-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GroundTruth eval: API changes #1353
Conversation
90bacee
to
74cee3d
Compare
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
agreement_txt, min_score_val=0, max_score_val=3 | ||
) | ||
/ 3, | ||
re_0_10_rating(agreement_txt) / 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change this back to 10pt scoring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is b/c the instructions in the agreement prompt (for feedback function GroundtruthAgreement.agreement_measure
) hasn't been updated to take the recently added configurable output score space yet.
I will do that along with several other feedback prompts in a separate PR, and this change is here so that the e2e notebook can run GT eval successfully
@@ -25,7 +25,7 @@ | |||
fileConfig(config.config_file_name) | |||
|
|||
# Get `sqlalchemy.url` from the environment. | |||
if config.get_main_option("sqlalchemy.url", None) in (None, ""): | |||
if config.get_main_option("sqlalchemy.url", None) is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change is b/c I've merged in https://github.com/truera/trulens/pull/1355/files so we no longer need to consider the empty string
…for BEIR data loader + add docstring for beir_loader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tq so much
* initial commit + fix type of ground_truth argument in benchmark_experiment * beir data loader impl * WIP ORM and persist API with chunking * wip with dataframe chunking * orm for groundtruth and dataset added * schema classes added for dataset and groundtruth * more CRUD code * more crud * separete tru changes * revert unwanted changes * schema fix * rm * move beir loader to its own change * schema update * tmp id handling * add BEIR dataset loader util * batch insertion of ground truth entries in tru sdk * wip * wip notebook test * add alembic new revision * add migration versions to data.py * remove ALTER column statement as it's not supported in SQLite * add autogenerated migration revision * add alembic new revision * remove ALTER column statement as it's not supported in SQLite * batch insertion of groundtruth entries more or less work * dataset use dataset_json just like gt * update revision * update api name * remove ts * added domain * nb update * better docstring * sdk renaming * renaming 'response' to 'expected_response' in GT eval * BEIR dataset loader WIP * nb * revisions * make data_path mandatory * beir done * adjust metadata in nb * remove domain * implement chunking * todo: refactor * no download cleanup zip * comment on expected_score * let groundtruth feedback handles pd df * api skeleton * pd concat * v1 working * speed up * fix groundtruth feedback * doc updates * more doc update * remove stuff * initial commit + fix type of ground_truth argument in benchmark_experiment * beir data loader impl * WIP ORM and persist API with chunking * wip with dataframe chunking * orm for groundtruth and dataset added * schema classes added for dataset and groundtruth * more CRUD code * more crud * separete tru changes * revert unwanted changes * schema fix * rm * move beir loader to its own change * schema update * tmp id handling * add autogenerated migration revision * dataset use dataset_json just like gt * update revision * remove ts * added domain * revisions * remove domain * let groundtruth feedback handles pd df * pd concat * v1 working * doc updates * more doc update * remove unused param * no more negative param * update * simplify name * improve batch insertion * remove unnecessary change in env.py * simplified and incorporating pr comments - no threads just in-memory * docstring * time-based to batch-size based * remove unnecessary dataset names and rely on the actual download URL for BEIR data loader + add docstring for beir_loader * pandas / pd
* initial commit + fix type of ground_truth argument in benchmark_experiment * beir data loader impl * WIP ORM and persist API with chunking * wip with dataframe chunking * orm for groundtruth and dataset added * schema classes added for dataset and groundtruth * more CRUD code * more crud * separete tru changes * revert unwanted changes * schema fix * rm * move beir loader to its own change * schema update * tmp id handling * add BEIR dataset loader util * batch insertion of ground truth entries in tru sdk * wip * wip notebook test * add alembic new revision * add migration versions to data.py * remove ALTER column statement as it's not supported in SQLite * add autogenerated migration revision * add alembic new revision * remove ALTER column statement as it's not supported in SQLite * batch insertion of groundtruth entries more or less work * dataset use dataset_json just like gt * update revision * update api name * remove ts * added domain * nb update * better docstring * sdk renaming * renaming 'response' to 'expected_response' in GT eval * BEIR dataset loader WIP * nb * revisions * make data_path mandatory * beir done * adjust metadata in nb * remove domain * implement chunking * todo: refactor * no download cleanup zip * comment on expected_score * let groundtruth feedback handles pd df * api skeleton * pd concat * v1 working * speed up * fix groundtruth feedback * doc updates * more doc update * remove stuff * initial commit + fix type of ground_truth argument in benchmark_experiment * beir data loader impl * WIP ORM and persist API with chunking * wip with dataframe chunking * orm for groundtruth and dataset added * schema classes added for dataset and groundtruth * more CRUD code * more crud * separete tru changes * revert unwanted changes * schema fix * rm * move beir loader to its own change * schema update * tmp id handling * add autogenerated migration revision * dataset use dataset_json just like gt * update revision * remove ts * added domain * revisions * remove domain * let groundtruth feedback handles pd df * pd concat * v1 working * doc updates * more doc update * remove unused param * no more negative param * update * simplify name * improve batch insertion * remove unnecessary change in env.py * simplified and incorporating pr comments - no threads just in-memory * docstring * time-based to batch-size based * remove unnecessary dataset names and rely on the actual download URL for BEIR data loader + add docstring for beir_loader * pandas / pd
Description
JIRA: https://snowflakecomputing.atlassian.net/browse/SNOW-1622124
Design: https://docs.google.com/document/d/1T67nNWL08jmQ7_xBpxulMU1mosecGgTKqBakhgBf79A/edit?pli=1
ORM / DAO PR: #1348
Other details good to know for developers
Please include any other details of this change useful for TruLens developers.
Type of change
not work as expected)