Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroundTruth eval: API changes #1353

Merged
merged 111 commits into from
Aug 20, 2024
Merged

GroundTruth eval: API changes #1353

merged 111 commits into from
Aug 20, 2024

Conversation

sfc-gh-dhuang
Copy link
Contributor

@sfc-gh-dhuang sfc-gh-dhuang commented Aug 16, 2024

Description

JIRA: https://snowflakecomputing.atlassian.net/browse/SNOW-1622124
Design: https://docs.google.com/document/d/1T67nNWL08jmQ7_xBpxulMU1mosecGgTKqBakhgBf79A/edit?pli=1

ORM / DAO PR: #1348

Other details good to know for developers

Please include any other details of this change useful for TruLens developers.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to
    not work as expected)
  • New Tests
  • This change includes re-generated golden test results
  • This change requires a documentation update

@sfc-gh-dhuang sfc-gh-dhuang changed the title add BEIR dataset loader util GroundTruth eval: API changes Aug 16, 2024
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

agreement_txt, min_score_val=0, max_score_val=3
)
/ 3,
re_0_10_rating(agreement_txt) / 10,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change this back to 10pt scoring?

Copy link
Contributor Author

@sfc-gh-dhuang sfc-gh-dhuang Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is b/c the instructions in the agreement prompt (for feedback function GroundtruthAgreement.agreement_measure) hasn't been updated to take the recently added configurable output score space yet.

I will do that along with several other feedback prompts in a separate PR, and this change is here so that the e2e notebook can run GT eval successfully

Base automatically changed from daniel/gt-dataset-persistence to main August 20, 2024 18:18
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Aug 20, 2024
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Aug 20, 2024
@@ -25,7 +25,7 @@
fileConfig(config.config_file_name)

# Get `sqlalchemy.url` from the environment.
if config.get_main_option("sqlalchemy.url", None) in (None, ""):
if config.get_main_option("sqlalchemy.url", None) is None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is b/c I've merged in https://github.com/truera/trulens/pull/1355/files so we no longer need to consider the empty string

…for BEIR data loader + add docstring for beir_loader
Copy link
Contributor

@sfc-gh-pdharmana sfc-gh-pdharmana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tq so much

@sfc-gh-dhuang sfc-gh-dhuang merged commit 8110c11 into main Aug 20, 2024
7 checks passed
@sfc-gh-dhuang sfc-gh-dhuang deleted the daniel/gt-api branch August 20, 2024 20:10
sfc-gh-dhuang added a commit that referenced this pull request Aug 21, 2024
* initial commit + fix type of ground_truth argument in benchmark_experiment

* beir data loader impl

* WIP ORM and persist API with chunking

* wip with dataframe chunking

* orm for groundtruth and dataset added

* schema classes added for dataset and groundtruth

* more CRUD code

* more crud

* separete tru changes

* revert unwanted changes

* schema fix

* rm

* move beir loader to its own change

* schema update

* tmp id handling

* add BEIR dataset loader util

* batch insertion of ground truth entries in tru sdk

* wip

* wip notebook test

* add alembic new revision

* add migration versions to data.py

* remove ALTER column statement as it's not supported in SQLite

* add autogenerated migration revision

* add alembic new revision

* remove ALTER column statement as it's not supported in SQLite

* batch insertion of groundtruth entries more or less work

* dataset use dataset_json just like gt

* update revision

* update api name

* remove ts

* added domain

* nb update

* better docstring

* sdk renaming

* renaming 'response' to 'expected_response' in GT eval

* BEIR dataset loader WIP

* nb

* revisions

* make data_path mandatory

* beir done

* adjust metadata in nb

* remove domain

* implement chunking

* todo: refactor

* no download cleanup zip

* comment on expected_score

* let groundtruth feedback handles pd df

* api skeleton

* pd concat

* v1 working

* speed up

* fix groundtruth feedback

* doc updates

* more doc update

* remove stuff

* initial commit + fix type of ground_truth argument in benchmark_experiment

* beir data loader impl

* WIP ORM and persist API with chunking

* wip with dataframe chunking

* orm for groundtruth and dataset added

* schema classes added for dataset and groundtruth

* more CRUD code

* more crud

* separete tru changes

* revert unwanted changes

* schema fix

* rm

* move beir loader to its own change

* schema update

* tmp id handling

* add autogenerated migration revision

* dataset use dataset_json just like gt

* update revision

* remove ts

* added domain

* revisions

* remove domain

* let groundtruth feedback handles pd df

* pd concat

* v1 working

* doc updates

* more doc update

* remove unused param

* no more negative param

* update

* simplify name

* improve batch insertion

* remove unnecessary change in env.py

* simplified and incorporating pr comments - no threads just in-memory

* docstring

* time-based to batch-size based

* remove unnecessary dataset names and rely on the actual download URL for BEIR data loader + add docstring for beir_loader

* pandas / pd
sfc-gh-chu pushed a commit that referenced this pull request Sep 25, 2024
* initial commit + fix type of ground_truth argument in benchmark_experiment

* beir data loader impl

* WIP ORM and persist API with chunking

* wip with dataframe chunking

* orm for groundtruth and dataset added

* schema classes added for dataset and groundtruth

* more CRUD code

* more crud

* separete tru changes

* revert unwanted changes

* schema fix

* rm

* move beir loader to its own change

* schema update

* tmp id handling

* add BEIR dataset loader util

* batch insertion of ground truth entries in tru sdk

* wip

* wip notebook test

* add alembic new revision

* add migration versions to data.py

* remove ALTER column statement as it's not supported in SQLite

* add autogenerated migration revision

* add alembic new revision

* remove ALTER column statement as it's not supported in SQLite

* batch insertion of groundtruth entries more or less work

* dataset use dataset_json just like gt

* update revision

* update api name

* remove ts

* added domain

* nb update

* better docstring

* sdk renaming

* renaming 'response' to 'expected_response' in GT eval

* BEIR dataset loader WIP

* nb

* revisions

* make data_path mandatory

* beir done

* adjust metadata in nb

* remove domain

* implement chunking

* todo: refactor

* no download cleanup zip

* comment on expected_score

* let groundtruth feedback handles pd df

* api skeleton

* pd concat

* v1 working

* speed up

* fix groundtruth feedback

* doc updates

* more doc update

* remove stuff

* initial commit + fix type of ground_truth argument in benchmark_experiment

* beir data loader impl

* WIP ORM and persist API with chunking

* wip with dataframe chunking

* orm for groundtruth and dataset added

* schema classes added for dataset and groundtruth

* more CRUD code

* more crud

* separete tru changes

* revert unwanted changes

* schema fix

* rm

* move beir loader to its own change

* schema update

* tmp id handling

* add autogenerated migration revision

* dataset use dataset_json just like gt

* update revision

* remove ts

* added domain

* revisions

* remove domain

* let groundtruth feedback handles pd df

* pd concat

* v1 working

* doc updates

* more doc update

* remove unused param

* no more negative param

* update

* simplify name

* improve batch insertion

* remove unnecessary change in env.py

* simplified and incorporating pr comments - no threads just in-memory

* docstring

* time-based to batch-size based

* remove unnecessary dataset names and rely on the actual download URL for BEIR data loader + add docstring for beir_loader

* pandas / pd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation lgtm This PR has been approved by a maintainer size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants