Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix erroranalysis test failures due to new shap release having inconsistent dimensions for single valued target #2552

Merged
merged 1 commit into from
Apr 8, 2024

Conversation

imatiach-msft
Copy link
Contributor

@imatiach-msft imatiach-msft commented Apr 5, 2024

Description

fix erroranalysis test failures due to new shap release having inconsistent dimensions for single valued target

Erroranalysis tests started failing which seemed to be caused by the new shap 0.45.0 release

Example exception:

2024-04-05T16:18:16.3039442Z error_correlation_method = <ErrorCorrelationMethods.GBM_SHAP: 'gbm_shap'>
2024-04-05T16:18:16.3039998Z 
2024-04-05T16:18:16.3040286Z     def run_error_analyzer(model, X_test, y_test, feature_names,
2024-04-05T16:18:16.3040941Z                            categorical_features,
2024-04-05T16:18:16.3041486Z                            error_correlation_method):
2024-04-05T16:18:16.3042102Z         model_analyzer = ModelAnalyzer(model, X_test, y_test,
2024-04-05T16:18:16.3042907Z                                        feature_names,
2024-04-05T16:18:16.3043635Z                                        categorical_features)
2024-04-05T16:18:16.3044545Z         scores = model_analyzer.compute_importances(error_correlation_method)
2024-04-05T16:18:16.3045390Z         if model_analyzer.model_task == ModelTask.CLASSIFICATION:
2024-04-05T16:18:16.3046339Z             diff = model.predict(model_analyzer.dataset) != model_analyzer.true_y
2024-04-05T16:18:16.3047018Z         else:
2024-04-05T16:18:16.3047678Z             diff = model.predict(model_analyzer.dataset) - model_analyzer.true_y
2024-04-05T16:18:16.3048409Z >       assert isinstance(scores, list)
2024-04-05T16:18:16.3048887Z E       assert False
2024-04-05T16:18:16.3049306Z E        +  where False = isinstance(0.0, list)

The most generic fix is to manipulate the shape in case we are running for classification scenario and get a two-dimensional shap values array result to be three-dimensional.

Also pinning scikit-learn, see: py-why/EconML#854
to get greenbuild in CD.yml so we can have all green builds again

Checklist

  • I have added screenshots above for all UI changes.
  • I have added e2e tests for all UI changes.
  • Documentation was updated if it was needed.

@codecov-commenter
Copy link

codecov-commenter commented Apr 5, 2024

Codecov Report

Attention: Patch coverage is 66.66667% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 92.12%. Comparing base (4bb3835) to head (9b789ad).

Files Patch % Lines
...sis/erroranalysis/error_correlation_methods/gbm.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2552       +/-   ##
===========================================
+ Coverage   80.96%   92.12%   +11.16%     
===========================================
  Files          75      108       +33     
  Lines        3472     5436     +1964     
===========================================
+ Hits         2811     5008     +2197     
+ Misses        661      428      -233     
Flag Coverage Δ
unittests 92.12% <66.66%> (+11.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@imatiach-msft imatiach-msft merged commit 2a5a473 into main Apr 8, 2024
101 checks passed
@imatiach-msft imatiach-msft deleted the ilmat/fix-shap-ea-builds-err branch April 8, 2024 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants