Add resnet56 short tests. #6101

tfboyd · 2019-01-27T16:07:32Z

created base benchmark module
renamed accuracy test class to contain the word Accuracy
which will result in a need to update all the jobs
and a loss of history but is worth it.
short tests are mostly copied from shining with oss refactor

- created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor

official/resnet/keras/keras_cifar_benchmark.py

guptapriya · 2019-01-28T07:36:11Z

official/resnet/keras/keras_cifar_benchmark.py

+    def_flags['data_dir'] = DATA_DIR
+    def_flags['train_steps'] = 110
+    def_flags['log_steps'] = 10
+    FLAGS.skip_eval = True


hm, why both put it in def_flags and set it here?

official/resnet/keras/keras_benchmark.py

lindong28 · 2019-01-28T20:05:55Z

Thanks for the PR!

@tfboyd I have two other PRs (tensorflow/benchmarks#292 and #6103) that changed the interface of PerfZero and added support for benchmarks defined in leading_indicators_test.py. In particularly, these PRs would replaced oss_report_object with a dict and this would affect the implementation of keras_benchmark.py added in this patch.

If we merge this PR first, some the new code added in this PR will be replaced soon and tested again. I am wondering if we could discuss and merge the other PR first so that the new code will use the new API from the beginning.

tfboyd · 2019-01-28T22:55:13Z

@lindong28 I am going to go with this one first so I can get tests up and running. Your change impacts a bunch of tests and they will all need changed. Which is a good move forward and we have to plan it out. We cannot block tests for refactoring. One option you might consider in the future is a FLAG for the execution path, where old tests use the old path and new test new path. Not needed for quick changes but your PR is a refactor touching much more than just execution and will take time to sort out. With this change impacting old tests a flag is still likely good to avoid disruption and is fun as that is how it is often done across Google. Then phase the flag out ASAP.

lindong28 · 2019-01-29T00:02:17Z

@tfboyd Sounds good. I was not sure about when we need the tests defined in this PR. There is tradeoff between timeliness for this PR and the amount of code change. Yes let's go with this PR first since we want to run the test soon.

Regarding the use of flags for switching between old tests and new tests, the flag reduces the coupling between repos versions at the cost of increased amount of code change (the extra logic/code for flags will eventually not be useful). Given that we are currently the only user of PerfZero and the oss_report_object defined in in this PR, we should be able to commit change for both repositories at around the same time without having to run different versions of these repo together. So I thought the benefits of flags may not worth the extra code/effort.

Does this make sense? I am happy to update the PRs to use the flag if the current solution (i.e. commit change for PerfZero and the benchmark classes which are run by PerZero within the same hour) has issues.

lindong28

Thanks for the PR. Left some minor comments related to code style.

official/resnet/keras/keras_benchmark.py

official/resnet/keras/keras_cifar_benchmark.py

- Address setting default flags repeatedly.

* Add resnet56 short tests. (#6101) * Add resnet56 short tests. - created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor * Address feedback. * Move flag_methods to init - Address setting default flags repeatedly. * Rename accuracy tests. * Lint errors resolved. * fix model_dir set to flags.data_dir. * fixed not fulling pulling out flag_methods. * Use core mirrored strategy in official models (#6126) * Imagenet short tests (#6132) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * points default data_dir to child folder. (#6131) Failed test is python2 and was a kokoro failure * Imagenet short tests (#6136) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * Add fill_objects * fixed calling wrong class in super. * fix lint issue. * Flag (#6121) * Fix the turn_off_ds flag problem * add param names to all args * Export benchmark stats using tf.test.Benchmark.report_benchmark() (#6103) * Export benchmark stats using tf.test.Benchmark.report_benchmark() * Fix python style using pyformat * Typos. (#6120) * log verbosity=2 logs every epoch no progress bars (#6142) * tf_upgrade_v2 on resnet and utils folder. * tf_upgrade_v2 on resnet and utils folder.

* Add resnet56 short tests. - created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor * Address feedback. * Move flag_methods to init - Address setting default flags repeatedly. * Rename accuracy tests. * Lint errors resolved. * fix model_dir set to flags.data_dir. * fixed not fulling pulling out flag_methods.

* Add resnet56 short tests. (tensorflow#6101) * Add resnet56 short tests. - created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor * Address feedback. * Move flag_methods to init - Address setting default flags repeatedly. * Rename accuracy tests. * Lint errors resolved. * fix model_dir set to flags.data_dir. * fixed not fulling pulling out flag_methods. * Use core mirrored strategy in official models (tensorflow#6126) * Imagenet short tests (tensorflow#6132) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * points default data_dir to child folder. (tensorflow#6131) Failed test is python2 and was a kokoro failure * Imagenet short tests (tensorflow#6136) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * Add fill_objects * fixed calling wrong class in super. * fix lint issue. * Flag (tensorflow#6121) * Fix the turn_off_ds flag problem * add param names to all args * Export benchmark stats using tf.test.Benchmark.report_benchmark() (tensorflow#6103) * Export benchmark stats using tf.test.Benchmark.report_benchmark() * Fix python style using pyformat * Typos. (tensorflow#6120) * log verbosity=2 logs every epoch no progress bars (tensorflow#6142) * tf_upgrade_v2 on resnet and utils folder. * tf_upgrade_v2 on resnet and utils folder.

Add resnet56 short tests.

4de3c19

- created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor

tfboyd requested review from seemuch, lindong28 and guptapriya January 27, 2019 16:07

tfboyd requested review from karmel and a team as code owners January 27, 2019 16:07

googlebot added the cla: yes label Jan 27, 2019

tfboyd removed request for a team and karmel January 27, 2019 16:07

guptapriya approved these changes Jan 28, 2019

View reviewed changes

lindong28 reviewed Jan 29, 2019

View reviewed changes

tfboyd added 5 commits January 29, 2019 15:19

Address feedback.

5669a75

Move flag_methods to init

2c1d7a5

- Address setting default flags repeatedly.

Rename accuracy tests.

6990462

Lint errors resolved.

5296b95

fix model_dir set to flags.data_dir.

23efabf

lindong28 approved these changes Jan 31, 2019

View reviewed changes

fixed not fulling pulling out flag_methods.

578a158

tfboyd merged commit 2519f29 into tensorflow:master Jan 31, 2019

tfboyd deleted the short_perf_tests branch June 25, 2019 15:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add resnet56 short tests. #6101

Add resnet56 short tests. #6101

tfboyd commented Jan 27, 2019

guptapriya Jan 28, 2019

lindong28 commented Jan 28, 2019 •

edited

Loading

tfboyd commented Jan 28, 2019

lindong28 commented Jan 29, 2019 •

edited

Loading

lindong28 left a comment

Add resnet56 short tests. #6101

Add resnet56 short tests. #6101

Conversation

tfboyd commented Jan 27, 2019

guptapriya Jan 28, 2019

Choose a reason for hiding this comment

lindong28 commented Jan 28, 2019 • edited Loading

tfboyd commented Jan 28, 2019

lindong28 commented Jan 29, 2019 • edited Loading

lindong28 left a comment

Choose a reason for hiding this comment

lindong28 commented Jan 28, 2019 •

edited

Loading

lindong28 commented Jan 29, 2019 •

edited

Loading