-
Notifications
You must be signed in to change notification settings - Fork 45.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add resnet56 short tests. #6101
Conversation
- created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor
def_flags['data_dir'] = DATA_DIR | ||
def_flags['train_steps'] = 110 | ||
def_flags['log_steps'] = 10 | ||
FLAGS.skip_eval = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, why both put it in def_flags and set it here?
Thanks for the PR! @tfboyd I have two other PRs (tensorflow/benchmarks#292 and #6103) that changed the interface of PerfZero and added support for benchmarks defined in leading_indicators_test.py. In particularly, these PRs would replaced If we merge this PR first, some the new code added in this PR will be replaced soon and tested again. I am wondering if we could discuss and merge the other PR first so that the new code will use the new API from the beginning. |
@lindong28 I am going to go with this one first so I can get tests up and running. Your change impacts a bunch of tests and they will all need changed. Which is a good move forward and we have to plan it out. We cannot block tests for refactoring. One option you might consider in the future is a FLAG for the execution path, where old tests use the old path and new test new path. Not needed for quick changes but your PR is a refactor touching much more than just execution and will take time to sort out. With this change impacting old tests a flag is still likely good to avoid disruption and is fun as that is how it is often done across Google. Then phase the flag out ASAP. |
@tfboyd Sounds good. I was not sure about when we need the tests defined in this PR. There is tradeoff between timeliness for this PR and the amount of code change. Yes let's go with this PR first since we want to run the test soon. Regarding the use of flags for switching between old tests and new tests, the flag reduces the coupling between repos versions at the cost of increased amount of code change (the extra logic/code for flags will eventually not be useful). Given that we are currently the only user of PerfZero and the oss_report_object defined in in this PR, we should be able to commit change for both repositories at around the same time without having to run different versions of these repo together. So I thought the benefits of flags may not worth the extra code/effort. Does this make sense? I am happy to update the PRs to use the flag if the current solution (i.e. commit change for PerfZero and the benchmark classes which are run by PerZero within the same hour) has issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Left some minor comments related to code style.
- Address setting default flags repeatedly.
* Add resnet56 short tests. (#6101) * Add resnet56 short tests. - created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor * Address feedback. * Move flag_methods to init - Address setting default flags repeatedly. * Rename accuracy tests. * Lint errors resolved. * fix model_dir set to flags.data_dir. * fixed not fulling pulling out flag_methods. * Use core mirrored strategy in official models (#6126) * Imagenet short tests (#6132) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * points default data_dir to child folder. (#6131) Failed test is python2 and was a kokoro failure * Imagenet short tests (#6136) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * Add fill_objects * fixed calling wrong class in super. * fix lint issue. * Flag (#6121) * Fix the turn_off_ds flag problem * add param names to all args * Export benchmark stats using tf.test.Benchmark.report_benchmark() (#6103) * Export benchmark stats using tf.test.Benchmark.report_benchmark() * Fix python style using pyformat * Typos. (#6120) * log verbosity=2 logs every epoch no progress bars (#6142) * tf_upgrade_v2 on resnet and utils folder. * tf_upgrade_v2 on resnet and utils folder.
* Add resnet56 short tests. - created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor * Address feedback. * Move flag_methods to init - Address setting default flags repeatedly. * Rename accuracy tests. * Lint errors resolved. * fix model_dir set to flags.data_dir. * fixed not fulling pulling out flag_methods.
* Add resnet56 short tests. (tensorflow#6101) * Add resnet56 short tests. - created base benchmark module - renamed accuracy test class to contain the word Accuracy which will result in a need to update all the jobs and a loss of history but is worth it. - short tests are mostly copied from shining with oss refactor * Address feedback. * Move flag_methods to init - Address setting default flags repeatedly. * Rename accuracy tests. * Lint errors resolved. * fix model_dir set to flags.data_dir. * fixed not fulling pulling out flag_methods. * Use core mirrored strategy in official models (tensorflow#6126) * Imagenet short tests (tensorflow#6132) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * points default data_dir to child folder. (tensorflow#6131) Failed test is python2 and was a kokoro failure * Imagenet short tests (tensorflow#6136) * Add short imagenet tests (taken from seemuch) - also rename to match go forward naming * fix method name * Update doc strings. * Fixe gpu number. * Add fill_objects * fixed calling wrong class in super. * fix lint issue. * Flag (tensorflow#6121) * Fix the turn_off_ds flag problem * add param names to all args * Export benchmark stats using tf.test.Benchmark.report_benchmark() (tensorflow#6103) * Export benchmark stats using tf.test.Benchmark.report_benchmark() * Fix python style using pyformat * Typos. (tensorflow#6120) * log verbosity=2 logs every epoch no progress bars (tensorflow#6142) * tf_upgrade_v2 on resnet and utils folder. * tf_upgrade_v2 on resnet and utils folder.
which will result in a need to update all the jobs
and a loss of history but is worth it.