Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add resnet56 short tests. #6101

Merged
merged 7 commits into from
Jan 31, 2019
Merged

Add resnet56 short tests. #6101

merged 7 commits into from
Jan 31, 2019

Conversation

tfboyd
Copy link
Member

@tfboyd tfboyd commented Jan 27, 2019

  • created base benchmark module
  • renamed accuracy test class to contain the word Accuracy
    which will result in a need to update all the jobs
    and a loss of history but is worth it.
  • short tests are mostly copied from shining with oss refactor

- created base benchmark module
- renamed accuracy test class to contain the word Accuracy
which will result in a need to update all the jobs
and a loss of history but is worth it.
- short tests are mostly copied from shining with oss refactor
@tfboyd tfboyd requested review from karmel and a team as code owners January 27, 2019 16:07
@tfboyd tfboyd removed request for a team and karmel January 27, 2019 16:07
def_flags['data_dir'] = DATA_DIR
def_flags['train_steps'] = 110
def_flags['log_steps'] = 10
FLAGS.skip_eval = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, why both put it in def_flags and set it here?

official/resnet/keras/keras_benchmark.py Outdated Show resolved Hide resolved
@lindong28
Copy link
Contributor

lindong28 commented Jan 28, 2019

Thanks for the PR!

@tfboyd I have two other PRs (tensorflow/benchmarks#292 and #6103) that changed the interface of PerfZero and added support for benchmarks defined in leading_indicators_test.py. In particularly, these PRs would replaced oss_report_object with a dict and this would affect the implementation of keras_benchmark.py added in this patch.

If we merge this PR first, some the new code added in this PR will be replaced soon and tested again. I am wondering if we could discuss and merge the other PR first so that the new code will use the new API from the beginning.

@tfboyd
Copy link
Member Author

tfboyd commented Jan 28, 2019

@lindong28 I am going to go with this one first so I can get tests up and running. Your change impacts a bunch of tests and they will all need changed. Which is a good move forward and we have to plan it out. We cannot block tests for refactoring. One option you might consider in the future is a FLAG for the execution path, where old tests use the old path and new test new path. Not needed for quick changes but your PR is a refactor touching much more than just execution and will take time to sort out. With this change impacting old tests a flag is still likely good to avoid disruption and is fun as that is how it is often done across Google. Then phase the flag out ASAP.

@lindong28
Copy link
Contributor

lindong28 commented Jan 29, 2019

@tfboyd Sounds good. I was not sure about when we need the tests defined in this PR. There is tradeoff between timeliness for this PR and the amount of code change. Yes let's go with this PR first since we want to run the test soon.

Regarding the use of flags for switching between old tests and new tests, the flag reduces the coupling between repos versions at the cost of increased amount of code change (the extra logic/code for flags will eventually not be useful). Given that we are currently the only user of PerfZero and the oss_report_object defined in in this PR, we should be able to commit change for both repositories at around the same time without having to run different versions of these repo together. So I thought the benefits of flags may not worth the extra code/effort.

Does this make sense? I am happy to update the PRs to use the flag if the current solution (i.e. commit change for PerfZero and the benchmark classes which are run by PerZero within the same hour) has issues.

Copy link
Contributor

@lindong28 lindong28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Left some minor comments related to code style.

official/resnet/keras/keras_benchmark.py Outdated Show resolved Hide resolved
official/resnet/keras/keras_benchmark.py Outdated Show resolved Hide resolved
official/resnet/keras/keras_benchmark.py Outdated Show resolved Hide resolved
official/resnet/keras/keras_benchmark.py Outdated Show resolved Hide resolved
official/resnet/keras/keras_cifar_benchmark.py Outdated Show resolved Hide resolved
@tfboyd tfboyd merged commit 2519f29 into tensorflow:master Jan 31, 2019
tfboyd pushed a commit that referenced this pull request Feb 5, 2019
* Add resnet56 short tests. (#6101)

* Add resnet56 short tests.
- created base benchmark module
- renamed accuracy test class to contain the word Accuracy
which will result in a need to update all the jobs
and a loss of history but is worth it.
- short tests are mostly copied from shining with oss refactor

* Address feedback.

* Move flag_methods to init
- Address setting default flags repeatedly.

* Rename accuracy tests.

* Lint errors resolved.

* fix model_dir set to flags.data_dir.

* fixed not fulling pulling out flag_methods.

* Use core mirrored strategy in official models (#6126)

* Imagenet short tests (#6132)

* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.

* points default data_dir to child folder. (#6131)

Failed test is python2  and was a kokoro failure

* Imagenet short tests (#6136)

* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.

* Add fill_objects

* fixed calling wrong class in super.

* fix lint issue.

* Flag (#6121)

* Fix the turn_off_ds flag problem

* add param names to all args

* Export benchmark stats using tf.test.Benchmark.report_benchmark() (#6103)

* Export benchmark stats using tf.test.Benchmark.report_benchmark()

* Fix python style using pyformat

* Typos. (#6120)

* log verbosity=2 logs every epoch no progress bars (#6142)

* tf_upgrade_v2 on resnet and utils folder.

* tf_upgrade_v2 on resnet and utils folder.
w4-jonghoon pushed a commit to w4-jonghoon/models that referenced this pull request Feb 6, 2019
* Add resnet56 short tests.
- created base benchmark module
- renamed accuracy test class to contain the word Accuracy
which will result in a need to update all the jobs
and a loss of history but is worth it.
- short tests are mostly copied from shining with oss refactor

* Address feedback.

* Move flag_methods to init
- Address setting default flags repeatedly.

* Rename accuracy tests.

* Lint errors resolved.

* fix model_dir set to flags.data_dir.

* fixed not fulling pulling out flag_methods.
w4-jonghoon pushed a commit to w4-jonghoon/models that referenced this pull request Feb 6, 2019
* Add resnet56 short tests. (tensorflow#6101)

* Add resnet56 short tests.
- created base benchmark module
- renamed accuracy test class to contain the word Accuracy
which will result in a need to update all the jobs
and a loss of history but is worth it.
- short tests are mostly copied from shining with oss refactor

* Address feedback.

* Move flag_methods to init
- Address setting default flags repeatedly.

* Rename accuracy tests.

* Lint errors resolved.

* fix model_dir set to flags.data_dir.

* fixed not fulling pulling out flag_methods.

* Use core mirrored strategy in official models (tensorflow#6126)

* Imagenet short tests (tensorflow#6132)

* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.

* points default data_dir to child folder. (tensorflow#6131)

Failed test is python2  and was a kokoro failure

* Imagenet short tests (tensorflow#6136)

* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.

* Add fill_objects

* fixed calling wrong class in super.

* fix lint issue.

* Flag (tensorflow#6121)

* Fix the turn_off_ds flag problem

* add param names to all args

* Export benchmark stats using tf.test.Benchmark.report_benchmark() (tensorflow#6103)

* Export benchmark stats using tf.test.Benchmark.report_benchmark()

* Fix python style using pyformat

* Typos. (tensorflow#6120)

* log verbosity=2 logs every epoch no progress bars (tensorflow#6142)

* tf_upgrade_v2 on resnet and utils folder.

* tf_upgrade_v2 on resnet and utils folder.
@tfboyd tfboyd deleted the short_perf_tests branch June 25, 2019 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants