Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tf_upgrade_v2 on resnet and utils folders. #6154

Merged
merged 15 commits into from
Feb 5, 2019
Merged

tf_upgrade_v2 on resnet and utils folders. #6154

merged 15 commits into from
Feb 5, 2019

Conversation

goldiegadde
Copy link
Contributor

No description provided.

tfboyd and others added 10 commits February 4, 2019 12:15
* Add resnet56 short tests.
- created base benchmark module
- renamed accuracy test class to contain the word Accuracy
which will result in a need to update all the jobs
and a loss of history but is worth it.
- short tests are mostly copied from shining with oss refactor

* Address feedback.

* Move flag_methods to init
- Address setting default flags repeatedly.

* Rename accuracy tests.

* Lint errors resolved.

* fix model_dir set to flags.data_dir.

* fixed not fulling pulling out flag_methods.
* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.
Failed test is python2  and was a kokoro failure
* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.

* Add fill_objects

* fixed calling wrong class in super.

* fix lint issue.
* Fix the turn_off_ds flag problem

* add param names to all args
)

* Export benchmark stats using tf.test.Benchmark.report_benchmark()

* Fix python style using pyformat
@goldiegadde goldiegadde requested review from karmel and a team as code owners February 4, 2019 21:57
@tfboyd tfboyd requested review from tfboyd and removed request for a team and karmel February 4, 2019 22:12
@tfboyd
Copy link
Member

tfboyd commented Feb 5, 2019

@guptapriya This is LGTM to me and I think we power through the lint errors separately as the majority are over 80 charcters per line issues die to compat.v1. This also bit rots really fast. What do you think? I ran some tests and TF 2.0 worked. If the TF 1.0 CIFAR-10 test that @goldiegadde did was 92.6 to 93% then I say we go with it.

@goldiegadde
Copy link
Contributor Author

@guptapriya This is LGTM to me and I think we power through the lint errors separately as the majority are over 80 charcters per line issues die to compat.v1. This also bit rots really fast. What do you think? I ran some tests and TF 2.0 worked. If the TF 1.0 CIFAR-10 test that @goldiegadde did was 92.6 to 93% then I say we go with it.

@tfboyd @guptapriya
Updated the PR to fix all the lint errors. I did run the cifar benchmark tests, will update the result tomorrow morning.

Copy link
Contributor

@guptapriya guptapriya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments, but most likely we will want to address these in subsequent PRs.

@@ -60,4 +60,4 @@ def _progress(count, block_size, total_size):

if __name__ == '__main__':
FLAGS, unparsed = parser.parse_known_args()
tf.app.run(argv=[sys.argv[0]] + unparsed)
tf.compat.v1.app.run(argv=[sys.argv[0]] + unparsed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we use absl`s app.run here? (simiar to cifar10_main.py)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ask and then see if they can fix it. We can also manually change it.

@@ -273,6 +274,6 @@ def main(_):


if __name__ == '__main__':
tf.logging.set_verbosity(tf.logging.INFO)
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.INFO)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was checking the docs and I think we are supposed to use absl for logging in 2.0?
https://github.com/tensorflow/docs/blob/master/site/en/r2/guide/effective_tf2.md#api-cleanup

Looks like the upgrade script doesn't do this automatically. Can we do this in a subsequent PR perhaps?

from absl import logging

logging.info(...)
logging.error(...)

https://abseil.io/docs/python/guides/logging

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is on our list. I was shocked that no logging moves forward in TF 2.0 without a rewrite.

@@ -81,7 +81,7 @@ def parse_record_keras(raw_record, is_training, dtype):
Tuple with processed image tensor and one-hot-encoded label tensor.
"""
image, label = cifar_main.parse_record(raw_record, is_training, dtype)
label = tf.sparse_to_dense(label, (cifar_main.NUM_CLASSES,), 1)
label = tf.compat.v1.sparse_to_dense(label, (cifar_main.NUM_CLASSES,), 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you file a ticket to fix this? We should not need to do this anyway and it might be better to just figure out the right way to do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to go back and file tickets on anything compat.v1 in the keras code path.

@@ -98,7 +98,7 @@ def run(flags_obj):
Dictionary of training and eval stats.
"""
if flags_obj.enable_eager:
tf.enable_eager_execution()
tf.compat.v1.enable_eager_execution()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think maybe we should also add an else clause here for the opposite case so it can work in 2.0 too?

else:
  tf.compat.v1.disable_eager_execution()

wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will make a 2.0 branch as some things are in 2.0 that cannot be done in 1.0 like saving a model in 2.0 format, the call just does not exist.

@@ -88,7 +88,7 @@ def run(flags_obj):
ValueError: If fp16 is passed as it is not currently supported.
"""
if flags_obj.enable_eager:
tf.enable_eager_execution()
tf.compat.v1.enable_eager_execution()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here re: disable eager in else case

@@ -72,10 +72,10 @@ def _batch_norm_ops(self, test=False):

g = tf.Graph()
with g.as_default():
tf.set_random_seed(self.name_to_seed(name))
input_tensor = tf.get_variable(
tf.compat.v1.set_random_seed(self.name_to_seed(name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tf.set_random_seed(self.name_to_seed(name))
input_tensor = tf.get_variable(
tf.compat.v1.set_random_seed(self.name_to_seed(name))
input_tensor = tf.compat.v1.get_variable(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is get_variable supposed to be just tf.Variable?

@@ -88,7 +88,7 @@ def process_record_dataset(dataset,

# Parses the raw records into images and labels.
dataset = dataset.apply(
tf.contrib.data.map_and_batch(
tf.data.experimental.map_and_batch(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh. yeah this file will need a lot of work to make it 2.0 friendly

targets=labels,
k=5,
name='top_5_op'))
accuracy = tf.compat.v1.metrics.accuracy(labels, predictions['classes'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe most of these metrics and summaries can use their newer versions, we just need to do it carefully and make sure everything works fine.

@tfboyd
Copy link
Member

tfboyd commented Feb 5, 2019

I suspect we may (maybe maybe not) need some copybara changes but we can deal with that when we try to import to Google3 post merge. Maybe we get lucky and all it works out.

@tfboyd
Copy link
Member

tfboyd commented Feb 5, 2019

Merging now to avoid future conflicts and we have all agreed to address the comments with bugs as needed over the next 48 hours.

@tfboyd tfboyd merged commit d6b2b83 into tensorflow:master Feb 5, 2019
w4-jonghoon pushed a commit to w4-jonghoon/models that referenced this pull request Feb 6, 2019
* Add resnet56 short tests. (tensorflow#6101)

* Add resnet56 short tests.
- created base benchmark module
- renamed accuracy test class to contain the word Accuracy
which will result in a need to update all the jobs
and a loss of history but is worth it.
- short tests are mostly copied from shining with oss refactor

* Address feedback.

* Move flag_methods to init
- Address setting default flags repeatedly.

* Rename accuracy tests.

* Lint errors resolved.

* fix model_dir set to flags.data_dir.

* fixed not fulling pulling out flag_methods.

* Use core mirrored strategy in official models (tensorflow#6126)

* Imagenet short tests (tensorflow#6132)

* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.

* points default data_dir to child folder. (tensorflow#6131)

Failed test is python2  and was a kokoro failure

* Imagenet short tests (tensorflow#6136)

* Add short imagenet tests (taken from seemuch)
- also rename to match go forward naming

* fix method name

* Update doc strings.

* Fixe gpu number.

* Add fill_objects

* fixed calling wrong class in super.

* fix lint issue.

* Flag (tensorflow#6121)

* Fix the turn_off_ds flag problem

* add param names to all args

* Export benchmark stats using tf.test.Benchmark.report_benchmark() (tensorflow#6103)

* Export benchmark stats using tf.test.Benchmark.report_benchmark()

* Fix python style using pyformat

* Typos. (tensorflow#6120)

* log verbosity=2 logs every epoch no progress bars (tensorflow#6142)

* tf_upgrade_v2 on resnet and utils folder.

* tf_upgrade_v2 on resnet and utils folder.
@goldiegadde goldiegadde deleted the gadde-v2-convert branch February 6, 2019 19:18
tfboyd pushed a commit that referenced this pull request Feb 6, 2019
tfboyd pushed a commit that referenced this pull request Feb 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants