Article

TensorFlow: a system for large-scale machine learning

OSDI'16: Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation

Pages 265 - 283

Published: 02 November 2016 Publication History

Abstract

TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. Tensor-Flow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

References

[1]

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. J. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Józefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. G. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. A. Tucker, V. Vanhoucke, V. Vasudevan, F. B. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint, 1603.04467, 2016. arxiv.org/abs/1603.04467. Software available from tensorflow.org.

[2]

R. Al-Rfou, G. Alain, A. Almahairi, C. Angermueller, D. Bahdanau, N. Ballas, F. Bastien, J. Bayer, A. Belikov, A. Belopolsky, Y. Bengio, A. Bergeron, J. Bergstra, V. Bisson, J. Bleecher Snyder, N. Bouchard, N. Boulanger-Lewandowski, X. Bouthillier, A. de Brébisson, O. Breuleux, P.- L. Carrier, K. Cho, J. Chorowski, P. Christiano, T. Cooijmans, M.-A. Côté, M. Côté, A. Courville, Y. N. Dauphin, O. Delalleau, J. Demouth, G. Desjardins, S. Dieleman, L. Dinh, M. Ducoffe, V. Dumoulin, S. Ebrahimi Kahou, D. Erhan, Z. Fan, O. Firat, M. Germain, X. Glorot, I. Goodfellow, M. Graham, C. Gulcehre, P. Hamel, I. Harlouchet, J.-P. Heng, B. Hidasi, S. Honari, A. Jain, S. Jean, K. Jia, M. Korobov, V. Kulkarni, A. Lamb, P. Lamblin, E. Larsen, C. Laurent, S. Lee, S. Lefrancois, S. Lemieux, N. Léonard, Z. Lin, J. A. Livezey, C. Lorenz, J. Lowin, Q. Ma, P.-A. Manzagol, O. Mastropietro, R. T. McGibbon, R. Memisevic, B. van Merriënboer, V. Michalski, M. Mirza, A. Orlandi, C. Pal, R. Pascanu, M. Pezeshki, C. Raffel, D. Renshaw, M. Rocklin, A. Romero, M. Roth, P. Sadowski, J. Salvatier, F. Savard, J. Schlüter, J. Schulman, G. Schwartz, I. V. Serban, D. Serdyuk, S. Shabanian, E. Simon, S. Spieckermann, S. R. Subramanyam, J. Sygnowski, J. Tanguay, G. van Tulder, J. Turian, S. Urban, P. Vincent, F. Visin, H. de Vries, D. Warde-Farley, D. J. Webb, M. Willson, K. Xu, L. Xue, L. Yao, S. Zhang, and Y. Zhang. Theano: A Python framework for fast computation of mathematical expressions. arXiv preprint, 1605.02688, 2016. arxiv.org/abs/1605.02688.

[3]

A. Angelova, A. Krizhevsky, and V. Vanhoucke. Pedestrian detection with a large-field-of-view deep network. In Proceedings of ICRA, pages 704-711. IEEE, 2015. www.vision.caltech.edu/anelia/publications/Angelova15LFOV.pdf.

[4]

Arvind and D. E. Culler. Dataflow architectures. In Annual Review of Computer Science Vol. 1, 1986, pages 225-253. Annual Reviews Inc., 1986. www.dtic.mil/cgi-bin/GetTRDoc?Location=U2&doc=GetTRDoc.pdf&AD=ADA166235.

Digital Library

[5]

J. Ba, V. Mnih, and K. Kavukcuoglu. Multiple object recognition with visual attention. arXiv preprint, 1412.7755, 2014. arxiv.org/abs/1412.7755.

[6]

Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137-1155, 2003. jmlr.org/papers/volume3/bengio03a/bengio03a.pdf.

Digital Library

[7]

T. Brants and A. Franz. Web 1T 5-gram version 1, 2006. catalog.ldc.upenn.edu/LDC2006T13.

[8]

R. H. Byrd, G. M. Chin, J. Nocedal, and Y. Wu. Sample size selection in optimization methods for machine learning. Mathematical Programming, 134(1):127-155, 2012. dx.doi.org/10.1007/s10107-012-0572-5.

Digital Library

[9]

C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, and P. Koehn. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint, 1312.3005, 2013. arxiv.org/abs/1312.3005.

[10]

J. Chen, R. Monga, S. Bengio, and R. Jozefowicz. Revisiting distributed synchronous SGD. In Proceedings of ICLR Workshop Track, 2016. arxiv.org/abs/1604.00981.

[11]

T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. In Proceedings of LearningSys, 2015. www.cs.cmu.edu/~muli/file/mxnet-learning-sys.pdf.

[12]

H.-T. Cheng, L. Koc, J. Harmsen, T. Shaked, T. Chandra, H. Aradhye, G. Anderson, G. Corrado, W. Chai, M. Ispir, R. Anil, Z. Haque, L. Hong, V. Jain, X. Liu, and H. Shah. Wide & deep learning for recommender systems. arXiv preprint, 1606.07792, 2016. arxiv.org/abs/1606.07792.

[13]

S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer. cuDNN: Efficient primitives for deep learning. arXiv preprint, 1410.0759, 2014. arxiv.org/abs/1410.0759.

[14]

T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project Adam: Building an efficient and scalable deep learning training system. In Proceedings of OSDI, pages 571-582, 2014. www.usenix.org/system/files/conference/osdi14/osdi14-paper-chilimbi.pdf.

Digital Library

[15]

S. Chintala. convnet-benchmarks, 2016. github.com/soumith/convnet-benchmarks.

[16]

E. S. Chung, J. D. Davis, and J. Lee. LINQits: Big data on little clients. In Proceedings of ISCA, pages 261-272, 2013. www.microsoft.com/enus/research/wp-content/uploads/2013/06/ISCA13_-linqits.pdf.

Digital Library

[17]

R. Collobert, S. Bengio, and J. Mariéthoz. Torch: A modular machine learning software library. Technical report, IDIAP, 2002. infoscience.epfl.ch/record/82802/files/rr02-46.pdf.

[18]

H. Cui, H. Zhang, G. R. Ganger, P. B. Gibbons, and E. P. Xing. GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server. In Proceedings of EuroSys, 2016. www.pdl.cmu.edu/PDL-FTP/CloudComputing/GeePS-cui-eurosys16.pdf.

Digital Library

[19]

A. Dai, C. Olah, and Q. V. Le. Document embedding with paragraph vectors. arXiv preprint, 1507.07998, 2015. arxiv.org/abs/1507.07998.

[20]

J. Dean, G. S. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, and A. Y. Ng. Large scale distributed deep networks. In Proceedings of NIPS, pages 1232-1240, 2012. research.google.com/archive/large_deep_networks_nips2012.pdf.

Digital Library

[21]

J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In Proceedings of OSDI, pages 137-149, 2004. research.google.com/archive/mapreduceosdi04.pdf.

Digital Library

[22]

DMLC. MXNet for deep learning, 2016. github.com/dmlc/mxnet.

[23]

J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121-2159, 2011. jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.

Digital Library

[24]

A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, T. Mikolov, et al. DeVISE: A deep visual-semantic embedding model. In Proceedings of NIPS, pages 2121-2129, 2013. research.google.com/pubs/archive/41473.pdf.

Digital Library

[25]

J. Gonzalez-Dominguez, I. Lopez-Moreno, P. J. Moreno, and J. Gonzalez-Rodriguez. Frame-by-frame language identification in short utterances using deep neural networks. Neural Networks, 64:49-58, 2015. research.google.com/pubs/archive/42929.pdf.

Digital Library

[26]

I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio. Generative adversarial nets. In Proceedings of NIPS, pages 2672- 2680, 2014. papers.nips.cc/paper/5423-generative-adversarial-nets.pdf.

Digital Library

[27]

Google Research. Tensorflow serving, 2016. tensorflow.github.io/serving/.

[28]

K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of CVPR, pages 770-778, 2016. arxiv.org/abs/1512.03385.

[29]

G. Heigold, V. Vanhoucke, A. Senior, P. Nguyen, M. Ranzato, M. Devin, and J. Dean. Multilingual acoustic models using distributed deep neural networks. In Proceedings of ICASSP, pages 8619-8623, 2013. research.google.com/pubs/archive/40807.pdf.

[30]

G. E. Hinton. Learning distributed representations of concepts. In Proceedings of the Eighth Annual Conference of the Cognitive Science Society, pages 1-12, 1986. www.cogsci.ucsd.edu/~ajyu/Teaching/Cogs202_-sp13/Readings/hinton86.pdf.

[31]

G. E. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag., 29(6):82- 97, 2012. www.cs.toronto.edu/~gdahl/papers/deepSpeechReviewSPM2012.pdf.

[32]

S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735-1780, 1997. deeplearning.cs.cmu.edu/pdfs/Hochreiter97_-lstm.pdf.

Digital Library

[33]

S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of ICML, pages 448-456, 2015. jmlr.org/proceedings/papers/v37/ioffe15.pdf.

[34]

M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. In Proceedings of EuroSys, pages 59-72, 2007. www.microsoft.com/en-us/research/wpcontent/uploads/2007/03/eurosys07.pdf.

Digital Library

[35]

B. Jacob et al. gemmlowp: a small self-contained low-precision GEMM library, 2015. github.com/google/gemmlowp.

[36]

B. Jacob, G. Guennebaud, et al. Eigen library for linear algebra. eigen.tuxfamily.org.

[37]

S. Jean, K. Cho, R. Memisevic, and Y. Bengio. On using very large target vocabulary for neural machine translation. In Proceedings of ACL-ICJNLP, pages 1-10, July 2015. www.aclweb.org/anthology/P15-1001.

[38]

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of ACM Multimedia, pages 675-678, 2014. arxiv.org/abs/1408.5093.

Digital Library

[39]

M. I. Jordan. Serial order: A parallel distributed processing approach. ICS report 8608, Institute for Cognitive Science, UCSD, La Jolla, 1986. cseweb.ucsd.edu/~gary/PAPERSUGGESTIONS/Jordan-TR-8604.pdf.

[40]

N. Jouppi. Google supercharges machine learning tasks with TPU custom chip, 2016. cloudplatform.googleblog.com/2016/05/Google-supercharges-machine-learning-tasks-with-customchip.html.

[41]

R. Józefowicz, O. Vinyals, M. Schuster, N. Shazeer, and Y. Wu. Exploring the limits of language modeling. arXiv preprint, 1602.02410, 2016. arxiv.org/abs/1602.02410.

[42]

A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In Proceedings of CVPR, pages 1725-1732, 2014. research.google.com/pubs/archive/42455.pdf.

Digital Library

[43]

A. Krizhevsky. One weird trick for parallelizing convolutional neural networks. arXiv preprint, 1404.5997, 2014. arxiv.org/abs/1404.5997.

[44]

A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of NIPS, pages 1106-1114, 2012. papers.nips.cc/paper/4824- imagenet-classification-with-deep-convolutional-neural-networks.pdf.

Digital Library

[45]

H. Larochelle, Y. Bengio, J. Louradour, and P. Lamblin. Exploring strategies for training deep neural networks. Journal of Machine Learning Research, 10:1-40, 2009. jmlr.org/papers/volume10/larochelle09a/larochelle09a.pdf.

Digital Library

[46]

A. Lavin and S. Gray. Fast algorithms for convolutional neural networks. In Proceedings of CVPR, pages 4013-4021, 2016. arxiv.org/abs/1509.09308.

[47]

Q. Le, M. Ranzato, R. Monga, M. Devin, G. Corrado, K. Chen, J. Dean, and A. Ng. Building high-level features using large scale unsupervised learning. In Proceedings of ICML, pages 81-88, 2012. research.google.com/archive/unsupervised_-icml2012.pdf.

[48]

Y. LeCun, C. Cortes, and C. J. Burges. The MNIST database of handwritten digits, 1998. yann.lecun.com/exdb/mnist/.

[49]

M. Li, D. G. Andersen, J. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B.-Y. Su. Scaling distributed machine learning with the Parameter Server. In Proceedings of OSDI, pages 583-598, 2014. www.usenix.org/system/files/conference/osdi14/osdi14-paper-li_mu.pdf.

Digital Library

[50]

C. J. Maddison, A. Huang, I. Sutskever, and D. Silver. Move evaluation in Go using deep convolutional neural networks. arXiv preprint, 1412.6564, 2014. arxiv.org/abs/1412.6564.

[51]

F. McSherry, M. Isard, and D. G. Murray. Scalability! But at what COST? In Proceedings of HotOS, HOTOS'15, 2015. www.usenix.org/system/files/conference/hotos15/ hotos15-paper-mcsherry.pdf.

Digital Library

[52]

T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In Proceedings of ICLR Workshops Track, 2013. arxiv.org/abs/1301.3781.

[53]

V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu. Recurrent models of visual attention. In Proceedings of NIPS, pages 2204-2212, 2014. papers.nips.cc/paper/5542-recurrent-models-of-visual-attention.pdf.

Digital Library

[54]

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533, 02 2015. dx.doi.org/10.1038/nature14236.

[55]

P. Moritz, R. Nishihara, I. Stoica, and M. I. Jordan. SparkNet: Training deep networks in Spark. In Proceedings of ICLR, 2016. arxiv.org/abs/1511.06051.

[56]

D. G. Murray, F. McSherry, M. Isard, R. Isaacs, P. Barham, and M. Abadi. Incremental, iterative data processing with timely dataflow. Commun. ACM, 59(10):75-83, Sept. 2016. dl.acm.org/citation.cfm?id=2983551.

Digital Library

[57]

A. Nair, P. Srinivasan, S. Blackwell, C. Alcicek, R. Fearon, A. De Maria, V. Panneershelvam, M. Suleyman, C. Beattie, S. Petersen, et al. Massively parallel methods for deep reinforcement learning. arXiv preprint, 1507.04296, 2015. arxiv.org/abs/1507.04296.

[58]

Nervana Systems. Neon deep learning framework, 2016. github.com/NervanaSystems/neon.

[59]

NVIDIA Corporation. NCCL: Optimized primitives for collective multi-GPU communication, 2016. github.com/NVIDIA/nccl.

[60]

R. Pascanu, T. Mikolov, and Y. Bengio. On the difficulty of training recurrent neural networks. In Proceedings of ICML, pages 1310-1318, 2013. jmlr.org/proceedings/papers/v28/pascanu13.pdf.

[61]

B. Recht, C. Re, S. Wright, and F. Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Proceedings of NIPS, pages 693-701, 2011. papers.nips.cc/paper/4390-hogwild-a-lock-free-approach-to-parallelizing-stochastic-gradient-descent.pdf.

Digital Library

[62]

C. J. Rossbach, Y. Yu, J. Currey, J.-P. Martin, and D. Fetterly. Dandelion: a compiler and runtime for heterogeneous systems. In Proceedings of SOSP, pages 49-68, 2013. sigops.org/sosp/sosp13/papers/p49-rossbach.pdf.

Digital Library

[63]

D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning representations by back-propagating errors. In Cognitive modeling, volume 5, pages 213-220. MIT Press, 1988. www.cs.toronto.edu/~hinton/absps/naturebp.pdf.

[64]

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211-252, 2015. arxiv.org/abs/1409.0575.

Digital Library

[65]

A. Smola and S. Narayanamurthy. An architecture for parallel topic models. Proc. VLDB Endow., 3(1-2):703-710, Sept. 2010. vldb.org/pvldb/vldb2010/papers/R63.pdf.

Digital Library

[66]

I. Sutskever, J. Martens, G. E. Dahl, and G. E. Hinton. On the importance of initialization and momentum in deep learning. In Proceedings of ICML, pages 1139-1147, 2013. jmlr.org/proceedings/papers/v28/sutskever13.pdf.

[67]

I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In Proceedings of NIPS, pages 3104- 3112, 2014. papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural.pdf.

Digital Library

[68]

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of CVPR, pages 1-9, 2015. arxiv.org/abs/1409.4842.

[69]

C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the Inception architecture for computer vision. arXiv preprint, 1512.00567, 2015. arxiv.org/abs/1512.00567.

[70]

C. tao Chu, S. K. Kim, Y. an Lin, Y. Yu, G. Bradski, K. Olukotun, and A. Y. Ng. Map-reduce for machine learning on multicore. In Proceedings of NIPS, pages 281-288, 2007. papers.nips.cc/paper/3150-map-reduce-for-machine-learning-on-multicore.pdf.

Digital Library

[71]

A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J. Wilkes. Large-scale cluster management at Google with Borg. In Proceedings of EuroSys, 2015. research.google.com/pubs/archive/43438.pdf.

Digital Library

[72]

O. Vinyals, L. Kaiser, T. Koo, S. Petrov, I. Sutskever, and G. Hinton. Grammar as a foreign language. arXiv preprint, 2014. arxiv.org/abs/1412.7449.

[73]

Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. Johnson, X. Liu, L. Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, W. Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, O. Vinyals, G. Corrado, M. Hughes, and J. Dean. Google's Neural Machine Translation system: Bridging the gap between human and machine translation. arXiv preprint, 1609.08144, 2016. arxiv.org/abs/1609.08144.

[74]

Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In Proceedings of OSDI, pages 1-14, 2008. www.usenix.org/legacy/event/osdi08/tech/full_papers/yu_y/yu_y.pdf.

Digital Library

[75]

M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of NSDI, pages 15-28, 2012. https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf.

Digital Library

[76]

M. D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, and G. E. Hinton. On rectified linear units for speech processing. In Proceedings of ICASSP, pages 3517-3521, 2013. research.google.com/pubs/archive/40811.pdf.

Cited By

Kashikar PSentieys OSinha S(2024)Combining Weight Approximation, Sharing and Retraining for Neural Network Model CompressionACM Transactions on Embedded Computing Systems10.1145/3687466Online publication date: 10-Aug-2024
https://dl.acm.org/doi/10.1145/3687466
António VKimani GUmohoza EBusogi M(2024)Cross-Regional Transferability of AI Crop-Type Mapping: Insights and ChallengesProceedings of the 2024 International Conference on Information Technology for Social Good10.1145/3677525.3678696(453-461)Online publication date: 4-Sep-2024
https://dl.acm.org/doi/10.1145/3677525.3678696
Liao SShan C(2024)A PSO-based Method to Test Deep Learning Library at API LevelProceedings of the 3rd International Conference on Computer, Artificial Intelligence and Control Engineering10.1145/3672758.3672777(117-130)Online publication date: 26-Jan-2024
https://dl.acm.org/doi/10.1145/3672758.3672777
Show More Cited By

TensorFlow: a system for large-scale machine learning
1. General and reference
  1. Cross-computing tools and techniques

Recommendations

TensorFlow 1.x Deep Learning Cookbook: Over 90 unique recipes to solve artificial-intelligence driven problems with Python
TensorFlow Acceleration on ARM Hikey Board
IWOCL '18: Proceedings of the International Workshop on OpenCL

There is huge demand for targeting complex and large-scale machine learning applications particularly those based on popular actively-maintained frameworks such as TensorFlow and CAFFE to a variety of platforms with accelerators ranging from high-end ...
TensorFlow enabled genetic programming
GECCO '17: Proceedings of the Genetic and Evolutionary Computation Conference Companion

Genetic Programming, a kind of evolutionary computation and machine learning algorithm, is shown to benefit significantly from the application of vectorized data and the TensorFlow numerical computation library on both CPU and GPU architectures. The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

OSDI'16: Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation

November 2016

786 pages

ISBN:9781931971331

Program Chairs:
Kimberly Keeton
Hewlett Packard Labs
,
Timothy Roscoe
ETH Zurich

Sponsors

VMware
NetApp
Google Inc.
Microsoft: Microsoft
Facebook: Facebook

In-Cooperation

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

USENIX Association

United States

Publication History

Published: 02 November 2016

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,342
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 14 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kashikar PSentieys OSinha S(2024)Combining Weight Approximation, Sharing and Retraining for Neural Network Model CompressionACM Transactions on Embedded Computing Systems10.1145/3687466Online publication date: 10-Aug-2024
https://dl.acm.org/doi/10.1145/3687466
António VKimani GUmohoza EBusogi M(2024)Cross-Regional Transferability of AI Crop-Type Mapping: Insights and ChallengesProceedings of the 2024 International Conference on Information Technology for Social Good10.1145/3677525.3678696(453-461)Online publication date: 4-Sep-2024
https://dl.acm.org/doi/10.1145/3677525.3678696
Liao SShan C(2024)A PSO-based Method to Test Deep Learning Library at API LevelProceedings of the 3rd International Conference on Computer, Artificial Intelligence and Control Engineering10.1145/3672758.3672777(117-130)Online publication date: 26-Jan-2024
https://dl.acm.org/doi/10.1145/3672758.3672777
Chen JJia CYan YGe JZheng HCheng Y(2024)A Miss Is as Good as A Mile: Metamorphic Testing for Deep Learning OperatorsProceedings of the ACM on Software Engineering10.1145/36607961:FSE(2005-2027)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660796
Liu HGalindo MXie HWong LShuai HLi YCheng W(2024)Lightweight Deep Learning for Resource-Constrained Environments: A SurveyACM Computing Surveys10.1145/365728256:10(1-42)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3657282
Jindal ABeedkar KSingh VMohammed JSingla TGupta AChoudhary K(2024)Reactive Dataflow for Inflight Error Handling in ML WorkflowsProceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning10.1145/3650203.3663333(51-61)Online publication date: 9-Jun-2024
https://dl.acm.org/doi/10.1145/3650203.3663333
Huang HChen XZhao J(2024)Fasor: A Fast Tensor Program Optimization Framework for Efficient DNN DeploymentProceedings of the 38th ACM International Conference on Supercomputing10.1145/3650200.3656631(498-510)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3650200.3656631
Canesche MRosário VBorin EQuintão Pereira F(2024)The Droplet Search Algorithm for Kernel SchedulingACM Transactions on Architecture and Code Optimization10.1145/365010921:2(1-28)Online publication date: 21-May-2024
https://dl.acm.org/doi/10.1145/3650109
Toma TBezemer CBosch JLewis GCleland-Huang JMuccini H(2024)An Exploratory Study of Dataset and Model Management in Open Source Machine Learning ApplicationsProceedings of the IEEE/ACM 3rd International Conference on AI Engineering - Software Engineering for AI10.1145/3644815.3644963(64-74)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3644815.3644963
Abo Khamis MNgo HPichler RSuciu DWang Y(2024)Convergence of datalog over (Pre-) SemiringsJournal of the ACM10.1145/364302771:2(1-55)Online publication date: 30-Jan-2024
https://dl.acm.org/doi/10.1145/3643027
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents