Static Analysis of Shape in TensorFlow Programs

The Theano Development Team , Rami Al‐Rfou , Guillaume Alain , The Theano Development Team , Rami Al‐Rfou , Guillaume Alain , Amjad Almahairi , Christof Angermueller , Dzmitry Bahdanau , Nicolas Ballas , Frédéric Bastien , Justin Bayer , Anatoly Belikov , Alexander Belopolsky , Yoshua Bengio , Arnaud Bergeron , James Bergstra , Valentin Bisson , Josh Bleecher Snyder , Nicolas Bouchard , Nicolas Boulanger-Lewandowski , Xavier Bouthillier , Alexandre de Brébisson , Olivier Breuleux , Pierre-Luc Carrier , Kyunghyun Cho , Jan Chorowski , Paul Christiano , Tim Cooijmans , Marc-Alexandre Côté , Myriam Côté , Aaron Courville , Yann Dauphin , Olivier Delalleau , Julien Demouth , Guillaume Desjardins , Sander Dieleman , Laurent Dinh , Mélanie Ducoffe , Vincent Dumoulin , Samira Ebrahimi Kahou , Dumitru Erhan , Ziye Fan , Orhan Fırat , Mathieu Germain , Xavier Glorot , Ian Goodfellow , M. Graham , Çaǧlar Gülçehre , Philippe Hamel , Iban Harlouchet , Jean-Philippe Heng , Balázs Hidasi , Sina Honari , Arjun Jain , Sébastien Jean , Kai Jia , Mikhail Korobov , Vivek Kulkarni , Alex Lamb , Pascal Lamblin , Eric Larsen , César Laurent , Sean Lee , Simon Lefrançois , Simon Lemieux , Nicholas Léonard , Zhouhan Lin , Jesse A. Livezey , Cory Lorenz , Jeremiah Lowin , Qianli Ma , Pierre-Antoine Manzagol , Olivier Mastropietro , Robert T. McGibbon , Roland Memisevic , Bart van Merriënboer , Vincent Michalski , Mehdi Mirza , Alberto Orlandi , Christopher Pal , Razvan Pascanu , Mohammad Pezeshki , Colin Raffel , Daniel Renshaw , Matthew Rocklin , Adriana Romero , M. Roth , Peter Sadowski , John Salvatier , François Savard , Jan Schlüter , John Schulman , Gabriel Schwartz , Iulian Vlad Serban , Dmitriy Serdyuk , Samira Shabanian , Étienne Simon , Sigurd Spieckermann , Siva Subramanyam , Jakub Sygnowski , Jérémie Tanguay , Gijs van Tulder
2020 arXiv (Cornell University) 1,669 citations

Abstract

Machine learning has been widely adopted in diverse science and engineering domains, aided by reusable libraries and quick development patterns. The TensorFlow library is probably the best-known representative of this trend and most users employ the Python API to its powerful back-end. TensorFlow programs are susceptible to several systematic errors, especially in the dynamic typing setting of Python. We present Pythia, a static analysis that tracks the shapes of tensors across Python library calls and warns of several possible mismatches. The key technical aspects are a close modeling of library semantics with respect to tensor shape, and an identification of violations and error-prone patterns. Pythia is powerful enough to statically detect (with 84.62% precision) 11 of the 14 shape-related TensorFlow bugs in the recent Zhang et al. empirical study - an independent slice of real-world bugs.

Keywords

Python (programming language)Computer scienceCompilerComputationSection (typography)Principal (computer security)Artificial intelligenceSoftwareProgramming languageComputational scienceSoftware engineeringMachine learningOperating system

Related Publications

Publication Info

Year
2020
Type
preprint
Citations
1669
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1669
OpenAlex

Cite This

The Theano Development Team, Rami Al‐Rfou, Guillaume Alain et al. (2020). Static Analysis of Shape in TensorFlow Programs. arXiv (Cornell University) . https://doi.org/10.4230/lipics.ecoop.2020.15

Identifiers

DOI
10.4230/lipics.ecoop.2020.15

Data Quality

Data completeness: 77%