### When the state of the art is ahead of the state of understanding: Unintuitive properties of deep neural networks

#### Abstract

Deep learning is an undeniably hot topic, not only within both academia and industry, but also among society and the media. The reasons for the advent of its popularity are manifold: unprecedented availability of data and computing power, some innovative methodologies, minor but significant technical tricks, etc. However, interestingly, the current success and practice of deep learning seems to be uncorrelated with its theoretical, more formal understanding. And with that, deep learning’s state-of-the-art presents a number of unintuitive properties or situations. In this note, I highlight some of these unintuitive properties, trying to show relevant recent work, and expose the need to get insight into them, either by formal or more empirical means.

#### Keywords

#### Full Text:

PDF#### References

Cybenko, G. (1989). Approximation by superposition of sigmoidal functions.

*Mathematics of Control, Signals and Systems*,*2*(4), 303–314. doi: 10.1007/BF02551274Dauphin, Y. N., Pascanu, R., Gulcehere, C., Cho, K., Ganguli, S., & Bengio, Y. (2014). Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.),

*Advances in neural information processing systems, 27*(pp. 2933–2941). New York, NY: Curran Associates Inc.Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., & Bengio, S. (2010). Why does unsupervised pre-training help deep learning?

*Journal of Machine Learning Research*,*11*, 625–660.Gilmer, J., Metz, L., Faghri, F., Schoenholz, S. S., Raghu, M., Wattenberg, M., & Goodfellow, I. (2018).

*Adversarial spheres.*Retrieved from https://arxiv.org/abs/1801.02774Goodfellow, I., Vinyals, O., & Saxe, A. M. (2015). Qualitatively characterizing neural network optimization problems. In

*Proceedings of the International Conference on Learning Representations (ICLR 2016)*. San Diego, CA, USA: ICLR. Retrieved from https://arxiv.org/abs/1412.6544Han, S., Mao, H., & Dally, W. J. (2016). Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In

*Proceedings of the International Conference on Learning Representations (ICLR 2016)*. San Juan, Puerto Rico: ICLR. Retrieved from https://arxiv.org/abs/1510.00149Hinton, G., Vinyals, O., & Dean, J. (2014). Distilling the knowledge in a neural network. In

*NIPS 2014 Deep Learning and Representation Learning Workshop*. Montreal, Canada: NIPS. Retrieved from https://arxiv.org/abs/1503.02531Kawaguchi, K. (2016). Deep learning without poor local minima. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.),

*Advances in neural information processing systems, 29*(pp. 586–594). New York, NY: Curran Associates Inc.Larochelle, H. (2017, 28 june).

*Neural networks II*. Deep Learning and Reinforcement Learning Summer School. Montreal Institute for Learning Algorithms, University of Montreal. Retrieved on 12 January 2018 from https://mila.quebec/en/cours/deep-learning-summer-school-2017/slides/LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning.

*Nature*,*521*, 436–444. doi: 10.1038/nature14539LeCun, Y., Bottou, L., Orr, G. B., & Müller, K.-R. (2002). Efficient backprop. In G. B. Orr & K.-R. Müller (Eds.),

*Neural networks: Tricks of the trade. Lecture notes in computer science. Volume 1524*(pp. 9–50). Berlin: Springer. doi: 10.1007/3-540-49430-8Li, H., Xu, Z., Taylor, G., & Goldstein, T. (2017).

*Visualizing the loss landscape of neural nets*. Retrieved from https://arxiv.org/abs/1712.09913McCloskey, M., & Cohen, N. (1989). Catastrophic interference in connectionist networks: The sequential learning problem.

*Psychology of Learning and Motivation*,*24*, 109–165. doi: 10.1016/S0079-7421(08)60536-8Nguyen, A., Yosinski, J., & Clune, J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In

*Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)*(pp. 427–436). Boston, MA: IEEE. doi: 10.1109/CVPR.2015.7298640Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z. B., & Swami, A. (2017). Practical black-box attacks against machine learning. In

*Proceedings of the 2017 ACM Asia Conference on Computer and Communications Society (Asia-CCCS)*(pp. 506–619). New York, NY: Association for Computing Machinery. doi: 10.1145/3052973.3053009Serrà, J., Surís, D., Miron, M., & Karatzoglou, A. (2018). Overcoming catastrophic forgetting with hard attention to the task. In

*Proceedings of the 35th International Conference on Machine Learning (ICML)*(pp. 4555–4564). Stockholm: ICML.Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). Intriguing properties of neural networks. In

*Proceedings of the International Conference on Learning Representations (ICLR)*. Banff, Canada: ICLR. Retrieved from https://arxiv.org/abs/1312.6199Wolfram, S. (2002).

*A new kind of science*. Champaign, IL: Wolfram Media.Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks? In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.),

*Advances in neural information processing systems, 27*(pp. 3320–3328). New York, NY: Curran Associates Inc.Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. In

*Proceedings of the International Conference on Learning Representations (ICLR).*Toulon, France: ICLR. Retrieved from https://arxiv.org/abs/1611.03530Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning.

*Proceedings of the International Conference on Learning Representations (ICLR*). Toulon, France: ICLR. Retrieved from https://arxiv.org/abs/1611.01578

Texts in the journal are –unless otherwise indicated– published under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

____________________________________________________________________________________________________________________