The Unreasonable Effectiveness of Deep Learning
This essay consists of summaries, explanations, and discussions of several papers which provide high-level arguments and intuitions about why, conceptually, deep learning works. Particular areas of investigation are "Which classes of functions can deep neural nets approximate well in principle?"; "Why can they quickly learn functions which have very small training loss?"; and "Why do the functions they learn generalise so well?". Understanding deep learning requires rethinking generalisation - Zhang, Bengio, Hardt, Recht and Vinyals, 2017 Machine learning in general is about identifying functions in high-dimensional spaces based on finitely many samples from them. In doing so, we navigate between two potential errors: learning a function which is too simple to capture most of the variation in our data (underfitting) and learning a function which matches the data points well, but doesn't generalise (overfitting). Underfitting implies higher train