{"version":1,"type":"rich","provider_name":"Libsyn","provider_url":"https:\/\/www.libsyn.com","height":90,"width":600,"title":"31 - Singular Learning Theory with Daniel Murfet","description":"What's going on with deep learning? What sorts of models get learned, and what are the learning dynamics? Singular learning theory is a theory of Bayesian statistics broad enough in scope to encompass deep neural networks that may help answer these questions. In this episode, I speak with Daniel Murfet about this research program and what it tells us. Patreon: patreon.com\/axrpodcast Ko-fi: ko-fi.com\/axrpodcast Topics we discuss, and timestamps: 0:00:26 - What is singular learning theory? 0:16:00 - Phase transitions 0:35:12 - Estimating the local learning coefficient 0:44:37 - Singular learning theory and generalization 1:00:39 - Singular learning theory vs other deep learning theory 1:17:06 - How singular learning theory hit AI alignment 1:33:12 - Payoffs of singular learning theory for AI alignment 1:59:36 - Does singular learning theory advance AI capabilities? 2:13:02 - Open problems in singular learning theory for AI alignment 2:20:53 - What is the singular fluctuation? 2:25:33 - How geometry relates to information 2:30:13 - Following Daniel Murfet's work &amp;nbsp; The transcript:  https:\/\/axrp.net\/episode\/2024\/05\/07\/episode-31-singular-learning-theory-dan-murfet.html Daniel Murfet's twitter\/X account: https:\/\/twitter.com\/danielmurfet Developmental interpretability website: https:\/\/devinterp.com Developmental interpretability YouTube channel: https:\/\/www.youtube.com\/@Devinterp &amp;nbsp; Main research discussed in this episode: - Developmental Landscape of In-Context Learning:&amp;nbsp;https:\/\/arxiv.org\/abs\/2402.02364 - Estimating the Local Learning Coefficient at Scale:&amp;nbsp;https:\/\/arxiv.org\/abs\/2402.03698 - Simple versus Short: Higher-order degeneracy and error-correction:&amp;nbsp;https:\/\/www.lesswrong.com\/posts\/nWRj6Ey8e5siAEXbK\/simple-versus-short-higher-order-degeneracy-and-error-1 &amp;nbsp; Other links: - Algebraic Geometry and Statistical Learning Theory (the grey book):&amp;nbsp;https:\/\/www.cambridge.org\/core\/books\/algebraic-geometry-and-statistical-learning-theory\/9C8FD1BDC817E2FC79117C7F41544A3A - Mathematical Theory of Bayesian Statistics (the green book): https:\/\/www.routledge.com\/Mathematical-Theory-of-Bayesian-Statistics\/Watanabe\/p\/book\/9780367734817  In-context learning and induction heads:  https:\/\/transformer-circuits.pub\/2022\/in-context-learning-and-induction-heads\/index.html - Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity:&amp;nbsp;https:\/\/arxiv.org\/abs\/2106.15933 - A mathematical theory of semantic development in deep neural networks: https:\/\/www.pnas.org\/doi\/abs\/10.1073\/pnas.1820226116 - Consideration on the Learning Efficiency Of Multiple-Layered Neural Networks with Linear Units:&amp;nbsp;https:\/\/papers.ssrn.com\/sol3\/papers.cfm?abstract_id=4404877 - Neural Tangent Kernel: Convergence and Generalization in Neural Networks:&amp;nbsp;https:\/\/arxiv.org\/abs\/1806.07572 - The Interpolating Information Criterion for Overparameterized Models:&amp;nbsp;https:\/\/arxiv.org\/abs\/2307.07785 - Feature Learning in Infinite-Width Neural Networks:&amp;nbsp;https:\/\/arxiv.org\/abs\/2011.14522 - A central AI alignment problem: capabilities generalization, and the sharp left turn:  https:\/\/www.lesswrong.com\/posts\/GNhMPAWcfBCASy8e6\/a-central-ai-alignment-problem-capabilities-generalization - Quantifying degeneracy in singular models via the learning coefficient:&amp;nbsp;https:\/\/arxiv.org\/abs\/2308.12108 &amp;nbsp; Episode art by Hamish Doodles:&amp;nbsp;hamishdoodles.com ","author_name":"AXRP - the AI X-risk Research Podcast","author_url":"https:\/\/axrp.net","html":"<iframe title=\"Libsyn Player\" style=\"border: none\" src=\"\/\/html5-player.libsyn.com\/embed\/episode\/id\/31169122\/height\/90\/theme\/custom\/thumbnail\/yes\/direction\/forward\/render-playlist\/no\/custom-color\/88AA3C\/\" height=\"90\" width=\"600\" scrolling=\"no\"  allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen><\/iframe>","thumbnail_url":"https:\/\/assets.libsyn.com\/secure\/content\/171708452"}