Key people: Emin Orhan, Vaibhav Gupta, Wai Keen Vong, and Kanishk Gandhi
Young children have wide-ranging and sophisticated knowledge of the world. What is the origin of this early knowledge? How much can be explained through generic learning mechanisms applied to sensory data, and how much requires more substantive innate inductive biases? We examine these nature vs. nurture questions by training large-scale neural networks through the eyes of a single developing child, using longitudinal baby headcam videos (see recent dataset from Sullivan et al., 2020).
Our results so far show how high-level visual representations emerge from a subset of one baby’s experience, through only self-supervised learning. Our ongoing work is investigating whether basic concepts of objects and agents, and simple predictive models of the world, can also be learned via similar generic learning mechanisms.
Orhan, A. E., Gupta, V. B., and Lake, B. M. (2020). Self-supervised learning through the eyes of a child. Advances in Neural Information Processing Systems 33. [Supporting Info] [Code and pre-trained models] [New Scientist article]
Davidson, G. and Lake, B. M. (2021). Examining Infant Relation Categorization Through Deep Neural Networks. In Proceedings of the 43rd Annual Conference of the Cognitive Science Society.
Key people: Reuben Feinman
Human conceptual representations are rich in structural and statistical knowledge. Symbolic models excel at capturing compositional and causal structure, but they struggle to model the most complex correlations found in raw data. In contrast, neural network models excel at processing raw stimuli and capturing complex statistics, but they struggle to model compositional and causal knowledge. The human mind seems to transcend this dichotomy: learning structural and statistical knowledge from raw inputs.
We are developing neuro-symbolic models that learn compositional and causal generative programs from raw data, while using neural sub-routines for powerful statistical modeling (see diagram). We aim to better understand the dual structural and statistical natures of human concepts, and to learn neuro-symbolic representations for machine learning applications.
Feinman, R. and Lake, B. M. (2020). Learning Task-General Representations with Generative Neuro-Symbolic Modeling. International Conference on Learning Representations (ICLR).
Feinman, R. and Lake, B. M. (2020). Generating new concepts with hybrid neuro-symbolic models. In Proceedings of the 42nd Annual Conference of the Cognitive Science Society. [Supporting Info.]
Key people: Yanli Zhou, Laura Ruis, Max Nye, and Marco Baroni
People make compositional generalizations in language, thought, and action. Once a person learns how to “photobomb” she immediately understands how to “photobomb twice” or “photobomb vigorously.” We have shown that, despite recent advances in natural language processing, the best algorithms fail catastrophically on tests of compositionality.
To better understand these distinctively human abilities, we are studying human compositional learning of language-like instructions. Based on behavioral insights, we are developing novel meta-learning and neural-symbolic models to tackle popular compositional learning benchmarks. Additional work focuses on learning compositional visual concepts and developing more challenging benchmarks for AI, e.g., few-shot learning of concepts such as “cautiously” (see image of “walking to the small red circle cautiously,” which requires looking both ways before moving).
Ruis, L., Andreas, J., Baroni, M. Bouchacourt, D., and Lake, B. M. (2020). A Benchmark for Systematic Generalization in Grounded Language Understanding. Advances in Neural Information Processing Systems 33.
Nye, M., Solar-Lezama, A., Tenenbaum, J. B., and Lake, B. M. (2020). Learning Compositional Rules via Neural Program Synthesis. Advances in Neural Information Processing Systems 33.
Key people: Guy Davidson
Video games are a powerful tool for comparing human and machine learning. Although the best algorithms outscore people on many games, they require hundreds of hours of experience to learn a new game while people need just a few minutes. The experience gap is only widening: OpenAI recently trained their Dota 2 bot for 45,000 years worth of game experience. Our hypothesis is that key cognitive ingredients are missing from contemporary AI systems — objects, agents, compositionality, and causality — and this absence is holding these systems back.
To evaluate this hypothesis, we are incorporating cognitive ingredients into deep reinforcement learning (RL) algorithms and evaluating their performance. In the “Frostbite challenge,” we have found that adding object masks leads to higher scores and better generalization to novel test scenarios: An agent surrounded by crabs now knows it’s toast! (see image). Ongoing work is studying the importance of agents, compositionality, and causality.
Davidson, G. and Lake, B. M. (2020). Investigating simple object representations in model-free deep reinforcement learning. In Proceedings of the 42nd Annual Conference of the Cognitive Science Society.
Lake, B. M., Ullman, T. D., Tenenbaum, J. B., and Gershman, S. J. (2017). Building machines that learn and think like people. Behavioral and Brain Sciences, 40, E253.