Read about our current projects on learning through the eyes of a child, concept learning, compositional generalization, video game learning, and question asking.

Learning through the eyes of a child

Key people: Emin Orhan, Vaibhav Gupta, Wai Keen Vong, and Kanishk Gandhi

Young children have wide-ranging and sophisticated knowledge of the world. What is the origin of this early knowledge? How much can be explained through generic learning mechanisms applied to sensory data, and how much requires more substantive innate inductive biases? We examine these nature vs. nurture questions by training large-scale neural networks through the eyes of a single developing child, using longitudinal baby headcam videos (see recent dataset from Sullivan et al., 2020).

Our results so far show how high-level visual representations emerge from a subset of one baby’s experience, through only self-supervised learning. Our ongoing work is investigating whether basic principles of objects and agents, and simple predictive models of the world, can also be learned via similar generic learning mechanisms.

Concept learning in minds and machines

Key people: Reuben Feinman

Human conceptual representations are rich in structural and statistical knowledge. Symbolic models excel at capturing compositional and causal structure, but they struggle to model the most complex correlations found in raw data. In contrast, neural network models excel at processing raw stimuli and capturing complex statistics, but they struggle to model compositional and causal knowledge. The human mind seems to transcend this dichotomy: learning structural and statistical knowledge from raw inputs.

We are developing neuro-symbolic models that learn compositional and causal generative programs from raw data, while using neural sub-routines for powerful statistical modeling (see diagram). We aim to better understand the dual structural and statistical natures of human concepts, and to learn neuro-symbolic representations for machine learning applications.

Compositional generalization in minds and machines

Key people: Yanli Zhou, Laura Ruis, Max Nye, and Marco Baroni

People make compositional generalizations in language, thought, and action. Once a person learns how to “photobomb” she immediately understands how to “photobomb twice” or “photobomb vigorously.” We have shown that, despite recent advances in natural language processing, the best algorithms fail catastrophically on tests of compositionality.

To better understand these distinctively human abilities, we are studying human compositional learning of language-like instructions. Based on behavioral insights, we are developing novel meta-learning and neural-symbolic models to tackle popular compositional learning benchmarks. Additional work focuses on learning compositional visual concepts and developing more challenging benchmarks for AI, e.g., few-shot learning of concepts such as “cautiously” (see image of “walking to the small red circle cautiously,” which requires looking both ways before moving).

Video game learning in minds and machines

Key people: Guy Davidson

Video games are ideal for comparing human and machine learning. Although the best algorithms outscore people on many games, they require hundreds of hours of experience to learn a new game while people need just a few minutes. The experience gap is only widening: OpenAI recently trained their Dota 2 bot for 45,000 years worth of game experience. Our hypothesis is that key cognitive ingredients are missing from contemporary AI systems — objects, agents, compositionality, and causality — and this absence is holding these systems back.

To evaluate this hypothesis, we are incorporating cognitive ingredients into deep reinforcement learning (RL) algorithms and evaluating their performance. In the “Frostbite challenge,” we have found that adding object masks leads to higher scores and better generalization to novel test scenarios: An agent surrounded by crabs now knows it’s toast! (see image) Ongoing work is studying the importance of agents, compositionality, and causality.

Question asking in minds and machines

Key people: Anselm Rothe, Ziyun Wang, and Todd Gureckis

People learn by asking rich and creative questions. A child learning about animals might ask, “What does a lemur look like? Do all birds fly?” In contrast, an active learning algorithm repeatedly asks the same question, “What is the category label of this image?” We want to understand the computational basis of human question asking.

Our studies focus on domains amenable to ideal Bayesian modeling, such as a variant of “Battleship” where people can ask natural language questions to find hidden ships, e.g., “Do all three ships have the same size?” (see image). We find that although people do not ask optimal questions, they are excellent judges of question value. Computationally, we view question asking as program generation. Symbolic programs provide a compositional “language of thought” for questions, while probabilistic modeling captures which questions are likely in a given context. Ongoing work is studying neuro-symbolic models for faster inference and more domain-general question asking.