MC4 – Learning Mappings via Symbolic, Probabilistic, and Connectionist Modeling

Lecturer: Afsaneh Fazly
Fields: Machine Learning, Cognitive Modelling, Language Acquisition


In session 1, we cover the basics of several mapping (association) problems, including theoretically important challenges such as the acquisition of word meanings in young children, as well as applied settings such as learning multimodal or multilingual representations.

Session 2 focuses on the early approaches applied to a mapping problem, including symbolic and probabilistic methods.

Session 3 covers the more recent techniques (linear transformations and deep learning), in the context of several mapping problems, such as learning multimodal and multilingual mappings.


The objective is to cover three different approaches applied to the same problem of learning mappings across modalities (e.g., learning the meanings of words, learning mappings between audio/words and image/video segments, learning multilingual representations, etc.).


J.M. Siskind (1995). Grounding Language in PerceptionArtificial Intelligence Review, 8:371-391, 1995. [LINK]

J.M. Siskind (1996). A Computational Study of Cross-Situational Techniques for Learning Word-to-Meaning MappingsCognition, 61(1-2):39-91, October/November 1996. Also appeared in Computational Approaches to Language Acquisition, M.R. Brent, ed., Elsevier, pp. 39-91, 1996. [LINK]

Frank, M. C., Goodman, N. D., & Tenenbaum, J. B. (2009). Using speakers’ referential intentions to model early cross-situational word learningPsychological Science, 20, 579-585. [LINK]

Fazly, A., Alishahi A., Stevenson, S. (2010). A probabilistic computational model of cross-situational word learning. Cognitive Science: A Multidisciplinary Journal, 34(6): 1017—1063. [LINK]

Tadas Baltrusaitis, Chaitanya Ahuja, and Louis-Philippe Morency (2017). Multimodal Machine Learning: A Survey and Taxonomy. [LINK]

Zhang, Y., Chen, C.H., & Yu, C. (2019). Mechanisms of Cross-situational Learning: Behavioral and Computational Evidence. Advances in child development and behavior. [LINK]

Sebastian Ruder, Ivan Vulić, Anders Søgaard (2019). A Survey of Cross-lingual Word Embedding Models. Journal of Artificial Intelligence Research 65: 569-631. [LINK]


Dr. Afsaneh Fazly

Afsaneh Fazly is a Research Director at Samsung Toronto AI Centre, and an Adjunct Professor at the Computer Science Department of University of Toronto in Canada. Afsaneh has extensive experience in both academia and the industry, publishing award-winning papers, and building strong teams solving real-world problems. Afsaneh’s research draws on many subfields of AI, including Computational Linguistics, Cognitive Science, Computational Vision, and Machine Learning. Afsaneh strongly believes that solving many of today’s real-world problems requires an interdisciplinary approach that can bridge the gap between machine intelligence and human cognition.

Before joining Samsung Research, Afsaneh worked at several Canadian companies as Research Director, where she helped build and lead teams of outstanding scientists and engineers solving a diverse set of AI problems. Prior to that, Afsaneh was a Research Scientist and Course Instructor at the University of Toronto, where she also received her PhD from. Afsaneh lives in Toronto, with her husband and two young children. Afsaneh’s main hobby these days is reading and spending time with her family.

Affiliation: Samsung Toronto AI Centre

MC3 – Embodied Symbol Emergence

Lecturer: Malte Schilling and Michael Spranger
Fields: Robotics / Autonomous systems / Neurobiology / Artificial Intelligence / Developmental Artificial Intelligence / Symbol Emergence


Symbols are the bedrock of human cognition. They play a role in planning, but are also crucial to understanding and modeling language. Since they are so important for human cognition, they are likely also vital for implementing similar abilities in software agents and robots.

The course will focus on symbols from two integrated perspectives. On the one hand, we look at the emergence of internal models through interaction with the environment and their role in sensorimotor behavior. This perspective is the embodied perspective. The first two lectures of the course concentrate on the emergence of internal models and grounded symbols in simple animals and agents and show how interaction with an environment requires internal models and how these are structured. Here we use robots to show how effective the discussed mechanisms are.

The second perspective is that symbols can also be socially constructed. In particular, we will focus on language and how it is grounded in embodiment but also social interaction. This will be the topic of the third and fourth lecture. We first investigate the emergence of grounded names and categories (and their terms) in social interactions between robots. The second two lectures of the course will focus on compositionality – that is the interaction of embodied categories in larger phrases or sentences and grammar.

Lecture 1: Embodied systems

Embodied systems: sophisticated behaviors do not necessarily require internal models. There are many examples of relatively simple animals (for example insects) that are able to perform complex behaviors. In the first lecture we focus on behavior-based robots that simply react to their environment without internal models. Crucially, these reactive behaviors can lead to complex and adaptive behavior, but the agent is not relying on internal representations. Instead, the systems is exploiting the relation to the environment.

Lecture 2: Grounded internal models

Grounded internal models serve a function for the system first. But the flexibility of these models allows them to be recruited in additional tasks. An example is the use of internal body models in perception. In the second part of the course internal models will be introduced, how they co-evolve in service for a specific behavior and how flexible models can be recruited for higher level tasks such as perception or cognition. The session will consist of case studies from neuroscience, psychology and behavioral science as well as modeling approaches of internal models in robotics. Sharing such internal models in a population of agents provides a step towards symbolic systems and communication.

Lecture 3: Symbol emergence in robot populations

The lecture will examine the emergence of grounded, shared lexical language in populations of robots. Lexical languages consist of single (or in some cases multi-word) expressions. We show how such systems emerge in referential games. In particular, we focus on how internal representations become shared across agents through communication. The lecture will cover (proper) naming and categorization of objects, for instance, using color. The lecture will introduce important concepts such as symbol grounding and discuss them from the viewpoint of language emergence.

Lecture 4: Compositional Language

Human language is compositional – which means that the meaning of phrases depends on its constituents but also the grammatical relations between them. For instance, projective categories such as “front”, “back”, “left” and “right” can be used as adjectives or prepositionally. Different syntactic usage signals a different conceptualization. This lecture will focus on compositional representations of language meaning, how they are related to syntax and how such systems might emerge in populations of agents.


The course will give an introduction to computational models of symbol emergence through sensorimotor behavior and social construction. These models can be run in simulation or on real robots. Participants will be introduced to the field of Embodied Cognition – providing an overview on interdisciplinary results from neuroscience, psychology, computer science, linguistics and robotics.


Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. (2016). Building Machines That Learn and Think Like People. Behav Brain Sci, 1–101.

Lecture 1-2

Dickinson, M. H., Farley, C. T., Full, R. J., Koehl, M. a. R., Kram, R., & Lehman, S. (2000). How Animals Move: An Integrative View. Science, 288(5463), 100–106.

Ijspeert, A. J. (2014). Biorobotics: Using robots to emulate and investigate agile locomotion. Science, 346(6206), 196–203.

Gallese, V., & Lakoff, G. (2005). The Brain’s concepts: The role of the Sensory-motor system in conceptual knowledge. Cognitive Neuropsychology, 22(3–4), 455–479.

Lecture 3-4

Steels, L.. The symbol grounding problem has been solved. so what’s next? In M. de Vega, editor, Symbols and Embodiment: Debates on Meaning and Cognition. Oxford University Press, 2008.

Steels, L.. The Talking Heads Experiment: Origins of Words and Meanings, volume 1 of Computational Models of Language Evolution. Language Science Press, Berlin, DE, 2015.

Spranger, M.. The Evolution of Grounded Spatial Language. Language Science Press, 2016.


Dr. Malte Schilling
Dr. Malte Schilling

Malte Schilling is a Responsible Investigator at the Center of Excellence for ‘Cognitive Interaction Technology’ in Bielefeld. His work concentrates on internal models, their grounding in behavior and application in higher-level cognitive function like planning ahead or communication. Before, he was a PostDoc at the ICSI in Berkeley and did research on the connection of linguistic to sensorimotor representation. He received his PhD in Biology from Bielefeld University in 2010 working on decentralized biologically-inspired minimal cognitive systems. He has studied Computer Science at Bielefeld University and finished 2003 the Diploma with his thesis on knowledge-based systems for virtual environments.

Dr. Michael Spranger
Dr. Michael Spranger

Michael Spranger received a PhD from the Vrije Universiteit in Brussels (Belgium) in 2011 (in Computer Science). For his PhD he was a researcher at Sony CSL Paris (France). He then worked in the R&D department of Sony Corporation in Tokyo (Japan) for almost 2 years. He is currently a researcher at Sony Computer Science Laboratories Inc (Tokyo, Japan). Michael is a roboticist by training with extensive experience in research on and construction of autonomous systems including research on robot perception, world modeling and behavior control. After his undergraduate degree he fell in love with the study of language and has since worked on different language domains from action language and posture verbs to time, tense, determination and spatial language. His work focuses on artificial language evolution, machine learning for NLP (and applications), developmental language learning, computational cognitive semantics and construction grammar.

Affiliation: Bielefeld University and Sony

MC5 – Low Complexity Modeling in Data Analysis and Image Processing

Lecturer: Emily King
Fields: Mathematical methods, data analysis, machine learning, image processing, harmonic analysis


Are you curious about how to extract important information from a data set?  Very likely, you will be rewarded if you use some sort of low complexity model in your analysis and processing.  A low complexity model is a representation of data which is in some sense much simpler than what the original format of the data would suggest. For example, every time you take a picture with a phone, about 80% of the data is discarded when the image is saved as a JPEG file.  The JPEG compression algorithm works due to the fact that discrete cosine functions yield a low complexity model for natural images that tricks human perception. As another example, linear bottlenecks, pooling, pruning, and dropout are all examples of enforcing a low complexity model on neural networks to prevent overfitting. Some benefits of  low complexity models include:

  • Approximating data via a low complexity model often highlights overall structure of the data set or key features.
  •  Appropriately reducing the complexity of data as a pre-processing step can speed up algorithms without drastically affecting the outcome.
  • Reducing the complexity of a system during a training task can prevent overfitting.

The course will begin with an introduction to applied harmonic analysis, touching on pertinent topics from linear algebra, Fourier analysis, time-frequency analysis, and wavelet/shearlet analysis.  Then an overview of low complexity models will be given, followed by specific discussions of

  • Linear dimensionality reduction (principal component analysis, Johnson-Lindenstrauss embeddings)
  • Sparsity and low rank assumptions (LASSO, l^p norms, k-means clustering, dictionary learning)
  • Nonlinear dimensionality reduction / manifold learning (Isomap, Locally Linear Embedding, local PCA)
  • Low complexity models in neural networks (linear bottlenecks, pooling, pruning, dropout, generative adversarial networks, Gaussian mean width)


The course aims to provide participants with a good understanding of basic concepts and applications of both classical mathematical tools like the Fourier or wavelet transform and more cutting edge methods like dropout in neural networks.  A variety of applications and algorithms will be presented.  Participants should finish the course with a clearer idea of when and how to use various approaches in data analysis and image processing.


The linear algebra chapter of MIT’s Deep Learning textbook:


Emily King is a professor of mathematics at Colorado State University, reigning IK Powerpoint Karaoke champion, an avid distance runner, and a lover of slow food / craft beer / third wave coffee.  Her research interests include algebraic and applied harmonic analysis, signal and image processing, data analysis, and frame theory.  In layman’s terms, she looks for the best building blocks to represent data, images, and even theoretical mathematical objects to better understand them.  She also has a tattoo symbolizing most of her favorite classes of mathematical objects.  If you are curious, you should ask her about it over a beer.

Affiliation: Colorado State University

MC1 – Applications of Bayesian Inference and the Free Energy Principle

Lecturer: Christoph Mathys
Fields: Bayesian inference, free energy principle, active inference,
computational neuroscience, time series


We will start with a look at the fundamentals of Bayesian inference, model selection, and the free energy principle. We will then look at ways to reduce Bayesian inference to simple prediction adjustments based on precision-weighted prediction errors. This will provide a natural entry point to the field of active inference, a framework for modelling and programming the behaviour of agents negotiating their continued existence in a given environment. Under active inference, an agent uses Bayesian inference to choose its actions such that they minimize the free energy of its model of the environment. We will look at how an agent can infer the state of the environment and its own internal control states in order to generate appropriate actions.


  • To understand the reduction of Bayesian inference to precision-weighting of
    prediction errors
  • To understand the free energy principle and the modelling framework of
    active inference
  • To know the principles of Bayesian inference and model selection, and to understand their application to a given data set.


  • Friston, K. J., Daunizeau, J., & Kiebel, S. J. (2009). Reinforcement Learning
    or Active Inference? PLoS ONE, 4(7), e6421.
  • Mathys, C., Lomakina, E.I., Daunizeau, J., Iglesias, S., Brodersen, K.H.,
    Friston, K.J., & Stephan, K.E. (2014). Uncertainty in perception and the
    Hierarchical Gaussian Filter. Frontiers in Human Neuroscience, 8:825.
  • Mathys, C., Daunizeau, J., Friston, K.J., Stephan, K.E., 2011. A Bayesian
    foundation for individual learning under uncertainty. Front. Hum. Neurosci. 5,
  • Friston, K. (2009). The free-energy principle: A rough guide to the brain? Trends in Cognitive Sciences, 13(7), 293–301.


Christoph Mathys is Associate Professor of Cognitive Science at Aarhus University. Originally a theoretical physicist, he worked in the IT industry for several years before doing a PhD in information technology at ETH Zurich and a master’s degree in psychology and psychopathology at the University of Zurich. During his graduate studies, he developed the hierarchical Gaussian filter (HGF), a generic hierarchical Bayesian model of inference in volatile environments. Based on this, he develops and maintain the HGF Toolbox, a Matlab-based free software package for the analysis of behavioural and neuroimaging experiments. His research focus is on the hierarchical message passing that supports inference in the brain, and on failures of inference that lead to psychopathology.

Affiliation: sissa

MC2 – Symbolic Reasoning within Connectionist Systems

Lecturer: Klaus Greff
Fields: Artificial Intelligence / Neural Networks, Draws upon Theoretical Neuroscience and Cognitive Psychology


Our brains effortlessly organize our perception into objects which it uses to compose flexible mental models of the world. Objects are fundamental to our thinking and our brains are so good at forming them from raw perception, that it is hard to notice anything special happening at all. Yet, perceptual grouping is far from trivial and has puzzled neuroscientists, psychologists and AI researchers alike. 

Current neural networks show impressive capacities in learning perceptual tasks but struggle with tasks that require a symbolic understanding. This ability to form high-level symbolic representations from raw data, I believe, is going to be a key ingredient of general AI. 

During this course, I will try to share my fascination with this important but often neglected topic. 

Within the context of neural networks, we will discuss the key challenges and how they may be addressed. Our main focus will be the so-called Binding Problem and how it prevents current neural networks from effectively dealing with multiple objects in a symbolic fashion.

After a general overview in the first session, the next lectures will explore in-depth three different aspects of the problem:

Session 2 (Representation) focuses on the challenges regarding distributed representations of multiple objects in artificial neural networks and the brain.

Session 3 (Segregation) is about splitting raw perception into objects, and we will discuss what they even are in the first place.

Session 4 (Composition) will bring things back together and show how different objects can be related and composed into complex structures. 


  • Develop an appreciation for the subtleties of object perception.
  • Understand the importance of symbol-like representations in neural networks and how they relate to generalization.
  • Become familiar with the binding problem and its three aspects: representation, segregation, and composition.
  • Get an overview of the challenges and available approaches for each subproblem.


The course is a non-technical high-level overview, so only basic familiarity with neural networks is assumed. Optional background material: 


Klaus Greff studied Computerscience at the University of Kaiserslautern and is currently a PhD candidate under the supervision of Prof. Jürgen Schmidhuber. His main research interest revolves around the unsupervised learning of symbol-like representations in neural networks (the content of this course).

Previously, Klaus has worked with Recurrent Neural Networks and the training of very deep neural networks, and is also the maintainer of the popular experiment management framework Sacred.

Affiliation: IDSIA