RC2 – Can patterns of word usage tell us what lemon and moon have in common? Analyzing the semantic content of distributional semantic models

Lecturer: Pia Sommerauer
Fields: Computational linguistics, cognitive linguistics

Content

Can patterns of textual contexts in which words appear tell you (or your model) that both, a lemon and the moon are described as yellow and round but differ with respect to (almost) everything else? In other words: How much information about concepts is encoded in patterns of word usage (i.e. distributional data)?

In this course, I will take stock of what we know about the semantic content encoded in data-derrived meaning representations (e.g Word2Vec), which are commonly used in Natural Language Processing and cognitive modelling (e.g. metaphor interpretation).

I will focus on how we can find out whether (and what) semantic knowledge they represent (beyond a general sense of semantic word similarity and relatedness). Drawing on methods in the area of neural network interpretability, I will discuss how we can “diagnose” semantic knowledge to find out whether a model can in fact distinguish flying from non-flying birds or tell you what lemons and the moon have in common.

Objectives

  • Become familiar with linguistic theories of the semantic encoded in linguistic context and what we could expect from it
  • Understand how distributional word representations are created, evaluated and used (with practical examples)
  • Understand why distributional word representations provide rich information for machine learning systems, but at the same time do not allow for straight-forward semantic interpretation
  • Understand the challenges of diagnostic methods and how they can be dealt with

Literature


Lecturer

Pia Sommerauer is a PhD student at the Computational Lexicology and Terminology Lab at Vrije Universiteit Amsterdam. Her research focuses on the type of semantic information captured by distributional representations of word meaning and whether they could be used for semantic reasoning. She has authored papers on this topic at venues specialized in lexical semantics and model interpretability together with her supervisors Antske Fokkens and Piek Vossen.

Website: https://piasommerauer.github.io/