Lexical Semantics Seminar: Semantic systems and word meaning – Københavns Universitet

Lingvistkredsen > Arrangementer > Forår 2018 > Lexical Semantics Seminar

Lexical Semantics Seminar: Semantic systems and word meaning

I dette seminar vil en række både danske og udenlandske forskere præsentere spændende nye resultater om ordsemantik. De vil diskutere hvordan ordbetydning er forankret i vores visuelle verden, hvordan den forandrer sig afhængigt af den tidsmæssige sammenhæng, og hvordan den bliver formaliseret i begrebsordbøger og  i onlineresurser til brug for sprogteknologiske applikationer. Og de vil drøfte hvordan en flersproglig tilgang kan bruges til at redegøre for sammensatte ords betydning.

Af hensyn til kaffepausebestillingen bedes de interesserede om at tilmelde sig ved at sende en mail til Patrizia Paggio.

Program

13:00 – 14:00  Elia Bruni, University of Amsterdam: Grounding word meaning in the visual world.

14:00 – 14:30  Sanni Nimb, Det Danske Sprog- og Litteraturselskab: Expressing word meaning in The Danish Concept Dictionary: The benefits and further perspectives of a formal lexical approach.

14:30 – 15:00  Kaffe

15:00 – 16:00  Lonneke van der Plads, University of Malta: Cross-lingual approaches to compound analysis.

16:00 – 16:30  Dorthe Duncker, University of Copenhagen: Word meaning in time and context.

16:30 – 17:00  Bolette S. Pedersen, University of Copenhagen: Word meaning: between lexica and text.

Abstracts

Elia Bruni, University of Amsterdam: Grounding word meaning in the visual world.

Computational semantic models derive representations of word meaning using contextual information by either gathering patterns of co-occurrence of words in text or by predicting such patterns using word embeddings. These models have been a success story of computational linguistics, being able to provide reliable estimates of semantic relatedness for the many semantic tasks requiring them. However, semantic models extract meaning information exclusively from text, which is an extremely impoverished basis compared to the rich perceptual sources that ground human semantic knowledge. I address the lack of perceptual grounding of semantic models by exploiting computer vision techniques so that the representation of a word can be extended to also encompass its co-occurrence with the visual information of images it is associated with. 

In the second part of the talk, I will explain that, despite such progress, these models instantiate rather fragile connections between vision and language, and we are still far from truly grasping the linkage between these two modalities. One of the reasons is that these systems are mainly devised to learn from very static environments, where a single model is repeatedly exposed to a large amount of text and images but does not have any chance to interact with those environments. To alleviate this problem, I will introduce a multimodal learning framework where two agents will have to cooperate via language in order to achieve a goal that is grounded in an external visual world.

Sanni Nimb, Det Danske Sprog- og Litteraturselskab: Expressing word meaning in The Danish Concept Dictionary: The benefits and further perspectives of a formal lexical approach.

While computational lexicographers base word sense descriptions on a narrow and precise set of semantic types and relations when they compile semantic lexicons for natural language processing, this is not the case in traditional lexicography. Here the focus is on how to express the semantic peculiarities of each specific word sense, and on being able to do it in the most flexible way possible. At the Society for Danish Language and Literature (DSL) we combined the two quite different approaches in the dictionary-making process of a Danish thesaurus, “Den Danske Begrebsordbog” (‘The Danish Concept Dictionary’, published in print 2015), based on our experiences with the compilation of the formal lexicon DanNet (the Danish WordNet) and the online Danish dictionary “Den Danske Ordbog”. While a set of predefined semantic types and relations constituted the overall organization principles across the named sections in the thesaurus, we also made room for a more flexible description of the Danish vocabulary in each section. The approach allowed us on the one hand to include almost any type of word sense described in the Danish dictionary, on the other hand to make use of the resulting thesaurus data in preceding research projects for the compilation of formal semantic lexicons for Danish. One example is a Danish FrameNet lexicon to be used for semantic annotation of Danish texts. In my presentation I will describe the lexical method and discuss the results.

Lonneke van der Plas, University of Malta: Cross-lingual approaches to compound analysis.

Compounds can be defined as the formation of a new lexeme by adjoining two or more lexemes (Bauer, 2003:40). They are studied extensively in the linguistic literature and are enjoying more and more attention in the Natural Language Processing (NLP) literature. The high productivity of compounds makes compositional approaches to automatic processing indispensable: listing all possible compounds in a dictionary would be almost as impractical as listing all possible adjective-noun combinations. However, due to the fact that they are at the interface between words and phrases, and show variable levels of semantic transparency (with respect to the constituents and the covert relation between them) they are particularly challenging for lexical semantics.

After explaining in more detail why compounds are a challenging, but worthwhile subject of study for lexical semanticists, I will give an overview of recent work we undertook that harvests parallel corpora as indirect supervision for two tasks in compound analysis: compound identification, and bracketing of compounds with three or more constituents. I will link this work to a resource we compiled automatically that contains English compounds and their semantic equivalents in many different languages, and showcase some ongoing work that uses this resource.

Dorthe Duncker, University of Copenhagen: Word meaning in time and context.

All communication processes are situated in time and context, because this is where the communicating participants are situated. It is impossible for anyone to say, write, read, hear, sign, etc. anything except in a particular situation. This means that word meaning is also situated and that meaning is always ‘now’. But, if meaning is always ‘now’ how can we at the same time expect our words to mean tomorrow, in two years, or in two minutes, what they mean here, now, today? And how can we expect that our words mean the ‘same’ across individual participants? In practice, these two opposite tendencies coexist, paradoxically but blissfully, because we apparently want to have it both ways. We want a flexible stability. This means that we need to be prepared to let each other in on each other’s word meanings when the circumstances require us to do so, and it seems that we have certain strategies for meeting these demands. Interestingly, the methods we, as lay speakers, employ to this end are not in principle that different from the approach taken by the academic lexicographer and terminologist. In the talk, I will present conversational examples that illustrate how communicating participants describe and compare word meanings, and how they manage to ‘show’ each other what their words mean.

Bolette S. Pedersen, University of Copenhagen: Word meaning: between lexica and text.

Humans generally find word descriptions in dictionaries helpful when producing and interpreting language. The sense distinctions and definitions seem intuitive and explanatory and give the user an overview of the meaning potential of a particular word. When employing dictionaries for semantic annotation of text and speech with the aim of machine learning, however, we find it very hard to achieve agreement between annotators. It becomes very evident that dictionary senses are constructs that only approximately frame the meaning potential of words. This imposes a problem for language processing purposes where we want to train systems to grasp meaning distinctions. How can we exploit the rich information of sense inventories as established in existing dictionaries and yet make them practically useful in language technology – not too vague and fine-grained to be operational and yet informative enough to be worth the trouble? In my talk, I will present a series of experiments dealing with this question and I will discuss the potential of combining dictionary sense descriptions with sense profiles generated by statistic methods.