We develop computational models of human language. Our work ranges from basic research on algorithms and machine learning to applied research in language technology.
The current focus of our research is on the analysis and enhancement of neural language models. We are working on methods for extending these models with non-linguistic signals such as images and videos, and on the application of neural language models to information extraction. In addition to this, we have a long-standing interest in work on the intersection of natural language processing and theoretical computer science.
STING: Synthesis and analysis with Transducers and Invertible Neural Generators
Human communication is multimodal and occurs through speech, language, gesture, facial expression, and similar signals. To enable natural interactions with human beings, artificial agents must be capable of analysing and producing these rich and interdependent signals and connecting them to their semantic implications. STING aims to unify synthesis and analysis through transducers and invertible neural models, connecting concrete, continuous-valued sensory data such as images, sound, and motion, with high-level, predominantly discrete representations of meaning. The project has the potential to endow synthesis output with human-understandable high-level explanations while simultaneously improving the ability to attach probabilities to semantic representations.
Funded by WASP
A Practical Theory of Computation for Modern Neural Network Architectures
Today, neural networks are the most important computational paradigm for AI and machine learning. While their computational power has been studied for decades, the theoretical development has not been able to keep up with the recent upshot of new neural architectures, such as the Transformer. Also, existing theoretical results are of limited practical value because they often apply to monolithic architectures, whereas applications typically use networks composed of re-usable components. Moreover, theoretical results tend to make idealising assumptions such as infinite precision or unlimited training time rather than acknowledging that such commodities are in short supply and should thus be treated as parameters of expressive power. In this project, we want to address these shortcomings and develop a practical theory of computation for modern neural network architectures. The project combines methods from theoretical computer science – especially the theory of computation and formal languages – with empirical validation in natural language processing (NLP).
Funded by WASP
Relation Extraction with Deep Neural Language Models
The field of natural language processing (NLP) has seen major progress during the last few years with the development of deep neural language models, which learn tasks such as question answering, machine translation, and text summarization without any direct supervision. This project will apply these models to extracting semantic relations between named entities from raw text. Our main goal is to design, implement, and evaluate an end-to-end system for relation extraction based on deep neural language models. Because training these models from scratch is extremely resource-intensive, we are specifically interested in developing methods for maximizing the performance that fine-tuning pre-trained models can obtain, particularly models for smaller languages such as Swedish.
Funded by ELLIIT
Interpreting and Grounding Pre-Trained Representations for Natural Language Processing
Building computers that understand human language is one of the central goals of artificial intelligence. A recent breakthrough on the way towards this goal is the development of neural models that learn deep contextualized representations of language. However, while these models have substantially advanced the state of the art in natural language processing (NLP), our understanding of the learned representations and our repertoire of techniques for integrating them with other knowledge representations and reasoning facilities remain severely limited. To address these gaps, we will develop new methods for interpreting, grounding, and integrating deep contextualized representations of language and evaluate the usefulness of these methods in the context of threat intelligence applications together with our industrial partner.
Funded by WASP
Marco Kuhlmann, Lena Katharina Schiffer, and Andreas Maletti.
The Tree-Generative Capacity of Combinatory Categorial Grammars.
Journal of Computer and System Sciences, pages 214–233, pages 214–233, 2022.
Jenny Kunz and Marco Kuhlmann.
Test Harder Than You Train: Probing with Extrapolation Splits.
In Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 15–25, Punta Cana, Dominican Republic, 2021.
Jenny Kunz and Marco Kuhlmann.
Classifier Probes May Just Learn from Linear Context Features.
In Proceedings of the 28th International Conference on Computational Linguistics (COLING), pages 5136–5146, Online, 2020.
Our teaching portfolio comprises courses and degree projects in natural language processing and text mining at the basic, advanced, and doctoral levels. We are committed to excellence in teaching through effective pedagogy that fosters relevant knowledge and skills and stimulate students to drive their own learning process.
Language Technology (Bachelor)
Language technology – technology for the analysis and interpretation of natural language – forms a key component of smart search engines, personal digital assistants, and many other innovative applications. The goal of this course is to give an introduction to language technology as an application area, as well as to its basic methods. The course focuses on methods that process text.
Course website 729G17 | Course website TDP030
Natural Language Processing (Master)
Natural Language Processing (NLP) develops methods for making human language accessible to computers. This course aims to provide students with a theoretical understanding of and practical experience with the advanced algorithms that power modern NLP. The course focuses on methods based on deep neural networks.
Course website TDDE09
Text Mining (Master)
Text Mining is about distilling actionable insights from text. The overall aim of this course is to provide students with practical experience of the main steps of text mining: information retrieval, processing of text data, modelling, analysis of experimental results. The course ends with an individual project where students work on a self-defined problem.
Course website 732A92 | Course website TDDE16
Deep Learning for NLP (PhD)
Natural Language Processing (NLP) develops methods for making human language accessible to computers. The goal of this course is to provide students with a theoretical understanding of and practical experience with the advanced algorithms that power modern NLP. The course focuses on methods based on deep neural networks.
This course is given in the context of the WASP Graduate School.
Open Thesis Proposals
Do Large Language Models Speak Swedish?
Large language models like GPT-3 are trained on massive corpora and have become state-of-the-art on many natural language tasks. Even more importantly, they succeed on many tasks in few-shot or even zero-shot setups, removing the need for task-specific pre-training. The primary language for these models is English, but we can prompt them to generate answers in Swedish. In this project, you will explore the capabilities of multilingual large language models on Swedish language tasks and compare them to Swedish monolingual models. You will also explore how we can adapt these models to Swedish through other means than computationally expensive pre-training.
Contact: Oskar Holmström