Copenhagen NLP Symposium
Friday, 20 June, 2025

Arbejdermuseet, Copenhagen, Denmark

Register Interest


The Copenhagen NLP Symposium is centered around highlighting NLP researchers from Denmark and other Nordic countries. The symposium features keynotes from renowned international researchers, a poster session for attendees to present their work, and dedicated time and space for networking across academia, industry, and students. The symposium is a free full-day event, taking place in the heart of Copenhagen.

Keynote Speakers


Louba ben Allal

Loubna ben Allal is a Research Engineer at HuggingFace. Loubna leads efforts on training small Language Models (SmolLM & SmolLM2) and building pre-training datasets like Cosmopedia and FineWeb-Edu.
Title: The Rise of Smol Models
Abstract: On-device language models are revolutionizing AI by making advanced models accessible in resource-constrained environments. In this talk, we will explore the rise of small models and how they are reshaping the AI landscape, moving beyond the era of scaling to ever-larger models. We will also cover SmolLM, a series of compact yet powerful LLMs, focusing on data curation, and ways to leverage these models for on-device applications.


Marzieh Fadaee

Marzieh Fadaee is a staff research scientist at Cohere Labs (formerly Cohere For AI) whose work centers on multilingual language models, data-efficient learning, and robust evaluation methods.
Title: Evaluating Language Models: A Mirror, a Microscope, and a Map
Abstract: Evaluation plays a central, but often underestimated, role in how large language models are developed and understood. This talk critically examines current evaluation practices, highlighting how they shape perceptions of model progress while often overlooking key challenges in robustness, multilingual performance, and real-world reliability. By reflecting on these gaps, I make the case for rethinking evaluation as a guiding force, and not just a final checkpoint, in building more capable, inclusive, and trustworthy LLMs.


Najoung Kim

Najoung Kim is an Assistant Professor at the Department of Linguistics and an affiliate faculty in the Department of Computer Science at Boston University.
Title: What does it take to convince ourselves that a system is exhibiting compositionality?
Abstract: Compositionality is often stated to be a desirable property for AI systems. But how do we evaluate this claim, and what evidence do we need to convince ourselves that a system is exhibiting this property? In this talk, I will start from the not-so-controversial bottom line that there can be no meaningful claims of compositionality without nontrivial commitments about the compositional machinery, building up towards the main claim that what we really want from an AI system is the availability of a process-compositional route. Then, I will discuss the role of behavioral and mechanistic evidence in convincing ourselves that such a route exists, featuring work on contextual inferences from adjective + noun compositions (with Hayley Ross and Kate Davidson) and on using Tensor Product Operations as a means to investigate symbol manipulation in neural networks (with Aditya Yedetore)


Kyle Lo

Kyle Lo is a research scientist at the Allen Institute for AI (Ai2), where he co-leads the OLMo project on open language modeling research. His current work focuses on data-driven approaches to model behavior and efficient language model experimentation. His research on language model development and adaptation, evaluation methods, and human-AI interaction has won awards at ACL, NAACL, EMNLP, EACL and CHI. Kyle’s work on language models for scientific research assistance—including fact checking, summarization, and augmented reading have been featured in Nature, Science, TechCrunch and other publications. Kyle holds a degree in Statistics from the University of Washington. Outside of work, he enjoys board games, boba tea, D&D, and spending time with his cat Belphegor.
Title: The OLMo Cookbook: Open Recipes for Language Model Data Curation
Abstract: Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or recipes to reproduce them. As a result, it can be challenging to conduct and advance scientific research on language modeling, such as understanding how training data impacts model capabilities, risks and limitations. In this talk, I'll present how we approach data curation research for OLMo, our project to develop and share fully open language models. Reflecting on our journey from OLMo 1 to our latest release of OLMo 2, I'll explore how data curation practices have matured across our work and the broader open data research ecosystem. Finally, I'll examine key challenges and opportunities for open data amid a rapidly changing language model landscape.


Yohei Oseki

Yohei Oseki is an Associate Professor at the University of Tokyo. Yohei investigates human linguistic intelligence as well as builds machines that process and learn natural language like humans, by comparing human language processing experimentally measured in cognitive and brain sciences, and machine language processing computationally implemented in natural language processing (NLP) and artificial intelligence.
Title: Small Language Models through Human-Like Learning Strategies
Abstract: Large language models (LLMs) have achieved remarkable success, thanks to the rapid development of NLP and AI, and outperform humans at various downstream tasks. However, those LLMs, despite their super-human performance, are pointed out as not efficient in terms of training data, model parameters, and computational resources. In this talk, I propose human-like learning strategies to efficiently train small language models (SLMs), building on insights from human language acquisition. Specifically, SLMs are trained on 100 million words of child-directed speech (CDS) through learning strategies such as curriculum learning, batch learning, indirect evidence, variation set, and working memory. The results suggest that inductive biases, inherent in both training data and language models, play an important role to efficiently train SLMs, with scientific implications for human language acquisition, as well as engineering applications to edge AIs and low-resource languages.

Program

08:30 - 09:00 Welcome
09:00 - 09:15 Opening Remarks
09:30 - 10:10 Keynote Talk 1
10:10 - 10:30 Break
10:45 - 11:25 Keynote Talk 2
11:30 - 12:30 Poster Session
12:30 - 13:30 Lunch
13:30 - 14:10 Keynote Talk 3
14:20 - 15:00 Keynote Talk 4
15:10 - 15:50 Keynote Talk 5
15:50 - 16:00 Closing Remarks
16:00 - 17:00 Reception

Call for Posters

We invite researchers to present posters on any topic related to Natural Language Processing. Posters may showcase recently published work (e.g., at conferences or in journals) or ongoing research. The registration form contains a dedicated section to express an interest in presenting a poster. Please include the poster title, an abstract, and—if applicable—details about the publication venue. We are interested in receiving posters on any of the following topics:

  • Nordic Language Processing
  • Safety and Alignment in LLMs
  • AI/LLM Agents
  • Human-AI Interaction/Cooperation
  • Retrieval-Augmented Language Models
  • Mathematical, Symbolic, and Logical Reasoning in NLP
  • Computational Social Science, Cultural Analytics, and NLP for Social Good
  • Code Models
  • Interpretability, Model Editing, Transparency, and Explainability
  • LLM Efficiency
  • Generalizability and Transfer
  • Dialogue and Interactive Systems
  • Discourse, Pragmatics, and Reasoning
  • Low-resource Methods for NLP
  • Ethics, Bias, and Fairness
  • Natural Language Generation
  • Information Extraction and Retrieval
  • Linguistic theories, Cognitive Modeling and Psycholinguistics
  • Machine Translation
  • Multilinguality and Language Diversity
  • Multimodality and Language Grounding to Vision, Robotics and Beyond
  • Neurosymbolic approaches to NLP
  • Phonology, Morphology and Word Segmentation
  • Question Answering
  • Resources and Evaluation
  • Semantics: Lexical, Sentence-level Semantics, Textual Inference and Other areas
  • Sentiment Analysis, Stylistic Analysis, and Argument Mining
  • Speech Processing and Spoken Language Understanding
  • Summarization
  • Hierarchical Structure Prediction, Syntax, and Parsing
  • NLP Applications

Important Dates

Poster Submission Deadline May 30, 2025
Registration Notifications June 13, 2025
Workshop Date June 20, 2025

Attend

How to get to Arbejdermuseet

The closest station is Nørreport station (metro and S Train).
For tickets and travel cards, you can either buy them at any station or use the Rejsekort app for convenient access to public transport.

Useful Links

Read an extensive list of restaurants and bars here
Visit the Copenhagen Neighborhood Guide here
Check what to see and do in Copenhagen here

Dining Near the Venue

  • Slurp Ramen Joint
    Cuisine: Japanese Ramen
    Address: Nansensgade 90, 1366 Copenhagen
    Notes: Handmade noodles, rich broth, expect possibly queues
  • Hanoi Alley
    Cuisine: Vietnamese
    Address: Nørrebrogade 62A, 2200 Copenhagen
    Notes: Excellent Vietnamese food at a decent price
  • Torvehallerne KBH
    Type: Food Market
    Address: Frederiksborggade 21, 1360 København K
    Notes: Indoor market with a wide range of food stalls (open daily)
  • Flindt & Ørsted
    Type: Café/Bar
    Address: Nørre Farimagsgade 6, 1364 Copenhagen (inside Ørstedsparken)
    Notes: Cozy café and bar with outdoor seating
  • Sporvejen
    Cuisine: Burgers
    Address: Gråbrødretorv 17, 1154 København K
    Notes: Retro-themed diner with great-value burgers
  • Poulette
    Cuisine: Fried Chicken & Tofu Sandwiches
    Address: Møllegade 1, 2200 København N
    Notes: Nashville-style spicy chicken sandwiches
  • Diamond Slice
    Cuisine: New York-Style Pizza
    Address: Blågårdsgade 27, 2200 København N
    Notes: Creative pizza toppings, casual vibe
  • Ramen to Bíiru (Nørrebro)
    Cuisine: Ramen & Craft Beer
    Address: Griffenfeldsgade 28, 2200 København N
    Notes: Japanese ramen with Danish craft beer
  • Gasoline Grill
    Cuisine: Burgers
    Address: Landgreven 10, 1301 København K
    Notes: Famous for juicy, organic burgers from a former gas station
  • Bangkok Cantine
    Cuisine: Thai
    Address: Nørre Allé 13, 2200 København
    Notes: Family-run Thai restaurant. Pretty small, so it may not accommodate large groups

Organisers

Anna Rogers


Anna Rogers
IT University Copenhagen

Desmond Elliott


Desmond Elliott
University of Copenhagen

Johanns Bjerva


Johannes Bjerva
Aalborg University Copenhagen

Russa Biswas


Russa Biswas
Aalborg University Copenhagen

Ernests Lavrinovics


Ernests Lavrinovics
Aalborg University Copenhagen

Ingo Ziegler


Ingo Ziegler
University of Copenhagen

Danae Sanchez Villegas


Danae Sanchez Villegas
University of Copenhagen

Arzu Burcu Güven


Arzu Burcu Güven
IT University Copenhagen

Andreas Geert Motzfeldt


Andreas Geert Motzfeldt
IT University Copenhagen

Contact

Please send all inquiries to email or contact any of the organisers via their email addresses.