Here are some of the latest projects I've been working on.

STELA - Learning Simulation of Language Acquisition

In close collaboration with fellow PhD student Marvin Lavechin.

STELA (STatistical Learning of Early Language Acquisition) is a learning simulation, which aims at modelling language acquisition under the scope of the statistical learning hypothesis.

The approach is based on self-supervised models which learn language based on raw speech. The STELA framework also allows the possibility to generate comparable developmental learning curve at the phonetic and lexical level.

More info on the project coming soon.

Layout on how learning simulations (like STELA) and infants compare.

Modelling bilingual language acquisition

This is the main focus of my thesis and a project which is still well ongoing. In short, using the same approach as in the STELA project, I am modelling developmental learning curves for monolingual and bilingual models. We can then compare the curves to existing experimental and observational results.

More info on the project coming soon.

Automatic Language Similarity

I am really interested in the effect that language similarity can have in speech models. Will a model trained in one language perform or transfer better to languages that are similar to the seed language? What happens if the model is trained on multiple languages?

From there stems another question: what is language similarity? Can models capture it automatically? And what kind of typology will be captured?

I presented a paper at Speech Prosody 2022 where we did a pilot study at capturing language typology using i-vectors. I am also looking at the effect of language similarity in modelling various speech-related cognitive processes (language discrimination and separation, language familiarity effect, language learning...)

Automatic clustering of languages typologies using i-vectors

ZeroResource Speech Challenge 2021

I was a co-organiser for the 2021 edition of the Zero Speech Challenge, where I among other things developed the semantic and syntactic benchmarks.

ZeroSpeech 2021 is a challenge aimed at Spoken Language Modelling from raw speech. This task consists in learning language models directly from raw audio in an unknown language, without any annotation or text.

For more info, check out the website (the challenge is still open for new submissions!).

Get in touch

maureen.deseyssel (at)