* denotes equal contribution as first author; † denotes other equal contribution; preprints are also included and indicated with [preprint], which denotes a manuscript currently under review.
Cruz Blandón M. A., Aldeneh Z., Chi, J. & de Seyssel, M. (2025). Leveraging Audio-Visual Data to Reduce the Multilingual Gap in Self-Supervised Speech Models. arXiv preprint arXiv:2509.17523. [preprint]
de Seyssel, M. & Dhekane, E. G. (2025). Which Evaluation for Which Model? A Taxonomy for Speech Model Assessment. arXiv preprint arXiv:2510.19509. [preprint]
de Seyssel, M., Chi, J., Seto, S., ter Hoeve, M., Fedzechkina, M., & Schluter, N. (2025). Discriminating Form and Meaning in Multilingual Models with Minimal-Pair ABX Tasks. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. (SAC Highlight Award) [pdf]
Lavechin, M.*, de Seyssel, M.*, Titeux, H., Bredin, H., Wisniewski, G., Cristia, A., & Dupoux, E. (2025) Simulating Early Phonetic and Word Learning Without Linguistic Categories. In Developmental Science 28(2). Wiley. https://doi.org/10.1111/desc.13606 [pdf] [supplementaries] [original preprint] [video]
de Seyssel, M., D'Avirro, A., Williams, A., & Dupoux, E. (2024). EmphAssess: a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. [pdf] [code]
de Seyssel, M.*, Lavechin, M.*, & Dupoux, E. (2023) Realistic and broad-scope learning simulations: First results and challenges. Journal of Child Language, 1-24. doi:10.1017/S0305000923000272 [pdf]
de Seyssel, M., Lavechin, M., Titeux, H., Thomas, A., Virlet, G., Santos Revilla, A., Wisniewski, G., Ludusan, B., & Dupoux, E. (2023) ProsAudit, a prosodic benchmark for self-supervised speech models. In Proc. Interspeech 2023. [pdf] [benchmark] [leaderboard]
Cuervo, S., Seto, S., de Seyssel, M., Bai, R. H., Gu, Z., Likhomanenko, T., Jaitly, N., & Aldeneh, Z. (2025). Closing the Gap Between Text and Speech Understanding in LLMs. arXiv preprint arXiv:2510.13632. [preprint]
Sperber, M., de Seyssel, M., Bao, J., Paulik, M. (2025) Toward Machine Interpreting: Lessons from Human Interpreting Studies. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. [pdf]
Seto, S., Ter Hoeve, M.†, de Seyssel, M.†, Grangier, D.† (2025). Assessing the Role of Data Quality in Training Bilingual Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2025 [pdf]
Chi, J., de Seyssel, M., Schluter, N. (2025). The Role of Prosody in Spoken Question Answering. In Findings of the Association for Computational Linguistics: NAACL 2025 [pdf]
Lavechin, M., de Seyssel, M., Métais, M., Metze, F., Mohamed, A., Bredin, H., Dupoux, E. & Cristia, A. (2023). Modeling early phonetic acquisition from child-centered audio data. Cognition, 245, 105734. [pdf]
Lavechin, M., de Seyssel, M., Gautheron, L., Dupoux, E., & Cristia, A. (2021). Reverse-engineering language acquisition with child-centered long-form recordings. Annual Review of Linguistics, 8, 389-407. [pdf]
Dunbar, E., Bernard, M., Hamilakis, N., Nguyen, T.A., de Seyssel, M., Rozé, P., Rivière, M., Kharitonov, E. & Dupoux, E. (2021). The Zero Resource Speech Challenge 2021: Spoken Language Modelling. In Proc. Interspeech 2021, 1574-1578, doi: 10.21437/Interspeech.2021-1755. [pdf]
Nguyen, T.A., de Seyssel, M., Algayres, R., Roze, P., Dunbar, E., Dupoux, E. (2020). Are word boundaries useful for unsupervised language learning? CoML Technical Report, September 2020. [preprint]
Maudet, E., Cattan, O., de Seyssel, M., & Servan, C. (2019). Qwant Research@ DEFT 2019: appariement de documents et extraction d’informations à partir de cas cliniques (Document matching and information retrieval using clinical cases). In Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Défi Fouille de Textes (atelier TALN-RECITAL) (pp. 67-80). [pdf]