SUMM-RE

Weak Supervision for Meeting Minutes with Rhetorical Relations

SUMM-RE aims to combine expertise in theories of discourse interpretation with recent developments in distant supervision to improve the automatic production of meeting summaries and minutes from spoken data. Its guiding hypothesis is that by exploiting information about discourse relations and the rich structures determined by relations between utterances, we can significantly improve models for abstractive summarization.


To test this hypothesis, SUMM-RE will begin by building a 100 hour audio-video corpus of multi-party, meeting-like interactions in French. Then, building on prior work by SUMM-RE members, we will extend the data programming paradigm Snorkel to automatically annotate the SUMM-RE corpus and the AMI corpus, a large meeting-style corpus in English, for discourse structure. The automatically annotated data will then be used to improve algorithms for both short topic summaries and more detailed meeting minutes.


These algorithms in turn will be integrated into LINAGORA’s semi-automatic summarization tool to significantly improve the output for its users. All project results (corpus and algorithms) will be released under an open-source license as a part of LINAGORA’s LinTO/Conversation Manager offer.


ANR PRCI project (ANR-20-CE23-0017)
15 Decembre, 2020 – 14 May, 2024
Lead: LINAGORA

People

Jean-Pierre Lorré

Jean-Pierre Lorré

R&D Director
Julie Hunter

Julie Hunter

NLP
Ilyes Rebai

Ilyes Rebai

Speech Recognition
Guokan Shang

Guokan Shang

NLP
Kate Thompson

Kate Thompson

NLP
Wajdi Ghezaiel

Wajdi Ghezaiel

Speech & Speaker Recognition

Partners

logo IRIT
logo LINAGORA
logo Parole & Langage