LINAGORA is a French company, specializing in open source software. One of its current projects is the development of the open source smart vocal assistant
LinTO. LinTO helps employees organize and carry out meetings: thanks to its Natural Language Understanding system, it can answer voice commands.
When training a natural language recognition engine, one of the major problems we have to tackle is data scarcity. A large training corpus is generally needed to improve these systems. The task of manually building a corpus is time-consuming and requires a lot of human resources. In this article, we describe the method we developed to create a data augmentation module, able to automatically create alternative commands from a small existing french corpus.