CHANTER

Projet ANR Cha(nt)N(umérique)Te(mps)R(éel)

Outils pour utilisateurs

Outils du site


Panneau latéral

Start

Special Session Interspeech 2016 Singing synthesis challenge "Fill-in the Gap"

Work

Scientific and industrial Partners

en:docus:start

Version française

Previous works

The project is based on previous works realised by the project members. Click on each article to know more about it.

Abstract

Ph. D. Thesis

Master thesis

  • S. Delalez
    Modèle articulatoire pour la synthèse de voix chantée contrôlée par le geste
    Master thesis, Master 2 I3SR - LAM / UPMC, June, 2013

Papers


Singing with hands: Chironomic interfaces for digital musical instruments

Ph D. thesis Université Paris-Sud (23/09/2015)

Olivier Perrotin
LIMSI-CNRS
Bâtiment 508 - Université Paris-Sud - 91405 Orsay Cedex
olivier.perrotin@limsi.fr

Access to the thesis and to the sound exemples

Abstract

This thesis deals with the real-time control of singing voice synthesis by a graphic tablet, based on the digital musical instrument Cantor Digitalis. The relevance of the graphic tablet for the intonation control is first considered, showing that the tablet provides a more precise pitch control than real voice in experimental conditions. To extend the accuracy of control to any situation, a dynamic pitch warping method for intonation correction is developed. It enables to play under the pitch perception limens preserving at the same time the musician's expressivity. Objective and perceptive evaluations validate the method efficiency. The use of new interfaces for musical expression raises the question of the modalities implied in the playing of the instrument. A third study reveals a preponderance of the visual modality over the auditive perception for the intonation control, due to the introduction of visual clues on the tablet surface. Nevertheless, this is compensated by the expressivity allowed by the interface. The writing or drawing ability acquired since early childhood enables a quick acquisition of an expert control of the instrument. An ensemble of gestures dedicated to the control of different vocal effects is suggested. Finally, an intensive practice of the instrument is made through the Chorus Digitalis ensemble, to test and promote our work. An artistic research has been conducted for the choice of the Cantor Digitalis' musical repertoire. Moreover, a visual feedback dedicated to the audience has been developed, extending the perception of the players' pitch and articulation.

Keywords

digital musical instrument, gestural interface, graphic tablet, voice synthesis, singing voice


Gestural control of singing voice synthesis by rules and musical applications

Ph D. thesis Université Pierre et Marie Curie - Paris VI (26/09/2013)

Lionel Feugère
LIMSI-CNRS
Bâtiment 508 - Université Paris-Sud - 91405 Orsay Cedex
lionel.feugere@limsi.fr

Access to the thesis and to the sound exemples

Abstract

Gestural control of singing voice synthesis by rules and musical applications This thesis deals with the production and control modeling of a synthetic singing voice in the context of making a digital musical instrument. Two instruments are presented: the Cantor Digitalis, focusing on singing vowel control and voice individualization, and the Digitartic, which aims at controlling the articulation of Vowel-Consonant-Vowel syllables. Using an augmented graphic tablet, these instruments allow interactive musical applications with fine temporal control of voice production parameters. The relevance of these musical instruments was established through several public performances of the Chorus Digitalis ensemble. The gestures of the musicians were studied along with the musical tasks required for playing the selected repertoire which was composed of traditional world music (baroque choral, North Indian khayal singing) as well as more contemporary pieces. In particular, an experiment was conducted to analyze the ability to control the fundamental frequency of the Cantor Digitalis. Participants were asked to imitate intervals and melodies according to three tempos using three different modalities (one's own voice, tablet, and tablet with audio feedback). Results showed that precision was better with the tablet modalities than with one's own voice, while no significant difference was found between the tablet with and without audio feedback. Both instruments have been unified into one Max/MSP application, which provides an audio-visual and interactive educational tool for understanding voice production.

Keywords

voice synthesis, gestural control, singing voice, musical gestures, digital musical instrument, digital orchestra


The performance, the phrasing and the vocal rhetoric of the French song since 1950 : clarifying the inexpressible of the voice

Ph. D. thesis in musicology, université Lumière - Lyon 2 (27/06/2013)

Céline Chabot-Canet
IRCAM
Univertisité Lyon 2
Passages XX-XXI (EA 4160)
celine.chabot-canet@ircam.fr

Access to the paper

Abstract

The present thesis focuses on the study of the song not in its word and music dialectic but through the acquisition of a third entity : the vocal rendition. The point is to reveal its critical importance and richness and make it legitimate as a subject of study as the result of the implementation of a specific methodological and lexical protocol that allows the analysis – as with the composition – although its changeable nature is not conducive to theorizing. Considered as a complex object (according to Edgar Morin’s terminology), vocal rendition is submitted to the crossfire of various disciplines (musicology, linguistics, rhetoric, acoustics) in order to favour as far as it is possible its objectivization. Within the framework of musicology, the use of computer tools makes it possible to establish a complementarity between the perspectives of social sciences and exact sciences, to catch and analyse the peculiarities of the performances both in their dominant or agogic characters, their connexions to the score, their combinatorial complexity within the meta-parameters (timbre, rhythm, phrasing), as well as the dialogical tensions which run through them (variation and repetition, melodicity and noise integration, singing and speaking parts). Thanks to the existence of a large body of French-speaking singers (from Rive Gauche style to Nouvelle chanson française) it is possible by studying studio and concert recordings to grasp the irreducible specificity of everyone (what is issued from a unique body) as well as the great underlying networks of stylistic relationships. Disclosed by the semeiological perspective, around the notions of strategy and performance designs, vocal rhetoric, the way to induce pathos and to express ethos, there emerges a typology of performing styles that is open to considering the intrinsic originality of each performer and integrating further generic developments.

Keywords

spectrum analysis, French song, French-speaking song, singing, expressiveness, vocal rendition, popular musics, vocal performance, voice, phrasing, prosody, vocal rhetoric, timbre


Synthèse concaténative de la voix chantée

Master thesis
Master 2 ATIAM - IRCAM / UPMC (01/03/2013 - 31/07/2013)

Luc Ardaillon
IRCAM
luc.ardaillon@ircam.fr

Acess to the thesis

Abstract

The internship project presented in this report is concerned by the evaluation and adaptation of existing technologies, for the purpose of singing voice synthesis, thus identifying the main problems to be solved to which some possible solutions have been suggested. For this purpose, a singing voice synthesis system has been developped, based on a unit concatenation method. This system should be able, from a score and the associated lyrics, to synthesize an audio file of a singing voice which should sound as naturel and expressive as it can possibly be. A database previously recorded and segmented is used in order to concatenate the various segments needed for the synthesis process, determined by a phonetisation step on the lyrics given as input. Various treatments are then applied in order to smooth the junctions between the segments and make them inaudible to the listener. Then, in order for the result to correspond to the score, the program superVP is used for the analysis, transformation and resynthesis of the sounds, applying recent high quality algorithms for the transposition and time-stretching. Finally, some means of making the result more natural and expressive are explored, with the implementation of a few rules allowing for the control of the various parameters.


33 ans de synthèse de la parole à partir du texte: une promenade sonore (1968-2001)

Article paru dans le volume 42 - n°1/2001 de la revue Traitement Automatique des Langues (TAL, éditions Hermès), pages 297 à 321.

Christophe d’Alessandro
LIMSI-CNRS
Bâtiment 508 - Université Paris-Sud - 91405 Orsay Cedex
cda@limsi.fr

Access to the paper and the sound exemples

Abstract

This article presents a compact disk containing 69 text-to-speech (TTS) synthesis sound examples. Examples from 25 automatic TTS systems, mainly in French are described, featuring 54 different voices. In a first part, ancient systems are presented (from 1968 to 1992). The second part describes sound examples linked to the papers of the present volume. The third part describes sound examples produced by other contemporary systems. Then, possible ways for listening to the examples are proposed to the reader/listener. This may be helpful for paying attention to specific aspects of TTS: e.g. synthesizer types, synthesis units, prosodic synthesis, regional accents. Finally, a common paragraph has been synthesized by 20 different voices.

Keywords

text-to-speech synthesis, sound examples, history of synthesis


en/docus/start.txt · Dernière modification: 2016/02/11 12:12 par operrotin