Reciprosody

Reciprosody is a repository for linguists and speech scientists to share data related to intonation and prosody.
All material is distributed under open-source or academic/scholarly research licenses.

Currently, prosody researchers lack a simple way to share their annotated data. This lack: 1) limits our ability to test robust supervised techniques for automatic prosodic analysis; 2) makes it difficult to compare performance across publications that operate on privately-held corpora; and 3) puts the burden on the creators of a resource to maintain and distribute their data. The lack of shareable data is particularly worrisome in our field, since prosodic annotation, as we all know, is a very onerous and time-consuming task.

Any individual or organization sharing data through Reciprosody will retain all rights to the material. The only requirement for sharing data on Reciprosody is that it be free for non-commercial use by educators and academic research institutions. The default license we will use is the Creative Commons Attribution-Non-Commercial-Share-Alike 3.0 License, which also allows for non-commercial use by commercial entities. Any license and access restrictions requested or required by a data owner will be supported provided that no charge is required, and open access is granted to educational and academic research institutions.

In addition to providing a venue for the sharing of resources, we are planning to provide a number of tools and resources (and ways to share them) to support the development and maintenance of prosodic annotation. Some ideas we have are 1) links and original material describing common annotation standards; 2) instructions for how to use tools like praat and wavesurfer for prosodic annotation; 3) community forums for discussing annotation standards, interesting or difficult utterances, and other community issues. This will include support to upload sound files and annotations to facilitate shared discussion; 4) annotation standard verification tools; 5) the ability for data users to provide feedback to data owners. (Data owners will be able to determine how they would like to receive this feedback); 6) versioning tools for the ongoing development and maintenance of annotated corpora.

This project is very much in its pilot stages. We are actively seeking additional material to host in this repository, and feedback.
Title Duration Language # Speakers Genre Annotation License Owner
Boston Directions Corpus [Read] 50 mins American English 4 (3M/1F) Read Full ToBI Creative Commons 3.0 Julia Hirschberg
Boston Directions Corpus [Spontaneous] 60 mins American English 4 (3M/1F) Spontaneous Full ToBI Creative Commons 3.0 Julia Hirschberg
C-PROM 69 mins French 28 (16M/12F) Various Prominence only Creative Commons 3.0 Mathieu Avanzi

If you are interested in sharing data on Reciprosody, please contact Andrew Rosenberg.