The Speech Lab at Queens College CUNY was established by Andrew Rosenberg in the fall of 2009.

We pursue research broadly on how speech communicates information. We have a particular focus on computational approaches to understanding prosody and intonation. We are also working to understand how deception operates in spoken communication.

This lab is an active participant in the NSF IGERT program, "From Data to Solutions: A New PhD Program in Transformational Data & Information Sciences Research and Innovation".

We have a laboratory in NSB A217 containing researcher work stations, and a sound isolation booth.

Automatic Prosodic Analysis

Prosody is a critical component of spoken language. Prosody, broadly, describes all of the production qualities of speech that are not involved in conveying lexical information. Where the words are "what is said", prosody is "how it is said". This project explores techniques to automatically analyze the prosodic content of speech and incorporate this into spoken language processing applications.

This work has led to the development of the AuToBI toolkit for prosodic analysis. AuToBI is an open-source toolkit that performs automatic ToBI labeling. It is written in java.


Keyword Search in Low-Resource Languages

The task of Keyword Search is to identify instances of a given query term in a collection of speech data. As part of the BABEL program, we are investigating how prosody can be used to identify useful regions of keywords in a diverse range of low resource languages


User-Identifying Pause and Revision Behavior in Typing

Keystroke Dynamics has been used as a biometric indicator on passwords and other predefined text. Stylometry is typically used in authorship attribution investigations. This work combines these two techniques to enable user authentication on free text based on stylometric and keystroke dynamic indicators.

Deceptive Speech

We are developing techniques to identify deceptive behavior, focusing on the within-culture and cross-culture differences between deceivers as well as their common characteristics. In addition to investigating verbal cues to deception across and within cultures, we are also examining whether personality differences can predict deceptive strategies within cultures. [Joint work with Julia Hirschberg (Columbia) and Michelle Levine (Barnard)]


Identifying Novelty, Vagueness and Disfluency in Speech

The task of recognizing useful information in speech sits at the intersection of information extraction and summarization. This work involves 1) identifying acoustic/prosodic indicators of new information, vague or uncertain communication of information and 2) enabling information extraction to more effectively operate on ASR hypotheses.

Reciprosody

Reciprosody is an open repository to share Prosodically annotated data. We are developing Reciprosody to support communication between data producers and consumers, and to provide support for the maintenance and updating of annotated data.


Negative Language Transfer in Prosody

Non-native speech can impair communication with human speakers, and hinder interaction with speech technology. This project investigates the way that the prosody of a speaker's native language impacts their speech production when speaking English. This work is in service of the development of a language instruction system focused on instruction of prosody, taking a speaker's native language into consideration.

Location

New Science Building A217
Queens College
65-30 Kissena Blvd
Flushing, NY 11367

Phone

(718) 997-3562

Email

andrew@cs.qc.cuny.edu

Twitter

@SpeechLabQC

2013

Let Me Finish: Automatic Conflict Detection Using Speaker Overlap
Felix Grezes, Justin Richards, Andrew Rosenberg
Interspeech 2013 Lyon, France

AuToBI

AuToBI is an open-source tool for automatically generating ToBI annotations. This toolkit includes java implementations of Praat's pitch and intensity extraction algorithms. The toolkit and trained models are available from the AuToBI web page. Users are encouraged to join the AuToBI-Users mailing list.

This work has led to the development of the AuToBI toolkit for prosodic analysis. AuToBI is an open-source toolkit that performs automatic ToBI labeling. It is written in java.


AuToBI as a Web Service

To make it even easier for users to explore automatic prosodic analysis and AuToBI, we have deployed AuToBI as a web service. Users need only to upload word segmentation material and a wav file for analysis.

Cluster Evaluation

Evaluating the success of clustering algorithms is not a trivial task. V-measure is a measure based on the criteria of homogeneity - each cluster should include as few classes of objects as possible - and completeness - each class should be represented in as few clusters as possible. This measure is described in V-Measure: A conditional entropy-based external cluster evaluation measure. The name comes from its similarity in calculation to F-measure, and its goal of evaluating cluster validity.

A java package that implements V-Measure as well as a number of other cluster evaluation measures is distributed here. Feel free to contact Andrew Rosenberg with bug reports and feature requests.
ClusterEvaluator.jar
ClusterEvaluator source code
Javadoc for ClusterEvaluator


Reciprosody

Reciprosody is a shared open repository of prosodically annotated material. This material is quite time-consuming to annotate. This repository aims to broaden research into prosody by making this material freely available to non-commercial research institutions.

SPARKLER - Scalable Prosodic, Anomaly and Relational Knowledge exploration of Language with Enhanced Robustness

DARPA DEFT Program 2012-2015


Development and Pilot Testing of a Computerized Personal Reading Tutoring Program

DMNS Research Enhancement Funds 2012-2013

LORELEI - Low Resource Language Indexing

IARPA (subcontract via IBM) 2011-2015

Reciprosody - A Repository for Prosodically Annotated Material

NSF 2012-2013

Investigating Cognitive Rhythms as a New Modality for Continuous Authentication

DARPA 2012-2013

Identifying Deception Across Cultures

Air Force Office of Scientific Research (AFOSR) 2011-2015

Generating Expressive Cued Speech from Audio Speech Signals

PSC-CUNY Research Enhancement Fund 2010