Welcome to the NUS-48E Sung and Spoken Corpus developed at Sound and Music Computing Laboratory at National University of Singapore.
The corpus is a 169-min collection of audio recordings of the sung and spoken lyrics of 48 (20 unique) English songs by 12 subjects and a complete set of transcriptions and duration annotations at the phone-level for all recordings of sung lyrics, comprising 25,474 phone instances.
The corpus is available here.
The corpus consists of the following:
- Twelve folders of the 12 subjects
- Each folder consists of “sing” and “read” folders, which consist of 4 sung and corresponding spoken .wav files, and their time-aligned phone-level manual annotations in .txt files
- A readme file
For information about any of the content described here, please contact Associate Professor Ye Wang (firstname.lastname@example.org) at the SMC Lab.
This dataset is being shared with the agreement that it will be used solely for research purposes. On use of this dataset, please cite this paper:
Zhiyan Duan, Haotian Fang, Bo Li, Khe Chai Sim and Ye Wang. “The NUS Sung and Spoken Lyrics Corpus: A Quantitative Comparison of Singing and Speech“. Asia-Pacific Signal and Information Processing Association Annual Submit and Conference 2013 (APSIPA ASC 2013).