Welcome to the ICSI Corpus

The ICSI Meeting Corpus is an audio data set consisting of about 70 hours of meeting recordings. More information can be found at the ICSI web site. To access the data, follow the directions given on the download page.

The audio was recorded on close-talking microphones - and available as either separate SPH files or a single mixed WAV file. Also available is orthographic transcription, and manual annotation of dialog acts and speech quality. Some third party annotations may also be made available here.

All of the signals and transcription, and some of the annotations, have been released publicly under the Creative Commons Attribution 4.0 International Licence (CC BY 4.0).