Switchboard in NXT |
The Switchboard Corpus in NXT |
||||||||||
|
The Switchboard in NXT
project aims to bring together major annotations of the Switchboard
corpus within a unified framework in XML format. The Switchboard corpus,
consisting of telephone conversations between speakers of American
English,
is one of the longest-standing corpora of fully spontaneous
speech. As such, there have been a range of different sorts of
linguistic information annotated on it, including syntax, discourse
semantics and prosody. In this project, we have converted all of these
into XML format within the Nite XML Toolkit (NXT)
framework. This allows users to query the corpus to extract data with
any combination of features from the whole range of annotated
linguistic information.
![]() With such a diverse range of annotations, we believe the corpus offers one of the richest resources available for the study of discourse in spontaneous speech. We hope by its public release that it will be widely used and developed by researchers in the linguistics and NLP communities. |