Data Problems

Issues concerning the collection and pre-processing of audio, video, and auxiliary data, and resolution status

As indicated in the table below, some of the recordings are affected by signal discontinuities. In most cases, this was due to a sudden failure by the audio recording equipment. For the purpose of generating transcriptions and annotated data for these files, the signals were concatenated with zero padding (performed on 48kHz files) in correspondence with the length of the discontinuity. Frame dropping also occurred with some of the video signals collected in Edinburgh and IDIAP when encoding them from DV tape to DivX. These have undergone additional processing and are now synchronized with the audio signals. Due to hardware issues during the recording stage, video signals collected at TNO are not perfectly synchronized. However, data for these have undergone an additional processing step (see Meeting Rooms), and now exhibit acceptable levels of synchronization.

Other data collection issues relate to seat swapping by meeting participants. In a few cases, this has resulted in the mislabeling of participant IDs and thumbnails. A very small number of meetings also feature an inappropriate positioning of headset microphones. In one meeting, a participant removes his headset altogether. These and other data collection and pre-processing issues are presented in the following table, along with details of whether and, where relevant, how the issue was resolved.

MEETING ID PROBLEM RESOLVED (Y/N) NOTES
* all meetings Logitech I/O digital pen output not synchronized with rest of data N pens' internal clocks do not drift by more than a few seconds during each meeting, providing sufficiently accurate calibration
TS* meetings video signals not perfectly synchronized Y manual processing performed to reach an acceptable level of synchronization quality; see Meeting Rooms
TS3011d missing beginning of meeting due to dropout N audio begins at 00:03:55.66
TS3012c missing end of meeting due to dropout N audio ends at 00:39:36.5