Graphical user interfaces

NXT gives three different levels of support for graphical user interfaces. The first is a very basic data display that will always work for data in the correct format. The second is a set of configurable end user tools for common coding tasks that covers simple timestamped labelling plus a range of discourse coding types. Finally, NXT contains a number of libraries that can be used to build tailored end user interfaces for a particular corpus.

Preliminaries

Invoking the GUIs

Most NXT corpora come with a script for invoking the GUIs that work with the data; look for a top level file with the extension .bat (for Windows), .sh (for Linux), or .command (for Mac OSX). Where these scripts fail to work, it is usually because you need to edit them because you have put the data in a different place than where the script author expected. These start-up scripts will give as options the standard search gui and generic display gui, plus any other interfaces that have been registered for the corpus by editing the callable-programs section of the metadata. For corpora with many different annotations, the generic display as accessed in this way is unusable because by default, it tries to load and display everything - the command line call will give better control.

Time Highlighting

Section incomplete

Search Highlighting

EXPLANATION OF SEARCH HIGHLIGHTING

In the search highlighting, if there isn't a direct representation of some element on the display, then there's nothing to highlight. For instance, in many data sets timestamped orthographic transcription consists of w and sil elements but the sil elements are not rendered in the display, so the query ($s sil): won't cause any highlighting to occur. This can be confusing but it is the correct behaviour. Good interface design will have a screen rendering for any elements of theoretical importance.

Generic tools that work on any data

Any corpus in NXT format is immediately amenable to two different graphical interfaces that allow the corpus to be searched, even without writing tailored programs. The first is a simple search GUI, and the second is a generic data display program that works in tandem with the search GUI to highlight search results.

The NXT Search GUI

The search GUI can be reached either by using search.bat/search.sh and specifying which corpus to load or by using the .bat/.sh for the specific corpus (if it exists) and choosing the Search option. It has two tabbed windows. The query tab allows the user to type in a query. Cut and paste from other applications works with this window. The query can also be saved on the Bookmark menu, but at May 2004 this doesn't work well for long queries. There is a button to press to do the search, which automatically takes the user either to a pop-up window with an error message explaining where the syntax of the query is incorrect, or, for a valid query, to the result tab. This window shows the results as an XML tree structure, with more information about the element the user has selected (with the mouse) displayed below the main tree.

The GUI includes an option to save the XML result tree to a file. This can be very handy in conjunction with knit for performing data analysis. It also includes an option to save the results in a rudimentary Excel spreadsheet. This is less handy, especially in the case of complex queries, because the return value is hierarchically structured but the spreadsheet just contains information about each matched element dumped into a flat list by performing a depth-first, left-to-right traversal of the results. However, for relatively simple queries and people who are used to data filtering and pivot tables in Excel, it can be the easiest first step for analysis.

The search GUI works on an entire corpus at once. This can make it slow to respond if the corpus is very large or if the query is very complicated (although of course it's possible to comment out observations in the metadata to reduce the amount of information it loads). Sometimes a query is slow because it's doing something more complicated than what the user intended. A query can be interrupted mid-processing and will still return a partial result list, which can be useful for checking it.

At May 2004, when the user chooses to open a corpus from the File menu, the search GUI expects the metadata file to be called something.corpus, although many users are likely to have it called something.xml (so that it behaves properly in other applications like web browsers). Choose the All files option (towards the bottom of the open dialogue box) in order to see .xml files as well as .corpus ones.

The Generic Display

NXT comes with a generic display so that it can at least display and search any corpus in NXT format "out of the box", without having to configure the end user coding tools or build a tailored tool. It provides the absolute basics. It isn't meant for serious use, but it can be useful to test out new data or if you don't need GUIs often enough to spend time getting something better set up.

The Generic Display works on one observation at a time. It can be invoked at the command line as follows:

java net.sourceforge.nite.gui.util.GenericDisplay -c CORPUS -o OBS -f FONTSIZE -q QUERY

In the call, CORPUS gives a path to a metadata file and OBS names an observation that is listed in that metadata file. These are mandatory. You may optionally specify a font size for the data rendering. You may also specify a query that will be used to choose kinds of data for display. Only the variable type information will be used in the processing; the display will show data just from the files that include data that matches variables of those types. For instance, -q '($w word)($d dialogue-act):' will render display windows for words and dialogue-acts only, ignoring all other data. This is particularly useful for corpora with many different kinds of annotation, where it would create too busy a display to show everything. For larger corpora, NXT is unable to render all of the annotations at once because this would take too much memory. It is only possible to run the generic display for such corpora with the -q option.

add screenshot. Is there more to say?

The Generic Display simply puts up an audio/video window for each signal associated with an observation, plus one window per coding that shows the elements in an NTextArea, one element per line, with indenting corresponding to the tree structure and a rendering of the attribute values, the PCDATA the element contains, and enough information about pointers to be able to find their targets visually on the other windows. It doesn't try to do anything clever about window placement. As with other NXT GUIs, there is a Search menu, and the display shows both search and time highlights.

Configurable end user coding tools

There are currently three built-in and configurable end user GUIs for common interface requirements.

The signal labeller

The signal labeller is for creating timestamped labels against signal, with the labels chosen from an enumerated list. This can be used for a very wide range of low-level annotations, such as gaze direction, movement in the room, rough starts and ends of turns, and areas to be included in or excluded from some other analysis. The tool treats the labels as mutually exclusive and exhaustive states; as the user plays the signal, whenever a new label is chosen (either with the mouse or using keyboard shortcuts), that time is used both for the beginning of the new label and the end of the old one. Although there are several similar tools available, this tool will work on either audio or video signals, including playing a set of synchronized signals together, and works natively on NXT format data, which is of benefit for user groups that intend to use NXT for further annotation. It does not, however, currently include either the palette-based displays popular from Anvil and TASX, and the signal control is meant for the coarser style of real-time coding, not for the precision timing that some projects require. It also does not contain waveform display, and therefore is unsuitable for many kinds of speech annotation.

Java class: net.sourceforge.nite.tools.videolabeler.ContinuousVideoLabeling

The discourse entity coder

The second end user GUI is for coding discourse entities above an existing text or speech transcription. Coding is performed by sweeping out the words in the entity and then mousing on the correct entity type from a static display of the named entity type ontology, or choosing it by keyboard shortcut. It can be used for any coding that requires the user to categorize contiguous stretches of text (or of speech by one person) using labels chosen from a tree-shaped ontology. In addition, it allows the user to indicate directional relationships between two coded entities, with the relationship categorized from a set of labels. The most common uses for this style of interface are in marking up named entities and coreferential relationships.

Java class: net.sourceforge.nite.tools.necoder.NECoder

The discourse segmenter

The final GUI is for segmenting discourse into contiguous stretches of text (or of speech by one person) and categorizing the segments. The most common use for this style of interface is a dialogue act coder. Coding is performed by marking the end of each discourse segment; the segment is assume to start at the end of the last segment (or the last segment by the same speaker, with the option of not allowing segments to draw words across some higher level boundary, such as previously marked speaker turns). A permanent dialogue box displays information about the currently selected act and allows a number of properties to be specified for it beyond simple type. The coding mechanisms supported include a tickbox to cover boolean properties such as termination of the act before completion, free text comments, and choice from a small, enumerated, mutuallly exclusive list, such as might be used for noting the dialogue act's addressee. Although this structure covers some styles of dialogue act coding, this tool is not suitable for schemes such as MRDA where dual-coding from the same act type list is allowed. This tool additionally allows the user to indicate directional relationships between acts using the same mechanism as in the discourse entity coder, although for current dialogue act schemes this is a minority requirement.

Java class: net.sourceforge.nite.tools.dacoder.DACoder

The non-spanning comparison display

This is the first in a series of tools to display multiple versions of a particular type of annotation. The non-spanning comparison display can show two different annotators' data over the same base-level transcription. We use annotator loosely to mean any human or machine process that results in an annotation. This is a display only, not an annotation tool. Display details are controlled using a configuration file much like the other end user GUIs, though there are two extra settings required (see below)Where?. The display shows the two annotators' names, with the first underlined and the second italicised, in a small Annotator Legend window. Every annotation by the first annotator is underlined and every annotation by the second is italizized so that the two can be distinguished. The types of annotations will be distinguished in the same way as for the discourse entity coder.

Java class: net.sourceforge.nite.tools.comparison.nonspanning.NonSpanningComparisonDisplay

The dual transcription comparison display

This is the second in a series of tools to display multiple versions of a particular type of annotation. The dual transcription comparison display displays two transcriptions side-by-side using a different configuration for each. For example, a manual transcription on the left and an automatic transcription on the right. This is a display only, not an annotation tool. Display details are controlled using a configuration file much like the other end user GUIs. Any annotations on the transcriptions will be displayed in the same way as for the non spanning comparison display.

Java class: net.sourceforge.nite.tools.comparison.dualtranscription.DualTranscriptionComparisonDisplay

How to configure the end user tools

There are two basic steps to configure one of these end-user tools for your corpus:

Edit the Metadata File

Consider what you want to code and which tool you want to use. Edit the codings and layers in the metadata file for your new annotation, then add something like this to the callable-programs section of your metadata file:

<callable-program description="Named Entity Annotation" 
  name="net.sourceforge.nite.tools.necoder.NECoder">
    <required-argument name="corpus" type="corpus"/>
    <required-argument name="observation" type="observation"/>
    <required-argument name="config" default="myConfig.xml"/>
    <required-argument name="corpus-settings" default="my-corpus-settings-id"/>
    <required-argument name="gui-settings" default="my-gui-settings-id"/>
</callable-program>

This tells NXT to allow the use of the built-in Named Entity coder on this corpus. When you start up net.sourceforge.nite.nxt.GUI on this metadata, a new entry will appear called Named Entity Annotation. The required-arguments require first that the corpus (metadata file name) is passed to the tool and than an observation is chosen by the user. The third required-argument, config tells NXT where to find the configuration file for this tool, relative to the metadata, and the last two tell it which settings to use within that file (see next section).

Edit or Create the Configuration File

Configuration files can look complicated but the requirements to get started are really quite simple. One example configuration file is included in the NXT distribution as lib/nxtConfig.xml. It contains extensive comments about what the settings mean. Below is a full discussion of the elements and attributes of the configuration files, but to continue with the above example, here is a configuration file (according to the above metadata fragment, it should be called myConfig.xml and located in the same directory as the metadata). This configures the named entity coder:

<NXTConfig>
  <DACoderConfig>
    <!-- Corpus settings for the ICSI corpus -->
    <corpussettings
        id                      = "my-corpus-settings-id"
        segmentationelementname = "segment"
        transcriptionlayername  = "words-layer"
        transcriptiondelegateclassname = "MyTranscriptionToTextDelegate"
        neelementname           = "named-entity"
        neattributename         = "type"
        annotatorspecificcodings= "nees"
    />    

    <guisettings
        id                      = "my-gui-settings-id"
        gloss                   = "My Corpus settings"
        applicationtitle        = "My Corpus Tool"
    />

  </DACoderConfig>
</NXTConfig>

Note the corpussettings element with the ID my-corpus-settings-id as referred to in the metadata file, and similarly a guisettings element named my-gui-settings-id. In this way, a configuration file can contain any number of different configurations for different corpora as well as different tools, though it's normally clearer to have at least one config file per corpus.

Some Important Settings

neelementname

the name of the element, which must be present in the metadata file, that will be created by the named entity tool

neattributename

if this is present, we are using an enumerated attribute directly on the neelementname rather than a pointer into a type hierarchy. The attribute must be present in the metadata file and must be enumerated. To use a pointer into a type hierarchy you should specify at least the neontology, neroot, nenameattribute and netyperole instead of this single attribute. Note: this feature is only available in NXT versions after March 2006.

segmentationelementname

the element used to split the transcription into 'lines'. It is normally assumed this is an agent-coding and if so, the agent associated with each speaker is placed in front of each line.

transcriptionlayername

the layer that contains the transcription to be printed. How it actually appears can be specified using transcriptiondelegateclassname.

transcriptiondelegateclassname

if this is absent, any element in the transcriptionlayername with text content will be displayed as transcription. If it is present, each element is passed to the delegate class in order to display the transcription. Any such delegate class has to implement the Java interface TranscriptionToTextDelegate which contains the single method getTextForTranscriptionElement(NOMElement nme).

Config File Detail

This section is a detailed look at the settings in the NXT configuration files. Note: some of these settings can only be used in NXT builds after 1.3.5 (9/5/06). The details may not be entirely static.

At the top level, in the NXTConfig element, there are currently two possible subelements: DACoderConfig and CSLConfig. The first is for configuring discourse coder tools (dialogue act coder; named entity coder etc). The second is for configuring the Continuous Signal Labeller tool

Both CSLConfig and DACoderConfig can contain any number of corpussettings and guisettings elements, each of which has an id attribute to uniquely identify it: often these IDs will be used in the CallableTools section of an NXT metadata file. guisettings are preferences that affect the overall look of the interface and corpussettings tell NXT about the elements to be displayed and annotated. The detail of what goes where is described in each subsection below.

DACoderConfig

guisettings attributes

id

Unique identifier

gloss

Example element containing short explanation of all possible settings

showapwindow

If true, the Adjacency Pair (or relation) window is shown in the discourse entity coder. Defaults to true.

showlogwindow

If true, the log feedback window is shown. Defaults to true.

applicationtitle

The title that you want to see in the main frame

wordlevelselectiontype

This determines what units are selectable on the speech transcriptions (assuming transcriptselection is not false). The are currently five valid strings - anything else will result in the default behaviour: in_segment_phrase. The values and their meanings are: one_word: only single words can be selected at a time; one_segment: only single segments can be selected; multiple_segments: multiple complete sements can be selected; in_segment_phrase: contiguous words that lie within a single segment can be selected; cross_segment_phrase: contiguous words across segments can be selected (note that the selection can in fact be discontiguous if multiagentselection is not true).

transcriptselection

This determines whether you can select speech transcription elements. If this is false no speechtext selection will take place, regardless of settings such as allowMultiAgentSelect or wordlevelSelectionType. Defaults to true.

annotationselection

This determines whether you can select annotation elements. If this is false no annotation selection will take place, regardless of other settings. Defaults to true.

multiagentselection

This determines whether you can select data from more than one agent. If this is true such selection can take place. Defaults to false.

corpussettings attributes

id

Unique identifier

gloss

Example element containing short explanation of all possible settings

segmentationelementname

Element name of the segmentation elements that pre-segments the transcription layer. Used for the initial display of the text.

segmenttextattribute

Name of the attribute on the segment element to use as the header of each transcription line. Use a delegate (below) for more complex derivation. If neither delegate nor attribute is set, the agent is used as the line header (if agent is specified).

segmenttextdelegateclassname

full class name of a TranscriptionToTextDelegate that derives the text of the segment header from each segment element. Note this is not the transcription derivation, just the derivation of the header for each line of transcription. If neither this delegate nor segmenttextattribute is set, the agent is used as the line header (if agent is specified).

transcriptionlayername

LAYER name of the transcription layer

transcriptionattribute

Name of the attribute in which text of transcription is stored. Leave out if text not stored in attribute.

transcriptiondelegateclassname

full class name of TranscriptionToTextDelegate. Leave out is no delegate is used. net.sourceforge.nite.gui.util.AMITranscriptionToTextDelegate is an example delegate class that works for the AMI corpus. For a new corpus you may have to write your own, but it is a simple process.

daelementname

element name of dialogue act instances

daontology

ontology name of dialogue acts

daroot

nite-id of dialogue act root

datyperole

role name of the pointer from a dialogue act to its type

daattributename

The enumerated attribute on the DA element used as its 'type'. If this attribute is set, the daontology, daroot and datyperole attributes are ignored.

dagloss

the name of the attribute of the dialog act types that contains some extra description of the meaning of this type

apelementname

element name of adjacency pair instances

apgloss

the name of the attribute of the relation types that contains some extra description of the meaning of this type

apontology

ontology name of adjacency pairs

aproot

nite-id of adjacency pair root

defaultaptype

nite-id of default adjacency pair type

aptyperole

role name of the pointer from a AP to its type

apsourcerole

role name of the pointer from a AP to its source

aptargetrole

role name of the pointer from a AP to its target

neelementname

element name of named entity instances

neattributename

The enumerated attribute on the NE element used as its 'type'. If this attribute is set, the neontology, neroot and netyperole attributes are ignored.

neontology

ontology name of named entities

neroot

nite-id of named entities root

neontologyexpanded

set to false if you want the ontology to remain in un-expanded form on startup. The default is to expand the tree.

nenameattribute

attribute name of the attribute that contains the name of the named entity

netyperole

role name of the pointer from a named entity to its type

nenesting

Set to true to allow named entities to nest inside each other. Defaults to false.

nemultipointers

if this is true each span of words can be associated with multiple values in the ontology. Note that this only makes sense when the neattributename is not set - this setting is ignored if neattributename is set. It also requires that the nenesting attribute is true.

abbrevattribute

name of the attribute which contains an abbreviated code for the named entity for in-text display

nelinkelementname

The element linking NEs together. Used by NELinker.

nelinkattribute

The enumerated attribute on the NE link element used as its 'type'. If this attribute is set, the nelinkontology, nelinkroot and nelinkrole attributes are ignored, and the nelinktypedefault if present is the default string value of the type. Used by NELinker.

nelinkontology

The type ontology pointed to by the NE link element. Used by NELinker.

nelinkroot

The root of the type ontology pointed into by the NE link element. Used by NELinker.

nelinktyperole

The role used to point into the type ontology by the NE link element. Used by NELinker.

nelinktypedefault

The default type value for NE link elements. Used by NELinker.

nelinksourcerole

The role of the pointer from the link element to the first (or source) NE element. Used by NELinker.

nelinktargetrole

The role of the pointer from the link element to the second (or target) NE element. Used by NELinker.

annotatorspecificcodings

the semi-colon-separated list of codings that are annotator specific, i.e. for which each individual annotator will get his or her own datafiles. Usually these are the codings for all layers that will be annotated in the DACoder; see AMI example. This setting only has effect when the tool is started for a named annotator or annotators.

nsannotatorlayer

Only used by NonSpanningComparisonDisplay this specifies the layer containing elements to compare. This is the top layer passed to the multi-annotator corpus load.

nscommonlayer

Only used by NonSpanningComparisonDisplay this is the layer that is common between all annotators - it will normally be the same layer as transcriptionlayername.

In the above settings, da and ap prefixes are used in the attribute names here (standing for 'dialogue act' and 'adjacency pair'), these can refer to any kind of discourse elements and relations between them you wish to annotate.

CSLCoderConfig

guisettings attributes

id

Unique identifier

gloss

Example CSL settings, giving an explanation for every entry.

autokeystrokes

Optional (default false): if true, keystrokes will be made automatically if no keystroke is defined in the corpus data or if the defined keystroke is already in use.

showkeystrokes

Optional (default off): set to off (keystroke won't be shown in the GUI), tooltip (keystroke will be shown in the tooltip of a control) or label (keystroke will be shown in the label of a control).

continuous

Optional (default true): if true, the CSL tool will ensure that annotations remain continuous (prevent gaps in the time line)

syncrate

Optional (default 200): the number of milliseconds between time change events from the NXT clock

timedisplay

Optional (default seconds): the type of display of coding times in the annotation window: if minutes then the format is like that of the clock h:mm:ss.ms

corpussettings attributes

id

Unique identifier

gloss

Example CSL settings for Dagmar demo corpus

annotatorspecificcodings

pose

For the Continuous Signal Labeller we expect the corpussettings element to contain a number of layerinfo elements, each of which can contain these attributes. Each layer named within the current corpussettings element can be coded using the same tool: users choose what they're annotating using a menu.

corpussettings / layerinfo attributes

id

Unique identifier

gloss

Textual description of this layer

codename

Name of the elements that are annotated in the given layer

layername

The name of the layer that you want to code in the video labeler

layerclass

Delegate AnnotationLayer class. Defaults to net.sourceforge.nite.tools.videolabeler.LabelAnnotationLayer

controlpanelclass

Delegate TargetControlPanel class. Defaults to net.sourceforge.nite.tools.videolabeler.LabelTargetControlPanel

enumeratedattribute

Either this or pointerrole are required for LabelAnnotationLayer: name of the attribute that should be set - attribute must exist on the codename element and must be enumerated - currently no flexibility is offered in the keyboard shortcuts - they always start at "1" and increase alphanumerically.

pointerrole

Either this or enumeratedattribute are required for LabelAnnotationLayer: role of the pointer that points to the object set or ontology that contains the labels.

labelattribute

Required for LabelAnnotationLayer: name of the attribute of an object set or ontology element that contains the label name.

evaluationattribute

Required for FeeltraceAnnotationLayer: name of the double value attribute that contains the evaluation of an emotion.

activationattribute

Required for FeeltraceAnnotationLayer: name of the double value attribute that contains the activation of an emotion.

showlabels

Optional (default true) for FeeltraceTargetControlPanel: if true, labels for some predefined emotions will be shown in the Feeltrace circle.

clickannotation

Optional (default false) for FeeltraceTargetControlPanel: if true, the user can click to start and end annotating; if false, the user should keep the mouse button pressed while annotating.

Libraries to support GUI authoring

Please refer to the NXT Javadoc and the example programs in the samples directory.

The NXT Search GUI as a component for other tools

It's often useful for applications to be able to pop up a search window and react to search results as they are selected by the user. Using any in-memory corpus that implements the SearchableCorpus interface (for example NOMWriteCorpus), you can very simply achieve this. If nom is a valid SearchableCorpus we could use:

net.sourceforge.nite.search.GUI searchGui = new GUI(nom);
searchGui.registerResultHandler(handler);
...
searchGui.popupSearchWindow();
          

In this extract, the first line initializes the search GUI by passing it a SearchableCorpus. The second line tells the GUI to inform handler when search results are selected. handler must implement the QueryResultHandler interface. This simple interface is already implemented by some of NXT's own GUI components like NTextArea, but this mechanism allows you complete freedom to do what you want with search results. There is no obligation to register a result handler at all, but it may result in a less useful interface.

The third line of the listing actually causes the search window to appear and will normally be the result of a user action like selecting the Search menu item or something similar.