NXT gives three different levels of support for graphical user interfaces. The first is a very basic data display that will always work for data in the correct format. The second is a set of configurable end user tools for common coding tasks that covers simple timestamped labelling plus a range of discourse coding types. Finally, NXT contains a number of libraries that can be used to build tailored end user interfaces for a particular corpus.
Most NXT corpora come with a script for invoking the GUIs that work
with the data; look for a top level file with the extension .bat
(for Windows), .sh
(for Linux), or .command
(for Mac OSX). Where these
scripts fail to work, it is usually because you need to edit them
because you have put the data in a different place than where the
script author expected.
These start-up scripts will give as options the standard search gui
and generic display gui, plus any other interfaces that have been
registered for the corpus by editing the callable-programs section of
the metadata. For corpora with many different annotations, the
generic display as accessed in this way is unusable because by
default, it tries to load and display everything - the command line
call will give better control.
EXPLANATION OF SEARCH HIGHLIGHTING
In the search highlighting, if there isn't a direct representation of
some element on the display, then there's nothing to highlight. For
instance, in many data sets timestamped orthographic transcription
consists of w
and
sil
elements but the sil
elements are not rendered in the display,
so the query ($s sil):
won't cause any highlighting to occur. This
can be confusing but it is the correct behaviour. Good interface design
will have a screen rendering for any elements of theoretical importance.
Any corpus in NXT format is immediately amenable to two different graphical interfaces that allow the corpus to be searched, even without writing tailored programs. The first is a simple search GUI, and the second is a generic data display program that works in tandem with the search GUI to highlight search results.
The search GUI can be reached either by using search.bat/search.sh
and
specifying which corpus to load or by using the .bat/.sh
for the
specific corpus (if it exists) and choosing the option. It
has two tabbed windows. The query tab allows the user to type in a
query. Cut and paste from other applications works with this
window. The query can also be saved on the menu, but at
May 2004 this doesn't work well for long queries.
There is a button to press to do the , which
automatically takes the user either to a pop-up window with an
error message explaining where the syntax of the query is incorrect,
or, for a valid query, to the result tab. This window shows the
results as an XML tree structure, with more information about the
element the user has selected (with the mouse) displayed below the
main tree.
The GUI includes an option to knit
for performing data analysis. It also includes an option
to save the results in a rudimentary Excel spreadsheet. This is less
handy, especially in the case of complex queries, because the return
value is hierarchically structured but the spreadsheet
just contains information about each matched element dumped into
a flat list by performing a depth-first, left-to-right traversal of
the results. However, for relatively simple queries and people who are used to
data filtering and pivot tables in Excel, it can be the easiest first
step for analysis.
The search GUI works on an entire corpus at once. This can make it slow to respond if the corpus is very large or if the query is very complicated (although of course it's possible to comment out observations in the metadata to reduce the amount of information it loads). Sometimes a query is slow because it's doing something more complicated than what the user intended. A query can be interrupted mid-processing and will still return a partial result list, which can be useful for checking it.
At May 2004, when the user chooses to open a corpus from the
, although many users are likely
to have it called something
.corpus
(so that it behaves properly in
other applications like web browsers). Choose the
option (towards the bottom of the open dialogue box) in
order to see something
.xml.xml
files as well as .corpus
ones.
NXT comes with a generic display so that it can at least display and search any corpus in NXT format "out of the box", without having to configure the end user coding tools or build a tailored tool. It provides the absolute basics. It isn't meant for serious use, but it can be useful to test out new data or if you don't need GUIs often enough to spend time getting something better set up.
The Generic Display works on one observation at a time. It can be invoked at the command line as follows:
java net.sourceforge.nite.gui.util.GenericDisplay -cCORPUS
-oOBS
-fFONTSIZE
-qQUERY
In the call, CORPUS
gives a path to a metadata file and OBS
names an observation
that is listed in that metadata file. These are mandatory. You may optionally specify a font size for the data rendering. You may also specify a query that will be used to choose kinds of data for display. Only the variable type information will be used in the processing; the display will show data just from the files that include data that matches variables of those types. For instance, -q '($w word)($d dialogue-act):' will render display windows for words and dialogue-acts only, ignoring all other data. This is particularly useful for corpora with many different kinds of annotation, where it would create too busy a display to show everything. For larger corpora, NXT is unable to render all of the annotations at once because this would take too much memory. It is only possible to run the generic display for such corpora with the -q option.
add screenshot. Is there more to say?
The Generic Display
simply puts up an audio/video window for each signal associated
with an observation, plus one window per coding that shows the
elements in an NTextArea
, one element per line,
with indenting corresponding to the tree
structure and a rendering of the attribute values, the PCDATA the
element contains, and enough information about pointers to be able
to find their targets visually on the other windows.
It doesn't try to do anything clever about window placement.
As with other NXT GUIs, there is a menu, and the display
shows both search and time highlights.
There are currently three built-in and configurable end user GUIs for common interface requirements.
The signal labeller is for creating timestamped labels against signal, with the labels chosen from an enumerated list. This can be used for a very wide range of low-level annotations, such as gaze direction, movement in the room, rough starts and ends of turns, and areas to be included in or excluded from some other analysis. The tool treats the labels as mutually exclusive and exhaustive states; as the user plays the signal, whenever a new label is chosen (either with the mouse or using keyboard shortcuts), that time is used both for the beginning of the new label and the end of the old one. Although there are several similar tools available, this tool will work on either audio or video signals, including playing a set of synchronized signals together, and works natively on NXT format data, which is of benefit for user groups that intend to use NXT for further annotation. It does not, however, currently include either the palette-based displays popular from Anvil and TASX, and the signal control is meant for the coarser style of real-time coding, not for the precision timing that some projects require. It also does not contain waveform display, and therefore is unsuitable for many kinds of speech annotation.
Java class: net.sourceforge.nite.tools.videolabeler.ContinuousVideoLabeling
The second end user GUI is for coding discourse entities above an existing text or speech transcription. Coding is performed by sweeping out the words in the entity and then mousing on the correct entity type from a static display of the named entity type ontology, or choosing it by keyboard shortcut. It can be used for any coding that requires the user to categorize contiguous stretches of text (or of speech by one person) using labels chosen from a tree-shaped ontology. In addition, it allows the user to indicate directional relationships between two coded entities, with the relationship categorized from a set of labels. The most common uses for this style of interface are in marking up named entities and coreferential relationships.
Java class: net.sourceforge.nite.tools.necoder.NECoder
The final GUI is for segmenting discourse into contiguous stretches of text (or of speech by one person) and categorizing the segments. The most common use for this style of interface is a dialogue act coder. Coding is performed by marking the end of each discourse segment; the segment is assume to start at the end of the last segment (or the last segment by the same speaker, with the option of not allowing segments to draw words across some higher level boundary, such as previously marked speaker turns). A permanent dialogue box displays information about the currently selected act and allows a number of properties to be specified for it beyond simple type. The coding mechanisms supported include a tickbox to cover boolean properties such as termination of the act before completion, free text comments, and choice from a small, enumerated, mutuallly exclusive list, such as might be used for noting the dialogue act's addressee. Although this structure covers some styles of dialogue act coding, this tool is not suitable for schemes such as MRDA where dual-coding from the same act type list is allowed. This tool additionally allows the user to indicate directional relationships between acts using the same mechanism as in the discourse entity coder, although for current dialogue act schemes this is a minority requirement.
Java class: net.sourceforge.nite.tools.dacoder.DACoder
This is the first in a series of tools to display multiple versions of a particular type of annotation. The non-spanning comparison display can show two different annotators' data over the same base-level transcription. We use annotator loosely to mean any human or machine process that results in an annotation. This is a display only, not an annotation tool. Display details are controlled using a configuration file much like the other end user GUIs, though there are two extra settings required (see below)Where?. The display shows the two annotators' names, with the first underlined and the second italicised, in a small Annotator Legend window. Every annotation by the first annotator is underlined and every annotation by the second is italizized so that the two can be distinguished. The types of annotations will be distinguished in the same way as for the discourse entity coder.
Java class: net.sourceforge.nite.tools.comparison.nonspanning.NonSpanningComparisonDisplay
This is the second in a series of tools to display multiple versions of a particular type of annotation. The dual transcription comparison display displays two transcriptions side-by-side using a different configuration for each. For example, a manual transcription on the left and an automatic transcription on the right. This is a display only, not an annotation tool. Display details are controlled using a configuration file much like the other end user GUIs. Any annotations on the transcriptions will be displayed in the same way as for the non spanning comparison display.
Java class: net.sourceforge.nite.tools.comparison.dualtranscription.DualTranscriptionComparisonDisplay
There are two basic steps to configure one of these end-user tools for your corpus:
Consider what you want to code and which tool you want to
use. Edit the codings and layers in the metadata file for your new
annotation, then add something like this to the callable-programs
section of your metadata file:
<callable-program description="Named Entity Annotation" name="net.sourceforge.nite.tools.necoder.NECoder"> <required-argument name="corpus" type="corpus"/> <required-argument name="observation" type="observation"/> <required-argument name="config" default="myConfig.xml"/> <required-argument name="corpus-settings" default="my-corpus-settings-id"/> <required-argument name="gui-settings" default="my-gui-settings-id"/> </callable-program>
This tells NXT to allow the use of the built-in Named Entity coder on
this corpus. When you start up
net.sourceforge.nite.nxt.GUI
on this metadata, a new
entry will appear called Named Entity
Annotation
. The required-argument
s require
first that the corpus
(metadata file name) is passed to the tool and
than an observation is chosen by the user. The third required-argument
,
config
tells NXT where to find the
configuration file for this tool, relative to the metadata, and the
last two tell it which settings to use within that file (see next
section).
Configuration files can look complicated but the requirements to
get started are really quite simple. One example configuration file is
included in the NXT distribution as lib/nxtConfig.xml
. It
contains extensive comments about what the settings mean. Below is a
full discussion of the elements and attributes of the configuration
files, but to continue with the above example, here is a configuration
file (according to the above metadata fragment, it should be called
myConfig.xml
and located in the same directory as the
metadata). This configures the named entity coder:
<NXTConfig> <DACoderConfig> <!-- Corpus settings for the ICSI corpus --> <corpussettings id = "my-corpus-settings-id" segmentationelementname = "segment" transcriptionlayername = "words-layer" transcriptiondelegateclassname = "MyTranscriptionToTextDelegate" neelementname = "named-entity" neattributename = "type" annotatorspecificcodings= "nees" /> <guisettings id = "my-gui-settings-id" gloss = "My Corpus settings" applicationtitle = "My Corpus Tool" /> </DACoderConfig> </NXTConfig>
Note the corpussettings
element with the ID
my-corpus-settings-id
as referred to in the metadata
file, and similarly a guisettings
element named
my-gui-settings-id
. In this way, a configuration file can
contain any number of different configurations for different corpora
as well as different tools, though it's normally clearer to have at
least one config file per corpus.
Some Important Settings
neelementname
the name of the element, which must be present in the metadata file, that will be created by the named entity tool
neattributename
if this is present, we are using an enumerated attribute directly on
the neelementname
rather than a pointer into a type
hierarchy. The attribute must be present in the metadata file and must
be enumerated. To use a pointer into a type hierarchy you should
specify at least the neontology
, neroot
,
nenameattribute
and netyperole
instead of this single attribute. Note:
this feature is only available in NXT versions after March 2006.
segmentationelementname
the element used to split the transcription into 'lines'. It is normally assumed this is an agent-coding and if so, the agent associated with each speaker is placed in front of each line.
transcriptionlayername
the layer that contains the transcription to be printed. How it
actually appears can be specified using
transcriptiondelegateclassname
.
transcriptiondelegateclassname
if this is absent, any element in the transcriptionlayername
with text
content will be displayed as transcription. If it is present, each
element is passed to the delegate class in order to display the
transcription. Any such delegate class has to implement the Java
interface TranscriptionToTextDelegate
which contains the
single method getTextForTranscriptionElement(NOMElement nme)
.
This section is a detailed look at the settings in the NXT configuration files. Note: some of these settings can only be used in NXT builds after 1.3.5 (9/5/06). The details may not be entirely static.
At the top level, in the NXTConfig
element, there are
currently two possible subelements: DACoderConfig
and
CSLConfig
. The first is for configuring discourse coder
tools (dialogue act coder; named entity coder etc). The second is
for configuring the Continuous Signal Labeller tool
Both CSLConfig
and DACoderConfig
can contain
any number of corpussettings
and guisettings
elements, each of which has an id
attribute to uniquely identify it:
often these ID
s will be used in the CallableTools
section
of an NXT metadata file. guisettings
are preferences that
affect the overall look of the interface and
corpussettings
tell NXT about the elements to be
displayed and annotated. The detail of what goes where is described in
each subsection below.
guisettings
attributes
id
Unique identifier
gloss
Example element containing short explanation of all possible settings
showapwindow
If true, the Adjacency Pair (or relation) window is shown in
the discourse entity coder. Defaults to true
.
showlogwindow
If true
, the log feedback window is shown.
Defaults to true
.
applicationtitle
The title that you want to see in the main frame
wordlevelselectiontype
This determines what units are selectable on the
speech transcriptions (assuming
transcriptselection
is not false
). The are
currently five valid strings - anything else will result in the
default behaviour: in_segment_phrase
. The values
and their meanings are: one_word
: only single
words can be selected at a time; one_segment
:
only single segments can be selected;
multiple_segments
: multiple complete sements can
be selected; in_segment_phrase
: contiguous words
that lie within a single segment can be selected;
cross_segment_phrase
: contiguous words across
segments can be selected (note that the selection can in fact be
discontiguous if multiagentselection
is not
true).
transcriptselection
This determines whether you can select speech
transcription elements. If this is false
no speechtext selection will take
place, regardless of settings such as allowMultiAgentSelect
or wordlevelSelectionType
. Defaults to true
.
annotationselection
This determines whether you can select annotation
elements. If this is false
no annotation selection will take
place, regardless of other settings. Defaults to true
.
multiagentselection
This determines whether you can select data from more than one agent.
If this is true
such selection can take place. Defaults to false
.
corpussettings
attributes
id
Unique identifier
gloss
Example element containing short explanation of all possible settings
segmentationelementname
Element name of the segmentation elements that pre-segments the transcription layer. Used for the initial display of the text.
segmenttextattribute
Name of the attribute on the segment element to use as the header of each transcription line. Use a delegate (below) for more complex derivation. If neither delegate nor attribute is set, the agent is used as the line header (if agent is specified).
segmenttextdelegateclassname
full class name of a TranscriptionToTextDelegate
that
derives the text of the segment header from each segment element. Note
this is not the transcription derivation, just the derivation of the
header for each line of transcription. If neither this delegate nor
segmenttextattribute
is set, the agent is used as the
line header (if agent is specified).
transcriptionlayername
LAYER name
of the transcription layer
transcriptionattribute
Name of the attribute in which text of transcription is stored. Leave out if text not stored in attribute.
transcriptiondelegateclassname
full class name of TranscriptionToTextDelegate
. Leave
out is no delegate is
used. net.sourceforge.nite.gui.util.AMITranscriptionToTextDelegate
is an example delegate class that works for the AMI corpus. For a new
corpus you may have to write your own, but it is a simple
process.
daelementname
element name of dialogue act instances
daontology
ontology name of dialogue acts
daroot
nite-id of dialogue act root
datyperole
role name of the pointer from a dialogue act to its type
daattributename
The
enumerated attribute on the DA element used as its 'type'. If this
attribute is set, the daontology
, daroot
and
datyperole
attributes are
ignored.
dagloss
the name of the attribute of the dialog act types that contains some extra description of the meaning of this type
apelementname
element name of adjacency pair instances
apgloss
the name of the attribute of the relation types that contains some extra description of the meaning of this type
apontology
ontology name of adjacency pairs
aproot
nite-id of adjacency pair root
defaultaptype
nite-id of default adjacency pair type
aptyperole
role name of the pointer from a AP to its type
apsourcerole
role name of the pointer from a AP to its source
aptargetrole
role name of the pointer from a AP to its target
neelementname
element name of named entity instances
neattributename
The
enumerated attribute on the NE element used as its 'type'. If this
attribute is set, the neontology
, neroot
and
netyperole
attributes are
ignored.
neontology
ontology name of named entities
neroot
nite-id of named entities root
neontologyexpanded
set to
false
if you want the ontology to remain in un-expanded
form on startup. The default is to expand the
tree.
nenameattribute
attribute name of the attribute that contains the name of the named entity
netyperole
role name of the pointer from a named entity to its type
nenesting
Set to true
to allow named entities to nest
inside each other. Defaults to false
.
if this is
true each span of words can be associated with
multiple values in the ontology. Note that this only makes sense when
the neattributename
is not set - this setting is ignored
if neattributename
is set. It also requires that the
nenesting
attribute is true.
abbrevattribute
name of the attribute which contains an abbreviated code for the named entity for in-text display
nelinkelementname
The
element linking NEs together. Used by
NELinker
.
nelinkattribute
The
enumerated attribute on the NE link element used as its 'type'. If
this attribute is set, the nelinkontology
,
nelinkroot
and nelinkrole
attributes are
ignored, and the nelinktypedefault
if present is the
default string value of the type. Used by
NELinker
.
nelinkontology
The type
ontology pointed to by the NE link element. Used by
NELinker
.
nelinkroot
The root of the
type ontology pointed into by the NE link element. Used by
NELinker
.
nelinktyperole
The role
used to point into the type ontology by the NE link element. Used by
NELinker
.
nelinktypedefault
The
default type value for NE link elements. Used by
NELinker
.
nelinksourcerole
The role
of the pointer from the link element to the first (or source) NE
element. Used by NELinker
.
nelinktargetrole
The role
of the pointer from the link element to the second (or target) NE
element. Used by NELinker
.
annotatorspecificcodings
the semi-colon-separated list of codings that are annotator specific,
i.e. for which each individual annotator will get his or her own
datafiles. Usually these are the codings for all layers that will be
annotated in the DACoder
; see AMI example. This setting only has
effect when the tool is started for a named annotator or
annotators.
nsannotatorlayer
Only used
by NonSpanningComparisonDisplay
this specifies the layer
containing elements to compare. This is the top layer passed to the
multi-annotator corpus load.
nscommonlayer
Only used by
NonSpanningComparisonDisplay
this is the layer that is
common between all annotators - it will normally be the same layer as
transcriptionlayername
.
In the above settings, da
and ap
prefixes are used in the attribute names here (standing for 'dialogue
act' and 'adjacency pair'), these can refer to any kind of discourse
elements and relations between them you wish to annotate.
guisettings
attributes
id
Unique identifier
gloss
Example CSL settings, giving an explanation for every entry.
autokeystrokes
Optional (default false
):
if true
, keystrokes will be made automatically if no keystroke is defined
in the corpus data or if the defined keystroke is already in use.
showkeystrokes
Optional (default off
):
set to off
(keystroke won't be shown in the GUI),
tooltip
(keystroke will be shown in the tooltip of a control)
or label
(keystroke will be shown in the label of a control).
continuous
Optional (default true
):
if true
, the CSL tool will ensure that annotations remain continuous (prevent gaps in the time line)
syncrate
Optional (default 200
):
the number of milliseconds between time change events from the NXT clock
timedisplay
Optional (default seconds
):
the type of display of coding times in the annotation window: if minutes
then the format is like
that of the clock h:mm:ss.ms
corpussettings
attributes
id
Unique identifier
gloss
Example CSL settings for Dagmar demo corpus
annotatorspecificcodings
pose
For the Continuous Signal Labeller we expect the
corpussettings
element to contain a number of
layerinfo
elements, each of which can contain these
attributes. Each layer named within the current
corpussettings
element can be coded using the same tool:
users choose what they're annotating using a menu.
corpussettings / layerinfo
attributes
id
Unique identifier
gloss
Textual description of this layer
codename
Name of the elements that are annotated in the given layer
layername
The name of the layer that you want to code in the video labeler
layerclass
Delegate AnnotationLayer
class. Defaults to
net.sourceforge.nite.tools.videolabeler.LabelAnnotationLayer
controlpanelclass
Delegate
TargetControlPanel
class. Defaults to
net.sourceforge.nite.tools.videolabeler.LabelTargetControlPanel
enumeratedattribute
Either this or pointerrole
are required for LabelAnnotationLayer
: name of the attribute that should be set -
attribute must exist on the codename
element and must be enumerated -
currently no flexibility is offered in the keyboard shortcuts -
they always start at "1" and increase alphanumerically.
pointerrole
Either this or
enumeratedattribute
are required for LabelAnnotationLayer
:
role of the pointer that points to the object set or ontology that contains the labels.
labelattribute
Required for
LabelAnnotationLayer
: name of the attribute of an object set or ontology
element that contains the label name.
evaluationattribute
Required
for FeeltraceAnnotationLayer
: name of the double value attribute that contains
the evaluation of an emotion.
activationattribute
Required for
FeeltraceAnnotationLayer
: name of the double value attribute that contains
the activation of an emotion.
showlabels
Optional (default true) for FeeltraceTargetControlPanel
:
if true
, labels for some predefined emotions will be shown in the Feeltrace circle.
clickannotation
Optional (default false
)
for FeeltraceTargetControlPanel
: if true
,
the user can click to start and end annotating; if false
, the user should keep
the mouse button pressed while annotating.
Please refer to the NXT Javadoc and
the example programs in the samples
directory.
It's often useful for applications to be
able to pop up a search window and react to search results as
they are selected by the user. Using any in-memory corpus that
implements the SearchableCorpus
interface (for
example NOMWriteCorpus
), you can very simply
achieve this. If nom
is a valid
SearchableCorpus
we could use:
net.sourceforge.nite.search.GUI searchGui = new GUI(nom
); searchGui.registerResultHandler(handler
); ... searchGui.popupSearchWindow();
In this extract, the first line initializes the search GUI by
passing it a SearchableCorpus
. The second line
tells the GUI to inform handler
when
search results are selected. handler
must implement the QueryResultHandler
interface.
This simple interface is already implemented by some of NXT's
own GUI components like NTextArea
, but this
mechanism allows you complete freedom to do what you want with
search results. There is no obligation to register a result
handler at all, but it may result in a less useful interface.
The third line of the listing actually causes the search window to appear and will normally be the result of a user action like selecting the
menu item or something similar.