Downloading and Using NXT

Platform-independent binary and source distributions of NXT can be downloaded from Sourceforge at http://sourceforge.net/projects/nite/. For most purposes the binary download is appropriate; the source download will be distinguished by _src suffix after the version number. For the most up-to-date version of NXT, the SourceForge CVS repository is available. For example

cvs -z3 -d:pserver:nite.cvs.sourceforge.net:/cvsroot/nite co nxt

would get you a current snapshot of the entire NXT development tree.

Prerequisites

Before using NXT, make sure you have a recent version of Java installed on your machine: Java 1.4.2_04 is the minimum requirement and Java 1.5 is recommended. Learn about Java on your platform, and download the appropriate version using Sun's Java Pages.

For optimum media performance you may also want to download JMF and the platform-specific performance pack for your OS. NXT comes packaged with a platform-independent version of JMF. Users of MacOS should use the FMJ libraries instead which use QuickTime for media playback for improved performance and easier installation. NXT comes packaged with a version of FMJ compiled specifically with QuickTime support.

Getting Started

  • Step 1: download and unzip nxt_version.zip

  • Step 2: Some data and simple example programs are provided to give a feel of NXT. On windows, try double-clicking a .bat file; on Mac, try running a .command file; on Linux (or Mac) try running a shell script from a terminal e.g. sh single-sentence.sh. More details in Sample Corpora section below.

  • Step 3: Try some sample media files: Download signals.zip (94 Mb) and unzip it into the Data directory in your NXT directory. Now when you try the programs they should run with synced media.

Sample Corpora

Some example NXT data and simple example programs are provided with the NXT download. There are several corpora provided, with at most one observation per corpus, even though in some cases the full corpus can actually consist of several hundred observations. Each corpus is described by a metadata files in the Data/meta directory, with the data itself in the Data/xml directory. The Java example programs reside in the samples directory and are provided as simple examples of the kind of thing you may want to do using the library.

  • single-sentence - a very small example corpus marked up for part of speech, syntax, gesture and prosody. Start the appropriate script for your platform; start the Generic Corpus Display and rearrange the windows. Even though there is no signal for this corpus, clicking the play button on the NITE Clock will time-highlight the words as the time goes by. Try popping up the seacrh window using the Search menu and typing a query like ($g lgest)($w word):$g # $w. This searches for left-handed gestures that temporally overlap words. You should see the three results highlighted when you click them.

  • dagmar - a slightly larger example corpus: a single monologue marked up for syntax and gesture. We provide a sample gesture-type coding interface which shows synchronisation with video (please download signals.zip to see this in action).

  • smartkom - a corpus of human compuiter dialogues. We provide several example stylesheet displays for this corpus showing the display object library and synchronization with signal (again, please download signals.zip above to see synchronisation)

  • switchboard - a corpus of telephone dialogues. We provide coders for animacy and markables which are in real-world use.

  • maptask-standoff - This is the full multi-rooted tree version of the Map Task corpus. We provide one example program that saves a new version of the corpus with the part-of-speech values as attributes on the <tu> (timed unit) tags, moving them from the "tag" attribute of <tw> tags that dominate the <tu> tags.

  • monitor - an eye-tracking version of the Map Task corpus.

  • ICSI - a corpus of meetings. We provide coders for topic segmentation, extractive summarization etc. The entire meeting corpus consists of more than 75 hours of meeting data richly annotated both manually and automatically.

Setting the CLASSPATH

All of the .bat, .command and .sh scripts in the NXT download have to set the Java CLASSPATH before running an NXT program. To compile and run your own NXT programs you need to do the same thing. The classpath normally includes all of the .jar files in the lib directory, plus the lib directory itself. Many programs only use a small proportion of those JAR files, but it's as well to include them all. JMF is a special case: you should find NXT plays media if the CLASSPATH contains lib/JMF/lib/jmf.jar. However, this will be sub-optimal: on Windows JMF is often included with Java, so you will need no jmf.jar on your CLASSPATH at all; on other platforms consult ???.

How to Play Media signals in NXT

NXT plays media using JMF (the Java Media Framework). JMF's support for media formats is limited and it depends on the platform you are using. A list of JMF supported formats is at http://java.sun.com/products/java-media/jmf/2.1.1/formats.html. This list is for JMF 2.1.1, which NXT currently ships with.

There are several ways of improving the coverage of JMF on your platform:

  • Performance packs from Sun - these improve codec coverage for Windows and Linux, and are available from the JMF download page. In particular, note that MPEG format isn't supported in the cross-platform version of JMF, but it is in the performance packs.

  • Fobs4JMF for Windows / Linux / MacOSX is a very useful package providing Java wrappers for the ffmpeg libraries (C libraries used by many media players which have a wide coverage of codecs and formats). Download; information. Make sure you follow the full installation instructions which involve updating the JMFRegistry and amending your LD_LIBRARY_PATH.

  • MP3 - There's an MP3 plugin available for all platforms from Sun.

Note

direct playback from DVDs or CDs is not supported by JMF.

NXT comes with a cross-platform distribution of JMF in the lib directory, and the .bat/.sh scripts that launch the GUI samples have this copy of JMF on the classpath. On a Windows machine, it is better to install JMF centrally on the machine and change the .bat script to refer to this installation. This will often get rid of error messages and exceptions (although they don't always affect performance), and allows JMF to find more codecs.

It is a good idea to produce a sample signal and test it in NXT (and any other tools you intend to use) before starting recording proper, since changing the format of a signal can be confusing and time-consuming. There are two tests that are useful. The first is whether you can view the signal at all under any application on your machine, and the second is whether you can view the signal from NXT. The simplest way of testing the latter is to name the signal as required for one of the sample data sets in the NXT download and try the generic display or some other tool that uses the signal. For video, if the former works and not the latter, then you may have the video codec you need, but NXT can't find it - it may be possible to fix the problem by adding the video codec to the JMF Registry. If neither works, the first thing to look at is whether or not you have the video codec you need installed on your machine. Another common problem is that the video is actually OK, but the header written by the video processing tool (if you performed a conversion) isn't what JMF expects. This suggests trying to convert in a different way, although some brave souls have been known to modify the header in a text editor.

<subsection> <title>Media on the Mac</title>

NXT ships with some startup scripts for the Mac platform (these are the .command files) that attempt to use FMJ to pass control of media playing from JMF to the native codecs used by the Quicktime player.

If the FMJ approach fails, you should still be able to play media on your Mac but you'll need to edit your startup script. Take an existing command file as a template and change the classpath. It should contain <directory>lib/JMF/lib</directory> (so jmf.properties is picked up); <file>lib/JMF/lib/jmf.jar</file> and <file>lib/fmj/lib/jffmpeg-1.1.0.jar</file>, but none of the other FMJ files. This approach uses JFFMPEG more directly and works on some Mac platforms where the default FMJ approach fails. It may become the default position for NXT in future.

</subsection>

Programmatic Controls for NXT

This section describes how to control certain behaviours of NXT from the command line.

These switches can be set using Java properties. Environment variables with the same names and values are also read, though properties will override environment variables. Example:

java -DNXT_DEBUG=0 -DNXT_QUERY_REWRITE=true CountQueryResults -c mymeta.xml 
     -o IS1003d -q '($s summ)($w w):text($w)="project" && $s^$w'    

This runs the CountQueryResults program with query rewriting on in silent mode (i.e. no messages). Setting environment variables with the same names will no longer work .

Java Arguments Controlling NXT Behaviour

NXT_DEBUG=number

The expected value is a number between 0 and 4. 0: no messages; 1: errors only; 2: important messages; 3: warnings; 4: debug information. The arguments true and false are also accepted to turn messages on or off.

NXT_QUERY_REWRITE

Values accepted: true or false; defaults to false. If the value is false, NXT will automatically rewrite queries in an attempt to speed up execution.

NXT_LAZY_LOAD

Values accepted: true or false; defaults to true. If the value is false, lazy loading will not be used. That means that data will be loaded en masse rather than as required. This can cause memory problems when too much data is loaded.

NXT_RESOURCES_ALWAYS_ASK

Values accepted: false or false; defaults to false. If the value is false, the user will be asked for input at all points where there is more than one resource listed in the resource file for a coding that needs to be loaded. The user will be asked even if there are already preferred / forced / defaulted resources for the coding. This should only be used by people who really understand the use of resources in NXT.

NXT_RESOURCES

A list of strings separated by commas (no spaces). Each string is taken to be the name of a resource in the resources file for the corpus and is passed to forceResourceLoad so that it must be loaded. Messages will appear if the resource names do not appear in the resource file.

NXT_ANNOTATOR_CODINGS

A list of strings separated by semi-colons. If any of the strings are coding names in the metadata file, they are used when populating the list of existing annotators for the 'choose annotator' dialog. If no valid coding names are listed, all available annotators are listed.

Compiling from Source and Running the Test Suites

  • Go into the top level nxt directory, decide on a build file to use and copy it to the right directory e.g. cp build_scripts/build.xml .. Type ant to compile (ant jar is perhaps the most useful target to use as it doesn't clean all compiled classes and rebuild the javadoc every time). If there are compile errors, copy the error message into an email and send it to Jonathan or another developer (see the SourceForge members page for emails).

  • Run the test suite(s). The NXT test suite is by no means comprehensive but tests a subset of NXT functionality. To run, you need to have the JUnit jar on your CLASSPATH. Then

    javac -d . test-suites/nom-test-suite/NXTTestScratch.java
    

    Now run the tests:

    java junit.textui.TestRunner NXTTestScratch
    

    Again, any errors should be forwarded to a developer.

  • If you are making a real public release, Update the README file in the top-level nxt directory, choosing a new minor or major release number. Commit this to CVS.

  • Now build the release using the build_scripts/build_release.xml ant file (use the default target). This compiles everything, makes a zip file of the source, and one of the compiled version for release, and produces the Javadoc. If you're on an Edinburgh machine, copy the Javadoc (in the apidoc directory) to /group/project/webltg/NITE/nxt/apidoc. Test the shell script examples, and upload the new release to SourceForge.