Additions to sml-jvm
====================
This directory contains an extended version of Peter Bertelsen's sml-jvm toolkit.
The main change is that class files can now be read in and converted into the
sml-jvm format.  A summary of the changes to each file is given below.


Bytecode.sml:   Added toString function for bytecode instructions.  Produces a string
  		corresponding roughly to Jasmin assembler format.

Classdecl.sml:  Added inner_info, INNER, SYNTHETIC and DEPRECATED attributes.
		Since we can now input classfiles we've extended the class_decl
		structure to include fields called 'major' and 'minor' which
		contain the class file version numbers.  Classdecl.sml also
		declares values 'major' and 'minor' (3 and 45 respectively)
		which should be used as default values when constructing a
		class_decl object.

Classfile.sml:  Added inner_info and new attributes
		The main addition here is of functions which allow the input of
		a class file,  producing an object of type Classfile.class_file.
		There are two such functions:

			inputClassFile: string -> class_file,

		which opens an explicitly named file,  and

			vectorToClass: Word8Vector -> class_file

		which converts a vector containing the contents of a class file
		into a class_file object (such a vector may for example have been
		obtained from a jar file by using the functions in ZipReader).

Constpool.sml:  Added stuff for parsing constant pool entries.
		Export a couple of things for use elsewhere.

Decompile.sml:  This is a new file.  The main entry point is the function

			toClassDecl: Classfile.class_file -> Classdecl.class_decl

		which is (approximately) the inverse of Classfile.fromClassDecl.
		Usually you'll use Decompile.toClassDecl o Classfile.inputClassFile
		to input a class file and convert it to sml-jvm format.  Note that
		if you read in a classfile and decompile it,  then recompile it and write
 		a new class file,  the new file generally won't be identical to the
		old one.  The reason for this is that the order of the items in the constant
		pools in the classfiles will generally differ.  Because of this it may occasionally
		be necessary to use an instruction of a different size to access a constant
		pool item (eg,  ldc_w instead of ldc):  this means that a reconstructed
		class file may even have a different size from the original version;
		however the new classfile should behave in a way which is indistinguishable
		from the original.  [Note:  I've only ever seen classfiles getting bigger
		because of this.  Do Java compilers construct the constant pools in such
		a way that frequently-used items are at the beginning,  so that more compact
		instructions can be used to access them?]

Emitcode.sml:   We now emit more compact jump instructions using EstimateJumps
		emitTableSwitch now uses 32-bit arithmetic,  avoiding potential overflows.
		Fixed bug:  wrong codes were emitted for ifnull and ifnonnull (if_icmpeq
		and if_icmpne).

EstimateJumps.sml:
		New file.  Calculate upper bounds for lengths of forward jumps,  enabling us to
		(usually) avoid using goto_w when goto will be ok.

Inflate.sml:    New file.  Implements decompression for zip files.  (See ZipReader.sml)

Int32.sml:	Added 32-bit addition and subtraction operators Int32.+ and Int32.- (exported
		in Int32.sig).
		Int32.toInt raises Int32Overflow of string if a 32-bit integer is too big to fit
		into one of Moscow ML's 31-bit integers:  the string contains a hex representation
		of the offending value, for example "0x4C0117FC" (8 hex digits).
		Added toString function:  gives a decimal representation if it's
		small enough for Moscow ML,  otherwise a hex representation.

Int64.sml:	Added exception Word64Overflow of string.  If Int64.toInt fails then
		Word64Overflow is raised with an argument giving the hex value of the
		input word, for example "0x0000231F0000FF32" (16 hex digits).
		Added toString.

IntCvt.sml:	No changes.  (Unused?)

JvmType.sml	Minor changes (some of the type descriptor conversion functions might have been
		truncating descriptors prematurely?).
		Added some code to convert type descriptors back to sml-jvm types.

Label.sml:	Added conversion functions between labels and integers (used in decompilation).

Localvar.sml:	No changes.

Real32.sml:	Attempted implementation of toReal and toString.

Real64.sml:	Attempted implementation of toReal and toString.

StackDepth.sml:
		No changes

Word16.sml:	No changes.

Word32.sml:	Added exception Word32Overflow of string,  used when a value won't fit
		into a 31-bit integer.  Added 32-bit unsigned addition and subtraction
		operators,  but we don't use these anywhere.
		Added toString.

Word64.sml:	Added exception Word64Overflow of string.  If Word64.toInt fails then
		Word64Overflow is raised with an argument giving the hex value of the
		input word, for example "0x0000231F0000FF32" (16 hex digits).
		Added toString

Wordn.sig and Intn.sig are unused.  Presumably they should be common signatures for
the various Word*.sml and Int*.sml files.  However,  some of these now contain functions
(eg, arithmetic functions) which others don't.  Realn.sig is also unused.


Zip files
---------
I've provided a module for extracting information from zip files (which include jar files).
This isn't particularly fast for large files such as rt.jar.

The functions are implemented in ZipReader.sml,  which conforms to the following
signature from ZipReader.sig:

  type zipfile

  val open_in     : string -> zipfile
  val inputMember : zipfile * string -> Word8Vector.vector option
  val close_in    : zipfile -> unit
  val apply       : (string -> unit) -> zipfile -> unit
  val members     : zipfile -> string list

  val extractFile : string * string -> Word8Vector.vector

open_in opens a zip file for use;  this involves caching information about
  the members of the file,  and may take some time.

inputMember attempts to obtain the contents of a member of the zip file.
  If the purported member isn't present in the zip archive then inputMember
  returns NONE.  If the member does exist then inputMember returns SOME v,
  where v is a Word8Vector containing the data in the member file (suitable
  for input to Classfile.vectorToClass, for example).  If the data was
  compressed then v will contain the uncompressed version.  The zip file
  format supports several compression formats, but we only implement
  decompression for the "deflate" format.  I've never seen a zipfile
  containing data compressed by any other means.

close_in closes the zip file;  an exception will be raised if you try to
  use any of the other functions once the file's been closed.

apply iterates a function over every member of the file (in no particular order).

members returns a list of the names of the members of a zip archive
  (in no particular order).

extractFile attemps to extract a single member of a zip archive without requiring
  the user to open and close the archive.  extractFile (zipfile, member) is
  equivalent to

	let val z = open_in zipfile
	    val data = inputMember (z, member)
	    val () = close_in z
        in
	  data
	end

If you only expect to process one member of a zip archive then you should use
extractFile.  The other functions should be used if you expect to process
several members,  since they make use of information about the internal structure
of the archive which is cached by open_in.


---------------------------

New code contributed by:

	Laura Korte
	Kenneth MacKenzie
	Matthew Prowse
	Nicholas Wolverson

---------------------------

K. MacKenzie,  June 2004

