Chapter 13. Usage

This chapter presents an overview of the actual steps and commands for using Theon to manage a PostgreSQL physical database - be it a custom private one, upstream one or locally developed shared one.

Theon should have been installed first on the desktop machine you are using to run. This may be the case by default if there is a system wide installation. If not see the Installation and Configuration chapter. You should have access to a running PostgreSQL installation on the same desktop machine. It could also be on a remote machine but for the purposes of this guide it should be on the same machine. You will need access to create databases (and maintain the content of those created databases) as yourself. Depending on your desktop machine management, this might be a local or central configuration change.

13.1. Version Control Awareness

Theon is not tied to a specifc version control system, or indeed any, as while using one is highly recommended, it is not mandatory. However, the default toolkit commands are to some extent version control system aware. While not necessary, using this builtin behaviour can be simpler, as it reduces the number of command options required. If this aspect of the toolkit commands is to be used then Git must be used as the underlying version control system - either locally or against a remote repository.

13.2. Starting from Scratch

You always need to start from scratch when you have a new personal physical database that you want to start managing as a model within Theon. In this case there will be no SchemaTree to work from, so the initial version of a SchemaTree needs to be created.

In order to do this a model in XSDDB needs to be created describing the new physical database. This can be done in two ways: manually by creating the necessary entries in the XSDDB database, normally using TheonUI and the model management desktops (see later); or by importing the model content from the PostgreSQL "system catalog" of the physical database directly into the XSDDB database. Doing an import is usually better, as by avoiding most of the manual data entry, its much quicker and less error prone. It may not be as appropriate for databases where a lot of structure cannot be imported for whatever reason (usually as it is unsupported and needing manual attention). Some manual data entry may still be necessary subsequent to an import, particularly annotation for documentation purposes, but none may actually be "required".

As mentioned before, a model in Theon does not represent all the possible structural elements of a PostgreSQL physical database. Only supported elements can be added or imported into the XSDDB database. Structures which are not added or imported (and therefore not directly representable in a Theon model) need to be handled in a different way. They can be manually created within container entities in a model. This is particularly suitable if they are isolated and not likely to be subject to frequent change. Alternatively they can simply be ignored and handled out of band, for example by extracting the necessary fragments from the output of pg_dump -s. This latter approach does mean that the installation and upgrade of a physical database from a model in Theon will require an additional step, one that is not part of the Theon managed process, for handling the "out of band" structures.

By default to import content into XSDDB directly from a physical database requires that the database must be contained in a PostgreSQL installation on the same physical host where Theon is being run. If this is not the case for your database a suitable LiveProfile must be created first. Alternatively take a schema dump (no content is necessary) and transfer that to the host where Theon is running, and install it temporarily as a database there - for example:

pg_dump -s -h HOSTNAME mydb > mydb.sql
psql < mydb.sql

Note that this only applies when using a PostgreSQL system catalog import to create and/or update XSDDB content. The live physical database is not otherwise need to be on the same host where Theon is being run. Indeed, this would normally not be the case in a central service unless changes were being made for subsequent import directly to the live in-service database. This is not normally reccommended. It is always better to use a development snapshot of the database first to avoid any possible user disruption. In a private installation of a personal database, it is much more likely that Theon will be being run on the same host as the database service, and direct changes to the live in-service database don’t matter (as they only affect that user anyway).

13.2.1. Model Management Commands and Process

The commands for model management in Theon are all found in the model sub-command of the Theon toolkit command (theontk). For brevity the examples below use the ttkm alias instead of the full command (theontk model).

Each command is shown in context below, but refer also to the command appendix for full details and options. You can also do ttkm help to get a list of possible model sub-command actions and help on each action.

The process outlined in the sections below assumes that the XSDDB database is being populated by direct import of the system catalog of an existing physical database (either on the same host as Theon is being run or temporarily transferred to that host solely for the purpose of import, as described above). For manual creation of XSDDB content refer to the section on the model management desktops under TheonUI.

The process when using a model which is already held in an SchemaTree of XSD files (such as those supplied with Theon itself) is slightly different - refer to the later section on making changes.

13.2.2. Create XSDDB

Create the Theon schema management database XSDDB. To recap, this is the "transient" database that helps maintain the actual SchemaTree of XSD files.

ttkm create

If the XSDDB database already exists then the above command will fail with an error. This is reasonable, as you might not want to destroy the current content in the XSDDB database - it may reflect changes in other models that have not been pushed back out as a SchemaTree.

The create subcommand is really just for user convenience and is actually an alias for ttkm self install. The self model is realised as the XSDDB physical database.

13.2.3. Import Physical Database

Now create a model in XSDDB from the system catalog of a physical PostgreSQL database. By default the model will be given the same name as the name of the database being imported; however this can be overridden.

ttkm import mydb

The above command takes the supported structures of the mydb database and imports them into XSDDB as a model called (tagged) mydb. To import the database using a different model name (noting that each model name concurrently loaded within XSDDB must be unique, so this may be necessary for example if multiple instances of the same physical database need to be held in XSDDB) do the following instead.

ttkm MODELNAME import mydb

Note that in both cases (irrespective of model name) the realised physical database associated with the model will always be called mydb. This can be changed within XSDDB manually by using TheonUI and the relevant model management desktops.

Since the command toolkit is version control aware, if the current working directory is established to be a checked out Subversion working copy of a ModelLibrary, then when the MODELNAME is not specified it will default to the name of that directory (rather than the name of the physical database).

The name of the physical database to import from is also optional and if not specified will default to being the same as the MODELNAME (or if that is not specified the name of the current working directory when it is a Subversion working copy of a ModelLibrary).

The import subcommand is really just for user convenience and is actually an alias for a TheonCoupler stream processing command. In this instance the command would be theon coupler self stream pgcat mydb mydb. That is, run the pgcat stream with two arguments (the first is the name of the physical database catalog to use, the second is the name of the model to import that as) against the self model (which is realised as the XSDDB database). There is no external data source in this stream, it is a live query against mydb (in this example). For more details see the TheonCoupler chapter.

No manual alterations should subsequently need to be made at this stage, but at some point the imported structures should be annotated so as to provide useful documentation. These manual alterations would be made using TheonUI and the relevant model management desktops (see later section).

As previously mentioned only supported structures will be imported. Unsupported structures will be silently ignored. They will however be identified later on during validation and can then be addressed as appropriate.

13.2.4. Export XML-Schema Tree of XSD Files

Now the SchemaTree can be created from the content of XSDDB. This will represent the true initial version of the model for the physical database in Theon.

Normally the SchemaTree would be written out into a working copy of a ModelLibrary from a version control repository (the master of which can be local or remote), as this means schema change management uses the same approach as source code change management. However, this is not mandatory, a filesystem directory can hold the SchemaTree as-is with no version control, or just simplistic version control such as frequent backups. This is the more likely case for managing personal databases in a personal installation of Theon. A version control system is more likely to be deployed when changes are being managed to a physical database which is centrally hosted and might be being updated by many independent developers simultaneously.

Similarly, when making changes to any third party model the SchemaTree would be a checked out copy from an accessable central repository (or installed package) holding that (or an independently branched derivative). The examples below show the use of a Subversion repository local to the host where Theon is being run. This could equally just be a simple filesystem directory though (mkdir below instead of svn checkout).

svn checkout file:///disk/local/mydb_modlib
cd mydb_modlib
ttkm mydb export

Here mydb is the model name, not necessarily the same as the name of the physical database it will be realised as (or was imported from).

By default the export will be into the current working directory. A new directory will be created (if necessary) and called mydb (in this example). This will contain the SchemaTree (the structure of which was outlined in a previous section). Any existing content will be overwritten (but files in the directory but not in the export will be left untouched).

If the current working directory is a checked out Subversion working copy of a ModelLibrary then the name of the model exported will default (when not explicitly specified) to the name of the current working directory and the export will not include the top level directory itself.

If the current working directory is not under version control then nothing more needs to be done. If it is then then the changes will ultimately need to be merged and committed back to the repository. This is assuming they represent a complete change that should be committed as a group. They may not be, as they are likely to need to be tested first, and so there be many iterations of XSDDB update and export before the changes are finally ready to be committed.

Since the command toolkit is version control aware as a convenience for Subversion repositories the following command is available.

ttkm update

This command (which is only applicable to the version control process) does a pre-commit pass within the current working directory (only if it is a checked out Subversion working copy). This will svn add any new files not previously in the working copy and svn remove any files not touched by the most recent export (as they must no longer be in the representation of the model). This process is only applied to files and directories that are part of the Schema Tree, others are ignored.

Any merge and commit operation that would now normally follow within the version control system must be done using the tools provided by that system.

13.2.5. Realise Model

Next the model is realised, that is, install (or subsequently upgrade) a PostgreSQL physical database from the SchemaTree used to represent it. In principle, if this step immediately follows an import from a pre-existing physical database, then the pg_dump -s output of each database should be near identical (or functionally identical anyway). In practice this may not be the case, in particular where the original database contained unsupported elements.

Before doing this, in the case where a pre-existing physical database has been imported, the original database must be deleted (after being backed up first in order to do subsequent testing). Obviously this would not be done if it was being used in a live production service. In this case the new Theon managed version of the database should be created in a different PostgreSQL installation (probably on a different host). Alternatively the model in XSDDB could be altered to give the realised database a different name before the content is exported as a SchemaTree. Delete the existing database if necessary using PostgreSQL commands, making a structural backup first (no content is included in this).

pg_dump -s mydb > mydb.imported.sql
dropdb mydb

Now (re)create the database from the SchemaTree in the current working directory (by default).

ttkm mydb install

Here mydb is the name of the model to install. This will not necessarily be the same as the name of the physical database created (the actual name is defined within the model). If the physical database that the mydb model creates already exists then this command will simply fail as a safety measure.

If the current working directory is a ModelLibrary (checked out working copy of a Subversion repository) containing a SchemaTree then the model installed will default to the model represented by that SchemaTree (when a model name is not specified in the command).

Otherwise the model name will be searched for as a directory (containing a SchemaTree): in the current working directory; in the local Theon installation (for models included with Theon such as self) or a specific URI given as an option to the command. When a URI is given as an option and no model name is specified the URI must be to a directory containing a SchemaTree and that will be what is used. Otherwise the URI should contain a directory containing a SchemaTree which is the same as the model name specified in the command. The URI would normally be a file:/// but can equally be a reference to a SchemaTree published on the web or in a version control repository (and specific revision thereof).

The install sub-command creates the mydb physical database (or whatever the model calls it) and all the structures within it as defined in the Schema Tree. It does this by first creating a single XML-Schema document for the entire SchemaTree (by schematic traversal using the ttkm schema sub-command). It then transforms this using the ttkm derive sub-command into a file containing DDL SQL statements that will create the database and all its structures. It then uses the base theon install command (not part of Theon) to run the DDL SQL statements through psql. It will also update the model management table (always created for a database in the derived DDL SQL statements under the theon schema) storing within it meta data about the installation: a timestamp; a session user; a checksum for the single XML-Schema document representing the complete SchemaTree (this is used for handling subsequent upgrades).

Normally a snapshot of the entire SchemaTree used to install a physical database is also embedded inside that database (in the theon.model management table). This is useful for making subsequent differential upgrades and is the default as, in particular, it is better for the frequent upgrade cycle occuring during active development. However, this is not required and the behaviour can be suppressed using the noxsd option on the ttkm install command. In this case the upgrade process will automatically create a temporary XSDDB instance, import the system catalog of the physical database into it and then export it out again in order to re-create the corresponding SchemaTree. This is obviously more time consuming (particularly for large models), so for performance do not use this option. Making live structure changes in the database when using this option means future upgrades are made against the current structure, not the structure as it was immediately after the previous install/upgrade (which it would be without specifiying noxsd). In practice this makes no functional difference, however it does reduce the opportunity for upgrade validation when making releases so this is another reason to not normally use it. The noxsd option is generally used during release management as deployed live production databases would be expected to have a schema under strict version control and so the schema corresponding to the particular database structure should always reside in the repository (for the purposes of differential upgrade).

13.2.6. Testing Differences

This step is recommended after an import from a pre-existing PostgreSQL physical database. Do a structure dump from the original database and from the one created fresh out of the new SchemaTree and then compare.

pg_dump -s mydb > mydb.sql
meld mydb.imported.sql mydb.sql

There are likely to be some differences. For example, any unsupported elements in the original will be missing. Also some model conventions in Theon while functionally similar or identical will cause a difference in the dump outputs. For example, views in a Theon model are (by default) constructed as inline functions and in a Theon model (by default) every foreign key is automatically given an index.

The physical database created from a model in Theon also includes a management table simply called model (in the theon physical database schema). This includes various rows of meta data associated with the installation including timestamp, checksum, serial identification or version control system revision number (identifying the specific instance of the model that the physical database represented on original installation and used for handling subsequent upgrades. Manual alteration to the content of this management table is likely to break future upgrades (or worse cause future upgrades to corrupt actual content), so is unwise except in a purely development scenario.

Differences associated with unsupported elements will need to be addressed before proceeding, but that is beyond the (current) scope of this document.

13.2.7. Delete XSDDB

Since the SchemaTree has been created, in principle there is now no longer any need for a local instance of the XSDDB database and it can be deleted.

ttkm delete

The delete sub-command is really just for user convenience and is actually an alias for ttkm self destroy.

In practice though there is no need to delete the local XSDDB instance. It can be kept around for as long as necessary, for example for future model maintenance and for loading and maintaining other models.

13.2.8. Workflow Summary

Figure 13.1. Theon Workflow

TM_Arch_Flow1.svg

This diagram above illustrates the workflow described so far. So covering manipulating a model represented in a SchemaTree, using XSDDB either by manual update or import from an existing PostgreSQL physical database, then creating a new PostgreSQL physical database. The next stage of the modeling process is concerned with how changes are managed.

  1. The XSDDB is created using the create command.
  2. The content of a model in the XSDDB is created from the schema of an existing PostgreSQL physical database by using the import command (which internally uses TheonCoupler on a stream built against the internal system catalog).
  3. The model content held in XSDDB is converted into a SchemaTree of XSD files using the export command.
  4. The SchemaTree is used to install a new instance of the PostgreSQL physical database using the install command (which internally derives the necessary DDL via schematic transform). The new instance should be identical to the original instance barring some non-functional changes and with the exception of any unsupported structures.
  5. The XSDDB is removed as it is no longer necessary by using the delete command.
  6. Optionally the new SchemaTree is committed into a version control system so subsequent changes to it can be tracked and managed updates can be made to the live production service. The update command can assist with this process when Subversion is being used as the version control system.

13.3. Making Changes

Once a physical database is managed as a model within Theon it can be updated and any changes distributed and deployed in a controlled manner (when using an underlying version control system anyway).

13.3.1. Create XSDDB

First create the Theon schema management database XSDDB. To recap, this is the "transient" database that helps maintain the actual SchemaTree of XSD files.

ttkm create

This step is only necessary if there is not an XSDDB database already, which will often be the case when a model is in constant development. If the XSDDB database already exists then the above command will fail with an error, so as to protect the current content (which may contain changes to models that have not been exported out as a SchemaTree).

The create subcommand is really just for user convenience and is actually an alias for ttkm self install. The self model is realised as the XSDDB physical database.

13.3.2. Reload Model

Load XSDDB with a model contained in a SchemaTree. A SchemaTree might be one of the ones included in the Theon installation itself, or it might be in a central repository of a version control system, the local filesystem, or simply just in the current working directory.

ttkm mydb reload

Here mydb is the model name to load, not necessarily the same as the name of the physical database it is realised as. The command expects to find a Schema Tree in the current working directory if no model name is specified, or in a sub-directory of the current working directory with a name matching that of the model specified (mydb in this instance). If not found the command will search a defined path in the local installation of Theon. Optionally an explicit path can be given that will be used to search in place of the current working directory and the local installation.

If the current working directory is a ModelLibrary (checked out Subversion working copy) then the model name does not need to be specified and the model loaded will be that represented by the SchemaTree in the current working directory (just as above).

A reload operation destroys all the existing content related to the model (mydb in this instance) within XSDDB. It does not affect the content of any other model also concurrently being held in XSDDB. Therefore, if there are unexported changes in XSDDB for the model being reloaded, they should be exported first then potentially merged and committed (if using an underlying version control system) so they are not lost.

The reload sub-command adds records into all the tables in the XSDDB so as to represent the model held in a SchemaTree. It does this by transforming the SchemaTree using the ttkm derive sub-command into a single file (called derived/model.sql) containing SQL DML statements predominantly. These delete any existing records for the model and then add records into all the tables to represent the model. It then runs all the SQL statements through psql. For performance the derive sub-command only does the transform if it has not already been done for the current SchemaTree, a transform can always be forced using the full option here.

The model can also be loaded in parts. This can be more efficient if only one bit is in use (or actively being worked on).

ttkm mydb reload dictionary
ttkm mydb reload reflection
ttkm mydb reload extraction

The above load the data model (PostgreSQL DDL) component (Dictionary cluster) and the TheonUI component (Reflection cluster). Since dictionary is a pre-requisite for the other that will always also be loaded (destructively) when reflection is loaded. All reloads destroy existing data.

13.3.3. Making the Changes

Now make the actual updates to the model held in XSDDB. For simple changes this is commonly done by manual editing of the content using the model management desktops in TheonUI (see later section). Alternatively some updates might be made by processing templates and loading the generated structures using a TheonCoupler stream.

For updates affecting the physical database structure the changes can also be made live on the actual physical database (realisation of the model) and imported. This is often more convenient providing a faster development cycle for structural changes (as the import can be left to be an atomic set of "related" changes once all are completed and tested).

psql mydb -c "ALTER TABLE x ADD COLUMN y INTEGER;"
ttkm import mydb

The above will import the change directly from the PostgreSQL system catalog for the physical database database mydb into the model mydb being held in XSDDB. If the model name is not the same as the realised physical database name then that must be explicitly specified as well.

Since the command toolkit is version control aware, if the current working directory is established to be a checked out Subversion working copy of a SchemaTree, then when the model is not specified (as above) it will default to the name of that directory (rather than the name of the physical database).

The name of the physical database to import from is also optional and if not specified will default to being the same as the model (or if that is not specified the name of the current working directory when it is a Subversion working copy of a SchemaTree).

Doing an import like this is not really any different to doing the original import of a new physical database when starting from scratch in Theon but unlike that, which will be creating all the elements from scratch, this will just be making small adjustments to already existing elements. This is because import is actually just an instance of a particular data feed and sync mechanism implemented in TheonCoupler. So the import subcommand is really just for user convenience and is actually an alias for a TheonCoupler stream processing command.

While it is possible to mix and match all means of updating the XSDDB, it is relatively important that different approaches are not used on the same element until the changes have been exported and the SchemaTree and live database upgraded to match inbetween. There is effectively no master otherwise. Manual changes to XSDDB, imported changes from the live database instance and externally generated and coupled changes are all treated the same. So when they update the same element each will silently override the other simply in the order applied. This will be the case until the changes in that element are pushed through the true master (the SchemaTree) and propogated through.

13.3.4. Export XML-Schema Tree of XSD Files

Once all the necessary changes have been made the SchemaTree can be updated from the content of XSDDB. This will then represent the updated model for (or of when the changes were imported) the physical database.

ttkm mydb export

Here mydb is the model name, not necessarily the same as the name of the physical database it is realised as.

By default the export will be into the current working directory. A new directory will be created (if necessary) and called mydb (in this example). It will contain the SchemaTree. Any existing content will be overwritten (but files in the directory but not in the export will be left untouched).

If the current working directory is a checked out Subversion working copy then the name of the model exported will default (when not explicitly specified) to the name of the current working directory and the export will not include the top level directory itself.

If the current working directory is not under version control then nothing more needs to be done. If it is then then the changes will ultimately need to be merged and committed back to the repository. This is assuming they represent a complete change that should be committed as a group. They may not be, as they are likely to need to be tested first, and so there be many iterations of XSDDB update and export before the changes are finally ready to be committed.

Since the command toolkit is version control aware as a convenience for Subversion repositories the following command is available.

ttkm update

This command (which is only applicable to the version control process) does a pre-commit pass within the current working directory (only if it is a checked out Subversion working copy). This will svn add any new files not previously in the working copy and svn remove any files not touched by the most recent export (as they must no longer be in the representation of the model). This process is only applied to files and directories that are part of the Schema Tree, others are ignored.

Any merge and commit operation that would now normally follow within the version control system must be done using the tools provided by that system. However this is likely to be deferred until the changes have been applied and tested and any further changes made (to form a "related" grouping).

13.3.5. Apply Changes

Warning

The upgrade sub-command has not been implemented yet. An upgrade cannot currently be made in the way described. For development purposes it is necessary to destroy and recreate the database, that is ttkm mydb destroy then ttkm mydb install. For live production database upgrades a dump of the content must be made first so it can be reloaded after the destroy/install cycle. It is safer to use insert statements in the pg_dump command to increase the likelihood that the data will reload without compatibility issues.

Now the model changes can be applied back to the live physical database. This is not strictly necessary for changes that have been made directly via manual DDL SQL statements on the live database and then imported back into XSDDB as described above. However, it is harmless and helps to validate that the changes were correctly imported. During upgrade changes that have already been made and that are consistent with what is required in the upgrade are silently ignored.

ttkm mydb upgrade

Here mydb is the name of the model to upgrade. This will not necessarily be the same as the name of the physical database upgraded (the actual name is defined within the model).

If the current working directory is a checked out working copy of a Subversion repository containing a SchemaTree then the physical database upgraded will default to that defined by the model within the current working directory (when a model name is not specified in the command).

Otherwise the model name will be searched for as a directory (containing a SchemaTree): in the current working directory; in the local Theon installation (for models included with Theon such as self) or a specific URI given as an option to the command. When a path is given as an option and no model name is specified the URI must be to a directory containing a Schema Tree and that will be what is used. Otherwise the URI should contain a directory containing a SchemaTree which is the same as the model name specified in the command. The URI would normally be a file:/// but can equally be a reference to a SchemaTree published on the web or in a version control repository (and specific revision thereof).

The upgrade sub-command alters as necessary the structures within the mydb physical database (or whatever the model calls it) to match the changes made to the SchemaTree. In order to do this the command constructs a new SchemaTree with special markup that represents the differences between the one in the current working directory (or as otherwise specified) and the one which represents the current state of the physical database realising the model in that SchemaTree.

This latter SchemaTree is taken from the physical database itself, as a copy is normally stashed in the the management table (called theon.model) during previous install or upgrade commands. This will not be the case when the last of these commands was run with the noxsd option, or normally when using a live deployed production database built from a packaged release. Then a temporary SchemaTree is rebuilt by creating a temporary model in XSDDB, importing the system catalog of the database into it, exporting it then deleting that model in XSDDB. Alternatively a URI specified in the command will be used. This must contain a SchemaTree or a directory containing a SchemaTree which is the same as the model name specified in the command. When using a version control system the two specified URIs can be used, for example, to do an upgrade to a specific version of the SchemaTree. Here the first would point to the version to upgrade to and the second point to the current version of the SchemaTree installed.

Once the "differential" SchemaTree has been produced it is transformed (using ttkm derive) into a file of SQL DDL statements. Then the ttkm upgrade command is used to upgrade the physical database by running the DDL SQL statements through psql. This will also update the theon.model management table storing within it meta data about the upgrade: a timestamp; a session user; an embedded snapshot of the SchemaTree upgraded to (unless the noxsd option was specified).

When the SchemaTree is rebuilt from the physical database above and the model changes have been made by direct DDL statements in the database then the generated model will also have these changes already. The resulting upgrade SQL will be empty (if no other changes were made outside of the live database). Whereas if the SchemaTree is taken from the embedded snapshot this will represent the state of the live database prior to any live changes made. The resulting upgrade SQL will consequently contain these changes already made in the live database. The generated DDL SQL statements are wrapped so as to detect when a consistent change has already been made and silently ignore it.

During development this whole process is repeated multiple times, that is, more changes can be made (manually in XSDDB or by import/template), exported and then re-applied by upgrade. When a "complete" set of changes has been applied, tested and debugged then the final export of the SchemaTree would be merged and committed (when using an underlying version control system).

13.3.6. Workflow Summary

Figure 13.2. Theon Changes Workflow

TM_Arch_Flow2.svg

This diagram above illustrates the workflow described so far. So concerned with changing a model represented in an SchemaTree, in general using XSDDB either by manual update or import from the model realised as a live PostgreSQL physical database.

  1. The XSDDB is created using the create command.
  2. The content of a model in the XSDDB is created from an existing Schema Tree representing the current state of that model by using the reload command.
  3. The XSDDB is used to manipulate the model to make the neccessary changes. This can be done in a number of ways, none of which are mutually exclusive.

    • The model can be updated directly from the current state of an existing PostgreSQL physical database which is the instatiation of it by using the import command (which internally uses TheonCoupler on a stream built against the internal system catalog).
    • The model can be directly manipulated using desktops and panels designed for this in a locally running instance of TheonUI.
    • Aspects of the model can be updated from the SchemaFactory by directly editing the relevant template files and using the factor command (which internally uses TheonCoupler on a stream built from the aggregation of processing those template files).
  4. The model content held in XSDDB is converted back into a SchemaTree (updating the existing XSD files) using the export command.
  5. The SchemaTree is used to upgrade the existing instance of the PostgreSQL physical database using the upgrade command (which internally derives the necessary DDL differences via schematic transform). After the upgrade the live physical database will have been structurally migrated to correspond to the modified SchemaTree representing it.
  6. The XSDDB is removed as it is no longer necessary by using the delete command.
  7. Optionally the modified SchemaTree is committed into a version control system so the changes to it can be tracked and managed updates can then be made to the live production service. The update command can assist with this process when Subversion is being used as the version control system.

The next sections look at some specific actions in more detail.

13.4. Changing an Upstream Model

A published ModelLibrary for Theon, either in a package or repository, can be changed using exactly the same workflows and toolkit commands as described in the previous sections. The SchemaTree for is included with a the development package of a ModelLibrary and can be worked on directly from that (by exporting the changed SchemaTree into a local directory or version control repository for example). Alternatively the most recent revision of the SchemaTree can be checked out from a central public repository and worked on, subsequently exporting into a local directory, version control repository or into a local working copy for committing back to the central public repository (assuming sufficient access rights to do so). The examples below illustrate the last case, but steps are otherwise much the same irrespective of the original source of the SchemaTree.

git clone git://HOST:PORT/library/MODLIB mydb
cd mydb
ttkm create
ttkm reload
# make the required changes
ttkm export
ttkm update
git add .
git commit

Above a working copy of the ModelLibrary is checked out and made the current working directory. This latter steps means that none of the toolkit commands need a model name parameter as they will work directly with the model in the current working directory. The XSDDB database is created and the model loaded into it. The necessary changes are made in whatever way is appropriate/required (see the earlier sections on making changes generally). A new SchemaTree is then exported into the current working directory. Since this is Git we use the relevant commands to stage the changes and then the new (altered) SchemaTree in the working copy is committed back.

In practice in a normal development cycle it is unlikely that the changes would be immediately committed back as above. Instead the local Theon installation being used for testing would be upgraded live to the revised model and tested, probably followed by other changes iteratively until everything is correct. This is also usually necessary as changes to models are likely to include (directly or indirectly by virtue of changes to the physical database structure) changes to the datasets used for TheonUI and TheonCoupler as well as the underlying physical database schema. All these need to be consistent.

Another issue that may arise in a team development environment is conflicts on commit when the same elements of the SchemaTree have been changed incompatibly by different people. These must be merged as is normal practice. However this will typically require working directly with the XML document format in the raw XSD files in order to determine the source of the incompatibility (see section on Conflict Resolution).

Changes made to the SchemaTree in a public repository for ultimately need to be packaged to form part of a new Theon release for normal deployment onto production services and/or wider distribution. See section on release management for details on this.

13.5. Changing the XSDDB

The self model included with Theon is the representation of the physical database structure of the XSDDB itself. It can be represented within itself to facilitate adding enhancements or fixing bugs identically to any other model. However, some external processes have XSDDB elements hard coded (as there is otherwise an issue with bootstrapping) and it is likely that these will have to be changed as an additional step (certainly if any table structure is altered). That is beyond the scope of this documentation (for now) since it involves knowledge of modifying the underlying toolkit source itself. For example, it is likely modifications would be required to the SQL XML generator for the SchemaTree and also to the XSLT transforms that convert the Schema Tree into SQL script for reloading itself into an instance of XSDDB (as well as many other transforms).

Nonetheless the changes to the model itself (for the physical database but also including the streams used for TheonCoupler and the desktops and panels used to manage the content of XSDDB with TheonUI) can be done using exactly the same workflows and toolkit commands as described previously for any other model. The SchemaTree for self is included with an installation of Theon and can be worked on directly from that (by exporting the changed SchemaTree into a local directory or version control repository for example). Alternatively the most recent revision of the SchemaTree can be checked out from the Theon central public repository and worked on, subsequently exporting into a local directory, version control repository or into a local working copy for committing back to the central public repository (assuming sufficient access rights to do so). The examples below illustrate the last case, but steps are otherwise much the same irrespective of the original source of the SchemaTree.

svn checkout https://svn.theon.inf.ed.ac.uk/svn/theon/model/library/self/schemat self
cd self
ttkm create
ttkm reload
# now the database schema for XSDDB is held within XSDDB itself!
# make the necessary changes
ttkm export
ttkm update
svn commit

Above a working copy of the self SchemaTree is checked out and made the current working directory. This latter steps means that none of the toolkit commands need a model name parameter as they will work directly with the model in the current working directory. The XSDDB database is created and the self model loaded into it (the model for XSDDB itself). The necessary changes are made in whatever way is appropriate/required (see the earlier sections on making changes generally). A new SchemaTree is then exported into the current working directory. Since this is Subversion we can use the convenience sub-command update to automatically apply any svn add and svn remove statements. Then the new (altered) SchemaTree in the working copy is committed back.

In practice in a normal development cycle it is unlikely that the changes would be immediately committed back as above. Instead the local Theon installation being used for testing would be upgraded live to the revised self model and tested, probably followed by other changes iteratively until everything is correct. See the section Changing the Core Model above for more details.

13.6. Managing Multiple Models

The XSDDB database is not constrained to contain only one model at a time. Any number of models can be loaded and independently updated/exported.

ttkm create
ttkm import mydb
ttkm otherdb reload

Above two models have been loaded. The mydb physical database has been imported as a model and the otherdb model has been directly loaded from the SchemaTree that represents it.

Both can then be exported into a SchemaTree with one command.

ttkm export

This would create two directories in the current working directory, one for mydb and one for otherdb, each containing its respective SchemaTree. Alternatively each model can be exported independently as and when required.

ttkm mydb export
ttkm otherdb export

13.7. Doing Transforms

The SchemaTree has been transformed in the workflows above, but only indirectly through the install and upgrade commands. Often though it is useful to be able to just do the transform step, in particular as not all the transforms are related to the installation or upgrade of the physical database, although we look at that first.

Before a transform is applied a single XML-Schema document representing the entire SchemaTree is created (by schematic traversal). Internally transforms can be done against the SchemaTree itself but when multiple transforms are being applied it is more efficient to build an aggregated document first (in particular if all the individual XSD files in a Schema Tree have to be fetched from a public distribution via HTTP or from a version control repository for example).

ttkm mydb complete

Here mydb is the name of the model. If the current working directory is a checked out working copy of a Subversion repository containing a SchemaTree then the model used will default to the model within the current working directory (when a model is not specified in the command).

Otherwise the model name will be searched for as a directory (containing a SchemaTree): in the current working directory; in the local Theon installation (for models included with Theon such as self) or a specific path/URI given as an option to the command. When a path/URI is given as an option and no model name is specified the path/URI must lead to a directory containing a SchemaTree and that will be what is used. Otherwise the path/URI should contain a directory containing a SchemaTree which is the same as the model name specified in the command. A URI would normally be a file:/// (or just a local file path) but can equally be a reference to a SchemaTree published on the web or in a version control repository (and specific revision thereof). A specific node instance (XSD file) within a SchemaTree can also be specified so that only that part of the Schema Tree is completed.

ttkm mydb complete entity:ENTITY

The above will root the schematic traversal at the ENTITY.xsd file in the entity directory of the SchemaTree thus making a partial completion as that fragment down only.

The complete sub-command uses the schematic command (included with the Theon installation) to traverse the SchemaTree and create a single document for it. The schematic command follows XML-Schema inclusion syntax and (by default) suppresses multiple inclusions of the same file. The result of this command is a file called complete.xsd in a directory relative to the current working directory, or a sub-directory therein named after the model. When a node instance is used the name will be NODE-complete.xsd where NODE is the base name of the XSD file which was used as the root for completion. This might just be the MODEL.xsd file at the root of the SchemaTree but can be any file within for partial completion (to complete a fragment of the SchemaTree). The NODE-complete.xsd file is written out into the same directory as the XSD file used as the root. If an output path (or file) is specified as an option to this command then that directory (or file) will be used for the result instead.

13.7.1. DDL Transform

Now, as the first example, transform the completed SchemaTree into a file of SQL DDL statements that can be used to create the physical database represented (effectively by just piping the file into psql). This is what is done internally by the install sub-command.

ttkm mydb derive ddl

Here mydb is the name of the model. If the current working directory is a checked out working copy of a Subversion repository containing a SchemaTree then the model used will default to the model within the current working directory (when a model is not specified in the command).

Otherwise the model name will be searched for as a directory (containing a SchemaTree): in the current working directory; in the local Theon installation (for models included with Theon such as self) or a specific path/URI given as an option to the command. When a path/URI is given as an option and no model name is specified the path/URI must lead to a directory containing a SchemaTree and that will be what is used. Otherwise the path/URI should contain a directory containing a SchemaTree which is the same as the model name specified in the command. A URI would normally be a file:/// (or just a local file path) but can equally be a reference to a SchemaTree published on the web or in a version control repository (and specific revision thereof). A specific node instance (XSD file) within a SchemaTree can also be specified so that only that part of the Schema Tree is derived.

The derive command will look for an instance of complete.xsd in the relevant directory that is newer (by timestamp) then MODEL.xsd in the same directory. If it does not exist, is older (by timestamp) or the complete option is specified then the complete sub-command is used first to create it. When a node instance is specified then the relevant NODE-complete.xsd file is looked for in the appropriate directory (and created if necessary).

The result of this command is a directory called derived in the current working directory, or a sub-directory therein named after the model, containing a file called ddl.sql. When a node instance is specified then the result will be a file called NODE-ddl.sql in a sub-directory named the same as the one the XSD file associated with the node instance is in. If an output path (or file) is specified as an option to this command then that directory (or file) will be used for the result instead.

ttkm derive /tmp/mydb ddl entity:ENTITY /tmp

In the above the SchemaTree must be within /tmp/mydb and the node instance entity/ENTITY.xsd within that will be transformed (via completion if necessary) into /tmp/derived/entity/ENTITY-ddl.sql.

13.7.2. DDL Differential Transform

Warning

The ddldiff transform has not been implemented yet.

This transform uses a feature of schematic which takes two SchemaTree instances and annotates the single XML-Schema document resulting from the first SchemaTree with additional markup indicating how it differs from the second SchemaTree. This facility is used in the upgrade command to produce a "changeset" of SQL DDL statements to bring the structures in a physical database matching the second SchemaTree into alignment with the first SchemaTree.

ttkm mydb derive ddldiff

The usage, function and result of this command is largely as per the ddl transform described in the section above. In fact if a complete XML document is used that is not marked up with differences (created from a single Schema Tree) then the resulting SQL DDL statements will be almost identical (functionally) to those for ddl. When using a correctly produced differential complete XML-Schema document the result will be SQL DDL statements that make the necessary structural changes reflecting the differences.

The derive command and the ddldiff transform cannot push down a call to the complete command (and so the complete option is also redundant). So the complete command must always preceed this (unless there is known to be no change to the existing complete output). The complete command is simply given an additional path or URI argument in order to produce a differential output. The resulting file is stored and named in the same way as for a non-differential output however, as the processing at the level of the derive command is identical in either case (whatever transform is used, only the ddldiff transform actually looks for and uses the additional change set markup).

The result of this command is a directory called derived in the current working directory, or a sub-directory therein named after the model, containing a file called ddldiff.sql. When a node instance is specified then the result will be a file called NODE-ddldiff.sql in a sub-directory named the same as the one the XSD file associated with the node instance is in. If an output path (or file) is specified as an option to this command then that directory (or file) will be used for the result instead.

13.7.3. DDL Documentation Transform

This transform produces the documentation for a completed SchemaTree. It results in a directory tree of HTML and SVG files that fully describe the physical database structure represented by the model (assuming the model has been sufficiently annotated). The documentation directory tree for the self model included with a Theon installation is produced in this way.

ttkm mydb derive ddldoc

The usage and function of this command is as per the ddl transform described in the section above.

The result of this command is a directory called derived in the current working directory, or a sub-directory therein named after the model, containing a directory called html. This will contain a file called index.html plus the same sub-directory structure as a SchemaTree each sub-directory containing a NODE.html file where NODE is the base name of the associated XSD file. Each sub-directory may also have one or more corresponding NODE-TYPE.svg files per node instance depending on the node type. When a node instance is specified then the result will be one of the files called NODE.html in the sub-directory named the same as the one the XSD file associated with the node instance is in, as well as one or more corresponding NODE-TYPE.svg files. If an output path (or file) is specified as an option to this command then that directory (or file) will be used for the result instead.

Note that there is another transform (called ddldocbook) that is used to produce the same documentation in a form that can be directly included in the DocBook source for this book. This should never be needed in normal development. It only needs to be used when a new release of the book is produced that needs to include updates to self model included with Theon for example.

13.7.4. DDL Data Transform

This transform is used by the reload command to produce SQL DML statements to load a SchemaTree into the XSDDB.

ttkm mydb derive ddldata

The usage and function of this command is as per the ddl transform described in the section above.

The result of this command is a directory called derived in the current working directory, or a sub-directory therein named after the model, containing a file called ddldata.sql. When a node instance is specified then the result will be a file called NODE-ddldata.sql in a sub-directory named the same as the one the XSD file associated with the node instance is in. If an output path (or file) is specified as an option to this command then that directory (or file) will be used for the result instead.

Note that another transform (called ddlxsl) is actually used to produce the ddldata transform. This should never be needed in normal development. It only needs to be used when the XSDDB structure itself is altered.

13.7.5. Other Transforms

There are various other transforms that can be used, however these are for producing datasets for the other components (TheonUI and TheonCoupler) rather than Theon itself. See the sections on these individual components for further details.

13.8. Conflict Resolution

When using a version control system merge conflicts in a SchemaTree must be dealt with using the tools of that version control system. However the files conflicts will be in are the XSD files which are formatted as XML documents. This makes the merge process complicated as it is not a structure that lends itself to easy detection and/or correction. Also the actual XML-Schema structure used to hold a SchemaTree has been left as an implementation detail and is not documented here. An easier approach exists however. This involves converting the SchemaTree into a format that can more easily be used for checking the conflicts, and then do any conflict resolution by rolling back and changing the source itself.

For example, a merge conflict in an XSD file in the entity directory could be dealt with by transforming it to a set of SQL DDL statements instead. Differences in these are more easily seen and understood. Below assumes that we are in the working copy holding the exported SchemaTree with a conflict and are using Subversion as the version control system.

svn revert entity/ENTITY.xsd
ttkm derive ddl entity:ENTITY
mv derived/entity/ENTITY-ddl.sql derived/entity/ENTITY-ddl.sql.svn
mv entity/ENTITY.xsd entity/ENTITY.xsd.svn
ttkm export entity:ENTITY
ttkm derive ddl entity:ENTITY
meld derived/entity/ENTITY-ddl.sql.svn derived/entity/ENTITY-ddl.sql
# change XSDDB content as necessary
ttkm export entity:ENTITY
svn commit entity/ENTITY.xsd

Above uses commands already described but in a more selective way. First revert the changes leading to the conflict in ENTITY.xsd. Then derive SQL DDL statements into derived/ddl.sql solely for the fragment of the Schema Tree rooted at ENTITY.xsd. Move these out of the way as they represent the current Subversion content. Now re-export just the part of the XSDDB that results in ENTITY.xsd in the SchemaTree. Then do the SQL DDL statements file derivation again. This represents the new desired state of the Subversion content. Compare the two. Make any necessary changes in the XSDDB to resolve the conflict then export again to commit the changes.

Similarly parts of the SchemaTree representing data for TheonUI can be derived into their own native format (JSON or HTML in this case) so that little or no knowledge of the XSD file content itself is required to resolve the conflict. Similarly for content intended for TheonCoupler.

The above approach won’t always work as not all of the XML document content necessarily makes it out into any particular transformation, although most ultimately should. As a final fallback the internal Theon transform to make a set of SQL DML statements to load a model into XSDDB from the Schema Tree can be used as that must by design include all elements in the XML. This may also be necessary if something in the changes causing conflict means the particular transformation to naturally use is broken as a result.

In general development to minimise merge conflicts it is always good practice to update the working copy prior to reloading the model in XSDDB from it, so before making the changes to it.