This chapter presents an overview of the actual steps and commands for using Theon to manage a PostgreSQL physical database - be it a custom private one, upstream one or locally developed shared one.
Theon should have been installed first on the desktop machine you are using to run. This may be the case by default if there is a system wide installation. If not see the Installation and Configuration chapter. You should have access to a running PostgreSQL installation on the same desktop machine. It could also be on a remote machine but for the purposes of this guide it should be on the same machine. You will need access to create databases (and maintain the content of those created databases) as yourself. Depending on your desktop machine management, this might be a local or central configuration change.
Theon is not tied to a specifc version control system, or indeed any, as
while using one is highly recommended, it is not mandatory. However, the
default toolkit commands are to some extent version control system aware. While
not necessary, using this builtin behaviour can be simpler, as it reduces the
number of command options required. If this aspect of the toolkit commands is
to be used then Git
must be used as the underlying version control
system
- either locally or against a remote repository.
You always need to start from scratch when you have a new personal physical
database that you want to start managing as a model within Theon. In
this case there will be no SchemaTree
to work from, so the initial version
of a SchemaTree
needs to be created.
In order to do this a model in XSDDB needs to be created describing the new physical database. This can be done in two ways: manually by creating the necessary entries in the XSDDB database, normally using TheonUI and the model management desktops (see later); or by importing the model content from the PostgreSQL "system catalog" of the physical database directly into the XSDDB database. Doing an import is usually better, as by avoiding most of the manual data entry, its much quicker and less error prone. It may not be as appropriate for databases where a lot of structure cannot be imported for whatever reason (usually as it is unsupported and needing manual attention). Some manual data entry may still be necessary subsequent to an import, particularly annotation for documentation purposes, but none may actually be "required".
As mentioned before, a model in Theon does not represent all the
possible structural elements of a PostgreSQL physical database. Only supported
elements can be added or imported into the XSDDB database. Structures which
are not added or imported (and therefore not directly representable in a Theon
model) need to be handled in a different way. They can be manually created
within container entities in a model. This is particularly suitable if they are
isolated and not likely to be subject to frequent change. Alternatively they
can simply be ignored and handled out of band, for example by extracting the
necessary fragments from the output of pg_dump -s
. This latter approach
does mean that the installation and upgrade of a physical database from a
model in Theon will require an additional step, one that is not part of the
Theon managed process, for handling the "out of band" structures.
By default to import content into XSDDB directly from a physical database
requires that the database must be contained in a PostgreSQL installation on
the same physical host where Theon is being run. If this is not the case
for your database a suitable LiveProfile
must be created first. Alternatively
take a schema dump (no content is necessary) and transfer that to the host
where Theon is running, and install it temporarily as a database there -
for example:
pg_dump -s -h HOSTNAME mydb > mydb.sql psql < mydb.sql
Note that this only applies when using a PostgreSQL system catalog import to create and/or update XSDDB content. The live physical database is not otherwise need to be on the same host where Theon is being run. Indeed, this would normally not be the case in a central service unless changes were being made for subsequent import directly to the live in-service database. This is not normally reccommended. It is always better to use a development snapshot of the database first to avoid any possible user disruption. In a private installation of a personal database, it is much more likely that Theon will be being run on the same host as the database service, and direct changes to the live in-service database don’t matter (as they only affect that user anyway).
The commands for model management in Theon are all found in the model
sub-command of the Theon toolkit command (theontk
). For brevity the
examples below use the ttkm
alias instead of the full command (theontk
model
).
Each command is shown in context below, but refer also to the command appendix
for full details and options. You can also do ttkm help
to get a list of
possible model sub-command actions and help on each action.
The process outlined in the sections below assumes that the XSDDB database is being populated by direct import of the system catalog of an existing physical database (either on the same host as Theon is being run or temporarily transferred to that host solely for the purpose of import, as described above). For manual creation of XSDDB content refer to the section on the model management desktops under TheonUI.
The process when using a model which is already held in an SchemaTree
of
XSD files (such as those supplied with Theon itself) is slightly different
- refer to the later section on making changes.
Create the Theon schema management database XSDDB. To recap, this is
the "transient" database that helps maintain the actual SchemaTree
of XSD
files.
ttkm create
If the XSDDB database already exists then the above command will fail with an
error. This is reasonable, as you might not want to destroy the current content
in the XSDDB database - it may reflect changes in other models that have not
been pushed back out as a SchemaTree
.
The create
subcommand is really just for user convenience and is actually an
alias for ttkm self install
. The self
model is realised as the XSDDB
physical database.
Now create a model in XSDDB from the system catalog of a physical PostgreSQL database. By default the model will be given the same name as the name of the database being imported; however this can be overridden.
ttkm import mydb
The above command takes the supported structures of the mydb
database and
imports them into XSDDB as a model called (tagged) mydb
. To import the
database using a different model name (noting that each model name
concurrently loaded within XSDDB must be unique, so this may be necessary for
example if multiple instances of the same physical database need to be held in
XSDDB) do the following instead.
ttkm MODELNAME import mydb
Note that in both cases (irrespective of model name) the realised physical
database associated with the model will always be called mydb
. This can be
changed within XSDDB manually by using TheonUI and the relevant model
management desktops.
Since the command toolkit is version control aware, if the current working
directory is established to be a checked out Subversion working copy of a
ModelLibrary
, then when the MODELNAME
is not specified it will default to
the name of that directory (rather than the name of the physical database).
The name of the physical database to import from is also optional and if not
specified will default to being the same as the MODELNAME
(or if that is not
specified the name of the current working directory when it is a Subversion
working copy of a ModelLibrary
).
The import
subcommand is really just for user convenience and is actually an
alias for a TheonCoupler stream
processing command. In this instance the
command would be theon coupler self stream pgcat mydb mydb
. That is, run the
pgcat
stream with two arguments (the first is the name of the physical
database catalog to use, the second is the name of the model to import that
as) against the self
model (which is realised as the XSDDB database). There
is no external data source in this stream, it is a live query against mydb
(in this example). For more details see the TheonCoupler chapter.
No manual alterations should subsequently need to be made at this stage, but at some point the imported structures should be annotated so as to provide useful documentation. These manual alterations would be made using TheonUI and the relevant model management desktops (see later section).
As previously mentioned only supported structures will be imported. Unsupported structures will be silently ignored. They will however be identified later on during validation and can then be addressed as appropriate.
Now the SchemaTree
can be created from the content of XSDDB.
This will represent the true initial version of the model for the physical
database in Theon.
Normally the SchemaTree
would be written out into a working copy of a
ModelLibrary
from a version control repository (the master of which can be
local or remote), as this means schema change management uses the same
approach as source code change management. However, this is not mandatory, a
filesystem directory can hold the SchemaTree
as-is with no version control,
or just simplistic version control such as frequent backups. This is the more
likely case for managing personal databases in a personal installation of
Theon. A version control system is more likely to be deployed when changes
are being managed to a physical database which is centrally hosted and might
be being updated by many independent developers simultaneously.
Similarly, when making changes to any third party model the SchemaTree
would
be a checked out copy from an accessable central repository (or installed
package) holding that (or an independently branched derivative). The examples
below show the use of a Subversion repository local to the host where Theon
is being run. This could equally just be a simple filesystem directory though
(mkdir
below instead of svn checkout
).
svn checkout file:///disk/local/mydb_modlib cd mydb_modlib ttkm mydb export
Here mydb
is the model name, not necessarily the same as the name of the
physical database it will be realised as (or was imported from).
By default the export will be into the current working directory. A new
directory will be created (if necessary) and called mydb
(in this example).
This will contain the SchemaTree
(the structure of which was
outlined in a previous section). Any existing content will be overwritten (but
files in the directory but not in the export will be left untouched).
If the current working directory is a checked out Subversion working copy of a
ModelLibrary
then the name of the model exported will default (when not
explicitly specified) to the name of the current working directory and the
export will not include the top level directory itself.
If the current working directory is not under version control then nothing more needs to be done. If it is then then the changes will ultimately need to be merged and committed back to the repository. This is assuming they represent a complete change that should be committed as a group. They may not be, as they are likely to need to be tested first, and so there be many iterations of XSDDB update and export before the changes are finally ready to be committed.
Since the command toolkit is version control aware as a convenience for Subversion repositories the following command is available.
ttkm update
This command (which is only applicable to the version control process) does a
pre-commit pass within the current working directory (only if it is a checked
out Subversion working copy). This will svn add
any new files not previously
in the working copy and svn remove
any files not touched by the most recent
export (as they must no longer be in the representation of the model). This
process is only applied to files and directories that are part of the Schema
Tree
, others are ignored.
Any merge and commit operation that would now normally follow within the version control system must be done using the tools provided by that system.
Next the model is realised, that is, install (or subsequently upgrade) a
PostgreSQL physical database from the SchemaTree
used to represent it. In
principle, if this step immediately follows an import from a pre-existing
physical database, then the pg_dump -s
output of each database should be
near identical (or functionally identical anyway). In practice this may not be
the case, in particular where the original database contained unsupported
elements.
Before doing this, in the case where a pre-existing physical database has been
imported, the original database must be deleted (after being backed up first
in order to do subsequent testing). Obviously this would not be done if it was
being used in a live production service. In this case the new Theon
managed version of the database should be created in a different PostgreSQL
installation (probably on a different host). Alternatively the model in XSDDB
could be altered to give the realised database a different name before the
content is exported as a SchemaTree
. Delete the existing database if
necessary using PostgreSQL commands, making a structural backup first (no
content is included in this).
pg_dump -s mydb > mydb.imported.sql dropdb mydb
Now (re)create the database from the SchemaTree
in the current working
directory (by default).
ttkm mydb install
Here mydb
is the name of the model to install. This will not necessarily be
the same as the name of the physical database created (the actual name is
defined within the model). If the physical database that the mydb
model
creates already exists then this command will simply fail as a safety measure.
If the current working directory is a ModelLibrary
(checked out working
copy of a Subversion repository) containing a SchemaTree
then the model
installed will default to the model represented by that SchemaTree
(when a
model name is not specified in the command).
Otherwise the model name will be searched for as a directory (containing a
SchemaTree
): in the current working directory; in the local Theon
installation (for models included with Theon such as self
) or a specific
URI given as an option to the command. When a URI is given as an option and
no model name is specified the URI must be to a directory containing a
SchemaTree
and that will be what is used. Otherwise the URI should
contain a directory containing a SchemaTree
which is the same as the model
name specified in the command. The URI would normally be a file:///
but can
equally be a reference to a SchemaTree
published on the web or in a version
control repository (and specific revision thereof).
The install
sub-command creates the mydb
physical database (or whatever
the model calls it) and all the structures within it as defined in the Schema
Tree
. It does this by first creating a single XML-Schema document for the
entire SchemaTree
(by schematic traversal using the ttkm schema
sub-command). It then transforms this using the ttkm derive
sub-command into a file containing DDL SQL statements that will create the
database and all its structures. It then uses the base theon install
command
(not part of Theon) to run the DDL SQL statements through psql
. It
will also update the model
management table (always created for a database
in the derived DDL SQL statements under the theon
schema) storing within it
meta data about the installation: a timestamp; a session user; a checksum for
the single XML-Schema document representing the complete SchemaTree
(this
is used for handling subsequent upgrades).
Normally a snapshot of the entire SchemaTree
used to install a physical
database is also embedded inside that database (in the theon.model
management table). This is useful for making subsequent differential upgrades
and is the default as, in particular, it is better for the frequent upgrade
cycle occuring during active development. However, this is not required and
the behaviour can be suppressed using the noxsd
option on the ttkm
install
command. In this case the upgrade process will automatically create
a temporary XSDDB instance, import the system catalog of the physical database
into it and then export it out again in order to re-create the corresponding
SchemaTree
. This is obviously more time consuming (particularly for large
models), so for performance do not use this option. Making live structure
changes in the database when using this option means future upgrades are made
against the current structure, not the structure as it was immediately after
the previous install/upgrade (which it would be without specifiying noxsd
).
In practice this makes no functional difference, however it does reduce the
opportunity for upgrade validation when making releases so this is another
reason to not normally use it. The noxsd
option is generally used during
release management as deployed live production databases would be expected to
have a schema under strict version control and so the schema corresponding to
the particular database structure should always reside in the repository (for
the purposes of differential upgrade).
This step is recommended after an import from a pre-existing PostgreSQL
physical database. Do a structure dump from the original database and from the
one created fresh out of the new SchemaTree
and then compare.
pg_dump -s mydb > mydb.sql meld mydb.imported.sql mydb.sql
There are likely to be some differences. For example, any unsupported elements in the original will be missing. Also some model conventions in Theon while functionally similar or identical will cause a difference in the dump outputs. For example, views in a Theon model are (by default) constructed as inline functions and in a Theon model (by default) every foreign key is automatically given an index.
The physical database created from a model in Theon also includes a
management table simply called model
(in the theon
physical database
schema). This includes various rows of meta data associated with the
installation including timestamp, checksum, serial identification or version
control system revision number (identifying the specific instance of the model
that the physical database represented on original installation and used for
handling subsequent upgrades. Manual alteration to the content of this
management table is likely to break future upgrades (or worse cause future
upgrades to corrupt actual content), so is unwise except in a purely
development scenario.
Differences associated with unsupported elements will need to be addressed before proceeding, but that is beyond the (current) scope of this document.
Since the SchemaTree
has been created, in principle there is now no longer
any need for a local instance of the XSDDB database and it can be deleted.
ttkm delete
The delete
sub-command is really just for user convenience and is actually an
alias for ttkm self destroy
.
In practice though there is no need to delete the local XSDDB instance. It can be kept around for as long as necessary, for example for future model maintenance and for loading and maintaining other models.
This diagram above illustrates the workflow described so far. So covering
manipulating a model represented in a SchemaTree
, using XSDDB either by
manual update or import from an existing PostgreSQL physical database, then
creating a new PostgreSQL physical database. The next stage of the modeling
process is concerned with how changes are managed.
create
command.
import
command (which
internally uses TheonCoupler on a stream built against the internal system
catalog).
SchemaTree
of XSD
files using the export
command.
SchemaTree
is used to install a new instance of the PostgreSQL
physical database using the install
command (which internally derives the
necessary DDL via schematic transform). The new instance should be identical
to the original instance barring some non-functional changes and with the
exception of any unsupported structures.
delete
command.
SchemaTree
is committed into a version control system
so subsequent changes to it can be tracked and managed updates can be made to
the live production service. The update
command can assist with this process
when Subversion is being used as the version control system.
Once a physical database is managed as a model within Theon it can be updated and any changes distributed and deployed in a controlled manner (when using an underlying version control system anyway).
First create the Theon schema management database XSDDB. To recap, this
is the "transient" database that helps maintain the actual SchemaTree
of
XSD files.
ttkm create
This step is only necessary if there is not an XSDDB database already, which
will often be the case when a model is in constant development. If the XSDDB
database already exists then the above command will fail with an error, so
as to protect the current content (which may contain changes to models that
have not been exported out as a SchemaTree
).
The create
subcommand is really just for user convenience and is actually an
alias for ttkm self install
. The self
model is realised as the XSDDB
physical database.
Load XSDDB with a model contained in a SchemaTree
. A SchemaTree
might
be one of the ones included in the Theon installation itself, or it might be
in a central repository of a version control system, the local filesystem, or
simply just in the current working directory.
ttkm mydb reload
Here mydb
is the model name to load, not necessarily the same as the name of
the physical database it is realised as. The command expects to find a Schema
Tree
in the current working directory if no model name is specified, or in a
sub-directory of the current working directory with a name matching that of
the model specified (mydb
in this instance). If not found the command will
search a defined path in the local installation of Theon. Optionally an
explicit path can be given that will be used to search in place of the current
working directory and the local installation.
If the current working directory is a ModelLibrary
(checked out Subversion
working copy) then the model name does not need to be specified and the model
loaded will be that represented by the SchemaTree
in the current working
directory (just as above).
A reload
operation destroys all the existing content related to the model
(mydb
in this instance) within XSDDB. It does not affect the content of any
other model also concurrently being held in XSDDB. Therefore, if there are
unexported changes in XSDDB for the model being reloaded, they should be
exported first then potentially merged and committed (if using an underlying
version control system) so they are not lost.
The reload
sub-command adds records into all the tables in the XSDDB so as
to represent the model held in a SchemaTree
. It does this by transforming
the SchemaTree
using the ttkm derive
sub-command into a single
file (called derived/model.sql
) containing SQL DML statements predominantly.
These delete any existing records for the model and then add records into all
the tables to represent the model. It then runs all the SQL statements
through psql
. For performance the derive
sub-command only does the
transform if it has not already been done for the current SchemaTree
, a
transform can always be forced using the full
option here.
The model can also be loaded in parts. This can be more efficient if only one bit is in use (or actively being worked on).
ttkm mydb reload dictionary ttkm mydb reload reflection ttkm mydb reload extraction
The above load the data model (PostgreSQL DDL) component (Dictionary
cluster)
and the TheonUI component (Reflection
cluster). Since dictionary
is a
pre-requisite for the other that will always also be loaded (destructively)
when reflection
is loaded. All reloads destroy existing data.
Now make the actual updates to the model held in XSDDB. For simple changes this is commonly done by manual editing of the content using the model management desktops in TheonUI (see later section). Alternatively some updates might be made by processing templates and loading the generated structures using a TheonCoupler stream.
For updates affecting the physical database structure the changes can also be made live on the actual physical database (realisation of the model) and imported. This is often more convenient providing a faster development cycle for structural changes (as the import can be left to be an atomic set of "related" changes once all are completed and tested).
psql mydb -c "ALTER TABLE x ADD COLUMN y INTEGER;" ttkm import mydb
The above will import the change directly from the PostgreSQL system catalog
for the physical database database mydb
into the model mydb
being held in
XSDDB. If the model name is not the same as the realised physical database
name then that must be explicitly specified as well.
Since the command toolkit is version control aware, if the current working
directory is established to be a checked out Subversion working copy of a
SchemaTree
, then when the model is not specified (as above) it will default
to the name of that directory (rather than the name of the physical database).
The name of the physical database to import from is also optional and if not
specified will default to being the same as the model (or if that is not
specified the name of the current working directory when it is a Subversion
working copy of a SchemaTree
).
Doing an import like this is not really any different to doing the original
import of a new physical database when starting from scratch in Theon
but unlike that, which will be creating all the elements from scratch, this
will just be making small adjustments to already existing elements. This is
because import is actually just an instance of a particular data feed and sync
mechanism implemented in TheonCoupler. So the import
subcommand is really
just for user convenience and is actually an alias for a TheonCoupler
stream
processing command.
While it is possible to mix and match all means of updating the XSDDB, it is
relatively important that different approaches are not used on the same
element until the changes have been exported and the SchemaTree
and live
database upgraded to match inbetween. There is effectively no master
otherwise. Manual changes to XSDDB, imported changes from the live database
instance and externally generated and coupled changes are all treated the
same. So when they update the same element each will silently override the
other simply in the order applied. This will be the case until the changes in
that element are pushed through the true master (the SchemaTree
) and
propogated through.
Once all the necessary changes have been made the SchemaTree
can be updated
from the content of XSDDB. This will then represent the updated model for (or
of when the changes were imported) the physical database.
ttkm mydb export
Here mydb
is the model name, not necessarily the same as the name of the
physical database it is realised as.
By default the export will be into the current working directory. A new
directory will be created (if necessary) and called mydb
(in this example).
It will contain the SchemaTree
. Any existing content will be overwritten
(but files in the directory but not in the export will be left untouched).
If the current working directory is a checked out Subversion working copy then the name of the model exported will default (when not explicitly specified) to the name of the current working directory and the export will not include the top level directory itself.
If the current working directory is not under version control then nothing more needs to be done. If it is then then the changes will ultimately need to be merged and committed back to the repository. This is assuming they represent a complete change that should be committed as a group. They may not be, as they are likely to need to be tested first, and so there be many iterations of XSDDB update and export before the changes are finally ready to be committed.
Since the command toolkit is version control aware as a convenience for Subversion repositories the following command is available.
ttkm update
This command (which is only applicable to the version control process) does a
pre-commit pass within the current working directory (only if it is a checked
out Subversion working copy). This will svn add
any new files not previously
in the working copy and svn remove
any files not touched by the most recent
export (as they must no longer be in the representation of the model). This
process is only applied to files and directories that are part of the Schema
Tree
, others are ignored.
Any merge and commit operation that would now normally follow within the version control system must be done using the tools provided by that system. However this is likely to be deferred until the changes have been applied and tested and any further changes made (to form a "related" grouping).
The upgrade
sub-command has not been implemented yet. An upgrade cannot
currently be made in the way described. For development purposes it is
necessary to destroy and recreate the database, that is ttkm mydb
destroy
then ttkm mydb install
. For live production database
upgrades a dump of the content must be made first so it can be reloaded after
the destroy/install cycle. It is safer to use insert statements in the
pg_dump
command to increase the likelihood that the data will reload without
compatibility issues.
Now the model changes can be applied back to the live physical database. This is not strictly necessary for changes that have been made directly via manual DDL SQL statements on the live database and then imported back into XSDDB as described above. However, it is harmless and helps to validate that the changes were correctly imported. During upgrade changes that have already been made and that are consistent with what is required in the upgrade are silently ignored.
ttkm mydb upgrade
Here mydb
is the name of the model to upgrade. This will not necessarily be
the same as the name of the physical database upgraded (the actual name is
defined within the model).
If the current working directory is a checked out working copy of a Subversion
repository containing a SchemaTree
then the physical database
upgraded will default to that defined by the model within the current working
directory (when a model name is not specified in the command).
Otherwise the model name will be searched for as a directory (containing a
SchemaTree
): in the current working directory; in the local Theon
installation (for models included with Theon such as self
) or a specific
URI given as an option to the command. When a path is given as an option and
no model name is specified the URI must be to a directory containing a Schema
Tree
and that will be what is used. Otherwise the URI should contain a
directory containing a SchemaTree
which is the same as the model name
specified in the command. The URI would normally be a file:///
but can
equally be a reference to a SchemaTree
published on the web or in a version
control repository (and specific revision thereof).
The upgrade
sub-command alters as necessary the structures within the mydb
physical database (or whatever the model calls it) to match the changes made
to the SchemaTree
. In order to do this the command constructs a new
SchemaTree
with special markup that represents the differences between the
one in the current working directory (or as otherwise specified) and the one
which represents the current state of the physical database realising the
model in that SchemaTree
.
This latter SchemaTree
is taken from the physical database itself, as a
copy is normally stashed in the the management table (called theon.model
)
during previous install
or upgrade
commands. This will not be the case
when the last of these commands was run with the noxsd
option, or normally
when using a live deployed production database built from a packaged release.
Then a temporary SchemaTree
is rebuilt by creating a temporary model in
XSDDB, importing the system catalog of the database into it, exporting it then
deleting that model in XSDDB. Alternatively a URI specified in the command
will be used. This must contain a SchemaTree
or a directory containing a
SchemaTree
which is the same as the model name specified in the command.
When using a version control system the two specified URIs can be used, for
example, to do an upgrade to a specific version of the SchemaTree
. Here the
first would point to the version to upgrade to and the second point to the
current version of the SchemaTree
installed.
Once the "differential" SchemaTree
has been produced it is transformed
(using ttkm derive
) into a file of SQL DDL statements. Then the ttkm
upgrade
command is used to upgrade the physical database by running the DDL
SQL statements through psql
. This will also update the theon.model
management table storing within it meta data about the upgrade: a timestamp; a
session user; an embedded snapshot of the SchemaTree
upgraded to (unless the
noxsd
option was specified).
When the SchemaTree
is rebuilt from the physical database above and the
model changes have been made by direct DDL statements in the database then the
generated model will also have these changes already. The resulting upgrade
SQL will be empty (if no other changes were made outside of the live
database). Whereas if the SchemaTree
is taken from the embedded snapshot
this will represent the state of the live database prior to any live changes
made. The resulting upgrade SQL will consequently contain these changes
already made in the live database. The generated DDL SQL statements are
wrapped so as to detect when a consistent change has already been made and
silently ignore it.
During development this whole process is repeated multiple times, that is,
more changes can be made (manually in XSDDB or by import/template), exported
and then re-applied by upgrade. When a "complete" set of changes has been
applied, tested and debugged then the final export of the SchemaTree
would
be merged and committed (when using an underlying version control system).
This diagram above illustrates the workflow described so far. So concerned
with changing a model represented in an SchemaTree
, in general using XSDDB
either by manual update or import from the model realised as a live PostgreSQL
physical database.
create
command.
Schema
Tree
representing the current state of that model by using the reload
command.
The XSDDB is used to manipulate the model to make the neccessary changes. This can be done in a number of ways, none of which are mutually exclusive.
import
command (which internally uses TheonCoupler on a stream built
against the internal system catalog).
SchemaFactory
by directly
editing the relevant template files and using the factor
command (which internally
uses TheonCoupler on a stream built from the aggregation of processing those
template files).
SchemaTree
(updating the existing XSD files) using the export
command.
SchemaTree
is used to upgrade the existing instance of the
PostgreSQL physical database using the upgrade
command (which internally
derives the necessary DDL differences via schematic transform). After the
upgrade the live physical database will have been structurally migrated to
correspond to the modified SchemaTree
representing it.
delete
command.
SchemaTree
is committed into a version control
system so the changes to it can be tracked and managed updates can then be
made to the live production service. The update
command can assist with
this process when Subversion is being used as the version control system.
The next sections look at some specific actions in more detail.
A published ModelLibrary
for Theon, either in a package or repository, can
be changed using exactly the same workflows and toolkit commands as described
in the previous sections. The SchemaTree
for is included with a the
development package of a ModelLibrary
and can be worked on directly from
that (by exporting the changed SchemaTree
into a local directory or version
control repository for example). Alternatively the most recent revision of the
SchemaTree
can be checked out from a central public repository and worked
on, subsequently exporting into a local directory, version control repository
or into a local working copy for committing back to the central public
repository (assuming sufficient access rights to do so). The examples below
illustrate the last case, but steps are otherwise much the same irrespective of
the original source of the SchemaTree
.
git clone git://HOST:PORT/library/MODLIB mydb cd mydb ttkm create ttkm reload # make the required changes ttkm export ttkm update git add . git commit
Above a working copy of the ModelLibrary
is checked out and made the
current working directory. This latter steps means that none of the toolkit
commands need a model name parameter as they will work directly with the model
in the current working directory. The XSDDB database is created and the
model loaded into it. The necessary changes are made in whatever way is
appropriate/required (see the earlier sections on making changes generally). A
new SchemaTree
is then exported into the current working directory. Since
this is Git we use the relevant commands to stage the changes and then
the new (altered) SchemaTree
in the working copy is committed back.
In practice in a normal development cycle it is unlikely that the changes would be immediately committed back as above. Instead the local Theon installation being used for testing would be upgraded live to the revised model and tested, probably followed by other changes iteratively until everything is correct. This is also usually necessary as changes to models are likely to include (directly or indirectly by virtue of changes to the physical database structure) changes to the datasets used for TheonUI and TheonCoupler as well as the underlying physical database schema. All these need to be consistent.
Another issue that may arise in a team development environment is conflicts on
commit when the same elements of the SchemaTree
have been changed
incompatibly by different people. These must be merged as is normal practice.
However this will typically require working directly with the XML document
format in the raw XSD files in order to determine the source of the
incompatibility (see section on Conflict Resolution).
Changes made to the SchemaTree
in a public repository for ultimately need to
be packaged to form part of a new Theon release for normal deployment onto
production services and/or wider distribution. See section on release
management for details on this.
The self
model included with Theon is the representation of the physical
database structure of the XSDDB itself. It can be represented within itself to
facilitate adding enhancements or fixing bugs identically to any other model.
However, some external processes have XSDDB elements hard coded (as there is
otherwise an issue with bootstrapping) and it is likely that these will have
to be changed as an additional step (certainly if any table structure is
altered). That is beyond the scope of this documentation (for now) since it
involves knowledge of modifying the underlying toolkit source itself. For
example, it is likely modifications would be required to the SQL XML generator
for the SchemaTree
and also to the XSLT transforms that convert the Schema
Tree
into SQL script for reloading itself into an instance of XSDDB (as well
as many other transforms).
Nonetheless the changes to the model itself (for the physical database but
also including the streams used for TheonCoupler and the desktops and panels
used to manage the content of XSDDB with TheonUI) can be done using exactly
the same workflows and toolkit commands as described previously for any other
model. The SchemaTree
for self
is included with an installation of
Theon and can be worked on directly from that (by exporting the changed
SchemaTree
into a local directory or version control repository for
example). Alternatively the most recent revision of the SchemaTree
can be
checked out from the Theon central public repository and worked on,
subsequently exporting into a local directory, version control repository or
into a local working copy for committing back to the central public repository
(assuming sufficient access rights to do so). The examples below illustrate
the last case, but steps are otherwise much the same irrespective of the
original source of the SchemaTree
.
svn checkout https://svn.theon.inf.ed.ac.uk/svn/theon/model/library/self/schemat self cd self ttkm create ttkm reload # now the database schema for XSDDB is held within XSDDB itself! # make the necessary changes ttkm export ttkm update svn commit
Above a working copy of the self
SchemaTree
is checked out and made the
current working directory. This latter steps means that none of the toolkit
commands need a model name parameter as they will work directly with the model
in the current working directory. The XSDDB database is created and the
self
model loaded into it (the model for XSDDB itself). The necessary
changes are made in whatever way is appropriate/required (see the earlier
sections on making changes generally). A new SchemaTree
is then exported
into the current working directory. Since this is Subversion we can use the
convenience sub-command update
to automatically apply any svn add
and svn
remove
statements. Then the new (altered) SchemaTree
in the working copy
is committed back.
In practice in a normal development cycle it is unlikely that the changes
would be immediately committed back as above. Instead the local Theon
installation being used for testing would be upgraded live to the revised
self
model and tested, probably followed by other changes iteratively until
everything is correct. See the section Changing the Core Model above for more
details.
The XSDDB database is not constrained to contain only one model at a time. Any number of models can be loaded and independently updated/exported.
ttkm create ttkm import mydb ttkm otherdb reload
Above two models have been loaded. The mydb
physical database has been
imported as a model and the otherdb
model
has been directly loaded from the SchemaTree
that represents it.
Both can then be exported into a SchemaTree
with one command.
ttkm export
This would create two directories in the current working directory, one for
mydb
and one for otherdb
, each containing its respective SchemaTree
.
Alternatively each model can be exported independently as and when required.
ttkm mydb export ttkm otherdb export
The SchemaTree
has been transformed in the workflows above, but only
indirectly through the install
and upgrade
commands. Often though it is
useful to be able to just do the transform step, in particular as not all the
transforms are related to the installation or upgrade of the physical
database, although we look at that first.
Before a transform is applied a single XML-Schema document representing the
entire SchemaTree
is created (by schematic traversal). Internally
transforms can be done against the SchemaTree
itself but when multiple
transforms are being applied it is more efficient to build an aggregated
document first (in particular if all the individual XSD files in a Schema
Tree
have to be fetched from a public distribution via HTTP or from a version
control repository for example).
ttkm mydb complete
Here mydb
is the name of the model. If the current working directory is a
checked out working copy of a Subversion repository containing a SchemaTree
then the model used will default to the model within the current working
directory (when a model is not specified in the command).
Otherwise the model name will be searched for as a directory (containing a
SchemaTree
): in the current working directory; in the local Theon
installation (for models included with Theon such as self
) or a specific
path/URI given as an option to the command. When a path/URI is given as an
option and no model name is specified the path/URI must lead to a directory
containing a SchemaTree
and that will be what is used. Otherwise the
path/URI should contain a directory containing a SchemaTree
which is the
same as the model name specified in the command. A URI would normally be a
file:///
(or just a local file path) but can equally be a reference to a
SchemaTree
published on the web or in a version control repository (and
specific revision thereof). A specific node instance (XSD file) within a
SchemaTree
can also be specified so that only that part of the Schema
Tree
is completed.
ttkm mydb complete entity:ENTITY
The above will root the schematic traversal at the ENTITY.xsd
file in the
entity
directory of the SchemaTree
thus making a partial completion as
that fragment down only.
The complete
sub-command uses the schematic
command (included with the
Theon installation) to traverse the SchemaTree
and create a single
document for it. The schematic
command follows XML-Schema inclusion syntax
and (by default) suppresses multiple inclusions of the same file. The result
of this command is a file called complete.xsd
in a directory relative to the
current working directory, or a sub-directory therein named after the model.
When a node instance is used the name will be NODE-complete.xsd
where NODE
is the base name of the XSD file which was used as the root for completion.
This might just be the MODEL.xsd
file at the root of the SchemaTree
but
can be any file within for partial completion (to complete a fragment of the
SchemaTree
). The NODE-complete.xsd
file is written out into the same
directory as the XSD file used as the root. If an output path (or file) is
specified as an option to this command then that directory (or file) will be
used for the result instead.
Now, as the first example, transform the completed SchemaTree
into a file
of SQL DDL statements that can be used to create the physical database
represented (effectively by just piping the file into psql
). This is what is
done internally by the install
sub-command.
ttkm mydb derive ddl
Here mydb
is the name of the model. If the current working directory is a
checked out working copy of a Subversion repository containing a SchemaTree
then the model used will default to the model within the current working
directory (when a model is not specified in the command).
Otherwise the model name will be searched for as a directory (containing a
SchemaTree
): in the current working directory; in the local Theon
installation (for models included with Theon such as self
) or a specific
path/URI given as an option to the command. When a path/URI is given as an
option and no model name is specified the path/URI must lead to a directory
containing a SchemaTree
and that will be what is used. Otherwise the
path/URI should contain a directory containing a SchemaTree
which is the
same as the model name specified in the command. A URI would normally be a
file:///
(or just a local file path) but can equally be a reference to a
SchemaTree
published on the web or in a version control repository (and
specific revision thereof). A specific node instance (XSD file) within a
SchemaTree
can also be specified so that only that part of the Schema
Tree
is derived.
The derive
command will look for an instance of complete.xsd
in the
relevant directory that is newer (by timestamp) then MODEL.xsd
in the same
directory. If it does not exist, is older (by timestamp) or the complete
option is specified then the complete
sub-command is used first to create
it. When a node instance is specified then the relevant NODE-complete.xsd
file is looked for in the appropriate directory (and created if necessary).
The result of this command is a directory called derived
in the current
working directory, or a sub-directory therein named after the model,
containing a file called ddl.sql
. When a node instance is specified then the
result will be a file called NODE-ddl.sql
in a sub-directory named the same
as the one the XSD file associated with the node instance is in. If an output
path (or file) is specified as an option to this command then that directory
(or file) will be used for the result instead.
ttkm derive /tmp/mydb ddl entity:ENTITY /tmp
In the above the SchemaTree
must be within /tmp/mydb
and the node
instance entity/ENTITY.xsd
within that will be transformed (via completion if
necessary) into /tmp/derived/entity/ENTITY-ddl.sql
.
The ddldiff
transform has not been implemented yet.
This transform uses a feature of schematic
which takes two SchemaTree
instances and annotates the single XML-Schema document resulting from the
first SchemaTree
with additional markup indicating how it differs from the
second SchemaTree
. This facility is used in the upgrade
command to
produce a "changeset" of SQL DDL statements to bring the structures in a
physical database matching the second SchemaTree
into alignment with the
first SchemaTree
.
ttkm mydb derive ddldiff
The usage, function and result of this command is largely as per the ddl
transform described in the section above. In fact if a complete XML document
is used that is not marked up with differences (created from a single Schema
Tree
) then the resulting SQL DDL statements will be almost identical
(functionally) to those for ddl
. When using a correctly produced
differential complete XML-Schema document the result will be SQL DDL
statements that make the necessary structural changes reflecting the
differences.
The derive
command and the ddldiff
transform cannot push down a call to
the complete
command (and so the complete
option is also redundant). So
the complete
command must always preceed this (unless there is known to be
no change to the existing complete output). The complete
command is simply
given an additional path or URI argument in order to produce a differential
output. The resulting file is stored and named in the same way as for a
non-differential output however, as the processing at the level of the
derive
command is identical in either case (whatever transform is used, only
the ddldiff
transform actually looks for and uses the additional change set
markup).
The result of this command is a directory called derived
in the current
working directory, or a sub-directory therein named after the model,
containing a file called ddldiff.sql
. When a node instance is specified then
the result will be a file called NODE-ddldiff.sql
in a sub-directory named
the same as the one the XSD file associated with the node instance is in. If
an output path (or file) is specified as an option to this command then that
directory (or file) will be used for the result instead.
This transform produces the documentation for a completed SchemaTree
. It
results in a directory tree of HTML
and SVG
files that fully describe the
physical database structure represented by the model (assuming the model has
been sufficiently annotated). The documentation directory tree for the self
model included with a Theon installation is produced in this way.
ttkm mydb derive ddldoc
The usage and function of this command is as per the ddl
transform described
in the section above.
The result of this command is a directory called derived
in the current
working directory, or a sub-directory therein named after the model,
containing a directory called html
. This will contain a file called
index.html
plus the same sub-directory structure as a SchemaTree
each
sub-directory containing a NODE.html
file where NODE
is the base name of
the associated XSD file. Each sub-directory may also have one or more
corresponding NODE-TYPE.svg
files per node instance depending on the node type.
When a node instance is specified then the result will be one of the files
called NODE.html
in the sub-directory named the same as the one the XSD file
associated with the node instance is in, as well as one or more corresponding
NODE-TYPE.svg
files. If an output path (or file) is specified as an option to
this command then that directory (or file) will be used for the result
instead.
Note that there is another transform (called ddldocbook
) that is used to
produce the same documentation in a form that can be directly included in the
DocBook source for this book. This should never be needed in normal
development. It only needs to be used when a new release of the book is
produced that needs to include updates to self
model
included with Theon for example.
This transform is used by the reload
command to produce SQL DML statements
to load a SchemaTree
into the XSDDB.
ttkm mydb derive ddldata
The usage and function of this command is as per the ddl
transform described
in the section above.
The result of this command is a directory called derived
in the current
working directory, or a sub-directory therein named after the model,
containing a file called ddldata.sql
. When a node instance is specified then
the result will be a file called NODE-ddldata.sql
in a sub-directory named
the same as the one the XSD file associated with the node instance is in. If
an output path (or file) is specified as an option to this command then that
directory (or file) will be used for the result instead.
Note that another transform (called ddlxsl
) is actually used to produce the
ddldata
transform. This should never be needed in normal development. It
only needs to be used when the XSDDB structure itself is altered.
When using a version control system merge conflicts in a SchemaTree
must be
dealt with using the tools of that version control system. However the files
conflicts will be in are the XSD files which are formatted as XML documents.
This makes the merge process complicated as it is not a structure that lends
itself to easy detection and/or correction. Also the actual XML-Schema
structure used to hold a SchemaTree
has been left as an implementation
detail and is not documented here. An easier approach exists however. This
involves converting the SchemaTree
into a format that can more easily be
used for checking the conflicts, and then do any conflict resolution by
rolling back and changing the source itself.
For example, a merge conflict in an XSD file in the entity
directory could
be dealt with by transforming it to a set of SQL DDL statements instead.
Differences in these are more easily seen and understood. Below assumes that
we are in the working copy holding the exported SchemaTree
with a conflict
and are using Subversion as the version control system.
svn revert entity/ENTITY.xsd ttkm derive ddl entity:ENTITY mv derived/entity/ENTITY-ddl.sql derived/entity/ENTITY-ddl.sql.svn mv entity/ENTITY.xsd entity/ENTITY.xsd.svn ttkm export entity:ENTITY ttkm derive ddl entity:ENTITY meld derived/entity/ENTITY-ddl.sql.svn derived/entity/ENTITY-ddl.sql # change XSDDB content as necessary ttkm export entity:ENTITY svn commit entity/ENTITY.xsd
Above uses commands already described but in a more selective way. First
revert the changes leading to the conflict in ENTITY.xsd
. Then derive SQL
DDL statements into derived/ddl.sql
solely for the fragment of the Schema
Tree
rooted at ENTITY.xsd
. Move these out of the way as they represent the
current Subversion content. Now re-export just the part of the XSDDB that
results in ENTITY.xsd
in the SchemaTree
. Then do the SQL DDL statements
file derivation again. This represents the new desired state of the Subversion
content. Compare the two. Make any necessary changes in the XSDDB to resolve
the conflict then export again to commit the changes.
Similarly parts of the SchemaTree
representing data for TheonUI can be
derived into their own native format (JSON or HTML in this case) so that
little or no knowledge of the XSD file content itself is required to resolve
the conflict. Similarly for content intended for TheonCoupler.
The above approach won’t always work as not all of the XML document content
necessarily makes it out into any particular transformation, although most
ultimately should. As a final fallback the internal Theon transform to
make a set of SQL DML statements to load a model into XSDDB from the Schema
Tree
can be used as that must by design include all elements in the XML.
This may also be necessary if something in the changes causing conflict means
the particular transformation to naturally use is broken as a result.
In general development to minimise merge conflicts it is always good practice to update the working copy prior to reloading the model in XSDDB from it, so before making the changes to it.