Chapter 15. Schema Factory

Some objects in a model can be maintained through what is called the Schema Factory. The objects produced through this are streamed into XSDDB via TheonCoupler when they are updated, but otherwise they are loaded from a SchemaTree as for all other objects. In the SchemaFactory objects are generated from an external source. Theon provides a default framework for this (Template) but any other framework could be used, as long as the output is compatible with the gather stream structure. Elements (or aspects of elements) maintained in this way could still be directly edited in XSDDB, but any changes made would be lost on a subsequent update and reload. This is because the master for this particular data becomes the external source rather than the exported SchemaTree, which then for this data just represents a snapshot copy to facilitate further processing.

Supported in the SchemaFactory in Theon is a means to externally provide the definitions of entities and in principle any entity content (generally the actual view and function bodies). Large blocks of SQL are more easily edited in local files with a user chosen familiar text editor and more easily tested using a raw command interface, such as psql, rather than through a web browser interface such as TheonUI. Functions (and views) can be built automatically from templates using a template processing engine. Multiple variant instances can be produced from a single template along with higher level per-instance descriptions. This is obviously more robust, efficient and maintainable than manually editing and keeping consistent multiple "similar" definitions manually. In both cases the SchemaFactory is then used to take this externally maintained content into XSDDB and thence the SchemaTree itself. Since any entity content is supported, the SchemaFactory can also be used to maintain the content of container entities that have blocks of SQL to be incorporated into the physical database structure which cannot otherwise be represented through Theon.

Support can easily be added to the SchemaFactory for any other type of external data that needs to be mapped into a model. Entire bits of schema can be streamed in to supplement existing schema, for example to dynamically create and change structure based on a completely different external data representation. The import action in the workflows that loads a model from an existing PostgreSQL physical database catalog is effectively just an extreme instance of streaming content in from an external representation, although this uses a dedicated stream rather than the Schema Factory. Adding support is simply a case of providing the data source and a suitable TheonCoupler configuration designed to be used with XSDDB. Further details are beyond the (current) scope of this document however.

It is not mandatory to define any or all view and function bodies by using the SchemaFactory. In a model representing a small database it would probably be an unnecessary step (simply copy/paste content from a separate editor back into XSDDB and/or use catalog import). However, in models for larger databases this can quickly become necessary. So when making changes to view and function body definitions in templates be aware that they are "all" being mastered in the SchemaFactory and outside of the exported SchemaTree and consequently changes made directly to that or via XSDDB will not be preserved.

It is not necessary to run TheonCoupler to update XSDDB if the external source data has not changed. The streamed content is normally loaded into the XSDDB from the most recent generated snapshot in the SchemaTree itself (when using reload) along with the rest of the elements in the the SchemaTree. Since the streamed content should not be altered in XSDDB it will get exported back out exactly as it was into the SchemaTree as for other content. When the streamed content does need to be changed the original external source data files should be updated and then re-streamed into XSDDB so the exported SchemaTree will reflect the changes. When using an underlying version control system the changed external source data files would usually be committed along with the changed SchemaTree for consistency, although that is not required (but the changes will not persist irrespective until the master source files are committed). However, this is only specifically the case with the default framework provided for the Schema Factory in Theon - the external source data could be from anywhere, including a remote live data service, in which case it is not appropriate or even possible to combine the changes with the resulting SchemaTree under one commit.

Sections below describe how to update external content in the SchemaFactory using the default Template framework provided with Theon.

15.1. Updating Entities and Entity Bodies

  1. this needs tweaked for new paths

    1. also refer to search path
    2. also mention system installed factory and generators etc

The external data for entities and their body definitions is held in the factory sub-directory of the ModelLibrary.

  MODLIBDIR
    factory/generator/<<<CLASS>>>/*.grg
    factory/templates/[*/]/*.fat]

The MODLIBDIR directory is usually just the name of the model the SchemaFactory is contributing to (or rather it will be named using the value of the tag attribute for that model). The SchemaTree is also held here in the schemat sub-directory, see earlier section on the structure of that.

The factory contains a generator sub-directory and a templates sub-directory. Below the generator directory is a sub-directory per-class, the content of which is then structurally free but ultimately will need to contain one or more *.grg files which define the generator class. The content of templates is structurally free but ultimately will need to contain one or more \*.fat files which define templates.

The factory processing recursively collates and processes all files ending with .fat found within the templates directory, actual location is irrelevant and does not affect final output. Each template file includes the relevant generater definition either from the local generator directory or from the search path.

The optional template sub-directory contains additional template header files with .grg suffix (since that is the preferred suffix for gurgle processed content). These are template headers that provide a higher level of abstract processing specific to the model. Any structure can exist below the template sub-directory to organise these headers appropriately. The root of the template sub-directory is added to the gurgle search path so that individual fat files can include them.

The structure is ultimately analogous to directories containing .c source files and also .l definition files that must be first processed by Lexx to produce additional .c source files all of which are compiled to produce the output binary (or in the case of Theon the final ddl.sql related files). Normally (as in Theon for the generated .xsd files) the generated .c files are included along with the template that produces them (the .l file) so that Lexx is not actually required to rebuild the final binary. Modifying the generated .c file will work but the changes will not persist unless the source .l file is also modified and the .c file rebuilt. In the same way in Theon modifying the generated .xsd file (via XSDDB) will not persist unless the corresponding .fat is updated and the .xsd file rebuilt (by streaming the changes into XSDDB and exporting).

The fat files are processed using pggurgle which is a standalone application included with a Theon installation. While pggurgle itself is generic the fat file must be written in a certain way so that it works to generate a suitable entity definition. This is done by including a suitable header defined in the SchemaFactory Generator framework. The generated output is collated and used as the data source for the forge stream in TheonCoupler which then effectively loads it into the XSDDB itself. All fat files in the factory/templates directory are considered to contribute to the stream, so common header files and supporting files must either use a different suffix or would normally be placed in the corresponding class generator sub-directory. Include files contain embedded "schema" specific content which is only active, as such, if the file is included (or eventually included) by a fat file that is held in the factory/templates directory.

A packaged ModelLibrary (or the central repository) holds all the associated fat template files and grg generater header files that are part of that model. They can be copied and modified directly in a local filesystem copy or modified in a working copy (of the central repository or a local repository) and committed back.

There is a special wrapper command to process the fat files and load them into XSDDB. This refers to the current working directory (working copy) by default, but can also refer to installed versions or versions in a specific working copy branch or local filesystem to facilitate update and testing. This command needs to be run after changing the fat files otherwise the changes will never be pushed into XSDDB, exported into the SchemaTree and subsequently transformed. This is the case even if the fat file is committed. Although as part of building an installation package a full build is always done, in order to facilitate testing a manual build is needed.

ttkm mydb gather

This is a convenience command to rebuild all entity definitions (normally just the actual view and function bodies for entities otherwise in the Schema Tree by being defined via XSDDB itself) that are in the factory sub-directory of the current working directory (or in a sub-directory of a directory named after the model specified, mydb in this case). In effect it processes all fat suffixed files. Once built the resulting output is aggregated to form a data source for the forge stream so that it can be loaded into XSDDB. The aggregated data source is effectively a delimited text file equivalent to parts of the pgcat stream.

Alternatively, to just rebuild and reload one single file (and leave the values of the others unchanged in XSDDB).

ttkm mydb gather for FILE
ttkm mydb gather against relations for FILE

Where FILE is the base name of the fat file (without the .fat suffix). This is most useful when making changes to specific known files only in which case there is no need to do a full reload of all the others which are known to have not changed.

The gather action processes fat files to produce forge data then runs the forge stream with the generated data. The first part is effectively pggurgle FILE with some options for this particular content environment and which just runs pggurgle over that file (or all the relevant files, in parallel where possible). The stream part is effectively self stream forge FILE. That is, run the forge stream against the self model (which is realised as the XSDDB database) with an argument which is the name of the constraining data subset to use (if any). The stream is paramaterized so that when only one item is being processed in the synchronisation the other items are ignored (rather than being marked for deletion). The external data source for this stream is actually a dedicated script. The script aggregates the results of processing each fat file to produce the stream data source on-demand. For more details on streaming in data see the TheonCoupler chapter.

It would be possible to use a different data source (different templating engine and/or source file format) instead of fat files and pggurgle processing. However, to do this would require replacing the existing script with one that processes the alternate source into output data with the same format as the one used for the existing forge stream in Theon. The gather command can take an option to suppress the first step above (pggurgle processing) where the correctly formatted output data is provided in a different way (externally). Alternatively the stream configuration for forge in the model for XSDDB could be altered, or an additional custom one added, to process an entirely differently formatted data source.

The supporting entity definition framework for the SchemaFactory is defined in base generator class headers available in a Theon installation at model/factory/generator. The path is included via the GURGLE_PATH environment (set within the toolkit normally). The full syntax to use for any given factory file is documented within the relevant header file itself and so this should be looked at and is not documented here.

There are other objects that can be created through the SchemaFactory beyond those described below, see the factory header directory for details. Note that the SchemaFactory is not model specific - it is a generic framework to automatically build entity definitions that can be imported into XSDDB via the forge TheonCoupler stream. The actual fat file in the SchemaTree contains the schema specific content (specific to the model which the SchemaTree it is held within represents).

15.1.1. View Entity Syntax

To construct a view entity and/or body definition from the SchemaFactory use the following at the very top of the fat file.

%%include "relation/view.grg"

Refer to model/factory/generator/relation/view.grg for details and class specific options.

15.1.2. Process Entity Syntax

To construct a process entity and/or body definition (such as for containers) and also dependencies and related events from the SchemaFactory use the following at the very top of the fat file.

%%include "relation/entity.grg"

Refer to model/factory/generator/entity.grg for details and class specific options.

15.1.3. Custom Model Factory Frameworks

The SchemaFactory can contain higher level model specific frameworks for generating views and functions. These are built on top of the provided headers so that a simplified interface is exposed for single entity construction or that wrap a complicated multi-entity construction.

model/library/MODLIBDIR/factory/generator/CLASS/*.grg
model/library/MODLIBDIR/factory/templates/*.fat

To construct entities from the SchemaFactory using these specific frameworks put the following, for example, at the very top of the fat file.

%%include "CLASS/FILE.grg"

Where FILE is appropriate to context. Refer to model/library/MODLIB/factory/generator/CLASS/\*.grg files for further details on usage.

Others can be added to any model. Alternative directories for user defined factory frameworks can be added to the search path where necessary.