This recipe is for creating a new model from scratch in Theon, including using the coupler and factory. To use a real world example this will be for the Matlab TAH Licensing Anaysis Database, otherwise known as "mattah" (as in "wot’s the").
This recipe can be run on any machine that has the necessary packages installed and suitable configuration of PostgreSQL (see configuration recipe).
In addition this recipe has some support files, locally available. Obtain the support files archive (mattah2theon.tgz) and extract somewhere suitable and set the path in $MATLAB2THEON to point at them. Then do an initial database load from the dump file.
cd /tmp tar zxvf -/matlab2theon.tgz export MATLAB2THEON=/tmp/matlab2theon createdb matlab psql matlab < ${MATLAB2THEON}/matlab.sql
(1) Create our new ModelLibary directory and change into it. We will start from
the existing prototype database called matlab
(available to load from the
recipe example support files, see above) on localhost so we launch with a
LiveProfile for that.
ttkm mattah launch at /tmp using database=matlab host=/disk/scratch cd /tmp/mattah
(2) Create a local management database (destroy prior instance, don’t do this
if you have un-exported content). Import our current matlab
database.
ttkm create aok ttkm import
(3) Reset our LiveProfile for what will be the new Theon managed database name.
ttkm launch aok using database=mattah host=/disk/scratch
(4) Update the model default database name for consistency, we use the UI to do this.
ttkm self server ui
Point a locally running browser at localhost:8000
. Choose the Dictionary
desktop and on the first panel select the mattah
model (it will be the only
model if the management database was created anew in step 1). Change the
Name[Realised]
value from matlab
to mattah
. Then Control-C the server
process.
(5) Now export the model and do a test installation.
ttkm export ttkm derive ddl ttkm install psql mattah -c "\d"
If all ok then that is the database now held in Theon as a ModelLibrary. We can for re-assurance though do a diff between the original and this by dumping each and comparing (although note that they are rather unlikely to be absolutely identical, but would be looking here for any important functional differences).
pg_dump95 -s matlab > /tmp/m.old pg_dump95 -s mattah > /tmp/m.new diff /tmp/m.old /tmp/m.new
In this case only indexes have not been imported. FIX ME. Then delete the original database.
dropdb matlab
(6) We now want to setup a factory to build some additional views and functions for the model.
mkdir -p factory/generator/mattah mkdir factory/templates
(7) Create a generator specific to this model to contain some defaults for view templates. This can be copied from the recipe support files.
cp ${MATLAB2THEON}/factory/generator/mattah/view.grg factory/generator/mattah
Review the content.
$ cat factory/generator/mattah/view.grg %%include "relation/view.grg" %%equate relation_ViewName BASENAME(_eq_file, ".fat") %%equate relation_ViewValueBlock "SQL_View" %%equate relation_ForgeViewRequiresBlock "SQL_ViewRequires"
The first line includes the base class in the Factory to build a View. The second line defaults the name of the view to be the same as the name of the defining template file. The next two lines define the block names that will be used to define the view itself as well as any pre-requisites.
(8) Create a file for each individual view. These will each be named
factory/templates/VIEWNAME.fat
. Copy each from the recipe support files.
cp ${MATLAB2THEON}/factory/templates/concurrency_data.fat factory/templates cp ${MATLAB2THEON}/factory/templates/costs_data.fat factory/templates cp ${MATLAB2THEON}/factory/templates/maximums_data.fat factory/templates cp ${MATLAB2THEON}/factory/templates/report.fat factory/templates cp ${MATLAB2THEON}/factory/templates/usage_data.fat factory/templates cp ${MATLAB2THEON}/factory/templates/usage_data.fat factory/templates
Review the content of each, the structure of which will be as below.
%%include "mattah/view.grg" %%block SQL_ViewRequires VALUES ( 'VIEWNAME', NULL ), ( 'VIEWNAME', NULL ) %%end %%block SQL_View <<<VIEW SQL DEFINITION>>> %%end
Note that the template files do not define any dependencies on tables (these will be imported below), but dependencies on views have been added to ensure the correct creation order.
(9) Process the Factory to create all the views in the management database.
ttkm gather
(10) Export, re-derive and re-install the mattah
database. Then import again
(this time from the mattah
database as it is the default in the LiveProfile)
which will set up each View dependencies and columns, then export again to
update the SchemaTree.
ttkm export ttkm derive ddl ttkm install aok ttkm import ttkm export
Check the views have been defined.
psql mattah -c "\d"
(9) Next a quick test that the default TheonUI Browser interface works. First derive the UI meta data. Then start up the server.
THEON_MODEL_XSLT_PATH=/usr/share/theon/ui/xslt ttkm derive ui ttkm server ui
Navigate a locally running browser to localhost:8000
and choose the Browser
desktop, All Clusters
and then test opening each of the tables under the
Choose Table
menu (although obviously there will be no data in any of them
yet). Note that the views will not be listed, they are not directly accessable
via TheonUI. Then Control-C the server process.
(10) Set up a basic Stream to load the products
table data. This Stream does
not have any couples as the data is loaded straight into the target table an no
further work is required. Start up a management database server.
ttkm self server ui
Navigate a locally running browser to localhost:8000
and follow the following
steps to create the Stream.
Open the "Streaming" desktop. On the first panel ("Choose Model") ensure the right model is shown and then add a new row in the "Streams" box: Stream-Label = Products Origin = PROD ? Feed Entity = products Source = File Format = CSV Name[Info] = Products
Control-C the server process. Then build the TheonCoupler configuration.
ttkm export ttkm derive coupler
Do a test to run the stream agent and load the data. A data file in the correct format is available in the recipe support files.
ttkm stream agent products refresh from ${MATLAB2THEON}/products.csv
Check the data loaded.
psql mattah -c "SELECT * FROM products;"
(11) Set up a Stream with a Couple to load the licenses
table data. This
Stream will use a FDW (Foreign Data Wrapper) which will be created live and
then imported back into the management database. The data file that is the
source for the FDW is available in the recipe support files.
psql mattah -c "CREATE EXTENSION file_fdw;" psql mattah <<EOF CREATE SERVER file_server FOREIGN DATA WRAPPER file_fdw; CREATE FOREIGN TABLE licenses_data ( logentry INTEGER, logdate DATE, logtime TIME WITHOUT TIME ZONE, state TEXT, product TEXT, uun TEXT, machine TEXT, infinnac INTEGER ) SERVER file_server OPTIONS (format 'csv', filename '${MATLAB2THEON}/licenses.csv'); EOF
Import the added DDL back into the management database. This will only import the extension defintion however as FDW tables are not natively supported in Theon.
ttkm import
Add the necessary DDL for the FDW manually into the management database (although this could also be done by a Factory Template). Start up a management database server.
ttkm self server ui
Navigate a locally running browser to localhost:8000
and follow the following
steps to create the FDW.
Open the "Dictionary" desktop. On the first panel ("Model") ensure the right model is shown and then add a new row in the "Entities" box: Name[Realised] = licenses_data Type = Container Content = SQL Open the "Entity" panel. Query for the "licenses_data" record. Copy the SQL above (between the EOF delimiters, so not the extension) into the "Value" field (remember to manually expand the ${MATLAB2THEON} variable when doing this as it won't be expanded within PostgreSQL). Add the following rows in the "Attributes" box: Position Name[Realised] Type Content 1 logentry Column Number/Integer 2 logdate Column Date 3 logtime Column Time 4 state Column Text 5 product Column Text 6 uun Column Text 7 machine Column Text 8 infinnac Column Number/Integer
Control-C the server process.
Export, re-derive and re-install to check the FDW has been correctly added.
ttkm export ttkm derive ddl ttkm install aok psql mattah -c "SELECT * FROM licenses_data LIMIT 10;"
Create the Stream and a Couple using the FDW source and with a literal to add an extra column’s content. Start up a management database server.
ttkm self server ui
Navigate a locally running browser to localhost:8000
and follow the following
steps to create the FDW.
Open the "Streaming" desktop. On the first panel ("Choose Model") ensure the right model is shown and then add a new row in the "Streams" box: Stream-Label = Licenses Origin = LICE Name[Info] = Licenses Open the "Stream" panel and ensure "Licenses" is shown and then add a new row in the "Couples" box: Position = 1 Source Entity = licenses_data Target Entity = licenses Name[Info] = Licenses Open the "Couple" panel and enter and save the following. Source Origin Value = 'LIC' Then add two new rows in the "Bonds" box: Position Type 1 Handle 2 Column Open the "Bond" panel and add rows in the "Pairs" box as below: For "Handle": Position = 1 Source Column = logentry Target Column = logentry For "Column": Position Source Column Target Column 1 logdate logdate 2 logtime logtime 3 state state 4 product product 5 uun uun 6 machine machine 7 infinnac infinnac Then add one more row for "Column": Position = 8 Source Value = (@logdate || ' ' || @logtime)::TIMESTAMP Target Column = logts
Control-C the server process.
Rebuild the database to accommodate changes made for TheonCoupler.
ttkm export ttkm derive ddl ttkm install aok ttkm derive coupler
Build the TheonCoupler configuration.
ttkm derive coupler
Do a test to run the stream agent and load the data. The licenses data is a large file creating nearly one million records so this process will take approximately 10 minutes. Note that an FDW was used here as an exemplar and is not the most optimal way to load and process this data.
ttkm stream agent licenses
Check the data loaded.
psql mattah -c "SELECT * FROM licenses LIMIT 10;"
Note that when using a Container
as a source for a Couple like this it must
be (produce) a table-like or view-like object to which permissions can be
granted and to which a SELECT
can be applied.
(12) Set up a Stream to load the usage
table data. This Stream uses a View,
which will be defined by Factory Template first, and one Couple to process.
Create the Factory Template file for the file, this can be copied from the recipe support files.
cp ${MATLAB2THEON}/factory/templates/usage_data.fat factory/templates
Process the Factory content, then rebuild the database with the new view.
ttkm gather ttkm export ttkm derive ddl ttkm install aok
Do an import to pick up the new View’s columns so they can be used in the definiton of the Couple.
ttkm import
Start up a management database server.
ttkm self server ui
Navigate a locally running browser to localhost:8000
and follow the following
steps to create the Stream and Couple.
Open the "Streaming" desktop. On the first panel ("Choose Model") ensure the right model is shown and then add a new row in the "Streams" box: Stream-Label = Usage Origin = USAG Name[Info] = Usage Open the "Stream" panel and ensure "Usage" is shown (by query or row selection) and then add a new row in the "Couples" box: Position = 1 Source Entity = usage_data Target Entity = usage Name[Info] = Usage Open the "Couple" panel and enter and save the following. Source Origin Value = 'USE' Then add two new rows in the "Bonds" box: Position Type 1 Handle 2 Column Open the "Bond" panel and add rows in the "Pairs" box as below: For "Handle": Position = 1 Source Column = uun Target Column = uun For "Column": Position = 1 Source Column = n Target Column = n
Control-C the server process.
FIXME - the Source Origin Value and Origin are mandatory so probably should have a default.
Now export the changes, rebuild the DDL and re-install the database in order to add the TheonCoupler support functions.
ttkm export ttkm derive ddl ttkm install aok
Rebuild the TheonCoupler configuration.
ttkm derive coupler
Test the feed works - in principle only as there is no data for it to process
(unless we first rerun the licenses
stream).
ttkm stream agent usage
(13) Set up a Stream to load the category
table data. This Stream uses a live
dblink
to the central infdb
database to get the base data and a view to
wrap the data. While the two could be amalgamated this will result in
authentication issues - the dblink
is run with the authentication of the
PostgreSQL server, which if using the supplied configuration will be the local
user which should have access to infdb
. The Couple is run as a specific role
without (by default) the necessary remote access.
Create the Factory Template files for the dblink
function and wrapper view,
these can be copied from the recipe support files.
cp ${MATLAB2THEON}/factory/templates/people_data.fat factory/templates cp ${MATLAB2THEON}/factory/templates/people_data_type.fat factory/templates cp ${MATLAB2THEON}/factory/templates/people.fat factory/templates
Review the content. The first file defines the function, the second the type returned by the dblink call and the third the wrapper view.
Process the Factory content, then rebuild the database with the new structures.
ttkm gather ttkm export ttkm derive ddl ttkm install aok
Test the dblink
returns data from infdb
via the view wrapper.
psql mattah -c "SELECT * FROM people;"
Do an import to pick up the new View’s columns so they can be used in the definiton of the Couple.
ttkm import
Start up a management database server.
ttkm self server ui
Navigate a locally running browser to localhost:8000
and follow the following
steps to create the Stream and Couple.
Open the "Streaming" desktop. On the first panel ("Choose Model") ensure the right model is shown and then add a new row in the "Streams" box: Stream-Label = Categories Origin = CATE Name[Info] = Categories Open the "Stream" panel and ensure "Categories" is shown (by query or row selection) and then add a new row in the "Couples" box: Position = 1 Source Entity = people Target Entity = category Name[Info] = Category Open the "Couple" panel and enter and save the following. Source Origin Value = 'CAT' Then add two new rows in the "Bonds" box: Position Type 1 Handle 2 Column Open the "Bond" panel and add rows in the "Pairs" box as below: For "Handle": Position = 1 Source Column = uun Target Column = uun For "Column": Position Source Column Target Column 1 status status 2 category category 3 fundingcat fundingcat
Control-C the server process.
Now export the changes, rebuild the DDL and re-install the database in order to add the TheonCoupler support functions.
ttkm export ttkm derive ddl ttkm install aok
Rebuild the TheonCoupler configuration.
ttkm derive coupler
Test the feed works.
ttkm stream agent categories
Check loaded content.
psql mattah -c "SELECT * FROM category LIMIT 10;"
(14) The final Stream to set up is used for internal processing rather than for
loading data. It uses a paramaterized function as a primary source which can
be called on demand through the stream in order to analyse specific ranges of
license data. This populates the aligned
table with corresponding licenses
data. One Couple does this. Three further Couples do more processing in a
pipeline from the aligned
table into the concurrency
table, then into the
maximums
table then into the costs
table.
Create the Factory Template files for the base function and return type, these can be copied from the recipe support files.
cp ${MATLAB2THEON}/factory/templates/aligned_data.fat factory/templates cp ${MATLAB2THEON}/factory/templates/aligned_data_type.fat factory/templates
Review the content. The first file defines the function, the second the type.
Process the Factory content, then rebuild the database with the new structures.
ttkm gather ttkm export ttkm derive ddl ttkm install aok
Do an import to pick up the new Function’s paramaters and return type columns so they can be used in the definiton of the Couple.
ttkm import
Start up a management database server.
ttkm self server ui
Navigate a locally running browser to localhost:8000
and follow the following
steps to create the Stream and Couples.
Open the "Dictionary" desktop. On the first panel ("Choose Model") ensure the right model is shown and then open the "Entity" panel and find the "aligned_data" entity. The return type columns must be manually added so they can be referenced by the Couple as they are not expanded automatically (FIXME). Add the following rows in the "Attributes" box: Position Name[Realised] Type Content 1 checkout_le Column Number/Integer 2 checkout_ts Column Date/Time 3 checkin_ts Column Date/Time 4 uun Column Text 5 category Column Text 6 machine Column Text 7 product Column Text Open the "Streaming" desktop. On the first panel ("Choose Model") ensure the right model is shown and then add a new row in the "Streams" box: Stream-Label = Aligned Origin = ALIN Name[Info] = Aligned Open the "Stream" panel and ensure "Aligned" is shown (by query or row selection) and then add four new rows in the "Couples" box: Position Source Entity Target Entity Name[Info] 1 aligned_data aligned Aligned 2 concurrency_data concurrency Concurrency 3 maximums_data maximums Maximums 4 costs_data costs Costs Open the "Couple" panel and enter and save the following same data for each of the four couples above. Source Origin Value = 'ALI' Then add two new rows in the "Bonds" box: Position Type 1 Handle 2 Column Open the "Bond" panel and add rows in the "Pairs" box for the "Aligned" couple as below: For "Handle": Position = 1 Source Column = checkout_le Target Column = checkout_le For "Column": Position Source Column Target Column 1 checkout_ts checkout_ts 2 checkin_ts checkin_ts 3 uun uun 4 category category 5 machine machine 6 product product Open the "Bond" panel and add rows in the "Pairs" box for the "Concurrency" couple as below: For "Handle": Position Source Column Target Column 1 snap_ts snap_ts 2 category category 3 product product For "Column": Position Source Column Target Column 1 num_concurrent num_concurrent Open the "Bond" panel and add rows in the "Pairs" box for the "Maximums" couple as below: For "Handle": Position Source Column Target Column 1 category category 2 product product For "Column": Position Source Column Target Column 1 maxcon maxcon Open the "Bond" panel and add rows in the "Pairs" box for the "Costs" couple as below: For "Handle": Position Source Column Target Column 1 pgrouping grouping 2 product product For "Column": Position Source Column Target Column 1 current_prov current_prov 2 required_prov required_prov 3 purchase_cost purchase_cost 4 annual_cost annual_cost 5 total_3yr_cost total_3yr_cost
Control-C the server process.
Now export the changes, rebuild the DDL and re-install the database in order to add the TheonCoupler support functions.
ttkm export ttkm derive ddl ttkm install aok
Rebuild the TheonCoupler configuration.
ttkm derive coupler
Do a notional test of the Stream, although note at this stage there is no data.
ttkm stream agent aligned couple aligned passing lower as \"10\", upper as \"100\" ttkm stream agent aligned couple concurrency ttkm stream agent aligned couple maximums ttkm stream agent aligned couple costs
(15) Reload all data and process internal streams. Note that the aligned
and
particularly concurrency
couples can take a long time (many minutes)
depending on the values of the lower and upper parameters specified.
ttkm stream agent products refresh from ${MATLAB2THEON}/products.csv ttkm stream agent categories ttkm stream agent licenses ttkm stream agent usage ttkm stream agent aligned couple aligned passing lower as \"10\", upper as \"1000\" ttkm stream agent aligned couple concurrency ttkm stream agent aligned couple maximums ttkm stream agent aligned couple costs
Check the final output for the analysis.
psql mattah -c "SELECT * FROM report;"
(16) Produce documentation for the ModelLibrary. This will be rather sparse as not a lot of comments were added in the entries above.
ttkm derive ddldoc
Navigate a locally running browser to file:///tmp/mattah/derived/model/html/index.html
to review the full documentation.
(17) Commit and deploy the ModelLibrary. Commit the SchemaTree, create a release and build a package so it can be deployed and run by anyone on their own machine or to be installed on a central server.
git add schemat git commit ttkm release ttkm package ls -l package total 132 -rw-r--r-- 1 timc people 21512 Mar 31 13:53 mattah-1.1-1.src.rpm -rw-r--r-- 1 timc people 35360 Mar 31 13:53 mattah-theon-library-1.1-1.noarch.rpm -rw-r--r-- 1 timc people 52124 Mar 31 13:53 mattah-theon-library-devel-1.1-1.noarch.rpm -rw-r--r-- 1 timc people 18480 Mar 31 13:53 mattah-theon-library-ui-1.1-1.noarch.rpm
Use UI to look at tables, to look at licenses table (massive) need to add:
"limit_rows" : 250, "limit_rows_conditionally" : True,
to derived/ui/default/ui.conf and then query conditionally.