TheonCoupler only has two approaches to explicitly bringing external data
into the physical database. One is to do nothing, when another mechanism is in
use to achieve this (see other configurations). The other is called Snapshot
.
This truncates (deletes the entire content of) the Stream Source Table
and
then copies all the stream data into the Stream Source Table
, effectively
re-populating it anew every time the stream data is known to have been
refreshed. The only supported format for the stream data is a CSV file (or
standard input which is CSV).
Optionally the initial truncation of the stream source table can be disabled.
In this context the new stream data will be repeatedly appended to the data
already in the Stream Source Table
. This would be used when each initiation
of stream data processing is known to have a single subset of data and the
Stream Source Table
then represents the steady accumulation of that data. As
a result the Stream Source Table
itself is representing the stream data (and
is the master source, which is simply being updated by external processes).
A final optional (enabled by default) function of Snapshot
is zero length
checking. When enabled TheonCoupler will ignore a zero length stream data
file and throw an error. This can be used to provide a minimal level of
protection against bad upstream data being produced. A zero length check is
also carried out later in individual couples, see coupling section.
If the external upstream origin of data does not fit with the Snapshot
approach above then TheonCoupler cannot itself be used to populate the
Stream Source Table
and another custom approach must be taken. Some possible
examples are in the configurations below.
The refresh
argument of the ttkm stream
sub-command covers this stage of
the TheonCoupler process. It may be a no-op of course. The refresh is done
atomically - any error and the whole process is rolled back (including any
truncation). When combined with the couple
argument (which generally runs
each individual couple
working against the stream data) then if any
individual couple
has an error the whole process is also rolled back
including the refresh stage (and the initial truncation). In this way in case
of error the original content of the Stream Source Table
is always preserved
hence also the state of each final Target Table
being synchronised against
it.