Tabular SQL Database Writer

Tabular SQL Database Writer writes (uploads) the tabular data from input to a target database table. This handler works for MS SQL Server only.

Purpose

The Tabular SQL Database Writer handler is specifically for loading tabular data into an MS SQL Server database table. The input tabular data is expected to have a header row with each column’s name matches to those in the target database table. The input data’s total columns don’t have to match exactly the target table, they can be a subset of the target table. Database NULL value will be inserted into the target table’s columns if a column doesn’t exist in the input data.

So every time an input tabular data block (called a “container”) arrives at the handler, it will check the input container’s columns against the expected columns, which is either specified in the parameter settings or if such parameter is not set, automatically retrieve from the database’s schema. In order to meet the name-matching requirements in a Foldda solution, use the TabularColumnsReArranger handler and the TabularColumnsSetHeader handler to properly set up the input data, before it reaches this handler.

When loading into the database table, the handler wraps the whole operation into a database transaction, that is, if for some reason, any record from the input container failed to be inserted into the target table, the whole operation rolls back (cause a database ROLLBACK) and none of the record in the container will be written into the database. Such transactional behavior is in contrast to the line-by-line inserts, where partial data may be written into the database.

Apart from matching columns names, a “database connection string” needs to be provided to instruct the handler how to connect to the target database instance. Depending on your SQL Server’s security settings, the connection string may have different format and value. There are many connection-string examples can be found online.

Of course the target table also need to be specified as one of the parameter. Also,

Parameters

The parameter ‘db-target-column-names’ is optional. It specifies the column names on the target table that will be filled by the writer handler. This is used to verify input tabular data to have the expected structure. If not supplied, this information is retrieved automatically by querying the database’s schema – which may not be possible depending on the permission restrictions.

The columns’ data type of the target table is default to varchar. The ‘column-spec’ parameter is for specifying columns with other data types, eg. integer, decimal, or date-time.

As in the example below, when it’s a ‘date-time’ format, an additional ‘formatter’ pattern is also required for the handler to correctly parse the input value. The ‘column-spec’ parameter is also used for capping the maximum length of the input varchar data – the handler will automatically truncate the input data if the limit is exceeded.

NameRemark
csv-first-line-is-headerset as “YES”.
db-connection-stringa database connection string
db-target-table-nametarget table name.
db-target-column-namesthe expected column names in the target table. This parameter is optional, and if not supplied, the columns will be retrieved from the database automatically.
column-speccan have multiple instances, for specifying the data type for columns that are not varchar, or for restricting the max data length of a varchar column.

Config Example

The following example illustrates a Tabular SQL Database Writer handler’s settings for loading input tabular data into a table called ‘Patient1’ on database ‘Test1’. It specifies the input data is expected to have six columns, named as “Mrn,Name,Dob,Age,HeightInMeter,Address”, and the handler will throw an error if input data doesn’t match such format.

Note for defining the date-time data type for column ‘Dob’, a ‘column-spec’ parameter is used, and it specifies the input format is to have a ‘yyyyMMdd’ pattern.

Also note for varchar column ‘Address’, the ‘column-spec’ parameter specifies that the max input data length to be 120 chars long. The handler will truncate the input data if this limit is exceeded, avoiding to have a potential database error when loading the data into the target table (in the case that the target table’s column size is also limited).

<Parameters>
    <!-- expect the input data (when input from file) has a header row -->
    <Parameter>
      <Name>csv-first-line-is-header</Name>
      <Value>YES</Value>
    </Parameter>

    <!-- MS SQL Server connection string -->
    <Parameter>
      <Name>db-connection-string</Name>
      <Value>Server=localhost;Database=Test1;User Id=User1;Password=password1;</Value>
    </Parameter>
    <Parameter>
      <Name>db-target-table-name</Name>
      <Value>Patient1</Value>
    </Parameter> 

    <!-- optional, if not supplied, it'll be retrieved from db-schema -->
    <Parameter>  
      <Name>db-target-column-names</Name>
      <Value>Mrn,Name,Dob,Age,HeightInMeter,Address</Value>
    </Parameter> 

    <!--"column-spec" is for defining non-default (non-varchar) column data types --> 
    <Parameter>
      <Name>column-spec</Name>
      <Value>Dob;date-time;yyyyMMdd</Value>
    </Parameter> 
    <Parameter>
      <Name>column-spec</Name>
      <Value>Age;integer</Value>
    </Parameter> 
    <Parameter>
      <Name>column-spec</Name>
      <Value>HeightInMeter;decimal</Value>
    </Parameter> 
    <Parameter>
      <Name>column-spec</Name>
      <Value>Address;max-length;120</Value>
    </Parameter> 
  </Parameters>

See Also

The Patient Discharge Registry solution demonstrates how to use the Tabular SQL Database Writer for selecting and converting HL7 data elements and writing into a SQL Server database table.

For generic database reading and writing, especially for other database types, check out the use of the Generic Database Reader handler and the Generic Database Writer handler.