Generic Database Writer

Generic Database Writer utilizes the ODBC/OLE-DB layer, so with the driver installed, in theory it supports writing to most of the database types including MySQL, Oracle, SQL Server and MS Access.

Purpose

The Generic Database Writer handler supports writing tabular data to a wide range of databases and file formats because it utilizes various abstract driver layers such as the ODBC, OLE DB, and OS-supported native SQL Server driver. In theory, any type of database or data files are accessible as long as the related ODBC or OLE DB driver is installed. Note this handler only utilizes 64-bits drivers only.

Apart from specifying the driver-type, a “database connection string” also needs to be provided to instruct the handler on how to connect to the target database (or data file) instance. The connection string varies depending on the database (or data file), and the driver being used. There are many connection-string examples that can be found online. The target table also needs to be specified as one of the parameters.

The input tabular data is expected to have a header row which has its every column’s name matches to those in the target database table. The input data’s total columns can be only a subset of the target table. When a target table’s column does not have a matching column in the supplied input CSV data, a database NULL value will be inserted into the column.

So every time an input tabular data block (called a “container”) arrives at the handler, it will check the input container’s columns against a set of “expected” columns, which are either specified in the parameter settings or, if such parameter is not set, are automatically retrieved from the database’s schema. In order to meet the name-matching requirements in a Foldda solution, use the Tabular Columns Re-Arranger handler and the Tabular Columns Set Header handler to properly set up the input data, before it reaches this handler.

When loading into the database table, the handler constructs an SQL Insert statement for each record in the tabular container, that is “INSERT INTO {table-name} ({column-1, column-2, .., column-N}) VALUES ({value-1, value-2, …, value-N});”, where “column-X” is from the container’s header-line, and the “value-X” is from each column of the tabular record in the container, and the hander executes this statement via the established connection.

Parameters

The parameter ‘database-driver-type’ define which underlying driver is used for accessing the database or the data file. Type “odbc” indicates the targeted data table is access via a installed ODBC data source; type “oledb” indicates the data table is accessible as an OLE DB “data provider” via an installed OLE DB driver; type “mssql” indicates the target table is accessible via a native SQL Server driver installed in the Windows OS.

Note for ODBC and OLE DB drivers, the 64-bits version is required, as this handler is targeted for 64-bits Windows platforms such as Windows 7/10 and Windows Server 2012 and above. Also, you can supply a suffix to this parameter to indicate the type (“brand”) of the database that is under the ODBC or OLE DB layer, e.g. “oledb-mssql” for accessing MS Access via OLE DB, or “odbc-mysql” for accessing a MySQL database via ODBC.

The “connection-string” and “table-name” are required for locating and connecting to the database and data file, and for specifying the target table where the data will be written into.

The columns’ data type of the target table is default to ‘varchar’. The ‘column-spec’ parameter is for specifying columns with other data types, eg. ‘integer’, ‘decimal’, or ‘date-time’.

As in the example below, when it’s a ‘date-time’ format, an additional ‘formatter’ pattern is also required for the handler to correctly parse the input value. The ‘column-spec’ parameter is also used for capping the maximum length of the input varchar data – the handler will automatically truncate the input data if the limit is exceeded.

NameRemark
database-driver-typeValid values are “mssql/odbc/oledb/oledb-mssql/odbc-mssql/odbc-mysql”.
connection-stringa database connection string specifying how to connect to the target database.
table-nametarget table name.
column-namesOPTIONAL. specifying the expected column names in the tabular data and the target table. If not supplied, the columns will be retrieved from the target table in the database automatically.
column-speccan have multiple instances, for specifying the data type for columns that are not varchar, or for restricting the max data length of a varchar column.

Note, to prevent the “SQL injection” attack, certain SQL keywords are forbidden to be used as part of the “table-name” and the “column-names” parameters.

Config Example

The following example illustrates using the handler writing to a MySQL database table called “DischargeRegistry”.

Also note for varchar column ‘Address’, the ‘column-spec’ parameter specifies that the max input data length to be 120 chars long. The handler will truncate the input data if this limit is exceeded, avoiding to have a potential database error when loading the data into the target table (in the case that the target table’s column size is also limited).

Note a number of ‘date-time’ fields are defined as the ‘column-spec’ parameters, with different parsing patterns. For example, for the ‘Dob’ column, in the target table it’s a ‘date-time’ column, and the value to be inserted will be parsed from the supplied string with an expected “yyyyMMdd” format.

  <Handler>GenericDatabaseWriter</Handler>
  <Parameters>
    <Parameter>
      <!-- specifies the underlying data-access method eg. OleDb, ODBC, or native MS-SQL -->
      <Name>database-driver-type</Name>
      <Value>odbc-mysql</Value> <!-- mssql/odbc/oledb/oledb-mssql/odbc-mssql/odbc-mysql -->
    </Parameter>
    <Parameter>
      <!-- this is an ODBC (MySql database) connection string example -->
      <Name>connection-string</Name>
      <Value>Provider=System.Data.Odbc;DSN=MySQL-DB-test;Uid=User1;Pwd=password1;</Value>
    </Parameter>
    <Parameter>
      <Name>table-name</Name>
      <Value>DischargeRegistry</Value>
    </Parameter> 

    <!-- 'column-spec' parameter is only required for non-varchar columns -->
    <Parameter>
      <Name>column-spec</Name>
      <Value>DOB;date-time;yyyyMMdd</Value>
    </Parameter> 
    <Parameter>
      <Name>column-spec</Name>
      <Value>Record Create DT;date-time;yyyyMMdd HH:mm:ss</Value>
    </Parameter> 
    <Parameter>
      <Name>column-spec</Name>
      <Value>Discharge DT;date-time;yyyyMMddHHmmss</Value>
    </Parameter> 
  </Parameters>
</NodeConfig>

See Also

The Tabular SQL Database Writer is a high-performance, transactional database writer specifically for MS SQL Server.

The Generic Database Reader handler does the reading from a database or a data file, via ODBC and OLE DB.