Capturing CCL output in a Database - Mirroring vs History
In many stream processing use cases there is a need to capture some of the output in a database. CCL uses database output adapters to do this, allowing you to connect a database adapter to an output stream or output window and point the adapter to a database table. It's important to understand, however, the two different ways of capturing streaming output and how opCodes affect what's stored in the database.
Now just to recap: CCL projects consist of two primary elements: Streams and Windows. The fundamental difference is that streams are stateless - an event comes in, it is processed by the stream, and an event comes out (or doesn't if the stream applies a filter), but the stream doesn't keep a record of any events. Windows, on the other hand, are like database tables and they maintain state. Each incoming event can add a row to the window, update a row in the window, or delete a row from the window according to the opCode on the event. If you aren't familiar with opCodes in CCL, read the Understanding CCL OpCodes tutorial.
So when it comes to capturing streaming data in a database, it's basically as simple as attaching the appropriate database output adapter to each output Stream or Window in the streaming project that produces the data you want to capture, and configuring the adapter to point it at your database and the table that will receive the events.
For an un-keyed output Stream, that's about all there is too it. Each event published by the output Stream will be inserted into the database table. There are some advanced properties that can be used to fine tune the adapter, balancing loading efficiency (batch and array size) with latency. And there's also the option of adding a timestamp. See the Adapters Guide for detail on the advanced properties - here's the section on the SAP HANA Output adapter.
Windows (and keyed streams), however, are where the choices come in. When you publish an output window to a database, you have two main choices: do you want the database table to "mirror" the CCL window - i.e. to always reflect the contents of the CCL window, or do you want the database table to record a history of the events added to the window? This is where the “Data Warehouse Mode” comes in – one of the advanced adapter properties.
With Data Warehouse Mode set to “OFF”, which is the default, you will get “mirroring”. Every change to the CCL window will be applied to the database table. This isn’t limited to inserts. When a row in the CCL Window is updated, the update will be applied to the database table. And when a row in the CCL Window is deleted, the corresponding row in the database table will be deleted.
Note that this also applies to deletes generated by the retention policy on the CCL Window. So if the Window has a retention policy of KEEP 1 HOUR, then each row will be deleted in the CCL Window after one hour and will also be deleted from the database table.
But in many use cases, what you want in the database is not a copy of the CCL Window, but rather an event history. For this, set the adapter Data Warehouse Mode property to either “ON” or “INSERT ONLY”. When it’s set to “ON”, all window inserts and updates will be applied as inserts to the database table. If it’s set to “INSERT ONLY”, only inserts will be applied to the database table; updates will be discarded. In both cases, deletes will be discarded. Also note that you can use the Timestamp Column Name property to add a timestamp to each row when it’s inserted into the database table – this is especially useful in Data Warehouse mode.