Reference Manual

Autor: Adam Leszczyński <aleszczynski@bersler.com>, version: 1.6.1, date: 2024-06-01

This document describes configuration parameters and usage of OpenLogReplicator.

Table of Contents

Program parameters
Folder permissions
OpenLogReplicator.json file format

Program parameters

OpenLogReplicator program is non-interactive. The only parameters accepted are:

-f|--file <config file> — configuration file name (default: "OpenLogReplicator.json"),
-p <process name> — process name (default: "OpenLogReplicator") displayed in the process list; useful when multiple instances are running,
-v|--version — print version and exit.

All parameters are defined in OpenLogReplicator.json config file which should be placed in the same directory. The file should be in JSON format. For start please check example config files in scripts folder. Refer to a full parameters list for more details.

All output messages are sent to stderr stream. Optionally, JSON output can be sent to stdout stream when test parameter is set to a no zero value.

The only language of the documentation and error messages and used in program is English.

Folder permissions

In some interval, the program writes a checkpoint file which contains information about the last processed transaction (sent to Kafka output).

OpenLogReplicator should have read, write and execute permissions for the checkpoint directory. It creates or deletes files like <database>-chkpt.json and <database>-chkpt-<scn>.json files. <database> is the database name defined in OpenLogReplicator.json file and <scn> is some database SCN number.

OpenLogReplicator.json file format

JSON config file main elements

The file is in JSON format. The file should contain a single object with the following parameters:

Table 1. Global parameters

Parameter	Specification	Notes
`source`	list of source elements, mandatory	The list should contain just one source element.
`target`	list of target elements, mandatory	The list should contain just one target element.
`version`	string, max length: 256, mandatory	The value must be equal to `"1.1.0"`. TIP: This is a safe-checker to make sure to check the content of the JSON configuration file during program upgrade. During upgrade, always check the documentation for parameter changes and verify that the JSON configuration file is correct.
`dump-path`	string, max length: 256, default: `"."`	The location where the `logdump` files are created. The path can be relative to the current directory. NOTE: This parameter is only valid when `dump-redo-log` parameter is set to non-zero value.
`dump-raw-data`	number, min: 0, max: 1, default: 0	Print hex dump of vector data for all dumped OP codes. Possible values are: `0` -– No hex dump is added to `DUMP-<nnn>.trace` file. `1` -– Before logdump information for every vector, the full vector is dumped in HEX format – useful for analysis of the content. NOTE: This parameter is only valid when `dump-redo-log` parameter is set to non-zero value.
`dump-redo-log`	number, min: 0, max: 2, default: 0	Create output similar to `logdump` command which can be compared as string to verify if certain parameters have been correctly decoded. Possible values are: `0` — No `logdump file is created. `1` — For every processed redo log file, a file `<database>-<nnn>.logdump` is created (`<database>` -– database name, `<nnn>` — redo log sequence). `2` — like `1` but additional information is printed which is not originally printed in `logdump` output — for example details about supplemental log groups. CAUTION: The result doesn’t have to fully match the results of `logdump`.There can be some inconsistency. Not all redo OP codes are parsed and analyzed, and there is no guarantee that the results should be exactly the same.
`log-level`	number, min: 0, max: 4, default: 2	Messages verbose level. All messages are sent to stderr output stream. Possible values are: `0` — Silent — don’t print anything. `1` — Error — print only error messages. `2` — Warning — print error and warning messages (default setting). `3` — Info — print error, warning and info messages. `4` — Debug — print all messages.
`trace`	number, min: 0, max: 524287, default: 0	Print debug information. The value is a sum of various trace parameters, please refer to source code for details. CAUTION: The codes can change without prior notice.

Table 2. Source element

Parameter	Specification	Notes
`alias`	string, max length: 256, mandatory	The name of the source -– referenced later in a target element. TIP: This is just a logical name used in the config file. It doesn’t have to match the actual database SID.
`format`	element of format, mandatory	Configuration of output data.
`name`	string, max length: 256, mandatory	This name is used for identifying database connection. This name is mentioned in the output and in the checkpoint files. WARNING: After starting replication, the value shouldn’t change, otherwise the checkpoint files would not be properly read. TIP: This is just a logical name used in the config file. It doesn’t have to match the actual database SID.
`reader`	element of reader, mandatory	Configuration of redo log reader.
`arch`	string, max length: 256, default is `online` for an online type; `path` for offline type; `list` for batch type	Way of getting an archive redo log file list. Possible values are: `online` -– Archived log list is read directly from the database using database connection. The database connection is closed during program work, open occasionally to read an archived redo log list. `online-keep` -– Like `online`, but the database connection is kept open. `path` -– Archived redo log file list is read from disk. `list` — Like `path` but the list of files is provided by user. This is the only mode used for `batch` type. TIP: This parameter is only valid for `online` reader type.
`arch-read-sleep-us`	number, default: 10000000	Time to sleep between two attempts to read an archived redo log list. Number in microseconds.
`arch-read-tries`	number, max: 1000000000, default: 10	Number of retries to read an archived redo log list before failing.
`debug`	element of debug	Group of options used for debugging.
`filter`	element of filter	Group of options used to filter the contents of the database and define which tables are replicated. CAUTION: The filter is applied only to the data, not to the DDL operations. IMPORTANT: During the first run, the schema is read only for tables which are selected by the filter. If the filter is changed, the schema would not update. Startup would fail because the set of users present in checkpoint files would not match the set of users defined in config file. The schema would update only when the program is reset, (i.e., the checkpoint files are removed and forced recreation).
`metrics`	element of metrics	Group of options used for collecting metrics of OpenLogReplicator.
`flags`	number, min: 0, max: 524287, default: 0	A sum of various flags. Flags define various options for the program. Possible values are: `0x0001` — Read-only archived redo logs. Online redo log files aren’t read at all. CAUTION: This option would cause a delay of data replication. When the redo log files are big or the operation of switching redo log groups is done, infrequent delay can occur. Transactions would not be read until the redo log group is switched. `0x0002` — Schemaless mode. The program can operate without a schema. NOTE: Refer to details in the User Manual for details. `0x0004` -– Adaptive schema mode. This mode is only valid when schemaless mode has been chosen. NOTE: Refer to details in the User Manual for details. `0x0008` — Don’t use direct read (`O_DIRECT`) for reading redo log files. TIP: Direct IO bypasses the disk caching mechanism. Using this option is not recommended and should be used only in special cases. `0x0010` -– Ignore basic errors and continue redo log processing. CAUTION: This option is not recommended. It is useful only for debugging. For most cases when the program fails, it is better to stop the program and fix the problem. The program is not designed to continue after error as this can lead to schema data inconsistency and nondeterministic data can be sent to output. `0x0020` — Show text of DDL commands in output. `0x0040` — Show invisible (hidden) columns in output. `0x0080` -– Show guard columns in output. `0x0100` — Show nested columns in output. `0x0200` — Show unused columns in output. `0x0400` — Include incomplete transactions in output. TIP: Incomplete transactions are transactions that have started before replication was set up. Some starting elements of such transactions may be missing in the output. By default, such transactions are ignored. `0x0800` — Include system transactions in output. `0x1000` — Show checkpoint information in output. TIP: The checkpoint records are useful to monitor the progress of replication. They’re also used to detect the last processed transaction. If the checkpoint records are hidden and there is low activity of data changes, it may be challenging to detect OpenLogReplicator failure. `0x2000` — Don’t delete old checkpoint files. TIP: The number of checkpoint files left is defined by parameter `keep-checkpoints`. This flag overrides this number and leaves the checkpoint file. `0x4000` — Reserved for future use. `0x8000` — Send column data to output in raw (hex) format. `0x10000` — Decode binary XMLType data (experimental). Refer to details in binary xmltype chapter for details. `0x20000` — Pass JSON data values to output in binary format (experimental). `0x40000` — Support UPDATE operations for NOT NULL columns with occasional NULL values (experimental).
`memory`	element of memory	Configuration of memory settings.
`redo-read-sleep-us`	number, min: 0, default: 50000	The amount of time the program would sleep when all data from online redo log is and the program is waiting for more transactions. Number in microseconds. IMPORTANT: The default setting is 50.000 microseconds meaning which is equal to 1/20 s or 50 ms. This means that 20 times a second OpenLogReplicator polls disk for new changes on disk (until there is no activity — after new data appears, it is read sequentially to the end). With default setting, in the worst case, the read process would notice after 50 ms that new data is ready. This is actually rapid and a proper setting for most cases. If this delay is potentially too big — the value can be decreased, but this would increase CPU usage.
`redo-verify-delay-us`	number, min: 0, default: 0	When this parameter is set to non-zero value, the redo log file data is read second time for verification after defined delay. Double read mode applies only to online redo log files. Number in microseconds. IMPORTANT: Some filesystems (like ext_4 or btrfs) can share disk read cache between multiple processes. This can cause problems when the redo log files are read by multiple processes. This can cause read inconsistencies when the database process is writing to the same memory buffer as the OpenLogReplicator process is reading. The checksum for disk blocks is just two bytes, so it is impossible to detect if the data is corrupted or not. The only way to detect this is to read the data again and compare the data. This parameter defines time delay after which the redo log file data is read second time for verification. CAUTION: Instead of double read, it is recommended to use Direct IO disk operations instead. This option disables disk read cache and guarantees that the data is read from disk. Use this option just as a workaround in case when Direct IO is not possible.
`refresh-interval-us`	number, min: 0, default: 10000000	During online redo log reading, a new redo log group could be created, and the program would need to refresh the list of redo log groups. In case there is a situation when old redo log file has been completely processed, but still no new group is created, the program would need to refresh the list of redo log groups. Number in microseconds.

Table 3. Reader element

Parameter	Specification	Notes
`max-mb`	number, min: 16, default: 1024	The maximum amount of memory the program can allocate. Number in megabytes. IMPORTANT: This number doesn’t include memory allocated for sending big JSON messages to Kafka – this memory is not included here and is allocated on demand separately. It does also not include memory used for LOB processing.
`min-mb`	number, min: 16, max: `max-mb`, default: 32	Amount of memory allocated at startup and desired amount of allocated memory during work. If memory is dynamically allocated in greater amount, it will be released as soon as it is not required any more. See notes for `max-mb` about memory for Kafka buffer. Number in megabytes.
`read-buffer-max-mb`	number, min: 1, max: `max-mb`, default: min(`max-mb` / 4, 32)	Size of memory buffer used for disk read. Number in megabytes. IMPORTANT: Greater buffer size increases performance, but also increases memory usage. Disk buffer memory is part of the main memory (controlled by `max-mb` and `min-mb`). It is important to not allocate too much memory for disk buffer, otherwise the program would not be able to allocate memory for other purposes. This memory is never swapped to disk, and it may happen that OpenLogReplicator would suffer when there is not enough memory for other purposes.

Table 4. Reader element

Parameter	Specification	Notes
`type`	string, max length: 256, default	Possible values are: `online` -– Primary mode to read online and archived redo logs and connect to a database for reading metadata. When the connection to the database is lost, the program will try to reconnect. Example config file: `OpenLogReplicator.json.example`. `offline` -– Like `online`, but metadata is only read from previously created checkpoint file; no connection to the database is required. Example config file: `OpenLogReplicator.json.example-offline`. `batch` -– Process only redo log files provided as a list and then stop. Example config file: `OpenLogReplicator.json.example-batch`.
`con-id`	signed number, min: -32768, max: 32767, default: -1	Define container ID for the database. This is used for multi-tenant databases. TIP: `-1' is the default value and means that the database is single-tenant.
`db-timezone`	string, default: database DBTIMEZONE value	Overwrites database DBTIMEZONE value. Timezone should be in format `+xx:yy` or `-xx:yy`. The time zone is used only as base timezone for values for TIMESTAMP WITH LOCAL TZ type.
`disable-checks`	number, min: 0, max: 15, default: 0	A sum of numbers: `0x0001` — During startup, don’t check if the database user has appropriate grants to system tables. `0x0002` — During startup, don’t check if listed tables contain supplemental logging for primary keys. `0x0004` — Disable CRC check for read blocks. NOTE: This field is valid only for `online` type. IMPORTANT: This might increase performance a bit, but it is not recommended to use this option. `0x0008` — Don’t check if JSON checkpoint and schema files and OpenLogReplicator.json configuration file contain invalid JSON tags. NOTE: For performance reasons, user might disable those checks. They are recommended to be enabled in production environment, especially when during program upgrades, the field names change. Referring to old invalid field names might cause the program to fail.
`host-timezone`	string, default: time zone of OpenLogReplicator host	Time zone used by the host where the database is running. Timezone should be in format `+xx:yy` or `-xx:yy`. If OpenLogReplicator is running on a host with a different time zone, adjust this parameter to the proper value.
`log-archive-format`	string, max length: 4000	Format of expected archived redo log files. This parameter defines how to parse the redo log file name to read the sequence number. When FRA is configured the format of files is expected to be `o1_mf_%t_%s_%h_.arc`. When FRA is not used the value use for this parameter is read from database configuration parameter `log_archive_format`.
`log-timezone`	string, default: time zone of OpenLogReplicator host	Time zone used for logging messagees. Timezone should be in format `+xx:yy` or `-xx:yy`. By default, log messages are printed in the local time zone of the host where OpenLogReplicator is being run. To print messages with log in the UTC timezone, set the value to '+00:00'. Used log timezone is printed on startup. IMPORTANT: The value of this parameter can be configured by setting the environment variable `OLR_LOG_TIMEZONE`.
`password`	string, max length: 128	Password for connecting to database instance. NOTE: This field is valid only for `online` type. CAUTION: The password is stored in unencrypted string in the configuration file.
`path-mapping`	list of string pairs, max length: 2048	List of pairs of files `[before1, after1, before2, after2, …]`. Every path (of online and archived redo log) is compared with the list. If a prefix of the path matches with `beforeX` it is replaced with `afterX`. NOTE: This field is valid only for `online` and `offline` types. TIP: The parameter is useful when OpenLogReplicator operates on a different host than the database server is running and the paths differ. For example, the path may be: `/opt/fra/o1_mf_1_1991_hkb9y64l_.arc`, but file is mounted using sshfs under a different path so having `"path-mapping": ["/db/fra", "/opt/fast-recovery-area"],` the program would look for `/opt/fast-recovery-area/o1_mf_1_1991_hkb9y64l_.arc` instead.
`redo-copy-path`	string, max length: 2048	Debugging parameter which allows to copy all contents of processed redo log files to defined folder. TIP: This parameter is useful for diagnosing disk-read related problems. When consistency errors are detected, the redo log file is copied to the defined folder. The file name is in format: `path/<database>_<seq>.arc`. Having a copy of read redo log file allows easier post-mortem analysis, since the file contains exactly the same data as those which were processed.
`redo-log`	list of string, max length: 2048	List of redo logs files which should be processed in batch mode. Elements could be files but also folders. In the second case, all files in this folder would be processed. NOTE: This field is valid only for `batch` type. Example config file: `OpenLogReplicator.json.example-batch`.
`server`	string, max length: 4096	Connect string for connecting to the database instance. Format should be in form like: `//<host>:<port>/<service>`. NOTE: This field is valid only for `online` type.
`start-scn`	number, min: 0	The first SCN number to be processed. If not specified, the program will start from the current SCN. CAUTION: Setting a very low value of starting SCN might cause problems during program startup if the schema has changed since this SCN and the schema is not available to read using database flashback. In such a case, the program will not be able to read the metadata and will stop. IMPORTANT: Setting this parameter to some value would mean that transactions started before this SCN would not be processed.
`start-seq`	number, min: 0	First sequence number to be processed. IMPORTANT: If not specified, the first sequence would be determined by reading SCN boundaries assigned to particular redo log files and matched to starting SCN.
`start-time-rel`	number, min: 0	Determine starting SCN by relative time. The value and is relative to the current time using `TIMESTAMP_TO_SCN` sql function. For example, if the value is set to `3600`, the program will start from the SCN, which was active 1 hour ago. Number in seconds. NOTE: This field is valid only for `online` type. CAUTION: It is invalid to use this parameter when `start-scn` is specified.
`start-time`	string, max length: 256	Determine a starting SCN value by absolute time. The value is in format `YYYY-MM-DD HH24:MI:SS` and is converted to SCN using `TIMESTAMP_TO_SCN` sql function. For example, if the value is set to `2018-01-01 00:00:00`, the program will start from the SCN, which was active at the beginning of 2018. NOTE: This field is valid only for `online` type. CAUTION: It is invalid to use this parameter when `start-scn` or `start-time-rel` is specified.
`state`	element of state	Configuration of state settings to store checkpoint information.
`user`	string, max length: 128	Database user for connecting to database instance. NOTE: This field is valid only for `online` type.
`transaction-max-mb`	number, min: 0, default: 0	An upper limit for transaction size. If the transaction size is greater than this value, the transaction is split into multiple transactions. Number in megabytes. CAUTION: The intention of this parameter is for debugging purposes only. It is not recommended to use it in production environment. The transaction splitting is intended to limit memory usage and assumes that the transaction is committed while splitting is performed. If the transaction is not committed, the first part of the transaction would be sent to output anyway. If the transaction contains a large number of partially rolled back DML operations, they might appear in output in spite of the rollback.

Table 5. State element

Parameter	Specification	Notes
`interval-mb`	number, min: 0, default: 500	Threshold of processed redo log data after which checkpoint file is created. Number in megabytes.
`interval-s`	number, min: 0, default: 600	Threshold of processed redo log data time after which checkpoint file is created. Number in seconds. IMPORTANT: The time refers not to processing time by OpenLogReplicator but to time of the redo log data. For example, the default setting of 600 seconds means that if the last checkpoint was created after processing redo log data created at 10:40 when the processing reaches data created at 10:50 new checkpoint file is created.
`keep-checkpoints`	number, min: 0, default: 100	Number of checkpoint files which should be kept. The oldest checkpoint files are deleted. TIP: Value `0` disables checkpoint files deletion. TIP: Keeping a larger number of checkpoint files allows adjusting starting SCN more precisely. It provides more security in case of filesystem corruption and the last checkpoint file not being available. CAUTION: The number of checkpoint files may be actually larger than this parameter (exactly up to `keep-checkpoints` + `schema-force-interval`). Checkpoint file might be deleted only if it is not referred in some consecutive checkpoint files (that don’t contain schema data).
`path`	string, max length: 2048, default: `"checkpoint"`	The path to store checkpoint files. NOTE: This field is valid only for `disk` type. IMPORTANT: The path should be accessible for writing by the user which runs the program.
`schema-force-interval`	_number_m min: 0, default: 20	To increase operating speed, not all checkpoint files would contain the full schema of the database. In case the schema didn’t change, it is not necessary to repeat the schema in every checkpoint file. The value determines the consecutive number of checkpoint files which may not contain the full schema. TIP: The value of `0` means that the schema is always included in the checkpoint file.
`type`	string, max length: 256, default: `"disk"`	Only `disk` is supported.

Table 6. Debug element

Parameter	Specification	Notes
`stop-log-switches`	number, min: 0, default: 0	For debug purposes only. Stop program after specified number of log switches.
`stop-checkpoints`	number, min: 0, default: 0	For debug purposes only. Stop program after specified number of LWN checkpoints.
`stop-transactions`	number, min: 0, default: 0	For debug purposes only. Stop program after specified number of transactions.
`owner`	string, max length: 128	Owner of the debug table.
`table`	string, max length: 128	This is a technical parameter primary used only for running test cases and defines table name. If any DML transactions occur for this table (like insert, update or delete), the program would stop. The transaction doesn’t necessary need to be committed.

Table 7. Format element

Parameter	Specification	Notes
`type`	string, max length: 256, required	Possible values are: `json` — Transactions in JSON OpenLogReplicator format. `protobuf` — Transactions in Protocol Buffer format. Refer to details in output format chapter for details. CAUTION: Protocol buffer support is in experimental state. It is not fully tested and might not work properly. Don’t use it for production without testing.
`attributes`	number, min: 0, max: 7, default: 0	Transaction attributes location. Field value is a sum of: `0` — add attributes to the begin message of the transaction. `1` — add attributes to every DML message of the transaction. `2` — add attributes to the commit message of the transaction.
`char`	number, min: 0, max: 3, default: 0	Format for (n)char, (n)varchar(2) and clob column types. By default, the value is written in Unicode format, using UTF-8 to code characters. Field value is a sum of: `0x0001` — No character set transformation is applied, the characters are copied from source "as is". `0x0002` — Instead of characters, the output is in HEX format (using hex format — for example, `"column":"4b4c204d"`).
`column`	numeric, min: 0, max: 2, default: 0	Column duplicate specification. `0` — Default behavior, INSERT and DELETE contain only non-null values. UPDATE contains only changed columns or those which are member of the primary key. TIP: This is the format that takes less space. There is an assumption that if the column doesn’t appear in the INSERT of DELETE statement, it means that the value is NULL. CAUTION: For LOB columns the before value is not available in the REDO stream. Therefore, the column is not included in the output. Only after value is included. `1` — INSERT and DELETE contain all values. UPDATE contains only changed columns or those which are member of a primary key. `2` — JSON output would contain all columns that appear in REDO stream, including those which didn’t change. CAUTION: It is technically not possible to differentiate if the column was actually mentioned by UPDATE DML command or not. `UPDATE X SET A = A` might have the same redo log vector as `UPDATE X SET A = A, B = B` — in some cases (especially for tables with large schema). The receiver of the output stream shouldn’t make any assumption that the user included a column in the UPDATE operation if it appeared in the output stream and has the same before and after image.
`db`	number, min: 0, max: 3, default: 0	Present database name in payload. Value is a sum of: `0x0000` — Database name is not present. `0x0001` -– Database name is present in `db` field in every DML message. `0x0002` -– Database name is present in `db` field in every DDL message.
`flush-buffer`	numeric, min: 0, default: 1048576	Number of bytes after which the output buffer is flushed. When set to `0` then the buffer is flushed immediately as a new message arrives.
`interval-dts`	number, min: 0, max: 10, default: 0	INTERVAL DAY TO SECONDS field format. Possible values are: `0` — Value in nanoseconds — `"val": 123456000000000`. `1` — Value in microseconds (possible data precision loss) — `"val": 123456000000`. `2` — Value in milliseconds (possible data precision loss) — `"val": 123456000`. `3` — Value in seconds (possible data precision loss) — `"val": 123456`. `4` — Value in nanoseconds stored as a string — `"val": "123456000000000"`. `5` — Value in microseconds stored as a string (possible data precision loss) — `"val": "123456000000"`. `6` — Value in milliseconds stored as a string (possible data precision loss) — `"val": "123456000"`. `7` — Value in seconds stored as a string (possible data precision loss) — `"val": "123456"`. `8` — Value stored in part of ISO-8601 format stored as a string — `"val": "01 06:00:00.123456789"`. `9` — Value stored in part of ISO-8601 format stored as a string using `","` as a separator between the number of days and time — `"val": "01,06:00:00.123456789"`. `10` — Value stored in part of ISO-8601 format stored as a string using `"-"` as a separator between the number of days and time — `"val": "01-06:00:00.123456789"`.
`interval-ytm`	number, min: 0, max: 4, default: 0	INTERVAL YEAR TO MONTH field format. Possible values are: * `0` — Value in months — `"val": 20` (1 year, 8 months). `1` — Value in months as a string — `"val": "20"`. `2` — Value in string format, number of years and months separated by `" "` — `"val": "1 8"`. `3` — Value in string format, number of years and months separated by `","` — `"val": "1,8"`. `4` — Value in string format, number of years and months separated by `"-"` — `"val": "1-8"`.
`message`	number, min: 0, max: 31, default: 0	Message format specification. Value is a sum of: `0x0001` -– One message for the whole transaction. TIP: By default, the transaction is split to many messages: begin, DML, DML, …, commit. Using this flag would cause to combine all messages into one. For performance reasons, this is not recommended when using Kafka when transactions could be in hundreds of megabytes in size. `0x0002` -– Add `num` field to every message. The field would contain a sequence number of the message in the transaction. For JSON only target, the following additional flags are available: `0x0004` — Skip begin message (when using flag `0x0001`). `0x0008` — Skip commit message (when using flag `0x0001`). `0x0010` — Add information about data offset (for debugging purpopses).
`rid`	number, min: 0, max: 1, default: 0	Add `rid` field for every row in output with the Row ID. Possible values are: `0` — Don’t add `rid` field (default). `1` — Add `rid` field for every row in output with the Row ID.
`schema`	number, min: 0, max: 7, default: 0	Schema format sent to output. By default, the schema is not sent to output. Example output: `{"scns":"0x0","tm":0,"xid":"x","payload":[{"op":"c","schema":{"owner":"USR1","table":"ADAM2","obj":0},"after":{"A":100,"B":999,"C":10.22,"D":"xx2","E":"yyy","F":1564662896000}}]}` The field is a sum of values: `0x0001` — Print full schema (including column descriptions), but just with the first message for every table. TIP: This optimization is based on the fact that it is meaningless to attach the same schema definition every time if it didn’t change. It is assumed that the client would cache the schema and would not request it again. If the schema changes, the first message where new schema is used would contain the full schema. Example output: {"scns":"0x0","tm":0,"xid":"x","payload":[{"op":"c","schema":{"owner":"USR1","table":"ADAM2","columns":[{"name":"A","type":"number","precision":-1,"scale":0,"nullable":1},{"name":"B","type":"number","precision":10,"scale":0,"nullable":1},{"name":"C","type":"number","precision":10,"scale":2,"nullable":1},{"name":"D","type":"char","length":10,"nullable":1},{"name":"E","type":"varchar2","length":10,"nullable":1},{"name":"F","type":"timestamp","length":11,"nullable":1},{"name":"G","type":"date","nullable":1}]},"after":{"A":100,"B":999,"C":10.22,"D":"xx2 ","E":"yyy","F":1564662896000}}]} `{"scns":"0x0","tm":0,"xid":"x","payload":[{"op":"c","schema":{"owner":"USR1","table":"ADAM2","after":{"A":100,"B":999,"C":10.22,"D":"xx3 ","E":"yyy","F":1564662896000}}]}` `0x0002` — Add full schema definition (including column descriptions) to every message. TIP: Remember to use flag `0x0001` together with flag `0x0002`. The flag `0x0002` alone has no effect. `0x0004` — Add objn field to schema description which contains database object ID. Example output: `{"scns":"0x0","tm":0,"xid":"x","payload":[{"op":"c","schema":{"owner":"USR1","table":"ADAM2"},"after":{"A":100,"B":999,"C":10.22,"D":"xx2 ","E":"yyy","F":1564662896000}}]}`
`scn`	number, min: 0, max: 3, default: 0	SCN field format. By default, every DML operation contains `scn` field with SCN value which is derived from the redo vector which contains DML data. Possible values are: `0` — SCN is stored as a decimal number in `scn` field. `1` -– SCN values are stored in a text format in hexadecimal format (in "C" format – like `0xFF`) in `scns` field. `2` — SCN values for all DML operations are copied from commit SCN record.
`scn-all`	number, min: 0, max: 1, default: 0	Include `scn` field in every payload. Possible values are: `0` — Put `scn` field only in the first message. `1` — Put `scn` field in every message.
`timestamp`	number, min: 0, max: 15, default: 0	Format of timestamp values. In the following description, the following timestamp is used as an example: `"2022-05-01 06:00:00.123456789"`. Possible values are: `0` — Unix with nanoseconds — `"tm": 1651384800123456789`. `1` — Unix with a precision to the microsecond (possible data precision loss) — `"tm": 1651384800123457`. `2` — Unix with precision to the millisecond (possible data precision loss) — `"tm": 1651384800123`. `3` — Unix with precision to the second (possible data precision loss) — `"tm": 1651384800`. `4` — Unix with nanoseconds precision stored as a string — `"tms": "1651384800123456789"`. `5` — Unix with microsecond precision stored as a string (possible data precision loss) — `"tms": "1651384800123457"`. `6` — Unix with millisecond precision stored as a string (possible data precision loss) — `"tms": "1651384800123"`. `7` — Unix with second precision stored as a string (possible data precision loss) — `"tms": "1651384800"`. `8` — ISO-8601 format stored with nanosecond precision — `"tms": "2022-05-01T06:00:00.123456789Z"`. `9` — ISO-8601 format stored with microsecond precision as a string — `"tms": "2022-05-01T06:00:00.123456Z"`. `10` — ISO-8601 format stored with millisecond precision as a string — `"tms": "2022-05-01T06:00:00.123Z"`. `11` — ISO-8601 format stored second precission as a string — `"tms": "2022-05-01T06:00:00Z"`. `12` — ISO-8601 format stored with nanosecond precision as a string without "TZ" — `"tms": "2022-05-01 06:00:00.123456789"`. `13` — ISO-8601 format stored with microsecond precision as a string without "TZ" — `"tms": "2022-05-01 06:00:00.123456"`. `14` — ISO-8601 format stored with millisecond precission as a string without "TZ" — `"tms": "2022-05-01 06:00:00.123"`. `15` — ISO-8601 format stored second precission as a string without "TZ" — `"tms": "2022-05-01 06:00:00"`. NOTE: This format is also used for type `timestamp with local time zone` since this type internally does not contain time zone data.
`timestamp-tz`	number, min: 0, max: 4, default: 0	Format of timestamp with time zone values. In the following description, the following timestamp with time zone is used as an example: `"2022-05-01 06:00:00.123456789 Europe/Warsaw"`. Possible values are: `0` — Unix with nanoseconds stored as a string with time zone after comma sign — `"tms": "1651384800123456789,Europe/Warsaw"`. `1` — Unix with microsecond precision stored as a string with time zone after comma sign (possible data precision loss) — `"tms": "1651384800123457,Europe/Warsaw"`. `2` — Unix with millisecond precision stored as a string with time zone after comma sign (possible data precision loss) — `"tms": "1651384800123,Europe/Warsaw"`. `3` — Unix with second precision stored as a string with time zone after comma sign (possible data precision loss) — `"tms": "1651384800,Europe/Warsaw"`. `4` — ISO-8601 format stored with nanosecond precision with time zone after space sign — `"tms": "2022-05-01T06:00:00.123456789Z Europe/Warsaw"`. `5` — ISO-8601 format stored with microsecond precision as a string with time zone after space sign-- `"tms": "2022-05-01T06:00:00.123456Z Europe/Warsaw"`. `6` — ISO-8601 format stored with millisecond precision as a string with time zone after space sign-- `"tms": "2022-05-01T06:00:00.123Z Europe/Warsaw"`. `7` — ISO-8601 format stored second precission as a string with time zone after space sign — `"tms": "2022-05-01T06:00:00Z Europe/Warsaw"`. `8` — ISO-8601 format stored with nanosecond precision as a string without "TZ" with time zone after space sign — `"tms": "2022-05-01 06:00:00.123456789 Europe/Warsaw"`. `9` — ISO-8601 format stored with microsecond precision as a string without "TZ" with time zone after space sign — `"tms": "2022-05-01 06:00:00.123456 Europe/Warsaw"`. `10` — ISO-8601 format stored with millisecond precission as a string without "TZ" with time zone after space sign — `"tms": "2022-05-01 06:00:00.123 Europe/Warsaw"`. `11` — ISO-8601 format stored second precission as a string without "TZ" with time zone after space sign — `"tms": "2022-05-01 06:00:00 Europe/Warsaw"`.
`timestamp-all`	number, min: 0, max: 1, default: 0	Include `timestamp` field in every payload. Possible values are: `0` — Put `timestamp` field only in the first message. `1` — Put `timestamp` field in every message.
`unknown`	number, min: 0, max: 1, default: 0	Unknown value reporting. For unknown values `‘?’` is sent to output. Possible values are: `0` — Silently ignore unknown values. `1` — Output to stderr information about decoding mismatch.
`xid`	number, min: 0, max: 2, default: 0	Format of the Transaction ID (XID). Possible values are: `0` — classic hex format (like: `"xid":"0x0002.012.00004162"`). `1` — decimal format (like: `"xid":"2.18.16738"`). `2` — a single 64-bit number format (like: `"xidn":563027262849378`).

Table 8. Filter element

Parameter	Specification	Notes
`table`	list of a table element	List of table regex rules which should be tracked in the redo log stream and sent to output. A table that matches at least one of the rules is tracked, thus the rules can overlap. Example: `"table": {{"table": "owner1.table1"}, {"table": "owner2.table2", "key": "col1, col2, col3"}, {"table":"sys.%"}}.`
`skip-xid`	list of string elements, max length: 32	List of transaction IDs which should be skipped. The format if XID should be one of: `UUUUSSSSQQQQQQQQ`, `UUUU.SSS.QQQQQQQQ`, `UUUU.SSSS.QQQQQQQQ`, `0xUUUU.SSS.QQQQQQQQ`, `0xUUUU.SSSS.QQQQQQQQ`. Example: `"skip-xid": ["0x0002.012.00004162"]`
`dump-xid`	list of string elements, max length: 32	Debug option to dump to stderr internals about certain XID. The format is the same as for skip-xid.

Table 9. Metrics element

Parameter	Specification	Notes
`type`	string, max length: 128, mandatory	Name of the metrics module. Currently only `prometheus` is supported.
`bind`	string, max length: 128, mandatory for `prometheus`	Network address used to bind the metrics module for Prometheus. The format is `<host>:<port>`. Prometheus uses this address to connect to OpenLogReplicator. Example: `"bind": "127.0.0.1:8080"`
`tag-names`	string, max length: 128	Define tags for `dml_op` metrics. Possible values are: `all` — Provide `schema` and `table` tags for every metrics. This equals to `filter` + `sys` options. `filter` — Provide `schema` and `table` tags only for metrics for tables which are defined in `filter` section, thus are replicated. `none` — Default, don’t provide `schema` or `table` tags. `sys` — Provide `schema` and `table` tags just for system tables which are tracked for OpenLogReplicator to work properly.

Table 10. Table element

Parameter	Specification	Notes
`owner`	string, max length: 128, mandatory	Regex pattern for matching owner name. The pattern is case-sensitive.
`table`	string, max length: 128, mandatory	Regex pattern for matching table name. The pattern is case-sensitive.
`key`	string, max length: 4096	A string field with a list of columns which should be used as a primary key. The columns are separated by comma. The column names are case-sensitive. TIP: If a table doesn’t contain a primary key, a custom set of columns can be treated as a primary key.
`condition`	string, max length: 16384	An expression which should be evaluated for every row. The format of the field is C-like. Example: `"condition": "([op] != 'd') \|\| ([login username] != 'USER1')"` The expression is evaluated from left to right. The following tokens can be used: \|\| — logical OR, ! — logical NOT, && — logical AND, () — parentheses to define the order of evaluation, == — equal, != — not equal. The expression can contain the following tokens, which has name derived from the attribute list of the transaction: [audit sessionid] [client id] [client info] [current username] [login username] — the username which performed the operation; [machine name] [op] — type of operation: `c` - create (insert), `u` - update, `d` - delete, `ddl` - DDL operation; [OS process name] [OS process id] [OS terminal] [serial number] [session number] [transaction name] — the name of the transaction; [version]

Table 11. Target element

Parameter	Specification	Notes
`alias`	string, max length: 256, mandatory	A logical name of the target used in JSON file for referencing.
`source`	string, max length: 256, mandatory	A logical name of the source which this target should be connected with.
`writer`	element of a writer, mandatory	Configuration of output processor.

Table 12. Writer element

Parameter	Specification	Notes
`topic`	string, max length: 256, mandatory	Name of a Kafka topic used to send transactions as JSON messages. NOTE: This field is valid only for `kafka` type.
`type`	string, max length: 256, mandatory	Possible values are: `discard` — No-op writer. Perform all actions like parsing redo log, producing messages, but messages are discarded and not sent to any target. TIP: This target is useful for testing purposes, to verify if redo log file parsing works correctly. This writer does not accept any parameters. `file` — Write output messages directly to a file. `kafka` — Connect directly to a Kafka message system and send transactions. `network` — Stream using plain TCP/IP transmission. This mode assumes that OpenLogReplicator acts as a server. A client connects to the server and receives the messages. If the client disconnects, the server will wait for a new client to connect and buffer transactions while no client connection is present. `zeromq` — Stream using ZeroMQ messaging. TIP: Technically this is the same as `network` but instead of using plain TCP/IP connection it uses ZeroMQ messaging.
`uri`	string, max length: 256, mandatory	For network writer type: `<host>:<port>` — information for network listener. For zeromq writer type: `<protocol>://<host>:<port>` -– URI for ZeroMQ connection. NOTE: This field is valid only for `network` and `zeromq` types.
`append`	number, min: 0, max: 1, default: 1	If define output file for transaction exists, append to it. If not, create a new file. NOTE: This field is valid only for `file` type. CAUTION: Parameter `output` can’t be used together with `append`.
`max-message-mb`	number, min: 1, max: 953, default: 100	Maximum size of a message sent to Kafka. Number in megabytes. CAUTION: Memory for this buffer is allocated independently of memory defined as `min-mb`/`max-mb` when a big message to Kafka is being constructed. If the transaction is close to this value, it would be divided in many parts. Every time such a situation occurs, a warning is printed to the log. NOTE: This field is valid only for `kafka` type.
`max-file-size`	number, min: 0, default: 0	Maximum file size for output file. The size can be defined only when `output` parameter is set and is using `%i` or `%t` placeholder. NOTE: This field is valid only for `file` type.
`new-line`	number, min: 0, max: 2, default: 0	Put a new line after each transaction. Possible values are: `0` — no new line. `1` — new line after each transaction in Unix format (`\n`). `2` — new line after each transaction in Windows format (`\r\n`). NOTE: This field is valid only for `file` type.
`output`	string, max length: 256	Format of output file. The format is the same as for `strftime` function. The following placeholders are supported: `%i` — autogenerated sequence id, starting from 0. `%t` — date and time in format defined by `timestamp-format` parameter. `%s` — database sequence number. NOTE: There should be only one placeholder in the format. When using `%i` or `%t` format `max-file-size` parameter must be set to value greater than 0. NOTE: This field is valid only for `file` type.
`poll-interval-us`	number, min: 100, max: 3600000000, default: 100000	Interval for polling for new messages. Number in microseconds. TIP: This parameter defines how often the client library checks for new messages. The smaller the value, the more often the client library checks for new messages. The larger the value, the more messages are buffered in the client library. NOTE: This field is valid only for `kafka`, `network` and `zeromq` types.
`properties`	map of string to string	Additional properties for Kafka producer. Refer to librdkafka documentation for full list of parameters. Typically used parameters are: `"brokers": "host1:9092, host2:9092"` — list of Kafka brokers; `"compression.codec": "snappy"` — compression codec; `"message.send.max.retries": "3"` — number of retries for sending a message; `"retry.backoff.ms": "500"` — delay between retries; `"queue.buffering.max.ms": "1000"` — maximum time in milliseconds to buffer messages in memory; `"enable.idempotence": "true"` — enable idempotence for producer; This field allows also setting customer Kafka security related parameters like authentication, encryption, etc. CAUTION: You should not set the `message.max.bytes` parameter as maximum message size is defined by the `max-message-mb` parameter. NOTE: This field is valid only for `kafka` type.
`queue-size`	number, min: 1, max: 1000000, default: 65536	Size of message queue. TIP: This parameter defines how many messages can be sent to output. If the message offers a level of parallelism, messages can be sent in parallel. If the message transport doesn’t offer a level of parallelism, messages are sent one by one. The larger the value, the more messages can be sent in parallel.
`timestamp-format`	string, max length: 256, default: `"%F_%T"`	Format of timestamp (defined using placeholder `%t` in field `output`) in output file name. The format is the same as for `strftime` function in C. Refer to the documentation of your C library for more information. NOTE: This field is valid only for `file` type.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reference-manual.adoc

reference-manual.adoc

Reference Manual

Program parameters

Folder permissions

OpenLogReplicator.json file format

Files

reference-manual.adoc

Latest commit

History

reference-manual.adoc

File metadata and controls

Reference Manual

Program parameters

Folder permissions

OpenLogReplicator.json file format