Copyright © 2007-2025 JumpMind, Inc

Version 3.15.18

Permission to use, copy, modify, and distribute this SymmetricDS User Guide for any purpose and without fee is hereby granted in perpetuity, provided that the above copyright notice and this paragraph appear in all copies.

Preface

This user guide introduces SymmetricDS and its features for data synchronization. It is intended for users, developers, and administrators who want to install the software, configure synchronization, and manage its operation. Thank you to all the members of the open source community whose feedback and contributions helped us build better software and documentation. This version of the guide was generated on 2025-06-20.

1. Introduction

SymmetricDS is open source software for database and file synchronization, with support for multi-master replication, filtered synchronization, and transformation. It uses web and database technologies to replicate change data as a scheduled or near real-time operation, and it includes an initial load feature for full data loads. The software was designed to scale for a large number of nodes, work across low-bandwidth connections, and withstand periods of network outage.

1.1. System Requirements

SymmetricDS is written in Java and requires a Java Runtime Environment (JRE) Standard Edition (SE) or Java Development Kit (JDK) Standard Edition (SE). Most major operating systems and databases are supported. See the list of supported databases in the Database Compatibility section. The minimum operating system requirements are:

  • Java SE Runtime Environment 17 or newer

  • Memory - 64 (MB) available

  • Disk - 256 (MB) available

The memory, disk, and CPU requirements increase with the number of connected clients and the amount of data being synchronized. The best way to size a server is to simulate synchronization in a lower environment and benchmark data loading. However, a rule of thumb for servers is one server-class CPU with 2 GB of memory for every 500 MB/hour of data transfer and 350 clients. Multiple servers can be used as a cluster behind a load balancer to achieve better performance and availability.

SymmetricDS Pro is accessed from a web console, which requires one of the following supported web browsers:

  • Chrome - Evergreen

  • Firefox - Evergreen, ESR

  • Safari - 15 or newer

  • Edge - Evergreen, Chromium

1.2. Overview

A node is responsible for synchronizing the data from a database or file system with other nodes in the network using HTTP. Nodes are assigned to one of the node Groups that are configured together as a unit. The node groups are linked together with Group Links to define either a push or pull communication. A pull causes one node to connect with other nodes and request changes that are waiting, while a push causes one node to connect with other nodes when it has changes to send.

Each node is connected to a database with a Java Database Connectivity (JDBC) driver using a connection URL, username, and password. While nodes can be separated across wide area networks, the database a node is connected to should be located nearby on a local area network for the best performance. Using its database connection, a node creates tables as a Data Model for configuration settings and runtime operations. The user populates configuration tables to define the synchronization and the runtime tables capture changes and track activity. The tables to sync can be located in any Catalog and Schema that are accessible from the connection, while the files to sync can be located in any directory that is accessible on the local server.

overview

At startup, SymmetricDS looks for Node Properties Files and starts a node for each file it finds, which allows multiple nodes to run in the same instance and share resources. The property file for a node contains its external ID, node group, registration server URL, and database connection information. The external ID is the name for a node used to identify it from other nodes. One node is configured as the registration server where the master configuration is stored. When a node is started for the first time, it contacts the registration server using a registration process that sends its external ID and node group. In response, the node receives its configuration and a node password that must be sent as authentication during synchronization with other nodes.

1.3. Architecture

Each subsystem in the node is responsible for part of the data movement and is controlled through configuration. Data flows through the system in the following steps:

  1. Capture into a runtime table at the source database

  2. Route for delivery to target nodes and group into batches

  3. Extract and transform into the rows, columns, and values needed for the outgoing batch

  4. Send the outgoing batch to target nodes

  5. Receive the incoming batch at the target node

  6. Transform into the rows, columns, and values needed for the incoming batch

  7. Load data and return an acknowledgment to the source node

architecture
Capture

Change Data Capture (CDC) for tables uses database triggers that fire and record changes as comma-separated values into a runtime table called DATA. For file sync, a similar mechanism is used, except changes to the metadata about files are captured. The changes are recorded as insert, update, and delete event types. The subsystem installs and maintains triggers on tables based on the configuration provided by the user, and it can automatically detect schema changes on tables and regenerate triggers.

Route

Routers run across new changes to determine which target nodes will receive the data. The user configures which routers to use and what criteria is used to match data, creating subsets of rows if needed. Changes are grouped into batches and assigned to target nodes in the DATA_EVENT and OUTGOING_BATCH tables.

Extract

Changes are extracted from the runtime tables and prepared to be sent as an outgoing batch. If large objects are configured for streaming instead of capture, they are queried from the table. Special event types like "reload" for Initial Loads are also processed.

Transform

If transformations are configured, they operate on the change data either during the extract phase at the source node or the load phase at the target node. The node’s database can be queried to enhance the data. Data is transformed into the tables, rows, columns, and values needed for either the outgoing or incoming batch.

Outgoing

The synchronization sends batches to target nodes to be loaded. Multiple batches can be configured to send during a single synchronization. The status of the batch is updated on the OUTGOING_BATCH table as it processes. An acknowledgment is received from target nodes and recorded on the batch.

Incoming

The synchronization receives batches from remote nodes and the data is loaded. The status of the batch is updated on the INCOMING_BATCH table as it processes. The resulting status of the batch is returned to the source node in an acknowledgment.

1.4. Features

SymmetricDS offers a rich set of features with flexible configuration for large scale deployment in a mixed environment with multiple systems.

  • Web UI - The web console provides easy configuration, management, and troubleshooting.

  • Data Synchronization - Change data capture for relational databases and file synchronization for file systems can be periodic or near real-time, with an initial load feature to fully populate a node.

  • Central Management - Configure, monitor, and troubleshoot synchronization from a central location where conflicts and errors can be investigated and resolved.

  • Automatic Recovery - Data delivery is durable and low maintenance, withstanding periods of downtime and automatically recovering from a network outage.

  • Secure and Efficient - Communication uses a data protocol designed for low bandwidth networks and streamed over HTTPS for encrypted transfer.

  • Transformation - Manipulate data at multiple points to filter, subset, translate, merge, and enrich the data.

  • Conflict Management - Enforce consistency of two-way synchronization by configuring rules for automatic and manual resolution.

  • Extendable - Scripts and Java code can be configured to handle events, transform data, and create customized behavior.

  • Deployment Options - The software can be installed as a self-contained server that stands alone, deployed to a web application server, or embedded within an application.

1.5. Why SymmetricDS?

SymmetricDS is a feature-rich data synchronization solution that focuses on ease of use, openness, and flexibility. The software encourages interoperability and accessibility for users and developers with the availability of source code, an application programming interface (API), and a data model supported by documentation. Configuration includes a powerful set of options to define node topology, communication direction, transformation of data, and integration with external systems. Through scripts and Java code, the user can also extend functionality with custom behavior. With a central database for setup and runtime information, the user has one place to configure, manage, and troubleshoot synchronization, with changes taking immediate effect across the network.

The trigger-based data capture system is easy to understand and widely supported by database systems. Table synchronization can be setup by users and application developers without requiring a database administrator to modify the server. Triggers are database objects written in a procedural language, so they are open for examination, and include flexible configuration options for conditions and customization. Some overhead is associated with triggers, but they perform well for applications of online transaction processing, and their benefits of flexibility and maintenance outweigh the cost for most scenarios.

Using an architecture based on web server technology, many simultaneous requests can be handled at a central server, with proven deployments in production supporting more than ten thousand client nodes. Large networks of nodes can be grouped into tiers for more control and efficiency, with each group synchronizing data to the next tier. Data loading is durable and reliable by tracking batches in transactions and retrying of faults for automatic recovery, making it a low maintenance system.

1.6. License

SymmetricDS Pro is commercial software that is licensed, not sold. It is subject to the terms of the End User License Agreement (EULA) and any accompanying JumpMind Support Contract. See the standard SymmetricDS Pro license for reference, but your agreement with JumpMind may be different.

2. Installation

SymmetricDS at its core is a web application. A SymmetricDS instance runs within the context of a web application container like Jetty or Tomcat, and uses web based protocols like HTTP to communicate with other instances.

An instance has one of the following installation options:

  1. Standalone Installation - SymmetricDS is installed and run as a standalone process using the built-in Jetty web server. This is the simplest and recommended way to install an instance.

  2. Web Archive (WAR) - A SymmetricDS web archive (WAR) file is deployed to an existing web application container that is separately installed, maintained and run.

  3. Embedded - SymmetricDS is embedded within an existing application. In this option, a custom wrapper program is written that calls the SymmetricDS API to synchronize data.

See also:

2.1. Standalone Installation

The SymmetricDS Pro setup program is an executable JAR file that can run on any system with a Java Runtime Environment (JRE). See System Requirements for prerequisites. Download the setup program from SymmetricDS Pro Downloads.

Run the setup program:

  • From a desktop environment, double click the symmetric-pro-<version>-setup.jar file

  • If double clicking doesn’t work, use a command prompt to run: java -jar symmetric-pro-<version>-setup.jar

  • From a text-based environment, use a terminal to run: java -jar symmetric-pro-<version>-setup.jar -console

install1

The first screen shows the SymmetricDS Pro software version. The setup program will ask a series of questions before writing files to disk.

To begin selecting options, click Next.

install2

Carefully read the SymmetricDS Pro License Agreement.

If you accept, select I accept the terms of this license agreement and click Next.

install3

Specify Install new software to install a new version of SymmetricDS for the first time.

For upgrading an existing installation of SymmetricDS, see Upgrading.

Click Next to continue.

install4

Choose the installation path where SymmetricDS will either be installed or upgraded. If the directory does not already exist, it will be created for you. Make sure your user has permission to write to the file system.

After entering the directory path, click Next.

install5

Select the packages you want to install and verify disk space requirements are met. By default, all packages are selected. Drivers for popular databases are included, but they can be unselected if you don’t plan to use them.

After selecting packages, click Next.

install6

SymmetricDS can either be run automatically by the system or manually by the user. Select the Install service to run automatically checkbox to install a Windows service or Unix daemon that will start SymmetricDS when the computer is restarted. The service can installed or uninstalled later using the Control Center or command line (see Running as a Service).

Select the Run server after installing checkbox to also run SymmetricDS after installation so it can be used immediately.

After selecting options, click Next.

install7

HTTPS and HTTPS/2 protocols are recommended for protecting data security. For testing without security or encryption, the HTTP protocol can be enabled. Choose an available port number to listen on, which will be validated.

Java Management eXtension (JMX) is an optional way to manage the server from third party tools like JConsole. Most installations leave it disabled and use the web console for management.

Click Next to continue.

install8

Specify how much memory to use for sending and receive data changes. More memory is needed to communicate with multiple clients and when data contains large objects (LOB). Estimate an extra 5 MB of memory for each client and each 500 MB/hour of data transfer.

Click Next to continue.

install9

Specify disk space options for temporarily staging incoming and outgoing data changes. Using staging helps the overall performance of the system and minimizes use of the database. The default location is the "tmp" sub-directory of the installation directory. For Clustering, specify a common network share.

Click Next to continue.

install10

Confirm your installation settings look correct.

Click Next to begin installing files.

install11

The packages you selected are installed to disk.

After it finishes, click Next.

install12

During the finish step, it will install the service and start the service if you selected those options.

After it finishes, click Next.

install13

The installation is now complete. Choose if you want to open the SymmetricDS Pro Control Center where you can view the server status and open a web console.

Click Done to exit the setup program.

install14

From the SymmetricDS Pro Control Center, you can start/stop the server, open the web console, and install/uninstall the service.

To begin configuration of SymmetricDS, check that the server is running, and then click Open Web Console.

To continue setup and configuration of SymmetricDS, refer to the Setup section.

2.2. Running as a Service

SymmetricDS can be configured to start automatically when the system boots, running as a Windows service or Linux/Unix daemon. A wrapper process starts SymmetricDS and monitors it, so it can be restarted if it runs out of memory or exits unexpectedly. The wrapper writes standard output and standard error to the logs/wrapper.log file.

For SymmetricDS Pro, you may have already installed as a service, so this section will show you how to manually install the service from command line.

2.2.1. Running as a Windows Service

To install the service, run the following command as Administrator:

bin\sym_service.bat install

Most configuration changes do not require the service to be re-installed. To uninstall the service, run the following command as Administrator:

bin\sym_service.bat uninstall

To start and stop the service manually, run the following commands as Administrator:

bin\sym_service.bat start
bin\sym_service.bat stop

2.2.2. Running as a Linux/Unix daemon

An init script is written to the system /etc/init.d directory. Symbolic links are created for starting on run levels 2, 3, and 5 and stopping on run levels 0, 1, and 6. To install the script, running the following command as root:

bin/sym_service install

Most configuration changes do not require the service to be re-installed. To uninstall the service, run the following command as root:

bin/sym_service uninstall

To start and stop the service manually, run the following commands:

bin/sym_service start
bin/sym_service stop

2.3. Clustering

A single SymmetricDS node can be deployed across a series of servers to cooperate as a cluster. A node can be clustered to provide load balancing and high availability.

Each node in the cluster shares the same database. A separate hardware or software load balancer is required to receive incoming requests and direct them to one of the backend nodes. Use the following steps to setup a cluster:

  1. Set the cluster.lock.enabled property to true

  2. Optionally, set the cluster.server.id property to a unique name, otherwise the hostname will be used

  3. Set the sync.url property to the URL of the load balancer

  4. Set the initial.load.use.extract.job.enabled property to false if using local staging

  5. Copy the engine properties, security/keystore, and conf/sym_service.conf files to each installation

  6. Configure the load balancer for sticky sessions

  7. If the load balancer requires cookies, then client nodes need to set the server.http.cookies.enabled property to true in conf/symmetric-server.properties

With the cluster.lock.enabled property set to true, jobs will acquire an entry in the LOCK table to ensure that only one instance of the job runs across the cluster. When a lock is acquired, a row is updated in the lock table with the time of the lock and the server ID of the locking job. The locking server ID defaults to the host name, but it can specified with the cluster.server.id property if nodes are running on the same server. Another instance of the job cannot acquire a lock until the locking instance releases the lock and sets the lock time back to null. If an instance is terminated while the lock is still held, an instance with the same server ID is allowed to re-acquire the lock. If the locking instance remains down, the lock can be broken after it expires, specified by the cluster.lock.timeout.ms property. Jobs refresh their lock periodically as they run, which prevents a lock from expiring due to a long run time.

The load balancer should be configured to use sticky sessions if the cluster will receive push synchronization. Push connections first request a reservation from the target node and then connect again using the reservation to push changes. Sticky sessions ensures that the push request is sent to the same server where the reservation is held.

Staging is writing batches to disk before sending over the network, which can use local disk or a shared network drive. Staging can improve performance by reducing the time that resources are held open in the database and by extracting batches before they are served. To use local staging in a cluster, disable the initial.load.use.extract.job.enabled property so the initial load will extract batches on the node serving the request, rather than extracting in the background on a different node. To use shared staging in a cluster, set the staging.dir property to the directory path of the network drive and enable the cluster.staging.enabled property so files are locked during use. With shared staging, the initial load extracts in the background on one node, but batches can be served from any of the nodes in the cluster, which can improve performance.

When deploying nodes in a cluster to an application server like Tomcat or JBoss, the application server does NOT need any clustering of sessions configured.

2.4. Other Deployment Options

It is recommended that SymmetricDS is installed as a standalone service, however there are two other deployment options.

2.4.1. Web Archive (WAR)

This option means packaging a WAR file and deploying to your favorite web server, like Apache Tomcat. It’s a little more work, but you can configure the web server to do whatever you need. SymmetricDS can also be embedded in an existing web application, if desired. As a web application archive, a WAR is deployed to an application server, such as Tomcat, Jetty, or JBoss. The structure of the archive will have a web.xml file in the WEB-INF folder, an appropriately configured symmetric.properties file in the WEB-INF/classes folder, and the required JAR files in the WEB-INF/lib folder.

symmetric war
Figure 1. War

A war file can be generated using the standalone installation’s symadmin utility and the create-war subcommand. The command requires the name of the war file to generate. It essentially packages up the web directory, the conf directory and includes an optional properties file. Note that if a properties file is included, it will be copied to WEB-INF/classes/symmetric.properties. This is the same location conf/symmetric.properties would have been copied to. The generated war distribution uses the same web.xml as the standalone deployment.

bin/symadmin -p my-symmetric-ds.properties create-war /some/path/to/symmetric-ds.war

2.4.2. Embedded

This option means you must write a wrapper Java program that runs SymmetricDS. You would probably use Jetty web server, which is also embeddable. You could bring up an embedded database like Derby or H2. You could configure the web server, database, or SymmetricDS to do whatever you needed, but it’s also the most work of the three options discussed thus far.

The deployment model you choose depends on how much flexibility you need versus how easy you want it to be. Both Jetty and Tomcat are excellent, scalable web servers that compete with each other and have great performance. Most people choose either the Standalone or Web Archive with Tomcat 5.5 or 6. Deploying to Tomcat is a good middle-of-the-road decision that requires a little more work for more flexibility.

A Java application with the SymmetricDS Java Archive (JAR) library on its classpath can use the SymmetricWebServer to start the server.

import org.jumpmind.symmetric.SymmetricWebServer;

public class StartSymmetricEngine {

    public static void main(String[] args) throws Exception {

        SymmetricWebServer node = new SymmetricWebServer(
                                   "classpath://my-application.properties", "conf/web_dir");

        // this will create the database, sync triggers, start jobs running
        node.start(8080);

        // this will stop the node
        node.stop();
    }

This example starts the SymmetricDS server on port 8080. The configuration properties file, my-application.properties, is packaged in the application to provide properties that override the SymmetricDS default values. The second parameter to the constructor points to the web directory. The default location is web. In this example the web directory is located at conf/web_dir. The web.xml is expected to be found at conf/web_dir/WEB-INF/web.xml.

2.4.3. Client Mode

This option runs the SymmetricDS engine without a web server, so it can initiate push and pull requests, but not receive them. Without the web server, there are no open ports listening for sync requests, which can help with security requirements. Be aware that this also means losing access to the web console at this node and any enhanced troubleshooting provided by remote status.

The conf/sym_service.conf file has a parameter to start the service in client mode:

wrapper.app.parameter.3=--client

2.5. Upgrade Considerations

Upgrading an existing SymmetricDS instance depends on the deployment option used originally.

  1. Standalone Installation: The SymmetricDS Pro setup program will automatically detect an older version and upgrade it.

  2. Clustered Installation has an added challenge of coordinating the database schema upgrade and synchronizing the keystore. See Upgrade SymmetricDS Cluster for more details.

  3. Web Archive (WAR): Use commands described in Web Archive (WAR) to generate a new WAR file and install that as before.

  4. Embedded Installation: As noted in the Embedded section, you would need to swap all SymmetricDS Java Archive (JAR) library on the classpath (used for your Java application) for their new version.

  5. Rollback: a combination of restoring database from a prior backup and the The SymmetricDS Pro setup program will overwrite current installation with an older version. See Downgrade Considerations for more details and options.

2.5.1. Upgrade SymmetricDS Cluster

Upgrading a cluster with multiple SymmetricDS installations requires careful coordination. Below are the most important steps. However, it is recommended to test the entire upgrade process in a non-Production environment and to measure time each step took.

In order to minimize risk of downtime, some IT departments require a verified rollback plan. To ensure your rollback plan works, practice it in a non-Production environment.

Contact JumpStart to request Jumpmind experts help you plan and implement a complicated upgrade in the shortest amount of time possible.

  1. Delete any offline nodes, which are not absolutely necessary.

  2. Run the SymmetricDS purge jobs on-demand: Purge Outgoing, Purge Incoming.

  3. Deactivate/disable all servers with SymmetricDS engines in the load balancer.

  4. Stop all SymmetricDS engines (on all servers in the cluster);

  5. Hold any external application jobs or batch processes.

  6. Database backup:

    • Disconnect all users from the database (use a single-user DBA mode, if available).

    • Take a full database backup (to support your rollback plan).

    • Exit single user DBA mode after database backup is complete.

  7. Install SymmetricDS on all new servers (upgrade in place is possible, but slows down rollback plan).

  8. Migrate SymmetricDS configuration from an old server:

    • Parameters: copy the engine .properties file from the old SymmetricDS server to a new one.

    • Logging: edit the log4j2.xml file on the new SymmetricDS installation to include features from the old one log4j file.

  9. First start and initialization:

    • Start only one SymmetricDS engine

    • Allow SymmetricDS the time necessary to execute any database change (schema upgrade scripts). Duration of this step varies as it depends on the version number you are upgrading from.

    • Enable the first SymmetricDS server in the load balancer;

    • Test the first server using the /ping URL.

    • Log into the SymmetricDS console and look for alerts.

    • Review symmetricds.log for any errors. If an error is linked to the SSL certificate CA(unrecognised issuing authority), use the [Import URL] button in the console (Manage→Security→Authorities tab). This will update the keystore file.

  10. Clone SymmetricDS cluster configuration to new servers

    • Copy the engine .properties file from the old SymmetricDS server to a new one.

    • Edit the log4j2.xml file on the new SymmetricDS installation and manually migrate all features from the older log4j configuration.

    • Copy the entire security sub-directory (contains keystore, which must be kept in-sync for all nodes in this cluster!).

    • Start SymmetricDS engine on additional servers in the cluster and monitor errors in the log;

    • Enable additional SymmetricDS servers in the load balancer;

    • Test additional SymmetricDS servers by using the /ping URL;

  11. Test the cluster - one server at a time:

    • Stop the SymmetricDS engine;

    • Reboot the server;

    • Ensure load balancer detects changes (server is listed as offline);

    • Confirm that the SymmetricDS engine runs automatically at system start-up.

  12. Enable any external application jobs or batch processes.

2.6. Downgrade Considerations

There are two common ways to downgrade an existing SymmetricDS instance depending on the availability or the full database backup.

  1. When very recent full database backup (from before the upgrade) is available:

    • Shut down SymmetricDS engine;

    • Restore full database backup;

    • If old server is still available, start the old server with an old installation of SymmetricDS;

    • Or run the SymmetricDS installer for an older version - matching the old installation.

    • Start the SymmetricDS engine and monitor logs for errors.

  2. When database backup is not available or is out of date (missing new / unprocessed data):

    • Shut down SymmetricDS engine;

    • Manually execute command to modify database version marker to match older version:
      UPDATE sym_node SET schema_version='your_prior_symmetric_version' WHERE node_id='your_cluster_nodes';

    • Run the SymmetricDS installer for an older version - matching the old installation.

    • Start the SymmetricDS engine and monitor logs for errors.

3. Setup

Once the SymmetricDS software is installed on a computer and an instance of it is running, the next step in setting up the synchronization scenario is to set up a SymmetricDS node within that running SymmetricDS instance. As a reminder, a SymmetricDS node is connected to a database or file system and is responsible for synchronizing that database’s data to other SymmetricDS nodes within the node network.

3.1. Node Type

When opening the web console, if there are no nodes defined within the running instance, the Connect Database Wizard will be displayed to guide you through the process of creating one. There are two types of nodes:

  1. Setup New Replication - The primary node is typically the first node set up when creating a new replication scenario, and it serves as the central place where configuration is done. All configuration is stored in the database that this node is attached to.

  2. Join Existing Replication - All other nodes join existing replication by registering with the primary node, where they receive a copy of the configuration and learn about other nodes.

A third option will perform the Setup Demo installation, which does the following:

  1. Creates two example databases

  2. Creates a primary node attached to the server database

  3. Creates a secondary node attached to the client database

  4. Configures synchronization of the two databases

  5. Randomly executes insert, update, and delete statements on the database to show activity

wizard node setup
Figure 2. Node selection.

Select whether the node will setup a new replication, join an existing replication, or run a demonstration.

3.2. Primary Node Setup

3.2.1. Connect to the database

wizard connect database

The first step in setting up a primary node is to connect it to the database it will be responsible for synchronizing. This database will also be used to store the SymmetricDS configuration.

  • Specify the database type

  • Choose the replication mode (not all options available for every database)

    • Trigger-based - Triggers installed and maintained by SymmetricDS to capture changes.

    • Log-based - Database providers transaction logs are read to capture changes.

    • Time-based - A timestamp coumn or a single column numeric value is used to capture changes.

    • Load Only - A node that will only act as a target and will not capture changes.

    • Extract Only - Deprecated and should use time-based moving forward.

  • Specify the jdbc url. An example will be provided based on the database type selected. Be sure to replace host name, port and database name if necessary (varies by database type)

  • Specify the id to be used to login to the database

  • Specify the password to be used to login to the database

3.2.2. Table permissions

wizard connect permissions

This screen shows the permissions required to install the Symmetric System tables for the provided user and database.

3.2.3. Select a configuration

SymmetricDS has several predefined configuration options that can assist you with setting up a synchronization scenario. These options create a base configuration and allow you to tweak that configuration after initial creation. The following configuration options are provided:

Two Tier

This pattern is a central database being synchronized with many child databases of the same type. Typical examples include a central office database synchronizing to field databases (central office to stores, central office to field offices, etc.)

Source to Target

Similar to Two Tier, except the node groups are named source and target

Multi Master

Multi Master synchronization. Typically two master databases / nodes used in failover fashion

Secondary

A primary node group and a secondary node group synchronize by pushing data to each other.

On Prem to Cloud

An on prem node group that includes the registration server synchronizes data to and from a cloud node group.

Multi Source

Many source databases synchronizing to a target database. A typical example for this pattern is operational databases to a data warehouse or data mart.

The other options aren’t predefined configurations, but options for making your own configuration.

Custom configuration

This option allows you to configure everything from scratch

Import existing configuration

Allows you to import an existing SymmetricDS configuration file for this node

wizard connect config

Select the option and click Next.

3.2.4. Communication Settings

wizard connect host

Provide the URL that other SymmetricDS nodes will use to connect to this primary node. The default option provided should be used unless the primary node will be accessed through a proxy, load balancer, or different host name than shown in the default option.

You can also select HTTPS or HTTP to be used for transferring data.

3.2.5. Console User

wizard connect user

By default SymmetricDS uses an internal security scheme for protecting the SymmetricDS application. The admin user is the administrative user for the SymmetricDS web console. Provide a password for this user and click Next.

Additional users can be setup in the web console after the wizard completes. For more information, see Users.

SymmetricDS can also use LDAP or SAML to secure the SymmetricDS application. For more information, see Single Sign-On.

3.2.6. Summary

wizard connect ready

The node is now ready to be installed. Click the Finish button.

3.3. Add Table(s) Wizard

Once the SymmetricDS software is installed, an instance of it is running, and a master node is setup, the synchronization scenario must be configured (i.e. what tables and/or files do I want to synchronize from which source node groups to which target node groups). The Add Table(s) Wizard simplifies this process by walking you through several simple configuration steps.

The Add Table(s) Wizard can be found under the Configure→Overview screen.

configure overview
The add table(s) wizard will show a screen for each router that is in place. The below example shows the Two Tier configuration option with a client and server.
wizard config start

3.3.1. Router: Client to Server

wizard config client server

Select all tables that should be synchronized in the direction stated in the header, in this case from CLIENT to the SERVER

3.3.2. Router: Server to Client

wizard config server client

Next we will select tables to sync from the SERVER to the CLIENT.

The item_selling_price table was selected on both wizard screens so this table will be configured for bi-directional sync.

3.3.3. Configuration Summary

wizard config verify

The summary of the configuration made by the wizard.

Once the add table(s) wizard has completed your setup you can still navigate to the specific components created to provide customizations beyond the default settings.

3.4. Node Setup

Once configuration is complete SymmetricDS will automatically pop up a Connect Database Wizard if there is only one node currently setup in your network.

See Add Node for more information.

3.5. Node Properties File

The connect database wizard generates a properties file that contains all the information a node needs to bootstrap itself.

Each node that is deployed to a server is represented by a properties file that allows it to connect to a database and register with a parent node. Properties are configured in a file named xxxxx.properties. It is placed in the engines directory of the SymmetricDS install. The file is usually named according to the engine.name, but it is not a requirement.

To give a node its identity, the following properties are required. Any other properties found in conf/symmetric.properties can be overridden for a specific engine in an engine’s properties file. If the properties are changed in conf/symmetric.properties they will take effect across all engines deployed to the server.

You can use the variable $(hostName) to represent the host name of the machine when defining these properties (for example, external.id=$(hostName)). You can also access external id, engine name, node group id, sync URL, and registration URL in this manner. (for example, engine.name=$(nodeGroupId)-$(externalId)).
You can also use BSH script for the external id, engine name, node group id, sync URL, and registration URL. Use back ticks to indicate the BSH expression, and note that only one BSH expression is supporter for a given property line. The script can be prefixed or suffixed with fixed text. For example, if you wish to based the external id off of just a part of the hostname (e.g., substring of hostName): external.id=store-`import org.apache.commons.lang.StringUtils; return StringUtils.substring(hostName,2,4);\`
You can also set both environmental variables and system properties directly in the engine file. Environmental variables refer to external variables supplied by the Operating System, and system properties are variables that can be passed into the JVM. For example, if the environment variable you’re using is USERNAME, you would use $(USERNAME) in the engine file. Note, this is case sensitive.
engine.name

This is an arbitrary name that is used to access a specific engine using an HTTP URL. Each node configured in the engines directory must have a unique engine name. The engine name is also used for the domain name of registered JMX beans.

group.id

The node group that this node is a member of. Synchronization is specified between node groups, which means you only need to specify it once for multiple nodes in the same group.

external.id

The external id for this node has meaning to the user and provides integration into the system where it is deployed. For example, it might be a retail store number or a region number. The external id can be used in expressions for conditional and subset data synchronization. Behind the scenes, each node has a unique sequence number for tracking synchronization events. That makes it possible to assign the same external id to multiple nodes, if desired.

sync.url

The URL where this node can be contacted for synchronization. At startup and during each heartbeat, the node updates its entry in the database with this URL. The sync url is of the format: http://{hostname}:{port}/{webcontext}/sync/{engine.name}

The {webcontext} is blank for a standalone deployment. It will typically be the name of the war file for an application server deployment.

The {engine.name} can be left blank if there is only one engine deployed in a SymmetricDS server.

When a new node is first started, it is has no information about synchronizing. It contacts the registration server in order to join the network and receive its configuration. The configuration for all nodes is stored on the registration server, and the URL must be specified in the following property:

registration.url

The URL where this node can connect for registration to receive its configuration. The registration server is part of SymmetricDS and is enabled as part of the deployment. This is typically equal to the value of the sync.url of the registration server.

Note that a registration server node is defined as one whose registration.url is either blank or identical to its sync.url.

For a deployment where the database connection pool should be created using a JDBC driver, set the following properties:

db.driver

The class name of the JDBC driver.

db.url

The JDBC URL used to connect to the database.

db.user

The database username, which is used to login, create, and update SymmetricDS tables.

db.password

The password for the database user.

See Startup Parameters, for additional parameters that can be specified in the engine properties file.

3.6. Load Only Node

Load only nodes can be configured for nodes that are only designed to load data into the database. This prevents any triggers or runtime tables from being installed on this database but still allow data to be replicated to it.

Advantages of Load Only Nodes
  • No SymmetricDS runtime tables (SYM_*) installed inside the target database.

  • No SymmetricDS triggers installed inside the target database.

  • There is still support for bulk loaders if provided (MSSQL, Oracle, Postgres, MySQL for example).

  • Allows SymmetricDS to load data into dialects that are not fully supported yet or may not have full trigger support.

Constraints of Load Only Nodes
  • Can not be set up to capture changes

  • Requires a JDBC driver

  • May require table creation outside of SymmetricDS

  • Requires an additional database to be used for SymmetricDS runtime usage. H2 is used in the setup below but any other database could be used as the runtime database.

3.6.1. Setup Load Only Node

  • To setup a load only node, begin by adding a new node to your configuration (See Add Node for details on adding a new node).

Setup Load Only Node For Supported Databases
  • On the database setup screen of the connect database wizard, select load only after selecting your database type.

manage load only
  • The remaining steps in the connect database wizard are the same as if adding a full node so continue through the remainder of the connect database wizard to complete the configuration.

Setup Load Only Node For Unsupported Databases
  • Click the Add Driver…​ button to setup a new JDBC driver to support a load only node.

manage load only driver
  • Enter the information for the new JDBC driver

    Alias

    Provide a name for the database as it will appear in the future drop down list of available databases.

    Upload

    Upload button to upload a new JDBC driver jar file to the SymmetricDS /lib folder

    Driver Class

    This will be populated by default by a class in the uploaded JDBC jar that implements the java.sql.Driver interface.

    Example Url

    An exmaple jdbc connection url for this driver so that when it is selected in the future a template will provided.

  • Click Save and the database setup screen will now be populated with your new driver information.

  • The remaining steps in the connect database wizard are the same as if adding a full node so continue through the remainder of the connect database wizard to complete the configuration.

Uploading a jar file is only available in the standalone installation of SymmetricDS, NOT with the deployed war file setup. For a web container (war) based deployment you would need to add the driver jar to the web servers classpath and restart. Then provide the driver class and url manually as explained in steps 4 and 5 below.
If the driver is already a supported database by SymmetricDA and the driver jar file was packaged with SymmetricDS (in the /lib folder) or has already been uploaded for another node there is no need to upload anything here.

4. Configuration

Configuring SymmetricDS is the process of setting up your synchronization scenario.

The Configure Overview screen provides a summary of what is currently configured in this master node. The main table on this screen shows configuration elements (Groups, Group Links, Routers, etc.) and the number of items configured for each element. As an example, in the sample screenshot below, there are 2 Groups defined, 2 Group Links defined, 2 Routers defined, 9 Channels, defined, etc. Certain configuration elements are required to configure the synchronization scenario. If these elements are not configured, a warning symbol will be displayed next to the element name.

In addition to using the Add Table(s) Wizard to configure the synchronization solution, you can also configure SymmetricDS configuration elements one by one using the links on the left hand side of the Configure Overview screen. Each configuration element is described in the following sections of this guide.

Configuration can also be imported from and exported to external files by pressing the Import and Export buttons from the Configure Overview screen.

configure overview

4.1. Import

There are two ways a SymmetricDS configuration can be imported. Both ways support importing an export that is in CSV format. In order to import in SQL format you would run the SQL script via database specific tools or import through the web console in Configure → Overview.

  1. During the creation of the master node, while selecting a configuration

    import wizard profile
    Figure 3. Import Existing configuration during the node creation
  2. Import a configuration through the web console

    This will wipe out all existing configuration settings and replace with the newly imported ones.

4.2. Export

Export your configuration at any time through the web console. Use the drop down to select the format you would like to use for the export.

Exporting your configuration might be useful in order to save a configuration in source control or to be used as an import into another environment. Export and Import are frequently used to migrate configuration between test and production environments.

4.3. Groups

In SymmetricDS, configuration rules are applied to groups of nodes versus individual nodes. A group is a categorization of nodes with similar synchronization needs. For example, in a synchronization scenario where a corporate office database is synchronized with field office databases, two node groups would be created, one for the corporate office database (Corporate), and one for the field office databases (Field_office). In the corporate group, there would be a single node and database. In the field_office group, there would be many nodes and databases, one for each field office. Configuration rules/elements are applied to the node group versus the individual nodes in order to simplify the configuration setup (no need to configure each individual field office node, just how the field office nodes sync with the corporate office node).

group
Required Fields
Group ID

Unique identifier for the group.

Advanced Options
Description

Description of the group that is available through the console.

Example 1. Sample Node Groups

Basic Two Tier

Server, Client

Retail

Corp, Store, Register,Handheld

Field Office

Corporate, Field_office

Failover

Master, Failover

Group links define at a high level how data moves throughout your synchronization scenario. The group link defines which node groups will synchronize data to other node groups and within that exchange, which node group will initiate the conversation for that exchange.

group link
Required Fields
Source Group ID

The source group of the communication link.

Link

Defines how the source and target groups will communicate.

Table 1. Options for Group Links

Push [P]

Indicates that nodes in the source node group will initiate communication over an HTTP PUT and push data to nodes in the target node group.

Wait for Pull [W]

Indicates nodes in the source node group will wait for a node in the target node group to connect via an HTTP GET and allow the nodes in the target node group to pull data from the nodes in the source node group.

Route-only [R]

Route-only indicates that the data isn’t exchanged between nodes in the source and nodes in the target node groups via SymmetricDS. This action type might be useful when using an XML publishing router or an audit table changes router.

Target Group ID

The target group of the communication link.

Advanced Options
Sync Configuration

Determines if configuration is also sent through this group link during routing. This affects all SymmetricDS configuration tables except for extract_request, monitor_event, node, node_host, node_security, table_reload_request, and table_reload_status. By default this is checked and configuration will communicate on this path. There are configurations that might cause configuration to continuously loop through the network. As a result, this might need to be unchecked for some links.

Sync SQL/DDL

Determines if SQL events (including DDL) is also sent through this group link during routing. By default this is checked and SQL/DDL will communicate on this path. When using DDL replication, this gives the user control of where to route DDL changes.

Reversible

Allows the communication link to send in the reverse direction if specified on the channel. A push link can be overridden to pull and a pull link can be overridden to push using a setting on the channel.

Example 2. Sample Group Links
Basic

Common two tier corp and store store bi-directional setup

group link basic
Firewall

All communication is initiated from corp group. This is a common way to setup synchronization when the corp group is behind a firewall.

group link firewall
Peer to Peer

Allows a node to sync to all other nodes in the same group.

group link same
Loop

Turn sync configuration off on group link c to a so that configuration does not loop.

group link config
Group Links Tips and Tricks
The link also defines if configuration data will be synchronized on the link. For example, you might not want remote nodes to be able to change configuration and affect other nodes in the network. In this case you would uncheck sync configuration.
Turn off the sync configuration option where appropriate in non hierarchical setup so that configuration changes do not create a loop. See "Loop" example above.

4.5. Routers

Routers ride on top of group links. While a group link specifies that data should be moved from nodes in a source node group to nodes in a target node group, routers define more specifically which captured data from a source node should be sent to which specific nodes in a target node group, all within the context of the node group link.

router
Required Fields
Router Id

Unique description of a specific router

Group Link

The group link used for the source and target node groups of this router

Router Type

The type of router. Standard router types are listed below. Custom routers can be configured as extension points. If a router does not support non-DML data events, then it will send all non-DML data events to the default router instead.

Table 2. Router Types
Type Description Supports Non-DML

default

A router that sends all captured data to all nodes that belong to the target node group defined in the router. See Default Router

✔

column

A router that compares old or new column values in a captured data row to a constant value or the value of a target node’s external id or node id. See Column Match Router

audit

A router that inserts into an automatically created audit table. It records captured changes to tables that it is linked to. See Audit Table Router

java

A router that executes a Java expression in order to select nodes to route to. The script can use the old and new column values. See [Java Router]

✔

lookuptable

A router which can be configured to determine routing based on an existing or ancillary table specifically for the purpose of routing data. See Lookup Table Router

subselect

A router that executes a SQL expression against the database to select nodes to route to. This SQL expression can be passed values of old and new column values. See Subselect Router

convertToReload

When subselect router is too slow and there are too many tables or rows for lookuptable router, a convertToReload router can efficiently sub-set data by converting multiple changes into a reload batch. See ConvertToReload Router

bsh

A router that executes a Bean Shell script expression in order to select nodes to route to. The script can use the old and new column values. See Beanshell Router

✔

csv

A router that sends data from a CSV file to a target table of the target node group defined in the router. See CSV Router

dbf

A router that sends data from a dBase generated DBF file to a target table of the target node group defined in the router. See DBF Router

Router Expression

An expression that is specific to the type of router that is configured in router type. See the documentation for each router for more details.

Use Source Catalog/Schema

If set then the source catalog and source schema are sent to the target to be used to find the target table.

Target Catalog

Optional name of catalog where a target table is located. If this field is unspecified, the catalog will be either the default catalog at the target node or the "source catalog name" from the table trigger, depending on how "use source catalog schema" is set for the router. Variables are substituted for $(sourceNodeId), $(sourceExternalId), $(sourceNodeGroupId), $(targetNodeId), $(targetExternalId), $(targetNodeGroupId), $(sourceCatalogName), and $(sourceSchemaName). Parameter values can be substituted using $(name) syntax. See Variables.

Target Schema

Optional name of schema where a target table is located. If this field is unspecified, the schema will be either the default schema at the target node or the "source schema name" from the table trigger, depending on how "use source catalog schema" is set for the router. Variables are substituted for $(sourceNodeId), $(sourceExternalId), $(sourceNodeGroupId), $(targetNodeId), $(targetExternalId), $(targetNodeGroupId), $(sourceCatalogName), and $(sourceSchemaName). Parameter values can be substituted using $(name) syntax. See Variables.

Advanced Options
Sync on Update

Flag that indicates that this router should send updated rows from nodes in the source node group to nodes in the target node group.

Sync on Insert

Flag that indicates that this router should send inserted rows from nodes in the source node group to nodes in the target node group.

Sync on Delete

Flag that indicates that this router should send deleted rows from nodes in the source node group to nodes in the target node group.

Target Table

Optional name for a target table. Only use this if the target table name is different than the source.

Example 3. Sample Routers
routers examples
  • link break A router that is not linked to any triggers.

  • table A router that is linked to a single trigger.

  • table multiple A router that is linked to multiple triggers.

4.5.1. Router Types

Default Router

The simplest router is a router that sends all the data that is captured by its associated triggers to all the nodes that belong to the target node group defined in the router.

Column Match Router

Sometimes requirements may exist that require data to be routed based on the current value or the old value of a column in the table that is being routed.

The first part of the expression is always the column name. The column name should always be defined in upper case. The upper case column name prefixed by OLD_ can be used for a comparison being done with the old column data value.

The second part of the expression can be a constant value, a token that represents another column, or a token that represents some other SymmetricDS concept. Token values always begin with a colon (:).

  1. Consider a table that needs to be routed to all nodes in the target group only when a status column is set to 'READY TO SEND.'

Router Expression
STATUS=READY TO SEND
  1. Consider a table that needs to be routed to all nodes in the target group only when a status column changes values.

The use of OLD_STATUS, where the OLD_ prefix gives access to the old column value.
Router Expression
STATUS!=:OLD_STATUS
Attributes on a NODE that can be referenced with the following tokens
  • :NODE_ID

  • :SOURCE_NODE_ID

  • :EXTERNAL_ID

  • :SOURCE_EXTERNAL_ID

  • :NODE_GROUP_ID

  • :SOURCE_NODE_GROUP_ID

  • :REDIRECT_NODE

  1. Consider a table that needs to be routed to only nodes in the target group whose STORE_ID column matches the external id of a node.

Router Expression
STORE_ID=:EXTERNAL_ID
  1. Consider a table that needs to be routed to a redirect node defined by its external id in the REGISTRATION_REDIRECT table.

Router Expression
STORE_ID=:REDIRECT_NODE
  1. More than one column may be configured in a router_expression. When more than one column is configured, all matches are added to the list of nodes to route to. The following is an example where the STORE_ID column may contain the STORE_ID to route to or the constant of ALL which indicates that all nodes should receive the update.

Router Expression
STORE_ID=ALL or STORE_ID=:EXTERNAL_ID
  1. The NULL keyword may be used to check if a column is null. If the column is null, then data will be routed to all nodes who qualify for the update. This following is an example where the STORE_ID column is used to route to a set of nodes who have a STORE_ID equal to their EXTERNAL_ID, or to all nodes if the STORE_ID is null.

Router Expression
STORE_ID=NULL or STORE_ID=:EXTERNAL_ID
  1. External data collected as part of the trigger firing (see External Select) can also be used as a virtual column in the router expression as well.

Router Expression
EXTERNAL_DATA=:EXTERNAL_ID
Audit Table Router

This router audits captured data by recording the change in an audit table that the router creates and keeps up to date. The router creates a table named the same as the table for which data was captured with the suffix of _AUDIT. It will contain all of the same columns as the original table with the same data types only each column is nullable with no default values.

The following parameter must be set to true so that the audit table can be created.
auto.config.database=true
Three extra "AUDIT" columns are added to the table:
AUDIT_ID

the primary key of the table.

AUDIT_TIME

the time at which the change occurred.

AUDIT_EVENT

the DML type that happened to the row.

The audit router must be associated with a node group link of type 'R'. The 'R' stands for 'only routes to' (see Group Links).
Lookup Table Router

A lookup table may contain the id of the node where data needs to be routed. This could be an existing table or an ancillary table that is added specifically for the purpose of routing data.

Each of the following configuration parameters are required.
LOOKUP_TABLE

This is the name of the lookup table.

KEY_COLUMN

This is the name of the column on the table that is being routed. It will be used as a key into the lookup table.

LOOKUP_KEY_COLUMN

This is the name of the column that is the key on the lookup table.

EXTERNAL_ID_COLUMN

This is the name of the column that contains the external_id of the node to route to on the lookup table.

ALL_NODES_VALUE

This is an optional parameter that allows you to specify a value for the EXTERNAL_ID_COLUMN that means "send to all nodes". The value of "null" will be interpreted as a null value, not the string "null".

The lookup table will be read into memory and cached for the duration of a routing pass for a single channel.

Consider a table that needs to be routed to a specific store, but the data in the changing table only contains brand information. In this case, the STORE table may be used as a lookup table.

Router Expression

LOOKUP_TABLE=STORE KEY_COLUMN=BRAND_ID LOOKUP_KEY_COLUMN=BRAND_ID EXTERNAL_ID_COLUMN=STORE_ID

Subselect Router

Sometimes routing decisions need to be made based on data that is not in the current row being synchronized. A 'subselect' router can be used in these cases. A 'subselect' is configured with a router expression that is a SQL select statement which returns a result set of the node ids that need routed to. Column tokens can be used in the SQL expression and will be replaced with row column data.

The overhead of using this router type is high because the 'subselect' statement runs for each row that is routed. It should not be used for tables that have a lot of rows that are updated. It also has the disadvantage that if the data being relied on to determine the node id has been deleted before routing takes place, then no results would be returned and routing would not happen.

The router expression you specify is appended to the following SQL statement in order to select the node ids:

select c.node_id
from sym_node c
where c.node_group_id=:NODE_GROUP_ID
        and c.sync_enabled=1 and ...

The SQL statement has access to the following variables that are replaced before running:

Table 3. Variables available to the subselect router

:NODE_GROUP_ID

The target node group ID that is configured for the router.

:EXTERNAL_DATA

The external data for current row, as configured by sym_trigger.external_select.

:DATA_EVENT_TYPE

The event type of either INSERT, UPDATE, or DELETE.

:TABLE_NAME

The table name for the current row.

:COLUMN_NAME

Variables named for each column name (in uppercase), which return the column value for the new row.

:OLD_COLUMN_NAME

Variables named for each column name (in uppercase and prefixed with OLD_), which return the column value for the old row.

Example 4. Sample Use Case for Subselect Router

For an example, consider the case where an Order table and an OrderLineItem table need to be routed to a specific store. The Order table has a column named order_id and STORE_ID. A store node has an external_id that is equal to the STORE_ID on the Order table. OrderLineItem, however, only has a foreign key to its Order of order_id. To route OrderLineItems to the same nodes that the Order will be routed to, we need to reference the master Order record.

There are two possible ways to solve this in SymmetricDS.

  1. Configure a 'subselect' router type (shown below).

  2. Use an external select to capture the data via a trigger for use in a column match router, see External Select.

Router Expression
c.external_id in (select STORE_ID from order where order_id=:ORDER_ID)
In this example that the parent row in Order must still exist at the moment of routing for the child rows (OrderLineItem) to route, since the select statement is run when routing is occurring, not when the change data is first captured.
ConvertToReload Router

This router converts multiple change events into a single reload event, which can be used to sub-set data quickly when the subselect router is too slow or when the lookuptable router can’t handle the amount of lookup data. ConvertToReload inserts the primary key values for each row, along with a unique load ID, into a temporary table. The reload event uses the initial load SQL along with a join to the temporary table to retrieve the changes and sub-set data.

The router expression requires the name of the temporary table to use:

temptable=mytable_router
Example 5. Sample Use Case for ConvertToReload Router
create table mytest (id integer, name varchar(50), primary key(id));
create table mytest_router (id integer, load_id integer, primary key(load_id, id));
insert into SYM_ROUTER (router_id,
        source_node_group_id, target_node_group_id, router_type,
        router_expression, create_time, last_update_time) values
        ('corp-2-store','corp', 'store', 'convertToReload', 'temptable=mytest_router', current_timestamp,
        current_timestamp);
Beanshell Router

When more flexibility is needed in the logic to choose the nodes to route to, then the a scripted router may be used. The currently available scripting language is Bean Shell. Bean Shell is a Java-like scripting language. Documentation for the Bean Shell scripting language can be found at http://www.beanshell.org .

The router type for a Bean Shell scripted router is 'bsh'. The router expression is a valid Bean Shell script that:

Table 4. Variables available to the script

nodes

Collection of org.jumpmind.symmetric.model.Node objects the router would route to normally.

nodeIds

Collection of node ids that the router would route to normally. You can just return this if you want the bsh router to behave like the default router.

targetNodes

Collection of org.jumpmind.symmetric.model.Node objects to be populated and returned.

engine

The instance of org.jumpmind.symmetric.ISymmetricEngine which has access to SymmetricDS services.

Any Data Column

Data column values are bound to the script evaluation as Java object representations of the column data. The columns are bound using the uppercase names of the columns. For example, a table with a store_id column will have a STORE_ID variable name available in Bean Shell script. When using file sync, columns from the FILE_SNAPSHOT table will be available in the script.

Any Old Values

Old Data column values are bound to the script evaluation as Java object representations of the column data. The columns are bound using the uppercase representations that are prefixed with 'OLD_'. For example, a table with a store_id column will have an OLD_STORE_ID variable name available in Bean Shell script representing the old value for the store_id before the change. When using file sync, columns from the FILE_SNAPSHOT table will be available in the script.

Table 5. Return options

targetNodes

Collection of org.jumpmind.symmetric.model.Node objects that will be routed to.

true

All nodes should be routed

false

No nodes should be routed

The last line of a bsh script is always the return value.
Example 6. Use case using a Bean Shell where the node_id is a combination of STORE_ID and WORKSTATION_NUMBER, both of which are columns on the table that is being routed.
targetNodes.add(STORE_ID + "-" + WORKSTATION_NUMBER)

The same could also be accomplished by simply returning the node id.

STORE_ID + "-" + WORKSTATION_NUMBER
Example 7. Use case using a Bean Shell script to synchronize to all nodes if the FLAG column has changed, otherwise no nodes will be synchronized.
FLAG != null && !FLAG.equals(OLD_FLAG)
Here we make use of OLD_, which provides access to the old column value.
Example 8. Use case using a Bean Shell script that iterates over each eligible node and checks to see if the trimmed value of the column named STATION equals the external_id.
for(org.jumpmind.symmetric.model.Node node : nodes) {
        if (STATION != null && node.getExternalId().equals(STATION.trim())) {
                targetNodes.add(node.getNodeId());
        }
}
CSV Router

This router is used to route csv files to a specified target database table.

Consider a csv file that needs to be routed to a target table. To specify the target table, click on "Show Advanced Options" and enter your table name.

csv router

In your router expression you can tell the router to include a transaction ID for the routed data with INCLUDE_TRANSACTION_ID=true|false. Default is false.

The CSV Router routes from file to database so a file trigger must be created to specify the directory or path that the file trigger should watch. (see File Triggers)
DBF Router

This router is used to route dbf files that are generated from a dBase system to a specified target database table.

The DBF Router routes from file to database so a file trigger must be created to specify the directory or path that the file trigger should watch. (see File Triggers)

4.6. Channels

Once group links and routers are defined, configuration must be completed to specify which data (tables, file systems, etc.) should be synchronized over those links and routers. The next step in defining which specific data in the database is moved is to define logical groupings for that data. Channels define those logical groupings. As an example, a set of tables that hold customer data might be logically grouped together in a Customer channel. Sales, returns, tenders, etc. (transaction data) might be logically grouped into a transaction channel. A default channel is automatically created that all tables will fall into unless other channels are created and specified. The default channel is called 'default'.

Channels can be disabled, suspended, or scheduled as needed.

channel
Transactions will NOT be preserved across channels so its important to setup channels to contain all tables that participate in a given transaction.
Required Fields
Channel ID

Identifier used through the system to identify a given channel.

Processing Order

Numeric value to determine the order in which a channel will be processed. Channels will be processed in ascending order.

Batch Algorithm

Batching is the grouping of data, by channel, to be transferred and committed at the client together.

Default

All changes that happen in a transaction are guaranteed to be batched together. Multiple transactions will be batched and committed together until there is no more data to be sent or the max_batch_size is reached. The routing.max.batch.size.exceed.percent parameter is used to keep batch sizes within a percentage over the max_batch_size, or it can be set to zero for no limit.

Transactional

Batches will map directly to database transactions. If there are many small database transactions, then there will be many batches. The max_batch_size column has no effect.

Nontransactional

Multiple transactions will be batched and committed together until there is no more data to be sent or the max_batch_size is reached. The batch will be cut off at the max_batch_size regardless of whether it is in the middle of a transaction.

Max Batch Size

Specifies the maximum number of data events to process within a batch for this channel.

Max Batch To Send

Specifies the maximum number of batches to send for a given channel during a 'synchronization' between two nodes. A 'synchronization' is equivalent to a push or a pull. For example, if there are 12 batches ready to be sent for a channel and max_batch_to_send is equal to 10, then only the first 10 batches will be sent even though 12 batches are ready.

Max Data To Route

Specifies the maximum number of data rows to route for a channel at a time.

Max KB/s

Specifies the maximum network transfer rate in kilobytes per second. Use zero to indicate unlimited. When throttling the channel, make sure the channel is on its own queue or within a queue of channels that are throttled at the same rate. This is currently only implemented when staging is enabled.

Data Loader Types

Determines how data will be loaded into the target tables. These are used during an initial load or a reverse initial load. Data loaders do not always have to load into the target relational database. They can write to a file, a web service, or any other type of non-relational data source. Data loaders can also use other techniques to increase performance of data loads into the target relation database.

default

Performs an insert first and if this fails will fall back to an update to load the data.

ftp_localhost

Sends the data in CSV format to a configured ftp location. These locations are setup in the TODO {SYM_HOME}/conf/ftp-extensions.xml

bulk

Assigns the appropriate bulk loader to this channel. Supported bulk loaders include: Microsoft SQL, PostgreSQL, MySQL and Amazon Redshift over S3.

mongodb

MongoDB data loader.

Tables that should be data loaded should be configured to use this channel. Many times, a reload channel will be set to bulk load to increase the performance of an initial load.
Queue Name

Determines a queue that the channel will sync in. Channels with the same queue name are processed synchronously (one at a time) and channels on different queues are processed asynchronously (in parallel).

Advanced Options
Group Link Direction

For a node group link that is reversible, the channel can specify either "push" or "pull" to override the default group link communication. If this field is empty, the default group link communication is used.

Enabled

Indicates whether the channel is enabled or disabled. If a channel is disabled, data is still captured for changes that occur on the source system, but it will not be routed and sent to the target until the channel is re-enabled.

Reload Channel

Indicates whether a channel is available for initial loads and reverse initial loads.

File Sync Channel

Indicates whether a channel is available for file synchronization.

Use Old Data To Route

Indicates if the old data will be included for routing. Routing can then use this data for processing. Defaults to true.

Use Row Data To Route

Indicates if the current data will be included for routing. Routing can then use this data for processing. Defaults to true.

Use Primary Key (PK) Data to Route

Indicates if the primary key data will be include for routing. For example maybe a store ID is needed to apply logic on before sending to the appropriate target nodes. Defaults to true.

Contains Lob or Wide Row Data

For Oracle, Tibero, Firebird, and Interbase, this setting can be enabled when change data capture exceeds the character limit. Oracle and Tibero have a character limit of 4000, while Firebird and Interbase have a character limit of 20000 for changes and 1000 for primary key values. Change data capture is first attempted to extract as character data for better performance, then it will automatically fall back to extract as a large object (LOB). Enable this setting when most changes captured on the channel need extracted as LOB or when the extraction is receiving a truncation error.

Example 9. Sample Channels
TODO
Channel Tips and Tricks
Increase performance by creating designated channels for tables that use LOB data types. For these channels be sure to check the "Table Contains Big Lobs" to increase performance.

4.7. Table Triggers

The next step in defining the synchronization scenario is to define which tables should be synchronized. The simplest way to do this is by using the Auto Create…​ button on the Configure → Table Triggers screen. The Auto Create functionality introspects the master node’s database and looks for available tables to synchronize.

Note that you will be configuring tables to synchronize for specific Routers. You will get a chance to select tables for each configured Router. Select the channel you desire from the dropdown, select the tables that should be synchronized using that channel, and then click the Next button. A summary will be presented at the end of the configuration. Click Finish to create the triggers.

The Auto Create functionality can be used repeatedly to select different table, router and channel combinations. If you are configuring a new set of tables triggers that should use an independent Trigger definition be sure to enter values for either the Trigger Prefix or the Trigger Suffix fields. These will be used to create a unique Trigger Id.

triggers auto create

For each table selected in the Auto Create wizard, a corresponding Table Trigger configuration element will be added that can be maintained within the Table Triggers list screen. New Table Trigger configuration elements can also be created manually from this screen.

trigger
Required Fields
Trigger Id

Unique identifier for a trigger.

Source Catalog

Optional name for the catalog the configured table is in. If the name includes * then a wildcard match on the table name will be attempted. \ Wildcard names can include a list of names that are comma separated. The ! symbol may be used to indicate a NOT match condition. Parameter values can be substituted using $(name) syntax. See Variables.

Source Schema

Optional name for the schema a configured table is in. If the name includes * then a wildcard match on the table name will be attempted. Wildcard names can include a list of names that are comma separated. The ! symbol may be used to indicate a NOT match condition. Parameter values can be substituted using $(name) syntax. See Variables.

Source Table

The name of the source table that will have a trigger installed to watch for data changes. See Trigger Wildcards for using wildcards to specify multiple source tables. Parameter values can be substituted using $(name) syntax. See Variables.

Channel

The channel_id of the channel that data changes will flow through.

Advanced Options
Sync On Insert

Flag for installing an insert trigger.

Sync On Update

Flag for installing an update trigger.

Sync On Delete

Flag for installing a delete trigger.

Reload Channel Id

The channel_id of the channel that will be used for initial loads.

Sync Conditions

A procedure language expression included in the trigger text to determine whether a change is captured or not. Most platforms include the condition inside an "IF" statement, while SQL-Server includes the condition in a "WHERE" clause. Old and new values of a column can be referenced using "$(oldTriggerValue)" and "$(newTriggerValue)" aliases respectively. See Trigger Variables. For example, if a character column is named "STATUS" and the row should be captured when the value is "2", then the condition would be:

 $(newTriggerValue).status = '2'
Sync On Insert Condition

Conditional expression for the insert trigger to determine if a change is captured or not. See Sync Conditions.

Sync On Update Condition

Conditional expression for the update trigger to determine if a change is captured or not. See Sync Conditions.

Sync On Delete Condition

Conditional expression for the delete trigger to determine if a change is captured or not. See Sync Conditions.

Custom Insert Trigger Text

Specify insert trigger text (SQL) to execute after the SymmetricDS trigger fires. This field is not applicable for H2, HSQLDB 1.x or Apache Derby.

Custom Update Trigger Text

Specify update trigger text (SQL) to execute after the SymmetricDS trigger fires. This field is not applicable for H2, HSQLDB 1.x or Apache Derby.

Custom Delete Trigger Text

Specify delete trigger text (SQL) to execute after the SymmetricDS trigger fires. This field is not applicable for H2, HSQLDB 1.x or Apache Derby.

Sync On Incoming

Whether or not an incoming batch that loads data into this table should cause the triggers to capture changes. Changes are never sent back to the source node, but enabling this setting makes it possible to create a never-ending loop through intermediary nodes, such as A to B to C to A.

Capture Row As LOB

For Oracle, Tibero, SQL Server, and DB2, row data that exceeds the character limit needs converted into a large object (LOB) for capture. Oracle and Tibero have a character limit of 4000, SQL Server has a nvarchar limit of 4000 and a varchar limit of 8000, while DB2 has a character limit of 32767. The trigger will try to automatically capture the row correctly, so only enable this setting if you are getting truncation errors during trigger creation or when changing data.

Stream LOBs

Captures an empty placeholder for large object (LOB) data types when a row is changed, then queries for the LOB value later when the batch is extracted. If normal capturing of LOBs is not working, enabling this setting may work instead. When very large LOB data is involved, this setting can reduce the overhead of making changes in the database, but it usually results in worse performance of synchronization since it queries each row during extraction.

Stream Row

Captures only the primary key values when the trigger fires, which can reduce overhead for tables with wide data or many columns. The data will be queried using the PK values when the batch is extracted. This results in worse performance of synchronization, but it can be used when triggers for all columns won’t install or when contention from triggers is too high.

Capture Old Data

Indicates whether this trigger should capture and send the old data, which is the previous state of the row before the change. Enable this option if you need to access old data in custom trigger text, routing expression, or transform expression. Otherwise, disable this option for better performance.

Handle Key Updates

For SQL-Server and Sybase, enable this setting to capture changes to the primary key. The trigger needs to do some additional work to handle changes to the primary key, so this setting is normally disabled.

External Select

Specify a SQL select statement that returns a single row, single column result. It will be used in the generated database trigger to populate the EXTERNAL_DATA field on the data table.

Excluded Column Names

Specify a comma-delimited list of columns that should not be synchronized from this table.

Included Column Names

Specify a comma-delimited list of columns only should be synchronized from this table.

Sync Key Names

Specify a comma-delimited list of columns that should be used as the key for synchronization operations. By default, if not specified, then the primary key of the table will be used.

Channel Expression

An expression that will be used to capture the channel id in the trigger. This expression will only be used if the channel_id is set to 'dynamic'. The variable "$(schemaName)" can be used, which is replaced with the source schema of the table. See Variables.

Example 10. Sample Triggers
triggers examples
Multiple Triggers On A Table
Note that many databases allow for multiple triggers of the same type to be defined. Each database defines the order in which the triggers fire differently. If you have additional triggers beyond those SymmetricDS installs on your table, please consult your database documentation to determine if there will be issues with the ordering of the triggers.
Add Table(s) Wizard
Use the Add Table(s) Wizard to setup a complete synchronization scenario with default trigger settings.
Auto Create Triggers
By using the "Auto Create" button the on the "Table Triggers" screen you can create all your triggers through a single dialog.
Capture Changed Data

When determining whether a data change has occurred or not, by default the triggers will record a change even if the data was updated to the same value(s) they were originally. For example, a data change will be captured if an update of one column in a row updated the value to the same value it already was. There is a global property that allows you to override this behavior (defaults to false).

trigger.update.capture.changed.data.only.enabled=false

This property is currently only supported on MySQL, DB2, SQL Server, PostgreSQL, and Oracle.

4.7.1. Trigger Wildcards

The source table name may be an expression that is used to match multiple table names. Special characters include the asterisk ('*') for wildcards, the comma (',') for multiple expressions, and the exclamation ('!') for negation.

Wildcard Rules
  • Separate multiple table names or table name expressions with a comma.

  • Characters are always evaluated from left to right.

  • When a table match is made, the table is either added to or removed from the list of tables. If another trigger already exists for a table, then that table is not included in the wildcard match (the explicitly defined trigger entry take precedence).

  • System tables and any table names that start with the SymmetricDS table prefix will be excluded.

  • To negate the expression and exclude tables, start the expression with an exclamation.

  • Double up special characters to match a single literal character. (Use two asterisks to match a single asterisk.) The entire expression is processed as wildcarded when an odd number of consecutive special characters are found.

trigger wildcard
Figure 4. Sample wildcard trigger for all tables that start with "sale" or " item"

4.7.2. External Select

Occasionally, you may find that you need to capture and save away a piece of data present in another table when a trigger is firing. This data is typically needed for the purposes of determining where to 'route' the data to once routing takes place. Each trigger definition contains an optional "external select" field which can be used to specify the data to be captured. Once captured, this data is available during routing in DATA 's external_data field.

For these cases, place a SQL select statement which returns the data item you need for routing in external_select. See Trigger Variables for a list of variables available for use.

The external select SQL must return a single row, single column
Example 11. Sample External Select SQL that returns STORE_ID based on the ORDER_ID captured in the trigger.
select STORE_ID
from order
where order_id=$(curTriggerValue).$(curColumnPrefix)order_id
External select SQL statements should be used carefully as they will cause the trigger to run the additional SQL each time the trigger fires.
Using an external select on the trigger is similar to using the 'subselect' router. The advantage of this approach over the 'subselect' approach is that it guards against the (somewhat unlikely) possibility that the master Order table row might have been deleted before routing has taken place. This external select solution also is a bit more efficient than the 'subselect' approach.

4.7.3. Trigger Variables

The Sync Condition, External Select, and Custom Trigger Text configurations allow the user to provide procedure language text that is included inside the trigger. Variables can be used for configuration that works across different database platforms. When triggers are created, the variables are replaced with the syntax needed for that specific database.

Table 6. Trigger Template Variables

$(newTriggerValue)

New row alias for inserts and updates. For example, "$(newTriggerValue).MYCOLUMN" becomes ":new.MYCOLUMN" for an insert/update trigger on Oracle.

$(oldTriggerValue)

Old row alias for updates and deletes. For example, "$(oldTriggerValue).MYCOLUMN" becomes ":old.MYCOLUMN" for an update/delete trigger on Oracle.

$(curTriggerValue)

Current row alias for insert, updates, and deletes. This variable acts like $(newTriggerValue) for inserts and updates, and it acts like $(oldTriggerValue) for deletes.

$(curColumnPrefix)

Column prefix only used by H2 database. It is replaced with the NEW_ or OLD_ column prefix needed by H2. All other platforms will replace the variable with an empty string

4.7.4. Load Only Triggers

Occasionally the decision of what data to load initially results in additional triggers. These triggers, known as load only triggers, are configured such that they do not capture any data changes. In other words, the sync on insert, sync on update, and sync on delete attributes of the trigger are all set to false.

Example 12. Sample load only trigger
trigger load only
Load only triggers still participate in the following:
  • Initial Loads

  • Reverse Initial Loads

  • Table Reloads

  • Creation of tables during initial loads

Use cases for load only triggers:
  • To load a read-only lookup table, for example. It could also be used to load a table that needs populated with example or default data.

  • Recovery of data for tables that have a single direction of synchronization. For example, a retail store records sales transactions that synchronize in one direction by trickling back to the central office. If the retail store needs to recover all the sales transactions from the central office, they can be sent are part of an initial load from the central office by setting up a load only trigger that "sync" in that direction.

4.7.5. Expandable Triggers

When a set of tables is repeated for each node and the tables all contain a prefix which groups them together an expandable trigger can be used. The prefix will be resolved to the external id that the source node replicates with either through a push or pull.

Table 7. Variable

$(targetExternalId)

Resolved by looking up all the nodes the source node replicates with and will be replaced with each external id to represent a table.

Example
$(targetExternalId)-Item

If there are nodes present with external ids of client1, client2, client3 then this trigger configuration would match the following 3 tables.

client1-Item
client2-Item
client3-Item

This reduces configuration if there are for example 10 tables that are repeated for each client then only table trigger configurations would be needed and as new nodes are added (along with their tables) there is no need to add any more configuration.

4.8. Table Routing

As discussed previously, Routers define more specifically which captured data from a source node should be sent to which specific nodes in a target node group. Table Triggers define which tables should be synchronized. Table Routing defines how the Routers and Table Triggers work together, specifying what tables are associated with each router (and thus the direction the table data flows).

The simplest way to set up Table Routing is to use the Auto Create…​ button.

auto create 1 triggers

The default entry point for the auto creation is from the triggers perspective. This will allow you to select table triggers first then routers.

Clicking Next to select from available routers.

auto create 2 routers

Select the routers (direction) to associate with the previously selected triggers.

Click Save to install the triggers on the database and begin data capture.

You can also select routers on the first screen of the auto create if you would like to select routers first then triggers. Both approaches will result in the same table routing.

auto create 1 routers

Select from available routers.

Clicking Next to select from available table triggers.

auto create 2 triggers

Select the table triggers to associate with the previously selected routers.

Click Save to install the triggers on the database and begin data capture.

The table routing elements created by the wizard can be maintained from the Table Routing List Screen. New table routing entries can also be created manually from this screen if desired.

SymmetricDS will not install table triggers on the database until they are associated with a router through table routing.
Once a table routing relationship is created data capture will begin.
table routing new
Required Fields
Table Trigger

The table trigger determines the source of data.

Router

The router determines where data will be sent.

Initial Load Select

A SQL expression used in the WHERE clause of the SELECT that extracts the table during initial load. Leave blank to retrieve all rows. For correlated joins, reference the table using the "t" alias. Variables are substituted for $(groupId), $(nodeId), and $(externalId). See Variables section for formatting and manipulation of variables.

Advanced Options
Initial Load Delete

A SQL statement to run that clears the table on the target node at the beginning of the initial load before data is sent. This is usually a delete or truncate statement. Variables are substituted for $(groupId), $(nodeId), $(externalId), $(sourceGroupId), $(sourceNodeId), and $(sourceExternalId). See Variables section for formatting and manipulation of variables.

Initial Load Order

Numeric position for this table in the initial load, sent in ascending numeric order. When two numeric values are the same, the ordering is based on foreign key constraints. Use a negative number to exclude the table from initial load.

Enabled

When enabled, changes are captured and routed into batches for the table. When disabled, changes are no longer captured for the table and any outstanding changes are placed into batches with a status of "unrouted".

Ping Back Enabled

When enabled, synchronized rows with the Sync On Incoming option set on a trigger will be routed back to the source node. By default, the data is not routed back to the source node (to prevent endless loops of data changes). A reason to turn this option on is when a transform changes some of the data and the transformed data needs to be returned to the source node.

Data Refresh Type

When the data refresh jobs runs (overnight by default), this control how to refresh data in the table. Blank means exclude the table from the data refresh. Auto chooses between full reload or compare and repair. Full uses a table reload request with delete first option. Differences uses a compare and repair request.

4.9. Table Groups

Table groups are used to define collections of tables that will synchronize their data at the same time. When one row from a table in a table group is changed on the source, all corresponding rows from other tables in the same group will also be synchronized to the target.

table group
Required Fields
ID

Unique identifier for the table group.

Source Node Group ID

The node group where data changes are captured.

Target Node Group ID

The node group where data changes will be sent.

Writer Type

Type of writer: D=default, J=JSON.

Type Description

Default

The source data will remain relational on the target.

JSON

The source data will be converted to JSON documents on the target. The target database must be MongoDB, Azure Cosmos DB, or Elasticsearch.

Advanced Options
Description

Description of the table group that is available through the console.

4.10. Table Group Hierarchies

Table group hierarchies define the tables that belong to a table group and their relationships to each other, with each hierarchy representing a single table. Each table group hierarchy can have a parent, but it is not necessary for every hierarchy to have a parent. It is possible to configure multiple table group hierarchies for a single table if the table has multiple parents.

table group hier
Required Fields
ID

Unique identifier for the table group hierarchy.

Table Group ID

Unique identifier for the table group associated with this table in the hierarchy.

Source Catalog

Optional name for the catalog the configured table is in.

Source Schema

Optional name for the schema the configured table is in.

Source Table

The name of the source table that will be a part of this table group.

Parent ID

Unique identifier for the parent of this table in the hierarchy.

Relation Type

When writer type is non-default, this defines the structure of data in hierarchy when writing output: O=object, A=array.

Type Description

Object

The data will be in the form of a single object containing one key/value pair for each column in the source table.

Array

The data will be in the form of an array of objects with each object containing one key/value pair for each column in the source table.

Primary Column Names

The primary key column names on this table that uniquely identify a row, comma-separated.

Local Column Names

The local column names on this table that reference a parent table, comma-separated.

Parent Column Names

The parent column names on the parent that join with local columns, comma-separated.

4.11. File Triggers

In addition to supporting database synchronization, SymmetricDS also supports File Synchronization. Similar to database synchronization which allows configuring Table Triggers, SymmetricDS also supports setting up File Triggers. A file trigger is equivalent to specifying a directory structure or path that should be "watched" for files that need to be synchronized.

file trigger
Required Fields
File Trigger Id

Unique identifier for a trigger.

Channel

The channel_id of the channel that data changes will flow through.

Reload Channel Id

The channel_id of the channel that will be used for reloads.

Base Directory

The base directory on the source node that files will be synchronized from. To access Azure blob storage, use azure://container/optional-path.

Advanced Options
Recurse

Whether to synchronize child directories.

Include Files

Wildcard-enabled (*), comma-separated list of file to include in synchronization.

Exclude Files

Wildcard-enabled (*), comma-separated list of file to exclude from synchronization.

Sync On Create

Whether to capture and send files when they are created.

Sync On Modified

Whether to capture and send files when they are modified.

Sync On Delete

Whether to capture and send files when they are deleted.

Sync On Ctl File

Combined with sync_on_create, determines whether to capture and send files when a matching control file exists. The control file is a file of the same name with a '.ctl' extension appended to the end.

Delete After Sync

Determines whether to delete the file after it has synced successfully.

Before Copy Script

A beanshell script that is run at the target node right before the file copy to it’s destination directory.

After Copy Script

A beanshell script that is run at the target node right after the file copy to it’s destination directory.

4.11.1. BeanShell Scripts

There are two types of Bean Shell scripts that can be leveraged to customize file synchronization behavior:

Before copy script

This runs on delivery of a file before it is copied to it’s target location

After copy script

This run on delivery of a file after it is copied to it’s target location

Each of these scripts have access to local variables that can be read or set to affect the behavior of copying files.

targetBaseDir

The preset base directory as configured in file trigger or overwritten in file trigger router. This variable can be set by the before copy script to set a different target directory.

targetFileName

The name of the file that is being synchronized. This variable can be overwritten by the before copy script to rename a file at the target.

targetRelativeDir

The name of a directory relative to the target base directory to which the target file will be copied. The default value of this variable is the relative directory of the source. For example, if the source base directory is /src and the target base directory is /tgt and the file /src/subfolder/1.txt is changed, then the default targetRelativeDir will be subfolder. This variable can be overwritten by the before_copy_script to change the relative directory at the target. In the above example, if the variable is set to blank using the following script, then the target file will be copied to /tgt/1.txt.

targetRelativeDir = "";
processFile

This is a variable that is set to true by default. A custom before copy script may process the file itself and set this variable to false to indicate that the file should NOT be copied to its target location.

sourceFileName

This is the name of the file.

sourceFilePath

This is the path where the file can be found relative to the batch directory.

batchDir

This is the staging directory where the batch has been extracted. The batchDir + sourceFilePath + sourceFileName can be used to locate the extracted file.

engine

This is the bound instance of the ISymmetricEngine that is processing a file. It gives access to all of the APIs available in SymmetricDS.

sourceNodeId

This is a bound variable that represents the nodeId that is the source of the file.

log

This is the bound instance of an org.slf4j.Logger that can be used to log to the SymmetricDS log file.

Example 13. Example of a Before Copy Script
File file = new File(batchDir + "/" + sourceFilePath + "/" + sourceFileName);
if (file.exists()) {
    String path = file.getAbsolutePath();
    cp (path,"/backup/" + sourceFileName);
}

4.12. File Routing

The file routing concept is identical to the Table Routing concept except that it is for files versus database tables. The File Triggers specify which directory structures on the source node to watch, and the Routers define where the data is sent based on the configured router. Similar to Table Routing the simplest way to set up file routing is via the use of the Auto Create button.

auto create 1 triggers

The default entry point for the auto creation is from the file triggers perspective. This will allow you to select file triggers first then routers.

Click Next to select from available routers.

auto create 2 routers

Select the routers (direction) to associate with the previously selected file triggers.

Click Save to setup the file triggers and begin file sync.

You can also select routers on the first screen of the auto create if you would like to select routers first then file triggers. Both approaches will result in the same file routing.

auto create 1 routers

Select from available routers.

Click Next to select from available file triggers.

auto create 2 triggers

Select the file triggers to associate with the previously selected routers.

Click Save to setup the file routers and begin file sync.

The file routing elements created by the wizard can be maintained from the File Routing List Screen. New file routing entries can also be created manually from this screen if desired.

SymmetricDS will not create file triggers until they are associated with a router through file routing.
Once a file routing relationship is created changes on the base directory will be captured and file sync will begin.
file routing new
Required Fields
File Triggers

The file trigger to link.

Routers

The router to link.

Target Base Directory

The base directory on the target node that files will be synchronized to.

Conflict Strategy

The strategy to employ when a file has been modified at both the client and the server.

source_wins

The source file will be used when a conflict occurs.

target_wins

The target file will be used when a conflict occurs.

manual

If a conflict occurs the batch will be put in ER (error) status and require manual intervention to resolve the issue.

newer_wins

If a conflict occurs, a comparison between last modified date of the file incoming and the file currently on disk will be compared and the newest will win. This assumes the system clocks on each machine are in sync to perform an accurate comparison.

older_wins

If a conflict occurs, a comparison between last modified date of the file incoming and the file currently on disk will be compared and the oldest will win. This assumes the system clocks on each machine are in sync to perform an accurate comparison.

Advanced Options
Initial Load Enabled

Indicates whether this file trigger should be initial loaded.

Enabled

Indicates whether this file trigger router is enabled or not.

4.13. Conflicts

Conflict detection is the act of determining if an insert, update or delete is in "conflict" due to the target data row not being consistent with the data at the source prior to the insert/update/delete. Without any overriding configuration, the system uses a detection of USE_CHANGED_DATA and a resolution of NEWER_WINS by default.

conflict
Conflicts are broken into 3 key components in SymmetricDS:
  1. Detection - How to detect a conflict when loading data at the target.

  2. Resolution - When a row is in conflict, what to do with it when loading at the target.

  3. Ping Back - What data to capture at target during resolution, if any, that will be sent back to source.

Required Fields
Conflict Id

Unique identifier for a specific conflict detection setting.

Group Link

References a node group link.

Detection Type

Indicates the strategy to use for detecting conflicts during a dml action.

Conflicts are detected while data is being loaded into a target system.
Table 8. Detection Types

USE_PK_DATA

Indicates that only the primary key is used to detect a conflict. If a row exists with the same primary key, then no conflict is detected during an update or a delete. Updates and deletes rows are resolved using only the primary key columns. If a row already exists during an insert then a conflict has been detected.

USE_CHANGED_DATA

Indicates that the primary key plus any data that has changed on the source system will be used to detect a conflict. If a row exists with the same old values on the target system as they were on the source system for the columns that have changed on the source system, then no conflict is detected during an update or a delete. If a row already exists during an insert then a conflict has been detected.

USE_OLD_DATA

Indicates that all of the old data values are used to detect a conflict. Old data is the data values of the row on the source system prior to the change. If a row exists with the same old values on the target system as they were on the source system, then no conflict is detected during an update or a delete. If a row already exists during an insert then a conflict has been detected.

USE_TIMESTAMP

Indicates that the primary key plus a timestamp column (as configured in detect_expression ) will indicate whether a conflict has occurred. If the target timestamp column is not equal to the old source timestamp column, then a conflict has been detected. If a row already exists during an insert then a conflict has been detected.

USE_VERSION

Indicates that the primary key plus a version column (as configured in detect_expression ) will indicate whether a conflict has occurred. If the target version column is not equal to the old source version column, then a conflict has been detected. If a row already exists during an insert then a conflict has been detected.

Detection Expression

An expression that provides additional information about the detection mechanism. If the detection mechanism is use_timestamp or use_version then this expression will be the name of the timestamp or version column. The detect_expression is also used to exclude certain column names from being used. For example, to exclude column1 and column2, the expression is "excluded_column_names=column1,column2".

Resolution Type

The choice of how to resolve a detected conflict is configured via the resolve type.

Table 9. Resolution Types

NEWER_WINS

Indicates that when a conflict is detected that the either the source or the target will win based on which side has the newer timestamp or higher version number. With USE_TIMESTAMP detection, the column specified in detect_expression is used, otherwise the time of capture is used.

FALLBACK

Indicates that when a conflict is detected the system should automatically apply the changes anyway. If the source operation was an insert, then an update will be attempted. If the source operation was an update and the row does not exist, then an insert will be attempted. If the source operation was a delete and the row does not exist, then the delete will be ignored.

IGNORE

Indicates that when a conflict is detected the system should automatically ignore the incoming change. Use IGNORE between two node groups in one direction, and FALLBACK in the other direction to establish which group wins a conflict.

MANUAL

Indicates that when a conflict is detected, the batch will remain in error until manual intervention occurs. A row in error is inserted into the INCOMING_ERROR table, which includes the conflict ID, old data, new data, and current data at the target. The user can specify the resolve data to use on the next load attempt. The resolve_ignore flag can also be used to indicate that the row should be ignored.

To make a primary node group always win a conflict, use a "fallback" resolution on group links where primary is the source and an "ignore" resolution on group links where primary is the target.
Ping Back

For each configured conflict, you also have the ability to control if and how much "resolved" data is sent back to the node whose data change is in conflict.

Table 10. Ping Backs

SINGLE_ROW

The resolved data of the single row in the batch that caused the conflict is sent back to the originating node. Recommended use with MANUAL resolution, so the resolved data is sent back to the originating node.

REMAINING_ROWS

The resolved data of the single row in the batch in conflict, along with the entire remainder of the batch, is sent back to the originating node.

OFF

No data is sent back to the originating node, even if the resolved data doesn’t match the data the node sent. Recommended use with resolution types that choose a winning row, including NEWER_WINS and when IGNORE and FALLBACK are used on opposing group links.

Advanced Options
Channel

Optional channel that this setting will be applied to.

Target Catalog

Optional database catalog that the target table belongs to. Only use this if the target table is not in the default catalog.

Target Schema

Optional database schema that the target table belongs to. Only use this if the target table is not in the default schema.

Target Table

Optional database table that this setting will apply to. If left blank, the setting will be for any table in the channel (if set) and in the specified node group link.

Be aware that conflict detection will not detect changes to binary columns in the case where use_stream_lobs is true in the trigger for the table. In addition, some databases do not allow comparisons of binary columns whether use_stream_lobs is true or not.
Some platforms do not support comparisons of binary columns. Conflicts in binary column values will not be detected on the following platforms: DB2, DERBY, ORACLE, and SQLSERVER.

4.14. Transforms

Transforms allow you to manipulate data on a source node or target node, as the data is being loaded or extracted.

The source table must first be configured for synchronization (see Table Triggers) before it can be transformed.

The source trigger captures data changes that are passed through transformations during extraction from the source and during loading at the target. If any enhancement with additional data is needed, an extract transform can access the source database, while a load transform can access the target database. A single row change may be processed by multiple transforms on the same group link, which transforms it into multiple target rows. Rows for the same target table do not merge, which means multiple transforms with the same target table results in multiple rows for that target table.

transform
Required Fields
Transform Id

Unique identifier of a specific transform.

Group Link

The group link defining which direction the transform will process.

Transform Point

Where this transform will occur. The options include:

Table 11. Transform Points

EXTRACT

The transform will execute while data is being extracted from the source. This means the transform will have access to the source’s database.

LOAD

The transform will execute while data is being loaded into the target. This means the transform will have access to the target’s database.

Column Policy

Indicates whether unspecified columns are passed thru or if all columns must be explicitly defined. The options include:

SPECIFIED

Indicates that only the transform columns that are defined will be the ones that end up as part of the transformation.

IMPLIED

Indicates that if not specified, then columns from the source are passed through to the target. This is useful if you just want to map a table from one name to anther or from one schema to another. It is also useful if you want to transform a table, but also want to pass it through. You would define an implied transform from the source to the target and would not have to configure each column.

Source Catalog

Name of the catalog of the configured source table. This should only be set if Use Source Catalog/Schema or Target Catalog are set on the Router. Parameter values can be substituted using $(name) syntax.

Source Schema

Name of the schema for the configured source table. This should only be set if Use Source Catalog/Schema or Target Schema are set on the Router. Parameter values can be substituted using $(name) syntax.

Source Table

The name of the source table that will be transformed. Parameter values can be substituted using $(name) syntax.

Target Catalog

Optional name for the catalog a target target table is in. Only use this if the target table is not in the default catalog. Parameter values can be substituted using $(name) syntax.

Target Schema

Optional name of the schema a target target table is in. Only use this if the target table is not in the default schema. Parameter values can be substituted using $(name) syntax.

Target Table

The name of the target table. Parameter values can be substituted using $(name) syntax.

Advanced Options
Update First

This option overrides the default behavior for an Insert operation. Instead of attempting the Insert first, SymmetricDS will always perform an Update first and then fall back to an Insert if that fails. Note that, by default, fall back logic always applies for Insert and Updates. Here, all you a specifying is whether to always do an Update first, which can have performance benefits under certain situations you may run into.

Delete Action

An action to take upon delete of a row.

Table 12. Transform Points

DEL_ROW

The delete results in a delete of the row as specified by the pk columns defined in the transformation configuration.

UPDATE_COL

The delete results in an update operation on the target which updates the specific rows and columns based on the defined transformation.

NONE

The delete results in no target changes.

Update Action

An action to take upon update of a row.

UPD_ROW

The update performs normally.

INS_ROW

The update is transformed into an insert instead.

DEL_ROW

The update is transformed into a delete instead.

NONE

The update is ignored and no changes are made.

Transform Order

For a single source operation that is mapped to a transformation, there could be more than one target operation that takes place. You may control the order in which the target operations are applied through a configuration parameter defined for each source-target table combination. This might be important, for example, if the foreign key relationships on the target tables require you to execute the transformations in a particular order.

4.14.1. Columns

Transforms are not complete until the columns involved in the transformation have been defined. Typically there will be several columns defined for each transform, each of which will define a source column and a target column.

To see a transform’s columns, select the transform and click "Edit Columns".

transform edit columns
PK

Indicates that this mapping is used to define the "primary key" for identifying the target row(s) (which may or may not be the true primary key of the target table). This is used to define the "where" clause when an Update or Delete on the target is occurring.

Unless the column policy is "IMPLIED" at least one row marked as a pk should be present for each transform_id.
Source

The source column name to be transformed.

Target

The target column name to be transformed.

Transform On

Defines whether this entry applies to source operations of Insert, Update, Delete, or All.

Type

The name of a specific type of transform, default type is "copy". See Transform Types for more information.

Expression

An expression that is specific to the type of transform that is configured in transform_type. See Transform Types for more information.

Order

In the event there are more than one columns to transform, this defines the relative order in which the transformations are applied.

4.14.2. Data Format

Transforms have access to the change data in character form using a standard set of data formats. Modification to change data should also be character data following the format for the target field.

Table 13. Data Formats
Type Format

binary

Hex or Base64 encoding, depending on the source database dialect. Use context.getBatch().encodeBinary() and context.getBatch().decodeBinary().

boolean

Either 1 or 0. (Some databases require a small integer to represent boolean.)

date

Formatted as yyyy-MM-dd.

float

Decimal digits with a decimal point expressed as one period or comma. A negative (-) sign and an exponent (e or E) are also allowed.

geometry

Well known text (WKT) format

integer

Decimal digits. A negative (-) sign is also allowed.

numeric

Decimal digits with a decimal point expressed as one period or comma. A negative (-) sign is also allowed.

time

Formatted as time HH:mm:ss.S or a timestamp yyyy-MM-dd HH:mm:ss.S.

timestamp

Formatted as yyyy-MM-dd HH:mm:ss.S.

4.14.3. Transform Types

There are several pre-defined transform types. Additional ones can be defined by creating and configuring an extension point which implements the IColumnTransform interface.

Table 14. Transform Types
Type Description

additive

For numeric fields, apply the difference of new minus old to the target. Calculate the difference between the old and new source number and apply the difference to the target.

bsh

Run a BeanShell script to return a value.

bleft

Copy the leftmost bytes of the value.

clarionDateTime

Convert a source column with a Clarion integer into a timestamp.

const

Use a provided constant value.

copy

Copy source column value to the target column.

copyIfChanged

Copy the source column value only if its old and new value are different.

columnsToRowsKey

Convert columns to rows, specifying which columns are the key for all rows.

columnsToRowsValue

Convert column to rows, specifying which columns are the changing values.

deletedColumns

Comma-separated list of column names that were changed to null.

identity

Set INSERT_IDENTITY to ON for generating the next identity value.

isBlank

If value is empty after trimming spaces, then replace its value.

isEmpty

If value is empty, then replace its value.

java

Run Java code to return a value.

isNull

If value is null, then replace its value.

left

Copy the leftmost characters of the value.

lookup

Run a SQL query to return a value.

math

Perform a math expression.

multiply

Run a SQL query with the primary key to create multiple rows.

remove

Remove the column from the row.

substring

Copy a substring of the value.

valueMap

Map from one literal value to another.

variable

Use a built-in variable value.

Copy Transform

This transformation type copies the source column value to the target column. This is the default behavior.

transform type copy
Remove Transform

This transformation type excludes the source column. This transform type is only valid for a table transformation type of 'IMPLIED' where all the columns from the source are automatically copied to the target.

transform type remove
Constant Transform

This transformation type allows you to map a constant value to the given target column. The constant itself is placed in transform expression.

transform type const
Variable Transform

This transformation type allows you to map a built-in dynamic variable to the given target column. The variable name is placed in transform expression. The following variables are available:

Table 15. Variables

batch_id

Batch number

batch_start_time

Timestamp of when the batch started to process

delete_indicator_flag

Whether or not this is a delete event, in format of Y or N.

null

null value

old_column_value

column’s old value prior to the DML operation.

source_catalog_name

Catalog name of source table

source_dml_type

Event type, usually one of INSERT, UPDATE, or DELETE, but possibly one of RELOAD, SQL, CREATE, or BSH.

source_node_id

node id of the source (from the batch)

source_node_id_from_data

source_node_id value from sym_data (source of a captured synchronization data change)

source_schema_name

Schema name of the source table

source_table_name

Name of the source table

system_date

current system date

system_timestamp

current system date and time using default timezone

system_timestamp_utc

current system date and time using UTC timezone

target_node_id

node id of the target (from the batch)

transform type variable
Additive Transform

This transformation type is used for numeric data. It computes the change between the old and new values on the source and then adds the change to the existing value in the target column. That is, target = target + multiplier (source_new - source_old), where multiplier is a constant found in the transform expression (default is 1 if not specified).

Example 14. Additive Transform Example

If the source column changed from a 2 to a 4, the target column is currently 10, and the multiplier is 3, the effect of the transform will be to change the target column to a value of 16 ( 10+3*(4-2) ⇒ 16 ).

In the case of deletes, the new column value is considered 0 for the purposes of the calculation.
transform type additive
Substring Transform

This transformation computes a substring of the source column data and uses the substring as the target column value. The transform expression can be a single integer ( n , the beginning index), or a pair of comma-separated integers ( n,m - the beginning and ending index). The transform behaves as the Java substring function would using the specified values in transform expression.

transform type substring
Left Transform

This transform copies the left most number of characters specified.

BLeft Transform

This transform copies the left most number of bytes specified.

Lookup Transform

This transformation determines the target column value by using a query, contained in transform expression to lookup the value in another table. The query must return a single row, and the first column of the query is used as the value. Your query references source column values by prefixing with a colon (e.g., :MY_COLUMN). Also, you can reference old values with :OLD_COLUMN and previously transformed columns (see transform order) with :TRM_COLUMN.

transform type lookup
Multiply Transform

This transformation allows for the creation of multiple rows in the target table based on the transform expression. This transform type can only be used on a primary key column. The transform expression is a SQL statement, similar to the lookup transform, except it can return multiple rows that result in multiple rows for the target table. The first column of the query is used as the value for the target column. The query can reference source column values by prefixing them with a colon (e.g., :MY_COLUMN).

BeanShell Script Transform

This transformation allows you to provide a BeanShell script in the transform expression and executes the script at the time of transformation. Beanshell transforms can return either a String value or an instance of NewAndOldValue. Some variables are provided to the script:

Table 16. Variables

<COLUMN_NAME>

The variable name is the source column name in uppercase of the row being changed (replace <COLUMN_NAME> with your column)

currentValue

The value of the current source column

oldValue

The old value of the source column for an updated row

sqlTemplate

org.jumpmind.db.sql.ISqlTemplate object for querying or updating the database

channelId

name of the channel on which the transformation is happening

sourceNode

org.jumpmind.symmetric.model.Node object that represents the node from where the data came

sourceNodeId

same as sourceNode.getNodeId()

sourceNodeGroupId

same as sourceNode.getNodeGroupId()

sourceNodeExternalId

same as sourceNode.getNodeExternalId()

targetNode

org.jumpmind.symmetric.model.Node object that represents the node where the data is being loaded.

targetNodeId

same as targetNode.getNodeId()

targetNodeGroupId

same as targetNode.getNodeGroupId()

targetNodeExternalId

same as targetNode.getNodeExternalId()

transformColumn

org.jumpmind.symmetric.io.data.transform.TransformColumn that is the transform configuration

includeOn

org.jumpmind.symmetric.io.data.transform.TransformColumn.IncludeOnType, same as transformColumn.getIncludeOn(), tells whether column transform is configured for all, insert, update, or delete

sourceSchemaName

source schema name that the transform matched

sourceCatalogName

source catalog name that the transform matched

sourceTableName

source table name that the transform matched

transformedData

org.jumpmind.symmetric.io.data.transform.TransformedData, the model object representing the outputted transformed data

sourceDmlType

org.jumpmind.symmetric.io.data.DataEventType that is the source row change type, either insert, update, or delete

sourceDmlTypeString

same as sourceDmlType.toString(), returning insert, update, or delete

log

org.slf4j.Logger, write to the log file

context

org.jumpmind.symmetric.io.data.DataContext containing internal variables and also acts like a Map for sharing variables between transforms for the current sync session

bshContext

java.util.Map, static map of variables to share between transforms

engine

org.jumpmind.symmetric.ISymmetricEngine, access to engine functions and services

Example 15. Transform Expression Example Returning a String
if (currentValue > oldValue) {
	return currentValue * .9;
} else {
	return PRICE;
}
Example 16. Transform Expression Example Returning a NewAndOldValue object
if (currentValue != null && currentValue.length() == 0) {
	return new org.jumpmind.symmetric.io.data.transform.NewAndOldValue(null, oldValue);
} else {
	return currentValue;
}
Example 17. Transform Expression Example Accessing Old/New Values for the Additional Column 'path'
String newFilePath = PATH;
String oldFilePath = null;
if (transformedData.getOldSourceValues() != null) {
    oldFilePath = transformedData.getOldSourceValues().get("path");
}
if (oldFilePath == null) {
    return newFilePath;
} else {
    return oldFilePath;
}
transform type beanshell
Identity Transform

This transformation allows you to insert into an identity column by letting the database compute a new identity, instead of loading an explicit value from a source database. This transform is needed on databases like SQL-Server and Sybase, which have an INSERT_IDENTITY option that is normally ON for normal data sync. By using the identity transform, the INSERT_IDENTITY is set to OFF, so the next value is generated by the database.

transform type identity
Mathematical Transform

This transformation allows you to perform mathematical equations in the transform expression. Some variables are provided to the script:

#{COLUMN_NAME}

A variable for a source column in the row, where the variable name is the column name in uppercase (replace COLUMN_NAME with your column name).

#{currentValue}

The value of the current source column

#{oldValue}

The old value of the source column for an updated row.

transform type math
Figure 5. Transform Expression Example
Copy If Changed

This transformation will copy the value to the target column if the source value has changed. More specifically, the copy will occur if the the old value of the source does not equal the new value.

Table 17. Target Expression Options

IgnoreColumn

If old and new values are equal, the COLUMN will be ignored

{empty string}

If old and new values are equal, the ROW will be ignored

transform type copy if change
Figure 6. If values are equal, ignore only the COLUMN
transform type copy if change blank
Figure 7. If values are equal, ignore the entire ROW
Value Map Transform

This transformation allows for simple value substitutions through use of the transform expression. The transform expression should consist of a space separated list of value pairs of the format sourceValue=TargetValue. The column value is used to locate the correct sourceValue, and the transform will change the value into the corresponding targetValue. A sourceValue of * can be used to represent a default target value in the event that the sourceValue is not found. Otherwise, if no default value is found, the result will be null.

Example 18. Value Map Examples
transform expression source value target value (result)

s1=t1 s2=t2 s3=t3 *=t4

s1

t1

s1=t1 s2=t2 s3=t3 *=t4

s2

t2

s1=t1 s2=t2 s3=t3 *=t4

s3

t3

s1=t1 s2=t2 s3=t3 *=t4

s4

t4

s1=t1 s2=t2 s3=t3 *=t4

s5

t4

s1=t1 s2=t2 s3=t3 *=t4

null

t4

transform type valueMap
Clarion Date Time

Convert a Clarion date column with optional time column into a timestamp. Clarion dates are stored as the number of days since December 28, 1800, while Clarion times are stored as hundredths of a second since midnight, plus one. Use a source column of the Clarion date and a target column of the timestamp. If the Clarion time exists in a separate column it can optionally be provided through the transform expression to be included in the target timestamp column.

Columns To Rows

Convert column values from a single source row into a row per column value at the target. Two column mappings are needed to complete the work:

columnsToRowsKey

Maps which source column is used

column1=key1,column2=key2
columnsToRowsValue

Maps the value

changesOnly=true

Convert only rows when the old and new values have changed

ignoreNulls=true

Convert only rows that are not null

TODO add image

Example 19. Example

"fieldid" mapped as "columnsToRowsKey" with expression of "user1=1,user2=2" and column "color" mapped as "columnsToRowsValue" would convert a row with columns named "user1" and "user2" containing values "red" and "blue" into two rows with columns "fieldid" and "color" containing a row of "1" and "red" and a row of "2" and "blue".

isEmpty Transform

This transformation checks to see if a string is null or zero length. If it is empty the replacement value will be used. If no value is provided null will be used as a default replacement for empty values.

isBlank Transform

This transformation checks to see if a string is null or zero length after trimming white spaces. If it is blank the replacement value will be used. If no value is provided null will be used as a default replacement for blank values.

Null Value Transform

This transformation checks to see if the source value is null and if so replaces it with the provided value.

Deleted Columns Transform

For an update, this transform returns a comma-separated list of columns names that were set to null and previously not null.

Java Transform

Java Transform ('java'): Use Java code in the transform expression that is included in the transform method of a class that extends JavaColumnTransform. The class is compiled whenever the transform expression changes and kept in memory for runtime. The code must return a String for the new value of the column being mapped.

Some variables are provided to the code:

Table 18. Variables
Variable Name Java Type Description

platform

org.jumpmind.db.platform.IDatabasePlatform

The platform for the database that this node is connected to

context

org.jumpmind.symmetric.io.data.DataContext

The data cotext for the synchronization of the current batch

column

org.jumpmind.symmetric.io.data.transform.TransformColumn

The transform column

data

org.jumpmind.symmetric.io.data.transform.TransformedData

The transformed data

sourceValues

java.util.Map<java.lang.String, java.lang.String>

The map of source values

newValue

java.lang.String

The captured new value

oldValue

java.lang.String

The captured old value

Example 20. Transform Expression Example Returning a String
if (sourceValues.containsKey("OLDKEY")) {
    return sourceValues.get("OLDKEY");
} else {
    return sourceValues.get("NEWKEY");
}

4.14.4. Virtual Columns

Transforms provide the ability to create "virtual columns" which can pass data between nodes for use by other SymmetricDS processes.

Use cases for virtual columns
  1. Extract transform adds virtual column to be processed by a target load transform.

  2. Extract transform adds virtual column to be processed by a target load filter.

  3. Extract transform adds virtual column to be processed by a source router.

Example 21. Example of an extract transform passing a virtual column to a target load transform
transform virtual columns
Figure 8. Create two transforms, one for extract and one for target using different group links
transform virtual columns extract
Figure 9. Create lookup transform for the extract transform to create a new virtual column to be sent to target.
transform virtual columns load
Figure 10. Create copy transform for the load transform to populate the cost column from the virtual column that was sent over.

4.15. Load Filters

Load Filters are a way to take a specific action when a row of data is loaded by SymmetricDS at a destination database node.

Load filters run for each row of data being loaded.
load filter
Required Fields
Filter Id

The unique identifier for the load filter

Group Link

The group link for with the load filter will be applied.

Type

The type of load filter. Today only Bean Shell, Java, and SQL are supported ('BSH', 'Java', 'SQL').

Target Table

The table on the target which the load filter will execute when changes occur on it.

Use the wildcard * to specify all tables configured through the group link. Partial table names in conjunction with a wildcard are NOT supported. If the wildcard is used it should be the only value.
Filter Order

The order in which load filters should execute if there are multiple scripts pertaining to the same source and target data.

Advanced Options
Filter On Update

Determines whether the load filter takes action (executes) on a database update statement.

Filter On Insert

Determines whether the load filter takes action (executes) on a database insert statement.

Filter On Delete

Determines whether the load filter takes action (executes) on a database delete statement.

Fail On Error

Whether we should fail the batch if the filter fails.

Target Catalog

The name of the target catalog for which you would like to watch for changes.

Target Schema

The name of the target schema for which you would like to watch for changes.

4.15.1. Load Filter Scripts

Load filters are based on the execution of a script. You have the ability to set the execution point of the script at 6 different points. A script can be provided for one or more of these execution points.

Return Values
  • Return true to load the row of data.

  • Return false to not load the row of data.

load filter edit scripts
Figure 11. Edit load filter scripts by selecting the load filter and clicking the Edit Scripts button.
Available Load Filter Scripts
Before Write Script

The script to execute before the database write occurs.

After Write Script

The script to execute after the database write occurs.

Batch Complete Script

The script to execute after the entire batch completes.

Batch Commit Script

The script to execute after the entire batch is committed.

Batch Rollback Script

The script to execute if the batch rolls back.

Handle Error Script

A script to execute if data cannot be processed.

Table 19. Variables available within scripts
Variable BSH SQL JAVA Description

engine

X

The Symmetric engine object.

COLUMN_NAME

X

X

The source values for the row being inserted, updated or deleted.

OLD_COLUMN_NAME

X

X

The old values for the row being inserted, updated or deleted.

context

X

X

The data context object for the data being inserted, updated or deleted. .

table

X

X

The table object for the table being inserted, updated or deleted.

data

X

X

The CsvData object for the data change.

error

X

X

java.lang.Exception

Example 22. Example of simple load filter
load filter example 1
Figure 12. Create new load filter
load filter example 1 bsh
Figure 13. Provide a "Before Write" bsh script.
Example 23. Example load filter to send email on error
load filter example
Figure 14. Create new load filter
load filter example bsh
Figure 15. Select filter and hit "Edit Scripts" button, then select "Handle Error Script" and enter the bsh script below.
Beanshell Expression
authListener = new javax.mail.Authenticator() {
  protected javax.mail.PasswordAuthentication getPasswordAuthentication() {
    return new javax.mail.PasswordAuthentication(engine.getParameterService().getString("mail.smtp.username"),
       engine.getParameterService().getString("mail.smtp.password"));
  }
};

if (bsh.shared.mailMap == void) {
  bsh.shared.mailMap = new HashMap();
}

String batchId = context.getBatch().getNodeBatchId();
String targetNodeId = context.getBatch().getTargetNodeId();
if (!bsh.shared.mailMap.containsKey(batchId)) {
  bsh.shared.mailMap.put(batchId, Boolean.TRUE);
  javax.mail.Session session = javax.mail.Session.getInstance
    (engine.getParameterService().getAllParameters(), authListener);
  javax.mail.internet.MimeMessage msg = new
    javax.mail.internet.MimeMessage(session);
  msg.setFrom(new javax.mail.internet.InternetAddress
    (engine.getParameterService().getString("mail.smtp.from")));
  msg.setRecipients(javax.mail.Message.RecipientType.TO,
    engine.getParameterService().getString("mail.smtp.to"));
  msg.setSubject("SymmetricDS - batch " + batchId + " is in error at node " + targetNodeId);
  msg.setSentDate(new java.util.Date());
  msg.setText(org.apache.commons.lang.exception.ExceptionUtils.
    getFullStackTrace(error));
  javax.mail.Transport.send(msg);

}

4.15.2. Custom Load Filters

Custom load filters can be created by implementing the IDatabaseWriterFilter, see IDatabaseWriterFilter for more information.

4.16. Extensions

Extensions are custom code written to a plug-in interface, which allows them to run inside the engine and change its default behavior. Saving extension code in the configuration has the advantage of dynamically running without deployment or restarting. Configured extensions are available to other nodes and move between environments when configuration is exported and imported.

Extension Id

Identifier for a unique extension entry.

Extension Type

Type of extension, either written in Java or BeanShell. Java extensions are compiled to bytecode on first use and may be compiled to native code by the Just-In-Time (JIT) compiler, giving them the best performance. BeanShell extensions are parsed on first use and interpreted at runtime, but they are easier to write because of loose typing and short-cuts with syntax.

Table 20. Options for Extension Type

Java

Indicates that Java code is provided in the extension text.

BSH

Indicates that BeanShell code is provided in the extension text. Built-in variables are available for engine, sqlTemplate, and log.

Interface Name

The full class name for the interface implemented by the extension, including the package name. Only needed for extension type of BSH.

Node Group Id

The node group where this extension will be active and run.

Enabled

Whether or not the extension should be run.

Extension Order

The order to register extensions when multiple extensions for the same interface exist.

Extension Text

The code for the extension that will be compiled or interpreted at runtime.

Example 24. BSH extension that adds a new transform for masking characters

Add a new transform type called "mask" that replaces all characters in a string with an asterisk except the last number of characters specified by the user in the expression. This BeanShell extension uses the ISingleValueColumnTransform interface and applies only to the "corp" node group.

Click the New button to create a new extension.

extension new

Click the Edit Script button to edit the script for the extension.

import org.apache.commons.lang.StringUtils;

isExtractColumnTransform() {
    return true;
}

isLoadColumnTransform() {
    return true;
}

transform(platform, context, column, data, sourceValues, newValue, oldValue) {
    if (StringUtils.isNotBlank(newValue)) {
        String expression = column.getTransformExpression();
        if (StringUtils.isNotBlank(expression)) {
            count = newValue.length() - Integer.parseInt(expression.trim());
            return StringUtils.repeat("*", count) + newValue.substring(count);
        }
    }
    return newValue;
}
Extensions Tips and Tricks
For BeanShell, implement only the methods needed from an interface, then write a special method of "invoke(method, args) {}" that will be called for any unimplemented methods.

4.17. Jobs

In SymmetricDS, jobs are tasks that are scheduled to run by a job manager. These jobs do the majority of synchronization work for SymmetricDS. In addition to the built in jobs, you can create custom jobs in Configuration to run for specified node groups.

customJob
Required Fields
Job Name

The Job Name is a unique identifier to refer to the job

Job Type

The type of job. Possible types are listed below.

Type Provided Description

BSH

x

A job that will compile and run a beanshell script.

JAVA

x

A job that will compile and run a java script.

SQL

x

A job that will compile and run a sql script.

Node Group

Targets the job at a specific node group id. To target all groups, use the value of 'ALL'.

Schedule

Specifies how often this job should run. If the schedule is just a number, it is interpreted as milliseconds. So 60000 means to run the job every minute. Otherwise, a cron expression can be specified. Some sample cron expressions include:

  • Every fifth minute: 0 0/5 * * * *

  • Every night at midnight: 0 0 0 * * *

  • Every ten minutes at the 1 o’clock hour: 0 0/10 1 * * *

Note that parameter values will be created for the schedule, in the form of job.<name>.cron OR job.<name>.period.time.ms, depending on if a cron of period of time was entered. If both the cron and period parameters are present, the cron will take precedence.

Advanced Topics
Job Expression

The payload of the job. For BSH jobs, this should be a beanshell script. For Java jobs, this should be Java code of a class which implements the IJob interface. For SQL jobs, this should be a sql script. BSH jobs are written in beanshell script (http://www.beanshell.org/). The beanshell has the following variables available:

  • engine - An instances of the current ISymmetricEngine

  • sqlTemplate - An instance of ISqlTemplate which can be used to run SQL statements.

  • log - An logger which can be used to write messages to the log.

This is an example BSH job:

    // Configuration for this extension...
    final String TABLE_NAME = "item";
    final String CHANNEL_ID = "item";
    final String ROUTER_ID = "corp_2_store";
    // End Configuration.
    String INSERT_RELOAD_DATA_TEMPLATE =
            "insert into sym_data ( table_name, event_type, row_data, trigger_hist_id, channel_id, transaction_id, create_time)" +
            "    select  t.source_table_name, 'R', 'item_id > ''1''', " +
            "            h.trigger_hist_id, t.channel_id, '1', current_timestamp" +
            "        from sym_trigger t inner join sym_trigger_router tr on" +
            "            t.trigger_id=tr.trigger_id inner join sym_trigger_hist h on" +
            "            h.trigger_hist_id=(select max(trigger_hist_id) from sym_trigger_hist" +
            "                where trigger_id=t.trigger_id)" +
            "    where channel_id=? and" +
            "        tr.router_id like ? and" +
            "        (t.source_table_name = ?)" +
            "    order by tr.initial_load_order asc;";

        int updatedCount = sqlTemplate.update(INSERT_RELOAD_DATA_TEMPLATE, new Object[]{CHANNEL_ID, ROUTER_ID, TABLE_NAME});
        if (updatedCount == 1) {
            log.info("Issued reload for table '" + TABLE_NAME + "'");
        } else {
            throw new SymmetricException("SyncOnHeartbeatExtension is designed to insert exactly 1 sym_data row.  Instead inserted " +
                    updatedCount + ". Check TABLE_NAME, CHANNEL_ID, ROUTER_ID parameters in the extension itself.",  null);
        }

This is the same job as a Java job. This might perform a little better, but you’ll need to have a JDK (not just JRE) available on your nodes for this to compile.

    // Configuration for this extension...
    final String TABLE_NAME = "item";
    final String CHANNEL_ID = "item";
    final String ROUTER_ID = "corp_2_store";
    // End Configuration.
    String INSERT_RELOAD_DATA_TEMPLATE =
            "insert into sym_data ( table_name, event_type, row_data, trigger_hist_id, channel_id, transaction_id, create_time)" +
            "    select  t.source_table_name, 'R', 'item_id > ''1''', " +
            "            h.trigger_hist_id, t.channel_id, '1', current_timestamp" +
            "        from sym_trigger t inner join sym_trigger_router tr on" +
            "            t.trigger_id=tr.trigger_id inner join sym_trigger_hist h on" +
            "            h.trigger_hist_id=(select max(trigger_hist_id) from sym_trigger_hist" +
            "                where trigger_id=t.trigger_id)" +
            "    where channel_id=? and" +
            "        tr.router_id like ? and" +
            "        (t.source_table_name = ?)" +
            "    order by tr.initial_load_order asc;";

        log.info("TESTING\n\n\n");
        int updatedCount = sqlTemplate.update(INSERT_RELOAD_DATA_TEMPLATE, CHANNEL_ID, ROUTER_ID, TABLE_NAME);
        if (updatedCount == 1) {
            log.info("Issued reload for table '" + TABLE_NAME + "'");
        } else {
            throw new org.jumpmind.symmetric.SymmetricException("SyncOnHeartbeatExtension is designed to insert exactly 1 sym_data row.  Instead inserted " +
                    updatedCount + ". Check TABLE_NAME, CHANNEL_ID, ROUTER_ID parameters in the extension itself.");
        }

This is an example of a SQL job that updates the heartbeat_time of the current node. Available tokens are: :NODE_ID and :NODE_GROUP_ID.

update sym_node_host set heartbeat_time = now() where node_id = :NODE_ID;
delete from item where item_id = 0;
insert into ITEM (ITEM_ID, NAME) values (0,'tesing');

To add your job expression/script navigate to the Configure Jobs page, select your job and click the button "Edit Script". The following editor will appear.

editScript

4.18. Parameters

Parameters can be used to help tune and configure your SymmetricDS configuration. Parameters can be set for an individual node or for all nodes in your network.

See Parameter List for a complete list of parameters.

4.18.1. Targeting

Parameters can be set for a given node or node group to allow more flexibility in configuration.

parameters target
Figure 16. Setting parameters by target
Only certain parameters can be edited on this screen. To edit a SymmetricDS startup parameter, you must go to the Manage→Startup Parameters screen or locate and edit the engine.properties file.

4.18.2. Filtering

Parameters can be filtered in order to help find specific parameters or parameters named a certain way.