Copyright © 2007-2024 JumpMind, Inc
Version 3.14.16
Permission to use, copy, modify, and distribute this SymmetricDS User Guide for any purpose and without fee is hereby granted in perpetuity, provided that the above copyright notice and this paragraph appear in all copies.
Preface
This user guide introduces SymmetricDS and its features for data synchronization. It is intended for users, developers, and administrators who want to install the software, configure synchronization, and manage its operation. Thank you to all the members of the open source community whose feedback and contributions helped us build better software and documentation. This version of the guide was generated on 2024-08-16.
1. Introduction
SymmetricDS is open source software for database and file synchronization, with support for multi-master replication, filtered synchronization, and transformation. It uses web and database technologies to replicate change data as a scheduled or near real-time operation, and it includes an initial load feature for full data loads. The software was designed to scale for a large number of nodes, work across low-bandwidth connections, and withstand periods of network outage.
1.1. System Requirements
SymmetricDS is written in Java and requires a Java Runtime Environment (JRE) Standard Edition (SE) or Java Development Kit (JDK) Standard Edition (SE) version 8.0 or above. Most major operating systems and databases are supported. See the list of supported databases in the Database Compatibility section. The minimum operating system requirements are:
-
Java SE Runtime Environment 8 or above
-
Memory - 64 (MB) available
-
Disk - 256 (MB) available
The memory, disk, and CPU requirements increase with the number of connected clients and the amount of data being synchronized. The best way to size a server is to simulate synchronization in a lower environment and benchmark data loading. However, a rule of thumb for servers is one server-class CPU with 2 GB of memory for every 500 MB/hour of data transfer and 350 clients. Multiple servers can be used as a cluster behind a load balancer to achieve better performance and availability.
SymmetricDS Pro is accessed from a web console, which requires one of the following supported web browsers:
-
Google Chrome 23 or newer
-
Internet Explorer 8 or newer
-
Mozilla Firefox 17 or newer
-
Safari 6 or newer
1.2. Overview
A node is responsible for synchronizing the data from a database or file system with other nodes in the network using HTTP. Nodes are assigned to one of the node Groups that are configured together as a unit. The node groups are linked together with Group Links to define either a push or pull communication. A pull causes one node to connect with other nodes and request changes that are waiting, while a push causes one node to connect with other nodes when it has changes to send.
Each node is connected to a database with a Java Database Connectivity (JDBC) driver using a connection URL, username, and password. While nodes can be separated across wide area networks, the database a node is connected to should be located nearby on a local area network for the best performance. Using its database connection, a node creates tables as a Data Model for configuration settings and runtime operations. The user populates configuration tables to define the synchronization and the runtime tables capture changes and track activity. The tables to sync can be located in any Catalog and Schema that are accessible from the connection, while the files to sync can be located in any directory that is accessible on the local server.
At startup, SymmetricDS looks for Node Properties Files and starts a node for each file it finds, which allows multiple nodes to run in the same instance and share resources. The property file for a node contains its external ID, node group, registration server URL, and database connection information. The external ID is the name for a node used to identify it from other nodes. One node is configured as the registration server where the master configuration is stored. When a node is started for the first time, it contacts the registration server using a registration process that sends its external ID and node group. In response, the node receives its configuration and a node password that must be sent as authentication during synchronization with other nodes.
1.3. Architecture
Each subsystem in the node is responsible for part of the data movement and is controlled through configuration. Data flows through the system in the following steps:
-
Capture into a runtime table at the source database
-
Route for delivery to target nodes and group into batches
-
Extract and transform into the rows, columns, and values needed for the outgoing batch
-
Send the outgoing batch to target nodes
-
Receive the incoming batch at the target node
-
Transform into the rows, columns, and values needed for the incoming batch
-
Load data and return an acknowledgment to the source node
- Capture
-
Change Data Capture (CDC) for tables uses database triggers that fire and record changes as comma-separated values into a runtime table called DATA. For file sync, a similar mechanism is used, except changes to the metadata about files are captured. The changes are recorded as insert, update, and delete event types. The subsystem installs and maintains triggers on tables based on the configuration provided by the user, and it can automatically detect schema changes on tables and regenerate triggers.
- Route
-
Routers run across new changes to determine which target nodes will receive the data. The user configures which routers to use and what criteria is used to match data, creating subsets of rows if needed. Changes are grouped into batches and assigned to target nodes in the DATA_EVENT and OUTGOING_BATCH tables.
- Extract
-
Changes are extracted from the runtime tables and prepared to be sent as an outgoing batch. If large objects are configured for streaming instead of capture, they are queried from the table. Special event types like "reload" for Initial Loads are also processed.
- Transform
-
If transformations are configured, they operate on the change data either during the extract phase at the source node or the load phase at the target node. The node’s database can be queried to enhance the data. Data is transformed into the tables, rows, columns, and values needed for either the outgoing or incoming batch.
- Outgoing
-
The synchronization sends batches to target nodes to be loaded. Multiple batches can be configured to send during a single synchronization. The status of the batch is updated on the OUTGOING_BATCH table as it processes. An acknowledgment is received from target nodes and recorded on the batch.
- Incoming
-
The synchronization receives batches from remote nodes and the data is loaded. The status of the batch is updated on the INCOMING_BATCH table as it processes. The resulting status of the batch is returned to the source node in an acknowledgment.
1.4. Features
SymmetricDS offers a rich set of features with flexible configuration for large scale deployment in a mixed environment with multiple systems.
-
Web UI - The web console provides easy configuration, management, and troubleshooting.
-
Data Synchronization - Change data capture for relational databases and file synchronization for file systems can be periodic or near real-time, with an initial load feature to fully populate a node.
-
Central Management - Configure, monitor, and troubleshoot synchronization from a central location where conflicts and errors can be investigated and resolved.
-
Automatic Recovery - Data delivery is durable and low maintenance, withstanding periods of downtime and automatically recovering from a network outage.
-
Secure and Efficient - Communication uses a data protocol designed for low bandwidth networks and streamed over HTTPS for encrypted transfer.
-
Transformation - Manipulate data at multiple points to filter, subset, translate, merge, and enrich the data.
-
Conflict Management - Enforce consistency of two-way synchronization by configuring rules for automatic and manual resolution.
-
Extendable - Scripts and Java code can be configured to handle events, transform data, and create customized behavior.
-
Deployment Options - The software can be installed as a self-contained server that stands alone, deployed to a web application server, or embedded within an application.
1.5. Why SymmetricDS?
SymmetricDS is a feature-rich data synchronization solution that focuses on ease of use, openness, and flexibility. The software encourages interoperability and accessibility for users and developers with the availability of source code, an application programming interface (API), and a data model supported by documentation. Configuration includes a powerful set of options to define node topology, communication direction, transformation of data, and integration with external systems. Through scripts and Java code, the user can also extend functionality with custom behavior. With a central database for setup and runtime information, the user has one place to configure, manage, and troubleshoot synchronization, with changes taking immediate effect across the network.
The trigger-based data capture system is easy to understand and widely supported by database systems. Table synchronization can be setup by users and application developers without requiring a database administrator to modify the server. Triggers are database objects written in a procedural language, so they are open for examination, and include flexible configuration options for conditions and customization. Some overhead is associated with triggers, but they perform well for applications of online transaction processing, and their benefits of flexibility and maintenance outweigh the cost for most scenarios.
Using an architecture based on web server technology, many simultaneous requests can be handled at a central server, with proven deployments in production supporting more than ten thousand client nodes. Large networks of nodes can be grouped into tiers for more control and efficiency, with each group synchronizing data to the next tier. Data loading is durable and reliable by tracking batches in transactions and retrying of faults for automatic recovery, making it a low maintenance system.
1.6. License
SymmetricDS Pro is commercial software that is licensed, not sold. It is subject to the terms of the End User License Agreement (EULA) and any accompanying JumpMind Support Contract. See the standard SymmetricDS Pro license for reference, but your agreement with JumpMind may be different.
2. Installation
SymmetricDS at its core is a web application. A SymmetricDS instance runs within the context of a web application container like Jetty or Tomcat, and uses web based protocols like HTTP to communicate with other instances.
An instance has one of the following installation options:
-
Standalone Installation - SymmetricDS is installed and run as a standalone process using the built-in Jetty web server. This is the simplest and recommended way to install an instance.
-
Web Archive (WAR) - A SymmetricDS web archive (WAR) file is deployed to an existing web application container that is separately installed, maintained and run.
-
Embedded - SymmetricDS is embedded within an existing application. In this option, a custom wrapper program is written that calls the SymmetricDS API to synchronize data.
2.1. Standalone Installation
The SymmetricDS Pro setup program is an executable JAR file that can run on any system with a Java Runtime Environment (JRE). See System Requirements for prerequisites. Download the setup program from SymmetricDS Pro Downloads.
Run the setup program:
-
From a desktop environment, double click the symmetric-pro-<version>-setup.jar file
-
If double clicking doesn’t work, use a command prompt to run:
java -jar symmetric-pro-<version>-setup.jar
-
From a text-based environment, use a terminal to run:
java -jar symmetric-pro-<version>-setup.jar -console
The first screen shows the SymmetricDS Pro software version. The setup program will ask a series of questions before writing files to disk.
To begin selecting options, click Next.
Carefully read the SymmetricDS Pro License Agreement.
If you accept, select I accept the terms of this license agreement and click Next.
Specify Install new software to install a new version of SymmetricDS for the first time.
For upgrading an existing installation of SymmetricDS, see Upgrading.
Click Next to continue.
Choose the installation path where SymmetricDS will either be installed or upgraded. If the directory does not already exist, it will be created for you. Make sure your user has permission to write to the file system.
After entering the directory path, click Next.
Select the packages you want to install and verify disk space requirements are met. By default, all packages are selected. Drivers for popular databases are included, but they can be unselected if you don’t plan to use them.
After selecting packages, click Next.
SymmetricDS can either be run automatically by the system or manually by the user. Select the Install service to run automatically checkbox to install a Windows service or Unix daemon that will start SymmetricDS when the computer is restarted. The service can installed or uninstalled later using the Control Center or command line (see Running as a Service).
Select the Run server after installing checkbox to also run SymmetricDS after installation so it can be used immediately.
After selecting options, click Next.
HTTPS and HTTPS/2 protocols are recommended for protecting data security. For testing without security or encryption, the HTTP protocol can be enabled. Choose an available port number to listen on, which will be validated.
Java Management eXtension (JMX) is an optional way to manage the server from third party tools like JConsole. Most installations leave it disabled and use the web console for management.
Click Next to continue.
Specify how much memory to use for sending and receive data changes. More memory is needed to communicate with multiple clients and when data contains large objects (LOB). Estimate an extra 5 MB of memory for each client and each 500 MB/hour of data transfer.
Click Next to continue.
Specify disk space options for temporarily staging incoming and outgoing data changes. Using staging helps the overall performance of the system and minimizes use of the database. The default location is the "tmp" sub-directory of the installation directory. For Clustering, specify a common network share.
Click Next to continue.
Confirm your installation settings look correct.
Click Next to begin installing files.
The packages you selected are installed to disk.
After it finishes, click Next.
During the finish step, it will install the service and start the service if you selected those options.
After it finishes, click Next.
The installation is now complete. Choose if you want to open the SymmetricDS Pro Control Center where you can view the server status and open a web console.
Click Done to exit the setup program.
From the SymmetricDS Pro Control Center, you can start/stop the server, open the web console, and install/uninstall the service.
To begin configuration of SymmetricDS, check that the server is running, and then click Open Web Console.
To continue setup and configuration of SymmetricDS, refer to the Setup section.
2.2. Running as a Service
SymmetricDS can be configured to start automatically when the system boots, running as a Windows service or Linux/Unix daemon.
A wrapper process starts SymmetricDS and monitors it, so it can be restarted if it runs out of memory or exits unexpectedly.
The wrapper writes standard output and standard error to the logs/wrapper.log
file.
For SymmetricDS Pro, you may have already installed as a service, so this section will show you how to manually install the service from command line.
2.2.1. Running as a Windows Service
To install the service, run the following command as Administrator:
bin\sym_service.bat install
Most configuration changes do not require the service to be re-installed. To uninstall the service, run the following command as Administrator:
bin\sym_service.bat uninstall
To start and stop the service manually, run the following commands as Administrator:
bin\sym_service.bat start
bin\sym_service.bat stop
2.2.2. Running as a Linux/Unix daemon
An init script is written to the system /etc/init.d
directory.
Symbolic links are created for starting on run levels 2, 3, and 5 and stopping on run levels 0, 1, and 6.
To install the script, running the following command as root:
bin/sym_service install
Most configuration changes do not require the service to be re-installed. To uninstall the service, run the following command as root:
bin/sym_service uninstall
To start and stop the service manually, run the following commands:
bin/sym_service start
bin/sym_service stop
2.3. Clustering
A single SymmetricDS node can be deployed across a series of servers to cooperate as a cluster. A node can be clustered to provide load balancing and high availability.
Each node in the cluster shares the same database. A separate hardware or software load balancer is required to receive incoming requests and direct them to one of the backend nodes. Use the following steps to setup a cluster:
-
Set the
cluster.lock.enabled
property totrue
-
Optionally, set the
cluster.server.id
property to a unique name, otherwise the hostname will be used -
Set the
sync.url
property to the URL of the load balancer -
Set the
initial.load.use.extract.job.enabled
property tofalse
if using local staging -
Copy the engine properties,
security/keystore
, andconf/sym_service.conf
files to each installation -
Configure the load balancer for sticky sessions
With the cluster.lock.enabled
property set to true
, jobs will acquire an entry in the LOCK table to ensure that only
one instance of the job runs across the cluster. When a lock is acquired, a row is updated in the lock table with the time of the lock
and the server ID of the locking job. The locking server ID defaults to the host name, but it can specified with the cluster.server.id
property if nodes are running on the same server. Another instance of the job cannot acquire a lock until the locking instance releases
the lock and sets the lock time back to null. If an instance is terminated while the lock is still held, an instance with the same server ID
is allowed to re-acquire the lock. If the locking instance remains down, the lock can be broken after it expires, specified by the
cluster.lock.timeout.ms
property. Jobs refresh their lock periodically as they run, which prevents a lock from expiring due to a long run time.
The load balancer should be configured to use sticky sessions if the cluster will receive push synchronization. Push connections first request a reservation from the target node and then connect again using the reservation to push changes. Sticky sessions ensures that the push request is sent to the same server where the reservation is held.
Staging is writing batches to disk before sending over the network, which can use local disk or a shared network drive.
Staging can improve performance by reducing the time that resources are held open in the database and by extracting batches before they are served.
To use local staging in a cluster, disable the initial.load.use.extract.job.enabled
property so the initial load will extract
batches on the node serving the request, rather than extracting in the background on a different node.
To use shared staging in a cluster, set the staging.dir
property to the directory path of the network drive and enable the
cluster.staging.enabled
property so files are locked during use. With shared staging, the initial load extracts in the background on one node,
but batches can be served from any of the nodes in the cluster, which can improve performance.
When deploying nodes in a cluster to an application server like Tomcat or JBoss, the application server does NOT need any clustering of sessions configured.
2.4. Other Deployment Options
It is recommended that SymmetricDS is installed as a standalone service, however there are two other deployment options.
2.4.1. Web Archive (WAR)
This option means packaging a WAR file and deploying to your favorite
web server, like Apache Tomcat. It’s a little more work, but you
can configure the web server to do whatever you need. SymmetricDS can also
be embedded in an existing web application, if desired. As a web application archive, a WAR is deployed to an application server,
such as Tomcat, Jetty, or JBoss. The structure of the archive will have a web.xml
file in the WEB-INF
folder, an appropriately configured symmetric.properties
file in the WEB-INF/classes
folder,
and the required JAR files in the WEB-INF/lib
folder.
A war file can be generated using the standalone installation’s symadmin
utility and the
create-war
subcommand. The command requires the name of the war file to generate. It
essentially packages up the web directory, the conf directory and includes an optional
properties file. Note that if a properties file is included, it will be copied to
WEB-INF/classes/symmetric.properties. This is the same location conf/symmetric.properties
would have been copied to. The generated war distribution uses the same web.xml as the standalone
deployment.
bin/symadmin -p my-symmetric-ds.properties create-war /some/path/to/symmetric-ds.war
2.4.2. Embedded
This option means you must write a wrapper Java program that runs SymmetricDS. You would probably use Jetty web server, which is also embeddable. You could bring up an embedded database like Derby or H2. You could configure the web server, database, or SymmetricDS to do whatever you needed, but it’s also the most work of the three options discussed thus far.
The deployment model you choose depends on how much flexibility you need versus how easy you want it to be. Both Jetty and Tomcat are excellent, scalable web servers that compete with each other and have great performance. Most people choose either the Standalone or Web Archive with Tomcat 5.5 or 6. Deploying to Tomcat is a good middle-of-the-road decision that requires a little more work for more flexibility.
A Java application with the SymmetricDS Java Archive (JAR) library on its
classpath can use the SymmetricWebServer
to start the server.
import org.jumpmind.symmetric.SymmetricWebServer;
public class StartSymmetricEngine {
public static void main(String[] args) throws Exception {
SymmetricWebServer node = new SymmetricWebServer(
"classpath://my-application.properties", "conf/web_dir");
// this will create the database, sync triggers, start jobs running
node.start(8080);
// this will stop the node
node.stop();
}
This example starts the SymmetricDS server on port 8080.
The configuration properties file, my-application.properties
,
is packaged in the application to provide properties that override the SymmetricDS
default values. The second parameter to the constructor points to the web directory.
The default location is web
. In this example the web directory is located
at conf/web_dir
. The web.xml is expected to be found at conf/web_dir/WEB-INF/web.xml
.
2.4.3. Client Mode
This option runs the SymmetricDS engine without a web server, so it can initiate push and pull requests, but not receive them. Without the web server, there are no open ports listening for sync requests, which can help with security requirements. Be aware that this also means losing access to the web console at this node and any enhanced troubleshooting provided by remote status.
The conf/sym_service.conf
file has a parameter to start the service in client mode:
wrapper.app.parameter.3=--client
3. Setup
Once the SymmetricDS software is installed on a computer and an instance of it is running, the next step in setting up the synchronization scenario is to set up a SymmetricDS node within that running SymmetricDS instance. As a reminder, a SymmetricDS node is connected to a database or file system and is responsible for synchronizing that database’s data to other SymmetricDS nodes within the node network.
3.1. Node Type
When opening the web console, if there are no nodes defined within the running instance, the Connect Database Wizard will be displayed to guide you through the process of creating one. There are two types of nodes:
-
Setup New Replication - The primary node is typically the first node set up when creating a new replication scenario, and it serves as the central place where configuration is done. All configuration is stored in the database that this node is attached to.
-
Join Existing Replication - All other nodes join existing replication by registering with the primary node, where they receive a copy of the configuration and learn about other nodes.
A third option will perform the <