In a database synchronization system, typically you have a big central database on the server side and a large number of small databases each residing on a device. The central database contains data for all the devices while each device’s local database only contains the device’s private data and some shared data.
In Fig. 2 we illustrate the subsetting of the data of a big central DB table. A table has columns and rows. One or more columns form a primary key to identify a row. To define the data sub-set, you can first select columns of the table. Only primary key columns and selected columns will be synchronized to the devices. This is called vertical subsetting. All devices share the same vertical subsetting, meaning they all get the same set of columns.
In the more interesting horizontal data subsetting, you select a sub-set of the rows by specifying a set of primary key values. In the figure, the rows identified by primary keys <PK 1> and <PK 2> are chosen to be distributed to all devices. These rows contain shared data. The rows identified by primary keys <PK 3> and <PK 4> are chosen to be distributed to device # 1 only. These rows contain private data for device #1. Similarly, the rows identified by primary keys <PK 5> and <PK 6> are synchronized to device # 2.
Figure 2 Data Subsetting in Data Base Synchronization
The horizontal subsetting can be carried out through a SQL query that returns the primary key values of the row sub-set. When a table is published, the SQL query is defined with embedded parameters (optional). During subscription of a client to the sync schema containing the table, concrete values for the parameters are specified. Different values cause the SQL query to return a different sub-set of data to the clients.