You want to synchronize data in a local database with a SyncML server and there is no SyncML client which supports that? Perhaps you are using KDE PIM or GPE? This post explains how you can write your own SyncML client for them using the SyncEvolution framework. There are several arguments in favor of taking that route compared to directly using a library like the one provided by Funambol:
Of course, not all of these arguments may apply. Perhaps some information is still missing. In that case I count on your feedback and questions! I intend to keep this blog post up-to-date so that there always is a good starting point. For further information please refer to the Doxygen documentation of the different classes mentioned in this post.
2011-04-19: unfortunately I did not have enough time to keep the content really up-to-date. The whole backend API was redesigned. For the rationale and information about this change, see the mailing list discussion. The file backend continues to serve as a functional example for a source derived from TrackingSyncSource. The KCalExtended (aka mKCal) backend shows how to compose a source differently, with change tracking implemented using a backend specific method.
Enough said, let’s get down to business. You need some understanding of C++. A compile environment which is supported by autoconf/automake/libtool (i.e., a Linux/Unix/Mac OS X system or Windows with something like Cygwin) is useful; otherwise you’ll have to figure out yourself how to compile the Funambol client library and SyncEvolution. No knowledge of SyncML is required, but you need to know how to read and modify items in your database.
The code that you have to write currently has to be in C++. I have thought a bit about extending SyncEvolution with modules written in Python: it could be done by exposing the essential classes (there aren’t that many) to Python in a
syncevolution module and calling a Python module by embedding the Python interpreter. By providing some more classes this might even become useful for GUI frontends, like Genesis (which currently calls the
syncevolution binary and parses its output). If someone would like to have such a facility or (even better) wants to implement it, please leave a comment.
- Client and server:
- SyncML synchronizes data between unequal partners. The server is the central hub which coordinates data synchronization between multiple, possibly heterogeneous clients.
- in SyncML, each item is an opaque binary blob as far as the protocol is concerned. An item can be in one of three different states: updated/added/deleted since the last sync.
- that’s the collection of items that is to be kept consistent between client and server. “local” from a client developers perspective is the database which the client directly has access to.
- Synchronization session, or just “sync”:
- after initiating the session, first the client tells the server about its changes, the server imports them and then sends any changes that it has itself. This is the normal “two-way” sync. If a sync fails, then usually client and server cannot know for sure whether their peer has received all changes. The next sync then has to be a “slow” one where the client sends all items, the server compares against the ones it has, and sends back those items that the client doesn’t have or that must be updated. The comparison on the server is based on heuristics, so unwanted duplicates are likely. In this mode deleted items are recreated because the server cannot determine whether items missing on one side were intentionally deleted or lost during a failed sync. Other modes are also possible (sending changes only one way; replacing all items on one side with all items from the other side).
- Sync source:
- this is a term introduced by Funambol. It stands for the class which connects a specific database to the sync infrastructure, both on the server and in clients. In SyncEvolution, sync sources are implemented by backends. Each backend can provide one or more modules, which can be compiled into the main binary (easy to install and debug) or loaded dynamically (useful when some optional modules might have unsatisfied library dependencies). Sync sources and the data format they are expected to use are selected by a source’s
typeproperty. Multiple sync sources can be active during the same sync session. However, the databases on client and server have to be different. On the server, databases are identified by an
URI. Strictly speaking, this is path relative to the sync URL of the server (
./contactsis the same as
contacts), but some servers treat it as a string which has to match exactly (
./contactsnot the same as
- if an item was modified or deleted on the client and the server also has a change for that item, then the server must resolve this conflict. It cannot know which change was made earlier, because the clients do not send time stamps for their changes, so servers usually arbitrarily choose “server wins” or “client wins” and discard the other change. When dealing with two updated revisions of the same item, a server might also try to merge them automatically or keep both revisions. In the former case old, unwanted data might be preserved, in the later case the user has to do the merging manually.
Ideally the SyncML server understands the data format that you intend to exchange with it. The standard formats are:
- vCard 2.1/3.0, with 2.1 being supported by all servers
- Calendar events and tasks:
- vCalendar 1.0 or (less often, but more capable) iCalendar 2.0
- plain text, first line serves as summary
When using these standard formats, the server is aware of the semantic of the data and can do conversions, like vCalendar 1.0 <-> vCalendar 2.0. It can also handle clients which support only a subset of the full standard: when receiving an update of an item from such a client, the server has a chance to preserve properties that the client was unable to store. A dumb server would replace a complete item with the incomplete one just received from a less capable client.
If you want to synchronize other kinds of data, then you need a server which accepts arbitrary binary blobs and stores them verbatim. In the Funambol server one can configure a file sync source which does that. Going that route limits synchronization to clients which all use exactly the same binary format.
Planning your Sync Source
The first step is determining the data format. If you have access to code which imports/exports items in the local database in one of the standard formats, then you are ready to go forward. If not, then you will have to define your own mapping between the database and a suitable interchange format. This can easily be the hardest part of the whole exercise and is beyond the scope of this blog post. It is useful to support both vCard 2.1 and 3.0 to meet the server’s needs. The class
EvolutionContactSource shows how this conversion can be done.
The next step is deciding about change tracking. There are two different ways to do that. The easy solution relies on the
TrackingSyncSource to do all the heavy lifting for you. In this case your code must be able to iterate over all items in the database. It must provide a unique ID and a “revision string” for each item. Both are arbitrary strings, but keeping them simple (printable ASCII, no white spaces, no special characters) makes debugging easier because such strings can often be used as-is without escaping problematic characters. The ID only has to be unique among all items currently in the database, so recycling an ID after deleting an older item is possible. However, this makes debugging harder and is better avoided.
The “revision string” must change if the item was changed and should not change when the item was not changed. If the revision string changes between syncs, then the item is sent as “updated” to the server even if it hasn’t really changed. This is usually not a problem, but can lead to unnecessary conflicts and wastes some bandwidth.
If the database keeps a modification date and time for each item, then a textual representation of that time stamp is a good revision string. Take care that changes made after a sync really assign a different time stamp: if the time stamp has seconds as resolution, then waiting for one second at the end of a sync is a good idea. This can be done in the
close() method of your sync source.
If there is no modification time stamp, then a hash of a textual representation of the item can be used instead. The drawback of that is that even minor changes in the way how the item is formatted make it look like the item was modified.
Instead of using
TrackingSyncSource, it is also possible to derive directly from its base class,
EvolutionSyncSource. In that case change tracking has to be implemented explicitly, possibly by utilizing facilities in the database API. This is not explained further here; for an example, see
One last point worth further considerations is whether the backend shall support more than one local database, how the user can select them with the
evolutionsource property and whether the backend creates databases that don’t exist yet. Supporting more than one database is required for automatic testing against a server; creating them automatically simplifies the setup of that testing. The convention adopted by the existing backends is that databases starting with
file:// are created. Other strings have to match an existing database. This is done to catch typos.
SyncEvolution keeps track of whether a sync source has failed in a property of the
EvolutionSyncSource base class. If only one of several active sources fails, then the sync continues with those. SyncEvolution forces a slow sync by resetting all sync anchors (a string which is stored in the client and the server) before a sync and only updating the anchors of sources which have not failed at the end of a sync. For the failed ones the server will detect the anchor mismatch during the next sync and force a slow sync.
There was a discussion on the Funambol developers list about resuming from a failed sync more gracefully. This is still in the planning stage and the necessary support in the client library has not been implemented yet.
Errors have to be reported back to the caller via a C++ exception. The methods
EvolutionSyncSource/Client::throwError() should be used for that. The one in
EvolutionSyncSource adds the name of the source to the error description and also sets its state to “failed”.
This section describes how to create your new backend and how to compile it as part of SyncEvolution using the autotools. As a first step make sure that you can compile the SyncEvolution 0.8 beta 2 (or later) source .tar.gz, using
configure && make. You don’t need Evolution for that: there is a local file backend which compiles on all platforms. In the next step check that you can rebuild
configure and the
Makefiles by invoking
./autogen.sh. You may have to install autotools/automake/libtool packages for that to succeed.
Backends are stored in subdirectories of
src/backends. Each subdirectory contains one
subconfigure.in fragment that gets included in the base
configure and a corresponding
Makefile.am. This slightly unusual setup has the advantage that
configure --help lists all options and common tests only need to be run once. Maintainer mode (= regenerating configure and rerunning it when one of build files is edited) still works.
To create a new backend, copy the existing
src/backends/file directory. It is a very minimal backend, so you can easily replace the parts that have to be implemented differently for your sync source. At a minimum, change the
enable part of
subconfigure.in and rename the
.cpp|h files. Remember to rename the files accordingly in
Makefile.am. There can be more then one backend per directory. The two original Evolution backends for contacts and calendars/task lists/notes are built like that inside
backends/evolution, sharing a common
*Register.cpp file(s) your backend adds itself to the SyncEvolution framework, simply by being linked into the executable or (if you link dynamically) when loading the module. You decide how users select the sync source via the
type. In order to work with the default configuration templates, treat the standard names (like e.g.
addressbook) as an alias for your source. The first backend which accepts a certain type “wins”, therefore also add a dedicated type that can be used to select your backend directly in case of conflicts. Your backend will always be asked before the file backend, but might come after other backends.
In the same file you’ll also register tests for your new class. We’ll go into that in more detail below.
Note that none of the changes above require modifications to SyncEvolution source files. This is intentional: you can start working on your own sync source without needing write access to the SyncEvolution repository. Instead of working with the source tar ball (as suggested above for simplicity reasons) you can also check out the source directly and keep track of changes in SyncEvolution and the client library that way. If you decide to keep your source separate from SyncEvolution permanently, then that’s okay, but I would rather like to have all code in one place (no copyright transfer needed!). Ask [for Subversion access], and it shall be given you…
Have a look at the base class that you are deriving from: both
src/core/TrackingSyncSource.h as well as
src/core/EvolutionSyncSource.h contain comments that describe what you are expected to implement.
make doc in the root directory invokes Doxygen, which turns these comments into HTML files in the
html directory of the build directory. Doxygen and dot (from the graphviz package) must be installed.
I hope that those explanations are good enough so that I don’t have to add further instructions here. If not, please let me know in the comments.
You have implemented your own backend and it compiles? Great, now let’s make sure that it really works
In this section I’m assuming that you implemented one sync source, configured the compilation so that only your backend is active (check the configuration summary at the end of the
configure output, use
--disable-foo as necessary) and don’t using a modular build (the default, i.e., no
--enable-shared). When I write about invoking
syncevolution, I mean the binary which is created in the
src directory – be carefully to not accidentally invoke one that might be installed elsewhere…
syncevolution --source-property type=?. The description should contain an entry for your sync source. If not, check that your registration code is linked into the main binary and invoked.
Then create a configuration. For each of the four standard sources (addressbook, calendar, todo, text) the
createSource() that you have provided as part of an
RegisterSyncSource instance will be called to check whether it can potentially provide such data. If it returns a sync source, then that sync source’s
listDatabases() is called to verify that the user really has data of that kind. The convention is that the first database is used by sync sources by default. For testing purposes it makes sense to point the
evolutionsource property towards a scratch database with no valuable data – I don’t have to tell you about backups, right? If any of these steps yields no results, the source will be disabled. If you developed a special sync source which does not match with any of the standard sources, then you will have to create your own source config file.
syncevolution --print-servers prints the names of existing configurations and where they are stored.
Once you have created the configuration, you can use
syncevolution --sync <mode> <server> <sourcename> to sync with your backend in different sync modes.
syncevolution --sync ? prints a list of valid modes.
After creating a second configuration with the same server URI and a different local database, copying items and changes to and from the server can be tested. But this manual testing quickly gets tedious. A better and more convenient coverage of your code is achieved by integrating your backend also into the automatic SyncEvolution regression testing. This CPPUnit based testing was originally developed for SyncEvolution and later moved into the Funambol C++ client library so that other SyncML clients can also use it. It covers both local tests (importing/export items, change tracking) and synchronization with a server.
Local tests only require one database with change tracking for multiple different servers (something that
TrackingSyncSource provides automatically). For real syncs, the sync source must support two different local databases: the first database is associated with a simulated client A that synchronizes via the server with a client B accessing the second database. Both “clients” run on the same machine in the same account because that’s where the main test program,
client-test, runs. On Unix systems it might be possible to use multiple accounts if a sync source only supports one database per account (like the Mac OS X Addressbook), but that idea hasn’t been implemented yet.
Testing your sync source is plugged into the
client-test test runner using the same
*Register.cpp file that is also used for registering the sync source itself. These files will always be compiled with
-DENABLE_INTEGRATION_TESTS when building them for
client-test, therefore the tests are always available even when configuring without the corresponding
--enable switches. They do not get into the way when including them in
Local instances of the
RegisterSyncSourceTest class that you have to provide then add the tests: you have to choose a name for each test, which is how it will be listed by
client-test: “file_vcard21″ appears as “Client::Source::file_vcard21″ and “Client::Sync::file_vcard21″. There are predefined tests for vCard 2.1/3.0 and iCalendar 2.0 events/tasks, but you can also provide your own test data in a
ClientTest::Config struct from scratch (
EvolutionCalendarSourceRegister.cpp) or just your own test case file (
SQLiteContactSourceRegister.cpp). Unit tests for your code can also go into the
Here’s a step-by-step guide for getting started with automated testing, using ScheduleWorld as example:
=> should list your tests and creates
~/.config/syncevolution/scheduleworld_configs, using local databases in
/tmp/testing/. The corresponding
evolutionsourceis updated each time
client-testis run, based on the current value of
CLIENT_TEST_EVOLUTION_PREFIX. The default prefix is
SyncEvolution_Test_. The name of your test and
_2will be appended. Your sync source must support creating these databases, otherwise you’ll have to use a different prefix and perhaps create suitable databases by hand.
- check the configuration and enter your server account information (the
syncevolutionis your friend here:
./client-test SyncEvolution Client::Source
=> runs all tests involving just local operations
=> runs one test that checks that one contact can be copied to and from the server using the two configurations
=> runs all tests which involve the SyncML server; tests with just one active source are run first, followed by the same tests with all enabled sources in two different orders
One not unlikely result is that items are slightly modified while they are handed from one client to the server and back to the other client. The
synccompare script brings vCard/iCalendar items into a normal form before comparison (thus preventing diffs because of perfectly legitimate reformatting) and ignores changes that some servers are known to introduce. If you get diffs, then first check whether your sync source passes the local import test,
Client::Source::*::testImport. Either fix that or use your own test data that doesn’t contain the problematic data. When a server is involved, the
*send.client.A.log shows what was sent to it and
*refresh.client.B.log what was received.
If it was the server, then work with the server developer to improve the server and/or add more suppressions to the script. Unfortunately this is currently not possible without modifying the original file in the Funambol source code repository. Temporarily a local file in SyncEvolution’s
test/synccompare.pl can be used to override the one from Funambol. The script also doesn’t handle normalization of vCard 2.1 and vCalendar 1.0 well (f.i., it doesn’t remove redundant QUOTED-PRINTABLE parameters). Patches welcome…
If tests fail in some other way, then have a look at the definitions of the tests (
ClientTest.cpp) to find out what a failing test does. In the current directory there are detailed client log files that can help, too.
It’s your choice whether and how you publish your work. You can generate a source tar ball with the standard
make dist and a binary package with the non-standard
make distbin BINSUFFIX=foo. The later will produce
syncevolution-<version>-foo.tar.gz with all the files necessary to run SyncEvolution, plus documentation. There’s also a
make deb BINSUFFIX=foo. It depends on a CheckInstall which contains some patches that I wrote. When releasing source code make sure that your source tar ball is complete with
Just remember that you are linking with SyncEvolution which is licensed under “GPL v2 or later”, so you must publish your source under a compatible license if you redistribute SyncEvolution. As I said above, no copyright transfer is necessary for backends. For the core SyncEvolution or backends that I maintain I would like to keep the copyright situation unambiguous, so if you want to contribute to that, I kindly ask for an informal copyright transfer. You’ll be a granted full rights to your contributions, as if you still owned the copyright, of course.
Enough of this boring legalese. Happy hacking!