Monday, July 18, 2005

Today I visited one of my customers to finish some work for a system we build. During the day some discussion came up about the tons of reference data that the system contains. The past few years the systems functionality increased enormously and so was the need for reference data. Although the systems isn’t based on a real service oriented design, most of the systems is exposed as web services because the functionality (and data) is also used by other systems at the customer’s location. At first the discussion covered system related stuff but soon changed to a more general discussion about how to handle data in a service oriented system.

Two weeks ago I visited a presentation of Ron Jacobs (Product Manager Patterns & Practises) at Tech-Ed 2005 Europe that was about handling data in a service oriented world. For me it was very easy to use some of his content to keep the discussion alive. Below some parts of the discussion:

The purpose of reference data is mainly to fill out requests and interpret response messages between services. Reference data is maintained by the service itself. The service implements a separate interface that can be used by service consumers to retrieve the reference data for a service. Further, reference data must be versioned, and gets updated periodically.

The service that maintains the reference data must provide the service consumers with a mechanism to retrieve the latest version of the reference data. The publisher of the reference data might implement some sort of a notification mechanism to notify it’s subscribers of a new version of the data. A service consumer can store the latest version of the reference in a local format to optimize the communication with the service. Of course this has some consequences for the development of the service because some sort functionality for this (storage of reference data) has to be developed.

Interaction between services includes request, response and reference data

Data (message) is immutable and therefore need a unique identifier. This identifier might include a version number of the data. Retrieving the data by using a unique id should ALWAYS return the same data no matter when the data is retrieved.

The data outside the services is represented as messages and is understood by both the sender and receiver. Special attention must be given to the schema that describes the message to reduce to possibility of misinterpretation of the message (data).

The data inside the service can be in any format that is appropriate for the service and is (and stays) private to the service.

Whenever data is needed by many services, choose one owner of the data. The owner is the only one that can make changes in the data. Whenever other services need an update of the data they send a request for an update to the owner. If the owner decides to make the update it publishes the changes to all subscribers. In this case the version number of the (reference) data increases. Request and response messages should be archived for auditing purposes.

Of course there is a lot more that can be said about data and services but the above topics kept us busy for most of the afternoon.

posted on 7/18/2005 7:18:42 PM UTC  #    Comments [3]

Related Posts:
7/19/2005 11:14:42 AM UTC
We are doing something similar for our customer. Since the response times of any consumed services aren’t guaranteed, our client suffers from performance issues. So we started replicating data on the client. The service uses initial load to publish the data. Publishing versioned data occurs with a publisher/subscriber messaging pattern.

Funny thing is that Clemens Vasters made a drawing of such a solution during the chalk & talk session at Tech Ed 2005 Europe.
7/19/2005 7:32:06 PM UTC
paul interesting!. Are you referring to a web or windows client?. The services you are talking about are they primarily maintaining so called "resource oriented data"? In this case think of resource oriented data as in customer or employee data. This data is most of the time valid over multiple (long running) transactions and therefor easy to cache.
do your services provide functionality for the publish and subscribe patterns in their interface? is your client only communicating with the "local store"? and only uses the service for retrieving the initial set of data and to send updates back?
7/23/2005 9:55:04 AM UTC
Web client. The services are largely resource oriented and most of them encapsulate legacy systems (back office). This means that since the data in the back office system is valid for 24 hours, so is the data inside the services. Therefore easy to cache, though the services use bulk-updates which complicates the way we handle data inside the service. Pub/sub implementation is a separate interface based on MQ.
Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):