|

Explaining Integration
By Phil Howard, Bloor Research
January 27, 2004
Article URL: http://www.it-director.com/article.php?articleid=11615
When Bloor Research first set up its practice model, integration was divided between me and my colleague Peter Abrahams. Amongst other things, he was to look after application integration, while I would focus on data integration. However, at no time was it explicitly stated where application integration ended and where data integration began. The reason for this was that it was by no means clear to us where any such dividing line should be placed. As my next project is a report on data integration I have had to devote some effort to attempting to clarify this position in my own mind and the following represents some of the conclusions I have reached.
The first and most obvious way in which you might differentiate between application and data integration is that the former uses messaging and the latter does not. This might once have been valid but it is not so any more.
If you think about types of data integration then there is the traditional ETL (extract, transform and load) functionality, which effectively acts in batch mode.
Then there is near real-time data transfer. This is typically provided not by the ETL technology itself but by the data connection technology that underpins it. For obvious reasons, such functionality requires change data capture to give it decent performance and this is what vendors like Striva (now part of Informatica) and Attunity provide.
Finally, there is true real-time capability, which is clearly a growing requirement. Now, both Ascential and Informatics, for example, have real-time engines but the truth is that more and more companies are using messaging to update data warehouses (for example) where they require real-time capability.
However, there is a fundamental difference between messaging and conventional data integration: the latter is a pull technology whereas the former uses a push-based approach. If you think about it, this must be the natural approach to real-time integration: changes and updates must be propagated as they occur rather than by using some technology that goes and asks if some change has been made, since the latter must involve some sort of time delay.
Nevertheless, just because you introduce push technology does not mean you have to use messaging. It should be equally feasible to use database replication. However, this would mean replicating updates to the ETL tool for transformation purposes rather than to another database, which is the usual purpose of replication. Thus this is not generally considered as an option. Nevertheless, there is one ETL vendor, GoldenGate, which does precisely this: combining ETL with its own replication technology.
Does this get us a lot further? Well, it suggests that EII (enterprise information integration) is getting squeezed between EAI (enterprise application integration) and ETL, so we might lose an unnecessary initialism. It also suggests a negative: that messaging does not help to define the distinction between data and application integration. Perhaps we can say that application integration is initiated by an operational application and data integration isn't, or that the data passed during EAI is used directly by the recipient application whereas it is only indirectly used in data integration environments. If anybody has a more specific (and concise) definition, I would be pleased to hear it.
[back to In The News] | top of page
|