Semantics: A new beginning?
Semantics have been around in the computer industry for a long time. Way back when, for example, Unisys used to market a database called SIM (semantic information model), which was based on Peter Chen's entity-relationship (E-R) model. For a variety of reasons, SIM never gained much traction in the market but, from a theoretical standpoint at least, it had some significant advantages over its counterparts at that time.
Today, semantics is back in the news. In fact, it has been for some time, albeit at a relatively low level. In particular, Tim Berners-Lee and others have been working on the "semantic web" for some time and, just a couple of weeks ago, the W3C approved the Web Ontology Language (OWL) as a standard, which is an enabling technology for the semantic web.
However, my task is not to talk about the semantic web but semantics in general and, in particular, its potential impact on data integration. But before I do that, the previous paragraph may need some explanation. The question that may be puzzling you is why the Web Ontology Language has an acronym of OWL, when it is spelt WOL? There are two answers: the first is that WOL sounds awful and the second is that that is how Owl spelled his name in Winnie-the-Pooh. That stands as one of the best reasons for an acronym that I have run across in a long time.
To return to the subject in hand: I want to discuss the impact that semantics or, more specifically, semantic information architectures, may have on the data integration market (whether moving data or federating it), data quality, corporate governance and reference data management markets, amongst others.
One of the key issues in all of the environments mentioned is that disparate data is coming from diverse sources and very often these data sources are dealing with what is essentially the same sort of information. For example, they may be all dealing with customers. That said, one source may refer to customers, another to clients, a third to sales accounts, a fourth to Cust_3, and so on.
So it is necessary to create some sort of mapping that enables these to be related to one another. This is what data integration products do, either in the form of transformations in the case of ETL or as virtual databases or views within data federation products. At a more specific level, for example when dealing with individual customers, this is what the linking features in data cleansing tools do.
However, the problem with all of these approaches is that you have to manually create these transformations and mappings. Yu have a tool that will help you create these mappings but one way or the other you have to read in the source data, identify that it matches some other source data that you have read, and then create the mapping. Moreover, there is no control over this process. If a new column name is created on a source system, then that has to be read, understood and mapped all over again.
The reason for all this continual mapping is that the source data has no meaning in any real sense of the word. That is, it has no context. What the new interest in semantics is looking to do is to provide that meaning. Unfortunately, there is no indication that database vendors (the leading ones at least) are going to provide this contextual information any time soon. However, there are moves afoot within the semantic community to try to establish semantic rules and ontologies that can be used across the enterprise and then, using appropriate tools, you can start to automate the process of creating these transformations and mappings rather than having to do it manually via a mapping tool.
There are a number of vendors already in this space. For example, Unicorn is an Israeli company that has moved its headquarters to the States, Contivo is completely American, and Network Inference started in the UK but is currently following Unicorn to the US.
Now, although all of these companies have a number of installations already, this is really bleeding edge stuff and I confess to not having fully got to grasps with it. However, I am in the process of arranging briefings with all of these companies and I will report back when I have more information and can explain it more simply. In the meantime the technology appears to have potential, not just for data integration but also other areas like EAI. Watch this space.
Copyright 2004. Originally published by IT-Director.com, reprinted with permission. IT-Director.com provides IT decision makers with free daily e-mails containing news analysis, member-only discussion forums, free research, technology spotlights and free on-line consultancy. To register for a free e-mail subscription, click here.
For more information:
- Looking for free research? Browse our comprehensive White Papers section by topic, author or keyword.
- Are you tired of technospeak? The Web Services Advisor column uses plain talk and avoids the hype.
- For insightful opinion and commentary from today's industry leaders, read our Guest Commentary columns.
- Visit our huge Best Web Links for Web Services collection for the freshest editor-selected resources.
- Visit Ask the Experts for answers to your Web services, SOAP, WSDL, XML, .NET, Java and EAI questions.
- Couldn't attend one of our Webcasts? Don't miss out. Visit our archive to watch at your own convenience.
- Choking on the alphabet soup of industry acronyms? Visit our helpful Glossary for the latest lingo.
- Discuss this article, voice your opinion or talk with your peers in the SearchWebServices Discussion Forums.