Monday, June 15, 2009

A Business Data Dictionary generated from XSD Schemas

With XSD schemas becoming the {default} Enterprise Data Dictionary at most companies, keeping schemas and the Business Data Dictionary synchronized is becoming a challenge. Business people must have a friendly, visually modeled, non-technical Data Dictionary to work with.

Tools which can generate documentation from Schemas (XSDs) are: DocFlex, Stylus Studio, xnsdoc, TechWriter, DocumentX, etc. However, all of them seem targeted at Technical folks and produce technical documentation (they generate namespaces, "AttributeGroups", "SimpleTypes" and "ComplexTypes" etc.) - that is gibberish to business folks and scares then away.

We need a Business Data Dictionary that is always in synch with the XSD and is visually modelled where the SSD terminology is represented by user friendly common language terms.

In the absence of available tools to generate them from a schema, most companies use Excel to maintain the data dictionary. However the tool to model and represent a schema is a treeview, not Excel. Secondly they often get out of synch due to the manual process involved and a few mistakes later, business folks start to distrust the manually maintained Data Dictionary

The best way to do this seems to embed xs:annotation/xs:documentation/xs:appInfo directly into the XSD and then use a {configurable} style sheet to generate a formatted tree-view Help (CHM / HHX / Framed HTML) which is a visual model of the XSD. This is how we can keep the XSDs and the Data Dictionaries always in synch (and not have to update any external Data Dictionary separately which often gets out of synch in a manual process).

Amazingly, there is no purchasable tool in the market which can do this - as far as my extensive research shows (I would be glad to be proven wrong). This is the 2nd time I ran into this in my career. I have explained the issue in the following attachment and am seeking developers who are skilled with XSD and XSLT to help me out with this project and also provide some insights.




At June 24, 2009 at 7:29 AM , Blogger Paul said...

I share your frustration over needing a business friendly view of the data, in dictionary form. Over the years I have build xslt tools for navigating over all xsd:include and xsd:imports as well as walking the tree via types, unions, derivations, etc.
In short, I think I have many of the ingredients you are mentioning. I would be happy to work with you on creating a true business data dictionary.
Paul Kiel Consulting

At July 13, 2009 at 3:16 PM , Blogger Leonid said...


I happen to be the chief author of one of the XML schema documentation tools mentioned here under the name DocFlex.

As I see in the original post, there is some misunderstanding of what exactly our tool is. This is frustrating to me because our tool was designed primary for such diverse and nonstandard problems like those mentioned above -- not only to produce some "gibberish" XML schema docs. (But actually I cannot blame you, because so far we invested too much efforts into the development of the technology itself rather than its promotion and explanation. The last is particularly difficult task because people tend not to read any explanations at all. Instead, most are quick to claim they cannot find the tool they need even when they may be looking exactly at one... This may sound strange, but just try to develop and market something really new, particularly in a sophisticated field, and you will know I am right.)

Well. Now concerning DocFlex.

First, "DocFlex" is not an XML schema documentation tool and, by itself, have even little to do with XML schemas. Rather, it is an abstract and very powerful technology for fast creation of any kind of documentation generators from any sort of data provided by any Java APIs. That is probably a thing that may be dreamed of only in some nightmare, I guess :) Anyway it exists! For further details, please read here: I suggest also to look closely at the whole our website You could notice more...

Second, it is only DocFlex/XML -- a branch of DocFlex -- that deals with XML-file data sources. DocFlex/XML is a universal template-driven documentation generator from any XML-file based data sources, which we developed basing on our main technology.

Now you may wonder, where in that system is the place of an XML schema documentation generator mentioned originally as "DocFlex"?

This is actually a specific template set for DocFlex/XML. That template set we sell as a separate product called DocFlex/XML XSDDoc. This is a very powerful XML schema documentation generator indeed, even though it is implemented on some mystic "templates" nobody heard of (rather than on such venerable stuff like XSLT).

Of course, I am biased, you may think, but I don't really know what a true alternative to our DocFlex/XML XSDDoc currently is (especially given that most of its functionality is available for free!). I don't imagine where people go otherwise... if only write their XML schema docs by hands.

Now, concerning this remark: "... However, all of them seem targeted at Technical folks and produce technical documentation (they generate namespaces, "AttributeGroups", "SimpleTypes" and "ComplexTypes" etc.) - that is gibberish to business folks and scares then away."

Well, I don't see anything wrong with this. Our XML schema doc generator (as well as all others mentioned in the original post) speak in terms of XSD (XML Schema Definition) language. All XML schemas are written in that language! I see nothing wrong that any general XML schema documentation first of all should represent the primary things (e.g. components) defined in the XML schema -- those namespaces, "AttributeGroups", "SimpleTypes" and "ComplexTypes" etc.

What else could you expect to see there? After all, XSD language does not (and should not) provide any notions from each particular application field where XML schemas are used.

[to be continued; see next my comment]

At July 13, 2009 at 4:58 PM , Blogger Leonid said...

[second part; see my first comment]

I my view, what a good general XML schema doc generator should do is not only to mention all the components (along with their properties) defined in the XML schemas being documented, but also to represent all important interconnections between those components -- the job you would typically do in your head (normally, by reading the XML schema sources)! That was the primary focus of our XML schema documentation generator DocFlex/XML XSDDoc.

Concerning remark: "... Amazingly, there is no purchasable tool in the market which can do this - as far as my extensive research shows (I would be glad to be proven wrong)."

Right. You do not find such a tool now and, I believe, very unlikely will find it even in some future. That's because the tool you want is near impossible!

Look, you have some business application task where XML schemas are used to represent some notions specific for that particular business field. Someone else is using XML schemas to represent UML models. That list may be continued... What's common here is that these are particular applications of the XSD language. Any such application adds its own semantic layer over XSD, which describes things inherent to that application field.

What you essentially want is a general XML schema doc generator that could be easily set (preferably in some visual way) to understand any given application-specific layer of semantics so that it could take some raw XSD files, recognize in them that semantics and generate some documentation according to its notions.

Unfortunately, such a tool would be similar to another dream: a tool that would primarily "know" Java (or any other universal programming language), to which you could somehow (visually) explain your programming task and it would generate to you the Java program you need. Do you know many such tools? I think, UML MDA is the only field most closely approaching to that goal. But how "closely" it actually is! Somebody says that Java programmers will be eliminated any time soon?

Well, my opinion is this. If you need an XML schema documentation generator to produce some specific docs that your clients could easily understand, you have to program that doc generator by yourself and program it meticulously, in every detail. You cannot avoid this!

Some technologies may help you indeed. XSLT is the one, you are right. But XSLT, except of being W3C standard, is rather cumbersome and far from universal. A lot of things cannot be easily done with it.

I would suggest to look at our main XML tool DocFlex/XML.

Our current XML schema documentation generator XSDDoc is just a set of templates open for any changes. The templates play the same role as XSLT sripts in XSL. However, unlike XSLT scripts, our templates are edited in a special graphic template designer, which provides some layer of visualization that resemble the output you are going to produce (as well as maintains the template integrity).

The basic XSDDoc templates could be easily modified according specific application semantics and other requirements. It is possibile to generate both framed and single-file HTML output as well as RTF documents (that can be converted to PDF). What else anyone needs?

Best regards,
Leonid Rudy

At January 21, 2010 at 4:44 AM , Blogger davidbaer said...

This comment has been removed by a blog administrator.

At May 25, 2011 at 3:06 AM , Blogger michel.bormans said...

I experienced the same things within my own environment. I'd like to add the complexity where each speaks his own national language (that is dutch, french, german, english, ...).
I tried to build some documentation-user-interface, offering a tree-view, not based upon the tag-denominators, but on the more textual and language-specific labels. You can see it at


Post a Comment

Subscribe to Post Comments [Atom]

<< Home