Monday, June 15, 2009

A Business Data Dictionary generated from XSD Schemas

With XSD schemas becoming the {default} Enterprise Data Dictionary at most companies, keeping schemas and the Business Data Dictionary synchronized is becoming a challenge. Business people must have a friendly, visually modeled, non-technical Data Dictionary to work with.

Tools which can generate documentation from Schemas (XSDs) are: DocFlex, Stylus Studio, xnsdoc, TechWriter, DocumentX, etc. However, all of them seem targeted at Technical folks and produce technical documentation (they generate namespaces, "AttributeGroups", "SimpleTypes" and "ComplexTypes" etc.) - that is gibberish to business folks and scares then away.

We need a Business Data Dictionary that is always in synch with the XSD and is visually modelled where the SSD terminology is represented by user friendly common language terms.

In the absence of available tools to generate them from a schema, most companies use Excel to maintain the data dictionary. However the tool to model and represent a schema is a treeview, not Excel. Secondly they often get out of synch due to the manual process involved and a few mistakes later, business folks start to distrust the manually maintained Data Dictionary


The best way to do this seems to embed xs:annotation/xs:documentation/xs:appInfo directly into the XSD and then use a {configurable} style sheet to generate a formatted tree-view Help (CHM / HHX / Framed HTML) which is a visual model of the XSD. This is how we can keep the XSDs and the Data Dictionaries always in synch (and not have to update any external Data Dictionary separately which often gets out of synch in a manual process).

Amazingly, there is no purchasable tool in the market which can do this - as far as my extensive research shows (I would be glad to be proven wrong). This is the 2nd time I ran into this in my career. I have explained the issue in the following attachment and am seeking developers who are skilled with XSD and XSLT to help me out with this project and also provide some insights.


SchemaModelDocumenter.doc

Labels:

Friday, February 6, 2009

Migrating BizTalk 2002 to 2006

Recently I was involved in migrating a BizTalk 2002 solution to 2006 for a large Candian customs and brokerage firm. With continued growth in the past few years and equally high growth forecast for future, the transaction load on the company's BizTalk Server 2002 environment had grown over many years
to a point where some transactions were taking close to 2 minutes to process. With future anticipated growth, it had become absolutely necessary that the environment be upgraded immediately, but the immediate need was to remove congestion and remove the single point of failure - which is the biggest Achilees heel of Biztalk 2002 - the lack of failover and clustering for redundancy.

BizTalk 2002, for those not familiar with it, is really a small subset of the functionality of what BizTalk 2006 today has grown to become. BizTalk Server 2006 is a significantly different product than BizTalk Server 2002. It not only provides a rich set of new features, but it provides new ways to do things. For example, imagine you have a business rule that must be invoked from numerous business processes. With BizTalk 2002, you can solve this challenge in one of two ways. You either implemented a decision shape in each orchestration or implemented the business rule in a custom object that is called from each orchestration. In BizTalk 2006, you may now choose to implement the business rule to execute within the Business Rules Engine that is called from each orchestration. The question then becomes, when I migrate this area of my BizTalk 2002 solution to BizTalk 2006, do I take advantage of the new Business Rules Engine?

The migration effort is riddled with issues like this. To help with the migration, I have attached a presentation comparing BizTalk 2002 and 2006 artifacts:

Comparision of BizTalk 2002 and 2006 artifacts

Labels:

Thursday, February 5, 2009

Microsoft Data Access strategies

Lately, there is a lot of buzz around two technologies from Microsoft about representing and querying data. The first is the new Entity Data Model exposed as part of the ADO.NET Entity Framework, and the second is a set of extensions to the .NET Framework for integrating queries into the programming language known as LINQ.

What are these technologies, how do they relate to one another, and what role do they play in Microsoft’s Data Access Strategy? How do they compare with the current ADO.NET which is widely used? How does XML data access feature in the new world? What about the community projects like NHibernate and Repository Factory - how do they fit it?

See my PowerPoint presentation DATA ACCESS STRATIEGES - 2008 to get an idea of what to use where and how Microsoft's data access strategies have evolved.

Labels:

Migrating SeeBeyond integration to Biztalk

Integration migration projects are humbling in their complexity. You are likely to run into daunting communication, management and operational challenges which can conspire to defeat your march.

But what happens when you need to migrate an integration, probably quite complex (say hundreds of interfaces) and settled in the organization to another platform, with minimal disruption of business continuity?

We faced such a challenge with migrating a SeeBeyond project at a large US client to a BizTalk platform. The project's achilles heel was the original SeeBeyond project built by a large consulting company (EDS) did not have a scrap of documentation.

SeeBeyond has historically been around as an EAI tool for over 15 years or more and was one of the pioneering tools in the realm of EAI (even before the EAI buzz word was coined). It was brought by SUN and the new version of its e*Gate tool redesigned so Java can be used for development. BizTalk is the relatively newer kid on the block with a different vision and is part of the .NET family.

How do you begin to approach this problem when a large multinational's business critical applications depend on this integration? Consider that professionals with a functional skills in both platforms (SeeBeyond and BizTalk) are imposssible to find. How do you assess the project scope, get the “big picture”, peek into many technical details and eccentricities early, and foresee upcoming challenges – not to mention planning ahead for infrastructure and performance?

As with most complex and large IT projects, there is no one magic bullet. We evaluated a range of strategies for project management and technical solutions. Planning for a migration strategy is key to a successful project especially when it is a complex, mission-critical EAI migration. Certain approaches are low-risk while others hide spiralling costs and technical pot holes. It is essential to identify the project roadmap and migration strategy blend before launching on this difficult path.

In the attached white paper, I share some project management strategies that worked for us for this project, which was a success.

Technically, SeeBeyond and Biztalk don't have a clear one-to-one mapping of features, and it is apparent at the outset that these are separate organisms from different lineages that evolved to fill the same niche, just like convergent evolution works in nature. So how does a BizTalk developer without SeeBeyond knowledge begin to proceed? In the white paper, I have tried to provide the best possible analogies between the concepts, features, technologies and artifacts of the BizTalk and SeeBeyond platforms. Please note that many parallels are mostly loose analogies to help the BizTalk developer gain a newcomer’s familiarity with SeeBeyond. Since the architecture of both platforms is different, drawing parallels between both is a hazardous task, fraught with shoe-horning and over-simplication. We have attempted it nevertheless - on the grounds of giving the migration team a first look / toehold into the world of SeeBeyond and dispel some of the fog surrounding it. Caution is advised against literal interpretation of these parallels – and no technical implementation based on it is recommended. A deeper and more thorough understanding of SeeBeyond Technical Architecure and concepts is advised before the migration project is undertaken.

White Paper: SeeBeyond to BizTalk Migration

Labels:

Wednesday, November 14, 2007

EAI or ETL ?

I see a lot of confusion in the industry today about when to use EAI and when to use ETL. A very large EAI implementation at a Canadian processed foods company is a performance dud because it is basically a database integration which could have been done much more efficiently the ETL route.

ETL tools are most appropriate for data integration that consists of data synchronization betweenapplications, and for point-to-point, single step interactive processing. Real-time data orientedintegration projects that involve large amounts of data, complex transformations, or dataaugmentation are appropriate for these tools. You will also get better performance moving and transforminglarge chunks of data with an ETL tool performing relational database-type operations on largeamounts of data.

EAI tools are most appropriate for process integration, where a contract (in the form of a schema say) needs to be exposed for multiple parties to consume and adhere to, and where the end systems are not fixed in stone but can be replaced.

Let's capture these differences definitvely:

  1. EAI is message oriented (act of a single row of data which is treated as a ‘message’ performing conversion, transformation or string operations), while ETL is data-set oriented (able to perform joins, merges, unions, sorts, aggregations, pivots on a huge set of data at once).
  2. ETL can perform heavy-duty data lifting EAI are more geared towards heavy throughput of messages or transactions especially when moving to many destinations.
  3. ETL is better at relational data manipulation/transformation while EAI is better at hierarchical data manipulation (XML, Flat files).
  4. EAI tools are not generally designed to understand the data schemas of the applications and to perform data transformations. They are designed to interact with the applications at an API level. To do relational and complex transformations you have to drop down to writing code. ETL is a natural at this and offers significant performance gains by using the right tool for the right job.
  5. EAI is based on a pub-sub model while ETL is based on an on-demand model. EAI subscriptions interface is based on a schema rather than an ERD data model which is far more difficult to share between data stores. Hence EAI subscribers are contract-based and replaceable.
  6. EAI connects disparate applications while the destination of ETL is usually a data store or a data warehouse.
  7. EAI provides much better workflow and process integration capabilities while ETL is good at integrating data.

Labels:

Wednesday, May 9, 2007

Why SOA?

Is it all about buzzword compliance? Is it hype or a genuine megatrend? Is it another noun for Web Services? How can you cut through the rivers of literature about SOA to first understand - why do we need another paradigm?

See my presentation - Why SOA? - the reason why everyone thinks SOA is futureproofing - and importantly, a business case for it.

Labels: , ,