Join wMUsers | Blog at wMUsers | User Control Panel | Site Map | webMethods Jobs |For Employers

Igor Androsov -- webMethods Ezine Columnist

Writing Custom Content Handlers



By Igor Androsov

 

Introduction

In many integration projects, architects and designers spend considerable amounts of time figuring out business logic and database models, but little time determining how to get the data from its source to the Integration Server. This article presents a few useful points on importing data to the Integration Server -- a subject often missed because it is thought to be trivial and simple!


Why Custom Handlers are Handy

Integrating using current XML standards or other contemporary solutions is simple. It is not so simple, however, when custom data formats are involved. Not coicidentally, I have yet to be involved with an IS project that did not requiring the processing of at least one custom flat file or some other type of legacy data format. Put simply, processing custom flat files is a challenge.

There are three ways to process non-XML data and each requires a custom content handler or a complete data transformation before ever introducing the data into TN or IS. The three ways are:

  1. The producer of the non-XML data wraps the data into a simple XML wrapper carrying identifying tags for sender, receiver and data.
    	<?xml version="1.0">
    	<sender>SENDER INFO GOES HERE</sender>
    	<receiver>RECEIVER INFO GOES HERE</receiver>
    	<data>DATA GOES HERE</data>
    
  2. The receiver of the data maps the document into an XML document and then sends the XML document to TN or IS.
  3. The producer delivers the data to a custom content handler service. The handler then parses the document, identifies the sender (and reciever), and chooses the appropriate processing rules.

webMethods' methodology suggests building custom content handlers for all text documents and registering the handlers with the Integration Server. Once registered, the Integration Server will direct all request of "text" MIME-types to the content handler. This is an excellent solution because it is transparent to data producers, it easier to build for developers, and it is easier to maintain for IS administrators. After all, there is only one service with one URL servicing all partners.

The disadvantage of a single handler, though, is that all inbound data is immediately placed into the pipeline by the Integration Server. This is because the IS instance is invoking the custom content handler for all inbound text documents. So, if the data set is very large and multiple documents are being processed concurrently, memory problems may arise for the Integration Server. Additionally, with one content handler for all text documents, custom processing of the inbound documents is not possible.

An alternative is to build multiple, specific services in order to extract the inbound data from input stream. With multiple custom handlers, we can customize data processing and be able to handle any size and volume of incoming data. If the sender of the data knows to which service to submit its data, the data can be processed asynchronously or in any other customized processing fashion.

This alternative provides a greater flexibility and stability in handling data than using a single handler. The disadvantage, though, is that for each "flavor" of inbound data, separate processing code must to be developed and maintained.

The need to custom process data only arises from performance and load. If the data volume exhausts the available memory of the IS, then custom handling is the only way to ensure reliable processing of inbound data. However, if data volumes are small, then generic content handler for text files is more appropriate.


How Large is Large?

From my discussions with webMethods and from my own working research, a "large inbound document" is an inbound document within the range of 20MB -- 600MB. In practice, though, even a 10MB flat file can slow down the Integration Server. This is true of all IS versions -- including version 4.6. The IS maintains its stability and its reliability in processing large documents, but performance does suffer. As volume rises on the server, the performance will continue to slow and, eventually, the IS will be unresponsive to requests.

webMethods IS 4.6 version is purported to have solved this performance issue by depositing the inbound data to disk (based on threshold) and later reading it back in chunks, but I have yet to see this methodology effectively utilized in real world applications.



[1]  2  Next>>

Go Deeper on the Subject: The wMUsers Discussion Forums


Igor Androsov has over 14 years of experience in the IT Industry. For last 5 years he has been a software mercenary specializing in Integration process and technology using wide array of languages and platforms: C/C++, Java UNIX, LINUX, OS/2 and Windows. Igor started working with webMethods in 1998 as a new Integration tool implementing a B2B Exchange. He has implemented large distributed systems for multiple Fortune 500 enterprises.

Igor can be reached via email at


Advertise at wMUsers






  Home | Join wMUsers | Discussion Forums | Knowledge Center | Jobs | Shareware | User Groups | Links |
Contact Us | Terms of Service | Privacy Policy

wMUsers is an independent organization and is not sponsored in any manner by Software AG.


© All Rights Reserved, 2001-2008.