View Full Version : How to convert a nonXML file into an XML format that can be sent to wmtnreceive
Hi,
I need to write a service that will take a non-XML flat file and convert it into an XML form that can be passed into wm.tn.receive .
This is for an integration project with a large number of clients whom will have widely different file formats.
Presuming that I can convert a flat file into an object, using my own Java service, is there an easy way to transmit this information to wm.tn.receive as a java.io.File or String with a content type set to text/xml?
Thanks in advance for any help. This is my first real webMethods project - I received my training around two months ago, so I would appreciate advice as to the most effective solution.
Nick_F
Nick, you did a good job explaining your goal. I have some questions about the methods, though.
You said that you are receiving a flat file. Is the flat file EDI? Is it a comma-separated list? Is it some other format?
It sounds like that you are figuring this to be more work than it really is. That should be good news to you, right? :-)
Nick,
You can do two things to convert your flat file into XML and send it to TN.
1. Write your own java service (conventional string handling) to read the flat file and convert it to IData objects.
2. You can create flat file tempaltes to read and parse a flat file using some EDI services (holds good for both EDI and non EDI flat files) into IData objects
Finallynd build an XML using these objects and route it to Trading networks.
halsalam
07-10-2002, 10:14
Nick,
Another option is to create a simple XML document with one ROOT TAG that wraps around your flat-file content.
This can be submitted to TN as usual.
You can also create your own service to populate SENDER, RECEIVER ID's within the BIZDOC prior to SUBMIT being called( if you want your Processing Rules to be SENDER/RECEIVER dependent).
Hope this helps.
Halsalam-
Do you have a sample Java code to map the flatfile to IData structure of EDI Template? COuld you please share it if you have?
Thanks,
Bals
Bals-
If I understand correctly - you want to convert IData to EDI (or vice versa)...
There are sample templates that ship with the EDI Adapter. Also, you should be able to download EDI templates directly from the webmethods site (which I have not tried - but that's what their docs say http://www.wmusers.com/wmusers/clipart/happy.gif)
Im impressed and thankful for such a swift reply - there is obviously a strong community here!
Dan - for the long-term idea of this project we anticipate having over 200 clients whom may have different file formats for purchase orders, etc.. Our high end clients will be submitting XML, in all probability, but we need to cater for older systems that will deliver in CSV and a host of proprietary formats.
So, I need to plan an architecture that can handle this with T.N. in a way that will produce a system that wont have a different mapping services for each integration partner. Hence, a Java service converting each order into a uniform XML format then passing it into T.N.
PU - If I convert the flat file into an IData object, I cannot invoke receive? We want to establish Document Types, use pre-processing rules, etc. for our clients.
I really want to know if there are any webMethods mechanisms for building an XML file from the Java objects holding client information. Is the best option simply working with java.io.* to create an XML file within a Java service?
Halsalam - your suggestion is a very tidy. I will think over it.
Once again, thanks guys.
If you're going to support multiple formats from multiple partners, you're going to have a bunch of different mapping services. The key will be to keep these to the bare minimum. Here are strategies:
* Define an XML schema for each doc type that meets your needs. Ideally, this would be an industry standard but quite often those don't work out. Get as many partners as you can to support your format. This works great if your company is an 800 pound gorilla and can dictate terms to some extent.
* As you pointed out, differing formats are a fact of life. Fortunately, dealing with these is a strength of Integration Server. Translate incoming docs to what's commonly referred to as a canonical format--a format that supports all the data you need for a given document type. From this canonical, you can translate to various target formats needed by backroom systems. Do the same for outbound documents as well--legacy format-->canonical format-->partner format. This approach keeps the number of mapping services down (the ol' m+n vs. m*n rationale).
* Although Java services are very useful (and sometimes indispensable) don't be too quick to drop into writing Java code. Normally, most of what you want done can be done with built-in services or with custom FLOW services. Mapping services in particular should be done in FLOW, not Java services.
* Use TN for all interaction with everybody. This provides document tracking, restarts, etc.
* It's important to have a strong doc type and IS record naming structure. You're going to have a lot of definitions, so you should structure things to aid the management of the definitions.
* Search in the forums in Advantage and in ITToolbox (http://eai.ittoolbox.com/) for discussions regarding content handlers, flat file parsing, etc. with tips on handling custom formats. If you want to run everything through TN (which is a good idea) you may need to "pre-process" some docs with IS services before submitting to TN.
There are definitely mechanisms for generating XML docs within IS. Getting the data into IS can be tricky sometimes (though usually it's trivial) but once there, there is tremendous capability to create virtually any XML doc layout. Using java.io.* to create the XML file is NOT the way to go. Use the built-in services and FLOW mechanisms.
I've got a somewhat different suggestion, which you might want to consider. Especially if you decided to go the Java service route. Please keep in mind that though I understand B2B well enough to have taught it in the past I don't work with it on a daily basis (hey, it wasn't my idea).
The problem you describe seems to be ideally suited for the Strategy Design Pattern ("Design Patterns," Gamma et al, 1994). The idea would be to define an interface with a method that takes a standard input, such as a string, and returns a standard output, such as a canonical document. For each different format that can be passed as a string you will define a different java class implementing the interface. Each java class will contain the specific strategy for converting a specific format to a canonical document.
For each Trading Partner you store the class name for the java class that performs the correct conversion strategy as an optional attribute in the Partner Profile.
Create a single java flow service that takes the string with all the information and the java class name attribute as arguments. The java class name is used to dynamically create an instance of that class. The beauty is that since all the classes implement the same Strategy Interface, the flow service only need a reference to the Interface which has a defined method that returns the canonical document.
The beauty of this solution is that you can very easily change the conversion strategy for a client by just creating a new java class (if one doesn't already exist) and modify the partner profile attribute.
Another benefit of this approach if you deal with many (10+) different formats is that you don't have to create a 10+ nested if then else flow to call the correct conversion flow. Now it might be possible to dynamically select a flow service in a similar fashion using a partner profile attribute, but this is where my knowledge is lacking. Someone else can feel free to chime in here.
The consideration would be how to extract the string with information of the specific format. I would assume you would use different types of entry services, ftp, socket, file, http, etc., etc.. Each one would extract the string and call the same java flow service with the string and the attribute from the partner profile, without any nested if then else structures.
A second side effect is that you easily can add new partners with new conversion strategies by just creating a new class, adding it to the classpath, and setting the attribute of the partner profile.
People, I would like some feedback for this idea. It is based on the wonderful world of design patterns which are very useful in an OO setting. I would like to know if it wouldn't work for some reason or if a similar solution could be deviced within the flow service world.
Rgs,
Andreas Amundin
www.amundin.com
Nick - you've received excellent advice in this thread. I'd like to emphasize how important Rob's point about using canonical formats is.
Someone famous once said most computer science problems can be solved by adding another level of indirection. That process is at work here. Translating 'n' partner formats --> one canonical format --> backend format (and vice versa) has two main advantages:
(1) As familiarity with the canonical format grows with each implementation, your time-to-implement speeds up.
(2) If your backend changes, you change just one translation (canonical format-->backend format) instead of 'n' translations.
In my opinion, the best way to implement a canonical format is to spend time cooking up your own 'inhouse' DTD for each document type. Lets say you work at a company called Data Dimensions ("DD Inc.") that receives POs and sends out Invoices in many formats. Start off by defining your standard (called, say, "ddXML"). The fastest way to do this is to emulate the structure of another standard, say xCBL 2.0. Of course, you can use xCBL as your inhouse standard too, but that may be overkill, especially if it doesn't address some special requirements you have. If you do create your own inhouse XML standard, here are some tips: keep the your XML standard "loose" --avoid doing much business validation -- leave that to your backend. Validate only basic requirements - try keeping fields optional as much as possible. Plan on creating two seperate fields for much of your data - your backend's version of that datum and your partner's version. i.e. for PO numbers, plan on having seperate and tags.
Lets say, you're done creating DTDs for two documents:
ddXML
Order...
and
ddXML
Invoice...
Now import these two DTDs into B2B and create the records in a *separate*, *versioned* package (called say, 'ddXML_1_Records'). You may find you have to do some manual cut and paste creation of records, especially if you have a single source DTD.
Once you've got your canonical documents as records in B2B, you're ready to start using them in your mappings.
Thanks guys.
We already have the canonical format idea working at the level of data structures and XML.
Im currently working through your ideas and communicating them to my colleagues.
Hearing the design alternatives pursued at this level of conversation is enlightening.
Your advice and response has been excellent. I hope to be able to contribute in some way to this community.
Will keep you informed ...
Nick
I am having fun with this thread.
[Switching from DesignFreak to BusinessFreak while stepping up on soap box. General warning issued]
Speaking of Canonicals, a term borrowed from the Enterprise server world (my stomping grounds), I would like to add some general comments.
Actually there is only one point I want to make. Who should define your canonicals?
Let me give you a hint, by proposing to use the term Business Model rather than the term canonicals.
IDEALLY a company, more specifically the business people running the company, should be aware of its Business Model and the Business Processes operating on the Business Model data. IDEALLY this Business Model should already be well defined by the business people.
I keep on using the word IDEALLY because many times the business is too complex or the exercise of defining a business model is considered to be too great of an effort. The problem is that if the business people won't define it, then the developers implementing the systems or business integration ends up defining it. This is wrong, the technical people should not define the business model that is supposed to support a company.
Don't get me wrong, I am not trying to put down the capabilities of the technical people working on integration projects. We all know us integrators are the bomb, but our approach tend to be to choose the quickest implementation (especially under time constraints). A common mistake I have seen in the past is to adopt the business model as defined by the most important third party system, be it Oracle Financials, xCBL, SAP, or whatever. As more systems are added more and more integrations has to conform with a third party business model while the entire system is actually supposed to support the undefined business model of the company. Over time the entire system will become more and more difficult to maintain.
The key is to define the "true" business model for your company. Make sure it is created and maintained by the business people (find someone who actually knows how to create a business model). Us techies then just have to make sure the system supports the true business model, no mean feat in and of itself.
IN REALITY, I have not seen one project where a "true" business model has been defined before a business or systems integration project has started. There are many reasons for this. Lack of modeling experience and failure to recognize and act on the need of a defined business model are two reasons, but there are many more. For know, we techies have to do our best with what we are dealt, but at least we should know what we should strive for and maybe we can help the business people realize they need a business model.
The move to create Business Models comes and goes. There was a big movement years ago for Business Process Reengineering, which amounted to defining the "true" business model and thereby streamline processes, mostly through increased understanding of the business.
[Stepping off soap box]
Rgs,
Andreas Amundin
www.amundin.com
I am in most agreement with Rob. We have use an internal format for a particular document and developed generic flow services to store the data. While the design pattern approach is interesting, Trading Networks has already nicely set up ProcessingRules - it will invoke a company-specific service (eg. transformation) based on whose sending us the document and what document type is received. This transforms csv,tsv, other XML into our XML version (while we wait for external standards to evolve such as PIDX).
The wrinkle with incoming non-XML files is that you can't have a partner send it directly to TN, so we will provide different entry points (URLs) depending on the file type the partner wants to send (only for non-XML). It's not as easy to detect the document type of say a csv file (as opposed to an XML file - using the root tag). The other thing is that you can't have one entry point for all csv's because each company will most likely have their own format. So you will only have as many different URLs as you have doc types - you don't have to have company specific URLs. Eg. you wrap all files coming from one URL with an tag and send it to TN. We stick the sender (from the current user invoking the service) into a bizdoc and send to TN. TN invokes a service based on the current partner and doc type (a special type with one tag).
We see webMethods' value as mostly a transformation service using TN - ie. company specific processing. Audit trails for doc, conversations, etc.
Hope this helps!
Will, the Strategy design pattern should be applicable to the problem of determining the doc type as well. Thereby avoiding the need to add distinct URLs for each client.
Andreas
Ahh, the error of my thoughts. I just realized that without a doctype, there is no way to identify the partner. This would make it pretty difficult to pick up an attribute from the partner profile to use when creating an Strategy instance whose responsibility it would be to determine the doctype.
Andreas
Andreas--Will's approach didn't have distinct URLs for each client. It had a distinct URL for each doc type.
While the strategy pattern is elegant, and conceptually can be applied to the wM environments, the specific thought of having a Java class as a parameter suffers from the need for the partner to specify the Java class. This doesn't seem like a good idea. And as Will pointed out, the whole purpose of TN is to invoke services based on content type, sender, receiver, and other attributes. TN more or less implements the strategy design pattern. Duplicating this in Java classes is...duplicative :-)
Regarding your "BusinessFreak" post (great name by the way!) I'm with you that it is rare that there is a business-oriented process model created before the integration project is started. What I've seen however, is that it's the IT folks that tend to ask and push for this and it's the business folks that demand the quick and dirty. "We just want to connect system A to B--why do we have go through all this other jazz? I don't have time for that nor can I pay for it." IT people usually see the bigger picture because they are exposed to more processes and systems and see the overlaps. Business folks *tend* to be a bit more myopic.
I could go off on a rant on how this situation is caused and perpetuated by how companies organize their business units, how they try to optimize the profitability of the company by optimizing the profitability of each business unit, how most companies continue to view IT as a cost center that provides "free" services to the business units, etc. but that would be *way* off topic. ;-)
Rob, good point about distinct URLs per doctype. As to Partner's specifying the java class, that was not my intention. I was under the impression that you could define optional attributes in a partner profile that would NOT be entered by the partner.
I have to say I am enjoying this thread. I couldn't agree more with your sentiment that IT folks are the ones driving for a business-oriented process. After all it is self-defense, we are the ones that will get blamed when it doesn't work. Most times we will assign blame to the inadequate technology and soon the company is buying a new million dollar silver bullet. So far we have gone through Mainframes, client/server systems, ERPs, Corba, EAI, and now it's Web services. I almost forgot the intranet portal. What will be next? Hmm, am I really this cynical?
Andreas
Ah, I see. You can indeed specify additional attributes (called Extended Fields) within profiles. A project I worked on recently used extended fields to identify the name of the mapping service (conceptually equivalent to specifying a Java class) that a general purpose document handling service would invoke. Here's the high-level steps:
1. TN would do it's thing, identifying the doc type, the sender, the receiver, etc., select the appropriate processing rule and then invoke the document handler.
2. The doc handler is generic--it knows nothing about the specifics of the document type. Using the sender and/or receiver data, it would lookup from extended fields which mapping service to use to change the record to a target format. It knows what extended field to retrieve from parameters passed to it from the processing rule.
3. The mapping service is document specific. It changes a source record to a target record and returns the target record along with the target record name.
4. The doc handler converts the target record to a bizdoc.
5. The target bizdoc is submitted to TN for delivery, additional transformation, or whatever other processing that needs to be done.
We did a custom delivery service to move documents to the Enterprise Server, when that was desired.
Thus, we had an architecture where every document was submitted to TN, using TN as a sort of broker. Docs are never submitted directly to B2B services by outside processes (except when necessary to "preprocess" in order to get to TN) and docs are never directly sent from B2B services to anywhere other than TN.
Regarding the silver bullet, I agree it can be discouraging. You'll note that all those things are still with us! I think it's a matter of viewing things as evolutionary and as being the natural progression of things--client/server led to more powerful/flexible ERPs which led to the need to tie things together more quickly which led to EAI which led to the desire for standards which led to web services and so on.
IMO, web services won't supplant EAI any more than any other RPC mechanism would. Direct RPC connections are often the right way to go but the facilities provided by EAI tools will not be/cannot be replaced by web services. The "aggregate application" model, in which some umbrella logic exists that does not/cannot be hosted in one of the participating applications, assures this.
Of course, that's just my opinion. I'm probably wrong. ;-)
I am still trying to digest all the above.. I am comperatively new to TN...
in this scnerio...
partner formats --> one canonical format --> backend format
why isnt it a good idea to convert from partner format to canonical format in B2B ?? once its converted then pass it onto TN to manage.. (sorry for such a dumb question :| )
can somebody explain, please.
Thanks.
Ultimately, that's exactly what you should do, but via TN.
The reason you want to pass the partner format directly to TN is so you can record it (and restart, log errors, etc).
Partner doc --> TN --> recorded --> passed to service for conversion to canonical format
Canonical format --> TN --> recorded --> passed to service for conversion to backend format
Backend format --> TN --> reccorded --> delivered using ftp, http or custom delivery service (written to DB, passed to wM Enterprise, written to MQSeries queue, etc.)
TN does some important work, but none of it pertains to transformation directly. It simply figures out what services to invoke or where to deliver a doc based on doc contents and processing rules. The B2B services do all the transformation work.
To sum up: use TN for all routing, use B2B services for transformation
This is similar in concept to wM Enterprise--all events are routed by the broker. All transformations are done not by the broker but by the agents and adapters.
Thanks for making me understand a little better. This thread has been quite informative in trying to understand whole webMethods Architecture.
But in regards to the approach defined above, I feel its an extra effort.
See if we do all logging/DB etc in TN, then we need to pass extra information between B2B and TN. say for example, partner formats --> canonical format fails, then we need to pass failure information back to TN, instead of logging it in B2B itself.
Also, i think TN console has some nice features, but not a good web interface for customer support people to look into orders and do order management. (ex if order fails, what's reason it is, need to inform customer or not, or correct/resubmit etc etc) I think to do these features, one need to login them into TN console, which opens another box full of questions (how to restrict them to some few basic tasks such that they dont break havoc on system)
Hence, if the above mentioned features are to be developed in B2B, i dont understand the need to pass formats around TN for DB/logging purposes. anyway, one has to develop logging utility in webMethods to log all transformations between partner,canonical and backend format.
Once again, I am new to TN. Hence these basic questions.
Thanks.
Keep in mind that TN isn't a entity separate from B2B, even though the wM positioning of TN would lead you to believe that it is. TN is just a collection of B2B services. TN is part of B2B and runs fully within the B2B Server (nee, Integration Server) environment.
In addition to the routing of documents to specific services based on doc type, sender, receiver, etc., TN has a pretty good set of logging services. During processing of a bizdoc, you can log any message you wish into the TN db using simple service calls. During the processing of a partner doc, you can associate it with the resulting canonical doc and later with the backend doc. Then with TN Console, you can track what happened to a given doc through all it's transformations--at restart it at any point if necessary.
Of course you can log things to B2B server logs but logging things to the TN db is a bit more robust and useful (though wading through the often huge server log is sometimes loads o' fun.)
Thus, there really is no need to develop your own logging utility. TN has what you need (though it has some warts).
I agree that it would be nice to have a restricted TN Console that allowed only certain operations for different roles.
Thanks a lot, Rob. (and everyone else on wonderful insights into webMethod Products)
I now have a much better understanding, in particular of relationship between TN and B2B IS, and how both should be structured.
On a different note.
(theoritically) one can also build a web interface around TN database. Since TN console is little restricted in its utility in terms of order management, one can ideally build web interface wrapped around the database. [obviously, this interface will be read only - as one doesnt want to mess with TN update db calls]
Warm Regards.
raymoser
07-11-2002, 19:02
I've been following this thread with great interest to see exactly where it would go. Rob raised some good points regarding the use of TN and I agree with him in principle with respect to the document routing capabilities of TN.
I want to clarify the building of a canonical for the mapping of the Flat files to xml. The canonical must contain a superset of all data objects. This will permit mapping from each flat file to a common xml (or IData Record) structure. However, in B2B land, we don't refer to our data structure as canonicals, so the term will be unclear unless you are an Enterprise god like Rob. If the structure of the "canonical" is correct, future data value additions will not be a problem.
From my perspective, I am used to NOT having TN. I've either connected to TN or a queue in most of the deployments. From my perspective, TN is a luxury due to the DB support needed (and my lack of desire to administer.)
Some of the suggestions of wrapping the flat file contents into xml tags and submitting the file is great, but probably won't happen in a mainframe environment.
One thing that worked well for me in the past was intelligent file naming conventions. For example, you could name the file as such:
FileType_custNum_dateTimeStamp_formatType_To_FormatType
Order_1001_2002-07-11T200246_FlatFile_ebXML
The client machine that is posting this can send the contents and then the local file name as parameters. You can parse this out.
The majority of my mainframe or backend experience used FTP to move the files back and forth. I have one instance of a shared drive on a Unisys system (more trouble than it's worth.)
Keep in mind that one of the main reasons that companies use B2B is the quick deployment capability. Overarchitecture of the system can lead to circular debugging. You'll be chasing yourself in circles trying to find your tail when something goes wrong.
I suggest planning the ultimate system, but then implementing in segments that will be easy to handle, yet still deliver to the business folks who requested. So, you can implement a B2B/TN service with bare/minimal and then ramp into a full-fledged reporting/routing system later.
I also heard through the WM folks that the TN console in the new version will be more restrictive and customizable.
Ray wrote:
"...wrapping the flat file contents into xml tags and submitting the file is great, but probably won't happen in a mainframe environment."
This is one area where the marriage of wM Enterprise with IS turns out to be a Good Thing. Using the adapters of wME or using the Mainframe Integration Server, you can easily do minimal data massaging to get things into a form acceptable to TN. Then using the B-E Package, you can submit the doc to TN.
Saurabh - you raised a good point about giving customer service people access to carry out order management in TN via a web interface.
There is a product called WmTNWeb that does this exact thing. However, it is a bit insecure because people who use it must be in the 'TNAdministrators' group. This is insecure because they can easily install TN console on their local machine, login, and change processing rules, etc if they wanted to. We have requested the TN team to work on this.
If you are willing to accept the security-by-obscurity approach .... ie. assuming that since the customer service people don't have access to TN console install binary, it is secure - then your problem is basically solved. Beware though, a few weeks ago I've heard some talk about WM making its products available for a trailware download -- don't remember where.
g_lokanadh
02-03-2004, 08:08
hi porfessionals,
i am new to webmethods. Can any one explain me the role of templates in webmethods briefly like explain what is Template? and how to create a Template in Webmethods?
i am in urgent need. i am expecting your early reply.
with regards
g_lokanadh
02-03-2004, 08:09
hi professionals,
i am new to webmethods. Can any one explain me the role of templates in webmethods briefly like explain what is Template? and how to create a Template in Webmethods?
i am in urgent need. i am expecting your early reply.
with regards
Check the DSP and Ouput Templates developer guide. You can access it from WM developer Help -> User Documentation.
raymoser
02-03-2004, 09:05
The output templates are ok but I think you need to concentrate on four possible scenarios from worst to best:
1. Write your own java service to parse the file.
2. Write flow service to parse the file (not very optimal).
3. Use the WmEDI parser to create templates and parse and convert the file to an IData object that can then map to an XML file.
4. Use the WmFlatFile package. This is not the easiest to understand upfront, but it seems to work well for simple files.
I go between writing my own flow services and using the WmFlatFile package depending on what my client requires. I do not utilize the WmEDI unless the client uses the package already due to the overhead.
HTH
Ray
jack (Unregistered Guest)
02-03-2004, 13:29
hi lok,
As you are new to webmethods and software industry, it is better you adhere to fundas like Intro to Integration, EdiModuleConceptsGuide, EdiModuleUserGuide,..,then only you can't be panning!!!.Try to build up java and webMethods.If you start from both the ends you will meet in the middle.Dont mind in -ve way ,please mind this in a positive way.
Thanks
Jack
Hi to everybody,
I am srinivas learning webMethods developer6.x. I want to be good in fundamentals .Could anybody can send me documentation on the technical examples with steps using BRACNH,LOOP,MAP and SEQUENCE..Kindly respond to my request...
Advanced Thanks,
Best regards
srinivas,
ssirapu@miraclesoft.com
Srinivas,
Go thru this documentation (ISDeveloperGuide.pdf) it will explain everything that you are looking for.This doc will be located inside the
(C:\webMethods6\Developer\doc)folder.
Regarding practical idea using of those above steps look into the WmSamples package this comes with the webMethods Standard Installation.
Goodluck...