PDA

View Full Version : WebMethods amp Informatica 15000 pdfbs once per month


wleishman
05-26-2003, 08:45
Hi.
We have a requirement to transfer 15,000 PDF's from our mainframe to two image repository applications (FileNet & Calligo) once per month. Each PDF is approximately 1MB.

Our ETL tool (Informatica) can (in theory) retrieve the PDF's from the mainframe (using their PowerConnect for Mainframe). The repository applications require the use of an api....but Informatica does not have a connection to FileNet or Calligo. The can write binary (blob) to some rdbms's (like MS Sql server but not Oracle). Our company uses Oracle.

We do have the new PowerConnect for webMethods, which gives Informatica the ability to pub/sub with webMethods broker.

I'm thinking of using a combo-solution. Informatica to retrieve the PDF's, and publish (using PowerConnect for webMethods) to the broker.

We would then create two custom webMethods adapters. One to talk to FileNet using their Java api, and the other to talk to Calligo using their Java api. These adapters would subscribe to the documents published from Informatica.

Could I get some thoughts on this ?

Any concerns about webMethods broker (5.0.1. right now, and 6.0.1 eventually) handling 15,000 1MB documents once per month ?

Regards,

Wayne

reamon
05-27-2003, 21:26
A 1MB event seems a bit huge for a document/event. You might consider not using the broker (no real need for pub/sub here) and either writing custom connectivity for Informatica to FileNet/Calligo (seeing as you'd be writing custom code for wM Broker anyway) or going through IS instead, which has better support for large doc handling.

wleishman
05-28-2003, 06:13
Rob, thanks very much for the feedback.

Regards,

Wayne

fredh666
05-28-2003, 10:30
My guess is that for a 1M doc on a fast network would not require compression.

If using IS the main issue would be getting some concurrency to utilize the network and disks, but not too much to run out of RAM.

Given that the data is PDF and, therefore, doesn't need to be transformed, you really would just need a service that gets the file using the IS Mainframe Adapter and a service that puts a stream (or blob depending on the APIs available from FileNet & Calligo). Also, I would make a "launcher" service.

The launcher would be a service scheduled to run each month that knows what docs need to be copied. It could kick off XX copies of the GetPut service (where XX is the number of concurrent transfers that can be supported by the hardware).

To get 1500 transferred in a day, you would need a little more than 1 a minute. I hear the IS Mainframe stuff is pretty smokin' fast, so the bottleneck would probably be inserting into the doc management systems.

wleishman
05-28-2003, 10:40
Fred thanks for the feedback. Unfortunately the webMethods Mainframe server/adapter does not allow us to retrieve PDF's from the mainframe. It can connect to CICS and IMS, but not VSAM or flat files on the mainframe. Perhaps we can create a custom CICS program to read these files, and then use the webMethods mainframe server/adapter to get the data. Not sure though.
The Informatica PowerConnect for Mainframe (which we have) can (in theory) retrieve data from Partitioned DataSets (PDS) on the mainframe.

Keep those ideas coming ! http://www.wmusers.com/wmusers/clipart/happy.gif

Regards,

Wayne

jbraunstein
06-17-2003, 18:51
webM is not well designed for batch processes or large file transfer. Informatica IS well designed for batch processes and large amounts of data (at least, that's what I keep hearing from the Datawarehousing guys). As Rob mentioned, there is no webM adapter for either app, so your either going to be writing custom Java services or using the ADK to create a pre-packaged Adapter in webM.

I dont know much about Informatica's development platform, but I would be extremely surprised if they dont have the capability to write Java code, similiar to writing Java services in Developer. Every Integration vendor has capability to write Java code in their platform.

So, if the choice is to write custom connectivity in Informatica or webM, for this particular process, I would transfer through Informatica and only use webM as the real-time triggering, status reporting, and event controllling (Start, End) parts of the Integration. Send the files through Informatica.