| Join wMUsers | Blog at wMUsers | User Control Panel | Site Map | webMethods Jobs |For Employers |
![]() |
![]() |
IntroductionProcessing a large file is a real challenge to any developer. When a recent project's requirements dictated that a large XML file be parsed, I recognized that there were two choices -- 1) Write a custom java service to read a chunk of XML into memory and then process it or, 2) Use the webMethods Node Iteration services for parsing. In order to minimize memory consumption, I took the Node Iterator approach. By iterating over each node of the XML document as needed -- rather than reading the entire document node tree -- the webMethods built-in Node Iterator services give developers an efficient parsing method for large XML documents. An Example -- The Purchase OrderLet us consider a XML Purchase Order document with the following structure. It is a fictional purchase order, but let's consider it for the sake of illustration. <PurchaseOrder> ......................... <OrderItemList> <OrderItem> <ItemName>Item 1</ItemName> <ItemQuantity>100</ItemQuantity> <ItemPrice>100.00</ItemPrice> </OrderItem> <OrderItem> <ItemName>Item 2</ItemName> <ItemQuantity>200</ItemQuantity> <ItemPrice>200</ItemPrice> </OrderItem> ................ <OrderItem> <ItemName>Item 100000</ItemName> <ItemQuantity>700</ItemQuantity> <ItemPrice>700.00</ItemPrice> </OrderItem> </OrderItemList> </PurchaseOrder> Let’s assume that this purchase order contains millions of items, which would make the XML document very large in size. The conventional webMethods approach of parsing the XML document to memory will result in the well-known Integration Server "Out of Memory" error. Meaning, if you use the built-in services Since we are only interested in the large document's OrderItem nodes, parsing the entire document is both a waste of system resources and of time. Node Iteration is a better option because the Node Iterator services provide a cursor that can be placed directly on any node of an XML document. In the XML document above, for example, we can invoke the Node Iteration services and parsing logic directly on the <OrderItem> nodes. Using the Node Iteration Built-In ServiceswebMethods provides two built-in services for iterating over XML document nodes:
If the option is available, load the XML document as a stream rather than as bytes. This is because stream-based data is loaded on demand and system memory is, therefore, more efficiently managed. Once loaded (either as a stream or as bytes), generate a Node object by invoking the The output of the Now, with the Node object in the pipeline, proceed with the A sample package accompanying this webMethods Ezine article demonstrates the above example. Run the service Moving Window ModeThe Node Iterator services may also be invoked in "Moving Window" mode. In Moving Window mode, only the most recent Node object returned by In disregarding "old" nodes, memory management is made more efficient. The burden is placed on the developer, though, to completely process the Node object before getting the next node with To use Moving Window mode, set the A Few Extra TipsA few more tips for processing large files using the Node Iterator services:
Go Deeper on the Subject: The wMUsers Discussion Forums Prashantha Upadhya is a Senior Technical Principal consultant at Inventa, a firm specializing in performance management, application development, enterprise and business-to-business integration solutions. Prashantha has over 9 years of experience in the software industry including B2B integrations. Prior to Inventa, he held positions in bio-medical research labs and has a masters degree in Bio-Medical Engineering.
Prashantha can be reached via email at
|
| © All Rights Reserved, 2001-2008. |