PDA

View Full Version : Canbt query an html page with charset%3dunicode using pubwebquerydocument


marius
02-07-2003, 15:58
Hello !

I'm doing a simple web flow in B2B 4.0 and I've run in the following issue :
I have a flow service with "loadDocument" and "QueryDocument". I can't query following html :
"
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=unicode"><form name="usfForm">
"
It seems that the "unicode" is causing some problems.
The error that I get is :
Could not obtain the Document View from the Server
...
com.wm.util.LocalizedCharConversionException
Incorrect character encoding (Missing byte-order mark)

I've tried to specify the encoding in the loadDocument parameters (like UTF-16 ), but no good result.

Any ideeas ?

Best regards,
Marius

fredh666
02-08-2003, 15:53
Two guesses:

1. Although the HTTP headers are good and the content is UTF-16, charset=unicode doesn't define whether the data is big endian or little endian, so without a BOM the parser doesn't know how to parse the data.
http://www.unicode.org/faq/utf_bom.html#22


2. Check the HTTP headers using pub.flow:getTranportInfo. The web server may be doing something funky with encoding. Then look at the characters in the stream (get with pub.client:http and look at the bytes to see that they are really unicode).

Marius Ciplinschi
02-08-2003, 19:03
Thanks for the response !

I don't have any control of how the html is build. I need to find a way to load it and parse it using flow services.
I've tried on webMethods version 3.0 and it works fine. I assume that 3.0 just ignore any unicode specifier.
The IE is able to render it.