Announcement

**Scott Klement** · July 8, 2019, 10:09 AM

Your third link is to the ILE COBOL manual. XML-SAX is an RPG opcode.

**AnthonyO** · July 9, 2019, 08:43 AM

It sounds like your file is not truly formatted in CCSID 819? I'd expect xml-sax should be able to translate it for you provided all your formats are correct. If it's a different format, I think you should be able to use the correct CCSID instead.

**RDKells** · July 9, 2019, 10:08 AM

I did not notice the COBOL thing. I really do find it hard to find the information on the IBM site that I need, the search never seems to truly show what you want. All it's for COBOL, it seems somewhat relevant for RPGLE as normal events seem to stop firing once an exception is hit. I have had a look for something similar in the RPGLE docs but I can't find info on it.

I also can't find a code page for 819 (that details the sumbols and their hexadecimal) on the IBM site but I've found one on google and it doesn't have the ? symbol - I guess therefore that the problem is ? not existing in ccsid 819 triggering the error code 6.

Per my initial post; I've tried modifying the CCSID of various things without any luck, how would one determine the correct code page? And again; is there a way to skip the parser error codes? Or is it just a case if maybe doing some sort of pre-validation of the file to make sure it's valid.

**AnthonyO** · July 9, 2019, 10:58 AM

In order to use the CCSID, you would need to know the source. Any information from the source system?
Does your xml file having an encoding at the top? Something along the lines of <?xml version="1.0" encoding="utf-8"?> UTF-8 would be 1208.
If you were using 819 you may see ISO 8859-1 in the encoding.

Not sure if you could skip the parser error codes, did you try an on-error check, and see if it continues?

**RDKells** · July 9, 2019, 11:19 AM

Hi Anthony,

No source system information, it's a translation log from a till.

The header of the file is literally;
<POS xmlns:dt="urn:schemas-microsoft-com:datatypes">

From what I understand this is a name space, not an encoding declaration, so it's no help here?

On-error catches the parser error, that's how it works currently but later on down the line it crashes cause data is missing - I traced it back to the error code 6 issue.

After reading up it seems the information for COBOL is pertinent to RPGLE also; if an exception occurs the program stops firing normal events and therefore the XML file is only partially processed.

I have tested this with the XML-SAX example in the URL and it seems to be the case, so I guess I'll try some different CCSID combinations to see if I can get it to work, otherwise I'm going to have to do some sort of pre-validation on the file to make sure it's valid.

**Scott Klement** · July 9, 2019, 01:28 PM

Originally posted by RDKells View Post

No source system information, it's a translation log from a till.

Sounds like the till is the source system, then. If its able to create a file, then it has some sort of CPU and some sort of software, etc, running on it.

You would want to look in the documentation, or talk to the technical support for the company that makes it to find out which character set/encoding it is using to create the file.

Originally posted by RDKells View Post

The header of the file is literally;
<POS xmlns:dt="urn:schemas-microsoft-com:datatypes">

I think Anthony was asking about the XML processing instructions (PI). That would look like a tag with question marks in it, like this: <?xml?> It sounds like there isn't one in this document, which is a shame.

If you can't get an answer from the people who make the device, you may be able to guess based on the code point used. For example, iso-8859-15 is mostly the same as iso-8859-1, except that the Euro symbol is at x'A4'. So if all other characters look the same as iso-8859-1 and the euro is x'A4', that might be it. If its mostly the same as iso-8859-1, but the euro is x'80', that'd be Windows-1252. If it is a 3-byte sequence x'E282AC' then it is UTF-8, etc.

Guessing based on the code point is not as good as finding out from the source because the information is circumstantial. Multiple encodings may use the same code points, and other characters that aren't in the particular instance of the document you're reading would be unknown, so might not match. But, sometimes a guess is the best you can do.

**vazymimil** · July 10, 2019, 03:06 AM

I think you should try CCSID 923 (iso-8859-15), iso-8859-1 does not include ??the euro sign

**RDKells** · July 10, 2019, 03:41 AM

The only "header" type information, at the start of the XML document is what I posted - which I thought is what Anthony0 was asking for. There is no <?xml?> tag in the file.

How would I determine what the hex value of the character is? Just run it in debug and check it via eval x 32?

Thanks for the suggestion Nicolas - I changed it to 923 but still get the same error;

Message ID . . . . . . : RNX0351 Severity . . . . . . . : 50
Message type . . . . . : Escape
Date sent . . . . . . : 10/07/19 Time sent . . . . . . : 10:25:59

Message . . . . : The XML parser detected error code 6.
Cause . . . . . : While parsing an XML document for an RPG procedure, the
parser detected an error at offset 14778 with reason code 6. The actual
document is

The symbol at said position is the ? symbol.

Opening the file in notepad++ gives this error;
XML Parsing error at line 1:
Input is not proper UTF-8, indicate encoding!
Bytes: 0x80 0x31 0x32 0x2E

As it doesn't seem possible to ignore the parser error and continue processing normally I'm just going to write something to pre-validate the file, if it fails then advise the offsets where the exceptions occurred and reject it - this will then force the other 3rd party to fix the issue their end.

**Scott Klement** · July 10, 2019, 08:51 AM

Originally posted by RDKells View Post

How would I determine what the hex value of the character is? Just run it in debug and check it via eval x 32?

One way is to use the DSPF command, there's an F10=Hex option.

**RDKells** · July 10, 2019, 11:11 AM

I have just realised that I never stipulated that this is an IFS file, rather than a database file - apologies.

If I view it via WRKLNK / 5 / F10 the symbol doesn't show for me;

My job CCSID options are;
Coded character set identifier . . . . . . . . . : 65535
Default coded character set identifier . . . . . : 37

The hex is;

Code:

6D61726B 65642080 31322E30 30266C74 3B2F4C69   marked  12.00&lt;/Li

Codepage 819 has x'31 for '1', pre-ceeding that is x'80' which I guess is 1252 for "euro" - as you pointed out earlier.

I've tried changing it to 1252 and I get error 302 instead;

302	The parser does not support the requested CCSID value or the first character of the XML document was not '<'.

Reading this document (don't worry, I checked it was relevant first);
https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_73/rzasc/xmlparselimit.htm

Suggests that 1252 isn't supported.

I wonder if the file is created in 1252 and changed to 819 somewhere, I will take a look at that.

Cheers,
Ryan

**Scott Klement** · July 10, 2019, 11:31 AM

1252 should work fine. The error message says that the document does not begin with a '<' character... Something's not right, here. How did you change it to 1252?

**RDKells** · July 11, 2019, 03:33 AM

Via CHGATR OBJ('/myifsfile.xml') ATR(*CCSID) VALUE(1252) (aka option 13 on the IFS file)

Is there another way of doing it?

**Scott Klement** · July 13, 2019, 04:50 PM

CHGATR is a good way to do it.

I've heard of people doing it other ways (such as the dialog in EDTF, or using CPY, etc) that would translate the text rather than just changing the CCSID value, and people can sometimes get confused about that. But the way you're doing it is correct.

I'm really surprised that CCSID 1252 wouldn't be supported. This seems like a bizarre limitation (the OS supports 1252, and RPG is internally translating the file... why on earth would it not support all of the CCSIDs that the OS supports?).

If that is indeed the problem, you should be able to work around it. Just do a CPY command, specify *TEXT (not *BINARY) and tell it to copy your file from CCSID 1252 to 1200. Then it should work, since 1200 is supported.

But I'm highly skeptical about that link you provided because it doesn't even list 1208, which is far and away the most widely used CCSID for XML documents. I'm away from the office right now with no access to try things, but maybe on Monday I'll do some quick tests on stuff like 1252 or 1208.

**RDKells** · July 15, 2019, 05:11 AM

Thanks for the CPY suggestion; 1252 didn't work but 1146 did - details below.

1252 no longer gave error 302 but it still gave error 6;

Code:

CPY OBJ('myfile.xml') TOOBJ('mynew1200file.xml')
FROMCCSID(1252) TOCCSID(1200) DTAFMT(*TEXT)

Message ID . . . . . . : RNX0351 Severity . . . . . . . : 50
Message type . . . . . : Escape
Date sent . . . . . . : 15/07/19 Time sent . . . . . . : 11:29:08

Message . . . . : The XML parser detected error code 6.
Cause . . . . . : While parsing an XML document for an RPG procedure, the
parser detected an error at offset 14778 with reason code 6. The actual
document is

Here's the hex of file after conversion; the Euro symbol was changed from 80 to 3F (The EUR symbol is just before the 1 in 12.00);

Code:

 - - - -  + - - -  - * - -  - - + -  - - - *    ----+----*----+----*
 D5E34040 40404094 81999285 84403FF1 F24BF0F0   NT     marked  12.00

I had a look at some of the common code pages we here; 27 & 285 and on the 285 page it advises that in CCSID 1146 "9F" is replaced by the euro symbol - looking at 37 and 285 "9F" is defined as;

"The currency sign (¤) is a character used to denote an unspecified currency."

So I converted from 1252 -> 1146;

Code:

CPY OBJ('myfile.xml') TOOBJ('mynew1146file.xml')
FROMCCSID(1252) TOCCSID(1146) DTAFMT(*TEXT)

Hex of file - you can now see the currency symbol;
- - - - + - - - - * - - - - + - - - - * ----+----*----+----*
D5E34040 40404094 81999285 84409FF1 F24BF0F0 NT marked ¤12.00

I didn't get a parser code 6 error and the file was processed via a job set as CCSID 65535/37.

I'll need to fully check all the relevant files to ensure nothing got mistranslated but at first glance; all looks good!

Thanks for your help, Scott

Announcement

XML-SAX - Error code 6

XML-SAX - Error code 6

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment