Announcement

**Barbara Morris** · October 14, 2019, 04:11 PM

'£' _does_ exist in UTF-8. X'C2A8' is the 2-byte UTF-8 character for '£'. It sounds like something along the way doesn't understand that it's UTF-8.

Here's a little RPG program that shows how interpreting the UTF-8 character as ASCII gives the result you're seeing.

Code:

        dcl-ds *n;                                           
           utf8 varchar(5) ccsid(*utf8) inz('£') pos(1);     
           ascii_819 varchar(5) ccsid(819)       pos(1);     
        end-ds;                                              
        dcl-s job varchar(5);                                
        job = utf8;        // interpret x'C2A8' as one UTF-8
        job = ascii_819;   // interpret x'C2A8' as two ASCII
        return;

In debug:

Code:

>
> EVAL utf8:x                   
     00000     0002C2A3 404040..

After assignment from utf8:
EVAL job                                         
  JOB = '£    '                                    
> EVAL job:x                                       
     00000     0001B100 000000..

After assignment from ascii_819:
> EVAL job                                         
  JOB = 'Â£   '                                    
> EVAL job:x                                       
     00000     000262B1 000000..

**Vectorspace** · October 15, 2019, 01:52 AM

So it sounds like one of two things
Either the webserver's conversion to EBCDIC should be converting %C2%A3 to %B1 instead of %62%B1. (Maybe it thinks the incoming data is ASCII instead of UTF8?)
Or, QzhbCgiParse() is not correctly decoding the escaped EBCDIC-encoded characters, and should be converting %62%B1 to £ instead of Â£

I think web-side character set is a function of the web server? Any idea where/how I view that?

**Scott Klement** · October 15, 2019, 09:58 AM

Frankly, it's very clearly the former. It's treating the input as ASCII rather than UTF-8.

**Vectorspace** · October 16, 2019, 01:05 AM

I assume there's no way to override that within my CGI program? So much hangs off this server instance there's no way anyone will want to risk changing its config.

**Scott Klement** · October 17, 2019, 02:01 PM

Seems to me that the error has already been made by the time your CGI program is invoked, so its too late to fix it at that point.

Either the program that is sending the document has to explicitly specify that the data is UTF-8, or you'd have to change your config.

Announcement

QzhbCgiParse not correctly decoding URL encoded characters?

QzhbCgiParse not correctly decoding URL encoded characters?

Comment

Comment

Comment

Comment

Comment