ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

XMLTABLE() Not Correctly Translating O-Umlaut

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • XMLTABLE() Not Correctly Translating O-Umlaut

    We receive xml documents that contain special ASCII characters that "should be" displayable on an IBM i. For example, the O-Umlaut character Ö. When we parse the xml document using XMLTABLE(), we get the à symbol followed by a question mark inside a box. This is not the only symbol we have this problem with. Some symbols even cause XMLTABLE() to fail parsing. Two questions:

    1. How do we get XMLTABLE() to properly parse special characters (like O umlaut)
    2. How do we pre-determine which special characters are not parse-able by XMLTABLE()




  • #2
    I assume the XML document is in ASCII and not UTF-8!
    XML Table expects UTF-8, i.e. try to convert the ASCII data first into UTF-8

    Comment


    • #3
      When receiving it, it often get CCSID 1252 as default.
      Check the ccsid of the IFS file and change it to 1208 ( UTF-8 ).
      This might do it.

      When using XML-INTO or XML-SAX in RPG I have experienced that the encoding in the XML-prolog
      is ignored and the CCSID for the IFS file is used.

      Comment


      • #4
        Originally posted by B.Hauser View Post
        I assume the XML document is in ASCII and not UTF-8!
        XML Table expects UTF-8, i.e. try to convert the ASCII data first into UTF-8
        Thanks for your response. Indeed the CCSID is 1252. I tried using the native CPY command to convert it to 1208 but i get the following error:


        Code:
        Message ID . . . . . . :   CPFA098       Severity . . . . . . . :   40        
        Message type . . . . . :   Escape                                            
        Date sent  . . . . . . :   04/08/23      Time sent  . . . . . . :   09:44:12  
                                                                                      
        Message . . . . :   The CCSID of the target file could not be set to match the
          CCSID of the source file.                                                  
        Cause . . . . . :   The copy operation failed because the coded character set
          identifier (CCSID) of the target file could not be set to match the CCSID of
          the source file. This could happen for one of the following reasons:        
            -- The file system you are copying into does not support the setting of  
          CCSIDs.                                                                    
            -- You are attempting to copy into a database file member (.MBR) and the  
          member could not be created with the same CCSID as your source file. This is
          because members must have the same CCSID as the database file (.FILE) they  
          are in. The database file's CCSID does not match that of your source file.  ​
        I assume this is because we are accessing the file via a network path through QNTC. I copied the file to my home folder on the IBM i IFS and it seemed to work. It looks like we need to either: 1. introduce another layer to copy network files to the local IFS or 2. figure out a way to get 3rd party software to generate the file in UTF-8 (CCSID 1208) format.

        Comment


        • #5
          The content of the file may well be CCSID 1208 already. But when you copy a file onto the IFS via network share, the IBMi does not interrogate the file contents to determine the actual CCSID of the data. It just defaults the CCSID to whatever the configured default for the network share is - usually 2352.

          Comment


          • Vectorspace
            Vectorspace commented
            Editing a comment
            Typo - the default is usually 1252
        Working...
        X