ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

Using CPYTOIMPF to create a UTF-8 encoded file.

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using CPYTOIMPF to create a UTF-8 encoded file.

    Hi ,

    I have a PF 'TEST/ADDMST' created with CCSID 1208 to store UTF-8 data. I have populated the file with some CYRILLIC characters.
    We need to create a flat file from this data and send it to a third party application.
    I use 'CPYTOIMPF FROMFILE(TEST/ADDMST) TOSTMF('/tmp/utf8test1.txt') STMFCCSID(1208) RCDDLM(*CRLF)' to create the IFS flat file.
    I use filezilla/ftp to transfer this file to my PC but the file doesn't contain the correct data from the PF. It doesn't contain any Cyrillic characters but just some garbled data in their place.
    Only the none-Cyrillic characters appear fine.
    I've also tried creating the flat file with BOM Codes as described below.
    CPYTOIMPF does not add BOM codes to a UTF-8 streamfile to clearly identify it as a UTF-8 file on other platforms. This document includes information on how to add them.


    What is the correct way to transfer UTF-8 data via a flat file?

    Regards,
    Ash

  • #2
    How did you create a flat file with CCSID(1208)?! I didn't think that was possible. Also, CPYTOIMPF is not designed for flat files.

    Assuming you really mean a database table (i.e. externally defined PF) rather than a flat file, I would recommend coding the fields as CCSID 1200. (This is UTF-16). I've found that this works much better for database access than UTF-8, at least on IBM i.

    Then, by all means, convert it from UTF-16 to UTF-8 when you use CPYTOIMPF. Since both UTF-16 and UTF-8 are fully compliant with Unicode, all character values will be preserved correctly.

    If you really and truly mean a flat file, then I would strongly recommend that you eliminate it and write straight to a stream file. This supports UTF-8 perfectly, and does not require a separate PF -- you just write it to your "import file" directly. This saves disk space, runs faster, and has only a single point of failure, making it easier to maintain.


    Comment


    • #3
      Hi Scott,

      Thanks for your reply.
      Yes, I meant an externally defined PF with the "CCSID(1208)" at file level. So all fields are coded as CCSID 1208.
      I've tried it with a externally defined PF with CCSID 1200 as well but I get the same result with the command 'CPYTOIMPF FROMFILE(TEST/ADDMSTUT16) TOSTMF('/tmp/utf8test2.txt') STMFCCSID(1200) RCDDLM(*CRLF)'.
      The Russian characters in the PF are replaced with a blocky character in the "utf8test2.txt" stream file.

      Comment


      • #4
        Makes sense. Try the things I suggested in my earlier post.

        Comment


        • #5
          For any one searching for an answer, I managed to create the UTF08 coded file on the IFS from a UTF-16 coded PF.
          1. Create a template UTF-8 file as described here https://www.ibm.com/support/pages/ho...pytoimpf-utf-8 .
          2. Copy the UTF-8 template CPY OBJ('utf8bom.txt') TOOBJ('utf8test2.txt') REPLACE(*YES).
          3. Convert UTF-16 PF TEST/ADDMST to UTF-8 stream file utf8test2.txt.
          CPYTOIMPF FROMFILE(TEST/ADDMST) TOSTMF('/home/test/utf8test2.txt') MBROPT(*REPLACE) FROMCCSID(1200) STMFCCSID(1208) RCDDLM(*CRLF)

          Comment

          Working...
          X