ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

Multiple XML Rows As Columns With Data

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple XML Rows As Columns With Data

    I have an xml file in a format where columns are being sent as rows and I need to take a list of values and turn them back into columns. Here is the sample xml:

    Code:
    <RootNode>
      <Vehicle VIN="3C3CFFBR5ET236538">
        <Values>
          <Value Attribute="Make">Fiat</Value>
          <Value Attribute="Model">500 Sport</Value>
          <Value Attribute="Year">2014</Value>
          <Value Attribute="Engine Type">4 cylinders</Value>
          <Value Attribute="Body">Passenger Car</Value>
          <Value Attribute="Fuel">Gas</Value> 
        </Values>
      </Vehicle>
      <Vehicle VIN="1NKWL29X1FS326723">
        <Values>
          <Value Attribute="Make">Kenworth</Value>
          <Value Attribute="Model">Kenworth</Value>
          <Value Attribute="Year">1985</Value>
          <Value Attribute="Engine Type">14.0L 6X4</Value>
          <Value Attribute="Body">Conventional Cab</Value>
          <Value Attribute="Fuel">Diesel</Value>
        </Values>
      </Vehicle>
    </RootNode>
    I know how to return each of the individual value nodes as a row, but I need each value attribute as a column in the result. Not doing so will increase the likelihood of exceeding the maximum size of a multiple occurrence data structure. The data structure should look something like this:

    Code:
    dcl-ds Data qualified dim(9999) inz;
      VIN varchar(17); 
      Make varchar(128);
      Model varchar(128);
      EngineType varchar(128);
      Body varchar(128);
      Fuel varchar(64);
    end-ds ;

    Here is the SQL i wrote that returns 1 row for each attribute value:

    Code:
    select x.*
    from myxml a, xmltable(
      '$doc/RootNode/Vehicle/Values/Value' passing a.xmlstring as "doc" columns
        VIN varchar(17) path '../../@VIN'
        , attribute varchar(128) path '@Attribute'
        , attributeValue varchar(128) path '.'
    ) as x ;

    Do I need to include the "pivot trick" on this data using common table expressions?

  • #2
    You can do more than return attribtue values. You can also find elements by attribute value. Using: "myCol varchar(128) path '/path/to/elem[@attr=''attrValue'']'"
    If this level of the doc contains multiple <elem> tags, it will only match the tag with the specified attribute. The attribtue is only being used for matching, so it will still return the child of <elem>

    Your root node will need to be the higher level, i.e. the Vehicle level, otherwise it cannot tell the <Value>s all belong to the same vehicle

    This worked for me:
    Code:
    select x.*
    from myxml a, xmltable(
      '$doc/RootNode/Vehicle' passing xmlparse(document a.xmlstring) as "doc" columns
        VIN varchar(17) path '@VIN'
        , Make varchar(128) path 'Values/Value[@Attribute=''Make'']'
        , Model varchar(128) path 'Values/Value[@Attribute=''Model'']'
        , Year varchar(128) path 'Values/Value[@Attribute=''Year'']'
        , EngineType varchar(128) path 'Values/Value[@Attribute=''Engine Type'']'
        , Body varchar(128) path 'Values/Value[@Attribute=''Body'']'
        , Fuel varchar(128) path 'Values/Value[@Attribute=''Fuel'']'
    ) as x
    (Note I used xmlparse() because my source data is a stirng rather than an already-parsed xml document)

    Comment


    • #3
      Vectorspace

      Worked like a charm! Thanks!

      Comment


      • #4
        OUt of curiosity how are you parsing the XML, do you copy file to the myxml table?

        Comment


        • #5
          Reboot1970

          Correct. The raw xml file exists on the IFS. I use the following code (found using Google searching) to get the IFS data into a table in QTEMP that has 1 column that is type XML.

          Code:
          dcl-s Rows uns(5) inz(%elem(Data)) ;
          dcl-s InFile sqltype(clob_file) ;
          dcl-s Path varchar(100) inz('/yourfolder/yourfile.xml') ;
          
          dcl-ds Data qualified dim(9999) inz;
            productId varchar(128);
            UserTypeId varchar(128);
            ParentID varchar(128);
            attributeID varchar(64);
            attributeValue varchar(64);
          end-ds ;
          
          exec sql SET OPTION COMMIT = *NONE,
          CLOSQLCSR = *ENDMOD,
          DATFMT = *ISO;
          
          clear InFile;
          Infile_Name = %trimr(Path);
          Infile_NL = %len(%trimr(Infile_Name));
          Infile_FO = SQFRD;
          
          exec sql DROP TABLE QTEMP.TESTFILE;
          exec sql CREATE TABLE QTEMP.TESTFILE(COL1 XML);
          exec sql INSERT INTO QTEMP.TESTFILE VALUES(:InFile);
          
          // now TESTFILE.COL1 has xml data that can be used in the query
          There are ways to get the data directly from the IFS using the new IFS db2 functions, but i hit too many problems with it. The code mentioned above worked right out of the box and seems stable.

          Comment


          • #6
            Originally posted by TheZenbudda View Post
            Reboot1970

            Correct. The raw xml file exists on the IFS. I use the following code (found using Google searching) to get the IFS data into a table in QTEMP that has 1 column that is type XML.

            Code:
            dcl-s Rows uns(5) inz(%elem(Data)) ;
            dcl-s InFile sqltype(clob_file) ;
            dcl-s Path varchar(100) inz('/yourfolder/yourfile.xml') ;
            
            dcl-ds Data qualified dim(9999) inz;
            productId varchar(128);
            UserTypeId varchar(128);
            ParentID varchar(128);
            attributeID varchar(64);
            attributeValue varchar(64);
            end-ds ;
            
            exec sql SET OPTION COMMIT = *NONE,
            CLOSQLCSR = *ENDMOD,
            DATFMT = *ISO;
            
            clear InFile;
            Infile_Name = %trimr(Path);
            Infile_NL = %len(%trimr(Infile_Name));
            Infile_FO = SQFRD;
            
            exec sql DROP TABLE QTEMP.TESTFILE;
            exec sql CREATE TABLE QTEMP.TESTFILE(COL1 XML);
            exec sql INSERT INTO QTEMP.TESTFILE VALUES(:InFile);
            
            // now TESTFILE.COL1 has xml data that can be used in the query
            There are ways to get the data directly from the IFS using the new IFS db2 functions, but i hit too many problems with it. The code mentioned above worked right out of the box and seems stable.
            Great stuff

            Comment


            • #7
              Here are some increasingly better ways (IMHO) to do that. Keep reading, because each method is better than the last.

              You do not need to create an actual temporary table, you can create a CTE instead. A CTE, or common table expresison, is SQL code that effectively lets you create dummy tables from hard coded values, that a subsequent SQL select statement can reference. But no actual table is created and it's part of the same overall SQL statement as the select.

              Code:
              with myxml(xmlstring) as (values(:InFile)) -- This is the CTE
              select x.*
              from myxml a, xmltable(
                '$doc/RootNode/Vehicle' passing xmlparse(document a.xmlstring) as "doc" columns
                  VIN varchar(17) path '@VIN'
                  , Make varchar(128) path 'Values/Value[@Attribute=''Make'']'
                  , Model varchar(128) path 'Values/Value[@Attribute=''Model'']'
                  , Year varchar(128) path 'Values/Value[@Attribute=''Year'']'
                  , EngineType varchar(128) path 'Values/Value[@Attribute=''Engine Type'']'
                  , Body varchar(128) path 'Values/Value[@Attribute=''Body'']'
                  , Fuel varchar(128) path 'Values/Value[@Attribute=''Fuel'']'
              ) as x
              CTE's are a good way of tetsing SQL statements with dummy data instead of actual files, without changing the content of the select. This is exactly the same method I used to test the SQL I provided, except it was values('your SQL string as a string literal') so I could run it in Run SQL Scripts.



              Even better
              xmltable does not need to be joined with an actual table to work, so you can do this without creating a temporary table. You can use :InFile directly in the xmltable function:

              Code:
              select x.*
              from xmltable(
                '$doc/RootNode/Vehicle' passing xmlparse(document :InFile) as "doc" columns
                  VIN varchar(17) path '@VIN'
                  , Make varchar(128) path 'Values/Value[@Attribute=''Make'']'
                  , Model varchar(128) path 'Values/Value[@Attribute=''Model'']'
                  , Year varchar(128) path 'Values/Value[@Attribute=''Year'']'
                  , EngineType varchar(128) path 'Values/Value[@Attribute=''Engine Type'']'
                  , Body varchar(128) path 'Values/Value[@Attribute=''Body'']'
                  , Fuel varchar(128) path 'Values/Value[@Attribute=''Fuel'']'
              ) as x

              Even Even better
              Rather than using sqltype(CLOB_FILE), you can use sqltype(XML_CLOB_FILE). This is exactly the same, but assumes the IFS file is a properly formatted XML document, parses it, and provides it as the XML data type, meaning you do not need to use xmlparse() like in the previous example

              Code:
              dcl-s InFile  sqltype(XML_CLOB_FILE);
              
              select x.*
              from xmltable(
                '$doc/RootNode/Vehicle' passing :InFile as "doc" columns
                  VIN varchar(17) path '@VIN'
                  , Make varchar(128) path 'Values/Value[@Attribute=''Make'']'
                  , Model varchar(128) path 'Values/Value[@Attribute=''Model'']'
                  , Year varchar(128) path 'Values/Value[@Attribute=''Year'']'
                  , EngineType varchar(128) path 'Values/Value[@Attribute=''Engine Type'']'
                  , Body varchar(128) path 'Values/Value[@Attribute=''Body'']'
                  , Fuel varchar(128) path 'Values/Value[@Attribute=''Fuel'']'
              ) as x

              Comment


              • Vectorspace
                Vectorspace commented
                Editing a comment
                Typo in the above. "values('your SQL string as a string literal')" should have been "values('your XML string as a string literal')"

            • #8
              Vectorspace

              Thanks a ton for your suggestions! The last recommendation you made is the one I couldn't get to work. I ran into several problems. Would you mind posting a more complete example and the CCSID of the IFS file? Here is the most current error I'm getting. The CCSID on the file is 437 and it was uploaded to the IFS using the ACS IFS tool.

              Error when opening the cursor using the last SQL statement you posted above:

              Code:
              Field HVR00001 and value 1 not compatible. Reason 19.
              Input variable INFILE or argument 1 not valid.
              SQLSTT = '42895'
              
              Message . . . . : Field HVR00001 and value 1 not compatible. Reason 19.
              Cause . . . . . : The attributes of variable field HVR00001 in query
              record format FORMAT0002 are not compatible with the attributes of value
              number 1. The value is █ê??È+?ÀÁ████████îÁÇÑÄ%ÁË█████████îÁÇÑÄ%. The
              reason code is 19. The reason codes and their meanings follow:
              
              19 -- The field is XML and the value is not a compatible string or the
              field is not XML and the value is XML.

              Here is the xml string that is in the xml file on the ifs:

              Code:
              <RootNode>
              <Vehicles>
              <Vehicle VIN="3C3CFFBR5ET236538">
              <Values>
              <Value Attribute="Make">Fiat</Value>
              <Value Attribute="Model">500 Sport</Value>
              <Value Attribute="Year">2014</Value>
              <Value Attribute="Engine Type">4 cylinders</Value>
              <Value Attribute="Body">Passenger Car</Value>
              <Value Attribute="Fuel">Gas</Value>
              </Values>
              </Vehicle>
              <Vehicle VIN="1NKWL29X1FS326723">
              <Values>
              <Value Attribute="Make">Kenworth</Value>
              <Value Attribute="Model">Kenworth</Value>
              <Value Attribute="Year">1985</Value>
              <Value Attribute="Engine Type">14.0L 6X4</Value>
              <Value Attribute="Body">Conventional Cab</Value>
              <Value Attribute="Fuel">Diesel</Value>
              </Values>
              </Vehicle>
              </Vehicles>
              </RootNode>
              Here is how I am populating InFile:


              Code:
              clear InFile;
              Infile_Name = %trimr(Path);
              Infile_NL = %len(%trimr(Infile_Name));
              Infile_FO = SQFRD;

              Comment


              • #9
                Vectorspace

                I did finally get something similar to work.

                Code:
                **free
                ctl-opt
                dftactgrp(*no) actgrp(*caller)
                option(*nodebugio:*srcstmt:*showcpy);
                
                dcl-s Rows uns(5) inz(%elem(Data)) ;
                dcl-s Path varchar(100) inz('/myfolder/Vehicles.xml') ;
                
                dcl-ds Data qualified dim(9999) inz;
                VIN varchar(17);
                Make varchar(128);
                Model varchar(128);
                EngineType varchar(128);
                Body varchar(128);
                Fuel varchar(64);
                end-ds ;
                
                // have to use commitment control with get_xml_file()
                exec sql
                set option commit = *CS,
                closqlcsr = *endmod,
                datfmt = *iso;
                
                
                exec sql
                declare c0 cursor for
                select
                coalesce(VIN,'')
                , coalesce(Make,'')
                , coalesce(Model,'')
                , coalesce(Year,'')
                , coalesce(EngineType,'')
                , coalesce(Body,'')
                , coalesce(Fuel,'')
                
                from xmltable(
                '$doc/RootNode/Vehicles/Vehicle' passing
                xmlparse(document get_xml_file(:path))
                as "doc" columns
                VIN varchar(17) path '@VIN'
                , Make varchar(128) path 'Values/Value[@Attribute=''Make'']'
                , Model varchar(128) path 'Values/Value[@Attribute=''Model'']'
                , Year varchar(128) path 'Values/Value[@Attribute=''Year'']'
                , EngineType varchar(128) path 'Values/Value[@Attribute=''Engine Type'']'
                , Body varchar(128) path 'Values/Value[@Attribute=''Body'']'
                , Fuel varchar(64) path 'Values/Value[@Attribute=''Fuel'']'
                ) as x ;
                
                
                exec sql OPEN C0;
                exec sql FETCH C0 FOR :Rows ROWS INTO :Data;
                exec sql GET DIAGNOSTICS :Rows = ROW_COUNT;
                exec sql CLOSE C0;
                
                *inlr = *on;
                return;
                Is there a way to copy/paste and keep formatting?

                Comment


                • #10
                  Among the formatting buttons for the text editor, is one that looks like a sheet of paper with <> in it. That will switch to plain text mode, then it should keep your formatting when you paste.

                  Personally I don't like get_xml_file() precisely because it requires commitment control, I would have used my 3rd example.

                  CCSIDs are a pain. How you need to deal with it depends on whether the bytes inside the IFS file match CCSID 438 or not (or is it maybe a CCSID 1208 (UTF-8) data that is incorrectly marked as CCSID 438.
                  I'm surprised get_xml_file worked if xml_clob_file didn't

                  Comment


                  • #11
                    Originally posted by TheZenbudda View Post
                    I have an xml file in a format where columns <snip>


                    Code:
                    dcl-ds Data qualified dim(9999) inz;
                    VIN varchar(17);
                    Make varchar(128);
                    Model varchar(128);
                    EngineType varchar(128);
                    Body varchar(128);
                    Fuel varchar(64);
                    end-ds ;
                    This is a classic case for me where using XML-INTO is way simpler than messing with SQL. You already have the DS a simple XML-INTO with the IFS file named would do the job in a single simple statement.

                    Comment


                    • #12
                      JonBoy

                      Understood and agreed. I was vetting the sql option to see what features we get with it.

                      Comment


                      • #13
                        I can and have used both methods.

                        IMHO xmltable's biggest advantage is it can parse and insert/update/merge into database in one go, without needing to step the data into your program and back out again.

                        The thing I don't like about xml-into or data-into is you must define a DS that can hold the entire parsed xml in its maximum size (e.g. as in JonBoy's example with a dim(9999)), which means you are allocating way more memory than you need most of the time. And if for example it's a service program, that memory remains allocated until the job ends even if the procedure is never called again.

                        So my preference is to use xml-into (or preferably data-into with Scott Klement's JSON parser) for smaller datasets, and xmltable/json_table for larger datasets.

                        Comment


                        • #14
                          Originally posted by Vectorspace View Post
                          IMHO xmltable's biggest advantage is it can parse and insert/update/merge into database in one go, without needing to step the data into your program and back out again.
                          True but in most cases some validation of the data is required even if only to weed out duplicates etc.
                          Originally posted by Vectorspace View Post
                          The thing I don't like about xml-into or data-into is you must define a DS that can hold the entire parsed xml in its maximum size (e.g. as in JonBoy's example with a dim(9999)), which means you are allocating way more memory than you need most of the time. And if for example it's a service program, that memory remains allocated until the job ends even if the procedure is never called again.
                          Not true - or at least only partly. For one thing you can usually use %Handler for repeating structures and don't need the full DS. Secondly if memory is an issue then allocate it dynamically in the procedure and release it when done. For that matter on V7.4 use dynamic arrays and RPG runtime can take care of it.

                          Comment


                          • Vectorspace
                            Vectorspace commented
                            Editing a comment
                            Thanks JonBoy, I was not aware of %handler, I will read up on it
                            As for dynamic arrays, unfortunately we are still on 7.3

                        • #15
                          JonBoy

                          Using XML-INTO, how would you convert the XML posted above to the following data structure:

                          HTML Code:
                          dcl-ds Data qualified dim(9999) inz;
                            VIN varchar(17);
                            Make varchar(128);
                            Model varchar(128);
                            EngineType varchar(128);
                            Body varchar(128);
                            Fuel varchar(64);
                          end-ds ;


                          Comment

                          Working...
                          X