ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

Improve the query with select (max) in the cobol program

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Improve the query with select (max) in the cobol program

    Hi everyone,

    I need to improve a query that I use in a cobol program. This query takes 3 minutes and I need to reduce the times. The query is something like that:

    PHP Code:
    SELECT B.CLIENTB.TYPEB.DOMAINB.REFERENCIA,
    B.AMOUNT A.AMOUNTA.DATEFINA.TYPODATO
    FROM TABLEA 
    AS A JOIN TABLEB AS B ON
    A
    .CLIENT B.CLIENT WHERE
    B
    .AMOUNT  <> AND B.OPERAC <> "L" AND
    B.DOMAIN IN ("XXX""TTT",
    "OOO") AND A.DATEFIN= (SELECT MAX(A.DATEFIN)
    FROM TABLEA  WHERE A.CLIENTB.CLIENT) AND
    B.TYPODATO "J" AND A.REFERENCIA B.REFERENCIA
    AND B.CLIENT IN (SELECT C.CLIENT
    FROM TABLEC  
    AS C WHERE C.STATUS <> "A")
    AND 
    B.CLIENT IN (SELECT D.CLIENT
    FROM TABLED 
    AS D WHERE D.STATUS <> "A")
    AND 
    A.AMOUNT <> B.AMOUNT
    AND B.TYPE "000" AND A.TYPODATO2 "O"
    AND B.TYPE "000" 
    The query is correct, but the problem is that TABLAA has 25 million records and is slow. I am not an expert in SQL and have looked for some ways to improve it but not long ago, is it possible to find a way to do the SELECT (MAX) in another way? I think that may be the problem. The point is that I need to match two tables and choose the maximum date for each of the records.

    Thank you very much for your attention



  • #2
    Have you run this through the advisor in ACS? That would at least let you know if any indexes etc. may be of help. Any indexes that you already have may be being ignored for some reason. Visual explain should help you to understand where the time is being spent.

    Comment


    • #3
      Originally posted by JonBoy View Post
      Have you run this through the advisor in ACS? That would at least let you know if any indexes etc. may be of help. Any indexes that you already have may be being ignored for some reason. Visual explain should help you to understand where the time is being spent.
      Thanks, I will try in visual explain

      Comment


      • #4
        Hi everyone,

        I have been able to verify that where later is doing the SELECT (MAX) of the date, how could I improve this?





        Comment


        • #5
          I suspect the problem is the "Table Scan" item at the top. That means that whatever the query is trying to do with that table, there is no key or access path for, so it has to scan every record. All 2.1E7 records (I think that's 21 million?)

          It's hard to know, without knowing which table that is. If you click on the Table Scan item, then the stats panel on the right should tell you somewhere what table it is. Once we know what table it is, we will be able to tell form your query what kind of lookup is being done on it, and suggest ways to speed it up.

          In addition, the analyser includes (somewhere off one of the top menus) an Index Advisor - which will suggest indexes you could create that may speed up the query.

          Finally - in my personal experience, using INNER JOIN or INNER JOIN TABLE (SELECT...) is often faster than using WHERE ABC in (SELECT...) or WHERE ABC = (SELECT...). Though I think that has seen performance improvements in recent years so it may no longer be the case

          Comment


          • #6
            I'd first rewrite the query, i.e. put the sub-selects from the WHERE clause into the from Clause or into a CTE
            Code:
            With x as (Select Client xClient, Max(DateFin) xDateFin
                         from TABLEA)
            Select B.CLIENT, B.TYPE, B.DOMAIN, B.REFERENCIA,
                   B.AMOUNT, A.AMOUNT, A.DATEFIN, A.TYPODATO
               From x Join TABLEA As A on     xClient      = A.Client
                                          And xDateFin     = A.DateFin                              
                      Join TABLEB As B On     A.CLIENT     = B.CLIENT
                                          And A.REFERENCIA = B.REFERENCIA
                      Join TABLEC As C on     B.Client     = C.Client
                                          And C.Status    <> "A"
                      Join TABLED as D on     B.Client     = D.Client
                                          and D.Status    <> "A"                                    
               Where     B.AMOUNT    <> 0
                     And A.AMOUNT    <> B.AMOUNT
                     And B.OPERAC    <> "L"
                     And B.DOMAIN    In ("XXX", "TTT", "OOO")
                     And B.TYPODATO  = "J"
                     And B.TYPE      = "000"
                     And A.TYPODATO2 = "O"
            Then make sure that you have an index for
            TABLEA on Client, Referencia, DateFin, TYPODATO2
            TABLEB on Client, Referencia, TYPE, TYPODATO, DOMAIN (and may be on OPERAC)
            TABLEC on Client and Status
            TABLED on Client and Status

            Comment


            • #7
              Originally posted by Vectorspace View Post
              Finally - in my personal experience, using INNER JOIN or INNER JOIN TABLE (SELECT...) is often faster than using WHERE ABC in (SELECT...) or WHERE ABC = (SELECT...). Though I think that has seen performance improvements in recent years so it may no longer be the case
              From what I hear and understand, that is still the case, Vector. Joins are faster than subselects.

              I have also heard that some query optimizers convert subselects to joins when possible. Whether DB2 for i does that, I have no idea.

              Comment


              • #8
                Originally posted by B.Hauser View Post
                I'd first rewrite the query, i.e. put the sub-selects from the WHERE clause into the from Clause or into a CTE
                Code:
                With x as (Select Client xClient, Max(DateFin) xDateFin
                from TABLEA)
                Select B.CLIENT, B.TYPE, B.DOMAIN, B.REFERENCIA,
                B.AMOUNT, A.AMOUNT, A.DATEFIN, A.TYPODATO
                From x Join TABLEA As A on xClient = A.Client
                And xDateFin = A.DateFin
                Join TABLEB As B On A.CLIENT = B.CLIENT
                And A.REFERENCIA = B.REFERENCIA
                Join TABLEC As C on B.Client = C.Client
                And C.Status <> "A"
                Join TABLED as D on B.Client = D.Client
                and D.Status <> "A"
                Where B.AMOUNT <> 0
                And A.AMOUNT <> B.AMOUNT
                And B.OPERAC <> "L"
                And B.DOMAIN In ("XXX", "TTT", "OOO")
                And B.TYPODATO = "J"
                And B.TYPE = "000"
                And A.TYPODATO2 = "O"
                Then make sure that you have an index for
                TABLEA on Client, Referencia, DateFin, TYPODATO2
                TABLEB on Client, Referencia, TYPE, TYPODATO, DOMAIN (and may be on OPERAC)
                TABLEC on Client and Status
                TABLED on Client and Status
                Thank very much, improve the query!
                Thanks Crack

                Comment

                Working...
                X