ibmi-brunch-learn

Announcement

Collapse
No announcement yet.

SQL Slow on Big tables

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SQL Slow on Big tables

    Hello,
    I have a big table and its index built on column used on the "where clause".

    I did alter table and index to keep in memory but results are still very slow.
    Isolation Level is set to *NONE


    It takes more than 6 seconds just to count rows when the "where clause" is in place and only 20ms when not in place.

    [ 27/07/2023, 13:59:40 ] Run Selected...
    SELECT count(*) FROM bms71t_dat.movmg00f
    Statement ran successfully (20 ms)



    [ 27/07/2023, 14:00:06 ] Run and Explain...
    SELECT count(*) FROM bms71t_dat.movmg00f WHERE mgcazi='020'
    Statement ran successfully (6,685 ms = 6.685 sec​


    I attach Visual Explain and it look good.
    Any idea to improve performance?

    Thanks
    Giovanni



    ​​
    Attached Files

  • #2
    Is there an index built over the columns specified in the WHERE clause?

    Comment


    • #3
      Hi,
      thank you for your reply.

      yes there is an index over the column specified in the where clause and that index is used by the SQE
      CPI4328 Access path of file MOVMG98X was used by query. Cause . . . . . : Access path for member MOVMG98X of file MOVMG98X in library BMS71T_DAT was used to access records from member MOVMG00F of file MOVMG00F in library BMS71T_DAT for reason code 1. The reason codes and their meanings follow: 1 - Record selection. 2 - Ordering/grouping criteria. 3 - Record selection and ordering/grouping criteria. If file MOVMG00F in library BMS71T_DAT is a logical file then member MOVMG00F of physical file MOVMG00F in library BMS71T_DAT is the actual file being accessed. Index only access was used for this query: *YES. A value of *YES for index only access processing indicates that all of the fields used for this query can be found within the access path of file MOVMG98X. A value of *NO indicates that index only access could not be performed for this access path. Index only access is generally a performance advantage since all of the data can be extracted from the access path and the data space does not have to be paged into active memory. Recovery . . . : An access path can only be considered for index only access if all of the fields used within the query for this file are also key fields for that access path. Refer to the DB2 for i - Database Performance and Query Optimization topic collection in the Database category in the IBM i Information Center for additional tips on index only access.



      This is my create index command.

      CREATE INDEX bms71t_dat.movmg98x
      ON bms71t_dat.movmg00f (mgcazi)
      KEEP IN MEMORY YES

      Comment


      • #4
        How many rows are in this table? I suspect the query engine is just pulling the row count from system catalog, not the table itself.
        How many deleted rows are in this table?
        What is row count where mgcazi = '020' ?
        What is the definition of column mgcazi ?

        You may want to RGZPFM if you have the storage space.

        Ringer

        Comment


        • #5
          Hi,
          there are 12.000.000 rows in the table.
          No deleted row.
          Row count for mgcazi = '020' is 11.500.000
          mgcazi is char(3)

          Table has been created from scratch , I did try RGZPFM but got same results.

          I ran a similar SQL on another IBM I on a different table with 22.000.000 rows and got exactly the same problem,
          select (*) without where condition takes 15 milliseconds, , with where clause on Unique Index column takes 20 seconds.

          Both Ibm I are on V7R4 with latest PTF's applied.
          Giovanni

          Comment


          • #6
            Try to create an Encoded Vector Index for the MGCAZI column with an INCLUDE:

            Example:
            Code:
            Create Encoded Vector Index YourSchema/YourEVI
                   On YourSchema/YourTable (YourColumn Asc)
                      With 65535Distinct Values
                    Include Count(*));​
            Last edited by B.Hauser; July 28, 2023, 05:28 AM.

            Comment


            • #7
              Hi Birgitta,
              It works fine now , thanks .

              Comment


              • #8
                Some background on this.

                An encoded vector index is an index on a column where the column value can only be one of a fixed distinct list of values. In Brigitta's example, the index is defined such that the column can have 65k distinct possible values. This lets the index be faster, but only so long as there aren't more 65k possible values.
                Any more than that and (I think) some of the possible values won't be indexed so will fall back on the slower index

                Comment

                Working...
                X