Query tuning example

A coworker of mine cleaned up a bunch of old data to improve performance on an older system of ours and one of the queries started running slower.  It looks like the optimizer was choosing a full table scan when an index existed that was much faster.  So, I took at look at why it was choosing the full scan and what could be done to get the query to run with the index.

This is a 9.2.0.6 64 bit HP-UX PA-Risc system.  The query without any hints runs like this:

Elapsed: 00:04:54.65

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE 
          (Cost=23208 Card=1 Bytes=12)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'MYTABLE' 
              (Cost=23208 Card=68262 Bytes=819144)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     242929  consistent gets
     238854  physical reads
          0  redo size
        203  bytes sent via SQL*Net to client
        242  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

I renamed the production table to MYTABLE to hide its name.  Here is how the query runs with the index hint:

Elapsed: 00:00:00.21

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE 
          (Cost=34865 Card=1 Bytes=12)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'MYTABLE' 
              (Cost=34865 Card=68262 Bytes=819144)
   3    2       INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) 
                (Cost=194 Card=68262)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1180  consistent gets
          0  physical reads
          0  redo size
        203  bytes sent via SQL*Net to client
        242  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

I got this output by putting these commands before the sql:

set autotrace on
set timing on

The query looks something like this:

SELECT  sum(COLUMN2)
  FROM MYTABLE
   WHERE MYTABLE.COLUMN1 = '01242014';

The index is on COLUMN1 only.

Here is what the query looks like with the index hint:

SELECT /*+ INDEX (MYTABLE MYINDEX) */ sum(COLUMN2)
  FROM MYTABLE
   WHERE MYTABLE.COLUMN1 = '01242014';

So, the question is why does the optimizer choose a full scan which runs for almost 5 minutes instead of the index scan which runs in a fifth of a second?  I think that the answer is that the optimizer assumes there is no caching.  Notice the number of consistent gets and physical reads in each case:

FULL SCAN

     242929  consistent gets
     238854  physical reads

INDEX SCAN

       1180  consistent gets           
          0  physical reads

If you look at the autotrace output and you see consistent gets but no physical reads it means that the blocks are being read from memory and not the disk.  So, just about every block read by the full scan is read from the disk and not from memory.  In the case of the index scan all the blocks are in memory.

Other than the caching the optimizer’s cost estimates aren’t that far off.  I think the optimizer’s cost units are the equivalent of single block reads.  I think I read this in Jonathan Lewis’ book.  These are the sections of the plan that show the cost of the two ways of reading from the table:

   2    1     TABLE ACCESS (FULL) OF 'MYTABLE' 
              (Cost=23208 Card=68262 Bytes=819144)

   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'MYTABLE' 
              (Cost=34865 Card=68262 Bytes=819144)
   3    2       INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) 
                (Cost=194 Card=68262)

So the full scan has a cost of 23208 which is the equivalent of that many 10 millisecond single block reads.  That’s about 232 seconds which is about 4 minutes.  So, that’s not far off for the cost of the full scan which ran in 5 minutes.  Also, full table scans do multi block reads so when autotrace says physical reads = 238854 what it really means is that many blocks were read in through physical reads.  Our system has db_file_multiblock_read_count=16 so probably 16 blocks are read per physical read.  The optimizer estimated a cost of 23208 which is about 10% of the physical blocks and this wasn’t a bad estimate because the blocks are read together.  So it assumed that the 238854 blocks would be read in the time it would take for 23208 single block reads and this was pretty close.

But, the index scan estimated 34865 sequential reads which is about 6 minutes.  Now, if there was no caching this wouldn’t be such a bad estimate.  Autotrace says there are 1180 blocks read from memory.  If these were read from disk each block would be a separate disk I/O so it would be about 12 seconds.  I checked and I did find that some values of the column had more rows than others and for the value 01242014 that I was testing with was one of the values with fewer rows.  So, for different values of COLUMN1 without caching the index would be a lot closer to the full scan:

SELECT COLUMN1,count(*) cnt
FROM MYTABLE
group by COLUMN1
order by cnt desc;

COLUMN1         CNT
-------- ----------
11262013     873867
11252013     576299
09222013     237098
08052013     179476
12082013     177359
11102013     175178
09152013     174220
10132013     174204
11172013     174102
08252013     173758
... skipped some ...
01242014      53801
10232013      53785
10072013      53335

So the value 01242014 only has 53801 rows but 11262013 has 863867 rows so it probably has blocks closer to the estimated 34865.  Here is a query to show how many blocks each value of the column has:

SELECT COLUMN1,
count(distinct DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)) cnt
FROM MYTABLE
group by COLUMN1
order by cnt desc;

COLUMN1        CNT
-------- ----------
11262013      16338
11252013      10847
09222013       4409
08052013       3437
12082013       3337
11102013       3305
09152013       3290
10132013       3286
11172013       3285
08252013       3281
10272013       3255
... skipped some ...
01242014        926
01232014        924
09262013        922
01132014        915

So, for the values of the column with the most rows there are 16338 blocks, somewhere in the vicinity of the estimated 34865 single block reads.  Anyway, if the optimizer is off by less than a factor of 100 I’d say it is pretty close.  But, the run time is off by about 600 – 5 minutes versus .2 seconds so that is a real problem.

Oracle’s method of dealing with caching of index blocks and indexed table blocks is one of the optimizer parameters.  Here are the settings as we have them on our system:

NAME                                 VALUE
------------------------------------ -----
optimizer_index_caching              0
optimizer_index_cost_adj             100

I messed with different values of optimizer_index_cost_adj and found that 66 was the highest value that would cause the plan to switch to an index scan.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE 
          (Cost=23011 Card=1 Bytes=12)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'MYTABLE' 
              (Cost=23011 Card=68262 Bytes=819144)
   3    2       INDEX (RANGE SCAN) OF 'MYINDEX' (NON-UNIQUE) 
                (Cost=194 Card=68262)

Note that the cost of the index access to the table has come down to 23011 which is less than the 23208 estimated for the full scan.  Also note that 23011/34865=.66=66/100.  So, the cost of the index access to the table is just multiplied by the value of optimizer_index_cost_adj divided by 100.

So, now we have come down to where the rubber meets the road.  What should we then do with all of this information?  The development team has told us that they can add a hint to their code, but they are concerned that there is some general problem with the database and they will have to keep adding hints in various other places.  Note that this is a very old system with vendor supplied code.  Presumably before the data was cleaned up the cost of the full scan exceeded that of the index scan and that it is only because our new table has so much less data that the index scan has become less costly.

So, we could set optimizer_index_cost_adj to 66 or some other value but then we would need to test the entire system.  How much effort are we going to put into a system that we need to retire anyway?  This is really where I come down with hints and other performance methods.  I’ve known about using optimizer_index_cost_adj to encourage index scans for a long time.  But, if I’m supporting a bunch of older systems I can’t just go in and start tweaking parameters without appropriate testing.  So, I’d love to adjust the parameter rather than using the hint to band aid this one query, but we have a lot of databases to support and some times you just have to apply the quick fix rather than taking the time to apply a more global solution.

– Bobby

About Bobby

I live in Chandler, Arizona with my wife and three daughters. I work for US Foods, the second largest food distribution company in the United States. I have worked in the Information Technology field since 1989. I have a passion for Oracle database performance tuning because I enjoy challenging technical problems that require an understanding of computer science. I enjoy communicating with people about my work.
This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to Query tuning example

  1. Kumar says:

    Hi Bob
    I know those two parameters should not be set. You probably know that there are lots of discussions on this. So my question is did you try to remove them and see if that would still have the desired effect. I would have done the same thing as you did in this case (use a hint if it is helping as the system is going to be retired anyway and dont go for global change) but just curious if this is something that can be tested.

    Thanks as always for your analysis and blogging about it. Apart of actual content, it gives ideas for others working on similar issues.

    – Kumar

    • Bobby says:

      Kumar,

      Thanks for your comment. I think that the optimizer parameters I mentioned are not really set in the sense that we didn’t override the default. This is not a good practice on our part, it is more ignorance and lack of time to do the job right. If I had the time I would carefully choose the settings of the optimizer parameters that affect query plans based on the settings that cause the overall best effect. I.e. Every time I setup a new database I could carefully choose the settings of the parameters. I have not heard of anyone saying that optimizer_index_caching and optimizer_index_cost_adj should be left at their default values.

      – Bobby

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.