We have been discussing query tuning at work and I’ve come back to an example of the Oracle optimizer choosing the wrong plan due to an error in its calculation of the number of rows returned by a query. I want to show that it is easy and common to have queries that the optimizer can not correctly optimize because it can’t get the number of rows, or cardinality right.
I’ve recast the fourth script from my cardinality talk – old scripts here. I’ve made it more real life by changing the table names to be more business related. I’ve also tried to prove that multi-column histograms don’t help and neither does cardinality feedback. This is all on 11.2.0.3 on 32-bit Windows. Here is the updated script and its log.
Two tables are joined together. First is a small division table that has text names and numeric division numbers:
create table DIVISION (DIVNAME VARCHAR2(2000),DIVNUM NUMBER); insert into DIVISION values ('Los Angeles',1); insert into DIVISION values ('Mayberry',2);
Next is the sales detail table. Note that there was only one sale in Mayberry but a million in Los Angeles:
create table SALES (DIVNUM NUMBER); /* SALES table has 1,000,000 1's for LA and one 2 for Mayberry */ declare i number; begin i:=1; loop insert into SALES values (1); i:=i+1; if (mod(i,10000)=0) then commit; end if; exit when (i > 1000000); end loop; commit; insert into SALES values (2); commit; end; / create index SALESINDEX on SALES(DIVNUM);
The query is just going to get the one Mayberry row joining the division and sales tables. There is an index but it doesn’t use it.
SQL> select B.DIVNUM 2 from DIVISION A,SALES B 3 where 4 a.DIVNUM=B.DIVNUM and 5 A.DIVNAME='Mayberry'; DIVNUM ---------- 2 Elapsed: 00:00:00.07 -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 455 (100)| |* 1 | HASH JOIN | | 500K| 8300K| 455 (2)| |* 2 | TABLE ACCESS FULL| DIVISION | 1 | 14 | 3 (0)| | 3 | TABLE ACCESS FULL| SALES | 1000K| 2929K| 448 (2)| --------------------------------------------------------------------
We have queries like this all the time where two tables are joined on some sort of numeric key but the where clause has a meaningful text-based condition on the smaller table. In the example it is a small division table joined to a sales table. It could also be a date table that relates quarters or fiscal periods to particular days.
Some sort of cross table histogram would be needed to fix this. It would look something like this:
DIVISION.DIVNAME DIVISION.DIVNUM SALES.DIVNUM NUM_ROWS Los Angeles 1 1 1000000 Mayberry 2 2 1
But, think about how complex this could get in terms of all the cross table relationships, even if this type of histogram were implemented. For the optimizer to figure out the row count across joins in the most general case there would be countless different cross table histograms that capture the relationships among the many tables and columns that make up a large database.
The type of join demonstrated in the test script is common in our production database applications, so I believe this test shows that we should expect the optimizer to frequently have inaccurate cardinality estimates which can lead to inefficient query execution plans.
– Bobby
P.S. I ran across a case yesterday where a dynamic sampling hint helped with a table that had multiple correlated predicates against the one table. So, I thought I’d try that hint on both tables in the example for this post but it still didn’t get the good plan with the index. The updated example script and its log are here. I ran this test on Exadata on 11.2.0.2 because that was where I saw the dynamic sampling hint work.