In an earlier post I described how some queries with long parse times were causing long library cache lock waits. Friday I applied a patch from Oracle that resolved the long parse times. Here are the conditions which may relate to this bug:
- Interval partitioned table
- Partitioned by range
- Sub-partitioned by list
- Open ended range on partitioning column in where clause
- Tens of thousands of sub-partitions
Prior to applying the patch I did an explain plan on two versions of the problematic query. One version specified an open-ended range on the partitioning column and the other closed off the range.
... T392658.CLNDR_DT >= TO_DATE('2014-11-17' , 'YYYY-MM-DD') and ... Explained. Elapsed: 00:00:46.20
... T392658.CLNDR_DT >= TO_DATE('2014-11-17' , 'YYYY-MM-DD') and T392658.CLNDR_DT <= TO_DATE('2014-11-26' , 'YYYY-MM-DD') and ... Explained. Elapsed: 00:00:00.09
The queries are the same except for the extra date condition which closes off the date range. Note that the explain plan took 46 seconds with the open-ended range and less than a tenth of a second with the closed off range.
With the patch the slow version is just as fast as the fast one.
This is bug 20061582 in Oracle’s system.
From my conversations with Oracle it sounds like in certain cases the optimizer is reviewing information for many or all of the sub-partitions and since we have 20,000 for this table in production it can take minutes to parse. I also messed with the dates on the open-ended condition and found that if I made the date early enough the parse time issue went away. So, it seems that there is some set of conditions, which I don’t know how to easily reproduce, which lead the optimizer to look at sub-partition information and slows parse time.
This is on 18.104.22.168 on an Exadata system. I was able to reproduce the long parse times on non-Exadata Linux 22.214.171.124 and 126.96.36.199 systems so it does not occur only on Exadata.
This issue surprised me because I thought that the optimizer would not look at partition or sub-partition statistics in a query that has a range condition which spanned more than one partition. In the past I have always seen the global or table level stats used in these situations. But, now I know that there are some cases where the optimizer will dig into the sub-partition stats even though the query covers more than one partition.
Looking at the plan of my slow parsing query before and after the patch I get a clue that the optimizer is looking at sub-partition stats:
Partition range without the patch:
Partition range with the patch:
1 | 136
Evidently KEY(SQ) relates to some sort of filtering of the sub-partitions which cause the long parse time. The manuals describe KEY(SQ) in a section titled “Dynamic Pruning with Subqueries” so maybe the problem queries have some sub-query that the optimizer was using to choose which sub-partitions that the query needed.
If you have an interval partitioned table with tens of thousands of sub-partitions and parse times in the minutes with open-ended ranges specified on the partitioning column your system could be hitting this same issue.