I was asked to look at a batch SQL query that ran for 12 hours on June 4th. I messed around with some other ideas and then went back to my best method for tuning long running batch queries. I ripped it apart. My code ran in 45 minutes.
This only works if you have the code and can change it. I work with a lot of PL/SQL code in Unix shell scripts running SQL*Plus. If someone comes to me with a SQL statement in this kind of code, I do not tear my hair out trying to tune it in some clever way. Also, this assumes that I cannot use a SQL Profile. SQL Profiles are my first choice for production performance problems. But for long running SQL that I have full control over and cannot use a SQL Profile I tear it apart.
Ripping or tearing a SQL statement apart means that I become the optimizer. In this case the problem query joined 5 tables. It summarized 3 months of data from a large fact table and the other 4 tables were joined together with the fact table. I replaced the one query with 5 queries each of which saved their results in a table. This first query summarized the fact table and the remaining four joined one more table to the current results. Something like this:
- Summarize 3 months of fact table data – table 1
- Join table 2 on surrogate key
- Join table 3 on surrogate key
- Join table 4 on natural keys
- Join table 5 on natural keys
So, I created 5 tables each of which held the results of the previous joins. I dropped the tables as I was done with them to save space.
I have a feeling that I could use some clever hint to force the join order and access methods to match my 5 queries. But my time is short and this works, so I did not bother trying. When you have a query that runs for 12 hours it’s not very motivating to try various hints to get it to run faster. How long do you wait for each try before you give up? Working on one table at a time is nice. I have had this approach work for me time and time again. It is almost a mechanical process without a lot of painful thinking.
Anyway, I pass this on to the internet. People may think that breaking up a 5 table join into 5 queries is ugly, but it works.
Very cool – I didn’t know that would work!
Thanks Karl. It should work with any SQL based database. The optimizer has to come up with a join order and choose access methods for every table. So if you have a tough query that you can change you can make those choices instead of the optimizer. I hope that you and your family are doing well.
Do you have the execution plan with the row source info of the long running job ? Would be interesting to compare the actual rows and Oracle’s estimated cardinality and work out if there is a misestimation somewhere.
Then eventually tune it and compare to the split job timings.
Thank you for your comment. I could have run the original query for 12 hours and gotten the actual row counts using the gather_plan_statistics hint and looked for bad row estimates but it was just easier to tear the query apart. If you have access to the code why not break up a problem query instead of trying to fix it? On shorter queries that I cannot easily change I normally take the approach that you describe. But a huge batch query that I can fully control why not tear it apart and tune each join?
There is no right and wrong way. Now you have a query that is performant and you have the control over it as you say.
However resolving the potential Optimizer issue causing your “bad” plan could help tune other queries in the database.
Anyway thanks for sharing,
Thanks again for commenting David. You are right about resolving an issue that could tune other queries in the database. In this case it is a very old data warehouse that I do not want to make changes to. Maybe I should have shared that in the blog post. A lot of the systems that I work on are old and poorly maintained. I do SQL Profiles and break up queries rather than making changes that affect other queries because I am supporting older systems and do not have the resources to test a change that would affect multiple SQL statements. If I were working on a new system I would adjust parameters, change how stats are gathered, modify partitioning, use indexes of various kinds etc. to setup the system so that the optimizer makes the best choices with as many SQL statements as possible. In my environment supporting many older systems I have to focus on fixing one SQL statement and leaving the rest unchanged. I would rather do the sort of thing you describe for sure when I can.