Yesterday I used Delphix to quickly recover ten production tables that had accidentally been emptied over the weekend. We knew that at a certain time on Saturday the tables were fully populated and after that some batch processing wrecked them so we created a new virtual database which was a clone of production as of the date and time just before the problem occurred. We could have accomplished the same task using RMAN to clone production but Delphix spun up the new copy more quickly than RMAN would have.
The source database is 5.4 terabytes and there were about 50 gigabytes of archive logs that we needed to apply to recover to the needed date and time. It took about 15 minutes to complete the clone including applying all the redo. The resulting database occupies only 10 gigabytes of disk storage.
If we had used RMAN we would first have to add more disk storage because we don’t have a system with enough free to hold a copy of the needed tablespaces. Then, after waiting for our storage and Unix teams to add the needed storage we would have to do the restore and recovery. All these manual steps take time and are prone to human error, but the Delphix system is point and click and done through a graphical user interface (GUI).
Lastly, during the recovery we ran into Oracle bug 7373196 which caused our first attempted recovery to fail with an ORA-00600 [krr_init_lbufs_1] error. After researching this bug I had to rerun the restore and recovery with the parameter _max_io_size set to 33554432 which is the workaround for the bug. Had we been using RMAN we probably would have to run the recovery at least twice to resolve this bug. Maybe we could have started at the point it failed but I’m not sure. With Delphix it was just a matter of setting the _max_io_size parameter and starting from scratch since I knew the process only took 15 minutes. Actually it took me two or three attempts to figure out how to set the parameter, but once I figured it out it was so simple I’m not sure why I didn’t do it right the first time. So, at the end of the day it was just under 3 hours from my first contact about this issue until they had the database up and were able to funnel off the data they needed to resolve the production issue. Had I been doing an RMAN recover I don’t doubt that I would have worked late into the night yesterday accomplishing the same thing.
P.S. These databases are on HP-UX 11.31 on IA64, Oracle version 18.104.22.168.0.