Exadata compression reduces row chaining > 255 columns

After I did my previous post on how adding a column to a table with more than 255 columns can cause a ton of row chaining the question came up about Exadata and the form of compression we use on our > 255 column tables.  Would we see the same sort of row chaining on our tables that have more than 255 columns, especially after we add new columns and populate them with data?

So, I reran the same scripts from the previous post unchanged and with the addition of QUERY HIGH compression of the tables after they are loaded with data.  Our real tables on our Exadata system are compressed in this way so I wanted to see if, assuming the tables are re-compressed when updates require it, will we still see row chaining.

Bottom line is that after compressing the tables for query high row chaining was almost eliminated even with more than 255 columns and after adding a new column to a table with more than 255 columns.

Here are the test results from the previous post – 16K block, 11.2.0.3 HP-UX Itanium 64-bit.  These are the increase in the table fetch continued row statistic after running a query with the given column in the where clause and forcing a serial full scan.

Table         Column 1 Column 2 Column 3 Last Column-1 Last Column

TEST255C             0        0        0             0           0

TEST256C             0    83333    83333         83333       83333

TEST256CPLUS1        0    83333  1999999       1999999     1999999

Here is on Exadata – V2 quarter rack with 11.2.0.2 BP20 and no compression.  Also, block size is 8K – half of what we had before.

Table         Column 1 Column 2 Column 3 Last Column-1 Last Column

TEST255C             0        0        0             0          14

TEST256C             0   166666   166666        166666      166666

TEST256CPLUS1        0   166666  1999999       1999999     1999999

So, without compression basically the same results.  I think the smaller block is probably why there is more chaining for the same amount of data.

But check out the dramatic reduction in chaining if you re-compress the table in each case with QUERY HIGH compression:

Table         Column 1 Column 2 Column 3 Last Column-1 Last Column

TEST255C            46       46       46           181         182

TEST256C            42       42       42           172         172

TEST256CPLUS1       41       41       41           172         172

Negligible chaining regardless and the elapsed times are all less than .2 seconds.

...>grep Elapsed *.log
nexttolastcolumns.log:Elapsed: 00:00:00.09
nexttolastcolumns.log:Elapsed: 00:00:00.08
nexttolastcolumns.log:Elapsed: 00:00:00.09
test255c.log:Elapsed: 00:00:00.09
test256c.log:Elapsed: 00:00:00.09
test256cplus1.log:Elapsed: 00:00:00.07
testfirstcolumns.log:Elapsed: 00:00:00.11
testfirstcolumns.log:Elapsed: 00:00:00.11
testfirstcolumns.log:Elapsed: 00:00:00.09
testsecondcolumns.log:Elapsed: 00:00:00.08
testsecondcolumns.log:Elapsed: 00:00:00.08
testsecondcolumns.log:Elapsed: 00:00:00.09
testthirdcolumns.log:Elapsed: 00:00:00.09
testthirdcolumns.log:Elapsed: 00:00:00.08
testthirdcolumns.log:Elapsed: 00:00:00.08

By comparison the uncompressed Exadata elapsed times were in the seconds with chaining:

...>grep Elapsed *.log
nexttolastcolumns.log:Elapsed: 00:00:00.99
nexttolastcolumns.log:Elapsed: 00:00:01.83
nexttolastcolumns.log:Elapsed: 00:00:02.81
test255c.log:Elapsed: 00:00:01.07
test256c.log:Elapsed: 00:00:01.83
test256cplus1.log:Elapsed: 00:00:03.19
testfirstcolumns.log:Elapsed: 00:00:00.72
testfirstcolumns.log:Elapsed: 00:00:10.97
testfirstcolumns.log:Elapsed: 00:00:00.51
testsecondcolumns.log:Elapsed: 00:00:04.80
testsecondcolumns.log:Elapsed: 00:00:04.63
testsecondcolumns.log:Elapsed: 00:00:01.18
testthirdcolumns.log:Elapsed: 00:00:04.36
testthirdcolumns.log:Elapsed: 00:00:04.56
testthirdcolumns.log:Elapsed: 00:00:02.21

On our non-Exadata system the chaining resulted in times in minutes with the most chaining:

...>grep Elapsed *.log
nexttolastcolumns.log:Elapsed: 00:00:22.89
nexttolastcolumns.log:Elapsed: 00:00:30.15
nexttolastcolumns.log:Elapsed: 00:04:19.77
test255c.log:Elapsed: 00:00:02.49
test256c.log:Elapsed: 00:00:06.79
test256cplus1.log:Elapsed: 00:00:09.41
testfirstcolumns.log:Elapsed: 00:00:20.91
testfirstcolumns.log:Elapsed: 00:00:23.24
testfirstcolumns.log:Elapsed: 00:00:28.78
testsecondcolumns.log:Elapsed: 00:00:11.34
testsecondcolumns.log:Elapsed: 00:00:15.91
testsecondcolumns.log:Elapsed: 00:00:18.46
testthirdcolumns.log:Elapsed: 00:00:13.29
testthirdcolumns.log:Elapsed: 00:00:17.95
testthirdcolumns.log:Elapsed: 00:04:12.92

So, if we can keep our tables with more than 255 columns compressed for query high row chaining doesn’t appear to be a factor, even if we add new columns and populate them.  I believe this is because QUERY HIGH compression spreads the data for the rows in several blocks across the blocks anyway so they really aren’t stored in the same way that uncompressed rows are stored with all the columns for a row kept together.

- Bobby

 

About Bobby

I live in Chandler, Arizona with my wife and three daughters. I work for US Foods, the second largest food distribution company in the United States. I've been working as an Oracle database administrator and PeopleSoft administrator since 1994. I'm very interested in Oracle performance tuning.
This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to Exadata compression reduces row chaining > 255 columns

  1. Jim says:

    Bobby, Do you now what is the impact to update/merge time on a compressed row vs an uncompressed row?
    From what I have read, it will take longer to update a compresed row because of the time to uncompress.

    • Bobby says:

      Jim,

      It’s a good question. I’d have to test it to know for sure. I think that with hybrid columnar compression like we have with query high compression the columns for the rows in a group of blocks get grouped together where for one column in all the rows they share the same block. So, if you do an update of a single column that might be pretty efficient. But, if you are updating a bunch of columns it seems like it would have to be less efficient since they would be spread over multiple blocks. Maybe I can put together a test to prove this out.

      But, one thing to consider is that in our systems we often have merge statements where almost all the time is spent processing the query behind the merge and not on the update/insert itself. So, if the query that’s part of the merge statement takes most of the time then the time to uncompress may not be important in terms of the overall run time of the merge statement.

      - Bobby

    • Bobby says:

      Jim,

      I did a quick test. Updates on query high compressed tables are much slower than on uncompressed tables. At least, they were in my testcase which was on a table with 256 columns. It didn’t matter which column I updated. It was more than twice as long.

      I may put out a blog post on this if I get time today.

      - Bobby

  2. Pingback: Updates of tables with query high compression slow | Bobby Durrett's DBA Blog

Leave a Reply