Bobby Durrett's DBA Blog

AZORA – Arizona Oracle User Group new location

Posted on September 14, 2015 by Bobby

The Arizona Oracle User Group has moved tomorrow’s meeting to Oracle’s offices on Camelback road:

Bobby

Posted in Uncategorized | Leave a comment

Plan change monitor prevents user impact from bad plan

Posted on August 31, 2015 by Bobby

This morning a colleague of mine noticed an alert from our plan change monitor and prevented any impact on the users by forcing a SQL’s plan to its normal fast plan before the users could be affected. I want to share the monitor script and describe how we use its output.

I know this is long but I want to paste the SQL text of the monitor script here. Review my comments on each step. I can not remember the details of the script myself even though I wrote it but the comments hopefully will help:

set linesize 160
set pagesize 1000
set echo on
set termout on
set trimspool on
set define off

-- status active
-- this plan isn't the most commonly
-- run plan.
-- average this plan in v$sql > 10 x average of
-- most common plan

-- status active sessions
-- sql_id and plan_hash_value
-- elapsed and executions
-- max elapsed for the sql
-- eliminate pl/sql which has
-- plan hash value of 0

drop table active_sql;

create table active_sql as
select
vs.sql_id,
sq.plan_hash_value,
max(sq.ELAPSED_TIME) elapsed,
max(sq.executions) executions
from
v$session vs,
v$sql sq
where
vs.sql_id=sq.sql_id and
vs.SQL_CHILD_NUMBER=sq.child_number and
vs.status='ACTIVE' and
sq.plan_hash_value <> 0
group by 
vs.sql_id,
sq.plan_hash_value;

-- to get the most frequently
-- used plan first get the number
-- of exections by plan
-- for each of the active sqls

drop table plan_executions;

create table plan_executions as
select
ss.sql_id,
ss.plan_hash_value,
sum(ss.executions_delta) total_executions
from
dba_hist_sqlstat ss,
active_sql a
where
a.sql_id=ss.sql_id
group by 
ss.sql_id,
ss.plan_hash_value;

-- use the previous table to get
-- the plans that are most frequently 
-- used. note that two plans could
-- have the same number of executions
-- but this is unlikely.

drop table most_frequent_executions;

create table most_frequent_executions as
select
pe1.sql_id,
pe1.plan_hash_value
from plan_executions pe1
where
pe1.total_executions =
(select max(pe2.total_executions)
from plan_executions pe2
where
pe1.sql_id=pe2.sql_id);

-- handle special case of two plans with
-- same number of executions.
-- pick one with highest plan value
-- just to eliminate dups.

drop table most_frequent_nodups;

create table most_frequent_nodups as
select
mfe1.sql_id,
mfe1.plan_hash_value
from most_frequent_executions mfe1
where
mfe1.plan_hash_value =
(select max(mfe2.plan_hash_value)
from most_frequent_executions mfe2
where
mfe1.sql_id=mfe2.sql_id);

-- get list of active sql that 
-- are not running the most
-- frequently executed plan

drop table not_most_freq;

create table not_most_freq as
select * from active_sql
where
(sql_id,plan_hash_value) not in
(select sql_id,plan_hash_value from most_frequent_nodups);

-- working on this part of the logic:
-- average this plan in v$sql > 10 x average of
-- most common plan

-- get average elapsed of most
-- frequently executed plans
-- add 1 to handle case of 0 executions

drop table avg_elapsed_most_f;

create table avg_elapsed_most_f as
select
ss.sql_id,
ss.plan_hash_value,
sum(ss.ELAPSED_TIME_DELTA)/(sum(ss.executions_delta)+1) 
avg_elapsed
from DBA_HIST_SQLSTAT ss,
most_frequent_nodups nd
where 
ss.sql_id = nd.sql_id and
ss.plan_hash_value = nd.plan_hash_value
group by
ss.sql_id,
ss.plan_hash_value;

-- get list of the sqls that are running
-- the plan that isn't most frequently 
-- executed and has an average elapsed 
-- more than 10 times the average of 
-- the most frequently executed plan
-- add 1 to executions to prevent 
-- divide by zero

drop table more_than_10x;

create table more_than_10x as
select 
n.sql_id,
n.plan_hash_value
from 
not_most_freq n,
avg_elapsed_most_f m
where
(n.elapsed/(n.executions+1)) > 10 * m.avg_elapsed and
n.sql_id=m.sql_id;

spool planchangemonitor.log

select name db_name from v$database;

-- The listed sql_id and plan_hash_value items correspond to 
-- sql statements that have plans that may be
-- inefficient and need to be investigated.
-- The session id and username are included if a
-- session is currently running the sql with the plan.

select
'CHANGED '||'PLAN' flag,
m.sql_id,
m.plan_hash_value,
s.sid,
s.username
from 
more_than_10x m,
v$session s,
v$sql q
where
m.sql_id=s.sql_id(+) and
m.plan_hash_value=q.plan_hash_value(+) and
s.sql_id=q.sql_id and
s.SQL_CHILD_NUMBER=q.child_number
order by
m.sql_id,
m.plan_hash_value,
s.sid,
s.username;

spool off

If I remember correctly I think the script looks for sessions running a plan whose current run time is 10 times that of the most frequently executed plan. This script is not perfect. The join to v$sql is not perfect and in some cases you can get duplicates. People could quibble about the approach. Why 10 times the previous run time? I thought about more complex approaches but I just needed to get something in place. But, on one database with a lot of small transactions we have made good use of this script, despite its flaws.

This morning my colleague noticed emails from the script that runs this SQL listing a query whose plan had changed. The output looked like this:

FLAG         SQL_ID        PLAN_HASH_VALUE        SID USERNAME
------------ ------------- --------------- ---------- ---------
CHANGED PLAN 75ufmwrcmsuwz      2484041482         35 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        394 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        395 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        446 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        463 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        464 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        544 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        613 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        631 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        665 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        678 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        738 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        746 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        750 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482        752 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482       1333 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482       1416 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482       1573 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482       1943 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482       1957 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482       3038 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482       3445 YOURUSER
CHANGED PLAN 75ufmwrcmsuwz      2484041482       3816 YOURUSER

I changed the real user to YOURUSER. This output indicates that 23 sessions were all running the same SQL – sql_id=75ufmwrcmsuwz – and that this SQL was running on a new plan that was causing the SQL to run at least 10 times the normal run time. In fact it was about 30 times as long.

To resolve the issue my colleague used our script to find the history of plans for 75ufmwrcmsuwz.

     select ss.sql_id,
  2  ss.plan_hash_value,
  3  sn.END_INTERVAL_TIME,
  4  ss.executions_delta,
  5  ELAPSED_TIME_DELTA/(executions_delta*1000) "Elapsed Average ms",
  6  CPU_TIME_DELTA/(executions_delta*1000) "CPU Average ms",
  7  IOWAIT_DELTA/(executions_delta*1000) "IO Average ms",
  8  CLWAIT_DELTA/(executions_delta*1000) "Cluster Average ms",
  9  APWAIT_DELTA/(executions_delta*1000) "Application Average ms",
 10  CCWAIT_DELTA/(executions_delta*1000) "Concurrency Average ms",
 11  BUFFER_GETS_DELTA/executions_delta "Average buffer gets",
 12  DISK_READS_DELTA/executions_delta "Average disk reads",
 13  ROWS_PROCESSED_DELTA/executions_delta "Average rows processed"
 14  from DBA_HIST_SQLSTAT ss,DBA_HIST_SNAPSHOT sn
 15  where ss.sql_id = '75ufmwrcmsuwz'
 16  and ss.snap_id=sn.snap_id
 17  and executions_delta > 0
 18  and ss.INSTANCE_NUMBER=sn.INSTANCE_NUMBER
 19  order by ss.snap_id,ss.sql_id;

SQL_ID        PLAN_HASH_VALUE END_INTERVAL_TIME         EXECUTIONS_DELTA Elapsed Average ms CPU Average ms IO Average ms Cluster Average ms Application Average ms Concurrency Average ms Average buffer gets Average disk reads Average rows processed
------------- --------------- ------------------------- ---------------- ------------------ -------------- ------------- ------------------ ---------------------- ---------------------- ------------------- ------------------ ----------------------
75ufmwrcmsuwz      2297146707 30-AUG-15 09.00.54.472 AM              830         587.207673     49.9638554    541.617188                  0                      0                      0          5234.01928         125.083133              332.66747
75ufmwrcmsuwz      2297146707 30-AUG-15 10.00.46.872 AM             1178         578.205867     49.3972835    532.377174                  0                      0                      0          4870.22326         126.048387              324.50764
75ufmwrcmsuwz      2297146707 30-AUG-15 11.00.39.206 AM             1433         631.484713     49.1486392    585.826676                  0                      0                      0          4624.11305         125.446615              299.57083
75ufmwrcmsuwz      2297146707 30-AUG-15 12.00.32.247 PM             1620         592.593823     49.5987654     546.29731                  0                      0                      0          4744.17284         121.961728             312.735185
75ufmwrcmsuwz      2297146707 30-AUG-15 01.00.20.783 PM             1774         534.412339      51.059752     485.46836                  0                      0                      0          4983.44983         119.564825             326.067644
75ufmwrcmsuwz      2297146707 30-AUG-15 02.00.08.843 PM             1757         447.385105     44.9345475    404.415659                  0                      0                      0          4525.13147         107.277746             293.739328
75ufmwrcmsuwz      2297146707 30-AUG-15 03.00.57.308 PM             1626         431.718507      45.904059    388.200416                  0                      0                      0          4462.93296         118.027675             300.724477
75ufmwrcmsuwz      2297146707 30-AUG-15 04.00.44.209 PM             1080         375.905966      44.212963    334.434835                  0                      0                      0          4766.81574         109.157407             310.712037
75ufmwrcmsuwz      2297146707 30-AUG-15 05.00.36.753 PM              707         368.289475     44.3140028    327.166223                  0                      0                      0          4894.20509         108.050919             315.565771
75ufmwrcmsuwz      2297146707 30-AUG-15 06.00.26.449 PM              529         341.483588     39.6408318     305.47356                  0                      0                      0          4381.19849         96.2646503             288.030246
75ufmwrcmsuwz      2297146707 30-AUG-15 07.00.17.636 PM              356         380.733635     41.5168539    342.034876                  0                      0                      0           4553.4691         105.272472             292.283708
75ufmwrcmsuwz      2297146707 30-AUG-15 08.00.11.170 PM              313         435.449406     37.1565495    402.636489                  0                      0                      0          4144.30351         92.8690096             264.923323
75ufmwrcmsuwz      2297146707 30-AUG-15 09.00.07.894 PM              214         516.455509     44.5794393    477.020692                  0                      0                      0          4567.67757         114.415888             289.607477
75ufmwrcmsuwz      2297146707 30-AUG-15 10.00.59.991 PM              182         720.749681     44.3956044    684.439467                  0                      0                      0          3811.83516         95.2362637             239.027473
75ufmwrcmsuwz      2297146707 30-AUG-15 11.00.47.388 PM               83          1043.1503     43.7349398    1008.41358                  0                      0                      0          3575.96386         114.289157             250.120482
75ufmwrcmsuwz      2484041482 30-AUG-15 11.00.47.388 PM                6         25314.6558     4311.66667    22971.4913                  0                      0                      0          78533.8333         69813.3333             157.833333
75ufmwrcmsuwz      2484041482 31-AUG-15 12.00.36.033 AM               96         25173.7346     5105.20833    21475.9516                  0                      0                      0          135242.802         62433.3125             118.395833
75ufmwrcmsuwz      2484041482 31-AUG-15 01.00.29.070 AM               39         26877.0626     5540.51282    22977.6229                  0                      0                      0          139959.308         68478.1795             93.7179487
75ufmwrcmsuwz      2484041482 31-AUG-15 02.00.18.755 AM               38          26993.419     5998.15789    22768.4285                  0                      0                      0          153843.342              74492             149.342105
75ufmwrcmsuwz      2484041482 31-AUG-15 03.00.09.615 AM               29         25432.5074     4952.06897    22288.7966                  0                      0                      0          112813.552         69803.0345             187.689655
75ufmwrcmsuwz      2484041482 31-AUG-15 04.00.01.749 AM               34         27281.7339     4541.47059    24543.1609                  0                      0                      0             95144.5         69187.3824             135.676471
75ufmwrcmsuwz      2484041482 31-AUG-15 05.00.55.876 AM              146         30512.9976     5421.43836    26984.2559                  0                      0                      0          115531.801         71886.6644             136.321918
75ufmwrcmsuwz      2484041482 31-AUG-15 06.00.50.769 AM              405         24339.6641     4853.40741    20794.0472                  0                      0                      0           115490.01         62004.4642             229.106173
75ufmwrcmsuwz      2484041482 31-AUG-15 07.00.43.047 AM              838         27552.3731     4903.06683    23661.2101                  0                      0                      0          111654.558         58324.9511             176.058473
75ufmwrcmsuwz      2484041482 31-AUG-15 08.00.50.864 AM             1653         30522.8358     4550.56261    26526.2183                  0                      0                      0           93818.418         49865.4701             137.212341

Based on this output my colleague chose 2297146707 as the good plan. She ran coe_xfr_sql_profile.sql which is found in SQLT’s utl directory like this:

coe_xfr_sql_profile.sql 75ufmwrcmsuwz 2297146707

Then she ran the generated script:

coe_xfr_sql_profile_75ufmwrcmsuwz_2297146707.sql

This forced the plan back to its earlier efficient plan and my coworker did all of this early this morning before we reached our peak time of usage and before the bad plan could affect our users.

Bobby

P.S. This example is from an 11.2.0.3 database running on HP-UX Itanium.

Posted in Uncategorized | 11 Comments

Reminder: Great free computer science and Python programming class starts Wednesday

Posted on August 24, 2015 by Bobby

I mentioned this class earlier in a blog post but I wanted to remind people who read this blog that the class is starting again on Wednesday. Here is the URL for the class: link

The class is completely free and taught at a very high level of quality.

It teaches computer science concepts that apply in any programming language but also teaches Python programming.

It is valuable information in the increasingly computer oriented world and economy and the class is free which is remarkable given its quality.

Here is the class name:

MITx: 6.00.1x Introduction to Computer Science and Programming Using Python

Bobby

Posted in Uncategorized | Leave a comment

Memory added based on buffer pool advisory did not give desired result

Posted on August 19, 2015 by Bobby

Development asked me to look at ways to cut the run time of a series of PeopleSoft payroll batch jobs so I took an AWR report of the entire 4 hour period. Based on the waits, the percentage of the elapsed time spent using I/O and the buffer pool advisory I chose to double the size of the buffer cache. But, this added memory did not improve the run time of the batch jobs. Maybe the affected blocks are only read into memory once so they would not get cached no matter how big the buffer pool was.

Here is the original run on June 22 with the original memory settings:

Cache Sizes

	Begin	End
Buffer Cache:	3,328M	3,424M	Std Block Size:	8K
Shared Pool Size:	1,600M	1,520M	Log Buffer:	7,208K

Top 5 Timed Foreground Events

Event	Waits	Time(s)	Avg wait (ms)	% DB time	Wait Class
db file sequential read	1,910,393	10,251	5	72.03	User I/O
DB CPU		2,812		19.76
log file sync	35,308	398	11	2.80	Commit
resmgr:cpu quantum	31,551	62	2	0.43	Scheduler
db file scattered read	7,499	60	8	0.42	User I/O

Buffer Pool Advisory

Only rows with estimated physical reads >0 are displayed
ordered by Block Size, Buffers For Estimate

P	Size for Est (M)	Size Factor	Buffers (thousands)	Est Phys Read Factor	Estimated Phys Reads (thousands)	Est Phys Read Time	Est %DBtime for Rds
D	336	0.10	41	2.71	6,513,502	1	9842530.00
D	672	0.20	83	2.42	5,831,130	1	8737799.00
D	1,008	0.29	124	2.18	5,241,763	1	7783636.00
D	1,344	0.39	166	1.96	4,720,053	1	6939010.00
D	1,680	0.49	207	1.77	4,250,981	1	6179603.00
D	2,016	0.59	248	1.59	3,825,904	1	5491420.00
D	2,352	0.69	290	1.43	3,438,372	1	4864023.00
D	2,688	0.79	331	1.28	3,083,734	1	4289879.00
D	3,024	0.88	373	1.15	2,758,459	1	3763273.00
D	3,360	0.98	414	1.02	2,459,644	1	3279504.00
D	3,424	1.00	422	1.00	2,405,118	1	3191229.00
D	3,696	1.08	455	0.91	2,184,668	1	2834329.00
D	4,032	1.18	497	0.80	1,931,082	1	2423784.00
D	4,368	1.28	538	0.71	1,696,756	1	2044421.00
D	4,704	1.37	579	0.62	1,479,805	1	1693185.00
D	5,040	1.47	621	0.53	1,278,370	1	1367070.00
D	5,376	1.57	662	0.45	1,090,505	1	1062925.00
D	5,712	1.67	704	0.38	914,112	1	777352.00
D	6,048	1.77	745	0.31	746,434	1	505888.00
D	6,384	1.86	786	0.24	580,310	1	236941.00
D	6,720	1.96	828	0.17	414,233	1	149325.00

In the SQL ordered by Elapsed Time report the top batch job SQL was 99.14% I/O

Based on this report it seems that the number of physical reads could be reduced to about 20% what they were on June 22 by doubling the size of the buffer cache. But, adding the memory did not cut the number of physical reads in any major way.

Here is yesterday’s run:

Cache Sizes

	Begin	End
Buffer Cache:	6,848M	6,816M	Std Block Size:	8K
Shared Pool Size:	3,136M	3,136M	Log Buffer:	16,572K

Top 5 Timed Foreground Events

Event	Waits	Time(s)	Avg wait (ms)	% DB time	Wait Class
db file sequential read	1,789,852	10,173	6	72.15	User I/O
DB CPU		2,970		21.06
log file sync	37,562	200	5	1.42	Commit
resmgr:cpu quantum	24,996	59	2	0.42	Scheduler
db file scattered read	5,409	54	10	0.38	User I/O

Buffer Pool Advisory

Only rows with estimated physical reads >0 are displayed
ordered by Block Size, Buffers For Estimate

P	Size for Est (M)	Size Factor	Buffers (thousands)	Est Phys Read Factor	Estimated Phys Reads (thousands)	Est Phys Read Time	Est %DBtime for Rds
D	672	0.10	83	11.25	516,440	1	1309098.00
D	1,344	0.20	166	5.98	274,660	1	683610.00
D	2,016	0.29	248	4.02	184,712	1	450915.00
D	2,688	0.39	331	2.90	133,104	1	317404.00
D	3,360	0.49	414	2.20	100,860	1	233990.00
D	4,032	0.59	497	1.80	82,768	1	187185.00
D	4,704	0.69	580	1.53	70,445	1	155305.00
D	5,376	0.79	663	1.31	60,345	1	129176.00
D	6,048	0.88	745	1.14	52,208	1	108127.00
D	6,720	0.98	828	1.01	46,477	1	93301.00
D	6,848	1.00	844	1.00	45,921	1	91862.00
D	7,392	1.08	911	0.95	43,572	1	85785.00
D	8,064	1.18	994	0.89	40,789	1	78585.00
D	8,736	1.28	1,077	0.85	38,889	1	73671.00
D	9,408	1.37	1,160	0.81	37,112	1	69073.00
D	10,080	1.47	1,242	0.77	35,490	1	64876.00
D	10,752	1.57	1,325	0.75	34,439	1	62158.00
D	11,424	1.67	1,408	0.73	33,353	1	59347.00
D	12,096	1.77	1,491	0.71	32,524	1	57204.00
D	12,768	1.86	1,574	0.69	31,909	1	55613.00
D	13,440	1.96	1,656	0.68	31,361	1	54194.00

After the memory add the same top batch job SQL was 98.80% I/O. Some improvement but not nearly as much as I expected based on the buffer pool advisory.

I guess the moral of the story is that the buffer pool advisory does not apply to specific workloads and is only a general guideline. Maybe this is the same kind of fallacy that you have with buffer cache hit ratios where certain workloads make the ratio irrelevant. Here were the hit ratios: Before 98.59% After 98.82%. Basically these are the same.

I just thought I would share this to document a real case of using the buffer pool advisory and having it not produce the expected results.

Bobby

Posted in Uncategorized | 2 Comments

Script to get previous month’s AWR report

Posted on August 18, 2015 by Bobby

We keep 6 weeks of history in the AWR on our databases, but I want to capture some information for long-term trending. What I really want to do is capture some metrics and put them in some database tables to use to generate reports, but I have not had time to build the scripts to do that. So, instead I built a simple set of scripts to capture an AWR for the previous month. Since we have 6 weeks of history if I run my report in the first week of a month all the days of the previous month should still be in the AWR. I have just finished building this script so I can not promise that there is value in keeping monthly AWR reports but I thought it was worth sharing it. Maybe something in the code will be useful to someone. Here is the script:

-- Has to be run in the first week of the month so the entire
-- previous month is available. We keep 6 weeks of awr history.

-- setup columns for snapshots

column bsnap1 new_value bsnap1s noprint;
column esnap1 new_value esnap1s noprint;
column filenm new_value filenms noprint;

-- get snap id for first day of previous month

select min(snap_id) bsnap1
from dba_hist_snapshot
where 
extract(month from END_INTERVAL_TIME)=
extract(month from (sysdate-to_number(to_char(sysdate,'DD'))))
and
STARTUP_TIME=
(select max(STARTUP_TIME)
from dba_hist_snapshot
where 
extract(month from END_INTERVAL_TIME)=
extract(month from (sysdate-to_number(to_char(sysdate,'DD')))));

-- get snap id for last day of previous month

select max(snap_id) esnap1
from dba_hist_snapshot
where 
extract(month from END_INTERVAL_TIME)=
extract(month from (sysdate-to_number(to_char(sysdate,'DD'))));

-- get html file name

select 
name||
'_'|| 
to_char(extract(month from 
(sysdate-to_number(to_char(sysdate,'DD')))))||
'_'|| 
to_char(extract(year from 
(sysdate-to_number(to_char(sysdate,'DD')))))||
'.html' filenm
from v$database;

-- get awr report

define report_type='html';
define begin_snap = &bsnap1s;
define end_snap = &esnap1s;
define report_name = '&filenms';

define num_days = 0;

@@$ORACLE_HOME/rdbms/admin/awrrpt.sql

undefine report_type
undefine report_name
undefine begin_snap
undefine end_snap
undefine num_days

If the database bounced during the previous month we get the last set of snapshots after the last bounce.

I am not sure whether this approach will give us any benefits but I think it may help to show how to use queries to pick begin and end snapshots and then run an AWR report.

The tricky part of the code is this:

extract(month from (sysdate-to_number(to_char(sysdate,'DD'))))

It just returns the previous month as a number. It is August now so here is what it returns today:

SQL> select
 2 extract(month from (sysdate-to_number(to_char(sysdate,'DD'))))
 3 prev_month
 4 from dual;

PREV_MONTH
----------
 7

sysdate-to_number(to_char(sysdate,’DD’)) is the last day of the previous month:

SQL> select sysdate-to_number(to_char(sysdate,'DD')) last_day
 2 from dual;

LAST_DAY
---------
31-JUL-15

– Bobby

Posted in Uncategorized | Leave a comment

Applied July Patch Sets To Test Databases

Posted on August 11, 2015 by Bobby

I applied the current July patch sets to a 11.2 and a 12.1 test database. Now I have a 11.2.0.4.7 and a 12.1.0.2.4 test database. It is helpful to have test databases that are on the most current patch sets and releases. If I see unexpected behavior on some other database I can try the same thing on the patched test databases to see if some patch changed the behavior to what I expect. Also, our production databases are all on 11.2.0.4 or earlier releases so I can check whether the new fully patched 12.1 release has different behavior than our older systems.

Here are the patch numbers:

6880880 – current version of opatch

20760982 – 11.2.0.4.7

20831110 – 12.1.0.2.4

My test environments are on x86-64 Linux.

– Bobby

Posted in Uncategorized | Leave a comment

Registered for Oracle OpenWorld

Posted on August 10, 2015 by Bobby

I registered myself for Oracle OpenWorld and I have my hotel reserved and my flights ticketed.

I think it has been over 12 years – probably more like 15 years – since I went to OpenWorld. I went at least once between December 1994 and November 2003 when I still lived in Florida and was working on Oracle databases. But since I moved from Florida I do not believe that I have been to the conference. I have presented at Collaborate and ECOUG conferences since then. I’m thinking that maybe next year I will try to present at the RMOUG conference. I live in Arizona so RMOUG is close. ECOUG was a nice distance when I still lived near the East Coast. I like the smaller conferences and I have a better shot at getting a presentation accepted there.

But, this year it is OpenWorld and I am looking forward to it. I may get a chance to interact with some Delphix employees and customers. Also, I’m hoping to check out some technical presentations by the Oak Table members. And it does not hurt to hear from Oracle itself on its technology. No doubt there will be many of Oracle’s top technical leaders presenting. And, any interaction I get with fellow DBA’s will be great. It is always good to hear from people about their own experiences which may differ from mine.

Anyway, I’m all booked for OpenWorld. Hope to see you there.

– Bobby

Posted in Uncategorized | Leave a comment

Check out 6.00.1x computer science class on edX!

Posted on July 25, 2015 by Bobby

I just finished the last program for a computer science class on edX and I urge you to try it.

I took this class:

MITx: 6.00.1x Introduction to Computer Science and Programming Using Python

I was more interested in how MIT taught the class than in the material itself because I already know the subjects covered.

The class taught the basics of programming – expressions, variables, loops, if statements, and functions.

It also had a large focus on bisection or binary search and the performance benefits of this type of search over sequentially reading through a list.

It also covered lists, hash tables, trees, stacks, and queues.

It discussed object-oriented programming.

The class concluded with the professor stating that the programming and computer science skills taught in this class are key to advancing your career, even if you do not work in a computer related job.

I interacted with a number of students in the class and found some that were in other fields and were having success taking the class. Others were in business computing or IT and yet did not have a computer science background so they were good programmers but learning new concepts. Many struggled with the class but, it is free, and is given often. The class starts up again August 26th. Nothing stops you from taking it multiple times.

I tried to think about whether I should recommend this class to the people I work with as a method of helping develop my coworkers that do not have experience in these areas. At first I thought that the subject is too academic and has no connection to their jobs. But, after thinking about it for a while, I now believe that just the opposite is true.

Searching for practical applications of the class, I first remembered the programs that we wrote that compared searching sequentially through a list to using binary search. In one test case the sequential method took 15 seconds but the binary search took less than one second. This reminded me so much of tuning Oracle SQL queries. The sequential scan of the list was like a full table scan in Oracle. The binary search was like looking up a single row using an index scan. As I tune Oracle queries my computer science knowledge of binary search and binary trees makes it easy to understand index and full table scans.

In another example, we recently had slowness on a Weblogic portal server. CPU was getting maxed out and the CPU spent most of its time in a Java ConcurrentHashMap object. I don’t know the internals of Weblogic and I have never used a ConcurrentHashMap but I know how hashing works. I know that hashing is very fast until your hash table fills up or if the hash function distributes the items in an unequal way. My knowledge of hashing helped me grasp why our portal server was using a lot of CPU despite my lack of access to its internals.

So, contrary to my original fear that the edX class was too academic and not practical I believe that the concepts covered are very practical. If you do not know how binary search works or what a binary tree is you will benefit from 6.00.1x on edX. If you can not explain how a hash table works and what causes hashing to slow down you can learn from 6.00.1x. And, if you have never written a computer program, although you may find the class difficult and have to take it more than once, you will benefit from 6.00.1x on edX.

– Bobby

Posted in Uncategorized | 5 Comments

Flushing Shared Pool Does Not Slow Its Growth

Posted on June 18, 2015 by Bobby

I’m still working on resolving the issues caused by bug 13914613.

Oracle support recommended that we apply a parameter change to resolve the issue but that change requires us to bounce the database and I was looking for a resolution that does not need a bounce. The bug caused very bad shared pool latch waits when the automatic memory management feature of our 11.2.0.3 database expanded the shared pool. Oracle support recommending setting _enable_shared_pool_durations=false and I verified that changing this parameter requires a bounce. It is a big hassle to bounce this database because of the application so I thought that I might try flushing the shared pool on a regular basis so the automatic memory management would not need to keep increasing the size of the shared pool. The shared pool was growing in size because we have a lot of SQL statements without bind variables. So, I did a test and in my test flushing the shared pool did not slow the growth of the shared pool.

Here is a zip of the scripts I used for this test and their outputs: zip

I set the shared pool to a small value so it was more likely to grow and I created a script to run many different sql statements that don’t use bind variables:

spool runselects.sql

select 'select * from dual where dummy=''s'
||to_char(sysdate,'HHMISS')||rownum||''';'
from dba_objects;

spool off

@runselects

So, the queries looked like this:

select * from dual where dummy='s0818111';
select * from dual where dummy='s0818112';
select * from dual where dummy='s0818113';
select * from dual where dummy='s0818114';
select * from dual where dummy='s0818115';
select * from dual where dummy='s0818116';
select * from dual where dummy='s0818117';

I ran these for an hour and tested three different configurations. The first two did not use the _enable_shared_pool_durations=false setting and the last did. The first test was a baseline that showed the growth of the shared pool without flushing the shared pool. The second test including a flush of the shared pool every minute. The last run included the parameter change and no flush of the shared pool. I queried V$SGA_RESIZE_OPS after each test to see how many times the shared pool grew. Here is the query:

SELECT OPER_TYPE,FINAL_SIZE Final,
to_char(start_time,'dd-mon hh24:mi:ss') Started, 
to_char(end_time,'dd-mon hh24:mi:ss') Ended 
FROM V$SGA_RESIZE_OPS
where component='shared pool'
order by start_time,end_time;

Here are the results.

Baseline – no flush, no parameter change:

OPER_TYPE       FINAL STARTED         ENDED
--------- ----------- --------------- ---------------
GROW      150,994,944 18-jun 05:03:54 18-jun 05:03:54
GROW      134,217,728 18-jun 05:03:54 18-jun 05:03:54
STATIC    117,440,512 18-jun 05:03:54 18-jun 05:03:54
GROW      167,772,160 18-jun 05:04:36 18-jun 05:04:36
GROW      184,549,376 18-jun 05:47:38 18-jun 05:47:38

Flush every minute, no parameter change:

OPER_TYPE       FINAL STARTED         ENDED
--------- ----------- --------------- ---------------
GROW      134,217,728 18-jun 06:09:15 18-jun 06:09:15
GROW      150,994,944 18-jun 06:09:15 18-jun 06:09:15
STATIC    117,440,512 18-jun 06:09:15 18-jun 06:09:15
GROW      167,772,160 18-jun 06:09:59 18-jun 06:09:59
GROW      184,549,376 18-jun 06:22:26 18-jun 06:22:26
GROW      201,326,592 18-jun 06:42:29 18-jun 06:42:29
GROW      218,103,808 18-jun 06:47:29 18-jun 06:47:29

Parameter change, no flush:

OPER_TYPE        FINAL STARTED         ENDED
--------- ------------ --------------- ---------------
STATIC     117,440,512 18-jun 07:16:09 18-jun 07:16:09
GROW       134,217,728 18-jun 07:16:18 18-jun 07:16:18

So, at least in this test – which I have run only twice – flushing the shared pool if anything makes the growth of the shared pool worse. But, changing the parameter seems to lock it in.

– Bobby

Posted in Uncategorized | 2 Comments

Overall I/O Query

Posted on June 16, 2015 by Bobby

I hacked together a query today that shows the overall I/O performance that a database is experiencing.

The output looks like this:

End snapshot time   number of IOs ave IO time (ms) ave IO size (bytes)
------------------- ------------- ---------------- -------------------
2015-06-15 15:00:59        359254               20              711636
2015-06-15 16:00:59        805884               16              793033
2015-06-15 17:00:13        516576               13              472478
2015-06-15 18:00:27        471098                6              123565
2015-06-15 19:00:41        201820                9              294858
2015-06-15 20:00:55        117887                5              158778
2015-06-15 21:00:09         85629                1               79129
2015-06-15 22:00:23        226617                2               10744
2015-06-15 23:00:40        399745               10              185236
2015-06-16 00:00:54       1522650                0               43099
2015-06-16 01:00:08       2142484                0               19729
2015-06-16 02:00:21        931349                0                9270

I’ve combined reads and writes and focused on three metrics – number of IOs, average IO time in milliseconds, and average IO size in bytes. I think it is a helpful way to compare the way two systems perform. Here is another, better, system’s output:

End snapshot time   number of IOs ave IO time (ms) ave IO size (bytes)
------------------- ------------- ---------------- -------------------
2015-06-15 15:00:25        331931                1              223025
2015-06-15 16:00:40        657571                2               36152
2015-06-15 17:00:56       1066818                1               24599
2015-06-15 18:00:11        107364                1              125390
2015-06-15 19:00:26         38565                1               11023
2015-06-15 20:00:41         42204                2              100026
2015-06-15 21:00:56         42084                1               64439
2015-06-15 22:00:15       3247633                3              334956
2015-06-15 23:00:32       3267219                0               49896
2015-06-16 00:00:50       4723396                0               32004
2015-06-16 01:00:06       2367526                1               18472
2015-06-16 02:00:21       1988211                0                8818

Here is the query:

select 
to_char(sn.END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS') "End snapshot time",
sum(after.PHYRDS+after.PHYWRTS-before.PHYWRTS-before.PHYRDS) "number of IOs",
trunc(10*sum(after.READTIM+after.WRITETIM-before.WRITETIM-before.READTIM)/
sum(1+after.PHYRDS+after.PHYWRTS-before.PHYWRTS-before.PHYRDS)) "ave IO time (ms)",
trunc((select value from v$parameter where name='db_block_size')*
sum(after.PHYBLKRD+after.PHYBLKWRT-before.PHYBLKRD-before.PHYBLKWRT)/
sum(1+after.PHYRDS+after.PHYWRTS-before.PHYWRTS-before.PHYRDS)) "ave IO size (bytes)"
from DBA_HIST_FILESTATXS before, DBA_HIST_FILESTATXS after,DBA_HIST_SNAPSHOT sn
where 
after.file#=before.file# and
after.snap_id=before.snap_id+1 and
before.instance_number=after.instance_number and
after.snap_id=sn.snap_id and
after.instance_number=sn.instance_number
group by to_char(sn.END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS')
order by to_char(sn.END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS');

I hope this is helpful.

– Bobby

P.S. Here is an updated version of the query using suggestions from the comments below:

select 
to_char(sn.END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS') "End snapshot time",
sum(after.PHYRDS+after.PHYWRTS-before.PHYWRTS-before.PHYRDS) "number of IOs",
trunc(10*sum(after.READTIM+after.WRITETIM-before.WRITETIM-before.READTIM)/
sum(1+after.PHYRDS+after.PHYWRTS-before.PHYWRTS-before.PHYRDS)) "ave IO time (ms)",
trunc(sum(after.block_size * (after.PHYBLKRD+after.PHYBLKWRT-before.PHYBLKRD-before.PHYBLKWRT))/
sum(1+after.PHYRDS+after.PHYWRTS-before.PHYWRTS-before.PHYRDS)) "ave IO size (bytes)"
from DBA_HIST_FILESTATXS before, DBA_HIST_FILESTATXS after,DBA_HIST_SNAPSHOT sn
where 
after.file#=before.file# and
after.snap_id=before.snap_id+1 and
before.instance_number=after.instance_number and
after.snap_id=sn.snap_id and
after.instance_number=sn.instance_number
group by to_char(sn.END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS')
HAVING 
SUM(after.PHYRDS + after.PHYWRTS - before.PHYWRTS - before.PHYRDS) >= 0 AND 
SUM(after.PHYBLKRD + after.PHYBLKWRT - before.PHYBLKRD - before.PHYBLKWRT) >= 0
order by to_char(sn.END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS');

PPS. Bug 25416731 makes this script return 0 rows for 12.2, 18, and 19 version databases.

Posted in Uncategorized | 14 Comments

Bobby Durrett's DBA Blog

AZORA – Arizona Oracle User Group new location

Plan change monitor prevents user impact from bad plan

Reminder: Great free computer science and Python programming class starts Wednesday

Memory added based on buffer pool advisory did not give desired result

Buffer Pool Advisory

Buffer Pool Advisory

Script to get previous month’s AWR report

Applied July Patch Sets To Test Databases

Registered for Oracle OpenWorld

Check out 6.00.1x computer science class on edX!

Flushing Shared Pool Does Not Slow Its Growth

Overall I/O Query

Subscribe to Blog via Email

Archives

Bobby Durrett\’s DBA Blog

Orafaq blog aggregator