Getting geared up for Collaborate 2018 #C18LV

I’m leaving for Las Vegas a week from this Sunday for the Collaborate 2018 conference #C18LV.

I’m doing the same two talks that I did in Denver at RMOUG Training Days.

Here are the times and locations for the talks if you will be there at C0llaborate:

Session: 1415: Python for the Oracle DBA
Date/time: Tue, Apr 24, 2018 (09:45 AM – 10:45 AM)
Location: Jasmine H

Session: 1416: Toastmasters for the Oracle DBA
Date/time: Wed, Apr 25, 2018 (08:30 AM – 09:30 AM)
Location: South Pacific I, North Convention Center

I haven’t looked at these talks since the conference in February and I have focused my attention on other things so now I am starting to get my mind back on these talks. I am passionate about both topics so hopefully that will come across. I got good feedback at the conference in Denver so I think that those who hear my talks in Vegas will benefit from them. If you can make it to Collaborate I would love to have you attend.

Las Vegas will be just a bit warmer than the 6 degree Fahrenheit mornings that we had in Denver, so that will be fun. 🙂

Bobby

Posted in Uncategorized | Leave a comment

Had to set parallel_max_servers to 2 for Delphix clone

This is a quick note about a problem I had creating a Delphix virtual database clone of a production database.

This is an older 11.1.0.7 HP-UX Itanium database. I tried to make a new virtual database copy of the production database and it failed due to the following error:

ORA-07445: exception encountered: core dump [krd_flush_influx_buffers()+96] [SIGSEGV] [ADDR:0x10000000005D8] [PC
:0x400000000B415880] [Address not mapped to object] []

I only found one thing on Oracle’s support site about krd_flush_influx_buffers but it was not an exact match because it had to do with Data Guard.

So I tried various parameters and none worked. I tried setting parallel_max_servers to 0  down from 100 in production but that caused other issues. Then I remembered something about setting it to 2 so I tried that and it worked.

The strange things is that I still see a ORA-07445 krd_flush_influx_buffers error in the alert log for the successful clone. But, somehow changing the parallel_max_servers parameter to 2 allowed the various Delphix and Oracle processes to complete.

Bobby

Posted in Uncategorized | Leave a comment

Does an automated cloud data warehouse need partitioning and indexes?

This is a late post inspired by my time at RMOUG Training Days 2018. I attended two RMOUG Training Days presentations about automated cloud data warehouse database systems. The first was about Snowflake. The second was about Oracle’s coming Autonomous Data Warehouse Cloud offering. Both products seem to have the goal of automating as much as possible in a data warehouse database and neither seem to allow users to create indexes or to partition their data in the way they choose.

This raises the obvious question – do you need the ability to create indexes and define your own partitioning scheme in a data warehouse database? I have seen many situations on data warehouse databases where both are helpful. When we went to Exadata there was a push to drop all of our indexes but we did not. Also, we used partitioning to improve query performance. If we had to get rid of both features what would we have done on Exadata?

I don’t know the answer, but people say that we need more automation and less control. So, you could spin up a cloud based data warehouse database quickly and start using it. But how many knobs are left to turn when performance is slow? Do users need indexes and partitioning methods to get acceptable performance? Really, what is wrong with giving users the ability to create indexes and to partition tables in the way they want?

Time will tell whether index and partition free cloud based data warehouse database systems make it. But, for now I’m left to wonder what we are losing without these features.

Bobby

 

Posted in Uncategorized | Leave a comment

Pushing predicates into MySQL subqueries

I am just getting started looking at MySQL performance tuning but I thought I would share something that I may have figured out. I’m working with a query that does a count and a group by in a subquery and it runs too long. It seems that the query reads every row from a table instead of passing a predicate into the subselect and accessing only a couple of rows. MySQL 5.7 seems to behave this way normally. I have built a simple test case to show that MySQL 5.7 works differently than Oracle 12.1 in this situation.

I build a table called TEST with about 1,000,000 rows (1,1) and one row (2,2) and put an index on the first column.

create table test(a int,b int);

insert into test values(1,1);

insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;
insert into test select * from test;

-- 1048576 rows

select count(*) from test;

create index testi on test (a);

insert into test values (2,2);

analyze table test;

Then I have a subselect with a SUM and a GROUP by and a where clause condition outside of the subselect that would cause the subselect to look at only one row – the one with values (2,2).

select sum_b from
(select a,sum(b) sum_b
from test
group by a) inner_query
where 
a=2;

This takes almost 2 seconds running in an Amazon RDS instance:

+-------+
| sum_b |
+-------+
|     2 |
+-------+
1 row in set (1.53 sec)

The corresponding Oracle query running in an on-premises Linux VM runs faster:

SQL> select sum_b from
  2  (select a,sum(b) sum_b
  3  from test
  4  group by a) inner_query
  5  where
  6  a=2;

     SUM_B
----------
         2

Elapsed: 00:00:00.08

I realize that the hardware is not the same but I think they are reasonably close. Best I can tell the Oracle version is faster because Oracle pushes the a=2 predicate into the subselect but MySQL doesn’t. The Oracle execution plan shows the a=2 predicate in the inner index scan:

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |     6 |     4   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT        |       |     1 |     6 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TEST  |     1 |     6 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | TESTI |     1 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
...
Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"=2)

I am still learning MySQL plans but it seems that MySQL does a full index scan:

As I said, I am new to MySQL so I may have missed some way around this. One simple solution would be to write your query without a subselect so that the predicate is in the group by query. This runs about as fast as the Oracle example above:

select sum(b) sum_b
from test
where 
a=2
group by a
--------------

+-------+
| sum_b |
+-------+
|     2 |
+-------+
1 row in set (0.10 sec)

This does a simple index lookup.

I have uploaded a zip of my Oracle and MySQL test here: zip

Maybe if you have used Oracle in the past and are now using MySQL 5.7 you should keep in mind that MySQL 5.7 does not push predicates into subqueries in the same situations that Oracle does and build your queries around this difference.

Bobby

 

 

Posted in Uncategorized | Leave a comment

Follow up from RMOUG Training Days talks

I have uploaded a zip of my two RMOUG Training Days talks here: zip

During the Toastmasters talk there was a question about finding a club in your area.

Here is how to find a Toastmasters club near you. Go to www.toastmasters.org and click on the FIND A CLUB button. Enter your address or zip and you get a map with club locations like this:

Click on the club to see details of when it meets and whether they are open to new members.

You can call the phone number and you can visit their website if they have one.

Toastmasters provides clubs with hosting for sites like this so it is easy to find out where they meet and to ask questions. In a public club like the one in this example you can just show up at a meeting and they will welcome you. A corporate club will often be just for employees so unless you work for the company that hosts the club you probably will have to look elsewhere. But there are a ton of clubs and most would love to have new members.

At the Python presentation people wanted to look at my PythonDBAGraphs source code. It is on GitHub here: PythonDBAGraphs. This is my largest example of Python code that an Oracle DBA might write. I think that the Oracle community may find PythonDBAGraphs more useful as an example than as a practical application. I use it every day for my work so it has value but it is more command line and less finished than a product that you would buy. Still, it shows how to use Python with Oracle databases and how to graph things using Python. And, it is a fairly simple yet still useful example of Python code.

The README has a lot of these details but I thought I would point people to the source files that might be most helpful to them as examples.

db.py has the Oracle database routines using the cx_Oracle module.

myplot.py has the graphing routines using the Matplotlib module.

util.py has some general functions that read or write from files, prompt for input from the command line, or process command line arguments.

The rest of the files are each one graph. I use these in the same way that I use the command line sqlplus scripts that I use for performance tuning. For each graph there is some code for stitching together a SQL query to get information from DBA_HIST or V$ views. Then each one calls routines in db.py, myplot.py, and util.py to do the rest. I tried to put as much logic as possible in db.py, myplot.py, and util.py so that it would be easy to quickly create a new graph.

onewait.py is an example of a single graph.

A lot of this information is in other blog posts but I thought I would put it here based on the questions that I got from doing the Python talk at RMOUG Training Days.

Anyway, I wanted to publish the PowerPoint files and give these two follow ups based on the questions that I got after each talk.

Bobby

Posted in Uncategorized | Leave a comment

Early morning RMOUG post

Well, it is early Wednesday morning here at the Westin hotel in Denver where the RMOUG Training Days conference is being held. I can’t sleep anyway so I thought I would write-up some of my impressions of yesterday’s presentations.

I appreciate all the effort people put in making their presentations. Since I have done Toastmasters I’ve learned to appreciate more what goes into being an effective speaker. But, the nature of my work is that I have to be critical of everything people say about technology. Maybe I should say “I have to think critically” instead of “be critical”. The problem with the type of work we do is that it involves a lot of money and that inevitably obscures the truth about the technical details of how things work. So, I want to just sit back and applaud but my technical side wants to tear apart every detail.

A nice perk of being a RMOUG presenter is that I got to attend the pre-conference workshops for free as well as the rest of the talks. In past conferences that I have spoken at that was not the case. So, I went to a four-hour Snowflake workshop. I have read a fair amount on Snowflake so much that the speaker presented was familiar. I wonder how people who had no Snowflake background perceived the talk? Being a nuts and bolts Oracle person I would have liked to dig in more to Snowflake internals and discuss its limitations. Surely any tool has things it does better and things that it does not do so well because of the choices that the developers made in its design. I’m interested in how Snowflake automatically partitions data across files on S3 and caches data in SSD and RAM at the compute level. At least, that is what the information on the web site suggests. But with cloud computing it seems that people frown upon looking under the covers. The goal is to spin up new systems quickly and Snowflake is fantastic at that. Also, it seems to get great performance with little effort. No tuning required! Anyway, it was a good presentation but didn’t get into nuts and bolts tuning and limitations which I would have liked to see.

I spent the rest of the day attending hour-long presentations on various topics. AWS offered a 3 hour session on setting up Oracle on RDS but since I’ve played with RDS at work I decided to skip it. Instead I went to mostly cloud and Devops sessions. I accidentally went to an Oracle performance session which was amusing. It was about tuning table scans in the cloud. The speaker claimed that in Oracle’s cloud you get sub-millisecond I/O which raised a bunch of questions in my mind. But the session was more about using Oracle database features to speed up a data warehouse query. It was fun but not what I expected.

I was really surprised by the Devops sessions. Apparently Oracle has some free Devops tools in their cloud that you can use for on premise work. My office is working with a variety of similar tools already so it is not something we would likely use. But it could be helpful to someone who doesn’t want to install the tools yourself. I’m hopeful that today’s Devops session(s) will fill in more details about how people are using Devlops with databases. I’m mostly interested in how to work with large amounts of data in Devops. It’s easy to store PL/SQL code in Git for versioning and push it out with Flywaydb or something like it. It is hard to make changes to large tables and have a good backout. Data seems to be Devops’s Achilles heel and I haven’t seen something that handles it well. I would love to hear about companies that have had success handling data changes with Devops tools.

Well, I’ve had one cup of coffee and Starbucks doesn’t open for another half hour but this is probably enough of a pre-dawn RMOUG data dump. Both of my talks are tomorrow so today is another day as a spectator. Likely it will be another day of cloud and Devops but I might sneak an Oracle performance talk in for one session.

Bobby

Posted in Uncategorized | Leave a comment

Should I use SQL or Python?

We had an outage on an important application last Thursday. A particular SQL statement locked up our database with library cache: mutex X waits. I worked with Oracle support to find a bug that caused the issue and we came up with a good workaround. The bug caused a bunch of shared cursor entries. So, I wanted to run a test on a test database to recreate the excess shared cursor entries. I wanted to run the SQL query that caused the outage a bunch of times. Also, we embed the SQL query inside a PL/SQL procedure so I wanted to run the query by calling the procedure. So, I needed to come up with a bunch of calls to the procedure using realistic data as a test script. This blog post is about the decision I had to make about creating the test script. Would I use SQL or Python to quickly hack together my test script? I thought it would be interesting to write about my choice because I am working on my Python for the Oracle DBA talk that encourages Oracle DBAs to learn Python. In this situation I turned to SQL instead of Python so what does that say about the value of Python for Oracle DBAs?

Let me lay out the problem that I needed to solve. Note that I was trying to get this done quickly and not spend a lot of time coming up with the perfect way to do it. I had over 6000 sets of bind variable values that the problem query has used in the past. I used my bind2.sql script to get some sample bind variable values for the problem query. The output of bind2.sql was in this format:

2017-11-27 15:08:56 :B1 1
2017-11-27 15:08:56 :B2 ABC
2017-11-27 15:08:56 :B3 JAFSDFSF
2017-11-27 15:08:56 :B4 345
2017-11-27 15:08:56 :B5 6345
2017-11-27 15:08:56 :B6 10456775
2017-11-27 15:08:56 :B7 34563465
2017-11-27 15:08:56 :B8 433
2017-11-27 15:09:58 :B1 1
2017-11-27 15:09:58 :B2 JUL
2017-11-27 15:09:58 :B3 KSFJSDJF
2017-11-27 15:09:58 :B4 234
2017-11-27 15:09:58 :B5 234253
2017-11-27 15:09:58 :B6 245
2017-11-27 15:09:58 :B7 66546
2017-11-27 15:09:58 :B8 657576
2017-11-27 15:10:12 :B1 1
2017-11-27 15:10:12 :B2 NULL
2017-11-27 15:10:12 :B3 NULL
2017-11-27 15:10:12 :B4 45646
2017-11-27 15:10:12 :B5 43
2017-11-27 15:10:12 :B6 3477
2017-11-27 15:10:12 :B7 6446
2017-11-27 15:10:12 :B8 474747

I needed to convert it to look like this:

exec myproc(34563465,10456775,345,433,6345,'JAFSDFSF','ABC',1,rc);
exec myproc(66546,245,234,657576,234253,'KSFJSDJF','JUL',1,rc);
exec myproc(6446,3477,45646,474747,43,'NULL','NULL',1,rc);

I gave myself maybe a minute or two to decide between using SQL or Python. I choose SQL. All I did was insert the data into a table and then manipulate it using SQL statements. Note that the order of the arguments in the procedure call is not the same as the order of the bind variable numbers. Also, some are character and some are number types.

Here is the SQL that I used:

drop table bindvars;

create table bindvars
(datetime varchar2(20),
 varname varchar2(2),
 varvalue varchar2(40));

insert into bindvars values ('2017-11-27 15:08:56','B1','1');
insert into bindvars values ('2017-11-27 15:08:56','B2','ABC');
insert into bindvars values ('2017-11-27 15:08:56','B3','JAFSDFSF');
insert into bindvars values ('2017-11-27 15:08:56','B4','345');
insert into bindvars values ('2017-11-27 15:08:56','B5','6345');
insert into bindvars values ('2017-11-27 15:08:56','B6','10456775');
insert into bindvars values ('2017-11-27 15:08:56','B7','34563465');
insert into bindvars values ('2017-11-27 15:08:56','B8','433');
insert into bindvars values ('2017-11-27 15:09:58','B1','1');
insert into bindvars values ('2017-11-27 15:09:58','B2','JUL');
insert into bindvars values ('2017-11-27 15:09:58','B3','KSFJSDJF');
insert into bindvars values ('2017-11-27 15:09:58','B4','234');
insert into bindvars values ('2017-11-27 15:09:58','B5','234253');
insert into bindvars values ('2017-11-27 15:09:58','B6','245');
insert into bindvars values ('2017-11-27 15:09:58','B7','66546');
insert into bindvars values ('2017-11-27 15:09:58','B8','657576');
insert into bindvars values ('2017-11-27 15:10:12','B1','1');
insert into bindvars values ('2017-11-27 15:10:12','B2','NULL');
insert into bindvars values ('2017-11-27 15:10:12','B3','NULL');
insert into bindvars values ('2017-11-27 15:10:12','B4','45646');
insert into bindvars values ('2017-11-27 15:10:12','B5','43');
insert into bindvars values ('2017-11-27 15:10:12','B6','3477');
insert into bindvars values ('2017-11-27 15:10:12','B7','6446');
insert into bindvars values ('2017-11-27 15:10:12','B8','474747');

commit;

drop table bindvars2;

create table bindvars2 as
select 
b1.varvalue b1,
b2.varvalue b2,
b3.varvalue b3,
b4.varvalue b4,
b5.varvalue b5,
b6.varvalue b6,
b7.varvalue b7,
b8.varvalue b8
from 
bindvars b1,
bindvars b2,
bindvars b3,
bindvars b4,
bindvars b5,
bindvars b6,
bindvars b7,
bindvars b8
where
b1.datetime = b2.datetime and
b1.datetime = b3.datetime and
b1.datetime = b4.datetime and
b1.datetime = b5.datetime and
b1.datetime = b6.datetime and
b1.datetime = b7.datetime and
b1.datetime = b8.datetime and
b1.varname = 'B1' and
b2.varname = 'B2' and
b3.varname = 'B3' and
b4.varname = 'B4' and
b5.varname = 'B5' and
b6.varname = 'B6' and
b7.varname = 'B7' and
b8.varname = 'B8';

select 'exec myproc('||
B7||','||
B6||','||
B4||','||
B8||','||
B5||','''||
B3||''','''||
B2||''','||
B1||',rc);'
from bindvars2;

I hacked the insert statements together with my Textpad text editor and then loaded the rows into a little table. Then I built a new table which combines the 8 rows for each call into a single row with a column for each bind variable. Finally I queried the second table generating the procedure calls with single quotes, commas and other characters all in the right place.

Now that the rush is past and my testing is done I thought I would hack together a quick Python script to do the same thing. If I had chosen Python how would have I done it without spending a lot of time making it optimal? Here is what I came up with:

Instead of insert statements I pulled the data into a multi-line string constant. Then I split it into a list of strings with each string representing a line. Then I split each line into space delimited strings so each line would have date,time,bind variable name, bind variable value. Finally I looped through each set of 8 lines extracting the bind variable values and then printing the bind variables in the correct order and format.

These are two quickly hacked together solutions. I think the key point is how I stored the data. With SQL I used tables. With Python I used lists. I’m not sure which I like better in this case. I’ve been doing SQL longer but Python wasn’t really harder. I guess my decision under pressure to use SQL shows that I still have more comfort with the SQL way of doing things, but my after the fact Python hacking shows that the Python solution was not any harder. FWIW.

Bobby

 

Posted in Uncategorized | 3 Comments

RMOUG Training Days 2018 early registration deadline rapidly approaching

The deadline for early registration for RMOUG Training Days 2018 is rapidly approaching. The deadline is January 12. The early registration fee is $385 for RMOUG members and $450 for non-members. There is a packed agenda over the three-day event. The hotel costs are reasonable at $159/night which is very good compared to the hotel prices at the vendor conferences in San Francisco. RMOUG is reasonably priced and high quality training. With all of the change going on with cloud computing and some of the database systems that are competing with Oracle now I am looking forward to hearing from other people about how they are adapting in this changing climate.

I am giving two talks which are both about personal development for DBAs. One is about whether a DBA should learn the Python programming language. The other is about whether DBAs could benefit from communication and leadership training through Toastmasters. Here are the dates and times for my two talks:

Toastmasters for the Oracle DBA
Thursday, February 22
Session 12, 1:30 pm – 2:30 pm

Python for the Oracle DBA
Thursday, February 22
Session 14, 4:00 pm – 5:00 pm

As things change it makes sense for DBAs to continue to develop themselves and I think that my talks and the conference in general will help us to move forward by preparing us to meet the challenges that the future will bring. I hope to see you in Denver and don’t forget to register by January 12.

B0bby

 

Posted in Uncategorized | Leave a comment

Both talks accepted for Collaborate 18

IOUG accepted both my Toastmasters and Python talks for Collaborate 18. RMOUG also accepted them both so I will be doing these two talks in both February and April. I am a little surprised because I have had talks rejected by IOUG in the past. There are a lot of great speakers competing for speaking slots. This is my first time for RMOUG so I did not know how hard it would be to get a talk accepted. I put both talks in for both conferences not knowing if either would be accepted and both were at both conferences!  So, 2018 will be a busier year than normal for me in terms of speaking at conferences. My last conference was two years ago at Oracle OpenWorld where I spoke about Delphix, a tool that I use with the Oracle database. Next year I’m talking about two things that I feel passionate about. The talks are not about the Oracle database but they are about things I have learned that have helped me in my Oracle database work. They are about how Toastmasters has helped me improve my speaking and leadership skills and about why the Python programming language has become my favorite general purpose scripting tool. I am looking forward to giving the talks. If you are able to attend one of the conferences maybe you could check out one of my talks. Fun.

Bobby

Posted in Uncategorized | 2 Comments

Arizona Oracle User Group meeting Thursday

The Arizona Oracle User Group (AZORA) is meeting this Thursday, November 16th, 2017 from 12:30 to 3:30 pm Arizona time.

Here is the link to the Meetup:

AZ-ORA November Meetup – Dan Morgan live!

Thursday, Nov 16, 2017, 12:30 PM

Republic Services – 2nd Floor Conference Rooms
14400 N 87th St (AZ101 & Raintree) Scottsdale, AZ

13 AZORAS Attending

It’s time to learn more about Oracle and Oracle products! Dan Morgan from Forsythe Meta 7 will be speaking at our November 16 Arizona Oracle User Group (AZ-ORA) meeting on two topics: “Channeling Oracle OpenWorld 2017” for DBAs and SQL & PL/SQL Developers and “Security for Oracle DBAs and Developers” – more details to come…When: Thursday Novembe…

Check out this Meetup →

Here is a link to the user group site with more information:

AZORA Meeting Nov 16, 2017

I’m looking forward to it. I hope to see you there!

Bobby

Posted in Uncategorized | Leave a comment