A coworker of mine asked me to look at a MySQL query that was running longer than we want it too. If they added a LIMIT 1000 clause at the end of the query it popped up the results in a couple of minutes but without the LIMIT clause it apparently hung forever without returning a row of the result set. I figured out that the query was returning 4 million rows with a bunch of columns in each row which is why it was slow. But the question remained why did it pop up the first row faster with LIMIT clause? I checked the plan with and without the LIMIT clause and didn’t see a difference. Then I did more testing and realized that even a simple query that returns a large result set took a long time to display the first row. This is on MySQL 5.7 running in AWS RDS.
I asked a question about this on two forums that answer MySQL questions:
I appreciate people responding to my questions but I didn’t really get the answer I was looking for. I really wanted to know why large result sets made the mysql client take a long time to display the first row. But, I guess people thought that you should just add a LIMIT clause and not worry about it. Why would you want to return the first rows of a large result set quickly? I just want to understand it and I still don’t.
I built a test case and ran it against an Oracle database and a MySQL database and the Oracle database returned the first row immediately and the MySQL version hung for a couple of minutes. I used the Oracle sqlplus utility and MySQL mysql command line client.
Here is the test case:
I ran this from my laptop at home over a VPN so they both had to send results over the internet. But the MySQL database was in AWS so it had to go over the internet twice to get to me. The Oracle database was in our data center. The select * from test statement pops up the first rows instantly on Oracle but took at least a minute to show up on MySQL. A simple Python script against the same MySQL table pops up rows instantly.
So, this leaves me wondering if the slowness in displaying the first row is due to some processing that occurs in the mysql client. I assume, but don’t know, that running the query through Python is no different from through the mysql client from the server’s perspective.
I guess the point is that there is a lot that I don’t know about MySQL’s architecture because most of my experience is with Oracle. But it was helpful to me to know that the mysql client is slow to produce the first row of a large result set and to know that a Python script could be used as a workaround to this slowness.