This is a follow-up to my earlier post about the assembly language book that I am working through. I have struggled to speed up a program using something that the book recommends, unrolling a loop. I think I have finally found an example where unrolling a loop speeds up a program so I wanted to share it.
I am working on Chapter 17 Exercise 2 of the book which asks you to write a program to find the longest common substring from two strings. I choose an inefficient and simple way to find the common substring and tried to speed the program up without changing the algorithm.
Here is the C version on GitHub: url
The core of the program is three loops. The outer loop tries each character in string 1 as the start of the substring. The middle loop tries each character in string 2 as the start of the substring. The inner loop advances through both strings until it finds the end of the common substring.
The C version ran in 27.2 seconds.
I built an assembly version that uses registers for most of the variables and it ran in about 11.7 seconds. It has the same three loops. Assembly register version: url
I tried to improve on the 11.7 seconds by unrolling each of the three loops. Unrolling the outer and inner loops resulted in no improvement in runtime. I was about to give up but finally decided to try unrolling the middle loop and it caused the program to run in 10.2 seconds. Very cool. Assembly unrolled middle loop version: url
I had to figure out how to use %rep, %assign, and how to have a label that I based on a nasm variable such as .skipif %+ i.
Kind of fun. I realize that this is off topic for my “DBA Blog” but it is something I’m playing with so I thought I would throw it out there. It doesn’t hurt a DBA to know some low-level computer science, even if you are not using it directly in your job. Anyway, it was a bit of a struggle to come up with an example that was faster with the loop unrolling and I think that I have found one.
P.S. Came across the term “loop unrolling” in the Oracle 12.2 New Features Manual. The section titled “Oracle Database Java Virtual Machine Performance Enhancements” has the same sort of performance enhancements that the assembly language book describe including loop unrolling and using SIMD instructions. So, maybe this assembly stuff is not as far removed from a DBA’s job as I thought. It helps me understand this section of the new features manual.