I have written earlier blog posts about my diversion from studying Oracle to studying computer science. Here are some relevant posts: url1,url2,url3,url4. After finishing the math for computer science online class and book that I was working on I stared working through a book about assembly language programming. I bought the book in a frenzy of enthusiasm about studying computer science for fun. But then I had to decide if I was going to work through the assembly book now and delay moving on to the algorithms class using Python that I had intended to do next. I intended to use the math that I studied to prepare me for the algorithms class. But, assembly is a nice low-level hardware oriented thing to study and a contrast from the math, computer science theory, and higher level Python scripting language. So, I decided to delay my study of algorithms and work on assembly. I’m working on the last exercise of the 16th chapter out of 19 in the book and thought I would start this blog post to document my experience so that others might benefit from it. The book is Ray Seyfarth’s “Introduction to 64 Bit Assembly Language Programming for Linux and OS X“. I have saved my work on the exercises on GitHub: repository.
I want to let people know what type of environment and tools that I used so they can compare notes with my experience if they decide to work through the book. I started out on an Oracle Enterprise Linux 7 virtual machine running under VirtualBox on my laptop. Oracle’s Linux is a version of Red Hat Linux. I believe that I had to compile YASM and the author’s ebe tool. It has been a while but I think I had to search around a bit to get the right packages so that they would compile and I had some parts of ebe that never worked correctly. Starting with Chapter 9 Exercise 2 I switched from YASM, the assembler recommended by the author of the book to NASM, a more commonly used assembler. I switched because I hit a bug in YASM. So, chapter 9 exercise 1 and earlier were all YASM. Also, after the early lessons I moved from the GUI ebe debugger to the command line gdb debugger. I wanted to get more familiar with gdb anyway since I use Linux for my job and might need gdb to help resolve problems. After I got a new laptop I switched to using Centos 7 on VirtualBox. I was able to install nasm and gdb using yum from standard repositories and did not have to do any manual compilation of development tools in my new environment. So, if you choose to work through the book you could go the nasm and gdb route that I ended up with if you have challenges installing and using ebe and yasm. There are some minor differences between nasm and yasm but they are pretty easy to figure out using the nasm manual.
There are connections between x86-64 assembly programming on Linux and my job working with Oracle databases. At work, 64 bit Linux virtual machines are our standard Oracle database platform. They are also the building blocks of the cloud. You see a lot of 64 bit Linux on Amazon Web Services, for example. So, I’m really kind of doing assembly language for fun since it is so impractical as a programming language, but at the same time I’m doing it on the platform that I use at work. Maybe when I’m looking at a dump in an Oracle trace file on Linux it will help me to know all the registers. If I’m working with some open source database like MySQL it can’t hurt to know how to debug in gdb and compile with gcc.
The assembly language book also connected with my passion for performance tuning. The author had some interesting things to say about performance. He kind of discouraged people from thinking that they could easily improve upon the performance of the GCC C compiler with all of its optimizations. It was interesting to think about the benefits of SIMD and how you might write programs to work better with pipelining and the CPU’s cache. It was kind of like Oracle performance tuning except you were looking at just CPU and lower level factors. But you still have tests to prove out your assumptions and you have to try to build tests to show that what you think is so will really hold up. Chapter 16 Exercise 1 is a good example of SIMD improving performance. I started with a simple C version that ran in 3.538 seconds. An AVX version of the subroutine did 8 floating point operations at a time and ran the same function in 2.1057 seconds. Here are some of the AVX instructions just for fun:
vmovups ymm0,[x_buffer] ; load 8 x[i] values vmovups ymm1,[rsi+r10*4] ; load 8 x[j] values vsubps ymm0,ymm1 ; do 8 x[i]-x[j] ops vmulps ymm0,ymm0 ; square difference
Generally, x86-64 assembly ended up feeling a lot like C. The book has you use a variety of calls from the C library so in the later chapters the assembly programs had calls to a lot of the functions that you use in C such as printf, scanf, strlen, strcmp, and malloc. Like C it was common to get segmentation faults without a lot to go on about what caused it. Still the back trace (bt) command in gdb leads you right to the instruction that got the error so it some ways it was easier to diagnose segmentation faults in assembly than I remember it being in C. It brought back memories of taking C in college and puzzling over segmentation faults and bad pointers. It also made me think of the time in a previous job when I progressed from C to C++ to Java. I came out of school having done a fair amount of C programming. Then I read up on C++ and object-oriented programming. But C++ still had the segmentation faults. Then I found Java and thought it was great because it gave you more meaningful error messages than segmentation fault. Now, I have embraced Python recently because of the edX classes that I took and because of the ways I have used it at work. Working with assembly has kind of taking me back down the chain of ease of use from Python to C to assembly. I can’t see using C or assembly for every day use but most of the software that we use is written in C so it seems reasonable to have some familiarity with C and the lower level assembly that lies beneath it.
Anyway, I have three more chapters to go but thought I would put out this update now while I am thinking about it. I may tweak this post later or put out a follow-up, but I hope that it is useful to someone. If you feel inclined to study 64 bit x86 assembly on Linux I think that you will find the Ray Seyfarth book a good resource for you. If you want to talk to me about my experience feel free to leave at comment on this post or send an email.