Translate

Partners

In this tutorial, we'll be covering how to RTFM and what exactly that means in NASM and Linux.

Preface:
Just some information/expectations for you guys/gals:
1. I will be using Linux for these articles. No, they will not work on Windows. However, because of the nature of this site, I expect that most if not all people on here have access to at the vary least, a virtual machine they can use. If not, direct yourself over to VMware, and to Ubuntu

2. I will be using Ubuntu 12.10 for these articles. Don't fret, the code will assemble on any Linux-based distro, but you may have to use a different package management system to download the assembler.

3. You need to know hexadecimal for these articles. You don't need to know all that much, just what it is, how to convert it, and just be generally comfortable seeing it.

4. This is not for people new to programming. In my opinion, you should learn assembler after you know a bit more about C/C++, or some other local language. Because this is not for those who are new to programming, I will not explain certain programming paradigms I expect most people with moderate knowledge of programming will know. (Functions, pointers, arrays, etc.)

5. After a recent upgrade, I now find myself using Ubuntu 12.10 x64. This means that I will be including the linking/assembling commands for that system as well. However, because all CPU's are backwards compatible, I will be only using x32 bit OPCodes and registers. This means for people on a x64 bit OS, such as myself, will have to use a bit longer linking command than those on a x32-bit OS. Everything else should be the same.

6. I expect that you all have read and comprehended to some degree my previous tutorial. If not, you can find it here.

7. That's it! Let's get started!

What is RTFM?:

Historically, RTFM stood for, Read The Fucking Manual. It was usually directed at people who were inexperienced and needed help from people who were experienced on something that was trivial to do. I'll give an example below:

New guy: "Hey guys, I'm learning C and I was wondering how you compile a program?"

Old annoyed guy: "Dude, just go RTFM."

New guy: ":("

In the example above, it's pretty clear that the new person could have easily solved his problem on his own, yet he continues to go and annoy people and thus RTFM was born.

How does RTFM pertain to NASM?:

Most languages have a manual that comes with them, or at least a book that everyone agrees sums everything up fairly well. For C there's "The C Programming Language", commonly referred to as K&R. For D there's "The D Programming Language", for Java there's "Head First Java", for Python it's the Python Docs, etc. NASM by itself does have a manual that comes with it in the form of documentation. This documentation goes through the ins and outs of NASM and some of it's more interesting quarks (I HIGHLY recommend you guys have this on hand, it'll really help you out). However, as we've learned in the past, most of the functions that we want to use in our programming (input, output, etc.) are not built in to NASM. Instead, they're built into the kernel that you're currently using (which is why Windows ASM code has different interrupts than Linux ASM code would). So then, where is the manual for that? How can we look up function calls in the kernel, or for the kernel? Well my little sappling, it's time for you to learn about the man pages.

The man pages:

As many of you (should) know, every function and program in Linux, or at least the vast majority of them have a manual that ships with them. To access this manual, you have to execute the command "man" in the terminal. Once execute, the terminal will then load up the manual for the requested page and then show it to you. For example, if we wanted to know more about the "ls" command, we would execute the following command in the terminal:

CODE :

man ls

The output of the command would be:

CODE :

NAME
ls - list directory contents

SYNOPSIS
ls [OPTION]... [FILE]...

DESCRIPTION
List information about the FILEs (the current directory by default).
Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.

Mandatory arguments to long options are mandatory for short options
too.

-a, --all
do not ignore entries starting with .

-A, --almost-all
do not list implied . and ..
...

From the command, we can see every argument that the command can take, what it does, and a general description of the command itself. Quite nice, if you ask me.

How does this apply to ASM programming?:

As well as a large majority of commands and programs using the man page, certain kernel calls are also listed there. For instance, if we wanted to, say, read from a file, we could execute the following command to figure out how:

CODE :

man 2 read

Why is there a '2', you ask? Well, because a large amount of people who read the man pages aren't actually coding this low level and don't need to know half the crap that's listed here, they list the program manuals and shell manuals first, THEN they list the programmer manuals (Hence the 2).

So from here we can clearly see exactly what parameters this kernel call will take:

1. The file descripter which is an int
2. The buffer to read into
3. The size of the buffer that it's reading in to

The way one would organize this information in the registers is exactly how you would think:

CODE :

EAX = The system call for read
EBX = The file descripter
ECX = The address of the buffer to read into
EDX = The address of the size of the buffer

Alright, that's cool, but how do we find the system calls?:

If you noticed above, EAX is going to be the system call for the kernel, but the system call (in asm) isn't listed in the man pages, so how do we find it? That's actually one of the easiest parts to do because all the system calls are housed in one file. For Linux, it's in "unistd.h" (Which should be housed in your /lib folder somewhere, had to search to find min) and for the *BSD's, it's in "syscall.h". Find that file, and open it up in a text editor. Once you open it up, you'll notice that every system call is in there, including the one's we've already used. Not only that, but we can see that the system call for "read" is 2. This means that when we want to read, EAX should be set to 2. So, all the registers should look something like this for a read call:

CODE :

EAX = 2
EBX = The file descripter (you can just use the return value from an "open" command)
ECX = The address of the buffer to read into
EDX = The address of the size of the buffer
int 0x80 = The call to the kernel!

And bam! Just like that, we now know how to find any function or system call we need and the parameters they take. Enjoy exploring!

If you have any questions AT ALL, I'm just a PM away!

Thank you all for reading, and there shall be more to come!
~Centip3de

Because Linux is an open source kernel, many people have modded it and made it their own. Because of this, they are not different operating systems (as they all share the same kernel) and are generally called flavors, varieties, or distributions of the kernel. This would be the same as when someone takes a game and mods it, making it a slightly different game. Because it has the same logic implemented behind it, it's not a completely different game, and as such is usually called a modification for that game.

As for your second comment:

Because of the degree of blatantly rude, condescending, offensive, audacious remarks in your first comment, you can go fuck yourself.

Technically speaking, GNU/Linux is the OS (the OS includes the kernel). ;) However, I'm not aware of anybody crazy enough to not just use a premade distro and settle for an absolute base system instead. :P

HackThisSite is is the collective work of the HackThisSite staff, licensed under a CC BY-NC license.
We ask that you inform us upon sharing or distributing.