All data is physically stored in memory. Memory is organized as a contiguous array of bytes -- well, words actually, with different computer systems potentially having their own proprietary word lengths, but the common convention now is to use bytes which are 8 bits in length -- , each located at its own unique memory address. A memory address is an integer value that can be loaded into a Memory Address Register which the hardware can then use to access that particular memory location.

A pointer is simply a variable that contains a memory address. That's all that a pointer is. The rest is just learning the syntax of working with pointers. In C++, a reference is simply working with pointers without the pointer syntax.

However, from your example, you're asking about the address operator, which I guess some books might call a "reference operator". All it does is to return the address of the variable.

1. When you use this operator, is it just a way for the compiler to find out where the variable is?

example

int var = 5;
printf("%d",&var);

to access the variable, the compiler needs to knwo where it's at. So is using the & operator do so?

So then you're asking how to display the variable's address? You're close, but the flag for a memory address is "%p", which displays it in hexadecimal; memory addresses are much more meaningful in hex. And, yes, you need to use the address operator to get the address, otherwise you would display the value that's stored at that variable location.

2. How does the compiler assign what its memory address is? Does it do it random or in a certain sequence?

Basically, as the compiler analyzes the code, it builds tables in which to hold information about the identifiers that it finds. Then at a particular stage of the process, as it generates code, it sets up segments to hold variables and code, etc, such as TEXT for code and literals, DATA for static variables, the HEAP, and the STACK -- different compilers can use different names. Static variables (AKA "global variables") go into a segment of read/write memory, such that they are placed one after the other, though possibly with some padding between them in order to optimize accessing them, with each location's address being its offset from the start of that memory segment; at compile time, the compiler has no idea where in memory the program will end up being loaded, so all addresses are offsets relative to some starting point. Auto variables (AKA "local variables") are set up in the code as offsets from a reference point in the function's stack frame so that they can be created when that function is called.

Now actually, when you compile a source file, you generate an object file that contains the object code and tables of what global variables it declared and what global variables and functions it expects to find in another object file. It is the linker that then reads all those tables and strings the object codes together and builds the global variable memory segment and the global initialization segment and generates the executable file. But even now, no actual memory addresses have been allocated or assigned; nobody can possibly know exactly where in memory any of the program's variables will reside.

Now you run the program. The OS' loader program reads the executable file's header for how much memory it needs (the linker had generated that header) and requests a block of memory of that size from the OS. All the addresses, both in the code and data, had been resolved as offsets from a starting memory address. That block of memory that the loader requested has a starting address, so now all the addresses in the executable can be fixed. Now we know the addresses. But they could be different each time we run the program, because subsequent executions can have different starting addresses.

But your example evokes an additional wrinkle. The loader fixes the addresses of the global variables, but not of the local variables; it cannot know where they will be because they are created and destroyed dynamically on the stack as you call and return from the functions that contain those local variables. You can play around with that by outputting a function's local variable addresses and then call it repeatedly from different functions and after having passed through differing numbers of functions (eg, the function is func, so call func from main, then main->func1->func, then main->func2->func1->func, etc).

Also, even though the compiler will assign offsets to the variables pretty much in the order in which they appear in the code, it might create them in the reverse order of what you might expect. Play around with it and see for yourself.

I really simplified that down for you. Hope it made sense and helped to answer your questions.