Programming

A Guide for Absolute Beginners

Contents :

Programming is not at all difficult. It's just a matter of putting words in the right order, so that a computer will understand them and do what it is asked to do.
The hard part is to figure out what you want the computer to do, and to imagine how a list of instructions wil explain to a machine how you want it to solve a given problem.
The other hard part is to write a good program : one that does exactly what it is supposed to, in an efficient way, and preferably with a touch of beaty. Anybody can take an axe to a piece of wood, not everybody can make furniture with an axe.

A program is basically a list of instructions, in a certain order, to be executed by a computer. This is most obvious in one of the oldest (but not necessarily obsolete) forms of programming : batch programming. A batch is like a list of things to do. A very common batch program used to be the autoexec.bat on DOS systems. It can also be used on Windows 95/98 machines. An autoexec.bat on a DOS/Windows for Workgroups machine could look something like this :

SET TEMP=C:\TEMP

keyb be,,C:\DOS\keyboard.sys

LH C:\DOS\MSCDEX.EXE /d:mscd001

WIN

The autoexec.bat is executed automatically at startup, and this one would

SET TEMP=C:\TEMP
SET TMP=C:\TEMP
make sure that temporary files are put in the 'temp' directory

LH C:\DOS\MSCDEX.EXE /d:mscd001
start a program that gives access to CD-ROMS under DOS

WIN
start Microsoft Windows

Another example could be a 'program', say backup.bat, that copies files from one directory to another.

XCOPY C:\MYDOC D:\BACKUP /S

This copies all files from the MyDoc folder to the Backup Folder on drive D:.Files in Subdirectories of the MyDoc folder are included, by adding the /S switch.

Something similar is done with scripting. A script is a text file that is read by program, a so-called command interpretor; tis program reads and executes the commands in the script. This can be used to avoid having to repat the same commands again and again : you write them down once, then tell the PC or the program to read it and do what is written there.
Let's say that you have made website. All the files are still on your computer, in the folder MySite on your hard disk (c:\mysite). You want to upload it to your webspace at Yoohoo.com. You could now create an ftp script (myscript.txt) containing the location of your web space (on youhoo's ftp server), your Yoohoo user name (Pete), password (3211), an ftp command to upload all files from your MySite folder, then close the connection. Maybe something like this :

pete
3211
lcd c:\mysite
mput *.*
bye

At a command prompt, the command ftp -s:myscript.txt would then always upload all files (mput *.*). If you have a website, and use a program such as CuteFTP to upload your pages, you can monitor the exchange of commands and replies between the server and your computer, and you'll notice it is quite similar to this script.

In the example of an autoexec.bat above, the batch file is very much like a script to configure the computer, to put it in a certain state (default location for temp files, belgian keyboard, access to CD-ROMS, then run windows). It doesn't feel like a real program. So let's add a couple of things.

interaction with users

Lets say that, after the computer has started, you want to be reminded to check your mail. One way could be to add a line that puts "don't forget to check your e-mail" on your monitor. In DOS batch language that would be ECHO don't forget to check your email . ECHO is the DOS Batch language for 'show on screen'.

Better even would be that the user is asked :"do you want to check your mail ? ", then if the user answers Yes, the e-mail program is started. If the user answers NO, nothing happens. The program should give the user a choice, then check what the user has choosen. This is done with a test.

tests, conditions

CHOICE is a DOS batch command that puts a question on the screen("start Pegasus mail to check e-mail ?" ) and then waits for the user to give an answer : by default Y for Yes, N for No. The T:n,06 switch makes that if no answer is given within 6 seconds, the program will continue as if the answer were "No" (n).This allows the you to ignore the e-mail reminder and go get some coffee while you wait for Windows to get ready with whatever it's doing during startup.
The errorlevel is a number that indicates (in this case) whether Y or N has been answered. if errorlevel 2 goto end - this is almost English; it means something like if the 2nd option (N) is chosen, go to the line (in this batfile) that starts with :end, and in all other cases, just continue with the next command.
c:\progra~1\pegasu~1\winpm-32.exe is a statement that starts my favorite email program : Pegasus Mail for Windows. You'll understand that if the user answers 'N(o)' to the question 'Start Pagasus Mail ?", this line will be skipped, as the program will go directly to the ':end'. Simple but effective.

the "IF errorlevel 2" statement is called a TEST or CONDITION: It can be true or false. If it's true, one thing happens, if it's not true, another (or nothing at all). Conditions are very important in programming.

Good programmers write "structured" programs. Structure in a program is obtained through the following means :

SEQUENCE : put things in the right order. Start with the beginning. Avoid jumping around. IF ... GO TO ... ELSE GO TO ... is a recipy for disaster. Especially if you Go To another of these GoTo contraptions. There's no way you can keep track of what this program will do, and the code will read like a plate of pasta. ("Spaghetti programming").
This is one of the reasons that some people consider programming languages such as (Visual) Basic kid's toys rather than real programming languages - the BASIC family allows the use of GOTO.

CHOICE (TEST) : If something is true, then do this, if it's not true, do something else. This is known as IF ... THEN .. ELSE ... in (Visual) Basic. A more complex test works with several cases: in case 1, do action a, in in case 2, do action b, in case 3, do action c, and so on.

REPEAT (LOOP) : Traditionally, one distinguishes between the following kinds of loops :

fixed : repeat a given number of times

conditional : repeat as long as (WHILE) or UNTIL a condition is true

The exact way (vocabulary, syntax)to structure your program will differ from one languague to another, but that's just the language. It's the story that counts, not the language it's written in.
In Visual Basic you may find it slightly easier to read because it sounds more like (Tarzan) English :

If a > b Then max = a Else max = b

obviously, this is part of a program (or a function) to decide which of two numbers is bigger (a or b).
In C++ it would look something like this :

If a program requires the same lines of coe to be repeated more than once, you can park those lines somewhere ouside your 'main' program, and 'call' them every time you want to use them. This is called an'function' or a 'subroutine'. Say the programming language you're using (Micrusoft Visual C++ ?) has no command to clear the screen. Yet you're writing a program that asks for user input a few times, then calculates something, and you want a blank screen before you show the result. This happens several times throughout the program.
You might create a 'ClearScreen' subroutine by having the program returning 25 blank lines. Blank lines in C++ are created with cout << ""; and you'll use a 'For' loop to repeat that 25 times. That may look as follows :

for (int i=0; i <= 25 ; i++)
{ cout << "";};

In stead of repeating those lines every time you need a blank screen, you can put them somewhere separate, in a "function definition", and give it an obvious name, such as ClearScreen(), Every time you mention ClearScreen() in your program, it will go to (!) the function, execute the code in it, and return to the program. Very useful technique indeed.

How exactly you go about 'putting this code somewhere separate, in a function or subroutine', varies from language to language. We won't go in to that - it would lead us to far beyound the scope of this paper. So would a discussion on the many uses of subroutines and functions. But you get the idea.

Imagine you write a program to add the numbers 4 and 2 together, and show the result on the monitor of your computer. In some form of BASIC, this may look like this :

Result = 4 + 2
PRINT Result

This will work, but it's rather silly. You'd have to write a different program every time you want to add different numbers. It makes sense to try and write something where you could give 2 numbers, let the program add them together, and put the result on your screen.
The program (in Qbasic) might look as follows :

FirstNumber = INPUT
SecondNumber = INPUT

Result = FirstNumber + SecondNumber

PRINT Result

Result, firstnumber and secondnumber are variables. With the QBasic command 'INPUT' you can let the user type a number on the keyboard. It is stored in the computer's memory for later use. The name of a variable is how your program remembers which value is stored exactly where in the computer's memory. That way it can be retrieved from the memory when needed (e.g. to add the two numbers together and store the result in another memory location. You as a programmer don't need to worry where that is, the program will know it by the name of the variable.

Assigning a value to a variable

Note that assigning a value to a variable has to be written correctly : the name of the variable to the left, the value to the right.
So although FirstNumber + SecondNumber = Result is mathematically correct, you can usually not program it that way. You want to assign a value (the sum of the 2 numbers) to the variable 'Result', so 'Result' has to be on the lefthand side, and the value (Firstnumber+SecondNumber) to the right.

In Pascal (a programming language that was originally developped as a tool to teach programming), the assignment is written := and programmers are advised to read it as 'becomes' or 'changes to' rather than 'is' or 'is equal to'. So (the value of) "Result changes to (the sum of FirstNumber and SecondNumber). This is a good way to distinguish between = (is equal to) and := (changes to), which are in fact two separate things. Many other programming languages use the same distinction, but use other symbols, such as == (equal to) and = (changes to).
Read the following out loud, saying "is equal to" for == and "changes to" for =, and you'll see (and hear) the difference;

Variable : type, declaration, initialization.

There are different 'types' of variables. This has to do with the fact that variables are, in fact, parts of the computer's memory. How big a part ? Well, they need to be large enough to hold their value, but if they are too big they just use up memory for no good reason. Say you'll give 1 variable 16 bits of memory. 16 bits of memory can hold 2 ^ 16 different binary numbers. (Each bit kan be 1 or 0. 16 bits allow for 2 ^ 16 different combinations of 1 or 0. That's 65.536 (integer) numbers. So a variable of 16 bits can hold any number between -32.768 and +32.768. Often, that's more than enough, so you can use this type of variable (called 'integer'). In case you need larger numbers, you can use a larger type, e.g. a 'long integer' that uses more bits of memory, and thus can hold larger numbers. For characters or strings, there are also types of variabels : 1 character takes 8 bits. There are also variables that can hold decimal numbers (such as 2/3 or 3.14159). Thease are called 'real numbers' and they require their own type of variable because the need more memory than, say, an integer. Depending on how precise they are (how many digits after the decimal point), they'll use up less or more memory.

Giving an name to your variables and indicating their type is called the "declaration" of variables. It is usually done at the beginning of the program. Not all languages require that you declare variables at the beginning of the program - they can then be declared anywhere in the program. Some languages allow the use of variables without any declaration. It's usually still important to remember variables have a type. Conversions from one type to another are possible, but can have certain side effects. e.g. Converting a 'long" or 'real' value to an integer will make the number you're using less accurate, because an integer variable is not big enough.

Variables are part of the computer's memory : it is a name for a specific location in the memory. It is always possible that this location still contains a value that is no longer used but has not been removed. Not all programming languages remove 'old' values when they look for memory locations for a new variable. (C/C++ is a typical example). Your variable would then -unexpectedly- have a value that it should not have. It is therefore usual to assign a value to all variables, as soon as possible (i.e. in or right after the declaration). This is called 'initialization' of the variables.

'Declaration'
Dim UserName as String

'Ask for user name if UserName is empty, repeat untill a value for Userbname is given
Do While Username = ""
UserName=InputBox("Enter username:")
Loop

In such a situation, if the variable 'UserName' is not empty at the start of the program (because the memory location contains a few forgotten bits ...), the program will not show the inputbox where the user can enter his/her name. It is therefore a good idea to make sure Username is empty , before the Do... Loop.

Since programming is not at all difficult, anyone with more that an average interest in computers will sooner or later have a go at it and learn to write a few lines of code, probably Visual Basic (for Applications) or some form of scripting. It takes a bit more than an average interest in computers to write a good program.

A well-written program works correctly. It does what it is supposed to do. Always. Even in unexpected situations. The odds that your program woks correctly increase if you think things through before you start writing the code, then write your code in a structured manner.

A well-written program is robust. It can handle unexpected situations. e.g. What would happen if you ask the user to enter a number, and he enters a character? (They will do that, by mistake or because they never read the instructions). Will your program crash ? Will it continue to work with the wrong information and therefore return a result that you can't rely on ? Or have you foreseen this so that your program will tell the user he's made a mistake, and ask him to enter the correct information ?
Or: what is your e-mail-program supposed to do when you hit the "Send Mail" button while there is no connection to the internet ?

A well-written program is easy to maintain. That means that when you look at the source code 3 months later, you can still read the code, and understand what the program does and how it does it. This means you'll have to take care of the layout of your code (use indents, ..), add explanations and comments in the code, declare variables at the beginning of the program, etc.).