Languages and Environments

Low Level Language

Computers are digital devices, which means they only store the values 1 and 0

Programs and data all end up in low level form (1s and 0s) – language used to directly control a computer is called low level language or machine code.

Low Level Language code is very hard to read and understand

Low level language can be arranged into assembly code (groups of 1s and 0s are replaced with keywords to make the code easier to read)

Each instruction in assembly code is equivalent to 1 line of LLL

High Level Language

Programming computers is easier if the language used is similar to natural language (the way we communicate)

High Level Languages are programming languages written in an English-like language, that must be translated to machine code before a computer can use them

High level languages follow a strict syntax. This means that the language follows a firm set of rules about what order keywords can be used. This strict syntax allows the computer to translate the code into machine code accurately.

There are many HLLs, all with different areas of focus, such as web development, mobile app development or computer games development

Procedural Language

A procedural language is a language that consists of a sequence of commands which will be executed in a predictable order from beginning to end.

A procedural language can group blocks of code together using procedures

The sequence of instructions will be translated into machine code before the code can run

Declarative Language

Declarative languages are designed to run tests on data

Instead of a sequence of commands, declarative languages have a set of facts and rules (the knowledge base)

A declarative program works by sending a query to the program. This is a test that is checked against the facts and rules in the knowledge base.

Declarative programs do not tend to use variables and control structures like procedural language, but depend on rules to control the behaviour of the program

Declarative programs can edit their own knowledge base (such as adding new facts) which means they can “learn” from user interaction.

Object Oriented Language

Object-oriented programs are designed around objects rather than sequences of instructions

Each type of item used in a OO program is described using code called a class

A class has a list of attributes and a list of methods

Each class can be used as the blueprint to create an object, for example the code in a class called Square could describe the attributes and methods of a square, and each square the program uses would be created as an object

Objects are closed off from other areas of the program – the design of the object determines which attributes and methods can be called from other parts of the program. This is called encapsulation.

Each class can be used as the base code for another class, and the programmer can add on new attributes and methods. For example, the Shape class could have two subclasses called Square and Triangle.

Creating a subclass from an existing class is called inheritance

Computational Constructs

Parameter Passing

Parameters are items of data that can be passed between the subprograms

To pass data into a subprogram, a list of parameters is given

To make it easy for different people to write subprograms that work together, the name of a parameter inside the subprogram can be different from its name outside of the subprogram.

The name given to the parameter within a subprogram is a formal parameter

The name given to the parameter when it is called is the actual parameter.

Actual parameters may also be numbers rather than variables, such as in the function call ROUND(30.423,1)

If a copy of the variable passed into a subprogram is used, it will not change the original variable. This is called pass by value.

If a reference (the address in memory) of a variable is passed into a subprogram, the subprogram can change the original variable. This is called pass by reference.

Scoping

Variables declared within a subprogram will be local variables. This means that they only exist during the lifetime of the subprogram’s execution.

Variables declared outside of subprograms will be global variables. This means they are accessible from any part of the program.

Global variables can be dangerous for programmers – a variable with global scope can be seen by the whole program. A global variable with a common name such as total or name may “overlap” with a local variable elsewhere. This could cause confusion.

Subprograms

A subprogram is a block of code within a program with a label. Sub programs can be used more than once.

Functions, procedures and methods

A function is a subprogram that returns a value. This means when you call the function, it sends back data directly to the program, e.g. the RND, ROUND or MOD functions.

A procedure is a block of code that runs in sequential order.

Procedures allow programs to group together code, but more importantly, reuse it.

A procedure can be called multiple times in a program - this saves on code duplication.

A procedure can take parameters, which allows it to work with different items of data each time it runs.

The difference between a function and a procedure is that a procedure is not designed to return a value to a program. Calling a procedure is simply calling a modular block of code.

Object-oriented programs are designed around objects rather than sequences of instructions

Each type of item used in a OO program is described using code called a class

A class has a list of attributes and a list of methods

Each class can be used as the blueprint to create an object, for example the code in a class called Square could describe the attributes and methods of a square, and each square the program uses would be created as an object

Objects are closed off from other areas of the program – the design of the object determines which attributes and methods can be called from other parts of the program. This is called encapsulation.

Each class can be used as the base code for another class, and the programmer can add on new attributes and methods. For example, the Shape class could have two subclasses called Square and Triangle.

Creating a subclass from an existing class is called inheritance

Data types and structures

Data types

A string is an array of characters. Most languages treat strings differently from arrays, because text is commonly used and manipulated in a program.

An integer is a whole number stored as a positive binary number, or a two’s complement binary number (which allows storage of positive and negative numbers)

A real number is a simple data type stored as a floating point number – a number split into two parts

A real number is a number with a fractional part (or decimal place).

A real number is stored in parts called the mantissa and exponent. This kind of number is called a floating point number.

The mantissa is a number used to store the precision of the number. The number of bits reserved for the mantissa determines its precision.

The exponent is a number used to store the range of a number. The number of bits reserved for the exponent determines its range.

Floating point numbers can also used an extra bit called a signed bit to indicate whether a number is positive or negative

Boolean is a simple data type that is stored as either True or False

An array is an ordered sequence of simple data types, all of the same type

An array is used to store a list of items - a variable only stores one item.

Each item in the list is given an index number. Some languages index from 0 onwards, some index from 1. Haggis Pseudocode (which is used by the SQA in exams) is indexed from 0 onwards.

Arrays are a good way of storing lists as they can be processed using their index, making it easy to search and sort the list

A record is a data structure that contains values of different types, e.g. a record of type Person could consist of a name (string), address (string) and house number (integer)

Sequential files

Programs can access data from files. The process is the same in almost every programming language:

Open a file

For each line in the file:

Read line from file

Close file

Programs can also write data to files:

Open a file (create if file doesn’t exist)

For each item to be written to a file:

Write line to file

Close file

Multiple items can be read from each line, or written to one line of a file. Most files separate items using a comma.

Testing and documenting solutions=

Testing

Systematic testing involves carrying out a set of tests according to a test plan

Comprehensive testing involves testing every aspect of the software

Types of error

A syntax error is an error that can be spotted by a translator because a line in a program is incorrectly formed. This would include missing symbols (like a semi-colon or quotation mark), spelling mistakes or missing keywords such as a missing END IF

Syntax errors tend to prevent a program from running (depending on the language).

An execution error is an error that is caused by the code when it runs. This would include an array being accessed at a position that doesn’t exist, a number being outside the possible range of numbers stored in the programming language, or dividing by zero.

A division by zero is an execution error where the computer tried to divide one number by zero, resulting in a crash.

Truncation is an execution error where, due to the way the program has been written, a number becomes less precise, leading to errors in a calculation.

An out of bounds error is an execution error that happens when an array is accessed in a position that does not exist, e.g. requesting position 20 of an array of ten numbers.

When handling files, if you try to access a file with an incorrect path, this will cause an Execution error.

A logic error is an error in the program’s decision making. This is caused by the programmer. The program will run, but not carry out instructions in the intended way, because the logic of the program is wrong.

Common causes of logic errors are when calculations are wrong and incorrect use of IF statements.

Testing tools

A dry run is a desk-based analysis of code. The program is executed line by line by a programmer and the programmer notes down the values of each variable as they change. This will help the programmer find logic and execution errors.

A trace table is a table of values showing the contents of a variable within a program as each line is run. The table can list several variables and will show the values of the variables in rows as the program progresses.

A watch variable is a tool that shows the contents of a variable in a program as it runs. The programmer can step through a program, line by line, and observe the changes to the variables by looking at the watch variable table.

A breakpoint is a marker in a program that pauses execution so that the programmer can use other tools such as a watch variable to observe what is happening at a given moment in a program

Algorithm Specification

Standard Algorithms

Input validation is an algorithm that asks the user for data within a particular range, and repeats the question until data in the correct range is entered.

Linear search is an algorithm that checks an array for the presence of a particular item of data. To do this, the user must be asked what they are searching for. Each item in the array will be compared to the search term, and if the item is found, a message is displayed.

Find the minimum is an algorithm that finds the smallest number in array. To do this, a variable will be used to store the smallest number. Each item in the array will be compared to this number, and if the number in the array is smaller than the lowest, then the variable that contains the lowest will be set to this new number.

Find the maximum is an algorithm that finds the highest number in array. To do this, a variable will be used to store the highest number. Each item in the array will be compared to this number, and if the number in the array is larger than the highest, then the highest will be set to this new number.

Counting occurrences is an algorithm that checks an array for the presence of a particular item of data and adds to a count each time it is found. To do this, the user must be asked what they are searching for. A variable storing the number of times the item is found is set to zero. Each item in the array will be compared to the search term, and if the item is found, the total will be updated by one.

Low-level operations and computer architecture

Computer architecture

A virtual machine is a software program which simulates a hardware platform. This means that it can allow a computer to run an Operating System within another operating system. The hosted OS acts as if it is running on a physical computer, but every hardware call is routed through the virtual machine.

Virtual machines contain the actions of the hosted Operating System because the software can control access to devices and networks.

Virtual machines use the processor of the computer directly (i.e. with the same machine code) but the programs run within the virtual machine can only interact with the hosted OS.

Virtual machines can also be created to run a program. The VM can protect the rest of the computer from being accessed by the program directly. This makes the program's execution much safer. This approach is called sandboxing. Sandboxing is used with Java, Python, Javascript and many other languages to separate the code being executed from the host computer.

Examples of virtual machines include PC-based solutions such as VirtualBox or VMWare, server based “Virtual Private Servers” such as Xen, or sandboxed Virtual Machines for programming languages such as Java.

An emulator is a software program that simulates a hardware platform including the processor. This means that a computer with one type of processor can run software designed for another type of computer - e.g. a Windows PC with an Intel processor can run Gameboy games, even though the Gameboy has a different set of hardware features and a Z80 processor.

Emulators are useful for running executables that use a different hardware platform and means that the software can be tested without the original device. This is used to test Android or iOS applications (running on ARM processors) on Intel computers.

Many applications are now developed for mobile devices, rather than PCs

Mobile apps are not developed on mobile phones. They are developed on PCs and emulators are used to test apps before they are deployed to mobile devices.

Data storage

Binary is the name given to the base 2 number system. Binary numbers are represented with two values, 0 and 1.

Computers store numbers as binary code.

Converting between decimal and binary can be carried out by grouping the number into values that represent each binary column: 128, 64, 32, 16, 8, 4, 2, 1

To convert a decimal number to binary, start with the highest possible number that is smaller (or equal to) the decimal number, and place a 1 under the column needed. Repeat the process with the remainder of the number. Place a 0 in any unused column.

To convert a binary number to decimal, add together the values of each column.

Two’s complement binary is used when numbers must be positive or negative

Two’s complement binary uses the highest value column to represent a negative number instead of a positive number (for example, an 8 bit two’s complement number stores the number -128 in the highest bit)

To calculate the two’s complement for a negative number, flip the bits of the binary for the positive version of the number, and add one. (e.g 65 is 01000001 so to get -65 you need to flip the bits to 10111110 and add one, giving 10111111). This can be double checked by adding the columns: -128 + 32 + 16 + 8 + 4 + 2 + 1

The ASCII character code uses 7 bits giving 128 possible characters. ASCII characters are stored in groups of 8 bits – the extra bit can be used for error checking.

ASCII stands for American Standard Code for Information Interchange and has been used as a standard character code by many computers for decades.

Unicode is an international version of ASCII that includes characters used in non-English alphabets. It uses 16 bits rather than 7 which gives 65536 possible characters.

Unicode characters take double the storage space of ASCII characters.

Graphics

Bitmap graphics are stored as a grid of pixels. Each pixel is represented by a number that tells the computer what colour the pixel should be.

The bit depth of an image is the number of bits used to represent each pixel. A bit depth of 1 will only allow a 1 or 0 per pixel (black or white). A bit depth of 8 will allow 8 bits per pixel meaning the computer has 256 colours to choose from.

The colour for each pixel is often represented by a colour code, which uses a number of bits for red, green and blue. In HTML, this is represented as a 24 bit number, with 3 numbers between 0 and 255 for each colour.

The size of a bitmap graphic can be calculated by multiplying the resolution of the image by the bit depth. The bit depth can be worked out from the number of colours.

Vector graphics store the attributes (position, colour, line thickness etc) of shapes and allow the user to perform operations (resize, delete, move) on the shapes. This means that the value of each pixel in a picture is not stored, just a list of instructions to create each shape on screen.

A vector image can be re-edited at any point because the computer can simply change the attributes the edit the picture. A bitmap image can only be edited at pixel level – no information about what shapes are used in the picture is stored.

Vectors are resolution independent. This means that the vector graphic can be redrawn at any scale on the screen. This is not the case with bitmaps, which pixelate when the user zooms in.

Sounds

MIDI stands for Musical Instrument Digital Interface. MIDI is a file format and protocol for communicating between devices. MIDI is used to send digital data about music between devices, rather than the resulting analogue sound.

A digital instrument (MIDI) sound file does not store samples. Like a vector graphic, a MIDI file is a set of instructions that is used to play music. Each note has attributes (sound, length, volume, time etc) that place it into the sequence of notes in the file. MIDI files can be played back by a computer using its own digital instruments.

Sampled sounds are stored by digitising the sound wave recorded by a microphone. The sound wave is sampled thousands of times per second and the values are recorded into a file.

The quality of a sound file is determined by how many samples per second are taken, and the sampling depth of each sample (the number of bits used to represent the sample)

Inside the computer

There are three main components to the processor: The ALU, Registers and Control Unit

The ALU (Arithmetic and Logic Unit) is the part of the processor where logical operations and calculations are carried out.

The Control Unit is used to control the flow of data between the processor and memory. The Control Unit interacts with the control bus to do this.

Registers store single items of data while they are being used in the processor.

A register called the Program Counter keeps track of the location of the current instruction in memory.

A register called the accumulator is used to store the result of any calculation.

Cache is fast, efficient memory that is used to speed up the retrieval of the most-used data and instructions from memory.

Cache is built onto the processor, or near to it on the motherboard, so that it can be accessed quickly.

Cache is expensive compared to RAM, so usually it is much smaller than RAM.

ROM chips are read-only memory. The data is written onto the chip when it is manufactured.

ROM chips are used to store programs and data that must be available to the computer when it boots up.

Bootstrap loaders, programs that help to identify the components of a PC and start up the Operating System from the hard drive, are often stored on ROMs.

Computers store programs and data currently in use in RAM (Random Access Memory)

RAM consists of a set of memory locations each with an individual address

RAM is not persistent – when the computer is switched off, RAM will lose its values.

RAM is used for storing programs and data when in use because its access time is much quicker than backing storage devices. RAM data access is measures in nanoseconds – hundreds of times faster than accessing a hard drive.

The control bus is a set of discrete lines that all indicate what more the processor is working in.

The read line on the control bus is set when a read operation is to be carried out

The write line on the control bus is set when a read operation is to be carried out

The clock line is switched on and off to provide a synchronising pulse for the processor to use when coordinating the movement of data

The interrupt line is used to change the state of the processor by replacing all the values in its registers (this is how programs can be switched)

The non-maskable interrupt line is used to change the state of the processor even when another program wishes to use the processor

The reset line zeroes all data in the processor and buses

The data bus sends data from the processor to memory and back again. The data bus only works in conjunction with the address bus (it must have a memory address to read from, or write to)

The data bus is bidirectional – data can travel from memory to processor or processor to memory.

The size of the data bus determines how many bits the processor can process in one operation. This is called the word size.

Increasing the size of the data bus improves system performance, as more bits can be transferred each clock cycle.

The address bus identifies locations in memory that can be used to read or write data. An address such as 10011001100110011110101101111001 would be used to identify the location in memory to read from, or write to.

The address bus is unidirectional – addresses only ever transfer from the processor to the memory.

The size of the address bus determines the number of locations in memory - 2 to the power of the number of lines on the address bus. This can be multiplied by the size of the data bus to calculate the total addressable memory.

An interface is a piece of hardware that allows to devices to communicate. An interface can also be used between internal components in a computer.

Interfaces are needed because different components/devices have different operating speeds, data formats and protocols