In this unit you will:

• Learn about the difference between data and information
• See how the analysing process can transform data into information
• Discover the role of hardware in the analysing process

Important Terms in Unit 1

(Look in Glossary Tab for the definition)

Information

Data

Analysing

Primary data

Bus

CPU

Clock speed

Data and information

In this unit, the terms ‘data’ and ‘information’ will be continuously used. Many people regularly give different and complicated definitions to both these words and a few may insist that these terms meanings are the same, but are they? One reason for someone claiming this might be because there is no physical difference between them. Information and data both come in several forms such as text, numbers, graphics, audio and video.

Even though data and information come in similar forms, the definitions of these terms are quite different.Data- is raw material used by information processes that have no meaning or clear purpose.Information- is data that has been ordered and given some meaning.

When trying to identify if something is data or information, it purely depends on how that person perceives it. For example, if someone sees a list of numbers that doesn’t make sense or have a clear meaning, it is data. However, if that same list was to be presented as a graph and made more sense to them, it is now information. This process is known as analysing.

Analysing

Analysing- the process that transforms data into information. However, the original data is not altered during process

During the analysing process, it adds a clear meaning and purpose to raw data without creating new information. This could be as simple as adding titles, labels or descriptions to the data. (As shown in tables below)

Raw Data

24

90

22

97

20

99

18

94

15

81

13

86

Analysed Data

Monthly climate in Wagga Wagga

Temperature (C°)

Rainfall (mm)

Week 1

24

90

Week 2

22

97

Week 3

20

99

Week 4

18

94

Week 5

15

81

Week 6

13

86

The table on the top only is only filled with numbers which there is no meaning or clear purpose to. Whereas, the table on the bottom shows the same numbers with labels which makes it easier to understand.

Another reason why analysing data is useful is that it helps people make decisions. Sometimes when analysing data it can reveal certain trends or patterns which can help in the decision making process E.g. when planning for the Sydney Olympics, a detailed data analysis of Sydney’s past weather patterns assisted then in choosing the best month to host it in.

Hardware for Analysing

Primary & Secondary Storage

The analysing process generally needs to access all the collected data and must also be able to store data created which means a large amount of storage space is necessary. Most information systems have two types of storage: primary storage and secondary storage which are both needed in large amounts for the analysing process.

Primary Storage-

is onboard storage devices which are built into an information system.

It's divided into:

RAM (Random Access Memory) which is temporary storage that only holds programs and files being used.

ROM (Read Only Memory) which is permanent storage which holds instructions to enhance the information system, making it faster and more accessible.

Secondary Storage- is storage devices attached to an information system. This type of storage is more permanent as it provides space for future use of data and software. Examples of this can include hard disk drives, floppy disk drives and CD drives.

Speed of Analysing

When analysing, the information system performs many difficult operations which uses a large amount of data. The speed and data processing capabilities of the CPU (central processing unit) determines time taken to complete these operations. However, speed can be enhanced if the CPU:

Has a wider data bus. The ‘bus’ contains wires which let data enter and leave the CPU. The bus size determines the amount of data bytes that can come in and out of the CPU. For example, a data bus which is 64 bits wide can move 8 bytes of data in or out the CPU.

Has a faster clock speed. Clock speed is the amount of timing signals produced every second by an electronic clock on the motherboard or CPU. For example, a clock speed of 1 GHz on a CPU will work twice faster and complete more tasks compared to 2 GHz.

Has a higher FLOP rating. A ‘FLOP’ (Floating-point operation) shows how many ‘floating-point numbers' (decimals and also high or small numbers) can be supplemented together in a second. It is said that this is a more reliable indicator of the CPU speed than clock speed.

Is working in parallel with other CPUs. If analysing process is shared between multiple CPUs it then means each CPU has a shorter list of tasks to complete. This can save time and money.

Questions

Write the definitions of 'data' and 'information'.

What is the purpose of the analysing process?

Why are primary and seconday storage important for the analysing process?

Would every analysis of data produce information? Explain

Why would a processor's FLOP rating give a better indication of its analysing abilities than its clock speed?

Answers

In this unit you will:

• See how data can be searched and sorted as part of the analysing tools
• Learn about modelling, stimulations and charts as analysing tools
• See how file comparisons can be used to analyse data

Important Terms in Unit 2

(Look in Glossary Tab for the definition)

Search key

Exact match

Wildcard matching

Logical operators

ASCII

Model

Simulation

Continuous data

Discrete data

Series data

Searching for Patterns

A part of analysing is searching data for specific values or patterns e.g. searching in a telephone book for certain name. Programs on an information system have tools that can search for different types of data:

Text data- Word processors, text editors and web browsers are used

Numerical data- Spreadsheet applications are used

Audio, sound and video- This is more a more complex search than others

Searching for Text Data

When searching for text data, a search key is used to compare the data. Collected data will be checked by the system to try and find the requested search key. There are two types of data matching when searching text, exact matching and wildcard matching.

Exact matching
To do this, before the search starts, all the characters in the search key need to be described. For the search to be successful, the same sequence of characters must be in the text data. An example is shown below.

Exact match search keyword: manage

Office managementThe current manager, Mr Smith, has recently returned from amanagement conference where she presented a paper onhow to manage an office staffed by part-time workers. Herexperiences in this new area of management skills well…

Wildcard Matching
In wildcard matching not all the characters have to be described before the search starts. ‘Placeholders’ or ‘wildcard characters’ replace individual or group of characters with either ‘?’ or ‘*’.

‘?’- used to represent an individual character in a certain place, e.g. the search key ‘b?t’ can be used to search for ‘bat’, ‘bit’ and ‘but’

‘*’- is used represent added characters (other than the characters searched for). An example of this is below.

Wildcard search keyword: manage*

Office managementThe current manager, Mr Smith, has recently returned from amanagement conference where she presented a paper onhow to manage an office staffed by part-time workers. Herexperiences in this new area of management skills well…

Simple searches can be linked together by logical operators to make a more complex search called a 'combined search'. The words ‘and’ and ‘or’ are the commonly used logical operators:

‘Or’ - includes all data found on all the seperate searches

‘and’-only includes common data between the all the seperate searches

An example of a combine search is below.

Searching Numerical Data

Unlike text data, numerical data is not searched character by character as it is stored in binary form. This means that searching for numerical data is not possible with wildcard matching. Alternatively, the exact numerical value of the search key must be compared with the exact numerical value of each data item.

Searching Image, Audio and Video Data

There is no single method of searching for image, audio and video data as there are numerous different data organising methods. Regularly used methods include searching through their filenames or descriptions. Generally, the people who develop multimedia file libraries are required to type a description either manually or using automated search software.

Sorting Data

Sorting can be a helpful first step in transforming data into information. Collected data in a computer-based system, unlike a manual information system (such as a telephone book), does not have to be arranged in a particular order. Computers today, make organising a large amount of data a quick and simple process. When sorting, each data type will differ in the way it is organised. For example, the process of sorting text and numerical data are different (explained in table below).

Data Type

Ascending Order Sort

Descending Order Sort

Text

A to Z: Using the ASCII value of each single character beginning with the 1st chatacter which will result in some unpredicted orders when sorting. E.g. 'Z' will come before 'a' (as capital letters come before smaller case letters) and numbers treated as text characters would be that '100' will come before '2'.

Z to A: Using the ASCII value of each characer beginning with the first character. This is the opposite to 'A to Z', for example, 'a' will come before 'Z' (smaller case letters will come before uppercase characters).

Numerical

0 to 9: Using the numerical value of the whole data item, not individual characters. E.g. 2 will come before 100 and 0.99 will be placed before 1.0

9 to 0: Using numerical value of whol data item. This is sorted oppositely to '0 to 9', for example, 100 will be placed before 2

While on the other hand, image, audio and video are usually sorted by their filenames, their descriptions or even their file size.

Modelling and Simulation

Modelling

A computer model uses the analysing process to represent another system, process or object (either real or imaginary). A model is built by the collected data such a measurements, observation and equations, being analysed. This model could be an image, a group of equations, a sound or an animation. Because computer modelling allows us to understand systems, it is helpful if the model can be realistic.

Simulation

Computer simulation is used to examine the performance of a model by analysing how it responds to data and equation changes. It is supposed to predict how an information system will responded to various situations. Examples of computer simulation are weather forecast programs, transport simulators, scientific and business software and even some computer games.

Simulation Software

This image shows a simulation program called 'TOA Image Simulation Software' which engineers would used. (Click image to maximise)

Simulators are used in a variety of research and training areas. These areas can include:

Economy stimulations let you adjust economic conditions such as interest rates to see the results.

Simulations used by businesses can involve using a spreadsheet program to see an increase in raw material cost or interest rates.

Simulations used by engineers can involve them using software that allows them to predict the characteristics of an aircraft or car design.

Military recruits use simulators to practice being in battle without being real.

Even though simulations can be set up on a spreadsheet program, however, specialised software is often used to supply faster calculations and/or appear more realistic.

The advantage of using simulation software over a real approach is that stimulation is generally cheaper and is less dangerous. However, if the data used is incorrect then the simulation will be incorrect and if the data used is too simple then the answers will be too.

‘What-If’ Analysis

Using a spreadsheet application has an advantage with its ability to quickly analyse a sheet of equations every time a piece of the data is changed. This is used for the ‘what-if’ analyse in numerous application areas. The image below is an example of this as, when the interest rate cell is adjusted it affects all the other data as well.

Charts and Graphs

Common methods for analysing data are charts and graphs as they can display the relationship, patterns and comparisons and they are a faster way to take in information. Charts and graphs have three main advantages:

Impact- can attract attention to important information with colours, symbols and patterns

Speed- well-drawn charts and graphs can clearly show trends in the data

Simplicity- Most people can understand the message in a graph/chart more than a table

The choice of which type of chart or graph is most important and will depend on the type of data being analysed. The table below shows information about different chart types. The data in this table is either label as ‘continuous’ or ‘discrete’. ‘Continuous’ refers to data that was collected or sampled repeatedly, for example, the amount of cars crossing an intersection every hour in a day. Whereas ‘discrete’ refers to ‘one-off’ data or even a snapshot of data, for example, votes for the political parties in the last election.

When many series are plotted the individual lines can become ‘lost’or difficult to see

Pie

Discrete

Ideal when comparing data as percentages

Small values are hard to see, can only plot a single data series

Combination (line and column)

Continuous or discrete

Best for plotting two separate data series in the same chart

If too complex it can become very confusing

The table above also mentions the term ‘data series’. The data series indentifies the number of data sets (columns and/or rows) that are being plotted in the chart. There are two types of data series; single which is a single column/row of data and multiple which are several columns/rows of data.

The table below is an example of multiple data series and the picture on the right shows different types of charts used to analyse multiple series data.

Hour
Beginning
At

7 am

8 am

9 am

10 am

11 am

12
pm

1
pm

2
pm

3
pm

4
pm

5
pm

6
pm

Cars

Vans

Trucks

Taxis

Buses

Motorcycles

Comparing Files

Checking the results of an analysed data can be done by comparing the processed data with the original to indentify changes. Analysing the differences between data files works best on fixed length files as the information system immediately compares them. An example of this is a word-processed document which has been altered, can be compared to the original document. However, the simplest way to compare documents is to look at their file lengths (characters or bytes).

Unit 5.3- Non-Computer-Based Analysing tools

& Social Issues

In this unit you will:

• Identify some non-computer-based tools for analysing data
• Examine some of the social and ethical issues affecting data analysis

Manual Analysing

Searching Manual Systems

Compared to computer-based, manual and people-based searching can sometimes be unreliable and also have limitations in what they can do, e.g. a human user searching for a single word in a large document.

An example of a useful manual search system is a card index system (shown in pictures beside). This can organise data and makes it more accessible by:

Arranging it in alphabetical order

Dividing it with tabbed dividers

Using colour codes to categorise it

However, a card system index also has a disadvantage being inflexible, e.g. trying to reorganise the data when it is sorted by dates.

Mechanical tools are also frequently used for searching data. A commonly used example of this is the telephone Rolodex/Teledex which is considered a faster way of finding telephone numbers than a telephone book.

Non-Computer-Based Models and Simulations

Non-computer-based models are used in numerous topics of science and engineering e.g. NASA relies on non-computer-based models to build wind tunnels. Sometimes can be more easier than building a computer-based model especially if we do not have a good knowledge of the system.

Physical objects or working/moving parts does not have to be present for it to become a model, it could be a paper based model which are commonly used for analysing information systems.

Social and Ethical Issues

Unauthorised analysis of data

Access to government and census data for analysis is covered by the Freedom of Information Act. This authorised data analysis has both practical and important uses as it makes reliable choices, for example, where to build infrastructure. On the other hand, unauthorised data analysis can lead to an abuse of privacy.

Data incorrectly analysed

The analysis of data is significant in making decisions and also in changing data into information. However, if this data is incorrect the information will be too. This is the responsibility of people who design and use data analysis systems. End users also have a responsibility to check data for faults as well.

Loss of Privacy from Linking Databases for Analysis

The ability of information systems to link data from different databases can have a great advantage of increasing the quality and quantity of information. However, this is also have a disadvantage in that it is a major danger to privacy.

Glossary

Analysing- the process that transforms data into information. However, the original data is not altered during process

ASCII- Stands for ‘American Standard Code for Information Interchange’ and is codes used for text characters

Bus- consists of wires which allow data to enter and leave the CPU

Clock Speed- the number of timing signal produced every second either by an electronic clock on a motherboard or CPU

Continuous Data- data that is collected repeatedly

CPU (Central Processing Unit)- Is built into a computer to process data

Data- raw facts/items that have no meaning or purpose

Data Series- the number of complete data sets such as columns and rows

Discrete Data- is data taken at a certain time or position

Exact Match- searching for the exact arrangement of text characters in collected data

Information- data that is meaningful and has a clear purpose

Logical Operators- link searches together, ‘and’ & ‘or’ are commonly used for this

Model- a description of a system, process or object

Primary Storage- is onboard storage devices built into an information system which is divided into RAM and ROM

Search Key- a set of text characters that is found by analysing collected data

Secondary Storage- is storage devices attached to an information system which is permanent

Simulation- uses a model to predict the actions/behavior of a system

Wildcard Matching- Uses placeholders such as ‘?’ and ‘*’ to represent characters

HSC Questions and Model Answers

Primary data storage is not permanent and is stored on onboard devices that are built into an information system whereas secondary data storage is permanent and is stored in attached devices in an information system, such as hard drives and floppy disks.

Modelling is where data, such as measurements, observations and equations, are collected either about a system, process or object and that data is then used to create a model. Whereas, simulations uses a model to predict how a system will behave under certain conditions.

Summarise how information technology has improved the analysing process.

The analysing process has been improved by technology as it has made completing analysing tasks easier and faster. These tasks include searching for different types of data, sorting data, creating models and simulations of systems and also comparing.

A person’s privacy can be abused by this process in that it has become easier the access unauthorised data and also that it has become easier to link different databases. Even though these have a lot of advantage however, they can easily be misused.