As stated in my introduction I will get my database from a census at school. In doing this it will save time on collecting the results and the results will be reliable. For this investigation I will get 500 data entries. If I plotted all 500 data entries on one graph then it would be illegible. For this reason I plan to only use 20 data entries.

To select the 20 entries I will use random sampling. First I will number my 500 data entries from 1-500. I will then use the RANDOM key on my calculator.

I have now drawn up my cumulative frequency graphs for both Australian and British Males. My original hypothesise were:

Null Hypothesis: There will not be a difference between the Median height of Australian men and that of British men.

Alternate Hypothesis: The Median height of Australian men will be higher than that of British men.

The cumulative frequency graphs demonstrate that on average Australian males are taller than British males. I can tell that this is true because the median value is 158cm for Britain but for Australia it is 161cm. This proves that my Alternate Hypothesis was correct.

The shortest person was British and the tallest was Australian. However the Inter-quartile range was smaller in the British population. The advantage of the inter-quartile range is that it eliminates wild and uncharacteristic results. This means that the Australian population had more variation in size and that they deviate more from the median than the British population. This result tells me that the Australian population may just have a few very tall people, skewing the results.

I have now drawn my histograms for both UK males and females height my original hypothesise were:

Null Hypothesis: There will not be a difference between the Mean height of UK Males and that of UK Females.

Alternate Hypothesis: The Mean height of UK Males will be higher than that of UK Females.

The Histograms clearly demonstrate that the average height for UK Males is larger than that of UK Females. I can tell that this is true because the mean value for UK males is 164cm whereas the mean height for UK females is 158.5cm. This is quite a large difference. The alternate hypothesis has been proved to be correct.

For females the highest frequency density was in the 160 ≤ h < 170 category as it was 2.5. This would be the modal average for UK females. For males the highest frequency density was in the 170 ≤ h < 180 category as it was 1.9. This would be the modal average for UK males. Another indication that UK males are in average larger than that of UK females.

I think that in general Investigation 3 went successfully. Using a histogram allowed me to use group of different interval width because area demonstrates group size not height. The stratified sampling of the data was a much fairer method than random sampling because it ensured that the same age group from each population was represented.

Related GCSE Height and Weight of Pupils and other Mayfield High School investigations essays

to 160 which I believe is slightly above average in my honest opinion for my sample and also I have come to find that the trend is rather varied although the frequency are upward to a certain point and downward from the peak onwards.

By using this method, the proportion of each strata in the sample will be more accurately matched to the proportion of each strata from the database. This would provide me with a more accurate sample to represent the entire database: Sample No.

11: 170 Divided by 1183 Multiplied by 360 = 52 Degrees Total 360 Degrees Mean and Mode of Frequency Data I will now find the Mean, Median and Mode of the Frequency that I have found and this will be quick efficient and reliable and will help me gain evidence

In this method of sampling there is a regular pattern created to choose the sample. All the results have to be listed for this to work. You first have to pick randomly a starting point and then every nth data is selected.

Some of the points lie on the line of best fit which shows that there is a linear relationship between the two sets of data. I also found the equation of the line of best fit, which is y = 73.604x - 67.744.