Centrality Measures

Why the Centrality Measures are Insufficient to Describe a Distribution?

We will consider the three data samples as lists of values; we can assume that they are students’ grades, for example:

Sample A: 7, 7, 7, 7, 7, 7, 7

Sample B: 10, 10, 7, 7, 7, 4, 4

Sample C: 8, 7, 7, 7, 7, 7, 6

These three samples have the same mode (7), the same median (7), and the same average (7). What then is the difference between these samples? It is easy to see that they do not have identical characteristics. The difference is in the way that they are dispersed.

Sample A is not dispersed at all, and it is concentrated upon a single point.

Sample B is the most dispersed of the three samples. It has values that are distant from the midpoint, and these values appear in significant numbers in the sample.

Sample C has moderate dispersal, i.e., the values are focused around the midpoint.

We will look at Additional Examples:

Samples with different midpoints are possible, but only if they have identical levels of dispersal. We will demonstrate this by using continuous variables:

We will conduct a sample of the heights (in centimeters) of residents in a certain city. 100 people were sampled, and the following results were obtained:

The Values (Height)

The Frequency

The Relative Frequency

The Width of the Division

The Density

140-150

10

10%

10

1

150-160

20

20%

10

2

160-170

40

40%

10

4

170-180

20

20%

10

2

180-190

10

10%

10

1

Total

100

100%

In another city, we will conduct a sample of the weights (in kilograms) of the residents. The number in the sample group was 500. The following illustrates the frequency table:

The Values (Weight)

The Frequency

The Relative Frequency

The Width of the Division

The Density

50-60

50

10%

10

1

60-70

100

20%

10

2

70-80

200

40%

10

4

80-90

100

20%

10

2

90-100

50

10%

10

1

Total

500

100%

We will examine the histograms of the two samples:

The histogram of heights and weights:

Histogram

It is easy to see that the dispersal is identical, but the values around which the samples have been dispersed are different: 165 in the sample of heights, and 75 in the sample of weights.