Confused with histograms

Hi!

I'm having trouble with histograms! I did stats for A level, and as I remember histograms are used to represent continuous variables. We get the data, make some group sizes, find the frequencies and calculate the frequency densities, and make your histogram; X axis has the units of the variable and the Y-axis has frequency density - right?

So when I've come to Uni, we're recapping stats. We get shown some lectures and tolded to use STATA, a stastics program.
Now on this it produces histograms with only frequency up the side, and the units across the bottom.

This means that the areas of the bars no longer reflect the frequency! I mean obviously the class widths are all the same, as they have to be, but surely this defeats the point of a histogram? It's like a bar chart as the height of the bars now are the frequency!

But if my stats lecturer is telling us this, and STATA is producing histograms like this surely it can't be wrong! What type of histograms are these?

Well attached is the picture from one of our online lectures, which is meant to show that the histogram on the right (with 100 samples, with 326 people in each sample) has a reduced spread compared to the one on the left with only 47 people in 100 samples. If you add up the height of all the bars, it adds up to 100 on both - like a bar chart. I'm confused!

I'm having trouble with histograms! I did stats for A level, and as I remember histograms are used to represent continuous variables. We get the data, make some group sizes, find the frequencies and calculate the frequency densities, and make your histogram; X axis has the units of the variable and the Y-axis has frequency density - right?

So when I've come to Uni, we're recapping stats. We get shown some lectures and tolded to use STATA, a stastics program.
Now on this it produces histograms with only frequency up the side, and the units across the bottom.

This means that the areas of the bars no longer reflect the frequency! I mean obviously the class widths are all the same, as they have to be, but surely this defeats the point of a histogram? It's like a bar chart as the height of the bars now are the frequency!

But if my stats lecturer is telling us this, and STATA is producing histograms like this surely it can't be wrong! What type of histograms are these?

Well attached is the picture from one of our online lectures, which is meant to show that the histogram on the right (with 100 samples, with 326 people in each sample) has a reduced spread compared to the one on the left with only 47 people in 100 samples. If you add up the height of all the bars, it adds up to 100 on both - like a bar chart. I'm confused!

I would regard what STATA and your lecturer are telling you as the usual
type of histogram, where the area equals the number of cases, however
I have also seen the density estimate type of histogram as well. In fact
the Wikipedia page on histograms gives examples of both.

I would regard what STATA and your lecturer are telling you as the usual
type of histogram, where the area equals the number of cases, however
I have also seen the density estimate type of histogram as well. In fact
the Wikipedia page on histograms gives examples of both.

ZB

Thanks for the reply! But if you look at the two histograms attached, that were ones created by STATA, it's not the area but the height of the bars that equal the frequency. But the lecturer, and from all the A-level books I've read, it says in histograms it's the area that equal the frequency. Sorry if I'm not making sense! So what I've seen is there are (at least) two types of histograms, one that has frequency in the Y-axis, and one that was Frequency density in the Y-axis, and both have the variable on the X-axis. Sorry, but I'm still a bit confused

Thanks for the reply! But if you look at the two histograms attached, that were ones created by STATA, it's not the area but the height of the bars that equal the frequency. But the lecturer, and from all the A-level books I've read, it says in histograms it's the area that equal the frequency. Sorry if I'm not making sense! So what I've seen is there are (at least) two types of histograms, one that has frequency in the Y-axis, and one that was Frequency density in the Y-axis, and both have the variable on the X-axis. Sorry, but I'm still a bit confused

I was using the term area rather more losly than I should have, the scale
should in this form be the count in the bin. Its the bin total that should
be the number of cases.

Thanks for the reply! But if you look at the two histograms attached, that were ones created by STATA, it's not the area but the height of the bars that equal the frequency. But the lecturer, and from all the A-level books I've read, it says in histograms it's the area that equal the frequency. Sorry if I'm not making sense! So what I've seen is there are (at least) two types of histograms, one that has frequency in the Y-axis, and one that was Frequency density in the Y-axis, and both have the variable on the X-axis. Sorry, but I'm still a bit confused

From what I understand, a histogram is very like a bar chart. The essential distinguishing feature of a histogram is that it is used for continuous data. Provided the classes (bins) are all the same width then, so far as the overall shape of the histogram is concerned, it doesnít matter if you use frequency instead of frequency density on the y-axis. Thatís because, for a constant width, the frequency is proportional to the frequency density. Although the frequency will not equal the area, it is proportional to the area. That said, you should use frequency density to scale the y-axis correctly. If the class width is not constant then use frequency density.

Are there two types of histogram? Strictly, I think not. A true histogram uses frequency density. The area is the frequency.