Other sites

Covid Death Rates: Is the data correct?

[This article was first published on sweissblaug, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Getting correct data on covid-19 cases is important to obtain up-to-date information on how the disease is progressing. It’s also necessary create models and make accurate forecasts.

However, I’m starting to think most of the charts we see for covid cases over time are incorrect. The reason is these datasets with aggregate statistics by state scrape data and looks at the changes in ‘headline’ numbers – how many total cases of covid by day. However scraping the data only accounts for when new deaths were reported and not when they happened.

The official data can be found on the covid19 colorado website and looks like a smooth and somewhat symmetric curve. Below is a chart comparing the two datasets.

This appears to be a known issue but they still seem ok publishing these incorrect charts. It appears that this can give erroneous conclusions both to people looking at these charts and anyone trying to do an analysis with them.