I have 4 years of road kill data for one road, though each year has a different sample time frame (i.e. May-June vs. April-October) and therefore each year has a different number of surveys and totals of road kill. I would like to compare kernel density rasters to find the worst kill locations when the 4 years are combined, but I do not want to bias the data toward the year with the greatest number of kills. (I am assuming kill locations in years with fewer samples are representative). How can I normalize the rasters so I can add them and find the greatest kill hot spots on the road?

1 Answer
1

Apart from edge effects, the integral (=volume beneath) a kernel density is supposed to equal the total count of the data it represents. That is, its values are in units of count per area. You want count per area per unit time. Evidently, that is done by dividing each raster by the amount of time it represents.

There are many other considerations, because the question essentially is about the potential for the combined raster dataset to be a biased estimator of the actual roadkill rates. A good approach is to consider the question, "what is the probability that a particular roadkill will be identified by some survey?" The probability depends on the typical time roadkill remains visible on the road (which can vary among surveys, depending on weather and time of year) and might also vary from year to year when there are different numbers of surveys made (at non random times) each year. Several estimators, including the Horvitz-Thompson estimator and Hurwitz-Hansen estimator, use these probabilities to adjust the sum so that it is an unbiased estimate of the total roadkill during the study period. The adjustments are simple, often involving nothing more complicated than dividing results by their probabilities. Read about them in Steven K. Thompson's book, Sampling, which is a good practical (and theoretical) resource for making estimates with the kinds of observational data collected by ecologists.