Dear All,
I assume there is an obvious answer to my question, but I’m not entirely sure myself after reading up on it.
I’ve got a panel dataset which consists of a time series dimension, indicated by variable “date”, and multiple cross-sectional dimensions of countries, identified by “country_id”, and country-specific financial assets, denoted by variable “asset_id”, hence my total number of observations is (#assets by country)*(#countries)*T. The assets have a time-varying market capitalization, denoted by “mcap”. I want to compute market-capitalisation weighted statistics of a number of variables by country and date, for instance the mean return of assets in country j at date k. Would it be correct to use:
-collapse (mean) return [pweight = mcap], by(country_id date)-
The background to this question is that I found -pweight- defined as w_i in the following equation for the weighted mean
ybar_w = 1/W * Sum_i^N (w_i * y_i)
where W = Sum_i^N (w_i) (see Cameron/Trivedi 2009, Microeconometrics using Stata). Now if -pweight- defines N as the number of assets by country (at least in combination with -by(country_id date)-), the above should be correct I think; but if it defines N as the number of overall assets (i.e. number of assets by country*number of countries) then the weighting would not yield the statistics by country I am looking for…
Thanks for your advice,
Best
Julian
___________________________________
Julian Schumacher
schumacher@hertie-school.org
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/