I would like some help with coding and generating faultline variable in Stata

17 May 2019, 11:35

Hello everyone,

I'm a student and I'm currently having some issues with coding and generating the variables for my research. My research is in the area of Strategic Management and is about the moderating effect of CEO-TMT characteristics on the relationship between TMT faultlines and R&D spending.

I have collected the following data on TMT members:

- exective age
- executive gender
- executive tenure
- executive tenure on the top management team
- type of degree of executive

This is the first time I am using Stata to generate variables, so I don't know much about it.

I want to see if there exists any faultlines within the TMT, based on 4 characteristics: age, gender, tenure and education.

I would like to generate a faultlines variable using the following formula:

n = number of group members (In my dataset this is "TMT size")

p = number of characteristics (The characteristics that I have in my dataset are: age, gender, tenure and education)

S = total subgroups = 2* (number of TMT members) -1 - 1

Xijk = the value of the jth characteristic of the ith member of subgroup k.

X-j indicates the overall group mean of characteristic j.

x- jk indicates the mean of characteristic j in subgroup k.

g= all possible splits (=S?)

ng k denotes the number of members of the kth subgroup (k= 1,2) under split g.

Fau= Fau is calculated as the maximum value of Faug over all possible splits g = 1,2,…,S.

Welcome to Statalist. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stat output, and sample data using dataex. Look at postings on Blau index - they have a similar structure.

You can do the means using bysort x: egen meanz=mean(z). Create the squares and then you can do totals with egen and rowtotal.

Comment

Hello Stata list members, i am interested to construct Faultlines strength(FLS) and Faultlines distanse(FLD) by using the well known studies in Faultlines research, Bezrukova et al ,2009, Hutzschenreuter and Horstkotte 2013, and actually following Thatcher et al 2003, and Bezrukova et al 2009.Before this i have calculted board diversity Index by using BLUE Index, but now i want to calculat this Index,.I tried to do it in stata but i dont know how can i do it.Although some professors recommended me SAS Statistical Analysis Software, or R-Software for calculating it.
i would be happy if some one guide me how can i calculate it in stata,because i have no commond on.R or SAS.
For Your information this concept is
opposite of Board diversity.
Here we first divide a group of Top managment team into two sub groups i.e. group 1 and 2 ,( may be thisdivision is based on gender i.e. male vs female or other attributes ), then we find the simalirties of these members based on other attributes,i .e age, tenure and education) so within group there are homogenity, simalirties of attributes in group 1(male) and also simalirties in group 2(female), and there are hetrogenity between these two groups.
Actually if there are 4 members(n=4), then we have S ways i mean g=7 split, we calculat it as formula : 2 power n-1, then we subtract 1(2 power 4-1=3, minius 1=8-1=7 ways), K means lets suppose two subgroups Male and female, and J represent attributes (age, gender, tenure and education).mostely attributes used are dicotomus ,but if continous attributes i.e tenure(number of years)then we will scaled tenure by TMT members size in the focal company rangemaximum -minimum) and share ownership of members,we scaled it.by range(maximum number.of share -minimum number of share).
Then we calculat Faultlines strength for other attributes based on age,(it means now we divided group on the basis of age instead of gender) and repreat the process, in last we add all Faultlines strength i.e aggregate group wise value, and then take average. range of this value between 0 and 1, ,the higher this value the more homogenous is the within-subgroup Structure.so Faultlines strength(FLS) measure capture how NEARLY a group SPLIT into Subgroups.furthermore,FLS is computed as the portion of the total variance in TMT members characteristics explained by the group split by clustering algorithms(between subgroup variance over total group variance).
So we want to calculate two things one is Faultlines strength(simalirty within subgroups) and Faultlines distanse as the average Euclidean distanse between subgroups centroids (thevector of subgroup means of TMT characteristics) ) or defiened as (dissimilarity between groups or measures the extent of difference between two subgroups).formula is Faultlines distanse= square root of summation j=1 to P(Mean of X1j -Mean of X2j)2(square). where the Faultlines distanse measure consists of the distanse between the cluster centroids .where mean X1j is the mean of the jth variable for subgroup1, mean X2j is the mean of this characteristic for subgroup2.the square root is taken from the aggregated attribute differences between both subgroups.FLD can range from 0 to 3(the square root of 9).
in last we multimple Faultlines strength(FLS) with the recprocal of Faultlines disctance(FLD) as ( FLS*(1-FLD)) range 0 to 1. maximum value showd more simalirty in attribute in subgroups.references (van peteghem et al 2018).
If you need further information i will provid it,
Could you please guide me how can i do it, i hope it will ease my work and save my time, and maximum people will take help from this.
thank you in advance.
looking forward for your positive response.
Ayub