Attribution of Responsibility and Blame Regarding a Man-made Disaster: #FlintWaterCrisis¶

Presented this work in the 4th International Workshop on Social Web for Disaster Management (SWDM'16), co-located with CIKM 2016; here is the paper.

**Abstract**

Attribution of responsibility and blame are important topics in political science especially as individuals tend to think of political issues in terms of questions of responsibility, and as blame carries far more weight in voting behavior than that of credit. However, surprisingly, there is a paucity of studies on the attribution of responsibility and blame in the field of disaster research.

The Flint water crisis is a story of government failure at all levels. By studying microblog posts about it, we understand how citizens assign responsibility and blame regarding such a man-made disaster online. We form hypotheses based on social scientific theories in disaster research and then operationalize them on unobtrusive, observational social media data. In particular, we investigate the following phenomena: the source for blame; the partisan predisposition; the concerned geographies; and the contagion of complaining.

This paper adds to the sociology of disasters research by exploiting a new, rarely used data source (the social web), and by employing new computational methods (such as sentiment analysis and retrospective cohort study design) on this new form of data. In this regard, this work should be seen as the first step toward drawing more challenging inferences on the sociology of disasters from "big social data".

In [ ]:

# read the JSON data and save it to Flint.pkl once,# whenever want to read the data, read the pickle,# instead of the raw JSON files. # This code block is here just to show how we created the pickle (.pkl) file.importpandasaspdimportjsonfromglobimportglobfromdatetimeimportdatetimetw=[]forfinglob("data/TweetCollection/*.json"):withopen(f,'r',encoding='utf-8')asfin:forlineinfin:a=json.loads(line)tw.append({'id':a['id_str'],'created_at':datetime.strptime(a['created_at'],'%a %b %d %H:%M:%S +0000 %Y'),'hashtagged':any(['flintwatercrisis'inh['text'].lower()forhina['entities']['hashtags']]),'screen_name':a['user']['screen_name'],'location':a['user']['location'],'followers':a['user']['followers_count'],'verified':bool(a['user']['verified']),'text':a['text']})df=pd.DataFrame(tw).set_index('id').drop_duplicates()#df.to_pickle('data/Flint.pkl')

In [1]:

importpandasaspdimportnumpyasnppd.set_option('max_colwidth',200)df=pd.read_pickle('../data/Flint.pkl')fromutilities.geocoderimportGeocodergc=Geocoder('utilities/geodata/state_abbr_file','utilities/geodata/city_file')df['latlon']=df.location.str.strip().apply(gc.geocode)fromIPython.displayimportHTMLHTML(df.head().to_html(index=False))#how the data looks like

Out[1]:

created_at

followers

hashtagged

location

screen_name

text

verified

latlon

2016-01-15 21:00:24

265

True

Sugar Land, Texas

zachsciba

RT @TheDailyShow: #FlintWaterCrisis could have been prevented by an easy $100/day solution. https://t.co/4Jf7oH20EX https://t.co/7fLogvuwrx

False

(29.599580, -95.614089)

2016-01-15 21:00:07

968

True

None

scootey

You can thank the Republican party for this #Michigan #FlintWaterCrisis #GOP #Uniteblue https://t.co/wK7IFvkk8k

False

None

2016-01-15 21:00:30

189

True

s. pasadena,ca

steve1204

RT @TheDailyShow: #FlintWaterCrisis could have been prevented by an easy $100/day solution. https://t.co/4Jf7oH20EX https://t.co/7fLogvuwrx

False

(34.112958, -118.155778)

2016-01-15 21:00:09

8053

True

Lansing, Michigan

ProgressMich

Snyder still won’t say when he knew about #FlintWaterCrisis. Protest with us on Tuesday to demand answers: https://t.co/aRfLc99QUy #MISOTS

False

(42.717585, -84.554916)

2016-01-15 21:00:35

7

True

None

marcgilbert77

RT @TheDailyShow: #FlintWaterCrisis could have been prevented by an easy $100/day solution. https://t.co/4Jf7oH20EX https://t.co/7fLogvuwrx

RT @opinionatedcxnt: Saw this on Tumblr &amp; it made me cringe. The Flint crisis is a horrific nightmare\nhttps://t.co/j6sT5c5p3O

2354

204672

RT @BuzzFeedVideo: People See What Flint Water Looks Like\nhttps://t.co/3fV2EZFz21

1950

In [15]:

# the original dates are in UTC/GMT, convert them to EST.# also, as given in footnote #4, report the missing datesimportpytzeastern=pytz.timezone('US/Eastern')# group tweets by daydf.created_at=df.created_at.dt.tz_localize(pytz.utc).dt.tz_convert(eastern)# print missing date intervals in our datasetday=df.groupby(df.created_at.dt.strftime('%m-%d'))['created_at'].count()days=day.index.tolist()foriinrange(len(days)-1):m1,d1=days[i].split('-')m2,d2=days[i+1].split('-')ifm1==m2:ifint(d1)==int(d2)-1:continueelse:ifd2=='01':continueprint('('+days[i]+','+days[i+1]+')',end=' ')

frommatplotlibimportanimation,font_managerimportmatplotlib.pyplotaspltfromhtmlimportunescapeimportosplt.rcParams['savefig.dpi']=150plt.rcParams['animation.html']='html5'fig,ax=plt.subplots(figsize=(6,1))ax.set_axis_off()plt.subplots_adjust(left=0.1,right=0.9,top=0.9,bottom=0.1)prop=font_manager.FontProperties(fname='Quivira.otf')# 'Symbola.ttf'text=ax.text(.5,.5,'',fontsize=11,va='center',ha='center',wrap=True,fontproperties=prop)txt=list(g.head(30).text)#g is a pandas dataframedefanimate(i):text.set_text('('+str(i+1)+') '+unescape(txt[i]))return(text,)anim=animation.FuncAnimation(fig,animate,frames=len(txt),interval=2000,blit=True)anim.save('top30.mp4')#matplotlib can save as mp4, but not as gif yet.os.system("convert -delay 200 top30.mp4 top30.gif")#imagemagick's convertanim#eye candy for the presentation :-)

colorm=dict(boxes='lightgreen',whiskers='black',medians='black',caps='black')#ax=compare[['followers','population']].plot(kind='box', patch_artist=True, showfliers=False)boxprops=dict(linestyle='-',color='black')matplotlib.style.use('ggplot')plt.rcParams['axes.facecolor']='w'plt.rcParams['savefig.facecolor']='w'matplotlib.rcParams['xtick.labelsize']=20matplotlib.rcParams['ytick.labelsize']=18matplotlib.rcParams['axes.titlesize']=14co={'color':'black'}ma={'color':'black','linestyle':'-'}plt.figure(figsize=(9,3))cohort=ffdf[ffdf.usent<0].fsent#ffdf[ffdf.fsent<0].usent control=ffdf[ffdf.usent>0].fsent.dropna()#ffdf[ffdf.fsent>0].usentprint(cohort.mean(),control.mean())bp=plt.boxplot([cohort,control],patch_artist=True,showfliers=False,whiskerprops=co,capprops=co,medianprops=ma,boxprops=boxprops,labels=['Friends of the cohort','Friends of the control'])ax=plt.gca()forpatch,colorinzip(bp['boxes'],['magenta','lightgreen']):patch.set_facecolor(color)ax.set_ylabel('Sentiment score',fontsize=22)ax.set_ylim(-.23,.08)#plt.yticks(np.arange(-.6, .6, .1))ax.yaxis.grid(True,linestyle='-',which='major',color='lightgrey',alpha=0.5)ax.get_figure().savefig('../figs/contagion-exp2.pdf',bbox_inches='tight')

-0.105396072804 -0.0749410768295

In [193]:

fromscipy.statsimportks_2sampfrommathimportsqrtc_a=1.36#coefficient c_a is 1.36 for alpha 0.05 and 1.95 for alpha 0.001print(ks_2samp(ffdf[ffdf.fsent<0].usent,ffdf[ffdf.fsent>0].usent))n1=len(ffdf[ffdf.fsent<0])n2=len(ffdf[ffdf.fsent>0])print('Critical value D_a (ks statistic (D) should be greater than this):',c_a*sqrt((n1+n2)/(n1*n2)))#that is the case for 95% confidence level: https://daithiocrualaoich.github.io/kolmogorov_smirnov/

fromscipy.statsimportks_2sampfrommathimportsqrtc_a=1.36#coefficient c_a is 1.36 for alpha 0.05 and 1.95 for alpha 0.001print(ks_2samp(ffdf[ffdf.usent<0].fsent,ffdf[ffdf.usent>0].fsent))n1=len(ffdf[ffdf.usent<0])n2=len(ffdf[ffdf.usent>0])print('Critical value D_a (ks statistic (D) should be greater than this):',c_a*sqrt((n1+n2)/(n1*n2)))#that is the case for 95% confidence level: https://daithiocrualaoich.github.io/kolmogorov_smirnov/

importsubprocess#the table that went into the presentationtemplate=r'''\documentclass[preview]{{standalone}}\usepackage{{booktabs}}\usepackage[vcentering,dvips]{{geometry}}\geometry{{total={{3.05in}}}}\begin{{document}}{}\end{{document}}'''filename="../figs/concerned_geo.tex"withopen(filename,'w')asf:f.write(template.format(cc.to_latex()))subprocess.call(['pdflatex',filename],cwd=r'../figs');