Category Archives: Japan Tohoku Earthquake

Post navigation

“Location” has become an essential component to many social media technologies. Not only is it important to convey “what happened?”, but also to reveal “where” it happened. Check-in technologies have become popularized by companies like Foursquare, Gowalla and Yelp, further blurring the lines between the content and the geo-coordinates. Twitter has traditionally not put an emphasis on the notion of “place”, but they did announce, in late 2009, their own geolocation feature. Users were now able to enable geolocation in their settings page to reveal the exact location of where they are tweeting from. While twitter itself does not have an interface for mapping tweet locations, making the data available through their Geotagging API allows third party applications access to this information and map them accordingly.

This infographic was created through a painstaking process that utilized almost 10 different applications to generate the final result. The main application used to create the word cluster graphic was Gephi, an open source platform that lets you visualize complex networked data elements in a visually compelling and interactive environment. However, coming up with this particular end result was complicated by various factors, one of which was the complexity that arose from using Japanese characters in its analysis.

The Workflow

Step 1

The first step in this Japan Twitter project was to actually collect and archive the twitter data coming out of Japan after the earthquake. For this, a cron job was written as a PhP script by David Shepard, a member of the UCLA Digital Humanities Collaborative. The script used the Twitter search API to find and filter tweets based on relevant hashtags, and dumping them into our own MySQL database. The cron job ran every 3 minutes for 30 days, collecting over 650,000 tweets during this time period.

Once the Twitter data was safely in our MySQL database, I queried out and generated 30 separate text files, one for each day following the earthquake. Each “day” file consisted of just the tweet text from the thousands of tweets that belonged to that day (on average there were about 20,000 tweets per day).

Here, you can see the number of tweets collected on an hourly basis:

Step 2

In order to capture the range of emotions through the different phases of recovery following the disaster, I followed a methodology employed by Eiji Aramaki from Tokyo University, who took the words from an Emotion Dictionary to extract emotion patterns in a set of text files. Dr. Aramaki provided me with about 2000 of the most commonly used “emotion” words in the Japanese language, sub-divided into 10 different categories. A separate CSV file for each emotion was generated.

I then used WordSmith, an application that allows you to extract word patterns, to find concurrences of every emotion word against each “day” file. Through WordSmith’s concordance tool, I was able to run a batch process that matched each of my 10 “emotion” files against each of my 30 “day” files.

Here is a screenshot of WordSmith’s concordance function:

Step 3

The data generated from WordSmith was exported as a series of spreadsheets. These spreadsheets were combined, merged, analyzed, and recalculated to produce a single matrix of emotion words by day. While I was able to do most of the work in Excel, because of varying language character problems, I was forced use Google Spreadsheets, mostly to generate the CSV file format that Gephi requires as an input source file (Excel lost the Japanese text on csv export, while Google did not).

In order to create an emotion “measure” for each day, the spreadsheet generated columns that counted the number of times each keyword was found in each of the 30 days. For example, for word 悲しみ (sadness) was found 0.5 times for every 10,000 tweets on March 11th, 3.1 times on March 12th, 325 times on March 13th, and so on.

Step 4

The heart of the word cluster analysis was conducted in Gephi. Gephi requires you to define your data in two basic elements: Nodes and Edges. For this analysis, I chose to define these as follows:

Nodes: Every emotion word, and every day was used and defined as a Gephi node

Edges: Every connection between a “word” and a “day” was defined as an edge, and weighted by how many times that word was found for every 10,000 tweets, for each day.

Here is a screen shot of Gephi’s data view:

Once the data elements were defined, Gephi is ready to visualize (ie, the fun part!). Gephi comes with many layout templates that you can choose from. Each layout has its own built in algorithms that take the nodes and edges from your database to generate a network diagram. I chose to use a layout called “Parallel Force Atlas” (it sure sounds good). You can choose to size and/or color each node by different data attributes, and do the same for the edges, which serve as the connectors between the nodes. You then press a button, configure a few parameters (such as “gravity”), and voila! you are introduced to a beautiful infographic.

Step 5

What I then thought would be an easy step to export the graphic and create a web viewer (for panning and zooming the huge image) turned out to be a much bigger task than I anticipated. First of all, the Gephi exporters failed to export the Japanese characters… with one exception: SVG format. For some reason, SVG was the only export format that allowed the Japanese characters to survive. Since I wanted to provide a web interface that allows for zooming and panning the graphic, I ended up choosing one that uses the OpenLayers javascript API, which is predominantly used for geo-spatial data visualizations, but also allows you to use on images. In order to get the image ready for OpenLayers, I used MapTiler, an application that generates the different image “tiles” that are needed for the different zoom levels. You can see a full screen version of the final infographic here.

What are they tweeting about?

One key feature of social media is that it provides a snapshot of a moment’s mood, reflected by the content of what people are tweeting about in real time. In order to analyze the emotional and psychological state of the nation in the days after the disaster, I have taken the tweet content text in the UCLA archive, and divided them into 30 text files, one for each day following the Earthquake, starting on March 11, 2011. To measure day to day fluctuations of emotions, I will use a similar methodology employed by Eiji Aramaki PhD (Tokyo University) which takes words from an “Emotion Dictionary” (感情表現辞典) and matches it against the tweet content. The dictionary classifies different emotions into 10 groups:

喜び – Happiness

怒る – Anger

哀しい – Sad

怖い – Fear

恥 – Shame

好き – Like

厭 – Unpleasant

昻 – Nervous

安 – Relief

驚く – Surprise

In order to visualize the relationship between various emotions keywords against the different days following the earthquake, a visualization was generated using Gephi. The words are color coded by emotion type, and line thickness of the connectors represents the strength of the connection between the word and the days.

Sinsai.info database
Courtesy of Makoto Inoue, administrator for the sinsai.info ushahidi website, this database includes the official incident data of more than 20,000 reports curated and posted by hundreds of volunteers. More than 80% of the reports came from Twitter.

UCLA’s Twitter Archive

UCLA’s archive was collected over a 30 day period, from March 10 – April 11, via a cron job that queried Twitter’s search API every 3 minutes to collect relevant tweets. The tweets were subsequently saved on UCLA’s own database server. While the archive has more than 650,000 records, it is a small portion of the supposed 700 million total tweets recorded during the same time period, but nevertheless represents an accurate sampling of the sentiment presented by the social web during this time. One thing that should be noted is that the tweets were filtered by user’s locations, focusing only on users based in Japan.

Here’s a look at the raw numbers:

666,552 Total number of tweets collected

232,914 Distinct users

558,040 Retweets (with the word “RT” in the text)

186,697 Distinct tweets

These numbers reveal some interesting Twitter usage statistics:

2.86 Average number of tweets per user during this 30 day period

84% Percentage of tweets that were “retweets”

The following chart shows a temporal display of the number of tweets per hour:

It is interesting to note that the highest number of tweets per hour comes about a month after the earthquake on April 7th at 11:32pm. This is likely to be due to the occurrence of the second largest aftershock that shook Japan at magnitude 7.1 (there was actually a 7.9 earthquake that followed 30 minutes after the main 9.0 earthquake on March 11th). At a time when the psychological, emotional and physical state of the nation was still frayed, it portrays the existing fears and distress of the population, through tweets like these:

What’s going on? Why are we made to suffer so much? Haven’t you shaken us enough? What have we done to deserve this?

Where are the users from?

One of the criteria of the data collection was to filter those that included a user profile location. Because of this, we are able to map the location of the users in this sample set during the 30 day period following the earthquake. Many users had the same location in their profile, accounting for a total of only 14,607 distinct locations (out of a total of 666,507 tweets). This means that many users had the same location in their profiles. The following are the top 10 most “popular” user profile locations. The location with the most users was in Shinjuku, Tokyo, with 24,169 users:

Location

Count

1

東京都新宿区市谷本村町5-1

24169

2

東京都千代田区大手町

16346

3

東京都渋谷区神南２−２−１

14981

4

島根県松江市

14857

5

東京都千代田区霞が関 中央合同庁舎５号館

13913

6

渋谷区, 東京都 JP

9297

7

東京都新宿区(Tokyo Shinjuku)

7845

8

東京都千代田区霞が関

7563

9

仙台市, 宮城県 JP

7450

10

Tokyo ときどき Kyoto

7311

Out of the top 10 locations, only 3 are located outside of Tokyo. In number 4 is an odd Shimane Prefecture. In number 9 comes Sendai, Miyagi, which was the region most devastateed by the Tsunami. In number 10 is “Tokyo, sometimes Kyoto”.

This is an exploratory paper on a look at how locational technologies were not effectively utilized during the Japan Earthquake despite their availabilities through social media and mobile devices. It also looks at how geo-enabling might be used to monitor future disaster relief efforts.

Part 1: How Twitter was used after the Earthquake

For many us, the moments during and after March 11th, 2011 were both harrowing and unreal, as we saw the horrors of the Japan Earthquake and Tsunami unfold. For those of us who were not physically in Japan, we were forced to look upon the disaster in despair, helpless to provide any immediate assistance. What made this disaster closer to us, in some ways even personal to the global audience, was the abundance of social media streams that allowed the world to feel the pain, see user generated media content, and listen to what was going on… in real time.

My uncle’s house is underwater because of the tsunami. He is stuck on the second floor. Please save him. Ishinomaki-shi 3-2-26 #j_j_helpme

On March 11th, 2011, the tweet shown above was seen on Twitter. It was a plea for help from a woman trying to save her uncle, trapped on the second floor of his house that was in a flood zone caused by the tsunami disaster in Ishinomaki. She added the hashtag #j_j_helpme which was designated to be used for people seeking help in the aftermath of the earthquake. Her plea for help was retweeted, over and over again. She even left an address that allowed us to be able to locate her uncle. Looking at the location on a map, sure enough, we find out that her uncle’s house was located in one of the hardest hit residential areas inundated by the tsunami.

Uncle’s house is in the tsunami flood zone

While it is unclear as to whether or not her tweet actually mobilized relief agencies to save her uncle, we are able to follow her thread by “following” her via her twitter account, and find out that just a few days later, she posted the following message:

“My uncle was rescued! Thank you everybody! I pray that others will be rescued as well!!!”

The power of the social web

It was through moments like these, following stories via the social web, that enabled many of us from around the world to experience what was happening on the ground, as if we were there. In some ways, the spatial boundaries were bridged through the power of social media. The social fabric of the nation quickly revolved around the usage of Twitter as the primary mode of communication, from requesting medical aid, assistance, seeking information about missing people, sending encouragement, and also reporting damage and transportation infrastructure statuses. While Twitter was used predominantly to talk about entertainment and anime before the earthquake, it quickly morphed into something entirely different on the day of the disaster, where 72% of the topics were related to the Earthquake, and another 8% were on transportation.

Before and after Twitter topics

In some ways, Twitter became the virtual bulletin board for exchanging valuable information, disseminating it to the public, and utilizing the social networks to “spread the word” quickly and effectively. For March 11 alone, 33 million tweets were reported in Japan, almost double the average daily usage. Over the next 30 days, more than 700 million tweets were reported. Out of a total population of 128 million, that is a lot of tweets, even when you take into account the fact that most users tweet multiple times.

The power of the “re”tweet

Part of the intrigue, and power of the social web, lies in its ability to transmit data through a multitude of networks that grows exponentially the more “popular” the information is. In Twitter, this is accomplished through its “retweet”ing capabilities, the simple notion of sharing a tweet with others in your network, and subsequently having people in your network retweeting it again, until a single tweet reaches a massive audience, sometimes in a matter of hours.

In the case of the tweets related to the Earthquake, retweeting was used effectively to communicate infrastructure damage, missing person notices, and even announcing relevant hashtags. Here are some examples of tweets that were retweeted more than a thousand times:

Tweet from a hospital director in Miyagi announcing that 30 patients are near starvation, seeking food, medical equipment and fuel.

If you have crayons, leave them close to the children. Children have the ability to draw what they are unable to communicate. Do not stop them even if they draw pictures of dead bodies or violent scenes. Drawing allows them to express their feelings, and helps them cope with the situation.

People in Iwaki are dying. The media is going to Miyagi and Iwate, places that are safe to visit. Iwaki has no food or water. There is no media present. There is no gasoline, and therefore no way to leave.

“My eyes filled up with tears when I heard that my father volunteered to go the the Fukushima Nuclear Plant, even though he will be retiring in just half a year. He said that “the future of this nuclear crisis depends on what we do now, and I must go.” At home, he is not always the most reliable father…but today, I have never felt as proud of him. And I pray for his safe return.”

While most tweets were informational in nature, the most “popular” tweet in Japan following the earthquake was about courage and sacrifice. Because the effects of radiation typically takes years to kick in, it was the older generation that stepped up to the plate to go to the front lines, risking exposure, but knowing that they had fewer years to live than their younger counter-parts. In many ways, symbolizing the spirit of the Japanese people during these trying times, even prompting the Prime Minister to proclaim to these volunteers, “You are the only ones who can resolve a crisis. Retreat is unthinkable,” according to the Financial Times.

Hashtags

Just a day after the earthquake, Twitter announced a set of recommended hashtags to be used to categorize specific post-disaster situational needs:

Post navigation

About me

I came to Los Angeles and UCLA in 1995 after living across the globe, in 5 different countries. At UCLA I serve as the Campus GIS Coordinator and hold lecturer positions in the Digital Humanities, Urban Planning and Public Policy. With 14 years of GIS project management experience, I have supervised projects in urban planning, emergency preparedness, disaster relief, volunteerism, archaeology, and the digital humanities. Current research and projects involve the geo-spatial web, visualization of temporal and spatial data, and creating systems that leverage current social media and web services in conjunction with traditional information systems.