9 Twitter for Subjective Well-being Word counting (single or tokenized) will not be used. Dictionaries for sentiment analysis will not be used either. Instead, supervised learning method will be used. Sentiment in a sample of tweets is graded by humans resulting in a training set. This set is then used to find similarities in order to classify the remaining and future tweets for sentiment.

10 Twitter for Subjective Well-being An app was developed by INEGI in order to classify tweets for sentiments: positive, negative or neutral, and then allocate them to different domains. Universidad Tec Milenio supports the exercise with about 3000 students who are classifying the training set

11 Twitter for Subjective Well-being The Spanish firm Lambdoop offered free of charge, the implementation of the processing software with the method called supervised learning to extract the sentiment from the tweets. The Dattlas Division of the firm KioNetworks, offered free of charge, the provision of a cluster with enough capacity to carry out our pilot tests and be able to identify, assess and to budget the HW and SW requirements.

12 Tourist Exercise

13 Twitter- Tourist Exercise Background: Collaboration with the Ministry of Tourism. 95% of Twitter users in Mexico are in the age group of 18 to % of domestic tourists are between 18 and 55 years old.

14 Twitter- Tourist Exercise Production of statistics for the tourist sector: 60 M tweets were processed from January to July Guanajuato y Puebla, days February 1st, 2nd y 3rd. 7,955 twitters: 827,424 tweets produced from any other state during a 6 month period. Find out from which state people twitted and how long had they stayed in that state. Stays of 1 to 15 consecutive days in Puebla or Guanajuato. With data obtained generate a map for Puebla and another for Guanajuato.

15 A. Twitter- Tourist Exercise

16 B. Twitter- Tourist Exercise

17 B. Twitter- Tourist Exercise Trips to Guanajuato or Puebla before, during and after a long weekend

20 Mobility studies Using the National Roads Network to find out whether domestic mobility analysis can be conducted, current work. (Plot of 70 M tweets)

21 Mobility studies

22 Next projects and the institutional strategy

23 Other potential projects from Twitter Consumer confidence Indicators on public safety Definition of metropolitan areas Quality of tourist services Conformation of regions

24 Institutional steps: Framework: Data Revolution: get everyone involved Big Data Participate in the main national and international initiatives Partnerships with the government, research centers and universities Explore methodologies, data sets, Approach the private sector Publish results as experimental data Form an inter-disciplinary task team

United Nations Economic Commission for Europe Statistical Division Workshop on the Modernisation of Official Statistics November 24-25, 2015 The Sandbox project The Sandbox 2015 Report Antonino Virgillito

Exploring Big Data in Social Networks virgilio@dcc.ufmg.br (meira@dcc.ufmg.br) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about

Introduction This survey has been developed jointly by the United Nations Statistics Division (UNSD) and the United Nations Economic Commission for Europe (UNECE). Our goal is to provide an overview of

DATA EXPERTS We accelerate research and transform data to help you create actionable insights WE MINE WE ANALYZE WE VISUALIZE Domains Data Mining Mining longitudinal and linked datasets from web and other

Big Data and Its Implication to Research Methodologies and Funding Cornelia Caragea TARDIS 2014 November 7, 2014 UNT Computer Science and Engineering Data Everywhere Lots of data is being collected and

SI485i : NLP Set 6 Sentiment and Opinions It's about finding out what people think... Can be big business Someone who wants to buy a camera Looks for reviews online Someone who just bought a camera Writes

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three

Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

Session III Establishment geo-referencing system for mobile devices used in the Mexican 2014 Economic Censuses and perspectives of use for geo-referencing economic units from administrative registers September

QUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS Huina Mao School of Informatics and Computing Indiana University, Bloomington, USA ECB Workshop on Using Big Data for Forecasting

SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social

1 Recommendations in Mobile Environments Professor Hui Xiong Rutgers Business School Rutgers University ADMA-2014 Rutgers, the State University of New Jersey Big Data 3 Big Data Application Requirements

Center for Business Intelligence and Analytics Topics for Industry Research Project and Partnership Introduction The Center for Business Intelligence and Analytics (http://cbia.stetson.edu/) at Stetson

Prashant Raina Sentiment analysis for news articles Wide range of applications in business and public policy Especially relevant given the popularity of online media Previous work Machine learning based

DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.

Twitter sentiment vs. Stock price! Background! On April 24 th 2013, the Twitter account belonging to Associated Press was hacked. Fake posts about the Whitehouse being bombed and the President being injured

NetView 360 Product Description Heterogeneous network (HetNet) planning is a specialized process that should not be thought of as adaptation of the traditional macro cell planning process. The new approach

How much does word sense disambiguation help in sentiment analysis of micropost data? Chiraag Sumanth PES Institute of Technology India Diana Inkpen University of Ottawa Canada 6th Workshop on Computational

100 001 010 111 From Raw Data to 10011100 Actionable Insights with 00100111 MATLAB Analytics 01011100 11100001 1 Access and Explore Data For scientists the problem is not a lack of available but a deluge.

Applying Machine Learning to Stock Market Trading Bryce Taylor Abstract: In an effort to emulate human investors who read publicly available materials in order to make decisions about their investments,

Can Twitter provide enough information for predicting the stock market? Maria Dolores Priego Porcuna Introduction Nowadays a huge percentage of financial companies are investing a lot of money on Social

Using R for Social Media Analytics Presentation to Tools for Teaching and Learning of Social Media Analytics Blue Sky workshop, 2015 International Communication Association conference (San Juan, Puerto

Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement Ray Chen, Marius Lazer Abstract In this paper, we investigate the relationship between Twitter feed content and stock market

Visualization and Big Data in Official Statistics Martijn Tennekes In cooperation with Piet Daas, Marco Puts, May Offermans, Alex Priem, Edwin de Jonge From a Official Statistics point of view Three types

SEMESTER PROGRAM During the year two semester-long courses are offered (about 16 weeks each). Foreign students entering the semester program must take a placement test. Six levels of instruction are available.

Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, sarah_brader@uk.ibm.com What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags

Using Twitter as a source of information for stock market prediction Ramon Xuriguera (rxuriguera@lsi.upc.edu) Joint work with Marta Arias and Argimiro Arratia ERCIM 2011, 17-19 Dec. 2011, University of

Social Media Monitoring: Engage121 User s Guide Engage121 is a comprehensive social media management application. The best way to build and manage your community of interest is by engaging with each person

DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data