Sarcasm Detection on Czech and English Twitter T Ptácek, I Habernal, J Hong – anthology.aclweb.org … The Ark-tweet-nlp tool (Gimpel et al., 2011) offers precisely that and although it was developed and tested in English, it yields satisfactory results in Czech as well. … Part of speech tagging was done using the Ark-tweet-nlp tool (Gimpel et al., 2011). 5 Results …

Sentiment analysis in czech social media using supervised machine learning I Habernal, T Ptácek, J Steinberger – Proceedings of the 4th Workshop …, 2013 – aclweb.org … of social media. Although Ark-tweet- nlp tool (Gimpel et al., 2011) was developed and tested in English, it yields satisfactory results in Czech as well, according to our initial experiments on the Facebook corpus. Its significant … Cited by 7 Related articles All 7 versions

Mobile sentiment analysis L Chambers, E Tromp, M Pechenizkiy… – Proceedings of the 16th …, 2012 – eprints.port.ac.uk … Future work is to include experimentation on a wider set of Android mobile phones with differing memory and Android operating systems, and inclusion of analysis on other POS taggers implemented in Java and those specifically aimed at Twitter data such as Tweet NLP. … Cited by 3 Related articles All 5 versions

Approaches to Automatically Constructing Polarity Lexicons for Sentiment Analysis on Social Networks VN Khuc – 2012 – rave.ohiolink.edu … 3.4.1. Co-occurrence matrix. Co-occurrence matrix A (N is the number of words) contains information about how many times the word i co- 1 http://code.google.com/p/ark-tweet-nlp/ Page 24. 12 occurs with the word j. The MapReduce job for calculating the co-occurrence … Related articles All 3 versions

How noisy social media text, how diffrnt social media sources T Baldwin, P Cook, M Lui, A MacKinlay… – Proceedings of the 6th …, 2013 – aclweb.org … In line with the findings of Read et al. (2012a) based on experimentation with a selection of sentence to- kenisers over user-generated content, we sentence- tokenise with tokenizer.4 Finally, we tokenise and POS tag the datasets using TweetNLP 0.3 (Owoputi et al., 2013). … Cited by 13 Related articles All 6 versions

Supervised sentiment analysis in Czech social media I Habernal, T Ptá?ek, J Steinberger – Information Processing & …, 2014 – Elsevier … of social media. Although Ark-tweet-nlp tool (Gimpel et al., 2011) was developed and tested in English, it yields satisfactory results in Czech as well, according to our initial experiments on the Facebook corpus. Its significant …

Mining divergent opinion trust networks through latent dirichlet allocation N Dokoohaki, M Matskin – … of the 2012 International Conference on …, 2012 – dl.acm.org … We have used Carnegie Mellon university’s TweetNLP 3 tool set [28]. In this tool set authors propose for a tag set, annotated data and features. We used TweetNLP for tokenization and part-of-speech tagging. Figure 2 shows a 3TweetNLP,http://www.ark.cs.cmu.edu/TweetNLP/ … Cited by 2 Related articles All 5 versions

Part-of-speech tagging for Twitter: Word clusters and other advances OOBOC Chris, DKGN Schneider – 2012 – ra.adm.cs.cmu.edu … released as TweetNLP version 0.3, along with the new annotated data and large-scale word clusters at http://www.ark.cs.cmu.edu/TweetNLP. … 6Included with the tagger, and accessible online at https://github.com/brendano/ark-tweet-nlp/blob/ master/docs/annot_guidelines.md 4 … Cited by 16 Related articles All 10 versions

Automatic Domain-Specific Sentiment Lexicon Generation with Label Propagation YJ Tai, HY Kao – … of International Conference on Information Integration …, 2013 – dl.acm.org … For the goal to extract our candidate words, we use the Ark-TweetNLP [19] as our POS tagging parser. … The main difference between Ark-TweetNLP and other parsers is that Ark-TweetNLP is built from tweet corpus. They manually annotate tweet corpus and train the POS tagger. … Related articles

Semantic sentiment analysis of twitter H Saif, Y He, H Alani – The Semantic Web–ISWC 2012, 2012 – Springer … In this work, we build various NB classifiers trained using a combination of word unigrams and POS features and use them as baseline mod- els. We extract the POS features using the TweetNLP POS tagger,9 which is trained specifically from tweets. … Cited by 47 Related articles All 10 versions

Evaluation datasets for twitter sentiment analysis H Saif, M Fernandez, Y He, H Alani – Proceedings, 1st Workshop …, 2013 – researchgate.net … To extract the number of unigrams, we use the TweetNLP tokenizer [7], which is specifically built to work on tweets data.9 Note that we considered all tokens found in the tweets including words, numbers, URLs, emoticons, and speical characters (eg, question marks, intensifiers … Cited by 2 Related articles All 4 versions

Event Detection in Twitter using Aggressive Filtering and Hierarchical Tweet Clustering. G Ifrim, B Shi, I Brigadir – SNOW-DC@ WWW, 2014 – insight-centre.org … Note the log in the denominator, allowing the current document frequency to have more weight than the pre- vious/historical average frequency. Another important focus is on tweet NLP in order to recognize named entities. … 4http://www.ark.cs.cmu.edu/TweetNLP/ Page 4. … Cited by 2 Related articles All 2 versions

A pipeline tweet contextualization system at inex 2013 K Ansary, AT Tran, NK Tran – 2013 – ims-sites.dei.unipd.it … be found at http://www. ark.cs.cmu.edu/TweetNLP/annot_guidelines.pdf. After tokenizing the tweet, we employed several heuristics to detect the key phrases as overlapping consecutive tokens. For example, we restricted that … Cited by 1 Related articles

Data-Mining Twitter and the Autism Spectrum Disorder: A Pilot Study A Beykikhoshk, O Arandjelovic, D Phung, S Venkatesh… – deakin.edu.au … In this context it is beneficial to have different inflections of the same word normalized and represented by a single term. In linguistics this process is referred to as lemmatization and we apply it automatically using the freely available TweetNLP soft- ware package [18]. …

UT-DB: an experimental study on sentiment analysis in twitter Z Zhu, D Hiemstra, PMG Apers, A Wombacher – 2013 – eprints.eemcs.utwente.nl … So we map the words ranging from ?5 to ?1 in SentiStrength to negative in our grading system, and the words ranging from 2http://www.ark.cs.cmu.edu/TweetNLP/ 3http://sentistrength.wlv.ac. uk/ 385 Page 3. +1 to +5 to positive. The rest are mapped to neutral. … Related articles All 14 versions

Dynamic Language Models for Streaming Text D Yogatama, C Wang, BR Routledge, NA Smith… – cs.cmu.edu … We look at tweets from the period 2011-01-01 to 2012-09-30 (639 days). As a result, we have approximately 100–800 tweets per day. We tokenized the tweets using the CMU ARK TweetNLP tools,12 numerical terms are mapped to a single word, and all letters are downcased. … Related articles All 3 versions

Part-of-speech tagging for twitter: Annotation, features, and experiments K Gimpel, N Schneider, B O’Connor, D Das… – Proceedings of the 49th …, 2011 – dl.acm.org … only 1.7% absolute. 5 Conclusion We have developed a part-of-speech tagger for Twit- ter and have made our data and tools available to the research community at http://www.ark.cs. cmu.edu/TweetNLP. More generally, we … Cited by 218 Related articles All 18 versions

Contextual Sentiment Analysis in Social Media Using High-Coverage Lexicon A Muhammad, N Wiratunga, R Lothian… – … and Development in …, 2013 – Springer … lemma. We use Stanford CoreNLP 1 pipeline for sentence split and lemmatization. However, we use TweetNLP 2 [ 9 ] for tokenization and PoS tagging because it recognises social media symbols such as emoticons. Stemming … Related articles All 2 versions

A Comparison of Sequential and Topic models for Named Entity Recognition on Tweets Y Chen, X Yan, W Zhang – cs.cmu.edu … 3.1 Tokenizing The tokenizer for tweets originally comes from Ark-Tweet-NLP [16], tough we did minor modifi- cations on it. Table 2 shows an example of how the tokenizer for tweets could tokenize the special strings that appear in tweets (eg the emotional icon “:)”) … Related articles All 2 versions

The utility of social and topical factors in anticipating repliers in twitter conversations J Schantl, R Kaiser, C Wagner… – … of the 5th Annual ACM Web …, 2013 – dl.acm.org … information. We calculate the similarity of the concept-vector of user a and the concept vector of user c using the cosine similarity which 4http://www.ark.cs.cmu.edu/TweetNLP/ 5http://dbpedia.org 379 Page 5. is defined as follows: … Cited by 2 Related articles All 6 versions

Learning part-of-speech taggers with inter-annotator agreement loss B Plank, D Hovy, A Søgaard – Proceedings of EACL, 2014 – cst.dk … The cost-sensitive model is 5http://www.ark.cs.cmu.edu/TweetNLP/ 6http://oak.dcs.shef.ac.uk/ msm2013/ie_ challenge/ able to improve performance on two out of the three test sets, while being slightly below baseline performance on the MSM challenge data. … Cited by 3 Related articles All 4 versions

Scat: A System For Concept Annotation Of Tweets S Sachidanandan – 2014 – web2py.iiit.ac.in … semantics contained in it, especially because, we are defining the pertinent concepts based on the hashtags, phrases and proper nouns present in the tweet. We use tweet NLP pos tagger proposed in [10] to encode the tweets based on their POS tag sequences and uses …

Recurrent Chinese Restaurant Process with a Duration-based Discount for Event Identification from Twitter Q Diao, J Jiang – SIAM Page 1. Recurrent Chinese Restaurant Process with a Duration-based Discount for Event Identification from Twitter Qiming Diao? Jing Jiang? Abstract Due to the fast development of social media on the Web, Twitter has become … Related articles