"... It is well known that utterances convey a great deal of information about the speaker in addition to their semantic content. One such type of information consists of cues to the speaker’s personality traits, the most fundamental dimension of variation between humans. Recent work explores the automat ..."

It is well known that utterances convey a great deal of information about the speaker in addition to their semantic content. One such type of information consists of cues to the speaker’s personality traits, the most fundamental dimension of variation between humans. Recent work explores the automatic detection of other types of pragmatic variation in text and conversation, such as emotion, deception, speaker charisma, dominance, point of view, subjectivity, opinion and sentiment. Personality affects these other aspects of linguistic production, and thus personality recognition may be useful for these tasks, in addition to many other potential applications. However, to date, there is little work on the automatic recognition of personality traits. This article reports experimental results for recognition of all Big Five personality traits, in both conversation and text, utilising both self and observer ratings of personality. While other work reports classification results, we experiment with classification, regression and ranking models. For each model, we analyse the effect of different feature sets on accuracy. Results show that for some traits, any type of statistical model performs significantly better than the baseline, but ranking models

by
Jon Oberlander
- In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics (ACL, 2006

"... We report initial results on the relatively novel task of automatic classification of author personality. Using a corpus of personal weblogs, or ‘blogs’, we investigate the accuracy that can be achieved when classifying authors on four important personality traits. We explore both binary and multipl ..."

We report initial results on the relatively novel task of automatic classification of author personality. Using a corpus of personal weblogs, or ‘blogs’, we investigate the accuracy that can be achieved when classifying authors on four important personality traits. We explore both binary and multiple classification, using differing sets of n-gram features. Results are promising for all four traits examined. 1

by
Dominique Estival, Tanja Gaustad, Ben Hutchinson, Son Bao Pham, Will Radford
- In Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, 2007

"... This paper reports on some aspects of a project aimed at automating the analysis of texts for the purpose of author profiling and identification. The complete analysis provides probabilities for the author’s basic demographic traits (gender, age, geographic origin, level of education and native lang ..."

This paper reports on some aspects of a project aimed at automating the analysis of texts for the purpose of author profiling and identification. The complete analysis provides probabilities for the author’s basic demographic traits (gender, age, geographic origin, level of education and native language) as well as for five psychometric traits. We describe the email data which was collected for the project, the ways this data is processed and analysed, and the experimental setup used for classification with the Text Attribution Tool (TAT) before presenting our results for the demographic and psychometric traits using English email. Results are very promising for all ten traits examined. 1

...ntifying the source of the threat is the first step in countering it. In this context, author profiling forensics can be helpful to at least narrow the list of potential authors (Corney et al., 2002; =-=Argamon et al., 2005-=-; Abbasi and Chen, 2005). Another area where author identification and profiling can provide valuable information is in deriving marketing intelligence from the acquired profiles (Glance et al., 2005)...

"... It would be useful to enable dialogue agents to project, through linguistic means, their individuality or personality. Equally, each member of a pair of agents ought to adjust its language (to a greater or lesser extent) to match that of its interlocutor. We describe CRAG, which generates dialogues ..."

It would be useful to enable dialogue agents to project, through linguistic means, their individuality or personality. Equally, each member of a pair of agents ought to adjust its language (to a greater or lesser extent) to match that of its interlocutor. We describe CRAG, which generates dialogues between pairs of agents, who are linguistically distinguishable, but able to align. CRAG-2 makes use of OPENCCG and an over-generation and ranking approach, guided by a set of language models covering both personality and alignment. We illustrate with examples of output, and briefly note results from user studies with the earlier CRAG-1, indicating how CRAG-2 will be further evaluated. Related work is discussed, along with current limitations and future directions.

by
Scott Nowson
- In Proceedings of the International Conference on Weblogs and Social, 2007

"... We report new results on the relatively novel task of automatic classification of blog author personality. Promisingly high classification accuracies have recently been reported for four important personality traits (Extraversion, Neuroticism, Agreeableness and Conscientiousness). But the blog corpu ..."

We report new results on the relatively novel task of automatic classification of blog author personality. Promisingly high classification accuracies have recently been reported for four important personality traits (Extraversion, Neuroticism, Agreeableness and Conscientiousness). But the blog corpus used in that work required careful preparation, and was consequently quite small (with less than a hundred authors; and less than half a million words). Here, we provide an initial report on the classification accuracies that can be achieved when classifiers conditioned on the small corpus are applied to a larger, automatically-acquired blog corpus, using lowergranularity personality data and substantially less manual preparation (with over a thousand bloggers, and approximately five million words). Predictably, results on the larger corpus are not as impressive as those on the smaller; nevertheless, they point the way forward for further work.

...re [11]. 2.3 Classification Perhaps the most relevant work here is the small but growing collection on the automatic classification of personality (to which this paper is an addition). Argamon et al. =-=[1]-=- focused on Extraversion and Neuroticism, dividing Pennebaker and King’s [28] population into just the top- and bottom-third scorers on a dimension, discarding the middle third. Employing various feat...

"... Conversation is an essential component of social behavior, one of the primary means by which humans express intentions, beliefs, emotions, attitudes and personality. Thus the development of systems to support natural conversational interaction has been a long term research goal. In natural conversat ..."

Conversation is an essential component of social behavior, one of the primary means by which humans express intentions, beliefs, emotions, attitudes and personality. Thus the development of systems to support natural conversational interaction has been a long term research goal. In natural conversation, humans adapt to one another across many levels of utterance production via processes variously described as linguistic style matching, entrainment, alignment, audience design, and accommodation. A number of recent studies strongly suggest that dialogue systems that adapted to the user in a similar way would be more effective. However, a major research challenge in this area is the ability to dynamically generate user-adaptive utterance variations. As part of a personality-based user adaptation framework, this article describes Personage, a highly parameterizable generator which provides a large number of parameters to support adaptation to a user’s linguistic style. We show how we can systematically apply results from psycholinguistic studies that document the linguistic reflexes of personality, in order to develop models to control Personage’s parameters, and produce utterances matching particular personality profiles. When we evaluate these outputs with human judges, the results indicate that humans perceive the personality of system utterances in the way that the system intended.

...2; Gosling et al., 2003; John et al., 1991). The other possibility is to identify relevant behavioral cues, e.g. based on the user’s interaction (Dunn et al., 2009) or the user’s speech and language (=-=Argamon et al., 2005-=-; Mairesse et al., 2007; Oberlander and Nowson, 2006). While personality questionnaires have a high predictive value and only need to be filled once by the user, they lack the objectivity of observer-...

"... We present a new corpus for computational stylometry, more specifically authorship attribution and the prediction of author personality from text. Because of the large number of authors (145), the corpus will allow previously impossible studies of variation in features considered predictive for writ ..."

We present a new corpus for computational stylometry, more specifically authorship attribution and the prediction of author personality from text. Because of the large number of authors (145), the corpus will allow previously impossible studies of variation in features considered predictive for writing style. The innovative meta-information (personality profiles of the authors) associated with these texts allows the study of personality prediction, a not yet very well researched aspect of style. In this paper, we describe the contents of the corpus and show its use in both authorship attribution and personality prediction. We focus on features that have been proven useful in the field of author recognition. Syntactic features like part-of-speech n-grams are generally accepted as not being under the author’s conscious control and therefore providing good clues for predicting gender or authorship. We want to test whether these features are helpful for personality prediction and authorship attribution on a large set of authors. Both tasks are approached as text categorization tasks. First a document representation is constructed based on feature selection from the linguistically analyzed corpus (using the Memory-Based Shallow Parser (MBSP)). These are associated with each of the 145 authors or each of the four components of the Myers-Briggs Type Indicator (Introverted-Extraverted, Sensing-iNtuitive, Thinking-Feeling, Judging-Perceiving). Authorship attribution on 145 authors achieves results around 50 % accuracy. Preliminary results indicate that the first two personality dimensions can be predicted fairly accurately.

"... In this paper, we address the issue of how different personalities interact in Twitter. In particular we study users ’ interactions using one trait of the standard model known as the “Big Five”: emotional stability. We collected a corpus of about 200000 Twitter posts and we annotated it with an unsu ..."

In this paper, we address the issue of how different personalities interact in Twitter. In particular we study users ’ interactions using one trait of the standard model known as the “Big Five”: emotional stability. We collected a corpus of about 200000 Twitter posts and we annotated it with an unsupervised personality recognition system. This system exploits linguistic features, such as punctuation and emoticons, and statistical features, such as followers count and retweeted posts. We tested the system on a dataset annotated with personality models produced from human judgements. Network analysis shows that neurotic users post more than secure ones and have the tendency to build longer chains of interacting users. Secure users instead have more mutual connections and simpler networks. 1

"... This paper reports on the application of the Text Attribution Tool (TAT) to profiling the authors of Arabic emails. The TAT system has been developed for the purpose of language-independent author profiling and has now been trained on two email corpora, English and Arabic. We describe the overall TA ..."

This paper reports on the application of the Text Attribution Tool (TAT) to profiling the authors of Arabic emails. The TAT system has been developed for the purpose of language-independent author profiling and has now been trained on two email corpora, English and Arabic. We describe the overall TAT system and the Machine Learning experiments resulting in classifiers for the different author traits. Predictions for demographic and psychometric author traits show improvements over the baseline for some of the author traits with both the English and the Arabic data. Arabic presents particular challenges for NLP and this paper describes more specifically the text processing components developed to handle Arabic emails. 1

...ion where profiling can make a contribution. Also, author profiling forensics may be helpful in narrowing the choice of potential authors when identifying the source of a threat (Corney et al., 2002; =-=Argamon et al., 2005-=-; Abbasi and Chen, 2005a). Author attribution is the task of deciding for a given text which author, usually from a predefined set of authors, has written it. Historically, author identification has i...