Alessandro Vinciarelli, professor of computer science at Glasgow University, said: “Apple is a company that can make significant investments and can count on some of the best minds in the world. When they enter a market, they can significantly improve the technology. This is particularly important in the case of speech interfaces because they ‘learn’ from the users how to map speech into suitable answers or actions.”

He said the more users an AI has the better it ‘learns’ to interpret requests – Apple products typically attract a large number of users. Vinciarelli also pointed to evidence that Apple had managed to “penetrate our everyday life to unprecedented extents” with products including phones and tablets, although the Apple Watch has failed to catch on.

At present machines still have limitations, added Vinciarelli. “If a request is not particularly frequent – or it is expressed in a way that is ambiguous, metaphoric or it assumes implicit knowledge – the matching will not produce the expected results.” He believes the technology is likely to improve dramatically in coming years.

This is nothing new – AI products, after all, are programmed to accumulate and share data. Alessandro Vinciarelli, Professor of Computing Science at the The University of Glasgow, believes that these types of products are “a little bit like a pets” in terms of their ability to learn.

“These machines – the more you use them, the better they work.”

They are constantly learning about our habits and collecting “millions upon millions of requests”.

Within the next decade, it is inevitable that more and more products will come equipped with artificial intelligence.

As Vinciarelli puts it: “More and more companies are contacting us to create AI for appliances. Vacuum cleaners, heating appliances, kitchen equipment and so on.”

It is also likely that these products will communicate with each other – creating household AI units.

Team member Prof Alessandro Vinciarelli, also of Glasgow University, said a key challenge was creating robots with the technical capability to read human emotions in noisy and crowded public spaces such as shopping malls. Creating sensors, microphones and cameras capable of picking up non-verbal signals is one of the issues that the project will grapple with, he said.

“The [robots] have to be able to understand subtle signs that go with social norms,” he added. “Is someone’s body still facing in the original direction, has there been a change in facial expression, is there a change in the speed of walking? We are looking at creating a socially-intelligent machine that can understand very subtle signs.”

Recent work in the development of robots has included creating machines capable of working with autistic children as well as companions, such as Paro an interactive furry seal robot developed in Japan, which aims to replicate the benefits of pet therapy.

So far all applications have been experimental though Vinciarelli claims they could be mainstream within a decade.

But he admitted that there were some risks associated with the development of robots including over reliance and threats to privacy. “Robots should work alongside humans to support them, not to replace them,” he said. “And when machines can understand our emotions and are in our public spaces the entire concept of privacy needs to be re-defined.”

I have been interviewed by Lennox Morrison (BBC Capital) about the use of Artificial Intelligence and Social Signal Processing in the management of Human Resources and, in particular, in hiring and promotion processes:

The following excerpt shows my contribution to the discussion proposed in the article:

Dr Alessandro Vinciarelli, expert on computing science, neuroscience and psychology at the University of Glasgow in the UK, says that it’s only in the last five years or so that speech analysis has come into use. In terms of accuracy, he says, “half of the cases you can deal with automatically and be confident. With the other 50%, it is better if you have an expert to go through [the findings]. Overall it is always better to have an expert. “These technologies are not supposed to replace people. They are supposed to help and support professionals.”

The rest of the articles reports on other opinions and, in particular, the claims of a company that commercialises systems aimed at identifying the best employees in a company.

The University of Glasgow aims to develop a world-class research emphasis in social robotics, shared between the Institute of Neuroscience and Psychology and the School of Computing Science. Within the Institute, we aim to appoint a new member of staff who will significantly develop our research presence in this area. The appointment can be at Lecturer or Senior Lecturer level, dependent on the applicant’s credentials.

Potential candidates should perform research in social neuroscience, computational neuroscience, social cognition, grounded cognition, or a related field that bears on social interaction and social robotics. Examples of relevant research areas include facial or bodily mirroring, theory of mind (intention attribution), the perception of agency, coordinated social action, etc. Primary qualifications for the position include research excellence, together with leadership potential for moving collaborative research on social robotics forward. Commitment to social robotics in previous and current research will be weighed positively.

The candidate’s research program should align with the strategic objectives of the Centre for Social Cognitive and Affective Neuroscience (cSCAN), and should complement our existing expertise in social signal processing, interactive communication, and/or grounded cognition. The Centre has excellent research facilities, including 4-D face motion capture, whole body motion capture, a variety of eye-tracking facilities, together with state-of-the-art neuroimaging facilities for fMRI, MEG, EEG and TMS associated with the Centre for Cognitive Neuroimaging (CCNi). As part of the University of Glasgow’s Social Robotics Initiative, the successful applicant will have excellent opportunities for collaborating with computer scientists and engineers in the College of Science and Engineering, with access to their robotic facilities and resources.

On June 10th, 2014 I have been keynote speaker at UK-Speech, the UK conference on speech processing and synthesis (the slides of the talk are available at this link). The conference gathers a large number of UK based researchers working on all aspects of speech science, including speech recognition, dialogue systems, therapy of speech pathologies, speech based applications, etc. It has been a great pleasure to meet the conference participants and the organisers, Naomi Harte and Rogier van Dalen.

Such social-signal analysis will be useful beyond call centres and meeting rooms, Vinciarelli says. Monitoring conversation in operating theatres or plane cockpits could help surgeons and pilots know when their colleagues are really paying attention to their instructions, potentially saving lives. But such pervasive recording could be invasive, says Vinciarelli. “This is recording your life to a very deep extent, and it’s mainly being done in professional settings. It’s not really ready for your private life.”

Here is the abstract of the article:”Mobile phones pervade our everyday life like no other technology, but the effects they have on one-to-one conversations is still relatively unknown. This paper focuses on how mobile phones influence negotiations, i.e. on discussions where two parties try to reach an agreement starting from opposing preferences. The experiments involve 60 pairs of unacquainted individuals (120 subjects). They must make a “yes” or “no” decision on whether several objects increase the chances of survival in a polar environment or not. When the participants disagree about a given object (one says “yes” and the other says “no”), they must try to convince one another and reach a common decision. Since the subjects discuss via phone, one of them (selected randomly) calls while the other is called. The results show that the caller convinces the receiver in 70% of the cases (p-value = 0.005 according to two-tailed a binomial test). Gender, age, personality and conflict handling style, measured during the experiment, fail in explaining such a persuasiveness difference. Calling or being called appears to be the most important factor behind the observed result.”

The journal paper “A Survey of Personality Computing” (A.Vinciarelli and G.Mohammadi) has been accepted for publication by the IEEE Transactions on Affective Computing. The article proposes an extensive overview of the works published so far on personality computing, i.e. on Automatic Personality Recognition (prediction of self-assessed traits), Automatic Personality Perception (prediction of traits attributed by others) and Automatic Personality Synthesis (generation of artefacts that elicit the attribution of predefined personality traits).

The paper proposes an approach that detects social signals in speech (overlapping speech, pitch, loudness, etc.) and predicts the conflict level perceived by human observers. The prediction is performed with Gaussian Processes and the adoption of Automatic Relevance Determination allows the identification of the social signals that influence most the perception of conflict.

The paper “Predicting Online Lecture Ratings Based on Gesturing and Vocal Behavior” has been accepted by the Journal of Multimodal User Interfaces. The article is the result of a collaboration with Marco Cristani (University of Verona), Pietro Salvagnini and Vittorio Murino (Italian Institute of Technology), Dong-Seong Chen (Yonsei University Seoul) and Hugues Salamin (University of Glasgow). The experiments show that it is possible to predict the ratings that people assign to an online lecture based on the nonverbal behavioural cues of the teacher (in particular gesturing, prosody and voice quality).

The SSPNet Speaker Personality Corpus includes 640 speech clips (10 seconds each) for a total of 322 subjects. Each clip was assessed by 11 raters in terms of the Big Five Personality traits (the assessments were performed using the BFI-10 questionnaire), namely Extraversion, Agreeableness, Conscientiousness, Neuroticism, Openness. The corpus includes not only the speech clips, but also the raw personality questionnaires, the overall personality scores and metadata associated to each clip: speaker gender, speaker status (journalist or non-journalist) and speaker ID.

The corpus was collected in the framework of the Social Signal Processing Network (SSPNet), the European Network of Excellence on modelling, analysis and synthesis of nonverbal communication in social interactions.

The corpus was collected in the framework of the Social Signal Processing Network (SSPNet), the European Network of Excellence on modelling, analysis and synthesis of nonverbal communication in social interactions.

I am keynote speaker at the International Workshop on Social Behaviour Analysis in Naples (September 11, 2013), in conjunction with the International Conference on Image Analysis and Processing. According to the organisers (Pietro Pala, Alberto del Bimbo and Maja Pantic), “This workshop is to bring together leading researchers in this and related fields to advocate and promote the research into human behaviors and social interactions analysis. The workshop aims to provide an interactive platform for researchers to disseminate their most recent research results, discuss rigorously and systematically potential solutions and challenges, and promote new collaborations among researchers“.

I give an invited lecture at the SICSA Summer School on Cognitive Computation (August 25-30, 2013). The School takes place at the University of Stirling (Scotland) and it is organized by Amir Hussain. One of the main goals of the school is “to empower participants with an interdisciplinary understanding of some of the key underlying methodologies, concepts and techniques in cognitive computation, and their strengths and limitations (demonstrated by a range of case studies)“.

I participate as an invited lecturer in the Summer School on Social Signal Processing (August 5-10, 2013), organized by Jens Allwood for the Swedish Society of Cognitive Sciences. The school takes place in Mullsjo (close to Gothenburg), at the Intercultural Center for the Quality of Life. The other lecturers include Anton Nijholt (University of Twente), Marc Mehu (newly appointed Assistant Professor at the Webster University in Vienna), and Jurgen Trouvain (Saarland University).

The organisers of the next Workshop on Image and Audio Analysis for Multimedia Interactive Services (WIAMIS) have invited me as a key-note speaker, a great honour and pleasure. Following the description on the website of the conference, “As a major event in the multimedia community, the International Workshop on Image and Audio Analysis for Multimedia Interactive services gives the possibility for researchers and developers to exchange on issues linked to interactive services. WIA2MIS 2013 will be held in Telecom ParisTech, Paris, France. For this 14th edition, particular emphasis will be put on audio analysis and audio-driven multimedia analysis research.”

I am keynote speaker at the International Conference on Intelligent Virtual Agents (Edinburgh, August 29-31 2013). The event is at its thirteenth edition and it gathers the scientific community working on the development of “interactive characters that exhibit human-like qualities and communicate with humans or with each other using natural human modalities such as facial expressions, speech and gesture” (from the home page of the conference). This year, IVA will focus in particular on Virtual Agents and Cognition: “IVAs and models of personality; theory of mind; learning and adaptation; motivation and goal-management; creativity; social and culturally-specific behaviour.”

I will have the opportunity to give an keynote talk at the Inputs-Outputs conference in Brighton. The event gathers scientists, artists, media practitioners, doctors and people interested in interdisciplinary research revolving around people and their interactions with both others and technology. Waiting for the day of the conference (June, 26 2013), I enjoy the beautiful flyer designed for the event.

My student Gelareh Mohammadi earned her PhD at the Swiss Federal Polytechnic Institute of Lausanne (EPFL). Gelareh has made a great presentation and her thesis (“Automatic Personality Perception: Inferring Personality Traits from Nonverbal Vocal Behavior“) was greatly appreciated by the jury.

I will deliver a keynote speech at the COST Workshop on Social Robotics and its sustainability, an exciting event organised by the COST (the European body promoting collaboration and networking among scientists all over the continent). The organisers describe their goals as follows: “This three-day event aims to give an overview of the current state of the art on social robotics across disciplines in order to discern what the near future may hold. Covered topics include but not be limited to the evolution of the field traced by all the transdisciplinary stakeholders (e.g. engineering, social sciences, etc.), the communication and emotional impact social robots might generate in users and in the general public, the models of society embodied in social robots, and the societies that such robots will contribute to outline.” It looks like a great opportunity and i really thank the organising committee for having included me in the list of speakers.

Gelareh Mohammadi, EPFL PhD student I supervise at Idiap Research Institute, has been awarded the “Google Anita Borg Scholarship“. The thesis of Gelareh revolves around the inference of the personality traits people attribute to unacquainted speakers.

I have been appointed editor of the new Springer Series “Computational Social Sciences“. The goal of the series is to publish contributions at the crossroad between social and computing sciences, dealing with human interactions at all scales, from dyadic exchanges to worldwide online social networks.

The scope of the series is as follows:

“Computational Social Sciences consists of authored and edited monographs that utilize quantitative and computational methods to model, analyze and interpret large-scale social phenomena. Titles within the series contain methods and practices which test and develop theories of complex social processes through bottom-up modeling of social interactions. Of particular interest is the study of the co-evolution of modern communication technology and social behavior and norms, in connection with emerging issues such a trust, risk, security and privacy in novel socio-technical environments.

Computational Social Sciences is explicitly transdisciplinary: quantitative methods from fields such as, e.g., dynamical systems, artificial intelligence, network theory, agent-based modeling and statistical mechanics are invoked and combined with state-of-the art mining and analysis of large data sets to help us understand social agents, their very interactions on and offline, and the effect of these interactions at the macro level. Topics include but are not limited to social networks and media, dynamics of opinions, cultures and conflicts, socio-technical co-evolution and social psychology. Computational Social Sciences will also publish monographs and selected edited contributions from specialized conferences and workshops specifically aimed at communicating new findings to a large transdisciplinary audience. A fundamental goal of the series is to provide a single forum within which commonalities and differences in the workings of this field may be discerned, hence leading to deeper insight and understanding.“

I am invited speaker at the Signal Processing Laboratory Workshop organized by Hicham Atassi at the Technical University of Brno (Czech Republic). The workshop takes place between October 24th and 26th, 2012.