Early diagnosis of social isolation in older adults can prevent physical and cognitive impairment or further impoverishment of their social network. This diagnosis is usually performed by personal and periodic application of psychological assessment instruments. This situation encourages the development of novel approaches able to monitor risk situations in social interactions to obtain early diagnosis and implement appropriate measures. This paper presents the development of a prediction model of social isolation in older adults through Ambient Intelligence (AmI) and Social Networking Sites (SNSs). The predictive model has been evaluated in terms of its accuracy, sensitivity, specificity, predictive values. This paper also presents the results of an experimental test applying the proposed approach with real users, obtaining a prediction accuracy of 87% and a type II error rate of 15%. The proposed model will benefit institutions interested in developing technological solutions to detect early stages of social isolation, resulting in improving the quality of life of older adults.

One of the more accentuated issues in late adulthood is social isolation due to such factors
as retirement, children living in different places, or the spouse loss. Social
isolation is defined as the lack of contact and interaction with others 1,2. An early diagnosis of this condition significantly
reduces the risk of depression, cognitive impairment, decreased food intake, reduced
physical exercise, or impoverishment of the social network 3. This risk underscores the importance of knowing at
all times when an older adult stops socializing in order to carry out interventions
that allow him/her to overcome this condition and be kept in a socially active
state. Currently, several psychological scales are used to assess the level of
social isolation in older adults 3,4. Unfortunately, the application of these instruments is
tedious because older adults need to go to assistance centers or specialized
professionals in order to be assessed. This motivates the development of novel
approaches that enable automatic monitoring of significant changes in their social
interactions. In this context, Ambient Intelligence (AmI) provides widely accepted
computational mechanisms that would help older people in their daily lives in a
manner that is simple, unintrusive, ubiquitous, and proactive at the same time 6,7. On the other hand, the increase in the
participation of older adults in Social Networking Sites (SNSs) opens a range of
opportunities to monitor social interactions through these virtual communication
channels 8. Therefore, this paper
describes a predictive model developed to serve as a baseline for determining social
isolation levels. This model receives as input quantitative values from
indoor/outdoor social interactions performed by older adults. The proposed model
will benefit institutes interested in developing systems to improve the quality of
life of older adults.

The paper is structured as follows: Section 2 describes related work on the detection of social isolation. In Section 3 the proposed predictive model of social isolation is presented. Section 4 describes the design of the experimental test to evaluate the performance of the predictive model. In Section 5 a discussion of our experimental test is given. Finally, Section 6 presents conclusions and future works.

2 Related Work

Recent research has studied the impact of Ambient Intelligence and Social Networking Sites on
socialization. This body of research has addressed how these technologies help i) to
reduce the level of social isolation and increase independent life at home 7),(8, ii) to keep seniors in touch with friends through
natural ways of interaction 9,10, iii) to encourage physical
exercise 11,12, and iv) to monitor the state of
health and to keep caregivers and relatives informed 15. However, monitoring social isolation through
computational mechanisms has not been addressed.

3 Predictive Model

A predictive model examines an attribute set and produces an outcome class. Our research work focuses on identifying attributes that have a correlation with social isolation. These attributes correspond to social activities performed by older adults that can be monitored by AmI and SNSs. Table 1 shows a summary of social interaction activities grouped by technological resource and the location where the activity is performed. Previous studies have demonstrated that these activities are correlated with subjective social isolation (loneliness), for instance, time spent inside home 16, time spent out of home 17, and communication through mobile phones 18. Also, previous research suggests that the use of SNSs could help prevent social isolation in older adults 8. For this reason, such activities were considered as the attribute set for the predictive model development.

3.1 Data Collection

Data collection consisted in carrying out a non-probability sampling through a questionnaire applied to 144 older adults, including both men and women between 60 and 89 years of age (68.2 ±8.9) with full physical and cognitive abilities, without mobility impairment, who own a mobile phone and have the ability to use it to make calls or send text messages. In addition, these subjects have a profile in Facebook, had no difficulty understanding the questions, and signed an informed consent indicating that they were willing to take part in the research. The sample was collected in the city of Cuernavaca, Mexico. The interviews took place in public parks and malls. The questionnaire comprises two parts. The first one is the LSNS-6 in its Spanish version 19 to determine the level of social isolation of older adults.

The second part of the questionnaire collected data concerning demographic information as well as social interaction activities described in Table 1. The questions formulated by the LSNS-6 request information about the frequency of social interactions during one month previous to the interview, which is often difficult to remember accurately. Figure 1 shows a summary of the sample's general information. In this chart we can observe 48 severe cases of social isolation, 93 moderate cases of isolation, and 3 cases where no social isolation was detected.

Table 1 Social interaction activities classified by the technologies that can be used to infer social isolation

Fig. 1 Sample general information

3.2 Attribute Selection

Attribute selection is the process of identifying and removing irrelevant and redundant information. Most machine learning algorithms were designed to identify the most appropriate attributes for classification. Decision tree methods choose the most promising attribute to split at each point and should in theory never select irrelevant or unhelpful attributes 19. In order to obtain the first subset of relevant attributes, the J48 classification algorithm was applied to the full dataset. Then, the subset obtained was assessed using Chi-Squared and InfoGain methods 20 with the Ranker method for evaluation of attributes, Correlation-based Feature Selection method with BestFirst and Greedy Stepwise 21 for evaluation of the sets of attributes. All the tests were performed with ten times 10-fold cross validation as the standard evaluation technique 19.

The resulting relevant attributes were gender, the number of different places visited, the number of times when a person initiates conversation with the family by chat, the number of incoming calls from the family, the number of incoming calls from friends, the duration of incoming calls from the family in minutes, the duration of outgoing calls from family in minutes, the number of incoming messages from the family, the number of outgoing messages to friends, time spent in the bedroom, time spent in the living room, time spent in the dining room, time spent in the garden, and time spent in other area inside home.

3.3 Classification

In order to develop the most suitable model for predicting social isolation, a range of classifier algorithms were assessed 22. This process was carried out using WEKA 23. ZeroR (ZR) algorithm was used as a baseline. The other classifier algorithms used were NaiveBayes (NB), Simple Logistic (SL), Support Vector Machine (SVM), k-Nearest-Neighbor (kNN), AdaBoost (AB), OneR (OR), J48, and SimpleCart (SC). The stratified ten times ten-fold cross-validation technique was used because it is the standard evaluation technique in situations where only limited data is available 19.

3.4 Balancing the Dataset

Table 2A dataset is imbalanced if the classification categories are not equally represented. The imbalance between such class data could have an impact on some classification algorithms, typically with a bias toward the majority class prediction. Therefore, applying a dataset balancing technique is required. In order to handle the imbalance, the dataset was resampled by applying the synthetic minority oversampling technique (SMOTE) 24.

Each derived model is denoted by the name of the classifier algorithm plus "_S" when SMOTE is applied. For example, a model derived using kNN classification and SMOTE for data resampling is denoted as "kNN_S" and the one without data resampling is denoted as "kNN". Table 2 shows the dataset before and after applying SMOTE.

Table 2 Number of instances of the dataset before and after SMOTE applied

3.5 Model Evaluation

Predictive models' performance was evaluated in terms of accuracy 25, sensitivity, specificity, positive and negative predictive values 26, and error types I and II. In order to corroborate the results of the predictive models, a reference standard was necessary to define as an alternative and real diagnosis. The reference standard used was the LSNS-6.

3.6 Suitable Model Selection

In selecting a suitable model we focus on reducing type II errors (FN rate), that is, we are more concerned with not detecting actual cases of isolation than predicting isolation when not actually present (type I error or FP rate), particularly, if the older adult might have unknowingly been under a serious risk of depression and suicide 1. The first model performance evaluation was carried out with ZR as the baseline. This algorithm is the simplest classification method which relies on the target and ignores all predictors. The ZR classifier simply predicts the majority class. Although there is no predictability power in ZR, it is useful for determining the baseline performance as a benchmark for other classification methods 19. All the models obtained from the dataset with and without using SMOTE produced a significantly higher accuracy than the baseline.

The second model performance evaluation was carried out in terms of accuracy. The classification algorithms were applied to the data using the relevant attribute subsets with and without using SMOTE. All the models obtained from the dataset using SMOTE produced higher accuracy scores. The accuracy of the baseline and the classifications algorithms applied to the dataset with and without using SMOTE are shown in Figure 2.

Once the performance of the models was compared based on their accuracy, the best models in terms of sensitivity, specificity, positive and negative predictive values were examined. Nevertheless, type II error was weighted heavier than the other criteria since this type of error could lead to most adverse effects in older adults. The summary of all criteria is presented in Table 3.

Table 3 Models' prediction performance rank table

The AB_S model obtained the best accuracy score of 85% and also the best type II error rate of 15%. It had the best performance over the rest of models. This model was selected as the most suitable one.

4 Experiment

In order to evaluate the AB_S model, an experiment was conducted. A comparison of the model's results with the real condition of the older adults was carried out.

4.1 Materials

Participants

The experiment included 8 older adults, 6 men and 2 women, who met the same profile requirements as in the previous phase, with ages between 60 and 85 years old (68.61 ±7.47).

Data collection

Participants' mobile phones, four wireless IP cameras, a wireless router, an internet connection, Facebook message history, and a printed form were used.

4.2 Procedure

First, the participants were asked to sign an informed consent form where they agreed to participate in the experiment. Then, each participant was monitored for one month, since this period is required from LSNS-6 in order to obtain the social isolation level. The monitoring of all participants lasted four months.

Such attributes as the number of incoming calls from the family, the number of incoming calls from friends, the duration of incoming calls from the family, the duration of outgoing calls from the family, the number of incoming messages from the family, and the number of outgoing messages to friends were obtained by retrieving the call log of each participant's mobile phone at the end of the month. Such attributes as time spent in the living room, time spent in the garden, time spent in the dining room, time spent in the bedroom, and time spent in other area inside home were obtained from two IP cameras strategically installed in homes. Each camera recorded 12 hours per day. About 5760 hours of video were recorded. The transcript 27 of the videos was done every day. Such attribute as the number of times that the older person initiates a conversation chat with the family was obtained by the Facebook message history in each participant's personal account at the end of the month. Such attributes as gender and the number of places visited were obtained by participants' self-report using the printed form where they informed of the number of places visited every day. At the end of each participant's monitoring period, the LSNS-6 was administered in order to obtain their real social isolation level. From the data collected during the monitoring phase, each participant's social isolation level was obtained through AB_S model. Finally, the comparison between the social isolation level results obtained with the LSNS-6 and AB_S model was carried out.

4.3 Results

Table 4 shows the summary of the 8 older adults' data who were in the experimental group. Column A corresponds to gender, Column B corresponds to the number of different places visited, Column C corresponds to the number of times that the older person initiates a conversation chat with the family, Column D corresponds to the number of incoming calls from the family, Column E corresponds to the number of incoming calls from friends, Column F corresponds to the duration of incoming calls from the family (in minutes), Column G corresponds to the duration of outgoing calls from the family (in minutes), Column H corresponds to the number of incoming messages from the family, Column I corresponds to the number of outgoing messages to friends, Column J corresponds to time spent in the bedroom (in minutes, excluding sleep time), Column K corresponds to time spent in the living room (in minutes), Column L corresponds to time spent in the dining room (in minutes), Column M corresponds to time spent in the garden (in minutes), and Column N corresponds to time spent in other area inside home (in minutes).

Table 4 Data collected in the experiment

The comparison between the social isolation level results obtained by LSNS-6 and AB_S model is shown in Figure 3.

Fig. 3 Comparison between results obtained with the Lubben Social Network scale (LSNS-6) and AB_S model

The AB_S model correctly classified 7 of 8 participants producing an accuracy of 87.5% and a type II error rate of 12.5%.

5 Discussion

This research focused on inferring the older adults' social isolation level through activities that can be monitored by AmI and SNSs. From the collected sample, which gave rise to the predictive model, the attributes that have a correlation with social isolation were identified. Also, we made some findings during the development of the predictive model.

5.1 Relevant Attributes

From all the demographic attributes, only gender resulted to be a relevant attribute. Thirty four percent of the sample had a severe level of social isolation. Of this 34%, 69.8% were male and 30.2% were female. As we can observe, men run a greater risk of social isolation than women. Of the 30.2% of women with a severe level of social isolation, 60% live alone.

One possible interpretation of this finding is that in Mexico men's life expectancy is lower than that of women 28, so women become widows and live alone. Another relevant attribute was the number of different places visited. One possible interpretation of this finding is that older adults need to perform activities outside their homes in order to encourage social interactions. Concerning such attributes as posts and messages by Facebook, they did not result to be relevant but what turned out to be relevant is the number of times that an older person initiates a conversation chat with the family. One possible interpretation of this finding is that older adults use only private messages and they avoid posting on the Facebook wall due to security. Another possible interpretation is that older adults have begun to use Facebook recently, so they have not yet developed enough abilities. Within the attributes concerning the use of mobile phone, the relevant attributes were the number of incoming calls from the family, the number of incoming calls from friends, the duration of incoming calls from the family, the duration of outgoing calls from the family, the number of incoming messages from the family, and the number of outgoing messages to friends.

As we can observe, most attributes refer to communication with the family. One possible interpretation of this finding is that currently older adults use their mobile phone more frequently to communicate with the family than with others. Another finding is that the use of SMSs by older adults is increasing. This increment could be explained by the fact that new technologies are considering older adults' limitations which results in more appropriate interfaces.

Finally, relevant attributes referring to indoor location were time spent in the bedroom, time spent in the living room, time spent in the dining room, time spent in the garden, and time spent in other area inside home. One possible interpretation of this finding is that it is important to older adults to move inside their homes since it encourages social interactions with the people they live with, thus avoiding being isolated in one area within the home.

5.2 Predictive Models

In order to handle the imbalanced data, an oversampling technique (SMOTE) [24] was applied. The predictive model performance was better using SMOTE. The AB model obtained an accuracy of 65% and a type II error rate of 32%. The AB_S model obtained an accuracy of 85% and a type error rate of 15%. In this case, the synthetic instances created with SMOTE enhanced the learning of the AdaBoost classifier algorithm. In the experiment, the AB_S model produced a higher accuracy than that produced by the cross-validation test (85%, 87.5%). The type II error rate was lower than the one for the cross-validation test (15%, 12.5%).

Even though the accuracy performance was worse, the type II error rate improved. It means a lower rate of older adults might have unknowingly been under a serious risk of other diseases [1]. Nevertheless, new experiments with a larger sample are required.

6 Conclusions and Future Work

Social isolation is considered to be one of the possible factors that cause such disorders as
depression, cognitive impairment, or impoverishment of the social network 3,1. Therefore, an early diagnosis and suitable
interventions from relatives and caregivers would allow older adults to cope with
this health condition. In order to infer social isolation in older adults, an
evaluation of a number of activities that can be monitored through AmI and SNSs was
carried out. From these activities, relevant attributes were identified through
attributes' evaluation methods. Using such attributes, a number of predictive models
were developed by implementing a range of classifier algorithms. Each model went
through performance evaluation and a technique to handle imbalanced data and bias.
The AB_S model was the selected model due to its performance (accuracy: 85%,
sensitivity: 85%, specificity: 92%, PPV: 91%, PPN: 85%) and its lower type II error
rate (15%). In order to evaluate the selected model, an experiment with 8 older
adults was carried out. The experiment compared the social isolation level of each
participant obtained by the AB_S model versus the reference standard, the LSNS-6.
The experiment results showed that the AB_S model correctly classified 7 of 8
participants, producing an accuracy of 87.5% and a type II error rate of 12.5%.

A limitation for our work is the amount of available data. It was both expensive and time-consuming to collect such data from older adults. Nevertheless, a collaborative project with geriatric institutions is currently underway which will allow our current approach to be extended to a larger sample size. As future work, an implementation of the AB_S model in a computer system is considered. This system will be capable to monitor the older adults' activities in a ubiquitous manner and to infer their social isolation levels. Also, older adults will be capable to share their social isolation level with previously authorized caregivers and relatives with the intention to alert them of a risk situation. Finally, some older adults employ and enjoy their isolated time which does not imply a risk situation. In order to adapt individual requirements to the model, an implementation of statistical learning algorithms is planned.

Wilfrido Campos is a Ph.D. student at CENIDET, Cuernavaca, Morelos, Mexico. He
received his M.Cs. degree in Computing from Autonomous University of
Guerrero (UAGro), Guerrero, Mexico. His research interests are data mining,
modeling, and information technology in software development.

Alicia Martinez is a research professor at CENIDET, Cuernavaca, Morelos,
Mexico. She received her Ph.D. degree in Computer Science from Technical
University of Valencia, Spain, and the Trento University, Italy. Her
research interests are affective computing, ubiquitous computing,
organizational modeling, and ontologies.

Wendy Sánchez is a Ph.D. student at CENIDET, Cuernavaca, Morelos, Mexico. She
received her M.Cs. degree from CENIDET, Cuernavaca, Morelos, Mexico. Her
research interests are data mining, AmI, behavior patterns and information
technology in software development.

Hugo Estrada is a computer science researcher at Information and Documentation
Fund for Industry, INFOTEC, Mexico City, Mexico. He received his Ph.D.
degree in Computer Science from Technical University of Valencia, Spain, and
the Trento University, Italy. His research interests are organizational
modeling, semantic Web, and ontologies.

Jesus Favela is a research professor at Ensenada Center for Scientific Research
and Higher Education, Baja California, Mexico. He received his Ph.D. degree
from MIT-Massachusetts Institute of Technology-Cambridge, Massachusetts,
United States. His research interests are mobile and ubiquitous computing,
medical informatics, and collaborative systems.

Joaquin Pérez is a research professor in Software Engineering Area at CENIDET,
Cuernavaca, Morelos, Mexico. He received his Ph.D. degree in Computer
Science from Technological Institute and of Higher Studies of Monterrey,
Morelos. His research interests are software engineering, data mining,
algorithms, and mathematical modeling.

This is an open-access article distributed under the terms of the Creative Commons Attribution License