Notes

Abstract:

Access to meaningful collections of sensory data is one of the major impediments in pervasive computing and activity recognition research. Researchers often need data to evaluate the viability of their ideas and algorithms. But obtaining useful sensory data from real world deployments is challenging because of the high cost and significant ground work involved in building actual spaces. Also, the regulatory limitations on human subject use can limit the researcher to execute all possible test scenarios. This situation can be improved by community effort to enable and encourage the sharing of existing inventory of datasets. However, powerful simulation tools and techniques are also needed to satisfy the growing demand of activity data and accelerate research on pervasive and human-centered computing. In this thesis, the Persim project is presented as a solution to these problems by (1) introducing a standard representation of sensory data ? Sensory Dataset Description Language (SDDL) for effective sharing of existing datasets among research communities, and (2) contributing a powerful event-driven simulation tool to generate synthetic data for human activities in standardized format. The simulator can capture the physical space in terms of sensors/actuators as well as user behavior (activities) and generate focused simulation data from the targeted space to achieve a particular research goal. Moreover, the simulator is verified by a fuzzy-based verification technique which assesses the fidelity of the simulated data against the real data collected from smart home deployments.

General Note:

In the series University of Florida Digital Collections.

General Note:

Includes vita.

Bibliography:

Includes bibliographical references.

Source of Description:

Description based on online resource; title from PDF title page.

Source of Description:

This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.

Notes

Abstract:

Access to meaningful collections of sensory data is one of the major impediments in pervasive computing and activity recognition research. Researchers often need data to evaluate the viability of their ideas and algorithms. But obtaining useful sensory data from real world deployments is challenging because of the high cost and significant ground work involved in building actual spaces. Also, the regulatory limitations on human subject use can limit the researcher to execute all possible test scenarios. This situation can be improved by community effort to enable and encourage the sharing of existing inventory of datasets. However, powerful simulation tools and techniques are also needed to satisfy the growing demand of activity data and accelerate research on pervasive and human-centered computing. In this thesis, the Persim project is presented as a solution to these problems by (1) introducing a standard representation of sensory data ? Sensory Dataset Description Language (SDDL) for effective sharing of existing datasets among research communities, and (2) contributing a powerful event-driven simulation tool to generate synthetic data for human activities in standardized format. The simulator can capture the physical space in terms of sensors/actuators as well as user behavior (activities) and generate focused simulation data from the targeted space to achieve a particular research goal. Moreover, the simulator is verified by a fuzzy-based verification technique which assesses the fidelity of the simulated data against the real data collected from smart home deployments.

General Note:

In the series University of Florida Digital Collections.

General Note:

Includes vita.

Bibliography:

Includes bibliographical references.

Source of Description:

Description based on online resource; title from PDF title page.

Source of Description:

This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.

activities either from real word deployments or from simulated environments. The scope

of this description language is to provide the collective information about the pervasive

space in terms of available sensors/actuators, activities that can happen inside the

space, researcher implicit assumptions, dataset parameters, and labeling. It does not

include post processing of sensor data or any physical properties of sensors/actuators.

Development Methodology

To develop a description language that is intended to be a global representation of

sensory data, we realized that it just cannot be done overnight. Unless we repeatedly

analyze the existing datasets from different research communities, it is not possible to

achieve our desired goal. With this realization, we have adopted an iterative

development methodology in designing SDDL.

Challenges

While developing SDDL we were confronted with several practical challenges -

both technical and methodological. We attempted to answer the following questions as

we developed the initial version of SDDL:

* How to develop a standard that is not overly generic or not overly specific?
* How can different styles of dataset organization be satisfied by a common format?
* How can the format allow the flexibility to support diverse experiments?
* How to make the language simple to understand as well as easy to parse?

Approach

Case-study analyses were used as a means for an iterative conjecture-refinement

process. While analyzing existing datasets, our objective was to gain an insight of the

practice of the researchers while organizing sensor data. Based on the analysis results,

we have learned of critical issues related to the characterization of sensor data and the

way in which they were collected. Then, we tried to combine different practices in a

common design that can serve the major purposes.

The development process was divided into two phases. In the first phase, our

focus was to develop an initial structure of SDDL based on a few existing and well-

known datasets. In the second phase, previously developed structure was refined and

adapted iteratively to accommodate additional information according to the analysis

result and what we have learned from new datasets.

In the first phase shown in Figure 3-1, we have analyzed few datasets and

contacted with several researchers (Dr. Diane J. Cook, Washington State University

(WSU) regarding CASAS smart home datasets [17] and Dr. Hani Hagras, University of

Essex regarding iDorm datasets [18]). Based upon their feedback, we have thoroughly

surveyed the datasets and developed the initial version of the schema structure of

The dataset corresponds to data about human activity performed in two sing le-

person apartments for two weeks. In each apartment, between 77 and 84 sensor data

collection board equipped with reed switch sensors were installed. In this experiment,

Supervised Learning approach was followed and Naive Bayesian Classifier was used to

detect activities [19], [20].

The dataset consists of three types of data in three separate files. These are:

* List of activities. This file consists of all predefined list of activities corresponding
to the experiment. Each activity is defined by comma separated values of heading,
category, sub-category, and code.

* List of sensors. This file contains all the sensors installed in the apartment. Each
sensor is defined by comma separated values of sensor ID, location, and object
(to which the sensor is attached).

* Activity data. This is the actual data file which contains the information about the
activity which was being performed and the sensors which were active during that
time. Each entry of the dataset corresponds to an activity which is defined by
comma separated values of activity label, start time, end time, sensor ID, sensor
object, sensor activation time, and sensor deactivation time.

At this point we had a few major observations. First, the dataset has pre-defined

sensor labeling and activity labeling. Second, the actual activity-data is also labeled with

activity and sensor information corresponding to that activity. Third, different activities

are defined by different subsets of sensors. For example, in Figure 3-5, 'Toileting'

activity is defined by sensors which are attached to the Sink faucet- hot and light

switch. On the other hand, 'Preparing Lunch' activity is defined by sensors that are

attached to the Door, Freezer, Toaster, Cabinet, and Drawer. Fourth, same activity can

also be defined by different subsets of sensors. For example, same activity 'Preparing

Figure 3-6. A snippet of MavPad testbed, University of Texas at Arlington [21]

Overview of SDDL Schema Structure

Terms and Notations

In this thesis, the following terms and notations are used to describe SDDL

schema elements:

* Element name starts with an upper-case letter. Any attribute name starts with a
lower-case letter.

* Element and attribute names use an underscore ("_") to separate multiple words in
order to improve readability (For example, ).

* Elements can be simple or complex.

* A simple schema element is the one that does not have any attributes.

* A complex schema element is the one that can have multiple simple or complex
sub elements. It can also have several attributes. Sub-elements are described in
orderly fashion after the parent element.

* Not all schema elements are mandatory in SDDL. In the schema diagram, the
mandatory elements are shown in solid rectangle where as optional elements are
shown in dashed rectangle.

Schema Elements and Attributes

SDDL is designed as a hierarchical collection of elements and their attributes. One

element can contain one or more sub-elements. Each element can also have one or

more attributes. Now we shall explain all the schema elements with some details. The

complete schema definition of SDDL is provided in Appendix A.

. This is the root element of the SDDL schema that captures

both metadata and sensor data of a dataset. Figure 3-7 shows the schema diagram of

this element. It has the following attributes:

* version. The version number of the SDDL instance.
* id. The identification number of the dataset.
* name. The name of the dataset.
* date_from. The start date and time of the sensor data generation.
* date_to. The end date and time of the sensor data generation.

. An element that briefly describes the contact information (e.g.

name, phone, email, etc.) of the authorized person of the dataset. The schema diagram

is shown in Figure 3-8. This element has the following attributes:

* name. The name of the authorized person of the dataset.
* role. The role of the authorized person such as owner, lab manager, etc.
* organization. The name of the corresponding organization.
* phone. The contact number of the authorized person of the dataset.
* email. The email address of the authorized person.

. This element contains all necessary details of the pervasive space

where data is generated. The schema diagram for this element is shown in Figure 3-8.

Sensory_Dataset

R attributes

version

id

name
--------------
date from

-date- to
date to i

---sddl:Contact Info

_---------------.
-- sddl:History
-u -- -
ddl:Dataset Contexts

-= sddl:Sensor Event

Figure 3-7. Schema diagram of element in SDDL

It includes three simple sub-elements ,

, and and one complex element -

. and briefly

describe the general information about the project and dataset.

represents information about the area and the layout of the target space.

actuators are physically located. Count attribute denotes the number of such areas. The

schema diagram of the element is shown in Figure 3-11. It can contain one or more

sub-elements which represents individual locations. element has

the following attributes:

* id. The identifier of the particular area in the space.
* name. The name of the location such as bedroom, bathroom etc.

* zone. The name of the zone if an area is divided into zones.

Locationlnfo -L attributes

Id
-- sddl:Location ---------
S................. nam e
O.... .
zone

Figure 3-11. The schema diagram of element in SDDL

. This element contains the relevant information about all the

sensors that are available in the space. It has count attribute which denotes the number

of sensors. The schema diagram is shown in Figure 3-12. Here, can

contain one or more complex element which describes the characteristics of

a particular sensor. It has the following attributes:

* id. The identifier of the particular sensor.
* name. The name of the sensor.
* type. The functional type of the sensor such as motion sensor, light sensor, etc.
* location id. The location identifier of the sensor mentioned in element.
* unit. The measurement unit of the sensor value in textual form.
* min_value. The minimum output value range of the sensor.
* max_value. The maximum output value range of the sensor.

. This element contains the relevant information about all the

actuators that are available in the space. It has count attribute which denotes the

number of actuators. The schema diagram is shown in Figure 3-13. Here,

can contain one or more complex element which describes

the characteristics of a particular actuator. It has the following attributes:

* id. The identifier of the particular actuator.
* name. The name of the actuator.
* type. The functional type of the actuator such as servo, electric motor, etc.
* location id. The location id of the actuator mentioned in element.
* unit. The measurement unit of the actuator value in textual form.
* min_value. The minimum output value range of the actuator.
* max_value. The maximum output value range of the actuator.

Attributes

count

R attributes

SensorInfo id

name

type
sddlSensr ---------------
location id
1..QO ----------^
unit

mm value

max value .

Figure 3-12. The schema diagram of element in SDDL

. This element denotes the activities or tasks that can happen

inside the space. The number of activities is stored in 'count' attribute. Figure 3-14

shows the schema diagram of the element. The sub-element describes

information about a particular activity and has the following attributes:

* id. The identifier of the activity.
* name. The name of the activity.
* category. The category of activities such as housekeeping, food preparation etc.
* sub-category. The sub-category of activities, if any.

-sddl:Actuator [-

w0..

- attributes
----------,
id

name
-----------
type

location id

unit

mm value

max value
.7 .

Figure 3-13. The schema diagram of element in SDDL

. The element denotes the contexts that are involved while

collecting data from the space. For example, 'age of the participant', 'month of the year',

can be example of such context. The number of contexts is stored in'count' attribute.

- sddl:Activity E-

O..Q

R] attributes

id

name

category
sub-----------------tegory
sub-category

Figure 3-14. The schema diagram of element in SDDL

36

The schema diagram of this element is shown in Figure 3-15. It consists of sub-

element that has the following attributes:

* id. The identifier of the context.
* name. The name of the context.
* type. The type or category of the context.

experiment. The number of subjects is stored in 'count' attribute. The schema diagram

of the element is shown in Figure 3-16. It contains sub-element that has the

following attributes:

* id. The identifier of the subject.
* name. The name of the subject.
* age. The age of the subject.
* ethnicity. The ethnicity of the subject.
* gender. The gender of the subject.
* disease/disability condition. The disease or disability condition such as obesity and
diabetes.

. The element contains the implicit information about the

organization and representation of the dataset such as parameters and the separator of

the parameter. The schema diagram of the element is shown in Figure 3-17. It has three

sub-elements , , and .

The can have several which denotes the information

that each sensor event captures. For example, parameters can be timestamp, sensorlD,

sensorValue etc. The number of parameters is represented by the 'count' attribute. The

sub-element has the following attributes:

* name. The name of the parameter that is present in the sensor data.
* index. The index number where the parameter is placed in the sensor data.

E attributes

couni '

Q attrbutes

sddl:Subject .i elhnici
id

O..GO gender

diseaseordisabiliiy_condilion1

disease_or_disabilily_condilion2

diseaseor_disability_condition3

Figure 3-16. The schema diagram of element in SDDL

sub element refers to the various separators such as comma,

semicolon, space, etc., used in representing the particular sensor data instance. It has

the following attributes:

* parameter_separator. The separator between parameters in the dataset.
* line_separator. The separator between two lines/data instances in the dataset.

refers to the mode of sensor data reporting. It is restricted

to two values single_sensor_per_event (i.e. only one sensor reading is recorded per

line) and multiple_sensors_per_event. (i.e. multiple sensor readings are recorded in one

line).

Q attributes
: count

sdd Il:Parameters E- n me
attributes

eldl:Paramneler name
1..o index

(Datast_Context IE Q attrbutes

sddl:Separators paran [.. Para '
lineseparalor .

-- dll: Sensor_EventType

Figure 3-17. The schema diagram of element in SDDL

. The element represents the actual sensor data collected from

the space. It also contains labeling information such as 'performed activity' and 'active

context'. The schema diagram of the element is shown in Figure 3-18.

The element contains sub element which denote a collection of sensor

data that may correspond to an activity under certain context. There can be multiple

instances of element for representing sensor data corresponding to different

activities. It has the following attributes:

* data. The numeric sensor data collected from space.
* activity_performed. The activity that was being performed during data collection.
* active_context. The activity context under which the data was collected.

experiments, there are several practical limitations. Few major issues are explained

below:

High Cost of Building Pervasive Spaces

To build a pervasive space with correct design and planning is very expensive. It

requires significant ground work and involvement of area specialists to create the smart

home with necessary devices and appliances. Not everybody has a large budget to

build such a space. Moreover, integrating and connecting various heterogeneous

devices and collecting data from the system are also very challenging [15].

Difficulty to Recruit Human Subjects

Human subjects are not always easy to find and recruit. Even subjects are

available, they cannot be used to perform all the activities under all possible conditions

or contexts that a researcher wishes to consider. To ensure safety and prevent against

abuse and exploitation, many governmental agencies and institutional review board

restrict the length of time human subjects can be used in any research study [15].

Significant Time to Generate Data

It is usually very time consuming to generate the adequate data for a meaningful

collection of events for all possible test scenarios. Depending on the goals of the

experiment, it can take weeks to generate activity data from a real deployment.

Difficulty to Modify the Physical Space

A physical space is not scalable in the sense that it cannot be extended easily by

adding/replacing sensors or changing the layout. Thus all the experiments that are

conducted have strong coupling with the hardware infrastructure of the space. Also, any

alteration to the experimental setup is time consuming and associated with financial

cost [16]. Additionally, once data is collected from the space, it is impossible to modify

the setup without repeating the experiment again.

Inability to Reproduce Experimental Data

Sometimes, researcher may find the collected amount of data is not adequate for

learning his/her activity model and thereby more data corresponding to the same activity

under same experimental setup is needed. But, it is not possible to reproduce the

similar activity data from the space without running the experiment again.

Challenges

In order to design a simulator that is free from the limitations of actual space

deployment and capable of generating human activities, we have faced a number of

challenges. Few of those are:

* How to define an actual space in terms of its intrinsic elements such as area,
sensors, actuators and extrinsic elements or events like 'a person is cooking',
'room temperature is changed from 60F to 62F'etc.?

* How to support the large set of diverse and heterogeneous elements of the space
and extend the utility of the simulator over time?

* How to define the semantic of an activity in terms of sensors?

* What parameters are essential for the simulation model to generate activity
events?

* How to represent the simulated dataset to the world in a way to foster collaborative
research?

Persim Architecture

We exploited the Model-View-Controller (MVC) [23], a simple yet powerful

architectural pattern to develop Persim. According to this pattern, the architecture is

divided into three components model, view and controller so that there is a clean

separation among the components and one can be modified without affecting the

others. This feature allows Persim to be extendable without much effort. Figure 4-1

into SDDL format to analyzing other's dataset in a formal way. Lastly, Persim has a

Project Renderer that captures the meta-data such as sensor information, space layout

etc. from an SDDL file and creates the corresponding simulation project that represents

the simulated environment of the actual space from where data was collected.

Organization of the Simulation Model

Persim is developed as a component-based simulator in order to accommodate

the various heterogeneous entities of the simulation environment. These components

are used for defining (i) space to be simulated in terms of layout and sensors/actuators,

(ii) activities to be simulated and (iii) simulation criteria/configuration. Each component is

characterized by several attributes. Now, we examine the major components of Persim

in some details.

Space

This component captures the spatial aspects of the target simulated environment.

It is currently in primitive stage, capturing only attributes such as space id and space

name, giving no details to layouts, walkable and non-walkable areas, and to the precise

relative locations among objects, sensors and actuators.

Sensor

Sensor component denotes the physical sensors that provide the contextual

information such as movement, illumination, humidity etc. about the space. Motion

sensor, light sensor, humidity sensor, Contact sensor etc are some examples of few

commonly used sensors in home setting.

We have identified several attributes that are required to simulate the sensor in the

virtual space of Persim. The major attributes are:

* Id. The identifier of the specific sensor instance.

* Name. The name of the specific sensor instance.

* Type. The type of the sensor is determined by the nature of its value generation.
Persim allows three types of sensors for simulation:

Independent sensor. A sensor which is triggered by any environment or nature.

Temperature sensor is an example of an independent sensor since it generates

temperature values regardless of any human activities around it. We also call the

occurrence of independent sensor events as Time-driven event since it generated

continuously over time. To simulate independent sensor, Persim requires the following

additional attributes related to the process generating function that generates the sensor

event:

* Process generation type. Denotes whether the stepping function of the
independent sensor event is based on fixed interval or probabilistic distribution.

* Interval size. The interval size in milliseconds if the process generation function is
based on fixed interval.

* Process generation function. A function that defines the interval or stepping of
sensor data generation. In other words, it defines how the sensor will be propelled
with time. For example, the exponential distribution can be a process generation
function.

* Distribution parameters. Denotes the parameters of the distribution such as
mean, variance etc. depending on the type of the process generation function. For
example, mean and variance are two parameters of the exponential distribution.

Dependent sensor. A sensor which is triggered by any external activity or human

behavior. For example, pressure sensor is a dependent sensor since it only generates

output when an external pressure is applied on the sensor. We have identified a special

Usually, when we refer to dependent sensor in Persim, we actually mean non-object

dependent sensor.

Object sensor. A dependent sensor that triggers when any object attached to it is

being handled or operated. For an example, an RFID tag on the TV remote control can

be an object sensor. The object sensor is different from other dependent sensors

because it has only two output values namely 'value when detected' and 'value when

not detected' and its value generation process can be expressed as a Finite State

Machine (FSM). Figure 4-2 shows a simple two-state FSM for the object sensor. In this

figure, 'start' denotes the initial state and pii, pij, pji, and pj are the transition probabilities

that changes the state of the object sensor over time.

Pii when not ) w hen )Pjj
detected detected /

Pji
start

Figure 4-2. Finite state machine for object sensor

* Location id. Location identification for a specific sensor.

* Functional type. Type of the sensor based on the functionality. For example,
motion and pressure denotes two different functional types of dependent sensors.

* Minimum value. The maximum value that a specific sensor can generate.

* Maximum value. The minimum value that a specific sensor can generate.

* Initial value. The initial value of a specific sensor before the experiment begins.

* Value generation function. A distribution function that characterizes the data
generated by the sensor. For example, the value generation function for a motion
sensor can be a binary valued function which generates value 1 when any
movement of a person is detected and 0 otherwise.

* Distribution parameters. Denotes the parameters of the distribution such as
mean, variance, etc. depending on the type of the value generation function. For
example, upper bound and lower bound are two parameters of the uniform
distribution function.

Actuator

Actuator component denotes the actuators or actors in the space that changes the

state of the environment by acting on sensors or some objects. Servo, Electric motor,

etc. are a few examples of commonly used actuators. A great majority of existing

datasets use only sensors and no actuators. However datasets like the iDrom dataset of

University of Essex [18] use an actuator as another device deployed in the space. In

this case, actuator component must be configured. Persim requires the following

attributes for actuators, which are similar to those of sensors:

* Id. The identifier of the specific actuator instance.

* Name. The name of the specific actuator instance.

* Location id. Location identification for a specific actuator.

* Functional type. Type of the actuator based on the functionality. For example,
servo and motor denotes two different functional types of actuators.

* Minimum value. The maximum value that a specific actuator can generate.

* Maximum value. The minimum value that a specific actuator can generate

* Initial value. The initial value of a specific actuator before the experiment starts.

* Value generation function. A distribution function that characterizes the data
generated by the actuator. For example, the value generation function for a servo
can be a uniform distribution which generates value from 0 to 360 degrees.

* Distribution parameters. Denotes the parameters of the distribution such as
mean, variance, etc. depending on the type of the value generation function. For
example, upper bound and lower bound are two parameters of uniform distribution
function.

Activity

Activity is an independent Persim event that captures a human activity (e.g.,

walking to kitchen, getting up, leaving house, etc.) It is the basic event that acts as a

stimulus and propels the simulation engine. Activities are generally sensed by several

sensors and are characterized by following attributes:

* Id. The identifier of the specific activity instance.

* Name. The name of the specific activity instance.

* Type. The type/category of the specific activity such as house- keeping, food
preparation, etc.

* Interval type. Denotes whether the stepping/inter-arrival function of the activity is
based on fixed interval or probabilistic distribution.

* Interval size. The interval size in milliseconds if the inter-arrival function is based
on fixed interval.

* Inter-arrival function. A function that defines the interval or stepping of a
particular activity. It defines how the activity is generated with time. It is similar to
the process generation function of the independent sensor.

* Distribution parameter. Denotes the parameters of the distribution such as
mean, variance, etc. depending on the type of the inter-arrival function.

Activity-Sensor Mapping

An activity can be detected by one or more sensors. To define the semantics of

the activity in terms of corresponding sensors, an Actuator-Sensor Mapping table is

required. The mapping captures the dependency of a specific activity on a specific set

of sensors by a causal relationship. Also, it is able to customize the ordering of sensor

trigger and minimum/maximum value of a sensor for a particular activity. The details of

this mapping are explained with an example in the Simulation Steps section. Figure 4-3

shows the concept of Activity-Sensor Mapping. The mapping function can be described

mathematically as fAct-Map: Acti ->S, where Acti is an activity and S = {Si, S2...Sk} is the

set of sensors that detect the activity.

Actuator-Sensor Mapping

This component is needed to be configured only when actual space has some

actuators. Actuators act as a domain manipulator of the sensors. Typically, single

sensor events such as values, ranges, etc., or the combination of multiple sensor

events, trigger an actuator. It then shifts the environment from one state to another

which again changes the value of some sensors.

S

S
Acti ILI)

Figure 4-3. Activity-Sensor mapping

An Actuator-Sensor Mapping table describes the relationship between the actuator

and sensors bytwo logistic functions: M1 and M2. The design of the Actuator-Sensor

Mapping table is shown in Figure 4-4. In the figure, M1 defines how a set of sensors

triggers an actuator and M2 how the triggered actuator changes the system state by

affecting another set of sensors. Mathematically, the combined mapping function can be

written as:

fActu-Map: S XActui S (4-1)

In Equation 4-1, S is the set of sensors that affect the actuator Actui and S'is the

set of sensors that will be affected by the action of the actuator Actui. In Figure 4-4, S =

This component of the simulator manages all the basic configuration parameters

needed by the simulation engine. The parameters are listed with some details:

* Simulation start time. The starting time of the simulation.

* Simulation end time. The finishing time of the simulation.

* Activities to be simulated. Denotes the list of activities that is to be simulated.
The user can add all possible activities while defining the Activity component
mentioned earlier and choose a sub set of activities from them to be simulated.

Sal

Sa1

SActu,

Figure 4-4. Actuator-Sensor mapping

* Simulation mode. Persim supports two simulation modes: Activity-driven and
State space. In the Activity-driven mode, the user specifies the set of activities that
happen in space. This mode activates the set of dependent sensors for all desired
activities based on the Activity-Sensor Mapping. In State space mode, Persim
generates a sequence of events reporting on all sensors (dependent and
independent) deployed in the space irrespective of any activity. In this mode, the
simulator records the state of the entire space and generated sensor events may
or may not include any information about occurring activities.

* Time-driven events to be simulated. Denotes the list of events that are
generated from independent sensors. The user can choose any number of
independent sensors from the available sensor list and simulate according to
his/her experimental requirement. Since these events are not triggered by any
activity, they are simulated based on the process generation function of the
corresponding sensor.
Simulation Algorithm

event, time-advanced approach to capture the dynamic nature of the pervasive system

which evolves over time. According to this classical simulation technique, the target

state space changes whenever any event (either activity or time-driven) occurs and the

Stl

-O

,O

system variables of the space are updated based on the simulation logic. The flowchart

of the simulation algorithm is shown in Figure 4-5. Persim simulation model consists of

the following major variables:

* Simulation clock (simClk). A variable that keeps track of the current value of the
logical simulated time. The domain of the clock value is the set of timestamps of all
events. This simClk is always advanced only to the time of the next imminent
event.

* List of Events (eventList(event, eventTime)). A list of tuples of the form (event,
eventTime) where event denotes the simulation event time and eventTime
denotes the time of the occurrence of that event. Each tuple represents an event
trigger which needs to be processed by the simulator.

* List of Activities (activityList). A list of activities to be simulated selected by the
user for a specific simulation project.

* List of contexts (contextList). A list of independent sensors selected by the user
that act as "contexts" for the simulation.

* Initialization Routine. A method to initialize simClk and eventList before the
simulation loop starts.

* Timing Routine. A method that determines the next event from the eventList and
advances the simClk to the time of occurrence of the selected event.

* Event Routine. A method to process the current event based on its type. It also
determines the time of occurrence of an event based on the library routine and re-
schedules as needed.

* Library Routine. Consists of several methods that serve as utility functions for the
simulation loop. These functions are used to generate random variables from the
probability distribution of the inter-arrival time of an event and generation function
of sensors (as specified by the user and configured by Persim).

* Simulation Loop. The main program that invokes the initialization routine, timing
routine and event routine and that propels the simulation of the target space by
processing and generating/scheduling new events for the entire lifecycle of the
simulation.

The simulation starts from the empty and idle state. First, it invokes the

Initialization Routine to initialize the simClk and eventList. Then, the Timing Routine is

invoked to find the most imminent (next) event from the eventList. It then advances the

simClk to the time of occurrence of the next event. It processes the event in the Event

Routine according to the mode of simulation, whether it is Activity-driven or State space.

The event routine also reschedules the current event to propel the simulation loop. The

simulation continues until the simClk exceeds the designated simulation end time at

Daily Living (ADLs) which are considered as basic functionalities for independent living.

Our verification process focused on the following activities:

* Make a phone call (T1). The participants are asked to look up a specified number
in a phonebook, call the number, and write down the cooking directions spoken by
the recorded message. The phonebook, notepad, and telephone are located on
dining room table.

* Wash hands (T2). For this activity, participants are told to wash their hands in the
kitchen sink using the soap and paper towels provided.

* Clean (T5). The cleaning activity required participants to clean the dishes and put
the medicine bottle and other materials back into a cabinet.

We have created the simulated environment in Persim similar to that of CASAS

smart home and generated 40-50 instances of simulated data for each of these three

tasks. In order to verify the accuracy of Persim, the verification approach used the

sensor data to construct a fuzzy logic based system (FLS) for both simulated and real

dataset. It models the given process and generates the mapping as a set of rules from

the data (real or simulated) to the correct activities/tasks. The motivation behind this

approach is that only by usi ng the data, we can generate fuzzy models which can be

easily interpreted by the researcher [3], [15].

The verification process is accomplished in four phases and includes both black

learning. Persim can be extended to annotate both actual and simulated data to
assist their research effort.

* Include a verification tool. Including the fuzzy-based verification tool in Persim
can help the researcher assess the accuracy of their simulated data against
similar real data. The verification tool can take one simulated dataset in SDDL and
corresponding reference dataset in SDDL and act as a standalone tool for
generating the percentage of accuracy. If the result is below some threshold, then
the researcher can modify the simulation environment to generate data that is
more close to the actual deployment.

* Add space realism. Currently, Persim has rudimentary space definitions with
simple space layout and no relative locations among objects, sensors and
actuator. By adding higher level of realism can place the simulator to capture more
detail of the space and utilize the information to create more realistic simulation of
activities. Space realism can be achieved through the use of space templates such
as single family home, apartment, assisted living facility etc.

* Support rules for actuator mapping. Usually, an actuator is triggered by certain
sensor events such as values, ranges or combination sensor events. An actuation,
on the other hand, could affect the environment and cause a change in some
sensor values (for example, turning light on may affect photo sensors). Realizing
actuators in the space can allow Persim to create diverse scenarios where
pervasive application logics such as 'activate alarm on potential danger',
'automatic lighting' are enabled. For this, we need to map actuators to sensors
(described in Chapter4) to define the causal-relationship among them. Persim can
allow rule-based mapping that can help the researcher to easily simulate desired
scenarios. The rules should be able to define any combination of actuator and
sensors.

* Apply FSM to model object sensor. To model object sensor more accurately,
Finite State Machine (FSM) (described in Chapter 4) can be added in Persim. The
interface should allow the user to create an FSM for each object sensor with
minimum effort. In simulation algorithm, Persim will maintain states for a particular
object sensor and transition from one state to other based on the specified FSM.

* Support automatic project generation from SDDL. In designing Persim, one of
our objectives was to import SDDL file from any real deployment and generate the
simulation project automatically with minimum input from the user. This feature
can be implemented so that researcher can easily analyze and generate data with
slight variations to satisfy his/her research goals.

This is an example SDDL file generated by SDDL Converter for 't03.t2' data file

collected from WSU Apartment Testbed [17] for normal ADL activities.

date to= "02-29-2008" date from= "02-29-2008">
phone= "" organization= "Washington State University" name="Dr. Diane J. Cook" email=""/>The experiment is performed in WSU Smart Apartment is part of the ongoing
CASAS smart home project at WSU. The CASAS project treats environments as intelligent agents, where
the status of the residents and their physical surroundings are perceived using sensors and the
environment is acted upon using controllers in a way that improves the comfort, safety, and/or productivity
of the reside ntsThese datasets represent sensor events collected in the WSU smart
apartment testbed. The data represents participants performing five ADL activities in the
apartment.The apartment is a three-bedroom apartment located on the Washington State
University campus. It includes three bedrooms, one bathroom, a kitchen, and a living / dining room. The
apartment is equipped with motion sensors distributed approximately 1 meter apart throughout the
space.
location id="" id="AD1-B"/>
location id="" id="M18"/>
location id="" id="MI17">
location id="" id="M16"/>
location id="" id="M15"l>
location id="" id="M14"l>
location id="" id="M13"/>

xsi:type= "Dataset_ Contexts">

Single_Sensor_per_Event

Example of a SDDL file of Persim Simulation

This SDDL file is generated by Persim which corresponds to the scenario

94 BIOGRAPHICAL SKETCH Shantonu Hossain was born in 1984 in Bangl adesh. She completed her Bachelor of Science degree in computer science and e ngineering in July 2007 from Bang l a desh University of Engineering and Technology, Dhaka, Bangladesh. Then she worked as a software e ngineer in Spectrum Engineering Consortium Ltd., Dhaka for two years. S he continued her studies at the University of Florida in the Department of Co mputer and Information Science and Engineering She received her Master of Science degree in August 2010. She will be joining the PhD program in the Department of Compu ter Science at the University of Rochester in September 2010. Her major research interests are pervasive computing, distributed systems and distributed computing.