TY - CONF
T1 - CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
T2 - The 13th IEEE International Conference on Semantic Computng
Y1 - 2019
A1 - Saeedeh Shekarpour
A1 - Faisal Alshargi
A1 - Krishnaprasad Thirunarayam
A1 - Valerie L. Shalin
A1 - Amit Sheth
KW - abstract ontology
KW - cevo
KW - cognitive annotation
KW - event ontology
KW - relation annotation
AB - While the general analysis of named entities has received substantial research attention on unstructured as well as structured data, the analysis of relations among named entities has received limited focus. In fact, a review of the literature revealed a deficiency in research on the abstract conceptualization required to organize relations. We believe that such an abstract conceptualization can benefit various communities and applications such as natural language processing, information extraction, machine learning, and ontology engineering. In this paper, we present Comprehensive EVent Ontology (CEVO), built on Levin’s conceptual hierarchy of English verbs that categorizes verbs with shared meaning, and syntactic behavior. We present the fundamental concepts and requirements for this ontology. Furthermore, we present three use cases employing the CEVO ontology on annotation tasks: (i) annotating relations in plain text, (ii) annotating ontological properties, and (iii) linking textual relations to ontological properties. These usecases demonstrate the benefits of using CEVO for annotation: (i) annotating English verbs from an abstract conceptualization, (ii) playing the role of an upper ontology for organizing ontological properties, and (iii) facilitating the annotation of text relations using any underlying vocabulary. This resource is available at https://shekarpour.github.io/cevo.io/ using https://w3id.org/cevo namespace
JA - The 13th IEEE International Conference on Semantic Computng
PB - IEEE
CY - Newport Beach, California
ER -
TY - CONF
T1 - empathi: An ontology for Emergency Managing and Planning about Hazard Crisis
T2 - 13th IEEE International Conference on Semantic Computing
Y1 - 2019
A1 - Manas Gaur
A1 - Saeedeh Shekarpour
A1 - Amelie Gyrard
A1 - Amit Sheth
KW - Crisis Management
KW - Disaster Management.
KW - Emergency
KW - Hazard Domain
KW - Knowledge Reuse
KW - Ontology
KW - Ontology Quality
KW - Vocabularies
AB - In the domain of emergency management during hazard crises, having sufficient situational awareness information is critical. It requires capturing and integrating information from sources such as satellite images, local sensors and social media content generated by local people. A bold obstacle to capturing, representing and integrating such heterogeneous and diverse information is lack of a proper ontology which properly conceptualizes this domain, aggregates and unifies datasets. Thus, in this paper, we introduce empathi ontology which conceptualizes the core concepts describing the domain of emergency managing and planning of hazard crises. Although empathi has a coarse-grained view, it considers the necessary concepts and relations being essential in this domain. This ontology is available at https://w3id.org/empathi/.
JA - 13th IEEE International Conference on Semantic Computing
PB - IEEE
CY - Newport Beach, California
ER -
TY - CONF
T1 - Knowledge-aware Assessment of Severity of Suicide Risk for Early Intervention
T2 - The Web Conference 2019
Y1 - 2019
A1 - Manas Gaur
A1 - Amanuel Alambo
A1 - Joy P. Sain
A1 - Ugur Kursuncu
A1 - Krishnaprasad Thirunarayam
A1 - Ramakanth Kavuluru
A1 - Amit Sheth
A1 - Randon S. Welton
A1 - Jyotishman Pathak
KW - Surveillance and Behavior Monitoring; Reddit; Mental Health; Suicide Risk Assessment; C-SSRS; Medical Knowledge Bases; Perceived Risk Measure; Semantic Social Computing
AB - Mental health illness such as depression is a significant risk factor for suicide ideation, behaviors, and attempts. A report by Substance Abuse and Mental Health Services Administration (SAMHSA) shows that 80% of the patients suffering from Borderline Personality Disorder (BPD) have suicidal behavior, 5-10% of whom commit suicide. While multiple initiatives have been developed and implemented for suicide prevention, a key challenge has been the social stigma associated with mental disorders, which deters patients from seeking help or sharing their experiences directly with others including clinicians. This is particularly true for teenagers and younger adults where suicide is the second highest cause of death in the US Prior research involving surveys and questionnaires (e.g. PHQ-9) for suicide risk prediction failed to provide a quantitative assessment of risk that informed timely clinical decision-making for intervention. Our interdisciplinary study concerns the use of Reddit as an unobtrusive data source for gleaning information about suicidal tendencies and other related mental health conditions afflicting depressed users. We provide details of our learning framework that incorporates domain-specific knowledge to predict the severity of suicide risk for an individual. Our approach involves developing a suicide risk severity lexicon using medical knowledge bases and suicide ontology to detect cues relevant to suicidal thoughts and actions. We also use language modeling, medical entity recognition, and normalization and negation detection to create a dataset of 2181 redditors that have discussed or implied suicidal ideation, behavior, or attempt. Given the importance of clinical knowledge, our gold standard dataset of 500 redditors (out of 2181) was developed by four practicing psychiatrists following the guidelines outlined in Columbia Suicide Severity Rating Scale (C-SSRS), with the pairwise annotator agreement of 0.79 and group-wise agreement of 0.73. Compared to the existing four-label classification scheme (no risk, low risk, moderate risk, and high risk), our proposed C-SSRS-based 5-label classification scheme distinguishes people who are supportive, from those who show different severity of suicidal tendency. Our 5-label classification scheme outperforms the state-of-the-art schemes by improving the graded recall by 4.2% and reducing the perceived risk measure by 12.5%. Convolutional neural network (CNN) provided the best performance in our scheme due to the discriminative features and use of domain-specific knowledge resources, in comparison to SVM-L that has been used in the state-of-the-art tools over similar dataset.
JA - The Web Conference 2019
PB - Association for Computing Machinery
CY - San Francisco, CA, USA
UR - http://knoesis.org/sites/default/files/Suicide_Paper.pdf
ER -
TY - CONF
T1 - Metrics for Evaluating Quality of Embeddings for Ontological Concepts
T2 - AAAI 2019 Spring Symposium on Combining Machine Learning with Knowledge Engineering
Y1 - 2019
A1 - Faisal Alshargi
A1 - Saeedeh Shekarpour
A1 - Tommaso Soru
A1 - Amit Sheth
KW - Embedding
KW - knowledge graph
KW - Metric
KW - Quality
AB - Although there is an emerging trend towards generating embeddings for primarily unstructured data and, recently, for structured data, no systematic suite for measuring the quality of embeddings has been proposed yet. This deficiency is further sensed with respect to embeddings generated for structured data because there are no concrete evaluation metrics measuring the quality of the encoded structure as well as semantic patterns in the embedding space. In this paper, we introduce a framework containing three distinct tasks concerned with the individual aspects of ontological concepts: (i) the categorization aspect, (ii) the hierarchical aspect, and (iii) the relational aspect. Then, in the scope of each task, a number of intrinsic metrics are proposed for evaluating the quality of the embeddings. Furthermore, w.r.t. this framework, multiple experimental studies were run to compare the quality of the available embedding models. Employing this framework in future research can reduce misjudgment and provide greater insight about quality comparisons of embeddings for onto-logical concepts. We positioned our sampled data and code at https://github.com/alshargi/Concept2vec under GNU General Public License v3.0.
JA - AAAI 2019 Spring Symposium on Combining Machine Learning with Knowledge Engineering
PB - AAAI
CY - Palo Alto, California
ER -
TY - CONF
T1 - Multimodal Emotion Classification
T2 - 2019 World Wide Web Conference
Y1 - 2019
A1 - Anarag Illendula
A1 - Amit Sheth
KW - Emoji Understanding
KW - Emotion Classification
KW - Multimodal Analysis
AB - Most NLP and Computer Vision tasks are limited to scarcity of labelled data. In social media emotion classification and other related tasks, hashtags have been used as indicators to label data. With the rapid increase in emoji usage of social media, emojis are used as an additional feature for major social NLP tasks. However, this is less explored in case of multimedia posts on social media where posts are composed of both image and text. At the same time, w.e have seen a surge in the interest to incorporate domain knowledge to improve machine understanding of text. In this paper, we investigate whether domain knowledge for emoji can improve the accuracy of emotion classification task. We exploit the importance of different modalities from social media post for emotion classification task using state-of-the-art deep learning architectures. Our experiments demonstrate that the three modalities (text, emoji and images) encode different information to express emotion and therefore can complement each other. Our results also demonstrate that emoji sense depends on the textual context, and emoji combined with text encodes better information than considered separately. The highest accuracy of 71.98% is achieved with a training data of 550k posts.
JA - 2019 World Wide Web Conference
CY - San Francisco, CA, USA
ER -
TY - CONF
T1 - Question Answering for Suicide Risk Assessment using Reddit
T2 - 13th IEEE International Conference on Semantic Computing
Y1 - 2019
A1 - Amanuel Alambo
A1 - Manas Gaur
A1 - Usha Lokala
A1 - Ugur Kursuncu
A1 - Krishnaprasad Thirunarayam
A1 - Amelie Gyrard
A1 - Randall Hand
A1 - Jyotishman Pathak
A1 - Amit Sheth
KW - C-SSRS
KW - Reddit
KW - Semantic Machine Learning
KW - Semantic Social Computing
KW - Suicide Risk Assessment
KW - The Diagnostic and Statistical Manual of Mental Disorders – Fifth Edition (DSM-5)
KW - Web-based Intervention
AB - Mental Health America designed ten questionnaires that are used to determine the risk of mental disorders. They are also commonly used by Mental Health Professionals (MHPs) to assess suicidality. Specifically, the Columbia Suicide Severity Rating Scale (C-SSRS), a widely used suicide assessment questionnaire, helps MHPs determine the severity of suicide risk and offer an appropriate treatment. A major challenge in suicide treatment is the social stigma wherein the patient feels reluctance in discussing his/her conditions with an MHP, which leads to inaccurate assessment and treatment of patients. On the other hand, the same patient is comfortable freely discussing his/her mental health condition on social media due to the anonymity of platforms such as Reddit, and the ability to control what, when and how to share. The popular “SuicideWatch” subreddit has been widely used among individuals who experience suicidal thoughts, and provides significant cues for suicidality. The timeliness in sharing thoughts, the flexibility in describing feelings, and the interoperability in using medical terminologies make Reddit an important platform to be utilized as a complementary tool to the conventional healthcare system. As MHPs develop an implicit weighting scheme over the questionnaire (i.e., C-SSRS) to assess suicide risk severity, creating a relative weighting scheme for answers to be automatically generated to the questions in the questionnaire poses as a key challenge. In this interdisciplinary study, we position our approach towards a solution for an automated suicide risk elicitation framework through a novel question answering mechanism. Our two-fold approach benefits from using: 1) semantic clustering, and 2) sequence-to-sequence (Seq2Seq) models. We also generate a gold standard dataset of suicide posts with their risk levels. This work forms a basis for the next step of building conversational agents that elicit suicide-related natural conversation based on questions.
JA - 13th IEEE International Conference on Semantic Computing
PB - IEEE
CY - Newport Beach, California
ER -
TY - Generic
T1 - Augmented Personalized Health: Using Semantically Integrated Multimodal Data for Patient Empowered Health Management Strategies
Y1 - 2018
A1 - Amit Sheth
A1 - Hong Yung Yip
A1 - Utkarshani Jaimini
A1 - Dipesh Kadariya
A1 - Vaikunth Sridharan
A1 - Revathy Venkataramanan
A1 - Tanvi Banerjee
A1 - Krishnaprasad Thirunarayan
A1 - Maninder Kalra
KW - Augmented Personalised Health
KW - mHealth
KW - Patient Evaluations
AB - Healthcare as we know it is in the process of going through a massive change from: 1. Episodic to continuous 2. Disease-focused to wellness and quality of life focused 3. Clinic-centric to anywhere a patient is 4. Clinician controlled to patient empowered 5. Being driven by limited data to 360-degree, multimodal personal-public-population physical-cyber-social big data-driven
CY - mHealth Technology Showcase, National Institute of Health- June 2018
ER -
TY - JOUR
T1 - Building IoT based applications for Smart Cities: How can ontology catalogs help?
JF - IEEE Internet of Things Journal
Y1 - 2018
A1 - Amelie Gyrard
A1 - Antoine Zimmermann
A1 - Amit Sheth
KW - Internet of Things
KW - Interoperability
KW - Knowledge Directory
KW - Ontologies
KW - Ontology Best Practices
KW - Ontology Catalogs
KW - Ontology Improvement
KW - ontology validation
KW - Reusable Knowledge.
KW - Semantic Data Interoperability
KW - Semantic Web
KW - Semantic Web Technologies
KW - Semantics
KW - Semantics-based Smart Cities
KW - Sensors
KW - smart cities
AB - The Internet of Things (IoT) plays an ever-increasing role in enabling Smart City applications. An ontology-based semantic approach can help improve interoperability between a variety of IoT-generated as well as complementary data needed to drive these applications. While multiple ontology catalogs exist, using them for IoT and smart city applications require significant amount of work. In this paper, we demonstrate how can ontology catalogs be more effectively used to design and develop smart city applications? We consider four ontology catalogs that are relevant for IoT and smart cities: READY4SmartCities, LOV, OpenSensingCity (OSC) and, LOV4IoT. To support semantic interoperability with the reuse of ontology-based smart city applications, we present a methodology to enrich ontology catalogs with those ontologies. Our methodology is generic enough to be applied to any other domains as is demonstrated by its adoption by OSC and LOV4IoT ontology catalogs. Researchers and developers have completed a survey based evaluation of the LOV4IoT catalog. The usefulness of ontology catalogs ascertained through this evaluation has encouraged their ongoing growth and maintenance. The quality of IoT and smart city ontologies have been evaluated to improve the ontology catalog quality. We also share the lessons learned regarding ontology best practices and provide suggestions for ontology improvements with a set of software tools.
PB - IEEE
ER -
TY - CONF
T1 - Concept Extraction from the Web of Things Knowledge Bases
T2 - International Conference WWW/Internet 2018
Y1 - 2018
A1 - Mahda Noura
A1 - Amelie Gyrard
A1 - Sebastian Heil
A1 - Martin Gaedke
KW - concept extraction
KW - Internet of Things
KW - iot ontologies
KW - Knowledge Bases
KW - Web of Things
AB - Semantic web technologies are a major driver for semantic interoperability in IoT-generated data by using shared vocabularies in an ontology-driven approach. While there is a growing interest in standardization of ontologies for IoT, there is still a lack of common agreement for a specific IoT ontology. Numerous concepts and relations have been designed within existing ontologies to handle different features of IoT data. However, there are many redundant and overlapping concepts designed within existing standardizations and groups. We found that new ontologies constantly redesign the same concepts in IoT. Therefore, it is a challenge to reuse and unify these different IoT ontologies with redundant concepts. In this paper, we investigate what are the most used terms within IoT ontologies? We identify the fourteen most popular ontologies within generic IoT and WoT domain. Analysis of popular concepts among these ontologies allows to automatically rank the knowledge. This work will enable guiding ontology engineers to re-use and unify existing ontologies, a required step to achieve semantic interoperability. Moreover, this work could contribute towards building iot.schema.org.
JA - International Conference WWW/Internet 2018
PB - Elsevier
CY - 21-23 October 2018 Budapest, Hungary
ER -
TY - Generic
T1 - Correlating Multimodal Signals with Asthma Control in Children Using kHealth Personalized Digital Health System
Y1 - 2018
A1 - Amit Sheth
A1 - Tanvi Banerjee
A1 - Utkarshani Jaimini
A1 - Dipesh Kadariya
A1 - Vaikunth Sridharan
A1 - Krishnaprasad Thirunarayan
A1 - Revathy Venkataramanan
A1 - Hong Yung Yip
A1 - Maninder Kalra
AB - In order to manage a multifactorial disease like asthma, it is important to be able to measure and analyze the vast amount of multimodal data as each different factor affects a patient’s asthma symptoms differently. Technology can assist in identifying personalized risk factors for each patient, allowing us to compute the impact of factors such as environmental conditions on asthma control, as well as their vulnerability score. Future work seeks to evaluate whether this approach can support improved self management and adherence to clinical guidance.
CY - In procedings of the American Thoracic Society (ATS) International Conference
CP - American Thoracic Society (ATS) International Conference
ER -
TY - Generic
T1 - Creating Real-Time Dynamic Knowledge Graphs
Y1 - 2018
A1 - Swati Padhee
A1 - Sarasi Lalithsena
A1 - Amit Sheth
KW - Dynamic Knowledge Graphs
CY - International Semantic Web Research School (ISWS) 2018, Bertinoro, Italy
ER -
TY - CONF
T1 - Demo Track Chairs' Welcome and Organization
T2 - Companion Proceedings of the The Web Conference 2018
Y1 - 2018
A1 - Paul Groth
A1 - Amelie Gyrard
KW - demo
KW - demonstations
KW - Web
AB - The Demo Track is one of the most exciting parts of any Web Conference. It allows researchers and practitioners to demonstrate new systems in an engaging and hands-on manner to the community. The Web has been driven forward by building systems and technology. The demo track is a venue that encourages this sort of important type of result This year the track received 71 submissions of those 30 were accepted for a 42% accept rate. We had a comprehensive review procedure that looked at a number of dimensions including the novelty of the demo, its fit with the conference, its research content, and its potential for audience engagement. We were pleased by the number of submissions that included links to online demonstrations and/or videos. This gave reviewers additional information about how the demo would be presented. Overall, we had 232 reviews across all submissions. Many of the reviews provided not only a their expert judgement but ways in which the submissions could be improved. It is often difficult judging demonstrations as there are multiple factors to be taken into account. We want to thank the entire committee for taking the time to support the track. The resulting set of selected demos reflects the wide-variety of technology and research interests impacting the wide. Demonstrations cover topics such as using data on the web, the integration of the web and the physical world, knowledge graphs, search engines, security and privacy, and dealing with multimedia data. We believe that these demos provide an exciting taste of the future of the Web.
JA - Companion Proceedings of the The Web Conference 2018
PB - International World Wide Web Conferences Steering Committee
CY - Republic and Canton of Geneva, Switzerland
SN - 978-1-4503-5640-4
ER -
TY - CHAP
T1 - Domain-specific Use Cases for Knowledge-enabled Social Media Analysis
T2 - Emerging Research Challenges and Opportunities in Social Network Analysis and Mining
Y1 - 2018
A1 - Soon Jye Kho
A1 - Swati Padhee
A1 - Goonmeet Bajaj
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
KW - domain knowledge
KW - Drug Abuse Epidemiology
KW - Emoji Sense Disambiguation
KW - Implicit Entity Recognition
KW - knowledge graph
KW - Language Understanding
KW - Machine intelligence
KW - Mental Health Disorder
KW - Social Media
AB - Social media provides a virtual platform for users to share and discuss their daily life, activities, opinions, health, feelings, etc. Such personal accounts readily generate Big Data marked by velocity, volume,value, variety, and veracity challenges. This type of Big Data analytics already supports useful investigations ranging from research into data mining and developing public policy to actions targeting an individual in a variety of domains such as branding and marketing, crime and law enforcement, crisis monitoring and management, as well as public and personalized health management. However, using social media to solve domain-specific problem is challenging due to complexity of the domain,lack of context, colloquial nature of language and changing topic relevance in temporally dynamic domain. In this article, we discuss the need to go beyond data-driven machine learning and natural language processing, and incorporate deep domain knowledge as well as knowledge of how experts and decision makers explore and perform contextual interpretation. Four use cases are used to demonstrate the role of domain knowledge in addressing each challenge.
JA - Emerging Research Challenges and Opportunities in Social Network Analysis and Mining
PB - Springer,2018
CY - Dayton
ER -
TY - CONF
T1 - D-record: Disaster Response and Relief Coordination Pipeline
T2 - Proceedings of the ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities (ARIC 2018)
Y1 - 2018
A1 - Shruti Kar
A1 - Hussein S. Al-Olmat
A1 - Krishnaprasad Thirunarayan
A1 - Valerie Shalin
A1 - Amit Sheth
A1 - Srinivasan Parthasarathy
KW - disaster relief
KW - flood mapping
KW - location-centric processing
KW - need matching
AB - We employ multi-modal data (i.e., unstructured text, gazetteers, and imagery) for location-centric demand/request matching in the context of disaster relief. After classifying the Need expressed in a tweet (the WHAT), we leverage OpenStreetMap to geolocate that Need on a computationally accessible map of the local terrain (the WHERE) populated with location features such as hospitals and housing. Further, our novel use of flood mapping based on satellite images of the affected area supports the elimination of candidate resources that are not accessible by road transportation. The resulting map-based visualization combines disaster-related tweets, imagery and pre-existing knowledge-base resources (gazetteers) to reduce decision-making latency and enhance resiliency by assisting individual decision-makers and first responders for relief effort coordination.
JA - Proceedings of the ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities (ARIC 2018)
PB - ACM
CY - Seattle, Washington
ER -
TY - RPRT
T1 - eDarkTrends update: New synthetic opiods identified on two cryptomarkets.
Y1 - 2018
A1 - Francois R. Lamy
A1 - Raminta Daniulaityte
A1 - Amit Sheth
A1 - Robert Carlson
A1 - Ramzi W. Nahhas
A1 - Monica Barratt
A1 - Usha Lokala
KW - Crypto Market
KW - Fentanyl
KW - Synthetic opioids
AB - New synthetic opioids emerge periodically circumventing existing administrative bans and fueling the opiate/opioid crisis with unknown substances In an effort to inform public health practitioners on new emerging substances in a timely manner, the eDarkTrends project aims to collect and analyze synthetic opioids-related data from Darknet cryptomarkets. Recently notified through the NDEWS network, new substances (U-48,800, ortho- Methylmethoxyacetylfentanyl, Despropionyl ortho-Methylfentanyl) were identified by the Organized Crime Drug Enforcement Task Force (0CDETF). Based on these recent developments, we would like to share some preliminary results from the eDarkTrends project.
ER -
TY - CONF
T1 - Enhancing crowd wisdom using explainable diversity inferred from social media
T2 - IEEE/WIC/ACM International Conference on Web Intelligence
Y1 - 2018
A1 - Shreyansh Bhatt
A1 - Manas Gaur
A1 - Beth Bullemer
A1 - Valerie Shalin
A1 - Amit Sheth
A1 - Brandon Minnery
KW - Collective Intelligence
KW - DBpedia
KW - Diversity
KW - Fantasy Sports
KW - Semantic Analysis
KW - twitter
KW - Wisdom of Crowds
AB - A crowd sampled from a set of individuals can provide a more accurate prediction in aggregate than most individuals.This effect, referred to as wisdom of crowd, exists when crowd members bring diverse perspectives to decision making.Such diversity leads to uncorrelated prediction errors that cancel out in aggregate. As crowd members’ judgments are often the result of solution strategies, diversity in solution strategies can enhance crowd wisdom. One of the most challenging tasks in sampling such a crowd is to determine the individual’s solution strategy for a prediction problem. As participating individuals often share their perspectives through social media, we can use such data to identify an individual’s solution strategy. In this paper, we propose a crowd selection approach using social media posts (tweets) indicating diverse solution strategies. We use tweet classification to identify participants’ prediction strategies and categorize participants based on the binomial test to identify sets of participants that apply a similar strategy. We then form a diverse crowd by sampling participants from different sets.Using the domain of Fantasy Sports, we show that such a diverse crowd can outperform crowd selected at random and 90% of individual participants, and participant categorization schemes using word2vec. Further, we use a knowledge graph to investigate the factors forming such a diverse crowd and how these factors can lead to a better decision. Relative to bottom-up (data-driven)processes the approach presented here provides an explanation of diverse crowd behavior.
JA - IEEE/WIC/ACM International Conference on Web Intelligence
PB - IEEE
CY - Santiago, Chile
ER -
TY - Generic
T1 - Feasibility of Recording Sleep Quality And Sleep Duration Using Fitbit in Children with Asthma
Y1 - 2018
A1 - Amit Sheth
A1 - Hong Yung Yip
A1 - Utkarshani Jaimini
A1 - Dipesh Kadariya
A1 - Vaikunth Sridharan
A1 - Revathy Venkataramanan
A1 - Tanvi Banerjee
A1 - Krishnaprasad Thirunarayan
A1 - Maninder Kalra
AB - Sleep disorders are common in children with asthma and are increasingly implicated in poor asthma control. Smart wearables such as the Fitbit wristband allow monitoring of users’ sleep duration and quality in their natural surroundings. However, the utility and efficacy of using such wearable devices to monitor sleep in pediatric patients with asthma have not been well-established. Thus, the objective of this study is to demonstrate the feasibility of recording sleep quality and sleep duration using Fitbit in children with asthma. We evaluated the sleep characteristics of 34 children (recruited from Dayton Children’s Hospital for a period of one month or three months) with varying levels of asthma severity (mild, moderate, and severe persistent) by age group (19 pre-teens of age 5 to 12 and 15 teens of age 13 to 17) in terms of average (i) time in bed, (ii) sleep time, (iii) REM sleep, (iv) light sleep, and (v) deep sleep. We observed that, on average, these children spent most of their sleep time on light sleep (58.5%), followed by REM sleep (21.7%) and deep sleep (19.8%). The sleep efficiency (total sleep time over time in bed) was higher in pre-teens (90%) as compared to teens (88%) with an overall efficiency of 89%. In addition, teenagers spent a significant higher time asleep (p=0.03) on the weekends as compared to week nights. These results correlated well with polysomnography based normative data in children. Our findings supported the potential use of wrist-worn devices to continuously monitor sleep duration and quality in children with asthma to allow for better evaluation of the effect of sleep on asthma outcomes in children.
CY - 32nd Annual Meeting of the Associated Professional Sleep Societies (SLEEP), 2-6 June 2018, Baltimore, MD
ER -
TY - CHAP
T1 - Feature Engineering for Twitter-based Applications
T2 - Feature Engineering for Machine Learning and Data Analytics
Y1 - 2018
A1 - Sanjaya Wijeratne
A1 - Amit Sheth
A1 - Shreyansh Bhatt
A1 - Lakshika Balasuriya
A1 - Hussein S. Al-Olimat
A1 - Manas Gaur
A1 - Amir Hossein Yazdavar
A1 - Krishnaprasad Thirunarayan
KW - Depression
KW - emoji
KW - Emotion Analysis
KW - gang member identification
KW - location extraction
KW - Sentiment Analysis
KW - twitris
KW - twitter
KW - twitter features
AB - This chapter presents studies concerning feature engineering for Twitter-based applications. It first discusses how Twitter data can be downloaded from the Twitter Application Programming Interface (API) and the kinds of data available in the downloaded tweets. Then, it discusses various textual features, image and video features, Twitter metadata-related features, and network features that can be extracted. Next, it discusses the uses of different feature types along with an analysis on why certain features perform well in the context of informal short text messages typically found in tweets. It then presents five real-world Twitter applications that utilize the different feature types. For each application, it also highlights the features that perform well in the corresponding application setting. Finally, it concludes the chapter by discussing Twitris, a real-time semantic social web analytics platform that has already been commercialized, and its use of Twitter features.
JA - Feature Engineering for Machine Learning and Data Analytics
PB - Chapman and Hall. Data Mining and Knowledge Discovery Series
CY - New York
SN - 978-1-1387-4438-7
ER -
TY - THES
T1 - A Framework to Understand Emoji Meaning: Similarity and Sense Disambiguation of Emoji using EmojiNet
T2 - College of Engineering and Computer Science
Y1 - 2018
A1 - Sanjaya Wijeratne
KW - emoji
KW - Emoji Analysis
KW - Emoji Analysis and Search
KW - Emoji Research
KW - Emoji Sense Disambiguation
KW - Emoji Similarity
KW - EmojiNet
KW - Social Media
KW - social media analysis
KW - twitter
KW - Word Embeddings
AB - Pictographs, commonly referred to as ‘emoji’, have become a popular way to enhance electronic communications. They are an important component of the language used in social media. With their introduction in the late 1990’s, emoji have been widely used to enhance the sentiment, emotion, and sarcasm expressed in social media messages. They are equally popular across many social media sites including Facebook, Instagram, and Twitter. In 2015, Instagram reported that nearly half of the photo comments posted on Instagram contain emoji, and in the same year, Twitter reported that the ‘face with tears of joy’ emoji has been tweeted 6.6 billion times. As of 2017, Facebook and Facebook Messenger processed over 60 million and 6 billion messages with emoji per day, respectively. Emogi, an Internet marketing firm, reports that over 92% of all online users have used emoji at least once. Creators of the SwiftKey Keyboard for mobile devices report that they process 6 billion messages per day that contain emoji. Moreover, business organizations have adopted and now accept the use of emoji in professional communication. For example, Appboy, an Internet marketing company, reports that there has been a 777% year-over-year increase and 20% month-over-month increase in emoji usage for marketing campaigns by business organizations in 2016. These statistics leave little doubt that emoji are a significant and important aspect of electronic communication across the world. The ability to automatically process and interpret text fused with emoji will be essential as society embraces emoji as a standard form of online communication. In the same way that natural language is processed with sophisticated machine learning techniques and technologies for many important applications, including text similarity and word sense disambiguation, emoji should also be amenable to such analysis. Yet the pictorial nature of emoji, the fact that the same emoji may be used in different contexts to express different meanings, and that emoji are used in different cultures over the world which can interpret emoji differently, make it especially difficult to apply traditional Natural Language Processing (NLP) techniques to analyze them. Indeed, emoji were developed organically with no overt/explicit semantics assigned to them. This contributed to their flexible usage but also lead to ambiguity. Thus, similar to words, emoji can take on different meanings depending on context and part-of-speech (POS). Polysemy in emoji complicates determination of emoji similarity and emoji sense disambiguation. However, having access to machine-readable sense repositories that are specifically designed to capture emoji meaning can play a vital role in representing, contextually disambiguating, and converting pictorial forms of emoji into text, thereby leveraging and generalizing NLP techniques for processing richer medium of communication. This dissertation presents the creation of EmojiNet, the largest machine-readable emoji sense inventory that links Unicode emoji representations to their English meanings extracted from the Web. EmojiNet consists of (i) 12,904 sense labels over 2,389 emoji, which were extracted from reliable online web sources and linked to machine-readable sense definitions seen in BabelNet; (ii) context words associated with each emoji sense, which are inferred through word embedding models trained over Google News and Twitter message corpora for each emoji sense definition; and (iii) recognizing discrepancies in the presentation of emoji on different platforms and specification of the most likely platform-based emoji sense for a selected set of emoji. It then discusses the application of emoji meanings extracted from EmojiNet to solve novel downstream applications including emoji similarity and emoji sense disambiguation. To address the problem of emoji similarity, first, it presents a comprehensive analysis of the semantic similarity of emoji through emoji embedding models learned over emoji meanings in EmojiNet. Using emoji descriptions, emoji sense labels, and emoji sense definitions, and with different training corpora obtained from Twitter and Google News, multiple embedding models are learned to measure emoji similarity. Using a benchmark sentiment analysis dataset, it further shows that incorporating emoji meanings in EmojiNet into embedding models can improve the accuracy of sentiment analysis tasks by ∼9%. To address the problem of emoji sense disambiguation, it uses word embedding models learned over Twitter and Google News corpora and shows that word embeddings models can be used to improve the accuracy of emoji sense disambiguation tasks. The EmojiNet framework, its RESTful web services, and other benchmarking datasets created as part of this dissertation are publicly released at http://emojinet.knoesis.org/.
JA - College of Engineering and Computer Science
PB - Wright State University
CY - Dayton
VL - Ph.D. in Computer Science and Engineering
U1 - Download Link - http://rave.ohiolink.edu/etdc/view?acc_num=wright1547506375922938
ER -
TY - JOUR
T1 - ``How Is My Child's Asthma?'' Digital Phenotype and Actionable Insights for Pediatric Asthma
JF - JMIR Pediatr Parent
Y1 - 2018
A1 - Utkarshani Jaimini
A1 - Krishnaprasad Thirunarayan
A1 - Maninder Kalra
A1 - Revathy Venkataramanan
A1 - Dipesh Kadariya
A1 - Amit Sheth
KW - actionable insights
KW - asthma control level
KW - asthma control test
KW - controller compliance score
KW - digital phenotype
KW - digital phenotype score
KW - mobile health
AB - Background: In the traditional asthma management protocol, a child meets with a clinician infrequently, once in 3 to 6 months, and is assessed using the Asthma Control Test questionnaire. This information is inadequate for timely determination of asthma control, compliance, precise diagnosis of the cause, and assessing the effectiveness of the treatment plan. The continuous monitoring and improved tracking of the child's symptoms, activities, sleep, and treatment adherence can allow precise determination of asthma triggers and a reliable assessment of medication compliance and effectiveness. Digital phenotyping refers to moment-by-moment quantification of the individual-level human phenotype in situ using data from personal digital devices, in particular, mobile phones. The kHealth kit consists of a mobile app, provided on an Android tablet, that asks timely and contextually relevant questions related to asthma symptoms, medication intake, reduced activity because of symptoms, and nighttime awakenings; a Fitbit to monitor activity and sleep; a Microlife Peak Flow Meter to monitor the peak expiratory flow and forced exhaled volume in 1 second; and a Foobot to monitor indoor air quality. The kHealth cloud stores personal health data and environmental data collected using Web services. The kHealth Dashboard interactively visualizes the collected data. Objective: The objective of this study was to discuss the usability and feasibility of collecting clinically relevant data to help clinicians diagnose or intervene in a child's care plan by using the kHealth system for continuous and comprehensive monitoring of child's symptoms, activity, sleep pattern, environmental triggers, and compliance. The kHealth system helps in deriving actionable insights to help manage asthma at both the personal and cohort levels. The Digital Phenotype Score and Controller Compliance Score introduced in the study are the basis of ongoing work on addressing personalized asthma care and answer questions such as, ``How can I help my child better adhere to care instructions and reduce future exacerbation?'' Methods: The Digital Phenotype Score and Controller Compliance Score summarize the child's condition from the data collected using the kHealth kit to provide actionable insights. The Digital Phenotype Score formalizes the asthma control level using data about symptoms, rescue medication usage, activity level, and sleep pattern. The Compliance Score captures how well the child is complying with the treatment protocol. We monitored and analyzed data for 95 children, each recruited for a 1- or 3-month-long study. The Asthma Control Test scores obtained from the medical records of 57 children were used to validate the asthma control levels calculated using the Digital Phenotype Scores. Results: At the cohort level, we found asthma was very poorly controlled in 37{%} (30/82) of the children, not well controlled in 26{%} (21/82), and well controlled in 38{%} (31/82). Among the very poorly controlled children (n=30), we found 30{%} (9/30) were highly compliant toward their controller medication intake–-suggesting a re-evaluation for change in medication or dosage–-whereas 50{%} (15/30) were poorly compliant and candidates for a more timely intervention to improve compliance to mitigate their situation. We observed a negative Kendall Tau correlation between Asthma Control Test scores and Digital Phenotype Score as −0.509 (PKnowledge-empowered real-time event-centric situational analysis: 1. Motivation and Objective: Making sense of heterogeneous multimodal big data 2. Architecture and Approach: Transfer learning Tensorflow model inference combined with location information and elevation data 3. Preliminary Outcome: Retrained models performance, object detection performance 4. Future Work: Develop heuristics to estimate flooding level 5. References
CY - NSF I/UCRC, Center for Surveillance Research Advisory Board meeting, August 7th, 2018
ER -
TY - Generic
T1 - Knowledge-enabled Personalized Dashboard for Asthma Management in Children
Y1 - 2018
A1 - Vaikunth Sridharan
A1 - Revathy Venkataramanan
A1 - Dipesh Kadariya
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Maninder Kalra
AB - Introduction: Childhood Asthma is a significant public health concern worldwide. Effective management of childhood asthma requires close monitoring of disease triggers, medication compliance and symptom control. The recent growth of the Internet of Things (IoT) based devices has enabled continuous monitoring of patients. kHealth-Asthma is a knowledge-enabled semantic framework consisting of IoT enabled sensors to record patient symptoms, medication usage and their environment. For each patient, 29 diverse parameters with 1852 data points are collected daily. kHealthDash platform enables real-time visual analysis at an individual and cohort level over such high volume, high variety data. Methods: The kHealth kit was given to 100 asthmatic children (5 to 17 years of age) for a period of one or three months each. The kit consists of an Android app-based questionnaire to record symptoms and medication usage, Fitbit to track activity and sleep, peak flow meter to measure PEF and FEV1, Foobot to monitor indoor air quality and web services to obtain outdoor environmental observations. Data collected are pushed to a private cloud storage in near real-time and visualized using kHealthDash. Five healthcare providers evaluated the effectiveness of kHealthDash by answering questions on data interpretation. Results: Providers reported that analyzing data with kHealthDash was 65% easier than using data in tabular format. The System Usability Score for kHealthDash is 80.5 (>68.5 - threshold), implying that kHealthDash is a user-friendly interface. Conclusion: kHealthDash integrates and visualizes multimodal data and holds promise to aid the clinicians in better decision making for asthma management.
CY - American College of Allergy, Asthma & Immunology Annual Meeting
ER -
TY - CONF
T1 - "Let Me Tell You About Your Mental Health!" Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention
T2 - The 27th ACM International Conference on Information and Knowledge Management (CIKM’18)
Y1 - 2018
A1 - Manas Gaur
A1 - Ugur Kursuncu
A1 - Amanuel Alambo
A1 - Amit Sheth
A1 - Raminta Daniulaityte
A1 - Krishnaprasad Thirunarayan
A1 - Jyotishman Pathak
KW - Reddit; Mental Health; DSM-5; Semantic Encoding and Decoding; Medical Knowledge bases; Drug Abuse Ontology; Semantic Social Computing
AB - Social media platforms are increasingly being used to share and seek advice on mental health issues. In particular, Reddit users freely discuss such issues on various subreddits, whose structure and content can be leveraged to formally interpret and relate subreddits and their posts in terms of mental health diagnostic categories. There is prior research on the extraction of mental health-related information, including symptoms, diagnosis, and treatments from social media; however, our approach can additionally provide actionable information to clinicians about the mental health of a patient in diagnostic terms for web-based intervention. Specifically, we provide a detailed analysis of the nature of subreddit content from domain expert's perspective and introduce a novel approach to map each subreddit to the best matching DSM-5 (Diagnostic and Statistical Manual of Mental Disorders - 5th Edition) category using multi-class classifier. Our classification algorithm analyzes all the posts of a subreddit by adapting topic modeling and word-embedding techniques, and utilizing curated medical knowledge bases to quantify relationship to DSM-5 categories. Our semantic encoding-decoding optimization approach reduces the false-alarm-rate from 30% to 2.5% over a comparable heuristic baseline, and our mapping results have been verified by domain experts achieving a kappa score of 0.84.
JA - The 27th ACM International Conference on Information and Knowledge Management (CIKM’18)
PB - Association for Computing Machinery
CY - Torino, Italy
ER -
TY - CONF
T1 - Location Name Extraction from Targeted Text Streams using Gazetteer-based Statistical Language Models
T2 - Proceedings of the 27th International Conference on Computational Linguistics
Y1 - 2018
A1 - Hussein S. Al-Olimat
A1 - Krishnaprasad Thirunarayan
A1 - Valerie Shalin
A1 - Amit Sheth
AB - Extracting location names from informal and unstructured social media data requires the identification of referent boundaries and partitioning compound names. Variability, particularly systematic variability in location names (Carroll, 1983), challenges the identification task. Some of this variability can be anticipated as operations within a statistical language model, in this case drawn from gazetteers such as OpenStreetMap (OSM), Geonames, and DBpedia. This permits evaluation of an observed n-gram in Twitter targeted text as a legitimate location name variant from the same location-context. Using n-gram statistics and location-related dictionaries, our Location Name Extraction tool (LNEx) handles abbreviations and automatically filters and augments the location names in gazetteers (handling name contractions and auxiliary contents) to help detect the boundaries of multi-word location names and thereby delimit them in texts. We evaluated our approach on 4,500 event-specific tweets from three targeted streams to compare the performance of LNEx against that of ten state-of-the-art taggers that rely on standard semantic, syntactic and/or orthographic features. LNEx improved the average F-Score by 33-179%, outperforming all taggers. Further, LNEx is capable of stream processing.
JA - Proceedings of the 27th International Conference on Computational Linguistics
PB - Association for Computational Linguistics
CY - Santa Fe, New Mexico, USA
VL - 2018
ER -
TY - CONF
T1 - Personalized Health Knowledge Graph
T2 - Contextualized Knowledge Graph (CKG) Workshop International Semantic Web Conference (ISWC) 2018
Y1 - 2018
A1 - Amelie Gyrard
A1 - Manas Gaur
A1 - Saeedeh Shekarpour
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
AB - Our current health applications do not adequately take into account contextual and personalized knowledge about patients. In order to design ``Personalized Coach for Healthcare'' applications to manage chronic diseases, there is a need to create a Personalized Healthcare Knowledge Graph (PHKG) that takes into consideration a patient's health condition (personalized knowledge) and enriches that with contextualized knowledge from environmental sensors and Web of Data (e.g., symptoms and treatments for diseases). To develop PHKG, aggregating knowledge from various heterogeneous sources such as the Internet of Things (IoT) devices, clinical notes, and Electronic Medical Records (EMRs) is necessary. In this paper, we explain the challenges of collecting, managing, analyzing, and integrating patients' health data from various sources in order to synthesize and deduce meaningful information embodying the vision of the Data, Information, Knowledge, and Wisdom (DIKW) pyramid. Furthermore, we sketch a solution that combines: 1) IoT data analytics, and 2) explicit knowledge and illustrate it using three chronic disease use cases -- asthma, obesity, and Parkinson's.
JA - Contextualized Knowledge Graph (CKG) Workshop International Semantic Web Conference (ISWC) 2018
PB - LNCS
CY - Monterey, California, USA
ER -
TY - Generic
T1 - Personalized Prediction of Suicide Risk for Web-based Intervention
Y1 - 2018
A1 - Amanuel Alambo
A1 - Manas Gaur
A1 - Ugur Kursuncu
A1 - Krishnaprasad Thirunarayan
A1 - Jeremiah Schumm
A1 - Jyotishman Pathak
A1 - Amit Sheth
AB - Across the United States, suicide is the second leading cause of death for people aged between 15 and 34, and younger people are more prone to mental health problems, suicidal thoughts, and behaviors. For instance, 80% of patients with Borderline Personality Disorder have suicide-related behaviors, and between 4-9% of them commit suicide. Moreover, the social stigma associated with mental health issues and suicide deter patients from sharing their experiences directly with others. In such a situation, social media that provides a free and open forum for voluntary expression can provide insights into suicide ideation and self-destructive behavior. Reddit is a widely used and highly relevant social-media platform where users subscribe to specific subreddits and share their experiences. The users on the respective subreddits often make use of metaphoric suicidal language with related intentions, while interacting with other like-minded users sharing similar experiences. The Columbia-Suicide Severity Rating Scale (C-SSRS) has been employed by clinicians to measure the level of suicidal risk but has not been adequately personalized for improved prevention and resiliency. In this study, we develop a framework for the prediction of suicidal risk by conducting a user-level analysis supervised by C-SSRS and using medical knowledge bases. This will eventually facilitate a clinician to perform a personalized web-based intervention. Our two-fold approach creates a user-level decision-making mechanism that factors in the linguistic, temporal, homophily-based, metaphorical, and intent-based information from the dialogues of 93K users interacting on r/SuicideWatch and other related subreddits that aid in the characterization of users’ suicidal vulnerability.
CY - 24th NIMH Conference on Mental Health Services Research (MHSR)
ER -
TY - Generic
T1 - Poster: Image Disguising for Privacy-preserving Deep Learning
Y1 - 2018
A1 - Sagar Sharma
A1 - Keke Chen
KW - Privacy-preserving
AB - Due to the high training costs of deep learning, model developers often rent cloud GPU servers to achieve better efficiency. However, this practice raises privacy concerns. An adversarial party may be interested in 1) personal identifiable information encoded in the training data and the learned models, 2) misusing the sensitive models for its own benefits, or 3) launching model inversion (MIA) and generative adversarial network (GAN) attacks to reconstruct repli- cas of training data (e.g., sensitive images). Learning from encrypted data seems impractical due to the large training data and expensive learning algorithms, while differential-privacy based approaches have to make significant trade-offs between privacy and model quality. We investigate the use of image disguising techniques to protect both data and model privacy. Our preliminary results show that with block-wise permutation and transformations, surprisingly, disguised images still give reasonably well performing deep neural networks (DNN). The disguised images are also resilient to the deep-learning enhanced visual discrimination attack and provide an extra layer of protection from MIA and GAN attacks.
CY - ACM Conference on Computer and Communications Security (CCS) 2018
ER -
TY - CHAP
T1 - Predictive Analysis on Twitter: Techniques and Applications
T2 - Emerging Research Challenges and Opportunities in Computational Social Network Analysis and Mining
Y1 - 2018
A1 - Ugur Kursuncu
A1 - Manas Gaur
A1 - Usha Lokala
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Ismailcem Budak Arpinar
KW - Citizen sensing
KW - Community evolution
KW - Demographic prediction
KW - Drug trends
KW - Election prediction
KW - event analysis
KW - Harassment detection
KW - machine learning
KW - Mental Health
KW - Semantic Social Computing
KW - Sentiment-Emotion-Intent Analysis
KW - social media analysis
KW - Spatio-temporalthematic analysis
KW - Stock Market prediction
AB - Predictive analysis of social media data has attracted considerable attention from the research community as well as the business world because of the essential and actionable information it can provide. Over the years, extensive experimentation and analysis for insights have been carried out using Twitter data in various domains such as healthcare, public health, politics, social sciences, and demographics. In this chapter, we discuss techniques, approaches and state-of-the-art applications of predictive analysis of Twitter data. Specifically, we present fine-grained analysis involving aspects such as sentiment, emotion, and the use of domain knowledge in the coarse-grained analysis of Twitter data for making decisions and taking actions, and relate a few success stories.
JA - Emerging Research Challenges and Opportunities in Computational Social Network Analysis and Mining
PB - Springer
CY - Dayton
ER -
TY - Generic
T1 - Privacy-Preserving Boosting with Random Linear Classifiers
Y1 - 2018
A1 - Sagar Sharma
A1 - Keke Chen
KW - Boosting
KW - Privacy-preserving
AB - We propose SecureBoost, a privacy-preserving predictive modeling framework, that allows service providers (SPs) to build powerful boosting models over encrypted or randomly masked user submit- ted data. SecureBoost uses random linear classifiers (RLCs) as the base classifiers. A Cryptographic Service Provider (CSP) manages keys and assists the SP’s processing to reduce the complexity of the protocol constructions. The SP learns only the base models (i.e., RLCs) and the CSP learns only the weights of the base models and a limited leakage function. This separated parameter holding avoids any party from abusing the final model or conducting model-based attacks. We evaluate two constructions of SecureBoost: HE+GC and SecSh+GC using combinations of primitives - homomorphic encryption, garbled circuits, and random masking. We show that SecureBoost efficiently learns high-quality boosting models from protected user-generated data with practical costs.
CY - ACM Conference on Computer and Communications Security (CCS) 2018
ER -
TY - CONF
T1 - A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research
T2 - Proceedings of the 10th {ACM} Conference on Web Science, WebSci 2018, Amsterdam, The Netherlands, May 27-30, 2018
Y1 - 2018
A1 - Mohammadreza Rezvan
A1 - Saeedeh Shekarpour
A1 - Lakshika Balasuriya
A1 - Krishnaprasad Thirunarayan
A1 - Valerie Shalin
A1 - Amit Sheth
KW - Annotated corpus
KW - appearance- related
KW - context
KW - cyberbullying
KW - harassment
KW - intellectual
KW - offensive Lexicon
KW - political
KW - profane word.
KW - racial
KW - sexual
AB - A quality annotated corpus is essential to research. Despite the recent focus of the Web science community on cyberbullying research,the community lacks standard benchmarks. This paper provides both a quality annotated corpus and an offensive words lexicon capturing different types of harassment content: (i) sexual, (ii) racial, (iii) appearance-related, (iv) intellectual, and (v) political. We first crawled data from Twitter using this content-tailored offensive lexicon. As mere presence of an offensive word is not a reliable indicator of harassment, human judges annotated tweets for the presence of harassment. Our corpus consists of 25,000 annotated tweets for the five types of harassment content and is available on the Git repository.
JA - Proceedings of the 10th {ACM} Conference on Web Science, WebSci 2018, Amsterdam, The Netherlands, May 27-30, 2018
PB - ACM
CY - Amsterdam, The Netherlands
ER -
TY - THES
T1 - A Semantically Enhance Approach to Identify Depression Indicative Symptoms Using Twitter Data
T2 - Department of Engineering & Computer Science
Y1 - 2018
A1 - Ankita Saxena
AB - According to the World Health Organization, more than 300 million people suffer from Major Depressive Disorder (MDD) worldwide. PHQ-9 is used to screen and diagnose MDD clinically and identify its severity. With the unprecedented growth and enthusiastic acceptance of social media such as Twitter, a large number of people have come to share their feelings and emotions on it openly. Each tweet can indicate a user’s opinion, thought or feeling. A tweet can also indicate multiple symptoms related to PHQ-9. Identifying PHQ-9 symptoms indicated by a tweet can provide crucial information about a user regarding his/her depression diagnosis. The current state-of-the-art approach using supervised machine learning to classify a tweet regarding PHQ-9 symptoms relies on explicit reference to a particular PHQ-9 symptom, i.e., it considers an exact string matching-based feature representation. This approach of explicit referencing falls short on classifying tweets having an implicit symptom indicator in several possible PHQ-9 symptoms. This thesis proposes a semantically enhanced approach that considers explicit as well as implicit depression-indicative symptoms. We better capture the semantics of a word in a tweet as it relates to depression condition by employing the context of the word indicated by the surrounding words using Word2Vec model trained on a corpus of ~3 million tweets. Using a two-stage (binary class - multi-label) classification model, we demonstrate that our approach outperforms the baseline model for depression-indicative symptoms by around 20% on f-measure. We further evaluated our semantically-enhanced approach to fill in the PHQ-9 questionnaire and identify the severity of depression by standard guidelines by considering a dataset of 932,108 self-reported users.
JA - Department of Engineering & Computer Science
PB - Wright State University
CY - Dayton
VL - M.S.
ER -
TY - THES
T1 - Sensor Data Streams Correlation Platform for Asthma Management
T2 - Department of Engineering & Computer Science
Y1 - 2018
A1 - Vaikunth Sridharan
AB - Asthma is a high-burden chronic inflammatory disease with prevalence in children with twice the rate compared to adults. It can be improved by continuously monitoring patients and their environment using the Internet of Things (IoT) based devices. These sensor data streams so obtained are essential to comprehend multiple factors triggering asthma symptoms. In order to support physicians in exploring causal associations and finding actionable insights, a visualization system with a scalable cloud infrastructure that can process multi-modal sensor data and Patient Generated Health Data (PGHD) is necessary. In this thesis, we describe a cloud-based asthma management and visualization platform that integrates personalized PGHD from kHealth 1 kit and outdoor environmental observations from web services 2 . When applied to data from an individual, the tool assists in analyzing and explaining symptoms using ”personalized” causes, monitor disease progression, and improve asthma management. The front-end visualization was built with Bootstrap Framework and Highcharts. Google’s Firebase and Elasticsearch engine were used as back-end storage to aggregate data from various sources. Further, Node.js and Express Framework were used to develop several Representational State Transfer services useful for the visualization.
JA - Department of Engineering & Computer Science
PB - Wright State University
CY - Dayton
VL - M.S.
ER -
TY - Generic
T1 - Towards Practical Privacy-Preserving Analytics for IoT and Cloud Based Healthcare Systems
Y1 - 2018
A1 - Sagar Sharma
A1 - Keke Chen
A1 - Amit Sheth
KW - Analytical models
KW - Cloud Computing
KW - Computational modeling
KW - Data models
KW - Data privacy
KW - Medical services
KW - Privacy
AB - Modern healthcare systems now rely on advanced computing methods and technologies, such as IoT devices and clouds, to collect and analyze personal health data at unprecedented scale and depth. Patients, doctors, healthcare providers, and researchers depend on analytical models derived from such data sources to remotely monitor patients, early-diagnose diseases, and find personalized treatments and medications. However, without appropriate privacy protection, conducting data analytics becomes a source of privacy nightmare. In this paper, we present the research challenges in developing practical privacy-preserving analytics in healthcare information systems. The study is based on kHealth - a personalized digital healthcare information system that is being developed and tested for disease monitoring. We analyze the data and analytic requirements for the involved parties, identify the privacy assets, analyze existing privacy substrates, and discuss the potential tradeoff among privacy, efficiency, and model quality.
PB - IEEE
VL - 22
CP - 2
ER -
TY - CHAP
T1 - Twitris: A System for Collective Social Intelligence
T2 - Encyclopedia of Social Network Analysis and Mining
Y1 - 2018
A1 - Amit Sheth
A1 - Hemant Purohit
A1 - Gary Alan Smith
A1 - Jeremy Brunn
A1 - Ashutosh Jadhav
A1 - Pavan Kapanipathi
A1 - Chen Lu
A1 - Wenbo Wang
ED - Reda Alhajj
ED - Jon Rokne
KW - Citizen sensing
KW - Community evolution
KW - Event analysis on social media
KW - Interaction Network
KW - People-Content-Network Analysis
KW - Real-time social media analysis
KW - Semantic Perception
KW - semantic social web
KW - Sentiment-Emotion-Intent Analysis
KW - Social Computing
KW - Social data analysis
KW - Social Media
KW - social media analysis
KW - Spatio-temporal-thematic analysis
KW - twitris
KW - Web 3.
AB - The massive amount of data on social networks (e.g., Twitter, Reddit, Facebook, Instagram, Web forums) provides an exceptional opportunity for leveraging citizen sensing (user expressed observations and opinions) for collective intelligence (deeper insights that drive decisions and actions based on social data) in numerous domains – retailing including branding and marketing, financial markets, entertainment and sports, disaster coordination, social movements, healthcare and epidemiology, etc. However, social data is extremely rich, providing opportunity to understand them from many dimensions such as spatio-temporal-thematic, people-content-network, and sentiment-emotion-intention. It takes a rich variety of semantic techniques including knowledge graphs or ontology enhanced or empowered text mining, natural language processing, and machine learning (including deep learning). Twitris is an embodiment of all these for real-time and highly scalable technology used both in scientific research and in commercial applications.
JA - Encyclopedia of Social Network Analysis and Mining
PB - Springer-Verlag New York
CY - New York
SN - 978-1-4614-7163-9
ER -
TY - JOUR
T1 - Using electronic health records to characterize prescription patterns: focus on antidepressants in nonpsychiatric outpatient settings
JF - JAMIA Open
Y1 - 2018
A1 - Joseph J Deferio
A1 - Tomer T Levin
A1 - Judith Cukor
A1 - Samprit Banerjee
A1 - Rozan Abdulrahman
A1 - Amit Sheth
A1 - Neel Mehta
A1 - Jyotishman Pathak
KW - antidepressants
KW - EHR
KW - outpatient
KW - prescription patterns
AB - Objective To characterize nonpsychiatric prescription patterns of antidepressants according to drug labels and evidence assessments (on-label, evidence-based, and off-label) using structured outpatient electronic health record (EHR) data. Methods A retrospective analysis was conducted using deidentified EHR data from an outpatient practice at a New York City-based academic medical center. Structured “medication–diagnosis” pairs for antidepressants from 35 325 patients between January 2010 and December 2015 were compared to the latest drug product labels and evidence assessments. Results Of 140 929 antidepressant prescriptions prescribed by primary care providers (PCPs) and nonpsychiatry specialists, 69% were characterized as “on-label/evidence-based uses.” Depression diagnoses were associated with 67 233 (48%) prescriptions in this study, while pain diagnoses were slightly less common (35%). Manual chart review of “off-label use” prescriptions revealed that on-label/evidence-based diagnoses of depression (39%), anxiety (25%), insomnia (13%), mood disorders (7%), and neuropathic pain (5%) were frequently cited as prescription indication despite lacking ICD-9/10 documentation. Conclusions The results indicate that antidepressants may be prescribed for off-label uses, by PCPs and nonpsychiatry specialists, less frequently than believed. This study also points to the fact that there are a number of off-label uses that are efficacious and widely accepted by expert clinical opinion but have not been included in drug compendia. Despite the fact that diagnosis codes in the outpatient setting are notoriously inaccurate, our approach demonstrates that the correct codes are often documented in a patient’s recent diagnosis history. Examining both structured and unstructured data will help to further validate findings. Routinely collected clinical data in EHRs can serve as an important resource for future studies in investigating prescribing behaviors in outpatient clinics.
ER -
TY - CONF
T1 - "What's ur type?" Contextualized Classification of User Types in Marijuana-related Communications using Compositional Multiview Embedding
T2 - 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)
Y1 - 2018
A1 - Ugur Kursuncu
A1 - Manas Gaur
A1 - Usha Lokala
A1 - Anurag Illendula
A1 - Krishnaprasad Thirunarayan
A1 - Raminta Daniulaityte
A1 - Amit Sheth
A1 - I. Budak Arpinar
KW - Compositional Multiview Embedding
KW - Emoji Embedding
KW - Marijuana
KW - Network Embedding
KW - Semantic Social Computing
KW - User classification
AB - With 93% of pro-marijuana population in US favoring legalization of medical marijuana, high expectations of a greater return for Marijuana stocks, and public actively sharing information about medical, recreational and business aspects related to marijuana, it is no surprise that marijuana culture is thriving on Twitter. After the legalization of marijuana for recreational and medical purposes in 29 states, there has been a dramatic increase in the volume of drug-related communication on Twitter. Specifically, Twitter accounts have been established for promotional and informational purposes, some prominent among them being American Ganja, Medical Marijuana Exchange, and Cannabis Now. Identification and characterization of different user types can allow us to conduct more fine-grained spatiotemporal analysis to identify dominant or emerging topics in the echo chambers of marijuana-related communities on Twitter. In this research, we mainly focus on classifying Twitter accounts created and run by ordinary users, retailers, and informed agencies. Classifying user accounts by type can enable better capturing and highlighting of aspects such as trending topics, business profiling of marijuana companies, and state-specific marijuana policy making. Furthermore, type-based analysis can provide more profound understanding and reliable assessment of the implications of marijuana-related communications. We developed a comprehensive approach to classifying users by their types on Twitter through contextualization of their marijuana-related conversations. We accomplished this using compositional multiview embedding synthesized from People, Content, and Network views achieving 8% improvement over the empirical baseline.
JA - 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)
PB - IEEE
CY - Santiago, Chile
ER -
TY - CONF
T1 - Why Reinvent the Wheel-Let's Build Question Answering Systems Together
T2 - The Web Conference (WWW 2018)
Y1 - 2018
A1 - Kuldeep Singh
A1 - Arun Sethupat
A1 - Andreas Both
A1 - Saeedeh Shekarpour
A1 - Ioanna Lytra
A1 - Ricardo Usbeck
A1 - Akhilesh Vyas
A1 - Akmal Khikmatullaev
A1 - Dharmen Punjani
A1 - Christoph Lange
A1 - Maria-Esther Vidal
A1 - Jens Lehmann
A1 - Sören Auer
KW - QA Framework
KW - question answering
KW - Seman-tic Search
KW - Semantic Web
KW - Software Re-usability
AB - Modern question answering (QA) systems need to flexibly integrate a number of components specialised to fulfil specific tasks in a QA pipeline. Key QA tasks include Named Entity Recognition and Disambiguation, Relation Extraction, and Query Building. Since a number of different software components exist that implement different strategies for each of these tasks, it is a major challenge to select and combine the most suitable components into a QA system, given the characteristics of a question. We study this op-timisation problem and train Classifiers, which take features of a question as input and have the goal of optimising the selection of QA components based on those features. We then devise a greedy algorithm to identify the pipelines that include the suitable components and can effectively answer the given question. We implement this model within Frankenstein a QA framework able to select QA components and compose QA pipelines. We evaluate the effectiveness of the pipelines generated by Frankenstein using the QALD and LC-QuAD benchmarks. These results not only suggest that Frankenstein precisely solves the QA optimisation problem, but also enables the automatic composition of optimised QA pipelines, which outperform the static Baseline QA pipeline. Thanks to this flexible and fully automated pipeline generation process, new QA components can be easily included in Frankenstein thus improving the performance of the generated pipelines.
JA - The Web Conference (WWW 2018)
CY - Lyon, France
ER -
TY - JOUR
T1 - "You got to love rosin: Solventless dabs, pure, clean, natural medicine." Exploring Twitter data on emerging trends in Rosin Tech marijuana concentrates
JF - Drug and Alcohol Dependence
Y1 - 2018
A1 - Francois R. Lamy
A1 - Raminta Daniulaityte
A1 - Mussah Zatreh
A1 - Ramzi W. Nahhas
A1 - Amit Sheth
A1 - Silvia S. Martins
A1 - Edward W. Boyer
A1 - Robert G. Carlson
KW - Cannabis legislation
KW - marijuana concentrates
KW - Rosin technique
KW - Social Media
KW - twitter
AB - Background “Rosin tech” is an emerging solventless method consisting in applying moderate heat and constant pressure on marijuana flowers to prepare marijuana concentrates referred to as “rosin.” This paper explores rosin concentrate-related Twitter data to describe tweet content and analyze differences in rosin-related tweeting across states with varying cannabis legal statuses. Method English language tweets were collected between March 15, 2015 and April 17, 2017, using Twitter API. U.S. geolocated unique (no retweets) tweets were manually coded to evaluate the content of rosin-related tweets. Adjusted proportions of Twitter users and personal communication tweets per state related to rosin concentrates were calculated. A permutation test was used to analyze differences in normalized proportions between U.S. states with different cannabis legal statuses. Results eDrugTrends collected 8389 tweets mentioning rosin concentrates/technique. 4164 tweets (49.6% of total sample) posted by 1264 unique users had identifiable state-level geolocation. Content analysis of 2010 non-retweeted tweets revealed a high proportion of media-related tweets (44.2%) promoting rosin as a safer and solventless production method. Tweet-volume-adjusted percentages of geolocated Twitter users and personal communication tweets about rosin were respectively up to seven and sixteen times higher between states allowing recreational use of cannabis and states where cannabis is illegal. Conclusion Our results indicate that there are higher proportions of personal communication tweets and Twitter users tweeting about rosin in U.S. states where cannabis is legalized. Rosin concentrates are advertised as a safer, more natural form of concentrates, but more research on this emerging form of marijuana concentrate is needed.
VL - 183
ER -
TY - CONF
T1 - Adaptive Training Instance Selection for Cross-domain Emotion Identification
T2 - Proceedings of the International Conference on Web Intelligence
Y1 - 2017
A1 - Wang, Wenbo
A1 - Lu Chen
A1 - Keke Chen
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
KW - cross-domain emotion identification
KW - instance selection
AB - This paper exploits a large number of self-labeled emotion tweets as the training data from the source domain to improve emotion identification in target domains (i.e., blogs and fairy tales), where there is a short supply of labeled data. Due to the noisy and ambiguous nature of self-labeled emotion training data, the existing domain adaptation methods that typically depend on high-quality labeled source-domain data do not work satisfactorily. This paper describes an adaptive source-domain training instance selection method to address the problem of noisy source-domain training data. The proposed approach can effectively identify the most informative training examples based on three carefully designed measures: consistency, diversity, and similarity. It uses an iterative method that consists of the following steps in each iteration: selecting informative samples from the source domain with the informativeness measures, merging with the target-domain training data, evaluating the performance of learned classifier for the target domain, and updating the informativeness measures for the next iteration. It stops until no new training instance is selected or in a designated number of iterations. Experiments show that our approach performs effectively for cross-domain emotion identification and consistently outperforms baseline approaches across four domains.
JA - Proceedings of the International Conference on Web Intelligence
PB - ACM
CY - New York, NY, USA
SN - 978-1-4503-4951-2
ER -
TY - CONF
T1 - Adaptive Training Instance Selection for Cross-Domain Emotion Identification
T2 - WI ’17
Y1 - 2017
A1 - Wenbo Wang
A1 - Lu Chen
A1 - Keke Chen
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
KW - Cross-Domain Emotion Identication
KW - instance selection
AB - This paper exploits a large number of self-labeled emotion tweets as the training data from the source domain to improve emotion identi cation in target domains (i.e., blogs and fairy tales), where there is a short supply of labeled data. Due to the noisy and ambiguous nature of self-labeled emotion training data, This paper exploits a large number of self-labeled emotion tweets as the training data from the source domain to improve emotion identi cation in target domains (i.e., blogs and fairy tales), where there is a short supply of labeled data. Due to the noisy and ambiguous nature of self-labeled emotion training data, the existing domain adaptation methods that typically depend on high-quality labeled source-domain data do not work satisfactorily. This paper describes an adaptive source-domain training instance selection method to address the problem of noisy source-domain training data. The proposed approach can e ectively identify the most informative training examples based on three carefully designed measures: consistency, diversity, and similarity. It uses an iterative method that consists of the following steps in each iteration: selecting informative samples from the source domain with the informativeness measures, merging with the target-domain training data, evaluating the performance of learned classi er for the target domain, and updating the informativeness measures for the next iteration. It stops until no new training instance is selected or in a designated number of iterations. Experiments show that our approach performs e ectively for cross-domain emotion identi cation and consistently outperforms baseline approaches across four domains.the existing domain adaptation methods that typically depend on high-quality labeled source-domain data do not work satisfactorily. This paper describes an adaptive source-domain training instance selection method to address the problem of noisy source-domain training data. The proposed approach can e ectively identify the most informative training examples based on three carefully designed measures: consistency, diversity, and similarity. It uses an iterative method that consists of the following steps in each iteration: selecting informative samples from the source domain with the informativeness measures, merging with the target-domain training data, evaluating the performance of learned classi er for the target domain, and updating the informativeness measures for the next iteration. It stops until no new training instance is selected or in a designated number of iterations. Experiments show that our approach performs e ectively for cross-domain emotion identi cation and consistently outperforms baseline approaches across four domains.
JA - WI ’17
CY - Leipzig, Germany
ER -
TY - CONF
T1 - Augmented Personalized Health: How Smart Data with IoTs and AI is about to Change Healthcare
T2 - 2017 IEEE 3rd International Forum on Research and Technologies for Society and Industry (RTSI 2017)
Y1 - 2017
A1 - Amit Sheth
A1 - Utkarshani Jaimini
A1 - Krishnaprasad Thirunarayan
A1 - Tanvi Banerjee
KW - Augmented Personalized Health
KW - cognitive computing
KW - Internet of Things
KW - perceptual computing
KW - semantic computing
KW - Sensors
KW - Smart Data
KW - Wearable
AB - Healthcare as we know it is in the process of going through a massive change - from episodic to continuous, from disease focused to wellness and quality of life focused, from clinic centric to anywhere a patient is, from clinician controlled to patient empowered, and from being driven by limited data to 360-degree, multimodal personal-public-population physical-cybersocial big data driven. While ability to create and capture data is already here, the upcoming innovations will be in converting this big data into smart data through contextual and personalized processing such that patients and clinicians can make better decisions and take timely actions for augmented personalized health. This paper outlines current opportunities and challenges, with a focus on key AI approaches to make this a reality. The broader vision is exemplified using three ongoing applications (asthma in children, bariatric surgery, and pain management) as part of the Kno.e.sis kHealth personalized digital health initiative.
JA - 2017 IEEE 3rd International Forum on Research and Technologies for Society and Industry (RTSI 2017)
CY - Modena, Italy
ER -
TY - JOUR
T1 - On the Challenges of Sentiment Analysis for Dynamic Events
JF - IEEE Intelligent Systems
Y1 - 2017
A1 - Monireh Ebrahimi
A1 - Amir Hossein Yazdavar
A1 - Amit Sheth
AB - With the proliferation of social media over the last decade, determining people’s attitude with respect to a specific topic, document, interaction or events has fueled research interest in natural language processing and introduced a new channel called “sentiment and emotion analysis” [1]. For instance, businesses routinely look to develop systems to automatically understand their customer conversations by identifying the relevant content to enhance marketing their products and managing their reputations [2]. Previous efforts to assess people’s sentiment on Twitter have suggested that Twitter may be a valuable resource for studying political sentiment and that it reflects the offline political landscape. According to a Pew Research Center report, in January 2016 44% of US adults stated having learned about the presidential election through social media. Furthermore, 24% reported use of social media posts of the two candidates as a source of news and information, which is more than the 15% who have used both candidates’ websites or emails combined (http://j.mp/PewSocM). The first presidential debate between Trump and Hillary was the most tweeted debate ever with 17.1 million tweets.
PB - IEEE
ER -
TY - JOUR
T1 - Characterizing marijuana concentrate users: A web-based survey
JF - Drug and Alcohol Dependence
Y1 - 2017
A1 - Raminta Daniulaityte
A1 - Francois R. Lamy
A1 - Monica Barratt
A1 - Ramzi W. Nahhas
A1 - Silvia S. Martins
A1 - Edward W. Boyer
A1 - Amit Sheth
A1 - Robert G. Carlson
KW - cannabis
KW - marijuana concentrates
KW - web survey
AB - Aims: The study seeks to characterize marijuana concentrate users, describe reasons and patterns of use, perceived risk, and identify predictors of daily/near daily use. Methods: An anonymous web-based survey was conducted (April-June 2016) with 673 US-based cannabis users recruited via the Bluelight.org web-forum and included questions about marijuana concentrate use, other drugs, and socio-demographics. Multivariable logistic regression analyses were conducted to identify characteristics associated with greater odds of lifetime and daily use of marijuana concentrates. Results: About 66% of respondents reported marijuana concentrate use. The sample was 76% male, and 87% white. Marijuana concentrate use was viewed as riskier than flower cannabis. Greater odds of marijuana concentrate use was associated with living in states with “recreational” (AOR = 4.91; p = 0.001) or “medical, less restrictive” marijuana policies (AOR = 1.87; p = 0.014), being male (AOR = 2.21, p = 0.002), younger (AOR = 0.95, p < 0.001), number of other drugs used (AOR = 1.23, p < 0.001), daily herbal cannabis use (AOR = 4.28, p < 0.001), and lower perceived risk of cannabis use (AOR = 0.96, p = 0.043). About 13% of marijuana concentrate users reported daily/near daily use. Greater odds of daily concentrate use was associated with being male (AOR = 9.29, p = 0.033), using concentrates for therapeutic purposes (AOR = 7.61, p = 0.001), using vape pens for marijuana concentrate administration (AOR = 4.58, p = 0.007), and lower perceived risk of marijuana concentrate use (AOR = 0.92, p = 0.017). Conclusions: Marijuana concentrate use was more common among male, younger and more experienced users, and those living in states with more liberal marijuana policies. Characteristics of daily users, in particular patterns of therapeutic use and utilization of different vaporization devices, warrant further research with community-recruited samples.
PB - Elsevier
ER -
TY - CONF
T1 - Constructing Synthetic Social Media Stimuli for an Emergency Preparedness Functional Exercise
T2 - 14th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2017)
Y1 - 2017
A1 - Andrew Hampton
A1 - Shreyansh Bhatt
A1 - Alan Smith
A1 - Jeremy Brunn
A1 - Hemant Purohit
A1 - Valerie Shalin
A1 - John Flach
A1 - Amit Sheth
KW - disaster response training
KW - emergency preparedness
KW - Social Media
KW - synthetic microblog corpus
AB - This paper details the creation of a massive (over 32,000 messages) artificially constructed ‘Twitter’ microblog stream for a regional emergency preparedness functional exercise. By combining microblog conversion, manual production, and a control set, we created a web-based information stream providing valid, misleading, and irrelevant information to public information officers (PIOs) representing hospitals, fire departments, the local Red Cross, and city and county government officials. Addressing the challenges in constructing this corpus constitutes an important step in providing experimental evidence that complements observational study, necessary for designing effective social media tools for the emergency response setting. Preliminary results in the context of an emergency preparedness exercise suggest how social media can participate in the work practice of a PIO concerning the assessment of the disaster and the dissemination of information within the emergency response organization and to the public.
JA - 14th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2017)
CY - Albi, Occitanie Pyrénées-Méditerranée, France
ER -
TY - CONF
T1 - Discovering Explanatory Models to Identify Relevant Tweets on Zika
T2 - 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2017)
Y1 - 2017
A1 - Roopteja Muppalla
A1 - Michele Miller
A1 - Tanvi Banerjee
A1 - William Romine
AB - Zika virus has caught the worlds attention, and has led people to share their opinions and concerns on social media like Twitter. Using text-based features, extracted with the help of Parts of Speech (POS) taggers and N-gram, a classifier was built to detect Zika related tweets from Twitter. With a simple logistic classifier, the system was successful in detecting Zika related tweets from Twitter with a 92% accuracy. Moreover, key features were identified that provide deeper insights on the content of tweets relevant to Zika. This system can be leveraged by domain experts to perform sentiment analysis, and understand the temporal and spatial spread of Zika.
JA - 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2017)
PB - IEEE
CY - Jeju Island, Republic of Korea
ER -
TY - CONF
T1 - Domain-specific Hierarchical Subgraph Extraction: A Recommendation Use Case
T2 - IEEE Conference on Big Data
Y1 - 2017
A1 - Sarasi Lalithsena
A1 - Sujan Perera
A1 - Pavan Kapanipathi
A1 - Amit Sheth
KW - Domain-specific knowledge graph
KW - hierarchical relationships
KW - probabilistic soft logic
KW - recommendation systems
AB - Hierarchical relationships play a key role in knowledge graphs. Particularly, large and well-known knowledge graphs such as DBpedia contain significant number of facts expressed with hierarchical relationships in comparison to the other types of relationships. These hierarchical relationships are extensively harnessed by applications such as personalization, question answering, and recommendation systems. However, the presence of large number of facts with hierarchical relationships makes the applications computationally intensive. Additionally, the applications can be domain-specific and may not require all the hierarchical facts available, but only require those that are specific to the domain. In this paper, we present an approach to extract domain-specific hierarchical subgraph from large knowledge graphs by identifying the domain-specificity of the categories in the hierarchy. Given a domain, the domain-specificity of categories are determined by combining different types of evidence using a probabilistic framework. We show the effectiveness of our approach with a recommendation use case for movie and book domains. Our evaluation demonstrates that the domain-specific hierarchical subgraphs extracted by our approach can reduce the baseline subgraph by 40% to 50% without compromising the accuracy of the recommendations. Furthermore, the presented approach outperforms the recommendation results obtained with a stateof-the-art domain-specific subgraph extraction technique which uses supervised learning.
JA - IEEE Conference on Big Data
CY - Boston, MA, USA
N1 -
ER -
TY - CONF
T1 - eAssistant: Cognitive Assistance for Identification and Auto-Triage of Actionable Conversations
T2 - 26th International World Wide Web Conference (WWW 2017)
Y1 - 2017
A1 - Hamid R. Motahar Nezhad
A1 - Kalpa Gunaratna
A1 - Juan Cappi
KW - Activity Management
KW - Cognitive Assistance
KW - information extraction
KW - Natural Language Interface
KW - Natural Language Understanding
KW - Online Learning
KW - personalization
AB - The browser and screen have been the main user interfaces of the Web and mobile apps. The notification mechanism is an evolution in the user interaction paradigm by keeping users updated without checking applications. Conversational agents are posed to be the next revolution in user interaction paradigms. However, without intelligence on the triage of content served by the interaction and content differentiation in applications, interaction paradigms may still place the burden of information overload on users. In this paper, we focus on the problem of intelligent identification of actionable information in the content served by applications, and in particular in productivity applications (such as email, chat, messaging, social collaboration tools, etc.). We present eAssistant, which offers a novel fine-grained action identification method in an adaptive, personalizable, and online-trainable manner, and a cognitive agent/API that uses action information and user-centric conv ersation characteristics to auto-triage user conversations. The introduced method identifies individual actions and associated metadata; it is extensible in terms of the number of action classes; it learns in an online and continuous manner via user interactions and feedback, and it is personalizable to different users. We have evaluated the proposed method using real-world datasets. The results show that the method achieves higher accuracy compared to traditional ways of formulating the problem, while exhibiting additional desired properties of online, personalized, and adaptive learning. In eAssistant, we introduce a multi-dimensional learning model of conversations auto-triage, defined based on a user study and NLP-based information extraction techniques, to auto-triage user conversations on social collaboration and productivity tools.
JA - 26th International World Wide Web Conference (WWW 2017)
CY - Perth, Australia
ER -
TY - CONF
T1 - EmojiNet: An Open Service and API for Emoji Sense Discovery
T2 - 11th International AAAI Conference on Web and Social Media (ICWSM 2017)
Y1 - 2017
A1 - Sanjaya Wijeratne
A1 - Lakshika Balasuriya
A1 - Amit Sheth
A1 - Derek Doran
KW - Emoji Analysis
KW - Emoji Sense Disambiguation
KW - Emoji Similarity
KW - EmojiNet
AB - This paper presents the release of EmojiNet, the largest machine-readable emoji sense inventory that links Unicode emoji representations to their English meanings extracted from the Web. EmojiNet is a dataset consisting of: (i) 12,904 sense labels over 2,389 emoji, which were extracted from the web and linked to machine-readable sense definitions seen in BabelNet; (ii) context words associated with each emoji sense, which are inferred through word embedding models trained over Google News corpus and a Twitter message corpus for each emoji sense definition; and (iii) recognizing discrepancies in the presentation of emoji on different platforms, specification of the most likely platform-based emoji sense for a selected set of emoji. The dataset is hosted as an open service with a REST API and is available at http://emojinet.knoesis.org/. The development of this dataset, evaluation of its quality, and its applications including emoji sense disambiguation and emoji sense similarity are discussed.
JA - 11th International AAAI Conference on Web and Social Media (ICWSM 2017)
CY - Montreal, Canada
ER -
TY - THES
T1 - Finding Street Gang Member Profiles on Twitter
T2 - Department of Engineering & Computer Science
Y1 - 2017
A1 - Lakshika Balasuriya
KW - Gang Activity Understanding
KW - social media analysis
KW - Street Gangs
KW - Twitter Profile Identification
KW - Word Embeddings
AB - The crime and violence street gangs introduce into neighborhoods is a growing epidemic in cities around the world. Today, over 1.4 million people, belonging to more than 33,000 gangs, are active in the United States, of which 88% identify themselves as being members of a street gang. With the recent popularity of social media, street gang members have established online presences coinciding with their physical occupation of neighborhoods. Recent studies report that approximately 45% of gang members participate in online offending activities such as threatening, harassing individuals, posting violent videos or attacking someone on the street for something they said online in social media platforms. Thus, their social media posts may be useful to social workers and law enforcement agencies to discover clues about recent crimes or to anticipate ones that may occur in a community. Finding these posts, however, requires a method to discover gang member social media profiles. This is a challenging task since gang members represent a very small population compared to the active social media user base. This thesis studies the problem of automatically identifying street gang member profiles on Twitter, which is a popular social media platform that is commonly used by street gang members to promote their online gang-related activities. It outlines a process to curate one of the largest sets of verifiable gang member Twitter profiles that have ever been studied. A review of these profiles establishes differences in the language, profile and cover images, YouTube links, and emoji shared on Twitter by gang members compared to the rest of the Twitter population. Beyond the earlier efforts in Twitter profile identification that utilize features derived from the profile and tweet text, this thesis uses additional heterogeneous sets of features from the emoji usage, profile images, and links to YouTube videos reflecting gang-related music culture towards solving the gang member profile identification problem. Features from this review are used to train a series of supervised machine learning classifiers and they are further improved upon by using word embeddings learned over a large corpus of tweets. Experimental results demonstrate that heterogeneous features enabled our classifiers to achieve low false positive rates and promising F1-scores.
JA - Department of Engineering & Computer Science
PB - Wright State University
CY - Dayton
VL - M.S.
ER -
TY - THES
T1 - Finding Street Gang Member Profiles on Twitter
T2 - Computer Science and Engineering
Y1 - 2017
A1 - Lakshika Balasuriya
KW - Gang Activity Understanding
KW - Street Gangs
KW - Street Gangs on Twitter
KW - twitter
KW - Twitter Gangs
AB - The crime and violence street gangs introduce into neighborhoods is a growing epidemic in cities around the world. Today, over 1.4 million people, belonging to more than 33,000 gangs, are active in the United States, of which 88% identify themselves as being members of a street gang. With the recent popularity of social media, street gang members have established online presences coinciding with their physical occupation of neighborhoods. Recent studies report that approximately 45% of gang members participate in online offending activities such as threatening, harassing individuals, posting violent videos or attacking someone on the street for something they said online in social media platforms. Thus, their social media posts may be useful to social workers and law enforcement agencies to discover clues about recent crimes or to anticipate ones that may occur in a community. Finding these posts, however, requires a method to discover gang member social media profiles. This is a challenging task since gang members represent a very small population compared to the active social media user base. This thesis studies the problem of automatically identifying street gang member profiles on Twitter, which is a popular social media platform that is commonly used by street gang members to promote their online gang-related activities. It outlines a process to curate one of the largest sets of verifiable gang member Twitter profiles that have ever been studied. A review of these profiles establishes differences in the language, profile and cover images, YouTube links, and emoji shared on Twitter by gang members compared to the rest of the Twitter population. Beyond the earlier efforts in Twitter profile identification that utilize features derived from the profile and tweet text, this thesis uses additional heterogeneous sets of features from the emoji usage, profile images, and links to YouTube videos reflecting gang-related music culture towards solving the gang member profile identification problem. Features from this review are used to train a series of supervised machine learning classifiers and they are further improved upon by using word embeddings learned over a large corpus of tweets. Experimental results demonstrate that heterogeneous features enabled our classifiers to achieve low false positive rates and promising F 1-scores.
JA - Computer Science and Engineering
PB - Wright State University
CY - Dayton
VL - Master of Science
U1 - Download Link - http://rave.ohiolink.edu/etdc/view?acc_num=wright1516054679956178
ER -
TY - THES
T1 - Harassment Detection on Twitter using Conversations
T2 - Computer Science
Y1 - 2017
A1 - Venkatesh Edupuganti
AB - Social media has brought people closer than ever before, but the use of social media has also brought with it a risk of online harassment. Such harassment can have a serious impact on a person such as causing low self-esteem and depression.The past research on detecting harassment on social media is primarily based on the content of messages exchanged on social media. The lack of context when relying on a single social media post can result in a high degree of false alarms.In this study, I focus on the reliable detection of harassment on Twitter by better understanding the context in which a pair of users is exchanging messages, thereby improving precision. Specifically, I use a comprehensive set of features involving content, profiles of users exchanging messages, and the sequence of messages. By analyzing the conversation between users and features such as change of behavior during their conversation, length of conversation and frequency of curse words, I find that the detection of harassment can be improved significantly over merely using content features and user profile information. Experimental results demonstrate that the comprehensive set of features I use in my supervised machine learning classifier achieves F-score of 88.2 and Area Under Curve (AUC) of Receiver Operating Characteristic (ROC) of 94.3.
JA - Computer Science
PB - Wright State University
CY - Dayton
VL - MS
ER -
TY - JOUR
T1 - Increases in synthetic cannabinoids-related harms: Results from a longitudinal web-based content analysis
JF - International Journal of Drug Policy
Y1 - 2017
A1 - Francois R. Lamy
A1 - Raminta Daniulaityte
A1 - Ramzi W. Nahhas
A1 - Monica J. Barratt
A1 - Alan G. Smith
A1 - Amit Sheth
A1 - Silvia S. Martins
A1 - Edward W. Boyer
A1 - Robert G. Carlson
KW - Drug use ontology
KW - NLP text processing
KW - Semantic Web
KW - synthetic cannabinoids
KW - Web-forums
AB - Background Synthetic Cannabinoid Receptor Agonists (SCRA), also known as “K2” or “Spice,” have drawn considerable attention due to their potential of abuse and harmful consequences. More research is needed to understand user experiences of SCRA-related effects. We use semi-automated information processing techniques through eDrugTrends platform to examine SCRA-related effects and their variations through a longitudinal content analysis of web-forum data. Method English language posts from three drug-focused web-forums were extracted and analyzed between January 1st 2008 and September 30th 2015. Search terms are based on the Drug Use Ontology (DAO) created for this study (189 SCRA-related and 501 effect-related terms). EDrugTrends NLP-based text processing tools were used to extract posts mentioning SCRA and their effects. Generalized linear regression was used to fit restricted cubic spline functions of time to test whether the proportion of drug-related posts that mention SCRA (and no other drug) and the proportion of these “SCRA-only” posts that mention SCRA effects have changed over time, with an adjustment for multiple testing. Results 19,052 SCRA-related posts (Bluelight (n = 2782), Forum A (n = 3882), and Forum B (n = 12,388)) posted by 2543 international users were extracted. The most frequently mentioned effects were “getting high” (44.0%), “hallucinations” (10.8%), and “anxiety” (10.2%). The frequency of SCRA-only posts declined steadily over the study period. The proportions of SCRA-only posts mentioning positive effects (e.g., “High” and “Euphoria”) steadily decreased, while the proportions of SCRA-only posts mentioning negative effects (e.g., “Anxiety,” ‘Nausea,” “Overdose”) increased over the same period. Conclusion This study’s findings indicate that the proportion of negative effects mentioned in web forum posts and linked to SCRA has increased over time, suggesting that recent generations of SCRA generate more harms. This is also one of the first studies to conduct automated content analysis of web forum data related to illicit drug use.
PB - Elsevier
VL - 44
ER -
TY - JOUR
T1 - Investigation of an Indoor Air Quality Sensor for Asthma Management in Children
JF - IEEE Sensors Letters
Y1 - 2017
A1 - Utkarshani Jaimini
A1 - Tanvi Banerjee
A1 - William Romine
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Maninder Kalra
AB - Abstract—Monitoring indoor air quality is critical because Americans spend 93% of their life indoors, and around 6.3 million children suffer from asthma. We want to passively and unobtrusively monitor the asthma patient’s environment to detect the presence of two asthma-exacerbating activities: smoking and cooking using the Foobot sensor. We propose a data-driven approach to develop a continuous monitoring-activity detection system aimed at understanding and improving indoor air quality in asthma management. In this study, we were successfully able to detect a high concentration of particulate matter, volatile organic compounds, and carbon dioxide during cooking and smoking activities. We detected 1) smoking with an error rate of 1%; 2) cooking with an error rate of 11%; and 3) obtained an overall 95.7% percent accuracy classification across all events (control, cooking and smoking). Such a system will allow doctors and clinicians to correlate potential asthma symptoms and exacerbation reports from patients with environmental factors without having to personally be present.
PB - IEEE
VL - 1
CP - 2
ER -
TY - JOUR
T1 - IoT Quality Control for Data and Application Needs
JF - IEEE Intelligent Systems
Y1 - 2017
A1 - Tanvi Banerjee
A1 - Amit Sheth
KW - IoT
KW - mHealth
KW - sensor analysis
KW - sensor data quality
AB - With the rapid growth of sensors and devices that communicate—that is, the Internet of Things (IoT)—smart devices have permeated every facet of modern life. These IoT devices are within our bodies, on our bodies, in the environment both inside and outside our homes, observing our behavior patterns on a day-to-day basis, and assisting in production systems and surveillance. Figure 1 highlights some of the more popular IoT applications in the world. However, with these sensors’ ubiquity and pervasiveness comes vast amounts of data that need to be processed and analyzed to extract meaningful or actionable information from the data for recommending appropriate changes in the real world. This requires using not only semantic approaches, but also data streamlining to ensure that the decisions made are not erroneous. Moreover, due to the sheer volume of the data from these IoT devices, any errors from user entry, data corruption, data accumulation, data integration, or data processing can snowball, causing massive errors that can detrimentally affect the decision-making process. Consequently, there needs to be a clear understanding of the challenges associated with data quality and a way to evaluate and ensure that data quality is maintained for different applications.
PB - IEEE
VL - 32
CP - 2
ER -
TY - Generic
T1 - kHealth Bariatrics: A Multisensory Approach to Monitoring Patients' Postsurgical Behavior
Y1 - 2017
A1 - Revathy Venkatramanan
A1 - Utkarshani Jaimini
A1 - Amit Sheth
A1 - Joon K Shim
A1 - Priti Parikh
A1 - Dene S Berman
AB - The rate of obesity is on the rise reaching epidemic proportions. According to American Society for Metabolic and Bariatric Surgery (ASMBS), 500 million people all over the world are obese. The data from Centers for Disease Control and Prevention (CDC) shows that more than 36% of adults in the United States have obesity. According to World Health Organization (WHO), 65% of the world’s population lives in countries where the occurrence of death due to overweight and obesity is higher than being underweight. It is well established that weight loss surgery can play a significant role in reducing, or even eliminating medical problems associated with obesity. Weight recidivism is one of the biggest challenges following bariatric surgery. As many as 50% of patients may regain a small amount of weight two years or more following their bariatric surgery. A lifetime commitment to diet and behavior modifications after surgery are essential for success after undergoing surgery. In this project, computer scientists working at Kno.e.sis, an Ohio Center of Excellence in BioHealth Innovation, are collaborating with a bariatric surgeon and a psychologist to bolster weight loss surgery patients for appropriate postsurgical progress. In our mobile personalized digital health solution, we use an Android application coupled with sensors to monitor patient’s compliance with post-surgery progress and motivate patients to have proper follow-ups. The sensors include a wireless weighing machine that automatically sends data to the cloud, activity and sleep monitoring wristband which also measures heart rate, water bottle sensor and pill bottle sensor which prompts the patient for proper intake of water and vitamin pills. Additionally, the android app with its simple questionnaire helps in monitoring the patient’s diet and emotional well-being. One of the key challenges for the surgeon is to continuously monitor the patient to identify the deviations from recommended postsurgical guidelines. We aid bariatric surgeons to identify noncompliance with direction by providing aggregated data of all the primary parameters to be monitored. We also monitor patient’s mental health, following diet and sleep cycle. Thus, a joint effort with the surgeon and psychologist to track patient’s postsurgical behavior differentiates our approach from others and contributes to improved outcomes for bariatric surgery patients.
CY - WSU Celebration of Research, Scholarship, and Creative Activities 2017
ER -
TY - CONF
T1 - A Knowledge Graph Framework for Detecting Traffic Events Using Stationary Cameras
T2 - Industrial Knowledge Graphs 2017 Workshop (co-located with 9th International ACM Web Science Conference 2017)
Y1 - 2017
A1 - Roopteja Muppalla
A1 - Sarasi Lalithsena
A1 - Tanvi Banerjee
A1 - Amit Sheth
KW - Knowledge graphs
KW - Traffic events
KW - Traffic image feature extraction
AB - With the rapid increase in urban development, it is critical to utilize dynamic sensor streams for traffic understanding, especially in larger cities where route planning or infrastructure planning is more critical. This creates a strong need to understand traffic patterns using ubiquitous sensors to allow city officials to be better informed when planning urban construction and to provide an understanding of the traffic dynamics in the city. In this study, we propose our framework ITSKG (Imagery-based Traffic Sensing Knowledge Graph) which utilizes the stationary traffic camera information as sensors to understand the traffic patterns. The proposed system extracts image-based features from traffic camera images, adds a semantic layer to the sensor data for traffic information, and then labels traffic imagery with semantic labels such as congestion. We share a prototype example to highlight the novelty of our system and provide an online demo to enable users to gain a better understanding of our system. This framework adds a new dimension to existing traffic modeling systems by incorporating dynamic image-based features as well as creating a knowledge graph to add a layer of abstraction to understand and interpret concepts like congestion to the traffic event detection system.
JA - Industrial Knowledge Graphs 2017 Workshop (co-located with 9th International ACM Web Science Conference 2017)
CY - Troy, NY
ER -
TY - CONF
T1 - Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples
T2 - 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI)
Y1 - 2017
A1 - Amit Sheth
A1 - Sujan Perera
A1 - Sanjaya Wijeratne
A1 - Krishnaprasad Thirunarayan
KW - Emoji Sense Disambiguation
KW - Implicit Entity Recognition
KW - Knowledge-driven deep content understanding
KW - Knowledge-enhanced Machine Learning
KW - Knowledge-enhanced NLP
KW - Machine intelligence
KW - Multimodal exploitation
KW - Personalized Digital Health
KW - Semantic-Cognitive-Perceptual Computing
KW - Understanding complex text
AB - Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.
JA - 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI)
PB - ACM
CY - Leipzig, Germany
SN - 978-1-4503-4951-2/17/08
ER -
TY - JOUR
T1 - Machine Learning for Internet of Things Data Analysis: A Survey
JF - Digital Communication and Networks
Y1 - 2017
A1 - Mohammad Saeid Mahdavinejad
A1 - Mohammadreza Rezvan
A1 - Mohammadamin Barekatain
A1 - Peyman Adibi
A1 - Payam Barnaghi
A1 - Amit Sheth
KW - Internet of Things
KW - machine learning
KW - Smart City
KW - Smart Data
AB - Rapid developments in hardware, software, and communication technologies have allowed the emergence of Internet-connected sensory devices that provide observation and data measurement from the physical world. By 2020, it is estimated that the total number of Internet-connected devices being used will be between 25-50 billion. As the numbers grow and technologies become more mature, the volume of data published will increase. Internet-connected devices technology, referred to as Internet of Things (IoT), continues to extend the current Internet by providing connectivity and interaction between the physical and cyber worlds. In addition to increased volume, the IoT generates Big Data characterized by velocity in terms of time and location dependency, with a variety of multiple modalities and varying data quality. Intelligent processing and analysis of this Big Data is the key to developing smart IoT applications. This article assesses the different machine learning methods that deal with the challenges in IoT data by considering smart cities as the main use case. The key contribution of this study is presentation of a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information. The potential and challenges of machine learning for IoT data analytics will also be discussed. A use case of applying Support Vector Machine (SVM) on Aarhus Smart City traffic data is presented for a more detailed exploration.
PB - ELSEVIER
UR - http://www.sciencedirect.com/science/article/pii/S235286481730247X
ER -
TY - CONF
T1 - A Novel Approach for Classifying Gene Expression Data using Topic Modeling
T2 - 8th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
Y1 - 2017
A1 - Soon Jye Kho
A1 - Hima Bindu Yalamanchili
A1 - Michael L. Raymer
A1 - Amit Sheth
KW - Cancer
KW - Classification
KW - clustering
KW - Gene Expression
KW - Latent Dirichlet Allocation
KW - machine learning
KW - Topic modeling
AB - Understanding the role of differential gene expression in cancer etiology and cellular process is a complex problem that continues to pose a challenge due to sheer number of genes and inter-related biological processes involved. In this paper, we employ an unsupervised topic model, Latent Dirichlet Allocation (LDA) to mitigate overfitting of high-dimensionality gene expression data and to facilitate understanding of the associated pathways. LDA has been recently applied for clustering and exploring genomic data but not for classification and prediction. Here, we proposed to use LDA inclustering as well as in classification of cancer and healthy tissues using lung cancer and breast cancer messenger RNA (mRNA) sequencing data. We describe our study in three phases: clustering, classification, and gene interpretation. First, LDA is used as a clustering algorithm to group the data in an unsupervised manner. Next we developed a novel LDA-based classification approach to classify unknown samples based on similarity of co-expression patterns. Evaluation to assess the effectiveness of this approach shows that LDA can achieve high accuracy compared to alternative approaches. Lastly, we present a functional analysis of the genes identified usinga novel topic profile matrix formulation. This analysis identified several genes and pathways that could potentially be involved in differentiating tumor samples from normal. Overall, our results project LDA as a promising approach for classification of tissue types based on gene expression data in cancer studies.
JA - 8th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PB - ACM
CY - Boston, MA
ER -
TY - CONF
T1 - PhD Forum: Multimodal IoT and EMR Based Smart Health Application for Asthma Management in Children
T2 - 2017 IEEE International Conference on Smart Computing (SMARTCOMP)
Y1 - 2017
A1 - Utkarshani Jaimini
KW - Air quality
KW - Humidity
KW - Medical services
KW - Monitoring
KW - Pediatrics
KW - Sensors
KW - Temperature measurement
AB - According to a study done in 2014 by National Health Interview Survey around 6.3 million children in United States suffer from asthma [1]. Asthma remains one of the leading reasons for pediatric admissions to children's hospitals, and has a prevalence rate of approximately 10% in children and it leads to missed days from school and other societal costs. This occurs despite improved medications to control asthma symptoms. Asthma management is challenging as it involves understanding asthma causes and avoiding asthma triggers that are both multi- factorial and individualistic in nature. It is almost impossible for doctors to constantly monitor each patient's health and environmental triggers. According to a recent article, the IoT device market in health-care will increase to a worth of 17 billion by the year 2020 [2].The monitoring segment of IoT devices have been predicted to increase 15 billion in 2017 [5]. The sales of smart watches, fitness and health trackers, are expected to account for more than 70% of all wearables sale worldwide in 2016 [6]. According to IBM, the volume of health-care data has reached to 150 exabytes in 2017 [7]. The data generated from these consumer graded devices is increasing day by day. This data collection has exacerbated the problem of understanding the data and making sense of it. We can use these low-cost sensors and consumer graded devices for continuous monitoring and management of asthma patients. We developed kHealth [1], a framework for continuous monitoring of the patient's personal, public and population-based health signals and send alerts to the patient when a condition deserves patient's or clinician's attention. This can assist the clinician in determining the triggers and deciding the future course of action for prevention and treatment of the disease. More importantly, it can also help a patient to better take control of his/her health management by taking more timely actions (e.g., in case of asthma, using an inhaler in a more timely manner to ward off an attack). Our kHealth framework goes well beyond the efforts of data collection and focuses on contextual and personalized processing of multi-modal data to help understand asthma control level and vulnerability score (change in conditions that increases the chances of an adverse event, thus requiring proactive action). Another unique aspect of our research is close collaboration with clinician combined with on-going evaluation of clinician's at the Dayton Children's hospital which involves an ongoing trial of our novel technical approach with a cohort of 200 patients.
JA - 2017 IEEE International Conference on Smart Computing (SMARTCOMP)
PB - IEEE
CY - Hong Kong, China
ER -
TY - CONF
T1 - PrivateGraph: A Cloud-Centric System for Spectral Analysis of Large Encrypted Graphs
T2 - 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)
Y1 - 2017
A1 - S. Sharma
A1 - K. Chen
KW - Algorithm design and analysis
KW - Cloud Computing
KW - cloud-centric system
KW - cloud-client interaction protocols
KW - Clustering algorithms
KW - cryptography
KW - Data privacy
KW - distributed databases
KW - eigen-decomposition algorithms
KW - eigenvalues and eigenfunctions
KW - encrypted data
KW - framework scalability
KW - large encrypted graphs
KW - privacy-preserving data submission
KW - PrivateGraph
KW - result quality
KW - spectral analysis
AB - Graph datasets have invaluable use in business applications and scientific research. Because of the growing size and dynamically changing nature of graphs, graph data owners may want to use public cloud infrastructures to store, process, and perform graph analytics. However, when outsourcing data and computation, data owners are at burden to develop methods to preserve data privacy and data ownership from curious cloud providers. This demonstration exhibits a prototype system for privacy-preserving spectral analysis framework for large graphs in public clouds (PrivateGraph) that allows data owners to collect graph data from data contributors, and store and conduct secure graph spectral analysis in the cloud with preserved privacy and ownership. This demo system lets its audience interactively learn the major cloud-client interaction protocols: the privacy-preserving data submission, the secure Lanczos and Nyström approximate eigen-decomposition algorithms that work over encrypted data, and the outcome of an important application of spectral analysis - spectral clustering. In the process of demonstration the audience will understand the intrinsic relationship amongst costs, result quality, privacy, and scalability of the framework.
JA - 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)
CY - Atlanta,GA,USA
ER -
TY - CONF
T1 - PrivateGraph: A Cloud-Centric System for Spectral Analysis of Large Encrypted Graphs
T2 - Distributed Computing Systems (ICDCS), 2017 IEEE 37th International Conference
Y1 - 2017
A1 - Sagar Sharma
A1 - Keke Chen
AB - Graph datasets have invaluable use in business applications and scientific research. Because of the growing size and dynamically changing nature of graphs, graph data owners may want to use public cloud infrastructures to store, process, and perform graph analytics. However, when outsourcing data and computation, data owners are at burden to develop methods to preserve data privacy and data ownership from curious cloud providers. This demonstration exhibits a prototype system for privacy-preserving spectral analysis framework for large graphs in public clouds (PrivateGraph) that allows data owners to collect graph data from data contributors, and store and conduct secure graph spectral analysis in the cloud with preserved privacy and ownership. This demo system lets its audience interactively learn the major cloud-client interaction protocols: the privacy-preserving data submission, the secure Lanczos and Nyström approximate eigen-decomposition algorithms that work over encrypted data, and the outcome of an important application of spectral analysis - spectral clustering. In the process of demonstration the audience will understand the intrinsic relationship amongst costs, result quality, privacy, and scalability of the framework.
JA - Distributed Computing Systems (ICDCS), 2017 IEEE 37th International Conference
PB - IEEE
CY - Atlanta, GA, USA
SN - 978-1-5386-1792-2
ER -
TY - CONF
T1 - Relatedness-based Multi-Entity Summarization
T2 - In IJCAI: proceedings of the conference (Vol. 2017, p. 1060)
Y1 - 2017
A1 - Kalpa Gunaratna
A1 - Amir Hossein Yazdavar
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Gong Cheng
KW - clustering
KW - Entity Summarization
KW - knowledge graph
AB - Representing world knowledge in a machine processable format is important as entities and their descriptions have fueled tremendous growth in knowledge-rich information processing platforms, services, and systems. Prominent applications of knowledge graphs include search engines (e.g., Google Search and Microsoft Bing), email clients (e.g., Gmail), and intelligent personal assistants (e.g., Google Now, Amazon Echo, and Apple’s Siri). In this paper, we present an approach that can summarize facts about a collection of entities by analyzing their relatedness in preference to summarizing each entity in isolation. Specifically, we generate informative entity summaries by selecting: (i) inter-entity facts that are similar and (ii) intra-entity facts that are important and diverse. We employ a constrained knapsack problem solving approach to efficiently compute entity summaries. We perform both qualitative and quantitative experiments and demonstrate that our approach yields promising results compared to two other stand-alone state-ofthe-art entity summarization approaches.
JA - In IJCAI: proceedings of the conference (Vol. 2017, p. 1060)
CY - Sydney, Australia
VL - Vol. 2017
ER -
TY - CONF
T1 - Relatedness-based Multi-Entity Summarization
T2 - International Joint Conference on Artificial Intelligence 2017 (IJCAI-17)
Y1 - 2017
A1 - Kalpa Gunaratna
A1 - Amir Hossein Yazdavar
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Gong Cheng
AB - Representing world knowledge in a machine processable format is important as entities and their descriptions have fueled tremendous growth in knowledge-rich information processing platforms, services, and systems. Prominent applications of knowledge graphs include search engines (e.g., Google Search and Microsoft Bing), email clients (e.g., Gmail), and intelligent personal assistants (e.g., Google Now, Amazon Echo, and Apple’s Siri). In this paper, we present an approach that can summarize facts about a collection of entities by analyzing their relatedness in preference to summarizing each entity in isolation. Specifically, we generate informative entity summaries by selecting: (i) inter-entity facts that are similar and (ii) intra-entity facts that are important and diverse. We employ a constrained knapsack problem solving approach to efficiently compute entity summaries. We perform both qualitative and quantitative experiments and demonstrate that our approach yields promising results compared to two other stand-alone state-ofthe-art entity summarization approaches.
JA - International Joint Conference on Artificial Intelligence 2017 (IJCAI-17)
CY - Melbourne, Australia
ER -
TY - CONF
T1 - Road Accidents Bigdata Mining and Visualization using Support Vector Machines
T2 - The ICMLA 2017: 19th International Conference on Machine Learning and Applications
Y1 - 2017
A1 - Usha Lokala
A1 - Prabhakar K Sharma
A1 - Srinivas Nowduri
JA - The ICMLA 2017: 19th International Conference on Machine Learning and Applications
CY - Copenhagen, Denmark
ER -
TY - CONF
T1 - RQUERY: Rewriting Natural Language Queries on Knowledge Graphs to Alleviate the Vocabulary Mismatch Problem
T2 - 31st AAAI Conference on Artificial Intelligence (AAAI-17)
Y1 - 2017
A1 - Saeedeh Shekarpour
A1 - Edgard Marx
A1 - Sören Auer
A1 - Amit Sheth
AB - For non-expert users, a textual query is the most popular and simple means for communicating with a retrieval or question answering system. However, there is a risk of receiving queries which do not match with the background knowledge. Query expansion and query rewriting are solutions for this problem but they are in danger of potentially yielding a large number of irrelevant words, which in turn negatively influences runtime as well as accuracy. In this paper, we propose a new method for automatic rewriting input queries on graph-structured RDF knowledge bases. We employ a Hidden Markov Model to determine the most suitable derived words from linguistic resources. We introduce the concept of triple-based co-occurrence for recognizing co-occurred words in RDF data. This model was bootstrapped with three statistical distributions. Our experimental study demonstrates the superiority of the proposed approach to the traditional n-gram model
JA - 31st AAAI Conference on Artificial Intelligence (AAAI-17)
CY - San Francisco, California
ER -
TY - CONF
T1 - Seasonality in dynamic stochastic block models
T2 - Proceedings of the International Conference on Web Intelligence
Y1 - 2017
A1 - Robinson, Jace
A1 - Derek Doran
AB - Sociotechnological and geospatial processes exhibit time varying structure that make insight discovery challenging. This paper proposes a new statistical model for such systems, modeled as dynamic networks, to address this challenge. It assumes that vertices fall into one of k types and that the probability of edge formation at a particular time depends on the types of the incident nodes and the current time. The time dependencies are driven by unique seasonal processes, which many systems exhibit (e.g., predictable spikes in geospatial or web traffic each day). The paper defines the model as a generative process and an inference procedure to recover the seasonal processes from data when they are unknown. Evaluation with synthetic dynamic networks show the recovery of the latent seasonal processes that drive its formation.
JA - Proceedings of the International Conference on Web Intelligence
PB - ACM
CY - Leipzig, Germany
ER -
TY - THES
T1 - Semantic Web Foundations for Representing, Reasoning, and Traversing Contextualized Knowledge Graphs
T2 - Department of Computer Science & Engineering
Y1 - 2017
A1 - Vinh Nguyen
KW - Contextualized Knowledge Graph
KW - Model theoretic semantics
KW - RDF Data Model
KW - Semantic Web
KW - Singleton Property
AB - Semantic Web technologies such as RDF and OWL have become World Wide Web Consortium (W3C) standards for knowledge representation and reasoning. RDF triples about triples, or meta triples, form the basis for a contextualized knowledge graph. They represent the contextual information about individual triples such as the source, the occurring time or place, or the certainty. However, an efficient RDF representation for such meta-knowledge of triples remains a major limitation of the RDF data model. The existing reification approach allows such meta-knowledge of RDF triples to be expressed in RDF by using four triples per reified triple. While reification is simple and intuitive, this approach does not have a formal foundation and is not commonly used in practice as described in the RDF Primer. This dissertation presents the foundations for representing, querying, reasoning and traversing the contextualized knowledge graphs (CKG) using Semantic Web technologies. A triple-based compact representation for CKGs. We propose a principled approach and construct RDF triples about triples by extending the current RDF data model with a new concept, called singleton property (SP), as a triple identifier. The SP representation needs two triples to the RDF datasets and can be queried with SPARQL. A formal model-theoretic semantics for CKGs. We formalize the semantics of the singleton property and its relationships with the triple it represents. We extend the current RDF model-theoretic semantics to capture the semantics of the singleton properties and provide the interpretation at three levels: simple, RDF, and RDFS. It provides a single interpretation of the singleton property semantics across applications and systems. A sound and complete inference mechanism for CKGs. Based on the semantics we propose, we develop a set of inference rules for validating and inferring new triples based on the SP syntax. We also develop different sets of context-based inference rules for provenance, time, and uncertainty. A graph-based formalism for CKGs. We propose a formal contextualized graph model for the SP representation. We formalize the RDF triples as a mathematical graph by combining the model theory and the graph theory into a hybrid RDF formal semantics. The unified semantics allows the RDF formal semantics to be leveraged in the graph-based algorithms.
JA - Department of Computer Science & Engineering
PB - Wright State University
CY - Dayton
VL - Ph.D
ER -
TY - CONF
T1 - A Semantics-Based Measure of Emoji Similarity
T2 - 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI)
Y1 - 2017
A1 - Sanjaya Wijeratne
A1 - Lakshika Balasuriya
A1 - Amit Sheth
A1 - Derek Doran
KW - Emoji Analysis and Search
KW - Emoji Similarity
KW - semantic similarity
AB - Emoji have grown to become one of the most important forms of communication on the web. With its widespread use, measuring the similarity of emoji has become an important problem for contemporary text processing since it lies at the heart of sentiment analysis, search, and interface design tasks. This paper presents a comprehensive analysis of the semantic similarity of emoji through embedding models that are learned over machine-readable emoji meanings in the EmojiNet knowledge base. Using emoji descriptions, emoji sense labels and emoji sense definitions, and with different training corpora obtained from Twitter and Google News, we develop and test multiple embedding models to measure emoji similarity. To evaluate our work, we create a new dataset called EmoSim508, which assigns human-annotated semantic similarity scores to a set of 508 carefully selected emoji pairs. After validation with EmoSim508, we present a real-world use-case of our emoji embedding models using a sentiment analysis task and show that our models outperform the previous best-performing emoji embedding model on this task. The EmoSim508 dataset and our emoji embedding models are publicly released with this paper and can be downloaded from http://emojinet.knoesis.org/.
JA - 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI)
PB - ACM
CY - Leipzig, Germany
SN - 978-1-4503-4951-2/17/08
ER -
TY - THES
T1 - Semantics-based Summarization of Entities in Knowledge Graphs
T2 - Department of Computer Science & Engineering
Y1 - 2017
A1 - Kalpa Gunaratna
AB - The processing of structured and semi-structured content on the Web has been gaining attention with the rapid progress in the Linking Open Data project and the development of commercial knowledge graphs. Knowledge graphs capture domain-specific or encyclopedic knowledge in the form of a data layer and add rich and explicit semantics on top of the data layer to infer additional knowledge. The data layer of a knowledge graph represents entities and their descriptions. The semantic layer on top of the data layer is called the schema (ontology), where relationships of the entity descriptions, their classes, and the hierarchy of the relationships and classes are defined. Today, there exist large knowledge graphs in the research community (e.g., encyclopedic datasets like DBpedia and Yago) and corporate world (e.g., Google knowledge graph) that encapsulate a large amount of knowledge for human and machine consumption. Typically, they consist of millions of entities and billions of facts describing these entities. While it is good to have this much knowledge available on the Web for consumption, it leads to information overload, and hence proper summarization (and presentation) techniques need to be explored. In this dissertation, we focus on creating both comprehensive and concise entity summaries at: (i) the single entity level and (ii) the multiple entity level. To summarize a single entity, we propose a novel approach called FACeted Entity Summarization (FACES) that considers importance, which is computed by combining popularity and uniqueness, and diversity of facts getting selected for the summary. We first conceptually group facts using semantic expansion and hierarchical incremental clustering techniques and form facets (i.e., groupings) that go beyond syntactic similarity. Then we rank both the facts and facets using Information Retrieval (IR) ranking techniques to pick the highest ranked facts from these facets for the summary. The important and unique contribution of this approach is that because of its generation of facets, it adds diversity into entity summaries, making them comprehensive. For creating multiple entity summaries, we propose RElatedness-based Multi-Entity Summarization (REMES) approach that simultaneously processes facts belonging to the given entities using combinatorial optimization techniques. In this process, we maximize diversity and importance of facts within each entity summary and relatedness of facts between the entity summaries. The proposed approach uniquely combines semantic expansion, graph-based relatedness, and combinatorial optimization techniques to generate relatedness-based multi-entity summaries. Complementing the entity summarization approaches, we introduce a novel approach using light Natural Language Processing (NLP) techniques to enrich knowledge graphs by adding type semantics to literals. This makes datatype properties semantically rich compared to having only implementation types. As a result of the enrichment process, we could use both object and datatype properties in the entity summaries, which improves coverage. Moreover, the added type semantics can be useful in other applications like dataset profiling and data integration. We evaluate the proposed approaches against the state-of-the-art methods and highlight their capabilities for single and multiple entity summarization.
JA - Department of Computer Science & Engineering
PB - Wright State University
CY - Dayton
VL - PhD
ER -
TY - CONF
T1 - Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media
T2 - Advances in Social Networks Analysis and Mining (ASONAM)
Y1 - 2017
A1 - Amir Hossein Yazdavar
A1 - Hussein S. Al-Olimat
A1 - Monireh Ebrahimi
A1 - Goonmeet Bajaj
A1 - Tanvi Banerjee
A1 - Krishnaprasad Thirunarayan
A1 - Jyotishman Pathak
A1 - Amit Sheth
KW - Depression
KW - Mental Health
KW - Natural Language Processing
KW - Semi-supervised Machine Learning
KW - Social Media
AB - Abstract—With the rise of social media, millions of people are routinely expressing their moods, feelings, and daily struggles with mental health issues on social media platforms like Twitter. Unlike traditional observational cohort studies conducted through questionnaires and self-reported surveys, we explore the reliable detection of clinical depression from tweets obtained unobtrusively. Based on the analysis of tweets crawled from users with self-reported depressive symptoms in their Twitter profiles, we demonstrate the potential for detecting clinical depression symptoms which emulate the PHQ-9 questionnaire clinicians use today. Our study uses a semi-supervised statistical model to evaluate how the duration of these symptoms and their expression on Twitter (in terms of word usage patterns and topical preferences) align with the medical findings reported via the PHQ-9. Our proactive and automatic screening tool is able to identify clinical depressive symptoms with an accuracy of 68% and precision of 72%.
JA - Advances in Social Networks Analysis and Mining (ASONAM)
PB - Advances in Social Networks Analysis and Mining (ASONAM), 2017 IEEE/ACM International Conferance on
CY - Sydney, Australia
ER -
TY - JOUR
T1 - Sentinels of breach: Lexical choice as a measure of urgency in social media
JF - Human factors
Y1 - 2017
A1 - Andrew J Hampton
A1 - Valerie Shalin
KW - disaster response
KW - Psycholinguistics
AB - Objective This paper identifies general properties of language style in social media to help identify areas of need in disasters. Background In the search for metrics of need in social media data, much of the existing literature ignores processes of language usage. Psychological concepts, such as narrative breach, Gricean maxims, and lexical marking in cognition, may assist the recovery of disaster-relevant metrics from altered patterns of word prevalence. Method We analyzed several hundred thousand location-specific microblogs from Twitter for Hurricane Sandy, Oklahoma tornadoes, and the Boston Marathon bombing along with a fantasy football control corpus, examining the relative frequency of words in 36 antonym pairs. We compared the ratio of words within these pairs to the corresponding ratios recovered from an online word norm database. Results Partial rank correlation values between observed antonym ratios demonstrate consistent patterns across disasters. For Hurricane Sandy data, 25 antonym pairs have moderate to large effect sizes for discrepancies between observed and normative ratios. Across disasters, 7 pairs are stable and meet effect size criteria. Sentiment analysis, supplementary word frequency counts with respect to disaster proximity, and examples support a "breach" account for the observed results. Conclusion Lexical choice between antonyms, only somewhat related to sentiment, suggests that social media capture wide-ranging breaches of normal functioning. Application Antonym selection contributes to screening tools based on language style for identifying relevant content and quantifying disruption using social media without the a priori specification of content keywords.
PB - Sage Publications Sage CA
VL - 59
CP - 4
ER -
TY - JOUR
T1 - A soft computing approach for benign and malicious web robot detection
JF - Expert Systems with Applications
Y1 - 2017
A1 - Mahdieh Zabihimayvan
A1 - Reza Sadeghi
A1 - H. Nathan Rude
A1 - Derek Doran
KW - Fuzzy Rough Set Theory
KW - Malicious web agents
KW - Markov clustering algorithm
KW - Web crawler
KW - Web Robot Detection
AB - The accurate detection of web robot sessions from a web server log is essential to take accurate traffic-level measurements and to protect the performance and privacy of information on a Web server. Moreover, the irrecoverable risks of visits from malicious robots that intentionally try to evade web server intrusion detection systems, covering-up their visits with fabricated fields in their http request packets, cannot be ignored. To separate both types of robots from humans in practice, analysts turn to heuristic methods or state-of-the-art soft computing approaches that have only been tuned to the specification of a kind of web server. Noting that the landscape of web robot agents is ever changing, and that behavioral patterns and characteristics vary across different web servers, both options are lacking. To overcome this challenge, this paper presents SMART, a soft computing system that simultaneously detects benign and malicious types of robot agents from web server logs and can automatically adapt to the session characteristics of a web server. The results of experiments over some access log file servers, each servicing different domains of the web, demonstrate outperformance of the proposed method on state-of-the-art ones for benign and malicious robot detection.
PB - ELSEVIER
VL - 87
UR - http://www.sciencedirect.com/science/article/pii/S0957417417304116
ER -
TY - CHAP
T1 - A Soft Computing Prefetcher to Mitigate Cache Degradation by Web Robots
T2 - Advances in Neural Networks - ISNN 2017
Y1 - 2017
A1 - Xie, Ning
A1 - Kyle Brown
A1 - Rude, Nathan
A1 - Derek Doran
ED - Fengyu Cong
ED - Leung, Andrew
ED - Wei, Qinglai
AB - This paper investigates the feasibility of a resource prefetcher able to predict future requests made by web robots, which are software programs rapidly overtaking human users as the dominant source of web server traffic. Such a prefetcher is a crucial first line of defense for web caches and content management systems that must service many requests while maintaining good performance. Our prefetcher marries a deep recurrent neural network with a Bayesian network to combine prior global data with local data about specific robots. Experiments with traffic logs from web servers across two universities demonstrate improved predictions over a traditional dependency graph approach. Finally, preliminary evaluation of a hypothetical caching system that incorporates our prefetching scheme is discussed.
JA - Advances in Neural Networks - ISNN 2017
PB - Springer International Publishing
CY - Cham
VL - 1
SN - 978-3-319-59072-1
UR - http://dx.doi.org/10.1007/978-3-319-59072-1_63
ER -
TY - JOUR
T1 - SPIN: Cleaning, Monitoring, and Querying Image Streams Generated by Ground-Based Telescopes for Space Situational Awareness
Y1 - 2017
A1 - Keke Chen
A1 - Bharath Avusherla
A1 - Sarah Allison
A1 - Vincent Schmidt
KW - Data Stream
KW - Image Processing
KW - Image Query Processing
KW - machine learning
KW - Space Situational Awareness
AB - With the increasing number of objects in Earth orbits, space situational awareness (SSA) becomes critical to space safety. As an economical option, ground-based telescopes can be deployed around the world and continuously provide imaginary information of space objects. However, they also raise unique challenges regarding big, noisy, and streaming data processing. In this paper, we present the SPIN system to address these challenges. The core algorithms process image sequences generated by ground-based telescopes and conduct: (1) image quality classification for data cleaning, (2) stream-based key-object identification and anomaly detection, and (3) efficient query processing on large image sequence repositories. Our goal is to design or adopt algorithms that handle the domain-specific image streams most efficiently and effectively. We use a 17-inch telescope to collect a large real dataset for evaluating the core algorithms, which covers more than ten satellites in one month and contains about 16,400 images. The experimental results show that the developed algorithms are fast enough for stream based real-time processing and also yield high-quality results for all the primary tasks.
N1 - Citation
Keke Chen, Bharath Avusherla, Sarah Allison ,Vincent Schmidt
Data Intensive Analysis and Computing Lab
Department of Computer Science and Engineering
Wright State University, OH, USA
E-mail: {keke.chen, avusherla.2, allison.24}@wright.edu
ER -
TY - CONF
T1 - Torpedo: Improving the State-of-the-Art RDF Dataset Slicing
T2 - 11th International Conference on Semantic Computing (IEEE ICSC 2017)
Y1 - 2017
A1 - Edgard Marx
A1 - Saeedeh Shekarpour
A1 - Tommaso Soru
A1 - Adrian Brasoveanu
A1 - Muhammad Saleem
A1 - Ciro Baron
A1 - Albert Weichselbraun
A1 - Jens Lehmann
A1 - Axel-Cyrille Ngonga Ngomo
A1 - Sören Auer
AB - Over the last years, the amount of data published as Linked Data on the Web has grown enormously. In spite of the high availability of Linked Data, organizations still encounter an accessibility challenge while consuming it. This is mostly due to the large size of some of the datasets published as Linked Data. The core observation behind this work is that a subset of these datasets suffices to address the needs of most organizations. In this paper, we introduce Torpedo, an approach for efficiently selecting and extracting relevant subsets from RDF datasets. In particular, Torpedo adds optimization techniques to reduce seek operations costs as well as the support of multi-join graph patterns and SPARQL FILTERs that enable to perform a more granular data selection. We compare the performance of our approach with existing solutions on nine different queries against four datasets. Our results show that our approach is highly scalable and is up to 26% faster than the current state-of-the-art RDF dataset slicing approach.
JA - 11th International Conference on Semantic Computing (IEEE ICSC 2017)
CY - San Diego, California
ER -
TY - JOUR
T1 - On Using the Intelligent Edge for IoT Analytics
JF - IEEE Intelligent Systems
Y1 - 2017
A1 - Pankesh Patel
A1 - Muhammad Intizar Ali
A1 - Amit Sheth
KW - Artificial Intelligence
KW - Bandwidth
KW - Cloud Computing
KW - Data Analysis
KW - Edge computing
KW - Internet of Things
KW - Mobile communication
KW - Sensors
AB - This article presents a flexible architecture for Internet of Things (IoT) data analytics using the concept of fog computing. The authors identify different actors and their roles in order to design adaptive IoT data analytics solutions. The presented approach can be used to effectively design robust IoT applications that require a tradeoff between cloud- and edge-based computing depending on dynamic application requirements. The potential use cases of this technology can be found in scenarios such as smart cities, security surveillance, and smart manufacturing, where the quality of user experience is important.
PB - IEEE
VL - 32
UR - http://ieeexplore.ieee.org/document/8070894/
CP - 5
ER -
TY - JOUR
T1 - What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, and Prevention
JF - JMIR Public Health Surveillance
Y1 - 2017
A1 - Michele Miller
A1 - Tanvi Banerjee
A1 - Roopteja Muppalla
A1 - William Romine
A1 - Amit Sheth
KW - Epidemiology
KW - machine learning
KW - Social Media
KW - viruses
AB - Background: In order to harness what people are tweeting about Zika, there needs to be a computational framework that leverages machine learning techniques to recognize relevant Zika tweets and, further, categorize these into disease-specific categories to address specific societal concerns related to the prevention, transmission, symptoms, and treatment of Zika virus. Objective: The purpose of this study was to determine the relevancy of the tweets and what people were tweeting about the 4 disease characteristics of Zika: symptoms, transmission, prevention, and treatment. Methods: A combination of natural language processing and machine learning techniques was used to determine what people were tweeting about Zika. Specifically, a two-stage classifier system was built to find relevant tweets about Zika, and then the tweets were categorized into 4 disease categories. Tweets in each disease category were then examined using latent Dirichlet allocation (LDA) to determine the 5 main tweet topics for each disease characteristic. Results: Over 4 months, 1,234,605 tweets were collected. The number of tweets by males and females was similar (28.47% [351,453/1,234,605] and 23.02% [284,207/1,234,605], respectively). The classifier performed well on the training and test data for relevancy (F1 score=0.87 and 0.99, respectively) and disease characteristics (F1 score=0.79 and 0.90, respectively). Five topics for each category were found and discussed, with a focus on the symptoms category. Conclusions: We demonstrate how categories of discussion on Twitter about an epidemic can be discovered so that public health officials can understand specific societal concerns within the disease-specific categories. Our two-stage classifier was able to identify relevant tweets to enable more specific analysis, including the specific aspects of Zika that were being discussed as well as misinformation being expressed. Future studies can capture sentiments and opinions on epidemic outbreaks like Zika virus in real time, which will likely inform efforts to educate the public at large.
PB - JMIR Publications
VL - 3
ER -
TY - CONF
T1 - Analyzing Clinical Depressive Symptoms in Twitter
T2 - 23rd NIMH Conference on Mental Health Services Research (MHSR): Harnessing Science to Strengthen the Public Health Impact
Y1 - 2016
A1 - Amir Hossein Yazdavar
A1 - Hussein S. Al-Olimat
A1 - Tanvi Banerjee
A1 - Krishnaprasad Thirunarayan
A1 - Jyotishman Pathak
A1 - Amit Sheth
AB - Twitter provides a rich source for studying people’s mood in order to detect depressive behaviors. We developed a novel technique to unobtrusively analyzes individual posts in social media to detect signs of depression that can be utilized to build a proactive and automatic screening tool for early recognition of clinical depression. Leveraging clinical definition of depression, we build a depression lexicon that contains common depression symptoms determined by experts such as from the established clinical assessment questionnaires PHQ-9. We expanded the terms expressing the nine PHQ-9 depression symptoms categories using Urban Dictionary and Big Huge Thesaurus. The lexicon contains depression-related symptoms that are likely to appear in the tweets of individuals either having depressive-like symptoms or suffering from depression. A subset of highly informative seed terms are selected from this depression lexicon for crawling depression-related tweets. For each lexical term, we calculate its association with all of the variations of the term “depress” using Pointwise Mutual Information (PMI) and Chi-squared test to quantify their correlation and thereby rank order them. We leverage Twitris, our social media analysis platform, to study language, sentiment, emotions, topics and people content-network of depressed individuals.
JA - 23rd NIMH Conference on Mental Health Services Research (MHSR): Harnessing Science to Strengthen the Public Health Impact
CY - Bethesda, MD
ER -
TY - JOUR
T1 - Building the Web of Knowledge with Smart IoT Applications
JF - IEEE Intelligent Systems
Y1 - 2016
A1 - Amelie Gyrard
A1 - Pankesh Patel
A1 - Amit Sheth
A1 - Martin Serrano
KW - Internet of Things
KW - Linked Open Data
KW - Linked Open Services
KW - Linked Open Vocabularies
KW - physical-cyber-social computing
KW - Programming Framework
KW - Semantic Web
KW - Semantic Web of Things
KW - Smart Data
KW - Web of Things
AB - The Internet of Things (IoT) is experiencing fast adoption in the society, from industrial to home applications. The number of deployed sensors and connected devices to the Internet is changing our perspective and the way we understand the world. The development and generation of IoT applications is just starting and they will modify our physical and virtual lives, from how we control remotely appliances at home to how we deal with insurance companies in order to start insurance schemes via smart cards. This massive deployment of IoT devices represents a tremendous economic impact and at the same time offers multiple opportunities. However, the potential of IoT is underexploited and day by day this gap between devices and useful applications is getting bigger. Additionally, the physical and cyber worlds are largely disconnected, requiring a lot of manual efforts to integrate, find, and use information in a meaningful way. To build a connection between the physical and the virtual, we need a knowledge framework that allow bilateral understandings, devices producing data, information systems managing the data and applications transforming information into meaningful knowledge. The first column in this series in the previous issue of this magazine titled “Internet of Things to Smart IoT Through Semantic, Cognitive, and Perceptual Computing,” [Sheth et al., 2016] reviews IoT growth and potential that have energized research and technology development, centered on aspects of Artificial Intelligence to build future intelligent system. This column steps back and demonstrates the benefits of using semantic web technologies to get meaningful knowledge from sensor data to design smart systems.
PB - IEEE
VL - 32
CP - 5
ER -
TY - THES
T1 - Characterizing Concepts in Taxonomy for Entity Recommendations
T2 - Department of Engineering & Computer Science
Y1 - 2016
A1 - Siva Kumar Cheekula
AB - Entity recommendation systems are enormously popular on the Web. These systems harness manually crafted taxonomies for improving recommendations. For example, Yahoo created the Open Directory Project for search and recommendation, and Amazon utilizes its own product taxonomy. While these taxonomies are of high quality, it is a labor and time-intensive process to manually create and keep them up to date. Instead, in this era of Web 2.0 where users collaboratively create large amounts of information on the Web, it is possible to utilize user-generated content to automatically generate good quality taxonomies. However, harnessing such taxonomies for entity recommendations has not been well explored. We exploit the Wikipedia category structure as a taxonomy and explore three prominent characteristics of concepts in the taxonomies for entity recommendations. The three characteristics we explore are: 1) Specificity, 2) Priority, and 3) Domain Relatedness of concepts in the taxonomy. We demonstrate the utility of specificity and priority of concepts in the taxonomies in achieving high quality recommendations by evaluating our recommender system on two diverse datasets.
JA - Department of Engineering & Computer Science
PB - Wright State University
CY - Dayton
VL - M.S.
ER -
TY - CONF
T1 - Clustering for Simultaneous Extraction of Aspects and Features from Reviews
T2 - North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL: HLT)
Y1 - 2016
A1 - Lu Chen
A1 - Justin Martineau
A1 - Doreen Cheng
A1 - Amit Sheth
KW - aspect discovery
KW - aspect-based opinion mining
KW - clustering
KW - feature extraction
AB - This paper presents a clustering approach that simultaneously identifies product features and groups them into aspect categories from online reviews. Unlike prior approaches that first extract features and then group them into categories, the proposed approach combines feature and aspect discovery instead of chaining them. In addition, prior work on feature extraction tends to require seed terms and focus on identifying explicit features, while the proposed approach extracts both explicit and implicit features, and does not require seed terms. We evaluate this approach on reviews from three domains. The results show that it outperforms several state-of-the-art methods on both tasks across all three domains.
JA - North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL: HLT)
PB - NAACL
CY - San Diego, California
ER -
TY - JOUR
T1 - Co-evolution of RDF Datasets
JF - Computing Research Repository (CoRR)
Y1 - 2016
A1 - Sidra Faisal
A1 - Kemele Endris
A1 - Saeedeh Shekarpour
A1 - Sören Auer
KW - Conflict Identification
KW - Conflict Resolution
KW - Dataset Co-evolution
KW - Dataset Synchronization
KW - RDF Dataset
AB - Linking Data initiatives have fostered the publication of large number of RDF datasets in the Linked Open Data (LOD) cloud, as well as the development of query processing infrastructures to access these data in a federated fashion. However, different experimental studies have shown that availability of LOD datasets cannot be always ensured, being RDF data replication required for envisioning reliable federated query frameworks. Albeit enhancing data availability, RDF data replication requires synchronization and conflict resolution when replicas and source datasets are allowed to change data over time, i.e., co-evolution management needs to be provided to ensure consistency. In this paper, we tackle the problem of RDF data co-evolution and devise an approach for conflict resolution during co-evolution of RDF datasets. Our proposed approach is property-oriented and allows for exploiting semantics about RDF properties during co-evolution management. The quality of our approach is empirically evaluated in different scenarios on the DBpedia-live dataset. Experimental results suggest that proposed proposed techniques have a positive impact on the quality of data in source datasets and replicas.
ER -
TY - PAT
T1 - Data Processing System and Method for Computer-Assisted Coding of Natural Language Medical Text
Y1 - 2016
A1 - Nehal Shah
A1 - Amit Sheth
A1 - Shreyansh Bhatt
A1 - Raxit Goswami
A1 - Vatsal Shah
A1 - Rahil Kanani
A1 - Amrish Patel
A1 - Parth Pathak
AB - A system and method utilizing deep clinical knowledge represented as a knowledge-graph to complement and enhance Natural Language Processing (NLP) for efficient and high-quality computer assisted coding of medical text. One embodiment utilizes the International Classification of Diseases version-10 Procedural Coding System (ICD-10-PCS). The system uses multiple knowledge bases combined with direct mapping provided by the ICD-10-PCS standard to enhance the coverage of assigned code. The system identifies ICD-10-PCS code considering hierarchical mapping and identifies the code by individual ICD-10-PCS character.
PB - ezDI, LLC
CY - USA
VL - US 2016/0132648 A1
ER -
TY - RPRT
T1 - EmojiNet: A Machine Readable Emoji Sense Inventory
Y1 - 2016
A1 - Sanjaya Wijeratne
A1 - Lakshika Balasuriya
A1 - Amit Sheth
A1 - Derek Doran
KW - Emoji Analysis
KW - Emoji Sense Disambiguation
KW - EmojiNet
AB - With the rise of social media, ‘emoji’ have become extremely popular in online communications. People have started using emoji as a new language in social media to add color and whimsiness to their messages. Without rigid semantics attached to them, emoji symbols take on different meanings based on the context of a message. This has resulted in ambiguity in emoji use. Similar to word sense disambiguation, machine readable sense inventories that list emoji meanings are essential for machines to understand emoji without ambiguity. As the first step towards building machines that can understand emoji, this paper presents EmojiNet, the first machine readable sense inventory for emoji. It links Unicode emoji representations to their English meanings extracted from the Web, enabling systems to link emoji with their context-specific meaning. EmojiNet is automatically constructed by integrating multiple emoji resources with BabelNet, which is the most comprehensive multilingual sense inventory available to date. The paper discusses its construction, evaluates the automatic resource creation process, and presents a use case where EmojiNet disambiguates emoji usage in tweets. EmojiNet is available online for use at http://emojinet.knoesis.org.
JA - Wright Brother's Day, Wright State University
CY - Dayton, Ohio, USA
ER -
TY - Generic
T1 - EmojiNet: Building a Machine Readable Sense Inventory for Emoji
T2 - 8th International Conference on Social Informatics (SocInfo 2016)
Y1 - 2016
A1 - Sanjaya Wijeratne
A1 - Lakshika Balasuriya
A1 - Amit Sheth
A1 - Derek Doran
ED - Emma Spiro
ED - Yong-Yeol Ahn
KW - Emoji Analysis
KW - Emoji Sense Disambiguation
KW - EmojiNet
AB - Emoji are a contemporary and extremely popular way to enhance electronic communication. Without rigid semantics attached to them, emoji symbols take on different meanings based on the context of a message. Thus, like the word sense disambiguation task in natural language processing, machines also need to disambiguate the meaning or ‘sense’ of an emoji. In a first step toward achieving this goal, this paper presents EmojiNet, the first machine readable sense inventory for emoji. EmojiNet is a resource enabling systems to link emoji with their context-specific meaning. It is automatically constructed by integrating multiple emoji resources with BabelNet, which is the most comprehensive multilingual sense inventory available to date. The paper discusses its construction, evaluates the automatic resource creation process, and presents a use case where EmojiNet disambiguates emoji usage in tweets. EmojiNet is available online for use at http://emojinet.knoesis.org.
JA - 8th International Conference on Social Informatics (SocInfo 2016)
PB - Springer International Publishing
CY - Bellevue, WA
VL - 10046
SN - 978-3-319-47880-7
U1 - In the reviewers’ words:

"Emoji is an important tool of nonverbal communication, but its usage lacks 'universal', uniform and rigorous semantic attachments. This paper introduces the first machine readable sense inventory for emoji—EmojiNet, a resource enabling systems to link emoji with its context-specific meaning. It is automatically constructed by integrating multiple emoji resources. It is a useful application tool for public use in online communication that will facilitate human interaction.”

"The representation of emojis with a tuple of 8 field is well designed and puts in a single place almost all the information available about emojis in previous dictionaries, reported in the previous work section. The authors evaluate the resource under the aspects of image detection/alignment and word sense disambiguation. both evaluation tasks are performed correctly."

ER -
TY - Generic
T1 - Exploring Term Networks for Semantic Search over RDF Knowledge Graphs
T2 - Metadata and Semantics Research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings
Y1 - 2016
A1 - Edgard Marx
A1 - Konrad Hoffner
A1 - Saeedeh Shekarpour
A1 - Axel-Cyrille Ngonga Ngomo
A1 - Jens Lehmann
A1 - Sören Auer
ED - Emmanouel Garoufallou
ED - Imma Subirats Coll
ED - Armando Stellato
ED - Jane Greenberg
KW - Knowledge graphs
KW - rdf
KW - Semantic Search
AB - Information retrieval approaches are considered as a key technology to empower lay users to access the Web of Data. A large number of related approaches such as Question Answering and Semantic Search have been developed to address this problem. While Question Answering promises more accurate results by returning a specific answer, Semantic Search engines are designed to retrieve the best top- KK ranked resources. In this work, we propose *path, a Semantic Search approach that explores term networks for querying RDF knowledge graphs. The adequacy of the approach is evaluated employing benchmark datasets against state-of-the-art Question Answering as well as Semantic Search systems. The results show that *path achieves better F 11 -score than the currently best performing Semantic Search system.
JA - Metadata and Semantics Research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings
PB - Springer International Publishing
CY - Cham
SN - 978-3-319-49157-8
ER -
TY - CONF
T1 - Features for Ranking Tweets Based on Credibility and Newsworthiness.
T2 - 17th International Conference on Collaboration Technologies and Systems (CTS 2016)
Y1 - 2016
A1 - Ross Jacob
A1 - Thirunarayan Krishnaprasad
KW - Credibility
KW - learning to rank
KW - Social Media
KW - twitter
AB - We create a robust and general feature set for learning to rank tweets based on credibility and newsworthiness. In previous works, it has been demonstrated that when the training and testing data are from two distinct time periods, the ranker performs poorly. We improve upon this by creating a feature set that does not overfit a particular year or set of topics. This is critical for robust analysis of social media over time. In order to derive such features, we use the studies done on credibility perception of social media as well as the clues provided in past works in this domain. We also present new features that, to our knowledge, are more effective than the state of the art.
JA - 17th International Conference on Collaboration Technologies and Systems (CTS 2016)
PB - IEEE Computer Society
CY - Orlando, Florida, USA
SN - 978-1-5090-2300-4
ER -
TY - CONF
T1 - Finding Street Gang Members on Twitter
T2 - 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2016)
Y1 - 2016
A1 - Lakshika Balasuriya
A1 - Sanjaya Wijeratne
A1 - Derek Doran
A1 - Amit Sheth
KW - Gang Activity Understanding
KW - social media analysis
KW - Street Gangs
KW - Twitter Profile Identification
AB - Most street gang members use Twitter to intimidate others, to present outrageous images and statements to the world, and to share recent illegal activities. Their tweets may thus be useful to law enforcement agencies to discover clues about recent crimes or to anticipate ones that may occur. Finding these posts, however, requires a method to discover gang member Twitter profiles. This is a challenging task since gang members represent a very small population of the 320 million Twitter users. This paper studies the problem of automatically finding gang members on Twitter. It outlines a process to curate one of the largest sets of verifiable gang member profiles that have ever been studied. A review of these profiles establishes differences in the language, images, YouTube links, and emojis gang members use compared to the rest of the Twitter population. Features from this review are used to train a series of supervised classifiers. Our classifier achieves a promising F1 score with a low false positive rate.
JA - 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2016)
CY - San Francisco, CA, USA
VL - 8
ER -
TY - JOUR
T1 - A Formal Graph Model for RDF and Its Implementation
JF - CoRR
Y1 - 2016
A1 - Vinh Nguyen
A1 - Jyoti Leeka
A1 - Olivier Bodenreider
A1 - Amit Sheth
AB - Formalizing an RDF abstract graph model to be compatible with the RDF formal semantics has remained one of the foundational problems in the Semantic Web. In this paper, we propose a new formal graph model for RDF datasets. This model allows us to express the current model-theoretic semantics in the form of a graph. We also propose the concepts of resource path and triple path as well as an algorithm for traversing the new graph. We demonstrate the feasibility of this graph model through two implementations: one is a new graph engine called GraphKE, and the other is extended from RDF-3X to show that existing systems can also benefit from this model. In order to evaluate the empirical aspect of our graph model, we choose the shortest path algorithm and implement it in the GraphKE and the RDF-3X. Our experiments on both engines for finding the shortest paths in the YAGO2S-SP dataset give decent performance in terms of execution time. The empirical results show that our graph model with well-defined semantics can be effectively implemented.
VL - abs/1606.00480
ER -
TY - JOUR
T1 - Gender-based Violence in 140 Characters or Fewer: A #BigData Case Study of Twitter
JF - First Monday
Y1 - 2016
A1 - Hemant Purohit
A1 - Tanvi Banerjee
A1 - Andrew Hampton
A1 - Valerie Shalin
A1 - Nayanesh Bhandutia
A1 - Amit Sheth
KW - Citizen sensing
KW - computational social science
KW - gender-based violence
KW - intervention campaign
KW - policy
KW - public attitude
KW - public awareness
KW - qualitative analysis
KW - quantitative analysis
KW - Social Media
AB - Public institutions are increasingly reliant on data from social media sites to measure public attitude and provide timely public engagement. Such reliance includes the exploration of public views on important social issues such as gender-based violence (GBV). In this study, we examine big (social) data consisting of nearly 14 million tweets collected from Twitter over a period of 10 months to analyze public opinion regarding GBV, highlighting the nature of tweeting practices by geographical location and gender. We demonstrate the utility of computational social science to mine insight from the corpus while accounting for the influence of both transient events and sociocultural factors. We reveal public awareness regarding GBV tolerance and suggest opportunities for intervention and the measurement of intervention effectiveness assisting both governmental and non-governmental organizations in policy development.
VL - 21
CP - 1-4
ER -
TY - Generic
T1 - Gleaning Types for Literals in RDF Triples with Application to Entity Summarization
T2 - 13th International Conference, ESWC 2016
Y1 - 2016
A1 - Kalpa Gunaratna
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Gong Cheng
ED - Harald Sack
ED - Eva Blomqvist
ED - Mathieu d'Aquin
ED - Chiara Ghidini
ED - Simone P. Ponzetto
ED - Christoph Lange
KW - Entity Summarization
KW - Properties
KW - Type assignment
AB - Associating meaning with data in a machine-readable format is at the core of the Semantic Web vision, and typing is one such process. Typing (assigning a class selected from schema) information can be attached to URI resources in RDF/S knowledge graphs and datasets to improve quality, reliability, and analysis. There are two types of properties: object properties, and datatype properties. Type information can be made available for object properties as their object values are URIs. Typed object properties allow richer semantic analysis compared to datatype properties, whose object values are literals. In fact, many datatype properties can be analyzed to suggest types selected from a schema similar to object properties, enabling their wider use in applications. In this paper, we propose an approach to glean types for datatype properties by processing their object values. We show the usefulness of generated types by utilizing them to group facts on the basis of their semantics in computing diversifed entity summaries by extending a state-of-the-art summarization algorithm.
JA - 13th International Conference, ESWC 2016
PB - Springer International Publishing
CY - Heraklion, Crete, Greece
VL - 9678
SN - 978-3-319-34128-6
ER -
TY - CONF
T1 - Harnessing Relationships for Domain-specific Subgraph Extraction: A Recommendation Use Case
T2 - IEEE International Conference on Big Data
Y1 - 2016
A1 - Sarasi Lalithsena
A1 - Pavan Kapanipathi
A1 - Amit Sheth
KW - Domain-specific knowledge graph
KW - Recommendation
KW - Relationship Ranking
AB - Applications on the Web such as search engines and recommendation systems are increasingly adapting semantic approaches by leveraging knowledge graphs. While some applications require processing of the whole knowledge graph, most are domain-specific and require only a relevant subset of it. For example, a movie or a book recommendation system would require a subgraph that comprises knowledge relevant to the specific domain. In such scenarios, processing the whole knowledge graph, particularly the commonly used, large, and openly available knowledge graphs on the Web, is computationally intensive and the irrelevant portion may negatively impact the performance of the application. This necessitates the identification and extraction of relevant subgraphs that adequately captures entities and their relationships for a given application domain and/or task. In this work, we present an approach to identify a minimal domain-specific subgraph by utilizing statistic and semantic-based metrics. Our approach highlights the importance of relationships as first-class elements to capture domain specificity of a subgraph. We demonstrate the applicability of this approach for a recommendation use case on two domains, i.e. movie and book. Our evaluation demonstrates a reduction of 80% to 90% of the knowledge graph with orders of magnitude decrease in time for computation without compromising accuracy.
JA - IEEE International Conference on Big Data
CY - Washington D.C.
ER -
TY - Generic
T1 - HeadEx: Triple Extraction from Stream of News Headlines on Twitter using n-ary Relations
Y1 - 2016
A1 - Saeedeh Shekarpour
A1 - Hussein S. Al-Olimat
A1 - Amir Hossein Yazdavar
A1 - Krishnaprasad Thirunarayan
A1 - Valerie Shalin
A1 - Amit Sheth
AB - The ever-growing datasets published on Linked Data mainly contain encyclopedic information. However, there is a lack of quality structued datasets extracted from unstructured real-time sources. News Headlines published on Twitter provide a real-time stream of events. In this paper, we propose an approach for extracting triples, leveraging n-ary relations, from News Headlines on Twitter in real-time. First, we introduce a mechanism for representing n-ary relations and their arguments as a background data model. This representation leverages Levin’s classification of English Verbs in [10] to support the use of unstructured text for constructing the background data model and capturing mentions of n-ary relations. Then, we use learning approaches, employing proposed syntactic features derived from parsing, to extract information respecting the data model. As a proof-of-concept, we follow a case study containing three distinct n-ary relations. The results of our experiments are promising and can be used to create timely and structured news headlines dataset.
JA - Kno.e.sis Library Archive
ER -
TY - ABST
T1 - Identifying Offensive Videos on YouTube
Y1 - 2016
A1 - Rajeshwari Kandakatla
AB - Harassment on social media has become a critical problem and social media content depicting harassment is becoming common place. Video-sharing websites such as YouTube contain content that may be offensive to certain community, insulting to certain religion, race etc., or make fun of disabilities. These videos can also provoke and promote altercations leading to online harassment of individuals and groups. In this thesis, we present a system that identifies offensive videos on YouTube. Our goal is to determine features that can be used to detect offensive videos efficiently and reliably. We conducted experiments using content and metadata available for each YouTube video such as comments, title, description and number of views to develop Naive Bayes and Support Vector Machine classifiers. We used training dataset of 300 videos and test dataset of 86 videos and obtained a classification F-Score of 0.86. It was surprising to note that sentiment and content of the comments were less effective in detecting offensive videos than the unigrams and bigrams in the video title and any other feature combinations does not improve the performance appreciably.Thus, the simplicity of these features contributes to the efficiency of computation and implies that the up-loaders provide good titles.
UR - https://etd.ohiolink.edu/!etd.send_file?accession=wright1484751212961772&disposition=inline
N1 - Rajeshwari Kandakatla: Identifying Offensive Videos on YouTube, MS Thesis, Wright State University, Dayton, OH, December 2016.
ER -
TY - THES
T1 - Identifying Tweets with Implicit Entity Mentions
T2 - Department of Computer Science and Engineering
Y1 - 2016
A1 - Adarsh Alex
AB - Social networking sites like Twitter and Facebook have become a significant source of user-generated content in the past decade. Mining of this user-generated content has proved beneficial for a broad range of applications like Event Extraction, Document Retrieval, and Sentiment Analysis. Identifying entities is one of the major tasks that fuel important information for above tasks. Identification of entities is typically performed in two steps; Named Entity Recognition (NER) and Entity Linking. State of the art NER solutions focus on recognizing the entities that are mentioned explicitly in social media posts. However, entities are frequently mentioned implicitly in them. For example, the tweet ‘Didn’t know that its the same actress in Fault in our stars and Divergent.’ contains explicit references to movies Fault in our stars and Divergent while it implicitly refers to actress Shailene Woodley. Spotting and classifying tweets with such implicit entity mentions (i.e. recognize that above tweet has implicit entity of type ACTRESS) is the initial step towards identifying the implicit mention of Shailene Woodley in this tweet. In this thesis, we propose a two step semantic driven approach to address the spotting and typing of implicit entity mentions in text. Specifically, we answer two research questions in this thesis:

How to find tweets that have implicit entity mentions of a given type?

What features help to distinguish tweets with implicit entity mentions from tweets with explicit entity mentions and tweets with no entity mentions at all?

We answer the first question by developing a technique to find semantic cues that indicate the presence of implicit entity mentions in tweets. The second research question is answered by exploiting the syntactic features of the tweets, along with semantic features extracted from crowd-sourced knowledge bases like Wikipedia and DBpedia, to determine whether a tweet has an implicit entity mention or not. We evaluate our approach by creating a gold standard dataset for two domains namely movies and books.
JA - Department of Computer Science and Engineering
PB - Wright State University
CY - Dayton
ER -
TY - Generic
T1 - Implicit Entity Linking in Tweets
T2 - Extended Semantic Web Conference
Y1 - 2016
A1 - Sujan Perera
A1 - Pablo N. Mendes
A1 - Adarsh Alex
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
KW - Contextual Knowledge
KW - entity Linking
KW - Entity Modeling
KW - Implicit Entities
AB - Over the years, Twitter has become one of the largest communication platforms providing key data to various applications such as brand monitoring, trend detection, among others. Entity linking is one of the major tasks in natural language understanding from tweets and it associates entity mentions in text to corresponding entries in knowledge bases in order to provide unambiguous interpretation and additional context. State-of-the-art techniques have focused on linking explicitly mentioned entities in tweets with reasonable success. However, we argue that in addition to explicit mentions – i.e. 'The movie Gravity was more expensive than the mars orbiter mission' – entities (movie Gravity) can also be mentioned implicitly – i.e. 'This new space movie is crazy. you must watch it!.' This paper introduces the problem of implicit entity linking in tweets. We propose an approach that models the entities by exploiting their factual and contextual knowledge. We demonstrate how to use these models to perform implicit entity linking on a ground truth dataset with 397 tweets from two domains, namely, Movie and Book. Specifically, we show: 1) the importance of linking implicit entities and its value addition to the standard entity linking task, and 2) the importance of exploiting contextual knowledge associated with an entity for linking their implicit mentions. We also make the ground truth dataset publicly available to foster the research in this new research area.
JA - Extended Semantic Web Conference
PB - Springer
CY - Heraklion, Crete, Greece
ER -
TY - MGZN
T1 - Internet of Things to Smart IoT Through Semantic, Cognitive, and Perceptual Computing
Y1 - 2016
A1 - Amit Sheth
KW - Big Data
KW - Blood pressure
KW - cognitive computing
KW - intelligent computing
KW - Intelligent sensors
KW - Intelligent systems
KW - Internet of Things
KW - Interoperability
KW - perceptual computing
KW - physical-cyber-social
KW - semantic computing
KW - Semantics
KW - smart IoT
AB - Rapid growth in the Internet of Things (IoT) has resulted in a massive growth of data generated by these devices and sensors put on the Internet. Physical-cyber-social (PCS) big data consist of this IoT data, complemented by relevant Web-based and social data of various modalities. Smart data is about exploiting this PCS big data to get deep insights and make it actionable, and making it possible to facilitate building intelligent systems and applications. This article discusses key AI research in semantic computing, cognitive computing, and perceptual computing. Their synergistic use is expected to power future progress in building intelligent systems and applications for rapidly expanding markets in multiple industries. Over the next two years, this column on IoT will explore many challenges and technologies on intelligent use and applications of IoT data.
JA - IEEE Intelligent Systems
PB - IEEE
VL - 31
CP - 2
ER -
TY - RPRT
T1 - Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples
Y1 - 2016
A1 - Amit Sheth
A1 - Sujan Perera
A1 - Sanjaya Wijeratne
KW - background knowledge
KW - domain-specific information retrieval
KW - Emoji Sense Disambiguation
KW - EmojiNet
KW - Enhancing statistical models with knowledge
KW - Implicit Entity Linking
KW - Knowledge Bases
KW - Knowledge-Aware Search
KW - Knowledge-driven deep content understanding
KW - Knowledge-enabled computing
KW - Knowledge-enhanced ML and NLP
KW - Machine intelligence
KW - Multimodal exploitation
KW - Ontology
KW - Semantic analysis of multimodal data
KW - Semantic Search
KW - Understanding complex text
AB - Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to unsupervised learning from a massive amount of data, albeit much of it relates to one modality/type of data at a time. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition of utilizing knowledge whenever it is available or can be created purposefully. In this paper, we focus on discussing the indispensable role of knowledge for deeper understanding of complex text and multimodal data in situations where (i) large amounts of training data (labeled/unlabeled) are not available or labor intensive to create, (ii) the objects (particularly text) to be recognized are complex (i.e., beyond simple entity-person/location/organization names), such as implicit entities and highly subjective content, and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create knowledge, varying from comprehensive or cross domain to domain or application specific, and (b) carefully exploit the knowledge to further empower or extend the applications of ML/NLP techniques. Using the early results in several diverse situations - both in data types and applications - we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data.
CP - Technical Report, Wright State University
ER -
TY - THES
T1 - Knowledge-driven Implicit Information Extraction
T2 - Department of Computer Science & Engineering
Y1 - 2016
A1 - Sujan Perera
KW - Implicit Entities
KW - implicit relationships
KW - information extraction
KW - Knowledge Base
AB - Natural language is a powerful tool developed by humans over hundreds of thousands of years. The extensive usage, flexibility of the language, creativity of the human beings, and social, cultural, and economic changes that have taken place in daily life have added new constructs, styles, and features to the language. One such feature of the language is its ability to express ideas, opinions, and facts in an implicit manner. This is a feature that is used extensively in day to day communications in situations such as: 1) expressing sarcasm, 2) when trying to recall forgotten things, 3) when required to convey descriptive information, 4) when emphasizing the features of an entity, and 5) when communicating a common understanding. Consider the tweet “New Sandra Bullock astronaut lost in space movie looks absolutely terrifying” and the text snippet extracted from a clinical narrative “He is suffering from nausea and severe headaches. Dolasteron was prescribed”. The tweet has an implicit mention of the entity “Gravity” and the clinical text snippet has implicit mention of the relationship between medication “Dolasteron” and clinical condition “nausea”. Such implicit references of the entities and the relationships are common occurrences in daily communication and they add value to conversations. However, extracting implicit constructs has not received enough attention in the information extraction literature. This dissertation focuses on extracting implicit entities and relationships from clinical narratives and extracting implicit entities from Tweets. When people use implicit constructs in their daily communication, they assume the existence of a shared knowledge with the audience about the subject being discussed. This shared knowledge helps to decode implicitly conveyed information. For example, the above Twitter user assumed that his/her audience knows that the actress “Sandra Bullock” starred in the movie “Gravity” and it is a movie about space exploration. The clinical professional who wrote the clinical narrative above assumed that the reader knows that “Dolasteron” is an anti-nausea drug. The audience without such domain knowledge may not have correctly decoded the information conveyed in the above examples. This dissertation demonstrates manifestations of implicit constructs in text, studies their characteristics, and develops a software solution that is capable of extracting implicit information from text. The developed solution starts by acquiring relevant knowledge to solve the implicit information extraction problem. The relevant knowledge includes domain knowledge, contextual knowledge, and linguistic knowledge. The acquired knowledge can take different syntactic forms such as a text snippet, structured knowledge represented in standard knowledge representation languages such as the Resource Description Framework (RDF) or other custom formats. Hence, the acquired knowledge is pre-processed to create models that can be processed by machines. Such models provide the infrastructure to perform implicit information extraction. This dissertation focuses on three different use cases of implicit information and demonstrates the applicability of the developed solution in these use cases. They are: 1) implicit entity linking in clinical narratives, 2) implicit entity linking in Twitter, and 3) implicit relationship extraction from clinical narratives. The evaluations are conducted on relevant annotated datasets for implicit information and they demonstrate the effectiveness of the developed solution in extracting implicit information from text.
JA - Department of Computer Science & Engineering
PB - Wright State University
CY - Dayton
VL - PhD
ER -
TY - THES
T1 - Knowledge-driven Search Intent Mining
T2 - Department of Computer Science & Engineering
Y1 - 2016
A1 - Ashutosh Jadhav
KW - health informatics
KW - search intent mining
KW - Search Log analysis
KW - Semantic Search
KW - Semantic Web
KW - Social Media Analytics
KW - Text Analytics
AB - Understanding users’ latent intents behind search queries is essential for satisfying a user’s search needs. Search intent mining can help search engines to enhance its ranking of search results, enabling new search features like instant answers, personalization, search result diversification, and the recommendation of more relevant ads. Consequently, there has been increasing attention on studying how to effectively mine search intents by analyzing search engine query logs. While state-of-the-art techniques can identify the domain of the queries (e.g. sports, movies, health), identifying domain-specific intent is still an open problem. Among all the topics available on the Internet, health is one of the most important in terms of impact on the user and it is one of the most frequently searched areas. This dissertation presents a knowledge-driven approach for domain-specific search intent mining with a focus on health-related search queries. First, we identified 14 consumer-oriented health search intent classes based on inputs from focus group studies and based on analyses of popular health websites, literature surveys, and an empirical study of search queries. We defined the problem of classifying millions of health search queries into zero or more intent classes as a multi-label classification problem. Popular machine learning approaches for multi-label classification tasks (namely, problem transformation and algorithm adaptation methods) were not feasible due to the limitation of label data creations and health domain constraints. Another challenge in solving the search intent identification problem was mapping terms used by laymen to medical terms. To address these challenges, we developed a semantics-driven, rule-based search intent mining approach leveraging rich background knowledge encoded in Unified Medical Language System (UMLS) and a crowd sourced encyclopedia (Wikipedia). The approach can identify search intent in a disease-agnostic manner and has been evaluated on three major diseases. While users often turn to search engines to learn about health conditions, a surprising amount of health information is also shared and consumed via social media, such as public social platforms like Twitter. Although Twitter is an excellent information source, the identification of informative tweets from the deluge of tweets is the major challenge. We used a hybrid approach consisting of supervised machine learning, rule-based classifiers, and biomedical domain knowledge to facilitate the retrieval of relevant and reliable health information shared on Twitter in real time. Furthermore, we extended our search intent mining algorithm to classify health-related tweets into health categories. Finally, we performed a large-scale study to compare health search intents and features that contribute in the expression of search intent from more than 100 million search queries from smarts devices (smartphones or tablets) and personal computers (desktops or laptops).
JA - Department of Computer Science & Engineering
PB - Wright State University
CY - Dayton
VL - PhD
ER -
TY - THES
T1 - Knowledge-empowered Probabilistic Graphical Models For Physical-cyber-social Systems
T2 - Department of Computer Science & Engineering
Y1 - 2016
A1 - Pramod Anantharam
AB - There is a rapid intertwining of sensors and mobile devices into the fabric of our lives. This has resulted in unprecedented growth in the number of observations from the physical and social worlds reported in the cyber world. Sensing and computational components embedded in the physical world is termed as Cyber-Physical System (CPS). Current science of CPS is yet to effectively integrate citizen observations in CPS analysis. We demonstrate the role of citizen observations in CPS and propose a novel approach to perform a holistic analysis of machine and citizen sensor observations. Specifically, we demonstrate the complementary, corroborative, and timely aspects of citizen sensor observations compared to machine sensor observations in Physical-Cyber-Social (PCS) Systems. Physical processes are inherently complex and embody uncertainties. They manifest as machine and citizen sensor observations in PCS Systems. We propose a generic framework to move from observations to decision-making and actions in PCS systems consisting of: (a) PCS event extraction, (b) PCS event understanding, and (c) PCS action recommendation. We demonstrate the role of Probabilistic Graphical Models (PGMs) as a unified framework to deal with uncertainty, complexity, and dynamism that help translate observations into actions. Data driven approaches alone are not guaranteed to be able to synthesize PGMs reflecting real-world dependencies accurately. To overcome this limitation, we propose to empower PGMs using the declarative domain knowledge. Specifically, we propose four techniques: (a) automatic creation of massive training data for Conditional Random Fields (CRFs) using domain knowledge of entities used in PCS event extraction, (b) Bayesian Network structure refinement using causal knowledge from Concept Net used in PCS event understanding, (c) knowledge-driven piecewise linear approximation of nonlinear time series dynamics using Linear Dynamical Systems (LDS) used in PCS event understanding, and the (d) transforming knowledge of goals and actions into a Markov Decision Process (MDP) model used in PCS action recommendation. We evaluate the benefits of the proposed techniques on real-world applications involving traffic analytics and Internet of Things (IoT).
JA - Department of Computer Science & Engineering
PB - Wright State University
CY - Dayton
VL - PhD
ER -
TY - CONF
T1 - KnowledgeWiki: An OpenSource Tool for Creating Community-Curated Vocabulary, with a Use Case in Materials Science
T2 - World Wide Web 2016
Y1 - 2016
A1 - Nishita Jaykumar
A1 - PavanKalyan Yallamelli
A1 - Vinh Nguyen
A1 - Sarasi Lalithsena
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
KW - KnowledgeWiki
KW - Linked Data application
KW - Materials Science
KW - Open source
KW - Provenance metadata
KW - Semantic MediaWiki
KW - Semantic Web
KW - Singleton Property
KW - Wikidata
AB - Resource Description Framework (RDF) datasets can be created by transforming structured databases, extracting the triples from semi-structured and unstructured sources, crowd-sourcing, or by integrating the existing datasets. The reliability and quality of these datasets can be improved by the participation of domain experts via a special purpose tool or a crowd-sourced application. Wikidata and Semantic MediaWiki are platforms which facilitate this kind of crowd-sourced data curation. We present our system, KnowledgeWiki, which is built upon the existing Semantic MediaWiki. We develop a novel extension by adopting the singleton property data model in our KnowledgeWiki. This extension allows various kinds of metadata about the RDF triples to be created in the Wiki. We combine this extension with other extensions such as semantic forms to provide a user-friendly, Wiki-like interface for domain experts with no prior technical expertise to easily curate data. We also present our new enhancement to Semantic Mediawiki, which facilitates importing existing RDF datasets into the wiki-based curating platform based on the singleton property approach, that preserves the provenance of individual triples. We also describe how it is being used by the materials science community to create and curate consolidated vocabularies.
JA - World Wide Web 2016
CY - Montreal, Canada
ER -
TY - CONF
T1 - Ontology-enabled Healthcare Applications Exploiting Physical-Cyber-Social Big Data
T2 - Ontology Summit 2016
Y1 - 2016
A1 - Amit Sheth
KW - EHR
KW - healthcare
AB - Healthcare applications now have the ability to exploit big data in all its complexity. A crucial challenge is to achieve interoperability or integration so that a variety of content from diverse physical (IoT)- cyber (web-based)- and social sources, with diverse formats and modality (text, image, video), can be used in analysis, insight, and decision-making. At Kno.e.sis, an Ohio Center of Excellence in BioHealth Innovation, we have a variety of large, collaborative healthcare/clinical/biomedical projects, all involving domain experts and end-users, and access to real world data that include: clinical/EMR data (of individual patients and that related to public health), data from a variety of sensors (IoT) on and around patients measuring real-time physiological and environmental observations), social data (Twitter, Web forums, PatientsLikeMe), Web search logs, etc. Key projects include: Prescription drug abuse online-surveillance and epidemiology (PREDOSE), Social media analysis to monitor cannabis and synthetic cannabinoid use (eDrugTrends), Modeling Social Behavior for Healthcare Utilization in Depression, Medical Information Decision Assistant and Support (MIDAS) with application to musculoskeletal issues, kHealth: A Semantic Approach to Proactive, Personalized Asthma Management Using Multimodal Sensing (also for dementia), and Cardiology Semantic Analysis System (with applications to Computer Assisted Coding and Computerized Document Improvement). This talk will review how ontologies or knowledge graphs play a central role in supporting semantic filtering, interoperability and integration (including the issues such as disambiguation), reasoning and decision-making in all our health-centric research and applications. Additional relevant information is at the speaker’s HCLS page.
JA - Ontology Summit 2016
CY - Virtual
ER -
TY - CONF
T1 - Preliminary Investigation of Walking Motion Using a Combination of Image and Signal Processing
T2 - 2016 International Conference on Computational Science and Computational Intelligence
Y1 - 2016
A1 - Bradley Schneider
A1 - Tanvi Banerjee
KW - activity detection
KW - gait analysis
KW - motion and tracking algorithms and applications
KW - video analysis
AB - We present the results of analyzing gait motion in first-person video taken from a commercially available wearable camera embedded in a pair of glasses. The video is analyzed with three different computer vision methods to extract motion vectors from different gait sequences from four individuals for comparison against a manually annotated ground truth dataset. Using a combination of signal processing and computer vision techniques, gait features are extracted to identify the walking pace of the individual wearing the camera as well as validated using the ground truth dataset. Our preliminary results indicate that the extraction of activity from the video in a controlled setting shows strong promise of being utilized in different activity monitoring applications such as in the eldercare environment, as well as for monitoring chronic healthcare conditions. Citation: B. Schneider & T. Banerjee, “Preliminary Investigation of Walking Motion Using a Combination of Image and Signal Processing”, accepted, 2016 International Conference on Computational Science and Computational Intelligence (CSCI'16: December 15-17, 2016, Las Vegas, USA).
JA - 2016 International Conference on Computational Science and Computational Intelligence
CY - Las Vegas, NV
ER -
TY - Generic
T1 - Privacy-Preserving Spectral Analysis of Large Graphs in Public Clouds
T2 - Asia Conference on Computer and Communications Security 2016
Y1 - 2016
A1 - Sagar Sharma
A1 - James Powers
A1 - Keke Chen
KW - Database and storage security
KW - Management and querying of encrypted data
KW - Privacy-preserving protocols
KW - Security services
AB - Large graph datasets have become invaluable assets for studying problems in business applications and scientific research. These datasets, collected and owned by data owners, may also contain privacy-sensitive information. When using public clouds for elastic processing, data owners have to protect both data ownership and privacy from curious cloud providers. We propose a cloud-centric framework that allows data owners to efficiently collect graph data from the distributed data contributors, and privately store and analyze graph data in the cloud. Data owners can conduct expensive operations in untrusted public clouds with privacy and scalability preserved. The major contributions of this work include two privacy-preserving approximate eigen decomposition algorithms (the secure Lanczos and Nystrom methods) for spectral analysis of large graph matrices, and a personalized privacy-preserving data submission method based on differential privacy that allows for the trade-off between data sparsity and privacy. For a N-node graph, the proposed approach allows a data owner to finish the core operations with only O(N) client-side costs in computation, storage, and communication. The expensive O(N2) operations are performed in the cloud with the proposed privacy-preserving algorithms. We prove that our approach can satisfactorily preserve data privacy against the untrusted cloud providers. We have conducted an extensive experimental study to investigate these algorithms in terms of the intrinsic relationships among costs, privacy, scalability, and result quality.
JA - Asia Conference on Computer and Communications Security 2016
PB - ACM
CY - Xi'an, China
ER -
TY - Generic
T1 - Qanary - A Methodology for Vocabulary-driven Open Question Answering Systems
T2 - ESWC 2016
Y1 - 2016
A1 - Andreas Both
A1 - Dennis Diefenbach
A1 - Kuldeep Singh
A1 - Saeedeh Shekarpour
A1 - Didier Cherix
A1 - Christoph Lange
KW - Annotation Model
KW - Ontologies
KW - question answering
KW - Semantic Search
KW - Semantic Web
KW - Software Reusability
AB - It is very challenging to access the knowledge expressed within (big) data sets. Question answering (QA) aims at making sense out of data via a simple-to-use interface. However, QA systems are very complex and earlier approaches are mostly singular and monolithic implementations for QA in specific domains. Therefore, it is cumbersome and inefficient to design and implement new or improved approaches, in particular as many components are not reusable. Hence, there is a strong need for enabling best-of-breed QA systems, where the best performing components are combined, aiming at the best quality achievable in the given domain. Taking into account the high variety of functionality that might be of use within a QA system and therefore reused in new QA systems, we provide an approach driven by a core QA vocabulary that is aligned to existing, powerful ontologies provided by domain-specific communities. We achieve this by a methodology for binding existing vocabularies to our core QA vocabulary without re-creating the information provided by external components. We thus provide a practical approach for rapidly establishing new (domain-specific) QA systems, while the core QA vocabulary is re-usable across multiple domains. To the best of our knowledge, this is the first approach to open QA systems that is agnostic to implementation details and that inherently follows the linked data principles.
JA - ESWC 2016
PB - Springer International Publishing
CY - Heraklion, Crete, Greece
ER -
TY - JOUR
T1 - Question Answering on Linked Data: Challenges and Future Directions
JF - Computing Research Repository (CoRR)
Y1 - 2016
A1 - Saeedeh Shekarpour
A1 - Denis Lukovnikov
A1 - Ashwini Jaya Kumar
A1 - Kemele M. Endris
A1 - Kuldeep Singh
A1 - Harsh Thakkar
A1 - Christoph Lange
KW - Data Quality
KW - Distributed and Heterogeneous Datasets
KW - Interoperability of Components
KW - Query Understanding
KW - Question Answering System
KW - Research Challenge
KW - Speech Interface
AB - Question Answering (QA) systems are becoming the inspiring model for the future of search engines. While recently, underlying datasets for QA systems have been promoted from unstructured datasets to structured datasets with highly semantic-enriched metadata, but still question answering systems involve serious challenges which cause to be far beyond desired expectations. In this paper, we raise the challenges for building a Question Answering (QA) system especially with the focus of employing structured data (i.e. knowledge graph). This paper provide an exhaustive insight of the known challenges, so far. Thus, it helps researchers to easily spot open rooms for the future research agenda.
ER -
TY - Generic
T1 - Question Answering on Linked Data: Challenges and Future Directions
T2 - International World Wide Web Conference
Y1 - 2016
A1 - Saeedeh Shekarpour
A1 - Kemele Endris
A1 - Ashwini Jaya Kumar
A1 - Denis Lukovnikov
A1 - Kuldeep Singh
A1 - Harsh Thakkar
A1 - Christoph Lange
KW - Data Quality
KW - Distributed and Heterogeneous Datasets
KW - Interoperability of Components
KW - Query Understanding
KW - Question Answering System
KW - Research Challenge
KW - Speech Interface
AB - Question Answering (QA) systems are becoming the inspiring model for the future of search engines. While, recently, datasets underlying QA systems have been promoted from unstructured datasets to structured datasets with semantically highly enriched metadata, question answering systems are still facing serious challenges and are therefore not meeting users' expectations. This paper provides an exhaustive insight of challenges known so far for building QA systems, with a special focus on employing structured data (i.e. knowledge graphs). It thus helps researchers to easily spot gaps to fill with their future research agendas.
JA - International World Wide Web Conference
PB - International World Wide Web Conferences Steering Committee
CY - Republic and Canton of Geneva, Switzerland
ER -
TY - JOUR
T1 - Recognition of side effects as implicit-opinion words in drug reviews
Y1 - 2016
A1 - Monireh Ebrahimi
A1 - Amir Hossein Yazdavar
A1 - Naomie Salim
A1 - Safaa Eltyeb
KW - Drug review
KW - Drug side effect
KW - Medical-opinion mining
KW - Regular expression
KW - Rule based
KW - Supervised approach
AB - Purpose – Many opinion-mining systems and tools have been developed to provide users with the attitudes of people toward entities and their attributes or the overall polarities of documents. In addition, side effects are one of the critical measures used to evaluate a patient’s opinion for a particular drug. However, side effect recognition is a challenging task, since side effects coincide with disease symptoms lexically and syntactically. The purpose of this paper is to extract drug side effects from drug reviews as an integral implicit-opinion words. Design/methodology/approach – This paper proposes a detection algorithm to a medical-opinion- mining system using rule-based and support vector machines (SVM) algorithms. A corpus from 225 drug reviews were manually annotated by a medical expert for training and testing. Findings – The results show that SVM significantly outperforms a rule-based algorithm. However, the results of both algorithms are encouraging and a good foundation for future research. Obviating the limitations and exploiting combined approaches would improve the results. Practical implications – An automatic extraction for adverse drug effects information from online text can help regulatory authorities in rapid information screening and extraction instead of manual inspection and contributes to the acceleration of medical decision support and safety alert generation. Originality/value – The results of this study can help database curators in compiling adverse drug effects databases and researchers to digest the huge amount of textual online information which is growing rapidly.
PB - Online Information Review
VL - 40
CP - 7
ER -
TY - THES
T1 - ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization
T2 - Department of Engineering & Computer Science
Y1 - 2016
A1 - Nishita Jaykumar
KW - Abstractive Summaries
KW - Automatic Summarization.
KW - Summarization Evaluation
AB - Automatic generation of summaries that capture the salient aspects of a search resultset (i.e., automatic summarization) has become an important task in biomedical research. Automatic summarization offers an avenue for overcoming the information overload problem prevalent in large online digital libraries. However, across many of the knowledge-driven approaches for automatic summarization it is not always clear which features highly impact or influence the quality of a summary. Instead, there has been considerable focus on utilizing schema knowledge to facilitate browsing and exploration of generated summaries a posteriori. Such informative features should not be ignored, since they could be utilized to help optimize the models that generate these semantic summaries in the first place. In this research, we adopt a leave-one-out approach to assess the impact of various features on the quality of automatically generated summaries that contain structured background knowledge. We first create the gold standard summaries, using information-theoretic methods, by extraction and validation, then the semantic summaries are transformed into an equivalent textual format. Finally, various similarity metrics, such as cosine similarity, euclidean distance, and Jensen-Shannon divergence are computed under different feature combinations, to assess summary quality against the textual gold standard. We report on the relative importance of the various features used to automatically generate the semantic summaries in a biomedical application. Our evaluation suggests that the proposed approach is an effective automatic evaluation method for assessing feature importance in automatically generated semantic summaries.
JA - Department of Engineering & Computer Science
PB - Wright State University
CY - Dayton
ER -
TY - MGZN
T1 - On Searching the Internet of Things: Requirements and Challenges
Y1 - 2016
A1 - Payam Barnaghi
A1 - Amit Sheth
KW - Cloud Computing
KW - Data Analysis
KW - Data Handling
KW - Data Storage Data
KW - Data/service Provider Resources
KW - Data/service Discovery
KW - Data/service Search
KW - Information Networks
KW - Information Retrieval
KW - Information Search And Retrieval Intelligent Systems
KW - Internet of Things
KW - Internet Of Things Searching
KW - IoT
KW - Iot Data Resources
KW - Iot Data Service Access
KW - Iot Discovery
KW - Iot Search Engine
KW - Knowledge Discovery
KW - Search Methods
KW - Search Engines
KW - Semantic Web
KW - Service Resources
KW - Stored Data
KW - Web Services
AB - Internet of Things (IoT) data services are designed to be available to devices and users on request at any time and at any location. Quality, latency, trust, availability, reliability, and continuity are among the key parameters that impact efficient access and use of IoT data and services. However, current data and service search, discovery, and access methods and solutions are more suited for fewer and/or static (or stored) data and service resources. IoT resources differ in terms of the number of resources and the complexity and amount of data. Efficient discovery, ranking, selection, access, integration, and interpretation and understanding of the data and services requires coordinated efforts among network, data/service provider resources, and core IoT components. This article describes some of the requirements and discusses the key challenges to build scalable and efficient search and discovery mechanisms for the IoT.
JA - IEEE Intelligent Systems
PB - IEEE
VL - 32
CP - 6
ER -
TY - MGZN
T1 - Semantic, Cognitive, and Perceptual Computing: Paradigms That Shape Human Experience
Y1 - 2016
A1 - Amit Sheth
A1 - Pramod Anantharam
A1 - Cory Henson
KW - Big Data
KW - cognitive computing
KW - computing for human experience
KW - experiential computing
KW - human-centric computing
KW - Internet of Everything
KW - Internet of Things
KW - Internet/Web technologies
KW - IoE
KW - IoT
KW - perceptual computing
KW - semantic computing
KW - Semantic Web
KW - Web 3.0
KW - Web evolution
AB - Unlike machine-centric computing, in which efficient data processing takes precedence over contextual tailoring, human-centric computation provides a personalized data interpretation that most users find highly relevant to their needs. The authors show how semantic, cognitive, and perceptual computing paradigms work together to produce actionable information.
JA - Computer
PB - IEEE
VL - 49
CP - 3
ER -
TY - MGZN
T1 - Semantic Filtering for Social Data
Y1 - 2016
A1 - Amit Sheth
A1 - Pavan Kapanipathi
KW - collective semantics
KW - context in social data hierarchical interest graph
KW - Continuous Semantics
KW - dynamically changing vocabulary
KW - filtering social media big data
KW - Linked Open Data
KW - Semantic filtering
KW - social data stream
KW - twitris
KW - velocity in Big Data
AB - More than a billion users on the Web are on social networks sharing and consuming short and real-time updates. Consumers of social data face information overload. Although information filtering can help, challenges that are specific to the short-text and real-time nature of social networks must be addressed. Knowledge bases-particularly those derived from crowd-sourced platforms such as Wikipedia can be harnessed for building an intelligent and effective information-filtering system for social networks.
JA - IEEE Internet Computing
PB - IEEE
VL - 20
CP - 4
ER -
TY - CONF
T1 - Signals Revealing Street Gang Members on Twitter
T2 - Workshop on Computational Approaches to Social Modeling (ChASM 2016) co-located with 8th International Conference on Social Informatics (SocInfo 2016)
Y1 - 2016
A1 - Lakshika Balasuriya
A1 - Sanjaya Wijeratne
A1 - Derek Doran
A1 - Amit Sheth
AB - We study the problem of automatically finding gang member profiles on Twitter. We outline a process to curate one of the largest sets of verifiable gang member profiles that has ever been studied. A review of these profiles establishes differences in the language, images, YouTube links, and emoji features gang members use compared to the rest of the Twitter population. We generate word embeddings that translate these features into a real vector format amenable for machine learning classification and use them to train a series of supervised classifiers. Our classifiers achieve promising F1 scores with low false positive rates.
JA - Workshop on Computational Approaches to Social Modeling (ChASM 2016) co-located with 8th International Conference on Social Informatics (SocInfo 2016)
CY - Bellevue, WA, USA
VL - 4
ER -
TY - JOUR
T1 - Smart Cities – Enabling Services and Applications
JF - Journal of Internet Services and Applications
Y1 - 2016
A1 - Edward Curry
A1 - Schahram Dustdar
A1 - Quan Sheng
A1 - Amit Sheth
AB - The proliferation of “Smart Cities” initiatives around the world is a part of the strategic response by governments to the challenges and opportunities of increasing urbanization and the rise of cities as the nexus of societal development. This JISA Thematic Series presents significant research contributions related to the design and development of Infrastructure, Services and Applications for the Smart City and Urban context.
PB - SpringerOpen
VL - 7
CP - 6
ER -
TY - RPRT
T1 - A Study of Social Web Data on Buprenorphine Abuse Using Semantic Web Technology
Y1 - 2016
A1 - Raminta Daniulaityte
A1 - Robert Carlson
A1 - Gregory Brigham
A1 - Delroy Cameron
A1 - Amit Sheth
AB - The Specific Aims of this application are to use a paradigmatic approach that combines Semantic Web technology, Natural Language Processing and Machine Learning techniques to:

Describe drug users’ knowledge, attitudes, and behaviors related to the non-medical use of Suboxone and Subutex as discussed on Web-based forums.

Identify and describe temporal patterns of non-medical use of Suboxone and Subutex as discussed on Web-based forums.

AIMS: Several states in the U.S. have legalized cannabis for recreational or medical uses. In this context, cannabis edibles have drawn considerable attention after adverse effects were reported. This paper investigates Twitter users' perceptions concerning edibles and evaluates the association edibles-related tweeting activity and local cannabis legislation. METHODS: Tweets were collected between May 1 and July 31, 2015, using Twitter API and filtered through the eDrugTrends/Twitris platform. A random sample of geolocated tweets was manually coded to evaluate Twitter users' perceptions regarding edibles. Raw state proportions of Twitter users mentioning edibles were adjusted relative to the total number of Twitter users per state. Differences in adjusted proportions of Twitter users mentioning edibles between states with different cannabis legislation status were assessed via a permutation test. RESULTS: We collected 100,182 tweets mentioning cannabis edibles with 26.9% (n=26,975) containing state-level geolocation. Adjusted percentages of geolocated Twitter users posting about edibles were significantly greater in states that allow recreational and/or medical use of cannabis. The differences were statistically significant. Overall, cannabis edibles were generally positively perceived among Twitter users despite some negative tweets expressing the unreliability of edible consumption linked to variability in effect intensity and duration. CONCLUSION: Our findings suggest that Twitter data analysis is an important tool for epidemiological monitoring of emerging drug use practices and trends. Results tend to indicate greater tweeting activity about cannabis edibles in states where medical THC and/or recreational use are legal. Although the majority of tweets conveyed positive attitudes about cannabis edibles, analysis of experiences expressed in negative tweets confirms the potential adverse effects of edibles and calls for educating edibles-naïve users, improving edibles labeling, and testing their THC content.

ER -
TY - Generic
T1 - Towards a Message-Driven Vocabulary for Promoting the Interoperability of Question Answering Systems
T2 - IEEE International Conference on Semantic Computing 2016
Y1 - 2016
A1 - Kuldeep Singh
A1 - Andreas Both
A1 - Dennis Diefenbach
A1 - Saeedeh Shekarpour
KW - Annotation Mode
KW - Ontologies
KW - question answering
KW - Semantic Search
KW - Semantic Web
KW - Software Reusability
AB - Question answering (QA) is one of the biggest challenges for making sense out of data. Web of Data has attracted the attention of question answering community and recently, a number of schema-aware question answering systems have been introduced. While research achievements are individually significant; yet, integrating different approaches is not possible due to lack of a systematic approach for conceptually describing QA systems. In this paper, we present a message-driven vocabulary built upon an abstract level. This vocabulary is concluded from conceptual views of different question answering systems. In this way, we are enabling researchers and industry to implement message-driven QA systems and to reuse and extend different approaches without the interoperability and extension concerns
JA - IEEE International Conference on Semantic Computing 2016
PB - IEEE
CY - Laguna Hills, CA
ER -
TY - Generic
T1 - Tweet Properly: Analyzing Deleted Tweets to Understand and Identify Regrettable Ones
T2 - 25th International World Wide Web Conference (WWW 2016)
Y1 - 2016
A1 - Lu Zhou
A1 - Wenbo Wang
A1 - Keke Chen
AB - Inappropriate tweets can cause severe damages on authors’ reputation or privacy. However, many users do not realize the negative consequences until they publish these tweets. Published tweets have lasting effects that may not be eliminated by simple deletion because other users may have read them or third-party tweet analysis platforms have cached them. Regrettable tweets, i.e., tweets with identifiable regrettable contents, cause the most damage on their authors because other users can easily notice them. In this paper, we study how to identify the regrettable tweets published by normal individual users via the contents and users’ historical deletion patterns. We identify normal individual users based on their publishing, deleting, followers and friends statistics. We manually examine a set of randomly sampled deleted tweets from these users to identify regrettable tweets and understand the corresponding regrettable reasons. By applying content-based features and personalized history-based features, we develop
JA - 25th International World Wide Web Conference (WWW 2016)
PB - ACM
CY - Montreal, Canada
ER -
TY - CONF
T1 - Understanding City Traffic Dynamics Utilizing Sensor and Textual Observations
T2 - 30th AAAI Conference on Artificial Intelligence (AAAI-16)
Y1 - 2016
A1 - Pramod Anantharam
A1 - Krishnaprasad Thirunarayan
A1 - Surendra Marupudi
A1 - Amit Sheth
A1 - Tanvi Banerjee
KW - city traffic events
KW - linear dynamical system
KW - log likelihood
KW - restricted switching linear dynamical system
KW - sensor data
KW - textual data
KW - time series
KW - traffic dynamics
KW - twitter traffic event
AB - Understanding speed and travel-time dynamics in response to various city related events is an important and challenging problem. Sensor data (numerical) containing average speed of vehicles passing through a road segment can be interpreted in terms of near real-time report of traffic related incidents from city authorities and social media data (textual), providing a complementary understanding of traffic dynamics. State-of-the-art research is focused on either analyzing sensor observations or citizen observations; we seek to exploit both in a synergistic manner. We demonstrate the role of domain knowledge in capturing the non-linearity of speed and travel-time dynamics by segmenting speed and travel-time observations into simpler components amenable to description using linear models such as Linear Dynamical System (LDS). Specifically, we propose Restricted Switching Linear Dynamical System (RSLDS) to model normal speed and travel time dynamics and thereby characterize anomalous dynamics. We utilize the city traffic events extracted from text to explain anomalous dynamics. We present a large scale evaluation of the proposed approach on a real-world traffic and twitter dataset collected over a year with promising results.
JA - 30th AAAI Conference on Artificial Intelligence (AAAI-16)
CY - Phoenix, Arizona
ER -
TY - CONF
T1 - Using Social Media Data to Understand Brand Development
T2 - 2016 Direct/Interactive Marketing Research Summit
Y1 - 2016
A1 - Kamer Yuksel
A1 - Sergio Biggemann
A1 - Amit Sheth
A1 - Jeremy Brunn
KW - Brand Development
KW - Critical Hermeneutics
KW - Social Media
KW - Social-Actor Engagement
KW - twitris
AB - Today, the world is a deeply interconnected place, with a broad array of social-actors (e.g. organizations, customers, and community members) interacting online. These interactions are oftentimes enhanced, contextually and dialogically, with unique online capabilities such as image sharing, linking and tagging. Twitter, in particular, has emerged as a simple yet effective form of blogging, where social-actors engage using a limited number of characters (specifically, the “microblogs” are limited to 140 characters, not including links and user handles). Some unique aspects of Twitter make it a particularly ideal platform for some brands to get noticed and become influential within the platform—given that their content is relevant and engaging. However, we know little about how organizations and other social-actors engage on social media platforms such as Twitter, and what constitutes an “engaging content”.
JA - 2016 Direct/Interactive Marketing Research Summit
PB - Marketing EDGE
CY - Los Angeles, CA
ER -
TY - CONF
T1 - What Motivates High School Students to Take Precautions Against the Spread of Influenza? Latent Modeling of Compliance with Preventative Practice: A Data Science Approach
T2 - International Conference on Health Informatics and Medical Systems (HIMS)
Y1 - 2016
A1 - William Romine
A1 - Tanvi Banerjee
A1 - William Folk
A1 - Lloyd Barrow
KW - data science
KW - decision support system
KW - Health behavior
KW - health belief model
KW - health informatics
KW - hygeine
KW - influenza
KW - multilevel logistic regression
KW - quantitative analysis
KW - vaccination
AB - This study focuses on a central question: What key behavioral factors influence high school students’ compliance with preventative measures against the transmission of influenza? We use multi-level logistic regression to equate logit measures for eight precautions to students’ latent compliance levels on a common scale. Using linear regression, we explore the efficacy of knowledge of influenza, affective perceptions about influenza and its prevention, prior illness, and gender in predicting compliance. Hand washing and respiratory etiquette are the easiest precautions for students, and hand sanitizer use and keeping the hands away from the face are the most difficult. Perceptions of barriers against taking precautions and sense of social responsibility had the greatest influence on compliance.
JA - International Conference on Health Informatics and Medical Systems (HIMS)
CY - Las Vegas, Nevada
ER -
TY - JOUR
T1 - "When 'Bad' is 'Good'"': Identifying Personal Communication and Sentiment in Drug-Related Tweets
JF - JMIR Public Health Surveillance
Y1 - 2016
A1 - Raminta Daniulaityte
A1 - Lu Chen
A1 - Francois R. Lamy
A1 - Robert Carlson
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
KW - cannabis
KW - eDrugTrends
KW - machine learning
KW - Sentiment Analysis
KW - Social Media
KW - synthetic cannabinoids
KW - twitter
AB - Background: To harness the full potential of social media for epidemiological surveillance of drug abuse trends, the field needs a greater level of automation in processing and analyzing social media content. Objectives: The objective of the study is to describe the development of supervised machine-learning techniques for the eDrugTrends platform to automatically classify tweets by type/source of communication (personal, official/media, retail) and sentiment (positive, negative, neutral) expressed in cannabis- and synthetic cannabinoid–related tweets. Methods: Tweets were collected using Twitter streaming Application Programming Interface and filtered through the eDrugTrends platform using keywords related to cannabis, marijuana edibles, marijuana concentrates, and synthetic cannabinoids. After creating coding rules and assessing intercoder reliability, a manually labeled data set (N=4000) was developed by coding several batches of randomly selected subsets of tweets extracted from the pool of 15,623,869 collected by eDrugTrends (May-November 2015). Out of 4000 tweets, 25% (1000/4000) were used to build source classifiers and 75% (3000/4000) were used for sentiment classifiers. Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machines (SVM) were used to train the classifiers. Source classification (n=1000) tested Approach 1 that used short URLs, and Approach 2 where URLs were expanded and included into the bag-of-words analysis. For sentiment classification, Approach 1 used all tweets, regardless of their source/type (n=3000), while Approach 2 applied sentiment classification to personal communication tweets only (2633/3000, 88%). Multiclass and binary classification tasks were examined, and machine-learning sentiment classifier performance was compared with Valence Aware Dictionary for sEntiment Reasoning (VADER), a lexicon and rule-based method. The performance of each classifier was assessed using 5-fold cross validation that calculated average F-scores. One-tailed t test was used to determine if differences in F-scores were statistically significant. Results: In multiclass source classification, the use of expanded URLs did not contribute to significant improvement in classifier performance (0.7972 vs 0.8102 for SVM, P=.19). In binary classification, the identification of all source categories improved significantly when unshortened URLs were used, with personal communication tweets benefiting the most (0.8736 vs 0.8200, PAIMS: Media reports suggest increasing popularity of marijuana concentrates ("dabs"; "earwax"; "budder"; "shatter; "butane hash oil") that are typically vaporized and inhaled via a bong, vaporizer or electronic cigarette. However, data on the epidemiology of marijuana concentrate use remain limited. This study aims to explore Twitter data on marijuana concentrate use in the U.S. and identify differences across regions of the country with varying cannabis legalization policies.

METHODS: Tweets were collected between October 20 and December 20, 2014, using Twitter's streaming API. Twitter data filtering framework was available through the eDrugTrends platform. Raw and adjusted percentages of dabs-related tweets per state were calculated. A permutation test was used to examine differences in the adjusted percentages of dabs-related tweets among U.S. states with different cannabis legalization policies.

RESULTS: eDrugTrends collected a total of 125,255 tweets. Almost 22% (n=27,018) of these tweets contained identifiable state-level geolocation information. Dabs-related tweet volume for each state was adjusted using a general sample of tweets to account for different levels of overall tweeting activity for each state. Adjusted percentages of dabs-related tweets were highest in states that allowed recreational and/or medicinal cannabis use and lowest in states that have not passed medical cannabis use laws. The differences were statistically significant.

CONCLUSIONS: Twitter data suggest greater popularity of dabs in the states that legalized recreational and/or medical use of cannabis. The study provides new information on the epidemiology of marijuana concentrate use and contributes to the emerging field of social media analysis for drug abuse research.

VL - 155
ER -
TY - Generic
T1 - Transforming Big Data into Smart Data: Deriving Value via Harnessing Volume, Variety & Velocity Using Semantics and Semantic Web
Y1 - 2015
A1 - Amit Sheth
PB - RMIT School of Computer Science and Information Technology
CY - Melbourne, Australia
ER -
TY - UNPB
T1 - Transforming Big Data into Smart Data: Deriving Value via Harnessing Volume, Variety & Velocity Using Semantics and Semantic Web
Y1 - 2015
A1 - Amit Sheth
PB - Singapore University of Technology and Design
CY - Singapore, Singapore
ER -
TY - Generic
T1 - Transforming Big Data into Smart Data: Deriving Value via Harnessing Volume, Variety & Velocity Using Semantics and Semantic Web
Y1 - 2015
A1 - Amit Sheth
PB - University of Otago
CY - Dunedin, New Zealand
ER -
TY - Generic
T1 - Transforming Big Data into Smart Data: Deriving Value via Harnessing Volume, Variety & Velocity Using Semantics and Semantic Web
Y1 - 2015
A1 - Amit Sheth
PB - CSIRO
CY - Hobart, Australia
ER -
TY - CONF
T1 - Trust Management: Multimodal Data Perspective, Invited Tutorial
T2 - 2015 International Conference on Collaboration Technologies and Systems (CTS 2015)
Y1 - 2015
A1 - Krishnaprasad Thirunarayan
KW - Bayesain Approach
KW - gleaning trustworthiness
KW - Multi-level trust
KW - Semantics of Trust
KW - Sensor
KW - social and interpersonal trust
KW - trust ontology
JA - 2015 International Conference on Collaboration Technologies and Systems (CTS 2015)
CY - Atlanta, Georgia
ER -
TY - CONF
T1 - Understanding Social Effects in Online Networks
T2 - International Symposium on Social Computing and Semantic Data Mining
Y1 - 2015
A1 - Huda Alhazmi
A1 - Swapna Gokhale
A1 - Derek Doran
KW - Network analytics
KW - Social dynamics
KW - Triad-based analysis
AB - Understanding the motives behind people's interactions online can offer sound bases to predict how a social network may evolve and also support a host of applications. We hypothesize that three offline social factors, namely, stature, relationship strength, and egocentricity may also play an important role in driving users' interactions online. Therefore, we study the influence of these three social factors in online interactions by analyzing the transitivity in triads or three-way relationships among users. Analyzing transitivity through the lens of triad census for four popular social networks, namely, Facebook, Twitter, YouTube and Slashdot, we find that: (i) users' interactions are largely influenced by intermediary relations, which enhances the mediators' stature; (ii) the strength of offline relationships plays a salient role in transitivity of relations online; and (iii) egocentricity, embodied in over-active and popular users, has a significant effect on the dynamics of online interactions.
JA - International Symposium on Social Computing and Semantic Data Mining
CY - Anaheim, California
ER -
TY - CHAP
T1 - Using EHRs for Heart Failure Therapy Recommendation Using Multidimensional Patient Similarity Analytics
T2 - Digital Healthcare Empowering Europeans: Proceedings of MIE 2015
Y1 - 2015
A1 - Maryam Panahiazar
A1 - Vahid Taslimitehrani
A1 - Naveen Pereira
A1 - Jyotishman Pathak
ED - Ronald Cornet
ED - Lăcrămioara Stoicu-Tivadar
ED - Alexander Hörbst
ED - Carlos Parra-Calderón
ED - Stig Andersen
ED - Mira Hercigonja-Szekeres
KW - electronic health records
KW - heart failure
KW - patient similarity
AB - Electronic Health Records (EHRs) contain a wealth of information about an individual patient's diagnosis, treatment and health outcomes. This information can be leveraged effectively to identify patients who are similar to each for disease diagnosis and prognosis. In recent years, several machine learning methods 1 have been proposed to assessing patient similarity, although the techniques have primarily focused on the use of patient diagnoses data from EHRs for the learning task. In this study, we develop a multidimensional patient similarity assessment technique that leverages multiple types of information from the EHR and predicts a medication plan for each new patient based on prior knowledge and data from similar patients. In our algorithm, patients have been clustered into different groups using a hierarchical clustering approach and subsequently have been assigned a medication plan based on the similarity index to the overall patient population. We evaluated the performance of our approach on a cohort of heart failure patients (N=1386) identified from EHR data at Mayo Clinic and achieved an AUC of 0.74. Our results suggest that it is feasible to harness population-based information from EHRs for an individual patient-specific assessment.
JA - Digital Healthcare Empowering Europeans: Proceedings of MIE 2015
PB - IOS Press
VL - 210
ER -
TY - CONF
T1 - Value Oriented Big Data Processing with Applications
T2 - 2015 International Conference on Collaboration Technologies and Systems (CTS 2015)
Y1 - 2015
A1 - Krishnaprasad Thirunarayan
KW - Big Data
KW - Correlation vs causation
KW - Hybrid reasoning
KW - Interleaved deduction and abduction
KW - physical-cyber-social systems
KW - Semantic Perception
AB - We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. To handle Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision- making. To handle Variety, we resort to semantic models and annotations of data so that intelligent processing can be done independent of heterogeneity of data formats and media. To handle Velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize relevant new concepts, entities and facts. To handle Veracity, we explore trust models and approaches to glean trustworthiness. Our ultimate goal is to deal with the challenges due to the four Vs of Big Data to derive Value to enable decision-making and action. In what follows, we discuss the primary characteristics of the Big Data problem as it pertains to the Five Vs. The first three were originally introduced by Doug Laney of Gartner
JA - 2015 International Conference on Collaboration Technologies and Systems (CTS 2015)
CY - Atlanta, Georgia
ER -
TY - ABST
T1 - Value-Oriented Big Data Processing with Applications
Y1 - 2015
A1 - Krishnaprasad Thirunarayan
AB - We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. To handle Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision- making. To handle Variety, we resort to semantic models and annotations of data so that intelligent processing can be done independent of heterogeneity of data formats and media. To handle Velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize relevant new concepts, entities and facts. To handle Veracity, we explore trust models and approaches to glean trustworthiness. Our ultimate goal is to deal with the challenges due to the four Vs of Big Data to derive Value to enable decision-making and action. In what follows, we discuss the primary characteristics of the Big Data problem as it pertains to the Five Vs. The first three were originally introduced by Doug Laney of Gartner.
ER -
TY - CONF
T1 - Where do we Develop? Discovering Regions for Urban Investment in Senegal
T2 - NetMob 2015
Y1 - 2015
A1 - Derek Doran
A1 - Andrew Fox
A1 - Veena Mendiratta
KW - International Conference on the Analysis of Mobile Phone Datasets Data for Development Challenge: Scientific Papers
AB - The rate of urbanization in developing countries, defined as the speed with which a population shifts from rural to urban areas, is among the highest in the world. The disproportionate number of citizens that live in a small numbers of cities places incredible pressure on the largest cities in these countries, which may already be faced with limited resources, weak industrialization, and underdeveloped infrastructures. Urban planning researchers as well as policy makers have suggested that governments in developing countries make capital investments within and surrounding smaller cities to attract citizens away from large urban centers, thereby lowering the pressure placed on overpopulated urban centers and making it more attractive for citizens to migrate to the smaller cities. This paper proposes a methodology that maps signals in mobile phone usage data to longstanding urban planning theories. These signals are subsequently combined in an unsupervised learner to discover regions within which city investments should be made. Qualitative evaluations of the selected arrondissements illustrate the promise of our approach.
JA - NetMob 2015
CY - Cambridge, MA
ER -
TY - Generic
T1 - Accurate Local Estimation of Geo-Coordinates for Social Media Posts
T2 - 26th International Conference on Software Engineering and Knowledge Engineering (SEKE 2014)
Y1 - 2014
A1 - Derek Doran
A1 - Swapna Gokhale
A1 - Aldo Dagnino
AB - Associating geo-coordinates with the content of social media posts can enhance many existing applications and services and enable a host of new ones. Unfortunately, a majority of social media posts are not tagged with geo-coordinates. Even when location data is available, it may be inaccurate, very broad or sometimes fictitious. Contemporary location estimation approaches based on analyzing the content of these posts can identify only broad areas such as a city, which limits their usefulness. To address these shortcomings, this paper proposes a methodology to narrowly estimate the geo-coordinates of social media posts with high accuracy. The methodology relies solely on the content of these posts and prior knowledge of the wide geographical region from where the posts originate. An ensemble of language models, which are smoothed over non-overlapping sub-regions of a wider region, lie at the heart of the methodology. Experimental evaluation using a corpus of over half a million tweets from New York City shows that the approach, on an average, estimates locations of tweets to within just 2.15km of their actual positions.
JA - 26th International Conference on Software Engineering and Knowledge Engineering (SEKE 2014)
CY - Vancouver, Canada
ER -
TY - Generic
T1 - Active Learning with Efficient Feature Weighting Methods for Improving Data Quality and Classification Accuracy
T2 - 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014)
Y1 - 2014
A1 - Justin Martineau
A1 - Lu Chen
A1 - Doreen Cheng
A1 - Amit Sheth
KW - Active learning
KW - Emotion Analysis
KW - twitter
AB - Many machine learning datasets are noisy with a substantial number of mislabeled instances. This noise yields sub-optimal classification performance. In this paper we study a large, low quality annotated dataset, created quickly and cheaply using Amazon Mechanical Turk to crowdsource annotations. We describe computationally cheap feature weighting techniques and a novel non-linear distribution spreading algorithm that can be used to iteratively and interactively correcting mislabeled instances to significantly improve annotation quality at low cost. Eight different emotion extraction experiments on Twitter data demonstrate that our approach is just as effective as more computationally expensive techniques. Our techniques save a considerable amount of time.
JA - 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014)
CY - Baltimore, MD
ER -
TY - JOUR
T1 - Alignment and Dataset Identification of Linked Data in Semantic Web
JF - Issue Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Y1 - 2014
A1 - Kalpa Gunaratna
A1 - Sarasi Lalithsena
A1 - Amit Sheth
KW - Dataset Source Identification
KW - Linked Data
KW - ontology alignment
AB - The Linked Open Data (LOD) cloud has gained significant attention in the Semantic Web community over the past few years. With rapid expansion in size and diversity, it consists of over 800 interlinked datasets with over 60 billion triples. These datasets encapsulate structured data and knowledge spanning over varied domains such as entertainment, life sciences, publications, geography, and government. Applications can take advantage of this by using the knowledge Chapterdistributed over the interconnected datasets, which is not realistic to find in a single place elsewhere. However, two of the key obstacles in using the LOD cloud are the limited support for data integration tasks over concepts, instances, and properties, and relevant data source selection for querying over multiple datasets. We review, in brief, some of the important and interesting technical approaches found in the literature that address these two issues. We observe that the general purpose alignment techniques developed outside the LOD context fall short in meeting the heterogeneous data representation of LOD. Therefore, an LOD-specific review of these techniques (especially for alignment) is important to the community. The topics covered and discussed in this article fall under two broad categories, namely alignment techniques for LOD datasets and relevant data source selection in the context of query processing over LOD datasets.
VL - 4
CP - 2
ER -
TY - CONF
T1 - An Analysis of Mayo Clinic Search Query Logs for Cardiovascular Diseases
T2 - AMIA Annual Symposium 2014
Y1 - 2014
A1 - Ashutosh Jadhav
A1 - Amit Sheth
A1 - Jyotishman Pathak
KW - cardiovascular diseases
KW - CVD
KW - eHealth
KW - health information search
KW - health search log
KW - health seeking behavior
KW - online health information seeking
KW - search query analysis
AB - Increasingly, individuals are taking active participation in learning and managing their health by leveraging online resources. Understanding online health information searching behavior can help us to study what health topics users search for and how search queries are formulated. In this work, we analyzed 10 million cardiovascular diseases (CVD) related search queries from MayoClinic.com. We performed semantic analysis on the queries using UMLS MetaMap and analyzed structural and textual properties as well as linguistic characteristics of the queries.
JA - AMIA Annual Symposium 2014
CY - Washington, D.C.
ER -
TY - Generic
T1 - Analysis of Online Information Searching for Cardiovascular Diseases on a Consumer Health Information Portal
T2 - AMIA Annual Symposium 2014
Y1 - 2014
A1 - Ashutosh Jadhav
A1 - Amit Sheth
A1 - Jyotishman Pathak
KW - cardiovascular diseases
KW - CVD
KW - eHealth
KW - health information search
KW - health search log
KW - health seeking behavior
KW - online health information seeking
KW - search intent mining
KW - search query analysis
AB - Since the early 2000, Internet usage for health information searching has increased significantly. Studying search queries can help us to understand users information need and how do they formulate search queries (expression of information need). Although cardiovascular diseases (CVD) affect a large percentage of the population, few studies have investigated how and what users search for CVD. We address this knowledge gap in the community by analyzing a large corpus of 10 million CVD related search queries from MayoClinic.com. Using UMLS MetaMap and UMLS semantic types/concepts, we developed a rule-based approach to categorize the queries into 14 health categories. We analyzed structural properties, types (keyword-based/Wh-questions/Yes-No questions) and linguistic structure of the queries. Our results show that the most searched health categories are Diseases/Conditions, Vital sings, Symptoms and Living-with. CVD queries are longer and are predominantly keyword-based. This study extends our knowledge about online health information searching and provides useful insights for Web search engines and health websites.
JA - AMIA Annual Symposium 2014
PB - American Medical Informatics Association
CY - Washington, D.C.
ER -
TY - CONF
T1 - An Analytic Model of Airport Security Checkpoint Screening Times
T2 - Transportation Research Board's 93rd Annual Meeting
Y1 - 2014
A1 - Derek Doran
A1 - Swapna Gokhale
A1 - Nicholas Lownes
AB - Security checkpoints at airports across the United States are essential for preventing passengers with dangerous weapons, explosives, and other threats from boarding airplanes. However, the multiple screening technologies and speeds of passengers lead to unpredictable and sometimes long waiting times. Security agencies and airport managers must find ways to minimize screening times at checkpoints without compromising the security of aviation transportation. This paper introduces an analytic model that derives the distribution of completion times for passengers through a security checkpoint, given its architecture, passenger profiles, and expected service times at checkpoint components. By varying the model's parameters and checkpoint architecture, security agencies and airport managers can quickly understand how the end-to-end completion times of passengers are affected by policy changes and checkpoint reconfigurations. The model can also be used to forecast the performance of future checkpoint architectures that use new components and policies. The authors demonstrate the utility of the model by analyzing a prototypical security checkpoint.
JA - Transportation Research Board's 93rd Annual Meeting
ER -
TY - JOUR
T1 - Analytical Modelling and Simulation of I-V Characteristics in Carbon Nanotube Based Gas Sensors Using ANN and SVR Methods
JF - Chemometrics and Intelligent Laboratory Systems
Y1 - 2014
A1 - Elnaz Akbari
A1 - Zolkafle Buntat
A1 - Aria Enzevaee
A1 - Monireh Ebrahimi
A1 - Amir Yazdavar
A1 - Rubiyah Yusof
KW - Artificial neural networks
KW - Carbon nanotubes (CNTs)
KW - Field effect transistor (FET)
KW - I-V characteristic
KW - Support vector regression (SVR)
AB - As one of the most interesting advancements in the field of nanotechnology, carbon nanotubes (CNTs) have been given special attention because of their remarkable mechanical and electrical properties and are being used in many scientific and engineering research projects. One such application facilitated by the fact that CNTs experience changes in electrical conductivity when exposed to different gases is the use of these materials as part of gas detection sensors. These are typically constructed on a field effect transistor (FET) based structure in which the CNT is employed as the channel between the source and the drain. In this study, an analytical model has been proposed and developed with the initial assumption that the gate voltage is directly proportional to the gas concentration as well as its temperature. Using the corresponding formulae for CNT conductance, the proposed mathematical model is derived. artificial neural network (ANN) and support vector regression (SVR) algorithms have also been incorporated to obtain other models for the current-voltage (I-V) characteristic in which the experimental data extracted from a recent work by N. Peng et al. has been used as the training data set. The comparative study of the results from ANN, SVR, and the analytical models with the experimental data in hand shows a satisfactory agreement which validates the proposed models. However, SVR outperforms the ANN approach and gives more accurate results.
VL - 137
ER -
TY - CONF
T1 - Applications of Multimodal Physical (IoT), Cyber and Social Data for Reliable and Actionable Insights
T2 - Cyber and Social Data for Reliable and Actionable Insights, 2nd International Workshop on Internet of Things(C-IOT 2014) in conjunction with IEEE CollaborateCom 2014
Y1 - 2014
A1 - Amit Sheth
A1 - Pramod Anantharam
A1 - Krishnaprasad Thirunarayan
AB - Physical objects with embedded sensors are increasingly being networked together using wireless and internet technologies to form Internet of Things (IoT). However, early applications that rely on IoT data fail to provide comprehensive situational awareness. This often requires combining physical (i.e., IoT) data with social data created by humans on the Web and increasingly on their mobile phones (i.e., citizen sensing) as well as other data such as structured open data and background knowledge available on the Web (i.e., cyber data and knowledge). In this paper, we explore how integration and analysis of multimodal physical-cyber- social data can support advanced applications and enrich human experience. Specifically, we illustrate the complementary role played by sensor and social data, often intermediated by other Web based data and knowledge, using real-world examples in the domain of situational awareness, traffic monitoring, and healthcare. We also show how semantic techniques and technologies support critical data interoperability needs, advanced computation capabilities including reasoning, and significantly enhance our ability to exploit growing amount of data from the proliferation of Internet of Things.
JA - Cyber and Social Data for Reliable and Actionable Insights, 2nd International Workshop on Internet of Things(C-IOT 2014) in conjunction with IEEE CollaborateCom 2014
CY - Miami, Florida
ER -
TY - RPRT
T1 - Architecture and Prototype for Materials Knowledge Management System using Semantic Web Technologies and Techniques: A Preliminary Report
Y1 - 2014
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Kalpa Gunaratna
A1 - Vinh Nguyen
A1 - Siva Cheekula
A1 - Sarasi Lalithsena
A1 - Nishita Jaykumar
A1 - Swapnil Soni
A1 - Clare Paul
AB - We present a semantic web principles and technology enabled framework for addressing digital data challenges relevant to Materials Genomics Initiative. Specifically this paper presents supports for and prototypes to (i) integrate different legacy Materials vocabularies, (ii) annotate documents using the unified vocabulary, and (iii) enable convenient elicitation of relevant materials knowledge from domain experts. We have also adapted our iExplore tool to visualize and navigate materials related semantic web data. Further, in order to permit extraction of specific biomaterials knowledge, we have developed pattern-based extraction rules and an annotation tool that can work on technical papers in PDF format. Our work exploits a recently proposed technique for annotating semantic web triples (using singleton property) that can associate provenance and access control information succinctly.
PB - Wright State University
CY - Dayton
ER -
TY - Generic
T1 - Assisting Coordination during Crisis: A Domain Ontology Based Approach to Infer Resource Needs from Tweets
T2 - 2014 ACM Conference on Web Science
Y1 - 2014
A1 - Shreyansh Bhatt
A1 - Hemant Purohit
A1 - Andrew Hampton
A1 - Valerie Shalin
A1 - Amit Sheth
A1 - John Flach
KW - crisis computing
KW - crisis coordination
KW - Crisis Informatics
KW - crisis response
KW - crisis response coordination
KW - domain model
KW - Emergency Response
KW - semantic inference
KW - Social Media
KW - social media for emergency management (SMEM).
AB - Ubiquitous social media during crises provides citizen reports on the situation, needs and supplies. Previous research extracts resource needs directly from the text (e.g. ÂPower cut to Coney Island and Brighton beachÂ indicates a power need). This approach assumes that citizens derive and write about specific needs from their observations, properly specified for the emergency response system, an assumption that is not consistent with general conversational behavior. In our study, Twitter messages (tweets) from Hurricane Sandy in 2012 clearly indicate power blackouts, but not their probable implications (e.g. loss of power to hospital life support systems). We use a domain model to capture such interdependencies between resources and needs. We represent these dependencies in an ontology that specifies the functional association between resources. Accurate interpretation of resource need/supply also depends on the location of a message. We show how inference based on a domain model combined with location detection and interpretation in the social data can enhance situational awareness, e.g., predicting a medical emergency before it is reported as critical.
JA - 2014 ACM Conference on Web Science
PB - ACM
CY - New York, NY
ER -
TY - JOUR
T1 - Automated fall detection with quality improvement "rewind" to reduce falls in hospital rooms
JF - Journal of Gerontological Nursing Technology Innovations
Y1 - 2014
A1 - Marilyn Rantz
A1 - Tanvi Banerjee
A1 - Erin Cattoor
A1 - Susan Scott
A1 - Mihail Popescu
A1 - Marjorie Skubic
AB - The purpose of this study was to test the implementation of a fall detection and “rewind” privacy-protecting technique using the Microsoft® Kinect™ to not only detect but prevent falls from occurring in hospitalized patients. Kinect sensors were placed in six hospital rooms in a step-down unit and data were continuously logged. Prior to implementation with patients, three researchers performed a total of 18 falls (walking and then falling down or falling from the bed) and 17 non-fall events (crouching down, stooping down to tie shoe laces, and lying on the floor). All falls and non-falls were correctly identified using automated algorithms to process Kinect sensor data. During the first 8 months of data collection, processing methods were perfected to manage data and provide a “rewind” method to view events that led to falls for post-fall quality improvement process analyses. Preliminary data from this feasibility study show that using the Microsoft Kinect sensors provides detection of falls, fall risks, and facilitates quality improvement after falls in real hospital environments unobtrusively, while taking into account patient privacy.
VL - 40
CP - 1
ER -
TY - Generic
T1 - Building a framework for recognition of activities of daily living from depth images using fuzzy logic
T2 - 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)
Y1 - 2014
A1 - Tanvi Banerjee
A1 - James Keller
A1 - Marjorie Skubie
KW - 3D feature extraction
KW - Activities of daily living
KW - behavior change measurement
KW - bounding box parameters
KW - data collection
KW - Data Mining
KW - depth image
KW - Depth images
KW - DH-HEMTs
KW - feature extraction
KW - foreground images
KW - Fuzzy Logic
KW - fuzzy reasoning
KW - fuzzy rules
KW - health change prediction
KW - hierarchical fuzzy rule model
KW - IADL
KW - IADLS
KW - instrumental activities-of-daily living recognition
KW - Kinect depth data
KW - learning (artificial intelligence)
KW - machine learning
KW - Niobium
KW - older adults
KW - Sensors Three-dimensional displays
KW - system model
KW - three-layered FIS model
KW - three-level fuzzy inference
AB - Complex activities such as instrumental activities of daily living (IADLs) can be identified by creating a hierarchical model of fuzzy rules. In this work, we present a framework to model a specific IADL - "making the bed". For this activity recognition, the need for a three level Fuzzy Inference System (FIS) model is shown. Simple features such as bounding box parameters were extracted from the foreground images and combined with 3D features extracted from the Kinect depth data. This was then fed as input to the three layered FIS for further analysis. Data collected from several participants were tested and evaluated. Such a framework can be used to model several other IADLS as well as basic activities of daily living (ADLs). Analysis of ADLs can be used to compare daily patterns in older adults to measure changes in behavior. This can then be used to predict health changes to assist older adults in leading independent lifestyles for longer time periods.
JA - 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)
PB - IEEE
CY - Beijing, China
ER -
TY - CONF
T1 - Characterizing Trajectories of Daily Routines of Older Adults with Sensor Technology
T2 - 7th Western Institute of Nursing Communicating Research Conference
Y1 - 2014
A1 - M. Yefimova
A1 - Z. Hajihashemi
A1 - Tanvi Banerjee
A1 - D. Woods
A1 - M. Popescu
A1 - M. Skubic
A1 - M. Rantz
A1 - J. Keller
A1 - M. Keller
JA - 7th Western Institute of Nursing Communicating Research Conference
CY - Seattle, WA
ER -
TY - CONF
T1 - Comparative Analysis of Online Health Information Search by Device Type
T2 - American Medical Informatics Association, TBI/CRI Joint Summit
Y1 - 2014
A1 - Ashutosh Jadhav
KW - eHealth
KW - health information search
KW - health search log
KW - health seeking behavior
KW - mHealth
KW - mobile health
KW - online health information seeking
KW - search query analysis
JA - American Medical Informatics Association, TBI/CRI Joint Summit
CY - San Francisco, California
ER -
TY - JOUR
T1 - Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal
JF - Journal of Medical Internet Research
Y1 - 2014
A1 - Ashutosh Jadhav
A1 - Donna Andrews
A1 - Alexander Fiksdal
A1 - Ashok Kumbamu
A1 - Jennifer McCormick
A1 - Andrew Misitano
A1 - Laurie Nelsen
A1 - Euijung Ryu
A1 - Amit Sheth
A1 - Stephen Wu
A1 - Jyotishman Pathak
KW - eHealth
KW - health information search
KW - health search log
KW - health seeking behavior
KW - mHealth
KW - mobile health
KW - online health information seeking
KW - search query analysis
AB - Background: The number of people using the Internet and mobile/smart devices for health information seeking is increasing rapidly. Although the user experience for online health information seeking varies with the device used, for example, smart devices (SDs) like smartphones/tablets versus personal computers (PCs) like desktops/laptops, very few studies have investigated how online health information seeking behavior (OHISB) may differ by device. Objective: The objective of this study is to examine differences in OHISB between PCs and SDs through a comparative analysis of large-scale health search queries submitted through Web search engines from both types of devices. Methods: Using the Web analytics tool, IBM NetInsight OnDemand, and based on the type of devices used (PCs or SDs), we obtained the most frequent health search queries between June 2011 and May 2013 that were submitted on Web search engines and directed users to the Mayo Clinic consumer health information website. We performed analyses on Queries with considering repetition counts (QwR) and Queries without considering repetition counts (QwoR). The dataset contains (1) 2.74 million and 3.94 million QwoR, respectively for PCs and SDs, and (2) more than 100 million QwR for both PCs and SDs. We analyzed structural properties of the queries (length of the search queries, usage of query operators and special characters in health queries), types of search queries (keyword-based, wh-questions, yes/no questions), categorization of the queries based on health categories and information mentioned in the queries (gender, age-groups, temporal references), misspellings in the health queries, and the linguistic structure of the health queries. Results: Query strings used for health information searching via PCs and SDs differ by almost 50%. The most searched health categories are Symptoms (1 in 3 search queries), Causes, and Treatments & Drugs. The distribution of search queries for different health categories differs with the device used for the search. Health queries tend to be longer and more specific than general search queries. Health queries from SDs are longer and have slightly fewer spelling mistakes than those from PCs. Users specify words related to women and children more often than that of men and any other age group. Most of the health queries are formulated using keywords; the second-most common are wh- and yes/no questions. Users ask more health questions using SDs than PCs. Almost all health queries have at least one noun and health queries from SDs are more descriptive than those from PCs. Conclusions: This study is a large-scale comparative analysis of health search queries to understand the effects of device type (PCs vs SDs) used on OHISB. The study indicates that the device used for online health information search plays an important role in shaping how health information searches by consumers and patients are executed.
VL - 16
CP - 7
ER -
TY - JOUR
T1 - Compensatory Enlargement of Ossabaw Miniature Swine Coronary Arteries in Diffuse Atherosclerosis
JF - International Journal of Cardiology
Y1 - 2014
A1 - Jenny Choy
A1 - Tong Luo
A1 - Yunlong Huo
A1 - Thomas Wischgoll
A1 - Kyle Schultz
A1 - Shawn Teague
A1 - Michael Sturek
A1 - Ghassan Kassab
AB - Studies in human and non-human primates have confirmed the compensatory enlargement or positive remodeling (Glagov phenomenon) of coronary vessels in the presence of focal stenosis. To our knowledge, this is the first study to document arterial enlargement in a metabolic syndrome animal model with diffuse coronary artery disease (DCAD) in the absence of severe focal stenosis. Two different groups of Ossabaw miniature pigs were fed a high fat atherogenic diet for 4 months (Group I) and 12 months (Group II), respectively. Group I (6 pigs) underwent contrast enhanced computed tomographic angiography (CCTA) and intravascular ultrasound (IVUS) at baseline and after 4 months of high fat diet, whereas Group II (7 pigs) underwent only IVUS at 12 months of high fat diet. IVUS measurements of the left anterior descending (LAD), left circumflex (LCX) and right coronary (RCA) arteries in Group I showed an average increase in their lumen cross-sectional areas (CSA) of 25.8%, 11.4%, and 43.4%, respectively, as compared to baseline. The lumen CSA values of LAD in Group II were found to be between the baseline and 4 months values in Group I. IVUS and CCTA measurements showed a similar trend and positive correlation. Fractional flow reserve (FFR) was 0.91 plus/minus 0.07 at baseline and 0.93 plus/minus 0.05 at 4 months with only 2.2%, 1.6% and 1% stenosis in the LAD, LCX and RCA, respectively. The relation between percent stenosis and lumen CSA shows a classical Glagov phenomenon in this animal model of DCAD.
VL - 6
ER -
TY - THES
T1 - A Context-Driven Subgraph Model for Literature-Based Discovery
T2 - Department of Computer Science & Engineering
Y1 - 2014
A1 - Delroy Cameron
KW - Graph mining
KW - Literature-based discovery (LBD)
KW - Path clustering
KW - semantic predications
KW - Semantic relatedness
AB - Literature-Based Discovery (LBD) refers to the process of uncovering hidden connections that are implicit in scientific literature. Numerous hypotheses have been generated from scientific literature using the LBD paradigm, which influenced innovations in diagnosis, treatment, preventions and overall public health. However, much of the existing research on discovering hidden connections among concepts have used distributional statistics and graph-theoretic measures to capture implicit associations. Such metrics do not explicitly capture the semantics of hidden connections. Rather, they only allude to the existence of meaningful underlying associations. To gain in-depth insights into the meaning of hidden (and other) connections, complementary methods have often been employed. Some of these methods include: 1) the use of domain expertise for concept filtering and knowledge exploration, 2) leveraging structured background knowledge for context and to supplement concept filtering, and 3) developing heuristics a priori to help eliminate spurious connections. While effective in some situations, the practice of relying on domain expertise, structured background knowledge, and heuristics to complement distributional and graph-theoretic approaches, has serious limitations. The main issue is that the intricate context of complex associations is not always known a priori and cannot easily be computed without understanding the underlying semantics of the associations. Complex associations should not be overlooked, since they are often needed to elucidate the mechanisms of interaction and causality relationships among concepts. Moreover, they can capture the broader aspects of a biomedical sub-domain by segregating associations along different thematic dimensions, such as Metabolic Function, Pharmaceutical Treatment, and Neurological Activity. This dissertation proposes an innovative context-driven, automatic subgraph creation method for finding hidden and complex associations among concepts, along multiple thematic dimensions. It outlines definitions for context and shared context, based on implicit and explicit (or formal) semantics, which compensate for deficiencies in statistical and graph-based metrics. It also eliminates the need for heuristics a priori. An evidence-based evaluation of the proposed framework showed that 8 out of 9 existing scientific discoveries could be recovered using this approach. Additionally, insights into the meaning of associations could be obtained using provenance provided by the system. In a statistical evaluation to determine the interestingness of the generated subgraphs, it was observed that an arbitrary association is mentioned in only approximately 4 articles in MEDLINE, on average. These results suggest that leveraging implicit and explicit context, as defined in this dissertation, is a significant advancement of the state-of-the-art in LBD research.
JA - Department of Computer Science & Engineering
PB - Wright State University
CY - Dayton
VL - PhD
ER -
TY - Generic
T1 - Cursing in English on Twitter
T2 - ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW 2014)
Y1 - 2014
A1 - Wenbo Wang
A1 - Lu Chen
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
KW - Cursing
KW - Emotion
KW - Gender Difference
KW - Profanity
KW - Social Media
KW - twitter
AB - Cursing is not uncommon during conversations in the physical world: 0.5% to 0.7% of all the words we speak are curse words, given that 1% of all the words are first-person plural pronouns (e.g., we, us, our). On social media, people can instantly chat with friends without face-to-face interaction, usually in a more public fashion and broadly disseminated through highly connected social network. Will these distinctive features of social media lead to a change in peopleÂs cursing behavior? In this paper, we examine the characteristics of cursing activity on a popular social media platform Â Twitter, involving the analysis of 51 million tweets and about 14 million users. In particular, we explore a set of questions that have been recognized as crucial for understanding cursing in offline communications by prior studies, including the ubiq uity, utility, and contextual dependencies of cursing.
JA - ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW 2014)
PB - ACM
CY - New York, NY
ER -
TY - CONF
T1 - Daily Routines of Older Adults: A Novel Method of Measurement
T2 - 67th Scientific Meeting of the Gerontological Society of America
Y1 - 2014
A1 - M. Yefimova
A1 - Z. Hajihashemi
A1 - Tanvi Banerjee
A1 - D. Woods
A1 - M. Popescu
A1 - M. Rantz
A1 - M. Skubic
A1 - J. Keller
JA - 67th Scientific Meeting of the Gerontological Society of America
CY - Washington DC
ER -
TY - Generic
T1 - Data Analytics for Power Utility Storm Planning
T2 - 6th International Conference on Knowledge Discovery and Information Retrieval
Y1 - 2014
A1 - Lan Lin
A1 - Aldo Dagnino
A1 - Derek Doran
A1 - Swapna Gokhale
KW - Data Analytics
KW - machine learning
KW - On-line social media
KW - Smart grid
KW - Storm damage projection
AB - As the world population grows, recent climatic changes seem to bring powerful storms to populated areas. The impact of these storms on utility services is devastating. Hurricane Sandy is a recent example of the enormous damages that storms can inflict on infrastructure, society, and the economy. Quick response to these emergencies represents a big challenge to electric power utilities. Traditionally utilities develop preparedness plans for storm emergency situations based on the experience of utility experts and with limited use of historical data. With the advent of the Smart Grid, utilities are incorporating automation and sensing technologies in their grids and operation systems. This greatly increases the amount of data collected during normal and storm conditions. These data, when complemented with data from weather stations, storm forecasting systems, and online social media, can be used in analyses for enhancing storm preparedness for utilities. This paper presents a data analytics approach that uses real-world historical data to help utilities in storm damage projection. Preliminary results from the analysis are also included.
JA - 6th International Conference on Knowledge Discovery and Information Retrieval
CY - Rome, Italy
ER -
TY - Generic
T1 - Distributed OWL EL Reasoning: The Story So Far
T2 - 10th International Workshop on Scalable Semantic Web Knowledge Base Systems
Y1 - 2014
A1 - Raghava Mutharaju
A1 - Pascal Hitzler
A1 - Prabhaker Mateti
ED - Thomas Liebig
ED - Achille Fokoue
KW - classification
KW - distributed reasoning
KW - MapReduce
KW - OWL 2 EL
KW - peer-to-peer system
AB - Automated generation of axioms from streaming data, such as traffic and text, can result in very large ontologies that single machine reasoners cannot handle. Reasoning with large ontologies requires distributed solutions. Scalable reasoning techniques for RDFS, OWL Horst and OWL 2 RL now exist. For OWL 2 EL, several distributed reasoning approaches have been tried, but are all perceived to be inefficient. We analyze this perception. We analyze completion rule based distributed approaches, using different characteristics, such as dependency among the rules, implementation optimizations, how axioms and rules are distributed. We also present a distributed queue approach for the classification of ontologies in description logic EL+(fragment of OWL 2 EL).
JA - 10th International Workshop on Scalable Semantic Web Knowledge Base Systems
PB - CEUR
CY - Riva del Garda, Italy
VL - 1261
ER -
TY - Generic
T1 - Don't like RDF Reification? Making Statements about Statements using Singleton Property
T2 - 23rd International conference on World Wide Web (WWW '14)
Y1 - 2014
A1 - Vinh Nguyen
A1 - Olivier Bodenreider
A1 - Amit Sheth
KW - Meta triples
KW - rdf
KW - RDF reification
KW - Semantic Web
KW - Singleton Property
KW - sparql
AB - Statements about RDF statements, or meta triples, provide additional information about individual triples, such as the source, the occurring time or place, or the certainty. Integrating such meta triples into semantic knowledge bases would enable the querying and reasoning mechanisms to be aware of provenance, time, location, or certainty of triples. However, an efficient RDF representation for such meta knowledge of triples remains challenging. The existing standard reification approach allows such meta knowledge of RDF triples to be expressed using RDF by two steps. The first step is representing the triple by a Statement instance which has subject, predicate, and object indicated separately in three different triples. The second step is creating assertions about that instance as if it is a statement. While reification is simple and intuitive, this approach does not have formal semantics and is not commonly used in practice as described in the RDF Primer. In this paper, we propose a novel approach called Singleton Property for representing statements about statements and provide a formal semantics for it. We explain how this singleton property approach fits well with the existing syntax and formal semantics of RDF, and the syntax of SPARQL query language. We also demonstrate the use of singleton property in the representation and querying of meta knowledge in two examples of Semantic Web knowledge bases: YAGO2 and BKR. Our experiments on the BKR show that the singleton property approach give a decent performance in terms of number of triples, query length and query execution time compared to existing approaches. This approach, which is also simple and intuitive, can be easily adopted for representing and querying statements about statements in other knowledge bases.
JA - 23rd International conference on World Wide Web (WWW '14)
PB - ACM
CY - New York, NY
ER -
TY - Generic
T1 - Don't like RDF reification?: making statements about statements using singleton property
T2 - 23rd International Conference on World Wide Web
Y1 - 2014
KW - Meta triples
KW - rdf
KW - RDF Singleton Property
KW - Reification
KW - Semantic Web
KW - sparql
AB - Statements about RDF statements, or meta triples, provide additional information about individual triples, such as the source, the occurring time or place, or the certainty. Integrating such meta triples into semantic knowledge bases would enable the querying and reasoning mechanisms to be aware of provenance, time, location, or certainty of triples. However, an efficient RDF representation for such meta knowledge of triples remains challenging. The existing standard reification approach allows such meta knowledge of RDF triples to be expressed using RDF by two steps. The first step is representing the triple by a Statement instance which has subject, predicate, and object indicated separately in three different triples. The second step is creating assertions about that instance as if it is a statement. While reification is simple and intuitive, this approach does not have formal semantics and is not commonly used in practice as described in the RDF Primer. In this paper, we propose a novel approach called Singleton Property for representing statements about statements and provide a formal semantics for it. We explain how this singleton property approach fits well with the existing syntax and formal semantics of RDF, and the syntax of SPARQL query language. We also demonstrate the use of singleton property in the representation and querying of meta knowledge in two examples of Semantic Web knowledge bases: YAGO2 and BKR. Our experiments on the BKR show that the singleton property approach gives a decent performance in terms of number of triples, query length and query execution time compared to existing approaches. This approach, which is also simple and intuitive, can be easily adopted for representing and querying statements about statements in other knowledge bases.
JA - 23rd International Conference on World Wide Web
PB - ACM
CY - Seoul, South Korea
SN - 978-1-4503-2744-2
ER -
TY - RPRT
T1 - Dynamic Update of Public Transport Schedules in Cities Lacking Traffic Instrumentation
Y1 - 2014
A1 - Pramod Anantharam
A1 - Biplav Srivastava
A1 - Raj Gupta
AB - A common obstacle for citizens in switching to public transportation is the lack of information about available choices when they need to travel. Al-though schedules of individual modes like bus or metro may be available as paper pamphlets, or digital files on websites, they do not give an integrated view of the complete services possible when a citizen actually wants to travel. Furthermore, the situation on the roads evolve and this demands timeliness of public transportation information. We want to tackle this problem in the context of cities of developing countries like India, which lacks basic instrumentation to track road conditions or vehicle location. Our solution extends a public transportation recommender working only with static schedule information to utilize SMS messages about road conditions sent by city authorities. Our solution consists of: (a) extracting events from traffic alert messages, (b) reasoning about traffic delays from extracted events by qualitatively deciding what stops (locations) will be affected, and quantitatively estimating the lower bound on the probability of having a delay at those locations in the city, and (c) utilizing the delay estimates for route recommendation. We use publicly available traffic related SMS messages from Delhi, India, to evaluate our approach and show its promise. Our solution provides dynamic updates for transport network in cities with low investment and quick time to realization.
PB - IBM Research
ER -
TY - Generic
T1 - Economically-efficient Sentiment Stream Analysis
T2 - 37th international ACM SIGIR Conference on Research & Development in Information Retrieval
Y1 - 2014
A1 - Roberto Lourenco
A1 - Adriano Veloso
A1 - Adriano Pereira
A1 - Wagner Meira
A1 - Renato Ferreira
A1 - Srinivasan Parthasarthy
KW - Economic Efficiency
KW - Sentiment Analysis
KW - Streams and Drifts
AB - Text-based social media channels, such as Twitter, produce torrents of opinionated data about the most diverse topics and entities. The analysis of such data (aka. sentiment analysis) is quickly becoming a key feature in recommender systems and search engines. A prominent approach to sentiment analysis is based on the application of classification techniques, that is, content is classified according to the attitude of the writer. A major challenge, however, is that Twitter follows the data stream model, and thus classifiers must operate with limited resources, including labeled data and time for building classification models. Also challenging is the fact that sentiment distribution may change as the stream evolves. In this paper we address these challenges by proposing algorithms that select relevant training instances at each time step, so that training sets are kept small while providing to the classifier the capabilities to suit itself to, and to recover itself from, different types of sentiment drifts. Simultaneously providing capabilities to the classifier, however, is a conflicting-objective problem, and our proposed algorithms employ basic notions of Economics in order to balance both capabilities. We performed the analysis of events that reverberated on Twitter, and the comparison against the state-of-the-art reveals improvements both in terms of error reduction (up to 14%) and reduction of training resources (by orders of magnitude).
JA - 37th international ACM SIGIR Conference on Research & Development in Information Retrieval
PB - ACM
CY - Gold Coast, Australia
ER -
TY - JOUR
T1 - Emergency-Relief Coordination on Social Media: Automatically Matching Resource Requests and Offers
JF - First Monday
Y1 - 2014
A1 - Hemant Purohit
A1 - Carlos Castillo
A1 - Fernando Diaz
A1 - Amit Sheth
A1 - Patrick Meier
KW - Coordination
KW - crisis computing
KW - crisis coordination
KW - Crisis Informatics
KW - crisis response
KW - disaster response
KW - donation coordination
KW - Emergency Response
KW - needs-offers matching
KW - relief coordination
KW - request-offer matching
KW - Social Media
AB - Disaster affected communities are increasingly turning to social media for communication and coordination. This includes reports on needs (demands) and offers (supplies) of resources required during emergency situations. Identifying and matching such requests with potential responders can substantially accelerate emergency relief efforts. Current work of disaster management agencies is labor intensive, and there is substantial interest in automated tools. We present machineÂlearning methods to automatically identify and match needs and offers communicated via social media for items and services such as shelter, money, clothing, etc. For instance, a message such as 'we are coordinating a clothing/food drive for families affected by Hurricane Sandy. If you would like to donate, DM us' can be matched with a message such as 'I got a bunch of clothes IÂd like to donate to hurricane sandy victims. Anyone know where/how I can do that?' Compared to traditional search, our results can significantly improve the matchmaking efforts of disaster response agencies.
VL - 19
CP - 1
ER -
TY - Generic
T1 - Empowering Personalized Medicine with Big Data and Semantic Web Technology: Promises, Challenges, and Use Cases
T2 - 2014 IEEE International Conference on Big Data
Y1 - 2014
A1 - Maryam Panahiazar
A1 - Vahid Taslimitehrani
A1 - Ashutosh Jadhav
A1 - Jyotishman Pathak
KW - Big Data
KW - health care
KW - Personalized Medicine
KW - Semantic Web
KW - Smart Data
AB - In healthcare, big data tools and technologies have the potential to create significant value by improving outcomes while lowering costs for each individual patient. Diagnostic images, genetic test results and biometric information are increasingly generated and stored in electronic health records presenting us with challenges in data that is by nature high volume, variety and velocity, thereby necessitating novel ways to store, manage and process big data. This presents an urgent need to develop new, scalable and expandable big data infrastructure and analytical methods that can enable healthcare providers access knowledge for the individual patient, yielding better decisions and outcomes. In this paper, we briefly discuss the nature of big data and the role of semantic web and data analysis for generating Âsmart dataÂ which offer actionable information that supports better decision for personalized medicine. In our view, the biggest challenge is to create a system that makes big data robust and smart for healthcare providers and patients that can lead to more effective clinical decision-making, improved health outcomes, and ultimately, managing the healthcare costs. We highlight some of the challenges in using big data and propose the need for a semantic data-driven environment to address them. We illustrate our vision with practical use cases, and discuss a path for empowering personalized medicine using big data and semantic web technology.
JA - 2014 IEEE International Conference on Big Data
PB - IEEE
CY - Washington, D.C.
ER -
TY - JOUR
T1 - Evaluating the Process of Online Health Information Searching: A Qualitative Approach to Exploring Consumer Perspectives
JF - Journal of Medical Internet Research
Y1 - 2014
A1 - Alexander Fiksdal
A1 - Ashok Kumbamu
A1 - Ashutosh Jadhav
A1 - Cristian Cocos
A1 - Laurie Nelsen
A1 - Jyotishman Pathak
A1 - Jennifer McCormick
KW - Consumer Health Information
KW - Information Seeking Behavior
KW - Internet
KW - Qualitative Research
AB - Background: The Internet is becoming a common resource that patients and consumers use to access health-related information. Multiple practical, cultural, and socioeconomic factors influence why, when, and how people utilize this tool. The multitude of preferred vocabularies used to describe medical conditions and concepts represents a major challenge for content providers in providing relevant search results and information to consumers. Although a wide body of quantitative research examining search behavior exists, qualitative approaches have been under-utilized and provide unique perspectives that may prove useful in improving the delivery of health information over the Internet. Objective: We conducted this study to gain a deeper understanding of online health-searching behavior in order to inform future developments of personalizing information searching and content delivery. Methods: We completed three focus groups of adult Olmsted County, MN residents that explored perceptions of online health information searching. Participants were recruited through flyers and classifieds advertisements posted throughout the community. We audio recorded and transcribed all focus groups, and analyzed data using standard qualitative methods. Results: Almost all participants reported using the Internet to gather health information. They described a common experience of searching, filtering, and comparing results in order to obtain information relevant to their intended search target. Information saturation and fatigue were cited as main reasons for terminating searching. This information was often used as a resource to enhance their interactions with health care providers. Conclusions: Many participants viewed the Internet as a valuable tool for finding health information in order to support their existing healthcare resources. As the length of interactions between patients and providers continues to decrease, health information retrieved from the Internet will play an increasingly important role in health care. Although the Internet is a preferred source of health information, challenges persist in streamlining the search process. Content providers should continue to develop new strategies and technologies aimed at accommodating diverse populations, vocabularies, and health information needs.
VL - 16
CP - 10
ER -
TY - Generic
T1 - Evolving a Fuzzy Goal-Driven Strategy for the Game of Geister
T2 - 2014 IEEE Congress on Evolutionary Computation (CEC)
Y1 - 2014
A1 - Andrew Buck
A1 - Tanvi Banerjee
A1 - James Keller
KW - autonomous gameplay agent
KW - coevolutionary algorithm
KW - computational intelligence
KW - computer games
KW - evolutionary computation
KW - fuzzy goal-driven strategy
KW - Fuzzy Logic
KW - fuzzy reasoning
KW - Games
KW - German for ghosts game
KW - goal-based fuzzy inference system
KW - IEEE Computational Intelligence Society
KW - Inference algorithms
KW - multi-agent systems
KW - neural nets
KW - neural network
KW - Neural networks
KW - teaching
KW - Training
KW - unobservable feature estimation
KW - Vectors
AB - This paper presents an approach to designing a strategy for the game of Geister using the three main research areas of computational intelligence. We use a goal-based fuzzy inference system to evaluate the utility of possible actions and a neural network to estimate unobservable features (the true natures of the opponent ghosts). Finally, we develop a coevolutionary algorithm to learn the parameters of the strategy. The resulting autonomous gameplay agent was entered in a global competition sponsored by the IEEE Computational Intelligence Society and finished second among eight participating teams.
JA - 2014 IEEE Congress on Evolutionary Computation (CEC)
PB - IEEE
CY - Beijing, China
ER -
TY - Generic
T1 - Gender classification under extended operating conditions
T2 - SPIE
Y1 - 2014
A1 - Nathan Rude
A1 - Mateen Rizki
AB - Gender classification is a critical component of a robust image security system. Many techniques exist to perform gender classification using facial features. In contrast, this paper explores gender classification using body features extracted from clothed subjects. Several of the most effective types of features for gender classification identified in literature were implemented and applied to the newly developed Seasonal Weather And Gender (SWAG) dataset. SWAG contains video clips of approximately 2000 samples of human subjects captured over a period of several months. The subjects are wearing casual business attire and outer garments appropriate for the specific weather conditions observed in the Midwest. The results from a series of experiments are presented that compare the classification accuracy of systems that incorporate various types and combinations of features applied to multiple looks at subjects at different image resolutions to determine a baseline performance for gender classification
JA - SPIE
VL - 9079
ER -
TY - Generic
T1 - Hierarchical Interest Graph from Tweets
T2 - 3rd International World Wide Web Conference
Y1 - 2014
A1 - Pavan Kapanipathi
A1 - Prateek Jain
A1 - Chitra Venkataramani
A1 - Amit Sheth
ED - Kyuseok Shim
ED - Torsten Suel
KW - Hierarchical Interest Graph
KW - personalization
KW - Social Semantic Web
KW - twitter
KW - User Profiles
KW - Wikipedia
AB - Industry and researchers have identified numerous ways to mone- tize microblogs for personalization and recommendation. A com- mon challenge across these different works is the identification of user interests. Although techniques have been developed to ad- dress this challenge, a flexible approach that spans multiple lev- els of granularity in user interests has not been forthcoming. In this work, we focus on exploiting hierarchical semantics of con- cepts to infer richer user interests expressed as a Hierarchical In- terest Graph . To create such graphs, we utilize usersÂ tweets to first ground potential user interests to structured background knowledge such as Wikipedia Category Graph. We then adapt spreading acti- vation theory to assign user interest score to each category in the hierarchy. The Hierarchical Interest Graph not only comprises of usersÂ explicitly mentioned interests determined from Twitter, but also their implicit interest categories inferred from the background knowledge source.
JA - 3rd International World Wide Web Conference
PB - ACM
CY - Seoul, South Korea
ER -
TY - JOUR
T1 - A Hybrid Approach to Finding Relevant Social Media Content for Complex Domain Specific Information Needs
JF - Journal of Web Semantics
Y1 - 2014
A1 - Delroy Cameron
A1 - Amit Sheth
A1 - Nishita Jaykumar
A1 - Krishnaprasad Thirunarayan
A1 - Gaurish Anand
A1 - Gary Alan Smith
KW - background knowledge
KW - Complex Information Needs
KW - Context-Free Grammar
KW - Information Retrieval
KW - Knowledge-Aware Search
KW - Ontology
KW - Semantic Search
AB - While contemporary semantic search systems offer to improve classical keyword-based search, they are not always adequate for complex domain specific information needs. The domain of prescription drug abuse, for example, requires knowledge of both ontological concepts and 'intelligible constructs' not typically modeled in ontologies. These intelligible constructs convey essential information that include notions of intensity, frequency, interval, dosage and sentiments, which could be important to the holistic needs of the information seeker. In this paper, we present a hybrid approach to domain specific information retrieval (or knowledge-aware search) that integrates ontology-driven query interpretation with synonym-based query expansion and domain specific rules, to facilitate search in social media. Our framework is based on a context-free grammar (CFG) that defines the query language of constructs interpretable by the search system. The grammar provides two levels of semantic interpretation: 1) a top-level CFG that facilitates retrieval of diverse textual patterns, which belong to broad templates and 2) a low-level CFG that enables interpretation of certain specific expressions that belong to such patterns. These low-level expressions occur as concepts from four different categories of data: 1) ontological concepts, 2) concepts in lexicons (such as emotions and sentiments), 3) concepts in lexicons with only partial ontology representation, called lexico-ontology concepts (such as side effects and routes of administration (ROA)), and 4) domain specific expressions (such as date, time, interval, frequency and dosage) derived solely through rules. Our approach is embodied in a novel Semantic Web platform called PREDOSE, which provides search support for complex domain specific information needs in prescription drug abuse epidemiology. When applied to a corpus of over 1 million drug abuse-related web forum posts, our search framework proved effective in retrieving relevant documents when compared with three existing search systems.
VL - 29
ER -
TY - JOUR
T1 - Identifying Seekers and Suppliers in Social Media Communities to Support Crisis Coordination
JF - Journal of Computer-Supported Cooperative Works (JCSCW)
Y1 - 2014
A1 - Hemant Purohit
A1 - Andrew Hampton
A1 - Shreyansh Bhatt
A1 - Valerie Shalin
A1 - Amit Sheth
A1 - John Flach
KW - Cooperative behavior
KW - cooperative crisis response
KW - Coordination
KW - crisis computing
KW - crisis coordination
KW - Crisis Informatics
KW - Distributed Decision Making
KW - Emergency Response
KW - Organizational semantic analysis
KW - Psycholinguistics
KW - seeker-supplier behavior
KW - Semantic Web
KW - Sensemaking
KW - Social Media
KW - Spatio-temporal-thematic analysis
KW - Targeted delivery
KW - twitris
AB - Effective crisis management has long relied on both the formal and informal response communities. Social media platforms such as Twitter increase the participation of the informal response community in crisis response. Yet, challenges remain in realizing the formal and informal response communities as a cooperative work system. We demonstrate a supportive technology that recognizes the existing capabilities of the informal response community to identify needs (seeker behavior) and provide resources (supplier behavior), using their own terminology. To facilitate awareness and the articulation of work in the formal response community, we present a technology that can bridge the differences in terminology and understanding of the task between the formal and informal response communities. This technology includes our previous work using domain-independent features of conversation to identify indications of coordination within the informal response community. In addition, it includes a domain-dependent analysis of message content (drawing from the ontology of the formal response community and patterns of language usage concerning the transfer of property) to annotate social media messages. The resulting repository of annotated messages is accessible through our social media analysis tool, Twitris. It allows recipients in the formal response community to sort on resource needs and availability along various dimensions including geography and time. Thus, computation indexes the original social media content and enables complex querying to identify contents, players, and locations. Evaluation of the computed annotations for seeker-supplier behavior with human judgment shows fair to moderate agreement. In addition to the potential benefits to the formal emergency response community regarding awareness of the observations and activities of the informal response community, the analysis serves as a point of reference for evaluating more computationally intensive efforts and characterizing the patterns of language behavior during a crisis.
VL - 23
CP - 4-6
ER -
TY - JOUR
T1 - An Information Filtering and Management Model for Twitter Traffic to Assist Crises Response Coordination
JF - Journal of Computer Supported Cooperative Work (Special Issue on Crisis Informatics and Collaboration)
Y1 - 2014
A1 - Hemant Purohit
A1 - Andrew Hampton
A1 - Shreyansh Bhatt
A1 - Valerie Shalin
A1 - Amit Sheth
A1 - John Flach
AB - Disasters such as Hurricane Sandy in 2012 result in extensive social media traffic, using networking platforms such as Twitter, as citizens report on their situations, identify needs and attempt to distribute resources. We address the challenge of finding relevant, actionable tweets from this large volume with an information filtering model. Driven primarily by concern for coordination, the initial domain independent analysis incorporates psycholinguistic theory to filter for potential messages of cooperation. The subsequent domain dependent analysis leverages a lightweight, language-driven, disaster-related domain model to extract resource references (e.g., food, shelter, etc.) in its first phase. Using a lexicon of verbs concerning the transfer of property, combined with simple syntactic frames, a second phase of domain dependent analysis assists in the identification of a particular kind of tacit cooperation, in the declarations of resource needs and availability. The results populate an annotated information repository to support the presentation of organized, actionable information nuggets regarding resource needs and availability at varying levels of abstraction. Computationally grounding the abstractions in raw data enables complex querying ability for who-what-where in coordination. Initial evaluation of the annotations relative to human judgment shows fair to good agreement. In addition to the potential benefits to the formal emergency response community of a filtered and organized corpus, the results serve as a benchmark for evaluating more computationally intensive efforts and characterizing the patterns of language behavior for coordination during a disaster.
VL - 23
CP - 4
ER -
TY - CONF
T1 - Interactive Visualization of GRT and BioHTS Data
T2 - AFRL-AFIT Human-Machine Systems Colloquium
Y1 - 2014
A1 - Sara Gharabaghi
A1 - Thomas Wischgoll
A1 - Rhonda Vickery
A1 - Ross Smith
A1 - Leslie Blaha
A1 - Thomas Lamkin
A1 - Steven Kawamoto
A1 - Robert Trevino
A1 - Eric Bardes
A1 - Scott Tabar
AB - The scope of this project is to provide better tools for statistical and informational visual analysis for High Throughput Screening of Biological Infectious Agents (BioHTS), General Recognition Theory (GRT) modeling, and areas where pipelines of unstructured datasets of all types must be analyzed. A parallel coordinates plot is one of the more effective visualization methods for visualizing multivariate data.
JA - AFRL-AFIT Human-Machine Systems Colloquium
ER -
TY - MGZN
T1 - The Internet of Things: The Story So Far
Y1 - 2014
A1 - Payam Barnaghi
A1 - Amit Sheth
KW - Actionable Information
KW - Actionable Knowledge
KW - Ambient Intelligence
KW - Internet of Things
KW - physical-cyber-social systems
KW - Semantic Sensor Network
AB - The IoT is not about collecting and publishing data from the physical world but rather about providing knowledge and insights regarding objects (i.e., things), the physical environment, the human and social activities in the physical environments (as may be recorded by devices), and enabling systems to take action based on the knowledge obtained. In other words, raw IoT data is not what the IoT user wants; it is mainly about ambient intelligence and actionable knowledge enabled by real world and real time data.
JA - IEEE Internet of Things
ER -
TY - JOUR
T1 - IVUS Validation of Patient Coronary Artery Lumen Area obtained from CT Images
JF - PLoS ONE
Y1 - 2014
A1 - Tong Luo
A1 - Thomas Wischgoll
A1 - Bon Kwon Koo
A1 - Yunlong Huo
A1 - Ghassan Kassab
VL - 9
CP - 1
ER -
TY - CONF
T1 - A Keyword Sense Disambiguation Based Approach for Noise Filtering in Twitter
T2 - 1st Insight Student Conference
Y1 - 2014
A1 - Sanjaya Wijeratne
A1 - Bahareh Heravi
KW - Digital Journalism
KW - Twitter Noise Filtering
KW - Word Sense Disambiguation
AB - In this paper, we describe an approach to filter out noisy data generated by keywords-based tweet filtering methods by performing Word Sense Disambiguation on those keywords used to collect tweets. We present the noise filtering problem as a binary classification problem and discuss our evaluation strategy which is to be carried out in future.
JA - 1st Insight Student Conference
PB - University College Dublin
CY - Dublin, Ireland
ER -
TY - CONF
T1 - kHealth: Proactive Personalized Actionable Information for Better Healthcare
T2 - Workshop on Personal Data Analytics in the Internet of Things (PDA@IOT 2014), collocated at VLDB 2014
Y1 - 2014
A1 - Amit Sheth
A1 - Pramod Anantharam
A1 - Krishnaprasad Thirunarayan
AB - Mobile devices and sensors are profoundly changing the way we create, consume, and share information. Health aficionados and patients with chronic conditions are increasingly using sensors and mobile devices to track sleep, food, activity, and other physiological observations (e.g., weight, heart rate, blood pressure). This trend is leading to a paradigm shift from reactive medicine to predictive, preventive, personalized, and participatory medicine. This is also empowering an individual to more fully participate in health related decision making. To facilitate this transformation, there is a dearth of research in understanding the richness and nuances of health care data. There are many healthcare applications that utilize mobile devices and sensors to monitor the health of an individual. With increased instrumentation such as use of smart phones and social media provides a fine-grained access to the activities of a person and population in general. Majority of analytics is focused on finding discrepancies in a single stream of observations without much insight into the problem and actionable information. kHealth analyzes observations from passive (no human involvement in data collection) and active (human input involved in data collection) sensors to provide explanations that are intelligible to individuals and when needed their clinicians for well-informed decision making.
JA - Workshop on Personal Data Analytics in the Internet of Things (PDA@IOT 2014), collocated at VLDB 2014
CY - Hangzhou, China
ER -
TY - Generic
T1 - Leveraging Social Media and Web of Data for Crisis Response Coordination
Y1 - 2014
A1 - Carlos Castillo
A1 - Fernando Diaz
A1 - Hemant Purohit
KW - Big Crisis Data
KW - crisis response
KW - crisis response coordination
KW - disaster response
KW - Emergency Response
KW - social media analysis
KW - Web of Data
AB - There is an ever increasing number of users in social media (1B+ Facebook users, 500M+ Twitter users) and ubiquitous mobile access (6B+ mobile phone subscribers) who share their observations and opinions. In addition, the Web of Data and existing knowledge bases keep on growing at a rapid pace. In this scenario, we have unprecedented opportunities to improve crisis response by extracting social signals, creating spatio-temporal mappings, performing analytics on social and Web of Data, and supporting a variety of applications. Such applications can help provide situational awareness during an emergency, improve preparedness, and assist during the rebuilding/recovery phase of a disaster. Data mining can provide valuable insights to support emergency responders and other stakeholders during crisis. However, there are a number of challenges and existing computing technology may not work in all cases. Therefore, our objective here is to present the characterization of such data mining tasks, and challenges that need further research attention.
ER -
TY - RPRT
T1 - Location Prediction of Twitter Users using Wikipedia
Y1 - 2014
A1 - Revathy Krishnamurthy
A1 - Pavan Kapanipathi
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
KW - Location prediction
KW - Social data analysis
KW - twitter
KW - Wikipedia
AB - The mining of user generated content in social media has proven very effective in domains ranging from personalization and recommendation systems to crisis management. The knowledge of online users' locations makes their tweets more informative and adds another dimension to their analysis. Existing approaches to predict the location of Twitter users are purely data-driven and require large training data sets of geo-tagged tweets. The collection and modelling process of tweets can be time intensive. To overcome this drawback, we propose a novel knowledge based approach that does not require any training data. Our approach uses information in Wikipedia, about cities in the geographical area of our interest, to score entities most relevant to a city. By semantically matching the scored entities of a city and the entities mentioned by the user in his/her tweets, we predict the most likely location of the user. Using a publicly available benchmark dataset, we achieve 3% increase in accuracy and 80 miles drop in the average error distance with respect to the state-of-the-art approaches.
PB - Wright State University
CY - Dayton
ER -
TY - CONF
T1 - Monitoring Patients in Hospital Beds Using Unobtrusive Depth Sensors
T2 - 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society
Y1 - 2014
A1 - Tanvi Banerjee
A1 - Moein Enayati
A1 - James Keller
A1 - Marjorie Skubic
A1 - Mihail Popescu
A1 - Marilyn Rantz
AB - We present an approach for patient activity recognition in hospital rooms using depth data collected using a Kinect sensor. Depth sensors such as the Kinect ensure that activity segmentation is possible during day time as well as night while addressing the privacy concerns of patients. It also provides a technique to remotely monitor patients in a non-intrusive manner. An existing fall detection algorithm is currently generating fall alerts in several rooms in the University of Missouri Hospital (MUH). In this paper we describe a technique to reduce false alerts such as pillows falling off the bed or equipment movement. We do so by detecting the presence of the patient in the bed for the times when the fall alert is generated. We test our algorithm on 96 hours obtained in two hospital rooms from MUH.
JA - 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society
PB - IEEE
CY - Chicago, IL
ER -
TY - Generic
T1 - A New CPXR Based Logistic Regression Method and Clinical Prognostic Modeling Results Using the Method on Traumatic Brain Injury
T2 - IEEE 14th International Conference on BioInformatics and BioEngineering (BIBE)
Y1 - 2014
A1 - Vahid Taslimitehrani
A1 - Guozhu Dong
KW - contrast pattern mining
KW - Logistic regression
KW - Prognostic modeling
KW - Traumatic brain injury
AB - Prognostic modeling is central to medicine, as it is often used to predict patients' outcome and response to treatments and to identify important medical risk factors. Logistic regression is one of the most used approaches for clinical prediction modeling. Traumatic brain injury (TBI) is an important public health issue and a leading cause of death and disability worldwide. In this study, we adapt CPXR (Contrast Pattern Aided Regression, a recently introduced regression method), to develop a new logistic regression method called CPXR(Log), for general binary outcome prediction (including prognostic modeling), and we use the method to carry out prognostic modeling for TBI using admission time data. The models produced by CPXR(Log) achieved AUC as high as 0.93 and specificity as high as 0.97, much better than those reported by previous studies. Our method produced interpretable prediction models for diverse patient groups for TBI, which show that different kinds of patients should be evaluated differently for TBI outcome prediction and the odds ratios of some predictor variables differ significantly from those given by previous studies; such results can be valuable to physicians.
JA - IEEE 14th International Conference on BioInformatics and BioEngineering (BIBE)
PB - IEEE
CY - Boca Raton, Florida
ER -
TY - Generic
T1 - A novel web-based depth video rewind approach toward fall preventive interventions in hospitals
T2 - 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society
Y1 - 2014
A1 - Moein Enayati
A1 - Tanvi Banerjee
A1 - Mihail Popescu
A1 - Marjorie Skubic
A1 - Marilyn Rantz
KW - depth video review
KW - fall alarms generation
KW - fall detection system
KW - fall preventive interventions
KW - hospital rooms
KW - Kinect depth images
KW - patients privacy
KW - shadow-like image capture
KW - video frames
KW - visualization techniques
KW - Web based application
KW - Web-based depth video rewind approach
AB - The purpose of this study was to implement a web based application to provide the ability to rewind and review depth videos captured in hospital rooms to investigate the event chains that led to patient's fall at a specific time. In this research, Kinect depth images are being used to capture shadow-like images of the patient and their room to resolve concerns about patients' privacy. As a result of our previous research, a fall detection system has been developed and installed in hospital rooms, and fall alarms are generated if any falls are detected by the system. Then nurses will go through the stored depth videos to investigate for possible injury as well as the reasons and events that may have caused the patient's fall to prevent future occurrences. This paper proposes a novel web application to ease the process of search and reviewing the videos by means of new visualization techniques to highlight video frames that contain potential risk of fall based on our previous research.
JA - 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society
PB - IEEE
CY - Chicago, IL
ER -
TY - CONF
T1 - Online Information Seeking for Cardiovascular Diseases: A Case Study from Mayo Clinic
T2 - 25th European Medical Informatics Conference (MIE 2014)
Y1 - 2014
A1 - Ashutosh Jadhav
A1 - Stephen Wu
A1 - Amit Sheth
A1 - Jyotishman Pathak
KW - Health Information Seeking
KW - Search Log analysis
KW - UMLS MetaMap
AB - The objective of this study is to understand the types of health information (health topics) that users search online for Cardiovascular Diseases, by performing categorization of health search queries (from Mayoclinic.com) using UMLS MetaMap based on UMLS concepts and semantic types.
JA - 25th European Medical Informatics Conference (MIE 2014)
CY - Istanbul, Turkey
ER -
TY - Generic
T1 - QFed: Query Set for Federated SPARQL Query Benchmark
T2 - 16th International Conference on Information Integration and Web-based Applications & Services
Y1 - 2014
A1 - Nur Rakhmawati
A1 - Muhammad Saleem
A1 - Sarasi Lalithsena
A1 - Stefan Decker
KW - Data Integration
KW - Federation SPARQL Query
KW - Linked Data
AB - The increasing attention for federated SPARQL query systems emphasize necessity for benchmarking systems to evaluate their performance. Most of the existing benchmark systems rely on a set of predefined static queries over a particular set of data sources. Such benchmark are useful for comparing general purpose SPARQL query federation systems such as FedX, SPLENDID etc. However, special purpose federation systems such as TopFed, SAFE, etc. cannot be tested with these static benchmarks since these systems only operate on a specific data sets and the corresponding queries. To facilitate the process of benchmarking for such special purpose SPARQL query federation systems, we propose QFed, a dynamic SPARQL query set generator that takes into account the characteristics of both dataset and queries along with the cost of data communication. Our experimental results show that QFed can successfully generate a large set of meaningful federated SPARQL queries to be considered for the performance evaluation of different federated SPARQL query engines.
JA - 16th International Conference on Information Integration and Web-based Applications & Services
PB - ACM
CY - Hanoi, Vietnam
ER -
TY - RPRT
T1 - Semantic Gateway as a Service architecture for IoT Interoperability
Y1 - 2014
A1 - Pratikkumar Desai
A1 - Amit Sheth
A1 - Pramod Anantharam
AB - The Internet of Things (IoT) is set to occupy a substantial component of future Internet. The IoT connects sensors and devices that record physical observations to applications and services of the Internet. As a successor to technologies such as RFID and Wireless Sensor Networks (WSN), the IoT has stumbled into vertical silos of proprietary systems, providing little or no interoperability with similar systems. As the IoT represents future state of the Internet, an intelligent and scalable architecture is required to provide connectivity between these silos, enabling discovery of physical sensors and interpretation of messages between things. This paper proposes a gateway and Semantic Web enabled IoT architecture to provide interoperability between systems using established communication and data standards. The Semantic Gateway as Service (SGS) allows translation between messaging protocols such as XMPP, CoAP and MQTT via a multi-protocol proxy architecture. Utilization of broadly accepted specifications such as W3C's Semantic Sensor Network (SSN) ontology for semantic annotations of sensor data provide semantic interoperability between messages and support semantic reasoning to obtain higher-level actionable knowledge from low-level sensor data.
ER -
TY - Generic
T1 - Semantic Modelling of Smart City Data
T2 - W3C Workshop on the Web of Things: Enablers and Services for an Open Web of Devices
Y1 - 2014
A1 - Stefan Bischof
A1 - Athanasios Karapantelakis
A1 - Cosmin-Septimiu Nechifor
A1 - Amit Sheth
A1 - Alessandra Mileo
A1 - Payam Barnaghi
KW - Semantic Descriptions
KW - Smart City
KW - Web of Things
KW - WoT
AB - Cities present an opportunity for rendering Web of Things-enabled services. According to the World Health Organization, population in cities will double by the middle of this century, while cities deal with increasingly pressing issues such as environmental sustainability, economic growth and citizen mobility. In this paper, we propose a discussion around the need for common semantic descriptions for smart city data to facilitate future services in smart cities. We present examples of data that can be collected from cities, discuss issues around this data and put forward some preliminary thoughts for creating a semantic description model to describe and help discover, index and query smart city data.
JA - W3C Workshop on the Web of Things: Enablers and Services for an Open Web of Devices
CY - Berlin, Germany
ER -
TY - JOUR
T1 - Semantics Driven Approach for Knowledge Acquisition from EMRs
JF - IEEE Journal of Biomedical and Health Informatics
Y1 - 2014
A1 - Sujan Perera
A1 - Cory Henson
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Suhas Nair
AB - Semantic computing technologies have matured to be applicable to many critical domains such as national security, life sciences, and health care. However, the key to their success is the availability of a rich domain knowledge base. The creation and refinement of domain knowledge bases poses difficult challenges. The existing knowledge bases in the health care domain are rich in taxonomic relationships, but they lack non-taxonomic (domain) relationships. In this paper, we describe a semi-automatic technique for enriching existing domain knowledge bases with causal relationships gleaned from Electronic Medical Records (EMR) data. We determine missing causal relationships between domain concepts by validating domain knowledge against EMR data sources and leveraging semantic-based techniques to derive plausible relationships that can rectify knowledge gaps. Our evaluation demonstrates that semantic techniques can be employed to improve the efficiency of knowledge acquisition.
VL - 18
CP - 2
ER -
TY - MGZN
T1 - Semantics empowered Big Data Processing with Applications
Y1 - 2014
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
AB - We discuss the nature of big data and address the role of semantics in analyzing and processing big data that arises in the context of physical-cybersocial systems. To handle volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision making. To handle variety, we resort to semantic models and annotations of data so that intelligent processing can be done independent of heterogeneity of data formats and media. To handle velocity, we seek to use continuous semantics capability to dynamically create event- or situation-specific models and recognize relevant new concepts, entities and facts. To handle veracity, we explore trust models and approaches to glean trustworthiness. These four of the five vÂs of big data are harnessed by the semantics-empowered analytics to derive value to support applications transcending the physicalcyber- social continuum.​
JA - AI Magazine
VL - 36
CP - 1
ER -
TY - Generic
T1 - Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
Y1 - 2014
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
AB - We present our research ideas for developing cyberinfrastructure for Geoscience applications developed in the context of the EarthCube initiative, and our NSF-sponsored work on incorporating spatial-temporal-thematic semantics for enhanced querying and feature extraction from sensor data streams.
PB - Geospatial Semantics Workshop and GeoVoCamp
CY - Madison, WI
ER -
TY - Generic
T1 - Simultaneous Detection of Communities and Roles from Large Networks
T2 - 2nd ACM Conference on Online Social Networks
Y1 - 2014
A1 - Yiye Ruan
A1 - Srinivasan Parthasarthy
KW - community detection
KW - role detection
KW - social networks
KW - structural role
AB - Community detection and structural role detection are two distinct but closely-related perspectives in network analytics. In this paper, we propose RC-Joint, a novel algorithm to simultaneously identify community and structural role assignments in a network. Rather than being agnostic to one assignment while inferring the other, RC-Joint employs a principled approach to guide the detection process in a nonparametric fashion and ensures that the two sets of assignments are sufficiently different from each other. Roles and communities generated by RC-Joint are both soft assignments, reflecting the fact that many real-world networks have overlapping community structures and role memberships. By comparing with state-of-the-art methods in community detection and structural role detection, we demonstrate that RC-Joint harvests the best of two worlds and outperforms existing approaches, while still being competitive in efficiency. We also investigate the effect of different initialization schemes, and find that using the results of RCJoint on a sparse network as the seed often leads to faster convergence and higher quality.
JA - 2nd ACM Conference on Online Social Networks
PB - ACM
CY - New York, NY
ER -
TY - JOUR
T1 - Sit-to-stand Measurement For In Home Monitoring Using Voxel Analysis
JF - IEEE Transactions in IT and Biomedicine
Y1 - 2014
A1 - Tanvi Banerjee
A1 - Marjorie Skubic
A1 - James Keller
A1 - Carmen Abbott
KW - Activity recognition
KW - eldercare technology
KW - ellipse fit
KW - sit-to-stand (STS)
KW - voxel
AB - We present algorithms to segment the activities of sitting and standing, and identify the regions of sit-to-stand (STS) transitions in a given image sequence. As a means of fall risk assessment, we propose methods to measure STS time using the 3-D modeling of a human body in voxel space as well as ellipse fitting algorithms and image features to capture orientation of the body. The proposed algorithms were tested on ten older adults with ages ranging from 83 to 97. Two techniques in combination yielded the best results, namely the voxel height in conjunction with the ellipse fit. Accurate STS time was computed on various STSs and verified using a marker-based motion capture system. This application can be used as part of a continuous video monitoring system in the homes of older adults and can provide valuable information to help detect fall risk and enable early interventions.
VL - 18
CP - 4
ER -
TY - JOUR
T1 - Transmission of data with orthogonal frequency division multiplexing technique for communication networks using GHz frequency band soliton carrier
JF - IET Communications
Y1 - 2014
A1 - Iraj Amiri
A1 - Monireh Ebrahimi
A1 - Amir Yazdavar
A1 - S. Ghorbani
A1 - S. Alavi
A1 - S. Idrus
A1 - J. Ali
KW - discrete wavelet transforms
KW - fast Fourier transforms
KW - intercarrier
KW - interference
KW - micro-optomechanical devices
KW - micromechanical resonators
KW - microwave photonics
KW - OFDM modulation
KW - optical fibre networks
KW - optical resonators
KW - optical solitons
AB - Microring resonators (MRRs) can be used to generate optical millimetre-wave solitons with a broadband frequency of 40-60 GHz. Non-linear light behaviours within MRRs, such as chaotic signals, can be used to generate logic codes (digital codes). The soliton signals can be multiplexed and modulated with the logic codes using an orthogonal frequency division multiplexing (OFDM) technique to transmit the data via a network system. OFDM uses overlapping subcarriers without causing inter-carrier interference. It provides both a high data rate and symbol duration using frequency division multiplexing over multiple subcarriers within one channel. The results show that MRRs support both single-carrier and multi-carrier optical soliton pulses, which can be used in an OFDM based on whether fast Fourier transform or discrete wavelet transform transmission/receiver system. Localised ultra-short soliton pulses within frequencies of 50 and 52 GHz can be seen at the throughput port of the panda system with respect to full-width at half-maximum (FWHM) and free spectrum range of 5 MHz and 2 GHz, respectively. The soliton pulses with FWHMs of 10 MHz could be generated at the drop port. Therefore, transmission of data information can be performed via a communication network using soliton pulse carriers and an OFDM technique.
VL - 8
CP - 8
ER -
TY - CHAP
T1 - Triad-based Role Discovery for Large Social Systems
T2 - Social Informatics: SocInfo 2014 International Workshops
Y1 - 2014
A1 - Derek Doran
ED - Luca Maria-Aiello
ED - Daniel McFarland
KW - Network analytics
KW - Social Network Analysis
KW - Social Roles
AB - The social role of a participant in a social system conceptualizes the circumstances under which she chooses to interact with others, making their discovery and analysis important for theoretical and practical purposes. In this paper, we propose a methodology to detect such roles by utilizing the conditional triad censuses of ego-networks. These censuses are a promising tool for social role extraction because they capture the degree to which basic social forces push upon a user to interact with others in a system. Clusters of triad censuses, inferred from network samples that preserve local structural properties, define the social roles. The approach is demonstrated on two large online interaction networks.
JA - Social Informatics: SocInfo 2014 International Workshops
PB - Springer International Publishing
VL - 8852
ER -
TY - CHAP
T1 - Twitris: A System for Collective Social Intelligence
T2 - Encyclopedia of Social Network Analysis and Mining
Y1 - 2014
A1 - Amit Sheth
A1 - Ashutosh Jadhav
A1 - Pavan Kapanipathi
A1 - Chen Lu
A1 - Hemant Purohit
A1 - Gary Alan Smith
A1 - Wenbo Wang
ED - Reda Alhajj
ED - Jon Rokne
KW - Citizen sensing
KW - Community evolution
KW - Event analysis on social media
KW - Interaction Network
KW - People-Content-Network Analysis
KW - Real-time social media analysis
KW - Semantic Perception
KW - semantic social web
KW - Sentiment-Emotion-Intent Analysis
KW - Social Computing
KW - Social data analysis
KW - Social Media
KW - social media analysis
KW - Spatio-temporal-thematic analysis
KW - twitris
KW - Web 3.0
AB - Twitris, a Semantic Web application that facilitates understanding of social perceptions by Semantics-based processing of massive amounts of event-centric data. Twitris addresses challenges in large scale processing of social data, preserving spatio-temporal-thematic properties and focusing on multi-dimensional analysis of sptatio-temporal-thematic, people-content-network and sentiment-emotion-subjectivity facets. Twitris also covers context based semantic integration of multiple Web resources and expose semantically enriched social data to the public domain. Semantic Web technologies enable the system's integration and analysis abilities. It has applications for studying and analyzing social sensing and perception of a broad variety of events: politics and elections, social movements and uprisings, crisis and disasters, entertainment, environment, decision making and coordination, brand management, campaign effectiveness, etc.
JA - Encyclopedia of Social Network Analysis and Mining
PB - Springer-Verlag New York
CY - New York
ER -
TY - CONF
T1 - Understanding Common Perceptions from Online Social Media
T2 - International Conference of Software Engineering and Knowledge Engineering
Y1 - 2014
A1 - Derek Doran
A1 - Swapna Gokhale
A1 - Aldo Dagnino
AB - Modern society habitually uses online social media services to publicly share observations, thoughts, opinions, and beliefs at any time and from any location. These geotagged social media posts may provide aggregate insights into people's perceptions on a bad range of topics across a given geographical area beyond what is currently possible through services such as Yelp and Foursquare. This paper develops probabilistic language models to investigate whether collective, topic-based perceptions within a geographical area can be extracted from the content of geotagged Twitter posts. The capability of the methodology is illustrated using tweets from three areas of different sizes. An application of the approach to support power grid restoration following a storm is presented.
JA - International Conference of Software Engineering and Knowledge Engineering
CY - Vancouver,CA
ER -
TY - CONF
T1 - On Understanding Divergence of Online Social Group Discussion
T2 - 8th International AAAI Conference on Weblogs and Social Media (ICWSM 2014)
Y1 - 2014
A1 - Hemant Purohit
A1 - Yiye Ruan
A1 - Dave Fuhry
A1 - Srinivasan Parthasarthy
A1 - Amit Sheth
KW - Coordination
KW - Group Discussion Divergence
KW - group dynamics
KW - self organization
KW - social cohesion
KW - Social dynamics
KW - social identity
KW - Social Media
AB - We study online social group dynamics based on how group members diverge in their online discussions. Previous studies mostly focused on the link structures to characterize social group dynamics, whereas the group behavior of content generation in discussions is not well understood. Particularly, we use Jensen-Shannon (JS) divergence to measure the divergence of topics in user-generated contents, and how it progresses over time. We study Twitter messages (tweets) in multiple real-world events (natural disasters and social activism) with different times and demographics. We also model structural and user features with guidance from two socio-psychological theories, social cohesion and social identity, to learn their implications on group discussion divergence. Those features show significant correlation with group discussion divergence. By leveraging them we are able to construct a classifier to predict the future increase or decrease in group discussion divergence, which achieves an area under the curve (AUC) of 0.84 and an F-1 score (harmonic mean of precision and recall) of 0.8. Our approach allows to systematically study collective diverging group behavior independent of group formation design. It can help to prioritize whom to engage with in communities for specific topics of needs during disaster response coordination, and for specific concerns and advocacy in the brand management.
JA - 8th International AAAI Conference on Weblogs and Social Media (ICWSM 2014)
CY - University Park, PA
ER -
TY - CONF
T1 - U.S. Religious Landscape on Twitter
T2 - 6th International Conference on Social Informatics (SocInfo 2014)
Y1 - 2014
A1 - Lu Chen
A1 - Ingmar Weber
A1 - Adam Okulicz-Kozaryn
KW - Correlation Analysis
KW - Linkage Preference
KW - Religion
KW - Religiosity on Twitter
KW - twitter
AB - Religiosity is a powerful force shaping human societies, affecting domains as diverse as economic growth or the ability to cope with illness. As more religious leaders and organizations as well as believers start using social networking sites (e.g., Twitter, Facebook), online activities become important extensions to traditional religious rituals and practices. However, there has been lack of research on religiosity in online social networks. This paper takes a step toward the understanding of several important aspects of religiosity on Twitter, based on the analysis of more than 250k U.S. users who self-declared their religions/belief, including Atheism, Buddhism, Christianity, Hinduism, Islam, and Judaism. Specifically, (i) we examine the correlation of geographic distribution of religious people between Twitter and offline surveys. (ii) We analyze users' tweets and networks to identify discriminative features of each religious group, and explore supervised methods to identify believers of different religions. (iii) We study the linkage preference of different religious groups, and observe a strong preference of Twitter users connecting to others sharing the same religion.
JA - 6th International Conference on Social Informatics (SocInfo 2014)
CY - Barcelona, Spain
ER -
TY - Generic
T1 - User Interests Identification on Twitter Using a Hierarchical Knowledge Base
T2 - 11th Extended Semantic Web Conference (ESWC 2014)
Y1 - 2014
A1 - Pavan Kapanipathi
A1 - Prateek Jain
A1 - Chitra Venkataramani
A1 - Amit Sheth
ED - Valentina Presutti
ED - Milan Stankovic
ED - Erik Cambria
ED - Iván Cantador
ED - Angelo DiLorio
ED - Tommaso DiNoia
ED - Christoph Lange
ED - Diego Reforgiato-Recupero
ED - Anna Tordai
KW - Hierarchical Interest Graph
KW - personalization
KW - Semantics
KW - Social Web
KW - twitter
KW - User Profiles
KW - Wikipedia
AB - Twitter, due to its massive growth as a social networking platform, has been in focus for the analysis of its user-generated content for personalization and recommendation tasks. A common challenge across these tasks is identifying user interests from tweets. Semantic enrichment of Twitter posts to determine user interests has been an active area of research in the recent past. These approaches typically use available public knowledge-bases (such as Wikipedia) to spot entities and create entity-based user profiles. However, exploitation of such knowledge-bases to create richer user profiles has yet to be explored. In this work, we leverage hierarchical relationships present in knowledge-bases to infer user interests expressed as a Hierarchical Interest Graph. We argue that the hierarchical semantics of concepts can enhance existing systems to personalize or recommend items based on a varied level of conceptual abstractness. We demonstrate the effectiveness of our approach through a user study which shows an average of approximately eight of the top ten weighted hierarchical interests in the graph being relevant to a userÂs interests.
JA - 11th Extended Semantic Web Conference (ESWC 2014)
PB - Springer International Publishing
CY - Crete, Greece
ER -
TY - Generic
T1 - Visualization Support for Cognitive Sciences
T2 - Midwestern Cognitive Science Conference (MWCogSci 14)
Y1 - 2014
A1 - Matthew Marangoni
A1 - Thomas Wischgoll
A1 - Yue Zhou
A1 - Leslie Blaha
A1 - Ross Smith
A1 - Rhonda Vickery
JA - Midwestern Cognitive Science Conference (MWCogSci 14)
CY - Dayton, OH
ER -
TY - Generic
T1 - Visualizing Confusion Matrices for Multidimensional Signal Detection Correlational Methods
T2 - Visualization and Data Analysis (VDA 2014)
Y1 - 2014
A1 - Yue Zhou
A1 - Thomas Wischgoll
A1 - Leslie Blaha
A1 - Ross Smith
A1 - Rhonda Vickery
KW - Analytics
KW - D3
KW - General Recognition Theory
KW - Information Visualization
KW - Infovis
KW - Javascript
KW - Scalable Vector Graphics (SVG)
KW - Web-browser-based Visualization
AB - Advances in modeling and simulation for General Recognition Theory have produced more data than can be easily visualized using traditional techniques. In this area of psychological modeling, domain experts are struggling to find effective ways to compare large-scale simulation results. This paper describes methods that adapt the web-based D3 visualization framework combined with pre-processing tools to enable domain specialists to more easily interpret their data. The D3 framework utilizes Javascript and scalable vector graphics (SVG) to generate visualizations that can run readily within the web browser for domain specialists. Parallel coordinate plots and heat maps were developed for identification-confusion matrix data, and the results were shown to a GRT expert for an informal evaluation of their utility. There is a clear benefit to model interpretation from these visualizations when researchers need to interpret larger amounts of simulated data.
JA - Visualization and Data Analysis (VDA 2014)
CY - San Francisco, CA
ER -
TY - MGZN
T1 - The Web of Things
Y1 - 2014
A1 - Steven Gustafson
A1 - Amit Sheth
KW - Industrial Internet
KW - Internet of Everything
KW - Internet of Things
KW - Web of Things
AB - The Internet of Things (IoT) is an extension to the current Internet that enables connections and communication among physical objects and devices (see the September 2013 Computing Now theme for more on IoT and its role in ubiquitous sensing). Estimates suggest that there will be 50 billion devices and people connected and leveraging the vision and technology behind IoT by 2020. A related term that's currently somewhat in vogue is Internet of Everything (IOE), which recognizes the key role of people, or citizen sensing (such as through online social media), to complement the physical sensing implied by IoT. The term Web of Things (WoT) goes beyond the focus on the Internet as the mode of exchanging data, instead bringing in all resources and interactions involving devices, data, and people on the Web. Correspondingly, it brings into focus a wide variety of challenges and opportunities while paving a way to a variety of exciting applications for individuals to industries.
JA - Computing Now
VL - 7
CP - 3
ER -
TY - CONF
T1 - What Information about Cardiovascular Diseases do People Search Online?
T2 - 25th European Medical Informatics Conference (MIE 2014)
Y1 - 2014
A1 - Ashutosh Jadhav
A1 - Stephen Wu
A1 - Amit Sheth
A1 - Jyotishman Pathak
KW - Health Information Seeking
KW - Search Log analysis
KW - UMLS MetaMap
AB - The objective of this study is to understand the types of health information (health topics) that users search online for Cardiovascular Diseases, by performing categorization of health search queries (from Mayoclinic.com) using UMLS MetaMap based on UMLS concepts and semantic types.
JA - 25th European Medical Informatics Conference (MIE 2014)
CY - Istanbul, Turkey
ER -
TY - CONF
T1 - When Less is More: A Web-Based Study of User Beliefs about Buprenorphine Dosing in Self-Treatment of Opioid Withdrawal Symptoms
T2 - College on Problems of Drug Dependence (CPDD 2014)
Y1 - 2014
A1 - Raminta Daniulaityte
A1 - Robert Carlson
A1 - Delroy Cameron
A1 - Gary Alan Smith
A1 - Amit Sheth
AB - There is growing evidence of an alarming increase in the illicit use of buprenorphine in the U.S., but our understanding of its use is limited because current epidemiologic systems do not systematically monitor buprenorphine. This study aims to explore Web-based data on illicit buprenorphine use, focusing on user beliefs about the appropriate dosing in self-treatment of opioid withdrawal. A web forum that allows free discussion of illicit drugs and is accessible for public viewing was selected for analysis. Posts that contained discussions of buprenorphine and opioid withdrawal symptoms were retrieved using PREDOSE, a novel semantic web platform developed for the information extraction and analysis of social web data on illicit drugs. All unique user names were anonymized. A total of 1,140 posts were retrieved, covering a time period between 2005 and May, 2013. These posts were uploaded to an NVivo database. A random sample of 378 (33%) posts was selected for content analysis. The number of buprenorphine-related posts increased from 46 in 2005 to 1,012 in 2009 and 4,376 in 2011. Over 65% of coded posts that contained information about buprenorphine dose in the self-treatment of withdrawal symptoms, endorsed and/or advocated, use of significantly lower amounts of buprenorphine (2 mg and lower) than typical doses of 16-24 mg per day recommended for standard treatment. Such posts expressed a belief that lower doses of buprenorphine are more effective in the self-treatment of opioid dependence, while the physician-prescribed dosage is too high. Thus, prescribed doses can be 'conserved' or shared with others. Social Web data suggest that the 'less is more' approach to buprenorphine dosing may be fairly prevalent among illicit opioid users and may be one of the contributing factors to the increasing availability of diverted buprenorphine. Our findings highlight the importance of Web-based data in drug abuse epidemiology research.
JA - College on Problems of Drug Dependence (CPDD 2014)
CY - San Juan, Puerto Rico
ER -
TY - Generic
T1 - Wisdom Application in Swarm: A WisSwarm Approach
T2 - 2014 International Conference on Optimization, Reliabilty, and Information Technology (ICROIT)
Y1 - 2014
A1 - Utkarshani Jaimini
A1 - V. Panchal
KW - Rough granular computing
KW - Swarm Intelligence
KW - Wisdom Technology
KW - WisSwarm
AB - In Swarm Intelligence, every single agent works in a group as a system to solve a problem. There is no centralized force governing the system. Each agent uses its own wisdom to work and collaborate with its fellow agents to constitute a swarm intelligence. Therefore, wisdom plays a key role in swarm intelligence. Without wisdom problem solving is an impossible task in every domain of life. This combination of Wisdom and Swarm is known as WisSwarm (Wisdom in Swarm).
JA - 2014 International Conference on Optimization, Reliabilty, and Information Technology (ICROIT)
PB - IEEE
CY - Faridabad, India
ER -
TY - CONF
T1 - With Whom to Coordinate, Why and How in Ad-hoc Social Media Communities during Crisis Response
T2 - 11th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2014)
Y1 - 2014
A1 - Hemant Purohit
A1 - Shreyansh Bhatt
A1 - Andrew Hampton
A1 - Valerie Shalin
A1 - Amit Sheth
A1 - John Flach
KW - Ad-hoc communities
KW - crisis computing
KW - crisis coordination
KW - Crisis Informatics
KW - crisis response
KW - crisis response coordination
KW - influential virtual responders
KW - Social Media
KW - social media engagement
AB - During crises affected people, well-wishers, and observers join social media communities to discuss the event. They often share useful information relevant to response coordination, for example, specific resource needs. However, responders face the challenge of massive data overload and lack the time to monitor social media traffic for important information. Analysis shows that only a small number of event related conversations are actionable. Moreover, responders do not know which sources are trustworthy. To address these challenges, response teams may apply manual filtering methods, resulting in limited coverage and quality. We propose a framework and interface for extracting specific resource-related information and engaging with influential users in the evolving social media community. These users can act as both sources and disseminators of important information to assist coordination, thereby emerging as virtual responders.
JA - 11th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2014)
CY - University Park, PA
ER -
TY - CONF
T1 - YouRank: Let User Engagement Rank Microblog Search Results
T2 - 8th International AAAI Conference on Weblogs and Social Media
Y1 - 2014
A1 - Wenbo Wang
A1 - Lei Duan
A1 - Anirudh Koul
A1 - Amit Sheth
KW - Ad-hoc communities
KW - crisis computing
KW - crisis coordination
KW - Crisis Informatics
KW - crisis response
KW - crisis response coordination
KW - influential virtual responders
KW - Social Media
KW - social media engagement
AB - We propose an approach for ranking microblog search results. The basic idea is to leverage user engagement for the purpose of ranking: if a microblog post received many retweets/replies, this means users find it important and it should be ranked higher. However, simply applying the raw count of engagement may bias the ranking by favoring posts from celebrity users whose posts generally receive a disproportionate amount of engagement regardless of the contents of posts. To reduce this bias, we propose a variety of time window-based outlier features that transfer the raw engagement count into an importance score, on a per user basis. The evaluation on five real-world datasets confirms that the proposed approach can be used to improve microblog search.
JA - 8th International AAAI Conference on Weblogs and Social Media
CY - Ann Arbor, MI
ER -
TY - JOUR
T1 - Accurate Analysis of Angiograms based on 3D Vector Field Topology
JF - Computer Aided Geometric Design
Y1 - 2013
A1 - Thomas Wischgoll
AB - Cardiovascular diseases are still the number one killer in the United States. The typical diagnostic method is using angiograms for detecting these types of diseases. As is the case with many diseases, early detection can help reduce further progression or enable physicians to take counter measures early on. Hence, accurate analysis techniques are needed for processing these angiogram data sets. In order to perform such analysis of CTA (Computed Tomography Angiograms) data sets, accurate measurements of the coronary vasculature have to be extracted from the volumetric data, such as vessel length, vessel bifurcation angles, cross-sectional area, and vessel volume. These measurements can then be used to discriminate healthy cases from diseased cases. Therefore, this article describes an improved segmentation algorithm based on a hybrid approach between isovalue and image-gradient segmentation and a center line extraction method utilizing 3D vector field topology analysis. Based on the center lines of the coronary vessels found in the angiogram, the quantitative measurements are then computed that can help in the diagnostic process.
VL - 30
CP - 6
ER -
TY - THES
T1 - Adaptive Semantic Annotation of Entity and Concept Mentions in Text
T2 - Department of Engineering and Computer Science
Y1 - 2013
A1 - Pablo Mendes
KW - Concept Tagging
KW - entity Disambiguation
KW - entity Extraction
KW - entity Linking
KW - Entity Tagging
KW - named Entity Recognition
KW - phrase Recognition
KW - phrase Spotting
KW - Semantic Annotation
KW - semantic Markup
KW - tag Extraction
KW - topic Indexing
KW - word Sense Disambiguation Word Spotting
AB - The recent years have seen an increase in interest for knowledge repositories that are useful across applications, in contrast to the creation of ad hoc or application-specific databases. These knowledge repositories figure as a central provider of unambiguous identifiers and semantic relationships between entities. As such, these shared entity descriptions serve as a common vocabulary to exchange and organize information in different formats and for different purposes. Therefore, there has been remarkable interest in systems that are able to automatically tag textual documents with identifiers from shared knowledge repositories so that the content in those documents is described in a vocabulary that is unambiguously understood across applications. Tagging textual documents according to these knowledge bases is a challenging task. It involves recognizing the entities and concepts that have been mentioned in a particular passage and attempting to resolve eventual ambiguity of language in order to choose one of many possible meanings for a phrase. There has been substantial work on recognizing and disambiguating entities for specialized applications, or constrained to limited entity types and particular types of text. In the context of shared knowledge bases, since each application has potentially very different needs, systems must have unprecedented breadth and flexibility to ensure their usefulness across applications. Documents may exhibit different language and discourse characteristics, discuss very diverse topics, or require the focus on parts of the knowledge repository that are inherently harder to disambiguate. In practice, for developers looking for a system to support their use case, is often unclear if an existing solution is applicable, leading those developers to trial-and-error and ad hoc usage of multiple systems in an attempt to achieve their objective. In this dissertation, I propose a conceptual model that unifies related techniques in this space under a common multidimensional framework that enables the elucidation of strengths and limitations of each technique, supporting developers in their search for a suitable tool for their needs. Moreover, the model serves as the basis for the development of flexible systems that have the ability of supporting document tagging for different use cases. I describe such an implementation, DBpedia Spotlight, along with extensions that we performed to the knowledge base DBpedia to support this implementation. I report evaluations of this tool on several well known datasets, and demonstrate applications to diverse use cases for further validation.
JA - Department of Engineering and Computer Science
PB - Wright State University
CY - Dayton
VL - PhD
ER -
TY - JOUR
T1 - Advancing Data Reuse in Phyloinformatics Using an Ontology-driven Semantic Web Approach
JF - BMC Medical Genomics
Y1 - 2013
A1 - Maryam Panahiazar
A1 - Amit Sheth
A1 - Ajith Ranabahu
A1 - Rutger Vos
A1 - Jim Leebens-Mack
KW - Ontology
KW - Phylogenetic analyses
KW - Semantic Technology
AB - Phylogenetic analyses can resolve historical relationships among genes, organisms or higher taxa. Understanding such relationships can elucidate a wide range of biological phenomena, including, for example, the importance of gene and genome duplications in the evolution of gene function, the role of adaptation as a driver of diversification, or the evolutionary consequences of biogeographic shifts. Phyloinformaticists are developing data standards, databases and communication protocols (e.g. Application Programming Interfaces, APIs) to extend the accessibility of gene trees, species trees, and the metadata necessary to interpret these trees, thus enabling researchers across the life sciences to reuse phylogenetic knowledge. Specifically, Semantic Web technologies are being developed to make phylogenetic knowledge interpretable by web agents, thereby enabling intelligently automated, high-throughput reuse of results generated by phylogenetic research. This manuscript describes an ontology-driven, semantic problem-solving environment for phylogenetic analyses and introduces artefacts that can promote phyloinformatic efforts to promote accessibility of trees and underlying metadata. PhylOnt is an extensible ontology with concepts describing tree types and tree building methodologies including estimation methods, models and programs. In addition we present the PhylAnt platform for annotating scientific articles and NeXML files with PhylOnt concepts. The novelty of this work is the annotation of NeXML files and phylogenetic related documents with PhylOnt Ontology. This approach advances data reuse in phyloinformatics.
VL - 6
CP - 3
ER -
TY - JOUR
T1 - Application Portability in Cloud Computing: An Abstraction Driven Perspective
JF - IEEE Transactions on Services Computing
Y1 - 2013
A1 - Ajith Ranabahu
A1 - Eugene Maximilien
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
KW - application generation
KW - Cloud Computing
KW - domain-specific language
AB - Cloud computing has changed the way organizations create, manage, and evolve their applications. While the abundance of computing resources at low cost opens up many possibilities for migrating applications to the cloud, this migration also comes at a price. Cloud applications, in many cases, depend on certain provider specific features or services. In moving applications to the cloud, application developers face the challenge of balancing these dependencies to avoid vendor lock-in. We present an abstraction-driven approach to address the application portability issues and focus on the application development process. We also present our theoretical basis and experience in two practical projects where we have applied the abstraction driven approach.
VL - PP
CP - 99
ER -
TY - CONF
T1 - Automatic Domain Identification for Linked Open Data
T2 - 2013 IEEE/WIC/ACM International Conference on Web Intelligence
Y1 - 2013
A1 - Sarasi Lalithsena
A1 - Prateek Jain
A1 - Pascal Hitzler
A1 - Amit Sheth
KW - Dataset search
KW - Domain Identification
KW - Linked Open Data Cloud
AB - Linked Open Data (LOD) has emerged as one of the largest collections of interlinked structured datasets on the Web. Although the adoption of such datasets for applications is increasing, identifying relevant datasets for a specific task or topic is still challenging. As an initial step to make such identification easier, we provide an approach to automatically identify the topic domains of given datasets. Our method utilizes existing knowledge sources, more specifically Freebase, and we present an evaluation which validates the topic domains we can identify with our system. Furthermore, we evaluate the effectiveness of identified topic domains for the purpose of finding relevant datasets, thus showing that our approach improves reusability of LOD datasets.
JA - 2013 IEEE/WIC/ACM International Conference on Web Intelligence
PB - ACM
CY - Atlanta, GA
U1 - Full citation
Sarasi Lalithsena, Prateek Jain, Pascal Hitzler, and Amit Sheth, "Automatic Domain Identification for Linked Open Data," 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Atlanta, GA, 2013, pp. 205-212.
doi: 10.1109/WI-IAT.2013.206
ER -
TY - Generic
T1 - Challenges in Understanding Clinical Notes: Why NLP Engines Fall Short and Where Background Knowledge Can Help
T2 - International Workshop on Data management & Analytics for healthcaRE (DARE) at ACM Conference of Information and Knowledge Management (CIKM)
Y1 - 2013
A1 - Sujan Perera
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
A1 - Suhas Nair
A1 - Neil Shah
KW - electronic medical records
KW - Knowledge Base
KW - NLP shortcomings
AB - Understanding of Electronic Medical Records(EMRs) plays a crucial role in improving healthcare outcomes. However, the unstructured nature of EMRs poses several technical challenges for structured information extraction from clinical notes leading to automatic analysis. Natural Language Processing(NLP) techniques developed to process EMRs are effective for variety of tasks, they often fail to preserve the semantics of original information expressed in EMRs, particularly in complex scenarios. This paper illustrates the complexity of the problems involved and deals with conflicts created due to the shortcomings of NLP techniques and demonstrates where domain specific knowledge bases can come to rescue in resolving conflicts that can significantly improve the semantic annotation and structured information extraction. We discuss various insights gained from our study on real world dataset.
JA - International Workshop on Data management & Analytics for healthcaRE (DARE) at ACM Conference of Information and Knowledge Management (CIKM)
CY - Burlingame, California
ER -
TY - Generic
T1 - Characterising Concepts of Interest Leveraging Linked Data and the Social Web
T2 - 2013 IEEE/WIC/ACM International Conference on Web Intelligence
Y1 - 2013
A1 - Fabrizio Orlandi
A1 - Pavan Kapanipathi
A1 - Amit Sheth
A1 - Alexandre Passant
AB - Extracting and representing user interests on the Social Web is becoming an essential part of the Web for person- alisation and recommendations. Such personalisation is required in order to provide an adaptive Web to users, where content fits their preferences, background and current interests, making the Web more social and relevant. Current techniques analyse user activities on social media systems and collect structured or unstructured sets of entities representing users' interests. These sets of entities, or user profiles of interest, are often missing the semantics of the entities in terms of: (i) popularity and temporal dynamics of the interests on the Social Web and (ii) abstractness of the entities in the real world. State of the art techniques to compute these values are using specific knowledge bases or taxonomies and need to analyse the dynamics of the entities over a period of time. Hence, we propose a real-time, computationally inexpensive, domain independent model for concepts of interest composed of: popularity, temporal dynamics and specificity. We describe and evaluate a novel algorithm for computing specificity leveraging the semantics of Linked Data and evaluate the impact of our model on user profiles of interests
JA - 2013 IEEE/WIC/ACM International Conference on Web Intelligence
CY - Atlanta, GA
ER -
TY - CONF
T1 - City Notifications as a Data Source for Traffic Management
T2 - 20th ITS World Congress
Y1 - 2013
A1 - Pramod Anantharam
A1 - Biplav Srivastava
KW - City Notifications
KW - Developing Countries
KW - information extraction
KW - Traffic Data
AB - A common problem for cities of developing countries like India in managing traffic is the lack of basic automated instrumentation to track road conditions or vehicle locations. Still, to help their citizens make informed travel decisions based on changing city dynamics; many cities have an authorized, city-initiated, notification service in place to alert subscribing commuters about road conditions. Here, alternative means may be used to create informal textual notifications (e.g., inputs from field personnel, citizen updates, and pre-authorized events from a city calendar). In this paper, we show that collections of such notifications, when processed with information extraction techniques, can turn them into a rich source of data for traffic managers. Specifically, we use Short Message Service (SMS) notifications from the city of Delhi, India to show promising insights.
JA - 20th ITS World Congress
CY - Tokyo, Japan
ER -
TY - JOUR
T1 - Comparative Trust Management with Applications: Bayesian Approaches Emphasis
JF - Future Generation Computer Systems
Y1 - 2013
A1 - Krishnaprasad Thirunarayan
A1 - Pramod Anantharam
A1 - Cory Henson
A1 - Amit Sheth
KW - beta-PDF
KW - binary and multi-level trust.
KW - collaborative systems
KW - Dirichlet distribution
KW - gleaning trustworthiness
KW - social and sensor networks
KW - trust metrics and models (propagation: chaining and aggregation)
KW - trust ontology
KW - trust system attacks
KW - trust vs. reputation
AB - Trust relationships occur naturally in many diverse contexts such as collaborative systems, e-commerce, interpersonal interactions, social networks, and semantic sensor web. As agents providing content and services become increasingly removed from the agents that consume them, the issue of robust trust inference and update becomes critical. There is a need to find online substitutes for traditional (direct or face-to-face) cues to derive measures of trust, and create efficient and robust systems for managing trust in order to support decision-making. Unfortunately, there is neither a universal notion of trust that is applicable to all domains nor a clear explication of its semantics or computation in many situations. We motivate the trust problem, explain the relevant concepts, summarize research in modeling trust and gleaning trustworthiness, and discuss challenges confronting us. The goal is to provide a comprehensive broad overview of the trust landscape, with the nitty-gritties of a handful of approaches. We also provide details of the theoretical underpinnings and comparative analysis of Bayesian approaches to binary and multi-level trust, to automatically determine trustworthiness in a variety of reputation systems including those used in sensor networks, e-commerce, and collaborative environments. Ultimately, we need to develop expressive trust networks that can be assigned objective semantics.
VL - 30
CP - 6
ER -
TY - JOUR
T1 - Computed Tomography-Based Diagnosis of Diffuse Compensatory Enlargement of Coronary Arteries Using Scaling Power-Laws
JF - Journal of The Royal Society Interface
Y1 - 2013
A1 - Yunlong Huo
A1 - Jenny Choy
A1 - Thomas Wischgoll
A1 - Tong Luo
A1 - Shawn Teague
A1 - Deepak Bhatt
A1 - Ghassan Kassab
ER -
TY - CHAP
T1 - Configural and Pictorial Displays
T2 - The Oxford Handbook of Cognitive Engineering
Y1 - 2013
A1 - Kevin Bennett
A1 - John Flach
ED - John Lee
ED - Alex Kirlik
AB - A framework for ecological display design is presented from the cognitive systems engineering perspective. This triadic approach views the interface as a medium that stands between the domain (situations) and the human (awareness); effective displays “close the loop” of this dynamical system, thereby supporting abductive learning and reasoning processes. All three of these system components (domain, interface, agent) and the interactions between them are incorporated into a single framework for display design. Analogical representations (i.e., configural displays) will typically provide good solutions for law-driven domains (e.g., process control); the visual properties (i.e., emergent features) should reflect the constraints of the underlying work domain. Metaphorical representations (i.e., pictorial displays) will typically provide the best options for intent-driven domains (e.g., mobile phones); the visual properties (icons) should leverage existing knowledge and support the processes of assimilation and accommodation. The role of CSE work domain analysis tools in obtaining the information needed to implement these alternative designs is described.
JA - The Oxford Handbook of Cognitive Engineering
PB - Oxford University Press
CY - Oxford
ER -
TY - CHAP
T1 - Coordination and Control in Emergency Response
T2 - Handbook of Emergency Response: A Human Factors and Systems Engineering Approach
Y1 - 2013
A1 - John Flach
A1 - Debra Steele-Johnson
A1 - Valerie Shalin
A1 - Glenn Hamilton
ED - Adedeji Badiru
ED - LeeAnn Racz
AB - On the afternoon of Sunday, September 14, 2008, the remnants from the Gulf Coast's Hurricane Ike moved through Ohio. Although not as devastating as a tornado, hurricane force winds caused substantial storm damage and extensive loss of power. The high winds cut electrical service to more than half of the local power company's one-half million regional customers. The company's website initially indicated that service restoration would be a multi-day effort. However, full restoration was not reported until 2 weeks after the initial storm, on September 29, 2008. The result was that significant stress was placed on the regional medical emergency response resources. For example, 1100 new patients reported to local emergency departments in the first week of the power outage, and some ambulances were diverted from the nearest emergency rooms due to overflowing capacity. In this chapter, we share lessons learned from extensive interviews with a wide range of participants in this emergency event. We summarize our observations in the context of theoretical work on the dynamics of complex adaptive organizations. Our goals are to explicate theories of complex adaptive organizations by grounding them in the concrete events associated with this particular emergency situation. We also provide some practical suggestions that might be useful in preparing for future emergency situations.
JA - Handbook of Emergency Response: A Human Factors and Systems Engineering Approach
PB - CRC Press
CY - Boca Raton
ER -
TY - CONF
T1 - Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citizen Roles for Crisis Response
T2 - 7th International AAAI Conference on Weblogs and Social Media
Y1 - 2013
A1 - Hemant Purohit
A1 - Amit Sheth
A1 - Carlos Castillo
A1 - Patrick Meier
KW - Citizen sensing
KW - Crisis Mapping
KW - crisis response
KW - crisis response coordination
KW - disaster response
KW - Emergency Response
KW - Social Media Analytics
AB - With the explosion in social media (1B+ Facebook users, 500M+ Twitter users) and ubiquitous mobile access (6B+ mobile phone subscribers) sharing their observations and opinions, we have unprecedented opportunities to extract social signals, create spatio-temporal mappings, perform analytics on social data, and support applications that vary from situational awareness during crisis response, preparedness and rebuilding phases to advanced analytics on social data, and gaining valuable insights to support improved decision making. This tutorial weaves three themes and corresponding relevant topics- a.) citizen sensing and crisis mapping, b.) technical challenges and recent research for leveraging citizen sensing to improve crisis response coordination, and c.) experiences in building robust and scalable platforms/systems. It will couple technical insights with identification of computational techniques and algorithms along with real-world examples. We will also do exemplary demos of the features in the Sahana, CrowdMap (Ushahidi's version) and Twitris platforms while elaborating on the practical issues and pitfalls of the development and operation of these large-scale platforms, especially during the real-time crisis response
JA - 7th International AAAI Conference on Weblogs and Social Media
CY - Cambridge, MA
ER -
TY - CONF
T1 - Crisis Response Coordination in Online Communities
T2 - NSF SOCS Symposium
Y1 - 2013
A1 - Hemant Purohit
KW - Coordination
KW - crisis computing
KW - Crisis Informatics
KW - crisis response
KW - crisis response coordination
KW - donation coordination
KW - emergency coordination
KW - need-offer matching
KW - seeker-supplier analysis
KW - Social Media
AB - During recent crises, citizens (sensors) are increasingly using social media to share variety of information- situation on the ground, emerging needs, donation offers, damage, etc. In such an evolving ad-hoc community, how can we extract actionable nuggets from the social media streams to aid relief efforts? This doctoral consortium presentation summarizes a framework to analyze social data and manage information to assist coordination by focusing on three important questions to answer: Whom to coordinate with, Why to coordinate and How to coordinate, with exemplary insights for needs and availability from the recent disaster events.
JA - NSF SOCS Symposium
ER -
TY - JOUR
T1 - CT-Based Diagnosis of Diffuse Coronary Artery Disease on the Basis of Scaling Power-Laws
JF - Radiology
Y1 - 2013
A1 - Yunlong Huo
A1 - Thomas Wischgoll
A1 - Jenny Choy
A1 - Srikanth Sola
A1 - Jose Navia
A1 - Shawn Teague
A1 - Deepak Bhatt
A1 - Ghassan Kassab
VL - 268
ER -
TY - Generic
T1 - Data Processing and Semantics for Advanced Internet of Things (IoT) Applications: modeling, annotation, integration, and perception
T2 - International Conference on Web Intelligence, Mining, and Semantics (WIMS '13)
Y1 - 2013
A1 - Pramod Anantharam
A1 - Payam Barnaghi
A1 - Amit Sheth
KW - annotation
KW - Cyber-Physical-Social Systems
KW - healthcare
KW - inference
KW - integration
KW - Internet of Things (IoT)
KW - Knowledge Engineering
KW - Modelling
KW - Ontology
KW - reasoning
KW - Semantic Sensor Web
KW - traffic analytics
AB - This tutorial presents tools and techniques for effectively utilizing the Internet of Things (IoT) for building advanced applications, including the Physical-Cyber-Social (PCS) systems. The issues and challenges related to IoT, semantic data modelling, annotation, knowledge representation (e.g. modelling for constrained environments, complexity issues and time/location dependency of data), integration, analysis, and reasoning will be discussed. The tutorial will describe recent developments on creating annotation models and semantic description frameworks for IoT data (e.g. such as W3C Semantic Sensor Network ontology). A review of enabling technologies and common scenarios for IoT applications from the data and knowledge engineering point of view will be discussed. Information processing, reasoning, and knowledge extraction, along with existing solutions related to these topics will be presented. The tutorial summarizes state-of-the-art research and developments on PCS systems, IoT related ontology development, linked data, do- main knowledge integration and management, querying large- scale IoT data, and AI applications for automated knowledge extraction from real world data.
JA - International Conference on Web Intelligence, Mining, and Semantics (WIMS '13)
PB - ACM
CY - Madrid, Spain
ER -
TY - JOUR
T1 - Day or Night Activity Recognition from Video Using Fuzzy Clustering Techniques
JF - IEEE Transactions on Fuzzy Systems
Y1 - 2013
A1 - Tanvi Banerjee
A1 - James Keller
A1 - Marjorie Skubic
A1 - Erik Stone
KW - Activity labeling
KW - Depth images
KW - fuzzy clustering
KW - image moments
KW - infrared images
AB - We present an approach for activity state recognition implemented on data collected from various sensors—standard web cameras under normal illumination, web cameras using infrared lighting, and the inexpensive Microsoft Kinect camera system. Sensors such as the Kinect ensure that activity segmentation is possible during the daytime as well as night. This is especially useful for activity monitoring of older adults since falls are more prevalent at night than during the day. This paper is an application of fuzzy set techniques to a new domain. The approach described herein is capable of accurately detecting several different activity states related to fall detection and fall risk assessment including sitting, being upright, and being on the floor to ensure that elderly residents get the help they need quickly in case of emergencies and ultimately to help prevent such emergencies.
VL - 22
CP - 3
ER -
TY - Generic
T1 - Demo: Approximate Semantic Matching in the COLLIDER Event Processing Engine
T2 - 7th ACM International Conference on Distributed Event-Based Systems (DEBS '13)
Y1 - 2013
A1 - Souleiman Hasan
A1 - Kalpa Gunaratna
A1 - Yongrui Qin
A1 - Edward Curry
KW - Approximate Event Matching
KW - Loose Semantic Coupling
KW - semantic matching
AB - This demo presents a use case from the energy management domain. It builds upon previous work on approximate semantic matching of heterogeneous events and compares two semantic matching scenarios: exact and approximate. It illustrates how a large number of exact matching event subscriptions are needed to match heterogeneous power consumption events. It then demonstrates how a small number of approximate semantic matching subscriptions are needed but possibly with a lower true positives/negatives performance. The demo is delivered via the COLLIDER approximate event processing engine currently under development in DERI.
JA - 7th ACM International Conference on Distributed Event-Based Systems (DEBS '13)
CY - Arlington, Texas
ER -
TY - Generic
T1 - Electro-optical seasonal weather and gender data collection
T2 - SPIE 8751, Machine Intelligence and Bio-inspired Computation: Theory and Applications VII
Y1 - 2013
A1 - Ryan McCoppin
A1 - Nathan Koester
A1 - Nathan Rude
A1 - Mateen Rizki
A1 - Louis Tamburino
A1 - Andrew Freeman
A1 - Olga Mendoza-Schrock
AB - This paper describes the process used to collect the Seasonal Weather And Gender (SWAG) dataset; an electro-optical dataset of human subjects that can be used to develop advanced gender classification algorithms. Several novel features characterize this ongoing effort (1) the human subjects self-label their gender by performing a specific action during the data collection and (2) the data collection will span months and even years resulting in a dataset containing realistic levels and types of clothing corresponding to the various seasons and weather conditions. It is envisioned that this type of data will support the development and evaluation of more robust gender classification systems that are capable of accurate gender recognition under extended operating conditions.
JA - SPIE 8751, Machine Intelligence and Bio-inspired Computation: Theory and Applications VII
ER -
TY - JOUR
T1 - From Data to Actionable Knowledge: Big Data Challenges in the Web of Things
JF - IEEE Intelligent Systems
Y1 - 2013
A1 - Payam Barnaghi
A1 - Amit Sheth
A1 - Cory Henson
KW - Actionable Knowledge
KW - Big Data
KW - Citizen sensing
KW - IoT
KW - sensor data query processing
KW - Smart Data
KW - Web of Things
AB - Extending the current Internet and providing connection, communication, and internetworking between devices and physical objects, or 'things,' is a growing trend that's often referred to as the Internet of Things (IoT). Integrating real-world data into the Web, with its large repositories of data, and providing Web-based interactions between humans and IoT resources is what the Web of Things (WoT) stands for. Here, the guest editors describe the Big Data issues in the WoT, discuss the challenges of extracting actionable knowledge and insights from raw sensor data, and introduce the theme articles in this special issue.
VL - 28
CP - 6
ER -
TY - Generic
T1 - From Questions to Effective Answers: On the Utility of Knowledge-Driven Querying Systems for Life Sciences Data
T2 - 9th International Conference on Data Integration in the Life Sciences (DILS '13)
Y1 - 2013
A1 - Amir Asiaee
A1 - Prashant Doshi
A1 - Todd Minning
A1 - Satya Sahoo
A1 - Priti Parikh
A1 - Amit Sheth
A1 - Rick Tarleton
KW - forms based querying
KW - knowledge-driven querying
KW - Paige tools
KW - Parasite Knowledge Repository
KW - T. cruzi
KW - template based querying
AB - We compare two distinct approaches for querying data in the context of the life sciences. The first approach utilizes conventional databases to store the data and provides intuitive form-based interfaces to facilitate querying of the data, commonly used by the life science researchers that we study. The second approach utilizes a large OWL ontology and the same datasets associated as RDF instances of the ontology. Both approaches are being used in parallel by a team of cell biologists in their daily research activities, with the objective of gradually replacing the conventional approach with the knowledge-driven one. We describe several benefits of the knowledge-driven approach in comparison to the traditional one, and highlight a few limitations. We believe that our analysis not only explicitly highlights the benefits and limitations of semantic Web technologies in the context of life sciences but also contributes toward effective ways of translating a question in a researcher's mind into precise queries with the intent of obtaining effective answers.
JA - 9th International Conference on Data Integration in the Life Sciences (DILS '13)
CY - Montreal, Canada
ER -
TY - CONF
T1 - From Questions to Effective Answers: On the Utility of Knowledge-Driven Querying Systems for Life Sciences Data
T2 - Data Integration in the Life Sciences, DILS 2013
Y1 - 2013
A1 - Amir H. Asiaee
A1 - Prashant Doshi
A1 - Todd Minning
A1 - Satya S. Sahoo
A1 - Priti Parikh
A1 - Amit Sheth
A1 - Rick L. Tarleton
KW - Advantag
KW - Tate
KW - Trypanosoma
AB - We compare two distinct approaches for querying data in the context of the life sciences. The first approach utilizes conventional databases to store the data and provides intuitive form-based interfaces to facilitate querying of the data, commonly used by the life science researchers that we study. The second approach utilizes a large OWL ontology and the same datasets associated as RDF instances of the ontology. Both approaches are being used in parallel by a team of cell biologists in their daily research activities, with the objective of gradually replacing the conventional approach with the knowledgedriven one. We describe several benefits of the knowledge-driven approach in comparison to the traditional one, and highlight a few limitations. We believe that our analysis not only explicitly highlights the benefits and limitations of semantic Web technologies in the context of life sciences but also contributes toward effective ways of translating a question in a researcher’s mind into precise queries with the intent of obtaining effective answers.
JA - Data Integration in the Life Sciences, DILS 2013
PB - Springer
CY - Berlin, Heidelberg
SN - 978-3-642-39437-9
ER -
TY - JOUR
T1 - A Graph-Based Recovery and Decomposition of Swanson's Hypothesis using Semantic Predications
JF - Journal of Biomedical Informatics
Y1 - 2013
A1 - Delroy Cameron
A1 - Olivier Bodenreider
A1 - Hima Yalamanchili
A1 - Tu Danh
A1 - Sreeram Vallabhaneni
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Thomas Rindflesch
KW - background knowledge
KW - Literature-based discovery (LBD)
KW - semantic associations
KW - semantic predications
KW - Subgraph Creation
KW - Swanson's Hypothesis
AB - Objectives: This paper presents a methodology for recovering and decomposing Swanson's Raynaud Syndrome-Fish Oil Hypothesis semi-automatically. The methodology leverages the semantics of assertions extracted from biomedical literature (called semantic predications) along with structured background knowledge and graph-based algorithms to semi-automatically capture the informative associations originally discovered manually by Swanson. Demonstrating that Swanson's manually intensive techniques can be undertaken semi-automatically, paves the way for fully automatic semantics-based hypothesis generation from scientific literature. Methods: Semantic predications obtained from biomedical literature allow the construction of labeled directed graphs which contain various associations among concepts from the literature. By aggregating such associations into informative subgraphs, some of the relevant details originally articulated by Swanson has been uncovered. However, by leveraging background knowledge to bridge important knowledge gaps in the literature, a methodology for semi-automatically capturing the detailed associations originally explicated in natural language by Swanson has been developed. Results: Our methodology not only recovered the 3 associations commonly recognized as Swanson's Hypothesis, but also decomposed them into an additional 16 detailed associations, formulated as chains of semantic predications. Altogether, 14 out of the 19 associations that can be attributed to Swanson were retrieved using our approach. To the best of our knowledge, such an in-depth recovery and decomposition of SwansonÂs Hypothesis has never been attempted. Conclusion: In this work therefore, we presented a methodology for semi-automatically recovering and decomposing Swanson's RS-DFO Hypothesis using semantic representations and graph algorithms. Our methodology provides new insights into potential prerequisites for semantics-driven Literature-Based Discovery (LBD). These suggest that three critical aspects of LBD include: 1) the need for more expressive representations beyond Swanson's ABC model; 2) an ability to accurately extract semantic information from text; and 3) the semantic integration of scientific literature with structured background knowledge.
VL - 46
CP - 2
ER -
TY - JOUR
T1 - I Just Wanted to Tell You That Loperamide WILL WORK: A Web-Based Study of Extra-Medical Use of Loperamide
JF - Drug Alcohol Dependence
Y1 - 2013
A1 - Raminta Daniulaityte
A1 - Robert Carlson
A1 - Russel Falck
A1 - Delroy Cameron
A1 - Sujan Perera
A1 - Lu Chen
A1 - Amit Sheth
KW - emerging drug use
KW - illicit opioid use
KW - Loperamide
KW - opiate withdrawal
KW - Prescription Drug Abuse
KW - Self-Treatment
KW - user-generated content
AB - Many websites provide a means for individuals to share their experiences and knowledge about different drugs. Such User-Generated Content (UGC) can be a rich data source to study emerging drug use practices and trends. This study examined UGC on extra-medical use of loperamide among illicit opioid users. A website that allows for the free discussion of illicit drugs and is accessible for public viewing was selected for analysis. Web-forum posts were retrieved using web crawlers and retained in a local text database. The database was queried to extract posts with a mention of loperamide and relevant brand/slang terms. Over 1290 posts were identified. A random sample of 258 posts was coded using NVivo to identify intent, dosage, and side-effects of loperamide use. There has been an increase in discussions related to loperamide's use by non-medical opioid users, especially in 2010-2011 Loperamide was primarily discussed as a remedy to alleviate a broad range of opioid withdrawal symptoms, and was sometimes referred to as "poor man's" methadone. Typical doses ranged 70-100mg per day, much higher than an indicated daily dose of 16mg. This study suggests that loperamide is being used extra-medically to self-treat opioid withdrawal symptoms. There is a growing demand among people who are opioid dependent for drugs to control withdrawal symptoms, and loperamide appears to fit that role. The study also highlights the potential of the Web as a "leading edge" data source in identifying emerging drug use practices.
VL - 130
CP - 1-3
ER -
TY - CONF
T1 - Logical Linked Data Compression
T2 - 10th Extended Semantic Web Conference (ESWC 2013 )
Y1 - 2013
A1 - Amit Joshi
A1 - Pascal Hitzler
A1 - Guozhu Dong
AB - Linked data has experienced accelerated growth in recent years. With the continuing proliferation of structured data, demand for RDF compression is becoming increasingly important. In this study, we introduce a novel lossless compression technique for RDF datasets, called Rule Based Compression (RB Compression) that compresses datasets by generating a set of new logical rules from the dataset and removing triples that can be inferred from these rules. Unlike other compression techniques, our approach not only takes advantage of syntactic verbosity and data redundancy but also utilizes semantic associations present in the RDF graph. Depending on the nature of the dataset, our system is able to prune more than 50% of the original triples without affecting data integrity.
JA - 10th Extended Semantic Web Conference (ESWC 2013 )
CY - Montpellier, France
ER -
TY - JOUR
T1 - Mining Effective Multi-Segment Sliding Window for Pathogen Incidence Rate Prediction
JF - Data & Knowledge Engineering
Y1 - 2013
A1 - Lei Duan
A1 - Changjie Tang
A1 - Xiasong Li
A1 - Guozhu Dong
A1 - Xianming Wang
A1 - Jie Zuo
A1 - Min Jiang
A1 - Zhongqiao Li
A1 - Yongqing Zhang
KW - Data Mining
KW - Multi-segment sliding window
KW - Pathogen incidence rate prediction
KW - Time series modeling
AB - Pathogen incidence rate prediction, which can be considered as time series modeling, is an important task for infectious disease incidence rate prediction and for public health. This paper investigates applying a genetic computation technique, namely GEP, for pathogen incidence rate prediction. To overcome the shortcomings of traditional sliding windows in GEP based time series modeling, the paper introduces the problem of mining effective sliding window, for discovering optimal sliding windows for building accurate prediction models. To utilize the periodical characteristic of pathogen incidence rates, a multi-segment sliding window consisting of several segments from different periodical intervals is proposed and used. Since the number of such candidate windows is still very large, a heuristic method is designed for enumerating the candidate effective multi-segment sliding windows. Moreover, methods to find the optimal sliding window and then produce a mathematical model based on that window are proposed. A performance study on real-world datasets shows that the techniques are effective and efficient for pathogen incidence rate prediction.
VL - 87
ER -
TY - CONF
T1 - Physical Cyber Social Computing: An Early 21st Century Approach to Computing for Human Experience
T2 - International Conference on Web Intelligence, Mining and Semantics (WIMS '13)
Y1 - 2013
A1 - Amit Sheth
A1 - Pramod Anantharam
KW - computing for human experience
KW - healthcare
KW - Intelligence
KW - Machine Perception
KW - physical-cyber-social systems
KW - semantic computing
KW - Social Computing
KW - traffic analytics
AB - Computing has a critical role in solving some of the grand challenges spanning health and wellbeing, sustainability, and prevention of crime. Traditionally, computing has focused on a single narrow view of the problem to provide solutions ignoring the essential human experience component. The availability of low-cost sensors and mobile devices leading to simultaneous and continuous access to events spanning physical, cyber, and social worlds demands rethinking of the traditional computational approaches. We propose Physical-Cyber-Social (PCS) computing, that takes a human centric and holistic view of computing by analyzing observations, knowledge, and experiences from physical, cyber, and social worlds. We exemplify real-world problems that demands the approach of PCS computing and outline the research challenges in building algorithms and techniques for PCS computing.
JA - International Conference on Web Intelligence, Mining and Semantics (WIMS '13)
CY - Madrid, Spain
ER -
TY - Generic
T1 - Physical-Cyber-Social Computing
T2 - Dagstuhl Seminar 13402
Y1 - 2013
A1 - Ramesh Jain
A1 - Amit Sheth
A1 - Steffen Staab
A1 - Payam Barnaghi
A1 - Markus Strohmaier
JA - Dagstuhl Seminar 13402
VL - 3
ER -
TY - MGZN
T1 - Physical-Cyber-Social Computing: An Early 21st Century Approach
Y1 - 2013
A1 - Amit Sheth
A1 - Pramod Anantharam
A1 - Cory Henson
KW - Abstraction
KW - cyber-physical systems
KW - cyber-social systems
KW - Human Centric Computing
KW - PCS operators
KW - Physical Cyber Social Computing
KW - socio-technical systems
AB - Visionaries and scientists from the early days of computing and electronic communication have discussed the proper role of technology to improve human experience. Technology now plays an increasingly important role in facilitating and improving personal and social activities and engagements, decision making, interaction with physical and social worlds, generating insights, and just about anything that a human, as an intelligent being, seeks to do. This article presents a vision of Physical-Cyber-Social (PCS) computing for a holistic treatment of data, information, and knowledge from physical, cyber, and social worlds to integrate, understand, correlate, and provide contextually relevant abstractions to humans.
JA - IEEE Intelligent Systems
VL - 28
ER -
TY - Generic
T1 - PLASMA-HD: Probing the Lattice Structure and Makeup of High-dimensional Data
T2 - VLDB Endowment
Y1 - 2013
A1 - David Fuhry
A1 - Yang Zhang
A1 - Venu Satuluri
A1 - Arnab Nandi
A1 - Srinivasan Parthasarthy
AB - Rapidly making sense of, analyzing, and extracting useful information from large and complex data is a grand challenge. A user tasked with meeting this challenge is often befuddled with questions on where and how to begin to understand the relevant characteristics of such data. Real-world problem scenarios often involve scalability limitations and time constraints. In this paper we present an incremental interactive data analysis system as a step to address this challenge. This system builds on recent progress in the fields of interactive data exploration, locality sensitive hashing, knowledge caching, and graph visualization. Using visual clues based on rapid incremental estimates, a user is provided a multi-level capability to probe and interrogate the intrinsic structure of data. Throughout the interactive process, the output of previous probes can be used to construct increasingly tight coherence estimates across the parameter space, providing strong hints to the user about promising analysis steps to perform next. We present examples, interactive scenarios, and experimental results on several synthetic and real-world datasets which show the effectiveness and efficiency of our approach. The implications of this work are quite broad and can impact fields ranging from top-k algorithms to data clustering and from manifold learning to similarity search.
JA - VLDB Endowment
VL - 6
ER -
TY - RPRT
T1 - Predicting Parkinson's Disease Progression with Smartphone Data
Y1 - 2013
A1 - Pramod Anantharam
A1 - Krishnaprasad Thirunarayan
A1 - Vahid Taslimi
A1 - Amit Sheth
KW - activity monitoring
KW - health and well-being
KW - Sensor data analytics
KW - Smart Phones
AB - Most of the existing approaches for detecting diseases/risk score form observations (sensor and textual) ignore the presence of any prior knowledge of the disease. In this work, we start top-down by enumerating the symptoms of Parkinson's Disease (PD) and map the symptoms to its possible manifestations in sensor observations (bottom-up). We show such manifestations and further use these manifestations as features to build classifiers to differentiate between the PD patients and the control group.
ER -
TY - JOUR
T1 - Prediction of Subscriber Churn Using Social Network Analysis
JF - Bell Laboratories Technical Journal
Y1 - 2013
A1 - Chitra Phadke
A1 - Huseyin Uzunalioglu
A1 - Veena Mendiratta
A1 - Dan Kushnir
A1 - Derek Doran
AB - In today's world, mobile phone penetration has reached a saturation point. As a result, subscriber churn has become an important issue for mobile operators as subscribers switch operators for a variety of reasons. Mobile operators typically employ churn prediction algorithms based on service usage metrics, network performance indicators, and traditional demographic information. A newly emerging technique is the use of social network analysis (SNA) to identify potential churners. Intuitively, a subscriber who is churning will have an impact on the churn propensity of his social circle. Call detail records are useful to understand the social connectivity of subscribers through call graphs but do not directly provide the strength of their relationship or have enough information to determine the diffusion of churn influence. In this paper, we present a way to address these challenges by developing a new churn prediction algorithm based on a social network analysis of the call graph. We provide a formulation that quantifies the strength of social ties between users based on multiple attributes and then apply an influence diffusion model over the call graph to determine the net accumulated influence from churners. We combine this influence and other social factors with more traditional metrics and apply machine-learning methods to compute the propensity to churn for individual users. We evaluate the performance of our algorithm over a real data set and quantify the benefit of using SNA in churn prediction
VL - 17
CP - 4
ER -
TY - RPRT
T1 - Prediction of User Engagement in Online Discussion Using Multi-faceted Features
Y1 - 2013
A1 - Yiye Ruan
A1 - Hemant Purohit
A1 - David Fuhry
A1 - Srinivasan Parthasarthy
A1 - Amit Sheth
PB - Wright State University
CY - Dayton
ER -
TY - JOUR
T1 - PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media
Y1 - 2013
A1 - Delroy Cameron
A1 - Gary Alan Smith
A1 - Raminta Daniulaityte
A1 - Amit Sheth
A1 - Drashti Dave
A1 - Lu Chen
A1 - Gaurish Anand
A1 - Robert Carlson
A1 - Kera Watkins
A1 - Russel Falck
KW - Drug Abuse Ontology
KW - Entity Identification
KW - Epidemiology
KW - knowledge
KW - Opiod abuse
KW - Prescription Drug Abuse
KW - Relationship Extraction
KW - Semantic Web
KW - Sentiment Extraction
KW - Triple Extraction
AB - The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel Semantic Web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO) (pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC). A combination of lexical, pattern-based and semantics-based techniques is used together with the domain knowledge to extract fine-grained semantic information from UGC. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, routes of administration, etc. The DAO is also used to help recognize three types of data, namely: 1) entities, 2) relationships and 3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information from UGC, and querying, search, trend analysis and overall content analysis of social media related to prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future.
ER -
TY - THES
T1 - A Semantic Situation Awareness Framework for Indoor Cyber-Physical Systems
T2 - Department of Engineering and Computer Science
Y1 - 2013
A1 - Pratik Desai
AB - Recently, the domain of cyber-physical systems (CPSs) has emerged as a successor to the traditional embedded systems and the wireless sensor networks. The relatively new cyber-physical domain offers tight integration of control, communication and computation components to develop advanced web based application in various heterogeneous domains such as health care, disaster management, automation and environment monitoring. The applications of indoor CPSs include remote patient monitoring, smart home, etc. with focus on situation awareness via event identification from context information. The principal challenges associated with the development of situation awareness applications include uncertainty in contextual data, incomplete domain knowledge, interoperability between interconnected systems and effective utilization of spatial information. This dissertation addresses these challenges by providing a comprehensive situation awareness framework for event comprehension utilizing raw sensor data and spatial information. Semantic web based annotation and mapping techniques are used to provide interoperability. The framework contains contextual situation awareness and location awareness stages towards achieving effective event assessment. The contextual situation awareness stage provides fuzzy abductive reasoning based architecture to transform raw physical sensor data to low-level fuzzy abstraction. These abstractions are used for event assessment with associated degree of certainty. The location awareness stage includes methodologies to hierarchically map indoor objects and define the object-event relationship in ontology, which is further exploited for event discrimination. This dissertation also presents a fusion based indoor positioning algorithm to provide accurate spatial information to assist location awareness. The algorithm uses extensive training of received signal strength (RSS) and time difference of arrival (TDoA) signals to estimate distance and position. The comprehensive framework is evaluated through an implementation of simulated indoor fire in a controlled environment.
JA - Department of Engineering and Computer Science
PB - Wright State University
CY - Dayton
VL - PhD
ER -
TY - THES
T1 - A Semantics-based Approach to Machine Perception
T2 - Department of Engineering and Computer Science
Y1 - 2013
A1 - Cory Henson
KW - Semantic Perception
KW - Semantic Sensor Web
AB - Machine perception can be formalized using semantic web technologies in order to derive abstractions from sensor data using background knowledge on the Web, and efficiently executed on resource-constrained devices. Advances in sensing technology hold the promise to revolutionize our ability to observe and understand the world around us. Yet the gap between observation and understanding is vast. As sensors are becoming more advanced and cost-effective, the result is an avalanche of data of high volume, velocity, and of varied type, leading to the problem of too much data and not enough knowledge (i.e., insights leading to actions). Current estimates predict over 50 billion sensors connected to the Web by 2020.1 While the challenge of data deluge is formidable, a resolution has profound implications. The ability to translate low-level data into high-level abstractions closer to human understanding and decision-making has the potential to disrupt data-driven interdisciplinary sciences, such as environmental science, healthcare, and bioinformatics, as well as enable other emerging technologies, such as the Internet of Things. The ability to make sense of sensory input is called perception; and while people are able to perceive their environment almost instantaneously, and seemingly without effort, machines continue to struggle with the task. Machine perception is a hard problem in computer science, with many fundamental issues that are yet to be adequately addressed, including: (a) annotation of sensor data, (b) interpretation of sensor data, and (c) efficient implementation and execution. This dissertation presents a semantics-based machine perception framework to address these issues.
JA - Department of Engineering and Computer Science
PB - Wright State University
CY - Dayton
VL - PhD
ER -
TY - CONF
T1 - Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Social Applications
T2 - AAAI 2013 Fall Symposium Series
Y1 - 2013
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
AB - We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. We organize our research around the five Vs of Big Data, where four of the Vs are harnessed to produce the fifth V - value. To handle the challenge of Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle the challenge of Variety, we resort to the use of semantic models and annotations of data so that much of the intelligent processing can be done at a level independent of heterogeneity of data formats and media. To handle the challenge of Velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize new concepts, entities and facts. To handle Veracity, we explore the formalization of trust models and approaches to glean trustworthiness. The above four Vs of Big Data are harnessed by the semantics-empowered analytics to derive Value for supporting practical applications transcending physical-cyber-social continuum.
JA - AAAI 2013 Fall Symposium Series
CY - Arlington, Virginia
ER -
TY - THES
T1 - SemSOS: an Architecture for Query, Insertion, and Discovery for Semantic Sensor Networks
T2 - Department of Computer Science & Engineering
Y1 - 2013
A1 - Joshua Pschorr
KW - Linked Data
KW - Semantic Sensor Web
KW - Semantic Web
KW - sensor data
KW - Sensor Observation Service
KW - Sensor Web
KW - Sensor Web Enablement
AB - With sensors, storage, and bandwidth becoming ever cheaper, there has been a drive recently to make sensor data accessible on the Web. However, because of the vast number of sensors collecting data about our environment, finding relevant sensors on the Web and then interpreting their observations is a non-trivial challenge. The Open Geospatial Consortium (OGC) defines a web service specification known as the Sensor Observation Service (SOS) that is designed to standardize the way sensors and sensor data are discovered and accessed on the Web. Though this standard goes a long way in providing interoperability between sensor data producers and consumers, it is predicated on the idea that the consuming application is equipped to handle raw sensor data. Sensor data consuming end-points are generally interested in not just the raw data itself, but rather actionable information regarding their environment. The approaches for dealing with this are either to make each individual consuming application smarter or to make the data served to them smarter. This thesis presents an application of the latter approach, which is accomplished by providing a more meaningful representation of sensor data by leveraging semantic web technologies. Specifically, this thesis describes an approach to sensor data modeling, reasoning, discovery, and query over richer semantic data derived from raw sensor descriptions and observations. The artifacts resulting from this research include: - an implementation of an SOS service which hews to both Sensor Web and Semantic Web standards in order to bridge the gap between syntactic and semantic sensor data consumers and that has been proven by use in a number of research applications storing large amounts of data, which serves as - an example of an approach for designing applications which integrate syntactic services over semantic models and allow for interactions with external reasoning systems. As more sensors and observations move online and as the Internet of Things becomes a reality, issues of integration of sensor data into our everyday lives will become important for all of us. The research represented by this thesis explores this problem space and presents an approach to dealing with many of these issues. Going forward, this research may prove a useful elucidation of the design considerations and affordances which can allow low-level sensor and observation data to become the basis for machine processable knowledge of our environment.
JA - Department of Computer Science & Engineering
PB - Wright State University
CY - Fairborn
VL - M.S.
ER -
TY - Generic
T1 - A Statistical and Schema Independent Approach to Identify Equivalent Properties on Linked Data
T2 - 9th International Conference on Semantic Systems (I-SEMANTICS)
Y1 - 2013
A1 - Kalpa Gunaratna
A1 - Krishnaprasad Thirunarayan
A1 - Prateek Jain
A1 - Amit Sheth
A1 - Sanjaya Wijeratne
KW - Linked Open Data
KW - Property Alignment
KW - Relationship Identication
KW - Statistical Equivalence
AB - Linked Open Data (LOD) cloud has gained significant attention in the Semantic Web community recently. Currently it consists of approximately 295 interlinked datasets with over 50 billion triples including 500 million links, and continues to expand in size. This vast source of structured information has the potential to have a significant impact on knowledge-based applications. However, a key impediment to the use of LOD cloud is limited support for data integration tasks over concepts, instances, and properties. Efforts to address this limitation over properties have focused on matching data-type properties across datasets; however,matching of object-type properties has not received similar attention. We present an approach that can automatically match object-type properties across linked datasets, primarily exploiting and bootstrapping from entity co-reference links such as owl:sameAs. Our evaluation, using sample instance sets taken from Freebase, DBpedia, LinkedMDB, and DBLP datasets covering multiple domains shows that our approach matches properties with high precision and recall (on average, F measure gain of 57% - 78%).
JA - 9th International Conference on Semantic Systems (I-SEMANTICS)
CY - Graz, Austria
ER -
TY - JOUR
T1 - Striving for Safety: Communicating and Deciding in Sociotechnical Systems
JF - Ergonomics
Y1 - 2013
A1 - John Flach
A1 - John Carroll
A1 - Marvin Dainoff
A1 - Ian Hamilton
KW - communications
KW - controllability
KW - decision-making
KW - dynamical systems
KW - observability
KW - safety
KW - sociotechnical systems
AB - How do communications and decisions impact the safety of sociotechnical systems? This paper frames this question in the context of a dynamic system of nested sub-systems. Communications are related to the construct of observability (i.e. how components integrate information to assess the state with respect to local and global constraints). Decisions are related to the construct of controllability (i.e. how component sub-systems act to meet local and global safety goals). The safety dynamics of sociotechnical systems are evaluated as a function of the coupling between observability and controllability across multiple closed-loop components. Two very different domains (nuclear power and the limited service food industry) provide examples to illustrate how this framework might be applied. While the dynamical systems framework does not offer simple prescriptions for achieving safety, it does provide guides for exploring specific systems to consider the potential fit between organisational structures and work demands, and for generalising across different systems regarding how safety can be managed.
VL - 58
CP - 4
ER -
TY - JOUR
T1 - Summarization via Pattern Utility and Ranking: A Novel Framework for Social Media Data Analytics
JF - Bulletin of the Technical Committee on Data Engineering
Y1 - 2013
A1 - Xintian Yang
A1 - Yiye Ruan
A1 - Srinivasan Parthasarthy
A1 - Amol Ghoting
AB - The firehose of data generated by users on social networking and microblogging sites such as Facebook and Twitter is enormous. The data can be classified into two categories: the textual content written by the users and the topological structure of the connections among users. Real-time analytics on such data is challenging with most current efforts largely focusing on the efficient querying and retrieval of data produced recently. In this article, we present a dynamic pattern driven approach to summarize social network content and topology. The resulting family of algorithms relies on the common principles of summarization via pattern utilities and ranking (SPUR). SPUR and its dynamic variant (D-SPUR) relies on an in-memory summary while retaining sufficient information to facilitate a range of user-specific and topic-specific temporal analytics. We then follow up by describing variants that take the implicit graph of connections into account to realize the Graph-based SPUR variant (G-SPUR). Finally we describe scalable algorithms for implementing these ideas on a commercial GPU-based systems. We examine the effectiveness of the summarization approaches along the axes of storage cost, query accuracy, and efficiency using real data from Twitter.
VL - 36
CP - 3
ER -
TY - JOUR
T1 - A Survey on the Privacy Preserving Algorithms of Association Rule Hiding
JF - International Journal of Data Mining & Knowledge Management Process
Y1 - 2013
A1 - EynollahKhanjari Miyaneh
A1 - Mohammadreza Lorestani
AB - Following the ever-increasingly pace of growth of the Internet, data storage devices, and data processing technologies, privacy preserving has emerged as one of the paramount issues in data mining. The issue has been widely studied given increase of sensitive data on the Internet. People are growingly expressing concerns about their privacy and personal information. To deal with this new demand, privacy preserving data mining (PPDM) has become increasingly popular because it allows sharing of privacy data for analysis purposes. However, there is a natural tradeoff between privacy and accuracy, though this tradeoff is affected by the particular algorithm which is used for privacy preservation.
VL - 3
CP - 2
ER -
TY - Generic
T1 - Swarm Intelligence computational paradigm
T2 - 2013 International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA)
Y1 - 2013
A1 - Utkarshani Jaimini
A1 - V. Panchal
KW - Multi-Agent
KW - Single-Agent
KW - Space Variant
KW - Swarm Intelligence
KW - Time-Space Invariant
KW - Time-Space Variant
AB - Swarm Intelligence has emerged as an important technique among various computational techniques due to its effieciency and robustness of the solution. The authors have categorized the different swarm intelligence techniques based on the agents population involved and on space-time variation to get an optimal solution to a problem.
JA - 2013 International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA)
PB - IEEE
CY - Bangalore, India
ER -
TY - CONF
T1 - Touch-Enabled Input Devices for Controlling Virtual Environments
T2 - 12th IFAC, IFIP, IFORS, IEA Symposium on Analysis, Design, and Evaluation of Human-Machine Systems
Y1 - 2013
A1 - Taylor Edmiston
A1 - Adam Golden
A1 - Adam Meily
A1 - Thomas Wischgoll
AB - The benefits of using virtual environment display technology is the familiarity of the user with the modalities of that environment providing a very intuitive access to models or data sets represented by using this technology. Various different styles of input devices are typically used for such virtual environments, ranging from standard game-pads to high-end commercial devices like an A.R.T. flystick2. These devices work great for operations such as selection or navigating the scene. Whenever more sophisticated dialog-based input is required, these devices typically rely on traditional 2D metaphors projected into the virtual environment. The use of tablet devices can provide a significantly more natural input-paradigm under these circumstances. This paper describes the deployment of a standard Android tablet device that interfaces with a virtual environment over the wireless network. The tablet device was tested using traditional CAVE-type display configurations and wall-type display systems using various different 3D stereoscopic technology including active stereo and passive stereo.
JA - 12th IFAC, IFIP, IFORS, IEA Symposium on Analysis, Design, and Evaluation of Human-Machine Systems
CY - Las Vegas, Nevada
ER -
TY - CONF
T1 - Traffic Analytics using Probabilistic Graphical Models Enhanced with Knowledge Bases
T2 - 2nd International Workshop on Analytics for Cyber-Physical Systems (ACS-2013) at SIAM International Conference on Data Mining (SDM 2013)
Y1 - 2013
A1 - Pramod Anantharam
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
KW - cyber physical systems
KW - Declarative Knowledge
KW - Graphical models
KW - probabilistic modeling
KW - traffic analytics
AB - Graphical models have been successfully used to deal with uncertainty, incompleteness, and dynamism within many domains. These models built from data often ignore pre-existing declarative knowledge about the domain in the form of ontologies and Linked Open Data (LOD) that is increasingly available on the web. In this paper, we present an approach to leverage such 'top-down' domain knowledge to enhance 'bottom-up' building of graphical models. Specifically, we propose three operations on the graphical model structure to enrich it with nodes, edges, and edge directions. We illustrate the enrichment process using traffic data from 511.org and declarative knowledge from ConceptNet. The resulting enriched graphical model can potentially lead to better predictions of traffic delays.
JA - 2nd International Workshop on Analytics for Cyber-Physical Systems (ACS-2013) at SIAM International Conference on Data Mining (SDM 2013)
CY - Austin, Texas
ER -
TY - Generic
T1 - Transforming Big Data into Smart Data: Deriving Value via Harnessing Volume, Variety and Velocity Using Semantics and Semantic Web
Y1 - 2013
A1 - Amit Sheth
KW - Big Data
KW - computing for human experience
KW - physical-cyber-social computing
KW - Semantic abstraction intelligence at edge
KW - Semantic Perception
KW - Semantic Web
KW - Smart Data
KW - Web 3.0
AB - Full Citation:

Amit Sheth, 'Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety and Velocity using semantics and Semantic Web,' keynote at the 21st Italian Symposium on Advanced Database Systems, June 30 - July 03 2013, Roccella Jonica, Italy. Also invited talks given in Universities in Spain and Italy in June 2013.

Provenance storage and propagation - to allow efficient storage and seamless propagation of provenance as the data is transferred across applications

Provenance query - to support queries with increasing complexity over large data size and also support knowledge discovery applications

We apply the SPF to two exemplar translational research projects, namely the Semantic Problem Solving Environment for Trypanosoma cruzi (T.cruzi SPSE) and the Biomedical Knowledge Repository (BKR) project, to demonstrate its effectiveness.Conclusions The SPF provides a unified framework to effectively manage provenance of translational research data during pre and post-publication phases. This framework is underpinned by an upper-level provenance ontology called Provenir that is extended to create domain-specific provenance ontologies to facilitate provenance interoperability, seamless propagation of provenance, automated querying, and analysis.
ER -
TY - CONF
T1 - An Up-to-date Knowledge-Based Literature Search and Exploration Framework for Focused Bioscience Domains
T2 - IHI 2012- 2nd ACM SIGHIT International Health Informatics Symposium
Y1 - 2012
A1 - Ramakanth Kavuluru
A1 - Christopher Thomas
A1 - Amit Sheth
A1 - Victor Chan
A1 - Wenbo Wang
A1 - Gary Alan Smith
KW - domain models
KW - hypothesis generation
KW - information extraction
KW - knowledge based systems
KW - SCOONER
KW - Semantic Web for life sciences
KW - text mining
KW - Virtuoso
AB - To handle the exponential growth in bioscience literature, several knowledge-based search systems that facilitate domain-specific search have been proposed. In such systems, knowledge of a domain of interest is embedded as a backbone that guides the search process. But the knowledge used in most such systems 1. exists only for few well known broad domains; 2. is of a basic nature: either purely hierarchical or involves only few relationship types; and 3. is not always kept up-to-date missing insights from recently published results. In this paper we present a framework and implementation of a focused and up-to-date knowledge-based search system, called Scooner, that utilizes domain-specific knowledge extracted from recent bioscience abstracts. To our knowledge, this is the first attempt in the field to address all three shortcomings mentioned above. Since recent introduction for operational use at Applied Biotechnology Branch of AFRL, some biologists are using Scooner on a regular basis, while it is being made available for use by many more. Initial evaluations point to the promise of the approach in addressing the challenge we set out to address.
JA - IHI 2012- 2nd ACM SIGHIT International Health Informatics Symposium
CY - Miami Florida
ER -
TY - JOUR
T1 - Use Attribute Behavior Diversity to Build Accurate Decision Tree Committees for Microarray Data
Y1 - 2012
A1 - Qian Han
A1 - Guozhu Dong
ER -
TY - CONF
T1 - A Web-Based Study of Self-Treatment of Opioid Withdrawal Symptoms with Loperamide
T2 - College on Problems of Drug Dependence (CPDD)
Y1 - 2012
A1 - Raminta Daniulaityte
A1 - Robert Carlson
A1 - Russel Falck
A1 - Delroy Cameron
A1 - Sujan Udayanga
A1 - Lu Chen
A1 - Amit Sheth
KW - Loperamide
KW - Prescription Drug Abuse
KW - Self-Treatment
KW - Withdrawal
AB - ABSTRACT: Aims: Many websites provide a medium for individuals to freely share their experiences and knowledge about different drugs. Such user-generated content can be used as a rich data source to study emerging drug use practices and trends. The study aims to examine web-based reports of loperamide use practices among non-medical opioid users. Loperamide, a piperidine derivative, is an opioid agonist approved for the control of diarrhea symptoms. Because of its general inability to cross the blood-brain barrier, it is considered to have no abuse potential and is available without a prescription. Methods: A website that allows free discussion of illicit drugs and is accessible for public viewing was selected for analysis. Web-forum posts were retrieved using Web Crawlers and retained in an Informal Text Database. All unique user names were anonymized. The database was queried to extract posts with a mention of loperamide and relevant brand/slang terms. Over 1200 posts were identified and entered into NVivo to assist with consistent application of codes related to the reasons, dosage, and effects of loperamide use. Results: Since the first post in 2005, there was a substantial rise in discussions related to its use by non-medical opioid users, especially in 2009-2011. Loperamide was primarily discussed as a remedy to alleviate a broad range of opiate withdrawal symptoms, and was sometimes referred to as 'poor man's methadone.' Typical doses frequently ranged from 100 mg to 200 mg per day, much higher than an indicated dose of 16 mg per day. Conclusions: This study suggests that loperamide is being used extra-medically by people who are involved with the abuse of opioids to control withdrawal symptoms. There is a growing demand among people who are opioid dependent for drugs to control withdrawal symptoms, and loperamide appears to fit that role. The study also highlights the potential of the Web as a 'leading edge' data source in identifying emerging drug use practices.
JA - College on Problems of Drug Dependence (CPDD)
PB - College on Problems of Drug Dependence (CPDD)
CY - The College on Problems of Drug Dependence (CPDD), Palm Springs, CA USA, June 9-14, 2012
ER -
TY - JOUR
T1 - 3D Reconstruction of Human Ribcage and Lungs and Improved Visualization of Lung X-ray Images Through Removal of the Ribcage
JF - Dagstuhl Follow-Ups
Y1 - 2011
A1 - Christopher Koehler
A1 - Thomas Wischgoll
AB - The analysis of X-ray imagery is the standard pre-screening approach for lung cancer. Unlike CTscans, X-ray images only provide a 2D projection of the patientÂs body. As a result occlusions, i.e. some body parts covering other areas of the body within this projected X-ray image, can make the analysis more difficult. For example, the ribs, a predominant feature within the X-ray image, can cover up cancerous nodules, making it difficult for the Computer Aided Diagnostic (CAD) systems or even a doctor to detect such nodules. Hence, this paper describes a methodology for reconstructing a patient-specific 3D model of the ribs and lungs based on a set of lateral and PA X-ray images, which allows the system to calculate simulated X-ray images of just the ribs. The simulated X-ray images can then be subtracted from the original PA X-ray image, resulting in an image where most of the cross hatching pattern caused by the ribs is removed to improve on automated diagnostic processes.
ER -
TY - CHAP
T1 - Active Perception Over Machine and Citizen Sensing
Y1 - 2011
A1 - Cory Henson
A1 - Amit Sheth
KW - Perception
KW - Semantic Web
KW - Sensing
AB - Today, many sensor networks and their applications employ a brute force approach to collecting and analyzing the huge volumes of sensor data currently generated. Such an approach often wastes valuable energy and computational resources by unnecessarily tasking sensors and generating observations of minimal use. People, on the other hand, have evolved sophisticated mechanisms to efficiently perceive their environment. This is accomplished through an ability to isolate the signal from the noise Â by focusing attention and seeking out those observations containing useful information. With the rise of social networking technology, people are now empowered to share their observations and perceptions with the world. A satisfying integration of such knowledge with observations generated by machine sensors, however, remains elusive. In this talk, we describe and demonstrate a semantics driven active perception prototype Â derived from cognitive theory Â that may be used to more effectively integrate observations from both machine sensors and people in order to efficiently generate comprehensive and robust situation awareness.
PB - Semantic Technology Conference
ER -
TY - CONF
T1 - Aligning the Parasite Experiment Ontology and the Ontology for Biomedical Investigations Using AgreementMaker
T2 - International Conference on Biomedical Ontology (ICBO 2011)
Y1 - 2011
A1 - Valerie Cross
A1 - Cosmin Stroe
A1 - Xueheng Hu
A1 - Pramit Silwal
A1 - Maryam Panahiazar
A1 - Isabel F. Cruz
A1 - Priti Parikh
A1 - Amit Sheth
KW - biomedical ontologies
KW - lexicons
KW - mapping provenance
KW - ontology alignment
KW - ontology profiling
AB - Tremendous amounts of data exist in life sciences along with many bio-ontologies. Though these databases contain important information about gene, proteins, functions, etc., this information is not well utilized due to the heterogeneous formats of these databases. Therefore, ontology alignment (OA) is now very critical for life science domain. Our work utilizes AgreementMaker for OA and describes results, difficulties faced in the process, and lessons learned. We aligned two real-world ontologies, the Parasite Experiment Ontology (PEO) and the Ontology for Biomedical Investigations (OBI). The former is more application- oriented and the latter is a reference ontology for any biomedical or clinical investigations. Our study led to several enhancements to AgreementMaker: annotation profiling, mapping provenance information, and tailored lexicon building. These enhancements, which are applicable to any OA system, greatly improved the alignment of these real world ontologies, producing 90% precision with 60% recall from the BSMlex+, the Base Similarity Matcher, and 57% precision with 67% recall from the PSMlex+, the Parametric String Matcher, both using lexicon lookup for synonyms. The mappings obtained through this study are posted on BioPortal site for public use.
JA - International Conference on Biomedical Ontology (ICBO 2011)
CY - Buffalo, New York
ER -
TY - ABST
T1 - Analysis and Visualization of Vascular Structures
Y1 - 2011
A1 - Thomas Wischgoll
ER -
TY - ABST
T1 - Analysis on Partial Relationship in LOD
Y1 - 2011
A1 - Kalpa Gunaratna
A1 - Sarasi Lalithsena
A1 - Cory Henson
A1 - Prateek Jain
KW - DBpedia
AB - Relationships play a key role in Semantic Web to connect the dots between entities (concepts or instances) in a way that enables to absorb the real sense of the entities. Some interesting relationships would give proof for the existence of subject and object in triples which in tern can be defined as evidential relationships. Identifying evidential relationships will yield solutions to some existing inference problems and open doors for new applications and research. Part_of relationships are identified as a special kind of an evidential relationship out of membership, causality and etc. Linked Open data as a global data space would provide a good platform to explore these relationships and solve interesting inference problems. But this is not trivial because LOD does not have a rich schema in terms of the data sets and also the existing work with respect to schema mapping in LOD is limited to concepts and not relationships. This project is based on finding a novel approach to identify partial relationships which is the superset of part_of relationships from LOD instance data by conducting a proper analysis of the data patterns in instance data. Ultimately this approach would provide a way to enhance the shallow schemas in LOD which in tern would be helpful in schema matching in LOD. We apply the determined approach to the DBpedia data set in order to identify the partial relationships in DBpedia.
ER -
TY - CONF
T1 - A Better Uncle For OWL - Nominal Schemas for Integrating Rules and Ontologies
T2 - International World Wide Web Conference (WWW2011)
Y1 - 2011
A1 - Markus Krotzsch
A1 - Frederick Maier
A1 - Adila Alfa Krisnadhi
A1 - Pascal Hitzler
KW - Datalog
KW - Description Logic
KW - Semantic Web Rule Language
KW - SROIQ
KW - tractability
KW - Web Ontology Language
AB - We propose a description-logic style extension of OWL 2 with nominal schemas which can be used like 'variable nominal classes' within axioms. This feature allows ontology languages to express arbitrary DL-safe rules (as expressible in SWRL or RIF) in their native syntax. We show that adding nominal schemas to OWL 2 does not increase the worst-case reasoning complexity, and we identify a novel tractable language SROELV_3(⊓, X) that is versatile enough to capture the lightweight languages OWL EL and OWL RL.
JA - International World Wide Web Conference (WWW2011)
PB - Proceedings of the 20th International World Wide Web Conference (WWW2011)
CY - New York
ER -
TY - ABST
T1 - Beyond Positive/Negative Classification: Automatic Extraction of Sentiment Clues from Microblogs
Y1 - 2011
A1 - Lu Chen
A1 - Wenbo Wang
A1 - Meenakshi Nagarajan
A1 - Shaojun Wang
A1 - Amit Sheth
KW - Optimization and Opinion Mining and Sentiment Analysis and Sentiment Extraction
AB - Microblogging provides a large volume of text for learning and understanding people's sentiments on a variety of topics. Much of the current work on sentiment analysis of microblogs (e.g., tweets) focuses on document level polarity. However, identifying sentiment clues with respect to specific targets (e.g., named entities) can be more useful than pure document polarity results. For example, sentiment clues such as 'must see', 'awesome', 'rate 5 stars' (in the movie domain) are much more meaningful than the polarities of tweets only. Previous attempts at single-word sentiment clue extraction from formal text will not suffice for extracting multi-word sentiment phrases. Single words 'must' and 'see' do not separately convey polarity, but their combination 'must see' expresses strong positive sentiment towards a movie target. Another issue with identifying sentiment clues is identifying informal sentiment expressions, such as misspellings ('kool'), abbreviations ('wtf') and slangs ('da bomb'). In this paper, we propose an approach for automatically extracting both single-word and multi-word sentiment clues. Such clues can include both traditional and slang expressions. We also present a mechanism for assessing their target-specific polarities from an unlabeled microblog corpus. Our approach first leverages traditional and slang subjective lexicons to generate candidate sentiment clues given some specific target. It then incorporates inter-clue relations from corpora into an optimization model to estimate the probability of a clue denoting positive/negative sentiment. Experiments using microblog data sets on two different domains -- movie and person -- show that the proposed approach can effectively 1) extract single-word as well as phrase sentiment clues, 2) identify both traditional and slang sentiment clues, and 3) determine their target-specific polarities. We also demonstrate how the proposed approach is superior in comparison with several baseline methods.
ER -
TY - CONF
T1 - Building a Foundation To Enable Semantic Technologies For Phylogenetically-Based Comparative Analyses
T2 - iEvoBio2011
Y1 - 2011
A1 - Maryam Panahiazar
A1 - Rutger Vos
A1 - Enrico Pontelli
A1 - Todd Vision
A1 - Arlin Stoltzfus
A1 - Jim LeebensMack
KW - Ontology
KW - Phylogeny
KW - phyloinformatics
KW - Semantic Annotation
KW - Semantic Web
AB - In revealing historical relationships among genes and species, phylogenies provide a unifying context across the life sciences for investigating diversification of biological form and function. The utility of phylogenies for addressing a wide variety of biological questions is evident in the rapidly increasing number of published gene and species trees. Further, this trend is certain to pick up pace with the explosion of data being generated with next generation sequencing technologies. The impact that this deluge of species and gene tree estimates will have on our understanding of the forces that shape biodiversity will be limited by the accessibility of these trees, and the underlying data and methods of analysis. The true structure of species trees and gene trees is rarely known. Rather, estimates are obtained through the application of increasingly sophisticated phylogenetic inference methods to increasingly large and complicated datasets. The need for Minimum Information about Phylogenetic Analyses (MIAPA) reporting standard is clear, but specification of the standard has been hampered by the absence of controlled vocabularies to describe phylogenetic methodologies and workflows. PhylOnt is an extensible ontology being developed to describe the methods employed to estimate trees given a data matrix and thus support specification of MIAPA. PhylOnt will be linked with the Comparative Data Analysis Ontology (CDAO) to provide a comprehensive set of concepts relating to phylogeny estimation that can be used by searchable tree databases and web services. Moreover, we aim to use PhylOnt/CDAO concepts that describe tree estimation procedures to explicitly relate tree descriptions to data matrices within NeXML files. We view this as an important step in the development and specification of MIAPA.
JA - iEvoBio2011
PB - iEvoBio 2011
CY - Norman, Oklahoma
ER -
TY - CHAP
T1 - Citizen Sensing Opportunities and Challenges in Mining Social Signals and Perceptions
Y1 - 2011
A1 - Amit Sheth
KW - Citizen sensing
KW - Continuous Semantics
KW - Dynamic Domain model
KW - Semantic Search
KW - semantic social web
KW - Social Data Annotation
KW - Social Perception
KW - Social Signal
AB -
ER -
TY - ABST
T1 - Citizen Sensor Data Mining, Social Media Analytics and Development Centric Web Applications
Y1 - 2011
A1 - Meenakshi Nagarajan
A1 - Amit Sheth
A1 - Selvam Velmurugan
KW - Citizen sensing
KW - mobile development application
KW - people-content-network view of social media
KW - semantic social mashup
KW - semantic social web
KW - social development application
KW - social media analysis
KW - social signals
KW - user generated content
AB - With the rapid rise in the popularity of social media (500M+ Facebook users, 100M+ twitter users), and near ubiquitous mobile access (4.1 billion actively-used mobile phones), the sharing of observations and opinions has become common-place (nearly 100M tweets a day, 1.8 trillion SMSs in US last year). This has given us an unprecedented access to the pulse of a populace and the ability to perform analytics on social data to support a variety of socially intelligent applications -- be it towards targeted online content delivery, crisis management, organizing revolutions or promoting social development in underdeveloped and developing countries. This tutorial will address challenges and techniques for building applications that support a broad variety of users and types of social media. This tutorial will focus on social intelligence applications for social development, and cover the following research efforts in sufficient depth: 1) understanding and analysis of informal text, esp. microblogs (e.g., issues of cultural entity extraction and role of semantic/background knowledge enhanced techniques), and 2) building social media analytics platforms. Technical insights will be coupled with identification of computational techniques and real-world examples.
PB - WWW 2011
ER -
TY - JOUR
T1 - The Cloud Agnostic e-Science Analysis Platform
Y1 - 2011
A1 - Ajith Ranabahu
A1 - Paul Anderson
A1 - Amit Sheth
KW - Cloud Computing
KW - escience
AB - The amount of data being generated for e-Science domains has grown exponentially in the past decade, yet the adoption of new computational techniques in these fields hasn't seen similar improvements. The presented platform can exploit the power of cloud computing while providing abstractions for scientists to create highly scalable data processing workflows
ER -
TY - MGZN
T1 - Cloud Centric Mobile Application Development Using Domain Specific Languages
Y1 - 2011
A1 - Ajith Ranabahu
A1 - Ashwin Manjunatha
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayam
AB - Stellar growth in use of mobile platforms for business users, combined with rapid business growth in developing countries that have much higher utilization of mobile access have put cloud back-end hosted mobile applications in a sweet spot. By using a cloud mobile combination, computationally intensive services can be delivered right to the consumer anywhere, anytime. Two unmet challenges in developing such applications are managing the development of applications for heterogeneous mobile platforms with equivalent functionality, and maintaining a portable back-end to mitigate any catastrophic outage (such as the recent outage of Amazon EC21 .) This article discusses the mobile application aspects of the MobiCloud [1] approach for rapidly developing cloud-mobile hybrid applications for heterogeneous mobile front-ends and cloud back-ends. MobiCloud exploits the features of Domain Specific Languages (DSLs) to address the difficulty of programming for multiple mobile platforms. It also helps to overcome some of the limitations of mobile platforms by pairing with cloud back-ends.
PB - IEEE ComSoc
VL - 6
CP - 10
ER -
TY - JOUR
T1 - Cloud Centric Mobile Application Development Using Domain Specific Languages
Y1 - 2011
A1 - Ajith Ranabahu
A1 - Ashwin Manjunatha
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
KW - application generation
KW - Cloud Computing
KW - Mobicloud
KW - Mobile Computing
ER -
TY - CONF
T1 - CloudVista: Visual Cluster Exploration for Extreme Scale Data in the Cloud
T2 - Scientific and Statistical Database Management Conference
Y1 - 2011
A1 - Keke Chen
A1 - Huiqi Xu
A1 - Fengguang Tian
A1 - Shumin Guo
KW - visual cluster exploartion
AB - The problem of efﬁcient and high-quality clustering of extreme scale datasets with complex clustering structures continues to be one of the most challenging data analysis problems. An innovate use of data cloud would provide unique opportunity to address this challenge. In this paper, we propose the CloudVista framework to address (1) the problems caused by using sampling in the existing approaches and (2) the problems with the latency caused by cloud-side processing on interactive cluster visualization. The CloudVista framework aims to explore the entire large data stored in the cloud with the help of the data structure visual frame and the previously developed VISTA visualization model. The latency of processing large data is addressed by the RandGen algorithm that generates a series of related visual frames in the cloud without user's intervention, and a hierarchical exploration model supported by cloud-side subset processing. Experimental study shows this framework is effective and efﬁcient for visually exploring clustering structures for extreme scale datasets stored in the cloud.
JA - Scientific and Statistical Database Management Conference
CY - Portland OR
ER -
TY - CONF
T1 - Computing for Human Experience: Semantics Empowered Cyber-Physical, Social and Ubiquitous Computing Beyond the Web
Y1 - 2011
A1 - Amit Sheth
KW - semantic sensor web and computing for human experience and semantic social web and pervasive computing supported humanity and social-cyber-technical systems and semantic Web of Things
AB - Traditionally, we had to artificially simplify the complexity and richness of the real world to constrained computer models and languages for more efficient computation. Today, devices, sensors, human-in-the-loop participation and social interactions enable something more than a 'human instructs machine' paradigm. Web as a system for information sharing is being replaced by pervasive computing with mobile, social, sensor and devices dominated interactions. Correspondingly, computing is moving from targeted tasks focused on improving efficiency and productivity to a vastly richer context that support events and situational awareness, and enrich human experiences encompassing recognition of rich sets of relationships, events and situational awareness with spatio-temporal-thematic elements, and socio-cultural-behavioral facets. Such progress positions us for what I call an emerging era of 'computing for human experience' (CHE). Four of the key enablers of CHE are: (a) bridging the physical/digital (cyber) divide, (b) elevating levels of abstractions and utilizing vast background knowledge to enable integration of machine and human perception, (c) convert raw data and observations, ranging from sensors to social media, into understanding of events and situations that are meaningful to humans, and (d) doing all of the above at massive scale covering the Web and pervasive computing supported humanity. Semantic Web (conceptual models/ontologies and background knowledge, annotations, and reasoning) techniques and technologies play a central role in important tasks such as building context, integrating online and offline interactions, and help enhance human experience in their natural environment.
PB - OnTheMove Federated Conferences and Workshops
ER -
TY - JOUR
T1 - Computing Inconsistency Measure based on Paraconsistent Semantics
Y1 - 2011
A1 - Yue Ma
A1 - Guilin Qi
A1 - Pascal Hitzler
AB - Measuring inconsistency in knowledge bases has been recognized as an important problem in several research areas. Many methods have been proposed to solve this problem and a main class of them is based on some kind of paraconsistent semantics. However, existing methods suffer from two limitations: (i) they are mostly restricted to propositional knowledge bases; (ii) very few of them discuss computational aspects of computing inconsistency measures. In this article, we try to solve these two limitations by exploring algorithms for computing an inconsistency measure of first-order knowledge bases. After introducing a four-valued semantics for first-order logic, we define an inconsistency measure of a first-order knowledge base, which is a sequence of inconsistency degrees. We then propose a precise algorithm to compute our inconsistency measure. We show that this algorithm reduces the computation of the inconsistency measure to classical satisfiability checking. This is done by introducing a new semantics, named S[n]-4 semantics, which can be calculated by invoking a classical SAT solver. Moreover, we show that this auxiliary semantics also gives a direct way to compute upper and lower bounds of inconsistency degrees. That is, it can be easily revised to compute approximating inconsistency measures. The approximating inconsistency measures converge to the precise values if enough resources are available. Finally, by some nice properties of the S[n]-4 semantics, we show that some upper and lower bounds can be computed in P-time, which says that the problem of computing these approximating inconsistency measures is tractable.
ER -
TY - CONF
T1 - Contextual Ontology Alignment of LOD With an Upper Ontology: A Case Study With Proton
T2 - 8th Extended Semantic Web Conference, ESWC 2011
Y1 - 2011
A1 - Prateek Jain
A1 - Peter Z. Yeh
A1 - Kunal Verma
A1 - Reymonrod Vasquez
A1 - Mariana Damova
A1 - Pascal Hitzler
A1 - Amit Sheth
KW - Contextual Ontology Alignment
KW - Linked Open Data
KW - Ontology Mapping
KW - Schema Alignment
AB - The Linked Open Data (LOD) is a major milestone towards realizing the Semantic Web vision, and can enable applications such as robust Question Answering (QA) systems that can answer queries requiring multiple, disparate information sources. However, realizing these applications requires relationships at both the schema and instance level, but currently the LOD only provides relationships for the latter. To address this limitation, we present a solution for automatically finding schema-level links between two LOD ontologies -- in the sense of ontology alignment. Our solution, called BLOOMS+, extends our previous solution (i.e. BLOOMS) in two significant ways. BLOOMS+ 1) uses a more sophisticated metric to determine which classes between two ontologies to align, and 2) considers contextual information to further support (or reject) an alignment. We present a comprehensive evaluation of our solution using schema-level mappings from LOD ontologies to Proton (an upper level ontology) -- created manually by human experts for a real world application called FactForge. We show that our solution performed well on this task. We also show that our solution significantly outperformed existing ontology alignment solutions (including our previously published work on BLOOMS) on this same task.
JA - 8th Extended Semantic Web Conference, ESWC 2011
PB - Proceedings of 8th Extended Semantic Web Conference, ESWC 2011
CY - Greece
ER -
TY - JOUR
T1 - DBpedia Spotlight: Shedding Light on the Web of Documents
Y1 - 2011
A1 - Pablo Mendes
A1 - Max Jakob
A1 - Andres Garcia
A1 - Christian Bizer
ER -
TY - ABST
T1 - Demonstration: Real-Time Semantic Analysis of Sensor Streams
Y1 - 2011
A1 - Harshal Patni
A1 - Cory Henson
A1 - Michael Cooney
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
KW - Abstraction
KW - Semantic Sensor Web
KW - Semantic Web
KW - Streaming Sensor data
AB - The emergence of dynamic information sources - including sensor networks - has led to large streams of real-time data on the Web. Research studies suggest, these dynamic networks have created more data in the last three years than in the entire history of civilization, and this trend will only increase in the coming years [1]. With this coming data explosion, real-time analytics software must either adapt or die [2]. This paper focuses on the task of integrating and analyzing multiple heterogeneous streams of sensor data with the goal of creating meaningful abstractions, or features. These features are then temporally aggregated into feature streams. We will demonstrate an implemented framework, based on Semantic Web technologies, that creates feature-streams from sensor streams in real-time, and publishes these streams as Linked Data. The generation of feature streams can be accomplished in reasonable time and results in massive data reduction.
ER -
TY - CONF
T1 - Discovering Dynamic Logical Blog Communities Based on Their Distinct Interest Profiles.
T2 - SOTICS 2011
Y1 - 2011
A1 - Neil Fore
A1 - Guozhu Dong
JA - SOTICS 2011
CY - Barcelona, Spain
ER -
TY - ABST
T1 - A Domain Specific Language Based Approach for Developing Complex Cloud Computing Applications
Y1 - 2011
A1 - Ashwin Manjunatha
KW - Cloud Computing
KW - Cloud Mobile Hybrid Application
KW - Domain Specific Language
KW - DSL
KW - Metabolink
KW - Metabolomics
KW - mobi-cloud
KW - Mobicloud
KW - Mobicloud Toolkit
KW - Mobile Computing
KW - SCALE toolkit
AB - Computing has changed. Lately, a slew of cheap, ubiquitous, connected mobile devices as well as seemingly unlimited, utility style, pay as you go computing resources has become available at the disposal of the common man. The latter commonly called Cloud Computing (or just Cloud) is democratizing computing by making large computing power accessible to people and corporations around the world easily and economically.

However, taking full advantage of this computing landscape, especially for the data intensive domains, has been hampered by many factors, the primary one being the complexity in developing applications for the variety of available platforms.

This thesis attempts to alleviate many of the issues faced in developing complex Cloud centric applications by using a Domain Specific Language (DSL) based methods. The research is focused in two main areas. One area is hybrid applications with mobile device based front-ends and Cloud based back-ends. The other is data and compute intensive biological experiments, exemplified by applying a DSL to metabolomics data analysis. This research investigates the viability of using a DSL in each domain and provides evidence of successful application.
ER -
TY - Generic
T1 - A Domain Specific Language for Enterprise Grade Cloud-Mobile Hybrid Applications
T2 - Proceedings of the 11th Workshop on Domain-Specific Modeling
Y1 - 2011
A1 - Ajith Ranabahu
A1 - Michael Maximilien
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
KW - cloud computing and mobicloud and application development
AB - Cloud computing has changed the technology landscape by offering flexible and economical computing resources to the masses. However, vendor lock-in makes the migration of applications and data across clouds an expensive proposition. The lock-in is especially serious when considering the new technology trend of combining cloud with mobile devices. In this paper, we present a domain specific language (DSL) that is purposely created for generating hybrid applications spanning across mobile devices as well as computing clouds. We propose a model-driven development process that makes use of a DSL to provide sufficient programming abstractions over both cloud and mobile features. We describe the underlying domain modeling strategy as well as the details of our language and the tools supporting our approach.
JA - Proceedings of the 11th Workshop on Domain-Specific Modeling
PB - ACM
CY - Portland, Oregon
ER -
TY - CONF
T1 - An Equivalence Class Based Clustering Algorithm for Categorical Data
T2 - International Conference on Advances in Information Mining and Management
Y1 - 2011
A1 - Qingbao Liu
A1 - Wanjun Wang
A1 - Su Deng
A1 - Guozhu Dong
JA - International Conference on Advances in Information Mining and Management
PB - International Conference on Advances in Information Mining and Management
CY - Barcelona, Spain
ER -
TY - JOUR
T1 - Extending Semantic Provenance into the Web of Data
JF - Provenance in Web Applications
Y1 - 2011
A1 - Jun Zhao
A1 - Satya S. Sahoo
A1 - Paolo Missier
A1 - Amit Sheth
A1 - Carole Goble
KW - Janus ontology framework
KW - Linked Open Data
KW - Provenir ontology
KW - semantic provenance
KW - Web of Data
AB - The importance of tracking and querying the provenance of experimental data for scientific applications is only now beginning to emerge, as a number of provenance management systems reach maturity. In addition to provenance, domain-specific semantic annotations to data products and the Web of Data are playing an increasingly important role as contextual metadata that can be used to assist with the interpretation of experimental data. In this article we use an example workflow, and a simple classification of user questions on the workflow's data products, to explore the combination of these three strands of contextual metadata through a semantic data model and infrastructure, and their potential to support enhanced semantic provenance applications.
ER -
TY - CONF
T1 - Flying under the Radar: Maintaining Control of Kernel without Changing Kernel Code or Persistent Data Structures
Y1 - 2011
A1 - Jinpeng Wei
A1 - Calton Pu
A1 - Keke Chen
AB - Cyber-spies rely on technologies such as rootkits to maintain a stealthy control of the victim kernel. Current techniques can detect changes to kernel code (e.g., SecVisor) and data (e.g., SBCFI), but have difficulties with transient kernel control flow attacks that insert execution requests into interrupt or kernel work queues (K-queues) without changing kernel code or data. Two examples implemented using Linux tasklets illustrate the effectiveness of K-queue attacks: key logger and CPU cycle stealer. Possible defenses to protect the kernel against K-queue attacks are outlined
ER -
TY - JOUR
T1 - Geometric Data Perturbation for Privacy Preserving Outsourced Data Mining
JF - Journal of Knowledge and Information Systems (KAIS)
Y1 - 2011
A1 - Keke Chen
A1 - Ling Liu
KW - Data mining algorithms
KW - Data perturbation
KW - Geometric data perturbation
KW - Privacy evaluation
KW - Privacy-preserving data mining
AB - Data perturbation is a popular technique in privacy-preserving data mining. A major challenge in data perturbation is to balance privacy protection and data utility, which are normally considered as a pair of conflicting factors.We argue that selectively preserving the task/model specific information in perturbation will help achieve better privacy guarantee and better data utility. One type of such information is the multidimensional geometric information, which is implicitly utilized by many data-mining models. To preserve this information in data perturbation, we propose the GeometricData Perturbation (GDP)method. In this paper, we describe several aspects of the GDP method. First, we show that several types of well-known data-mining models will deliver a comparable level of model quality over the geometrically perturbed data set as over the original data set. Second, we discuss the intuition behind the GDP method and compare it with other multidimensional perturbation methods such as random projection perturbation. Third, we propose a multi-column privacy evaluation framework for evaluating the effectiveness of geometric data perturbation with respect to different level of attacks. Finally, we use this evaluation framework to study a few attacks to geometrically perturbed data sets. Our experimental study also shows that geometric data perturbation can not only provide satisfactory privacy guarantee but also preserve modeling accuracy well.
ER -
TY - JOUR
T1 - Guest Editors' Introduction: Provenance in Web Applications
JF - IEEE Internet Computing
Y1 - 2011
A1 - Geetika T. Lakshmanan
A1 - Juliana Freire
A1 - Francisco Curbera
A1 - Amit Sheth
KW - provenance
KW - Web
AB - Provenance, from the French word provenir, 'to come from,' means the origin, or the source, of something, or the history of an object's ownership or location. A digital object's provenance (also referred to as its audit trail or lineage) contains information about the process and data used to derive the object. Provenance provides important documentation that's vital to preserving data, determining data quality and authorship, and reproducing and validating results. As increasing volumes of data are shared and modified over the Web, it's crucial to track their provenance for business, scientific, and social networking applications. This special issue provides a snapshot of ongoing work in this area.
ER -
TY - JOUR
T1 - Guest Editors' Introduction: Provenance in Web Applications
Y1 - 2011
A1 - Geetika T. Lakshmanan
A1 - Juliana Freire
A1 - Amit Sheth
KW - provenance
KW - Web
AB - Provenance, from the French word provenir, 'to come from,' means the origin, or the source, of something, or the history of an object's ownership or location. A digital object's provenance (also referred to as its audit trail or lineage) contains information about the process and data used to derive the object. Provenance provides important documentation that's vital to preserving data, determining data quality and authorship, and reproducing and validating results. As increasing volumes of data are shared and modified over the Web, it's crucial to track their provenance for business, scientific, and social networking applications. This special issue provides a snapshot of ongoing work in this area.
ER -
TY - CONF
T1 - Identifying and Implementing the Underlying Operators for Nuclear Magnetic Resonance Based Metabolomics Data Analysis
T2 - 3rd International Conference on Bioinformatics and Computational Biology (BICoB-2011)
Y1 - 2011
A1 - Ashwin Manjunatha
A1 - Paul Anderson
A1 - Ajith Ranabahu
A1 - Amit Sheth
KW - Domain Specific Languagess
KW - Metabolomics
KW - Nuclear Magnetic Resonance
KW - PIGLatin
KW - Program generation
AB - The science of metabolomics is a relatively young field that requires intensive signal processing and multivariate data analysis for interpretation of experimental results. The lack of integration and standardization for metabolomics compounded by the complexity of the experimental data has lead to a fragmented research community. While efforts have been undertaken to approach these problems, the efforts to develop a set of standards for reporting processing and analysis procedures has stalled. In this paper, we propose a set of fundamental operators for nuclear magnetic resonance(NMR) based metabolomics. These operators are implementation independent, and can be used to easily and precisely describe the processing and analysis steps that led to research conclusions. This formalization can facilitate inter-lab communication, and due to its simplicity, it is easily adapted by the metabolomics community. A Domain Specific Language (DSL) is also included to demonstrate an implementation of these operators. The DSL is simple, convenient for a domain scientist, and can be easily transformed into multiple target platforms.
JA - 3rd International Conference on Bioinformatics and Computational Biology (BICoB-2011)
CY - New Orleans, Louisiana
ER -
TY - CONF
T1 - Kino : A Generic Document Management System for Biologists Using SA-REST and Faceted Search
T2 - Fifth IEEE International Conference on Semantic Computing (ICSC)
Y1 - 2011
A1 - Ajith Ranabahu
A1 - Priti Parikh
A1 - Maryam Panahiazar
A1 - Amit Sheth
A1 - Flora Logan-Klumpler
KW - APIHut
KW - Kino
KW - SA-REST
KW - scientific annotation
KW - semantic annotation of biomedical documents
KW - semantic annotation of document
KW - semantic service annotation
JA - Fifth IEEE International Conference on Semantic Computing (ICSC)
PB - Proceedings of the 5th IEEE International Conference on Semantic Computing
CY - Palo Alto, CA
ER -
TY - CONF
T1 - The Knowledge Driven Exploration of Integrated Biomedical Knowledge Sources Facilitates the Generation of New Hypotheses
T2 - 1st International Conference on Linked Science
Y1 - 2011
A1 - Vinh Nguyen
A1 - Olivier Bodenreider
A1 - Todd Minning
A1 - Amit Sheth
KW - Semantic Web and Knowledge Exploration
AB - Knowledge gained from the scientific literature can complement newly obtained experimental data in helping researchers understand the pathological processes underlying diseases. However, unless the scientific literature and experimental data are semantically integrated, it is generally difficult for scientists to exploit the two sources effectively. We argue that, in addition to the semantic integration of heterogeneous knowledge sources, the usability of the integrated resource by scientists is dependent upon the availability of knowledge visualization and exploration tools. Moreover, the integration techniques must be scalable and the exploration interfaces must be easy to use by bench scientists.The end goal of such integrated knowledge sources and exploration tools is to enable scientists to generate novel hypotheses from the knowledge they explore. We tested the feasibility of our approach on a real use case in the domain of human health and parasite biology. On the one hand, we integrated the experimental data generated as part of an on-going research on Chagas disease with the knowledge extracted from thePubMed articles, using Semantic Web technologies. On the other hand, we developed iExplore, a web tool with a graphical interface for interactive knowledge exploration, that allows non-technical users to explore the integrated knowledge base using a relationship-focused approach. We illustrate the effectiveness of our approach by describing the knowledge-driven process of using iExplore to generate a new hypothesis for the treatment of Chagas disease.
JA - 1st International Conference on Linked Science
PB - Linked Science workshop, ISWC
CY - Bonn, Germany
ER -
TY - CONF
T1 - Local Closed World Reasoning: Grounded Circumscription for OWL
T2 - The Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Bonn, Germany, October 23-27, 2011
Y1 - 2011
A1 - Kunal Sengupta
A1 - Adila Alfa Krisnadhi
A1 - Pascal Hitzler
AB - We present a new approach to adding closed world reasoning to the Web Ontology Language OWL. It transcends previous work on circumscriptive description logics which had the drawback of yielding an undecidable logic unless severe restrictions were imposed. In particular, it was not possible, in general, to apply local closure to roles. In this paper, we provide a new approach, called grounded circumscrip- tion, which is applicable to SROIQ and other description logics around OWL without these restrictions. We show that the resulting language is decidable, and we derive an upper complexity bound. We also provide a decision procedure in the form of a tableaux algorithm.
JA - The Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Bonn, Germany, October 23-27, 2011
PB - Proceedings, Part I. Lecture Notes in Computer Science Vol. 7031, Springer, Heidelberg, 2011
CY - Bonn, Germany
ER -
TY - CONF
T1 - Local closed world semantics: grounded circumscription for description logics
T2 - Web Reasoning and Rule Systems. 5th International Conference, RR 2011, Galway, Ireland, August 29-30, 2011
Y1 - 2011
A1 - Kunal Sengupta
A1 - Adila Alfa Krisnadhi
A1 - Pascal Hitzler
KW - circumscription
KW - Description Logic
KW - local closed world
AB - We present an improved local closed world extension for description logics. It is based on circumscription, and deviates from previous circumscriptive description logics [1,3] in that extensions of minimized predicates may contain only extensions of named individuals in the knowledge base. Besides an (arguably) higher intuitive appeal, the improved semantics is applicable to expressive description logics without loss of decidability.
JA - Web Reasoning and Rule Systems. 5th International Conference, RR 2011, Galway, Ireland, August 29-30, 2011
PB - Lecture Notes in Computer Science Vol. 6902, Springer, Heidelberg, 2011
CY - Galway, Ireland
ER -
TY - CONF
T1 - Local Closed World Semantics: Keep it simple, stupid!
T2 - Local Closed World Semantics: Keep it simple, stupid!
Y1 - 2011
A1 - Adila Alfa Krisnadhi
A1 - Kunal Sengupta
A1 - Pascal Hitzler
KW - circumscription
KW - closed world
KW - decidability
KW - Description Logic
JA - Local Closed World Semantics: Keep it simple, stupid!
ER -
TY - JOUR
T1 - Local Closed-World Reasoning with Description Logics under the Well-founded Semantics
Y1 - 2011
A1 - Matthias Knorr
A1 - Jose Julio Alferes
A1 - Pascal Hitzler
KW - Description logics and ontologies
KW - Knowledge Representation
KW - logic programming
KW - Non-monotonic reasoning
KW - Semantic Web
AB - An important question for the upcoming Semantic Web is how to best combine open world ontology languages, such as the OWL-based ones, with closed world rule-based languages. One of the most mature proposals for this combination is known as hybrid MKNF knowledge bases (Motik and Rosati, 2010 [52]), and it is based on an adaptation of the Stable Model Semantics to knowledge bases consisting of ontology axioms and rules. In this paper we propose a well-founded semantics for nondisjunctive hybrid MKNF knowledge bases that promises to provide better efficiency of reasoning, and that is compatible with both the OWL-based semantics and the traditional Well-Founded Semantics for logic programs. Moreover, our proposal allows for the detection of inconsistencies, possibly occurring in tightly integrated ontology axioms and rules, with only little additional effort. We also identify tractable fragments of the resulting language.
ER -
TY - CONF
T1 - Local graph sparsification for scalable clustering
T2 - ACM SIGMOD International Conference on Management of data
Y1 - 2011
A1 - Venu Satuluri
A1 - Srinivasan Parthasarthy
A1 - Yiye Ruan
KW - cation
KW - Graph Clustering
KW - Graph Sparsiﬁ
KW - Minwise Hashing
AB - In this paper we look at how to sparsify a graph i.e. how to reduce the edgeset while keeping the nodes intact, so as to enable faster graph clustering without sacriﬁcing quality. The main idea behind our approach is to preferentially retain the edges that are likely to be part of the same cluster. We propose to rank edges using a simple similaritybased heuristic that we eﬃciently compute by comparing the minhash signatures of the nodes incident to the edge. For each node, we select the top few edges to be retained in the sparsiﬁed graph. Extensive empirical results on several real networks and using four state-of-the-art graph clustering and community discovery algorithms reveal that our proposed approach realizes excellent speedups (often in the range 10-50), with little or no deterioration in the quality of the resulting clusters. In fact, for at least two of the four clustering algorithms, our sparsiﬁcation consistently enables higher clustering accuracies.
JA - ACM SIGMOD International Conference on Management of data
CY - Athens, Greece
ER -
TY - CONF
T1 - MapSSS results for OAEI 2011
T2 - MapSSS results for OAEI 2011
Y1 - 2011
A1 - Michelle Cheatham
JA - MapSSS results for OAEI 2011
ER -
TY - CONF
T1 - Multi-field Visualization for Biomedical Data Sets
Y1 - 2011
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - Nominal Schemas for Integrating Rules and Description Logics
Y1 - 2011
A1 - Markus Krotzsch
A1 - Frederick Maier
A1 - Adila Alfa Krisnadhi
A1 - Pascal Hitzler
AB - We propose an extension of SROIQ with nominal schemas which can be used like Âvariable nominal conceptsÂ within axioms. This feature allows us to express arbitrary DL-safe rules in description logic syntax. We show that adding nominal schemas to SROIQ does not increase its worst-case reasoning complexity, and we identify a family of tractable DLs SROELVn that allow for restricted use of nominal schemas.
ER -
TY - JOUR
T1 - An Ontological Approach to Focusing Attention and Enhancing Machine Perception on the Web
JF - Applied Ontology
Y1 - 2011
A1 - Cory Henson
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
KW - observation
KW - Ontology
KW - Perception
KW - Sensor
AB - Today, many sensor networks and their applications employ a brute force approach to collecting and analyzing sensor data. Such an approach often wastes valuable energy and computational resources by unnecessarily tasking sensors and generating observations of minimal use. People, on the other hand, have evolved sophisticated mechanisms to efficiently perceive their environment. One such mechanism includes the use of background knowledge to determine what aspects of the environment to focus our attention. In this paper, we develop an ontology of perception, IntellegO, that may be used to more efficiently convert observations into perceptions. IntellegO is derived from cognitive theory, encoded in set-theory, and provides a formal semantics of machine perception. We then present an implementation that iteratively and efficiently processes low level, heterogeneous sensor data into knowledge through use of the perception ontology and domain specific background knowledge. Finally, we evaluate IntellegO by collecting and analyzing observations of weather conditions on the Web, and show significant resource savings in the generation and storage of perceptual knowledge.
ER -
TY - CONF
T1 - Overview of Contrast Data Mining as a Field and Preview of an Upcoming Book
T2 - 11th IEEE International Conference on Data Mining Workshops
Y1 - 2011
A1 - Guozhu Dong
A1 - James Bailey
JA - 11th IEEE International Conference on Data Mining Workshops
PB - ICDM 2011
CY - Las Vegas, NV
ER -
TY - ABST
T1 - OWL and Rules
Y1 - 2011
A1 - Adila Alfa Krisnadhi
A1 - Frederick Maier
A1 - Pascal Hitzler
AB - The relationship between the Web Ontology Language OWL and rule-based formalisms has been the subject of many discussions and research investigations, some of them controversial. From the many attempts to reconcile the two paradigms, we present some of the newest developments. More precisely, we show which kind of rules can be modeled in the current version of OWL, and we show how OWL can be extended to incorporate rules without compromising OWL design principles.
PB - Reasoning Web. Semantic Technologies for the Web of Data. 7th International Summer School 2011, Galway, Ireland, August 23-27, 2011
ER -
TY - CONF
T1 - Paraconsistent Semantics for Hybrid MKNF Knowledge Bases
T2 - Web Reasoning and Rule Systems. 5th International Conference, RR 2011, Galway, Ireland, August 29-30, 2011
Y1 - 2011
A1 - Shasha Huang
A1 - Qingguo Li
A1 - Pascal Hitzler
AB - Hybrid MKNF knowledge bases, originally based on the stable model semantics, is a mature method of combining rules and Description Logics (DLs). The well-founded semantics for such knowledge bases has been proposed subsequently for better efficiency of reasoning. However, integration of rules and DLs may give rise to inconsistencies, even if they are respectively consistent. Accordingly, reasoning systems based on the previous two semantics will break down. In this paper, we employ the four-valued logic proposed by Belnap, and present a paraconsistent semantics for Hybrid MKNF knowledge bases, which can detect inconsistencies and handle it effectively. Besides, we transform our proposed semantics to the stable model semantics via a linear transformation operator, which indicates that the data complexity in our paradigm is not higher than that of classical reasoning. Moreover, we provide a fixpoint algorithm for computing paraconsistent MKNF models.
JA - Web Reasoning and Rule Systems. 5th International Conference, RR 2011, Galway, Ireland, August 29-30, 2011
PB - Proceedings. Lecture Notes in Computer Science Vol. 6902, Springer, Heidelberg, 2011
CY - Galway, Ireland
ER -
TY - CONF
T1 - Personalized Filtering of the Twitter Stream
T2 - ISWC 2011
Y1 - 2011
A1 - Pavan Kapanipathi
A1 - Fabrizio Orlandi
A1 - Amit Sheth
A1 - Alexandre Passant
KW - Semantic Web and Twitter and PubSubHubbub and Social Network and User Profiling and Personalization
AB - With the rapid growth in users on social networks, there is a corresponding increase in user-generated content, in turn resulting in information overload. On Twitter, for example, users tend to receive uninterested information due to their non-overlapping interests from the people whom they follow. In this paper we present a Semantic Web approach to filter public tweets matching interests from personalized user profiles. Our approach includes automatic generation of multi-domain and personalized user profiles, filtering Twitter stream based on the gen- erated profiles and delivering them in real-time. Given that users interests and personalization needs change with time, we also discuss how our application can adapt with these changes.
JA - ISWC 2011
PB - International Semantic Web Conference
CY - Bonn, Germany
ER -
TY - CONF
T1 - Privacy-Aware and Scalable Content Dissemination in Distributed Social Networks
T2 - International Semantic Web Conference InUSe
Y1 - 2011
A1 - Pavan Kapanipathi
A1 - Julia Anaya
A1 - Amit Sheth
A1 - Brett Slatkin
A1 - Alexandre Passant
KW - Distributed Social Network
KW - FOAF
KW - Privacy
KW - PubSubHubbub
KW - Semantic Web
KW - Social Web
AB - Centralized social networking websites raise scalability issues Â due to the growing number of participants Â and, as well as, policy concerns Â such as control, privacy and ownership over the userÂs published data. Distributed Social Networks aim to solve this issue by enabling architecture where people own their data and share it their own way. However, the privacy and scalability challenge is still to be tackled. This paper presents a privacy-aware extension to GoogleÂs PubSubHubbub protocol, using Semantic Web technologies, solving both the scalability and the privacy issues in Distributed Social Networks. We enhanced the traditional feature of PubSubHubbub protocol Â that decouples publishers and subscribers Â in order to allow publishers to decide whom they want to share their information with. We also present a use-case of applying this to SMOB (our (Semantic Microblogging framework). Our architecture is application agnostic, and can be adopted by any system that requires scalable and privacy-aware content broadcasting.
JA - International Semantic Web Conference InUSe
CY - Bonn, Germany
ER -
TY - CONF
T1 - Privacy-By-Design in Federated Social Web Applications
T2 - Web Science 2011
Y1 - 2011
A1 - Alexandre Passant
A1 - Owen Sacco
A1 - Julia Anaya
A1 - Pavan Kapanipathi
KW - privacy and web applications
JA - Web Science 2011
CY - Koblenz, Germany
ER -
TY - JOUR
T1 - Ranking Function Adaptation With Boosting Trees
JF - ACM Transactions on Information Systems
Y1 - 2011
A1 - Keke Chen
A1 - Jing Bai
A1 - Zhaohui Zheng
KW - boosting regression trees
KW - domain adaptation
KW - learning to rank
KW - user feedback
KW - web search ranking
AB - Machine learned ranking functions have shown successes in web search engines. With the increasing demands on developing effective ranking functions for different search domains, we have seen a big bottleneck, i.e., the problem of insufﬁcient labeled training data, which has signiﬁcantly slowed the development and deployment of machine learned ranking functions for different domains. There are two possible approaches to address this problem: (1) combining labeled training data from similar domains with the small targetdomain labeled data for training or (2) using pairwise preference data extracted from user clickthrough log for the target domain for training. In this paper, we propose a new approach called tree based ranking function adaptation ('Trada') to effectively utilize these data sources for training cross-domain ranking functions. Tree adaptation assumes that ranking functions are trained with the Stochastic Gradient Boosting Trees method − a gradient boosting method on regression trees. It takes such a ranking function from one domain and tunes its tree based structure with a small amount of training data from the target domain. The unique features include (1) it can automatically identify the part of model that needs adjustment for the new domain, (2) it can appropriately weigh training examples considering both local and global distributions. Based on a novel pairwise loss function that we developed for pairwise learning, the basic tree adaptation algorithm is also extended ('Pairwise Trada') to utilize the pairwise preference data from the target domain to further improve the effectiveness of adaptation. Experiments are performed on real datasets to show that tree adaptation can provide better-quality ranking functions for a new domain than other methods.
ER -
TY - CONF
T1 - RASP: Efficient Multidimensional Range Query on Attack-Resilient Encrypted Databases
T2 - ACM Conference on Data and Application Security and Privacy (CODASPY) 2011
Y1 - 2011
A1 - Keke Chen
A1 - Ramakanth Kavuluru
A1 - Shumin Guo
KW - Attack Analysis
KW - Multidimensional Range Query
KW - Outsourced Database
KW - Random Space Encryption
AB - Range query is one of the most frequently used queries for online data analytics. Providing such a query service could be expensive for the data owner. With the development of services computing and cloud computing, it has become possible to outsource large databases to database service providers and let the providers maintain the range-query service. With outsourced services, the data owner can greatly reduce the cost in maintaining computing infrastructure and data-rich applications. However, the service provider, although honestly processing queries, may be curious about the hosted data and received queries. Most existing encryption based approaches require linear scan over the entire database, which is inappropriate for online data analytics on large databases. While a few encryption solutions are more focused on efﬁciency side, they are vulnerable to attackers equipped with certain prior knowledge. We propose the Random Space Encryption (RASP) approach that allows efﬁcient range search with stronger attack resilience than existing efﬁciency-focused approaches. We use RASP to generate indexable auxiliary data that is resilient to prior knowledge enhanced attacks. Range queries are securely transformed to the encrypted data space and then efﬁciently processed with a two-stage processing algorithm. We thoroughly studied the potential attacks on the encrypted data and queries at three different levels of prior knowledge available to an attacker. Experimental results on synthetic and real datasets show that this encryption approach allows efﬁcient processing of range queries with high resilience to attacks.
JA - ACM Conference on Data and Application Security and Privacy (CODASPY) 2011
CY - San Antonio, TX
ER -
TY - ABST
T1 - Real Time Semantic Analysis of Streaming Sensor Data
Y1 - 2011
A1 - Harshal Patni
KW - Semantic Analysis
KW - Semantic Sensor Web
KW - Sensor Web Enablement
KW - Streaming Sensor data
AB - The emergence of dynamic information sources - like Social, mobile and sensors, has led to ginormous streams of real time data on the web also called, the era of Big Data [1]. Research studies suggest, these dynamic networks have created more data in the last three years than in the entire history of civilization, and this trend will only increase in the coming years [1]. Figure 1 shows, how the total information generated by these dynamic information sources has completely surpassed the total storage capacity. Thus keeping in mind the problem of ever-increasing data, this thesis focuses on semantically integrating and analyzing multiple, multimodal, heterogeneous streams of weather data with the goal of creating meaningful thematic abstractions in real-time. This is accomplished by implementing an infrastructure for creating and mining thematic abstractions over massive amount of real-time sensor streams. Evaluation section shows 69% data reduction with this approach.
ER -
TY - ABST
T1 - Reconciling OWL and Rules
Y1 - 2011
A1 - David Carral Martinez
A1 - Adila Alfa Krisnadhi
A1 - Frederick Maier
A1 - Kunal Sengupta
A1 - Pascal Hitzler
KW - OWL and description logic and decidability and local closed world and reasoningalgorithms and rules and datalog
AB - We report on a recent advance in integrating Rules and OWL. We discuss a recent proposal, known as nominal schemas, which realizes a seamless integration of Datalog rules into the description logic SROIQ which underlies OWL 2 DL. We present extensions of the standardized OWL syntaxes to incorporate nominal schemas, reasoning algorithms, and a rst naive implementation. And we argue why this approach goes a long way towards overcoming the present paradigm split.
ER -
TY - CONF
T1 - Representation of Parsimonious Covering Theory in OWL-DL
T2 - 8th International Workshop on OWL: Experiences and Directions (OWLED 2011)
Y1 - 2011
A1 - Cory Henson
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Pascal Hitzler
KW - OWL and Abductive Reasoning and Parsimonious Covering Theory
AB - The Web Ontology Language has not been designed for representing abductive inference, which is often required for applications such as medical disease diagnosis. As a consequence, existing OWL ontologies have limited ability to encode knowledge for such applications. In the last 150 years, many logic frameworks for the representation of abductive inference have been developed. Among these frameworks, Parsimonious Covering Theory (PCT) has achieved wide recognition. PCT is a formal model of diagnostic reasoning in which knowledge is represented as a network of causal associations, and whose goal is to account for observed symptoms with plausible explanatory hypotheses. In this paper, we argue that OWL does provide some of the expressivity required to approximate diagnostic reasoning, and outline a suitable encoding of PCT in OWL-DL.
JA - 8th International Workshop on OWL: Experiences and Directions (OWLED 2011)
CY - San Francisco, California
ER -
TY - CONF
T1 - SECURE: Semantics Empowered Rescue Environment (Demonstration Paper)
T2 - International Semantic Web Conference (ISWC 2011)
Y1 - 2011
A1 - Pratikkumar Desai
A1 - Cory Henson
A1 - Pramod Anantharam
A1 - Amit Sheth
KW - Semantic Sensor Web and wireless sensor network and rescue environment and robotics and sensor and abstraction
AB - This paper demonstrates a Semantic Web enabled system for collecting and processing sensor data within a rescue environment. The real-time system collects heterogeneous raw sensor data from rescue robots through a wireless sensor network. The raw sensor data is converted to RDF using the Semantic Sensor Network (SSN) ontology and further processed to generate abstractions used for event detection in emergency scenarios.
JA - International Semantic Web Conference (ISWC 2011)
PB - 4th International Workshop on Semantic Sensor Networks 2011 (SSN 2011)
CY - Bonn, Germany
ER -
TY - ABST
T1 - Sem PuSH Privacy Aware and Scalable Broadcasting for Semantic Microblogging
Y1 - 2011
A1 - Pavan Kapanipathi
A1 - Julia Anaya
A1 - Alexandre Passant
KW - Semantic Web and Social Web and Privacy and PubSubHubbub
AB - Users of traditional microblogging platforms such as Twitter face drawbacks in terms of (1) Privacy of status updates as a followee Â reaching undesired people (2) Information overload as a follower Â receiving uninteresting microposts from followees. In this paper we demonstrate distributed and user-controlled dissemination of microposts using SMOB (semantic microblogging framework) and Semantic Hub (privacy-aware implementation of PuSH3 protocol) . The approach leverages usersÂ Social Graph to dynamically create group of followers who are eligible to receive micropost. The restrictions to create the groups are provided by the followee based on the hastags in the micropost. Both SMOB and Semantic Hub are available as open source.
ER -
TY - CONF
T1 - Semantic Annotation and Search for resources in the next Generation Web with SA-REST
Y1 - 2011
A1 - Ajith Ranabahu
A1 - Amit Sheth
A1 - Maryam Panahiazar
A1 - Sanjaya Wijeratne
KW - SA-REST and Service discovery and Annotations and Microdata
AB - SA-REST, the W3C member submission, can be used for supporting a wide variety of Plain Old Semantic HTML (POSH) annotation capabilities on any type of Web resource. Kino framework and tools provide support of capabilities to realize SA-RESTs promised value. These tools include (a) a browser-plugin to support annotation of a Web resource (including services) with respect to an ontology, domain model or vocabulary, (b) an annotation aware indexing engine and (c) faceted search and selection of the Web resources. At one end of the spectrum, we present Kino E(aka Kino for Enterprise) which uses NCBO formal ontologies and associated services for searching ontologies and mappings, for annotating RESTful services and Web APIs, which are then used to support faceted search. At another end of the spectrum, we present KinoW (aka Kino for the Web), capable of adding SA-REST or Microdata annotations to Web pages, using Schema.org as a model and Linked Open Data (LOD) as a knowledge base. We also present two use cases based on KinoE and the benefits to data and service integration enabled through this annotation approach.
PB - W3C Workshop on Data and Services Integration
ER -
TY - Generic
T1 - Semantic Predications for Complex Information Needs in Biomedical Literature
T2 - 2011 IEEE International Conference on Bioinformatics and Biomedicine
Y1 - 2011
A1 - Delroy Cameron
A1 - Pablo Mendes
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
A1 - Ramakanth Kavuluru
A1 - Olivier Bodenreider
KW - background knowledge
KW - literature-based discovery
KW - question answering
KW - semantic predications
KW - text mining
AB - Many complex information needs that arise in biomedical disciplines require exploring multiple documents in order to obtain information. While traditional information retrieval techniques that return a single ranked list of documents are quite common for such tasks, they may not always be adequate. The main issue is that ranked lists typically impose a significant burden on users to filter out irrelevant documents. Additionally, users must intuitively reformulate their search query when relevant documents have not been not highly ranked. Furthermore, even after interesting documents have been selected, very few mechanisms exist that enable document- to-document transitions. In this paper, we demonstrate the utility of assertions extracted from biomedical text (called semantic predications) to facilitate retrieving relevant documents for complex information needs. Our approach offers an alternative to query reformulation by establishing a framework for transitioning from one document to another. We evaluate this novel knowledge-driven approach using precision and recall metrics on the 2006 TREC Genomics Track.
JA - 2011 IEEE International Conference on Bioinformatics and Biomedicine
PB - IEEE
CY - Atlanta, GA
ER -
TY - CONF
T1 - Semantic Predications for Complex Information Needs in Biomedical Literature
T2 - 5th International Conference on Bioinformatics and Biomedicine (BIBM2011)
Y1 - 2011
A1 - Delroy Cameron
A1 - Ramakanth Kavuluru
A1 - Olivier Bodenreider
A1 - Pablo Mendes
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
KW - background knowledge
KW - literature-based discovery
KW - question answering
KW - semantic predications
KW - text mining
AB - Many complex information needs that arise in biomedical disciplines require exploring multiple documents in order to obtain information. While traditional information retrieval techniques that return a single ranked list of documents are quite common for such tasks, they may not always be adequate. The main issue is that ranked lists typically impose a significant burden on users to filter out irrelevant documents. Additionally, users must intuitively reformulate their search query when relevant documents have not been not highly ranked. Furthermore, even after interesting documents have been selected, very few mechanisms exist that enable document- to-document transitions. In this paper, we demonstrate the utility of assertions extracted from biomedical text (called semantic predications) to facilitate retrieving relevant documents for complex information needs. Our approach offers an alternative to query reformulation by establishing a framework for transitioning from one document to another. We evaluate this novel knowledge-driven approach using precision and recall metrics on the 2006 TREC Genomics Track.
JA - 5th International Conference on Bioinformatics and Biomedicine (BIBM2011)
PB - 2011 IEEE International Conference on Bioinformatics and Biomedicine
CY - Atlanta GA
ER -
TY - RPRT
T1 - Semantic Sensor Network XG Final Report
Y1 - 2011
A1 - Payam Barnaghi
A1 - Michael Compton
A1 - Oscar Corcho
A1 - Raul Garcia Castro
A1 - John Graybeal
A1 - Arthur Herzog
A1 - Krzysztof Janowicz
A1 - Holger Neuhaus
A1 - Andriy Nikolov
A1 - Kevin Page
ED - Laurent Lefort
ED - Cory Henson
ED - Kerry Taylor
KW - Ontology
KW - Semantic Sensor Network
KW - Sensor
AB - This document is the final report of the W3C Semantic Sensor Network Incubator Group. The group had two main objectives. The first was to develop an ontology to describe sensors and sensor networks for use in sensor network and sensor web applications. The second was to study and recommend methods for using the ontology to semantically enable applications developed according to available standards such as the Open Geospatial Consortium's (OGC ) Sensor Web Enablement (SWE) standards. The Sensor and Sensor Network ontology presented here, known as the SSN ontology, answers the need for a domain-independent and end-to-end model for sensing applications by merging sensor-focused (e.g. SensorML), observation-focused (e.g. Observation & Measurement) and system-focused views. It covers the sub-domains which are sensor-specific such as the sensing principles and capabilities and can be used to define how a sensor will perform in a particular context to help characterise the quality of sensed data or to better task sensors in unpredictable environments. Although the ontology leaves the observed domain unspecified, domain semantics, units of measurement, time and time series, and location and mobility ontologies can be easily attached when instantiating the ontology for any particular sensors in a domain. The alignment between the SSN ontology and the DOLCE Ultra Lite upper ontology has helped to normalise the structure of the ontology to assist its use in conjunction with ontologies or linked data resources developed elsewhere. While the OGC SWE standards provide description and access to data and metadata for sensors, they do not provide facilities for abstraction, categorization, and reasoning offered by semantic technologies. This report presents a semantic annotation method defined by the XG. This method should help the users of OGC standards to retrofit XML-based web services to better support semantic mashups and to ease the integration with linked open data applications relying on semantic web technologies like RDF and SPARQL. Finally, this report identifies where ongoing research and standardisation efforts are likely to benefit from the work done by this Incubator Activity and also where further work is required. Three directions for future work have been identified: (1) standardise the SSN ontology to use it in a Linked Sensor Data context, (2) standardise the SSN ontology to bridge the Internet of Things and the Internet of Services, (3) foster the adoption of the SSN ontology in the OGC community More than 25 papers discussing applications of the SSN XG work have been published by the XG participants and by early adopters which were not directly involved in the ontology development but were closely following it via the mailing list and the wiki. This is a clear signal that there is an good prospect for the creation of W3C community group focused on the maintenance and extension of the SSN Ontology. This is the first recommendation issued by the group. The group's decision to align the SSN Ontology with DOLCE Ultra Lite has raised specific challenges for the publication, packaging and maintenance of the SSN Ontology which are not frequently addressed by other W3C groups publishing recommendations focusing on ontologies. The second recommendation is to encourage further work on these issues at W3C. The third recommendation made by the group is to encourage its participants and followers to join the Provenance working group to work on sensor-specific issues. The fourth recommendation requests the W3C to promote uptake of the SSN ontology in the W3C community and beyond. The final recommendation of this report is to encourage W3C and OGC to coordinate their activities in this area and especially to build a larger pool of experts to tackle the challenges linked to differences between the modelling approach used by the group, based on OWL, and the modelling principles currently applied within the OGC community.
ER -
TY - BOOK
T1 - Semantic Services, Interoperability and Web Applications: Emerging Concepts
Y1 - 2011
A1 - Amit Sheth
AB - Semantic Web technology is of fundamental interest to researchers in a number of fields, including information systems and Web design, as continued advancements in this discipline impact other, related fields of study. Semantic Services, Interoperability and Web Applications: Emerging Concepts offers suggestions, solutions, and recommendations for new and emerging research in Semantic Web technology. Focusing broadly on methods and techniques for making the Web more useful and meaningful, this book unites discussions on Semantic Web directions, issues, and challenges.
PB - IGI Global
CY - New York
ER -
TY - CONF
T1 - Semantic Social Mashup approach for Designing Citizen Diplomacy
Y1 - 2011
A1 - Amit Sheth
KW - Semantic Social Web and Semantic Social Mashup and Designing Citizen Diplomacy and socio-technical systems
AB - Advancement in technology has brought exceptional connectivity, easy and open access to communication mediums via Internet. Everyday millions of people are interactively communicating to each other and sharing multimedia content through Social Media/Networks, Web-based and mobile-based technologies. Social media provides variety of interesting, engaging applications such as Twitter, Facebook, YouTube, Flicker, Blogs. People interested in contributing to global welfare and improving humanity are connected to various NGO's like Red Cross, Ushahidi (www.ushahidi.com), eMoksha (emoksha.org), etc. Social media and NGOs are acting as an excellent medium of communication and sharing, connected diverse people irrespective of their nationality, religion, culture, etc. Social media and mobile communication are natural tools to support citizen diplomacy; they have played pivotal role in activism for governance, democracy and other causes, as it was demonstrated in Iran Election, Haiti Earthquakes, and Tunisia crises[1].
PB - NSF Workshop on Designing Citizen Diplomacy
ER -
TY - Generic
T1 - Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Interoperability
Y1 - 2011
A1 - Ora Lassila
A1 - Amit Sheth
ER -
TY - JOUR
T1 - Semantic Web surveys and applications.
Y1 - 2011
A1 - Krzysztof Janowicz
A1 - Pascal Hitzler
ER -
TY - JOUR
T1 - Semantic Web Tools and Systems.
JF - Semantic Web – Interoperability, Usability, Applicability
Y1 - 2011
A1 - Pascal Hitzler
A1 - Krzysztof Janowicz
AB - Semantic Web research relies on a number of key methodologies such as knowledge representation languages or reasoning algorithms. As a research community, however, we could not progress based on these methodologies exclusively, but require tools and systems that realize our research results as key technologies for the Semantic Web.
PB - IOS Press
ER -
TY - JOUR
T1 - Semantics in Location-Based Services [Guest editor's introduction]
Y1 - 2011
A1 - Sergio Ilarri
A1 - Arantza Ilarramendi
A1 - Eduardo Mena
A1 - Amit Sheth
KW - Internet
KW - Mobile radio mobility management
KW - Semantic Web
KW - Semantics
KW - Special issues and sections
KW - Web and internet services
AB - Advances in wireless networks and mobile devices have motivated an intensive research effort in mobile computing and mobile data services. The goal is to provide users with anywhere, anytime connectivity to the Internet and access to services that are valuable depending on users' context. One of the most interesting context factors is user location. Thus, context-awareness, and particularly location-based services (LBSs), have attracted great interest. Moreover, rapid progress in developing Semantic Web standards, tools, and techniques, including applications that are rich in spatial and temporal dimensions, and the vision of the geospatial Semantic Web 3 enable the development of more intelligent LBSs.
ER -
TY - JOUR
T1 - Semantics Scales Up: Beyond Search in Web 3.0
JF - IEEE Internet Computing
Y1 - 2011
A1 - Amit Sheth
KW - computing for human experience
KW - scaling semantics
KW - Semantic Search
KW - semantics empowered cyber-physical systems
KW - semantics in Web 3.0
KW - semantics-empowered physical-virtual systems
AB - Concern for scalability - both in computational terms and in terms of human effort needed to develop semantic models and background knowledge - has hampered the adoption of semantic techniques and the Semantic Web. This concern is misplaced given the extensive progress the past decade has seen on standards, methods, and technologies for developing semantic models or ontologies, semantic annotations, and techniques for semantic integration, analysis, and reasoning. Such progress is complemented by myriad recent success stories that use semantics in broad-based applications such as Web search, as well as in a growing number of vertical domains. As the future of computing expands beyond cyberspace to cyber-physical-social computing, with extensive growth in social and sensor data, semantics will play an even larger and more pervasive role in exploiting larger amounts of increasingly heterogeneous and multimodal data.
VL - 15
CP - 6
ER -
TY - CONF
T1 - SemPuSH: Privacy-Aware and Scalable Broadcasting for Semantic Microblogging
T2 - 10th International Semantic Web Conference
Y1 - 2011
A1 - Pavan Kapanipathi
A1 - Julia Anaya
A1 - Alexandre Passant
KW - Semantic Web and Distributed Social Network and Social Web and Privacy and FOAF and PubSubHubbub
AB - Users of traditional microblogging platforms such as Twitter face drawbacks in terms of (1) Privacy of status updates as a followee Â reaching undesired people (2) Information overload as a follower Â receiving uninteresting microposts from followees. In this paper we demonstrate distributed and user-controlled dissemination of microposts using SMOB (semantic microblogging framework) and Semantic Hub (privacy-aware implementation of PuSH3 protocol) . The approach leverages usersÂ Social Graph to dynamically create group of followers who are eligible to receive micropost. The restrictions to create the groups are provided by the followee based on the hastags in the micropost. Both SMOB and Semantic Hub are available as open source.
JA - 10th International Semantic Web Conference
CY - Bonn, Germany
ER -
TY - CONF
T1 - SMOB: The Best of Both Worlds
T2 - Federated Social Web Europe Conference
Y1 - 2011
A1 - Alexandre Passant
A1 - Julia Anaya
A1 - Owen Sacco
A1 - Pavan Kapanipathi
AB - This paper presents the architecture of SMOB and the way it combines Semantic Web standards (RDF(S) / SPARQL) and new protocols such as PubSubHubbub to enable a Federated and Privacy-Aware Social Web.
JA - Federated Social Web Europe Conference
ER -
TY - CHAP
T1 - SPARQL-ST: Extending SPARQL to Support Spatiotemporal Queries
T2 - Geospatial Semantics and the Semantic Web
Y1 - 2011
A1 - Matthew Perry
A1 - Prateek Jain
A1 - Amit Sheth
KW - SPARQL-ST
KW - spatial filter
KW - spatial join
KW - spatial selection
KW - spatio-temporal filter
KW - spatio-temporal query
KW - spatio-temporal semantic web
KW - temporal filter
KW - temporal join
KW - temporal selection
AB - Spatial and temporal data is plentiful on the Web, and SemanticWeb technologies have the potential to make this data more accessible and more useful. Semantic Web researchers have consequently made progress towards better handling of spatial and temporal data.SPARQL, the W3C-recommended query language for RDF, does not adequately support complex spatial and temporal queries. In this work, we present the SPARQL-ST query language. SPARQL-ST is an extension of SPARQL for complex spatiotemporal queries. We present a formal syntax and semantics for SPARQL-ST. In addition, we describe a prototype implementation of SPARQL-ST and demonstrate the scalability of this implementation with a performance study using large real-world and synthetic RDF datasets.
JA - Geospatial Semantics and the Semantic Web
PB - Springer
CY - New York
VL - 12
SN - 10.1007/978-1-4419-9446-2
ER -
TY - RPRT
T1 - A Systematic Property Mapping Using Category Hierarchy and Data
Y1 - 2011
A1 - Kalpa Gunaratna
A1 - Sarasi Lalithsena
A1 - Prateek Jain
A1 - Cory Henson
A1 - Amit Sheth
KW - Linked Open Data and Property Matching
AB - Relationships play a key role in Semantic Web to connect the dots between entities (concepts or instances in a way that enables to absorb the real sense of the entities. Even though relationships are important, it is difficult to categorize or identify them because they consist of complex knowledge in the schema. Therefore systematically identifying relationships yield many advantages and open doors for new research avenues. In this work, we try to identify a specific type of relationship (part of) in a multi-domain dataset and devised an algorithm using Wikipedia to identify patterns of part of relationships in the dataset. This paper is based on some in progress initial work based on identifying part of relationships.
ER -
TY - CONF
T1 - Towards Optimal Resource Provisioning for Running MapReduce Programs in Public Clouds
T2 - IEEE Conference on Cloud Computing
Y1 - 2011
A1 - Fengguang Tian
A1 - Keke Chen
KW - Cloud Computing
KW - MapReduce
KW - Performance Modeling
KW - Resource Provisioning
AB - Running MapReduce programs in the public cloud introduces the important problem: how to optimize resource provisioning to minimize the ﬁnancial charge for a speciﬁc job? In this paper, we study the whole process of MapReduce processing and build up a cost function that explicitly models the relationship between the amount of input data, the available system resources (Map and Reduce slots), and the complexity of the Reduce function for the target MapReduce job. The model parameters can be learned from test runs with a small number of nodes. Based on this cost model, we can solve a number of decision problems, such as the optimal amount of resources that can minimize the ﬁnancial cost with a time deadline or minimize the time under certain ﬁnancial budget. Experimental results show that this cost model performs well on tested MapReduce programs.
JA - IEEE Conference on Cloud Computing
ER -
TY - CHAP
T1 - Trust Networks
Y1 - 2011
A1 - Krishnaprasad Thirunarayan
A1 - Pramod Anantharam
A1 - Cory Henson
A1 - Amit Sheth
AB - Trust relationships occur naturally in many diverse contexts such as collaborative systems, e-commerce, interpersonal interactions, social networks, semantic sensor web, etc. As collaborating agents providing content and services become increasingly removed from the agents that consume them, the issue of robust trust inference and update become critical. There is a need to find online substitutes for traditional (direct or face-to-face) cues to derive measures of trust, and create efficient and secure system for managing trust, to support decision-making. Unfortunately, there is neither a universal notion of trust that is applicable to all domains nor a clear explication of its semantics or computation in many situations. In this survey, we motivate the trust problem, explain the relevant concepts, summarize research in modeling trust and gleaning trustworthiness, and discuss challenges confronting us. The goal is to provide a comprehensive broad overview of the trust landscape, with the nitty-gritties of a handful of approaches. We also provide details of the theoretical underpinnings and comparative analysis of Bayesian approaches to binary and multi-level trust, to automatically determine trustworthiness in a variety of reputation systems including those used in sensor networks, e-commerce, and collaborative environments.
PB - 5th Indian International Conference on Artificial Intelligence (IICAI-11)
ER -
TY - CONF
T1 - Trust Networks: Interpersonal, Sensor, and Social
T2 - International Conference on Collaborative Technologies and Systems (CTS 2011)
Y1 - 2011
A1 - Krishnaprasad Thirunarayan
A1 - Pramod Anantharam
KW - beta-PDF
KW - gleaning trustworthiness
KW - security attacks.
KW - social and sensor networks
KW - trust metrics (propagation: chaining and aggregation)
KW - trust ontology
KW - trust vs. reputation
AB - Trust relationships occur naturally in many diverse contexts such as ecommerce, interpersonal interactions, social networks, sensor web, etc. As agents providing content and services become increasingly removed from the agents that consume them, the issue of robust trust inference and update become critical. Unfortunately, there is neither a universal notion of trust that is applicable to all domains nor a clear explication of its semantics or computation in many situations. In this beginner's level tutorial, we motivate the trust problem, explain the relevant concepts, summarize research in modeling trust and gleaning trustworthiness, and discuss challenges confronting us in this process.
JA - International Conference on Collaborative Technologies and Systems (CTS 2011)
PB - In: Proceedings of 2011 International Conference on Collaborative Technologies and Systems (CTS 2011)
CY - Philadelphia, Pennsylvania, USA,
ER -
TY - CONF
T1 - Trusted Semantic Sensor Network
T2 - AFRL Strategic Expansion Workshop on Cognitive Modeling
Y1 - 2011
A1 - Cory Henson
A1 - Amit Sheth
KW - Semantic Sensor Network
AB - Development of a framework capable of robust and comprehensive trusted situation awareness involving heterogeneous sensor and social networking data.
JA - AFRL Strategic Expansion Workshop on Cognitive Modeling
PB - AFRL
CY - Dayton, OH
ER -
TY - CONF
T1 - Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter
T2 - SoME 2011 (Workshop on Social Media Engagement, in conjunction with WWW 2011)
Y1 - 2011
A1 - Hemant Purohit
A1 - Yiye Ruan
A1 - Amruta Joshi
A1 - Srinivasan Parthasarthy
A1 - Amit Sheth
KW - Social Networks and Twitter and People-Content-Network Analysis (PCNA) and Network Analysis and Content Analysis and User Engagement and Community Formation
AB - The widespread use of social networking websites in recent years has suggested a need for effective methods to understand the new forms of user engagement, the factors impacting them, and the fundamental reasons for such engagements. We perform exploratory analysis on Twitter to understand the dynamics of user engagement by studying what attracts a user to participate in discussions on a topic. We identify various factors which might affect user engagement, ranging from content properties, network topology to user characteristics on the social network, and use them to predict user joining behavior. As opposed to traditional ways of studying them separately, these factors are organized in our framework, People-Content-Network Analysis (PCNA), mainly designed to enable understanding of human social dynamics on the web. We perform experiments on various Twitter user communities formed around topics from diverse domains, with varied social significance, duration and spread. Our findings suggest that capabilities of content, user and network features vary greatly, motivating the incorporation of all the factors in user engagement analysis, and hence, a strong need can be felt to study dynamics of user engagement by using the PCNA framework. Our study also reveals certain correlation between types of event for discussion topics and impact of user engagement factors.
JA - SoME 2011 (Workshop on Social Media Engagement, in conjunction with WWW 2011)
PB - 2011 workshop on Social Media Engagement (SoME)
CY - Hyderabad, India
ER -
TY - CONF
T1 - Vortex Detection and Visualization for Design of Micro Air Vehicles and Turbomachinery
T2 - DoD High Performance Computing Modernization Program 21st Users Group Conference
Y1 - 2011
A1 - Rhonda J. Vickery
A1 - Thomas Wischgoll
A1 - Christopher Koehler
A1 - Matthew Picket
A1 - Neal Eikenberry
A1 - Lance Harris
A1 - Richard Snyder
A1 - Darius Sanders
A1 - Haibo Dong
A1 - Randall Hand
A1 - Hugh Thornbug
JA - DoD High Performance Computing Modernization Program 21st Users Group Conference
CY - Portland, OR
ER -
TY - JOUR
T1 - Vortex Visualization in Ultra Low Reynolds Number Insect Flight
JF - IEEE Transactions on Visualization and Computer Graphics
Y1 - 2011
A1 - Christopher Koehler
A1 - Thomas Wischgoll
A1 - Haibo Dong
A1 - Zachary Gaston
ER -
TY - JOUR
T1 - Web Wisdom: An Essay on How Internet and Semantic Web can foster a Global Knowledge Society
JF - Computers in Human Behavior
Y1 - 2011
A1 - Christopher Thomas
A1 - Amit Sheth
KW - Human Computation
KW - knowledge
KW - Semantic Web
KW - Web 2.0
AB - Admittedly this is a presumptuous title that should never be used when reporting on individual research advances. Wisdom is just not a scientific concept. In this case, though, we are reporting on recent developments on the web that lead us to believe that the web is on the way to providing a platform for not only information acquisition and business transactions but also for large scale knowledge development and decision support. It is likely that by now every web user has participated in some sort of social function or knowledge accumulating function on the web, many times without even being aware of it, simply by searching and browsing, other times deliberately by e.g. adding a piece of information to a Wikipedia article or by voting on a movie on IMDB.com. In this paper we will give some examples of how Web Wisdom is already emerging, some ideas of how we can create platforms that foster Web Wisdom and a critical evaluation of types of problems that can be subjected to Web Wisdom.
PB - Computers in Human Behavior
VL - 27
CP - 4
ER -
TY - CHAP
T1 - Web Wisdom: An Essay on How Web 2.0 and Semantic Web Can Foster a Global Knowledge Society
Y1 - 2011
A1 - Christopher Thomas
A1 - Amit Sheth
KW - Human and social computation
KW - Problem solving
KW - Social networking
AB - Admittedly this is a presumptuous title that should never be used when reporting on individual research advances. Wisdom is just not a scientific concept. In this case, though, we are reporting on recent developments on the web that lead us to believe that the web is on the way to providing a platform for not only information acquisition and business transactions but also for large scale knowledge development and decision support. It is likely that by now every web user has participated in some sort of social function or knowledge accumulating function on the web, many times without even being aware of it, simply by searching and browsing, other times deliberately by e.g. adding a piece of information to a Wikipedia article or by voting on a movie on IMDB.com. In this paper we will give some examples of how Web Wisdom is already emerging, some ideas of how we can create platforms that foster Web Wisdom and a critical evaluation of types of problems that can be subjected to Web Wisdom.
ER -
TY - CONF
T1 - What's happening in Semantic Web ... and what FCA could have to do with it.
T2 - What's happening in Semantic Web ... and what FCA could have to do with it
Y1 - 2011
A1 - Pascal Hitzler
AB - The Semantic Web [27] is gaining momentum. Driven by over 10 years of focused project funding in the US and the EU, Semantic Web Technologies are now entering application areas in industry, academia, government, and the open Web. The Semantic Web is based on the idea of describing the meaning - or semantics - of data on the Web using metadata - data that describes other data - in the form of ontologies, which are represented using logic-based knowledge representation languages [26]. Central to the transfer of Semantic Web into practice is the Linked Open Data effort [7], which has already resulted in the publication, on the Web, of billions of pieces of information using ontology languages. This provides the basic data needed for establishing intelligent system applications on the Web in the tradition of Semantic Web Technologies.
JA - What's happening in Semantic Web ... and what FCA could have to do with it
PB - Lecture Notes in Artificial Intelligence 6628, Springer, Heidelberg
CY - Formal Concept Analysis, 9th International Conference, ICFCA 2011, Nicosia, Cyprus, May 2011
ER -
TY - CONF
T1 - What's happening in Semantic Web ... and what FCA could have to do with it
Y1 - 2011
A1 - Pascal Hitzler
ER -
TY - JOUR
T1 - 3-D Reconstruction of the Human Ribcage Based on Chest X-Ray Images and Geometric Template Models
JF - IEEE Multimedia
Y1 - 2010
A1 - Christopher Koehler
A1 - Thomas Wischgoll
A1 - Forouzan Golshani
AB - 2010
ER -
TY - THES
T1 - Algorithmic Techniques Employed in the Quantification and Characterization of Nuclear Magnetic Resonance Spectroscopic Data
Y1 - 2010
AB - Nuclear magnetic resonance (NMR) based metabolomics is a developing research field with broad applicability, including the identification of biomarkers associated with pathophysiologic changes, sample classification based on the type of toxic exposure, and clinical diagnosis. Intrinsic to these applications is the need for statistical and computational techniques to facilitate the associated data analysis. Further, a typical 1H NMR spectrum of pure proteins, biofluids, or tissue may contain thousands of resonances (i.e., peaks), thus, a pure visual inspection is insufficient to fully utilize the spectral information. Common practice within the metabolomics community is to evaluate and validate novel algorithms on empirical and simplified simulated data. Empirical data captures the complex characteristics of experimental data; however, evaluations on empirical data often rely on indirect performance metrics because the optimal or correct output is difficult to obtain a priori. To overcome the drawback of relying on indirect performance metrics, researchers often evaluate their algorithms on simplified simulated data. The conclusions obtained on this type of data can be difficult to generalize to true experimental data. This dissertation combines the advantages of both empirical and simplified simulated data by generating exacting synthetic data sets that emulate the salient features of experimental data. The analysis of NMR metabolic spectroscopic data can be divided into four steps: (1) standard post-instrumental processing of spectroscopic data; (2) quantification of spectral features; (3) normalization and scaling; and (4) multivariate statistical modeling of data. Quantification of spectral features, step (2), is a key step in the development of classification algorithms and biomarker identification (i.e., pattern recognition). Algorithms for spectral quantification are designed to enhance the efficacy of pattern recognition and multivariate statistical techniques for metabolomics. This is accomplished by reducing the dimensionality of the spectra, while retaining salient information and mitigating peak misalignment. This dissertation develops two novel spectral quantification techniques: Gaussian binning and dynamic adaptive binning. Gaussian binning utilizes a kernel-based binning algorithm to decrease the sensitivity to peak misalignment. Dynamic adaptive binning optimizes the bin boundaries through an objective function using a dynamic programming strategy. Both Gaussian binning and dynamic adaptive binning are compared to common spectral binning techniques by analyzing their ability to reduce the probability of peaks spanning bin boundaries and increase the interpretability of the results. Finally, a case study is presented to show the ability of dynamic adaptive binning and Gaussian binning to enhance the analysis of a 1H NMR-based experiment to monitor rat urinary metabolites following exposure to the toxin α-naphthylisothiocyanate.
ER -
TY - Generic
T1 - Analyzing and Tracking Weblog Communities Using Discriminative Collection Representatives
T2 - SBP 2010
Y1 - 2010
A1 - Guozhu Dong
A1 - Ting Sa
KW - Behavioral Modeling
KW - Discriminative Collection Representatives
KW - Prediction
KW - Social Computing
JA - SBP 2010
CY - Bethesda, MD
VL - 6007
ER -
TY - CONF
T1 - Approximate Instance Retrieval on Ontologies
T2 - 21st International Conference, DEXA 2010
Y1 - 2010
A1 - Tuvshintur Tserendorj
A1 - Stephan Grimm
A1 - Pascal Hitzler
AB - With the development of more expressive description logics(DLs) for the Web Ontology Language OWL the question arises how we can properly deal with the high computational complexity for efficient reasoning. In application cases that require scalable reasoning with expressive ontologies, non-standard reasoning solutions such as approximate reasoning are necessary to tackle the intractability of reasoning in expressive DLs. In this paper, we are concerned with the approximation of the reasoning task of instance retrieval on DL knowledge bases, trading correctness of retrieval results for gain of speed. We introduce our notion of an approximate concept extension and we provide implementations to compute an approximate answer for a concept query by a suitable mapping to efficient database operations. Furthermore, we report on experiments of our approach on instance retrieval with the Wine ontology and discuss first results in terms of error rate and speed-up.
JA - 21st International Conference, DEXA 2010
CY - Bilbao, Spain
ER -
TY - ABST
T1 - Automatic Domain Model Creation Using Pattern-Based Fact Extraction
Y1 - 2010
A1 - Christopher Thomas
A1 - Pankaj Mehra
A1 - Wenbo Wang
A1 - Amit Sheth
A1 - Gerhard Weikum
KW - Domain Model Creation and Information Extraction and Knowledge Extraction and Ontology Learning
AB - This paper describes a minimally guided approach to automatic domain model creation. The first step is to carve an area of interest out of the Wikipedia hierarchy based on a simple query or other starting point. The second step is to connect the concepts in this domain hierarchy with named relationships. A starting point is provided by Linked Open Data, such as DBPedia. Based on these community-generated facts we train a pattern-based fact-extraction algorithm to augment a domain hierarchy with previously unknown relationship occurrences. Pattern vectors are learned that represent occurrences of relationships between concepts. The process described can be fully automated and the number of relationships that can be learned grows as the community adds more information. Unlike approaches that are aimed at finding single, highly indicative patterns, we use the cumulative score of many pattern occurrences to increase extraction recall. The relationship identification process itself is based on positive-only classification of training facts.
ER -
TY - CONF
T1 - Biomedical Ontologies for Parasite Research
T2 - Intelligent System for Molecular Biology (ISMB)
Y1 - 2010
A1 - Vinh Nguyen
A1 - Satya S. Sahoo
A1 - Priti Parikh
A1 - Todd Minning
A1 - D. Brent Weatherly
A1 - Flora Logan
A1 - Amit Sheth
A1 - Rick Tarleton
KW - Ontology and Parasite
AB - Trypanosoma cruzi is a protozoan parasite that causes Chagas disease or American trypanosomiasis,which is the leading cause of death in Latin America. The primary objective of this study is to create an ontology-driven information infrastructure to support parasite researchers in identifying gene knockout, vaccination, or drug targets for T. cruzi. This involves querying across multiple datasets from diverse sources, such as proteome, pathway, internal lab data, etc. that are often represented in heterogeneous formats. To address this, a multi-ontology parasite knowledge repository (PKR) is being created with an intuitive graphical query interface called Cuebee. The PKR is underpinned by the Parasite Lifecycle Ontology (PLO) and Parasite Experiment Ontology (PEO) that has been released for public use through National Center for Biomedical Ontology (NCBO). The PLO models lifecycle stages of T. cruzi and two related kintoplastids, T. brucei and Leishmania major. It also describes the host, parasitic, and vector organisms, and anatomical location corresponding to each stage. On the other hand, the PEO models gene knockout, strain creation, proteomics, and microarray data. All the entities in PLO and PEO are linked to each other by explicitly modeled named relationships; for example, 'Trypanosoma_cruzi->has_vector_organism->triatominae.' Both PLO and PEO are used as schema to semantically annotate the experimental data and transform them into semantic web Resource Description Framework (RDF) format. PLO and PEO along with Cuebee interface will allow biologists to formulate queries over multiple data sources to find gene knockout, vaccination, or drug targets for T. cruzi and related parasites
JA - Intelligent System for Molecular Biology (ISMB)
PB - Intelligent System for Molecular Biology (ISMB)
CY - Boston, MA
ER -
TY - CONF
T1 - Cloud Based Scientific Workflow for NMR Data Analysis
T2 - 18th Annual International Conference on Intelligent Systems for Molecular Biology
Y1 - 2010
A1 - Ashwin Manjunatha
A1 - Paul Anderson
A1 - Satya S. Sahoo
A1 - Ajith Ranabahu
A1 - Michael Raymer
A1 - Amit Sheth
KW - Workflow and Cloud and Scientific Workflow and NMR and Taverna and Hadoop and Map-reduce
AB - This work presents a service oriented scientific workflow approach to NMR-based metabolomics data analysis. We demonstrate the effectiveness of this approach by implementing several common spectral processing techniques in the cloud using a parallel map-reduce framework, Hadoop.
JA - 18th Annual International Conference on Intelligent Systems for Molecular Biology
PB - 18th Annual International Conference on Intelligent Systems for Molecular Biology
CY - Boston, MA
ER -
TY - Generic
T1 - Cloud-based Map-Reduce Architecture for Nuclear Magnetic Resonance based Metabolomics
T2 - 7th Microsoft Research eScience Workshop
Y1 - 2010
A1 - Paul Anderson
A1 - Ashwin Manjunatha
A1 - Ajith H. Ranabahu
A1 - Nicholas J. DelRaso
A1 - Nicholas V. Reo
A1 - Amit Sheth
A1 - Michael Raymer
JA - 7th Microsoft Research eScience Workshop
ER -
TY - JOUR
T1 - A Clustering Comparison Measure Using Density Profiles and its Application to the Discovery of Alternate Clusterings
JF - Data Mining and Knowledge Discovery
Y1 - 2010
A1 - Eric Bae
A1 - James Bailey
A1 - Guozhu Dong
KW - alternate clustering algorithms
KW - alternate clusterings
KW - cluster analysis
KW - clustering
KW - clustering comparison
KW - clustering similarity
AB - Data clustering is a fundamental and very popular method of data analysis. Its subjective nature, however, means that different clustering algorithms or different parameter settings can produce widely varying and sometimes conflicting results. This has led to the use of clustering comparison measures to quantify the degree of similarity between alternative clusterings. Existing measures, though, can be limited in their ability to assess similarity and sometimes generate unintuitive results. They also cannot be applied to compare clusterings which contain different data points, an activity which is important for scenarios such as data stream analysis. In this paper, we introduce a new clustering similarity measure, known as ADCO, which aims to address some limitations of existing measures, by allowing greater flexibility of comparison via the use of density profiles to characterize a clustering. In particular, it adopts a 'data mining style' philosophy to clustering comparison, whereby two clusterings are considered to be more similar, if they are likely to give rise to similar types of prediction models. Furthermore, we show that this new measure can be applied as a highly effective objective function within a new algorithm, known as MAXIMUS, for generating alternate clusterings.
ER -
TY - JOUR
T1 - Computational Complexity and Anytime Algorithm for Inconsistency Measurement
JF - International Journal of Software and Informatics
Y1 - 2010
A1 - Yue Ma
A1 - Guilin Qi
A1 - Guohui Xiao
A1 - Pascal Hitzler
A1 - Zuoquan Lin
KW - algorithm
KW - computational complexity
KW - inconsistency measurement
KW - multi-valued logic
AB - Measuring inconsistency degrees of inconsistent knowledge bases is an important problem as it provides context information for facilitating inconsistency handling. Many methods have been proposed to solve this problem and a main class of them is based on some kind of paraconsistent semantics. In this paper, we consider the computational aspects of inconsistency degrees of propositional knowledge bases under 4-valued semantics. We first give a complete analysis of the computational complexity of computing inconsistency degrees. As it turns out that computing the exact inconsistency degree is intractable, we then propose an anytime algorithm that provides tractable approximations of the inconsistency degree from above and below. We show that our algorithm satisfies some desirable properties and give experimental results of our implementation of the algorithm.
ER -
TY - CHAP
T1 - Computing for Human Experience: Semantics Empowered Sensors, Services, Social and Ubiquitous Computing in Web 3.0 and Beyond
Y1 - 2010
A1 - Amit Sheth
PB - University of Illinois, NCSA
ER -
TY - JOUR
T1 - Computing for Human Experience: Semantics-Empowered Sensors, Services, and Social Computing on the Ubiquitous Web
JF - IEEE Internet Computing articles
Y1 - 2010
A1 - Amit Sheth
KW - computing for human experience
KW - human perception
KW - human-computer interaction
KW - Internet
KW - Machine Perception
KW - Ontologies
KW - pervasive computing
KW - semantic computing
KW - Semantic Web
KW - Social Computing
AB - We're on the verge of an era in which the human experience can be enriched in ways we couldn't have imagined two decades ago. Recent technological progress with several technologies- including computing technologies; communication, social-interaction, and Web technologies; and embedded, fixed, or mobile sensors and devices; enables use to rethink the relationship and interactions between humans and machines. Their semantics-empowered convergence and integration will enable us to capture, understand, and reapply human knowledge and intellect. Such capabilities will consequently elevate our technological ability to deal with the abstractions, concepts, and actions that characterize human experiences. This will herald computing for human experience (CHE). The CHE vision is built on a suite of technologies that serves, assists, and cooperates with humans to nondestructively and unobtrusively complement and enrich normal activities, with minimal explicit concern or effort on the humans' part. CHE will anticipate when to gather and apply relevant knowledge and intelligence. It will enable human experiences that are intertwined with the physical, conceptual, and experiential worlds (emotions, sentiments, and so on [CB]), rather than immerse humans in cyber worlds for a specific task. Instead of focusing on humans interacting with a technology or system, CHE will feature technology-rich human surroundings that often initiate interactions. Interaction will be more sophisticated and seamless compared to today's precursors such as automotive accident-avoidance systems.
ER -
TY - JOUR
T1 - Concept Learning in Description Logics Using Refinement Operators
JF - Machine Learning
Y1 - 2010
A1 - Jens Lehmann
A1 - Pascal Hitzler
KW - description logics
KW - Inductive logic programming
KW - Refinement operators
KW - Semantic Web
KW - Structured machine learning
AB - With the advent of the Semantic Web, description logics have become one of the most prominent paradigms for knowledge representation and reasoning. Progress in research and applications, however, is constrained by the lack of well-structured knowledge bases consisting of a sophisticated schema and instance data adhering to this schema. It is paramount that suitable automated methods for their acquisition, maintenance, and evolution will be developed. In this paper, we provide a learning algorithm based on refinement operators for the description logic ALCQ including support for concrete roles. We develop the algorithm from thorough theoretical foundations by identifying possible abstract property combinations which refinement operators for description logics can have. Using these investigations as a basis, we derive a practically useful complete and proper refinement operator. The operator is then cast into a learning algorithm and evaluated using our implementation DL-Learner. The results of the evaluation show that our approach is superior to other learning approaches on description logics, and is competitive with established ILP systems.
ER -
TY - Generic
T1 - Contextual Ontology Alignment of LOD with an Upper Ontology: A Case Study with Proton
T2 - ESWC 2011
Y1 - 2010
A1 - Prateek Jain
A1 - Peter Yeh
A1 - Kunal Verma
A1 - Reymonrod Vasquez
A1 - Mariana Damova
A1 - Pascal Hitzler
A1 - Amit Sheth
KW - ontology alignment and lod and proton
JA - ESWC 2011
PB - Springer
CY - Heraklion, Greece
VL - 6643
ER -
TY - JOUR
T1 - Continuous Semantics to Analyze Real-Time Data
JF - IEEE Internet Computing
Y1 - 2010
A1 - Amit Sheth
A1 - Christopher Thomas
A1 - Pankaj Mehra
KW - circle of knowledge life
KW - Continuous Semantics
KW - Doozer
KW - Dynamic Domain model
KW - real time search
KW - real-time social data
KW - semantic analysis of real-time data
KW - semantic annotation of social data
KW - Semantic Web
AB - Increasingly we are presented with dynamic domains involved in social, mobile, and sensor webs. Such domains are spontaneous (arising suddenly), follow a period of rapid evolution, involving real-time or near real-time data,involve many distributed participants and diverse viewpoints involving topical or contentious subjects, and involve feature context colored by local knowledge and sociocultural backgrounds.This article present continuous semantics can help us model such dynamic domains and analyze the related real-time data.capabilities include crating dynamic domain model by mining social data, and using dynamic models for semantic analysis of real-time data.
VL - 14
ER -
TY - CONF
T1 - Cross-Market Model Adaptation with Pairwise Preference Data for Web Search Ranking
T2 - International Conference on Computational Linguistics
Y1 - 2010
A1 - Jing Bai
A1 - Fernando Diaz
A1 - Yi Chang
A1 - Zhaohui Zheng
A1 - Keke Chen
KW - machine-learned ranking
KW - pairwise-trada
KW - search engine ranking
AB - Machine-learned ranking techniques automatically learn a complex document ranking function given training data. These techniques have demonstrated the effectiveness and flexibility required of a commercial web search. However, manually labeled training data (with multiple absolute grades) has become the bottleneck for training a quality ranking function, particularly for a new domain. In this paper, we explore the adaptation of machine-learned ranking models across a set of geographically diverse markets with the market-specific pairwise preference data, which can be easily obtained from clickthrough logs. We propose a novel adaptation algorithm, Pairwise-Trada, which is able to adapt ranking models that are trained with multi-grade labeled training data to the target market using the target-market-specific pairwise preference data. We present results demonstrating the efficacy of our technique on a set of commercial search engine data.
JA - International Conference on Computational Linguistics
ER -
TY - Generic
T1 - Distance-based Measures of Inconsistency and Incoherency for Description Logics
Y1 - 2010
A1 - Yue Ma
A1 - Pascal Hitzler
AB - Inconsistency and incoherency are two sorts of erroneous information in a DL ontology which have been widely discussed in ontology-based applications. For example, they have been used to detect modeling errors during ontology construction. To provide more informative metrics which can tell the differences between inconsistent ontologies and between incoherent terminologies, there has been some work on measuring inconsistency of an ontology and on measuring incoherency of a terminology. However, most of them merely focus either on measuring inconsistency or on measuring incoherency and no clear ideas of how to extend them to allow for the other. In this paper, we propose a novel approach to measure DL ontologies, named distance-based measures. It has the merits that both inconsistency and incoherency can be measured in a unified framework. Moreover, only classical DL interpretations are used such that there is no restriction on the DL languages used.
PB - 23rd International Workshop on Description Logics (DL2010)
CY - Waterloo, Canada
ER -
TY - Generic
T1 - Dynamic Associative Relationships on the Linked Open Data Web
T2 - WebSci10, Extending the Frontiers of Society On-Line Conference
Y1 - 2010
A1 - Pablo Mendes
A1 - Pavan Kapanipathi
A1 - Delroy Cameron
A1 - Amit Sheth
KW - semantic web and social media and microblogging and linked data and real-time web
AB - In this work we approach relationships on the Linked Open Data Web as key facilitators of information exploration. Linked Open Data (LOD) principles contribute to a shift in paradigm for information representation and access, enhancing the ability of users and computers to connect, browse and query data on the Web through standard lan- guages and protocols. We present a brief discussion on the current relationship types on the Web, and observe the need for the extraction of a particular type. We focus on trending socially con- textual relationships since they highlight the dynamics of the social interaction between users creating, linking, and consuming information on the Web. Our extraction method is based on the identification of co-occurring entity mentions in microblog posts. Real time entity and relationship extraction can be useful tools for the on demand extension of the Web of Data. We demonstrate the usefulness of this approach in the context of two ap- plications: database exploration and semantic browsing for exploratory search.
JA - WebSci10, Extending the Frontiers of Society On-Line Conference
CY - Raleigh, NC
ER -
TY - JOUR
T1 - Extracting Reduced Logic Programs from Artificial Neural Networks
Y1 - 2010
A1 - Jens Lehmann
A1 - Sebastian Bader
A1 - Pascal Hitzler
KW - artificial neural networks
KW - reduced logic programs
AB - Artificial neural networks can be trained to perform excellently in many application areas. While they can learn from raw data to solve sophisticated recognition and analysis problems, the acquired knowledge remains hidden within the network architecture and is not readily accessible for analysis or further use: Trained networks are black boxes. Recent research efforts therefore investigate the possibility to extract symbolic knowledge from trained networks, in order to analyze, validate, and reuse the structural insights gained implicitly during the training process. In this paper, we will study how knowledge in form of propositional logic programs can be obtained in such a way that the programs are as simple as possible - where simple is being understood in some clearly defined and meaningful way.
ER -
TY - CONF
T1 - Flexible Bootstrapping-Based Ontology Alignment
Y1 - 2010
A1 - Prateek Jain
A1 - Pascal Hitzler
A1 - Amit Sheth
KW - Ontology Matching and Wikipedia and BLOOMS and Plug-n-Play Ontology Matching System
AB - BLOOMS (Jain et al, ISWC2010) is an ontology alignment system which, in its core, utilizes the Wikipedia category hierarchy for establishing alignments. In this paper, we present a Plug-and-Play extension to BLOOMS, which allows to flexibly replace or complement the use of Wikipedia by other online or offline resources, including domain-specific ontologies or taxonomies. By making use of automated translation services and of Wikipedia in languages other than English, it makes it possible to apply BLOOMS to alignment tasks where the input ontologies are written in different languages.
PB - The Fifth International Workshop on Ontology Matching collocated with the 9th International Semantic Web Conference ISWC-2010, November 7, 2010
ER -
TY - BOOK
T1 - Foundations of Semantic Web Technologies
Y1 - 2010
A1 - Pascal Hitzler
A1 - Markus Krotzsch
A1 - Sebastian Rudolph
KW - foundations of semantic web
KW - foundations of semantic web technologies
ER -
TY - JOUR
T1 - Generalized Distance Functions in the Theory of Computation
JF - The Computer Journal
Y1 - 2010
A1 - Anthony K. Seda
A1 - Pascal Hitzler
KW - denotational semantics
KW - fixed-point theorems
KW - logic programming
KW - stable model
KW - supported model
KW - topology
KW - ultra-metrics
AB - We discuss a number of distance functions encountered in the theory of computation, including metrics, ultra-metrics, quasi-metrics, generalized ultrametrics, partial metrics, d-ultra-metrics, and generalized metrics. We consider their properties, associated fixed-point theorems, and some general applications they have within the theory of computation. We consider in detail the applications of generalized distance functions in giving a uniform treatment of several important semantics for logic programs, including acceptable programs and natural generalizations of them, and also the supported model and the stable model in the context of locally stratified extended disjunctive logic programs and databases.
ER -
TY - JOUR
T1 - Geometric Data Perturbation for Privacy Preserving Outsourced Data Mining
JF - Journal of Knowledge and Information Systems (KAIS)
Y1 - 2010
A1 - Keke Chen
A1 - Ling Liu
KW - Data mining algorithms
KW - Data perturbation
KW - Geometric data perturbation
KW - Privacy evaluation
KW - Privacy-preserving data mining
AB - Data perturbation is a popular technique in privacy-preserving data mining. A major challenge in data perturbation is to balance privacy protection and data utility, which are normally considered as a pair of conflicting factors.We argue that selectively preserving the task/model specific information in perturbation will help achieve better privacy guarantee and better data utility. One type of such information is the multidimensional geometric information, which is implicitly utilized by many data-mining models. To preserve this information in data perturbation, we propose the GeometricData Perturbation (GDP)method. In this paper, we describe several aspects of the GDP method. First, we show that several types of well-known data-mining models will deliver a comparable level of model quality over the geometrically perturbed data set as over the original data set. Second, we discuss the intuition behind the GDP method and compare it with other multidimensional perturbation methods such as random projection perturbation. Third, we propose a multi-column privacy evaluation framework for evaluating the effectiveness of geometric data perturbation with respect to different level of attacks. Finally, we use this evaluation framework to study a few attacks to geometrically perturbed data sets. Our experimental study also shows that geometric data perturbation can not only provide satisfactory privacy guarantee but also preserve modeling accuracy well.
ER -
TY - CONF
T1 - Getting Code Near the Data: A Study of Generating Customized Data Intensive Scientific Workflows with Domain Specific Language
T2 - 2nd IEEE International Conference on Cloud Computing Technology and Science
Y1 - 2010
A1 - Ashwin Manjunatha
A1 - Ajith Ranabahu
A1 - Paul Anderson
A1 - Amit Sheth
KW - NMR and metabolomics and DSL and Domain Specific Language
JA - 2nd IEEE International Conference on Cloud Computing Technology and Science
CY - Indianapolis, IN
ER -
TY - CONF
T1 - How To Make Linked Data More than Data
T2 - Semantic Technology Conference 2010
Y1 - 2010
A1 - Prateek Jain
A1 - Pascal Hitzler
A1 - Amit Sheth
A1 - Peter Z. Yeh
A1 - Kunal Verma
KW - Linked Open Data
KW - Lod federated query
KW - LoD ontology
KW - LoD Schema
KW - LoD Schema Entichment
KW - LoD Semantic Enrichment
KW - SPARQL federated query
AB - The LOD cloud has a potential for applicability in many AI-related tasks, such as open domain question answering, knowledge discovery, and the Semantic Web. An important prerequisite before the LOD cloud can enable these goals is allowing its users (and applications) to effectively pose queries to and retrieve answers from it. However, this prerequisite is still an open problem for the LOD cloud and has restricted it to 'merely more data.' To transform the LOD cloud from 'merely more data' to 'semantically linked data' there are plenty of open issues which should be addressed. We believe this transformation of the LOD cloud can be performed by addressing the shortcomings identified by us: lack of conceptual description of datasets, lack of expressivity, and difficulties with respect to querying.
JA - Semantic Technology Conference 2010
CY - San Francisco, California
ER -
TY - CONF
T1 - Janus: from Workflows to Semantic Provenance and Linked Open Data
T2 - 3rd International Provenance and Annotation Workshop
Y1 - 2010
A1 - Paolo Missier
A1 - Satya S. Sahoo
A1 - Jun Zhao
A1 - Carole Goble
A1 - Amit Sheth
KW - Linked Open Data
KW - semantic provenance
AB - Data provenance graphs are form of metadata that can be used to establish a variety of properties of data products that undergo sequences of transformations, typically specified as workflows. Their usefulness for answering user provenance queries is limited, however, unless the graphs are enhanced with domain-specific annotations. In this paper we propose a model and architecture for semantic, domain-aware provenance,and demonstrate its usefulness in answering typical user queries. Furthermore, we discuss the additional benefits and the technical implications of publishing provenance graphs as a form of Linked Data. A prototype implementation of the model is available for data produced by the Taverna workflow system.
JA - 3rd International Provenance and Annotation Workshop
CY - Troy, NY
ER -
TY - JOUR
T1 - Knowledge Based 3D Reconstruction and Visualization of Human Ribcage and Lungs
JF - IEEE Computer Graphics and Applications
Y1 - 2010
A1 - Christopher Koehler
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - Learning Paradigms in Dynamic Environments
T2 - Dagstuhl Seminar Proceedings 10302
Y1 - 2010
A1 - Barbara Hammer
A1 - Pascal Hitzler
A1 - Wolfgang Maass
A1 - Marc Toussaint
AB - The seminar centered around problems which arise in the context of machine learning in dynamic environments. Particular emphasis was put on a couple of specific questions in this context: how to represent and abstract knowledge appropriately to shape the problem of learning in a partially unknown and complex environment and how to combine statistical inference and abstract symbolic representations; how to infer from few data and how to deal with non i.i.d. data, model revision and life-long learning; how to come up with efficient strategies to control realistic environments for which exploration is costly, the dimensionality is high and data are sparse; how to deal with very large settings; and how to apply these models in challenging application areas such as robotics, computer vision, or the web.
JA - Dagstuhl Seminar Proceedings 10302
PB - Schloss Dagstuhl
CY - Dagstuhl, Germany
ER -
TY - Generic
T1 - Linked Data is Merely More Data
T2 - 2010 AAAI Spring Symposium
Y1 - 2010
A1 - Prateek Jain
A1 - Pascal Hitzler
A1 - Peter Z. Yeh
A1 - Kunal Verma
A1 - Amit Sheth
KW - Artificial Intelligence
KW - Linked Data
KW - Semantic Web
KW - Web of Data
AB - In this position paper, we argue that the Linked Open Data (LoD) Cloud, in its current form, is only of limited value for furthering the Semantic Web vision. Being merely a weakly linked 'triple collection', it will only be of very limited benefit for the AI or Semantic Web communities. We describe the corresponding problems with the LoD Cloud and give directions for research to remedy the situation.
JA - 2010 AAAI Spring Symposium
PB - AAAI Press
CY - Menlo Park, California
SN - 978-1-57735-461-1
ER -
TY - CONF
T1 - Linked Open Social Signals
T2 - International Conference on Web Intelligence
Y1 - 2010
A1 - Pablo N. Mendes
A1 - Alex Passant
A1 - Pavan Kapanipathi
A1 - Amit Sheth
KW - annotation
KW - information extraction
KW - microblogging
KW - rdf
KW - Social Media
KW - social signals
KW - sparql
KW - twitter
AB - In this paper we discuss the collection, semantic annotation and analysis of real-time social signals from micro-blogging data. We focus on users interested in analyzing socialsignals collectively for sensemaking. Our proposal enables flexibility in selecting subsets for analysis, alleviating information overload. We define an architecture that is based on state-of-the-art Semantic Web technologies and a distributed publish subscribe protocol for real time communication. In addition,we discuss our method and application in a scenario related to the health care reform in the United States.
JA - International Conference on Web Intelligence
PB - IEEE/WIC/ACM International Conference on Web Intelligence (WI-10)
CY - Toronto, Canada
ER -
TY - CONF
T1 - Linked Sensor Data
T2 - Collaborative Technologies and Systems (CTS 2010)
Y1 - 2010
A1 - Harshal Patni
A1 - Cory Henson
A1 - Amit Sheth
KW - Semantic Sensor Web and Dataset Generation and Linked Data and Sensor Web Enablement and Sensor Data and Resource Description Framework (RDF)
AB - A number of government, corporate, and academic organizations are collecting enormous amounts of data provided by environmental sensors. However, this data is too often locked within organizations and underutilized by the greater community. In this paper, we present a framework to make this sensor data openly accessible by publishing it on the Linked Open Data (LOD) Cloud. This is accomplished by converting raw sensor observations to RDF and linking with other datasets on LOD. With such a framework, organizations can make large amounts of sensor data openly accessible, thus allowing greater opportunity for utilization and analysis.
JA - Collaborative Technologies and Systems (CTS 2010)
CY - Chicago, Illinois
ER -
TY - JOUR
T1 - Logical Queries over Views: Decidability and Expressiveness
JF - ACM Transactions on Computational Logic
Y1 - 2010
A1 - James Bailey
A1 - Guozhu Dong
A1 - Anthony Widjaja
KW - conjunctive query
KW - containment
KW - database query
KW - database view
KW - decidability
KW - first-order logic
KW - Lowenheim class
KW - monadic logic
KW - ontology reasoning
KW - Satisfiability
KW - unary logic
KW - unary view
AB - We study the problem of deciding the satisfiability of first-order logic queries over views, with our aim to delimit the boundary between the decidable and the undecidable fragments of this language. Views currently occupy a central place in database research due to their role in applications such as information integration and data warehousing. Our main result is the identification of a decidable class of first-order queries over unary conjunctive views that general the decidability of the classical class of first-order sentences over unary relations known as the Lowenheim class. We then demonstrate how various extensions of this class lead to undecidability and also provide some expressivity results. Besides its theoretical interest, our new decidable class is potentially interesting for use in applications such as deciding implication of complex dependencies, analysis of a restricted class of active database rules, and ontology reasoning.
ER -
TY - CONF
T1 - Managing Provenance Information in Parasite Research
Y1 - 2010
A1 - Vinh Nguyen
A1 - Priti Parikh
A1 - Satya Sahoo
A1 - Amit Sheth
KW - ontology and provenance and RDF triple
AB - The objective of this research is to create a semantic problem solving environment (PSE) for human parasite Trypanosoma cruzi. As a part of the PSE, we are trying to manage provenance of the experiment data as it is generated. It requires to capture the provenance which is often collected through web forms used by biologists to input the information about experiments they conduct. We have created Parasite Experiment Ontology (PEO) that represents provenance information used in the project. We have modified the back end which processes the data gathered from biologists, generates RDF triples and serializes them into the triple store. Moreover, it is necessary to assert that RDF triples conform to the PEO schema. This work allows us to capture provenance of experiments conducted at Tarleton Research Group as a part of this project.
PB - Ohio State University, Columbus
ER -
TY - CONF
T1 - A MapReduce Algorithm for EL+
T2 - 23rd International Workshop on Description Logics (DL2010)
Y1 - 2010
A1 - Raghava Mutharaju
A1 - Frederick Maier
A1 - Pascal Hitzler
AB - Recently, the use of the MapReduce framework for distributed RDF Schema reasoning has shown that it is possible to compute the deductive closure of sets of over a billion RDF triples within a reasonable time span [22], and that it is also possible to carry the approach over to OWL Horst [21]. Following this lead, in this paper we provide a MapReduce algorithm for the description logic EL+, more precisely for the classification of EL+ ontologies. To do this, we first modify the algorithm usually used for EL+ classification. The modified algorithm can then be converted into a MapReduce algorithm along the same key ideas as used for RDF schema.
JA - 23rd International Workshop on Description Logics (DL2010)
CY - Waterloo, Canada
ER -
TY - BOOK
T1 - Mathematical Aspects of Logic Programming Semantics
Y1 - 2010
A1 - Anthony K. Seda
A1 - Pascal Hitzler
ER -
TY - CONF
T1 - MobiCloud Making Clouds Reachable: A Toolkit for Easy and Efficient Development of Customized Cloud Mobile Hybrid Applications
T2 - 2nd IEEE International Conference on Cloud Computing Technology and Science
Y1 - 2010
A1 - Ashwin Manjunatha
A1 - Ajith Ranabahu
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
KW - Mobicloud
AB - The advancements in computing have resulted in a boom of cheap, ubiquitous, connected mobile devices, as well as seemingly unlimited, utility style, pay as you go computing resources, commonly referred to as Cloud computing. However, taking full advantage of this mobile and cloud computing landscape, especially for the data intensive domains, has been hampered by the many heterogeneities that exist in the mobile space, as well as the Cloud space. Our research attempts to exploit the capabilities of the mobile and cloud landscape by introducing MobiCloud, an online toolkit to efﬁciently develop Cloud-mobile hybrid (CMH) applications. We deﬁne a CMH application as a collective application that has a Cloud based back-end and a mobile device front-end. Using a single Domain Speciﬁc Language (DSL) script, MobiCloud toolkit is capable of generating a variety of CMH applications. These applications are composed of multiple combinations of native Cloud and mobile applications. Our approach not only reduces the learning curve, but also shields the developers from the complexities of the target platforms. In this paper, we provide a brief description of the MobiCloud toolkit and the workﬂow
JA - 2nd IEEE International Conference on Cloud Computing Technology and Science
CY - Indianapolis, Indiana
ER -
TY - ABST
T1 - MobiCloud - Making Clouds Reachable: A Toolkit for Easy and Efficient Development of Customized Cloud Mobile Hybrid Application
Y1 - 2010
A1 - Ashwin Manjunatha
A1 - Ajith Ranabahu
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
KW - Mobicloud
AB - The advancements in computing have resulted in a boom of cheap, ubiquitous, connected mobile devices, as well as seemingly unlimited, utility style, pay as you go computing resources, commonly referred to as Cloud computing. However, taking full advantage of this mobile and cloud computing landscape, especially for the data intensive domains,has been hampered by the many heterogeneities that exist in the mobile space, as well as the Cloud space. Our research attempts to exploit the capabilities of the mobile and cloud landscape by introducing MobiCloud, an online toolkit to efficiently develop Cloud-mobile hybrid (CMH) applications. We define a CMH application as a collective application that has a Cloud based back-end and a mobile device front-end. Using a single Domain Specific Language (DSL) script, MobiCloud toolkit is capable of generating a variety of CMH applications. These applications are composed of multiple combinations of native Cloud and mobile applications. Our approach not only reduces the learning curve, but also shields the developers from the complexities of the target platforms. In this paper, we provide a brief description of the MobiCloud toolkit and the workflow.
ER -
TY - JOUR
T1 - Modeling and Visualization of Cardiovascular Systems
JF - Dagstuhl Follow-Ups
Y1 - 2010
A1 - Thomas Wischgoll
AB - Modeling complex organs, such as the human heart, requires a detailed understanding of the geometric and mechanical properties of that organ. Similarly, the model is only as accurate as the precision of the underlying properties allow. Hence, it is of great importance that accurate measurements of the geometric configuration are available. This paper describes the different steps that are necessary for creating and visualizing such a vascular model, ranging from determining a basic geometric model, gathering statistical data necessary to extend an existing model up to the visualization of the resulting large-scale vascular models.
ER -
TY - JOUR
T1 - Multimodal Social Intelligence in a Real-Time Dashboard System
JF - VLDB Journal Special Issue on Data Management and Mining for Social Networks and Social Media
Y1 - 2010
A1 - Daniel Gruhl
A1 - Meenakshi Nagarajan
A1 - Jan Pieper
A1 - Christine Robson
A1 - Amit Sheth
KW - BBC SoundIndex
KW - informal text analysis
KW - Information Mashups
KW - Semantic Annotation
KW - Semantic Domain Models
KW - Slang Sentiment Identification
KW - Social Intelligence
KW - Spam Filtering
KW - Voting Theory
AB - Social Networks provide one of the most rapidly evolving data sets in existence today. Traditional Business Intelligence applications struggle to take advantage of such data sets in a timely manner. The BBC SoundIndex, developed by the authors and others, enabled real-time analytics of music popularity using data from a variety of Social Networks. We present this system as a grounding example of how to overcome the challenges of working with this data from social networks. We discuss a variety of technologies to implement near real-time data analytics to transform Social Intelligence into Business Intelligence and evaluate their effectiveness in the music domain. The SoundIndex project helped to highlight a number of key research areas, including named entity recognition and sentiment analysis in Informal English. It also drew attention to the importance of metadata aggregation in multimodal environments. We explored challenges such as drawing data from a wide set of sources spanning a myriad of modalities, developing adjudication techniques to harmonize inputs, and performing deep analytics on extremely challenging Informal English snippets. Ultimately, we seek to provide guidance on developing applications in a variety of domains that allow an analyst to rapidly grasp the evolution in the social landscape, and show how to validate such a system for a real-world application.
PB - Springer
ER -
TY - ABST
T1 - Nominal Schemas for Integrating Rules and Ontologies.
Y1 - 2010
A1 - Frederick Maier
A1 - Adila Alfa Krisnadhi
A1 - Pascal Hitzler
KW - Web Ontology Language and Description Logic and SROIQ and Semantic Web Rule Language and Datalog and tractability
AB - We propose a description-logic style extension of OWL DL, which includes DL-safe variable SWRL and seamlessly integrates datalog rules. Our language also sports a tractable fragment, which we call ELP 2, covering OWL EL, OWL RL, most of OWL QL, and variable restricted datalog.
ER -
TY - CHAP
T1 - Ontological Evaluation and Validation
Y1 - 2010
A1 - Samir Tartir
A1 - Ismailcem Budak Arpinar
A1 - Amit Sheth
KW - ontology evaluation
KW - ontology quality analysis
KW - ontology validation
KW - OntoQA
KW - relationship richness
AB - In the last few years, the Semantic Web gained scientific acceptance as the means of sharing knowledge in different domains, and the cornerstone of the Semantic Web is ontologies. Currently, users trying to incorporate ontologies in their applications have to rely on their experience to try to find a suitable ontology for their applications. Methods for evaluating ontology quality and validity, ontology characterization and ranking have been developed for that purpose. In this chapter, we introduce several approaches that have ...
ER -
TY - CONF
T1 - Ontology Alignment for Linked Open Data
T2 - 9th International Semantic Web Conference
Y1 - 2010
A1 - Prateek Jain
A1 - Pascal Hitzler
A1 - Amit Sheth
A1 - Kunal Verma
A1 - Peter Z. Yeh
KW - BLOOMS
KW - Linked Open Data
KW - Linked Open Data Schema Matching
KW - ontology alignment
KW - Schema Alignment
KW - Wikipedia
AB - The Web of Data currently coming into existence through the Linked Open Data (LOD) effort is a major milestone in realizing the Semantic Web vision. However, the development of applications based on LOD faces difficulties due to the fact that the different LOD datasets are rather loosely connected pieces of information. In particular, links between LOD datasets are almost exclusively on the level of instances, and schema-level information is being ignored. In this paper, we therefore present a system for finding schema-level links between LOD datasets in the sense of ontology alignment. Our system, called BLOOMS, is based on the idea of bootstrapping information already present on the LOD cloud. We also present a comprehensive evaluation which shows that BLOOMS outperforms state-of-the-art ontology alignment systems on LOD datasets. At the same time, BLOOMS is also competitive compared with these other systems on the Ontology Evaluation Alignment Initiative Benchmark datasets.
JA - 9th International Semantic Web Conference
CY - Shanghai, China
ER -
TY - CHAP
T1 - Ontology Evaluation and Validation
T2 - Theory and Applications of Ontology
Y1 - 2010
A1 - Samir Tartir
A1 - Ismailcem Budak Arpinar
A1 - Amit Sheth
KW - information systems applications
KW - Ontology
AB - In the last few years, the Semantic Web gained scientific acceptance as the means of sharing knowledge in different domains, and the cornerstone of the Semantic Web is ontologies. Currently, users trying to incorporate ontologies in their applications have to rely on their experience to try to find a suitable ontology for their applications. Methods for evaluating ontology quality and validity, ontology characterization and ranking have been developed for that purpose. In this chapter, we introduce several approaches that have been developed to aid in evaluating ontologies. In addition, we present highlights of OntoQA, an ontology evaluation and analysis tool that uses a set of metrics measuring different aspects of the ontology schema and knowledge base to give an insight to the overall characteristics of the ontology. It is important to keep in mind while reading this chapter that the definition “goodness” or the “validity” of an ontology might vary between different users or different domains.
JA - Theory and Applications of Ontology
PB - Springer Netherlands
CY - Amsterdam
VL - 2
ER -
TY - JOUR
T1 - Pattern Space Maintenance for Data Updates and Interactive Mining
JF - Computational Intelligence
Y1 - 2010
A1 - Mengling Feng
A1 - Guozhu Dong
A1 - Jinyan Li
A1 - Yap-Peng Tan
A1 - Limsoon Wong
KW - Data mining algorithms
KW - data update and interactive mining
KW - frequent pattern
KW - incremental maintenance
AB - This paper addresses the incremental and decremental maintenance of the frequent pattern space. We conduct an in-depth investigation on how the frequent pattern space evolves under both incremental and decremental updates. Based on the evolution analysis, a new data structure, Generator-Enumeration Tree (GE-tree), is developed to facilitate the maintenance of the frequent pattern space. With the concept of GE-tree, we propose two novel algorithms, Pattern Space Maintainer+ (PSM+) and Pattern Space Maintainer- (PSM-), for the incremental and decremental maintenance of frequent patterns. Experimental results demonstrate that the proposed algorithms, on average, outperform the representative state-of-the-art methods by an order of magnitude.
VL - 26
CP - 3
ER -
TY - CONF
T1 - Pattern-Based Synonym and Antonym Extraction
T2 - 48th ACM Southeast Conference
Y1 - 2010
A1 - Wenbo Wang
A1 - Christopher Thomas
A1 - Amit Sheth
A1 - Victor Chan
KW - information extraction
KW - Pattern-Based extraction
KW - Verb Relationship
AB - Many research studies adopt manually selected patterns for semantic relation extraction. However, manually identifying and discovering patterns is time consuming and it is difficult to discover all potential candidates. Instead, we propose an automatic pattern construction approach to extract verb synonyms and antonyms from English newspapers. Instead of relying on a single pattern, we combine results indicated by multiple patterns to maximize the recall.
JA - 48th ACM Southeast Conference
CY - Oxford, Mississippi
ER -
TY - JOUR
T1 - Perspectives and Challenges for Recurrent Neural Network Training
JF - Logic Journal of the IGPL
Y1 - 2010
A1 - Marco Gori
A1 - Barbara Hammer
A1 - Pascal Hitzler
A1 - Guenther Palm
KW - neural network training challenges
AB - Recurrent neural networks (RNNs) offer flexible machine learning tools which share the learning abilities of feedforward networks and which extend their expression abilities based on dynamical equations. Hence, they can directly process complex spatiotemporal data and model complex dynamic systems. Since temporal and spatial data are present in many domains such as processing environmental time series, modelling the financial market, speech and language processing, robotics, bioinformatics, medical informatics, etc., RNNs constitute promising candidates for a variety of applications. Further, their rich dynamic repertoire as time dependent systems makes them suitable candidates for modelling brain phenomena or mimicking large-scale distributed computations and argumentations. Thus, RNNs carry the promise of efficient biologically plausible signal processing models optimally suited for a wide area of industrial applications on the one hand and an explanation of cognitive phenomena of the human brain on the other hand.
ER -
TY - CONF
T1 - Power of Clouds In Your Pocket: An Efficient Approach for Cloud Mobile Hybrid Application Development
T2 - 2nd IEEE International Conference on Cloud Computing Technology and Science
Y1 - 2010
A1 - Ashwin Manjunatha
A1 - Ajith Ranabahu
A1 - Amit Sheth
A1 - Krishnaprasad Thirunarayan
AB - The advancements in computing have resulted in a boom of cheap, ubiquitous, connected mobile devices as well as seemingly unlimited, utility style, pay as you go computing resources, commonly referred to as Cloud computing. However, taking full advantage of this mobile and cloud computing landscape, especially for the data intensive domains has been hampered by the many heterogeneities that exist in the mobile space as well as the Cloud space. Our research focuses on exploiting the capabilities of the mobile and cloud landscape by defining a new class of applications called cloud mobile hybrid (CMH) applications and a Domain Specific Language (DSL) based methodology to develop these applications. We define Cloud-mobile hybrid as a collective application that has a Cloud based back-end and a mobile device front-end. Using a single DSL script, our toolkit is capable of generating a variety of CMH applications. These applications are composed of multiple combinations of native Cloud and mobile applications. Our approach not only reduces the learning curve but also shields developers from the complexities of the target platforms. We provide a detailed description of our language and present the results obtained using our prototype generator implementation. We also present a list of extensions that will enhance the various aspects of this platform.
JA - 2nd IEEE International Conference on Cloud Computing Technology and Science
CY - Indianapolis, Indiana
ER -
TY - BOOK
T1 - Progressive Concepts for Semantic Web Evolution: Applications and Developments
Y1 - 2010
A1 - Miltiadis Lytras
A1 - Amit Sheth
KW - semantic web applications
KW - semantic web development
KW - semantic web evolution
AB - Semantic Web technologies and applications have become increasingly important as new methods for understanding and expressing information are discovered.

Progressive Concepts for Semantic Web Evolution: Applications and Developments unites research on essential theories, models, and applications of Semantic Web research. Contributions focus on mobile ontologies and agents, fuzzy databases, and new approaches to retrieval and evaluation in the Semantic Web.
PB - IGI Global
CY - Pennsylvania
SN - 9781605669922
ER -
TY - CONF
T1 - Provenance Aware Linked Sensor Data
T2 - 2nd Workshop on Trust and Privacy on the Social and Semantic Web
Y1 - 2010
A1 - Harshal Patni
A1 - Satya S. Sahoo
A1 - Cory Henson
A1 - Amit Sheth
KW - Semantic Sensor Web and Dataset Generation and Linked Data and provenir ontology and Sensor Web Enablement and Provenance Management Framework and Provenance and Lineage and Sensor Data and Resource Description Framework(RDF)
AB - Provenance, from the French word 'provenir', describes the lineage or history of a data entity. Provenance is critical information in the sensors domain to identify a sensor and analyze the observation data over time and geographical space. In this paper, we present a framework to model and query the provenance information associated with the sensor data exposed as part of the Web of Data using the Linked Open Data conventions. This is accomplished by developing an ontology-driven provenance management infrastructure that includes a representation model and query infrastructure. This provenance infrastructure, called Sensor Provenance Management System (PMS), is underpinned by a domain specific provenance ontology called Sensor Provenance (SP) ontology. The SP ontology extends the Provenir upper level provenance ontology to model domain-specific provenance in the sensor domain. In this paper, we describe the implementation of the Sensor PMS for provenance tracking in the Linked Sensor Data.
JA - 2nd Workshop on Trust and Privacy on the Social and Semantic Web
CY - Heraklion Greece
ER -
TY - Generic
T1 - Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data
T2 - Scientific and Statistical Database Management (SSDBM 2010)
Y1 - 2010
A1 - Satya Sahoo
A1 - Olivier Bodenreider
A1 - Krishnaprasad Thirunarayan
A1 - Pascal Hitzler
A1 - Amit Sheth
KW - Provenir ontology and Provenance context entity and Biomedical knowledge repository and Context theory and RDF reification
AB - The Resource Description Framework (RDF) format is being used by a large number of scientific applications to store and disseminate their datasets. The provenance information, describing the source or lineage of the datasets, is playing an increasingly significant role in ensuring data quality, computing trust value of the datasets, and ranking query results. Current provenance tracking approaches using the RDF reification vocabulary suffer from a number of known issues, including lack of formal semantics, use of blank nodes, and application-dependent interpretation of reified RDF triples. In this paper, we introduce a new approach called Provenance Context Entity (PaCE) that uses the notion of provenance context to create provenance-aware RDF triples. We also define the formal semantics of PaCE through a simple extension of the existing RDF(S) semantics that ensures compatibility of PaCE with existing Semantic Web tools and implementations. We have implemented the PaCE approach in the Biomedical Knowledge Repository (BKR) project at the US National Library of Medicine. The evaluations demonstrate a minimum of 49% reduction in total number of provenance-specific RDF triples generated using the PaCE approach as compared to RDF reification. In addition, performance for complex queries improves by three orders of magnitude and remains comparable to the RDF reification approach for simpler provenance queries.
JA - Scientific and Statistical Database Management (SSDBM 2010)
CY - Heidelberg, Germany
VL - 6187
ER -
TY - CONF
T1 - Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data
T2 - SSDBM2010
Y1 - 2010
A1 - Satya S. Sahoo
A1 - Krishnaprasad Thirunarayan
A1 - Olivier Bodenreider
A1 - Pascal Hitzler
A1 - Amit Sheth
KW - Biomedical knowledge repository
KW - Context theory
KW - domain specific provenance
KW - Model theoretic semantics
KW - PACE
KW - PrOM
KW - Provenance context
KW - Provenance context entity
KW - Provenance Management Framework
KW - Provenir ontology
KW - RDF reification
KW - semantic provenance
AB - The Resource Description Framework (RDF) format is being used by a large number of scientific applications to store and disseminate their datasets. The provenance information, describing the source or lineage of the datasets, is playing an increasingly significant role in ensuring data quality, computing trust value of the datasets, and ranking query results. Current provenance tracking approaches using the RDF reification vocabulary suffer from a number of known issues, including lack of formal semantics, use of blank nodes, and application-dependent interpretation of reified RDF triples. In this paper, we introduce a new approach called Provenance Context Entity (PaCE) that uses the notion of provenance context to create provenance-aware RDF triples. We also define the formal semantics of PaCE through a simple extension of the existing RDF(S) semantics that ensures compatibility of PaCE with existing Semantic Web tools and implementations. We have implemented the PaCE approach in the Biomedical Knowledge Repository (BKR) project at the US National Library of Medicine. The evaluations demonstrate a minimum of 49% reduction in total number of provenance-specific RDF triples generated using the PaCE approach as compared to RDF reification. In addition, performance for complex queries improves by three orders of magnitude and remains comparable to the RDF reification approach for simpler provenance queries.
JA - SSDBM2010
PB - The 22nd International Conference on Scientific and Statistical Database Management (SSDBM) 2010
ER -
TY - CONF
T1 - A Qualitative Examination of Topical Tweet and Retweet Practices
Y1 - 2010
A1 - Meenakshi Nagarajan
A1 - Hemant Purohit
A1 - Amit Sheth
KW - twitris and twitter and retweet and user-generated content analysis and information diffusion
AB - This work contributes to the study of retweet behavior on Twitter surrounding real-world events. We analyze over a million tweets pertaining to three events, present general tweet properties in such topical datasets and qualitatively analyze the properties of the retweet behavior surrounding the most tweeted/viral content pieces. Findings include a clear relationship between sparse/dense retweet patterns and the content and type of a tweet itself; suggesting the need to study content properties in link-based diffusion models.
PB - 4th Int'l AAAI Conference on Weblogs and Social Media (ICWSM)
ER -
TY - CONF
T1 - Ranking Documents Semantically Using Ontological Relationships
Y1 - 2010
A1 - Boanerges Aleman-Meza
A1 - Ismailcem Budak Arpinar
A1 - Mustafa V. Nural
A1 - Amit Sheth
KW - relationship based document relevance
KW - Semantic Association
KW - semantic document ranking
KW - semantic ranking
KW - semantic relationship-based ranking
KW - UIMA
AB - Although arguable success of today s keyword based search engines in certain information retrieval tasks, ranking search results in a meaningful way remains an open problem. In this work, the goal is to use of semantic relationships for ranking documents without relying on the existence of any specific structure in a document or links between documents. Instead, real-world entities are identified and the relevance of documents is determined using relationships that are known to exist between the entities in a populated ontology. We introduce a measure of relevance that is based on traversal and the semantics of relationships that link entities in an ontology. We expect that the semantic relationship-based ranking approach will be either an alternative or a complement to widely deployed document search for finding highly relevant documents that traditional syntactic and statistical techniques cannot find.
ER -
TY - CONF
T1 - A Rate Distortion Approach for Semi-Supervised Conditional Random Fields
Y1 - 2010
A1 - G. Haffari
A1 - Y. Wang
A1 - Shaojun Wang
A1 - G. Mori
KW - semi-supervised learning
AB - We propose a novel information theoretic approach for semi-supervised learning of conditional random fields that defines a training objective to combine the conditional likelihood on labeled data and the mutual information on unlabeled data. In contrast to previous minimum conditional entropy semi-supervised discriminative learning methods, our approach is grounded on a more solid foundation, the rate distortion theory in information theory. We analyze the tractability of the framework for structured prediction and present a convergent variational training algorithm to defy the combinatorial explosion of terms in the sum over label configurations. Our experimental results show the rate distortion approach outperforms standard l2 regularization, minimum conditional entropy regularization as well as maximum conditional entropy regularization on both multi-class classifcation and sequence labeling problems.
PB - Advances in Neural Information Processing Systems
ER -
TY - JOUR
T1 - A Reasonable Semantic Web
JF - Semantic Web – Interoperability, Usability, Applicability
Y1 - 2010
A1 - Frank Van Harmelen
A1 - Pascal Hitzler
KW - Automated Reasoning
KW - Formal Semantics
KW - KnowledgeRepresentation
KW - Linked Open Data
AB - The realization of Semantic Web reasoning is central to substantiating the Semantic Web vision. However, current mainstream research on this topic faces serious challenges, which forces us to question established lines of research and to rethink the underlying approaches. We argue that reasoning for the Semantic Web should be understood as 'shared inference,' which is not necessarily based on deductive methods. Model-theoretic semantics (and sound and complete reasoning based on it) functions as a gold standard, but applications dealing with large-scale and noisy data usually cannot afford the required runtimes. Approximate methods, including deductive ones, but also approaches based on entirely different methods like machine learning or natureinspired computing need to be investigated, while quality assurance needs to be done in terms of precision and recall values (as in information retrieval) and not necessarily in terms of soundness and completeness of the underlying algorithms.
PB - IOS Press
ER -
TY - JOUR
T1 - SCALE: a Scalable Framework for Efficiently Clustering Large Transactional Data
JF - Journal of Data Mining and Knowledge Discovery (DMKD)
Y1 - 2010
A1 - Hua Yan
A1 - Keke Chen
A1 - Ling Liu
A1 - Zhang Yi
KW - Framework
KW - Large Data Clusters
AB - This paper presents SCALE, a fully automated transactional clustering framework. The SCALE design highlights three unique features. First, we introduce the concept of Weighted Coverage Density as a categorical similarity measure for efficient clustering of transactional datasets. The concept of weighted coverage density is intuitive and it allows the weight of each item in a cluster to be changed dynamically according to the occurrences of items. Second, we develop the weighted coverage density measure based clustering algorithm, a fast, memory-efficient, and scalable clustering algorithm for analyzing transactional data. Third, we introduce two clustering validation metrics and show that these domain specific clustering evaluation metrics are critical to capture the trasactional semantics in clustering analysis. Our SCALE framework combines the weighted coverage density measure for clustering over a sample dataset with self configuring methods. These self-configuring methods can automatically tune the two important parameters of our clustering algorithms: (1) the candidates of the best number K of clusters; and (2) the application of two domain-specific cluster validity measures to find the best result from the set of clustering results.We have conducted extensive experimental evaluation using both synthetic and real datasets and our results show that the weighted coverage density approach powered by the SCALE framework can efficiently generate high quality clustering results in a fully automated manner
ER -
TY - JOUR
T1 - Semantic Modeling for Cloud Computing, Part 1
Y1 - 2010
A1 - Amit Sheth
A1 - Ajith Ranabahu
KW - cloud computing
KW - semantic modeling
AB - Cloud computing has lately become the attention grabber in both academia and industry. The promise of seemingly unlim- ited, readily available utility-type computing has opened many doors previously considered difficult, if not impossible, to open. The cloud-computing landscape, however, is still evolving, and we must overcome many challenges to foster widespread adoption of clouds. The main challenge is interoperability. Numerous vendors have introduced paradigms and services, making the cloud landscape diverse and heterogeneous. Just as in the computer hardware industry's early days, when each vendor made and marketed its own version of (incompatible) computer equipment, clouds are diverse and vendor-locked. Although many efforts are under way to standardize clouds important technical aspects, notably from the US National Institute of Standards and Technology (NIST), consolidation and standardization are still far from reality. In this two-part article, we discuss how a little bit of semantics can help address clouds key interoperability and portability issues
ER -
TY - JOUR
T1 - Semantic Modeling for Cloud Computing, Part 2
Y1 - 2010
A1 - Amit Sheth
A1 - Ajith Ranabahu
KW - Cloud computing
AB - Cloud computing has lately become the attention grabber in both academia and industry. The promise of seemingly unlim- ited, readily available utility-type computing has opened many doors previously considered difficult, if not impossible, to open. The cloud-computing landscape, however, is still evolving, and we must overcome many challenges to foster widespread adoption of clouds. The main challenge is interoperability. Numerous vendors have introduced paradigms and services, making the cloud landscape diverse and heterogeneous. Just as in the computer hardware industry's early days, when each vendor made and marketed its own version of (incompatible) computer equipment, clouds are diverse and vendor-locked. Although many efforts are under way to standardize clouds important technical aspects, notably from the US National Institute of Standards and Technology (NIST), consolidation and standardization are still far from reality. In this two-part article, we discuss how a little bit of semantics can help address clouds key interoperability and portability issues. In this part we discuss the potentials for semantics in Clouds.
ER -
TY - THES
T1 - Semantic Provenance: Modeling, Querying, and Application in Scientific Discovery
T2 - Department of Computer Science and Engineering
Y1 - 2010
A1 - Satya S. Sahoo
KW - Biomedical Informatics
KW - Materialized Provenance View
KW - Provenance context entity
KW - Provenance Query Operators
KW - Provenir ontology
KW - RDF reification
KW - semantic provenance
KW - Semantic Web
KW - SPARQL Query Optimization
AB - Provenance metadata, describing the history or lineage of an entity, is essential for ensuring data quality, correctness of process execution, and computing trust values. Traditionally, provenance management issues have been dealt with in the context of workflow or relational database systems. However, existing provenance systems are inadequate to address the requirements of an emerging set of applications in the new eScience or Cyberinfrastructure paradigm and the Semantic Web. Provenance in these applications incorporates complex domain semantics on a large scale with a variety of uses, including accurate interpretation by software agents, trustworthy data integration, reproducibility, attribution for commercial or legal applications, and trust computation. In this dissertation, we introduce the notion of "semantic provenance" to address these requirements for eScience and Semantic Web applications. In addition, we describe a framework for management of semantic provenance by addressing the three issues of, (a) provenance representation, (b) query & analysis, and (c) scalable implementation. First, we introduce a foundational model of provenance called Provenir to serve as an upper-level reference ontology to facilitate provenance interoperability. Second, we define a classification scheme for provenance queries based on the query characteristics and use this scheme to define a set of specialized provenance query operators. Third, we describe the implementation of a highly scalable query engine to support the provenance query operators, which uses a new class of materialized views based on the Provenir ontology, called Materialized Provenance Views (MPV), for query optimization. We also define a novel provenance tracking approach called Provenance Context Entity (PaCE) for the Resource Description Framework (RDF) model used in Semantic Web applications. PaCE, defined in terms of the Provenir ontology, is an effective and scalable approach for RDF provenance tracking in comparison to the currently used RDF reification vocabulary. Finally, we describe the application of the semantic provenance framework in biomedical and oceanography research projects.
JA - Department of Computer Science and Engineering
PB - Wright State University
CY - Dayton
ER -
TY - JOUR
T1 - Semantic Web - Interoperability, Usability, Applicability
JF - Semantic Web
Y1 - 2010
A1 - Pascal Hitzler
A1 - Krzysztof Janowicz
KW - Applicability
KW - Interoperability
KW - Semantic Web
KW - Usability
AB - While this statement seems obvious, this has not been so a few years ago, when basic research funding seemed to be running out, and industrial uptake was hardly happening. In the meantime, we do not only see sustained funding for Semantic Web related research (in particular by the European Commission), but also significant investment by industry, including major IT and venture capital companies. The Semantic Web is here to stay – and to grow. The Semantic Web is multidisciplinary and heterogeneous. Many Semantic Web researchers maintain close ties to neighboring disciplines which provide methods or application areas for their work. However, the Semantic Web has now established itself as a research field in its own rights. Consequently, a growing number of researchers, in particular those of the second or third generation, seem to identify themselves with the Semantic Web as their primary field of work. The growing number of top quality events dedicated to Semantic Web topics is also a clear indication of this trend. Another indicator is the increasing interweavement of Semantic Web methods into related disciplines leading to research topics such as geospatialsemantics, the Semantic Sensor Web, semantic desktop, or work on cultural heritage
PB - IOS Press
VL - 1
ER -
TY - CHAP
T1 - Semantic Web Services
T2 - Handbook of Semantic Web Technologies: Semantic Web Applications
Y1 - 2010
A1 - Carlos Pedrinaci
A1 - John Domingue
A1 - Amit Sheth
KW - Semantic Web Services
AB - In recent years service orientation has increasingly been adopted as one of the main approaches for developing complex distributed systems out of reusable components called services. Realizing the potential benefits of this software engineering approach requires semi-automated and automated techniques and tools for searching or locating services, selecting the suitable ones, composing them into complex processes, resolving heterogeneity issues through process and data mediation, and reduce other tedious yet recurrent tasks with minimal manual effort. Just as semantics has brought significant benefits to search, integration and analysis of data, semantics is also seen as a key to achieving a greater level of automation to service orientation. This has lead to research and development, as well as standardization efforts on semantic Web services. Activities related to semantic Web services have involved developing conceptual models or ontologies, algorithms and engine that could support machines in semi-automatically or automatically discovering, selecting, composing, orchestrating, mediating and executing services. In this chapter we provide an overview of the area after nearly a decade of research. We present the main principles and conceptual models proposed thus far including OWL-S, WSMO, and SAWSDL/METEOR-S. We also describe the main approaches developed by the research community that are able to use these semantic descriptions of services to support some of the typical activities related to services and service-based applications. We then illustrate the ideas and techniques described through two applications that integrate semantic Web services technologies within real-world application. Finally, we provide a set of key resources that would allow the reader to reach a greater understanding of the field, and we present what we believe are the main issues that will drive the future of semantic Web services
JA - Handbook of Semantic Web Technologies: Semantic Web Applications
PB - Springer-Verlag Berlin Heidelberg
CY - Berlin
VL - 2
SN - 978-3-540-92912-3
ER -
TY - CONF
T1 - Semantically Annotated RESTful Services for Large-scale Metabolomics Data Analysis
T2 - Ohio Collaborative Conference on Bioinformatics 2010
Y1 - 2010
A1 - Ashwin Manjunatha
A1 - Paul Anderson
A1 - Satya S. Sahoo
A1 - Ajith Ranabahu
A1 - Michael Raymer
A1 - Amit Sheth
JA - Ohio Collaborative Conference on Bioinformatics 2010
CY - Columbus, Ohio
ER -
TY - CONF
T1 - Semantics Centric Solutions for Application and Data Portability in Cloud Computing
T2 - 2nd IEEE International Conference on Cloud Computing Technology and Science
Y1 - 2010
A1 - Ajith Ranabahu
A1 - Amit Sheth
AB - Cloud computing has become one of the key considerations both in academia and industry. Cheap, seemingly unlimited computing resources that can be allocated almost instantaneously and pay-as-you-go pricing schemes are some of the reasons for the success of Cloud computing. The Cloud computing landscape, however, is plagued by many issues hindering adoption. One such issue is vendor lock-in, forcing the Cloud users to adhere to one service provider in terms of data and application logic. Semantic Web has been an important research area that has seen significant attention from both academic and industrial researchers. One key property of Semantic Web is the notion of interoperability and portability through high level models. Significant work has been done in the areas of data modeling, matching, and transformations. The issues the Cloud computing community is facing now with respect to portability of data and application logic are exactly the same issue the Semantic Web community has been trying to address for some time. In this paper we present an outline of the use of well established semantic technologies to overcome the vendor lock-in issues in Cloud computing. We present a semantics-centric programming paradigm to create portable Cloud applications and discuss MobiCloud, our early attempt to implement the proposed approach.
JA - 2nd IEEE International Conference on Cloud Computing Technology and Science
ER -
TY - THES
T1 - Semantics Enriched Service Environments
Y1 - 2010
A1 - Karthik Rajagopal
KW - data mediation
KW - Event identification
KW - Mediatability
KW - RESTful API search and ranking
KW - RESTful services
KW - SA-REST
KW - Semantic services
KW - Services computing
KW - Smart Mashups
AB - During the past seven years services centric computing has emerged as the preferred approach to architect complex software. Software is increasingly developed by integrating remotely existing components, popularly called services. This architectural paradigm, also called Service Oriented Architecture (SOA), brings with it the benefits of interoperability, agility and flexibility to software design and development. One can easily add or change new features to existing systems, either by the addition of new services or by replacing existing ones. Two popular approaches have emerged for realizing SOA. The first approach is based on the SOAP protocol for communication and the Web Service Description Language (WSDL) for service interface description. SOAP and WSDL are built over XML, thus guaranteeing minimal structural and syntactic interoperability. In addition to SOAP and WSDL, the WS-* (WS-Star) stack or SOAP stack comprises other standards and specification that enable features such as security and services integration. More recently, the RESTful approach has emerged as an alternative to the SOAP stack. This approach advocates the use of the HTTP operations of GET/PUT/POST/DELETE as standard service operations and the REpresentational State Transfer (REST) paradigm for maintaining service states. The RESTful approach leverages on the HTTP protocol and has gained a lot of traction, especially in the context of consumer Web applications such as Maps.

Despite their growing adoption, the stated objectives of interoperability, agility, and flexibility have been hard to achieve using either of the two approaches. This is largely because of the various heterogeneities that exist between different service providers. These heterogeneities are present both at the data and the interaction levels. Fundamental to addressing these heterogeneities are the problems of service Description, Discovery, Data mediation and Dynamic configuration. Currently, service descriptions capture the various operations, the structure of the data, and the invocation protocol. They however, do not capture the semantics of either the data or the interactions. This minimal description impedes the ability to find the right set of services for a given task, thus affecting the important task of service discovery. Data mediation is by far the most arduous task in service integration. This has been a well studied problem in the areas of workflow management, multi-database systems and services computing. Data models that describe real world data, such as enterprise data, often involve hundreds of attributes. Approaches for automatic mediation have not been very successful, while the complexity of the models require considerable human effort. The above mentioned problems in description, discovery and data mediation pose considerable challenge to creating software that can be dynamically configured.

This dissertation is one of the first attempts to address the problems of description, discovery, data mediation and dynamic configuration in the context of both SOAP and RESTful services. This work builds on past research in the areas of Semantic Web, Semantic Web services and Service Oriented Architectures. In addition to addressing these problems, this dissertation also extends the principles of services computing to the emerging area of social and human computation. The core contributions of this work include a mechanism to add semantic metadata to RESTful services and resources on the Web, an algorithm for service discovery and ranking, techniques for aiding data mediation and dynamic configuration. This work also addresses the problem of identifying events during service execution, and data integration in the context of socially powered services.
ER -
TY - CONF
T1 - Semantics-Empowered Text Exploration for Knowledge Discovery
T2 - 48th ACM Southeast Conference Oxford Mississippi, April 15-17, 2010
Y1 - 2010
A1 - Delroy Cameron
A1 - Pablo N. Mendes
A1 - Amit Sheth
A1 - Victor Chan
KW - annotation
KW - Exploratory Search
KW - Knowledge Exploration
KW - Navigation
KW - Semantic Browsing
KW - Semantic Metadata
AB - The interaction paradigm offered by most contemporary Web Information Systems is a search-and-sift paradigm in which users manually seek information using hyperlinked documents. This paradigm is derived from a document-centric model that gives users minimal support for scanning through high volumes of text. We present a novel information exploration paradigm based on a data-centric view of corpora, along with a prototype implementation that demonstrates the value in content-driven navigation. We leverage semantic metadata to link data in documents by exploiting named relationships between entities. We also present utilities for gathering user generated navigation trails, critical for knowledge discovery. We discuss the impact of our approach in the context of knowledge exploration.
JA - 48th ACM Southeast Conference Oxford Mississippi, April 15-17, 2010
PB - ACM
CY - Oxford, Mississippi
ER -
TY - Generic
T1 - Sensor Data and Perception: Can Sensors Play 20 Questions?
T2 - Dagstuhl Seminar 10042: Semantic Challenges in Sensor Networks
Y1 - 2010
A1 - Cory Henson
KW - Perception
KW - Semantic Sensor Web
KW - SSW
AB - Currently, there are many sensors collecting information about our environment, leading to an overwhelming number of observations that must be analyzed and explained in order to achieve situation awareness. As perceptual beings, we are also constantly inundated with sensory data, yet we are able to make sense of our environment with relative ease. Why is the task of perception so easy for us, and so hard for machines; and could this have anything to do with how we play the game 20 Questions?
JA - Dagstuhl Seminar 10042: Semantic Challenges in Sensor Networks
PB - Schloss Dagstuhl
CY - Dagstuhl, Germany
ER -
TY - RPRT
T1 - Sensor Discovery on Linked Data
Y1 - 2010
A1 - Josh Pschorr
A1 - Cory Henson
A1 - Harshal Patni
A1 - Amit Sheth
KW - Semantic Web and Linked Data and Sensor Web Enablement and Architectures and Middleware for Semantic Sensor Networks and Sensor Discovery
AB - There has been a drive recently to make sensor data accessible on the Web. However, because of the vast number of sensors collecting data about our environment, finding relevant sensors on the Web is a non-trivial challenge. In this paper, we present an approach to discovering sensors through a standard service interface over Linked Data. This is accomplished with a semantic sensor network middleware that includes a sensor registry on Linked Data and a sensor discovery service that extends the OGC Sensor Web Enablement. With this approach, we are able to access and discover sensors that are positioned near named-locations of interest.
ER -
TY - CONF
T1 - Some Trust Issues in Social Networks and Sensor Networks
T2 - Collaborative Technologies and Systems (CTS 2010)
Y1 - 2010
A1 - Krishnaprasad Thirunarayan
A1 - Cory Henson
A1 - Pramod Anantharam
A1 - Amit Sheth
KW - Social Networks and Sensor Networks and Reputation and Trust Management and Trust Metrics and Semantic Web Technologies
AB - Trust and reputation are becoming increasingly important in diverse areas such as search, e-commerce, social media, semantic sensor networks, etc. We review past work and explore future research issues relevant to trust in social/sensor networks and interactions. We advocate a balanced, iterative approach to trust that marries both theory and practice. On the theoretical side, we investigate models of trust to analyze and specify the nature of trust and trust computation. On the practical side, we propose to uncover aspects that provide a basis for trust formation and techniques to extract trust information from concrete social/sensor networks and interactions. We expect the development of formal models of trust and techniques to glean trust information from social media and sensor web to be fundamental enablers for applying semantic web technologies to trust management.
JA - Collaborative Technologies and Systems (CTS 2010)
CY - Chicago, Illinois
ER -
TY - CONF
T1 - Spatial Semantics for Better Interoperability and Analysis: Challenges and Experiences in Building Semantically Rich Applications in Web 3.0
T2 - 3rd Annual Spatial Ontology Community of Practice Workshop: Development, Implementation, and Use of Geo-Spatial Ontologies and Semantics
Y1 - 2010
A1 - Amit Sheth
KW - Semantic Sensor Web and Linked Sensor Data and Linked Open Data and Spatial Semantics and LOD and Spatial Aggregation and Active Machine Perception and Streaming Real-time Data and Streaming Sensor Data
JA - 3rd Annual Spatial Ontology Community of Practice Workshop: Development, Implementation, and Use of Geo-Spatial Ontologies and Semantics
CY - Reston, VA
ER -
TY - CONF
T1 - A Study in Hadoop Streaming with Matlab for NMR Data Processing
T2 - 2nd IEEE International Conference on Cloud Computing Technology and Science
Y1 - 2010
A1 - Ajith Ranabahu
A1 - Paul Anderson
A1 - Kalpa Gunaratna
A1 - Amit Sheth
KW - Hadoop
KW - hadoop streaming
KW - matlab
KW - NMR
AB - Applying Cloud computing techniques for analyzing large data sets has shown promise in many data-driven scientific applications. Our approach presented here is to use Cloud computing for Nuclear Magnetic Resonance (NMR) data analysis which normally consists of large amounts of data. Biologists often use third party or commercial software for ease of use. Enabling the capability to use this kind of software in a Cloud will be highly advantageous in many ways. Scripting languages especially designed for clouds may not have the flexibility biologists need for their purposes. Although this is true, they are familiar with special software packages that allow them to write complex calculations with minimum effort, but are often not compatible with a Cloud environment. Therefore, biologists who are trying to perform analysis on NMR data, acquire many advantages due to our proposed solution. Our solution gives them the flexibility to Cloud-enable their familiar software and it also enables them to perform calculations on a significant amount of data that was not previously possible. Our study is also applicable to any other environment in need of similar flexibility. We are currently in the initial stage of developing a framework for NMR data analysis.
JA - 2nd IEEE International Conference on Cloud Computing Technology and Science
ER -
TY - CONF
T1 - A Taxonomy-based Model for Expertise Extrapolation
T2 - 4th International Conference on Semantic Computing (ICSC2010)
Y1 - 2010
A1 - Boanerges Aleman-Meza
A1 - Amit Sheth
A1 - Ismailcem Budak Arpinar
A1 - Delroy Cameron
A1 - Sheron Decker
KW - Bibliographic Data
KW - Collaboration Networks
KW - Expert Finder
KW - Semantic Association
KW - Taxonomy
AB - While many ExpertFinder applications succeed in finding experts, their techniques are not always designed to capture the various levels at which expertise can be expressed. Indeed, expertise can be inferred from relationships between topics and subtopics in a taxonomy. The conventional wisdom is that expertise in subtopics is also indicative of expertise in higher level topics as well. The enrichment of Expertise Profiles for finding experts can therefore be facilitated by taking domain hierarchies into account. We present a novel semantics-based model for finding experts, expertise levels and collaboration levels in a peer review context, such as composing a Program Committee (PC) for a conference. The implicit coauthorship network encompassed by bibliographic data enables the possibility of discovering unknown experts within various degrees of separation in the coauthorship graph. Our results show an average of 85% recall in finding experts, when evaluated against three WWW Conference PCs and close to 80 additional comparable experts outside the immediate collaboration network of the PC Chairs.
JA - 4th International Conference on Semantic Computing (ICSC2010)
CY - Pittsburgh PA, USA
ER -
TY - CONF
T1 - Three-dimensional Stereoscopic Exploration System for the Heart
Y1 - 2010
A1 - Thomas Wischgoll
AB - Coronary heart diseases (CHD) are one of the main causes of deaths in the United States. Although it is well known that CHD mainly occurs due to blocked arteries, many of the specifics of this disease are still subject to current research. It is commonly accepted that certain factors, such as a cholesterol high diet, increase the risk of coronary heart disease. As a consequence, people should be educated to adhere to a diet low in low-density lipoprotein (LDL or bad cholesterol). In order for children to become familiar with these facts, educational, explorative computer systems can be employed to raise some awareness. Hence, this presentation describes an educational computer game for children. While practicing their navigation skills, children can learn about the various types of blood cells and particles within the blood stream. A geometric model of the arterial vascular system of the heart was generated based on a CT scan, which considers vessels of different sizes. An interactive flythrough using a standard game controller facilitates the exploration of the interior structure of the vasculature. A blood flow simulation including several different particles within the blood stream allows the young explorer to understand their functionality. Atherosclerotic lesions can be modeled to add calcifications to the geometric model. Since the blood flow visualized by the particles within the blood stream adapts to those lesions, the user learns about the immediate effects of such calcifications on the blood flow. The computer game supports stereoscopic display systems. Through the use of, for example, active shutter glasses and large-screen plasma displays, a fully immersive gaming environment can be achieved. An early version of this computer game has been deployed as an interactive museum exhibit for children. The primary age group addressed by the science museum where it was displayed is 4-9 years.
ER -
TY - CONF
T1 - Towards Cloud Mobile Hybrid Application Generation using Semantically Enriched Domain Specific Languages
Y1 - 2010
A1 - Krishnaprasad Thirunarayan
A1 - Ajith Ranabahu
A1 - Ashwin Manjunatha
A1 - Amit Sheth
KW - Cloud Computing and Cirrocumulus and Mobile Computing
AB - The advancements in computing have resulted in a boom of cheap, ubiquitous, connected mobile devices as well as seemingly unlimited, utility style, pay as you go computing resources, commonly referred to as Cloud computing. Taking advantage of this computing landscape, however, has been hampered by the many heterogeneities that exist in the mobile space as well as the Cloud space.This research attempts to introduce a disciplined methodology to develop Cloud-mobile hybrid applications by using a Domain Specific Language(DSL) centric approach to generate applications. A Cloud-mobile hybrid is an application that is split between a Cloud based back-end and a mobile device based front-end. We present mobicloud, our prototype system we built based on a DSL that is capable of developing these hybrid applications. This not only reduces the learning curve but also shields the developers from the native complexities of the target platforms. We also present our vision on propelling this research forward by enriching the DSLs with semantics. The high-level vision is outline in the ambitious Cirrocumulus project, the driving principle being write once - run on any device
PB - International Workshop on Mobile Computing and Clouds (MobiCloud 2010)
ER -
TY - CONF
T1 - Trust In Social and Sensor Networks
Y1 - 2010
A1 - Pramod Anantharam
A1 - Krishnaprasad Thirunarayan
A1 - Cory Henson
A1 - Amit Sheth
KW - trust and trust model
AB - Trust can be defined as the perception of the trustor about the degree to which the trustee would satisfy an expectation about a transaction constituting risk. Trust plays a pivotal role when the risk in believing incorrect information is high. With Web 2.0 where user generated content and real time interactions dominate, the openness of data contribution may hinder the quality of information we can get.
PB - Wright State University
ER -
TY - CONF
T1 - Trust Model for Semantic Sensor and Social Networks: A Preliminary Report
T2 - National Aerospace & Electronics Conference (NAECON)
Y1 - 2010
A1 - Pramod Anantharam
A1 - Krishnaprasad Thirunarayan
A1 - Cory Henson
A1 - Amit Sheth
KW - reputation
KW - sensor networks
KW - social networks
KW - trust
KW - trust model
AB - Trust is an amorphous concept that is becoming Increasingly important in many domains, such as P2P networks, E-commerce, social networks, and sensor networks. While we all have an intuitive notion of trust, the literature is scattered with a wide assortment of differing definitions and descriptions; often these descriptions are highly dependent on a single domain or application of interest. In addition, they often discuss orthogonal aspects of trust while continuing to use the general term 'trust'. In order to make sense of the situation, we have developed an ontology of trust that integrates and relates its various aspects into a single model
JA - National Aerospace & Electronics Conference (NAECON)
PB - In Proceedings of National Aerospace & Electronics Conference (NAECON)
CY - Dayton, Ohio
ER -
TY - CONF
T1 - Twarql: Tapping into the Wisdom of the Crowd
T2 - Triplification Challenge 2010
Y1 - 2010
A1 - Pavan Kapanipathi
A1 - Alexandre Passant
A1 - Pablo Mendes
KW - Brand tracking
KW - Linked Open Data
KW - Real time
KW - Semantic Web
KW - Social Media
KW - twitter
AB - Twarql is an infrastructure translating microblog posts from Twitter as Linked Open Data in real-time. The approach employed in Twarql can be summarized as follows: (1) ex- tract content (e.g. entity mentions, hashtags and URLs) from microposts streamed from Twitter; (2) encode content in RDF using shared and well-known vocabularies (FOAF, SIOC, MOAT, etc.); (3) enable structured querying of mi- croposts with SPARQL; (4) enable subscription to a stream of microposts that match a given query; and (5) enable scal- able real-time delivery of streaming annotated data using sparqlPuSH. In this paper we use a brand tracking scenario to demonstrate how Twarql enables exibility in handling the information overload of those interested in collectively analyzing microblog data for sensemaking. The dataset pro- duced is shared as Linked Data. Twarql is available as open source and can be easily deployed or extended for monitor- ing Twitter data in various contexts such as brand tracking, disaster relief management, stock exchange monitoring, etc.
JA - Triplification Challenge 2010
PB - 6th International Conference on Semantic Systems (I-SEMANTICS)
ER -
TY - ABST
T1 - Twitris 2.0 : Semantically Empowered System for Understanding Perceptions From Social Data
Y1 - 2010
A1 - Ajith Ranabahu
A1 - Ashutosh Jadhav
A1 - Hemant Purohit
A1 - Pramod Anantharam
A1 - Pablo Mendes
A1 - Pavan Kapanipathi
A1 - Vinh Nguyen
A1 - Gary Alan Smith
A1 - Michael Cooney
A1 - Amit Sheth
KW - Twitris and Semantic Web and Twitter Data Anaylsis and Social Data Analysis and Social Semantic Web and Social Network Analysis
AB - We present Twitris 2.0, a Semantic Web application that facilitates understanding of social perceptions by Semantics-based processing of massive amounts of event-centric data. Twitris 2.0 addresses challenges in large scale processing of social data, preserving spatio-temporal-thematic properties. Twitris 2.0 also covers context based semantic integration of multiple Web resources and expose semantically enriched social data to the public domain. Semantic Web technologies enable the systematic integration and analysis abilities.
ER -
TY - JOUR
T1 - Unary First Order Logic Queries Over Views
Y1 - 2010
A1 - Guozhu Dong
A1 - James Bailey
A1 - Anthony Widjaja
ER -
TY - THES
T1 - Understanding User-Generated Content on Social Media
Y1 - 2010
A1 - Meenakshi Nagarajan
KW - domain knowledge
KW - informal text analysis
KW - Social Media
KW - user-generated content
AB - Over the last few years, there has been a growing public and enterprise fascination with 'social media' and its role in modern society. At the heart of this fascination is the ability for users to participate, collaborate, consume, create and share content via a variety of platforms such as blogs, micro-blogs, email, instant messaging services, social network services, collaborative wikis, social bookmarking sites, and multimedia sharing sites. This dissertation is devoted to understanding informal user-generated textual content on social media platforms and using the results of the analysis to build Social Intelligence Applications. The body of research presented in this thesis focuses on understanding what a piece of user-generated content is about via two sub-goals of Named Entity Recognition and Key Phrase Extraction on informal text. In light of the poor context and informal nature of content on social media platforms, we investigate the role of contextual information from documents, domain models and the social medium to supplement and improve the reliability and performance of existing text mining algorithms for Named Entity Recognition and Key Phrase Extraction. In all cases we find that using multiple contextual cues together lends to reliable inter-dependent decisions, better than using the cues in isolation and that such improvements are robust across domains and content of varying characteristics, from micro-blogs like Twitter, social networking forums such as those on MySpace and Facebook, and blogs on the Web. Finally, we showcase two deployed Social Intelligence applications that build over the results of Named Entity Recognition and Key Phrase Extraction algorithms to provide near real-time information about the pulse of an online populace. Specifically, we describe what it takes to build applications that wish to exploit the 'wisdom of the crowds' - highlighting challenges in data collection, processing informal English text, metadata extraction and presentation of the resulting information.
ER -
TY - CONF
T1 - What Goes Around Comes Around - Improving Linked Open Data through On-Demand Model Creation
Y1 - 2010
A1 - Delroy Cameron
A1 - Christopher Thomas
A1 - Pablo N. Mendes
A1 - Pankaj Mehra
A1 - Wenbo Wang
A1 - Amit Sheth
KW - Information Extraction and Model Creation and Linked Open Data and Knowledge Extraction
AB - We present a method for growing the amount of knowledge available on the Web using a hermeneutic method that involves background knowledge, Information Extraction techniques and validation through discourse and use of the extracted information. We exemplify this using Linked Data as background knowledge, automatic Model/Ontology creation for the IE part and a Semantic Browser for evaluation. The hermeneutic approach, however, is open to be used with other IE techniques and other evaluation methods. We will present results from the model creation and anecdotal evidence for the feasibility of 'Validation through Use'.
PB - Web Science
ER -
TY - CONF
T1 - 3D Reconstruction and Visualization of a Hovering Dragonfly
Y1 - 2009
A1 - Christopher Koehler
A1 - Haibo Dong
A1 - Zachary Gaston
A1 - Hui Wan
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - Analysis and Monetization of Social Data
T2 - Analysis and Monetization of Social Data
Y1 - 2009
A1 - Amit Sheth
KW - advertising on social networks
KW - analysis of user generated content
KW - intent of user post
KW - monetizing social data
KW - spatio-temporal-thematic analysis of social data
KW - twitris
AB - ppt
JA - Analysis and Monetization of Social Data
ER -
TY - CONF
T1 - An Anytime Algorithm for Computing Inconsistency Measurement
T2 - Third International Conference, KSEM 2009
Y1 - 2009
A1 - Yue Ma
A1 - Guilin Qi
A1 - Guohui Xiao
A1 - Zuoquan Lin
A1 - Pascal Hitzler
AB - Measuring inconsistency degrees of inconsistent knowledge bases is an important problem as it provides context information for facilitating inconsistency handling. Many methods have been proposed to solve this problem and a main class of them is based on some kind of paraconsistent semantics. In this paper, we consider the computational aspects of inconsistency degrees of propositional knowledge bases under 4-valued semantics. We first analyze its computational complexity. As it turns out that computing the exact inconsistency degree is intractable, we then propose an anytime algorithm that provides tractable approximation of the inconsistency degree from above and below.We show that our algorithm satisfies some desirable properties and give experimental results of our implementation of the algorithm.
JA - Third International Conference, KSEM 2009
ER -
TY - CHAP
T1 - Applications of Emerging Patterns for Microarray Gene Expression Data Analysis
Y1 - 2009
A1 - Jinyan Li
KW - Data Analysis
KW - Gene Expression
KW - Microarray
ER -
TY - CHAP
T1 - Association Analytics for Network Connectivity in a Bibliographic and Expertise Dataset
Y1 - 2009
A1 - Ismailcem Budak Arpinar
A1 - Boanerges Aleman-Meza
A1 - Sheron L. Decker
A1 - Delroy Cameron
KW - Bibliography Measures
KW - Centrality
KW - Collaboration Strength
KW - DBLP
KW - Expert Finder.
KW - Ontologies
KW - rdf
KW - Semantic Web
ER -
TY - JOUR
T1 - Automated Isolation of Translational Efficiency Bias that Resists the Confounding Effect of GC(AT)-Content
JF - IEEE Control Systems Society
Y1 - 2009
A1 - Travis Doom
A1 - D. Krane
A1 - Doug Raiford
A1 - Michael Raymer
KW - codon usage bias
KW - GC-content
KW - strand bias
KW - translational efficiency
AB - Genomic sequencing projects are an abundant source of information for biological studies ranging from the molecular to the ecological in scale; however, much of the information present may yet be hidden from casual analysis. One such information domain, trends in codon usage, can provide a wealth of information about an organism's genes and their expression. Degeneracy in the genetic code allows more than one triplet codon to code for the same amino acid, and usage of these codons is often biased such that one or more of these synonymous codons is preferred. Detection of this bias is an important tool in the analysis of genomic data, particularly as a predictor of gene expressivity. Methods for identifying codon usage bias in genomic data that rely solely on genomic sequence data are susceptible to being confounded by the presence of several factors simultaneously influencing codon selection. Presented here is a new technique for removing the effects of one of the more common confounding factors, GC(AT)-content, and of visualizing the search-space for codon usage bias through the use of a solution landscape. This technique successfully isolates expressivity-related codon usage trends, using only genomic sequence information, where other techniques fail due to the presence of GC(AT)-content confounding influences.
ER -
TY - CONF
T1 - Automated Review of Natural Language Requirements Documents: Generating Useful Warnings with User-extensible Glossaries Driving a Simple State Machine
T2 - Automated Review of Natural Language Requirements Documents: Generating Useful Warnings with User-extensible Glossaries Driving a Simple State Machine
Y1 - 2009
A1 - Prateek Jain
A1 - Kunal Verma
A1 - Alex Kass
A1 - Reymonrod Vasquez
AB - We present an approach to automating some of the quality assurance review of software requirements documents, and promoting best practices for requirements documentation. The system we describe -- the Requirements Analysis Tool (RAT) - has been deployed and is currently being used in pilot projects with large and complex requirements documents. Preliminary results indicate a reduction in time needed to review documents and reduction in requirements defects as well as a change in the way users think about writing requirements. Our approach allows users to write requirements in natural language instead of an artificial formalism. RAT enforces requirements documentation best practices such as using standardized syntaxes and internally-consistent use of terminology. It supports the use of user-extensible glossaries to define terms. The formalism driving RAT is a state-machine, which is used to classify requirements into types based on keywords and then verify that the requirements follow one of the best practice syntaxes supported by the tool. It generates helpful warning messages explaining where requirements are not following best practices and suggests ways to rectify.
JA - Automated Review of Natural Language Requirements Documents: Generating Useful Warnings with User-extensible Glossaries Driving a Simple State Machine
ER -
TY - CONF
T1 - Automatic Composition of Semantic Web Services Using Process and Data Mediation
T2 - Tenth International Conference on Web Information Systems Engineering
Y1 - 2009
A1 - Amit Sheth
A1 - John Miller
A1 - Karthik Gomadam
A1 - Ajith Ranabahu
A1 - Zixin Wu
AB - Web service composition has quickly become a key area of research in the services oriented architecture community. One of the challenges in composition is the existence of heterogeneities across independently created and autonomously managed Web service requesters and Web service providers. Previous work in this area either involved significant human effort or in cases of the efforts seeking to provide largely automated approaches, overlooked the problem of data heterogeneities, resulting in partial solutions that would not support executable workflow for real-world problems. In this paper, we present a planning-based approach to solve both the process heterogeneity and data heterogeneity problems. Our system successfully outputs an executable BPEL file which correctly solves non-trivial real-world process specifications outlind in the 2006 SWS Challenge.
JA - Tenth International Conference on Web Information Systems Engineering
CY - Poznan, Poland
ER -
TY - CONF
T1 - A Best Practice Model for Cloud Middleware Systems
Y1 - 2009
A1 - Michael Maximilien
A1 - Ajith Ranabahu
KW - Cloud Computing and Best Practices
AB - Cloud computing is the latest trend in computing where the intention is to facilitate cheap, utility type computing resources in a service-oriented manner. However, the cloud landscape is still maturing and there are heterogeneities between the clouds, ranging from the application development paradigms to their service interfaces,and scaling approaches. These differences hinder the adoption of cloud by major enterprises. We believe that a cloud middleware can solve most of these issues to allow cross-cloud inter-operation. Our proposed system is Altocumulus, a cloud middleware that homogenizes the clouds. In order to provide the best use of the cloud resources and make that use predictable and repeatable, Altocumulus is based on the concept of cloud best practices. In this paper we briefly describe the Altocumulus middleware and detail the cloud best practice model it encapsulates. We also present examples based on real world deployments as evidence to the applicability of our middleware.
PB - 24th ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA)
ER -
TY - JOUR
T1 - Biophysical Model of Spatial Heterogeneity of Myocardial Flow
JF - Biophysical Journal
Y1 - 2009
A1 - Ghassan Kassab
A1 - Benjamin Kaimovitz
A1 - Yoram Lanir
A1 - Yunlong Huo
A1 - Julien I.E. Hoffman
A1 - Thomas Wischgoll
AB - The blood flow in the myocardium has significant spatial heterogeneity. The objective of this study was to develop a biophysical model based on detailed anatomical data to determine the heterogeneity of regional myocardial flow during diastole. The model predictions were compared with experimental measurements in a diastolic porcine heart in the absence of vessel tone using nonradioactive fluorescent microsphere measurements. The results from the model and experimental measurements showed good agreement. The relative flow dispersion in the arrested, vasodilated heart was found to be 44% and 48% numerically and experimentally, respectively. Furthermore, the flow dispersion was found to have fractal characteristics with fractal dimensions (D) of 1.25 and 1.27 predicted by the model and validated by the experiments, respectively. This validated three-dimensional model of normal diastolic heart will play an important role in elucidating the spatial heterogeneity of coronary blood flow, and serve as a foundation for understanding the interplay between cardiac mechanics and coronary hemodynamics.
ER -
TY - CONF
T1 - Characterization of 1H NMR Spectroscopic Data and the Generation of Synthetic Validation Sets.
T2 - Characterization of 1H NMR Spectroscopic Data and the Generation of Synthetic Validation Sets.
Y1 - 2009
A1 - Travis Doom
A1 - Ben Kelly
A1 - Paul Anderson
A1 - Michael Raymer
A1 - Nicholas J. DelRaso
A1 - Nicholas Reo
AB - Motivation: Common contemporary practice within the nuclear magnetic resonance (NMR) metabolomics community is to evaluate and validate novel algorithms on empirical data or simplified simulated data. Empirical data captures the complex characteristics of experimental data, but the optimal or most correct analysis is unknown a priori; therefore, researchers are forced to rely on indirect performance metrics, which are of limited value. In order to achieve fair and complete analysis of competing techniques more exacting metrics are required. Thus, metabolomics researchers often evaluate their algorithms on simplified simulated data with a known answer. Unfortunately, the conclusions obtained on simulated data are only of value if the data sets are complex enough for results to generalize to true experimental data. Ideally, synthetic data should be indistinguishable from empirical data, yet retain a known best analysis. Results: We have developed a technique for creating realistic synthetic metabolomics validation sets based on NMR spectroscopic data. The validation sets are developed by characterizing the salient distributions in sets of empirical spectroscopic data. Using this technique, several validation sets are constructed with a variety of characteristics present in real data. A case study is then presented to compare the relative accuracy of several alignment algorithms using the increased precision afforded by these synthetic data sets. Availability: These data sets are available for download at http://birg.cs.wright.edu/nmr_synthetic_data_sets.
JA - Characterization of 1H NMR Spectroscopic Data and the Generation of Synthetic Validation Sets.
ER -
TY - ABST
T1 - Citizen Sensing, Social Signals, and Enriching Human Experience
Y1 - 2009
A1 - Amit Sheth
AB - In this article, I introduce the exciting paradigm of citizen sensing enabled by mobile sensors and human computing-that is, humans as citizens on the ubiquitous Web, acting as sensors and sharing their observations and views using mobile devices and Web 2.0 services.
ER -
TY - CONF
T1 - A Comparison of Codon Usage Trends in Prokaryotes
T2 - A Comparison of Codon Usage Trends in Prokaryotes
Y1 - 2009
A1 - Travis Doom
A1 - Amanda Hanes
A1 - D. Krane
A1 - Michael Raymer
AB - Codon usage bias is an effective measure of the differences among organisms at a genomic level. These genomic differences also reflect some differences in the organisms lifestyles and physiology. Here we demonstrate that prokaryotic obligate intracellular parasites and symbionts have a codon usage pattern that differs significantly from that of exclusively free-living prokaryotes. This result is valuable in that it suggests that the habitat of an organism may directly influence that organisms use of synonymous codons, which in turn demonstrates evidence of an evolutionary mechanism that operates at a finer molecular level than that of amino acids and proteins.
JA - A Comparison of Codon Usage Trends in Prokaryotes
ER -
TY - ABST
T1 - Computational Analysis of Metabolomic Toxicological Data Derived from NMR Spectroscopy
Y1 - 2009
A1 - Ben Kelly
KW - Metabolomic Toxicological
KW - NMR Spectroscopy
AB - Nuclear magnetic resonance (NMR) spectroscopy is a non-invasive method of acquiring metabolic profiles from biofluids. The most informative metabolomic features, or biomarkers, may provide keys to the early detection of changes within an organism such as those that result from exposure to a toxin. One major difficulty with typical NMR data, whether it come from a toxicological, medical or other source, is that it features a low sample size relative to the number of variables measured. Thus, traditional pattern recognition techniques are not always feasible. The curse of dimensionality is an important consideration in selecting appropriate statistical and pattern recognition methods for the identification of potential biomarkers. In this thesis, several alternatives for isolating biomarkers are evaluated on NMR-derived toxicological data set and results are compared: the fold test, univariate ranking, the unpaired t-test, and the paired t-test are examined. Potential biomarkers were inspected for differences based on several subjective criteria including ability to identify consistent differences between treatment and control samples and distinguish potential vehicle effects, those effects caused by the method of delivery performed on both treated and control animals. Based on these results, the paired t-test method is preferred, due to its ability to attribute statistical significance, to take into consideration consistency of a single subject over a time course, and to mitigate the low sample, high dimensionality problem. A protocol for the paired t-test is also proposed to remove potential vehicle effects and identify toxic responses above the vehicle effects. Due to the large number of variables to be considered, a correction for multiple testing must be employed. In this thesis, several methods of correction for multiple test is evaluated. An acceptable p-value cutoff for each correction is proposed so that the most appropriate correction can be applied based on the purpose of the metabolomic toxicology experiment. Also in this thesis, a more complex method for identifying biomarkers, Orthogonal Projection to Latent Structures Discriminant Analysis (O-PLS-DA), is compared to the t-test using synthetic data sets based on the characterization of experimental NMR spectra. The ranking of potential biomarkers produced by both methods is compared to the ranking of features used to create the synthetic data. In addition, an O-PLS-DA permutation test method of determining an important feature cutoff is evaluated using the synthetic data. The variable-at-a-time t-test method using a p-value threshold is also evaluated for comparison. Based on these results the O-PLS-DA permutation test was not consistent or stable enough to distinguish truly responding biomarkers. The benefits of O-PLS-DA, including it's ability to deal with correlated variables, removal of unwanted systematic variation, and the ability to deal with some amount of missing data, make it sufficient for identifying potential biomarkers. It is determined that O-PLS-DA does not rank potential biomarkers differently than the t-test nor does it classify new samples significantly better or worse than a majority-vote based t-test classifier.
ER -
TY - CONF
T1 - Computing for Human Experience: Semantics Empowered Sensors, Services, and Social Computing on Ubiquitous Web
Y1 - 2009
A1 - Amit Sheth
ER -
TY - CONF
T1 - Computing for Human Experience: Sensing, Perception, Semantics, Social Computing, Web 3.0, and beyond
Y1 - 2009
A1 - Amit Sheth
PB - 20th Midwest Artificial Intelligence and Cognitive Science Conference (MAICS2009)
ER -
TY - CONF
T1 - Context and Domain Knowledge Enhanced Entity Spotting in Informal Text
T2 - International Semantic Web Conference (ISWC 2009)
Y1 - 2009
A1 - Meenakshi Nagarajan
A1 - Daniel Gruhl
A1 - Christine Robson
A1 - Jan Pieper
A1 - Amit Sheth
AB - This paper explores the application of restricted relationship graphs (RDF) and statistical NLP techniques to improve named entity annotation in challenging Informal English domains. We validate our approach using on-line forums discussing popular music. Named entity annotation is particularly difficult in this domain because it is characterized by a large number of ambiguous entities, such as the Madonna album ÂMusicÂ or Lilly Allen's pop hit ÂSmileÂ. We evaluate improvements in annotation accuracy that can be obtained by ...
JA - International Semantic Web Conference (ISWC 2009)
CY - Chantilly, VA, USA
ER -
TY - CONF
T1 - Context is Highly Contextual!
T2 - Context is Highly Contextual!
Y1 - 2009
A1 - Amit Sheth
JA - Context is Highly Contextual!
ER -
TY - CONF
T1 - A Contrast Pattern Based Clustering Quality Index for Categorical Data
T2 - IEEE International Conference on Data Mining series (ICDM 2009)
Y1 - 2009
A1 - Qingbao Liu
A1 - Guozhu Dong
AB - Since clustering is unsupervised and highly explorative, clustering validation (i.e. assessing the quality of clustering solutions) has been an important and long standing research problem. Existing validity measures have significant shortcomings. This paper proposes a novel Contrast Pattern based Clustering Quality index (CPCQ) for categorical data, by utilizing the quality and diversity of the contrast patterns (CPs) which contrast the clusters in clusterings. High quality CPs can characterize clusters and discriminate them against each other. Experiments show that the CPCQ index (1) can recognize that expert-determined classes are the best clusters for many datasets from the UCI repository; (2) does not give inappropriate preference to larger number of clusters; (3) does not require a user to provide a distance function.
JA - IEEE International Conference on Data Mining series (ICDM 2009)
CY - Miami, Florida
ER -
TY - ABST
T1 - Cuebee: Knowledge Driven Query Formulation
Y1 - 2009
A1 - Pablo N. Mendes
KW - rdf and sparql and query and interface and UI
AB - Cuebee (http://knoesis.wright.edu/library/tools/cuebee/) is a knowledge-driven query formulation system targeted at non-computer expert users for SPARQL generation. Cuebee uses ontology schemata to guide users step-by-step in formulating queries in an intuitive way. Queries are translated to SPARQL, sent to servers over the web and the results are presented to the user in multiple visualization interfaces. Each perspective should help the user in accomplishing a different analytical task, e.g. graph-based visualization to explore connections between elements, chart-based visualization for result set summarization and genome map visualization for positional analysis, amongst others. Cuebee supports utility tools that can be executed over the Internet (as web services) to perform computations as part of query solutions. Such tools are currently integrated with the query formulation process in an ad hoc fashion. Future work includes a flexible integration of the query formulation with semantic web service registry. This document aims at describing the basic components of Cuebee, presenting what has been implemented, as well as future work. We start by showing how users interact with the system to formulate queries, then discuss how to encode SPARQL through that process. Query execution is also discussed, providing insight into the integration with the Web Services infrastructure to be developed.
ER -
TY - CONF
T1 - On Domain Similarity and Effectiveness of Adapting to Rank
T2 - On Domain Similarity and Effectiveness of Adapting to Rank
Y1 - 2009
A1 - Belle Tseng
A1 - Jing Bai
A1 - Srihari Reddy
A1 - Keke Chen
AB - Adapting to rank address the the problem of insufficient domainspecificlabeled training data in learning to rank. However, theinitial study shows that adaptation is not always effective. In thispaper, we investigate the relationship between the domain similarityand the effectiveness of domain adaptation with the help of twodomain similarity measure: relevance correlation and sample distributioncorrelation.
JA - On Domain Similarity and Effectiveness of Adapting to Rank
ER -
TY - BOOK
T1 - Emerging Pattern Based Classification
Y1 - 2009
A1 - Guozhu Dong
A1 - Jinyan Li
ER -
TY - JOUR
T1 - Emerging Patterns
Y1 - 2009
A1 - Jinyan Li
A1 - Guozhu Dong
ER -
TY - CHAP
T1 - Encyclopedia of Database Systems
Y1 - 2009
A1 - Jinyan Li
KW - Database Systems
KW - Pattern Based Classification
ER -
TY - BOOK
T1 - Evaluation of Inter Laboratory and Cross Platform Concordance of DNA Microarrays through Discriminating Genes and Classifier Transferability
Y1 - 2009
A1 - Shihong Mao
A1 - Chalres Wang
A1 - Guozhu Dong
ER -
TY - CONF
T1 - An Evolutionary Computing Approach for Reasoning in the Semantic Web
Y1 - 2009
A1 - Sebastian Rudolph
A1 - Gaston Tagni
A1 - Christophe Gueret
A1 - Stefan Schlobach
A1 - Pascal Hitzler
PB - International Workshop on Collective Intelligence and Evolution
ER -
TY - CONF
T1 - An Examination of Language Use in Online Dating Profiles
Y1 - 2009
A1 - Marti Hearst
A1 - Meenakshi Nagarajan
AB - This paper contributes to the study of self-presentation in online dating systems by performing a factor analysis on the text portions of online profiles. Findings include a similarity in the overall factor structures between male and female profiles, including use of tentative words by men. Contrasts between sexes were also found in a cluster analysis of the profiles using their factor scores. Finally, we also found similarities in frequent words used by the gender groups.
PB - 3rd Int'l AAAI Conference on Weblogs and Social Media
ER -
TY - CONF
T1 - Extending SPARQL to Support Spatially and Temporally Related Information
Y1 - 2009
A1 - Amit Sheth
A1 - Prateek Jain
A1 - Kunal Verma
A1 - Peter Yeh
PB - Semantic Technology Conference
ER -
TY - CONF
T1 - Extending SPARQL to Support Spatially and Temporally Related Information
Y1 - 2009
A1 - Prateek Jain
A1 - Kunal Verma
A1 - Amit Sheth
A1 - Peter Yeh
PB - Semantic Technology Conference
ER -
TY - JOUR
T1 - Facets of Artificial General Intelligence
JF - Kunstliche Intelligenz
Y1 - 2009
A1 - Kai-Uwe Kuhnberger
A1 - Pascal Hitzler
AB - We argue that time has come for a serious endeavor to work towards artificial general intelligence (AGI). This positive assessment of the very possibility of AGI has partially its roots in the development of new methodological achievements in the AI area, like new learning paradigms and new integration techniques for different methodologies. The article sketches some of these methods as prototypical examples for approaches towards AGI.
ER -
TY - CONF
T1 - FAnToM - Lessons Learned from Design, Implementation, Administration, and Use of a Visualization System for Over 10 Years
Y1 - 2009
A1 - Gerik Scheuermann
A1 - Alexander Wiebel
A1 - Christoph Garth
A1 - Mario Hlawitschka
A1 - Thomas Wischgoll
AB - Scientific visualization has become a central tool in many research areas since it has been established as a research discipline in 1987 [2]. Naturally, this development resulted in software tools specifically tailored for the visualization task at hand. While many such tools exist, the design choices underlying them vary greatly. This abstract describes some aspects of the FAnToM1 visualization system that is being developed since 1999. Initially created to support research in topological methods for vector and tensor fields, the system quickly grew into a visualization platform for general flow visualization specialized to data represented on unstructured grids. From this origin, FAnToM derives advanced data structures for point location and interpolation over unstructured meshes, as well as fast integral curve capabilities. More recently, FAnToM has gradually been extended to serve a wider area of visualization applications, including medical and graph visualization. Throughout the development of FAnToM, close collaboration with application domain scientists has been a strong priority to facilitate the system s usefulness on state-of-the-art problems. The continuous development of this system over a period of ten years revealed a number of important aspects that are crucial for the usefulness of a visualization system. Furthermore, some design choices underlying FAnToM are uncommon among visualization systems in general. Here, it is our aim to discuss some aspects and design choices underlying the FAnToM system to illustrate some of its properties and differences from other visualization systems. During the discussion, we will point out some experiences and lessons learned in working with the system on modern visualization applications.
ER -
TY - BOOK
T1 - Foundations of Semantic Web Technologies
Y1 - 2009
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
ER -
TY - JOUR
T1 - A Genetic Optimization Approach for Isolating Translational Efficiency Bias
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
Y1 - 2009
A1 - Douglas Raiford
A1 - Dan Krane
A1 - Travis Doom
A1 - Michael Raymer
KW - Artificial Intelligence
KW - codon usage bias
KW - computing methodologies
KW - Evolutionary computing and genetic algorithms
KW - GC-content
KW - miscellaneous
KW - strand bias
KW - translational efficiency
AB - The study of codon usage bias is an important research area that contributes to our understanding of molecular evolution, phylogenetic relationships, respiratory lifestyle, and other characteristics. Translational efficiency bias is perhaps the most well studied codon usage bias, as it is frequently utilized to predict relative protein expression levels. We present a novel approach to isolating translational efficiency bias in microbial genomes. There are several existent methods for isolating translational efficiency bias. Previous approaches are susceptible to the confounding influences of other potentially dominant biases. Additionally, existing approaches to identifying translational efficiency bias generally require both genomic sequence information and prior knowledge of a set of highly expressed genes. This novel approach provides more accurate results from sequence information alone by resisting the confounding effects of other biases. We validate this increase in accuracy in isolating translational efficiency bias on ten microbial genomes, five of which have proven particularly difficult for existing approaches due to the presence of strong confounding biases.
PB - IEEE Computer Society Press
VL - 8
CP - 2
ER -
TY - THES
T1 - Graph summaries for Optimizing Graph Pattern Queries on RDF Databases
Y1 - 2009
A1 - Angela Maduko
AB - The adoption of the Resource Description Framework (RDF) as a metadata representation standard is spurring the development of high-level mechanisms for storing and querying RDF data. Many of the proposed systems are built on Relational/Object-Relational Databases with a translation of queries posed in the supported RDF query language to SQL for processing by the database. Graph pattern matching which matches a query graph against a data graph, often require join operations. To process join operations, the database optimizer determines an optimal join order from a cost model which employs the expected cardinality of join results as a key parameter. This parameter is estimated from a statistical summary of the data maintained in memory. In this work, we argue that the data summarization technique employed by database systems are oblivious of the graph structure of RDF data and may lead to estimation errors which result in the choice of a sub-optimal query plan. We present and evaluate two techniques for estimating the frequency of subgraphs utilizing a small statistical summary of the graph, based on occurrences. In the first technique, we summarize the graph in the P-Tree by pruning small subgraphs based on a valuation scheme that blends information about their importance and estimation power. In the second technique, we assume that edge occurrences on edge sequences of length maxL are position independent. We then summarize the most informative dependencies in the MD-Tree. In both techniques, we assume conditional independence to estimate the frequencies of larger subgraphs. We present extensive experiments on real world and synthetic datasets which confirm the feasibility of our approach. Our experiments are geared towards showing that the estimates obtained from the proposed summaries are accurate as well as effective for optimizing graph pattern queries posed over RDF graphs.
ER -
TY - JOUR
T1 - HE-Tree: a Framework for Detecting Changes in Clustering Structure for Categorical Data Streams
Y1 - 2009
A1 - Keke Chen
A1 - Ling Liu
KW - Categorical Data Clustering
KW - Change Detection
KW - Data Stream Mining
AB - Analyzing clustering structures in data streams can provide critical information for real-time decisionmaking. Most research in this area has focused on clustering algorithms for numerical data streams, andvery few have proposed to monitor the change of clustering structure. Most surprisingly, to our knowledge,no work has been proposed on monitoring clustering structure for categorical data streams. In this paper,we present a framework for detecting the change of primary clustering structure in categorical data streams,which is indicated by the change of the best number of clusters (Best K) in the data stream. The frameworkuses a Hierarchical Entropy Tree structure (HE-Tree) to capture the entropy characteristics of clusters in adata stream, and detects the change of Best K by combining our previously developed BKPlot method. TheHE-Tree can efficiently summarize the entropy property of a categorical data stream and allow us to drawprecise clustering information from the data stream for generating high-quality BKPlots. We also developthe time-decaying HE-Tree structure to make the monitoring more sensitive to recent changes of clusteringstructure. The experimental result shows that with the combination of the HE-Tree and the BKPlot methodwe are able to promptly and precisely detect the change of clustering structure in categorical data streams.
ER -
TY - CONF
T1 - The Importance of Being Neural-Symbolic - A Wilde Position
T2 - Second Conference on Artificial General Intelligence, AGI
Y1 - 2009
AB - We argue that Neural-Symbolic Integration is a topic of central importance for the advancement of Artificial General Intelligence.
JA - Second Conference on Artificial General Intelligence, AGI
PB - Second Conference on Artificial General Intelligence, AGI 2009
CY - Arlington, Virginia, USA
ER -
TY - CONF
T1 - Improving Remote Homology Detection Using Sequence Properties and Position Specific Scoring Matrices
T2 - The 2009 International Conference on Bioinformatics and Computational Biology (BIOCOMP 09)
Y1 - 2009
A1 - Gina Cooper
A1 - Michael Raymer
AB - Understanding the structure and function of proteins is a key part of understanding biological systems. Although proteins are complex biological macromolecules, they are made up of only 20 basic building blocks known as amino acids. The makeup of a protein can be described as a sequence of amino acids. One of the most important tools in modern bioinformatics is the ability to search for biological sequences (such as protein sequences) that are similar to a given query sequence. There are many tools for doing this (Altschul et al., 1990, Hobohm and Sander, 1995, Thomson et al., 1994, Karplus and Barrett, 1998). Most of these tools, however, focus on closely related, or homologous, sequences. Distantly related proteins sequences (remote homologs) are of interest to biologists but remain notoriously difficult to find. This dissertation presents a novel method for finding remote homologs in databases of protein sequences. In this method, proteins are characterized according to physiochemical and sequence-based features. Features are then weighted according to their utility in identifying distantly related protein sequences. The feature weights are optimized by a custom genetic algorithm. Position-specific-scoring matrices are used to further increase the ability of the tuned algorithm to generalize its search capability to new sequences. The resulting search method outperforms the most well-known techniques for finding distant homologs, both in terms of accuracy and computation time.
JA - The 2009 International Conference on Bioinformatics and Computational Biology (BIOCOMP 09)
CY - Las Vegas, Nevada
ER -
TY - JOUR
T1 - Incremental Computation of Queries
Y1 - 2009
A1 - Guozhu Dong
A1 - Jianwen Su
KW - Computer Communication Networks
KW - computer imaging
KW - database management
KW - Information Storage and Retrieval
KW - information systems applications
KW - Multimedia Information Systems
KW - pattern recognition and graphics
KW - vision
AB - A view on a database is defined by a query over the database. When the database is updated, the value of the view (namely the answer to the query) will likely change. The computation of the new answer to the query using the old answer is called incremental query computation or incremental view maintenance. Incremental computation is typically performed by identifying the part in the old answer that need to be removed, and the part in the new answer that need to be added. Incremental computation is desirable when it is much more efficient than a re-computation of the query. Efficiency can be measured by computation time, storage space, or query language desirability/availability, etc. Incremental computation algorithms could use auxiliary relations (in addition to the query answer), which also need to be incrementally computed. Two query languages can be involved for the incremental query computation problem. One is used for defining the view to be maintained, and the other for describing the incremental computation algorithm. For relational databases, the two languages can be relational algebra, SQL, nested relational algebra, Datalog, SQL embedded in a host programming language, etc.
ER -
TY - JOUR
T1 - Inexact Matching of Ontology Graphs Using Expectation-Maximization.
Y1 - 2009
A1 - Prashant Doshi
A1 - Ravikanth Kolli
A1 - Christopher Thomas
KW - Expectation-maximization
KW - Homomorphism
KW - Matching
AB - We present a new method for mapping ontology schemas that address similar domains. The problem of ontology matching is crucial since we are witnessing a decentralized development and publication of ontological data. We formulate the problem of inferring a match between two ontologies as a maximum likelihood problem, and solve it using the technique of expectation-maximization (EM). Specifically, we adopt directed graphs as our model for ontology schemas and use a generalized version of EM to arrive at a map between the nodes of the graphs. We exploit the structural, lexical and instance similarity between the graphs, and differ from the previous approaches in the way we utilize them to arrive at, a possibly inexact, match. Inexact matching is the process of finding a best possible match between the two graphs when exact matching is not possible or is computationally difficult. In order to scale the method to large ontologies, we identify the computational bottlenecks and adapt the generalized EM by using a memory bounded partitioning scheme. We provide comparative experimental results in support of our method on two well-known ontology alignment benchmarks and discuss their implications.
ER -
TY - CONF
T1 - InforMate - A GIS for Diverse Mobile Devices
T2 - eASiA 2009
Y1 - 2009
A1 - Lakshika Balasuriya
A1 - K. Perera
A1 - N. Bandara
A1 - S. Perera
A1 - D. Amarasena
A1 - D. Dias
JA - eASiA 2009
CY - Sri Lanka
ER -
TY - CONF
T1 - Information Theoretic Regularization for Semi-Supervised Boosting
T2 - Knowledge Discovery and Data Mining - KDD2009
Y1 - 2009
A1 - Lei Zheng
A1 - Yan Liu
A1 - Shaojun Wang
AB - We present novel semi-supervised boosting algorithms that incrementally build linear combinations of weak classifiers through generic functional gradient descent using both labeled and unlabeled training data. Our approach is based on extending information regularization framework to boosting,bearing loss functions that combine log loss on labeled data with the information-theoretic measures to encode unlabeled data. Even though the information-theoretic regularization terms make the optimization non-convex, we propose simple sequential gradient descent optimization algorithms, and obtain impressively improved results on synthetic, benchmark and real world tasks over supervised boosting algorithms which use the labeled data alone and a state-of-the-art semi-supervised boosting algorithm
JA - Knowledge Discovery and Data Mining - KDD2009
CY - Paris, France
ER -
TY - CHAP
T1 - Integrated Retrieval from Web of Documents and Data
Y1 - 2009
A1 - Krishnaprasad Thirunarayan
A1 - Trivikram Immaneni
KW - Data Retrieval
KW - Hybrid Query Language
KW - Hypertext Web
KW - Information Retrieval
KW - Semantic Web
KW - Unified Web
AB - The Semantic Web is evolving into a property-linked web of data, conceptually different from but contained in the Web of hyperlinked documents. Data Retrieval techniques are typically used to retrieve data from the Semantic Web while Information Retrieval techniques are used to retrieve documents from the Hypertext Web. We present a Unified Web model that integrates the two webs and formalizes connection between them. We then present an approach to retrieving documents and data that captures best of both the worlds. Specifically, it improves recall for legacy documents and provides keyword-based search capability for the Semantic Web. We specify the Hybrid Query Language that embodies this approach, and the prototype system SITAR that implements it. We conclude with areas of future work
ER -
TY - ABST
T1 - Investigation and Quantification of Codon Usage Bias Trends in Prokaryotes
Y1 - 2009
A1 - Amanda Hanes
KW - codon usage
KW - prokaryotes
AB - Organisms construct proteins out of individual amino acids using instructions encoded in the nucleotide sequence of a DNA molecule. The genetic code associates combinations of three nucleotides, called codons, with every amino acid. Most amino acids are associated with multiple synonymous codons, but although they result in the same amino acid and thus have no effect on the final protein, synonymous codons are not present in equal amounts in the genomes of most organisms. This phenomenon is known as codon usage bias, and the literature has shown that all organisms display a unique pattern of codon usage. Research also suggests that organisms with similar codon usage share biological similarities as well. This thesis helps to verify this theory by using an existing computational algorithm along with multivariate analysis to demonstrate that there is a significant difference between the codon usage of free-living prokaryotes and that of obligate intracellular prokaryotes. The observed difference is primarily the result of GC content, with the additional effect of an unknown factor. Although the existing literature often mentions the strength of biased codon usage, it does not contain a clear, consistent definition of the concept. This thesis provides a disambiguated definition of bias strength and clarifies the relationships between this and other properties of biased codon usage. A bias strength metric, designed to match the given definition of bias strength, is proposed. Evaluation of this metric demonstrates that it compares favorably with existing metrics used in the literature as criteria for bias iv strength, and also suggests that codon usage bias in general follows the trend of being either strong and global to the genome, or weak and present in only a subset of the genome. Analysis of these metrics provides insight into the unknown factor partially responsible for the codon usage difference between free-living and obligatorily intracellular prokaryotes, and the proposed bias strength metric is used to draw conclusions about the characteristics of GC-content bias.
ER -
TY - CONF
T1 - A Local Qualitative Approach to Referral and Functional Trust
T2 - A Local Qualitative Approach to Referral and Functional Trust
Y1 - 2009
A1 - Amit Sheth
A1 - Cory Henson
A1 - Dharan Althuru
A1 - Krishnaprasad Thirunarayan
AB - Trust and confidence are becoming key issues in diverse applications such as ecommerce, social networks, semantic sensor web, semantic web information retrieval systems, etc. Both humans and machines use some form of trust to make informed and reliable decisions before acting. In this work, we briefly review existing work on trust networks, pointing out some of its drawbacks. We then propose a local framework to explore two different kinds of trust among agents called referral trust and functional trust, that are modelled using local partial orders, to enable qualitative trust personalization. The proposed approach formalizes reasoning with trust, distinguishing between direct and inferred trust. It is also capable of dealing with general trust networks with cycles.
JA - A Local Qualitative Approach to Referral and Functional Trust
ER -
TY - CHAP
T1 - Maintenance of Frequent Patterns: A Survey
Y1 - 2009
A1 - Jinyan Li
A1 - Limsoon Wong
A1 - Mengling Feng
A1 - Guozhu Dong
AB - This chapter surveys the maintenance of frequent patterns in transaction datasets. It is written to be accessible to researchers familiar with the field of frequent pattern mining. The frequent pattern main-tenance problem is summarized with a study on how the space of frequent patterns evolves in response to data updates. This chapter focuses on incremental and decremental maintenance. Four major types of maintenance algorithms are studied: Apriori-based, partition-based, prefix-tree-based, and concise-representation-based algorithms. The authors study the advantages and limitations of these algorithms from both the theoretical and experimental perspectives. Possible solutions to certain limitations are also proposed. In addition, some potential research opportunities and emerging trends in frequent pat-tern maintenance are also discussed.
ER -
TY - CHAP
T1 - Mining Conditional Contrast Patterns
Y1 - 2009
A1 - Guozhu Dong
A1 - Guimei Liu
A1 - Limsoon Wong
A1 - Jinyan Li
AB - This chapter considers the problem of 'conditional contrast pattern mining.' It is related to contrast mining, where one considers the mining of patterns/models that contrast two or more datasets, classes, conditions, time periods, and so forth. Roughly speaking, conditional contrasts capture situations where a small change in patterns is associated with a big change in the matching data of the patterns. More precisely, a conditional contrast is a triple (B, F1, F2) of three patterns; B is the condition/context pattern of the conditional contrast, and F1 and F2 are the contrasting factors of the conditional contrast. Such a conditional contrast is of interest if the difference between F1 and F2 as itemsets is relatively small, and the difference between the corresponding matching dataset of B∪F1 and that of B∪F2 is relatively large. It offers insights on 'discriminating' patterns for a given condition B. Conditional contrast mining is related to frequent pattern mining and analysis in general, and to the mining and analysis of closed pattern and minimal generators in particular. It can also be viewed as a new direction for the analysis (and mining) of frequent patterns. After formalizing the concepts of conditional contrast, the chapter will provide some theoretical results on conditional contrast mining. These results (i) relate conditional contrasts with closed patterns and their minimal generators, (ii) provide a concise representation for conditional contrasts, and (iii) establish a so-called dominance-beam property. An efficient algorithm will be proposed based on these results, and experiment results will be reported. Related works will also be discussed.
ER -
TY - JOUR
T1 - Mining Disease State Converters for Medical Intervention of Diseases.
Y1 - 2009
A1 - Changjie Tang
A1 - Lei Duan
A1 - Guozhu Dong
KW - Class membership conversion
KW - Classification
KW - Contrast mining
KW - Disease state conversion
KW - Drug design
AB - In applications such as gene therapy and drug design, a key goal is to convert the disease state of diseased objects from an undesirable state into a desirable one. Such conversions may be achieved by changing the values of some attributes of the objects. For example, in gene therapy one may convert cancerous cells to normal ones by changing some genes' expression level from low to high or from high to low. In this paper, we define the disease state conversion problem as the discovery of disease state converters; a disease state converter is a small set of attribute value changes that may change an object's disease state from undesirable into desirable. We consider two variants of this problem: personalized disease state converter mining mines disease state converters for a given individual patient with a given disease, and universal disease state converter mining mines disease state converters for all samples with a given disease. We propose a DSCMiner algorithm to discover small and highly effective disease state converters. Since real-life medical experiments on living diseased instances are expensive and time consuming, we use classifiers trained from the datasets of given diseases to evaluate the quality of discovered converter sets. The effectiveness of a disease state converter is measured by the percentage of objects that are successfully converted from undesirable state into desirable state as deemed by state-of-the-art classifiers. We use experiments to evaluate the effectiveness of our algorithm and to show its effectiveness. We also discuss possible research directions for extensions and improvements. We note that the disease state conversion problem also has applications in customer retention, criminal rehabilitation, and company turn-around, where the goal is to convert class membership of objects whose class is an undesirable class.
ER -
TY - CONF
T1 - Monetizing User Activity on Social Networks - Challenges and Experiences
T2 - Monetizing User Activity on Social Networks - Challenges and Experiences
Y1 - 2009
A1 - Shaojun Wang
A1 - Meenakshi Nagarajan
A1 - Kamal Baid
A1 - Amit Sheth
JA - Monetizing User Activity on Social Networks - Challenges and Experiences
ER -
TY - CONF
T1 - Monetizing User Activity on Social Networks - Challenges and Experiences
Y1 - 2009
A1 - Meenakshi Nagarajan
A1 - Kamal Baid
A1 - Amit Sheth
A1 - Shaojun Wang
PB - Beyond Search: Semantic Computing and Internet Economics 2009 Workshop
ER -
TY - CONF
T1 - An Ontological Representation of Time Series Observations on the Semantic Sensor Web
Y1 - 2009
A1 - Rajkumar Buyya
A1 - Amit Sheth
A1 - Holger Neuhaus
A1 - Cory Henson
A1 - Krishnaprasad Thirunarayan
KW - Semantic Sensor Web and Ontology and SSW and Observations and Measurements and Sensor Web Enablement and Time Series Observations
AB - Time series observations are a common method of collecting sensor data. The Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) provides a standard representation for time series observations within the Observations and Measurements language, and therefore is in heavy use on the Sensor Web. By providing a common model, Observations and Measurements (O&M) facilitates syntax-level integration, but lacks the ability to facilitate semantic-level integration. This inability can cause problems with interoperability between disparate sensor networks that may have subtle variations in their sensing methods. An ontological representation of time series observations could provide a more expressive model and resolve problems of semantic-level interoperability of sensor networks on the Semantic Sensor Web. In this paper, such an ontology model is proposed, as well as a real-world usecase from sensor networks currently measuring rainfall in the South Esk river catchment in the North East of Tasmania, Australia.
ER -
TY - CHAP
T1 - Ontologies and Rules
Y1 - 2009
A1 - Bijan Parsia
A1 - Pascal Hitzler
KW - OWL
KW - Web Ontology Language
AB - The Web Ontology Language OWL, as introduced in Chapter 4, is the language recommended by the World Wide Web consortium (W3C) for expressing ontologies for the Semantic Web. OWL is based on Description Logics, see Chapter 1, and as such is based on first-order predicate logic as underlying knowledge representation and reasoning paradigm.
ER -
TY - Generic
T1 - Ontology Supported Knowledge Discovery in the Field of Human Performance and Cognition
Y1 - 2009
A1 - Delroy Cameron
A1 - C. Ramakrishnan
A1 - Amit Sheth
A1 - Pablo N. Mendes
A1 - Christopher Thomas
A1 - Krishnaprasad Thirunarayan
KW - domain model creation
KW - Human Performance and Cognition
KW - Knowledge Discovery
KW - Knowledge-Driven Web Browsing
KW - Relationship Matching
PB - Wright-Patterson Air Force Base
ER -
TY - CONF
T1 - Ontology-driven Provenance Management in eScience: An Application in Parasite Research
T2 - Ontology-driven Provenance Management in eScience: An Application in Parasite Research
Y1 - 2009
A1 - Satya S. Sahoo
A1 - D. Brent Weatherly
A1 - Raghava Mutharaju
A1 - Pramod Anantharam
A1 - Rick L. Tarleton
A1 - Amit Sheth
JA - Ontology-driven Provenance Management in eScience: An Application in Parasite Research
ER -
TY - CONF
T1 - Paraconsistent Reasoning for OWL 2
T2 - Third International Conference, RR
Y1 - 2009
A1 - Yue Ma
A1 - Pascal Hitzler
AB - A four-valued description logic has been proposed to reason with description logic based inconsistent knowledge bases. This approach has a distinct advantage that it can be implemented by invoking classical reasoners to keep the same complexity as under the classical semantics. However, this approach has so far only been studied for the basid description logic ALC. In this paper, we further study how to extend the four-valued semantics to the more expressive description logic SROIQ which underlies the forthcoming revision of the Web Ontology Language, OWL 2, and also investigate how it fares when adapated to tractable description logics including EL++, DL-Lite, and Horn-DLs. We define the four-valued semantics along the same lines as for ALC and show that we can retain most of the desired properties.
JA - Third International Conference, RR
PB - Web Reasoning and Rule Systems, Third International Conference, RR 2009
CY - Chantilly, VA, USA
ER -
TY - CONF
T1 - A Preferential Tableaux Calculus for Circumscriptive ALCO
T2 - International Conference, RR
Y1 - 2009
A1 - Stephan Grimm
A1 - Pascal Hitzler
AB - Nonmonotonic extensions of description logics (DLs) allow for default and local closed-world reasoning and are an acknowledged desired feature for applications, e.g. in the Semantic Web. A recent approach to such an extension is based on McCarthy's circumscription, which rests on the principle of minimising the extension of selected predicates to close off dedicated parts of a domain model. While decidability and complexity results have been established in the literature, no practical algorithmisation for circumscriptive DLs has been proposed so far. In this paper, we present a tableaux calculus that can be used as a decision procedure for concept satisfiability with respect to concept circumscribed ALCO knowledge bases. The calculus builds on existing tableaux for classical DLs, extended by the notion of a preference clash to detect the non-minimality of constructed models.
JA - International Conference, RR
PB - International Conference, RR 2009
CY - Chantilly, VA, USA
ER -
TY - JOUR
T1 - Privacy-preserving Multiparty Collaborative Mining with Geometric Data Perturbation
JF - IEEE Transactions on Parallel and Distributed Systems
Y1 - 2009
A1 - Keke Chen
A1 - Ling Liu
AB - In multiparty collaborative data mining, participantscontribute their own datasets and hope to collaborativelymine a comprehensive model based on the pooled dataset. Howto efficiently mine a quality model without breaching eachparty's privacy is the major challenge. In this paper, we proposean approach based on geometric data perturbation and datamining-service oriented framework. The key problem of applyinggeometric data perturbation in multiparty collaborative miningis to securely unify multiple geometric perturbations that arepreferred by different parties, respectively. We have developedthree protocols for perturbation unification. Our approach hasthree unique features compared to the existing approaches. (1)With geometric data perturbation, these protocols can work formany existing popular data mining algorithms, while most ofother approaches are only designed for a particular mining algorithm.(2) Both the two major factors: data utility and privacyguarantee are well preserved, compared to other perturbationbasedapproaches. (3) Two of the three proposed protocols alsohave great scalability in terms of the number of participants,while many existing cryptographic approaches consider only twoor a few more participants. We also study different features of thethree protocols and show the advantages of different protocolsin experiments.
ER -
TY - ABST
T1 - PrOM: A Semantic Web Framework for Provenance Management in Science
Y1 - 2009
A1 - Amit Sheth
A1 - Pascal Hitzler
A1 - Krishnaprasad Thirunarayan
A1 - Satya S. Sahoo
A1 - Roger Barga
AB - The eScience paradigm is enabling researchers to collaborate over the Web in virtual laboratories and conduct experiments on an industrial scale. But, the inherent variability in the quality and trust associated with eScience resources necessitates the use of provenance information describing the origin of an entity. Existing systems often model provenance using ambiguous terminology, have poor domain semantics and include modeling inconsistencies that hinders interoperability. Further, mere collection of provenance information is of little value without a well-defined and scalable query mechanism. In this paper, we present 'PrOM', a framework that addresses both the modeling and querying issues in eScience provenance management. The theoretical underpinning for PrOM consists of, (a) a novel foundational ontology for provenance representation called 'Provenir', and (b) the first set of query operators to be defined for provenance query and analysis. The PrOM framework also includes a scalable provenance query engine that supports complex queries (high 'expression complexity') over a very large real world dataset with 308 million RDF triples. The query engine uses a new class of materialized views for query optimization that confers significant advantages (up to three orders of magnitude) in query performance.
ER -
TY - CONF
T1 - Provenir ontology: Towards a Framework for eScience Provenance Management
Y1 - 2009
A1 - Satya S. Sahoo
A1 - Amit Sheth
KW - trident ontology and parasite experiment ontology and provenance management framework and provenir ontology
AB - 'Provenance metadata describes the 'lineage' or history of an entity and necessary information to verify the quality of data, validate experiment protocols, and associate trust value with scientific results. eScience projects generate data and the associated provenance metadata in a distributed environment (such as myGrid) and on a very large scale that often precludes manual analysis. Given this scenario, provenance information should be, (a) interoperable across projects, research groups, and application domains, and (b) support analysis over large datasets using reasoning to discover implicit information. In this paper, we introduce an ontology-driven framework for eScience provenance management underpinned by an 'upper-level' ontology called provenir defined in OWL-DL. This framework is implemented in a modular fashion by extending provenir ontology to create a suite of domain-specific provenance ontologies that facilitate interoperability and enable reasoning. We demonstrate the application of this framework in two eScience projects domains through creation of, (a) Parasite Experiment ontology to model provenance in parasite research, and (b) Trident ontology to model provenance in the Neptune oceanography project.'
PB - Microsoft eScience Workshop
ER -
TY - CONF
T1 - Reconstruction of the Upper Torso Using X-Ray Imagery
Y1 - 2009
A1 - Christopher Koehler
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - Research in Semantic Web and Information Retrieval: Trust, Sensors, and Search
Y1 - 2009
A1 - Krishnaprasad Thirunarayan
KW - trust and sensor web and information retrieval
ER -
TY - JOUR
T1 - SCALE: a Scalable Framework for Efficiently Clustering Large Transactional Data
JF - Journal of Data Mining and Knowledge Discovery (DMKD)
Y1 - 2009
A1 - Keke Chen
A1 - Hua Yan
A1 - Ling Liu
KW - data clustering
AB - This paper presents SCALE, a fully automated transactional clustering framework. The SCALE designhighlights three unique features. First, we introduce the concept of Weighted Coverage Density as acategorical similarity measure for efficient clustering of transactional datasets. The concept of weightedcoverage density is intuitive and it allows the weight of each item in a cluster to be changed dynamicallyaccording to the occurrences of items. Second, we develop the weighted coverage density measure basedclustering algorithm, a fast, memory-efficient, and scalable clustering algorithm for analyzing transactionaldata. Third, we introduce two clustering validation metrics and show that these domain specific clusteringevaluation metrics are critical to capture the transactional semantics in clustering analysis. Our SCALEframework combines the weighted coverage density measure for clustering over a sample dataset with selfconfiguringmethods. These self-configuring methods can automatically tune the two important parametersof our clustering algorithms: (1) the candidates of the best number K of clusters; and (2) the applicationof two domain-specific cluster validity measures to find the best result from the set of clustering results.
ER -
TY - CONF
T1 - Semantic Information and Sensor Networks
T2 - Semantic Information and Sensor Networks
Y1 - 2009
A1 - Krishnaprasad Thirunarayan
A1 - Josh Pschorr
AB - Embedded Networked Sensing involves untethered, networked devices tightly coupled to the physical world, to monitor and interact with it. Raw sensor observation can be annotated with semantic metadata to provide interpretation and context for it. In this paper, we discuss what, why and how of semantic sensor data and semantic sensor networks. Specifically, we explore (i) benefits of augmenting sensor data with semantics, (ii) domainspecific and spatio-temporal problems to be addressed, (iii) role of knowledge representation and reasoning (Semantic Web technology), and (iv) standardization efforts underway to make sensor-related data and sensor observations widely available.
JA - Semantic Information and Sensor Networks
ER -
TY - CONF
T1 - Semantic Integration of Citizen Sensor Data and Multilevel Sensing: A Comprehensive Path Towards Event Monitoring and Situational Awareness
Y1 - 2009
A1 - Amit Sheth
KW - Citizen Sensor Data
PB - From E-Gov to Connected Governance: the Role of Cloud Computing, Web 2.0 and Web 3.0 Semantic Technogologies
ER -
TY - BOOK
T1 - The Semantic Web - ISWC 2009
Y1 - 2009
KW - Description Logic
KW - information management
KW - machine learning
KW - mash ups
KW - ontology modeling
KW - ontology search
KW - p2p
KW - Privacy
KW - security
KW - Semantic Web
KW - trust
KW - visualization
KW - Web 2.0
KW - web applications
AB - This book constitutes the refereed proceedings of the 8th International Semantic Web Conference, ISWC 2009, held in Chantilly, VA, USA, during October 25-29, 2009.The volume contains 43 revised full research papers selected from a total of 250 submissions; 15 papers out of 59 submissions to the semantic Web in-use track, and 7 papers and 12 posters accepted out of 19 submissions to the doctorial consortium.The topics covered in the research track are ontology engineering; data management; software and service engineering; non-standard reasoning with ontologies; semantic retrieval; OWL; ontology alignment; description logics; user interfaces; Web data and knowledge; semantic Web services; semantic social networks; and rules and relatedness. The semantic Web in-use track covers knowledge management; business applications; applications from home to space; and services and infrastructure.
ER -
TY - JOUR
T1 - Semantics-Empowered Social Computing
JF - IEEE Internet Computing
Y1 - 2009
A1 - Amit Sheth
A1 - Meenakshi Nagarajan
KW - semantic social web
KW - user-generated content
KW - Web 3.0
AB - In this article, we discuss some of the challenges in marking-up or annotating UGC, a first step toward the realization of the social semantic Web. Using examples from real- world UGC, we show how domain knowledge can effectively complement statistical natural language processing techniques for metadata creation.
ER -
TY - CONF
T1 - SemSOS: Semantic Sensor Observation Service
T2 - SemSOS: Semantic Sensor Observation Service
Y1 - 2009
A1 - Krishnaprasad Thirunarayan
A1 - Josh Pschorr
A1 - Cory Henson
A1 - Amit Sheth
AB - Sensor Observation Service (SOS) is a Web service specification defined by the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) group in order to standardize the way sensors and sensor data are discovered and accessed on the Web. This standard goes a long way in providing interoperability between repositories of heterogeneous sensor data and applications that use this data. Many of these applications, however, are ill equipped at handling raw sensor data as provided by SOS and require actionable knowledge of the environment in order to be practically useful. There are two approaches to deal with this obstacle, make the applications smarter or make the data smarter. We propose the latter option and accomplish this by leveraging semantic technologies in order to provide and apply more meaningful representation of sensor data. More specifically, we are modeling the domain of sensors and sensor observations in a suite of ontologies, adding semantic annotations to the sensor data, using the ontology models to reason over sensor observations, and extending an open source SOS implementation with our semantic knowledge base. This semantically enabled SOS, or SemSOS, provides the ability to query high-level knowledge of the environment as well as low-level raw sensor data.
JA - SemSOS: Semantic Sensor Observation Service
ER -
TY - ABST
T1 - Service Level Agreement in Cloud Computing
Y1 - 2009
A1 - Ajith Ranabahu
A1 - Pankesh Patel
A1 - Amit Sheth
KW - Cloud and SLA
AB - Cloud computing that provides cheap and pay-as-you-go computing resources is rapidly gaining momentum as an alternative to traditional IT Infrastructure. As more and more consumers delegate their tasks to cloud providers, Service Level Agreements(SLA) between consumers and providers emerge as a key aspect. Due to the dynamic nature of the cloud, continuous monitoring on Quality of Service (QoS) attributes is necessary to enforce SLAs. Also numerous other factors such as trust (on the cloud provider) come into consideration, particularly for enterprise customers that may outsource its critical data. This complex nature of the cloud landscape warrants a sophisticated means of managing SLAs. This paper proposes a mechanism for managing SLAs in a cloud computing environment using the Web Service Level Agreement(WSLA) framework, developed for SLA monitoring and SLA enforcement in a Service Oriented Architecture (SOA). We use the third party support feature of WSLA to delegate monitoring and enforcement tasks to other entities in order to solve the trust issues. We also present a real world use case to validate our proposal.
ER -
TY - CONF
T1 - Situation Awareness via Abductive Reasoning for Semantic Sensor Data: A Preliminary Report
T2 - Situation Awareness via Abductive Reasoning for Semantic Sensor Data: A Preliminary Report
Y1 - 2009
A1 - Amit Sheth
A1 - Cory Henson
A1 - Krishnaprasad Thirunarayan
AB - Semantic Sensor Web enhances raw sensor data with spatial, temporal, and thematic annotations to enable high-level reasoning. In this paper, we explore how abductive reasoning framework can benefit formalization and interpretation of sensor data to garner situation awareness. Specifically, we show how abductive logic programming techniques, in conjunction with symbolic knowledge rules, can be used to detect inconsistent sensor data and to generate human accessible description of the state of the world from consistent subset of the sensor data. We also show how trust/belief information can be incorporated into the interpreter to enhance reliability. For concreteness, we formalize Weather domain and develop a meta-interpreter in Prolog to explain Weather data. This preliminary work illustrates synthesis of highlevel, reliable information for situation awareness by querying low-level sensor data.
JA - Situation Awareness via Abductive Reasoning for Semantic Sensor Data: A Preliminary Report
ER -
TY - CONF
T1 - SPARQL Query Re-writing for Spatial Datasets Using Partonomy Based Transformation Rules
T2 - SPARQL Query Re-writing for Spatial Datasets Using Partonomy Based Transformation Rules
Y1 - 2009
A1 - Cory Henson
A1 - Kunal Verma
A1 - Amit Sheth
A1 - Peter Yeh
A1 - Prateek Jain
AB - Often the information present in a spatial knowledge base is represented at a different level of granularity and abstraction than the query constraints. For querying ontology's containing spatial information, the precise relationships between spatial entities has to be specified in the basic graph pattern of SPARQL query which can result in long and complex queries. We present a novel approach to help users intuitively write SPARQL queries to query spatial data, rather than relying on knowledge of the ontology structure. Our framework re-writes queries, using transformation rules to exploit part-whole relations between geographical entities to address the mismatches between query constraints and knowledge base. Our experiments were performed on completely third party datasets and queries. Evaluations were performed on Geonames dataset using questions from National Geographic Bee serialized into SPARQL and British Administrative Geography Ontology using questions from a popular trivia website. These experiments demonstrate high precision in retrieval of results and ease in writing queries.
JA - SPARQL Query Re-writing for Spatial Datasets Using Partonomy Based Transformation Rules
ER -
TY - CONF
T1 - Spatio-Temporal-Thematic Analysis of Citizen-Sensor Data - Challenges and Experiences
T2 - Spatio-Temporal-Thematic Analysis of Citizen-Sensor Data - Challenges and Experiences
Y1 - 2009
A1 - Karthik Gomadam
A1 - Ajith Ranabahu
A1 - Ashutosh Jadhav
A1 - Amit Sheth
A1 - Meenakshi Nagarajan
A1 - Raghava Mutharaju
AB - We present work in the spatio-temporal-thematic analysis of citizen-sensor observations pertaining to real-world events. Using Twitter as a platform for obtaining crowd-sourced observations, we explore the interplay between these 3 dimensions in extracting insightful summaries of social perceptions behind events. We present our experiences in building a web mashup application, Twitris that extracts and facilitates the spatio-temporal-thematic exploration of event descriptor summaries.
JA - Spatio-Temporal-Thematic Analysis of Citizen-Sensor Data - Challenges and Experiences
ER -
TY - ABST
T1 - Stereoscopic Display Technology for Visualizing Vascular Structures
Y1 - 2009
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - Suggestions for OWL 3
Y1 - 2009
A1 - Pascal Hitzler
AB - With OWL 2 about to be completed, it is the right time to start discussions on possible future modifications of OWL. We present here a number of suggestions in order to discuss them with the OWL user community. They encompass expressive extensions on polynomial OWL 2 profiles, a suggestion for an OWL Rules language, and expressive extensions for OWL DL.
PB - 5th International Workshop on OWL: Experiences and Directions (OWLED 2009)
ER -
TY - CONF
T1 - A Survey of the Semantic Specification of Sensors
Y1 - 2009
A1 - Cory Henson
A1 - Holger Neuhaus
A1 - Michael Compton
A1 - Laurent Lefort
A1 - Amit Sheth
KW - Semantic Sensor Web and Ontology and Semantic Annotation and SSW and Semantic Sensor Networks
AB - Semantic sensor networks use declarative descriptions of sensors promote reuse and integration, and to help solve the dificulties of installing, querying and maintaining complex, heterogeneous sensor networks. This paper reviews the state of the art for the semantic specification of sensors, one of the fundamental technologies in the semantic sensor network vision. Twelve sensor ontologies are reviewed and analysed for the range and expressive power of their concepts. The reasoning and search technology developed in conjunction with these ontologies is also reviewed, as is technology for annotating OGC standards with links to ontologies. Sensor concepts that cannot be expressed accurately by current sensor ontologies are also discussed.
PB - 2nd International Workshop on Semantic Sensor Networks (SSN09)
ER -
TY - ABST
T1 - Tableau Algorithm for Concept Satisfiability in Description Logic ALCH
Y1 - 2009
A1 - Satya S. Sahoo
A1 - Krishnaprasad Thirunarayan
AB - The provenir ontology is an upper-level ontology to facilitate interoperability of provenance information in scientific applications. The description logic (DL) expressivity of provenir ontology is ALCH, that is, it models role hierarchies (H) (without transitive roles and inverse roles). Even though the complexity results for concept satisfiability for numerous variants of DL such as ALC with transitively closed roles (ALCR+ also called S), inverse roles SI, and role hierarchy SHI have been well-established, similar results for ALCH has been surprisingly missing from the literature. Here, we show that the complexity of the concept satisfiability problem for the ALCH variant of DL is PSpace complete. This result contributes towards a complete set of complexity results for DL variants and establishes a lower bound on complexity for domain-specific provenance ontologies that extend provenir ontology.
ER -
TY - JOUR
T1 - Time for DNA Disclosure
JF - DNA Disclosure
Y1 - 2009
KW - DNA Disclosure
ER -
TY - CONF
T1 - Towards Reasoning Pragmatics
T2 - Third International Conference, GeoS
Y1 - 2009
A1 - Pascal Hitzler
AB - The realization of Semantic Web reasoning is central to substantiating the Semantic Web vision. However, current mainstream research on this topic faces serious challenges, which force us to question established lines of research and to rethink the underlying approaches.
JA - Third International Conference, GeoS
CY - Mexico City, Mexico
ER -
TY - JOUR
T1 - Trusted Query in Spatial and Temporal Correlated Wireless Sensor Networks
Y1 - 2009
A1 - Giovani R. Abuaitah
A1 - Bin Wang
KW - belief propagation
KW - compromised and misbehaving node detection
KW - reputation characterization and update
KW - Security and trust framework
KW - spatial and temporal correlated wireless sensor network
KW - trust management
KW - trusted query
AB - In this work, we design and demonstrate the feasibility of an innovative reputation-based framework rooted in rigorous statistical theory and belief theory to characterize the trustworthiness of individual nodes in a wireless sensor network (WSN). The resulting mechanism allows the detection of compromised nodes as well as misbehaving nodes. Moreover, trusted querying is enabled by filtering out 'untrustworthy sensor nodes and data' and returning the most-trusted aggregate response. We showcase the effectiveness of the proposed framework through a simulation based study.
ER -
TY - CONF
T1 - Trykipedia: Collaborative Bio-Ontology Development using Wiki Environment
T2 - Ohio Collaborative Conference on BioInformatics (OCCBIO 2009)
Y1 - 2009
A1 - Raghava Mutharaju
A1 - Satya Sahoo
A1 - Pramod Anantharam
A1 - Rick Tarleton
A1 - Flora Logan
A1 - Amit Sheth
A1 - D. Brent Weatherly
KW - Collaborative Bio-Ontology
AB - Biomedical ontology development is an intensely collaborative process between biology experts and computer scientists. With the proliferation of ontology based approach to solve informatics problems in biological domain, there is a need for collaborative environment that is intuitive and widely accepted for modeling the ontology.
JA - Ohio Collaborative Conference on BioInformatics (OCCBIO 2009)
CY - Case Western Reserve University, Cleveland, OH
ER -
TY - ABST
T1 - Twitris: Socially Influenced Browsing
Y1 - 2009
A1 - Meenakshi Nagarajan
A1 - Raghava Mutharaju
A1 - Karthik Gomadam
A1 - Pramod Anantharam
A1 - Ajith Ranabahu
A1 - Ashutosh Jadhav
A1 - Wenbo Wang
A1 - Vinh Nguyen
A1 - Amit Sheth
KW - twitris
AB - In this paper, we present Twitris, a semantic Web application that facilitates browsing for news and information, using social perceptions as the fulcrum. In doing so we address challenges in large scale crawling, processing of real time information, and preserving spatio-temporal-thematic properties central to observations pertaining to realtime events. We extract metadata about events from Twitter and bring related news and Wikipedia articles to the user. In developing Twitris, we have used the DBPedia ontology.
ER -
TY - CONF
T1 - User-Generated Content on Social Media Challenges, Opportunities
Y1 - 2009
A1 - Meenakshi Nagarajan
KW - domain
KW - user-generated content
AB - Understanding and exploiting user generated (textual) content (UGC) on social media is at the forefront of information management challenges today. The variety of UGC in detailed blog commentaries, collaborative wiki-content, online conversations, short messages in micro-blogs etc., are powering several personalization, monetization, crowd/business intelligence applications, and also providing an electronic microscope on social phenomena at an extraordinary scale.Certain characteristics of UGC however, necessitate key computational linguistic interventions before systems can tap into this data. A large portion of language found on social media is in the Informal English domain a blend of abbreviations, slang and context dependent terms delivered with an indifferent approach to grammar and spelling. Large-scale informal content analysis issues are not only a challenge for natural language engineers but are also relevant to scientists observing the effects of content semantics and style on the structure and behavior of a social medium.In this talk, I will cover some representative work in understanding three dimensions of user-generated informal content on social media - what are the named entities and topics they are making references to (what), what words or language constructs are they making use of (how) and what are the intentions behind what they write (why). Many of these investigations were conducted in collaboration with researchers at IBM, MSR and UC Berkeley.We will demonstrate how these perspectives along with other contextual properties of data (when, where they were generated) and the network they were generated in (type of media, poster characteristics) have been absorbed into two deployed Social Intelligence applications. We will close with a discussion on the potential in using social data for understanding complex phenomena like online conversations, diffusion of information, study of emergent social order etc., that necessitate a confluence of both network and content analyses.
PB - Social Data on the Web Workshop, collocated with International Semantic Web Conference, 2009
ER -
TY - CHAP
T1 - Vascular Geometry Reconstruction and Grid Generation
Y1 - 2009
A1 - Daniel R. Einstein
A1 - Andrew P. Kuprat
A1 - Xiangmin Jiao
A1 - Ghassan Kassab
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - VisGBT: Visually Analyzing Evolving Datasets for Adaptive Learning
T2 - VisGBT: Visually Analyzing Evolving Datasets for Adaptive Learning
Y1 - 2009
A1 - Fengguang Tian
A1 - Keke Chen
JA - VisGBT: Visually Analyzing Evolving Datasets for Adaptive Learning
ER -
TY - CHAP
T1 - Visualization of Two Parameters in a Three Dimensional Environment
T2 - Human-Computer Systems Interaction
Y1 - 2009
A1 - Leonidas Deligiannidis
A1 - Amit Sheth
AB - Visualization techniques and tools allow a user to make sense of enor mous amount of data. Querying capabilities and direct manipulation techniques ena ble a user to filter out irrelevant data and focus only on information that could yield to a conclusion. Effective visualization techniques should enable a user or an analyst to get to the conclusion in a short time and with minimal training. We illustrate, via three different research projects, how to visualize two parameters where a user can get to a conclusion in a very short amount of time with minimal or no training. The two parameters could be the relation of documents and their importance, or spatial events and their timing, or even the ration between carbon dioxide emission levels and number of trees per country. To accomplish this, we visualize the data in a three dimensional environment on a regular computer display. For data manipulation, we use techniques familiar to novice computer users.
JA - Human-Computer Systems Interaction
ER -
TY - CONF
T1 - Web 3.0: Semantics, Services, Sensor & Social Computing
Y1 - 2009
A1 - Amit Sheth
PB - DA-IICT
ER -
TY - JOUR
T1 - Where Did You Come From... Where Did You Go ? -An Algebra and Query Engine for Scientific Data Provenance
JF - Computing
Y1 - 2009
A1 - Satya Sahoo
A1 - Roger Barga
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
A1 - Jonathan Goldstein
KW - Provenance Management Framework
AB - Introduce provenance and research issues in this area, describe provenance management framework consist of common provenance model and set of provenance query operators supported by a query engine. A naive engine is impractical for queries over large scientific datasets(why???). A new class of materialized view has been defined using provenir model, call MPV for query optimization. One example from Neptune project significantly reduces provenance query time. Heterogeneous data format from escience.
ER -
TY - JOUR
T1 - Where Did You Come From... Where Did You Go ? -An Algebra and Query Engine for Scientific Data Provenance
Y1 - 2009
A1 - Roger Barga
A1 - Jonathan Goldstein
A1 - Krishnaprasad Thirunarayan
A1 - Satya Sahoo
A1 - Amit Sheth
KW - Provenance Management Framework
AB - Introduce provenance and research issues in this area, describe provenance management framework consist of common provenance model and set of provenance query operators supported by a query engine. A naive engine is impractical for queries over large scientific datasets(why???). A new class of materialized view has been defined using provenir model, call MPV for query optimization. One example from Neptune project significantly reduces provenance query time. Heterogeneous data format from escience.
ER -
TY - CONF
T1 - Why Gujarat Needs Much Better Higher Education & Research to Succeed in Knowledge Economy & What We Can Do About It?
Y1 - 2009
A1 - Sanjay Chaudhary
A1 - Amit Sheth
A1 - Kamlesh Lulla
KW - Research University and Higher Education Regulatory Reform and Higher Education in Gujarat
AB - This white paper distills the deliberations on the role of higher education and research as a key enabler of a Knowledge based Society. In particular it discusses (a) the importance of higher quality PhDs for building a knowledge society, (b) the initiatives and progress in competing economies in higher education and research, (c) where Gujarat stands in comparison, and (d) some recommendations on what Gujarat can do to enable timely progress towards building a knowledge based society and economy. These deliberations were conducted in conjunction with the International Conference on 'Reconnecting Gujarati Diaspora with its Homeland: Contribution to its Development with focus on Building a Knowledge Society' (January 17-19, 2009, Patan) at presented to the CM Shri. Narendrabhai Desai at the Round Table on 'Regulatory& Policy Reform for Higher Education in Gujarat' (January 18, 2009, Gandhinagar).
PB - Round Table on Regulatory & Policy Reform for Higher Education in Gujarat, in conjunction with the International Conference on Reconnecting Gujarati Diaspora with its Homeland: Contribution to its Development with a Focus on Building a Knowledge Society
ER -
TY - CHAP
T1 - Active Semantic Electronic Medical Records
Y1 - 2008
A1 - Nicole Oldham
A1 - K. Gallegher
A1 - Amit Sheth
A1 - P. Yadav
A1 - S. Agrawal
A1 - Jon Lathem
A1 - H. Wingate
KW - Active Semantic Document
KW - Clinical Ontology
KW - Semantic Annotation of Patient Records
KW - Semantic Electronic Medical Record
KW - Semantic Web Clinical Application
KW - Semantic Web Health Application
AB - The most cumbersome aspect of health care is the extensive documentation which is legally required for each patient. For these reasons, physicians and their assistants spend about 30% of their time documenting encounters. Paper charts are slowly being phased out due to inconvenience, inability to mine data, costs and safety concerns. Many practices are now investing in electronic medical records (EMR) systems which allow them to have all patient data at their fingertips. Although current adoption by medical groups (based on a 2005 survey (AHRQ 2005)) is still below 15% with even less adoption rate for smaller practices, the trend is clearly towards increasing adoption. This trend will accelerate as regulatory pressures such as 'Pay-4- Performance' become mandatory thus enhancing the ROI sophisticated systems can achieve. This paper focuses on the first known development and deployment24 of a comprehensive EMR system that utilizes semantic Web and Web service/process technologies. It is based on substantial collaboration between practicing physicians (Dr. Agrawal is a cardiologists and a fellow of the American Cardiology Association, Dr. Wingate is an emergency room physician) at the Athens Heart Center and the LSDIS lab at UGA. More specifically, we leverage the concept and technology of Active Semantic Documents (ASDs) developed at the LSDIS lab. ASDs get their semantic feature by automatic semantic annotation of documents with respect to one or more ontologies. These documents are termed active since they support automatic and dynamic validation and decision making on the content of the document by applying contextually relevant rules to components of the documents. This is accomplished by executing rules on semantic annotations and relationships that span across ontologies. Specifically, Active Semantic Electronic Medical Record (ASEMR) is an application of ASDs in health care which aims to reduce medical errors, improve physician efficiency, improve patient safety and satisfaction in medical practice, improve quality of billing records leading be better payment, and make it easier to capture and analyze health outcome measures. In ASMER, rules specified in conjunction with ontologies play a key role. Examples of the rules include prevention of drug interaction (i.e., not allowing a patient to be prescribed two severely interacting drugs, or alerting the doctor and requiring his/her to make specific exceptions when low or moderate degree of interactions are acceptable) or ensuring the procedure performed has supporting diagnoses. ASDs display the semantic (for entities defined in the ontologies) and lexical (for terms and phrases that are part of specialist lexicon, specific items related to the clinics, and other relevant parts of speech) annotations in document displaced in a browser, show results of rule execution, and provide the ability to modify semantic and lexical components of its content in an ontology-supported and otherwise constrained manner such as through lists, bags of terms, specialized reference sources, or a thesaurus or lexical reference system such as WordNet. This feature allows for better and more efficient patient care and because of the ability of ASDs to offer suggestions when rules are broken or exceptions made. ASEMR is currently in daily and routine use by the Athens Heart Center (AHC) and eight other sites in Georgia. ASEMRs have been implemented as an enhancement of AHC's Panacea electronic medical management system. Panacea is a web-based, end to end medical records and management system, and hence it is used with respect to each patent seen at AHC. This has enhanced the collaborative environment and has provided insights into the components of electronic medical records and the kinds of data available in these systems. The preliminary version was implemented during Summer 2005 and tested in early fall. The current version was deployed and has been fully functional since January 2006. Parts of ASMER we will focus on in this paper are:

the development of populated ontologies in the healthcare (specifically cardiology) domain

the development of an annotation tool that utilizes the developed ontologies for annotation of patient records

the development of decision support algorithms that support rule and ontology based checking/validation and evaluation.

ER -
TY - CONF
T1 - Adapting Ranking Functions to User Preference
T2 - Adapting Ranking Functions to User Preference
Y1 - 2008
A1 - Rongqing Lu
A1 - Keke Chen
A1 - Larry Heck
A1 - Belle Tseng
A1 - Gordon Sun
A1 - C.K. Wong
JA - Adapting Ranking Functions to User Preference
ER -
TY - CONF
T1 - All Elephants are Bigger than All Mice
Y1 - 2008
AB - We investigate the concept product as an expressive feature for description logics (DLs). While this construct allows us to express an arguably very common and natural type of statement, it can be simulated only by the very expressive DL SROIQ for which no tight worst-case complexity is known. However, we show that concept products can also be added to the DLs SHOIQ and SHOI, and to the tractable DL EL++ without increasing the worst-case complexities in any of those cases. We therefore argue that concept products provide practically relevant expressivity at little cost, making them a good candidate for future extensions of the DL-based ontology language OWL.
PB - 21st International Workshop on Description Logics, DL2008
ER -
TY - CONF
T1 - Applications of Voting Theory to Information Mashups
T2 - 2nd IEEE International Conference on Semantic Computing
Y1 - 2008
A1 - Christine Robson
A1 - Jan Pieper
A1 - Nachiketa Sahoo
A1 - Alfredo Alba
A1 - Meenakshi Nagarajan
A1 - Daniel Gruhl
A1 - Varun Bhagwan
A1 - Julia Grace
A1 - Kevin Haas
AB - Blogs, discussion forums and social networking sites are an excellent source for people's opinions on a wide range of topics. We examine the application of voting theory to 'Information Mashups' - the combining and summarizing of data from the multitude of often-conflicting sources. This paper presents an information mashup in the music domain: a Top 10 artist chart based on user comments and listening behavior from several Web communities. We consider different voting systems as algorithms to combine opinions from multiple sources and evaluate their effectiveness using social welfare functions. Different voting schemes are found to work better in some applications than others. We observe a tradeoff between broad popularity of established artists versus emerging superstars that may only be popular in one community. Overall, we find that voting theory provides a solid foundation for information mashups in this domain.
JA - 2nd IEEE International Conference on Semantic Computing
CY - Santa Clara, CA, USA
ER -
TY - CONF
T1 - Approximate OWL Instance Retrieval with Screech
T2 - Approximate OWL Instance Retrieval with Screech
Y1 - 2008
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Tuvshintur Tserendorj
A1 - Pascal Hitzler
AB - With the increasing interest in expressive ontologies for the Semantic Web, it is critical to develop scalable and efficient ontology reasoning techniques that can properly cope with very high data volumes. For certain application domains, approximate reasoning solutions, which trade soundness or completeness for increased reasoning speed, will help to deal with the high computational complexities which state of the art ontology reasoning tools have to face. In this paper, we present a comprehensive overview of the SCREECH approach to approximate instance retrieval with OWL ontologies, which is based on the KAON2 algorithms, facilitating a compilation of OWL DL TBoxes into Datalog, which is tractable in terms of data complexity. We present three different instantiations of the Screech approach, and report on experiments which show that the gain in efficiency outweighs the number of introduced mistakes in the reasoning process.
JA - Approximate OWL Instance Retrieval with Screech
ER -
TY - CONF
T1 - Approximate OWL-Reasoning with Screech
T2 - International Conference, RR 2008
Y1 - 2008
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Tuvshintur Tserendorj
A1 - Pascal Hitzler
AB - Applications of expressive ontology reasoning for the Semantic Web require scalable algorithms for deducing implicit knowledge from explicitly given knowledge bases. Besides the development of more efficient such algorithms, awareness is rising that approximate reasoning solutions will be helpful and needed for certain application domains. In this paper, we present a comprehensive overview of the Screech approach to approximate reasoning with OWL ontologies, which is based on the KAON2 algorithms, facilitating a compilation of OWL DL TBoxes into Datalog, which is tractable in terms of data complexity.We present three different instantiations of the Screech approach, and report on experiments which show that a significant gain in efficiency can be achieved.
JA - International Conference, RR 2008
PB - Second International Conference, RR 2008
CY - Karlsruhe, Germany
ER -
TY - ABST
T1 - Artist Ranking Through Analysis of Online Community Comments
Y1 - 2008
A1 - Julia Grace
A1 - Daniel Gruhl
A1 - Meenakshi Nagarajan
A1 - Nachiketa Sahoo
A1 - Christine Robson
A1 - Kevin Haas
AB - We describe an approach to measure the popularity of music tracks, albums and artists by analyzing the comments of music listeners in social networking online communities such as MySpace. This measure of popularity appears to be more accurate than the traditional measure based on album sales figures, as demonstrated by our focus group study. We faced many challenges in our attempt to generate a popularity ranking from the user comments on social networking sites, e.g., broken English sentences, comment spam, etc. We discuss the steps we took to overcome these challenges and describe an end to end system for generating a new popularity measure based on online comments, and the experiments performed to evaluate its success.
ER -
TY - CHAP
T1 - Attribute Grammars and their Applications
Y1 - 2008
A1 - Krishnaprasad Thirunarayan
ER -
TY - JOUR
T1 - Automated Isolation of Translational Efficiency Bias that Resists the Confounding Effect of GC(AT)-Content
Y1 - 2008
A1 - D. Raiford
A1 - Doug Raiford
A1 - D. Krane
A1 - Dan Krane
A1 - Travis Doom
A1 - Michael Raymer
AB - Genomic sequencing projects are an abundant source of information for biological studies ranging from the molecular to the ecological in scale; however, much of the information present may yet be hidden from casual analysis. One such information domain, trends in codon usage, can provide a wealth of information about an organism's genes and their expression. Degeneracy in the genetic code allows more than one triplet codon to code for the same amino acid, and usage of these codons is often biased such that one or more of these synonymous codons is preferred. Detection of this bias is an important tool in the analysis of genomic data, particularly as a predictor of gene expressivity. Methods for identifying codon usage bias in genomic data that rely solely on genomic sequence data are susceptible to being confounded by the presence of several factors simultaneously influencing codon selection. Presented here is a new technique for removing the effects of one of the more common confounding factors, GC(AT)-content, and of visualizing the search-space for codon usage bias through the use of a solution landscape. This technique successfully isolates expressivity-related codon usage trends, using only genomic sequence information, where other techniques fail due to the presence of GC(AT)-content confounding influences.
ER -
TY - JOUR
T1 - Best K: the Critical Clustering Structures in Categorical Data
Y1 - 2008
A1 - Ling Liu
A1 - Keke Chen
KW - best Ks'
KW - how can we efficiently and reliably determine the best Ks?
KW - none has satisfactorily addressed the problem of Best K for categorical clustering. Since categorical data does not have an inherent distance function as the similarity measure
KW - surprisingly
KW - The demand on cluster analysis for categorical data continues to grow over the last decade. A well-known problem in categorical clustering is to determine the best K number of clusters. Although several categorical clustering algorithms have been develope
KW - traditional cluster validation techniques based on geometric shapes and density distributions are not appropriate for categorical data. In this paper
KW - we study the entropy property between the clustering results of categorical data with different K number of clusters
KW - what is the set of candidate '
AB - The demand on cluster analysis for categorical data continues to grow over the last decade. A well-known problem in categorical clustering is to determine the best K number of clusters. Although several categorical clustering algorithms have been developed, surprisingly, none has satisfactorily addressed the problem of Best K for categorical clustering. Since categorical data does not have an inherent distance function as the similarity measure, traditional cluster validation techniques based on geometric shapes and density distributions are not appropriate for categorical data. In this paper, we study the entropy property between the clustering results of categorical data with different K number of clusters, and propose the BKPlot method to address the three important cluster validation problems: 1) How can we determine whether there is significant clustering structure in a categorical dataset? 2) If there is significant clustering structure, what is the set of candidate 'best Ks'? 3) If the dataset is large, how can we efficiently and reliably determine the best Ks?
ER -
TY - CONF
T1 - Boosting with Incomplete Information
T2 - Boosting with Incomplete Information
Y1 - 2008
A1 - Shaojun Wang
A1 - G. Haffari
A1 - F. Jiao
A1 - Y. Wang
A1 - G. Mori
JA - Boosting with Incomplete Information
ER -
TY - JOUR
T1 - Business Process Management
Y1 - 2008
A1 - Amit Sheth
A1 - Jos Luiz Fiadeiro
A1 - S. Dustdar
ER -
TY - CONF
T1 - Capturing Workflow Event Data for Monitoring, Performance Analysis, and Management of Scientific Workflows
Y1 - 2008
A1 - Satya S. Sahoo
A1 - Roger Barga
A1 - Jared Jackson
A1 - Matthew Valerio
KW - Scientific workflow and ontology-aware notification system and event data capture and Trident scientific workbench
AB - To effectively support real-time monitoring and performance analysis of scientific workflow execution, varying levels of event data must be captured and made available to interested parties. This paper discusses the creation of an ontology-aware workflow monitoring system for use in the Trident system which utilizes a distributed publish/subscribe event model. The implementation of the publish/subscribe system is discussed and performance results are presented.
PB - Scientific Workflows and Business Workflow Standards in e-Science (SWBES08) in conjunction with IEEE International e-Science Conference 2008
ER -
TY - CHAP
T1 - Challenges of Creating a Knowledge-based Society: Education & Research for India & Gujarat
Y1 - 2008
A1 - Amit Sheth
PB - World Gujarat Conference
ER -
TY - CONF
T1 - Cheap Boolean Role Constructors for Description Logics
T2 - 11th European Conference on Logics in Artificial Intelligence (JELIA)
Y1 - 2008
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - We investigate the possibility of incorporating Boolean role constructors on simple roles into some of today's most popular description logics, focussing on cases where those extensions do not increase complexity of reasoning. We show that the expressive DLs SHOIQ and SROIQ, serving as the logical underpinning of OWL and the forthcoming OWL 2, can accommodate arbitrary Boolean expressions. The prominent OWL-fragment SHIQ can be safely extended by safe role expressions, and the tractable fragments EL++ and DLP retain tractability if extended by conjunction on roles, where in the case of DLP the restriction on role simplicity can even be discarded.
JA - 11th European Conference on Logics in Artificial Intelligence (JELIA)
CY - Dresden, Germany
ER -
TY - JOUR
T1 - A Coherent Query Language for XML
Y1 - 2008
A1 - Trivikram Immaneni
A1 - Krishnaprasad Thirunarayan
KW - Retrieval
KW - Search
ER -
TY - CONF
T1 - A Coherent Well-founded Model for Hybrid MKNF Knowledge Bases
T2 - 18th European Conference on Artificial Intelligence, ECAI
Y1 - 2008
A1 - Matthias Knorr
A1 - Jose Julio Alferes
A1 - Pascal Hitzler
AB - With the advent of the Semantic Web, the question becomes important how to best combine open-world based ontology languages, like OWL, with closed-world rules paradigms. One of the most mature proposals for this combination is known as Hybrid MKNF knowledge bases [11], which is based on an adaptation of the stable model semantics to knowledge bases consisting of ontology axioms and rules. In this paper, we propose a well-founded semantics for such knowledge bases which promises to provide better efficiency of reasoning, which is compatible both with the OWL-based semantics and the traditional well-founded semantics for logic programs, and which surpasses previous proposals for such a well-founded semantics by avoiding some issues related to inconsistency handling.
JA - 18th European Conference on Artificial Intelligence, ECAI
PB - 18th European Conference on Artificial Intelligence, ECAI 2008
CY - Patras, Greece
ER -
TY - CHAP
T1 - Collaborative RO1 with NCBO Semantics and Services Enabled Problem Solving Environment For Trypanosoma Cruzi
Y1 - 2008
A1 - Rick Tarleton
A1 - Pablo N. Mendes
A1 - Prashant Doshi
A1 - Natasha Noy
A1 - Satya S. Sahoo
A1 - Mark Musen
A1 - D. Brent Weatherly
A1 - Amit Sheth
ER -
TY - CHAP
T1 - Computing Center-Lines: An Application of Vector Field Topology
Y1 - 2008
A1 - Thomas Wischgoll
AB - Flow visualization has been a very active subfield of scientific visualization in recent years. From the resulting large variety of methods this paper discusses structure-based techniques. The aim of these approaches is to partition the flow in areas of common behavior. Based on this partitioning, subsequent visualization techniques can be applied. A classification is suggested and advantages/disadvantages of the different techniques are discussed as well.
ER -
TY - CONF
T1 - Computing for Human Experience: Sensors, Perception, Semantics, Web N.0, and Beyond
T2 - 3rd Asian Semantic Web Conference (ASWC 2008)
Y1 - 2008
A1 - Amit Sheth
KW - semantic web and perception
AB - Traditionally there has been a strong separation between computing and human activities in the real world. The approach has largely been that of mapping the complexity and richness of the real world to constrained computer models and languages for more efficient computation, and then transferring the results for use in the real world. I think the time is ripe to reverse the situation, for computing and communication to transparently enrich and enhance human experience.Today, devices enable something more than a 'human instructs machine' paradigm. We are seeing computing and communication engage transparently in human activities by enriching them in ways not possible before. Assimilating, linking, computing over and disseminating multimodal information (maps, images, events, reviews, videos etc.) occur with far less human interaction. Systems are more 'aware' they not only deal with simple objects such as documents or entities, but also model, compute over, and communicate complex facets, such as the relationships between objects and the temporal ('when'), thematic ('what') and spatial ('where') aspects of objects.This era of 'computing for human experience' involves a seamless interaction between the physical world and the virtual or cyber world with advanced integrated capabilities in perception and awareness of the physical world (e.g., in extending sensory engagement with environments and narrowing the gaps between the real world and computing), using 'humans as sensors' of intensions and emotions, understanding (semantics) and using knowledge and collective wisdom, while integrating online and offline interactions. Some of these ideas have been posited as the Internet of Things, Intelligence@Interfaces, Humanist Computing, Relationship Web, PeopleWeb, EventWeb, and Experiential Computing. Applications and infrastructures embodying the principles of computing for richer human experiences are already emerging: MyLifeBits, linked data, Open Social, and reusable knowledge bases (Semantic Web or Web3.0) are some examples.Building on these and related concepts and visions, we will outline recent progress and highlight where semantics and semantic Web technologies play an important role in leading to the next phase 'computing to enrich human experience' as well as the types of enrichments one can expect.
JA - 3rd Asian Semantic Web Conference (ASWC 2008)
PB - Asian Semantic Web Conference
CY - Bangkok, Thailand
ER -
TY - JOUR
T1 - Connectionist Model Generation: A First-Order Approach
JF - Neurocomputing
Y1 - 2008
A1 - Sebastian Bader
A1 - Steffen Holldobler
A1 - Pascal Hitzler
KW - Connectionist Model Generation
KW - First-Order Logic Programs
KW - Neural-Symbolic Integration
KW - Recurrent RBF Networks
AB - Knowledge based artificial neural networks have been applied quite successfully to propositional knowledge representation and reasoning tasks. However, as soon as these tasks are extended to structured objects and structure-sensitive processes as expressed e.g., by means of first-order predicate logic, it is not obvious at all what neural symbolic systems would look like such that they are truly connectionist, are able to learn, and allow for a declarative reading and logical reasoning at the same time. The core method aims at such an integration. It is a method for connectionist model generation using recurrent networks with feed-forward core.We show in this paper how the core method can be used to learn first-order logic programs in a connectionist fashion, such that the trained network is able to do reasoning over the acquired knowledge. We also report on experimental evaluations which show the feasibility of our approach.
ER -
TY - CONF
T1 - Constrained Classification on Structured Data
T2 - Constrained Classification on Structured Data
Y1 - 2008
A1 - C. Lee
A1 - Shaojun Wang
A1 - M. Brown
A1 - A. Murtha
A1 - R. Greiner
JA - Constrained Classification on Structured Data
ER -
TY - CHAP
T1 - A Declarative Approach using SAWSDL and Semantic Templates Towards Process Mediation
Y1 - 2008
A1 - Amit Sheth
A1 - Karthik Gomadam
A1 - Ajith Ranabahu
A1 - John A. Miller
A1 - Zixin Wu
AB - In this paper we address the challenges that arise due to heterogeneities across independently created and autonomously managed Web service requesters and Web service providers. Previous work in this area either involved significant human effort or in cases of the efforts seeking to provide largely automated approaches, overlooked the problem of data heterogeneities, resulting in partial solutions that would not support executable workflow for real-world problems. In this paper, we present a planning-based ...
ER -
TY - CONF
T1 - Defeasible Inference with Circumscriptive OWL Ontologies
Y1 - 2008
A1 - Stephan Grimm
A1 - Pascal Hitzler
AB - The Web Ontology Language (OWL) adheres to the openworld assumption and can thus not be used for forms of nonmonotonic reasoning or defeasible inference, an acknowledged desirable feature in open Semantic Web environments. We investigate the use of the formalism of circumscriptive description logics (DLs) to realise defeasible inference within the OWL framework. By example, we demonstrate how reasoning with (restricted) circumscribed OWL ontologies facilitates various forms of defeasible inference, also in comparison to alternative approaches. Moreover, we sketch an extension to DL tableaux for handling the circumscriptive case and report on a preliminary implementation.
PB - 5th European Semantic Web Conference, ESWC08
ER -
TY - CONF
T1 - Dependence of Binary Associations on Co-occurrence Granularity in News Documents
T2 - IKE 2008
Y1 - 2008
A1 - Krishnaprasad Thirunarayan
A1 - Trivikram Immaneni
A1 - Mastan Vali Shaik
AB - We describe and formalize an approach to correlate binary associations (such as between entities and events, between persons and events, etc.) implied by News documents on the co-occurrence granularity (such as document-level, paragraph-level, sentence-level, etc.) of the corresponding text phrases in the documents. Specifically, we present both qualitative and quantitative characterization of searching News documents: former in terms of the nature of the content and the queries, and latter in terms of a metric obtained by adapting the notions of precision and recall. Specifically, the approach tries to reduce the manual effort required to analyze the News documents to compare the three alternatives for granularity of co-occurrence. Furthermore, the analysis suggests ways to improve retrieval performance as illustrated by applying our findings to News documents for the year 2005.
JA - IKE 2008
CY - New York, USA
ER -
TY - CONF
T1 - Description Logic Reasoning with Decision Diagrams: Compiling SHIQ to Disjunctive Datalog
T2 - The Semantic Web - ISWC 2008, 7th International Semantic Web Conference, 2008
Y1 - 2008
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - We propose a novel method for reasoning in the description logic SHIQ. After a satisfiability preserving transformation from SHIQ to the description logic ALCIb, the obtained ALCIb Tbox T is converted into an ordered binary decision diagram (OBDD) which represents a canonical model for T. This OBDD is turned into a disjunctive datalog program that can be used for Abox reasoning. The algorithm is worst-case optimal w.r.t. data complexity, and admits easy extensions with DL-safe rules and ground conjunctive queries.
JA - The Semantic Web - ISWC 2008, 7th International Semantic Web Conference, 2008
PB - The Semantic Web - ISWC 2008, 7th International Semantic Web Conference
CY - Karlsruhe, Germany
ER -
TY - CONF
T1 - Description Logic Rules
T2 - 18th European Conference on Artificial Intelligence, ECAI
Y1 - 2008
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - We introduce description logic (DL) rules as a new rule-based formalism for knowledge representation in DLs. As a fragment of the Semantic Web Rule Language SWRL, DL rules allow for a tight integration with DL knowledge bases. In contrast to SWRL, however, the combination of DL rules with expressive description logics remains decidable, and we show that the DL SROIQ - the basis for the ongoing standardisation of OWL 2 - can completely internalise DL rules. On the other hand, DL rules capture many expressive features of SROIQ that are not available in simpler DLs yet. While reasoning in SROIQ is highly intractable, it turns out that DL rules can be introduced to various lightweight DLs without increasing their worst-case complexity. In particular, DL rules enable us to significantly extend the tractable DLs EL++ and DLP.
JA - 18th European Conference on Artificial Intelligence, ECAI
CY - Patras, Greece
ER -
TY - JOUR
T1 - Determining the Best K for Clustering Transactional Datasets: A Coverage Density-based Approach
Y1 - 2008
A1 - Hua Yan
A1 - Keke Chen
A1 - Ling Liu
KW - an agglomerative hierachical clustering algorithm is developed and the Merge Dissimilarity Indexes
KW - The problem of determining the optimal number of clusters is important but mysterious in cluster analysis. In this paper
KW - we propose a novel method to find a set of candidate optimal number Ks of clusters in transactional datasets. Concretely
KW - we propose Transactional-cluster-modes Dissimilarity based on the concept of coverage density as an intuitive transactional inter-cluster dissimilarity measure. Based on the above measure
KW - which are generated in hierachical cluster merging processes
AB - The problem of determining the optimal number of clusters is important but mysterious in cluster analysis. In this paper, we propose a novel method to find a set of candidate optimal number Ks of clusters in transactional datasets. Concretely, we propose Transactional-cluster-modes Dissimilarity based on the concept of coverage density as an intuitive transactional inter-cluster dissimilarity measure. Based on the above measure, an agglomerative hierachical clustering algorithm is developed and the Merge Dissimilarity Indexes, which are generated in hierachical cluster merging processes, are used to find the candidate optimal number Ks of clusters of transactional data. Our experimental results on both synthetic and real data show that the new method often effectively estimates the number of clusters of transactional data.
ER -
TY - JOUR
T1 - Do Amino Acid Biosynthetic Costs Constrain Protein Evolution in Saccharomyces Cerevisiae?
Y1 - 2008
A1 - Dan Krane
A1 - R. Miller
A1 - Michael Raymer
A1 - D. Raiford
A1 - E. Heizer
A1 - H. Akashi
ER -
TY - JOUR
T1 - Dose and Time Response Metabolomic Analyses of a-Napthylisothiocyanate Toxicity in the Rat
JF - Chemical Research in Toxicology
Y1 - 2008
A1 - M. Westrick
A1 - Nicholas J. DelRaso
A1 - A. Neuforth
A1 - D. Mahle
A1 - Michael Raymer
A1 - Nicholas Reo
ER -
TY - JOUR
T1 - Dose and Time Response Metabolomic Analyses of Î±-Napthylisothiocyanate Toxicity in the Rat
Y1 - 2008
A1 - Nicholas J. DelRaso
A1 - M. Westrick
A1 - Michael Raymer
A1 - D. Mahle
A1 - Nicholas Reo
A1 - A. Neuforth
ER -
TY - CONF
T1 - Dynamic and Agile SOA using SAWSDL
Y1 - 2008
A1 - Amit Sheth
A1 - Kunal Verma
A1 - Karthik Gomadam
KW - SOA and SAWSDL and Agile
PB - Semantic Technology Conference
ER -
TY - ABST
T1 - ELP: Tractable Rules for OWL 2
Y1 - 2008
A1 - Markus Krotzsch
A1 - Sebastian Rudolph
A1 - Pascal Hitzler
AB - We introduce ELP as a decidable fragment of the Semantic Web Rule Language (SWRL) that admits reasoning in polynomial time. ELP is based on the tractable description logic EL++, and encompasses an extended notion of the recently proposed DL rules for that logic. Thus ELP extends EL++ with a number of features introduced by the forthcoming OWL 2, such as disjoint roles, local reflexivity, certain range restrictions, and the universal role.We present a reasoning algorithm based on a translation of ELP to Datalog, and this translation also enables the seamless integration of DL-safe rules into ELP.While reasoning with DL-safe rules as such is already highly intractable, we show that DL-safe rules based on the Description Logic Programming (DLP) fragment of OWL 2 can be admitted in ELP without losing tractability.
ER -
TY - CONF
T1 - ELP: Tractable Rules for OWL 2
T2 - ELP: Tractable Rules for OWL 2
Y1 - 2008
A1 - Markus Krotzsch
A1 - Sebastian Rudolph
A1 - Pascal Hitzler
AB - We introduce ELP as a decidable fragment of the Semantic Web Rule Language (SWRL) that admits reasoning in polynomial time. ELP is based on the tractable description logic EL++, and encompasses an extended notion of the recently proposed DL rules for that logic. Thus ELP extends EL++ with a number of features introduced by the forthcoming OWL 2, such as disjoint roles, local reflexivity, certain range restrictions, and the universal role.We present a reasoning algorithm based on a translation of ELP to Datalog, and this translation also enables the seamless integration of DL-safe rules into ELP.While reasoning with DL-safe rules as such is already highly intractable, we show that DL-safe rules based on the Description Logic Programming (DLP) fragment of OWL 2 can be admitted in ELP without losing tractability.
JA - ELP: Tractable Rules for OWL 2
ER -
TY - RPRT
T1 - ELP: Tractable Rules for OWL 2
Y1 - 2008
A1 - Sebastian Rudolph
A1 - Pascal Hitzler
A1 - Markus Krotzsch
AB - We introduce ELP as a decidable fragment of the Semantic Web Rule Language (SWRL) that admits reasoning in polynomial time. ELP is based on the tractable description logic EL++, and encompasses an extended notion of the recently proposed DL rules for that logic. Thus ELP extends EL++ with a number of features introduced by the forthcoming OWL 2, such as disjoint roles, local reflexivity, certain range restrictions, and the universal role. We present a reasoning algorithm based on a translation of ELP to Datalog, and this translation also enables the seamless integration of DL-safe rules into ELP. While reasoning with DL-safe rules as such is already highly intractable, we show that DL-safe rules based on the Description Logic Programming (DLP) fragment of OWL 2 can be admitted in ELP without losing tractability.
JA - ELP: Tractable Rules for OWL 2
ER -
TY - CONF
T1 - Empowering Translational Research using Semantic Web Technologies
Y1 - 2008
A1 - Amit Sheth
PB - Ohio Collaborative Conference on Bioinformatics (OCCBIO)
ER -
TY - CONF
T1 - Enhancing Process-Adaptation Capabilities with Web-Based Corporate Radar Technologies
Y1 - 2008
A1 - Kunal Verma
A1 - Prateek Jain
A1 - Alex Kass
A1 - Peter Z. Yeh
A1 - Amit Sheth
KW - Dynamic business process and Corporate Radars and Dynamic Process Adaptation
AB - Dynamic business processes are capable of adapting themselves to internal events but lack the ability to adapt to external events. Corporate radars are capable of mining the Web for external events of interest to produce structured representations of these events but are not capable of adapting themselves based on these events. In this position paper, we advocate and propose a system that integrates these two approaches to enhance the capability of process-adaptation engines, greatly increasing the scope of events they can respond to.
PB - First International workshop on Ontology-supported business intelligence
ER -
TY - CONF
T1 - Event Visualization in a 3D Environment
T2 - Event Visualization in a 3D Environment
Y1 - 2008
A1 - Deligiannidis L
A1 - Hakimpour F
A1 - Amit Sheth
AB - Semantic event tracker (SET) is an interactive visualization tool for analyzing events (activities) in a three-dimensional environment. We model an event as an object that describes an action, its location, time, and relations to other objects. Real world event information is extracted from Internet sources, then stored and processed using Semantic Web technologies that enable us to discover semantic associations between events. We use RDF graphs to represent semantic metadata and ontologies. SET is capable of visualizing as well as navigating through the event data in all three aspects of space, time and theme. Temporal data is illustrated as a 3D multi-line in the 3D environment that connects consecutive events. The line is marked with user-selectable objects that represent the events being visualized. Upon an event's selection, SET 'speaks' to the user semantically associated information via a speech synthesizer. Then, upon the user's verbal command SET can display semantically associated media such as digital images, audio and/or video clips. SET provides access to multi-source, heterogeneous, multimedia data, and is capable of visualizing events that contain geographic and time information.
JA - Event Visualization in a 3D Environment
ER -
TY - ABST
T1 - Expressive Tractable Description Logics based on SROIQ Rules
Y1 - 2008
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - We introduce description logic (DL) rules as a new rule-based formalism for knowledge representation in DLs. As a fragment of the Semantic Web Rule Language SWRL, DL rules allow for a tight integration with DL knowledge bases. In contrast to SWRL, however, the combination of DL rules with expressive description logics remains decidable, and we show that the DL SROIQ - the basis for the ongoing standardisation of OWL 2 - can completely internalise DL rules. On the other hand, DL rules capture many expressive features of SROIQ that are not available in simpler DLs yet. While reasoning in SROIQ is highly intractable, it turns out that DL rules can be introduced to various lightweight DLs without increasing their worst-case complexity. In particular, DL rules enable us to significantly extend the tractable DLs EL++ and DLP.
ER -
TY - THES
T1 - Extracting, Representing and Mining Semantic Metadata from Text: Facilitating Knowledge Discovery in Biomedicine
Y1 - 2008
A1 - Cartic Ramakrishnan
KW - Knowledge Discovery
KW - text mining
AB - The information access paradigm offered by most contemporary text information systems is a search-and-sift paradigm where users have to manually glean and aggregate relevant information from the large number of documents that are typically returned in response to keyword queries. Expecting the users to glean and aggregate information has lead to several inadequacies in these information systems. Owing to the size of many text databases, search-and-sift is a very tedious often requiring repeated keyword searches refining or generalizing queries terms. A more serious limitation arises from the lack of automated mechanisms to aggregate content across different documents to discover new knowledge. This dissertation focuses on processing text to assign semantic interpretations to its content (extracting Semantic metadata) and the design of algorithms and heuristics to utilize the extracted semantic metadata to support knowledge discovery operations over text content. Contributions in extracting semantic metadata in this dissertation cover the extraction of compound entities and complex relationships connecting entities. Extraction results are represented using a standard Semantic Web representation language (RDF) and are manually evaluated for accuracy. Knowledge discovery algorithms presented herein operate on RDF data. To further improve access mechanisms to text content, applications supporting semantic browsing and semantic search of text are presented.
ER -
TY - CONF
T1 - A Faceted Classification Based Approach to Search and Rank Web APIs
T2 - International Conference on Web Services, 2008
Y1 - 2008
A1 - Ajith Ranabahu
A1 - Karthik Gomadam
A1 - Kunal Verma
A1 - Amit Sheth
A1 - Meenakshi Nagarajan
AB - Web application hybrids, popularly known as mashups, are created by integrating services on the Web using their APIs. Support for finding an API is currently provided by generic search engines or domain specific solutions such as ... Shortcomings of both these solutions in terms of and reliance on user tags make the task of identifying an API challenging. Since these APIs are described in HTML documents, it is essential to look beyond the boundaries of current approaches to Web service discovery that rely on formal descriptions. In this work, we present a faceted approach to searching and ranking Web APIs that takes into consideration attributes or facets of the APIs as found in their HTML descriptions. Our method adopts current research in document classification and faceted search and introduces the serviut score to rank APIs based on their utilization and popularity. We evaluate classification, search accuracy and ranking effectiveness using available APIs while contrasting our solution with existing ones.
JA - International Conference on Web Services, 2008
PB - http://conferences.computer.org/icws/2008/
CY - Beijing, China
ER -
TY - CONF
T1 - A Forgetting-based Approach for Handling Inconsistency in Distributed Ontologies
Y1 - 2008
A1 - Guilin Qi
A1 - Yimin Wang
A1 - Peter Haase
A1 - Pascal Hitzler
AB - In the context of multiple distributed ontologies, we are often confronted with the problem of dealing with inconsistency. In this paper, we propose an approach for reasoning with inconsistent distributed ontologies based on concept forgetting.We firstly define concept forgetting in description logics.We then adapt the notions of recoveries and preferred recoveries in propositional logic to description logics. Two consequence relations are then defined based on the preferred recoveries.
PB - 5th European Semantic Web Conference, ESWC08
ER -
TY - CONF
T1 - A Framework for Trust and Distrust Networks
T2 - A Framework for Trust and Distrust Networks
Y1 - 2008
A1 - Rakesh Verma
A1 - Krishnaprasad Thirunarayan
AB - In this age of internet and electronic commerce it is becoming increasingly important to have and to manipulate information about the trustworthiness of the content or service providers in order to make informed decisions. This paper explores realistic models of trust and distrust based on partially ordered discrete values and proposes a framework, which is sensitive to local, relative ordering of values rather than their magnitudes. The framework distinguishes between direct and inferred trust, preferring direct information over possibly conflicting inferred information. It also represents ambiguity or inconsistency explicitly. The framework is capable of handling general trust and belief networks containing cycles.
JA - A Framework for Trust and Distrust Networks
ER -
TY - ABST
T1 - A Framework to Support Spatial, Temporal and Thematic Analytics over Semantic Web Data
Y1 - 2008
A1 - Matthew Perry
A1 - Amit Sheth
KW - RDF and Ontology and Temporal Query Processing and Spatio-Temporal-Thematic Query Processing and RDF Query Processing and Spatial Query Processing and SPARQL
AB - Spatial and temporal data are critical components in many applications. This is especially true in analytical applications ranging from scientific discovery to national security and criminal investigation. The analytical process often requires uncovering and analyzing complex thematic relationships between disparate people, places and events. Fundamentally new query operators based on the graph structure of Semantic Web data models, such as semantic associations, are proving useful for this purpose. However, these analysis mechanisms are primarily intended for thematic relationships. In this paper, we describe a framework built around the RDF data model for analysis of thematic, spatial and temporal relationships between named entities. We present a spatiotemporal modeling approach that uses an upper-level ontology in combination with temporal RDF graphs. A set of query operators that use graph patterns to specify a form of context are formally defined. We also describe an efficient implementation of the framework in Oracle DBMS and demonstrate the scalability of our approach with a performance study using both synthetic and real-world RDF datasets of over 25 million triples.
ER -
TY - THES
T1 - A Framework to Support Spatial, Temporal and Thematic Analytics over Semantic Web Data
Y1 - 2008
A1 - Matthew Perry
A1 - Amit Sheth
KW - Ontology
KW - rdf
KW - RDF Query Processing
KW - sparql
KW - SPARQL-ST
KW - Spatial Query Processing
KW - Spatio-Temporal-Thematic Query Processing
KW - Spatiotemporal Graph Patterns
KW - Temporal Query Processing
AB - Spatial and temporal data are critical components in many applications. This is especially true in analytical applications ranging from scientific discovery to national security and criminal investigation. The analytical process often requires uncovering and analyzing complex thematic relationships between disparate people, places and events. Fundamentally new query operators based on the graph structure of Semantic Web data models, such as semantic associations, are proving useful for this purpose. However, these analysis mechanisms are primarily intended for thematic relationships. In this paper, we describe a framework built around the RDF data model for analysis of thematic, spatial and temporal relationships between named entities. We present a spatiotemporal modeling approach that uses an upper-level ontology in combination with temporal RDF graphs. A set of query operators that use graph patterns to specify a form of context are formally defined. We also describe an efficient implementation of the framework in Oracle DBMS and demonstrate the scalability of our approach with a performance study using both synthetic and real-world RDF datasets of over 25 million triples.
ER -
TY - CONF
T1 - Fundamental Concepts of Bioinformatics
Y1 - 2008
A1 - Michael Raymer
ER -
TY - JOUR
T1 - Gaussian Binning: A new Kernel-based Method for Processing NMR Spectroscopic Data for Metabolomics
Y1 - 2008
A1 - Michael Raymer
ER -
TY - JOUR
T1 - Gaussian Binning for Processing NMR Spectroscopic Data for Metabolomics
Y1 - 2008
A1 - P. Erson
A1 - Michael Raymer
A1 - Nicholas Reo
A1 - Nicholas J. DelRaso
ER -
TY - JOUR
T1 - A Genetic Optimization Approach for Isolating Translational Efficiency Bias
JF - IEEE Control Systems Society
Y1 - 2008
A1 - D. Raiford
A1 - Doug Raiford
A1 - Dan Krane
A1 - Travis Doom
A1 - Michael Raymer
KW - codon usage bias
KW - Evolutionary computing and genetic algorithms
KW - GC-content
KW - strand bias
KW - translational efficiency
AB - The study of codon usage bias is an important research area that contributes to our understanding of molecular evolution, phylogenetic relationships, respiratory lifestyle, and other characteristics. Translational efficiency bias is perhaps the most well studied codon usage bias, as it is frequently utilized to predict relative protein expression levels. We present a novel approach to isolating translational efficiency bias in microbial genomes. There are several existent methods for isolating translational efficiency bias. Previous approaches are susceptible to the confounding influences of other potentially dominant biases. Additionally, existing approaches to identifying translational efficiency bias generally require both genomic sequence information and prior knowledge of a set of highly expressed genes. This novel approach provides more accurate results from sequence information alone by resisting the confounding effects of other biases. We validate this increase in accuracy in isolating translational efficiency bias on ten microbial genomes, five of which have proven particularly difficult for existing approaches due to the presence of strong confounding biases.
ER -
TY - CONF
T1 - Graph Summaries for Subgraph Frequency Estimation
T2 - 5th European Semantic Web Conference (ESWC2008)
Y1 - 2008
A1 - Angela Maduko
A1 - Kemafor Anyanwu
A1 - Paul Schliekelman
A1 - Amit Sheth
KW - Data summaries
KW - Frequency estimation
KW - Graph summaries
KW - RDF graph patterns
KW - RDF query optimizer
AB - A fundamental problem related to graph structured databases is searching for substructures. One issue with respect to optimizing such searches is the ability to estimate the frequency of substructures within a query graph. In this work, we present and evaluate two techniques for estimating the frequency of subgraphs from a summary of the data graph. In the first technique, we assume that edge occurrences on edge sequences are position independent and summarize only the most informative dependencies. In the second technique, we prune small subgraphs using a valuation scheme that blends information about their importance and estimation power. In both techniques, we assume conditional independence to estimate the frequencies of larger subgraphs. We validate the effectiveness of our techniques through experiments on real and synthetic datasets.
JA - 5th European Semantic Web Conference (ESWC2008)
CY - Tenerife, Spain
ER -
TY - CONF
T1 - Growing Fields of Interest -Using an Expand and Reduce Strategy for Domain Model Extraction
T2 - Growing Fields of Interest -Using an Expand and Reduce Strategy for Domain Model Extraction
Y1 - 2008
A1 - Christopher Thomas
A1 - Roger Brooks
A1 - Pankaj Mehra
A1 - Amit Sheth
AB - Domain hierarchies are widely used as models underlying information retrieval tasks. Formal ontologies and taxonomies enrich such hierarchies further with properties and relationships associated with concepts and categories but require manual effort; therefore they are costly to maintain, and often stale. Folksonomies and vocabularies lack rich category structure and are almost entirely devoid of properties and relationships. Classification and extraction require the coverage of vocabularies and the alterability of folksonomies and can largely benefit from category relationships and other properties. With Doozer, a program for building conceptual models of information domains, we want to bridge the gap between the vocabularies and Folksonomies on the one side and the rich, expert-designed ontologies and taxonomies on the other. Doozer mines Wikipedia to produce tight domain hierarchies, starting with simple domain descriptions. It also adds relevancy scores for use in automated classification of information. The output model is described as a hierarchy of domain terms that can be used immediately for classifiers and IR systems or as a basis for manual or semi-automatic creation of formal ontologies.
JA - Growing Fields of Interest -Using an Expand and Reduce Strategy for Domain Model Extraction
ER -
TY - CONF
T1 - HTML Microformat for Describing RESTful Web Services and APIs
T2 - IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
Y1 - 2008
A1 - Tomas Vitvar
A1 - Jacek Kopecky
A1 - Karthik Gomadam
AB - The Web 2.0 wave brings, among other aspects, the Programmable Web: increasing numbers of Web sites provide machine-oriented APIs and Web services. However, most APIs are only described with text in HTML documents. The lack of machine-readable API descriptions affects the feasibility of tool support for developers who use these services. We propose a microformat called hRESTS (HTML for RESTful Services) for machine-readable descriptions of Web APIs, backed by a simple service model. The hRESTS microformat describes main aspects of services, such as operations, inputs and outputs. We also present two extensions of hRESTS: SA-REST, which captures the facets of public APIs important for mashup developers, and MicroWSMO, which provides support for semantic automation.
JA - IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
CY - Sydney, Australia
ER -
TY - JOUR
T1 - Inferring the Number of Contributors to Mixed DNA Profiles
Y1 - 2008
A1 - D. Paoletti
A1 - Michael Raymer
A1 - Travis Doom
A1 - Dan Krane
ER -
TY - CHAP
T1 - Inheritance in Programming Languages
Y1 - 2008
A1 - Krishnaprasad Thirunarayan
ER -
TY - CONF
T1 - Introduction to Bioinformatics
Y1 - 2008
A1 - Michael Raymer
ER -
TY - CONF
T1 - Joint Extraction of Compound Entities and Relationships from Biomedical Literature
T2 - Joint Extraction of Compound Entities and Relationships from Biomedical Literature
Y1 - 2008
A1 - Cartic Ramakrishnan
A1 - Pablo N. Mendes
A1 - Rodrigo Da Gama
A1 - Guilherme Ferreira
A1 - Amit Sheth
AB - In this paper we identify some limitations of contemporary information extraction mechanisms in the context of biomedical literature. We present an extraction mechanism that generates structured representations of textual content. Our extraction mechanism achieves this by extracting compound entities, and relationships between them, occuring in text. A detailed evaluation of the relationship and compound entities extracted is presented. Our results show over 62% average precision across 8 relationship types tested with over 82% average precision for compound entity identification1.
JA - Joint Extraction of Compound Entities and Relationships from Biomedical Literature
CY - Sydney, Australia
ER -
TY - CHAP
T1 - Learning Expressive Ontologies
Y1 - 2008
A1 - Johanna Volker
A1 - Peter Haase
A1 - Pascal Hitzler
ER -
TY - JOUR
T1 - A Low-Cost, Linux-Based Virtual Environment for Visualizing Vascular Structures
JF - Lecture Notes in Computer Science, Volume 5358
Y1 - 2008
A1 - Thomas Wischgoll
AB - The analysis of morphometric data of the vasculature of any organ requires appropriate visualization methods to be applied due to the vast number of vessels that can be present in such data. In addition, the geometric properties of vessel segments, i.e. being rather long and thin, can make it difficult to judge on relative position, despite depth cues such as proper lighting and shading of the vessels. Virtual environments that provide true 3-D visualization of the data can help enhance the visual perception. Ideally, the system should be relatively cost-effective. Hence, this paper describes a Linux-based virtual environment that utilizes a 50 inch plasma screen as its main display. The overall cost of the entire system is less than $3,500 which is considerably less than other commercial systems. The system was successfully used for visualizing vascular data sets providing true three-dimensional perception of the morphometric data.
ER -
TY - Generic
T1 - Mechanisms for Improved Covariant Type-Checking
Y1 - 2008
A1 - K. Cleereman
A1 - Krishnaprasad Thirunarayan
A1 - M. Cheatham
PB - Computer Languages, Systems and Structures
ER -
TY - CONF
T1 - Mediatability: Estimating the Degree of Human Involvement in XML Schema Mediation
T2 - Mediatability: Estimating the Degree of Human Involvement in XML Schema Mediation
Y1 - 2008
A1 - Kunal Verma
A1 - Lakshmish Ramaswamy
A1 - Ajith Ranabahu
A1 - Karthik Gomadam
A1 - Amit Sheth
JA - Mediatability: Estimating the Degree of Human Involvement in XML Schema Mediation
ER -
TY - CONF
T1 - Mining Sequence Classifiers for Early Prediction
T2 - Mining Sequence Classifiers for Early Prediction
Y1 - 2008
A1 - Guozhu Dong
A1 - Zhengzheng Xing
A1 - Philip Yu
A1 - Jian Pei
JA - Mining Sequence Classifiers for Early Prediction
ER -
TY - ABST
T1 - Mobile Semantic Computing
Y1 - 2008
A1 - Karthik Gomadam
A1 - Anupam Joshi
A1 - Amit Sheth
KW - Data Mining
KW - Natural Language Processing
KW - Semantic Database Theory and Systems
KW - Service-oriented Architectures and Computing
AB - The advent of powerful mobile devices, mobile computing platforms and interfaces are rapidly ushering the next computing revolution. The recent surge in the growth and adoption of the RESTful services paradigm for delivering applications as services on the Web have challenged and changed conventional approaches to software design and delivery. Combined with the evolution of Semantic Web techniques and principles, the areas of Mobile Computing and RESTful services can take us closer toward realizing the Web as a ubiquitous platform for data, information, knowledge and application exchange. While the possibilities are endless and are exciting, one needs to address certain basic issues before the grand vision can be realized. Some of these include:

Efficient ways to distribute data processing between service providers and users, especially data integration in case of mashups,

Semantic annotation of data and data formats that can be used in bandwidth constrained environments such as mobile environments,

Need to identify techniques to model and describe various execution environments, their capabilities and constraints and

The role of semantic web in taking the foundations of social computing and other emerging Web 2.0 paradigms into ubiquitous environments.

PB - 2nd International Conference on Semantic Computing (ICSC2008)
ER -
TY - ABST
T1 - Monetizing User Activity on Social Networks
Y1 - 2008
A1 - Amit Sheth
A1 - Meenakshi Nagarajan
A1 - Kamal Baid
A1 - Shaojun Wang
KW - Social Networks and Monetization and User activity and Computational Advertising and O-topic content and Intents
AB - In this work, we investigate techniques to monitize user activity on public forums, marketplaces and groups on social network sites. Our approach involves (a) identifying the monetization potential of user posts and (b) eliminating o_- topic content in monetizable posts to use the most relevant keywords for advertising. Our _rst user study involving 30 users and data from MySpace and Facebook, shows that 52% of ad impressions shown after using our system were more targeted compared to the 30% relevant impressions generated without using our system. A second smaller study suggests that pro_le ads that are based on user activity generate more interest than ads solely based on pro_le information.
ER -
TY - CONF
T1 - On the Move to Meaningful Internet Systems: OTM 2008 Workshops
Y1 - 2008
A1 - Pieter De Leenheer
A1 - Martin Hepp
A1 - Amit Sheth
PB - On the Move to Meaningful Internet Systems: OTM 2008 Workshops COMBEK 2008
ER -
TY - JOUR
T1 - An Online Platform for Web APIs and Service Mashups
Y1 - 2008
A1 - Ajith Ranabahu
A1 - Karthik Gomadam
A1 - E. Micheal Maximilien
KW - DSL
ER -
TY - CONF
T1 - Ontology Driven Semantic Provenance for Heterogeneous Bionomics Experimental Data
Y1 - 2008
A1 - Michael Raymer
A1 - Satya S. Sahoo
A1 - Cory Henson
A1 - William York
A1 - Amit Sheth
AB - Scientific experimental data generated by all the bionomic technologies is characterized by heterogeneity in its representation formats, constituents, and generation processes and, therefore, also in its usage. Using the proteomics domain we demonstrate the important role of provenance information o manage, interpret and analyze experimental data. We present a novel approach that employs an ontology as a knowledge model to automatically create semantic provenance information for high-throughput mass spectrometry (MS) data in the glycoproteomics domain. The Semantic Provenance Annotation of Data in protEomics (SPADE) implementation is based on the ProPreO ontology, a large-process ontology ( ~500 classes, 40 named relationships with 170 class-level restrictions, and 3.1 million instances) that models the complete experimental protocol for MS-based glycoproteomics data analysis. The semantic provenance information created in SPADE enables biologists to query over the semantic provenance information and retrieve exact data using 'train-of-thought' expressive queries in SPARQL query language. We also discuss our current work in extending the ProPreO ontology to support toxicological metabolomics experimentation using Nuclear Magnetic Resonance (NMR) spectroscopy. Our strategic goal is to use Semantic Provenance information by pattern recognition and data mining algorithms for comparative or correlation analysis of Liquid Chromatography MS (LCMS) and NMR spectroscopy experimental data as part of toxicological metabolomics studies.
ER -
TY - JOUR
T1 - An Ontology-Driven Semantic Mashup of Gene and Biological Pathway Information: Application to the Domain of Nicotine Dependence
Y1 - 2008
A1 - Satya Sahoo
A1 - Olivier Bodenreider
A1 - Joni Rutter
A1 - Karen Skinner
A1 - Amit Sheth
KW - OWL
KW - This paper illustrates how Semantic Web technologies (especially RDF
KW - we integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base.
ER -
TY - JOUR
T1 - An Ontology-driven Semantic Mash-up of Gene and Biological Pathway Information: Application to the Domain of Nicotine Dependence
JF - Journal of Biomedical Informatics
Y1 - 2008
A1 - Karen Skinner
A1 - Joni Rutter
A1 - Amit Sheth
A1 - Olivier Bodenreider
A1 - Satya S. Sahoo
KW - Entrez Knowledge Model (EKoM)
KW - Gene-Pathway data integration
KW - information integration
KW - Multi-ontology schema integration
KW - Nicotine dependence
KW - Semantic Bioinformatics
KW - Semantic mashup
AB - Objectives: This paper illustrates how Semantic Web technologies (especially RDF, OWL, and SPARQL) can support information integration and make it easy to create semantic mashups (semantically integrated resources). In the context of understanding the genetic basis of nicotine dependence, we integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base. Methods: We use an ontology-driven approach to integrate two gene resources (Entrez Gene and HomoloGene) and three pathway resources (KEGG, Reactome and BioCyc), for five organisms, including humans. We created the Entrez Knowledge Model (EKoM), an information model in OWL for the gene resources, and integrated it with the extant BioPAX ontology designed for pathway resources. The integrated schema is populated with data from the pathway resources, publicly available in BioPAX-compatible format, and gene resources for which a population procedure was created. The SPARQL query language is used to formulate queries over the integrated knowledge base to answer the three biological queries. Results: Simple SPARQL queries could easily identify hub genes, i.e., those genes whose gene products participate in many pathways or interact with many other gene products. The identification of the genes expressed in the brain turned out to be more difficult, due to the lack of a common identification scheme for proteins. Conclusion: Semantic Web technologies provide a valid framework for information integration in the life sciences. Ontology-driven integration represents a flexible, sustainable and extensible solution to the integration of large volumes of information. Additional resources, which enable the creation of mappings between information sources, are required to compensate for heterogeneity across namespaces.
ER -
TY - CONF
T1 - Paraconsistent Reasoning for Expressive and Tractable Description Logics
Y1 - 2008
A1 - Yue Ma
A1 - Zuoquan Lin
A1 - Pascal Hitzler
AB - Four-valued description logic has been proposed to reason with description logic based inconsistent knowledge bases, mainly ALC. This approach has a distinct advantage that it can be implemented by invoking classical reasoners to keep the same complexity as classical semantics. In this paper, we further study how to extend the four-valued semantics to more expressive description logics, such as SHIQ, and to more tractable description logics including EL++, DL-Lite, and Horn-DLs. The most effort we spend defining the four-valued semantics of expressive four-valued description logics is on keeping the reduction from four-valued semantics to classical semantics as in the case of ALC; While for tractable description logics, we mainly focus on how to maintain their tractability when adopting four-valued semantics.
PB - 21st International Workshop on Description Logics, DL2008
ER -
TY - CONF
T1 - Proceedings of the 2008 International Workshop on Context Enabled Source and Service Selection, Integration and Adaptation
T2 - Proceedings of the 2008 International Workshop on Context Enabled Source and Service Selection, Integration and Adaptation
Y1 - 2008
A1 - Amit Sheth
A1 - Z. Maamr
A1 - Q. Z. Sheng
A1 - Biplav Srivastava
A1 - Ullas Nambiar
A1 - S. Elnaffar
KW - context awareness
KW - service and data integration
KW - web service
JA - Proceedings of the 2008 International Workshop on Context Enabled Source and Service Selection, Integration and Adaptation
CY - Beijing, China
VL - 292
SN - 978-1-60558-107-1
ER -
TY - ABST
T1 - Provenance Algebra and Materialized View-based Provenance Management
Y1 - 2008
A1 - Satya S. Sahoo
A1 - Roger Barga
A1 - Amit Sheth
A1 - Jonathan Goldstein
AB - Provenance, from the French word 'provenir' meaning 'to come from', describes the lineage of an entity. Provenance is critical information in eScience to accurately interpret scientific results. Though information provenance has been recognized as a hard problem in computing science (British Computing Society, 2004), many fundamental research issues in provenance have yet to be addressed. A common provenance model with well-defined formal semantics to facilitate interoperability of provenance metadata from different sources has not been defined. Another important issue is the lack of a systematic study of provenance query characteristics across multiple applications. A classification or taxonomy of the provenance queries will not only help to better understand provenance metadata, but will also enable the definition of provenance query operators. Finally, while provenance for a user or an application is a specific view over all available provenance metadata, a provenance management system that supports provenance storage as views has not been implemented. In this paper we propose a novel provenance algebra consisting of a common provenance model called provenir, defined in description logic based W3C Web Ontology Language (OWL-DL), along with a set of provenance query operators derived from the classification of provenance queries. We also introduce a practical provenance storage solution using materialized views over a generic relational database system. Our approach takes advantage of provenance query operators and well-defined indices to efficiently process complex provenance queries over very large datasets. To support our claims we present an evaluation of both performance and scalability aspects of our initial implementation. To the best of our knowledge this is the first provenance management system that supports the complete process from a formal provenance model and query operators to storage and efficient queries over provenance data.
ER -
TY - CONF
T1 - RDB2RDF: Incorporating Domain Semantics in Structured Data
T2 - RDB2RDF: Incorporating Domain Semantics in Structured Data
Y1 - 2008
A1 - Satya S. Sahoo
KW - Entrez Gene
KW - Gene Ontology
KW - information integration
KW - knowledge integration
KW - Nicotine dependence
KW - Ontologies
KW - rdf
KW - Semantic mashup
KW - Semantic Web
JA - RDB2RDF: Incorporating Domain Semantics in Structured Data
ER -
TY - ABST
T1 - Reasoning in Circumscriptive ALCO
Y1 - 2008
A1 - Stephan Grimm
A1 - Pascal Hitzler
AB - Non-monotonic extensions of description logics (DLs) allow for default and local closed-world reasoning and are an acknowledged desired feature for applications, e.g. in the Semantic Web. A recent approach to such an extension is based on McCarthy's circumscription, which rests on the principle of minimising the extension of selected predicates to locally close off dedicated parts of a domain model. While decidability and complexity results have been established in the literature, no practical algorithmisation for circumscriptive DLs has been proposed so far. In this paper, we present a tableaux calculus that can be used as a sound and complete decision procedure for concept satisfiability with respect to concept-circumscribed ALCO knowledge bases. The calculus builds on existing tableaux for classical DLs, extended by the notion of a preference clash to detect the non-minimality of constructed models.
ER -
TY - CONF
T1 - Relationship Web: Trailblazing, Analytics and Computing for Human Experience
T2 - Relationship Web: Trailblazing, Analytics and Computing for Human Experience
Y1 - 2008
A1 - Amit Sheth
JA - Relationship Web: Trailblazing, Analytics and Computing for Human Experience
PB - 27th International Conference on Conceptual Modeling (ER 2008)
CY - Barcelona, Spain
ER -
TY - JOUR
T1 - Resolution of Forensic DNA Mixtures
Y1 - 2008
A1 - Dan Krane
A1 - Michael Raymer
A1 - J. Gilder
A1 - K. Inman
A1 - Travis Doom
ER -
TY - CONF
T1 - SA-REST: Using Semantics to Empower RESTful Services and Smashups with Better Interoperability and Mediation
Y1 - 2008
A1 - Amit Sheth
A1 - Karthik Gomadam
KW - Web 2.0 and mashups and SA-REST and Smashups
PB - Semantic Technology Conference
ER -
TY - JOUR
T1 - Scalable Semantic Analytics on Social Networks for Addressing the Problem of Conflict of Interest Detection
JF - ACM Transactions on the Web
Y1 - 2008
A1 - Boanerges Aleman-Meza
A1 - Meenakshi Nagarajan
A1 - Li Ding
A1 - Amit Sheth
A1 - Ismailcem Budak Arpinar
A1 - Anupam Joshi
A1 - Timothy Finin
KW - Information Systems
KW - information systems applications
AB - In this paper, we demonstrate the applicability of semantic techniques for detection of Conflict of Interest (COI). We explain the common challenges involved in building scalable Semantic Web applications, in particular those addressing connecting-the-dots problems. We describe in detail the challenges involved in two important aspects on building Semantic Web applications, namely, data acquisition and entity disambiguation (or reference reconciliation). We extend upon our previous work where we integrated the collaborative network of a subset of DBLP researchers with persons in a Friend-of-a-Friend social network (FOAF). Our method finds the connections between people, measures collaboration strength, and includes heuristics that use friendship/affiliation information to provide an estimate of potential COI in a peer-review scenario. Evaluations are presented by measuring what could have been the COI between accepted papers in various conference tracks and their respective program committee members. The experimental results demonstrate that scalability can be achieved by using a dataset of over 3 million entities (all bibliographic data from DBLP and a large collection of FOAF documents).
PB - ACM
VL - 2
CP - 1
ER -
TY - CONF
T1 - Segmenting Brain Tumors Using Pseudo-Conditional Random Fields
T2 - Segmenting Brain Tumors Using Pseudo-Conditional Random Fields
Y1 - 2008
A1 - M. Brown
A1 - A. Murtha
A1 - C. Lee
A1 - Shaojun Wang
A1 - R. Greiner
JA - Segmenting Brain Tumors Using Pseudo-Conditional Random Fields
ER -
TY - JOUR
T1 - Semantic Knowledge Facilities for a Web-based Recipe Database System Supporting Personalization
Y1 - 2008
A1 - Qing Li
A1 - Liping Wang
A1 - Guozhu Dong
A1 - Yu Li
ER -
TY - JOUR
T1 - Semantic Matchmaking of Web Resources with Local Closed-World Reasoning
JF - International Journal of e-Commerce
Y1 - 2008
A1 - Stephan Grimm
A1 - Pascal Hitzler
KW - closed-world reasoning
KW - local closed world reasoning
AB - Ontology languages like OWL allow for semantically rich annotation of resources, such as products advertised at an electronic online marketplace,while the Description Logic (DL) formalism underlying OWL provides reasoning techniques to perform matchmaking on such annotations. We identify peculiarities in the use of DL inferences for matchmaking which are due to the open-world semantics of OWL, and we analyse the use of local closed-world reasoning for its applicability to matchmaking. In particular,we investigate two nonmonotonic extensions to DL, namely auto epistemic DLs and DLs with circumscription, for their suitability of realising local closed-world reasoning in the matchmaking context to overcome these problems. We discuss their different characteristics by means of an elab- orate example of an electronic marketplace for PC product catalogues from the eCommerce domain and demonstrate how these formalisms can be used to realise such scenarios.
ER -
TY - JOUR
T1 - Semantic Provenance for eScience: Managing the Deluge of Scientific Data
Y1 - 2008
A1 - Cory Henson
A1 - Amit Sheth
A1 - Satya S. Sahoo
KW - semantic provenance
AB - Provenance information in eScience is metadata that's critical to effectively manage the exponentially increasing volumes of scientific data from industrial-scale experiment protocols. Semantic provenance, based on domain-specific provenance ontologies, lets software applications unambiguously interpret data in the correct context. The semantic provenance framework for eScience data comprises expressive provenance information and domain-specific provenance ontologies and applies this information to data management. ...
ER -
TY - CONF
T1 - Semantic Sensor Web
Y1 - 2008
A1 - Amit Sheth
A1 - Cory Henson
KW - Semantic Sensor Web and SSW
PB - Semantic Technology Conference
ER -
TY - CONF
T1 - Semantic Sensor Web
Y1 - 2008
A1 - Cory Henson
A1 - Amit Sheth
A1 - Cory Henson
A1 - Amit Sheth
A1 - Satya S. Sahoo
PB - AGU Fall Meeting
ER -
TY - JOUR
T1 - Semantic Sensor Web
Y1 - 2008
A1 - Cory Henson
A1 - Amit Sheth
KW - SSW
AB - In March 2008, heavy rainstorms across the Midwestern region of the US caused many rivers to breach their banks. Residents of Valley Park, a small town along the Meramec River, Missouri, had to decide whether to rely on a newly constructed levee or abandon their homes for higher ground.1 Although the levee held, many chose the latter option and fled their homes; it was a chaotic situation that might have been avoided through access to better situational knowledge regarding the current water pressure and the levee's structural integrity. Had pressure sensors been embedded in the levee, they might have provided accurate real-time information that let residents make informed decisions about the safety of the levee, their homes, and themselves. This scenario demonstrates the increasingly critical role of sensors that collect and distribute observations of our world in our everyday lives. In recent years, sensors have been increasingly adopted by a diverse array of disciplines, such as meteorology for weather forecasting and wildfire detection (www.met.utah.edu/mesowest/), civic planning for traffic management (www.buckeyetraffic.org/), satellite imaging for earth and space observation (http://vast.uah.edu/), medical sciences for patient care using biometric sensors (www.liebertonline.com/doi/abs/10.1089/109350703322682531), and homeland security for radiation and biochemical detection at ports (www.msnbc.msn.com/id/8092280). Sensors are thus distributed across the globe, leading to an avalanche of data about our environment. The rapid development and deployment of sensor technology involves many different types of sensors, both remote and in situ, with diverse capabilities such as range, modality, and maneuverability. Today, it's possible to use sensor networks to detect and identify a multitude of observations, from simple phenomena to complex events and situations. The lack of integration and communication between these networks, however, often isolates important data streams and intensifies the existing problem of too much data and not enough knowledge. With a view to addressing this problem, we discuss a semantic sensor Web (SSW) in which sensor data is annotated with semantic metadata to increase interoperability as well as provide contextual information essential for situational knowledge. In particular, this involves annotating sensor data with spatial, temporal, and thematic semantic metadata.
ER -
TY - CONF
T1 - Semantic Sensor Web
Y1 - 2008
A1 - Cory Henson
A1 - Amit Sheth
KW - Semantic Sensor Web and SSW
AB - ppt
PB - Semantic Interoperability Community of Practice (SICoP) Conference: Building Semantic Interoperability Solutions for Information Sharing and Integration
ER -
TY - CHAP
T1 - Semantic Sensor Web
Y1 - 2008
A1 - Cory Henson
A1 - Amit Sheth
A1 - Cory Henson
A1 - Amit Sheth
A1 - Satya S. Sahoo
KW - Semantic Sensor Web
KW - SSW
AB - ppt
PB - Semantic Interoperability Community of Practice (SICoP): Sensor Standards Harmonization WG
ER -
TY - CHAP
T1 - Semantic Sensor Web
Y1 - 2008
A1 - Amit Sheth
A1 - Cory Henson
A1 - Cory Henson
A1 - Amit Sheth
A1 - Satya S. Sahoo
AB - In March 2008, heavy rainstorms across the Midwestern region of the US caused many rivers to breach their banks. Residents of Valley Park, a small town along the Meramec River, Missouri, had to decide whether to rely on a newly constructed levee or abandon their homes for higher ground.1 Although the levee held, many chose the latter option and fled their homes; it was a chaotic situation that might have been avoided through access to better situational knowledge regarding the current water pressure and the levee's structural integrity. Had pressure sensors been embedded in the levee, they might have provided accurate real-time information that let residents make informed decisions about the safety of the levee, their homes, and themselves. This scenario demonstrates the increasingly critical role of sensors that collect and distribute observations of our world in our everyday lives. In recent years, sensors have been increasingly adopted by a diverse array of disciplines, such as meteorology for weather forecasting and wildfire detection (www.met.utah.edu/mesowest/), civic planning for traffic management (www.buckeyetraffic.org/), satellite imaging for earth and space observation (http://vast.uah.edu/), medical sciences for patient care using biometric sensors (www.liebertonline.com/doi/abs/10.1089/109350703322682531), and homeland security for radiation and biochemical detection at ports (www.msnbc.msn.com/id/8092280). Sensors are thus distributed across the globe, leading to an avalanche of data about our environment. The rapid development and deployment of sensor technology involves many different types of sensors, both remote and in situ, with diverse capabilities such as range, modality, and maneuverability. Today, it's possible to use sensor networks to detect and identify a multitude of observations, from simple phenomena to complex events and situations. The lack of integration and communication between these networks, however, often isolates important data streams and intensifies the existing problem of too much data and not enough knowledge. With a view to addressing this problem, we discuss a semantic sensor Web (SSW) in which sensor data is annotated with semantic metadata to increase interoperability as well as provide contextual information essential for situational knowledge. In particular, this involves annotating sensor data with spatial, temporal, and thematic semantic metadata.
PB - ARC Research Network on Intelligent Sensors, Sensor Networks and Information Processing
ER -
TY - BOOK
T1 - Semantic Web. Grundlagen
Y1 - 2008
A1 - Sebastian Rudolph
A1 - York Sure
A1 - Markus Krotzsch
A1 - Pascal Hitzler
ER -
TY - CONF
T1 - The Semantic Web - ISWC 2008
Y1 - 2008
A1 - Diana Maynard
A1 - M. Paolucci
A1 - Amit Sheth
A1 - Timothy Finin
A1 - Steffen Staab
A1 - Krishnaprasad Thirunarayan
A1 - Mike Dean
KW - OWL and non-standard reasoning with ontologies and semantic social networks and ontology alignment and business applications and ontology engineering and user interfaces and Web data and knowledge and semantic Web services and semantic retrieval and descr
PB - 7th International Semantic Web Conference, ISWC 2008
ER -
TY - Generic
T1 - A Semantic Web Model for the Personalized e-Learning
T2 - 9th International Information Technology Conference
Y1 - 2008
A1 - S. Lalithsena
A1 - K. Hewagamage
A1 - K. Jayaratne
AB - Personalized e-Learning is aimed in adapting the learning process of e-Learning based on needs and preferences of the learner instead of providing a onesize- fit-all learning model as in the conventional e- Learning. Exiting solutions for the personalized e- Learning depends on the metadata and various standards defined on the content and user. This paper proposes a semantic web model for personalized e- Learning through which personalization can be further enhanced. It describes a proper mechanism to handle learning content by maintaining a separate knowledge layer for knowledge representation and another one layer to store the content according to the knowledge represented in the above layer. Both the Subject Domain of the learning content and the User Profile for learner’s information would be modeled using an Ontology approach in the semantic web. The proposed model generate the most matching learning path for the learner in a goal oriented way and then could be used to extract relevant learning content.
JA - 9th International Information Technology Conference
PB - ICTer
CY - Negombo, Sri Lanka
ER -
TY - CONF
T1 - Semantic Web: Promising Technologies, Current Applications and Research Directions
Y1 - 2008
A1 - Amit Sheth
KW - Semantic Web
AB - ppt
PB - Melbourne, University of Adelaide, University of Melbourne, Victoria University
ER -
TY - CHAP
T1 - Semantics and Services enabled Problem Solving Environment for Trypanosoma cruzi, NIH RO1 grant (collaborations with NCBO)
Y1 - 2008
A1 - Satya S. Sahoo
KW - Bioinformatics
KW - NCBO
KW - NIH
KW - Research Grant
KW - T.cruzi
PB - Kno.e.sis Lab
ER -
TY - JOUR
T1 - Semantics enhanced Services: METEOR-S, SAWSDL and SA-REST
Y1 - 2008
A1 - Ajith Ranabahu
A1 - Amit Sheth
A1 - Karthik Gomadam
KW - data mediation
KW - dynamic configuration and adaptation of Web processes. We finally discuss our current and future research in the area of RESTful services.
KW - in the domain of RESTful services and Web 2.0. In this article
KW - publication and discovery of semantic Web services
KW - researchers have addressed key issues in the area of semantic Web services and more recently
KW - s
KW - semantic Web services and service oriented computing. Starting with the METEOR workflow management system in the 90'
KW - Services Research Lab at the Knoesis center and the LSDIS lab at University of Georgia have played a significant role in advancing the state of research in the areas of workflow management
KW - we present a brief discussion on the various contributions of METEOR-S including SAWSDL
AB - Services Research Lab at the Knoesis center and the LSDIS lab at University of Georgia have played a significant role in advancing the state of research in the areas of workflow management, semantic Web services and service oriented computing. Starting with the METEOR workflow management system in the 90's, researchers have addressed key issues in the area of semantic Web services and more recently, in the domain of RESTful services and Web 2.0. In this article, we present a brief discussion on the various contributions of METEOR-S including SAWSDL, publication and discovery of semantic Web services, data mediation, dynamic configuration and adaptation of Web processes. We finally discuss our current and future research in the area of RESTful services.
ER -
TY - CHAP
T1 - Semantics to Empower Services Science: Using Semantics at Middleware, Web Services and Business Levels
Y1 - 2008
A1 - Amit Sheth
PB - University of New South Wales, Australia
ER -
TY - JOUR
T1 - Services Mashups: The New Generation of Web Applications
Y1 - 2008
A1 - S. Dustdar
A1 - Amit Sheth
A1 - Djamal Benslimane
KW - service mashups
KW - Web APIs
KW - Web Services
KW - WSDL
AB - Web services are becoming a major technology for deploying automated interactions between distributed and heterogeneous applications, and for connecting business processes. Service mashups indicate a way to create new Web applications by combining existing Web resources utilizing data and Web APIs. They facilitate the design and development of novel and modern Web applications based on easy-to-accomplish end- user service compositions.
ER -
TY - CHAP
T1 - Spieltheorie
T2 - MINT (Mathematik, Informatik, Naturwissenschaften, Technik) Vol. 18
Y1 - 2008
A1 - Pascal Hitzler
A1 - Pascal Hitzler
A1 - Alexander Chocholaty
A1 - Gudrun Kalmbach
JA - MINT (Mathematik, Informatik, Naturwissenschaften, Technik) Vol. 18
ER -
TY - JOUR
T1 - The State of the Art in Flow Visualization: Structure-Based Techniques
JF - SimVis 2008
Y1 - 2008
A1 - Gerik Scheuermann
A1 - Tobias Salzbrunn
A1 - Heike Jaenicke
A1 - Thomas Wischgoll
AB - The analysis of morphometric data of the vasculature of any organ requires appropriate visualization methods to be applied due to the vast number of vessels that can be present in such data. In addition, the geometric properties of vessel segments, i.e. being rather long and thin, can make it difficult to judge on relative position, despite depth cues such as proper lighting and shading of the vessels. Virtual environments that provide true 3-D visualization of the data can help enhance the visual perception. Ideally, the system should be relatively cost-effective. Hence, this paper describes a Linux-based virtual environment that utilizes a 50 inch plasma screen as its main display. The overall cost of the entire system is less than $3,500 which is considerably less than other commercial systems. The system was successfully used for visualizing vascular data sets providing true three-dimensional perception of the morphometric data.
ER -
TY - CONF
T1 - Statistical Population Thresholding: A novel non-linear thresholding method for peak and baseline selection in biological spectra containing thermally generated noise
Y1 - 2008
A1 - D. Homer
A1 - Nicholas Reo
A1 - Michael Raymer
ER -
TY - CHAP
T1 - Strategic Importance of Higher Education and Research in Positioning Gujarat for Global Competitiveness
Y1 - 2008
A1 - Amit Sheth
PB - Global Gujarat & Its Diaspora
ER -
TY - CONF
T1 - Substructure Similarity Search in Chinese Recipes
T2 - Substructure Similarity Search in Chinese Recipes
Y1 - 2008
A1 - Guozhu Dong
A1 - Yu Yang
A1 - Qing Li
A1 - Liping Wang
A1 - Na Li
JA - Substructure Similarity Search in Chinese Recipes
ER -
TY - CHAP
T1 - A Survey of Multiplicative Data Perturbation for Privacy Preserving Data Mining
Y1 - 2008
A1 - Ling Liu
A1 - Keke Chen
KW - Multiplicative Data Perturbation
KW - Privacy Preserving Data Mining
AB - The major challenge of data perturbation is to achieve the desired balance between the level of privacy guarantee and the level of data utility. Data privacy and data utility are commonly considered as a pair of conflicting requirements in privacy-preserving data mining systems and applications. Multiplicative perturbation algorithms aim at improving data privacy while maintaining the desired level of data utility by selectively preserving the mining task and model specific information during the data perturbation process. By preserving the task and model specific information, a set of 'transformation-invariant data mining models' can be applied to the perturbed data directly, achieving the required model accuracy. Often a multiplicative perturbation algorithm may find multiple data transformations that preserve the required data utility. Thus the next major challenge is to find a good transformation that provides a satisfactory level of privacy guarantee. In this chapter, we review three representative multiplicative perturbation methods: rotation perturbation, projection perturbation, and geometric perturbation, and discuss the technical issues and research challenges. We first describe the mining task and model specific information for a class of data mining models, and the transformations that can (approximately) preserve the information. Then we discuss the design of appropriate privacy evaluation models for multiplicative perturbations, and give an overview of how we use the privacy evaluation model to measure the level of privacy guarantee in the context of different types of attacks.
ER -
TY - ABST
T1 - Targeted Content Delivery for Social Media Content
Y1 - 2008
A1 - Amit Sheth
A1 - Meenakshi Nagarajan
A1 - Kamal Baid
A1 - Shaojun Wang
KW - Mutual Information and Contextual keywords and Contextual Content Delivery and Social Media Content
AB - Spotting contextually relevant keywords is fundamental to effective content suggestions on the Web. In this regard, misspellings, entity variations and off-topic discussions in content from Social Media pose unique challenges. Here, we present an algorithm that assists content delivery systems by identifying contextually relevant keywords and eliminating off-topic keywords. A preliminary user study over data from MySpace and Facebook clearly suggests the usefulness of our work in delivering more targeted content suggestions.
ER -
TY - CONF
T1 - TcruziKB: Enabling Complex Queries for Genomic Data Exploration
T2 - 2nd IEEE International Conference on Semantic Computing
Y1 - 2008
A1 - Amit Sheth
A1 - Bobby Mcknight
A1 - Pablo N. Mendes
A1 - Jessica Kissinger
KW - Genome Analysis
KW - Querying
KW - rdf
KW - sparql
AB - We developed a novel analytical environment to aid in the examination of the extensive amount of interconnected data available for genome projects. Our focus is to enable flexibility and abstraction from implementation details, while retaining the expressivity required for post-genomic research. To achieve this goal, we associated genomics data to ontologies and implemented a query formulation and execution
JA - 2nd IEEE International Conference on Semantic Computing
PB - 2nd IEEE International Conference on Semantic Computing (ICSC 2008)
CY - Santa Clara, CA, USA
ER -
TY - CONF
T1 - Terminological Reasoning in SHIQ with Ordered Binary Decision Diagrams
T2 - Terminological Reasoning in SHIQ with Ordered Binary Decision Diagrams
Y1 - 2008
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - We present a new algorithm for reasoning in the description logic SHIQ, which is the most prominent fragment of the Web Ontology Language OWL. The algorithm is based on ordered binary decision diagrams (OBDDs) as a data structure for storing and operating on large model representations. We thus draw on the success and the proven scalability of OBDD-based systems. To the best of our knowledge, we present the very first algorithm for using OBDDs for reasoning with general Tboxes.
JA - Terminological Reasoning in SHIQ with Ordered Binary Decision Diagrams
ER -
TY - ABST
T1 - Terminological Reasoning in SHIQ with Ordered Binary Decision Diagrams
Y1 - 2008
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - We present a new algorithm for reasoning in the description logic SHIQ, which is the most prominent fragment of the Web Ontology Language OWL. The algorithm is based on ordered binary decision diagrams (OBDDs) as a data structure for storing and operating on large model representations. We thus draw on the success and the proven scalability of OBDD-based systems. To the best of our knowledge, we present the very first algorithm for using OBDDs for reasoning with general Tboxes.
ER -
TY - ABST
T1 - Text Analytics for Semantic Computing - the good, the bad and the ugly
Y1 - 2008
A1 - Cartic Ramakrishnan
A1 - Meenakshi Nagarajan
A1 - Amit Sheth
KW - semantic computing
KW - text mining
PB - International Conference on Semantic Computing, 2008
ER -
TY - JOUR
T1 - A Time and Dose Response Metabonomics Study of D-serine Toxicity in Rats.
Y1 - 2008
A1 - Michael Raymer
A1 - W. Couch
A1 - Nicholas J. DelRaso
A1 - P. Erson
A1 - A. Neuforth
A1 - Nicholas Reo
A1 - D. Mahle
ER -
TY - CONF
T1 - Trada: a Tree-Based Ranking Function Adaptation Approach
T2 - 17th ACM Conference on Information and Knowledge Management
Y1 - 2008
A1 - Keke Chen
A1 - Rongqing Lu
A1 - C.K. Wong
A1 - Gordon Sun
A1 - Larry Heck
A1 - Belle Tseng
AB - Machine Learned Ranking approaches have shown successes in web search engines. With the increasing demands on de- veloping effective ranking functions for different search do- mains, we have seen a big bottleneck, i.e., the problem of insufficient training data, which has significantly limited the fast development and deployment of machine learned rank- ing functions for different web search domains. In this paper, we propose a new approach called tree based ranking func- tion adaptation ( tree adaptation ) to address this problem. Tree adaptation assumes that ranking functions are trained with regression-tree based modeling methods, such as Gra- dient Boosting Trees. It takes such a ranking function from one domain and tunes its tree-based structure with a small amount of training data from the target domain. The unique features include (1) it can automatically identify the part of model that needs adjustment for the new domain, (2) it can appropriately weight training examples considering both lo- cal and global distributions. Experiments are performed to show that tree adaptation can provide better-quality rank- ing functions for a new domain, compared to other modeling methods.
JA - 17th ACM Conference on Information and Knowledge Management
PB - ACM
CY - Napa Valley, California
ER -
TY - JOUR
T1 - Traveling the Semantic Web through Space, Theme and Time
JF - IEEE Educational Activities Department
Y1 - 2008
A1 - Amit Sheth
A1 - Matthew Perry
KW - Domain Ontology
KW - Events
KW - EventWeb
KW - rdf
KW - SOA
KW - Spatial Ontology
KW - Spatial Relationships
KW - Spatio-temporal-thematic Querying
KW - Temporal Ontology
KW - Temporal Relationships
AB - In this installment of Semantics and Services, we further develop the idea of spatial, temporal, and thematic (STT) processing of semantic Web data and describe the Web infrastructure needed to support it. Starting from Ramesh Jain's vision of the EventWeb as a view of what's possible with a Web that better accommodates all three dimensions of event-related information (thematic, spatial, and temporal), we outline the architecture needed to support it and current research that aims to realize it.
ER -
TY - CONF
T1 - Unsupervised Discovery of Compound Entities for Relationship Extraction
T2 - Unsupervised Discovery of Compound Entities for Relationship Extraction
Y1 - 2008
A1 - Shaojun Wang
A1 - Cartic Ramakrishnan
A1 - Pablo N. Mendes
A1 - Amit Sheth
AB - In this paper we investigate unsupervised population of a biomedical ontology via information extraction from biomedical literature. Relationships in text seldom connect simple entities. We therefore focus on identifying compound entities rather than mentions of simple entities. We present a method based on rules over grammatical dependency structures for unsupervised segmentation of sentences into compound entities and relationships. We complement the rule-based approach with a statistical component that prunes structures with low information content, thereby reducing false positives in the prediction of compound entities, their constituents and relationships. The extraction is manually evaluated with respect to the UMLS Semantic Network by analyzing the conformance of the extracted triples with the corresponding UMLS relationship type definitions.
JA - Unsupervised Discovery of Compound Entities for Relationship Extraction
ER -
TY - JOUR
T1 - Validation of Image-Based Extraction Method for Morphometry of Coronary Arteries
JF - Annals of Biomedical Engineering
Y1 - 2008
A1 - Jenny Choy
A1 - Erik Ritman
A1 - Ghassan Kassab
A1 - Thomas Wischgoll
AB - An accurate analysis of the spatial distribution of blood flow in any organ must be based on detailed morphometry (diameters, lengths, vessel numbers, and branching pattern) of the organ vasculature. Despite the significance of detailed morphometric data, there is relative scarcity of data on 3D vascular anatomy. One of the major reasons is that the process of morphometric data collection is labor intensive. The objective of this study is to validate a novel segmentation algorithm for semi-automation of morphometric data extraction. The utility of the method is demonstrated in porcine coronary arteries imaged by computerized tomography (CT). The coronary arteries of five porcine hearts were injected with a contrast-enhancing polymer. The coronary arterial tree proximal to 1 mm was extracted from the 3D CT images. By determining the center lines of the extracted vessels, the vessel radii and lengths were identified for various vessel segments. The extraction algorithm described in this paper is based on a topological analysis of a vector field generated by normal vectors of the extracted vessel wall. With this approach, special focus is placed on achieving the highest accuracy of the measured values. To validate the algorithm, the results were compared to optical measurements of the main trunk of the coronary arteries with microscopy. The agreement was found to be excellent with a root mean square deviation between computed vessel diameters and optical measurements of 0.16 mm (<10% of the mean value) and an average deviation of 0.08 mm. The utility and future applications of the proposed method to speed up morphometric measurements of vascular trees a re discussed.
ER -
TY - CONF
T1 - WS3: International Workshop on Context-enabled Source and Service Selection, Integration and Adaptation
T2 - WS3: International Workshop on Context-enabled Source and Service Selection, Integration and Adaptation
Y1 - 2008
A1 - Ullas Nambiar
A1 - S. Elnaffar
A1 - Zakaria Maamar
A1 - Biplav Srivastava
A1 - Q. Z. Sheng
A1 - Amit Sheth
JA - WS3: International Workshop on Context-enabled Source and Service Selection, Integration and Adaptation
ER -
TY - JOUR
T1 - An XML-Based Approach to Handling Tables in Documents
Y1 - 2008
A1 - Krishnaprasad Thirunarayan
A1 - Trivikram Immaneni
AB - We explore application of XML technology for handling tables in legacy semistructured documents. Specifically, we analyze annotating heterogeneous documents containing tables to obtain a formalized XML Master document that improves traceability (hence easing verification and update) and enables manipulation using XSLT stylesheets. This approach is useful when table instances far outnumber distinct table types because the effort required to annotate a table instance is relatively less compared to formalizing table processing that respects table's semantics. This work is also relevant for authoring new documents with tables that should be accessible to both humans and machines.
ER -
TY - CONF
T1 - The 4 X 4 Semantic Model: Exploiting Data, Functional, Non-Functional and Execution Semantics across Business Process, Workflow, Partner Services and Middleware Services Tiers
T2 - The 4 X 4 Semantic Model: Exploiting Data, Functional, Non-Functional and Execution Semantics across Business Process, Workflow, Partner Services and Middleware Services Tiers
Y1 - 2007
A1 - Amit Sheth
A1 - Karthik Gomadam
AB - Business processes in the global environment increasingly encompass multiple partners and complex, rapidly changing requirements. In this context it is critical that strategic business objectives align with and map accurately to systems that support flexible and dynamic business processes. To support the demanding requirements of global business processes, we propose a comprehensive, unifying 4 X 4 Semantic Model that uses Semantic Templates to link four tiers of implementation with four types of semantics. The four tiers are the Business Process Tier, the Workflow Enactment Tier, the Partner Services Tier, and the Middleware Services Tier. The four types of semantics are Data Semantics, Function Semantics, Nonfunctional Semantics, and Execution Semantics. Our model encompasses services architectures that include enterprise class WSDL-based Web services as well as the lightweight but broadly used REST-based services.
JA - The 4 X 4 Semantic Model: Exploiting Data, Functional, Non-Functional and Execution Semantics across Business Process, Workflow, Partner Services and Middleware Services Tiers
ER -
TY - CONF
T1 - Acquisition of OWL DL Axioms from Lexical Resources
T2 - 4th European Semantic Web Conference
Y1 - 2007
A1 - Johanna Volker
A1 - Philipp Cimiano
A1 - Pascal Hitzler
AB - State-of-the-art research on automated learning of ontologies from text currently focuses on inexpressive ontologies. The acquisition of complex axioms involving logical connectives, role restrictions, and other expressive features of the Web Ontology Language OWL remains largely unexplored. In this paper, we present a method and implementation for enriching inexpressive OWL ontologies with expressive axioms which is based on a deep syntactic analysis of natural language definitions. We argue that it can serve as a core for a semi-automatic ontology engineering process supported by a methodology that integrates methods for both ontology learning and evaluation. The feasibility of our approach is demonstrated by generating complex class descriptions from Wikipedia definitions and from a fishery glossary provided by the Food and Agriculture Organization of the United Nations.
JA - 4th European Semantic Web Conference
CY - Innsbruck, Austria
ER -
TY - BOOK
T1 - Advances in Data and Web Management: Proceedings of the Joint International ApWeb/WAIM Conference on Web-Age Information Management
Y1 - 2007
A1 - Xuemin Lin
A1 - Guozhu Dong
A1 - Yu Yang
A1 - Jeffrey Xu Yu
A1 - Wei Wang
ER -
TY - CONF
T1 - Advances in Semantics for Web Services 2007
Y1 - 2007
A1 - Amit Sheth
A1 - Steven Battle
A1 - Dumitru Roman
A1 - John Domingue
A1 - David Martin
PB - Introduction to the 2nd Edition of the Workshop "Advances in Semantics for Web Services 2007" (Semantics4ws 2007). Business Process Management Workshops
ER -
TY - CONF
T1 - An Algorithm for Computing Inconsistency Measurement by Paraconsistent Semantics
T2 - Proceedings of Ninth European Conference on Symbolic and Quanlitative Approaches to Reasoning with Uncertainty
Y1 - 2007
A1 - Yue Ma
A1 - Guilin Qi
A1 - Zuoquan Lin
A1 - Pascal Hitzler
AB - Measuring inconsistency in knowledge bases has been recognized as an important problem in many research areas. Most of approaches proposed for measuring inconsistency are based on paraconsistent semantics. However, very few of them provide an algorithm for implementation. In this paper, we first give a four-valued semantics for first-order logic and then propose an approach for measuring the degree of inconsistency based on this four-valued semantics. After that, we propose an algorithm to compute the inconsistency degree by introducing a new semantics for first order logic, which is called S[n]-4 semantics.
JA - Proceedings of Ninth European Conference on Symbolic and Quanlitative Approaches to Reasoning with Uncertainty
CY - Hammamet,Tunisia
ER -
TY - CONF
T1 - Algorithms for Paraconsistent Reasoning with OWL
T2 - 4th European Semantic Web Conference, ESWC2007
Y1 - 2007
A1 - Yue Ma
A1 - Zuoquan Lin
A1 - Pascal Hitzler
JA - 4th European Semantic Web Conference, ESWC2007
CY - Innsbruck, Austria
ER -
TY - CONF
T1 - Altering Document Term Vectors for Classification - Ontologies as Expectations of Co-occurrence
T2 - 16th World Wide Web Conference (WWW2007)
Y1 - 2007
A1 - Meenakshi Nagarajan
A1 - Mustafa Uysal
A1 - Amit Sheth
A1 - Arif Merchant
A1 - Kimberly Keeton
A1 - Marcos Aguilera
KW - Vector Space Models and Supervised Document Classification and Background domain knowledge and Ranking semantic relationships
JA - 16th World Wide Web Conference (WWW2007)
PB - 16th World Wide Web Conference (WWW2007)
CY - Banff, Canada
ER -
TY - CONF
T1 - Any-World Access to OWL from Prolog
T2 - 30th Annual German Conference on AI
Y1 - 2007
A1 - Tobias Matzner
A1 - Pascal Hitzler
AB - The W3C standard OWL provides a decidable language for representing ontologies. While its use is rapidly spreading, efforts are being made by researchers worldwide to augment OWL with additional expressive features or by interlacing it with other forms of knowledge representation, in order to make it applicable for even further purposes. In this paper, we integrate OWL with one of the most successful and most widely used forms of knowledge representation, namely Prolog, and present a hybrid approach which layers Prolog on top of OWL in such a way that the open-world semantics of OWL becomes directly accessible within the Prolog system.
JA - 30th Annual German Conference on AI
PB - Advances in Artificial Intelligence, 30th Annual German Conference on AI, KI 2007
CY - KI, Osnabruck, Germany
ER -
TY - CONF
T1 - Automatic Composition of Semantic Web Services using Process Mediation
T2 - 9th International Conference on Enterprise Information Systems (ICES 2007)
Y1 - 2007
A1 - Zixin Wu
A1 - Karthik Gomadam
A1 - Ajith Ranabahu
A1 - Amit Sheth
A1 - John Miller
AB - Web service composition has quickly become a key area of research in the services oriented architecture community. One of the challenges in composition is the existence of heterogeneities across independently created and autonomously managed Web service requesters and Web service providers. Previous work in this area either involved significant human effort or in cases of the efforts seeking to provide largely automated approaches, overlooked the problem of data heterogeneities, resulting in partial solutions that would not support executable workflow for real-world problems. In this paper, we present a planning-based approach to solve both the process heterogeneity and data heterogeneity problems. Our system successfully outputs an executable BPEL file which correctly solves non-trivial real-world process specifications outlind in the 2006 SWS Challenge.
JA - 9th International Conference on Enterprise Information Systems (ICES 2007)
CY - Funchal, Portugal
ER -
TY - ABST
T1 - Automatic Composition of Semantic Web Services using Process Mediation
Y1 - 2007
A1 - Zixin Wu
A1 - John Miller
A1 - Amit Sheth
A1 - Karthik Gomadam
A1 - Ajith Ranabahu
KW - Semantic Web Services and Process and Data Mediation
AB - Web service composition has quickly become a key area of research in the services oriented architecture community. One of the challenges in composition is the existence of heterogeneities across independently created and autonomously managed Web service requesters and Web service providers. Previous work in this area either involved significant human effort or in cases of the efforts seeking to provide largely automated approaches, overlooked the problem of data heterogeneities, resulting in partial solutions that would not support executable workflow for real-world problems. In this paper, we present a planning-based approach to solve both the process heterogeneity and data heterogeneity problems. Our system successfully outputs an executable BPEL file which correctly solves non-trivial real-world process specifications outlind in the 2006 SWS Challenge.
ER -
TY - JOUR
T1 - Beyond SAWSDL: A Game Plan for Broader Adoption of Semantic Web Services
JF - Trends & Controversies: Semantic Web Services, Part 2
Y1 - 2007
A1 - Amit Sheth
KW - SAWSDL
KW - Semantic Web Services
KW - WSDL-S
ER -
TY - CHAP
T1 - Chapter 4, Knowledge Representation on the Semantic Web
Y1 - 2007
A1 - Peter Mika
A1 - Ramesh Jain
A1 - Amit Sheth
KW - Beyond Computing
KW - Data Integration
KW - Mika
KW - Networks
KW - Online networks
KW - Ontology
KW - Semantic Web
KW - Semantics
KW - Social
KW - social networks
KW - Web
KW - Web 2.0
AB - Social Networks and the Semantic Web combines the concepts and the methods of two fields of investigation, which together have the power to aid in the analysis of the social Web and the design of a new class of applications that combine human intelligence with machine processing. Social Network Analysis and the emerging Semantic Web are also the fields that stand to gain most from the new Web in achieving their full potential. On the one hand, the social Web delivers social network data at an extraordinary scale, with a dynamics and precision that has been outside of reach for more traditional methods of observing social structure and behavior. In realizing this potential, the technology of the Semantic Web provides the key in aggregating information across heterogeneous sources. The Semantic Web itself benefits by incorporating user-generated metadata and other clues left behind by users.
ER -
TY - CHAP
T1 - Chapter 5: Modeling and Aggregating Social Network Data
Y1 - 2007
A1 - Peter Mika
A1 - Ramesh Jain
A1 - Amit Sheth
KW - Beyond Computing
KW - Data Integration
KW - Mika
KW - Networks
KW - Online networks
KW - Ontology
KW - Semantic Web
KW - Semantics
KW - Social
KW - social networks
KW - Web
KW - Web 2.0
AB - Social Networks and the Semantic Web combines the concepts and the methods of two fields of investigation, which together have the power to aid in the analysis of the social Web and the design of a new class of applications that combine human intelligence with machine processing. Social Network Analysis and the emerging Semantic Web are also the fields that stand to gain most from the new Web in achieving their full potential. On the one hand, the social Web delivers social network data at an extraordinary scale, with a dynamics and precision that has been outside of reach for more traditional methods of observing social structure and behavior. In realizing this potential, the technology of the Semantic Web provides the key in aggregating information across heterogeneous sources. The Semantic Web itself benefits by incorporating user-generated metadata and other clues left behind by users.
ER -
TY - CHAP
T1 - Chapter 6: Developing Social-Semantic Applications
Y1 - 2007
A1 - Peter Mika
A1 - Ramesh Jain
A1 - Amit Sheth
KW - Beyond Computing
KW - Data Integration
KW - Mika
KW - Networks
KW - Online networks
KW - Ontology
KW - Semantic Web
KW - Semantics
KW - Social
KW - social networks
KW - Web
KW - Web 2.0
AB - Social Networks and the Semantic Web combines the concepts and the methods of two fields of investigation, which together have the power to aid in the analysis of the social Web and the design of a new class of applications that combine human intelligence with machine processing. Social Network Analysis and the emerging Semantic Web are also the fields that stand to gain most from the new Web in achieving their full potential. On the one hand, the social Web delivers social network data at an extraordinary scale, with a dynamics and precision that has been outside of reach for more traditional methods of observing social structure and behavior. In realizing this potential, the technology of the Semantic Web provides the key in aggregating information across heterogeneous sources. The Semantic Web itself benefits by incorporating user-generated metadata and other clues left behind by users.
ER -
TY - CHAP
T1 - Chapter 9, Ontologies Are Us: Emergent Semantics in Folksonomy Systems
Y1 - 2007
A1 - Peter Mika
A1 - Ramesh Jain
A1 - Amit Sheth
KW - Beyond Computing
KW - Data Integration
KW - Mika
KW - Networks
KW - Online networks
KW - Ontology
KW - Semantic Web
KW - Semantics
KW - Social
KW - social networks
KW - Web
KW - Web 2.0
AB - Social Networks and the Semantic Web combines the concepts and the methods of two fields of investigation, which together have the power to aid in the analysis of the social Web and the design of a new class of applications that combine human intelligence with machine processing. Social Network Analysis and the emerging Semantic Web are also the fields that stand to gain most from the new Web in achieving their full potential. On the one hand, the social Web delivers social network data at an extraordinary scale, with a dynamics and precision that has been outside of reach for more traditional methods of observing social structure and behavior. In realizing this potential, the technology of the Semantic Web provides the key in aggregating information across heterogeneous sources. The Semantic Web itself benefits by incorporating user-generated metadata and other clues left behind by users.
ER -
TY - CONF
T1 - Charting the Winds of Evolutionary Change: Â Bioinformatics Methods for Identifying Bias in Prokaryotic Codon Usage.
Y1 - 2007
A1 - Michael Raymer
ER -
TY - CONF
T1 - Charting the Winds of Evolutionary Change: Bioinformatics Methods for Identifying Bias in Prokaryotic Codon Usage.
T2 - Charting the Winds of Evolutionary Change: Bioinformatics Methods for Identifying Bias in Prokaryotic Codon Usage.
Y1 - 2007
A1 - Michael Raymer
JA - Charting the Winds of Evolutionary Change: Bioinformatics Methods for Identifying Bias in Prokaryotic Codon Usage.
ER -
TY - CONF
T1 - Collecting Expertise of Researchers for Finding Relevant Experts in a Peer-Review Setting
Y1 - 2007
A1 - Delroy Cameron
A1 - Ismailcem Budak Arpinar
A1 - Boanerges Aleman-Meza
AB - We present ideas for determining the expertise of researchers across various areas of computer science and for finding relevant experts/reviewers in a peerreview setting. We explain how Semantic Web techniques for data collection and data representation using ontologies can be used in addressing this specific 'ExpertFinder' problem.
PB - 1st International ExpertFinder Workshop (EFW 2007), co-located with 7th Knowledge Web General Assembly
ER -
TY - CONF
T1 - A Comparison of Disjunctive Well-founded Semantics
Y1 - 2007
A1 - Matthias Knorr
A1 - Pascal Hitzler
AB - While the stable model semantics, in the form of Answer Set Programming, has become a successful semantics for disjunctive logic programs, a corresponding satisfactory extension of the well-founded semantics to disjunctive programs remains to be found. The many current proposals for such an extension are so diverse, that even a systematic comparison between them is a challenging task. In order to aid the quest for suitable disjunctive well-founded semantics, we present a systematic approach to a comparison based on level mappings, a recently introduced framework for characterizing logic programming semantics, which was quite successfully used for comparing the major semantics for normal logic programs. We extend this framework to disjunctive logic programs, which will allow us to gain comparative insights into their different handling of negation. Additionally, we show some of the problems occurring when trying to handle minimal models (and thus disjunctive stable models) within the framework.
PB - Foundations of Artificial Intelligence (FAInt-07)
ER -
TY - CONF
T1 - Comparison of Statistical Techniques for the Analysis of Metabolic Toxicological Data Derived from NMR Spectroscopy
Y1 - 2007
A1 - B. Kelly
A1 - Travis Doom
A1 - P. Erson
A1 - Nicholas Reo
A1 - Nicholas J. DelRaso
A1 - Michael Raymer
ER -
TY - CONF
T1 - Complexity Boundaries for Horn Description Logics
T2 - The 22nd AAAI Conference on Artficial Intelligence
Y1 - 2007
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - Horn description logics (Horn-DLs) have recently started to attract attention due to the fact that their (worst-case) data complexities are in general lower than their overall (i.e. combined) complexities, which makes them attractive for reasoning with large ABoxes. However, the natural question whether Horn-DLs also provide advantages for TBox reasoning has hardly been addressed so far. In this paper, we therefore provide a thorough and comprehensive analysis of the combined complexities of Horn-DLs. While the combined complexity for many Horn-DLs turns out to be the same as for their non-Horn counterparts, we identify subboolean DLs where Hornness simplifies reasoning.
JA - The 22nd AAAI Conference on Artficial Intelligence
CY - Vancouver, British Columbia, Canada
ER -
TY - ABST
T1 - Complexity of Horn Description Logics
Y1 - 2007
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - Horn description logics (Horn-DLs) have recently started to attract attention due to the fact that their (worst-case) data complexities are in general lower than their overall (i.e. combined) complexities, which makes them attractive for reasoning with large ABoxes. However, the natural question whether Horn-DLs also provide advantages for TBox reasoning has hardly been addressed so far. In this paper, we therefore provide a thorough and comprehensive analysis of the combined complexities of Horn-DLs. While the combined complexity for many Horn-DLs turns out to be the same as for their non-Horn counterparts, we identify subboolean DLs where Hornness simplifies reasoning.
ER -
TY - JOUR
T1 - Computing Center-Lines: An Application of Vector Field Topology
JF - Topological Methods in Visualization 2007
Y1 - 2007
A1 - Thomas Wischgoll
AB - Flow visualization has been a very active subfield of scientific visualization in recent years. From the resulting large variety of methods this paper discusses structure-based techniques. The aim of these approaches is to partition the flow in areas of common behavior. Based on this partitioning, subsequent visualization techniques can be applied. A classification is suggested and advantages/disadvantages of the different techniques are discussed as well.
ER -
TY - CONF
T1 - Conjunctive Queries for a Tractable Fragment of OWL 1.1
T2 - 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC
Y1 - 2007
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - Despite the success of the Web Ontology Language OWL, the development of expressive means for querying OWL knowledge bases is still an open issue. In this paper, we investigate how a very natural and desirable form of queries-namely conjunctive ones-can be used in conjunction with OWL such that one of the major design criteria of the latter-namely decidability-can be retained. More precisely, we show that querying the tractable fragment EL++ of OWL 1.1 is decidable. We also provide a complexity analysis and show that querying unrestricted EL++ is undecidable.
JA - 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC
PB - The Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC
CY - Busan, Korea
ER -
TY - CHAP
T1 - The Core Method: Connectionist Model Generation for First-Order Logic Programs
Y1 - 2007
A1 - Sebastian Bader
A1 - Steffen Holldobler
A1 - Andreas Witzel
A1 - Pascal Hitzler
KW - Artificial Intelligence
AB - In Artificial Intelligence, knowledge representation studies the formalisation of knowledge and its processing within machines. Techniques of automated reasoning allow a computer system to draw conclusions from knowledge represented in a machine-interpretable form. Recently, ontologies have evolved in computer science as computational artefacts to provide computer systems with a conceptual yet computational model of a particular domain of interest. In this way, computer systems can base decisions on reasoning about domain knowledge, similar to humans. This chapter gives an overview on basic knowledge representation aspects and on ontologies as used within computer systems. After introducing ontologies in terms of their appearance, usage and classification, it addresses concrete ontology languages that are particularly important in the context of the Semantic Web. The most recent and predominant ontology languages and formalisms are presented in relation to each other and a selection of them is discussed in more detail.
ER -
TY - CONF
T1 - CSI Revisited: The Science of Forensic DNA Analysis.
Y1 - 2007
A1 - Michael Raymer
ER -
TY - CHAP
T1 - Data Integration
Y1 - 2007
A1 - Meenakshi Nagarajan
KW - Data Integration
KW - Semantic Web
ER -
TY - CONF
T1 - Decidability Under the Well-Founded Semantics
T2 - First International Conference on Web Reasoning and Rule Systems, RR2007
Y1 - 2007
A1 - Pascal Hitzler
A1 - Natalia Cherchago
A1 - Steffen Holldobler
AB - The well-founded semantics (WFS) for logic programs is one of the few major paradigms for closed-world reasoning. With the advent of the Semantic Web, it is being used as part of rule systems for ontology reasoning, and also investigated as to its usefulness as a semantics for hybrid systems featuring combined open- and closed-world reasoning. Even in its most basic form, however, the WFS is undecidable. In fact, it is not even semi-decidable, which means that it is a theoretical impossibility that sound and complete reasoners for the WFS exist. Surprisingly, however, this matter has received next to no attention in research, although it has already been shown in 1995 by John Schlipf [1]. In this paper, we present several conditions under which query-answering under the well-founded semantics is decidable or semi-decidable. To the best of our knowledge, these are the very first results on such conditions.
JA - First International Conference on Web Reasoning and Rule Systems, RR2007
CY - Innsbruck, Austria
ER -
TY - CONF
T1 - Description Logic Programs: Normal Forms
Y1 - 2007
A1 - Andreas Eberhart
A1 - Pascal Hitzler
AB - The relationship and possible interplay between different knowledge representation and reasoning paradigms is a fundamental topic in artificial intelligence. For expressive knowledge representation for the Semantic Web, two different paradigms - namely Description Logics (DLs) and Logic Programming - are the two most successful approaches. A study of their exact relationships is thus paramount. An intersection of OWL with (function-free non-disjunctive) Datalog, called DLP (for Description Logic Programs), has been described in [1,2]. We provide normal forms for DLP in Description Logic syntax and in Datalog syntax, thus providing a bridge for the researcher and user who is familiar with either of these paradigms. We argue that our normal forms are the most convenient way to define DLP for teaching and dissemination purposes.
PB - FAInt-07
ER -
TY - CHAP
T1 - Developing Social-Semantic Applications
Y1 - 2007
A1 - Peter Mika
A1 - Ramesh Jain
A1 - Amit Sheth
KW - Beyond Computing
KW - Data Integration
KW - Mika
KW - Networks
KW - Online networks
KW - Ontology
KW - Semantic Web
KW - Semantics
KW - Social
KW - social networks
KW - Web
KW - Web 2.0
AB - Social Networks and the Semantic Web combines the concepts and the methods of two fields of investigation, which together have the power to aid in the analysis of the social Web and the design of a new class of applications that combine human intelligence with machine processing. Social Network Analysis and the emerging Semantic Web are also the fields that stand to gain most from the new Web in achieving their full potential. On the one hand, the social Web delivers social network data at an extraordinary scale, with a dynamics and precision that has been outside of reach for more traditional methods of observing social structure and behavior. In realizing this potential, the technology of the Semantic Web provides the key in aggregating information across heterogeneous sources. The Semantic Web itself benefits by incorporating user-generated metadata and other clues left behind by users.
ER -
TY - JOUR
T1 - Document Clustering and Ranking System for Exploring MEDLINE Citations
Y1 - 2007
A1 - Ling Liu
A1 - Keke Chen
KW - CCPY better ranked important articles than did the others. Furthermore
KW - citation count (CC)
KW - i.e.
KW - important'
KW - including citation count per year (CCPY)
KW - our text clustering and knowledge extraction strategy grouped the retrieval results into informative clusters as revealed by the keywords and MeSH terms extracted from the documents in each cluster. Conclusions: The text mining system studi
KW - ranked the citations in each cluster
KW - simply showing them as a long list often provides poor overview. With a goal of presenting users with reduced sets of relevant citations
KW - text summarization
KW - this study developed an approach that retrieved and organized MEDLINE citations into different topical groups and prioritized important citations in each group. Design: A text mining system framework for automatic document clustering and ra
KW - those articles selected by the Surgical Oncology Society. Results: Our results showed that CCPY outperforms CC and JIF
AB - Objective: A major problem faced in biomedical informatics involves how best to present information retrieval results. When a single query retrieves many results, simply showing them as a long list often provides poor overview. With a goal of presenting users with reduced sets of relevant citations, this study developed an approach that retrieved and organized MEDLINE citations into different topical groups and prioritized important citations in each group. Design: A text mining system framework for automatic document clustering and ranking organized MEDLINE citations following simple PubMed queries. The system grouped the retrieved citations, ranked the citations in each cluster, and generated a set of keywords and MeSH terms to describe the common theme of each cluster. Measurements: Several possible ranking functions were compared, including citation count per year (CCPY), citation count (CC), and journal impact factor (JIF). We evaluated this framework by identifying as 'important' those articles selected by the Surgical Oncology Society. Results: Our results showed that CCPY outperforms CC and JIF, i.e., CCPY better ranked important articles than did the others. Furthermore, our text clustering and knowledge extraction strategy grouped the retrieval results into informative clusters as revealed by the keywords and MeSH terms extracted from the documents in each cluster. Conclusions: The text mining system studied effectively integrated text clustering, text summarization, and text ranking and organized MEDLINE retrieval results into different topical groups.
ER -
TY - JOUR
T1 - Efficient Computation of Iceberg Cubes by Bounding Aggregate Functions
Y1 - 2007
A1 - Xiuzhen Zhang
A1 - Pauline Lienhua Chou
A1 - Guozhu Dong
ER -
TY - CONF
T1 - Efficient OWL Reasoning with Logic Programs - Evaluations
T2 - International Conference on Web Reasoning and Rule Systems, RR2007
Y1 - 2007
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Michael Sintek
A1 - Denny Vrandecic
A1 - Pascal Hitzler
AB - We report on efficiency evaluations concerning two different approaches to using logic programming for OWL [1] reasoning and show, how the two approaches can be combined. Introduction. Scalability of reasoning remains one of the major obstacles in leveraging the full power of the Web Ontology Language OWL [1] for practical applications. Among the many possible approaches to address scalability, one of them concerns the use of logic programming for this purpose. It was recently shown that reasoning in Horn-SHIQ [2-4] can be realised by invoking Prolog systems on the output of the
JA - International Conference on Web Reasoning and Rule Systems, RR2007
CY - Innsbruck, Austria
ER -
TY - ABST
T1 - Engineering Mathematics Education at Wright State University: Uncorking the First-Year Bottleneck
Y1 - 2007
A1 - N. Klingbeil
A1 - Michael Raymer
A1 - David Reynolds
A1 - K. Rattan
A1 - R. Mercer
JA - Engineering Mathematics Education at Wright State University: Uncorking the First-Year Bottleneck
ER -
TY - CONF
T1 - Estimating the Cardinality of RDF Graph Patterns
Y1 - 2007
A1 - Angela Maduko
A1 - Paul Schliekelman
A1 - Kemafor Anyanwu
A1 - Amit Sheth
KW - RDF and RDF graph patterns and RDF Semantic Summary and RDF Structural Summary and RDF Query processing and Statistical Summaries and Pattern Cardinality Estimation
AB - Most RDF query languages allow for graph structure search through a conjunction of triples which is typically processed using join operations. A key factor in optimizing joins is determining the join order which depends on the expected cardinality of intermediate results. This work proposes a pattern-based summarization framework for estimating the cardinality of RDF graph patterns. We present experiments on real world and synthetic datasets which confirm the feasibility of our approach.
PB - 16th World Wide Web Conference (WWW2007)
ER -
TY - CONF
T1 - Evolution and Maintenance of Frequent Pattern Space When Transactions Are Removed. Proceedings of PAKDD
T2 - Evolution and Maintenance of Frequent Pattern Space When Transactions Are Removed. Proceedings of PAKDD
Y1 - 2007
A1 - Jinyan Li
A1 - Limsoon Wong
A1 - Yap-Peng Tan
A1 - Mengling Feng
A1 - Guozhu Dong
JA - Evolution and Maintenance of Frequent Pattern Space When Transactions Are Removed. Proceedings of PAKDD
ER -
TY - CONF
T1 - An experiment in integrating large biomedical knowledge resources with RDF: Application to associating genotype and phenotype information
Y1 - 2007
A1 - Amit Sheth
A1 - Satya S. Sahoo
A1 - Olivier Bodenreider
A1 - Kelly Zeng
KW - Semantic Web and SPARQL and Gene Ontology and Entrez Gene and path queries. and Data integration and Resource DescriptionFramework
AB - ABSTRACT Bridging between genotype and phenotype is generally achieved through the integration of knowledge sources such as Entrez Gene (EG), Online Mendelian Inheritance in Man (OMIM) and the Gene Ontology (GO). Traditionally, such integration implies manual effort or the development of customized software. In this paper, we demonstrate how the Resource Description Framework (RDF) can be used to represent and integrate these resources and support complex queries over the unified resource. We illustrate the ...
PB - Workshop on Health Care and Life Sciences Data Integration for the Semantic Web at WWW2007
ER -
TY - JOUR
T1 - Flow Patterns in Three-Dimensional Porcine Epicardial Coronary Arterial Tree
JF - American Journal Physiol of Heart and Circulatory Physiology
Y1 - 2007
A1 - Ghassan Kassab
A1 - Yunlong Huo
A1 - Thomas Wischgoll
AB - Flow patterns in three-dimensional porcine epicardial coronary arterial tree. Am J Physiol Heart Circ Physiol 293: H2959ÂH2970, 2007. First published September 7, 2007; doi:10.1152/ajpheart.00586.2007.ÂThe branching pattern of epicardial coronary arteries is clearly three-dimensional, with correspondingly complex flow patterns. The objective of the present study was to perform a detailed hemodynamic analysis using a three-dimensional finite element method in a left anterior descending (LAD) epicardial arterial tree, including main trunk and primary branches, based on computed tomography scans. The inlet LAD flow velocity was measured in an anesthetized pig, and the outlet pressure boundary condition was estimated based on scaling laws. The spatial and temporal wall shear stress (WSS), gradient of WSS (WSSG), and oscillatory shear index (OSI) were calculated and used to identify regions of flow disturbances in the vicinity of primary bifurcations. We found that low WSS and high OSI coincide with disturbed flows (stagnated, secondary, and reversed flows) opposite to the flow divider and lateral to the junction orifice of the main trunk and primary branches. High time-averaged WSSG occurs in regions of bifurcations, with the flow divider having maximum values. Low WSS and high OSI were found to be related through a power law relationship. Furthermore, zones of low time-averaged WSS and high OSI amplified for larger diameter ratio and high inlet flow rate. Hence, different focal atherosclerotic-prone regions may be explained by different physical mechanism associated with certain critical levels of low WSS, high OSI, and high WSSG, which are strongly affected by the diameter ratio. The implications of the flow patterns for atherogenesis are enumerated.
ER -
TY - CONF
T1 - Foundations of Refinement Operators for Description Logics
T2 - 17th International Conference, ILP 2007
Y1 - 2007
A1 - Jens Lehmann
A1 - Pascal Hitzler
AB - In order to leverage techniques from Inductive Logic Programming for the learning in description logics (DLs), which are the foundation of ontology languages in the Semantic Web, it is important to acquire a thorough understanding of the theoretical potential and limitations of using refinement operators within the description logic paradigm. In this paper, we present a comprehensive study which analyses desirable properties such operators should have. In particular, we show that ideal refinement operators in general do not exist, which is indicative of the hardness inherent in learning in DLs. We also show which combinations of desirable properties are theoretically possible, thus providing an important step towards the definition of practically applicable operators.
JA - 17th International Conference, ILP 2007
CY - Corvallis, OR, USA
ER -
TY - CONF
T1 - From "Glycosyltransferase" to "Congenital Muscular Dystrophy": Integrating Knowledge from NCBI Entrez Gene and the Gene Ontology
T2 - From "Glycosyltransferase" to "Congenital Muscular Dystrophy": Integrating Knowledge from NCBI Entrez Gene and the Gene Ontology
Y1 - 2007
A1 - Kelly Zeng
A1 - Satya S. Sahoo
A1 - Amit Sheth
A1 - Olivier Bodenreider
AB - Entrez Gene (EG), Online Mendelian Inheritance in Man (OMIM) and the Gene Ontology (GO) are three complementary knowledge resources that can be used to correlate genomic data with disease information. However, bridging between genotype and phenotype through these resources currently requires manual effort or the development of customized software. In this paper, we argue that integrating EG and GO provides a robust and flexible solution to this problem. We demonstrate how the Resource Description Framework (RDF) developed for the Semantic Web can be used to represent and integrate these resources and enable seamless access to them as a unified resource. We illustrate the effectiveness of our approach by answering a real-world biomedical query linking a specific molecular function, glycosyltransferase, to the disorder congenital muscular dystrophy.
JA - From "Glycosyltransferase" to "Congenital Muscular Dystrophy": Integrating Knowledge from NCBI Entrez Gene and the Gene Ontology
ER -
TY - JOUR
T1 - From "Glycosyltransferase" to "Congenital Muscular Dystrophy": Integrating Knowledge from NCBI Entrez Gene and the Gene Ontology
Y1 - 2007
A1 - Satya Sahoo
A1 - Kelly Zeng
A1 - Olivier Bodenreider
A1 - Amit Sheth
KW - Entrez Gene
KW - Gene Ontology
KW - knowledge integration
KW - rdf
KW - Semantic Web
AB - Entrez Gene (EG), Online Mendelian Inheritance in Man (OMIM) and the Gene Ontology (GO) are three complementary knowledge resources that can be used to correlate genomic data with disease information. However, bridging between genotype and phenotype through these resources currently requires manual effort or the development of customized software. In this paper, we argue that integrating EG and GO provides a robust and flexible solution to this problem. We demonstrate how the Resource Description Framework (RDF) developed for the Semantic Web can be used to represent and integrate these resources and enable seamless access to them as a unified resource. We illustrate the effectiveness of our approach by answering a real-world biomedical query linking a specific molecular function, glycosyltransferase, to the disorder congenital muscular dystrophy.
ER -
TY - CONF
T1 - A Fully Connectionist Model Generator for Covered First-Order Logic Programs
T2 - Twentieth International Joint Conference on Artificial Intelligence, IJCAI-07
Y1 - 2007
A1 - Sebastian Bader
A1 - Steffen Holldobler
A1 - Andreas Witzel
A1 - Pascal Hitzler
AB - We present a fully connectionist system for the learning of first-order logic programs and the generation of corresponding models: Given a program and a set of training examples, we embed the associated semantic operator into a feed-forward network and train the network using the examples. This results in the learning of first-order knowledge while damaged or noisy data is handled gracefully.
JA - Twentieth International Joint Conference on Artificial Intelligence, IJCAI-07
CY - Hyderabad, India
ER -
TY - CONF
T1 - Fundamental Concepts of Bioinformatics
Y1 - 2007
A1 - Michael Raymer
ER -
TY - CONF
T1 - A General Boosting Method and its Application to Learning Ranking Functions for Web Search
T2 - A General Boosting Method and its Application to Learning Ranking Functions for Web Search
Y1 - 2007
A1 - Zhaohui Zheng
A1 - Hongyuan Zha
A1 - Gordon Sun
A1 - Olivier Chapelle
A1 - Keke Chen
A1 - Tong Zhang
JA - A General Boosting Method and its Application to Learning Ranking Functions for Web Search
ER -
TY - CHAP
T1 - Geospatial and Temporal Semantic Analytics
Y1 - 2007
A1 - Matthew Perry
A1 - Ismailcem Budak Arpinar
A1 - Amit Sheth
A1 - Farshad Hakimpour
KW - Ontology
KW - rdf
KW - Semantic Analytics
KW - Semantic Association
KW - Spatiotemporal Thematic Context
AB - The amount of digital data available to researchers and knowledge workers has grown tremendously in recent years. This is especially true in the geography domain. As the amount of data grows, problems of data relevance and information overload become more severe. The use of semantics has been proposed to combat these problems (Berners-Lee et al., 2001; Egenhofer,
ER -
TY - CHAP
T1 - GlycO Ontology
Y1 - 2007
A1 - Christopher Thomas
KW - Glycomics
ER -
TY - CHAP
T1 - Glycomics Project Overview
Y1 - 2007
A1 - Satya S. Sahoo
KW - Glycomics
ER -
TY - CONF
T1 - Hybrid Retrieval from the Unified Web
T2 - Hybrid Retrieval from the Unified Web
Y1 - 2007
A1 - Trivikram Immaneni
A1 - Krishnaprasad Thirunarayan
JA - Hybrid Retrieval from the Unified Web
ER -
TY - JOUR
T1 - Implicit Online Learning with Kernels
Y1 - 2007
A1 - L. Cheng
A1 - S. Vishwanathan
A1 - D. Schuurmans
A1 - Shaojun Wang
A1 - Terry Caelli
AB - We present two new algorithms for online learning in reproducing kernel Hilbert spaces. Our first algorithm, ILK (implicit online learning with kernels), employs a new, implicit update technique that can be applied to a wide variety of convex loss functions. We then introduce a bounded memory version, SILK (sparse ILK), that maintains a compact representation of the predictor without compromising solution quality, even in non-stationary environments. We prove loss bounds and analyze the convergence rate of both. Experimental evidence shows that our proposed algorithms outperform current methods on synthetic and real data.
ER -
TY - CONF
T1 - Inter-enterprise System and Application Integration: A Reality Check
T2 - 9th International Conference on Enterprise Information Systems (ICEIS 2007)
Y1 - 2007
A1 - Amit Sheth
A1 - Christoph Bussler
A1 - Jorge Cardoso
A1 - Kurt Skuhl
A1 - Wil Van Der Aalst
AB - This paper structures the summary of the panel held at the 9th International Conference on Enterprise Information Systems, Funchal, Madeira, 12-16 June 2007 that addressed the following question: Are you still working on Inter-Enterprise System and Application Integration?  The panel aggregated distinguished experts from the areas of process management, workflow, Web services, SOA, and Semantic Web. Wil van der Aalst: We Are Creating Our Own Problems Christoph Bussler: The World Moved on Amit Sheth: New World Order for Interactions Across Enterprise Information Systems in the Flat World Kurt Sankuhl: It Is an Illusion to Believe We Will Ever Solve All Interoperability Problems! Jorge Cardoso: Devise Conceptual Nodes  Instead of Leaves  Solutions!
JA - 9th International Conference on Enterprise Information Systems (ICEIS 2007)
CY - Funchal, Madeira
ER -
TY - ABST
T1 - Keyword Search Interface to Express Path Queries in RDF
Y1 - 2007
A1 - Sujeeth Thirumalai
KW - Keyword Search
KW - Ontology
KW - Path Query
KW - Semantic Web
AB - Today's semantic web has a growing wealth of machine understandable metadata represented using markup languages like RDF, XML or OWL. There exists a plethora of query languages that aid is searching such data models. However, most real world searches involve queries expressed in natural language as it allows the user to get information without using complex formal query languages. This paper presents a search interface for path queries on ontologies, which accepts keywords and finds answers where each answer is a subgraph containing paths between nodes that match the keywords. Our approach for building such a system comprises of (1) a full-text search index for triples in the ontology (2) lexical and semantic query expansion to match user keywords to entities in the ontology, and (3) an algorithm which uses the Sparq2l path sequence indices to compute the answer subgraphs.
ER -
TY - CHAP
T1 - Knowledge Representation on the Semantic Web
Y1 - 2007
A1 - Peter Mika
A1 - Ramesh Jain
A1 - Amit Sheth
KW - Beyond Computing
KW - Data Integration
KW - Mika
KW - Networks
KW - Online networks
KW - Ontology
KW - Semantic Web
KW - Semantics
KW - Social
KW - social networks
KW - Web
KW - Web 2.0
AB - Social Networks and the Semantic Web combines the concepts and the methods of two fields of investigation, which together have the power to aid in the analysis of the social Web and the design of a new class of applications that combine human intelligence with machine processing. Social Network Analysis and the emerging Semantic Web are also the fields that stand to gain most from the new Web in achieving their full potential. On the one hand, the social Web delivers social network data at an extraordinary scale, with a dynamics and precision that has been outside of reach for more traditional methods of observing social structure and behavior. In realizing this potential, the technology of the Semantic Web provides the key in aggregating information across heterogeneous sources. The Semantic Web itself benefits by incorporating user-generated metadata and other clues left behind by users.
ER -
TY - CHAP
T1 - Kursarbeit mit Schulern - die Intensivkurse Mathematik
Y1 - 2007
A1 - Gudrun Kalmbach
A1 - Pascal Hitzler
ER -
TY - JOUR
T1 - Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields
Y1 - 2007
A1 - F. Jiao
A1 - D. Schuurmans
A1 - R. Greiner
A1 - C. Lee
A1 - Shaojun Wang
ER -
TY - CHAP
T1 - Leveraging Semantic Web techniques to gain situational awareness
Y1 - 2007
A1 - Amit Sheth
PB - Cyber Situational Awareness Workshop
ER -
TY - CONF
T1 - Measuring Inconsistency for Description Logics Based on Paraconsistent Semantics
Y1 - 2007
A1 - Yue Ma
A1 - Guilin Qi
A1 - Pascal Hitzler
A1 - Zuoquan Lin
AB - In this paper, we present an approach for measuring inconsistency in a knowledge base.We first define the degree of inconsistency using a four-valued semantics for the description logic ALC. Then an ordering over knowledge bases is given by considering their inconsistency degrees. Our measure of inconsistency can provide important information for inconsistency handling.
PB - the 2007 International Workshop on Description Logics (DL-2007)
ER -
TY - CONF
T1 - Measuring Inconsistency for Description Logics Based on Paraconsistent Semantics
T2 - Ninth European Conference on Symbolic and Quanlitative Approaches to Reasoning with Uncertainty
Y1 - 2007
A1 - Yue Ma
A1 - Guilin Qi
A1 - Zuoquan Lin
A1 - Pascal Hitzler
AB - In this paper, we present an approach for measuring inconsistency in a knowledge base.We first define the degree of inconsistency using a four-valued semantics for the description logic ALC. Then an ordering over knowledge bases is given by considering their inconsistency degrees. Our measure of inconsistency can provide important information for inconsistency handling.
JA - Ninth European Conference on Symbolic and Quanlitative Approaches to Reasoning with Uncertainty
CY - Hammamet, Tunisia
ER -
TY - JOUR
T1 - Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints
Y1 - 2007
A1 - Guozhu Dong
A1 - Xiaonan Ji
A1 - James Bailey
ER -
TY - CONF
T1 - Modeling and Simulation of Cardiovascular Systems
Y1 - 2007
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - A multi-objective genetic algorithm that employs a hybrid approach for isolating codon usage bias indicative of translational efficiency
Y1 - 2007
A1 - Dan Krane
A1 - Travis Doom
A1 - Michael Raymer
A1 - D. Raiford
ER -
TY - CONF
T1 - A National Model for Engineering Mathematics Education
T2 - 117th ASEE Annual Conference and Exposition
Y1 - 2007
A1 - Nathan Klingbeil
A1 - Kuldip Rattan
A1 - Michael Raymer
A1 - David Reynolds
A1 - Richard Mercer
A1 - Anant Kukreti
A1 - Brian Randolph
AB - The traditional approach to engineering mathematics education begins with one year of freshman calculus as a prerequisite to subsequent core engineering courses. However, the inability of incoming students to successfully advance through the traditional freshman calculus sequence is a primary cause of attrition in engineering programs across the country. As a result, this paper describes an NSF funded initiative at Wright State University to redefine the way in which engineering mathematics is taught, with the goal of increasing student retention, motivation and success in engineering. This paper provides an overview of the WSU model for engineering mathematics education, followed by an assessment of student performance, perception and retention through its initial implementation. It also summarizes the scope of a recent NSF CCLI Phase 2 Expansion award, which involves a multiyear assessment at WSU, pilot adoption and assessment at two collaborating institutions, and a widespread dissemination of results.
JA - 117th ASEE Annual Conference and Exposition
PB - Proceedings of the 2007 ASEE Annual Conference & Exposition
CY - Honolulu, HI
ER -
TY - Generic
T1 - A National Model for Engineering Mathematics Education
Y1 - 2007
A1 - N. Klingbeil
A1 - R. Mercer
A1 - K. Rattan
A1 - Michael Raymer
PB - ASEE Southeastern Section Conference
ER -
TY - JOUR
T1 - A Novel Method for Visualization of Entire Coronary Arterial Tree
JF - Annals of Biomedical Engineering
Y1 - 2007
A1 - Joerg Meyer
A1 - Ghassan Kassab
A1 - Benjamin Kaimovitz
A1 - Yoram Lanir
A1 - Thomas Wischgoll
AB - he complexity of the coronary circulation especially in the deep layers largely evades experimental investigations. Hence, virtual/computational models depicting structure-function relation of the entire coronary vasculature including the deep layer are imperative. In order to interpret such anatomically based models, fast and efficient visualization algorithms are essential. The complexity of such models, which include vessels from the large proximal coronary arteries and veins down to the capillary level (3 orders of magnitude difference in diameter), is a challenging visualization problem since the resulting geometrical representation consists of millions of vessel segments. In this study, a novel method for rendering the entire porcine coronary arterial tree down to the first segments of capillaries interactively is described which employs geometry reduction and occlusion culling techniques. Due to the tree-shaped nature of the vasculature, these techniques exploit the geometrical topology of the object to achieve a faster rendering speed while still handling the full complexity of the data. We found a significant increase in performance combined with a more accurate, gap-less representation of the vessel segments resulting in a more interactive visualization and analysis tool for the entire coronary arterial tree. The proposed techniques can also be applied to similar data structures, such as neuronal trees, airway structures, bile ducts, and other tree-like structures. The utility and future applications of the proposed algorithms are explored.
ER -
TY - CHAP
T1 - Ontologies Are Us: Emergent Semantics in Folksonomy Systems
Y1 - 2007
A1 - Peter Mika
A1 - Ramesh Jain
A1 - Amit Sheth
KW - Beyond Computing
KW - Data Integration
KW - Mika
KW - Networks
KW - Online networks
KW - Ontology
KW - Semantic Web
KW - Semantics
KW - Social
KW - social networks
KW - Web
KW - Web 2.0
AB - Social Networks and the Semantic Web combines the concepts and the methods of two fields of investigation, which together have the power to aid in the analysis of the social Web and the design of a new class of applications that combine human intelligence with machine processing. Social Network Analysis and the emerging Semantic Web are also the fields that stand to gain most from the new Web in achieving their full potential. On the one hand, the social Web delivers social network data at an extraordinary scale, with a dynamics and precision that has been outside of reach for more traditional methods of observing social structure and behavior. In realizing this potential, the technology of the Semantic Web provides the key in aggregating information across heterogeneous sources. The Semantic Web itself benefits by incorporating user-generated metadata and other clues left behind by users.
ER -
TY - JOUR
T1 - Ontology Driven Data Mediation in Web Services
JF - International Journal of Web Services Research
Y1 - 2007
A1 - Amit Sheth
A1 - Meenakshi Nagarajan
A1 - Kunal Verma
A1 - John Miller
KW - Axis 2.0
KW - data mediation
KW - Mapping
KW - Matching
KW - Message-level heterogeneities
KW - METEOR-S
KW - Ontology
KW - SAWSDL
KW - Web Service interoperation
AB - With the rising popularity of Web services, both academia and industry have invested considerably in Web service description standards, discovery, and composition techniques. The standards based approach utilized by Web services has supported interoperability at the syntax level. However, issues of structural and semantic heterogeneity between messages exchanged by Web services are far more complex and crucial to interoperability. It is for these reasons that we recognize the value that schema/data mappings bring to Web service descriptions. In this paper, we examine challenges to interoperability; classify the types of heterogeneities that can occur between interacting services and present a possible solution for data interoperability using the mapping support provided by WSDL-S, a key driver behind SAWSDL. We present a data mediation architecture using the extensibility features of WSDL and the popular SOAP engine, Axis 2.
ER -
TY - CONF
T1 - Ontology Evaluation and Ranking using OntoQA
T2 - Ontology Evaluation and Ranking using OntoQA
Y1 - 2007
A1 - Ismailcem Budak Arpinar
A1 - Samir Tartir
AB - Ontologies form the cornerstone of the Semantic Web and are intended to help researchers to analyze and share knowledge, and as more ontologies are being introduced, it is difficult for users to find good ontologies related to their work. Therefore, tools for evaluating and ranking the ontologies are needed. In this paper, we present OntoQA, a tool that evaluates ontologies related to a certain set of terms and then ranks them according a set of metrics that captures different aspects of ontologies. Since there are no global criteria defining how a good ontology should be, OntoQA allows users to tune the ranking towards certain features of ontologies to suit the need of their applications. We also show the effectiveness of OntoQA in ranking ontologies by comparing its results to the ranking of other comparable approaches as well as expert users.
JA - Ontology Evaluation and Ranking using OntoQA
CY - Irvine, CA, USA
ER -
TY - CONF
T1 - Paraconsistent Resolution for Four-valued Description Logics
Y1 - 2007
A1 - Yue Ma
A1 - Zuoquan Lin
A1 - Pascal Hitzler
AB - In this paper, we propose an approach to translating any ALC ontology (possible inconsistent) into a logically consistent set of disjunctive datalog rules. We achieve this in two steps: First we give a simple way to make any ALC based ontology 4-valued satisfiable, and then we study a sound and complete paraconsistent ordered-resolution decision procedure for our 4-valued ALC. Our approach can be viewed as a paraconsistent version of KAON2 algorithm.
PB - the 2007 International Workshop on Description Logics (DL-2007)
ER -
TY - CONF
T1 - The Programmable Web: Agile, Social, and Grassroot Computing
T2 - 1st IEEE International Conference on Semantic Computing(ICSC)
Y1 - 2007
A1 - E. Micheal Maximilien
A1 - Ajith Ranabahu
AB - Web services, the Semantic Web, and Web 2.0 are three somewhat separate movements trying to make the Web a programmable substrate. While each has achieved some level of success on its own right, it is becoming apparent that the grassroots approach of the Web 2.0 is gaining greater success than the other two. In this paper we analyze the movements, briefly describing their main traits, and outlining their primary assumptions. We then frame the common problem of achieving a programmable Web within the context of distributed computing and software engineering and attempt to show why Web 2.0 is closest to give a pragmatic solution to the problem and will therefore likely continue to have the most success while the other two only have cursory contributions.
JA - 1st IEEE International Conference on Semantic Computing(ICSC)
PB - http://icsc2007.eecs.uci.edu/
CY - Irvine, CA, USA
ER -
TY - CONF
T1 - A proposed statistical protocol for the analysis of metabolic toxicological data derived from NMR spectroscopy
Y1 - 2007
A1 - Michael Raymer
A1 - Nicholas Reo
A1 - P. Erson
A1 - Travis Doom
A1 - B. Kelly
A1 - Nicholas J. DelRaso
ER -
TY - CONF
T1 - Quo Vadis, CS? - On the (non)-impact of Conceptual Structures on the Semantic Web
T2 - ICCS 2007
Y1 - 2007
A1 - Sebastian Rudolph
A1 - Markus Krotzsch
A1 - Pascal Hitzler
AB - Conceptual Structures is a field of research which shares abstract concepts and interests with recent work on knowledge representation for the Semantic Web. However, while the latter is an area of research and development which is rapidly expanding in recent years, the former fails to participate in these developments on a large scale. In this paper, we attempt to stimulate the Conceptual Structures community to catch the Semantic Web train.
JA - ICCS 2007
CY - Sheffield, UK
ER -
TY - CONF
T1 - RDF data exploration and visualization
T2 - RDF data exploration and visualization
Y1 - 2007
A1 - Amit Sheth
A1 - Leonidas Deligiannidis
A1 - Krzysztof Kochut
AB - We present Paged Graph Visualization (PGV), a new semiautonomous tool for RDF data exploration and visualization. PGV consists of two main components: a) the 'PGV explorer' and b) the 'RDF pager' module utilizing BRAHMS, our high performance main-memory RDF storage system. Unlike existing graph visualization techniques which attempt to display the entire graph and then filter out irrelevant data, PGV begins with a small graph and provides the tools to incrementally explore and visualize relevant data of very large RDF ontologies. We implemented several techniques to visualize and explore hot spots in the graph, i.e. nodes with large numbers of immediate neighbors. In response to the user-controlled, semantics-driven direction of the exploration, the PGV explorer obtains the necessary sub-graphs from the RDF pager and enables their incremental visualization leaving the previously laid out sub-graphs intact. We outline the problem of visualizing large RDF data sets, discuss our interface and its implementation, and through a controlled experiment we show the benefits of PGV.
JA - RDF data exploration and visualization
ER -
TY - CONF
T1 - Realizing the Relationship Web: Morphing information access on the Web from today's document- and entity-centric paradigm to a relationship-centric paradigm
T2 - Realizing the Relationship Web: Morphing information access on the Web from today's document- and entity-centric paradigm to a relationship-centric paradigm
Y1 - 2007
A1 - Amit Sheth
KW - Relationship Web
JA - Realizing the Relationship Web: Morphing information access on the Web from today's document- and entity-centric paradigm to a relationship-centric paradigm
ER -
TY - CONF
T1 - Realizing the Relationship Web: Morphing information access on the Web from today's document- and entity-centric paradigm to a relationship-centric paradigm
Y1 - 2007
A1 - Amit Sheth
KW - Relationship Web
PB - ACM Multimedia 2007 International Workshop on the Many Faces of Multimedia Semantics
ER -
TY - CONF
T1 - A Refinement Operator Based Learning Algorithm for the ALC Description Logic
T2 - 17th International Conference, ILP
Y1 - 2007
A1 - Jens Lehmann
A1 - Pascal Hitzler
AB - With the advent of the Semantic Web, description logics have become one of the most prominent paradigms for knowledge representation and reasoning. Progress in research and applications, however, faces a bottleneck due to the lack of available knowledge bases, and it is paramount that suitable automated methods for their acquisition will be developed. In this paper, we provide the first learning algorithm based on refinement operators for the most fundamental description logic ALC. We develop the algorithm from thorough theoretical foundations and report on a prototype implementation.
JA - 17th International Conference, ILP
PB - 17th International Conference, ILP 2007
CY - Corvallis, OR, USA
ER -
TY - CONF
T1 - A Regression Framework for Learning Ranking Functions Using Relative Relevance Judgments
T2 - A Regression Framework for Learning Ranking Functions Using Relative Relevance Judgments
Y1 - 2007
A1 - Keke Chen
A1 - Hongyuan Zha
A1 - Gordon Sun
A1 - Zhaohui Zheng
JA - A Regression Framework for Learning Ranking Functions Using Relative Relevance Judgments
ER -
TY - JOUR
T1 - Relationship Web: Between Web Resources
Y1 - 2007
A1 - Cartic Ramakrishnan
A1 - Amit Sheth
ER -
TY - JOUR
T1 - Relationship Web: Blazing Semantic Trails between Web Resources
Y1 - 2007
A1 - Cartic Ramakrishnan
A1 - Amit Sheth
KW - Relationship Web
AB - Using keywords as inputs to search engines and receiving documents as responses remains the prevalent way to access information on the Web. Although a shift toward entity awareness is a fairly recent trend in information access, such methods remain devoid of semantics, which are increasingly recognized as the lynchpin of search, integration, and analysis. We argue that relationships are at the heart of semantics, and, as such, we envision a Web of relationships to relate content across Web resources. Under ...
ER -
TY - CONF
T1 - Relationship Web: realizing the Memex vision with the help of semantic web
T2 - International Multimedia Conference
Y1 - 2007
A1 - Amit Sheth
AB - Relationship Web takes us from 'which document' could have information I need to 'what's in the resources' that gives me the insight and knowledge I need for decision making. Dr. Vannevar Bush outlined his vision for Memex in a 1945 Atlantic Monthly article [1]. Describing how the human brain navigates an information space in what he called trailblazing, Dr. Bush said, 'It operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain.' Now that we can label content to associate semantics (meaning) to data and build information processing in which relationships rather than keywords and entities play the central role, the possibility of realizing the Memex vision seems tantalizingly close. Although through much of the recent past attention has been on search, finding a document is seldom the end goal of a human activity. Aligned with the Memex vision, human need for information is related to a desire and need for information processing that goes well beyond delivering a list of documents that matches the keywords or even the implied intent. Human information seeking is likely to be driven by more demanding activities such as interaction and entertainment, finding associations and answers, performing analysis, gaining insights, or making decisions. The Memex vision provides an interesting paradigm for supporting these objectives. Changing the computing paradigm to one that focuses on relationships is the key to realizing the Memex vision. We term our realization of Memex Relationship Web. In past work we observed the changing focus from documents to entities to relationships. We also investigated a broad variety of issues related to modeling, validating, discovering, and exploiting the many types of relationships between entities in content [2]. The first result of these efforts was the concept of Metadata Reference Links (MREFs), which proposed associating semantic metadata with hypertext links [3]. MREF faced several limitations, but recent significant advances resulting from research, standards, and technology development associated with Semantic Web provide building blocks for realizing the Relationship Web. We outline below some recent relationship-centric research to which we have had the opportunity to contribute, at the same time acknowledging extensive work in each area by many researchers and practitioners.
JA - International Multimedia Conference
CY - Augsburg Germany
ER -
TY - CHAP
T1 - Relationship Web: Realizing the Memex vision with the help of Semantic Web
Y1 - 2007
A1 - Amit Sheth
KW - Relationship Web
KW - Semantic Web
AB - Relationship Web takes us from' which document' could have information I need to' what's in the resources' that gives me the insight and knowledge I need for decision making. Dr. Vannevar Bush outlined his vision for Memex in a 1945 Atlantic Monthly article [1]. Describing how the human brain navigates an information space in what he called trailblazing, Dr. Bush said,' It operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with ...
PB - SemGrail 2007
ER -
TY - CONF
T1 - Relationship Web: Spinning the Semantic Web from Trailblazing to Complex Hypothesis Evaluation
Y1 - 2007
A1 - Amit Sheth
KW - Relationship Web
PB - College of Engineering, University of Illinois at Chicago
ER -
TY - JOUR
T1 - Role of Semantics in Autonomic & Adaptive Web Services and Processes
Y1 - 2007
ER -
TY - CONF
T1 - Role of semantics in Autonomic and Adaptive Web Services & Processes
T2 - Internationales Begegnungs- und Forschungszentrum fuer Informatik (IBFI)
Y1 - 2007
A1 - Jana Koehler
A1 - Martin Wirsing
A1 - Marco Pistore
A1 - Amit Sheth
A1 - Paolo Traverso
AB - The emergence of Service Oriented Architectures (SOA) has created a new paradigm of loosely coupled distributed systems. In the METEOR-S project, we have studied the comprehensive role of semantics in all stages of the life cycle of service and process-- including annotation, publication, discovery, interoperability\/data mediation, and composition. In 2002-2003, we had offered a broad framework of semantics consisting of four types:1) Data semantics, 2) Functional semantics, 3) Non-Functional semantics and 4) Execution semantics. This talk describes the need for the four types of semantics, its standards-based support through WSDL-S\/SAWSDL, and the need for such semantic representation to dynamic and adaptive SOA. We also briefly review the proposal for Adaptive Web Processes introduced earlier in a ICSOC 2005 vision talk.
JA - Internationales Begegnungs- und Forschungszentrum fuer Informatik (IBFI)
CY - Schloss Dagstuhl, Germany
ER -
TY - CONF
T1 - Role of semantics in Autonomic and Adaptive Web Services & Processes
T2 - Autonomous and Adaptive Web Services 2007
Y1 - 2007
A1 - Amit Sheth
AB - The emergence of Service Oriented Architectures (SOA) has created a new paradigm of loosely coupled distributed systems. In the METEOR-S project, we have studied the comprehensive role of semantics in all stages of the life cycle of service and process-- including annotation, publication, discovery, interoperability/data mediation, and composition. In 2002-2003, we had offered a broad framework of semantics consisting of four types:1) Data semantics, 2) Functional semantics, 3) Non-Functional semantics and 4) Execution semantics. This talk describes the need for the four types of semantics, its standards-based support through WSDL-S/SAWSDL, and the need for such semantic representation to dynamic and adaptive SOA. We also briefly review the proposal for Adaptive Web Processes introduced earlier in a ICSOC 2005 vision talk.
JA - Autonomous and Adaptive Web Services 2007
CY - Internationales Begegnungs- und Forschungszentrum fuer Informatik (IBFI)
ER -
TY - CONF
T1 - Runtime Support of Speculative Optimization for Offline Escape Analysis
T2 - Runtime Support of Speculative Optimization for Offline Escape Analysis
Y1 - 2007
A1 - K. Cleereman
A1 - Krishnaprasad Thirunarayan
A1 - M. Cheatham
AB - Escape analysis can improve the speed and memory efficiency of garbage collected languages by allocating objects to the call stack, but an offline analysis will potentially interfere with dynamic class loading and an online analysis must sacrifice precision for speed. We describe a technique that permits the safe use of aggressive, speculative offline escape analysis in programs potentially loading classes that violate the analysis results.
JA - Runtime Support of Speculative Optimization for Offline Escape Analysis
ER -
TY - CONF
T1 - SA-REST and (S)mashups : Adding Semantics to RESTful Services
T2 - International Conference on Semantic Computing
Y1 - 2007
A1 - Karthik Gomadam
A1 - Amit Sheth
A1 - Jonathan Lathem
AB - The evolution of the Web 2.0 phenomenon has led to the increased adoption of the RESTful services paradigm. RESTful services often take the form of RSS/Atom feeds and AJAX based light weight services. The XML based messaging paradigm of RESTful services has made it possible to compose various services together. Such compositions of RESTful services is widely referred to as Mashups. In this paper, we outline the limitations in current approaches to creating mashups. We address these limitations by proposing a framework called as SA-REST. SA-REST adds semantics to RESTful services. Our proposed framework builds upon the original ideas in WSDL-S, our W3C submission, which was subsequently adapted for Semantic Annotation of WSDL (SAWSDL), now a W3C proposed recommendation. We demonstrate use of microformats for semantic annotation of RESTful services and then the use of such semantically enabled services with better support for interoperability for creating dynamic mashups called SMashups.
JA - International Conference on Semantic Computing
CY - Irvine, CA, USA
ER -
TY - JOUR
T1 - SA-REST: Semantically Interoperable and Easier-to-Use Services and Mashups
JF - IEEE Internet Computing
Y1 - 2007
A1 - Karthik Gomadam
A1 - Amit Sheth
A1 - Jonathan Lathem
KW - SA-REST
KW - SAWSDL
KW - Semantic Annotation of RESTful Services
KW - Semantic mashup
KW - Smashup
AB - Services based on the representational state transfer (REST) paradigm, a lightweight implementation of a service-oriented architecture, have found even greater success than their heavyweight siblings, which are based on the Web Services Description Language (WSDL.) and SOAP. By using XML-based messaging, RESTful services can bring together discrete data from different services to create meaningful data sets; mashups such as these are extremely popular today.
ER -
TY - CHAP
T1 - Schema-Driven Relationship Extraction from Unstructured Text
Y1 - 2007
A1 - Cartic Ramakrishnan
KW - NLP
KW - Relationship Extraction
KW - Semantic Web
ER -
TY - CONF
T1 - Selecting Labels for News Document Clusters
T2 - Selecting Labels for News Document Clusters
Y1 - 2007
A1 - M. Shaik
A1 - Krishnaprasad Thirunarayan
A1 - Trivikram Immaneni
AB - This work deals with determination of meaningful and terse cluster labels for News document clusters. We analyze a number of alternatives for selecting headlines and/or sentences of document in a document cluster (obtained as a result of an entity-event-duration query), and formalize an approach to extracting a short phrase from well-supported headlines/sentences of the cluster that can serve as the cluster label. Our technique maps a sentence into a set of significant stems to approximate its semantics, for comparison. Eventually a cluster label is extracted from a selected headline/sentence as a contiguous sequence of words, resuscitating word sequencing information lost in the formalization of semantic equivalence.
JA - Selecting Labels for News Document Clusters
ER -
TY - CONF
T1 - Semantic Annotations for WSDL
Y1 - 2007
A1 - Jacek Kopecky
A1 - Amit Sheth
PB - W3C Track,16th World Wide Web Conference (WWW2007)
ER -
TY - CHAP
T1 - Semantic Biological Web Services Registry
Y1 - 2007
A1 - Amit Sheth
A1 - Satya S. Sahoo
A1 - B. Hunter
A1 - William York
AB - There are now more than a thousand Web Services [22] offering access to disparate biological resources namely data and computational tools. It is extremely difficult for biological researchers to search in a Web Services (WS) registry for a relevant WS using the standard (primarily computational) descriptions used to describe it. Semantic Biological Web Services Registry (SemBOWSER) is an ontology-based implementation of the UDDI specification, which enables, at present, glycoproteomics researchers to publish, search and discover WS using semantic, service-level, descriptive domain keywords . SemBOWSER classifies a WS along two dimensions- th task they implement and the domain they are associated with. Each published WS is associated with the relevant ProPreO (comprehensive process ontology for glycoproteomics experimental lifecycle) ontology-based kej^words (implemented as part of the registry). A researcher, in turn, can search for relevant WS using only the descriptive kej^words, part of their everyday working lexicon. This intuitive search is underpinned by the ProPreO ontology, thereby making use of the inherent advantages of a semantic search, as compared to a purely syntactic search, namely disambiguation and use of named relationships between concepts. SemBOWSER is part of the glycoproteomics web portal 'Stargate'.
ER -
TY - CONF
T1 - Semantic Convergence of Wikipedia Articles
T2 - Semantic Convergence of Wikipedia Articles
Y1 - 2007
A1 - Amit Sheth
A1 - Christopher Thomas
AB - Social networking, distributed problem solving and human computation have gained high visibility. Wikipedia is a well established service that incorporates aspects of these three fields of research. For this reason it is a good object of study for determining quality of solutions in a social setting that is open, completely distributed, bottom up and not peer reviewed by certified experts. In particular, this paper aims at identifying semantic convergence of Wikipedia articles; the notion that the content of an article stays stable regardless of continuing edits. This could lead to an automatic recommendation of good article tags but also add to the usability of Wikipedia as a Web Service and to its reliability for information extraction. The methods used and the results obtained in this research can be generalized to other communities that iteratively produce textual content.
JA - Semantic Convergence of Wikipedia Articles
CY - Fremont Marriott Hotel, Silicon Valley, USA
ER -
TY - CONF
T1 - A Semantic Framework for Identifying Events in SOA
T2 - A Semantic Framework for Identifying Events in SOA
Y1 - 2007
A1 - Amit Sheth
A1 - Kunal Verma
A1 - Karthik Gomadam
A1 - Lakshmish Ramaswamy
A1 - Ajith Ranabahu
JA - A Semantic Framework for Identifying Events in SOA
ER -
TY - CHAP
T1 - Semantic Web applications in Industry, Government, Health Care and Life Sciences
Y1 - 2007
A1 - Amit Sheth
KW - Semantic Web
PB - Greater Dayton IT Alliance
ER -
TY - CONF
T1 - Semantic Web for Health Care and Biomedical Informatics
Y1 - 2007
A1 - Amit Sheth
PB - NSF Biomedical Informatics Workshop: Expanding Secondary Use of Health Data
ER -
TY - CONF
T1 - Semantic Web: Promising Technologies and Current Applications in Health Care and Life Sciences
Y1 - 2007
A1 - Amit Sheth
KW - Semantic Web
PB - LexisNexis/Elsevier Web Conference
ER -
TY - JOUR
T1 - Semantic Web Services, Part 2.
Y1 - 2007
A1 - Katia Sycara
A1 - Steven Battle
A1 - Dieter Fensel
A1 - Amit Sheth
A1 - David Martin
A1 - John Domingue
ER -
TY - CHAP
T1 - Semantic Web techniques empower perception and comprehension in Cyber Situational Awareness?
Y1 - 2007
A1 - Amit Sheth
KW - Semantic Web
KW - situational awareness
PB - ARO Cyber Situational Awareness Workshop
CY - Fairfax, VA
ER -
TY - CONF
T1 - Semantic Web: Technologies and Applications for the Real-World
T2 - Semantic Web: Technologies and Applications for the Real-World
Y1 - 2007
A1 - Susie Stephens
A1 - Amit Sheth
JA - Semantic Web: Technologies and Applications for the Real-World
ER -
TY - BOOK
T1 - Semantic Web-Based Information Systems: State-of-the-Art Applications
Y1 - 2007
A1 - Miltiadis Lytras
A1 - Amit Sheth
KW - Semantic Web
KW - semantic web applications
AB - As a new generation of technologies, frameworks, concepts and practices for information systems emerge, practitioners, academicians, and researchers are in need of a source where they can go to educate themselves on the latest innovations in this area. Semantic Web Information Systems: State-of-the-Art Applications establishes value-added knowledge transfer and personal development channels in three distinctive areas: academia, industry, and government.
ER -
TY - JOUR
T1 - Semantically Annotating a Web Service
JF - IEEE Internet Computing
Y1 - 2007
A1 - Amit Sheth
A1 - Kunal Verma
KW - OWL-S
KW - SAWSDL
KW - Semantic SOA
KW - Semantic Web Service
KW - WSDL-S
KW - WSMO
AB - In the past few years, service-oriented architecture (SOA) has transitioned from a partially formed vision into a widely implemented paradigm, with Web services (WS) being the forerunners to implementing SOA-based solutions. But even though the current trend is to use Web services' standards-based nature to establish static connections between various components, businesses are starting to explore dynamic value-added propositions, such as reuse, interoperability, and agility.
ER -
TY - CONF
T1 - Semantics to Empower Services Science: Using Semantics at Middleware, Web Services and Business Levels
Y1 - 2007
A1 - Amit Sheth
KW - Semantic Web Services and WSDL-S and semantic REST and Semantic Services Science (3S) model and services semantics and Semantics of Lightweight services and Service Sciences and semantics of knowledge services and Semantic mashups (Smashup)
PB - International Conference on Enterprise Information Systems
ER -
TY - JOUR
T1 - Semantic-Web-Based Knowledge Management
JF - IEEE Internet Computing
Y1 - 2007
A1 - John Davies
A1 - Amit Sheth
A1 - Miltiadis Lytras
VL - 11
CP - 5
ER -
TY - CHAP
T1 - Sembowser - Semantic Biological Web Services Registry
Y1 - 2007
A1 - William York
A1 - Satya S. Sahoo
A1 - Amit Sheth
A1 - B. Hunter
AB - There are now more than a thousand Web Services [22] offering access to disparate biological resources namely data and computational tools. It is extremely difficult for biological researchers to search in a Web Services (WS) registry for a relevant WS using the standard (primarily computational) descriptions used to describe it. Semantic Biological Web Services Registry (SemBOWSER) is an ontology-based implementation of the UDDI specification, which enables, at present, glycoproteomics researchers to publish, search and discover WS using semantic, service-level, descriptive domain keywords . SemBOWSER classifies a WS along two dimensions- th task they implement and the domain they are associated with. Each published WS is associated with the relevant ProPreO (comprehensive process ontology for glycoproteomics experimental lifecycle) ontology-based kej^words (implemented as part of the registry). A researcher, in turn, can search for relevant WS using only the descriptive kej^words, part of their everyday working lexicon. This intuitive search is underpinned by the ProPreO ontology, thereby making use of the inherent advantages of a semantic search, as compared to a purely syntactic search, namely disambiguation and use of named relationships between concepts. SemBOWSER is part of the glycoproteomics web portal 'Stargate'.
ER -
TY - CHAP
T1 - Sensor Data Management
Y1 - 2007
A1 - Cory Henson
KW - Semantic Sensor Web
KW - SSW
KW - XML
ER -
TY - CHAP
T1 - Sensor Networks Survey
Y1 - 2007
A1 - Cory Henson
A1 - Satya S. Sahoo
KW - Semantic Sensor Web
KW - SSW
PB - daytaOhio
ER -
TY - BOOK
T1 - Sequence Data Mining
Y1 - 2007
A1 - Jian Pei
A1 - Guozhu Dong
ER -
TY - CONF
T1 - Situational Web Applications Mashups
Y1 - 2007
A1 - Ajith Ranabahu
A1 - Stefan Tai
A1 - E. Micheal Maximilien
KW - mashups and Domain Specific Languages and Ruby on Rails
AB - Distributed programming has shifted from private networks to the Internet using heterogeneous Web APIs. This enables the creation of situational applications of composed services exposing user interfaces, i.e., mashups. However, this programmableWeb lacks unified models that can facilitate mashup creation, reuse, and deployments. This poster demonstrates a platform to facilitate Web 2.0 mashups.
PB - ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA)
ER -
TY - CONF
T1 - Space Adaptation: Privacy-preserving Multiparty Collaborative Mining with Geometric Perturbation
T2 - Space Adaptation: Privacy-preserving Multiparty Collaborative Mining with Geometric Perturbation
Y1 - 2007
A1 - Keke Chen
A1 - Ling Liu
JA - Space Adaptation: Privacy-preserving Multiparty Collaborative Mining with Geometric Perturbation
ER -
TY - CONF
T1 - SPARQ2L: Towards Support For Subgraph Extraction Queries in RDF Databases
T2 - 16th International World Wide Web Conference (WWW 2007)
Y1 - 2007
A1 - Kemafor Anyanwu
A1 - Amit Sheth
A1 - Angela Maduko
AB - Many applications in analytical domains often have the need to 'connect the dots' i.e., query about the structure of data. In bioinformatics for example, it is typical to want to query about interactions between proteins. The aim of such queries is to 'extract' relationships between entities i.e. paths from a data graph. Often, such queries will specify certain constraints that qualifying results must satisfy e.g. paths involving a set of mandatory nodes. Unfortunately, most present day Semantic Web query languages including the current draft of the anticipated recommendation SPARQL, lack the ability to express queries about arbitrary path structures in data. In addition, many systems that support some limited form of path queries rely on main memory graph algorithms limiting their applicability to very large scale graphs. In this paper, we present an approach for supporting Path Extraction queries. Our proposal comprises (i) a query language SPARQ2L which extends SPARQL with path variables and path variable constraint expressions, and (ii) a novel query evaluation framework based on efficient algebraic techniques for solving path problems which allows for path queries to be efficiently evaluated on disk resident RDF graphs. The effectiveness of our proposal is demonstrated by a performance evaluation of our approach on both real world and synthetic datasets.
JA - 16th International World Wide Web Conference (WWW 2007)
CY - Banff, Alberta
ER -
TY - CHAP
T1 - Spatiotemporal and Thematic Semantic Analytics
Y1 - 2007
A1 - Matthew Perry
KW - Spatiotemporal
KW - Thematic Semantic Analytics
ER -
TY - CHAP
T1 - Spatiotemporal-Thematic Data Processing in Semantic Web
Y1 - 2007
A1 - Matthew Perry
A1 - Amit Sheth
A1 - Farshad Hakimpour
A1 - Boanerges Aleman-Meza
KW - Event
KW - GIS
KW - rdf
KW - Semantics
KW - Spatiotemporal
KW - spatiotemporal thematic (STT) functions and proximity
AB - This chapter presents practical approaches to data processing in space, time and theme dimensions using current Semantic Web technologies. It describes how we obtain geographic and even data from Internet sources and also how we integrate then into an RDF store. We briefly introduce a set of functionalities in space, time and semantics. These functionalities are implemented based on our existing technology for main-memory based RDF data processing developed at the LSDIS Lab. A number of these functionalities are exposed as REST Web services. We present two sample client side applications that are developed using a combination of our services with Google maps service.
ER -
TY - CONF
T1 - Supporting Complex Thematic, Spatial and Temporal Queries over Semantic Web Data
T2 - 2nd International Conference on Geospatial Semantics (GEOS 07)
Y1 - 2007
A1 - Amit Sheth
A1 - Prateek Jain
A1 - Farshad Hakimpour
A1 - Matthew Perry
AB - Spatial and temporal data are critical components in many applications. This is especially true in analytical domains such as national security and criminal investigation. Often, the analytical process requires uncovering and analyzing complex thematic relationships between disparate people, places and events. Fundamentally new query operators based on the graph structure of Semantic Web data models, such as semantic associations, are proving useful for this purpose. However, these analysis mechanisms are primarily intended for thematic relationships. In this paper, we describe a framework built around the RDF metadata model for analysis of thematic, spatial and temporal relationships between named entities. We discuss modeling issues and present a set of semantic query operators. We also describe an efficient implementation in Oracle DBMS and demonstrate the scalability of our approach with a performance study using a large synthetic dataset from the national security domain.
JA - 2nd International Conference on Geospatial Semantics (GEOS 07)
PB - GeoS 2007: http://geosco.org/geos2007/
CY - Mexico City, MX
ER -
TY - THES
T1 - Supporting Link Analysis using Advanced Querying Methods on Semantic Web Databases
Y1 - 2007
KW - Link Analysis
KW - rdf
KW - semantic associations
KW - Semantic Query Languages
KW - Semantic Web Databases
KW - SPARQ2L
KW - sparql
AB - There is an increasing demand for technologies that can help organizations unearth actionable knowledge from their data assets. This demand continues to drive the flurry of activities in data mining research where the emphasis is on technologies that can identify patterns in data. However, in addition to the 'patterns' view of data, other data and knowledge perspectives are required to support the broad range of complex analytical tasks found in contemporary applications. For example, in some applications in homeland security, bioinformatics, business and other investigative domains many tasks are focused on 'connecting the dots'. For this genre of applications, support for identifying, revealing and analyzing links or relationships between groups of entities (link analysis) is crucial. Currently, mainstream database systems do not provide support for such analyses and current solutions rely on exporting their data from their databases into custom applications to be analyzed. This has the disadvantage of additional overhead and precludes the ability to exploit other mature technologies offered by today's database systems. This thesis argues for database support for link analysis by providing an appropriate interpretation for such information requests in a graph database model. It addresses several key database issues with respect to supporting such queries. First, it identifies a number of querying constructs that are crucial to supporting linking analysis applications and proposes a formal query language called SPARQ2L that allows their expression. A formal semantics and characterization of the computational complexity of SPARQ2L's query constructs is also presented. Second, it proposes a database storage model that supports efficient processing of queries while being tolerant of data persistence. The storage model combines a graph linearization strategy rooted in algebraic techniques for solving path problems with a set of heuristics for node and edge clustering that aims to minimize external path lengths. Third, it proposes a novel relevance model SemRank which exploits the 'machine processible semantics' of data in ascribing relative importance to query results and offers a flexible or 'modulative ranking' model enabling serendipitous knowledge discovery.
ER -
TY - Generic
T1 - SwetoDblp Ontology of Computer Science Publications
Y1 - 2007
A1 - Amit Sheth
A1 - Farshad Hakimpour
A1 - Boanerges Aleman-Meza
A1 - Ismailcem Budak Arpinar
KW - Ontology
KW - Ontology Population
KW - Semantic Analytics
KW - XML
AB - SwetoDblp is a large populated ontology with a shallow schema yet a large number of real- world instance data. We describe how such ontology is built from an XML source and how it can be maintained. Instead of a one-to-one mapping from XML to RDF, the creation of the ontology emphasizes the addition of relationships and the value of URIs. SwetoDblp is publicly available online. We also summarize research efforts that have used or are using this freely available community resource.
PB - Web Semantics: Science, Services and Agents on the World Wide Web
ER -
TY - CONF
T1 - Towards Attack-Resilient Geometric Data Perturbation
T2 - Towards Attack-Resilient Geometric Data Perturbation
Y1 - 2007
A1 - Ling Liu
A1 - Keke Chen
JA - Towards Attack-Resilient Geometric Data Perturbation
ER -
TY - CONF
T1 - Towards Tractable Local Closed World Reasoning for the Semantic Web
T2 - 13th Portuguese Conference on Aritficial Intelligence, EPIA
Y1 - 2007
A1 - Matthias Knorr
A1 - Jose Julio Alferes
A1 - Pascal Hitzler
AB - Recently, the logics of minimal knowledge and negation as failure MKNF [12] was used to introduce hybrid MKNF knowledge bases [14], a powerful formalism for combining open and closed world reasoning for the Semantic Web. We present an extension based on a new three-valued framework including an alternating fixpoint, the well-founded MKNF model. This approach, the well-founded MKNF semantics, derives its name from the very close relation to the corresponding semantics known from logic programming. We show that the well-founded MKNF model is the least model among all (three-valued) MKNF models, thus soundly approximating also the two-valued MKNF models from [14]. Furthermore, its computation yields better complexity results (up to polynomial) than the original semantics where models usually have to be guessed.
JA - 13th Portuguese Conference on Aritficial Intelligence, EPIA
PB - 13th Portuguese Conference on Aritficial Intelligence, EPIA 2007
CY - Guimaraes, Portugal
ER -
TY - CHAP
T1 - Trailblazing, Complex Hypothesis Evaluation, Abductive Inference and Semantic Web - Exploring possible synergy
Y1 - 2007
A1 - Amit Sheth
PB - Evidence and Intelligent Systems: ARO Workshop on Abductive Reasoning
ER -
TY - CONF
T1 - A Unified approach To Retrieving Web Documents and Semantic Web Data
T2 - 4th European Semantic Web Conference (ESWC 2007)
Y1 - 2007
A1 - Krishnaprasad Thirunarayan
A1 - Trivikram Immaneni
JA - 4th European Semantic Web Conference (ESWC 2007)
CY - Innsbruck, Austria
ER -
TY - CONF
T1 - Usability of multiple degree-of-freedom input devices and virtual reality displays for interactive visual data analysis
Y1 - 2007
A1 - Joerg Meyer
A1 - Hans Hagen
A1 - Elke Moritz
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - Using SAWSDL for Semantic Service Interoperability
Y1 - 2007
A1 - Kunal Verma
A1 - Amit Sheth
PB - Semantic Technology Conference
ER -
TY - CONF
T1 - Video on the Semantic Sensor Web
Y1 - 2007
A1 - Terry Rapoch
A1 - Amit Sheth
A1 - Josh Pschorr
A1 - Prateek Jain
A1 - Cory Henson
KW - Semantic Sensor Web and Semantic Annotation and SSW and Semantic Mashup
AB - Millions of sensors around the globe currently collect avalanches of data about our world. The rapid development and deployment of sensor technology is intensifying the existing problem of too much data and not enough knowledge. With a view to alleviating this glut, we propose that sensor data, especially video sensor data, can be annotated with semantic metadata to provide contextual information about videos on the Web. In particular, we present an approach to annotating video sensor data with spatial, temporal, and thematic semantic metadata. This technique builds on current standardization efforts within the W3C and Open Geospatial Consortium (OGC) and extends them with Semantic Web technologies to provide enhanced descriptions and access to video sensor data.
PB - W3C Video on the Web Workshop
ER -
TY - ABST
T1 - Visualization and Analysis of CT Data in Cardiovascular Research
Y1 - 2007
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - Visualization of Events in a Spatially and Multimedia Enriched Virtual Environment
T2 - IEEE Intelligence and Security Informatics (ISI)
Y1 - 2007
A1 - Farshad Hakimpour
A1 - Leonidas Deligiannidis
A1 - Amit Sheth
AB - Semantic Event Tracker (SET) is a highly interactive visualization tool for tracking and associating activities (events) in a spatially and Multimedia Enriched Virtual Environment. SET provides integrated views of information spaces while providing overview and detail to improve perception and evaluation of complex scenarios. We model an event as an object that describes an action and its location, time, and relations to other objects. Real world event information is extracted from Internet sources, then stored and processed using Semantic Web technologies that enable us to discover semantic associations between events. We use RDF graphs to represent semantic metadata and ontologies. SET is capable of visualizing as well as navigating through the event data in all three aspects of space, time and theme.
JA - IEEE Intelligence and Security Informatics (ISI)
CY - New Brunswick, NJ
ER -
TY - CONF
T1 - Visualization of Vascular Structures
Y1 - 2007
A1 - Thomas Wischgoll
ER -
TY - CONF
T1 - Visualizing Morphometric Data of Vasculatures
Y1 - 2007
A1 - Thomas Wischgoll
AB - Volume visualization is a common, very well-established visualization technique for volumetric data sets. Numerous advancements have been proposed and sophisticated improvements have been implemented to produce elaborated renderings that are capable of enhancing details within the volume. However, volume visualization alone is often not sufficient for the application domain. Often times, researchers are interested in accurate measurements extracted from volumetric data to gain further insight of the specimen. These extracted measurements can then be used to generate and visualize a geometric reconstruction of the specimen. The visualization can incorporate additional tools that allow researchers to determine additional measurements of the specimen to help in the analysis. This paper describes such a software system which is capable of accurately extracting measurements from volumetric data, reconstructing the geometric properties of the specimen, and interactively visualizing the results, thereby allowing researcher to further investigate the geometric configuration of the specimen. Index Terms Morphometric measurements, geometric reconstruction, visualization of vascular structures.
ER -
TY - JOUR
T1 - Welcome to Prof. Amit Sheth
JF - Springer Science+Business Media
Y1 - 2007
A1 - A. Elmagarmid
A1 - Amit Sheth
KW - Distrib Parallel Databases
KW - Elmagarmid
KW - Sheth
AB - It is with great pleasure and enthusiasm that I announce the creation of a new co-EIC position for the DAPD Journal. We have recently published our 21st Volume without ever missing an issue or being late, the journal is now published by Springer and we have moved completely into electronic reviewing and publishing. As we move forward, I could not have found a better partner than my old friend and college class mate, Amit Sheth. Prof. Sheth has recently moved to the Wright State University as the LexisNexis Ohio Eminent Scholar in Advanced Data Management and Analysis. There are several compelling reasons why Prof. Sheth was my only choice to help me lead the Journal going forward. He has been pioneer and a leader in the database, semantic web and workflow communities. He has consistently pursued discovery, learning and engagement more successfully than any one else I know. Of particular interest to our Journal, he co-authored 3 of the most cited papers in the journal with over 1600 combined citations: An overview of Workflow Management System, OBSERVER: An Approach for Query Processing in Global Information (3rd most cited with over 400 citations), and Managing heterogeneous multi-system tasks to support enterprise-wide operations (5th most cited with 198 citations). Please share with me a sincere thanks to Prof. Sheth for his willingness to work with me to take this journal to new heights. Ahmed Elmagarmid, Co-Editor in Chief, DAPD
VL - 105-106
ER -
TY - CONF
T1 - A Well-founded Semantics for Hybrid MKNF Knowledge Bases
Y1 - 2007
A1 - Matthias Knorr
A1 - Jose Julio Alferes
A1 - Pascal Hitzler
AB - In [10], hybrid MKNF knowledge bases have been proposed for combining open and closed world reasoning within the logics of minimal knowledge and negation as failure ([8]). For this powerful framework, we define a three-valued semantics and provide an alternating fixpoint construction for nondisjunctive hybrid MKNF knowledge bases. We thus provide a well-founded semantics which is a sound approximation of the cautious MKNF model semantics, and which also features improved computational properties. We also show that whenever the DL knowledge base part is empty, then the alternating fixpoint coincides with the classical well-founded model.
PB - the 2007 International Workshop on Description Logics (DL-2007)
ER -
TY - ABST
T1 - What, Where and When: Supporting Semantic, Spatial and Temporal Queries in a DBMS
Y1 - 2007
A1 - Prateek Jain
A1 - Matthew Perry
A1 - Amit Sheth
A1 -