Welcome on the homepage of the chair "Internet Technologies and Systems" of Prof. Dr. Christoph Meinel and his team. We like to inform you about our teaching and ongoing research activities in security, knowledge engineering, innovation and design thinking research.

The chair of Prof. Dr. Christoph Meinel offers courses in the following disciplines: Internet and Web Technologies, (Discrete) Mathematics and Logic, IT Security and Internet Security, Complexity Theory and Information Security as well as Design Thinking.

In Security and Trust Engineering our research and development work is mainly focused on: Network & Internet Security, Cloud and SOA-Security (SOA - Service Oriented Architectures) and Security Awareness.

The research of the team of Prof. Dr. Christoph Meinel in the field of knowledge management and engineering focus on the challenging question, how to manage the mass of digital data, so-called "big data", from Internet and other sources in order to generate new knowledge.

Cloud storage brokerage systems abstract cloud storage complexities by mediating technical and business relationships between Cloud Service Providers(CSP) and cloud users, while providing value-added services e.g. increased security, identity management and file sharing/syncing. However, CSBs face several security challenges including enlarged attack surfaces due to integration of disparate components e.g. on-premise and cloud APIs/services. Therefore, appropriate security risk assessment methods are required to identify and evaluate these security issues, and examine the efficiency of countermeasures. A possible approach for satisfying these requirements is employment of threat modeling concepts, which have been successfully applied in traditional paradigms. In this work, we employ threat models including attack trees, attack graphs and Data Flow Diagrams against a representative, real Cloud Storage Broker (CSB) and analyze these security threats and risks. We also propose a technique for combining Common Vulnerability Scoring System (CVSS) and Common Configuration Scoring System (CCSS) base scores in probabilistic attack graphs in order to cater for configuration-based vulnerabilities which are typically leveraged to compromise cloud storage systems. This effort is necessary since existing schemes do not provide sufficient security metrics, imperative for comprehensive risk assessments. We demonstrate the efficiency of our proposal by devising CCSS base scores for two common attacks against cloud storage: Cloud Storage Enumeration Attack and Cloud Storage Exploitation Attack. These metrics are then used in Attack Graph Metric-based risk assessment. Therefore, our approach can be employed by CSBs and CSPs to improve cloud security.

Battery-powered and energy-harvesting IEEE 802.15.4 nodes are subject to so-called denial-of-sleep attacks. Such attacks generally aim at draining the energy of a victim device. Especially, session key establishment schemes for IEEE 802.15.4 security are susceptible to denial-of-sleep attacks since injected requests for session key establishment typically trigger energy-consuming processing and communication. Nevertheless, Krentz et al.’s Adaptive Key Establishment Scheme (AKES) for IEEE 802.15.4 security is deemed to be resilient to denial-of-sleep attacks thanks to its energy-efficient design and special defenses. However, thus far, AKES’ resilience to denial-of-sleep attacks was presumably never evaluated. In this paper, we make two contributions. First, we evaluate AKES’ resilience to denial-of-sleep attacks both theoretically and empirically. We particularly consider two kinds of denial-of-sleep attacks, namely HELLO flood attacks, as well as what we introduce in this paper as “yo-yo attacks”. Our key finding is that AKES’ denial-of-sleep defenses require trade-offs between denial-of-sleep resilience and the speed at which AKES adapts to topology changes. Second, to alleviate these trade-offs, we devise and evaluate new denial-of-sleep defenses. Indeed, our newly-devised denial-of-sleep defenses turn out to significantly accelerate AKES’ reaction to topology changes, without incurring much overhead nor sacrificing on security.

In talk-based mental health interventions, treatment outcomes can be decisively improved by enhancing the relationship between patient and therapist. We developed the interactive documentation system Tele-Board MED (TBM) with the goal of supporting patients and doctors in their cooperative task of patient care. The system offers a whiteboard-inspired graphical user interface which allows them to take notes jointly during the treatment session. Two proxy studies were conducted whereby TBM was introduced in a role play that showcased the dialogue in a therapy session. The patient role was played by a volunteer. The audience of human-centered design as well as eHealth experts rated the therapist-patient relationship in a session with and without TBM. The data collected via questionnaires shows that TBM consistently receives a positive rating from study participants (N=36) in the areas of: collaboration, communication, patient-doctor relationship, as well as patient empowerment.

Many providers of Massive Open Online Course (MOOC) platforms released mobile applications in the recent years to enable learning offline and on the go, for a more ubiquitous learning experience. However, mainly the MOOC content was optimized for small screens, but mobile devices provide the opportunity to enrich the MOOC experience even further by enabling new forms of learning. Based on a previous learning patterns evaluation and a user survey, this paper presents a second screen prototype for the MOOC platform of the Hasso Plattner Institute, whereby the mobile application can be used as a learning companion while using the web platform on a computer. Four different actions were implemented which can be done next to watching a video lecture. The evaluation showed that the prototype was helpful and made learning more efficient, as reported by users, and also ideas for further improvements were proposed.

This research note addresses the design and testing set-up of a prototype learning unit (protoLU). We explain the advantages of adapting Massive Open Online Courses (MOOCs) to digital learning units for professional working environments and introduce the use case for vocational design thinking trainings. Consequently, we describe the case of a protoLU created to support a physical, two-phase design thinking workshop. The protoLU aimed at refreshing participant knowledge, fostering skill transfer and to inspire through behavioral modeling. With our research set-up we intend to gather results for iterating the protoLU and learn more about the needs of design thinking learner teams in independent teamwork phases.

In this paper, we tackle the problem of detecting malicious domains and IP addresses using graph inference. In this regard, we mine proxy and DNS logs to construct an undirected graph in which vertices represent domain and IP address nodes, and the edges represent relationships describing an association between those nodes. More specifically, we investigate three main relationships: subdomainOf, referredTo, andresolvedTo. We show that by providing minimal ground truth information, it is possible to estimate the marginal probability of a domain or IP node being malicious based on its association with other malicious nodes. This is achieved by adopting belief propagation, i.e., an efficient and popular inference algorithm used in probabilistic graphical models. We have implemented our system in Apache Spark and evaluated using one day of proxy and DNS logs collected from a global enterprise spanning over 2 terabytes of disk space. In this regard, we show that our approach is not only efficient but also capable of achieving high detection rate (96% TPR) with reasonably low false positive rates (8% FPR). Furthermore, it is also capable of fixing errors in the ground truth as well as identifying previously unknown malicious domains and IP addresses. Our proposal can be adopted by enterprises to increase both the quality and the quantity of their threat intelligence and blacklists using only proxy and DNS logs.

Massive Open Online Courses (MOOCs) have left their mark on the face of education during the recent couple of years. At the Hasso Plattner Institute (HPI) in Potsdam, Germany, we are actively developing a MOOC platform, which provides our research with a plethora of e-learning topics, such as learning analytics, automated assessment, peer assessment, team-work, online proctoring, and gamification. We run several instances of this platform. On openHPI, we provide our own courses from within the HPI context. Further instances are openSAP, openWHO, and mooc.HOUSE, which is the smallest of these platforms, targeting customers with a less extensive course portfolio. In 2013, we started to work on the gamification of our platform. By now, we have implemented about two thirds of the features that we initially have evaluated as useful for our purposes. About a year ago we activated the implemented gamification features on mooc.HOUSE. They have been employed actively in the course “Design for Non-Designers”. We plan to activate the features on openHPI in the beginning of 2017. The paper at hand recaps, examines, and re-evaluates our initial recommendations.

Nowadays, social networks produce a huge amount of spatial and spatiotemporal data that provide interesting knowledge. This knowledge can be discovered by clustering algorithms and the result of that can be used for different applications. One of such applications is the geospatial event detection based on data from social networks. Many of such detection methods rely on clustering algorithms that should provide clusters with the high level of density in space and intensity in time. Meanwhile, traditional clustering methods are not always practical for spatial and spatiotemporal data because of the specific of such data. Therefore, in this paper, we present the density and intensity-based spatiotemporal clustering algorithm with fixed distance and time radius. This approach produces the clusters that have the density-based center in space and intensity-based center in time. In the paper, we provide the description of the method from the perspective of 2 aspects: spatial and temporal. We complete the paper with the full description of the algorithm methods and detailed explanation of the pseudo code.

The Hasso Plattner Institute successfully runs a self-developed Massive Open Online Course (MOOC) platform—openHPI—since 2012. MOOCs, even more than classic classroom situations, depend on automated solutions to assess programming exercises. Manual evaluation is not an option due to the massive amount of users that participate in these courses. The paper at hand maps the landscape of tools that are used on openHPI in the context of automated grading of programming exercises. Furthermore, it provides a sneak preview to new features that will be integrated ion the near future. Particularly, we will introduce CodeHarbor, our platform to share auto-gradeable exercises between various online code execution platforms.

Web-based E-Learning uses Internet technologies and digital media to deliver education content to learners. Many universities in recent years apply their capacity in producing Massive Open Online Courses (MOOCs). They have been offering MOOCs with an expectation of rendering a comprehensive online apprenticeship. Typically, an online content delivery process requires an Internet connection. However, access to the broadband has never been a readily available resource in many regions. In Africa, poor and no networks are yet predominantly experienced by Internet users, frequently causing offline each moment a digital device disconnect from a network. As a result, a learning process is always disrupted, delayed and terminated in such regions. This paper raises the concern of E-Learning in poor and low bandwidths, in fact, it highlights the needs for an Offline-Enabled mode. The paper also explores technical approaches beamed to enhance the user experience in Web-based E-Learning, particular in Africa.

Auto-gradable hands-on programming exercises are a key element for scalable programming courses. A variety of auto-graders already exist, however, creating suitable high- quality exercises in a sufficient amount is a very time-consuming and tedious task. One way to approach this problem is to enable sharing auto-gradable exercises between several interested parties. School-teachers, MOOC1 instructors, workshop providers, and university level teachers need programming exercises to provide their students with hands-on experience. Auto-gradability of these exercises is an important requirement. The paper at hand introduces a tool that enables the sharing of such exercises and addresses the various needs and requirements of the different stakeholders.

Teamwork is an an important topic in education. It fosters deep learning and allows educators to assign interesting tasks, which would be too complex to be solved by single participants due to the time restrictions defined by the context of a course.Furthermore, today's jobs require an increasing amount of team skills. On the other hand, teamwork comes with a variety of issues of its own. Particularly in large scale settings, such as MOOCs, teamwork is challenging. Courses often end with dysfunctional teams due to drop-outs or insufficient matching. The paper at hand presents a set of three tools that we have recently added to our system to enable teamwork in our courses. This toolset consists of the TeamBuilder, a tool to match successful teams based on a variable set of parameters, CollabSpaces, providing teams with a secluded area to communicate and collaborate within the course context, and a TeamPeerAssessment tool, which allows to provide teams with complex tasks and which allows assessment that sufficiently scales for the MOOC context. The presented tools are evaluated in terms of success rates of the created teams and workload reduction for the courses' teaching teams.

The popularity of MOOCs has increased considerably in the last years. A typical MOOC course consists of video content, self tests after a video and homework, which is normally in multiple choice format. After solving this homeworks for every week of a MOOC, the final exam certificate can be issued when the student has reached a sufficient score. There are also some attempts to include practical tasks, such as programming, in MOOCs for grading. Nevertheless, until now there is no known possibility to teach embedded system programming in a MOOC course where the programming can be done in a remote lab and where grading of the tasks is additionally possible. This embedded programming includes communication over GPIO pins to control LEDs and measure sensor values. We started a MOOC course called ``Embedded Smart Home'' as a pilot to prove the concept to teach real hardware programming in a MOOC environment under real life MOOC conditions with over 6000 students. Furthermore, also students with real hardware have the possibility to program on their own real hardware and grade their results in the MOOC course. Finally, we evaluate our approach and analyze the student acceptance of this approach to offer a course on embedded programming. We also analyze the hardware usage and working time of students solving tasks to find out if real hardware programming is an advantage and motivating achievement to support students learning success.

Generating seeds on Internet of things (IoT) devices is challenging because these devices typically lack common entropy sources, such as user interaction or hard disks. A promising replacement is to use power-up static random-access memory (SRAM) states, which are partly random due to manufacturing deviations. Thus far, there, however, seems to be no method for extracting close-to-uniformly distributed seeds from power-up SRAM states in an information-theoretically secure and practical manner. Moreover, the min-entropy of power-up SRAM states reduces with temperature, thereby rendering this entropy source vulnerable to so-called freezing attacks. In this paper, we mainly make three contributions. First, we propose a new method for extracting uniformly distributed seeds from power-up SRAM states. Unlike current methods, ours is information-theoretically secure, practical, and freezing attack-resistant rolled into one. Second, we point out a trick that enables using power-up SRAM states not only for self-seeding at boot time, but also for reseeding at runtime. Third, we compare the energy consumption of seeding an IoT device either with radio noise or power-up SRAM states. While seeding with power-up SRAM states turned out to be more energy efficient, we argue for mixing both these entropy sources.

802.15.4 security protects against the replay, injection, and eavesdropping of 802.15.4 frames. A core concept of 802.15.4 security is the use of frame counters for both nonce generation and anti-replay protection. While being functional, frame counters (i) cause an increased energy consumption as they incur a per-frame overhead of 4 bytes and (ii) only provide sequential freshness. The Last Bits (LB) optimization does reduce the per-frame overhead of frame counters, yet at the cost of an increased RAM consumption and occasional energy- and time-consuming resynchronization actions. Alternatively, the timeslotted channel hopping (TSCH) media access control (MAC) protocol of 802.15.4 avoids the drawbacks of frame counters by replacing them with timeslot indices, but findings of Yang et al. question the security of TSCH in general. In this paper, we assume the use of ContikiMAC, which is a popular asynchronous MAC protocol for 802.15.4 networks. Under this assumption, we propose an Intra-Layer Optimization for 802.15.4 Security (ILOS), which intertwines 802.15.4 security and ContikiMAC. In effect, ILOS reduces the security-related per-frame overhead even more than the LB optimization, as well as achieves strong freshness. Furthermore, unlike the LB optimization, ILOS neither incurs an increased RAM consumption nor requires resynchronization actions. Beyond that, ILOS integrates with and advances other security supplements to ContikiMAC. We implemented ILOS using OpenMotes and the Contiki operating system.

In order to create an effective article, having great content is essential. However, to achieve this, the writer needs to target a specific audience. A target audience refers to a group of readers that a writer intends to reach with his content. Defining a target audience is substantial because it has a direct effect on adjusting writing style and content of the article. Nowadays, writers rely solely on annotated attributes of articles, such as location and language to understand his/her audience. The aim of this work is to identify the audience attributes of articles, especially not-annotated attributes. Among others, this work focuses on the detection of three key audience attributes of related articles: age, gender, and personality. We compare between multiple machine learning classifiers to detect these attributes. Finally, we demonstrate a prototypical application that enables writers to run existing algorithms such as trend detection and showing related articles that are specific to a defined target audience based on the newly detected attributes.

In this paper, we present our experiences in analyzing Twitter data. The analysis has shown that information diffuses over time through the Twitter network in certain patterns. Furthermore, it has shown those friend relationships significantly influence the information propagation speed on Twitter. Since it was launched in 2006, the microblogging service grew tremendously. Tweets are sent by users all around the world. Results show that there are two major patterns. While these patterns accommodate us to understand the diffusion of information through Twitter in an even better plan, the analysis of friend networks provides information on who influences the network, concerning the number of re-tweets and the time between a tweet and its re-tweets. The approaches have been evaluated both technically, based on how certain a topic matches one of the patterns and how prominent friends are compared to other users, and conceptually, based on existing, well-known approaches in measuring the speed and scale of information diffusion on Twitter.

This paper discusses a new approach for designing and deploying Security-as-a-Service (SecaaS) applications using cloud native design patterns. Current SecaaS approaches do not efficiently handle the increasing threats to computer systems and applications. For example, requests for security assessments drastically increase after a high-risk security vulnerability is disclosed. In such scenarios, SecaaS applications are unable to dynamically scale to serve requests. A root cause of this challenge is employment of architectures not specifically fitted to cloud environments. Cloud native design patterns resolve this challenge by enabling certain properties e.g. massive scalability and resiliency via the combination of microservice patterns and cloud-focused design patterns. However adopting these patterns is a complex process, during which several security issues are introduced. In this work, we investigate these security issues, we redesign and deploy a monolithic SecaaS application using cloud native design patterns while considering appropriate, layered security counter-measures i.e. at the application and cloud networking layer. Our prototype implementation out-performs traditional, monolithic applications with an average Scanner Time of 6 minutes, without compromising security. Our approach can be employed for designing secure, scalable and performant SecaaS applications that effectively handle unexpected increase in security assessment requests.

Providing patients access to mental health records is a controversial topic that gains growing attention in research and practice. While it has great potential in increasing the patient engagement, skepticism is prevailing among therapists who fear detrimental effects and face a lack of feasibility when treatment notes are handwritten. We aim at empowering both therapists to new documentation approaches and patients to higher engagement, and develop the collaborative documentation system Tele-Board MED (TBM) as an adjunct to talk-based mental health interventions. We present an evaluation of TBM by comparing four prototypes and testing scenarios, reaching from early simulations to attempts of real-life implementations in clinical routines. This paper delivers a systematic need comparison of therapists as primary users and patients as secondary users, both during and beyond treatment sessions. While patient feedback is thoroughly positive, the therapist needs are only partially addressed; the benefits remain hidden behind the perceived effort.

Generating seeds on Internet of things (IoT) devices is challenging because these devices typically lack common entropy sources, such as user interaction or hard disks. A promising replacement is to use power-up static random-access memory (SRAM) states, which are partly random due to manufacturing deviations. Thus far, there, however, seems to be no method for extracting close-to-uniformly distributed seeds from power-up SRAM states in an information-theoretically secure and practical manner. Moreover, the min-entropy of power-up SRAM states reduces with temperature, thereby rendering this entropy source vulnerable to so-called freezing attacks. In this paper, we mainly make three contributions. First, we propose a new method for extracting uniformly distributed seeds from power-up SRAM states. Unlike current methods, ours is information-theoretically secure, practical, and freezing attack-resistant rolled into one. Second, we point out a trick that enables using power-up SRAM states not only for self-seeding at boot time, but also for reseeding at runtime. Third, we compare the energy consumption of seeding an IoT device either with radio noise or power-up SRAM states. While seeding with power-up SRAM states turned out to be more energy efficient, we argue for mixing both these entropy sources.

Cloud Native Applications (CNA) consists of multiple collaborating microservice instances working together towards common goals. These microservices leverage the underlying cloud infrastructure to enable several properties such as scalability and resiliency. CNA are complex distributed applications, vulnerable to several security issues affecting microservices and traditional cloud-based applications. For example, each microservice instance could be developed with different technologies e.g. programming languages and databases. This diversity of technologies increases the chances for security vulnerabilities in microservices. Moreover, the fast-paced development cycles of CNA increases the probability of insufficient security tests in the development pipelines, and consequent deployment of vulnerable microservices. Furthermore, cloud native environments are ephemeral, microservices are dynamically launched and de-registered, this factor creates a discoverability challenge for traditional security assessment techniques. Hence, security assessments in such environments require new approaches which are specifically adapted and integrated to CNA. In fact, such techniques are to be cloud native i.e. well integrated into the cloud’s fabric. In this paper, we tackle the above-mentioned challenges through the introduction of a novel Security Control concept - the Security Gateway. To support the Security Gateway concept, two other concepts are proposed: dynamic document store and security health endpoints.We have implemented these concepts using cloud native design patterns and integrated them into the CNA workflow. Our experimental evaluations validate the efficiency of our proposals, the time overhead due to the security gateway is minimal and the vulnerability detection rate surpasses that of traditional security assessment approaches. Our proposal can therefore be employed to secure CNA and microservice-based implementations.

Like virtually all media access control (MAC) protocols for 802.15.4 networks, also ContikiMAC is vulnerable to various denial-of-sleep attacks. The focus of this paper is on countering three specific denial-of-sleep attacks on ContikiMAC, namely ding-dong ditching, pulse-delay attacks, and collision attacks. Ding-dong ditching is when attackers emit interference, inject frames, or replay frames so as to mislead ContikiMAC into staying in receive mode for extended periods of time and hence consuming much energy. Pulse-delay attacks are actually attacks on time synchronization, but can also be launched against ContikiMAC’s phase-lock optimization to cause an increased energy consumption. Lastly, in collision attacks, an attacker provokes retransmissions via jamming. In this paper, to counter these three kinds of denial-of-sleep attacks, we propose two optimizations to ContikiMAC. The dozing optimization, on the one hand, significantly reduces the energy consumption under ding-dong ditching. Beyond that, the dozing optimization helps during normal operation as it reduces the energy consumption of true wake ups, too. The secure phase-lock optimization, on the other hand, is a version of ContikiMAC’s phase-lock optimization that resists pulse-delay attacks. Additionally, the secure phase-lock optimization makes ContikiMAC resilient to collision attacks, as well as more energy efficient. We implemented and evaluated both optimizations using the Contiki operating system and OpenMotes.

IoT devices usually are battery-powered and directly connected to the Internet. This makes them vulnerable to so-called path-based denial-of-service (PDoS) attacks. For example, in a PDoS attack an adversary sends multiple Constrained Application Protocol (CoAP) messages towards an IoT device, thereby causing each IoT device along the path to expend energy for forwarding this message. Current end-to-end security solutions, such as DTLS or IPsec, fail to prevent such attacks since they only filter out inauthentic CoAP messages at their destination. This demonstration shows an approach to allow en-route filtering where a trusted gateway has all necessary information to check the integrity, decrypt and, if necessary, drop a message before forwarding it to the constrained mote. Our approach preserves precious resources of IoT devices in the face of path-based denial-of-service attacks by remote attackers.

Teamwork is an an important topic in education. It fosters deep learning and allows educators to assign interesting tasks, which would be too complex to be solved by single participants due to the time restrictions defined by the context of a course. Furthermore, today’s jobs require an increasing amount of team skills. On the other hand, teamwork comes with a variety of issues of its own. Particularly in large scale settings, such as MOOCs, teamwork is challenging. Courses often end with dysfunctional teams due to drop-outs or insufficient matching. The paper at hand presents a set of three tools that we have recently added to our system to enable teamwork in our courses. This toolset consists of the TeamBuilder, a tool to match successful teams based on a variable set of parameters, CollabSpaces, providing teams with a secluded area to communicate and collaborate within the course context, and a TeamPeerAssessment tool, which allows to provide teams with complex tasks and which allows assessment that sufficiently scales for the MOOC context. The presented tools are evaluated in terms of success rates of the created teams and workload reduction for the platform’s OPS-team, which prepares the courses in accordance with the requirements of the teaching teams. The evaluation is based on the analysis of data, which has been collected in five courses that have been conducted on one of our platforms during 2015 and 2016 and interviews with the platform’s OPS-team.

Inclusion dependencies form one of the most fundamental classes of integrity constraints. Their importance in classical data management is reinforced by modern applications such as data profiling, data cleaning, entity resolution and schema matching. Their discovery in an unknown dataset is at the core of any data analysis effort. Therefore, several research approaches have focused on their efficient discovery in a given, static dataset. However, none of these approaches are appropriate for applications on dynamic datasets, such as transactional datasets, scientific applications, and social network. In these cases, discovery techniques should be able to efficiently update the inclusion dependencies after an update in the dataset, without reprocessing the entire dataset. We present the first approach for incrementally updating the unary inclusion dependencies. In particular, our approach is based on the concept of attribute clustering from which the unary inclusion dependencies are efficiently derivable. We incrementally update the clusters after each update of the dataset. Updating the clusters does not need to access the dataset because of special data structures designed to efficiently support the updating process. We perform an exhaustive analysis of our approach by applying it to large datasets with several hundred attributes and more than 116,200,000 million tuples. The results show that the incremental discovery significantly reduces the runtime needed by the static discovery. This reduction in the runtime is up to 99.9996 % for both the insert and the delete.

Micro-grids offer a cost-effective approach to providing reliable power supply in isolated and disadvantaged communities. These communities present a special case where access to national power networks is either non-existent or intermittent due to load-shedding to provision urban areas and/or due to high interconnection costs. By necessity, such micro-grids rely on renewable energy sources that are variable and so only partly predictable. Ensuring reliable power provisioning and billing must therefore be supported by demand management and fair-billing policies. Furthermore, since trusted centralized grid management is not always possible, using a distributed model offers a viable solution approach. However, such a distributed system may be subject to subversion attacks aimed at power theft. In this paper, we present a novel and innovative distributed architecture for power distribution and billing on micro-grids. The architecture is designed to operate efficiently over a lossy communication network, which is an advantage for disadvantaged communities. Since lossy networks are undependable, differentiating system failures from adversarial manipulations is important because grid stability is to a large extent dependent on user participation. To this end, we provide a characterization of potential adversarial models to underline how these can be differentiated from failures.

Amirkhanyan, A., Meinel, C.: Analysis of data from the Twitter account of the Berlin Police for public safety awareness.2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD). pp. 209-214 (2017).

Many datasets change over time. As a consequence, long-running applications that cache and repeatedly use query results obtained from a SPARQL endpoint may resubmit the queries regularly to ensure up-to-dateness of the results. While this approach may be feasible if the number of such regular refresh queries is manageable, with an increasing number of applications adopting this approach, the SPARQL endpoint may become overloaded with such refresh queries. A more scalable approach would be to use a middle-ware component at which the applications register their queries and get notified with updated query results once the results have changed. Then, this middle-ware can schedule the repeated execution of the refresh queries without overloading the endpoint. In this paper, we study the problem of scheduling refresh queries for a large number of registered queries by assuming an overload-avoiding upper bound on the length of a regular time slot available for testing refresh queries. We investigate a variety of scheduling strategies and compare them experimentally in terms of time slots needed before they recognize changes and number of changes that they miss.

Nowadays, social networks are an essential part of modern life. People posts everything what happens with them and what happens around them. The amount of data, producing by social networks, increases dramatically every year and users more often post geo-tagged messages. It gives us more possibilities for visualization and analysis of social data, since we can be interested not only in the content of the message but also in the location, from where this message was posted. We aimed to use public social data from location-based social networks to improve situational awareness. In the paper, we show our approach of handling in real-time geodata from Twitter and providing the advanced methods for visualization, analysis, searching and statistics, in order to improve situational awareness.

There is significant, unexploited potential to improve the patients’ engagement in psychotherapy treatment through technology use. We develop Tele-Board MED (TBM), a digital tool to support documentation and patient-provider collaboration in medical encounters. Our objective is the evaluation of TBM's practical effects on patient-provider relationships and patient empowerment in the domain of talk-based mental health interventions. We tested TBM in individual therapy sessions at a psychiatric ward using action research methods. The qualitative results in form of therapist observations and patient stories show an increased acceptance of diagnoses and patient-therapist bonding. We compare the observed effects to patient-provider relationship and patient empowerment models. We can conclude that the functions of TBM – namely that notes are shared and cooperatively taken with the patient, that diagnostics and treatment procedures are depicted via visuals and in plain language, and that patients get a copy of their file – lead to increased patient engagement and an improved collaboration, communication and integration in consultations.

Links are the key enabler for retrieval of related information on the Web of Data. Currently, DBpedia is one of the central interlinking hubs in the Linked Open Data (LOD) cloud. With over 28 million of described and localized things it is one of the largest and open datasets. With the increasing number of linked datasets, there is need for proper maintenance of these links. In this paper, we describe the DBpedia Links repository, which maintains linksets between DBpedia and other LOD datasets. We describe the system for maintenance, update and quality assurance of the linksets.

This joint volume of proceedings gathers together papers from the 2nd Workshop on Managing the Evolution and Preservation of the Data Web (MEPDaW) and the 3rd Workshop on Linked Data Quality (LDQ), held on the 30th of May of 2016 during the 13th ESWC conference in Anissaras, Crete, Greece.

The handwritten signature is widely employed and accepted as a proof of a person's identity. In our everyday life, it is often verified manually, yet only casually. As a result, the need for automatic signature verification arises. In this paper, we propose a new approach to the writer independent verification of offline signatures. Our approach, named Signature Embedding, is based on deep metric learning. Comparing triplets of two genuine and one forged signature, our system learns to embed signatures into a high-dimensional space, in which the Euclidean distance functions as a metric of their similarity. Our system ranks best in nearly all evaluation metrics from the ICDAR SigWiComp 2013 challenge. The evaluation shows a high generality of our system: being trained exclusively on Latin script signatures, it outperforms the other systems even for signatures in Japanese script.

Most Online Social Networks (OSNs) implement privacy policies that enable users to protect their sensitive information against privacy violations. However, observations indicate that users find these privacy policies cumbersome and difficult to configure. Consequently, various approaches have been proposed to assist users with privacy policy configuration. These approaches are however, limited to either protecting only profile attributes, or only protecting user-generated content. This is problematic, because both profile attributes and user-generated content can contain sensitive information. Therefore, protecting one without the other, can still result in privacy violations. A further drawback of existing approaches is that most require considerable user input which is time consuming and inefficient in terms of privacy policy configuration. In order to address these problems, we propose an automated privacy policy recommender system. The system relies on the expertise of existing OSN users, in addition to the target user's privacy policy history to provide him/her with personalized privacy policy suggestions for profile attributes, as well as user-generated content. Results from our prototype implementation indicate that the proposed recommender system provides accurate privacy policy suggestions, with minimum user input.

In recent years, named entity linking (NEL) tools were primarily developed as general approaches, whereas today numerous tools are focusing on specific domains such as e.g. the mapping of persons and organizations only, or the annotation of locations or events in microposts. However, the available benchmark datasets used for the evaluation of NEL tools do not respect this focalizing trend. We have analyzed the evaluation process applied in the NEL benchmarking framework GERBIL and its benchmark datasets. Based on these insights we extend the GERBIL framework to enable a more fine grained evaluation and in deep analysis of the used benchmark datasets according to different emphases. In this paper, we present the implementation of an adaptive filter for arbitrary entities as well as a system to automatically measure benchmark dataset properties, such as the extent of content-related ambiguity and diversity. The implementation as well as a result visualization are integrated in the publicly available GERBIL framework.

The practice of rejecting injected and replayed 802.15.4 frames only after they were received leaves 802.15.4 nodes vulnerable to broadcast and droplet attacks. Basically, in broadcast and droplet attacks, an attacker injects or replays plenty of 802.15.4 frames. As a result, victim 802.15.4 nodes stay in receive mode for extended periods of time and expend their limited energy. He et al. considered embedding one-time passwords in the synchronization headers of 802.15.4 frames so as to avoid that 802.15.4 nodes detect injected and replayed 802.15.4 frames in the first place. However, He et al.’s, as well as similar proposals lack support for broadcast frames and depend on special hardware. In this paper, we propose Practical On-the-fly Rejection (POTR) to reject injected and replayed 802.15.4 frames early during receipt. Unlike previous proposals, POTR supports broadcast frames and can be implemented with many off-the-shelf 802.15.4 transceivers. In fact, we implemented POTR with CC2538 transceivers, as well as integrated POTR into the Contiki operating system. Furthermore, we demonstrate that, compared to using no defense, POTR reduces the time that 802.15.4 nodes stay in receive mode upon receiving an injected or replayed 802.15.4 frame by a factor of up to 16. Beyond that, POTR has a small processing and memory overhead, and incurs no communication overhead.

Analyzing data is a cost-intensive process, particularly for organizations lacking the necessary in-house human and computational capital. Data analytics outsourcing offers a cost-effective solution, but data sensitivity and query response time requirements, make data protection a necessary pre-processing step. For performance and privacy reasons, anonymization is preferred over encryption. Yet, manual anonymization is time-intensive and error-prone. Automated anonymization is a better alternative but requires satisfying the conflicting objectives of utility and privacy. In this paper, we present an automated anonymization scheme that extends the standard k-anonymization and l-diversity algorithms to satisfy the dual objectives of data utility and privacy. We use a multi-objective optimization scheme that employs a weighting mechanism, to minimise information loss and maximize privacy. Our results show that automating l-diversity results in an added average information loss of 7 % over automated k-anonymization, but in a diversity of between 9–14 % in comparison to 10–30 % in k-anonymised datasets. The lesson that emerges is that automated l-diversity offers better privacy than k-anonymization and with negligible information loss.

Inclusion dependencies (INDs) within and across databases are an important relationship for many applications in data integration, schema (re-)design, integrity checking, or query optimization. Existing techniques for detecting all INDs need to generate IND candidates and test their validity in the given data instance. However, the major disadvantage of this approach is the exponentially growing number of data accesses in terms of the number of SQL queries as well as I/O operations. We introduce Mind2, a new approach for detecting n-ary INDs (n > 1) without any candidate generation. Mind2 implements a new characterization of the maximum INDs we developed in this paper. This characterization is based on set operations defined on certain metadata that Mind2generates by accessing the database only 2 x the number of valid unary INDs. Thus, Mind2 eliminates the exponential number of data accesses needed by existing approaches. Furthermore, the experiments show that Mind2 is significantly more scalable than hypergraph-based approaches.

The explosive growth of surveillance cameras and its 7 * 24 recording period brings massive surveillance videos data. Therefore how to efficiently retrieve the rare but important event information inside the videos is eager to be solved. Recently deep convolutinal networks shows its outstanding performance in event recognition on general videos. Hence we study the characteristic of surveillance video context and propose a very competitive ConvNets approach for real-time event recognition on surveillance videos. Our approach adopts two-steam ConvNets to respectively recognition spatial and temporal information of one action. In particular, we propose to use fast feature cascades and motion history image as the template of spatial and temporal stream. We conducted our experiments on UCF-ARG and UT-interaction dataset. The experimental results show that our approach acquires superior recognition accuracy and runs in real-time.

In this article we show that a mutual exclusion protocol supporting continuous double auctioning for power trading on computationally constrained microgrid can be fault tolerant. Fault tolerance allows the CDA algorithm to operate reliably and contributes to overall grid stability and robustness. Contrary to fault tolerance approaches proposed in the literature which bypass faulty nodes through a network reconfiguration process, our approach masks crash failures of cluster head nodes through redundancy. Masking failure of the main node ensures the dependent cluster nodes hosting trading agents are not isolated from auctioning. A rendundant component acts as a backup which takes over if the primary components fails, allowing for some fault tolerance and a graceful degradation of the network. Our proposed fault-tolerant CDA algorithm has a complexity of O(N) time and a check-pointing message complexity of O(W). N is the number of messages exchanged per critical section. W is the number of check-pointing messages.

With significant increasing of surveillance cameras, the amount of surveillance videos is growing rapidly. Thereby how to automatically and efficiently recognize semantic actions and events in surveillance videos becomes an important problem to be addressed. In this paper, we investigate the state-of-the-art Deep Learning (DL) approaches for human action recognition, and propose an improved two-stream ConvNets architecture for this task. In particular, we propose to use Motion History Image (MHI) as motion expression for training the temporal ConvNet, which achieved impressive results in both accuracy and recognition speed. In our experiment, we conducted an in-depth study to investigate important network options and compared to the latest deep network for action recognition. The detailed evaluation results show the superior ability of our proposed approach, which achieves state-of-the-art in surveillance video context.

Deep Convolutional Neural Networks (CNN) have recently been shown to outperform previous state of the art approaches for image classification. Their success must in parts be attributed to the availability of large labeled training sets such as provided by the ImageNet benchmarking initiative. When training data is scarce, however, CNNs have proven to fail to learn descriptive features. Recent research shows that supervised pre-training on external data followed by domain-specific fine-tuning yields a significant performance boost when external data and target domain show similar visual characteristics. Transfer-learning from a base task to a highly \emph{dissimilar target task, however, has not yet been fully investigated. In this paper, we analyze the performance of different feature representations for classification of paintings into art epochs. Specifically, we evaluate the impact of training set sizes on CNNs trained with and without external data and compare the obtained models to linear models based on Improved Fisher Encodings. Our results underline the superior performance of fine-tuned CNNs but likewise propose Fisher Encodings in scenarios were training data is limited.

Abstract—“Internetworking with TCP/IP” is a massive open online course (MOOC) provided by Germany-based MOOC platform “openHPI”, which has been offered in German, English and – recently – Chinese respectively, with similar content. In this paper, the authors, who worked jointly as a teacher (or as teaching assistants) in this course, want to share their ideas derived from daily teaching experiences, analysis of the statistics, comparison between the performance in different language offers and the feedback from user questionnaires. Additionally, the motivation, attempt and suggestion at MOOC localization will also be discussed.

In this paper, we consider the Continuous Double Auction (CDA) scheme as a comprehensive power resource allocation approach on micro-grids. Users of CDA schemes are typically self-interested and so work to maximize self-profit. Meanwhile, security in CDAs has received limited attention, with little to no theoretical or experimental evidence demonstrating how an adversary cheats to gain excess energy or derive economic benefits. We identify two forms of cheating realised by changing the trading agent (TA) strategy of some of the agents in a homogeneous CDA scheme. In one case an adversary gains control and degrades other trading agents' strategies to gain more surplus. While in the other, K colluding trading agents employ an automated coordinated approach to changing their TA strategies to maximize surplus power gains. We propose an exception handling mechanism that makes use of allocative efficiency and message overheads to detect and mitigate cheating forms.

During a video recorded university class students have to watch several hours of video content. This can easily add up to several days of video content during a semester. Naturally, not all 90 minutes of a typical lecture are relevant for the exam. When the semester ends with a final exam students have to study more intensively the important parts of all the lectures. To simplify the learning process and design it to be more efficient we have introduced the Couch Learning Mode in our lecture video archive. With this approach students can create custom playlists out of the video lecture archive with a time frame for every selected video. Finally, students can lean back and watch all relevant video parts consecutively for the exam without being interrupted. Additionally, the students can share their playlists with other students or they can use the video search to watch all relevant lecture videos about a topic. This approach uses playlists and HTML5 technologies to realize the consecutive video playback. Furthermore, the powerful Lecture Butler search engine is used to find worthwhile video parts for certain topics. Our approach shows that we have more satisfied students using the manual playlist creation to view reasonable parts for an exam. Finally, students are keen on watching the top search results showing reasonable parts of lectures for a topic of interest. The Couch Learning Mode supports and motivates students to learn with video lectures for an exam and daily life.

Efforts towards improving security in cloud infrastructures recommend regulatory compliance approaches such as HIPAA and PCI DSS. Similarly, vulnerability assessments are imperatives for fulfilling these regulatory compliance requirements. Nevertheless, conducting vulnerability assessments in cloud environments requires approaches different from those found in traditional computing. Factors such as multi-tenancy, elasticity, self-service and cloud-specific vulnerabilities must be considered. Furthermore, the Anything-as-a-Service model of the cloud stimulates security automation and user-intuitive services. In this paper, we tackle the challenge of efficient vulnerability assessments at the system level, in particular for core cloud applications.Within this scope, we focus on the use case of a cloud administrator. We believe the security of the underlying cloud software is crucial to the overall health of a cloud infrastructure since these are the foundations upon which other applications within the cloud function. We demonstrate our approach using OpenStack and through our experiments prove that our prototype implementation is effective at identifying “OpenStacknative” vulnerabilities. We also automate the process of identifying insecure configurations in the cloud and initiate steps for deploying Vulnerability Assessment-as-a-Service in OpenStack.

We present a physical attestation and authentication approach to detecting cheating in resource constrained smart micro-grids. A multi-user smart microgrid (SMG) architecture supported by a low cost and unreliable communications network, forms our application scenario. In this scenario, a malicious adversary can cheat by manipulating the measured power consumption/generation data. In doing so, the reward is access to more than the per user allocated power quota. Cheating discourages user participation and results in grid destabilisation and a breakdown of the grid in the worst case. Detecting cheating attacks is thus essential for secure and resilient SMG but also a challenging problem.We develop a cheating detection scheme that integrates the idea of physical attestation to assess whether the SMG system is under attack. Subsequently, we support our scheme with an authentication mechanism based on control signals to uniquely identify node subversion. A theoretical analysis demonstrates the efficiency and correctness of our proposed scheme for constrained SMGs.

Selection of initial points, the number of clusters and finding proper clusters centers are still the main challenge in clustering processes. In this paper, we suggest genetic algorithm based method which searches several solution spaces simultaneously. The solution spaces are population groups consisting of elements with similar structure. Elements in a group have the same size, while elements in different groups are of different sizes. The proposed algorithm processes the population in groups of chromosomes with one gene, two genes to k genes. These genes hold corresponding information about the cluster centers. In the proposed method, the crossover and mutation operators can accept parents with different sizes, this can lead to versatility in population and information transfer among sub-populations. We implemented the proposed method and evaluated its performance against some random datasets and the Ruspini dataset as well. The experimental results show that the proposed method could effectively determine the appropriate number of clusters and recognize their centers. Overall this research implies that using heterogeneous population in the genetic algorithm can lead to better results.

In the world of football, performance analytics about a player’s skill level and the overall tactics of a match are supportive for the success of a team. These analytics are based on positional data on the one hand and events about the game on the other hand. The positional data of the ball and players is tracked automatically by cameras or via sensors. However, the events are still captured manually by human, which is time-consuming and error-prone. Therefore, this paper introduces an approach to detect events based on the positional data of football matches. We trained and aggregated the machine learning algorithms Support Vector Machine, K-Nearest Neighbours and Random Forest, based on features, which were calculated on base of the positional data. We evaluated the quality of our approach by comparing the recall and precision of the results. This allows an assessment of how event detection in football matches can be improved by automating this process based on spatio-temporal data. We discovered, that it is possible to detect football events from positional data. Nevertheless, the choice of a specific algorithm has a strong influence on the quality of the predicted results.

Data security is an important area of concern for every computer system owner. An intrusion detection system is a device or software application that monitors a network or systems for malicious activity or policy violations. Already various techniques of artificial intelligence have been used for intrusion detection. The main challenge in this area is the running speed of the available implementations. In this research work, we present a hybrid approach which is based on the “linear discernment analysis” and the “extreme learning machine” to build a tool for intrusion detection. In the proposed method, the linear discernment analysis is used to reduce the dimensions of data and the extreme learning machine neural network is used for data classification. This idea allowed us to benefit from the advantages of both methods. We implemented the proposed method on a microcomputer with core i5 1.6 GHz processor by using machine learning toolbox. In order to evaluate the performance of the proposed method, we run it on a comprehensive data set concerning intrusion detection. The data set is called KDD, which is a version of the data set DARPA presented by MIT Lincoln Labs. The experimental results were organized in related tables and charts. Analysis of the results show meaningful improvements in intrusion detection. In general, compared to the existing methods, the proposed approach works faster with higher accuracy.

Prototypes help people to externalize their ideas and are a basic element for gathering feedback on an early product design. Prototyping is oftentimes a team-based method traditionally involving physical and analog tools. At the same time, collaboration among geographically dispersed team members becomes more and more standard practice for companies and research teams. Therefore, a growing need arises for collaborative prototyping environments. We present a standards compliant, web browser-based real-time remote 3D modeling system. We utilize cross-platform WebGL rendering API for hardware accelerated visualization of 3D models. Synchronization relies on WebSocket-based message interchange over a centralized Node.js real-time collaboration server. In a first co-located user test, participants were able to rebuild physical prototypes without having prior knowledge of the system. This way, the provided system design and its implementation can serve as a basis for visual real-time collaboration systems available across a multitude of hardware devices.

During the last years, e-learning has become more and more important. There are several approaches like teleteaching or MOOCs to delivers knowledge information to the students on different topics. But, a major problem most learning platforms have is, students often get demotivated fast. This is caused e.g. by solving similar tasks again and again, and learning alone on the personal computer. To avoid this situation in coding-based courses one possible way could be the use of embedded devices. This approach increases the practical programming part and should push motivation to the students. This paper presents a possibility to the use of embedded systems with an LED panel to motivate students to use programming languages and solve the course successfully. To analyze the successfulness of this approach, it was tested within a MOOC called "Java for beginners" with 11,712 participants. The result was evaluated by personal feedback of the students and user data was analyzed to measure the acceptance and motivation of students by solving the embedded system tasks. The result shows that the approach is well accepted by the students and they are more motivated by tasks with real hardware support.

Earlier research shows that using an embedded LED system motivates students to learn programming languages in massive open online courses (MOOCs) efficiently. Since this earlier approach was very successful the system should be improved to increase the learning experience for students during programming exercises. The problem of the current system is that only a static image was shown on the LED matrix controlled by students’ array programming over the embedded system. The idea of this paper to change this static behavior into a dynamic display of information on the LED matrix by the use of sensors which are connected with the embedded system. For this approach a light sensor and a temperature sensor are connected to an analog-to-digital converter (ADC) port of the embedded system. These sensors' values can be read by the students to compute the correct output for the LED matrix. The result is captured and sent back to the students for direct feedback. Furthermore, unit tests can be used to automatically evaluate the programming results. The system was evaluated during a MOOC course about web technologies using JavaScript. Evaluation results are taken from the student’s feedback and an evaluation of the students’ code executions on the system. The positive feedback and the evaluation of the students’ executions, which shows a higher amount of code executions compared to standard programming tasks and the fact that students solving these tasks have overall better course results, highlight the advantage of the approach. Due to the evaluation results, this approach should be used in e-learning e.g. MOOCs teaching programming languages to increase the learning experience and motivate students to learn programming.

Renz, J., Schwerer, F., Meinel, C.: openSAP: Evaluating xMOOC Usage and Challenges for Scalable and Open Enterprise Education.Proceedings of the Eighth International Conference on E-Learning in the Workplace (2016).

Smart micro-grid architectures are small scale electricity provision networks composed of individual electricity providers and consumers. Supporting micro-grids with computationally limited devices, is a cost-effective approach to service provisioning in resource-limited settings. However, the limited availability of real time measurements and the unreliable communication network makes the use of Advanced Metering Infrastructure (AMI) for monitoring and control a challenging problem. Grid operation and stability are therefore reliant on inaccurate and incomplete information. Consequently, data gathering and analytics raise privacy concerns for grid users, which is undesirable. In this paper, we study adversarial scenarios for the privacy violations on micro-grids. We consider two types of privacy threats in constrained micro-grids, namely inferential and aggregation attacks. The reason is that both attacks capture scenarios that can be used to provoke energy theft and destabilize the grid. Grid destabilzation leads to distrust between suppliers and consumers. This work provides a roadmap towards a secure and resilient smart micro-grid energy networks.

Massive Open Online Courses (MOOCs) have revolutionized higher education by offering university-like courses for a large amount of learners via the Internet. The paper at hand takes a closer look on peer assessment as a tool for delivering individualized feedback and engaging assignments to MOOC participants. Benefits, such as scalability for MOOCs and higher order learning, and challenges, such as grading accuracy and rogue reviewers, are described. Common practices and the state-of-the-art to counteract challenges are highlighted. Based on this research, the paper at hand describes a peer assessment workflow and its implementation on the openHPI and openSAP MOOC platforms. This workflow combines the best practices of existing peer assessment tools and introduces some small but crucial improvements.

There is an emerging demand on efficiently archiving and (temporal) querying different versions of evolving semantic Web data. As novel archiving systems are starting to address this challenge, foundations/standards for benchmarking RDF archives are needed to evaluate its storage space efficiency and the performance of different retrieval operations. To this end, we provide theoretical foundations on the design of data and queries to evaluate emerging RDF archiving systems. Then, we instantiate these foundations along a concrete set of queries on the basis of a real-world evolving dataset. Finally, we perform an empirical evaluation of various current archiving techniques and querying strategies on this data. Our work comprises -- to the best of our knowledge -- the first benchmark for querying evolving RDF data archives.

Decentralized Continuous Double Auctioning offers a flexible marketing approach to power distribution in resource constrained (RC) smart micro-grids. Grid participants (buyers and sellers) can obtain power at a suitable price both at on or off-peak periods. Decentralized CDA schemes are however vulnerable to two attacks, namely - ‘Victim Strategy Downgrade’ and ‘Collusion’. Both attacks foil the CDA scheme by allowing an individual to gain surplus energy that leads to low allocative efficiency, which is undesirable for maintaining grid stability and reliability. In this paper we propose a novel scheme to circumvent power auction cheating attacks. Our scheme works by employing an exception handling mechanism that employs cheating detection and resolution algorithms. Our correctness and complexity analysis demonstrates that the solution is both sound and performance efficient under resource constrained conditions.

This paper reports an in progress research project. Each year many software vulnerabilities are discovered and reported. These vulnerabilities can lead to system exploitations and consequently finance and information losses. Soon after detection of vulnerabilities, requests for solutions arise. Usually it takes some time and effort until an effective solution is provided. Therefore it is very desirable to have an automated vulnerability solution predictor. In this paper we introduce an effective approach to achieve such a predictive system. In the first step, by using text mining techniques, we extract some features from the available textual data concerning vulnerabilities. Due to the pattern of the existing overlap between different categories of vulnerabilities and their solutions, we found the overlapping clustering algorithm to be the most suitable method to cluster them. After that, we attempt to find the existing relationship among the obtained clusters. In the last step, we benefit from machine learning methods to construct the requested solution predictor. In our approach we propose an automated quick workaround solution, in workaround solutions, users do not need to wait for a patch or a new version of software but they bypass a problem caused by vulnerability with additional effort to avoid its damages.

In this paper we propose an approach to predict punctuation marks for unsegmented speech transcript. The approach is purely lexical, with pre-trained Word Vectors as the only input. A training model of Deep Neural Network (DNN) or Convolutional Neural Network (CNN) is applied to classify whether a punctuation mark should be inserted after the third word of a 5-words sequence and which kind of punctuation mark the inserted one should be. TED talks within IWSLT dataset are used in both training and evaluation phases. The proposed approach shows its effectiveness by achieving better result than the state-of-the-art lexical solution which works with same type of data, especially when predicting puncuation position only.

In this paper we propose a solution that detects sentence boundary from speech transcript. First we train a pure lexical model with deep neural network, which takes word vectors as the only input feature. Then a simple acoustic model is also prepared. Because the models work independently, they can be trained with different data. In next step, the posterior probabilities of both lexical and acoustic models will be involved in a heuristic 2-stage joint decision scheme to classify the sentence boundary positions. This approach ensures that the models can be updated or switched freely in actual use. Evaluation on TED Talks shows that the proposed lexical model can achieve good results: 75.5% accuracy on error-involved ASR transcripts and 82.4% on error-free manual references. The joint decision scheme can further improve the accuracy by 3�~10% when acoustic data is available.

Programming tasks are an important part of teaching computer programming as they foster students to develop essential programming skills and techniques through practice. The design of educational problems plays a crucial role in the extent to which the experiential knowledge is imparted to the learner both in terms of quality and quantity. Badly designed tasks have been known to put-off students from practicing programming. Hence, there is a need for carefully designed problems. Cellular Automata programming lends itself as a very suitable candidate among problems designed for programming practice. In this paper we describe how various types of problems can be designed using concepts from Cellular Automata and discuss the features which make them good practice problems with regard to instructional pedagogy. We also present a case study on a Cellular Automata programming exercise used in a MOOC on Test Driven Development using JUnit, and discuss the automated evaluation of code submissions and the feedback about the reception of this exercise by participants in this course.

Amirkhanyan, A., Meinel, C.: Analysis of the Value of Public Geotagged Data from Twitter from the Perspective of Providing Situational Awareness.Proceedings of the 15th IFIP Conference on e-Business, e-Services and e-Society (I3E2016) - Social Media: The Good, the Bad, and the Ugly. Springer, Swansea, Wales, UK (2016).

In the era of social networks, we have a huge amount of social geotagged data that reflect the real world. These data can be used to provide or to enhance situational and public safety awareness. It can be reached by the way of analysis and visualization of geotagged data that can help to better understand the situation around and to detect local geo-spatial threats. One of the challenges in the way of reaching this goal is providing valuable statistics and advanced methods for filtering data. Therefore, in the scope of this paper, we collect sufficient amount of public social geotagged data from Twitter, build different valuable statistics and analyze them. Also, we try to find valuable parameters and propose the useful filters based on these parameters that can filter data from invaluable data and, by this way, support analysis of geotagged data from the perspective of providing situational awareness.

In this paper we propose a method to evaluate the importance of lecture video segments in online courses. The video will be first segmented based on the slide transition. Then we evaluate the importance of each segment based on our analysis of the teacher’s focus. This focus is mainly identified by exploring features in the slide and the speech. Since the whole analysis process is based on multimedia materials, it could be done before the official start of the course. By setting survey questions and collecting forum statistics in the MOOC “Web Technologies”, the proposed method is evaluated. Both the general trend and the high accuracy of selected key segments (over 70%) prove the effectiveness of the proposed method.

Virtual Laboratory is needed for practical, hands-on exercises in e-learning courses. The E-learning system needs to provide a specific laboratory environment for a specific learning unit. A Virtual laboratory system with a high requirements learning units, is struggling in serving a large number of users, because the available hardware resources are limited and the budget to provide more resources is low. The number of e-Learning users that simultaneously access the virtual laboratory is varied. In this paper, we propose an architecture of a virtual laboratory system for a large number of users. A person or a company can contribute in providing hardware resources in crowdsourcing manner. This system uses Hybrid cloud platform to be able to scale out and scale in rapidly. The architecture is able to expand by receiving more hardware resources from a person or a company that is willing to contribute. The resources can be anywhere but must be connected to the Internet. For example, if a user has a Virtual Machine (VM) in the cloud or in his own bare metal system connected to the Internet, he can integrate his VM into the Virtual laboratory system. Because the e-learning system is a non-profit system, we assumed that some users and companies are willing to contribute. We use Tele-lab architecture as a based to create the proposed architecture. The Tele-lab is a virtual laboratory for Internet Security e-learning. The Tele-lab uses a private cloud (openNebula) to provide VMs and Containers that are used to represent hosts in a Virtual laboratory. In our architecture as also in the Tele-Lab, there is a frontend and a backend. The frontend is providing an interface to the users. In our architecture, we focus on the backend to be able to provide a virtual laboratory that can serve a large number of users. In the architecture, we use a middleware to provide a communication between a private cloud and a public cloud and also communication between the Virtual laboratory system and the resources that belong to the crowd. This work is part of the continuous improvement on Tele-Lab to make it more reliable and more scalable. We are heading toward using Tele-Lab in the implementation of Massive Open Online Course (MOOC)

In many MOOCs hands-on exercises are a key component. Their format must be deliberately planned to satisfy the needs of a more and more heterogeneous student body. At the same time, costs have to be kept low for maintenance and support on the course provider’s side. The paper at hand reports about our experiments with a tool called Vagrant in this context. It has been successfully employed for use cases similar to ours and thus promises to be an option for achieving our goals.

We have addressed the problems of independent e-lecture learning with an approach involving collaborative learning with lecture recordings. In order to make this type of learning possible, we have prototypically enhanced the video player of a lecture video platform with functionality that allows simultaneous viewing of a lecture on two or more computers. While watching the video, synchronization of the playback and every click event, such as play, pause, seek, and playback speed adjustment can be carried out. We have also added the option of annotating slides. With this approach, it is possible for learners to watch a lecture together, even though they are in different places. In this way, the benefits of collaborative learning can also be used when learning online. Now, it is more likely that learners stay focused on the lecture for a longer time (as the collaboration creates an additional obligation not to leave early and desert a friend). Furthermore, the learning outcome is higher because learners can ask their friends questions and explain things to each other as well as mark important points in the lecture video.

Many datasets change over time. As a consequence, long-running applications that cache and repeatedly use query results obtained from a SPARQL endpoint may resubmit the queries regularly to ensure up-to-dateness of the results. While this approach may be feasible if the number of such regular refresh queries is manageable, with an increasing number of applications adopting this approach, the SPARQL endpoint may become overloaded with such refresh queries. A more scalable approach would be to use a middle-ware component at which the applications register their queries and get notified with updated query results once the results have changed. Then, this middle-ware can schedule the repeated execution of the refresh queries without overloading the endpoint. In this paper, we study the problem of scheduling refresh queries for a large number of registered queries by assuming an overload-avoiding upper bound on the length of a regular time slot available for testing refresh queries. We investigate a variety of scheduling strategies and compare them experimentally in terms of time slots needed before they recognize changes and number of changes that they miss.

Remote collaboration systems are a necessity for geographically dispersed teams in achieving a common goal. Realtime groupware systems frequently provide a shared workspace where users interact with shared artifacts. However, a shared workspace is often not enough for maintaining the awareness of other users. Video conferencing can create a visual context simplifying the user’s communication and understanding. In addition, flexible working modes and modern communication systems allow users to work at any time at any location. It is therefore desirable that a groupware system can run on users’ everyday devices, such as smartphones and tablets, in the same way as on traditional desktop hardware. We present a standards compliant, web browser-based realtime remote collaboration system that includes WebRTC-based video conferencing. It allows a full-body video setup where everyone can see what other participants are doing and where they are pointing in the shared workspace. In contrast to standard WebRTC’s peer-to-peer architecture, our system implements a star topology WebRTC video conferencing. In this way, our solution improves network bandwidth efficiency from a linear to a constant network upstream consumption.

Keeping data confidential is a deeply rooted requirement in medical documentation. However, there are increasing calls for patient transparency in medical record documentation. With Tele-Board MED, an interactive system for joint documentation of doctor and patient is developed. This web-based application designed for digital whiteboards will be tested in treatment sessions with psychotherapy patients and therapists. In order to ensure the security of patient data, security measures were implemented and they are illustrated in this paper. We followed the major information security objectives: confidentiality, integrity, availability and accountability. Next to technical aspects, such as data encryption, access restriction through firewall and password, and measures for remote maintenance, we address issues at organizational and infrastructural levels as well (e.g., patients’ access to notes). With this paper we want to increase the awareness of information security, and promote a security conception from the beginning of health software research projects. The measures described in this paper can serve as an example for other health software applications dealing with sensitive patient data, from early user testing phases on.

In this paper we showcase a system for real-time text detection and recognition. We apply deep features created by Convolutional Neural Networks (CNNs) for both text detection and word recognition task. For text detection we follow the common localization-verification scheme which already shown its excellent ability in numerous previous work. In text localization stage, textual regions are roughly detected by using a MSERs (Maximally Stable Extremal Regions) detector with high recall rate. False alarms are then eliminated by using a CNNs classifier, and remaining text regions are further grouped into words. In the word recognition stage, we developed an skeleton-based text binarization method for segmenting text from its background. A CNNs based recognizer is then applied for recognizing character. The initial experiments show the powerful ability of deep features for text classification comparing with commonly used visual features. Our current implementation demonstrates real-time performance for recognizing scene text by using a standard PC with webcam.

The detection of vulnerabilities in computer systems and computer networks as well as the representation of the results are crucial problems. The presented method tackles the problem with an automated detection and an intuitive representation. For detecting vulnerabilities the approach uses a logical representation of preconditions and postconditions of vulnerabilities. Thus an automated analytical function could detect security leaks on a target system. The gathered information is used to provide security advisories and enhanced diagnostics for the system. Additionally the conditional structure allows us to create attack graphs to visualize the network structure and the integrated vulnerability information. Finally we propose methods to resolve the identified weaknesses whether to remove or update vulnerable applications and secure the target system. This advisories are created automatically and provide possible solutions for the security risks.

Micro-grid architectures based on renewable energy sources offer a viable solution to electricity provision in regions that are not connected to the national power grid or that are severely affected by load shedding. The limited power generated in micro-grids however makes monitoring power consumption an important consideration in guaranteeing efficient and fair energy sharing. A further caveat is that adversarial data tampering poses a major impediment to fair energy sharing on small scale energy systems, like micro-grids, and can result in a complete breakdown of the system. In this paper, we present an innovative approach to monitoring home power consumption in smart micro-grids. This is done by taking into account power consumption measurement on a per appliance and/or device basis. Our approach works by employing a distributed snapshot algorithm to asynchronously collect the power consumption data reported by the appliances and devices. In addition, we provide a characterization of noise that affects the quality of the data making it difficult to differentiate measurement errors and power fluctuations from deliberate attempts to misreport consumption.

In the existing cloud brokerage system, the client does not have the ability to verify the result of the cloud service selection. There are possibilities that the cloud broker can be biased in selecting the best Cloud Service Provider (CSP) for a client. A compromised or dishonest cloud broker can unfairly select a CSP for its own advantage by cooperating with the selected CSP. To address this problem, we propose a mechanism to verify the CSP selection result of the cloud broker. In this verification mechanism, properties of every CSP will also be verified. It uses a trusted third party to gather clustering result from the cloud broker. This trusted third party is also used as a base station to collect CSP properties in a multi-agents system. Software Agents are installed and running on every CSP. The CSP is monitored by agents as the representative of the customer inside the cloud. These multi-agents give reports to a third party that must be trusted by CSPs, customers and the Cloud Broker. The third party provides transparency by publishing reports to the authorized parties (CSPs and Customers).

Privacy, security, and trust concerns are continuously hindering the growth of cloud computing despite its attractive features. To mitigate these concerns, an emerging approach targets the use of multi-cloud architectures to achieve portability and reduce cost. Multi-cloud architectures however suffer several challenges including inadequate cross-provider APIs, insufficient support from cloud service providers, and especially non-unified access control mechanisms. Consequently, the available multicloud proposals are unhandy or insecure. This paper proposes two contributions. At first, we survey existing cloud storage provider interfaces. Following, we propose a novel technique that deals with the challenges of connecting modern authentication standards and multiple cloud authorization methods.

In modern computer systems, multicore processors are prevalent, even on mobile devices. Since JavaScript WebWorkers provide execution parallelism in a web browser, they can help utilize multicore CPUs more effectively. However, WebWorker limitations include a lack of access to web browser's native XML processing capabilities and related Document Object Model (DOM). We present a JavaScript DOM and XML processing implementation that adds missing APIs to WebWorkers. This way, it is possible to use JavaScript code that relies on native APIs within WebWorkers. We show and evaluate the seamless integration of an external XMPP library to enable parallel network data and user input processing in a web based real-time remote collaboration system. Evaluation shows that our XML processing solution has the same linear execution time complexity as its native API counterparts. The proposed JavaScript solution is a general approach to enable parallel XML data processing within web browser-based applications. By implementing standards compliant DOM interfaces, our implementation is useful for existing libraries and applications to leverage the processing power of multicore systems.

An important technique for attack detection in complex company networks is the analysis of log data from various network components. As networks are growing, the number of produced log events increases dramatically, sometimes even to multiple billion events per day. The analysis of such big data highly relies on a full normalization of the log data in realtime. Until now, the important issue of full normalization of a large number of log events is only insufficiently handled by many software solutions and not well covered in existing research work. In this paper, we propose and evaluate multiple approaches for handling the normalization of a large number of typical logs better and more efficient. The main idea is to organize the normalization in multiple levels by using a hierarchical knowledge base (KB) of normalization rules. In the end, we achieve a performance gain of about 1000x with our presented approaches, in comparison to a naive approach typically used in existing normalization solutions. Considering this improvement, big log data can now be handled much faster and can be used to find and mitigate attacks in realtime.

Ussath, M., Cheng, F., Meinel, C.: Concept for a Security Investigation Framework.Proceedings of the 7th IFIP International Conference on New Technologies, Mobility, and Security (NTMS’15) (2015).

The vast amount of information on the Web poses a challenge when trying to identify the most important facts. Many fact ranking algorithms have emerged, however, thus far there is a lack of a general domain, objective gold standard that would serve as an evaluation benchmark for comparing such systems. We present FRanCo, a ground truth for fact ranking acquired using crowdsourcing. The corpus is built on a representative DBpedia sample of 541 entities and made freely available. We have published both the aggregated and the raw data collected, including identified nonsense statements that contribute to improving data quality in DBpedia.

Inclusion dependencies within and across databases are an important relationship for many applications in anomaly detection, schema (re-)design, query optimization or data integration. When such dependencies are not available as explicit metadata, scalable and efficient algorithms have to discover them from a given data instance. We introduce a new idea for clustering the attributes of database relations. Based on this idea we have developed S-indd, an efficient and scalable algorithm for discovering all unary inclusion dependencies in large datasets. S-indd is scalable both in the number of attributes and in the number of rows. We show that previous approaches reveal themselves as special cases of S-indd. We exhaustively evaluate S-indd's scalability using many datasets with several thousands attributes and rows up to one million. The experiments show that S-indd is up to 11x faster than previous approaches.

For testing new methods of network security or new algorithms of security analytics, we need the experimental environments as well as the testing data which are much as possible similar to the real-world data. Therefore, the researchers are always trying to find the best approaches and recommendations of creating and simulating testbeds, because the issue of automation of the testbed creation is a crucial goal to accelerate research progress. One of the ways to generate data is simulate the user behavior on the virtual machines, but the challenge is how to describe what we want to simulate. In this paper, we present a new approach of describing user behavior for the simulation tool. This approach meets requirements of simplicity and extensibility. And it could be used for generating user behavior scenarios to simulate them on Windows-family virtual machines. The proposed approached is applied to our developed simulation tool that we use for solving a problem of the lack of data for research in network security and security analytics areas by generating log dataset that could be used for testing new methods of network security and new algorithms of security analytics.

This paper presents a novel approach for text categorization by fusing “Bag-of-words” (BOW) word feature and multilevel semantic feature (SF). By extending Online LDA (OLDA) as multilevel topic model for learning a semantic space with different topic granularity, multilevel semantic features are extracted for representing text component. The effectiveness of our approach is evaluated on both large scale Wikipedia corpus and middle-sized 20newsgroups dataset. The former experiment shows that our approach is able to preform semantic feature extraction on large scale dataset. It also demonstrates the topics generated from different topic level have different semantic scopes, which is more appropriate to represent text content. Our classification experiments on 20newsgroups achieved 82.19 % accuracy, which illustrates the effectiveness of fusing BOW and SF features. The further investigation on word and semantic feature fusion proves that Support Vector Machine (SVM) is more sensitive to semantic feature than Naive Bayes (NB), K Nearest Neighbor(KNN), Decision Tree (DT). It is shown that appropriately fusing low-level word feature and high-level semantic feature can achieve equally well or even better result than state-of-the-art with reduced feature dimension and computational complexity

Cross-Modal mapping plays an essential role in multimedia information retrieval systems. However, most of existing work paid much attention on learning mapping functions but neglected the exploration of high-level semantic representation of modalities. Inspired by recent success of deep learning, in this paper, deep CNN (convolutional neural networks) features and topic features are utilized as visual and textual semantic representation respectively. To investigate the highly non-linear semantic correlation between image and text, we propose a regularized deep neural network(RE-DNN) for semantic mapping across modalities. By imposing intra-modal regularization as supervised pre-training, we finally learn a joint model which captures both intra-modal and inter-modal relationships. Our approach is superior to previous work in follows: (1) it explores high-level semantic correlations, (2) it requires little prior knowledge for model training, (3) it is able to tackle modality missing problem. Extensive experiments on benchmark Wikipedia dataset show RE-DNN outperforms the state-of-the-art approaches in cross-modal retrieval.

Lecture video archives offer a large variety of lecture recordings in different topics. Naturally, topics are described superficially, easily or detailed in different lectures. Users interested in certain topics have problems finding lectures describing a topic chronology from basic lectures to more detailed difficult lectures. The Lecture Butler is going to automatically offer e-learning students lectures for the topics of interest in chronological playlists. The approach is finding lecture information using title, description, OCR and ASR data. This data is indexed and searched by an in-memory database to fulfill the speed requirements for playlist creation. In the search results lectures are going to be ordered by lecture occurrence in the university semester time schedule or by given lecture level of difficulty. As a result students can automatically create playlists for their topic of interest in sequence of the lecture level. Hence, students are not overstrained by lectures when they start with basic lectures first. Basic lectures provide information to understand more complex lectures. The research shows that an automatic approach by adding the level of difficulty or university semester time table is going to show reasonable playlists to find topics of interest. This solves the main problem students encounter when they try to learn a topic step-by-step using recorded lectures. The approach will support and motivate students using e-learning opportunities.

Lecture video archives offer hundreds of lectures. Students have to watch lecture videos in a lecture archive without any feedback. They do not know if they understood everything correctly in comparison to MOOCs (Massive Open Online Course) where a direct feedback with self-tests or assignments is common. In contrast to MOOCs, video lecture archives normally do not offer self-test or assignment sections after every video. Due to this behavior of lecture archives questions have to be made visible on the video page. Furthermore, lecture recording videos are typically longer than videos in MOOCs. So, it is not so reasonable and sometimes even demotivating to ask a lot of questions after a long video when not all information is already memorized by the student. The approach of this paper is to overcome these self-test problems in lecture video archives and to finally solve them in a reasonable way to increase the learning experience and support students to learn more efficient with recorded lecture videos.

Modern machine learning techniques have been applied to many aspects of network analytics in order to discover patterns that can clarify or better demonstrate the behavior of users and systems within a given network. Often the information to be processed has to be converted to a different type in order for machine learning algorithms to be able to process them. To accurately process the information generated by systems within a network, the true intention and meaning behind the information must be observed. In this paper we propose different approaches for mapping network information such as IP addresses to integer values that attempts to keep the relation present in the original format of the information intact. With one exception, all of the proposed mappings result in (at most) 64 bit long outputs in order to allow atomic operations using CPUs with 64 bit registers. The mapping output size is restricted in the interest of performance. Additionally we demonstrate the benefits of the new mappings for one specific machine learning algorithm (k-means) and compare the algorithm's results for datasets with and without the proposed transformations.

To survive reboots, 802.15.4 security normally requires an 802.15.4 node to store both its anti-replay data and its frame counter in non-volatile memory. However, the only non-volatile memory on most 802.15.4 nodes is flash memory, which is energy consuming, slow, as well as prone to wear. Establishing session keys frees 802.15.4 nodes from storing anti-replay data and frame counters in non-volatile memory. For establishing pairwise session keys for use in 802.15.4 security in particular, Krentz et al. proposed the Adaptable Pairwise Key Establishment Scheme (APKES). Yet, APKES neither supports reboots nor mobile nodes. In this paper, we propose the Adaptive Key Establishment Scheme (AKES) to overcome these limitations of APKES. Above all, AKES makes 802.15.4 security survive reboots without storing data in non-volatile memory. Also, we implemented AKES for Contiki and demonstrate its memory and energy efficiency. Of independent interest, we resolve the issue that 802.15.4 security stops to work if a node's frame counter reaches its maximum value, as well as propose a technique for reducing the security-related per frame overhead.

We present an approach to monitoring power consumption in a distributed computationally constrained micro-grid. Computationally constrained micro-grids typically operate over low-powered computational devices on insecure and unreliable communication networks. Monitoring these networks is important in distinguishing faulty power reports from adversarially manipulated ones. We address this problem with a two-pronged approach, first by characterizing electrical appliances according to consumption behavior models. Second, we propose a delayed snapshot algorithm for collecting aggregated data on power consumption patterns. Our snapshot algorithm is demonstrably efficient in terms of message exchange and offers the added advantage of being resilient to malicious attacks aimed at communication disruption.

On the Web there are a lot of frequently used video lecture archives which have grown up fast during the last couple of years. This fact led to a lot of lecture recordings which include knowledge for a variety of subjects. The typical way of searching these videos is by title and description. Unfortunately, not all important keywords and facts are mentioned in the title or description if they are available. Furthermore, there is no possibility to analyze how important those detected keywords are for the whole video. Another lecture archive specific virtue is that every regular university lecture is repeated yearly. Normally this will lead to duplicate lecture recordings. In search results doubling is disturbing for students when they want to watch the most recent lectures from the search result. This paper deals with the idea to resolve these problems by analyzing the recorded lecture slides with Optical Character Recognition (OCR). In addition to the name and description the OCR data will be used for a full text analysis to create an index for the lecture archive search. Furthermore, a fuzzy search is introduced. This will solve the issue of misspelled search requests and OCR detection defects. Additionally, this paper deals with the performance issues of a full text search with an in-memory database, issues in OCR detection, handling duplicate recordings of lectures repeated every year. Finally, an evaluation of the search performance in comparison with other database ideas besides the in-memory database is performed. Additionally, a user acceptability survey for the search results to increase the learning experience on lecture archives was performed. As a result, this paper shows how to handle the big amount of OCR data for a full text live search performed on an in-memory database in reasonable time. During this search a fuzzy search is performed additionally to resolve spelling mistakes and OCR detection problems. In conclusion this paper shows a solution for an enhanced video lecture archive search that supports students in online research processes and enhances their learning experience.

Nowadays, we have a lot of data produced by social media services, but more and more often these data contain information about a location that gives us the wide range of possibilities to analyze them. Since we can be interested not only in the content, but also in the location where this content was produced. For good analyzing geo-spatial data, we need to find the best approaches for geo clustering. And the best approach means real-time clustering of massive geodata with high accuracy. In this paper, we present a new approach of clustering geodata for online maps, such as Google Maps, OpenStreetMap and others. Clustered geodata based on their location improve visual analysis of them and improve situational awareness. Our approach is the server-side online algorithm that does not need the entire data to start clustering. Also, this approach works in real-time and could be used for clustering of massive geodata for online maps in reasonable time. We implemented the proposed approach to prove the concept, and also, we provided experiments and evaluation of our approach.

Nowadays gamification is a hot topic in the world, a lot of websites, applications and researches adapt this method to arouse users' motivation. From the past experience, gamification indeed has a positive influence on users' motivation especially in e-learning field. However, the gamification method either is hard to be applied to professional content called meaningful gamification or is negative on user's intrinsic motivation called reward-based gamification. So we study the game addiction mechanism and propose the reward-based intermittent reinforcement method in gamification to take advantage of user independence feature in the latter one and eliminate the negative influence on user's intrinsic motivation. In order to investigate the practicability and integrate effectiveness, we implement this model in our tele-teaching platform.

Microgrids are power networks which may operate autonomously or in parallel with national grids and the ability to function in case of islanding events, allowing critical national infrastructures to be both more efficient and robust. Particularly at smaller scales and when relying on renewable energy, stability of microgrids is critical. In this paper we propose a token-based CDA algorithm variant which may be frequently run on resource-constrained devices to efficiently match loads and generator capacity. The new algorithm was proven theoretically that it satisfies the mutual exclusion properties, while yielding an acceptable time and message complexity of O(N) and O(logN) respectively. The algorithm should generally be compatible to microgrids supported by a hierarchical network topology where households form cluster nodes around a single smart meter-cluster head (a setup similar to the one discussed in Sect. 3).

In this paper, we propose an automated adaptive solution to generate logical, accurate and detailed tree-structure outline for video-based online lectures, by extracting the attached slides and reconstructing their content. The proposed solution begins with slide-transition detection and optical character recognition, and then proceeds by a static method of analyzing the layout of single slide and the logical relations within the slides series. Some features about the under-processing slides series, such as a �xed title position, will be �gured out and applied in the adaptive rounds to improve the outline quality. The result of our experiments shows that the general accuracy of the �nal lecture outline reaches 85%, which is about 13% higher than the static method.

Security issues are still prevalent in cloud computing particularly public cloud. Efforts by Cloud Service Providers to secure out-sourced resources are not sufficient to gain trust from customers. Service Level Agreements (SLAs) are currently used to guarantee security and privacy, however research into SLAs monitoring suggests levels of dissatisfaction from cloud users. Accordingly, enterprises favor private clouds such as OpenStack as they offer more control and security visibility. However, private clouds do not provide absolute security, they share some security challenges with public clouds and eliminate other challenges. Security metrics based approaches such as quantitative security assessments could be adopted to quantify security value of private and public clouds. Software quantitative security assessments provide extensive visibility into security postures and help assess whether or not security has improved or deteriorated. In this paper we focus on private cloud security using OpenStack as a case study, we conduct a quantitative assessment of OpenStack based on empirical data. Our analysis is multi-faceted, covering OpenStack major releases and services. We employ security metrics to determine the vulnerability density, vulnerability severity metrics and patching behavior. We show that OpenStack’s security has improved since inception, however concerted efforts are imperative for secure deployments, particularly in production environments.

An important technique for attack detection in complex company networks is the analysis of log data from various network components. As networks are growing, the number of produced log events increases dramatically, sometimes even to multiple billion events per day. The analysis of such big data highly relies on a full normalization of the log data in realtime. Until now, the important issue of full normalization of a large number of log events is only insufficiently handled by many software solutions and not well covered in existing research work. In this paper, we propose and evaluate multiple approaches for handling the normalization of a large number of typical logs better and more efficient. The main idea is to organize the normalization in multiple levels by using a hierarchical knowledge base (KB) of normalization rules. In the end, we achieve a performance gain of about 1000x with our presented approaches, in comparison to a naive approach typically used in existing normalization solutions. Considering this improvement, big log data can now be handled much faster and can be used to find and mitigate attacks in realtime.

The current increase in software vulnerabilities necessitates concerted research in vulnerability lifecycles and how effective mitigative approaches could be implemented. This is especially imperative in cloud infrastructures considering the novel attack vectors introduced by this emerging computing paradigm. By conducting a quantitative security assessment of OpenStack’s vulnerability lifecycle, we discovered severe risk levels resulting from prolonged gap between vulnerability discovery and patch release. We also observed an additional time lag between patch release and patch inclusion in vulnerability scanning engines. This scenario introduces sufficient time for malicious actors to develop zero-days exploits and other types of malicious software. Mitigating these concerns requires systems with current knowledge on events within the vulnerability lifecycle. However, current threat mitigation systems like vulnerability scanners are designed to depend on information from public vulnerability repositories which mostly do not retain comprehensive information on vulnerabilities. Accordingly, we propose a framework that would mitigate the afore-mentioned risks by gathering and correlating information from several security information sources including exploit databases, malware signature repositories, Bug Tracking Systems and other channels. These information is thereafter used to automatically generate plugins armed with current information about possible zeroday exploits and other unknown vulnerabilities. We have characterized two new security metrics to describe the discovered risks, Scanner Patch Time and Scanner Patch Discovery Time

Design Thinking and Massive Open Online Courses (MOOCs) have enjoyed a widespread attention and uptake by both institutes of higher education and media. These two increasingly popular phenomena have joined forces in the recent years with several reputable universities offering MOOCs on Design Thinking. However the MOOC model of learning and Design Thinking education seem very contradictory at the first glance: Design Thinking is taught in a learning-by-doing fashion in small teams and through various hands-on activities. In contrast, MOOCs are most often completed individually. Hence the seemingly unfitting characteristics of MOOCs and Design Thinking are worth further investigation. This paper presents the initial stage of a research project that explores the potential of teaching Design Thinking at scale. It offers a pedagogical evaluation of the existing Design Thinking MOOCs using the Taxonomy Table and the Seven Principles of Good Practice in Undergraduate Education. The results shed light on how Design Thinking is being taught today in a MOOC environment and the learning objectives that the course providers are expecting.

The detection of vulnerabilities in computer systems and computer networks as well as the weakness analysis are crucial problems. The presented method tackles the problem with an automated detection. For identifying vulnerabilities the approach uses a logical representation of preconditions and postconditions of vulnerabilities. The conditional structure simulates requirements and impacts of each vulnerability. Thus an automated analytical function could detect security leaks on a target system based on this logical format. With this method it is possible to scan a system without much expertise, since the automated or computer-aided vulnerability detection does not require special knowledge about the target system. The gathered information is used to provide security advisories and enhanced diagnostics which could also detect attacks that exploit multiple vulnerabilities of the system.

The boundary devices, such as routers, firewalls, proxies, and domain controllers, etc., are continuously generating logs showing the behaviors of the internal and external users, the working state of the network as well as the devices themselves. To rapidly and efficiently analyze these logs makes great sense in terms of security and reliability. However, it is a challenging task due to the fact that a huge amount of data might be generated for being analyzed in very short time. In this paper, we address this challenge by applying complex analytics and modern in-memory database technology on the large amount of log data. Logs from different kinds of devices are collected, normalized, and stored in the In-Memory database. Machine learning approaches are then implemented to analyze the centralized big data to identify attacks and anomalies which are not easy to be detected from the individual log event. The proposed method is implemented on the In-Memory platform, i.e., SAP HANA Platform, and the experimental results show that it has the expected capabilities as well as the high performance.

Modern Security Information and Event Management systems should be capable to store and process high amount of events or log messages in different formats and from different sources. This requirement often prevents such systems from usage of computational-heavy algorithms for security analysis. To deal with this issue, we built our system based on an in-memory data base with an integrated machine learning library, namely SAP HANA. Three approaches, i.e. (1) deep normalisation of log messages (2) storing data in the main memory and (3) running data analysis directly in the database, allow us to increase processing speed in such a way, that machine learning analysis of security events becomes possible nearly in real-time. To prove our concepts, we measured the processing speed for the developed system on the data generated using Active Directory tested and showed the efficiency of our approach for high-speed analysis of security events.

Perlich, A., Meinel, C.: Automatic Treatment Session Summaries in Psychotherapy – a Step towards Therapist-Patient Cooperation.Proceedings of the International Conference on Current and Future Trends of Information and Communication Technologies in Healthcare (ICTH2015) (2015).