5 Background The behavior of a discrete-event dynamic system is formally given in terms of a labeled state transition system: (S, Λ, ) Λ is a set of labels S Λ S s.t. (s,λ,s ) iff s reachable from s (written as s s ) λ initial state S

6 Background In general S may be infinite, or even uncountable. Some abstraction techniques are required in order to be able to enumerate the whole state space. Abstract State Space: (A, L, ) Where A is a coverage of S, and A L A s.t. exists a morphism f which maps Λ labels into L labels. a 0 S

13 Map-Reduce TB nets analysis tool Map step = given an unexplored state, it applies the createsuccessors function. Incoming transitions are stored into destination states by a list of identifiers. Shuffle step = Gathers together states potentially related: This is done by using as intermediate keys the evaluation of the getfeatures function. Reduce step = given a set of states potentially related, it applies the identifyrelationship function foreach pair of states. Building blocks = State = <M,C> pair. M marking, C constraint. identifyrelationship computes the actual relationship between two states according to the following rule: a a σ(m) = σ(m ) C C getfeatures returns just the topological part of M σ(m). 10

14 Hybrid Iterative Map-Reduce A single Map-Reduce job is not enough: Iterative Map-Reduce During the first and last iterations of the algorithm the set of states is quite small. Thus a MapReduce job over a large cluster of machines is useless and expensive in term of time and resources. The computation starts with a sequential algorithm and goes on until the state space size passes a configurable threshold. After that we distribute the computation over a big cluster. while ( N > 0) { if ( N > threshold ) runmapreduce( ) Map( ) Iterations Reduce( ) else runlocalbuilder( ) sequential builder iteration output } // end while 11

15 Hybrid Iterative Map-Reduce A single Map-Reduce job is not enough: Iterative Map-Reduce During the first and last iterations of the algorithm the set of states is Gas Burner example: quite small. Thus a MapReduce job over a large cluster of machines is useless and expensive in term of time and resources. Iterations while ( N > 0) { if ( N > threshold ) runmapreduce( ) The computation starts with a sequential algorithm and goes on The execution with 8 machines is almost 80% faster than the iteration sequential output until the state space size passes a } // end while algorithm configurable threshold. After that we distribute the computation over a big cluster. else Map( ) Reduce( ) #machines machine type #abstract states threshold time (m) 1 m2.2xlarge 1.456x runlocalbuilder( ) 4 m2.2xlarge 1.456x m2.2xlarge 1.456x sequential builder 11

17 Use Cases P/T nets State = <M> marking, associates places with natural numbers. s = s M = M thus we can use the optimized Reduce phase. In order to prove the effectiveness of using MaRDiGraS to improve legacy tools, we adapted an existing P/T nets tool: PIPE. To adapt the sequential algorithm of PIPE into a distributed one, we just needed 290 lines of code: a very small number also if compared with the dimension of the effectively used PIPE modules ( 6500 lines of code). 13

18 Use Cases P/T nets Shared Memory example: Simple Load Balancing example: State = <M> marking, associates places with natural numbers states s = s M = M thus we can use the optimized 10 Reduce phase. complete the computation. 9 transitions 120GB of data In order to prove the effectiveness of using execution MaRDiGraS time = 530 min. to improve using legacy to complete the same computation, using 16 tools, we adapted an existing P/T nets 20 tool: machines. PIPE reachable states The PIPE tool takes more than 20 hours to The adapted version takes 74 min machines. To adapt the sequential algorithm of PIPE into a distributed one, we just needed 290 lines of code: a very small number also if compared with the dimension of the effectively used PIPE modules ( 6500 lines of code). 13

19 CTL model checking in the cloud We developed a software tool which exploits the MaRDiGraS computed graphs by applying iterative map-reduce algorithms based on fixpoint characterizations of the basic temporal operators of CTL (Computational Tree Logic). Given a state transition system T=<S,s0,R,L>, and a set of states that satisfy the φ formula ( [φ]t ) [EXφ]T = R ([φ]t) [EGφ]T = νx([φ]t R (X)) [E[φUψ]] T = μx([ψ]t ([φ]t R (X))) 14

20 Computation Tree Logic CTL is a branching-time logic which models time as a tree-like structure where each moment can be followed by several different possible futures. In CTL each basic temporal operator (i.e., either X, F, G) must be immediately preceded by a path quantifier (i.e., either A or E). In particular, CTL formulas are inductively defined as follows The interpretation of a CTL formula is defined over a Kripke structure (i.e, a state transition system). 15

21 Computation Tree Logic It can be shown that any CTL formula can be written in terms of,, EX, EG, and EU greatest fixed point monotonic predicate transformer least fixed point 16

27 Conclusion MaRDiGraS + CTL verification in the cloud allow users to implement distributed reachability graph builders and verification tools for different formalisms without care about all non functional aspects. They apply techniques typically used by the big data community and so far poorly explored for this kind of issues. We believe that this work could be a first step towards a synergy between two very different, but related communities: the formal verification community and the big data community. Open Questions How it can be optimized when the remaining set gets very small? How to choose the optimal threshold dynamically? Are there classes of formalisms for which this approach cannot be used? And how can we adapt it to these classes?...? 22

28 Planned Work Development of a technique for tackling topologically infinite TB net models computation of minimal coverability sets (so far unexplored) this provides a means to decide several important properties also for real time systems: coverability: is it possible to reach a marking dominating a given marking? boundedness: is the set of reachability markings finite? place boundedness: is it possible to bound the number of tokens in a given place? semi-liveness: is there a reachable marking in which a given transition is enabled? 23

Static Program Transformations for Efficient Software Model Checking Shobha Vasudevan Jacob Abraham The University of Texas at Austin Dependable Systems Large and complex systems Software faults are major

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after

Framework for Developing a Software Cost Estimation Model for Software Based on a Relational Matrix of Project Profile and Software Cost Using an Analogy Estimation Method Hathaichanok Suwanjang and Nakornthip

1 Fixed-Point Logics and Computation Symposium on the Unusual Effectiveness of Logic in Computer Science University of Cambridge 2 Mathematical Logic Mathematical logic seeks to formalise the process of

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 10, October 2013,

8.5 PETRI NETS Consider the computer program shown in Figure 8.5.1. Normally, the instructions would be processed sequentially first, A = 1, then B = 2, and so on. However, notice that there is no logical

On the Modeling and Verification of Security-Aware and Process-Aware Information Systems 29 August 2011 What are workflows to us? Plans or schedules that map users or resources to tasks Such mappings may

Evaluating HDFS I/O Performance on Virtualized Systems Xin Tang xtang@cs.wisc.edu University of Wisconsin-Madison Department of Computer Sciences Abstract Hadoop as a Service (HaaS) has received increasing

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems Rekha Singhal and Gabriele Pacciucci * Other names and brands may be claimed as the property of others. Lustre File

Data Mining with Hadoop at TACC Weijia Xu Data Mining & Statistics Data Mining & Statistics Group Main activities Research and Development Developing new data mining and analysis solutions for practical

MapReduce (in the cloud) How to painlessly process terabytes of data by Irina Gordei MapReduce Presentation Outline What is MapReduce? Example How it works MapReduce in the cloud Conclusion Demo Motivation:

The Flink Big Data Analytics Platform Marton Balassi, Gyula Fora" {mbalassi, gyfora}@apache.org What is Apache Flink? Open Source Started in 2009 by the Berlin-based database research groups In the Apache

University of Texas at El Paso DigitalCommons@UTEP Departmental Technical Reports (CS) Department of Computer Science 8-1-2007 Using Patterns and Composite Propositions to Automate the Generation of Complex

Using Data-Oblivious Algorithms for Private Cloud Storage Access Michael T. Goodrich Dept. of Computer Science Privacy in the Cloud Alice owns a large data set, which she outsources to an honest-but-curious

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 9, September 2014,

Firewall Verification and Redundancy Checking are Equivalent H. B. Acharya University of Texas at Austin acharya@cs.utexas.edu M. G. Gouda National Science Foundation University of Texas at Austin mgouda@nsf.gov