Tools

"... curvilinear configurations to hand-drawn sketches. It collects observations from our own recent research, which focused initially on the domain of sketched human stick figures in diverse postures, as well as related computer vision literature. Sketch recognition, i.e., labeling strokes in the i ..."

curvilinear configurations to hand-drawn sketches. It collects observations from our own recent research, which focused initially on the domain of sketched human stick figures in diverse postures, as well as related computer vision literature. Sketch recognition, i.e., labeling strokes in the input with the names of the model parts they depict, would be a key component of higher-level sketch understanding processes that reason about the recognized configurations. A sketch recognition technology must meet three main requirements. It must cope reliably with the pervasive variability of hand sketches, provide interactive performance, and be easily extensible to new configurations. We argue that useful sketch recognition may be within the grasp of current research, if these requirements are addressed systematically and in concert.

"... This paper describes an active learning approach to the problem of grammatical inference, specifically the inference of deterministic finite automata (DFAs). We refer to the algorithm as the estimation-exploration algorithm (EEA). This approach differs from previous passive and active learning appro ..."

This paper describes an active learning approach to the problem of grammatical inference, specifically the inference of deterministic finite automata (DFAs). We refer to the algorithm as the estimation-exploration algorithm (EEA). This approach differs from previous passive and active learning approaches to grammatical inference in that training data is actively proposed by the algorithm, rather than passively receiving training data from some external teacher. Here we show that this algorithm outperforms one version of the most powerful set of algorithms for grammatical inference, evidence driven state merging (EDSM), on randomly-generated DFAs. The performance increase is due to the fact that the EDSM algorithm only works well for DFAs with specific balances (percentage of positive labelings), while the EEA is more consistent over a wider range of balances. Based on this finding we propose a more general method for generating DFAs to be used in the development of future grammatical inference algorithms.

"... The natural language learning problem has attracted the attention of researchers for several decades. Computational and formal models of language acquisition have provided some preliminary, yet promising insights of how children learn the language of their community. Further, these formal models als ..."

The natural language learning problem has attracted the attention of researchers for several decades. Computational and formal models of language acquisition have provided some preliminary, yet promising insights of how children learn the language of their community. Further, these formal models also provide an operational framework for the numerous practical applications of language learning. We will survey some of the key results in formal language learning. In particular, we will discuss the prominent computational approaches for learning different classes of formal languages and discuss how these fit in the broad context of natural language learning.

"... Efficient learning of DFA is a challenging research problem in grammatical inference. It is known that both exact and approximate (in the PAC sense) identifiability of DFA is hard. Pitt, in his seminal paper posed the following open research problem: "Are DFAPAC-identifiable if examples are d ..."

Efficient learning of DFA is a challenging research problem in grammatical inference. It is known that both exact and approximate (in the PAC sense) identifiability of DFA is hard. Pitt, in his seminal paper posed the following open research problem: &quot;Are DFAPAC-identifiable if examples are drawn from the uniform distribution, or some other known simple distribution?&quot; [25]. We demonstrate that the class of simple DFA (i.e., DFA whose canonical representations have logarithmic Kolmogorov complexity) is efficiently PAC learnable under the Solomonoff Levin universal distribution. We prove that if the examples are sampled at random according to the universal distribution by a teacher that is knowledgeable about the target concept, the entire class of DFA is efficiently PAC learnable under the universal distribution. Thus, we show that DFA are efficiently learnable under the PACS model [6]. Further, we prove that any concept that is learnable under Gold&apos;s model for learning from characteristic samples, Goldman and Mathias&apos; polynomial teachability model, and the model for learning from example based queries is also learnable under the PACS model.

"... Abstract. Techniques for inferring a regular language, in the form of a finite automaton, from a sufficiently large sample of accepted and nonaccepted input words, have been employed to construct models of software and hardware systems, for use, e.g., in test case generation. We intend to adapt thes ..."

Abstract. Techniques for inferring a regular language, in the form of a finite automaton, from a sufficiently large sample of accepted and nonaccepted input words, have been employed to construct models of software and hardware systems, for use, e.g., in test case generation. We intend to adapt these techniques to construct state machine models of entities of communication protocols. The alphabet of such state machines can be very large, since a symbol typically consists of a protocol data unit type with a number of parameters, each of which can assume many values. In typical algorithms for regular inference, the number of needed input words grows with the size of the alphabet and the size of the minimal DFA accepting the language. We therefore modify such an algorithm (Angluin’s algorithm) so that its complexity grows not with the size of the alphabet, but only with the size of a certain symbolic representation of the DFA. The main new idea is to infer, for each state, a partitioning of input symbols into equivalence classes, under the hypothesis that all input symbols in an equivalence class have the same effect on the state machine. Whenever such a hypothesis is disproved, equivalence classes are refined. We show that our modification retains the good properties of Angluin’s original algorithm, but that its complexity grows with the size of our symbolic DFA representation rather than with the size of the alphabet. We have implemented the algorithm; experiments on synthesized examples are consistent with these complexity results. 1

"... Abstract. Existing algorithms for regular inference (aka automata learn-ing) allows to infer a finite state machine by observing the output that the machine produces in response to a selected sequence of input strings. We generalize regular inference techniques to infer a class of state ma-chines wi ..."

Abstract. Existing algorithms for regular inference (aka automata learn-ing) allows to infer a finite state machine by observing the output that the machine produces in response to a selected sequence of input strings. We generalize regular inference techniques to infer a class of state ma-chines with an infinite state space. We consider Mealy machines extended with state variables that can assume values from a potentially unbounded domain. These values can be passed as parameters in input and output symbols, and can be used in tests for equality between state variables and/or message parameters. This is to our knowledge the first exten-sion of regular inference to infinite-state systems. We intend to use these techniques to generate models of communication protocols from observa-tions of their input-output behavior. Such protocols often have param-eters that represent node adresses, connection identifiers, etc. that have a large domain, and on which test for equality is the only meaningful operation. Our extension consists of two phases. In the first phase we apply an existing inference technique for finite-state Mealy machines to generate a model for the case that the values are taken from a small data domain. In the second phase we transform this finite-state Mealy machine into an infinite-state Mealy machine by folding it into a compact symbolic form. 1

"... We present a novel approach for verifying safety properties of finite state machines communicating over unbounded FIFO channels that is based on applying machine learning techniques. We assume that we are given a model of the system and learn the set of reachable states from a sample set of exec ..."

We present a novel approach for verifying safety properties of finite state machines communicating over unbounded FIFO channels that is based on applying machine learning techniques. We assume that we are given a model of the system and learn the set of reachable states from a sample set of executions of the system, instead of attempting to iteratively compute the reachable states. The learnt set of reachable states is then used to either prove that the system is safe or to produce a valid execution of the system leading to an unsafe state (i.e. a counterexample). We have implemented this method for verifying FIFO automata in a tool called Lever that uses a regular language learning algorithm called RPNI. We apply our tool to a few case studies and report our experience with this method. We also demonstrate how this method can be generalized and applied to the verification of other infinite state systems.

"... This article presents a novel application of grammatical inference techniques to the synthesis of behavior models of software systems. This synthesis is used for the elicitation of software requirements. This problem is formulated as a deterministic finite-state automaton induction problem from pos ..."

This article presents a novel application of grammatical inference techniques to the synthesis of behavior models of software systems. This synthesis is used for the elicitation of software requirements. This problem is formulated as a deterministic finite-state automaton induction problem from positive and negative scenarios provided by an end user of the software-to-be. A query-driven state merging (QSM) algorithm is proposed. It extends the Regular Positive and Negative Inference (RPNI) and blue-fringe algorithms by allowing membership queries to be submitted to the end user. State merging operations can be further constrained by some prior domain knowledge formulated as fluents, goals, domain properties, and models of external software components. The incorporation of domain knowledge both reduces the number of queries and guarantees that the induced model is consistent with such knowledge. The proposed techniques are implemented in the ISIS tool and practical evaluations on standard requirements engineering test cases and synthetic data illustrate the interest of this approach.