# Inferring Approximated Models for Systems Engineering

In January 2014, SAP researcher will present a paper “Inferring Approximated Models for Systems Engineering” in the 15th IEEE International Symposium on High Assurance Systems Engineering (HASE 2014) in Miami, Florida, USA.

Engineering safe and reliable systems demands rigorous approaches which are capable of providing an increased confidence in achieving a desired level of functional safety and reliability. One of such approaches is the model-based engineering which uses formal methods. Formal, executable models, once they become available, can be used to perform comprehensive validation to prove the absence of undesired properties by simulation and model checking, as well by testing, where models guide the test generation process. At the same time, the main prerequisite for using formal methods is the availability of adequate models of software artefacts. The problem is that many systems are developed without formal models. This motivates research on reengineering methods retrieving models from software artefacts. Depending on the context, goal and assumptions, various types of techniques have been considered for specification mining or model inference. When source code is available, automated analysis of its structure can yield adequate models. Some methods also reverse engineer models from binary code.

In this paper, we propose a new method to infer finite state behavioural models of black box components by testing them. Typically, such components could be accessed over a network, so that we do not even assume that the executable can be scrutinized: the system can only be observed at its interfaces. This corresponds to a typical black box testing situation where the tester would send inputs to a system and observe its outputs.

We assume that the System Under Test (SUT) can be modelled, at some level of abstraction, on its inputs and outputs, as a Finite State Machine (FSM). FSM-based testing theory has shown that an FSM can be identified, i.e., the SUT can be tested to be proven equivalent to it, with the help of state identifying (distinguishing) sequences, constituting, e.g., a characterization set, W-set, of input sequences.

Central to our approach is the notion of initial quotient of an FSM associated with a “partial” characterization set Z, which is a subset of W. In essence, the quotient represents an approximate model where some states might not be distinguished.

Previous methods for automata inference have mostly been derived from grammatical inference techniques. Our method follows partly the same paradigm framework, in particular the Minimally Adequate Teacher. It assumes that the SUT can be used to answer queries in two forms: output queries, where an input sequence can be submitted to the SUT (after resetting it to its initial state) to get the corresponding output sequence; and a restricted form of equivalence queries: we assume that further testing of the SUT can provide counterexamples, i.e., input sequences for which the output sequences of the SUT differ from those of the inferred FSM. Our method will thus infer an SUT by building increasingly precise quotients of it using counterexamples.

Our method departs from previous methods in that it is directly inspired by testing theory. It is also designed to be more adapted and efficient in a software testing context: although it is an active testing and learning method, it can start from existing test records as in passive testing and inference; an initial Z-set of distinguishing sequences can be provided, e.g., by the test expert. Our experiments confirm that it requires fewer queries (tests) than existing recent algorithms to infer a given FSM. The efficiency of the method is measured by the number of queries submitted to the SUT, because it is assumed that the internal bookkeeping updates of data structures used by the method is usually negligible compared to interactions with the SUT (esp. when the system is accessed remotely).

This paper is a joint work of Alexandre Petrenko of CRIM, Canada, Keqin Li of SAP Product Security Research, France, Roland Groz, Karim Hossen, and Catherine Oriat of Grenoble University, France.