Formal methods: practical applications and foundations: Editorial
Incremental column-wise verification of arithmetic circuits using computer algebra
Verifying arithmetic circuits and most prominently multiplier circuits is an important problem which in practice still requires substantial manual effort. The currently most effective approach uses polynomial reasoning over pseudo boolean polynomials. In this approach a word-level specification is reduced by a Gröbner basis which is implied by the gate-level representation of the circuit. This reduction returns zero if and only if the circuit is correct. We give a rigorous formalization of this approach including soundness and completeness arguments. Furthermore we present a novel incremental column-wise technique to verify gate-level multipliers. This approach is further improved by extracting full- and half-adder constraints in the circuit which allows to rewrite and reduce the Gröbner basis. We also present a new technical theorem which allows to rewrite local parts of the Gröbner basis. Optimizing the Gröbner basis reduces computation time substantially. In addition we extend these algebraic techniques to verify the equivalence of bit-level multipliers without using a word-level specification. Our experiments show that regular multipliers can be verified efficiently by using off-the-shelf computer algebra tools, while more complex and optimized multipliers require more sophisticated techniques. We discuss in detail our complete verification approach including all optimizations.
From LTL to rLTL monitoring: improved monitorability through robust semantics
Runtime monitoring is commonly used to detect the violation of desired properties in safety critical cyber-physical systems by observing its executions. Bauer et al. introduced an influential framework for monitoring Linear Temporal Logic (LTL) properties based on a three-valued semantics for a finite execution: the formula is already satisfied by the given execution, it is already violated, or it is still undetermined, i.e., it can still be satisfied and violated by appropriate extensions of the given execution. However, a wide range of formulas are not monitorable under this approach, meaning that there are executions for which satisfaction and violation will always remain undetermined no matter how it is extended. In particular, Bauer et al. report that 44% of the formulas they consider in their experiments fall into this category. Recently, a robust semantics for LTL was introduced to capture different degrees by which a property can be violated. In this paper we introduce a robust semantics for finite strings and show its potential in monitoring: every formula considered by Bauer et al. is monitorable under our approach. Furthermore, we discuss which properties that come naturally in LTL monitoring-such as the realizability of all truth values-can be transferred to the robust setting. We show that LTL formulas with robust semantics can be monitored by deterministic automata, and provide tight bounds on the size of the constructed automaton. Lastly, we report on a prototype implementation and compare it to the LTL monitor of Bauer et al. on a sample of examples.
Formal methods: practical applications and foundations: Editorial
Towards efficient verification of population protocols
Population protocols are a well established model of computation by anonymous, identical finite-state agents. A protocol is well-specified if from every initial configuration, all fair executions of the protocol reach a common consensus. The central verification question for population protocols is the : deciding if a given protocol is well-specified. Esparza et al. have recently shown that this problem is decidable, but with very high complexity: it is at least as hard as the Petri net reachability problem, which is TOWER-hard, and for which only algorithms of non-primitive recursive complexity are currently known. In this paper we introduce the class of well-specified strongly-silent protocols and we prove that it is suitable for automatic verification. More precisely, we show that has the same computational power as general well-specified protocols, and captures standard protocols from the literature. Moreover, we show that the membership and correctness problems for reduce to solving boolean combinations of linear constraints over . This allowed us to develop the first software able to automatically prove correctness for of the infinitely many possible inputs.
Golem: a flexible and efficient solver for constrained Horn clauses
The logical framework of Constrained Horn Clauses (CHC) models verification tasks from a variety of domains, ranging from verification of safety properties in transition systems to modular verification of programs with procedures. In this work we present Golem, a flexible and efficient solver for satisfiability of CHCs over linear real and integer arithmetic. Golem provides flexibility with modular architecture and multiple back-end model-checking algorithms, as well as efficiency with tight integration with the underlying SMT solver. This paper describes the architecture of Golem and its back-end engines, which include our recently introduced model-checking algorithm TPA for deep exploration. The description is complemented by extensive evaluation, demonstrating the competitive nature of the solver.
From non-preemptive to preemptive scheduling using synchronization synthesis
We present a computer-aided programming approach to concurrency. The approach allows programmers to program assuming a friendly, non-preemptive scheduler, and our synthesis procedure inserts synchronization to ensure that the final program works even with a preemptive scheduler. The correctness specification is implicit, inferred from the non-preemptive behavior. Let us consider sequences of calls that the program makes to an external interface. The specification requires that any such sequence produced under a preemptive scheduler should be included in the set of sequences produced under a non-preemptive scheduler. We guarantee that our synthesis does not introduce deadlocks and that the synchronization inserted is optimal w.r.t. a given objective function. The solution is based on a finitary abstraction, an algorithm for bounded language inclusion modulo an independence relation, and generation of a set of global constraints over synchronization placements. Each model of the global constraints set corresponds to a correctness-ensuring synchronization placement. The placement that is optimal w.r.t. the given objective function is chosen as the synchronization solution. We apply the approach to device-driver programming, where the driver threads call the software interface of the device and the API provided by the operating system. Our experiments demonstrate that our synthesis method is precise and efficient. The implicit specification helped us find one concurrency bug previously missed when model-checking using an explicit, user-provided specification. We implemented objective functions for coarse-grained and fine-grained locking and observed that different synchronization placements are produced for our experiments, favoring a minimal number of synchronization operations or maximum concurrency, respectively.
Algorithmic games for full ground references
We present a full classification of decidable and undecidable cases for contextual equivalence in a finitary ML-like language equipped with full ground storage (both integers and reference names can be stored). The simplest undecidable type is . At the technical level, our results marry game semantics with automata-theoretic techniques developed to handle infinite alphabets. On the automata-theoretic front, we show decidability of the emptiness problem for register pushdown automata extended with fresh-symbol generation.
Global guidance for local generalization in model checking
SMT-based model checkers, especially IC3-style ones, are currently the most effective techniques for verification of infinite state systems. They infer inductive invariants via reasoning about a single step of the transition relation of a system, while employing SMT-based procedures, such as interpolation, to mitigate the limitations of local reasoning and allow for better generalization. Unfortunately, these mitigations intertwine model checking with heuristics of the underlying SMT-solver, negatively affecting stability of model checking. In this paper, we propose to tackle the limitations of locality in a systematic manner. We introduce explicit into the local reasoning performed by IC3-style algorithms. To this end, we extend the SMT-IC3 paradigm with three novel rules, designed to mitigate fundamental sources of failure that stem from locality. We instantiate these rules for Linear Integer Arithmetic and Linear Rational Aritmetic and implement them on top of Spacer solver in Z3. Our empirical results show that GSpacer, Spacer extended with global guidance, is significantly more effective than both Spacer and sole global reasoning, and, furthermore, is insensitive to interpolation.
Isla: integrating full-scale ISA semantics and axiomatic concurrency models (extended version)
Architecture specifications such as Armv8-A and RISC-V are the ultimate foundation for software verification and the correctness criteria for hardware verification. They should define the allowed sequential and relaxed-memory concurrency behaviour of programs, but hitherto there has been no integration of full-scale instruction-set architecture (ISA) semantics with axiomatic concurrency models, either in mathematics or in tools. These ISA semantics can be surprisingly large and intricate, e.g. 100k lines for Armv8-A. In this paper we present a tool, Isla, for computing the allowed behaviours of concurrent litmus tests with respect to full-scale ISA definitions, in the Sail language, and arbitrary axiomatic relaxed-memory concurrency models, in the Cat language. It is based on a generic symbolic engine for Sail ISA specifications. We equip the tool with a web interface to make it widely accessible, and illustrate and evaluate it for Armv8-A and RISC-V. The symbolic execution engine is valuable also for other verification tasks: it has been used in automated ISA test generation for the Arm Morello prototype architecture, extending Armv8-A with CHERI capabilities, and for Iris program-logic reasoning about binary code above the Armv8-A and RISC-V ISA specifications. By using full-scale and authoritative ISA semantics, Isla lets one evaluate litmus tests using arbitrary user instructions with high confidence. Moreover, because these ISA specifications give detailed and validated definitions of the sequential aspects of functionality, as used by hypervisors and operating systems, e.g. instruction fetch, exceptions, and address translation, our tool provides a basis for developing concurrency semantics for these. We demonstrate this for the Armv8-A instruction-fetch and virtual-memory models and examples of Simner et al.
Practical algebraic calculus and Nullstellensatz with the checkers Pacheck and Pastèque and Nuss-Checker
Automated reasoning techniques based on computer algebra have seen renewed interest in recent years and are for example heavily used in formal verification of arithmetic circuits. However, the verification process might contain errors. Generating and checking proof certificates is important to increase the trust in automated reasoning tools. For algebraic reasoning, two proof systems, Nullstellensatz and polynomial calculus, are available and are well-known in proof complexity. A Nullstellensatz proof captures whether a polynomial can be represented as a linear combination of a given set of polynomials by providing the co-factors of the linear combination. Proofs in polynomial calculus dynamically capture that a polynomial can be derived from a given set of polynomials using algebraic ideal theory. In this article we present the practical algebraic calculus as an instantiation of the polynomial calculus that can be checked efficiently. We further modify the practical algebraic calculus and gain LPAC (practical algebraic calculus + linear combinations) that includes linear combinations. In this way we are not only able to represent both Nullstellensatz and polynomial calculus proofs, but we are also able to blend both proof formats. Furthermore, we introduce extension rules to simulate essential rewriting techniques required in practice. For efficiency we also make use of indices for existing polynomials and include deletion rules too. We demonstrate the different proof formats on the use case of arithmetic circuit verification and discuss how these proofs can be produced as a by-product in formal verification. We present the proof checkers Pacheck, Pastèque, and Nuss-Checker. Pacheck checks proofs in practical algebraic calculus more efficiently than Pastèque, but the latter is formally verified using the proof assistant Isabelle/HOL. The tool Nuss-Checker is used to check proofs in the Nullstellensatz format.
Information-flow interfaces
Contract-based design is a promising methodology for taming the complexity of developing sophisticated systems. A formal contract distinguishes between , which are constraints that the designer of a component puts on the environments in which the component can be used safely, and , which are promises that the designer asks from the team that implements the component. A theory of formal contracts can be formalized as an , which supports the composition and refinement of both assumptions and guarantees. Although there is a rich landscape of contract-based design methods that address functional and extra-functional properties, we present the first interface theory designed to ensure system-wide security properties. Our framework provides a refinement relation and a composition operation that support both incremental design and independent implementability. We develop our theory for both and interfaces. Additionally, we introduce information-flow contracts where and are sets of flow relations. We use these contracts to illustrate how to enrich information-flow interfaces with a semantic view. We illustrate the applicability of our framework with two examples inspired by the automotive domain.
Predicate abstraction for hyperliveness verification
Temporal hyperproperties are system properties that relate multiple execution traces. In finite-state systems, temporal hyperproperties are supported by model-checking algorithms, and tools for general temporal logics like HyperLTL exist. In infinite-state systems, the analysis of temporal hyperproperties has, so far, been limited to -safety properties, i.e., properties that stipulate the absence of a bad interaction between any traces. In this paper, we present an automated method for the verification of -safety properties in infinite-state systems. A -safety property stipulates that for any traces, there exist traces such that the resulting traces do not interact badly. This combination of universal and existential quantification captures many properties beyond -safety, including hyperliveness properties such as generalized non-interference or program refinement. Our verification method is based on a strategy-based instantiation of existential trace quantification combined with a program reduction, both in the context of a fixed predicate abstraction.
Synthesizing adaptive test strategies from temporal logic specifications
Constructing good test cases is difficult and time-consuming, especially if the system under test is still under development and its exact behavior is not yet fixed. We propose a new approach to compute test strategies for reactive systems from a given temporal logic specification using formal methods. The computed strategies are guaranteed to reveal certain simple faults in realization of the specification and for behavior of the uncontrollable part of the system's environment. The proposed approach supports different assumptions on occurrences of faults (ranging from a single transient fault to a persistent fault) and by default aims at unveiling the weakest one. We argue that such tests are also sensitive for more complex bugs. Since the specification may not define the system behavior completely, we use reactive synthesis algorithms with partial information. The computed strategies are that react to behavior at runtime. We work out the underlying theory of adaptive test strategy synthesis and present experiments for a safety-critical component of a real-world satellite system. We demonstrate that our approach can be applied to industrial specifications and that the synthesized test strategies are capable of detecting bugs that are hard to detect with random testing.
The probabilistic termination tool amber
We describe the Amber tool for proving and refuting the termination of a class of probabilistic while-programs with polynomial arithmetic, in a fully automated manner. Amber combines martingale theory with properties of asymptotic bounding functions and implements relaxed versions of existing probabilistic termination proof rules to prove/disprove (positive) almost sure termination of probabilistic loops. Amber supports programs parametrized by symbolic constants and drawing from common probability distributions. Our experimental comparisons give practical evidence of Amber outperforming existing state-of-the-art tools.
Monitoring hyperproperties
Hyperproperties, such as non-interference and observational determinism, relate multiple system executions to each other. They are not expressible in standard temporal logics, like LTL, CTL, and CTL*, and thus cannot be monitored with standard runtime verification techniques. extends linear-time temporal logic (LTL) with explicit quantification over traces in order to express hyperproperties. We investigate the runtime verification problem of formulas for three different input models: (1) The model, where a number of system executions is processed in parallel. (2) The model, where system executions are processed sequentially, one execution at a time. In this model, the number of incoming executions is a-priori unbounded and may in fact grow forever. (3) The where the traces are processed sequentially and the number of incoming executions is . We show that the existence of a bound in the parallel and bounded sequential models leads to a different notion of monitorability than in the unbounded sequential model. We show that deciding the monitoriability problem for alternation-free HyperLTL is -complete while the problem is undecidable in general. For every input model, we provide monitoring algorithms along with run-time and storage optimizations. By recognizing properties of specifications such as reflexivity, symmetry, and transitivity, we reduce the number of comparisons between traces. For the sequential models, we present a technique that minimizes the number of traces that need to be stored. We evaluate our optimizations, showing that this leads to a more scalable monitoring and, in particular, a significantly lower memory consumption.
Mining definitions in Kissat with Kittens
Bounded variable elimination is one of the most important preprocessing techniques in SAT solving. It benefits from discovering functional dependencies in the form of definitions encoded in the CNF. While the common approach pioneered in SatELite relies on syntactic pattern matching, our new approach uses cores produced by an embedded SAT solver, Kitten. In contrast to a similar semantic technique implemented in Lingeling based on BDD algorithms to generate irredundant CNFs, our new approach is able to generate DRAT proofs. We further discuss design choices for our embedded SAT solver Kitten. Experiments with Kissat show the effectiveness of this approach.
Quantitative monitoring of STL with edit distance
In cyber-physical systems (CPS), physical behaviors are typically controlled by digital hardware. As a consequence, continuous behaviors are discretized by sampling and quantization prior to their processing. Quantifying the similarity between CPS behaviors and their specification is an important ingredient in evaluating correctness and quality of such systems. We propose a novel procedure for measuring robustness between digitized CPS signals and signal temporal logic (STL) specifications. We first equip STL with quantitative semantics based on the , a metric that quantifies both space and time mismatches between digitized CPS behaviors. We then develop a dynamic programming algorithm for computing the robustness degree between digitized signals and STL specifications. In order to promote hardware-based monitors we implemented our approach in FPGA. We evaluated it on automotive benchmarks defined by research community, and also on realistic data obtained from magnetic sensor used in modern cars.
SMT-based verification of program changes through summary repair
This article provides an innovative approach for verification by model checking of programs that undergo continuous changes. To tackle the problem of repeating the entire model checking for each new version of the program, our approach verifies programs incrementally. It reuses computational history of the previous program version, namely function summaries. In particular, the summaries are over-approximations of the bounded program behaviors. Whenever reusing of summaries is not possible straight away, our algorithm repairs the summaries to maximize the chance of reusability of them for subsequent runs. We base our approach on satisfiability modulo theories (SMT) to take full advantage of lightweight modeling approach and at the same time the ability to provide concise function summarization. Our approach leverages pre-computed function summaries in SMT to localize the checks of changed functions. Furthermore, to exploit the trade-off between precision and performance, our approach relies on the use of an SMT solver, not only for underlying reasoning, but also for program modeling and the adjustment of its precision. On the benchmark suite of primarily Linux device drivers versions, we demonstrate that our algorithm achieves an order of magnitude speedup compared to prior approaches.
Empirical software metrics for benchmarking of verification tools
We study empirical metrics for software source code, which can predict the performance of verification tools on specific types of software. Our metrics comprise variable usage patterns, loop patterns, as well as indicators of control-flow complexity and are extracted by simple data-flow analyses. We demonstrate that our metrics are powerful enough to devise a machine-learning based portfolio solver for software verification. We show that this portfolio solver would be the (hypothetical) overall winner of the international competition on software verification (SV-COMP) in three consecutive years (2014-2016). This gives strong empirical evidence for the predictive power of our metrics and demonstrates the viability of portfolio solvers for software verification. Moreover, we demonstrate the flexibility of our algorithm for portfolio construction in novel settings: originally conceived for SV-COMP'14, the construction works just as well for SV-COMP'15 (considerably more verification tasks) and for SV-COMP'16 (considerably more candidate verification tools).
Preserving provability over GPU program optimizations with annotation-aware transformations
GPU programs are widely used in industry. To obtain the best performance, a typical development process involves the manual or semi-automatic application of optimizations prior to compiling the code. Such optimizations can introduce errors. To avoid the introduction of errors, we can augment GPU programs with (pre- and postcondition-style) annotations to capture functional properties. However, keeping these annotations correct when optimizing GPU programs is labor-intensive and error-prone. This paper presents an approach to automatically apply optimizations to GPU programs while preserving provability by defining . It applies frequently-used GPU optimizations, but besides transforming code, it also transforms the annotations. The approach has been implemented in the Alpinist tool and we evaluate Alpinist in combination with the VerCors program verifier, to automatically apply optimizations to a collection of verified programs and reverify them.
