Dependability and Error-Recovery in Security Intensive Financial Systems


An Overview of Project Results

Handling Runtime Monitoring Overhead - Christian Colombo from CESCA on Vimeo.

Project Description

Background

As software systems grow in size and complexity, it is becoming harder to ensure their correctness. In particular, security-intensive systems, such as software handling online financial transaction are particularly difficult to test for robustness, since their correctness depends on a combination of the system's behaviour, that of potentially malicious users, all working in a real-time setting. Potential errors or uncatered for sequences of user interaction in the code, may lead to huge losses for the service-provider, and loss of trust from the users' end. A number of techniques are currently used to improve the robustness, including heavy testing prior to the online deployment of the systems. However, the difficulty of testing such systems limits the number of and type of errors which can be identified. Execution paths not covered in the testing phase may still contain errors. Recently, other approaches have been proposed and explored. One such approach, runtime verification, a complementary technique to traditional testing regimes, which enriches the system with monitors which validate the execution path of the system either at runtime or offline through logs produced by the system. In this project, we propose to explore and extend the use of runtime verification on online financial transaction handling software currently being developed and used by Ixaris Ltd for identifying anomalies in the system behaviour at runtime.


State of the art

Increasing the robustness and dependability of safety-critical systems has been one of the areas on the forefront of computer science since its inception. Different techniques have been proposed and used to improve the reliability of systems, most of which, however, prove to be of limited applicability in security-intensive interactive systems. Extensive testing is currently the primary means used to discover errors in a system's logic. Unfortunately, testing explores only a small subset of possible system execution paths, and its effectiveness is substantially low in security-intensive systems, in which malicious users actively try to discover weaknesses in the system. Furthermore, once the system passes a testing regime, it is usually deployed with no further runtime checks for errors. This puts further pressure on system administrators to identify bugs and vulnerabilities in the system discovered by users, who may abusing of such unintentional features.

Model checking, and other automatic verification techniques [CGP00], which ensure the correctness relative to a precise system specification, are being developed and scaled up to work on larger systems. Unfortunately, these approaches still cannot handle large, real-life software systems and require expensive and continuous expert input to support software revisions.

An alternative approach, complementary to these techniques, is runtime verification [CM05], in which one checks the correctness with respect to a set of given properties, of the correctness-critical system during its execution. Through the specification of system invariants in an appropriate logic, monitoring code is automatically instrumented into the system, to ensure that execution paths followed at runtime do not violate the properties. Upon identifying such violations, the system may take any necessary action, specified by the system developers, to remedy the problem or stop further violations.

Such an approach can be done in an online fashion - during and within the actual execution of the system, or offline - by having the system responsible for recording traces of relevant events which are eventually analysed [CDR04]. Online runtime verification has the advantage that the monitors may give direct feedback to the system, correct the detect property violation during the execution of the system. The main challenge of online verification is that of performing the runtime checks without a major impact on the system's performance. However, although various algorithms have been proposed to make online runtime verification viable, for instance using dynamic programming [HR02] and term rewriting systems [CELM96, HR04] have been proposed, degradation of performance is inevitable, especially when more expressive logics are required for expressing system properties. Another challenge in online runtime verification is that of properties cross cutting across independent concurrent threads. Instrumenting monitors automatically for such systems is particularly challenging without introducing substantial communication overhead between the threads. Offline runtime verification [RH01, RH05] circumvents these problems by only assigning to the system the task of logging all relevant events. The monitoring of these traces is performed independently, possibly on another machine, at a different time. This not only minimises the performance issue, but also ensures that the system logic remains intact, since no monitoring code is injected inside. Despite these advantages, offline monitoring suffers from the fact that it cannot directly influence the system, thus still allowing errors and problems to go through, even if the system administrators are alerted about such anomalies.


Proposal

In this project, we propose to explore and extend the use of runtime verification on online financial transaction handling software currently being developed and used by Ixaris Ltd for identifying anomalies in the system behaviour at runtime.

Due to various factors, online monitoring is not a viable on such systems: (i) the systems have a high traffic throughput; (ii) the type of properties one would need to monitor require a highly expressive logic, to express properties over real-time; (iii) financial transactions may give rise to different threads related to the same transaction (for example, a transaction may involve communicating to both the buyer and the bank), relations between which may be required to be enforced; (iv) response time on such systems is crucial; and (v) instrumenting monitoring code which influences the logic of the system directly is considered too risky. Due to thse factors, it is imperative that the runtime monitoring is performed offline. Such an approach will guarantee the generation of alerts whenever the system invariants are violated, without loss of performance on the system itself. However, using such an approach would not enable automatic remedial actions to be taken to stop the online system to stop further compromising behaviour.

We propose to extend current runtime-verification techniques to handle time-lagged runtime-monitoring, in which erroneous behaviour captured by the monitor may trigger remedial actions in the system. Traditionally, this can be done when the verification is performed at runtime. In systems where the runtime verification is performed asynchronously, there may be a time lag from when the actual error occurs and when it is discovered, requiring the system to have to rollback any further actions it may have performed. Building such a system enables both the benefits of offline verification, where the performance of the system is not degraded or compromised, and those of online verification, which enables anomalies to be remedied automatically. Such a balance between on- and offline verification is key to real-life systems with intricate properties to be monitored on massive logs, and will certainly have a direct benefit on security-critical systems by making them safer for the service providers and the end users.


References
  • [CDR04] Feng Chen, Marcelo D'Amorim, and Grigore Rosu. Monitoring-oriented programming: A tool-supported methodology for higher quality object-oriented software. Technical Report (No. UIUCDCS- R-2004-2420), Department of Computer Science, University of Illinois at Urbana-Champaign, 2004.
  • [CELM96] Manuel Clavel, Steven Eker, Patrick Lincoln, and Jose Meseguer. Principles of maude. Electr. Notes Theor. Comput. Sci., 4, 1996.
  • [HR01a] Klaus Havelund and Grigore Rosu. Monitoring programs using rewriting. In ASE '01: Proceedings of the 16th IEEE international conference on Automated software engineering, page 135, Washington, DC, USA, 2001.
  • [HR01b] Klaus Havelund and Grigore Rosu. Testing linear temporal logic formulae on finite execution traces. Technical report, RIACS, 2001.
  • [HR02] Klaus Havelund and Grigore Rosu. Synthesizing monitors for safety properties. In Tools and Algorithms for Construction and Analysis of Systems, pages 342-356, 2002.
  • [HR04] Klaus Havelund and Grigore Rosu. Efficient monitoring of safety properties. Int. J. Softw. Tools Technol. Transf., 6(2):158-173, 2004.
  • [RH01] Grigore Rosu and Klaus Havelund. Synthesizing dynamic programming algorithms from linear temporal logic formulae. Technical report, RIACS, 2001.
  • [RH05] Grigore Rosu and Klaus Havelund. Rewriting-based techniques for runtime verification. Automated Software Engg., 12(2):151-197,2005.
  • [CGP00] Edmund M. Clarke, Orna Grumberg and Doron Peled, Model Checking, MIT Press, 2000.
  • [CM05] S. Colin and L. Mariani, Run-Time Verification, chapter 18 of Model-based Testing of Reactive Systems, volume 3472 of Lecture Notes in Computer Science. Springer, 2005.

Publications
  • Dynamic Event-Based Runtime Monitoring of Real-Time and Contextual Properties, Christian Colombo, Gordon J. Pace and Gerardo Schneider, in FMICS 2008, Italy. (PDF) (BiBTeX entry)
  • A Practical Approach to Runtime Verification of Real-Time Properties for Java Programs, Christian Colombo, Gordon J. Pace and Gerardo Schneider, internal report 01-WICT-2008, Malta. (PDF)
  • Practical Runtime Monitoring with Impact Guarantees of Java Programs with Real-Time Constraints, Christian Colombo, in MSc Thesis 2008, University of Malta. (PDF)
  • Safe Runtime Verification of Real-Time Properties, Christian Colombo, Gordon J. Pace and Gerardo Schneider, in FORMATS 2009, Budapest, Hungary. (PDF) (BiBTeX entry)
  • LARVA - Safer Monitoring of Real-Time Java Programs, Christian Colombo, Gordon J. Pace and Gerardo Schneider, in SEFM 2009, Hanoi, Vietnam. (PDF) (BiBTeX entry)
  • Resource-Bounded Runtime Verification of Java Programs with Real-Time Properties, Christian Colombo, Gordon J. Pace and Gerardo Schneider, Department of Computer Science, University of Malta. Technical Report CS2009-01, 2009. (PDF) (BiBTeX entry)
  • Offline Runtime Verification with Real-Time Properties: A Case Study, Christian Colombo, Gordon J. Pace and Patrick Abela, internal report 01-WICT-2009, Malta. (PDF)
  • Statistics and Runtime Verification, Andrew Gauci, Gordon J. Pace and Christian Colombo, internal report 02-WICT-2009, Malta. (PDF)
  • Slowdown Invariance of Timed Regular Expressions, Ingram Bondin, Gordon J. Pace and Christian Colombo, internal report 03-WICT-2009, Malta. (PDF)
  • LarvaStat: Monitoring of Statistical Properties, Christian Colombo, Andrew Gauci and Gordon J. Pace, in RV, Malta, 2010. (PDF)
  • Compensation-Aware Runtime Monitoring, Christian Colombo, Gordon J. Pace and Patrick Abela, in RV, Malta, 2010. (PDF)

Tools

Presentations
  • Dynamic Event-Based Runtime Monitoring of Real-Time and Contextual Properties, FMICS'08. (PDF)
  • Conflict Detection in Contracts, FLACOS'08. (PDF)
  • Logical Automata for Runtime Verification and Analysis (Larva), MaltaJUG. (PDF)
  • Safe Runtime Verification of Real-Time Properties, FORMATS'09. (PDF)
  • LARVA - Safer Monitoring of Real-Time Java Programs, SEFM'09. (PDF)
  • Resource-Bounded Runtime Verification of Java Programs with Real-Time Properties, SYNCHRON'09. (PDF)
  • Resource-Bounded Runtime Verification and Compensations, SYNCHRON'09. (PDF)
  • Offline Runtime Verification with Real-Time Properties: A Case Study, WICT'09. (PDF)
  • Statistics and Runtime Verification, WICT'09,
  • Slowdown Invariance of Timed Regular Expressions, WICT'09.
  • LarvaStat: Monitoring of Statistical Properties, RV'10. (PDF)
  • Compensation-Aware Runtime Monitoring, RV'10. (PDF)