Parallel and Selective Symbolic Execution

Prof. George Candea and Stefan Bucur                    [slides]

EPFL (Lausanne, Switzerland)

We will describe two approaches to scaling the analysis of software systems to large, real-world systems: parallel symbolic execution with Cloud (http://cloud9.epfl.ch) and selective symbolic execution of binaries with S2E (http://s2e.epfl.ch).

Cloud9 is a platform for automated testing. It is based on a scalable parallelization of symbolic execution on clusters of commodity hardware. Cloud9 provides a systematic interface for writing “symbolic tests” that concisely specify entire families of inputs and behaviors to be tested. Cloud9 can handle not only single-threaded programs but also multi-threaded and distributed systems. It includes a new symbolic environment model that is the first to support all major aspects of the POSIX interface, such as processes, threads, synchronization, networking, IPC, and file I/O. We find that Cloud9 can automatically test real systems like memcached, Apache httpd, lighttpd, the Python interpreter, rsync, and curl, while scaling the amount of useful work done linearly with the number of nodes in the cluster.

S2E is a platform for writing tools that analyze the properties and behavior of software systems. So far, we have used S2E to develop an automated bug finding tool for both kernel-mode and user-mode binaries, a reverse engineering tool for proprietary software, and a comprehensive performance profiler. Building these tools on top of S2E took less than 770 LOC and 40 person-hours each. S2E is based on two new ideas, selective symbolic execution and relaxed execution consistency models, which allow S2E to simultaneously analyze entire families of execution paths, instead of just one execution at a time; perform the analyses in-vivo within a real software stack – user programs, libraries, kernel, drivers, etc. – instead of using abstract models of these layers; and to operate directly on binaries, including proprietary and obfuscated software. One can analyze a full Windows or Linux software stack using S2E.

We will close the lecture with measurements offering insights into how one might be able to speed up the constraint solver underlying our tools.

Lecturer Bios

Prof. George Candea heads the Dependable Systems Lab at EPFL in Switzerland, where he leads research focused on techniques, tools, and runtimes that improve the dependability of software systems while increasing programmer productivity. Until recently, George was also Chief Scientist of Aster Data, a Silicon Valley-based large-scale data analytics company he co-founded in 2005 (now part of Teradata). Aster Data has just received the 2011 "Technology Pioneer" award from the World Economic Forum. George is a recipient of the MIT TR35 "Top 35 Young Technology Innovators" award for 2005. George was part of the founding team of the Stanford/Berkeley Recovery-Oriented Computing (ROC) project. During his studies, he also worked at Oracle, IBM Research, and Microsoft Research. George received his PhD in computer science from Stanford University in 2005 and his BS (1997) and MEng (1998) in computer science from the Massachusetts Institute of Technology.

Stefan Bucur is a PhD student in the Dependable Systems laboratory, under the supervision of Prof. George Candea. He works on finding systems solutions to alleviate path explosion in automated testing, and enable it to scale to real systems with millions of lines of code. In this regard, Stefan is building Cloud9, a cluster-based automated testing platform based on parallel symbolic execution. He was recently awarded the 2011 Google Europe Fellowship in Systems Dependability to support his research for three years. Before joining EPFL, Stefan was a student in "Politehnica" University in Bucharest, where he also received his Dipl.Eng. in 2009. During his undergraduate studies, he was a Google Summer of Code student (2008) and mentor (2009), he worked with Adobe Systems Inc. as a student intern (2009), and was a finalist in the Microsoft Windows Embedded Student Challenge (WESC) 2006.

  • No labels