Untangling System Configurations

Assistant Professor Arjun Guha

Assistant Professor Arjun Guha looks to programming languages to manage complex computing infrastructures.

The primary role of a system administrator is to maintain their organization's computing infrastructure -- they deploy new software, manage server load, and address security vulnerabilities, among other responsibilities. All of these tasks require the administrator to write and maintain system configurations.

Not too long ago, it was feasible to manage systems by directly running installers, editing configuration files, and so on. However, according to CICS assistant professor Arjun Guha, the scale of modern data centers and cloud computing environments has made these old approaches brittle and ineffective. In the past few years, several large-scale outages at Facebook, Skype, the New York Stock Exchange, and other organizations have been attributed not to software bugs, but to misconfigured software.

Guha, who specializes in programming languages, said, "System configuration in these new, complex environments is a problem that naturally lends itself to domain-specific languages (DSLs)." In fact, he added, there are already several commercially-supported system configuration DSLs available, such as Puppet, Chef, CFengine, Ansible, and others. A common refrain among researchers and practitioners is that the system configuration languages should be "declarative" - they should specify what the state of a system should be and not how to get there. "Real-world languages are far from that ideal and suffer a variety of problems," he said.

In a paper presented at ACM SIGPLAN's Conference on Programming Language Design and Implementation 2016, "Rehearsal: A Configuration Verification Tool for Puppet," Arjun and his students, Rian Shambaugh and Aaron Weiss, investigated problems that arise in Puppet, one of the most widely-used system configuration languages. They discovered that writing Puppet configurations that are either non-deterministic, or flap between two states, or have other kinds of unexpected effects, is straightforward. The fundamental issue is that system configurations are necessarily complex and different parts of a configuration may interact in unexpected ways. In a large configuration spanning thousands of lines, it can be impossible for a system administrator to keep all interactions in mind.

To address this issue, Shambaugh, Weiss, and Guha developed Rehearsal, a system configuration verification tool that quickly and automatically answers questions such as, "Is this configuration deterministic?" and "Does this configuration flap between states?" In addition, Rehearsal can automatically repair certain classes of system configuration bugs.

"System configuration in these new, complex environments is a problem that naturally lends itself to domain-specific languages (DSLs)."

The researchers explained that the key idea behind Rehearsal is simple: take a complicated system configuration language and compile it to a simple language of file-system operations. These file-system manipulating programs can be modeled as logical formulas in a SAT solver and properties such as determinism and idempotence can be posed as queries to the SAT solver. The challenge is to do this and keep the model tractable. Rehearsal uses both classic techniques (partial-order reduction and program slicing) and new insights about the structure of system configurations to achieve dramatic speedups over naive modeling approaches. With all optimizations enabled, the tool runs in a few seconds on thousands on lines of configurations.

In addition to Rehearsal, Guha and his students are working on ways to make system configuration testing and the bug fixing process more efficient. System administrators typically test configuration changes in testing environments, such as virtual machines to find system configuration bugs. Once detected, these bugs are easy to fix using terminal commands; however, these actions only fix the test machine and not the configuration, resulting in a time-intensive testing and recompiling process. To address this problem, they are using program synthesis techniques to derive repairs to system configurations from terminal interactions. "The key challenge," said Weiss, "is to update the configuration without breaking the abstractions and structure set up by the system administrator. In other words, the tool needs to synthesize an edit that the sysadmin can understand and accept."

Guha and his students are particularly interested in domain-specific languages and applying programming languages to problems in systems. In addition to the work on system configurations described above, the group also works on languages for software-defined networking and Web programming.