Improving Configuration Troubleshooting with Dynamic Information Flow Analysis
Complex software systems are difficult to configure and manage. When problems inevitably arise, operators spend considerable time troubleshooting those problems. Even for casual compute users, troubleshooting is often enormously frustrating. I focus specifically on configuration errors, in which the application code is correct, but the software has been configured incorrectly so that it does not behave as desired. For instance, a mistake in a configuration file may lead software to crash, produce undesired output, or run with degraded performance. In this talk, I show that system support for dynamic information flow analysis can substantially simplify and reduce the human effort needed to troubleshoot software systems. I present ConfAid, and X-ray, two diagnosis tools that use dynamic information flow analysis to identify the likely root cause of a configuration problem. ConfAid diagnoses configuration problems that lead to crashes, and undesired outcome; while X-ray focuses on diagnosing misconfigurations that cause performance problems. The output of these tools is an ordered list of the configuration tokens most likely to have caused the exhibited problem. I show that troubleshooting using information flow analysis only takes a few minutes to complete, which is much faster and far less labor-intensive compared to manual troubleshooting.