To what extent can we reverse-engineer variability models from code? Our paper investigating this question will appear at ICSE'14. It's joint work with Sarah Nadi, Christian Kästner, and Krzysztof Czarnecki, and uses some heavy machinery (FarCE, TypeChef) to extract feature constraints from C codebases.
Highly configurable systems offer configuration options for tailoring software to specific needs. Not all combinations of configuration options are valid though, and constraints arise for technical or nontechnical reasons. To reason about configurations, it is valuable to explicitly describe the constraints in a variability model. To automate creating variability models, we need to understand the origin of such configuration constraints. We propose an approach to automatically extract configuration constraints from C code based on build-time errors and a novel feature-effect heuristic.We analyze the feasibility of our approach on four highly configurable open-source systems, and use the results to quantitatively and qualitatively study the various kinds of constraints in existing variability models. Both our extraction heuristics are highly accurate (93% and 77% respectively). We find that 19% of the variability model constraints are statically reflected in the code. However, many of the remaining constraints require expert knowledge or more expensive analyses. This suggests that while substantial parts of variability models can be efficiently reverse-engineered from the codebase, other parts contain significant domain and expert knowledge not readily reflected in the code. We argue that our approach, tooling, and experimental results support researchers and practitioners working on variability model re-engineering, evolution, and consistency-checking techniques.