Discrete Mathematics & Theoretical Computer Science |
Finding a satisfying assignment for a $k$-CNF formula $(k \geq 3)$, assuming such exists, is a notoriously hard problem. In this work we consider the uniform distribution over satisfiable $k$-CNF formulas with a linear number of clauses (clause-variable ratio greater than some constant). We rigorously analyze the structure of the space of satisfying assignments of a random formula in that distribution, showing that basically all satisfying assignments are clustered in one cluster, and agree on all but a small, though linear, number of variables. This observation enables us to describe a polynomial time algorithm that finds $\textit{whp}$ a satisfying assignment for such formulas, thus asserting that most satisfiable $k$-CNF formulas are easy (whenever the clause-variable ratio is greater than some constant). This should be contrasted with the setting of very sparse $k$-CNF formulas (which are satisfiable $\textit{whp}$), where experimental results show some regime of clause density to be difficult for many SAT heuristics. One explanation for this phenomena, backed up by partially non-rigorous analytical tools from statistical physics, is the complicated clustering of the solution space at that regime, unlike the more "regular" structure that denser formulas possess. Thus in some sense, our result rigorously supports this explanation.