On Incremental Pre-processing for SMT

CADE 2023

Nikolaj Bjørner
Microsoft Research
Katalin Fazekas
TU Wien

Outline

What semantic conditions and interfaces are required for SMT pre-processing to allow incremental use?

Calculus

  • Main notion: Simplification modulo model updates

  • Rules covering: SAT/SMT/FTP/MIP

  • Replay: formulas when adding new constraints

Why Pre-processing?

Regin

Why Incremental pre-processing?

  • Better performance for highly incremental applications

  • Offer more uniform user experiences

  • Unleash in-processing

Pre-processing for SMT - examples

  • $x > x - y + 1$ $\leadsto$ $y > 1$

    • Equilvalence preserving
  • $x + 3 = y + z \land \phi[x]$ $\leadsto$ $\phi[y + z - 3]$

    • $x$ is solved for
  • $F, x \leq y, x \leq z, y \leq u$ $\leadsto$ $F$

    • $x, y \not\in FV(F)$
    • interpret as: $x \mapsto \min(y,z), y \mapsto u$
  • $F, p \lor C \leadsto F$

    • $p$ is a blocked literal
    • (resolving on $p \lor C$ in $F$ produces tautologies)

Pre-processing for SMT - incremental

  • $x > x - y + 1$ $\leadsto$ $y > 1$ - add $z > x$

    • $\leadsto y > 1 \land z > x$.
  • $x + 3 = y + z \land \phi[x]$ $\leadsto$ $\phi[y + z - 3]$ - add $z > x$

    • $\leadsto \phi[x] \land z > y + z - 3$.
  • $F, x \leq y, x \leq z, y \leq u$ $\leadsto$ $F$ - add $z > y + x$

    • $\leadsto F, x \leq y, y \leq u, x \leq z, z > y + x$.
  • $F, p \lor C \leadsto F$ - add $\neg p \lor D$

    • $\leadsto F, p, \lor C, \neg p \lor D$.

Pre-processing - as inference rule

$x + 3 = y + z \land F[x]$ $\leadsto$ $F[y + z - 3]$

\[\begin{mdmathpre}%mdk \mdmathindent{2}\state{\mathid{F},~\mathid{x}~+~3~=~\mathid{y}~+~\mathid{z}}{\theta}~\Longrightarrow\\ \mdmathindent{8}\state{\mathid{F}[\mathid{y}~+~\mathid{z}~-~3~/~\mathid{x}]}{\theta\rigidSubst{\mathid{x}}{\mathid{y}~+~\mathid{z}~-~3}{\mathid{x}~+~\mathid{z}~=~\mathid{y}~+~\mathid{z}}} \end{mdmathpre}%mdk \]
  • If $M \models F[y + z - 3]$,
  • then $M[x \mapsto y + z - 3] \models F, x + 3 = y + z$.

Simplification modulo $\theta$ - motivation

Just preserving satisfiability is not sufficient for ensuring compositionality.

Semantic condition on pre-processing and inferences that

  • Capture model reconstruction

  • Allow formalizing main useful cases of incremental pre-processing

Simplification modulo $\theta$ - definition

We say that the formula $F$ simplifies to $F'$ modulo $\theta$, denoted $\modelEquivTheta{F}{\theta}{F'}$

  • If $\model \models F$ then there is a model $\model'$ such that, $\model' \models F'$ and $\model'$ agrees with $\model$ on all symbols that are in $F$ or in background theories or not in $F'$.
  • If $\model' \models F'$ then $\model'\theta \models F$.

Simplification state

Substitution with side-constraints

  • $\theta := \subst{x_1}{t_1}{\Psi_1}{\mathbb{B}_1} \ldots \subst{x_k}{t_k}{\Psi_k}{\mathbb{B}_k}$

  • The effect of $\theta$

    • On models - $\model\theta\subst{x}{t}{\Psi}{\mathbb{B}} = \modelUpdate{\model}{x}{t^\model} \theta$
    • On formulas - $F\subst{x}{t}{\Psi}{\mathbb{B}}\theta = F[t/x]\theta$
    • To undo simplifications - add back $\Psi_i$ if $\mathbb{B}_i = \top$.
    • $\mathbb{B}_i = \top$, or $\bot$.

Example

\[\begin{mdmathpre}%mdk \mdmathindent{2}\state{\mathid{F},~\mathid{p}~\lor \mathid{C}}{\theta}~\Longrightarrow \state{\mathid{F}}{\flexSubst{\mathid{p}}{\mathid{p}~\lor \neg \mathid{C}}{\mathid{p}~\lor \mathid{C}}} \end{mdmathpre}%mdk \]
\[ \modelEquivTheta{p \lor C, F}{p \mapsto p \lor \neg C}{F} \]
  • $M' \models F$, then $M'\flexSubst{p}{p \lor \neg C}{p \lor C} \models F, p \lor C$.

  • $M \models F, p \lor C$, then $M \models F$.

Pre-processing as abstract inference rules

  1. Generic rule covering many scenarios (including from SAT).

    • specialization when model preservation can be established compositionally
    • specialization when variables occur uniquely
  2. Rule to handle when variables can be solved for.

Blocked Clauses

\[\begin{mdmathpre}%mdk \mdmathindent{3}\state{\mathid{F},~\mathid{p}~\lor \mathid{C}}{\theta}~&~\Longrightarrow &~\state{\mathid{F}}{\theta}\flexSubst{\mathid{p}}{\mathid{p}~\lor \neg \mathid{C}}{\mathid{p}~\lor \mathid{C}} \end{mdmathpre}%mdk \]
  • Resolvents with $p$ on $p \lor C$ in $F$ result in tautologies.

  • Model for $p$ is updated to ensure $p \lor C$ is satisfied

  • If adding constraints such that $p \lor C$ is no longer blocked, add back $p \lor C$

Covered Clauses

\[\begin{mdmathpre}%mdk \mdmathindent{3}\state{\mathid{F},~\mathid{p}~\lor \mathid{q}}{\theta}~\Longrightarrow \state{\mathid{F},~\mathid{p}~\lor \mathid{q}~\lor \mathid{r}}{\theta\flexSubst{\mathid{p}}{\mathid{p}~\lor \mathid{r}}{\mathid{p}~\lor \mathid{q}}} \end{mdmathpre}%mdk \]
  • $F := \neg p \lor r \lor s, \neg p \lor r \lor t, F'$
  • $p$ occurs only positively in $F'$

Skolemization, Tseitin

\[\begin{mdmathpre}%mdk \mdmathindent{2}\state{\forall \mathid{x}~\ .~\ \exists \mathid{y}~\ .~\ \mathid{p}(\mathid{x},~\mathid{y}),~\mathid{F}}{\theta}~\Longrightarrow \state{\forall \mathid{x}~\ .~\ \mathid{p}(\mathid{x},~\mathid{f}_{\mathid{sk}}(\mathid{x})),~\mathid{F}}{\theta} \end{mdmathpre}%mdk \]
\[\begin{mdmathpre}%mdk \mdmathindent{2}\state{\mathid{p}~\lor \mathid{q}~\land \mathid{r},~\mathid{F}}{\theta}~\Longrightarrow \state{\mathid{p}~\lor \mathid{s}_{\mathid{q}~\land \mathid{r}},~\neg \mathid{s}_{\mathid{q}~\land \mathid{r}}~\lor \mathid{q},~\neg \mathid{s}_{\mathid{q}~\land \mathid{r}}~\lor \mathid{r},~\mathid{F}}{\theta} \end{mdmathpre}%mdk \]

$\nameUpdate$ - Generic rule

\[\begin{mdmathpre}%mdk \state{\mathid{F},~\Psi}{\theta}~&~\Longrightarrow &~\state{\mathid{F},~\Phi}{\theta\flexSubst{\mathid{x}}{\mathid{t}}{\Psi}}~&~\ \ \textrm{if }~\modelEquiv{\mathid{F},~\Psi}{\mathid{x}}{\mathid{t}}{\mathid{F},~\Phi} \end{mdmathpre}%mdk \]
  • $F, \Psi$ - initial formula, $F, \Phi$ simplified formula
  • $\theta$ - model converter
  • $\flexSubst{x}{t}{\Psi}$ - updated model converter, replay $\Psi$ to undo simplification

$\nameFlex$ - Special case

\[\begin{mdmathpre}%mdk \state{\Phi[\mathid{t}~+~\mathid{x}],~\mathid{F}}{\theta}~\Longrightarrow \state{\Phi[\mathid{y}],~\mathid{F}}{\theta\flexSubst{\mathid{x}}{\mathid{y}~-~\mathid{t}}{\Phi[\mathid{t}~+~\mathid{x}]}} \end{mdmathpre}%mdk \]
  • $x$ occurs uniquely in the sub-term $t + x$, $x \not\in F$.

$\nameFlex$:

\[\begin{mdmathpre}%mdk \mdmathindent{1}\state{\mathid{F},\Psi}{\theta}~&~\Longrightarrow &~\state{\mathid{F},\Psi[\mathid{t}/\mathid{x}]}{\theta \flexSubst{\mathid{x}}{\mathid{t}}{\Psi}} \end{mdmathpre}%mdk \]
  • if $x \in \Psi, x \not\in F$ and $\modelEquiv{\Psi}{x}{t}{\Psi[t/x]}$.

$\nameInvert$ - Special case

\[\begin{mdmathpre}%mdk \state{\mathid{F}[\mathid{x}~+~\mathid{t}]}{\theta}~\Longrightarrow \state{\mathid{F}[\mathid{y}]}{\theta \flexSubst{\mathid{x}}{\mathid{y}~-~\mathid{t}}{\mathid{y}~\simeq \mathid{x}~+~\mathid{t}}} \end{mdmathpre}%mdk \]
  • if $x$ occurs uniquely in $F$, $y$ is fresh

$\nameRigid$ - Solvable variables

\[\begin{mdmathpre}%mdk \mdmathindent{2}\state{\mathid{F}}{\theta}~&~\Longrightarrow &~\state{\mathid{F}[\mathid{t}/\mathid{x}]}{\theta \rigidSubst{\mathid{x}}{\mathid{t}}{\Psi}}~&\\ \mdmathindent{2} \end{mdmathpre}%mdk \]
  • If $\Psi\subseteq F, x \not\in t$, and $\Psi \Rightarrow \exists y \ . \ x \simeq t[y]$

Incrementally adding constraints

  • $\state{F}{\theta}$ is a state after pre-processing of $F_0$

  • Add formula $\Phi$

  • How should $F$, $\theta$, $\Phi$ be adjusted to provide state

    • $\state{F'}{\theta'}$
    • such that $\modelEquivTheta{F_0 \land \Phi}{\theta'}{F'}$?

Solution

  • Convert $\theta$ to a clean substitution w.r.t. new formula $\Phi$.

    • Removes model updates that are no longer sound when adding $\Phi$.
    • Adds side constraints from $\theta$ back to formula $F$.
  • Apply clean substitution $\theta'$ to $\Phi$,

    • Return $\state{F \land \Phi\theta'}{\theta'}$

Clean substitutions

A formula $\Phi$ is clean w.r.t. a substitution sequence $\theta$ iff

  • $\theta = \varepsilon$, or
  • $\theta = \subst{x}{t}{\Psi}{\mathbb{B}}\theta'$, $x\not\in\Phi$ and $\Phi$ is clean with respect to $\theta'$, or
  • $\theta = \substRigid{x}{t}{\Psi}\theta'$ and $\Phi[t/x]$ is clean with respect to $\theta'$.

Update

\[\begin{array}{lllcl} \multicolumn{2}{l}{\nameAdd:} & & & \\ &\state{F}{\theta} & \Longrightarrow & \state{F,\Phi\theta}{\theta} & \ \ \textrm{if } \Phi \textrm{ is clean w.r.t. } \theta \\ \multicolumn{2}{l}{\nameUndo:} & & & \\ &\state{F}{\theta_0\subst{x}{t}{\Psi}{\mathbb{B}}\theta} & \Longrightarrow & \state{F,\Psi\theta}{\theta_0 \theta} & \ \ \textrm{if } \Psi \textrm{ is clean w.r.t. } \theta \\ \\ \end{array} \]

Summary and Outlook

In this paper:

  • A notion of model preservation modulo $\theta$ to capture conditions for incrementality.
  • Semantic rules for incremental pre-processing.
  • Exhibited how mainstream SMT pre-processing rules can be captured by semantic rules

Future:

  • FOL lifting of SAT pre-processing rules are justified by preservation of proofs
    • Instead justify by model updates
  • Integrate notions of redundant clauses with calculus