Ecne: Automated Verification of ZK Circuits

May 12, 2022

Ecne: Automated Verification of ZK Circuits

We present Ecne, the first verifier for testing R1CS soundness.

Prelude: Contextualizing the Project

First, we give a brief overview of the Circom and Snarkjs zk-SNARK compilation toolchains.

The central object of a zk-SNARK is a function $f$ which takes in an input $x$ and yields an output $y$ . Circom source code files contain code to generate a witness (a set of intermediate computation values, analogous to the "computation trace" of function execution on $x$ ), as well as a set of quadratic constraint equations on these witness values that must be satisfied in order for a valid proof to be generated. The constraint equations are mapped into a rank-1 constraint system (R1CS, equivalent to a QAP), which is a set of degree- $2$ polynomial equations (represented in a .r1cs file). The witness generation program in circom is converted to WebAssembly code, so that it can be run at proof time.

We would like to make various claims about the R1CS/QAP in relation to an abstract specification of the function we seek to implement (for example, an abstract specification may be $f$ in a more "vanilla" programming language like Python). Our ultimate goal--formal verification--is to verify the equivalence of the QAP and an abstract specification. Ecne's contribution is to prove that the QAP is indeed equivalent to some function--that the output variables are uniquely determined by the input variables given an QAP, or (more plainly stated) that our QAP is not "missing" any constraints. As a next step, if one could show that the compiled witness generation code always produces a witness satisfying the QAP, this would show equivalence between the compiled witness generation and the QAP, which would give us great confidence in the QAP. Note that this does not require trusting the Circom compiler!

Motivation

At a high level, the first step in a zk-SNARK is converting standard functions (in the sense of computer science) into a quadratic-arithmetic program, a system of equations in the following form.

(a_{1} x_{1} + \dots + a_{n} x_{n}) (b_{1} x_{1} + \dots + b_{n} x_{n}) = (c_{1} x_{1} + \dots + c_{n} x_{n}) in F_{p} .

This conversion is useful because the conversion preserves the properties of the function, and QAPs are particularly amenable to zero-knowledge proofs with elliptic curve pairings.

To this end, QAPs play a crucial role in the Groth16 proving system and are widely used in many zk-SNARK proving systems.

QAP examples

First, consider a simple example, the function $y = x^{3}$ in $F_{p}$ , which converts to

y = x^{3} \overset{QAP}{\leftrightarrow} \exists z so that {\begin{cases} input x \\ output y \\ z = x \cdot x \\ y = z \cdot x \end{cases}

For a slightly more nontrivial example, consider a program that indexes an array at a variable position -- in order to express this in QAP form, we'd use the following conversion.

int x [2]; y = x [j] \overset{QAP}{\leftrightarrow} \exists i_{0}, i_{1}, y_{0}, y_{1} so that {\begin{cases} input x [0], x [1], j \\ output y \\ i_{0} \cdot (j - 0) = 0 \\ i_{1} \cdot (j - 1) = 0 \\ i_{0} + i_{1} = 1 \\ y_{0} = i_{0} \cdot x [0] \\ y_{1} = i_{1} \cdot x [1] \\ y = y_{0} + y_{1} \end{cases}

To see why this works, observe that $i_{t} = 0$ if $t \neq j$ . As you can see, even simple programs can have quite complicated conversions into QAP form.

Significance

One major difficulty with zk-SNARK development is that the conversion of a function into an QAP is a manual and very time consuming process, and if a constraint is missing it leads to the zk-SNARK becoming invalid and potentially hackable. To underscore the importance of this problem, note that function-to-QAP conversion is the only manual step in the process of creating a zk-SNARK.

Ecne addresses this issue by verifying that a given set of QAP constraints uniquely identifies a solution by operating on the QAP directly.

Goals of Ecne

There are several things we might hope for when checking the correctness of a conversion between a function and an QAP. We define the following terms:

Weak Verification: This tests if, given the input variables in a QAP, the output variables have uniquely determined values.
Witness Verification: This tests if all the witness variables that appear in all equations, and not just input and output variables, collectively are uniquely determined.
Strong Verification: This tests if the QAP is exactly equivalent to a formal mathematical specification. For a function like "multiply two BigIntegers", this could be useful, because there's a specific specification we want to compute (multiplication). This one is the most similar to formal verification.

Ecne is primarily used to establish weak verification and witness verification (sometimes when assuming weak / strong verification of smaller circuit subcomponents), but in the future it could be extended to help establish strong verification as well.

Why weak verification and witness verification?

It may not be immediately clear why we should care about weak verification and witness verification. While these are natural concepts, what guarantees do they give us about our circuits?

Firstly, these notions of verification ensure that our circuits are not underconstrained. An underconstrained circuit admits valid proofs for multiple different outputs, given the same input. In the worst case, an attacker can generate a valid proof for an underconstrained circuit for any output--meaning that an attacker would be able to convince a verifier who (incorrectly) believes the circuit to be properly-constrained that the attacker knows the pre-image of arbitrary outputs. This is a severe security issue: a privacy-preserving digital currency system based on underconstrained circuits would allow an attacker to withdraw other people's money. Weak verification at least ensures that such attacks are not possible.

Secondly, weak verification and witness verification, in combination with testing of witness satisfaction, can help us build confidence in the equivalence of the QAP with an abstract function specification.

Let $A (x, y, z)$ be the statement that our QAP constraints are satisfied with input $x$ , output $y$ , and auxiliary variables (witness variables) $z$ . We use the term witness satisfaction for $x_{0}$ to refer to the property that there exists some $y_{0}, z_{0}$ such that $A (x_{0}, y_{0}, z_{0})$ .

Taking a bird's eye view, we are trying to show that, for all $x, y$ ,

f (x) = y ⟺ \exists z so that A (x, y, z)

Weak verification of $A (x, y, z)$ (resp. witness verification) means that $y$ (resp. $y$ and $z$ ) are uniquely determined given a value of $x$ . If we use the witness generation procedure for a specific $x_{0}$ to compute values of $y_{0}, z_{0}$ so that $f (x_{0}) = y_{0}$ and $A (x_{0}, y_{0}, z_{0})$ , then the uniqueness of $y_{0}$ given $x_{0}$ (via weak verification) combined with the existence of a $z_{0}$ satisfying $A (x_{0}, y_{0}, z_{0})$ implies that $f (x_{0}) = y ⟺ \exists z s.t. A (x_{0}, y, z)$ .

In conclusion, if a QAP $A$ is weakly verified and we know of the existence of a $y_{0}, z_{0}$ so that $A (x_{0}, y_{0}, z_{0})$ is satisfied and $f (x_{0}) = y_{0}$ , then $f (x_{0}) = y$ is equivalent to $\exists z, A (x_{0}, y, z)$ .

Thus, if we have weak verification and witness satisfaction for all $x$ we can show that $f (x) = y$ is equivalent to $\exists z, A (x, y, z)$ .

Witness satisfaction is hard to prove (see the limitations at the end), but one benefit is that since we already specify a witness generation procedure, we can repeatedly generate $y, z$ for randomly chosen values of $x$ and see if they satisfy $A (x, y, z)$ -- indeed, this check is already a well-used step in the zk workflow. If this check works for many different inputs, we can have some confidence in witness satisfaction, and by extension in the equivalence of the QAP and our function specification.

Ecne Example

Consider the following constraint system, which implements the Num2Bits functionality for $n = 2$ . (checking if a number $x$ has two bits in binary, and that it lies between $0$ and $3$ inclusive)

{\begin{cases} input x \\ b_{0} \cdot (b_{0} - 1) = 0 \\ b_{1} \cdot (b_{1} - 1) = 0 \\ b_{0} + 2 b_{1} = x \end{cases}

This yields a unique solution for $b_{0}$ and $b_{1}$ (by the uniqueness of binary), and Ecne will tell you that the value of $x$ is guaranteed to be between $0$ and $3$ .

Now, consider a case in which we don't constrain $b_{0}$ , perhaps because of an off-by-one error.

{\begin{cases} input x \\ b_{1} \cdot (b_{1} - 1) = 0 \\ b_{2} \cdot (b_{2} - 1) = 0 \\ b_{0} + 2 b_{1} = x \end{cases}

Here, Ecne will tell you that, while $x$ is uniquely determined (due to being an input variable), it cannot verify any bounds on $x$ . This feedback may suggest to you that $b_{0}$ and $b_{1}$ are improperly constrained.

Ecne will print this type of data out for every single variable, which can be helpful if you want to figure out where the equations first go awry.

Ecne Technical Details

Ecne follows a rule-based system and, at any given point in execution, maintains and updates a set of information about each variable in the system of equations. In the course of the algorithm's execution, we apply various inference rules to update variable states. These rules and the variable state schema are described below.

State Information

index: An identifier for the variable; $i$ , if this is variable $x_{i}$ .
unique: Whether we have proven that this variable is uniquely determined given the currently known variables.
lb (optional): A provable lower bound on this variable.
ub (optional): A provable upper bound on this variable.
values (optional): A set of possible values that we can prove this variable must be contained within.
abz (optional): A special field--a pointer $j$ for which, if referenced variable $x_{j}$ is not equal to a specific value (which is different across all numbers which have $abz = j$ ), then $x_{i} = 0$ . As an example for why this field is useful, consider a system of equations
${\begin{cases} x_{1} \cdot x_{j} = 0 \\ x_{2} \cdot (x_{j} - 1) = 0 \\ x_{3} \cdot (x_{j} - 2) = 0 \end{cases}$
Then the $abz$ of each of $x_{1}, x_{2}, x_{3}$ is given by the index of $x_{j}$ , because $x_{1}$ is zero as long as $x_{j}$ is not zero, $x_{2}$ is zero as long as $x_{j}$ is not one, etc.

Inference Rules

Then, we need to produce a set of rules to use while solving the equation. Here are a few examples of rules; our appendix at the end lists all inference rules.

If all variables belong to one ABZ group and have a known sum, and the abz variable is known (so in the above, if we knew that $y_{1}$ 's value was uniquely determined and that $x_{1} + x_{2} + x_{3} = 5$ ) then $x_{1}, x_{2}, x_{3}$ have uniquely determined values.
If $z = x_{0} + a_{1} x_{1} + \dots + a_{n} x_{n}$ , and $a_{i} ∣ a_{i + 1}$ , and $x_{i} < a_{i + 1} / a_{i}$ (and $x_{n} < p / a_{n} - 1$ , where $p$ is the prime of the finite field (i.e. the BabyJubJub prime) then as long as $z$ is uniquely determined, $x_{0}, x_{1}, \dots x_{n}$ are uniquely determined too.
If $A x = b$ , where $A$ is a constant square matrix with nonzero determinant, and $b$ is a uniquely determined vector, then $x$ is uniquely determined as well.

Rule-Following Algorithm

Most of the rules can be inferred with a single equation (as seen above, the first and second rules depend only on VariableStates and a single equation afterwards). This suggests an algorithm of:

Try each rule on each equation over and over again, and stop when no progress is made.

However, this algorithm is slow, because it fails to exploit the fact that if no progress has been made on any of the variables in the equation, then there's no point in considering that equation again. Thus, we use the algorithm

Begin: Enqueue all equations that one can make progress on (generally, ones with few unknown variables)

Step: Given an equation, attempt to apply all of the rules to it. If progress is made on any variable, enqueue all equations with that variable.

The benefit of this algorithm is that we can bound the number of times any equation is enqueued. An equation in, say, $x_{1}, x_{2}, x_{3}$ , will only be enqueued when we obtain more information about any of $x_{1}, x_{2}, x_{3}$ . Suppose that we can only obtain information about a variable $M$ times -- then each equation in $k$ variables will be enqueued at most $k M$ times, and thus if the combined set of equations has $N$ variable appearances altogether, the run-time of the algorithm is $O (N M)$ , where $M$ is a constant, so the algorithm runs in linear time.

Then if all output variables are uniquely determined, we obtain weak verification. If all intermediate variables total are uniquely determined, we obtain witness verification.

Prior Knowledge

There are certain claims that we cannot easily resolve with the methods described above. To give an explicit example, say that we had equations which established that (in base $b$ )

{\begin{cases} (x_{1} x_{2} x_{3})_{b} \cdot (y_{1} y_{2} y_{3})_{b} \equiv (z_{1} z_{2} z_{3})_{b} (\mod p_{1} p_{2} p_{3}) \\ (y_{1} y_{2} y_{3})_{b} \leq (p_{1} p_{2} p_{3})_{b} \\ isPrime ((p_{1} p_{2} p_{3})_{b}) \\ (x_{1} x_{2} x_{3})_{b} \neq 0 (\mod p_{1} p_{2} p_{3}) \end{cases} .

Combined, these conditions indicate that if $x_{1}, x_{2}, x_{3}, z_{1}, z_{2}, z_{3}, p_{1}, p_{2}, p_{3}$ were uniquely determined, then the values of $y_{1}, y_{2}, y_{3}$ are uniquely determined as well. However, this is not an inference that could easily fit into the VariableState framework above, because these conditions on the variables involve many variables simultaneously.

This then begs the question: How do we represent information like the equations above? For this, we'll need a trick: Trusted Functions.

Suppose we know that a function is strongly verified. Strongly verified functions are (so far) all verified by hand -- the functionality for strongly verifying functions programmatically would require a theorem prover. However, strong verification is needed if we want to use mathematical properties of functions to make deductions like above.

After strongly verifying a function, we compile it to an R1CS. Then, we run what is an essentailly a find-and-replace for that function inside another R1CS. For example, the following circuit constrains the isZero function (i.e. $y = isZero (x)$ ):

{\begin{cases} y = 1 + x \cdot z \\ 0 = x \cdot y \end{cases}

If within another R1CS we see the constraints

{\begin{cases} y^{'} = 1 + x^{'} \cdot z^{'} \\ 0 = x^{'} \cdot y^{'} \end{cases}

then we can safely fold these constraints into $y^{'} = isZero (x^{'})$ .

Now, we have an R1CS along with some special constraints that look like $(x_{j_{1}}, x_{j_{2}}, \dots x_{j_{n}}) = f (x_{i_{1}}, x_{i_{2}}, \dots x_{i_{m}})$ for various $f$ , which now lets us capture more complicated constraints like the ones shown above.^[1]

In practice, this means that you can feed trusted R1CS to the solver to help the prover show weak verification of an untrusted R1CS.

Results

Applying Ecne, we have been able to weakly verify 49/72 of the functions in circomlib. Most of the other circuits pertain to elliptic curves, and are likely much more challenging to weakly verify without far more mathematical tricks -- in addition, some circuits, like the BabyAdd function, do not have formally stated preconditions, meaning that they are impossible to verify by themselves. ^[2]

With these techniques, we are able to weakly verify the unaudited v0.0.0 ECDSAPrivToPub circuit (a complicated circuit with over $1, 000, 000$ constraints) when assuming strong verification of BigMultModP and BigLessThan. This only takes about $20$ minutes, which is far faster than could be obtained with Sage's solver, and likely performs better than standard SMT solvers as well (which will need several hundred million SAT clauses to represent the function).

This was actually useful in practice, because when verifying ECDSAPrivToPub we knew to just focus on each of BigMultModP and BigLessThan, and BigMod (a subroutine of BigMultModP) was lacking a range check. Fixed here.

We also considered two of the most widely deployed circuits, namely Polygon Hermez and TornadoCash.

For the case of TornadoCash, we are able to obtain witness verification* of the withdraw circuit.
In the case of Polygon Hermez, we are able to obtain witness verification* of withdraw and weak verification* of rollup-main.

The asterisk is explained in the section on partial functions.

Limitations of Ecne

Ecne is a tool for showing weak / witness verification, but there is still a gap between what Ecne can prove and what it takes to have full confidence in our circuits. Here, we describe some of the limitations.

What if Ecne can't show weak / witness verification?

There exist QAPs for which Ecne cannot prove weak or witness verification, but whose outputs are indeed uniquely determined by inputs. Oftentimes this is because Ecne does not have enough inference rules to prove that the system of equations has a unique solution for either the witness or output. For example, the BabyAdd circuit from circomlib (even with preconditions) is not within the scope of Ecne to weakly verify, because it requires very complicated mathematical reasoning to prove--specifically Theorem 3.3 of this paper.

Note that when Ecne does terminate with a positive result, we can be confident in the weak verification. In other words, Ecne will never indicate that an underconstrained circuit is indeed weakly verified, though for some circuits that are indeed properly constrained, Ecne may be unable to perform weak verification.

OK, now suppose that we do have weak verification. We showed earlier that weak verification, combined with witness satisfaction, was enough to prove equivalence. How might witness satisfaction fail? There are two main reasons: there could be bugs in the witness generation process, or $f$ might be a partial function, for which witness satisfication is impossible to prove.

Errors in the witness generation process

It could be the case that the witness generation procedure does not always create $y$ and $z$ so that $A (x, y, z)$ is satisfied. This could either because of bugs in the witness generation procedure, or because $A$ is incorrect-- $A$ may be equivalent to some function $f^{'}$ that is not actually the same thing as $f$ !

Partial Functions

It could also be the case that the witness generation procedure does not create $y$ , $z$ so that $A (x, y, z)$ is satisfied, even if there are no bugs in the witness generation procedure. The only way for this to happen would be if there was no $y$ so that $f (x) = y$ , which would seem to contradict the definition of a function itself!

These functions $f$ are known as partial functions, implying that they do not produce outputs for all inputs $x$ . This poses a difficulty in the argument earlier which showed that weak verification in combination with witness satisfaction on all inputs gives us the QAP-function equivalence, because even if $y$ is uniquely determined given $x$ for $A (x, y, z)$ to hold, there may be no valid $y$ and $z$ .

One way this plays out is that we may think that we have solved for a value of $y$ when there is actually no solution for $y$ , like in the system of equations

{\begin{cases} (y - 2)^{2} = 0 \\ (y - 3)^{2} = 0 \end{cases}

For a more practical example, consider a function

updateBalance (start, withdrew, auth) = {\begin{cases} start - withdrew & if auth is valid \\ error & otherwise \end{cases}

If I had the QAP $A_{0}$ given by $y = start - withdrew$ . Ecne would correctly show that this is weakly verified, but because the correct witness generation procedure sometimes does not generate $y$ 's at all, we can't have witness satisfaction, and indeed it would not be correct to conclude that $updateBalance$ is equivalent to $A_{0}$ .

Roadmap and Future Directions

What lies ahead? A few things:

Some QAPs in the pipeline use methods in which witness verification does not hold (by design), but eventually lead to uniquely determined outputs; these may be slightly beyond what we can verify currently.
General performance improvements.
More practical examples like TornadoCash and Hermez. If you're interested in running Ecne on your circuits, please reach out to us!

Acknowldgements

This work would not be possible without several people, who I thank here.

Thanks to gubsheep for helping with many drafts of this document, and to Yi Sun, Wen-Ding Li, and the rest of the 0xPARC community for many helpful suggestions.
Thanks to Antonio de la Piedra for many helpful pull requests.
Thanks to Andrew Sutherland for showing me websites that contain formal verifications of elliptic curve rules.

Appendix: All Rules

(Rule 1) If $c x = (unique)$ , and $c \neq 0$ is a constant, then $x$ is also uniquely determined.
(Rule 2a) If $(x - a) * (x - b) = 0$ , then $x \in {a, b}$ . Namely, if ${a, b} = {0, 1}$ , then $0 \leq x \leq 1$ as well.
(Rule 2b) If $a * x + b = 0$ , then $x = - b / a$ .
(Rule 3) If $z = 2^{0} x_{0} + 2^{1} x_{1} + \dots + 2^{n} x_{n}$ , and $x_{0}, x_{1}, . . . x_{n}$ are all in $[0, 1]$ , then $z \in [0, 2^{n + 1})$ . Furthermore, if $z$ is uniquely determined, so are $x_{0}, x_{1}, . . . x_{n}$ .
(Rule 4a) If $x = y$ , then all properties from $x$ are copied to $y$ .
(Rule 4b) If $1 = x + y$ , and $x \in [0, 1]$ (or $y \in [0, 1]$ ), then the other variable is also in $[0, 1]$ . Furthermore, if $x$ is uniquely determined, $y$ is uniquely determined as well.
(Rule 5) If $z = x_{0} + a_{1} x_{1} + \dots + a_{n} x_{n}$ , and $a_{i} ∣ a_{i + 1}$ , and $x_{i} < a_{i + 1} / a_{i}$ (and $x_{n}$ is less than $p / a_{n} - 1$ , where $p$ is the prime of the finite field), then one can bound $z$ similarly to Rule 3, and if $z$ is uniquely determined, then $x_{0}, x_{1}, . . . x_{n}$ are also uniquely determined.
(Rule 6) This is a rule called the AllButOneZero rule, and is described in the main text.
(Rule 7) If $A x = b$ , where $A$ is a constant square matrix with nonzero determinant, and $b$ is a uniquely determined vector, then $x$ is uniquely determined as well.

A somewhat unexpected property of the above procedure is that in its current incarnation, it often depends on using unoptimized R1CS (compiled with O0) instead of R1CS, because only then can the find-and-replace can work -- otherwise, the constraints in a R1CS that computes a function which uses $isZero$ as a subroutine may not include the exact constraints for $isZero$ as a substring due to various optimizations. ↩︎
Websites like http://hyperelliptic.org/EFD/g1p/auto-twisted.html contain some level of verification of elliptic curve group laws, which serve as evidence that verification of ZK circuits for elliptic curve operations is likely tractable. ↩︎

Ecne: Automated Verification of ZK Circuits ​

Prelude: Contextualizing the Project ​

Motivation ​

QAP examples ​

Significance ​

Goals of Ecne ​

Why weak verification and witness verification? ​

Ecne Example ​

Ecne Technical Details ​

State Information ​

Inference Rules ​

Rule-Following Algorithm ​

Prior Knowledge ​

Results ​

Limitations of Ecne ​

What if Ecne can't show weak / witness verification? ​

Errors in the witness generation process ​

Partial Functions ​

Roadmap and Future Directions ​

Acknowldgements ​

Appendix: All Rules ​

Ecne: Automated Verification of ZK Circuits

Prelude: Contextualizing the Project

Motivation

QAP examples

Significance

Goals of Ecne

Why weak verification and witness verification?

Ecne Example

Ecne Technical Details

State Information

Inference Rules

Rule-Following Algorithm

Prior Knowledge

Results

Limitations of Ecne

What if Ecne can't show weak / witness verification?

Errors in the witness generation process

Partial Functions

Roadmap and Future Directions

Acknowldgements

Appendix: All Rules