zkPairing (Part 2): Technical Explainer

August 24, 2022

zkPairing (Part 2): Technical Explainer

by Jonathan W., Vincent H., and Yi Sun

In a previous post, we presented zkPairing, a proof-of-concept implementation of elliptic curve pairings in Circom. This post explains some of the techniques and optimizations which went into our work.

What are elliptic curve pairings?

Pairing-based cryptography relies on the construction of a special function known as a bilinear pairing. A pairing is a function $e (P, Q)$ which takes two points $P$ and $Q$ on an elliptic curve and outputs a scalar. To be useful for cryptography, we ask for two additional properties on the pairing which mean it is compatible with arithmetic on the elliptic curve:

bilinearity: $e (u P, v Q) = e (P, Q)^{u v}$ for elliptic curve points $P, Q$ and non-zero integers $u, v$ .
non-degeneracy: $e (P, Q) \neq 1$ unless $P = Q = O$ .

These properties imply the decisional Diffie-Hellman property

e (u P, v P) = e (P, Q) ⟺ Q = u v P .

Most elliptic curves used in cryptography are assumed to satisfy the computational Diffie-Hellman assumption: given any point $P$ and the points $u P$ and $v P$ , it is hard to compute $u v P$ . This means that pairings can be used to verify knowledge of $u v P$ , though $u v P$ itself remains hard to compute.

Luckily, pairings satisfying these properties exist on many elliptic curves used in cryptography. In this post, we will be interested in the BLS12-381 curve and the Tate and optimal Ate pairings on it. The construction of these pairings originates in algebraic geometry (the study of solutions to polynomial equations) and is quite involved. You can find the gory details here, but at a high level the optimal Ate pairing involves the following operations:

Miller loop: Using a recursive algorithm to compute a function $f_{λ, Q} (P)$ of the two input points.
Final exponentiation: Raising $f_{λ, Q} (P)$ to the large power $(q^{12} - 1) / r$ .

How do BLS signatures work?

The BLS signature scheme is a signature scheme based on elliptic curve pairings. BLS signatures have several nice properties. First, unlike RSA or ECDSA signatures, they are unique and deterministic: given a key and message, there is only one possible valid signature. Second, BLS signatures can be efficiently adapted to create aggregate and threshold signatures, which involve checking that some number of people from a group have signed messages.

On an elliptic curve with group of points of order $r$ , a BLS private key is simply a number $x$ between $0$ and $r$ , and the corresponding public key is $x G$ for some base point $G$ . To sign a message $m$ with this private key, the signer computes a hash $H (m)$ of the message onto the elliptic curve and takes $s = x H (m)$ to be the signature.

To verify a signature $s$ against a public key $x G$ , the verifier checks the pairing equality

e (G, s) = e (x G, H (m)) .

By the decisional Diffie-Hellman property discussed earlier, this implies that $s = x H (m)$ . Because BLS signature verification only requires computing two pairings, SNARK-ing pairings naturally extends to SNARK-ing BLS signature verification.

Circuits and Techniques

Finite field extensions

Earlier in the post, we mentioned that the arguments to an elliptic curve pairing are points $(x, y)$ on an elliptic curve $x^{2} = y^{3} + a x + b$ . Though $x$ and $y$ are often taken to be integers modulo a prime $q$ , for pairing based cryptography, we will need to choose $x$ and $y$ to lie in a more complicated mathematical object called a finite field extension.

The easiest way to understand finite field extensions is with an example. Consider the integers modulo $3$ , which form the set ${0, 1, 2}$ . This is denoted by $F_{3}$ and called a finite field. For any elements $x, y$ in $F_{3}$ , we can add or multiply them to produce another element of $F_{3}$ , e.g. $2 + 2 = 1$ , meaning $F_{3}$ is self-contained under addition and multiplication.

However, there is no element $x \in F_{3}$ with $x^{2} + 1 = 0$ ; we can easily check this explicitly: $0^{2} + 1 = 1, 1^{2} + 1 = 2, 2^{2} + 1 = 2$ . So to solve the equation $x^{2} + 1 = 0$ , we must define a new element $i$ outside of $F_{3}$ , with $i^{2} + 1 = 0$ , and then we can extend the field $F_{3}$ to a larger finite field $F_{3^{2}}$ consisting of the nine elements $a + b i$ for $0 \leq a, b < 3$ . (It's referred to as $F_{3^{2}}$ because there are $3^{2} = 9$ elements.) This is analogous to how, within the real numbers $R$ , there is no solution to $x^{2} + 1 = 0$ , so we augment $R$ with some element $i$ satisfying $i^{2} + 1 = 0$ to generate the complex numbers $C$ consisting of the elements $a + b i$ for $a, b \in R$ .

The above example shows how, for a prime $q$ , we can go from the finite field $F_{q}$ to the field extension $F_{q^{2}}$ . For pairings on BLS12-381, we must consider points on the curve with coordinates in an even larger field $F_{q^{12}}$ which is itself an extension of $F_{q^{2}}$ . As usual, $q$ differs from the baby Jubjub prime $p$ native to Circom, so we must extend the BigInteger arithmetic circuits from zk-ECDSA to apply to these field extensions. Because of the complexity of this arithmetic, we also adopt the technique of lazy reduction from zk-ECDSA.

Miller loop

Let $q$ be the $381$ -bit prime order of the coordinate field of BLS12-381, and let $r$ be the $255$ -bit order of the group of points on BLS12-381. The goal of the Miller loop is to compute an element $f_{λ, Q} (P)$ of $F_{q^{12}}$ which is the evaluation of a certain rational function $f_{λ, Q}$ at the point $P$ , where $λ$ is the remainder when $q$ is divided by $r$ .

The rational function $f_{λ, Q}$ is defined recursively via Miller's algorithm, which constructs rational functions $f_{i, Q}$ for $0 < i \leq λ$ by the recursion

f_{i + j, Q} = f_{i, Q} \cdot f_{j, Q} \cdot \frac{l_{i, j, Q}}{v_{i + j, Q}},

where $l_{i, j, Q}$ denotes the equation of a line passing through $i Q, j Q$ and $v_{i + j, Q}$ denotes the equation of a vertical line passing through $(i + j) Q$ . If we start with $f_{1, Q} = 1$ , this allows us to iteratively compute larger functions $f_{i, Q}$ . This can be done most efficiently by considering the binary representation of $i$ : for instance, to compute $f_{13, Q} = f_{{\overset{―}{1101}}_{2}, Q}$ , we can use the recursive relation to compute $f_{{\overset{―}{1}}_{2}, Q}, f_{{\overset{―}{10}}_{2}, Q}, f_{{\overset{―}{100}}_{2}, Q}, f_{{\overset{―}{101}}_{2}, Q}, f_{{\overset{―}{1010}}_{2}, Q}, f_{{\overset{―}{1011}}_{2}, Q}$ in order.

Our Miller loop circuit performs this process with a few optimizations also present in non-ZK implementations:

Instead of computing the points $i Q$ directly, which requires elliptic curve arithmetic over $F_{q^{12}}$ , we use a twist to an elliptic curve over $F_{q^{2}}$ .
Because we will raise $f_{λ, Q} (P)$ to the $(q^{12} - 1) / r$ power in the final exponentiation, we omit all of the vertical line evaluations $v_{i + j, Q} (P)$ , as they end up becoming $1$ when exponentiated.

Final exponentiation

The optimal Ate pairing is given by raising the outcome $f_{λ, Q} (P) \in F_{q^{12}}$ of the Miller loop to the $\frac{q^{12} - 1}{r}$ power, namely

e (P, Q) = f_{λ, Q} (P)^{(q^{12} - 1) / r} .

Because $q$ is $381$ -bit and $r$ is $255$ -bit, $\frac{q^{12} - 1}{r}$ is over $4000$ bits, so naive exponentiation via repeated multiplication is extremely inefficient. Instead, we employ various optimizations that are again present in non-ZK implementations of the optimal Ate pairing.

We break the computation of $f^{(q^{12} - 1) / r}$ into two steps, based on the factorization

\frac{q^{12} - 1}{r} = (q^{6} - 1) (q^{2} + 1) \frac{q^{4} - q^{2} + 1}{r},

where $r$ divides $q^{4} - q^{2} + 1$ .

Easy part (Frobenius operator)

In the easier first step, we compute $f_{easy} = f^{(q^{6} - 1) (q^{2} + 1)}$ . The reason that computing $f_{e a s y}$ is "easy" is because it turns out that computing $f^{q}$ is relatively cheap due to certain special properties of working with finite fields. This operator $f \mapsto f^{q}$ is known as the Frobenius operator. By precomputing certain fixed parameters, the cost of a Frobenius operator is the same as a single $F_{q^{12}}$ multiplication, as opposed to $\log_{2} (q)$ of them. This allows us to constrain $f_{easy}$ at the cost of only a few multiplications.

Hard part (cyclotomic subgroup)

We complete the exponentiation by computing $f_{hard} = f_{easy}^{(q^{4} - q^{2} + 1) / r}$ . The number $\frac{q^{4} - q^{2} + 1}{r}$ is still 1268-bits, so a direct computation is suboptimal. The main idea behind the optimization here is again to use the Frobenius operator:

We write $\frac{q^{4} - q^{2} + 1}{r} = λ_{0} + λ_{1} q + λ_{2} q^{2} + λ_{3} q^{3}$ for $λ_{i}$ small using [Section 5, HHT]. Then we compute $f_{easy}^{λ_{i}}$ and use the Frobenius operator to get $f_{hard} = (f_{easy}^{λ_{0}}) (f_{easy}^{λ_{1}})^{q} (f_{easy}^{λ_{2}})^{q^{2}} (f_{easy}^{λ_{3}})^{q^{3}}$ .

We use one further optimization to speed up computation of the $f_{easy}^{λ_{i}}$ . Because of how we defined $f_{easy}$ , $f_{easy}$ belongs to the cyclotomic subgroup (meaning that $f_{easy}^{q^{4} - q^{2} + 1} = 1$ ). By [Karabina], elements in the cyclotomic subgroup can be compressed into a format that uses less data, and squaring can be done more cheaply in the compressed format. We use this compressed format to perform optimized exponentiation via the ordinary square-and-multiply method.

BLS Signatures: Glued Miller loop and hashing to curve

Remember from earlier in the post that BLS signature verification requires checking that

e (G, s) = e (x G, H (m)) .

The obvious way to do this is to compute two pairings: one on the left and one on the right. However we can do better: the bilinearity property of pairings tells us that $e (x G, H (m))^{- 1} = e (- x G, H (m))$ , so we can equivalently check whether

e (G, s) e (- x G, H (m)) = 1.

The left hand side is equal to $(f_{λ, s} (G) f_{λ, H (m)} (- x G))^{(q^{12} - 1) / r}$ , so we only need to perform one final exponentiation at the end. Moreover, in the recursion step of the Miller loop, we do not need to keep track of $f_{i, s} (G)$ and $f_{i, H (m)} (- x G)$ individually; we only care about their product. This saves us an $F_{q^{12}}$ multiplication at each step of the "glued" Miller loop.

Hashing to curve

Up to now we have described how to verify BLS signatures given a hashed message $H (m)$ . However we have not talked about how to get from the message $m$ , which is typically a byte string, to $H (m)$ . It turns out this is quite complicated -- there is an entire Internet Draft about how to do it. We refer to BLS12-381 For The Rest Of Us for an in-depth overview, but we will provide a summary for convenience:

hashToField: Starting with $m$ , one hashes it to two elements u[0], u[1] of $F_{q^{2}}$ using a hash like SHA-256. This step depends on a DST (domain separation tag), which is highly implementation dependent, so be sure to check you are using the right one!
Optimized simplified SWU map for BLS12-381: [Section 4, Wahby-Boneh] provide an optimized method to go from elements $u$ in $F_{q^{2}}$ to $E_{2} (F_{q^{2}})$ where $E_{2}$ is the twist of BLS12-381. This is based on a simplified version of the Shallue-van de Woestijne-Ulas (SWU) mapping. (Footnote: the sgn0 function described in Wahby-Boneh is different from the one in the Internet Draft.)
Cofactor clearing: We now have an element $P$ in $E_{2} (F_{q^{2}})$ . However it does not necessarily lie in the subgroup $G_{2}$ of elements of order $r$ . The size of $E_{2} (F_{q^{2}})$ is $h_{2} r$ where the cofactor $h_{2}$ is quite large. To map $P$ to $G_{2}$ , the naive thing to do is to scalar multiply $P$ by $h_{2}$ . Since $h_{2}$ is huge, this is extremely expensive. Instead, there is a different, faster method to map $P$ to $G_{2}$ . This method uses an endomorphism $ψ$ by Budroni-Pintore which is closely related to the Frobenius operator.

Subgroup checks

For full security, we also need to check that the public key $x G$ belongs to a subgroup $G_{1}$ and the signature $s$ belongs to a subgroup $G_{2} .$ The naive ways to do this are expensive, and there are various optimizations involving the endomorphism $ψ$ in the same spirit as in hashing to curve. We use the latest (2021) optimizations of Scott.

Outlook and roadmap

Adapting to other elliptic curves

Although our circuits target the BLS12-381 curve, they can be easily adapted to other curves. We have recently modified our circuits to work for the BN254 curve, which Ethereum supports with precompiles for elliptic curve arithmetic and pairings. This used the following modifications:

Field extension representation: The scalar field $F_{q}$ of BN254 differs from that of BLS12-381, so we must change our representation for the field extensions $F_{q^{2}}$ and $F_{q^{12}}$ accordingly.
Curve coefficients: BLS12-381 and BN254 have different coefficients $a$ and $b$ in the Weierstrass normal form $y^{2} = x^{3} + a x + b$ , so we must update these.
Optimal Ate definition: BN254 lies in the Barreto-Naehrig family of curves, for which we must modify the $λ$ used for $f_{λ, Q} (P)$ in the definition of the optimal Ate pairing.
Final exponentiation optimization: Because of the smaller size of the scalar field $F_{q}$ and the higher Hamming weight of a certain parameter involved in defining $q$ and the curve order $r$ for BN254, we must reassess whether compressing to the cyclotomic subgroup still offers an advantage over the direct square-and-multiply method.

Recursive verification / proof aggregation

Implementing the optimal Ate pairing on BN254 opens up the possibility of implementing zkSNARK verification within a zkSNARK. For the Groth16 proof system, verifying a proof $π = (A, B, C)$ involves checking that

e (A, B) = e (g^{α}, h^{β}) \cdot e (\prod_{i = 1}^{k} S_{i}^{a_{i}}, h^{γ}) \cdot e (C, h^{δ}),

where $e$ is the optimal Ate pairing on BN254, $(g^{α}, h^{β}, h^{γ}, h^{δ}, {S_{i}}_{i = 1}^{k})$ form the verification key, and ${a_{i}}_{i = 1}^{k}$ are the public inputs. With the help of Nalin Bhardwaj, we have recently implemented ZK circuits for Groth16 verification by checking this equality. In a future blog post, Nalin will explain the implementation in more detail.

Recursive verification and aggregate verification with techniques like SnarkPack allows us to construct zkSNARKs in a modular fashion. To give an example, suppose that Alice wants to generate a zkSNARK for the composition h(x) = f(g(x)) of two function calls. She can either (1) generate a SNARK for the large combined function h(x) or (2) generate a SNARK proof $π$ for the statement y = g(x) and a second SNARK proof for the two statements z = f(y) and " $π$ is a valid proof for y = g(x)". The latter approach, enabled by recursion, allows Alice to prove that z = h(x) while only proving SNARKs smaller than the one for h(x). This is particularly useful in size-constrained settings where the composed SNARK has more constraints than the current largest trusted setup for Groth16 (~256M constraints).

PLONK and KZG commitment verification

Many of the most efficient modern ZKPs such as PLONK are based on the KZG polynomial commitment. Batch opening the value of a committed polynomial in the KZG commitment scheme at several points requires checking a single equality involving elliptic curve pairings. Our techniques for pairings thus open the path to verification for KZG commitments and thus verification for PLONK-type SNARKs which are based on batch openings of KZG commitments.

Circuit auditing and verification

As discussed in earlier posts on zk-ECDSA, these circuits are unaudited and not ready for production use. Due to the complexity and size of these circuits, ensuring their correctness is particularly difficult. We have implemented some basic unit tests for witness correctness, but a few directions we are exploring are:

Testing for witness uniqueness using automated tools like Ecne. Our circuits are substantially larger than previous circuits Ecne has been applied to, and we are excited to see whether such tools can scale to our use case.
We have created extensive documentation for each component of our circuits, including specifying pre- and post-conditions on each template. In addition to facilitating a human audit, this can form the first step towards a specification for formal verification of our circuits.

Conclusion

We invite you to check out our repo, and we'd love to hear your feedback or ideas for improvement. We are also actively exploring further applications of this work -- please reach out if you have an application of zkSNARKs for pairings!

This project was built during 0xPARC's ZK Identity Working Group with the support of the ZKxZK Gitcoin grant.

We use many optimization techniques from circom-ecdsa and xJsnark for big integer and elliptic curve arithmetic. We would also like to thank Jordi Baylina for his crucial help in teaching us how to use the Circom / snarkjs toolstack for large ZK circuits.

zkPairing (Part 2): Technical Explainer ​

What are elliptic curve pairings? ​

How do BLS signatures work? ​

Circuits and Techniques ​

Finite field extensions ​

Miller loop ​

Final exponentiation ​

Easy part (Frobenius operator) ​

Hard part (cyclotomic subgroup) ​

BLS Signatures: Glued Miller loop and hashing to curve ​

Hashing to curve ​

Subgroup checks ​

Outlook and roadmap ​

Adapting to other elliptic curves ​

Recursive verification / proof aggregation ​

PLONK and KZG commitment verification ​

Circuit auditing and verification ​

Conclusion ​

zkPairing (Part 2): Technical Explainer

What are elliptic curve pairings?

How do BLS signatures work?

Circuits and Techniques

Finite field extensions

Miller loop

Final exponentiation

Easy part (Frobenius operator)

Hard part (cyclotomic subgroup)

BLS Signatures: Glued Miller loop and hashing to curve

Hashing to curve

Subgroup checks

Outlook and roadmap

Adapting to other elliptic curves

Recursive verification / proof aggregation

PLONK and KZG commitment verification

Circuit auditing and verification

Conclusion