PLONK (Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge) was introduced by Gabizon, Williamson, and Ciobotaru in 2019. In the years since, it has become the de facto foundation for production zero-knowledge proof systems. zkSync Era, Polygon zkEVM, Aztec, and Scroll all build on PLONK or its derivatives. Understanding its internals is no longer optional for engineers working in this space.
This post is a deep technical walkthrough of PLONK's arithmetization — how computations are encoded into polynomials, how the constraint system works, and how modern extensions like custom gates and lookup arguments push performance to practical limits. We assume familiarity with elliptic curve cryptography and basic polynomial commitment schemes.
Why Arithmetization Matters
A ZK proof system proves that a prover knows a witness w such that a statement C(x, w) = true, without revealing w. The key challenge is transforming an arbitrary computation C into a form that a proof system can handle. That transformation is arithmetization — encoding the computation as polynomial constraints over a finite field.
The choice of arithmetization dominates the performance characteristics of the resulting proof system. Poor arithmetization leads to enormous circuit sizes, long proving times, and large proofs. Efficient arithmetization is what separates practical ZK systems from theoretical ones. PLONK introduced a flexible, universal arithmetization that can be instantiated with different proving backends — SNARKs, STARKs, or hybrid systems.
The PLONK Circuit Model
PLONK represents computation as an arithmetic circuit over a finite field F_p, where p is a large prime (typically 254 bits for BN254 or 255 bits for BLS12-381). Unlike earlier systems like Groth16 that used R1CS (Rank-1 Constraint Systems), PLONK introduces a more general gate model.
The PLONK Gate Equation
Each gate in a PLONK circuit is defined by the equation: q_L * a + q_R * b + q_O * c + q_M * a * b + q_C = 0. Here, a, b, c are the left input, right input, and output wire values. The selector polynomials q_L, q_R, q_O, q_M, q_C are fixed at circuit construction time and encode what type of gate this is.
- Addition gate: q_L = 1, q_R = 1, q_O = -1, q_M = 0, q_C = 0. Enforces a + b - c = 0
- Multiplication gate: q_L = 0, q_R = 0, q_O = -1, q_M = 1, q_C = 0. Enforces a*b - c = 0
- Constant gate: q_L = 1, q_R = 0, q_O = 0, q_M = 0, q_C = -k. Enforces a = k
- Public input gate: q_L = 1, q_R = 0, q_O = 0, q_M = 0, q_C = -pub. Enforces a equals a public input
This single parameterized gate equation is the core of PLONK's flexibility. By choosing selector values, any constraint expressible as a degree-2 polynomial over the wire values can be implemented as a PLONK gate. The system is 'universal' in the sense that the circuit structure (selector polynomial values) can be fixed before the prover's witness is known.
The Copy Constraint Problem
The gate equation handles local constraints within a single gate. But computations are not isolated gates — output values flow from one gate to another as inputs. Enforcing that the output wire of gate i equals the input wire of gate j is called a 'copy constraint' or 'wiring constraint.'
PLONK encodes copy constraints via a permutation argument. All wire values (a_1, b_1, c_1, a_2, b_2, c_2, ..., a_n, b_n, c_n) for n gates form a single vector. A permutation sigma maps indices such that if wire position i should equal wire position j, then sigma(i) = j. The prover must demonstrate that the wire vector is consistent with sigma — equivalently, that values at positions i and sigma(i) are identical.
PLONK's central insight is encoding both gate constraints and copy constraints as polynomial identities over a multiplicative subgroup of F_p. Both can be checked simultaneously with a single polynomial evaluation at a random point, enabling a universal and updatable structured reference string.
The Polynomial Encoding
PLONK works over a multiplicative subgroup H of F_p of order n, where n is the number of gates. For BN254, n can be up to 2^28 (268 million gates) with acceptable performance. The subgroup H = {omega^0, omega^1, ..., omega^{n-1}} where omega is an n-th root of unity.
Each row of the execution trace becomes a coefficient in a polynomial. The wire assignment a_i is encoded as the polynomial a(X) such that a(omega^i) = a_i. Similarly for b(X), c(X), and the selector polynomials. Gate satisfaction becomes: q_L(X)*a(X) + q_R(X)*b(X) + q_O(X)*c(X) + q_M(X)*a(X)*b(X) + q_C(X) = 0 for all X in H.
The vanishing polynomial Z_H(X) = X^n - 1 vanishes at every point in H. So the gate equation holds for all X in H if and only if Z_H(X) divides the left-hand side. This translates proving a constraint holds everywhere into a single polynomial divisibility check — the core trick that makes PLONK efficient.
The Permutation Argument in Detail
The permutation argument proves that a set of values f = (f_0, ..., f_{n-1}) is a permutation of another set g = (g_0, ..., g_{n-1}). PLONK uses the 'grand product' technique: construct the polynomial z(X) where z(omega^0) = 1 and z(omega^{i+1}) = z(omega^i) * prod((f_j + beta * j + gamma) / (g_j + beta * sigma(j) + gamma)) for j from 0 to i.
Here beta and gamma are random challenges from the verifier (or Fiat-Shamir hash of the transcript). The final value z(omega^n) = z(1) must equal 1. This can only happen if f is indeed a permutation of g. The entire permutation check reduces to: (a) z(omega^0) = 1, (b) z(omega^{n-1}) = 1, and (c) a polynomial identity relating consecutive values of z.
- All three conditions are polynomial constraints, checkable via a single evaluation protocol
- The challenge values beta and gamma ensure the check is sound — a cheating prover cannot manipulate them
- Copy constraints across 3n wire positions (a, b, c for each gate) are handled with a single permutation over a 3n-element domain
- The grand product polynomial z(X) has degree n, requiring an n-point evaluation commitment
TurboPLONK and UltraPLONK: Custom Gates
Vanilla PLONK with its degree-2 gate equation requires large circuits for complex operations. A 256-bit integer multiplication needs roughly 2000 gates. An elliptic curve point addition over BN254 embedded in BN254 arithmetic needs hundreds more. Custom gates are the solution.
TurboPLONK (Aztec, 2019) extends the gate equation to arbitrary degree: q_1 * f_1(a,b,c,d) + q_2 * f_2(a,b,c,d) + ... = 0, where each gate can use up to 4 wire values (adding a fourth wire d) and the f_i can be arbitrary polynomials of arbitrary degree. This enables encoding entire cryptographic operations in a single gate.
Examples of Custom Gates
- Poseidon hash gate: Encodes a single S-box evaluation (x^5) and the linear MDS layer as a single gate, reducing a full Poseidon hash to ~30 gates instead of ~3000 in vanilla PLONK
- EC point addition gate: Encodes the full elliptic curve addition formula in ~4 gates instead of ~200
- Range check gate: Proves a value lies in [0, 2^k - 1] without a decomposition into individual bit constraints
- Boolean constraint gate: Enforces a^2 - a = 0 (i.e., a is 0 or 1) as a single gate
- SHA-256 round gate: Encodes entire SHA-256 compression rounds for zkEVM applications
UltraPLONK (Aztec, 2022) further extends TurboPLONK with lookup tables — precomputed tables of valid input-output pairs for functions like XOR, AND, range checks, and bitwise operations. The prover proves that its wire values appear in the lookup table, which is far more efficient than computing the function gate-by-gate.
Lookup Arguments: PlonkUp and Caulk
A lookup argument proves that a set of values f = (f_0, ..., f_{m-1}) are all contained in a table T = (t_0, ..., t_{n-1}), without revealing which specific entries they are. For zkEVMs, this is essential — the EVM uses many bitwise operations and range checks that are expensive in pure arithmetic circuits.
The naive lookup argument in vanilla UltraPLONK requires O(n) prover work per lookup, where n is the table size. For large tables like the full 16-bit range check table (65536 entries), this becomes expensive. Caulk (EPFL, 2022) reduced prover time to O(m log m) independent of table size, a dramatic improvement for multi-column tables with hundreds of thousands of entries.
The Caulk Construction
Caulk uses a KZG polynomial commitment scheme where the table T is committed as a polynomial t(X) of degree n. A lookup of m values requires the prover to produce an opening proof that the queried values are contained in t(X) at some subset of positions. The key optimization is that the subset proof is O(m log m) regardless of n, using a subprotocol based on structured subsets of roots of unity.
Proving System Backends: KZG vs. FRI
PLONK's arithmetization is agnostic to the polynomial commitment scheme used. The two dominant choices are KZG (Kate-Zaverucha-Goldberg) and FRI (Fast Reed-Solomon IOP of Proximity).
- KZG commitments: Require a trusted setup ceremony (structured reference string). Produce constant-size proofs (48 bytes per commitment on BN254). Verification is fast but setup is a centralization risk. Used by Groth16, PLONK on zkSync Era, Polygon Hermez
- FRI-based commitments: No trusted setup required. Proofs scale logarithmically with circuit size (~100-200KB for typical circuits). Verification is slower but trust assumptions are minimal. Used by StarkWare, Polygon Miden
- The tradeoff is proof size and verification cost vs. trust assumptions. For L2 rollups submitting proofs to Ethereum, KZG proofs verify for ~250k gas while FRI proofs verify for ~5M gas — a 20x cost difference
- EIP-4844 (danksharding) added KZG commitments to Ethereum L1, making KZG the de facto standard for L2 proofs submitted to L1
The Prover Performance Picture
For a PLONK circuit of n gates, proving time is dominated by two operations: multi-scalar multiplication (MSM) over the elliptic curve, and fast Fourier transforms (FFTs) over F_p. Both are O(n log n) in asymptotic complexity, but the constants matter enormously in practice.
Modern PLONK provers achieve roughly 1-5 million gates per second on GPU hardware. A full Ethereum block translated to a zkEVM circuit has approximately 30-100 million gates, implying 10-100 seconds of proving time on a high-end GPU cluster. This is why zkEVM block times are typically 1-3 minutes — the proving latency is a fundamental constraint.
- Parallel FFT: Circuit evaluation uses FFTs that are embarrassingly parallelizable across GPU cores
- Batched MSM: Multi-scalar multiplication (committing to polynomials) uses Pippenger's algorithm, which achieves near-linear performance with careful batching
- Proving key caching: Selector polynomials and permutation data are fixed per circuit; caching them in GPU memory eliminates repeated computation
- Recursive proving: Proofs of proofs enable aggregation — a single proof verified on L1 can attest to hundreds of batch proofs, amortizing verification cost
The zkEVM proving bottleneck today is not algebraic — it is hardware. The next 5x improvement in zkEVM throughput will come from ASIC provers and GPU kernel optimization, not from new arithmetization techniques. The arithmetization layer is largely solved.
Practical Implications for Protocol Designers
For engineers building ZK applications on top of PLONK-based systems, the key insight is that circuit size is the primary cost driver. Every operation your application performs translates to a circuit gate count, and gate count drives proving time, proof size, and verification cost.
- Prefer ZK-friendly hash functions: Poseidon over SHA-256 saves 100x in gate count. This matters for Merkle trees, commitment schemes, and nullifiers
- Use lookup tables for bitwise operations: XOR, AND, and range checks are 10-100x cheaper as lookups than as arithmetic gate sequences
- Avoid arbitrary branching: Conditional logic in ZK circuits must evaluate all branches. Use multiplexers (a * b + (1-a) * c) and structure computation to minimize branch count
- Batch verification: A single proof can attest to many statements. Recursive PLONK proofs (proof of proofs) amortize the fixed overhead of proof generation across many operations
- Measure gate counts early: The performance envelope of a ZK application is set during design. Refactoring for circuit efficiency after implementation is expensive
Build Production ZK Systems with Accelar
Accelar designs and implements zero-knowledge proof systems for enterprise and protocol applications — from zkEVM circuit optimization to custom PLONK constraint systems and ZK-native application architectures. If you're building on ZK technology and need deep cryptographic engineering expertise, let's talk.
