Getting to Know zkMIPS Proving Architecture

*TL;DR: zkMIPS proves the correct execution of a MIPS program in five steps: it (1) divides the program in segments, (2) divides the instructions of each segment in four module tables, (3) proves instructions from each module table independently, (4) proves instructions from each segment is contained in one of its tables, and (5) recursively proves that the sequence of segment match the program execution. Step 3 is written as a STARK, step 4 is a logUp proof written as a STARK, and step 5 as a PLONK proof. All proof steps are implemented with the Plonky2 library. Optionally, one can generate a final Groth16 proof to verify the program execution on-chain.*

The concept of Zero-Knowledge (ZK) Proofs has gained prominence in blockchain for their ability to provide transparent and verifiable validation of critical properties, e.g. to prove a reserve or validate a new rollup state. In the last few years, two specific cases of ZK proofs have become especially popular: zkVM (ZK Virtual Machine) and zkEVM (ZK Ethereum-VM). These concepts are very similar as both verify the correct execution of a program; the main difference between them is whether the program was meant to run on (zkEVM) or off-chain (zkVM), i.e. whether it was written in a smart contract language or not.

ZKM developed a zkVM, called zkMIPS, capable of verifying on-chain that a MIPS program was correctly executed off-chain. Since most off and on-chain programming languages can be easily compiled to MIPS, zKMIPS supports a wide range of programs, including those that verify the correctness of blockchain transactions, smart contract function calls and rollups.

This post explains what all of this means technically, and the steps required to implement this system. To do so, we have to go back a few steps to explain how a program and its execution are interpreted in a ZK proof.

For *any* computational problem there is a verification algorithm that checks whether a given solution is valid. ZK proofs model verification algorithms as secure multi-party protocols in a way to guarantee:

- (Completeness) the proof will succeed if the verification algorithm succeeds;
- (Soundness) the proof will fail if the verification algorithm fails;
- (Zero-Knowledge) the proof will reveal no information about the solution.

These three properties are implemented by two related algorithms, namely prover and verifier (not to be confused with verification - the verification algorithm verifies a solution, the verifier algorithm verifies a proof). We require the prover *to know* the solution to the target problem, while the verifier does not (though prover and verifier are algorithms, we can also use these terms to refer to the parties running them). In other words, if the prover is honest, they will help the verifier to check the validity of the solution. Under this terminology, a ZK proof is expected to be successful only when the prover is honest (completeness) and to fail only when it cheats (soundness).

Nowadays, ZK proofs are ultimately about making the verifier algorithm run faster than the verification algorithm. Though this property is often referred to as zero-knowledge, technically it is *called* succinctness. One way to increase succinctness is to move verification complexity from the verifier to the prover, implying a sophisticated prover and a simple verifier.

zkMIPS is composed of a sophisticated prover for the execution of MIPS programs, along with a simple verifier algorithm and an equivalent verifier smart-contract. The computational problem here is verifying the correctness of the execution of some MIPS program. The solution, i.e. the proof of correctness, is the sequence of overall CPU states representing the program’s execution. The naive verification algorithm is to simply rerun the program, but this approach is not succinct because the sequence of states is too large for the smart contract.

Using state-of-the-art succinct proofs, this verification of a MIPS program execution can be done much faster and cheaper. How much faster and how much cheaper depends on sophisticated techniques that are constantly evolving. Until very recently, these techniques relied mostly on polynomial and probability properties: the polynomial properties involve advanced math and basic (sometimes advanced) crypto; while the probability properties involve basic math and hash functions.

The underlying idea is to encode the steps of the verification algorithm as polynomials, as well as the values involved in these steps. We call polynomials encoding verification steps ‘constraint polynomials’, and polynomials encoding values involved in them ‘witness polynomials’. If constraint and witness polynomials are defined properly, we can express the whole algorithm execution as a single polynomial through a thoughtful polynomial composition.

At the end of this step, the solution is verified through certain polynomial evaluations on the composed polynomial (this is where probability comes in). One of the most common methods to define which evaluations to perform is the FRI protocol, which verifies whether the composition polynomial has proper degree. If the polynomials being evaluated are composed properly, this is enough to verify witness and constraint polynomials match.

Though the FRI protocol is the core of zkMIPS proofs, it is not succinct enough to allow on-chain verification. One reason is that polynomial representations are typically as large as the polynomial degree, which, in the case of the polynomial input to FRI, is as large as the MIPS program. The first thing we do to improve FRI succinctness is to divide the program into segments, prove each of these segments in parallel using the FRI protocol, and then recursively combine proofs of subsequent segments. We call this method continuation, because each new proof shows that its first input segment is continued by the second, i.e. the initial state of the second input proof matches the final state of the first.

To optimize proving time, we set the number of segments to a power of 2 and make all segments the same size (segments of size equal to a power of 2 are optimal for FRI, so we also take this into consideration). The final continuation proof is roughly of the size of one segment proof but proves the correct execution of the whole program, which guarantees a logarithmic succinctness-factor.

Another important technique for succinct zkMIPS proving is lookups. Since some operations cannot be easily expressed as polynomials (e.g. logic operations), some projects create fixed evaluation tables for them and require *lookups* into these tables whenever these operations appear. The trick here is to encode these tables' entries as polynomials, and the lookups into these tables as another type of polynomial, allowing composition like in the rest of the protocol. This solution works in practice because looked-up tables are fixed and can be checked by anyone.

Last year, Jolt and Lasso (J&L) proposed lookups for *all* operations. This idea seems to require huge tables to implement all possible input and output combinations for some operations (e.g. 32-bit arithmetic operations). However, J&L do not really create these tables, they just simulate them using fancy polynomial techniques. zkMIPS utilizes lookups in a way similar to J&L: it divides operations into four groups (arithmetic, logic, memory and control), proves each group of operations individually (using FRI), and uses lookups to show that every entry in the original set of internal CPU states appears in the table of one of these operation groups.

To create constraints and witness polynomials, combine them into polynomials that will be FRI-tested, and write continuation and lookup proofs, we use the well-known Plonky2 library. This library is a state-of-art framework to design two types of ZK proof, namely STARK and PLONK proofs. For technical reasons, we chose to use the STARK paradigm in the FRI and the lookup phases, and the PLONK paradigm in the continuation phase. The implementation of both paradigms in the Plonky2 library uses the FRI protocol, which makes their proofs very similar and guarantees that they can be verified by the same algorithm.

This property allows us to incorporate one additional proving layer into the zkMIPS architecture to make the final proof easier to verify on Layer 1. We chose the widely used Groth16 proving paradigm as this final layer since it uses bilinear functions over elliptic curves that are natively implemented in Ethereum's EVM. Since the proof that will be converted to this EVM-friendly format is a continuation proof which, in fact, is an FRI proof, we need to write a Groth16 proof for the FRI verification algorithm.

One final detail about zkMIPS architecture is a proving optimization that takes advantage of a special feature from the FRI protocol. This protocol uses polynomial commitments (an advanced cryptography topic that I will not cover here) to evaluate certain properties in each of its iterations. One of the most distinctive properties of the polynomial commitment scheme employed by Plonky2 in both STARK and PLONK implementations is its variable size.

The size of these commitments is inversely proportional to the time that the prover takes to compute them. To accelerate proving time, we chose to go with fast and large commitments in every Plonky2-based proving phase, except for the last one, where we chose to use slow and small commitments to improve proof size. This guarantees that proving time will be as minimal as possible, and that the final proof will be significantly smaller than the intermediary ones. The resulting proving architecture is illustrated below.

In the next part of this blog series I will compare our zkVM solution to those from other companies.

*Lucas Fraga is a senior researcher at ZKM Research.*

More articles

Hello World - August Newsletter

House of ZK held ‘ZK Day’ at The Science of Blockchain Conference in New York City. This event, consisting of technical workshops and debates centered around the latest innovations in ZK, was co-hosted by ZKM and Aleo, and co-organized by IC3, Stanford CBR, and Berkeley RDI. Notable experts like Jeroen van de Graaf, Lucas Fraga, and Alex Pruden shared their insights about topics such as Entangled Rollups, zkMIPS architecture, and BitcoinL2 zkVM-Powered Native Security, before the day concluded with a lively Happy Hour, offering a perfect opportunity to network and relax after a day of intensive discussions.