Offline memory checking in zkVM

Share on

Memory checking in a zkVM is used to allow a prover to demonstrate to a verifier that read/write memory operations are performed correctly. In such a memory system, a value v can be written to an address a, and later retrieved from address a by the program. This technique enables the verifier to efficiently confirm that the prover has followed memory access rules—specifically, that any read returns the latest value written to that memory address.

The term offline memory checking refers to a technique in zkVM proofs where the correctness of all read operations is checked at once, after all read values have been committed. Offline checking does not immediately determine whether each read is correct at the moment it happens. Instead, it verifies the correctness of all reads in a single batch after all operations are done. This contrasts with online memory checking techniques, such as Merkle hashing, which confirm read correctness immediately by requiring an authentication path for every read. Merkle hashing imposes higher proof costs on the prover for each read operation, whereas offline memory checking is more suitable for zkVM scenarios.

1. Constructing the Read Set RS and Write Set WS

We construct two sets, the read set RS and the write set WS, where each element is a tuple Esto es un ejemplo: \( (a, v, c) \) representing the address, value, and instruction count. They are constructed as follows:

Initialization:

  • \(RS = WS = \emptyset ;\)
  • If the memory \(a_i\) is to be initialized with value \(v_i\), then the initial tuple is added to the write set: \(WS = WSU(a_i, v_i, 0)\).

Memory Read and Write Operations:

  • When reading from address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for that address (i.e., with the largest instruction count \(c\)). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v, c_{\text{now}})\), where \(c_{\text{now}}\) is the current instruction count.
  • When writing value \(v'\) to address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for address \(a\). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v', c_{\text{now}})\).

Post-processing:

  • For each memory address \(a_i\), find the last tuple \((a_i, v_i, c_i)\) in \(WS\), and update: \(RS = RSU(a_i, v_i, c_i)\).

2. Core Argument

If the following conditions are met, then the prover has followed the memory rules and read the correct values from memory:

1. The read and write sets are correctly initialized;

2. For each address \(a_i\), the instruction counts added to \(WS\) strictly increase over time;

  • For read operations, the tuples added to \(RS\) and \(WS\) must have the same value;
  • For read-write operations, the tuple added to \(RS\) must have a lower instruction count than the one added to \(WS\);

3. After post-processing, \(RS = WS.\).

Brief Proof: Consider the first faulty memory read operation. Suppose a read operation is expected to return the tuple \((a, v, c)\) but instead erroneously returns \((a, v' \neq v, c')\), which is added to \(RS\). Note that all tuples in \(WS\) are unique. After adding \((a, v', c_{\text{now}})\) to \(WS\), both \((a, v, c)\) and \((a, v', c_{\text{now}})\) are not in \(RS\). According to condition 2, after each read/write operation, there are always at least two tuples in \(WS\) that are not in \(RS\), which means post-processing cannot make \(RS = WS\).

There are two methods for enforcing the final equality of \(WS\) and \(RS\): multiset hashing and lookup tables (e.g., LogUp).

3. Multiset Hashing Method

Multiset hashing allows a (multi)set to be hashed into a short string such that it is computationally difficult to find two different sets with the same hash. The hash is computed incrementally—one element at a time. A key property is that the final hash value is independent of the order in which elements are hashed. This lets us compare sets constructed in different orders.

Let the group \(G\) be the set of points on an elliptic curve defined by \(y^2 = x^3 + Ax + B\) (including the point at infinity). We can implement multiset hashing using group hashing. To hash a set element into a point on the elliptic curve, we first map the element to an \(x\)-coordinate. Since this may not yield a valid curve point, we add an 8-bit adjustment value \(t\) and use hash operations to avoid collisions. Additionally, we apply constraints (e.g., ensuring the \(y\)-coordinate is a quadratic residue or adding range checks) to prevent flipping the sign of the y-coordinate.

For example, during zkVM execution, we can construct the following columns and corresponding constraints:

Columns:

  • \(a\): operation address;
  • \(v_r, v_w\): values added to the read/write set;
  • \(c_r, c_w\): program counters added to the read/write set;
  • \(t_r, t_w\): 8-bit adjustments;
  • \(x_r, x_w\): x-coordinates, satisfying \(x = \text{hash}(a | |v| | |c| | t);\)
  • \(y_r, y_w\): y-coordinates;
  • \(z_r, z_w\): satisfying \(y = z^2\), used to ensure that \(y\) is a quadratic residue;
  • \(h_r, h_w\): current hash values of the read/write sets.

Constraints:

  1. Perform range checks on \(a, v, c, t\);
  2. \(c_{\text{now}} = c_w > c_r\) (you can introduce \(d = c_w - c_r\) and range-check \(d\));
  3. \(x = \text{hash}(a||v||c||t)\), \(y^2 = x^3 + Ax + B\), \(y = z^2\);
  4. \(h = h_{\text{old}} + (x, y)\);
  5. At the end, ensure \(h_r = h_w\).

4. LogUp Method

To prove that two multisets \(A = a_i\) and \(B = b_i\) are equal, it is sufficient to prove that for a randomly chosen challenge \(\gamma\), the following holds:

\[\sum_{i} \frac{1}{a_i + \gamma} = \sum_{i} \frac{1}{b_i + \gamma}\]

We can construct the following columns and constraints:

Columns:

  • \(a\);
  • \(v_r, v_w\);
  • \(c_r, c_w\);
More articles
The ZKM March Update
Last month at ZKM, all attention was on one thing: performance.
以太坊扩容后的生活
在我们讨论混合汇总技术之路的文章中,我们谈到了以太坊扩容斗争的演变。在那次讨论中,我们看到了一个优雅而实用的解决方案是如何最终出现的:分层的区块链设计,将第 1 层作为主链,仅处理共识和数据可用性,第 2 层作为汇总,为所有区块链用户提供卸载的计算验证,从而创建可扩展的区块链解决方案。
Offline memory checking in zkVM

Memory checking in a zkVM is used to allow a prover to demonstrate to a verifier that read/write memory operations are performed correctly. In such a memory system, a value v can be written to an address a, and later retrieved from address a by the program. This technique enables the verifier to efficiently confirm that the prover has followed memory access rules—specifically, that any read returns the latest value written to that memory address.

The term offline memory checking refers to a technique in zkVM proofs where the correctness of all read operations is checked at once, after all read values have been committed. Offline checking does not immediately determine whether each read is correct at the moment it happens. Instead, it verifies the correctness of all reads in a single batch after all operations are done. This contrasts with online memory checking techniques, such as Merkle hashing, which confirm read correctness immediately by requiring an authentication path for every read. Merkle hashing imposes higher proof costs on the prover for each read operation, whereas offline memory checking is more suitable for zkVM scenarios.

1. Constructing the Read Set RS and Write Set WS

We construct two sets, the read set RS and the write set WS, where each element is a tuple Esto es un ejemplo: \( (a, v, c) \) representing the address, value, and instruction count. They are constructed as follows:

Initialization:

  • \(RS = WS = \emptyset ;\)
  • If the memory \(a_i\) is to be initialized with value \(v_i\), then the initial tuple is added to the write set: \(WS = WSU(a_i, v_i, 0)\).

Memory Read and Write Operations:

  • When reading from address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for that address (i.e., with the largest instruction count \(c\)). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v, c_{\text{now}})\), where \(c_{\text{now}}\) is the current instruction count.
  • When writing value \(v'\) to address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for address \(a\). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v', c_{\text{now}})\).

Post-processing:

  • For each memory address \(a_i\), find the last tuple \((a_i, v_i, c_i)\) in \(WS\), and update: \(RS = RSU(a_i, v_i, c_i)\).

2. Core Argument

If the following conditions are met, then the prover has followed the memory rules and read the correct values from memory:

1. The read and write sets are correctly initialized;

2. For each address \(a_i\), the instruction counts added to \(WS\) strictly increase over time;

  • For read operations, the tuples added to \(RS\) and \(WS\) must have the same value;
  • For read-write operations, the tuple added to \(RS\) must have a lower instruction count than the one added to \(WS\);

3. After post-processing, \(RS = WS.\).

Brief Proof: Consider the first faulty memory read operation. Suppose a read operation is expected to return the tuple \((a, v, c)\) but instead erroneously returns \((a, v' \neq v, c')\), which is added to \(RS\). Note that all tuples in \(WS\) are unique. After adding \((a, v', c_{\text{now}})\) to \(WS\), both \((a, v, c)\) and \((a, v', c_{\text{now}})\) are not in \(RS\). According to condition 2, after each read/write operation, there are always at least two tuples in \(WS\) that are not in \(RS\), which means post-processing cannot make \(RS = WS\).

There are two methods for enforcing the final equality of \(WS\) and \(RS\): multiset hashing and lookup tables (e.g., LogUp).

3. Multiset Hashing Method

Multiset hashing allows a (multi)set to be hashed into a short string such that it is computationally difficult to find two different sets with the same hash. The hash is computed incrementally—one element at a time. A key property is that the final hash value is independent of the order in which elements are hashed. This lets us compare sets constructed in different orders.

Let the group \(G\) be the set of points on an elliptic curve defined by \(y^2 = x^3 + Ax + B\) (including the point at infinity). We can implement multiset hashing using group hashing. To hash a set element into a point on the elliptic curve, we first map the element to an \(x\)-coordinate. Since this may not yield a valid curve point, we add an 8-bit adjustment value \(t\) and use hash operations to avoid collisions. Additionally, we apply constraints (e.g., ensuring the \(y\)-coordinate is a quadratic residue or adding range checks) to prevent flipping the sign of the y-coordinate.

For example, during zkVM execution, we can construct the following columns and corresponding constraints:

Columns:

  • \(a\): operation address;
  • \(v_r, v_w\): values added to the read/write set;
  • \(c_r, c_w\): program counters added to the read/write set;
  • \(t_r, t_w\): 8-bit adjustments;
  • \(x_r, x_w\): x-coordinates, satisfying \(x = \text{hash}(a | |v| | |c| | t);\)
  • \(y_r, y_w\): y-coordinates;
  • \(z_r, z_w\): satisfying \(y = z^2\), used to ensure that \(y\) is a quadratic residue;
  • \(h_r, h_w\): current hash values of the read/write sets.

Constraints:

  1. Perform range checks on \(a, v, c, t\);
  2. \(c_{\text{now}} = c_w > c_r\) (you can introduce \(d = c_w - c_r\) and range-check \(d\));
  3. \(x = \text{hash}(a||v||c||t)\), \(y^2 = x^3 + Ax + B\), \(y = z^2\);
  4. \(h = h_{\text{old}} + (x, y)\);
  5. At the end, ensure \(h_r = h_w\).

4. LogUp Method

To prove that two multisets \(A = a_i\) and \(B = b_i\) are equal, it is sufficient to prove that for a randomly chosen challenge \(\gamma\), the following holds:

\[\sum_{i} \frac{1}{a_i + \gamma} = \sum_{i} \frac{1}{b_i + \gamma}\]

We can construct the following columns and constraints:

Columns:

  • \(a\);
  • \(v_r, v_w\);
  • \(c_r, c_w\);