Offline memory checking in zkVM
Share on

Memory checking in a zkVM is used to allow a prover to demonstrate to a verifier that read/write memory operations are performed correctly. In such a memory system, a value v can be written to an address a, and later retrieved from address a by the program. This technique enables the verifier to efficiently confirm that the prover has followed memory access rules—specifically, that any read returns the latest value written to that memory address.

The term offline memory checking refers to a technique in zkVM proofs where the correctness of all read operations is checked at once, after all read values have been committed. Offline checking does not immediately determine whether each read is correct at the moment it happens. Instead, it verifies the correctness of all reads in a single batch after all operations are done. This contrasts with online memory checking techniques, such as Merkle hashing, which confirm read correctness immediately by requiring an authentication path for every read. Merkle hashing imposes higher proof costs on the prover for each read operation, whereas offline memory checking is more suitable for zkVM scenarios.

1. Constructing the Read Set RS and Write Set WS

We construct two sets, the read set RS and the write set WS, where each element is a tuple Esto es un ejemplo: \( (a, v, c) \) representing the address, value, and instruction count. They are constructed as follows:

Initialization:

  • \(RS = WS = \emptyset ;\)
  • If the memory \(a_i\) is to be initialized with value \(v_i\), then the initial tuple is added to the write set: \(WS = WSU(a_i, v_i, 0)\).

Memory Read and Write Operations:

  • When reading from address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for that address (i.e., with the largest instruction count \(c\)). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v, c_{\text{now}})\), where \(c_{\text{now}}\) is the current instruction count.
  • When writing value \(v'\) to address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for address \(a\). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v', c_{\text{now}})\).

Post-processing:

  • For each memory address \(a_i\), find the last tuple \((a_i, v_i, c_i)\) in \(WS\), and update: \(RS = RSU(a_i, v_i, c_i)\).

2. Core Argument

If the following conditions are met, then the prover has followed the memory rules and read the correct values from memory:

1. The read and write sets are correctly initialized;

2. For each address \(a_i\), the instruction counts added to \(WS\) strictly increase over time;

  • For read operations, the tuples added to \(RS\) and \(WS\) must have the same value;
  • For read-write operations, the tuple added to \(RS\) must have a lower instruction count than the one added to \(WS\);

3. After post-processing, \(RS = WS.\).

Brief Proof: Consider the first faulty memory read operation. Suppose a read operation is expected to return the tuple \((a, v, c)\) but instead erroneously returns \((a, v' \neq v, c')\), which is added to \(RS\). Note that all tuples in \(WS\) are unique. After adding \((a, v', c_{\text{now}})\) to \(WS\), both \((a, v, c)\) and \((a, v', c_{\text{now}})\) are not in \(RS\). According to condition 2, after each read/write operation, there are always at least two tuples in \(WS\) that are not in \(RS\), which means post-processing cannot make \(RS = WS\).

There are two methods for enforcing the final equality of \(WS\) and \(RS\): multiset hashing and lookup tables (e.g., LogUp).

3. Multiset Hashing Method

Multiset hashing allows a (multi)set to be hashed into a short string such that it is computationally difficult to find two different sets with the same hash. The hash is computed incrementally—one element at a time. A key property is that the final hash value is independent of the order in which elements are hashed. This lets us compare sets constructed in different orders.

Let the group \(G\) be the set of points on an elliptic curve defined by \(y^2 = x^3 + Ax + B\) (including the point at infinity). We can implement multiset hashing using group hashing. To hash a set element into a point on the elliptic curve, we first map the element to an \(x\)-coordinate. Since this may not yield a valid curve point, we add an 8-bit adjustment value \(t\) and use hash operations to avoid collisions. Additionally, we apply constraints (e.g., ensuring the \(y\)-coordinate is a quadratic residue or adding range checks) to prevent flipping the sign of the y-coordinate.

For example, during zkVM execution, we can construct the following columns and corresponding constraints:

Columns:

  • \(a\): operation address;
  • \(v_r, v_w\): values added to the read/write set;
  • \(c_r, c_w\): program counters added to the read/write set;
  • \(t_r, t_w\): 8-bit adjustments;
  • \(x_r, x_w\): x-coordinates, satisfying \(x = \text{hash}(a | |v| | |c| | t);\)
  • \(y_r, y_w\): y-coordinates;
  • \(z_r, z_w\): satisfying \(y = z^2\), used to ensure that \(y\) is a quadratic residue;
  • \(h_r, h_w\): current hash values of the read/write sets.

Constraints:

  1. Perform range checks on \(a, v, c, t\);
  2. \(c_{\text{now}} = c_w > c_r\) (you can introduce \(d = c_w - c_r\) and range-check \(d\));
  3. \(x = \text{hash}(a||v||c||t)\), \(y^2 = x^3 + Ax + B\), \(y = z^2\);
  4. \(h = h_{\text{old}} + (x, y)\);
  5. At the end, ensure \(h_r = h_w\).

4. LogUp Method

To prove that two multisets \(A = a_i\) and \(B = b_i\) are equal, it is sufficient to prove that for a randomly chosen challenge \(\gamma\), the following holds:

\[\sum_{i} \frac{1}{a_i + \gamma} = \sum_{i} \frac{1}{b_i + \gamma}\]

We can construct the following columns and constraints:

Columns:

  • \(a\);
  • \(v_r, v_w\);
  • \(c_r, c_w\);
More articles
The ZKM June Update
A month of system-level upgrades, design-level clarity, and a coordinated signal toward the next big release.
July 2023 ZKM Newsletter
And We Are Off🚀 ZKM officially launched 13 July 2023. A zero-knowledge virtual machine (zkVM) solution, incubated by the MetisDAO Foundation, aims to establish Ethereum as the universal
Offline memory checking in zkVM

Memory checking in a zkVM is used to allow a prover to demonstrate to a verifier that read/write memory operations are performed correctly. In such a memory system, a value v can be written to an address a, and later retrieved from address a by the program. This technique enables the verifier to efficiently confirm that the prover has followed memory access rules—specifically, that any read returns the latest value written to that memory address.

The term offline memory checking refers to a technique in zkVM proofs where the correctness of all read operations is checked at once, after all read values have been committed. Offline checking does not immediately determine whether each read is correct at the moment it happens. Instead, it verifies the correctness of all reads in a single batch after all operations are done. This contrasts with online memory checking techniques, such as Merkle hashing, which confirm read correctness immediately by requiring an authentication path for every read. Merkle hashing imposes higher proof costs on the prover for each read operation, whereas offline memory checking is more suitable for zkVM scenarios.

1. Constructing the Read Set RS and Write Set WS

We construct two sets, the read set RS and the write set WS, where each element is a tuple Esto es un ejemplo: \( (a, v, c) \) representing the address, value, and instruction count. They are constructed as follows:

Initialization:

  • \(RS = WS = \emptyset ;\)
  • If the memory \(a_i\) is to be initialized with value \(v_i\), then the initial tuple is added to the write set: \(WS = WSU(a_i, v_i, 0)\).

Memory Read and Write Operations:

  • When reading from address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for that address (i.e., with the largest instruction count \(c\)). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v, c_{\text{now}})\), where \(c_{\text{now}}\) is the current instruction count.
  • When writing value \(v'\) to address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for address \(a\). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v', c_{\text{now}})\).

Post-processing:

  • For each memory address \(a_i\), find the last tuple \((a_i, v_i, c_i)\) in \(WS\), and update: \(RS = RSU(a_i, v_i, c_i)\).

2. Core Argument

If the following conditions are met, then the prover has followed the memory rules and read the correct values from memory:

1. The read and write sets are correctly initialized;

2. For each address \(a_i\), the instruction counts added to \(WS\) strictly increase over time;

  • For read operations, the tuples added to \(RS\) and \(WS\) must have the same value;
  • For read-write operations, the tuple added to \(RS\) must have a lower instruction count than the one added to \(WS\);

3. After post-processing, \(RS = WS.\).

Brief Proof: Consider the first faulty memory read operation. Suppose a read operation is expected to return the tuple \((a, v, c)\) but instead erroneously returns \((a, v' \neq v, c')\), which is added to \(RS\). Note that all tuples in \(WS\) are unique. After adding \((a, v', c_{\text{now}})\) to \(WS\), both \((a, v, c)\) and \((a, v', c_{\text{now}})\) are not in \(RS\). According to condition 2, after each read/write operation, there are always at least two tuples in \(WS\) that are not in \(RS\), which means post-processing cannot make \(RS = WS\).

There are two methods for enforcing the final equality of \(WS\) and \(RS\): multiset hashing and lookup tables (e.g., LogUp).

3. Multiset Hashing Method

Multiset hashing allows a (multi)set to be hashed into a short string such that it is computationally difficult to find two different sets with the same hash. The hash is computed incrementally—one element at a time. A key property is that the final hash value is independent of the order in which elements are hashed. This lets us compare sets constructed in different orders.

Let the group \(G\) be the set of points on an elliptic curve defined by \(y^2 = x^3 + Ax + B\) (including the point at infinity). We can implement multiset hashing using group hashing. To hash a set element into a point on the elliptic curve, we first map the element to an \(x\)-coordinate. Since this may not yield a valid curve point, we add an 8-bit adjustment value \(t\) and use hash operations to avoid collisions. Additionally, we apply constraints (e.g., ensuring the \(y\)-coordinate is a quadratic residue or adding range checks) to prevent flipping the sign of the y-coordinate.

For example, during zkVM execution, we can construct the following columns and corresponding constraints:

Columns:

  • \(a\): operation address;
  • \(v_r, v_w\): values added to the read/write set;
  • \(c_r, c_w\): program counters added to the read/write set;
  • \(t_r, t_w\): 8-bit adjustments;
  • \(x_r, x_w\): x-coordinates, satisfying \(x = \text{hash}(a | |v| | |c| | t);\)
  • \(y_r, y_w\): y-coordinates;
  • \(z_r, z_w\): satisfying \(y = z^2\), used to ensure that \(y\) is a quadratic residue;
  • \(h_r, h_w\): current hash values of the read/write sets.

Constraints:

  1. Perform range checks on \(a, v, c, t\);
  2. \(c_{\text{now}} = c_w > c_r\) (you can introduce \(d = c_w - c_r\) and range-check \(d\));
  3. \(x = \text{hash}(a||v||c||t)\), \(y^2 = x^3 + Ax + B\), \(y = z^2\);
  4. \(h = h_{\text{old}} + (x, y)\);
  5. At the end, ensure \(h_r = h_w\).

4. LogUp Method

To prove that two multisets \(A = a_i\) and \(B = b_i\) are equal, it is sufficient to prove that for a randomly chosen challenge \(\gamma\), the following holds:

\[\sum_{i} \frac{1}{a_i + \gamma} = \sum_{i} \frac{1}{b_i + \gamma}\]

We can construct the following columns and constraints:

Columns:

  • \(a\);
  • \(v_r, v_w\);
  • \(c_r, c_w\);