Multivariate Sumcheck Protocol

Share on

The Multivariate Sumcheck Protocol is an important PIOP (Polynomial IOP) component in zero-knowledge proofs. It mainly proves the correctness of the following equation:

\[ c = \sum_{\vec{x} \in \{0,1\}^n} f(X_1, ..., X_n) \]

Here, $c$ is the sum result, and $X_1, ..., X_n$ are $d$ variables, each taking the value 0 or 1.Unlike traditional univariate polynomial operations, multiplication of multivariate polynomials does not require complex FFT operations — only pointwise multiplication is needed. We will now introduce the basics of multivariate polynomials, including their representation and operations.

1. Preliminaries

1.1 From Univariate to Multivariate Polynomials (Evaluation Form)

Let $\vec{g} = g^0, g^1, ..., g^7$ be an order-8 multiplicative subgroup in a large prime field, where:

\[ g^8 = 1 \]

We can encode position $i$ in the Boolean domain $g^i$ as its binary representation $000, 001, ..., 111$ to construct all evaluation points for a multivariate Boolean function. A univariate polynomial $f(X)$ can thus be represented as a multivariate polynomial $f(\vec{X}) = f(X_0, X_1, ..., X_7)$:

\[ (f(g^0), f(g^1), ..., f(g^7)) = (f(0,0,0), f(0,0,1), ..., f(1,1,1)) \]

Each value corresponds one-to-one, effectively flattening the high-dimensional Boolean function into a vector.

1.2 Vector Storage Form (Evaluation Form)

The multilinear extension can be stored as a vector in little-endian order:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

This is similar to storing point values using their binary representation in decimal indexing.

1.3 Low-degree Extension (Coefficient Form)

We want to extend $f$ from Boolean points to a function $\tilde{f}$ over the whole domain $F^n$.

Definition:\[ \chi_w(x) = \begin{cases} 1 & w = x \\ 0 & \text{otherwise} \end{cases} \]

The multivariate polynomial\[ \chi_w(x) = \prod_{i=1}^n [(1 - x_i)(1 - w_i) + x_i w_i] \]

is analogous to the Lagrange basis for univariate polynomials and is usually precomputed during system setup.

To convert from evaluation form to coefficient form, more generally for multivariates:\[ \tilde{f}(x_1, ..., x_n) = \sum_{w \in \{0,1\}^n} f(w) \cdot \chi_w(x_1, ..., x_n) \]

Here, $\tilde{f}$ is also known as the Multilinear Extension (MLE).

1.4 Vector Storage Form (Coefficient Form)

The multilinear extension can also be stored in vector form using little-endian ordering, e.g.:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

If the polynomial is $f(X_3, X_2, X_1, X_0)$, then $E[1]$, $E[5]$, $E[8]$ store the coefficients for the terms $X_0$, $X_2 \cdot X_0$, and $X_3$, respectively.

1.5 Using Evaluation Form for Computation at a Specific Point

Suppose we need to compute $x_1 = r$. Let:

\[ \tilde{f}(x_2, ..., x_n) = \tilde{f}(r, x_2, ..., x_n) \]

let n = 2^v;
let half = n / 2;
for i in 0..half {
    let low = E[i];
    let high = E[half + i];
    E[i] = low + r * (high - low);
}

This works because:

\[ \tilde{f}(r, x_2, ..., x_n) = (1 - r) \cdot \tilde{f}(0, x_2, ..., x_n) + r \cdot \tilde{f}(1, x_2, ..., x_n) \]

This method computes the value at $r$ directly in evaluation form, without converting to coefficient form. Both time and space complexity are $O(n)$.

1.6 Using Coefficient Form for Computation at a Specific Point

  1. Convert from evaluation form to coefficient form using the method in Section 3.
  2. Substitute the specific point into the corresponding variable.

1.7 Multiplication of Multivariate Polynomials

In multivariate polynomial systems, MLE $\times$ MLE (Multilinear Extension $\times$ Multilinear Extension) operations are widely used in zkSNARK protocols based on the multivariate sumcheck.

Let $f,g$ be two MLE functions over the Boolean hypercube $0,1^v$ (each polynomial is linear in each variable). They can be represented by their evaluation vectors over all $2^v$ points.

Using evaluation form, we can efficiently compute:

\[ f(r_1, ..., r_v) \cdot g(r_1, ..., r_v) \]

— i.e., pointwise multiplication at the same point.

If the evaluation form length is insufficient (e.g., degree $d$ not reached), it must be padded to length $d$ first. This is because multiplication increases degree (up to double), and keeping length $d$ ensures correct subsequent IFFT or Lagrange interpolation.

This is analogous to univariate polynomial handling in FFT: pointwise multiplication after FFT, then IFFT to return to coefficient form. Multivariate MLE pointwise multiplication follows the same principle, but on the high-dimensional Boolean hypercube.

2. Sumcheck Protocol

The goal is to prove:\[ \sum_{x \in \{0,1\}^n} f(x_1, ..., x_n) = c \]

Protocol steps:

  1. Prover constructs the univariate polynomial:\[ g_1(X_1) = \sum_{(x_2, ..., x_n) \in \{0,1\}^{n-1}} f(X_1, x_2, ..., x_n) \]and sends it to the <strong>Verifier</strong> (computed in coefficient form).
  2. Verifier checks $g_1(0) + g_1(1) = c$, then sends random $r_1$.
  3. Prover constructs: \[ g_2(X_2) = \sum_{(x_3, ..., x_n) \in \{0,1\}^{n-2}} f(r_1, X_2, ..., x_n) \]and Verifier checks $g_1(r_1) = g_2(0) + g_2(1)$.
  4. Repeat: In each round, Prover sends $g_i(X)$, and Verifier checks $g_{i-1}(r_{i-1}) = g_i(0) + g_i(1)$.
  5. Final round: Verifier checks $g_n(0) + g_n(1) = g_{n-1}(r_{n-1})$, and additionally computes $f(r_1, r_2, ..., r_n)$ for the final verification.

The last-round verification can be done by querying a polynomial oracle or by having the Prover compute it directly in evaluation form.

3. Conclusion

The Multivariate Sumcheck Protocol provides an efficient and modular method to verify large multilinear polynomial sums over the Boolean hypercube without explicitly evaluating the function on all 2n points. By iteratively reducing an n-variate sum to a sequence of univariate checks, it minimizes communication and computational costs for both prover and verifier. Its reliance on multilinear extensions and evaluation-form computations eliminates the need for heavy FFTs in high dimensions, making it especially suitable for zero-knowledge proof systems such as zkSNARKs and GKR. The protocol’s structure not only ensures soundness through random challenges but also integrates cleanly with other PIOP components, enabling scalable verification in modern cryptographic protocols.

Subscribe to the ZKM Blog to stay updated with the latest research by ZKM.
Explore the ZKM Docs
Start building with
ZKM GitHub
Follow ZKM on
X

More articles
你好世界-八月时事通讯 2.0
本月取得了许多进展和战略合作,这些合作继续影响着我们的技术,因此,我们为您带来了第二版《八月通讯》。要了解第一版,请参阅 zkm.io/blog/hello-world-八月时事通讯。
Offline memory checking in zkVM
Memory checking in a zkVM is used to allow a prover to demonstrate to a verifier that read/write memory operations are performed correctly. In such a memory system, a value v can be written to an address a, and later retrieved from address a by the program. This technique enables the verifier to efficiently confirm that the prover has followed memory access rules—specifically, that any read returns the latest value written to that memory address.
Multivariate Sumcheck Protocol

The Multivariate Sumcheck Protocol is an important PIOP (Polynomial IOP) component in zero-knowledge proofs. It mainly proves the correctness of the following equation:

\[ c = \sum_{\vec{x} \in \{0,1\}^n} f(X_1, ..., X_n) \]

Here, $c$ is the sum result, and $X_1, ..., X_n$ are $d$ variables, each taking the value 0 or 1.Unlike traditional univariate polynomial operations, multiplication of multivariate polynomials does not require complex FFT operations — only pointwise multiplication is needed. We will now introduce the basics of multivariate polynomials, including their representation and operations.

1. Preliminaries

1.1 From Univariate to Multivariate Polynomials (Evaluation Form)

Let $\vec{g} = g^0, g^1, ..., g^7$ be an order-8 multiplicative subgroup in a large prime field, where:

\[ g^8 = 1 \]

We can encode position $i$ in the Boolean domain $g^i$ as its binary representation $000, 001, ..., 111$ to construct all evaluation points for a multivariate Boolean function. A univariate polynomial $f(X)$ can thus be represented as a multivariate polynomial $f(\vec{X}) = f(X_0, X_1, ..., X_7)$:

\[ (f(g^0), f(g^1), ..., f(g^7)) = (f(0,0,0), f(0,0,1), ..., f(1,1,1)) \]

Each value corresponds one-to-one, effectively flattening the high-dimensional Boolean function into a vector.

1.2 Vector Storage Form (Evaluation Form)

The multilinear extension can be stored as a vector in little-endian order:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

This is similar to storing point values using their binary representation in decimal indexing.

1.3 Low-degree Extension (Coefficient Form)

We want to extend $f$ from Boolean points to a function $\tilde{f}$ over the whole domain $F^n$.

Definition:\[ \chi_w(x) = \begin{cases} 1 & w = x \\ 0 & \text{otherwise} \end{cases} \]

The multivariate polynomial\[ \chi_w(x) = \prod_{i=1}^n [(1 - x_i)(1 - w_i) + x_i w_i] \]

is analogous to the Lagrange basis for univariate polynomials and is usually precomputed during system setup.

To convert from evaluation form to coefficient form, more generally for multivariates:\[ \tilde{f}(x_1, ..., x_n) = \sum_{w \in \{0,1\}^n} f(w) \cdot \chi_w(x_1, ..., x_n) \]

Here, $\tilde{f}$ is also known as the Multilinear Extension (MLE).

1.4 Vector Storage Form (Coefficient Form)

The multilinear extension can also be stored in vector form using little-endian ordering, e.g.:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

If the polynomial is $f(X_3, X_2, X_1, X_0)$, then $E[1]$, $E[5]$, $E[8]$ store the coefficients for the terms $X_0$, $X_2 \cdot X_0$, and $X_3$, respectively.

1.5 Using Evaluation Form for Computation at a Specific Point

Suppose we need to compute $x_1 = r$. Let:

\[ \tilde{f}(x_2, ..., x_n) = \tilde{f}(r, x_2, ..., x_n) \]

let n = 2^v;
let half = n / 2;
for i in 0..half {
    let low = E[i];
    let high = E[half + i];
    E[i] = low + r * (high - low);
}

This works because:

\[ \tilde{f}(r, x_2, ..., x_n) = (1 - r) \cdot \tilde{f}(0, x_2, ..., x_n) + r \cdot \tilde{f}(1, x_2, ..., x_n) \]

This method computes the value at $r$ directly in evaluation form, without converting to coefficient form. Both time and space complexity are $O(n)$.

1.6 Using Coefficient Form for Computation at a Specific Point

  1. Convert from evaluation form to coefficient form using the method in Section 3.
  2. Substitute the specific point into the corresponding variable.

1.7 Multiplication of Multivariate Polynomials

In multivariate polynomial systems, MLE $\times$ MLE (Multilinear Extension $\times$ Multilinear Extension) operations are widely used in zkSNARK protocols based on the multivariate sumcheck.

Let $f,g$ be two MLE functions over the Boolean hypercube $0,1^v$ (each polynomial is linear in each variable). They can be represented by their evaluation vectors over all $2^v$ points.

Using evaluation form, we can efficiently compute:

\[ f(r_1, ..., r_v) \cdot g(r_1, ..., r_v) \]

— i.e., pointwise multiplication at the same point.

If the evaluation form length is insufficient (e.g., degree $d$ not reached), it must be padded to length $d$ first. This is because multiplication increases degree (up to double), and keeping length $d$ ensures correct subsequent IFFT or Lagrange interpolation.

This is analogous to univariate polynomial handling in FFT: pointwise multiplication after FFT, then IFFT to return to coefficient form. Multivariate MLE pointwise multiplication follows the same principle, but on the high-dimensional Boolean hypercube.

2. Sumcheck Protocol

The goal is to prove:\[ \sum_{x \in \{0,1\}^n} f(x_1, ..., x_n) = c \]

Protocol steps:

  1. Prover constructs the univariate polynomial:\[ g_1(X_1) = \sum_{(x_2, ..., x_n) \in \{0,1\}^{n-1}} f(X_1, x_2, ..., x_n) \]and sends it to the <strong>Verifier</strong> (computed in coefficient form).
  2. Verifier checks $g_1(0) + g_1(1) = c$, then sends random $r_1$.
  3. Prover constructs: \[ g_2(X_2) = \sum_{(x_3, ..., x_n) \in \{0,1\}^{n-2}} f(r_1, X_2, ..., x_n) \]and Verifier checks $g_1(r_1) = g_2(0) + g_2(1)$.
  4. Repeat: In each round, Prover sends $g_i(X)$, and Verifier checks $g_{i-1}(r_{i-1}) = g_i(0) + g_i(1)$.
  5. Final round: Verifier checks $g_n(0) + g_n(1) = g_{n-1}(r_{n-1})$, and additionally computes $f(r_1, r_2, ..., r_n)$ for the final verification.

The last-round verification can be done by querying a polynomial oracle or by having the Prover compute it directly in evaluation form.

3. Conclusion

The Multivariate Sumcheck Protocol provides an efficient and modular method to verify large multilinear polynomial sums over the Boolean hypercube without explicitly evaluating the function on all 2n points. By iteratively reducing an n-variate sum to a sequence of univariate checks, it minimizes communication and computational costs for both prover and verifier. Its reliance on multilinear extensions and evaluation-form computations eliminates the need for heavy FFTs in high dimensions, making it especially suitable for zero-knowledge proof systems such as zkSNARKs and GKR. The protocol’s structure not only ensures soundness through random challenges but also integrates cleanly with other PIOP components, enabling scalable verification in modern cryptographic protocols.

Subscribe to the ZKM Blog to stay updated with the latest research by ZKM.
Explore the ZKM Docs
Start building with
ZKM GitHub
Follow ZKM on
X