Multivariate Sumcheck Protocol

Share on

The Multivariate Sumcheck Protocol is an important PIOP (Polynomial IOP) component in zero-knowledge proofs. It mainly proves the correctness of the following equation:

\[ c = \sum_{\vec{x} \in \{0,1\}^n} f(X_1, ..., X_n) \]

Here, $c$ is the sum result, and $X_1, ..., X_n$ are $d$ variables, each taking the value 0 or 1.Unlike traditional univariate polynomial operations, multiplication of multivariate polynomials does not require complex FFT operations — only pointwise multiplication is needed. We will now introduce the basics of multivariate polynomials, including their representation and operations.

1. Preliminaries

1.1 From Univariate to Multivariate Polynomials (Evaluation Form)

Let $\vec{g} = g^0, g^1, ..., g^7$ be an order-8 multiplicative subgroup in a large prime field, where:

\[ g^8 = 1 \]

We can encode position $i$ in the Boolean domain $g^i$ as its binary representation $000, 001, ..., 111$ to construct all evaluation points for a multivariate Boolean function. A univariate polynomial $f(X)$ can thus be represented as a multivariate polynomial $f(\vec{X}) = f(X_0, X_1, ..., X_7)$:

\[ (f(g^0), f(g^1), ..., f(g^7)) = (f(0,0,0), f(0,0,1), ..., f(1,1,1)) \]

Each value corresponds one-to-one, effectively flattening the high-dimensional Boolean function into a vector.

1.2 Vector Storage Form (Evaluation Form)

The multilinear extension can be stored as a vector in little-endian order:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

This is similar to storing point values using their binary representation in decimal indexing.

1.3 Low-degree Extension (Coefficient Form)

We want to extend $f$ from Boolean points to a function $\tilde{f}$ over the whole domain $F^n$.

Definition:\[ \chi_w(x) = \begin{cases} 1 & w = x \\ 0 & \text{otherwise} \end{cases} \]

The multivariate polynomial\[ \chi_w(x) = \prod_{i=1}^n [(1 - x_i)(1 - w_i) + x_i w_i] \]

is analogous to the Lagrange basis for univariate polynomials and is usually precomputed during system setup.

To convert from evaluation form to coefficient form, more generally for multivariates:\[ \tilde{f}(x_1, ..., x_n) = \sum_{w \in \{0,1\}^n} f(w) \cdot \chi_w(x_1, ..., x_n) \]

Here, $\tilde{f}$ is also known as the Multilinear Extension (MLE).

1.4 Vector Storage Form (Coefficient Form)

The multilinear extension can also be stored in vector form using little-endian ordering, e.g.:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

If the polynomial is $f(X_3, X_2, X_1, X_0)$, then $E[1]$, $E[5]$, $E[8]$ store the coefficients for the terms $X_0$, $X_2 \cdot X_0$, and $X_3$, respectively.

1.5 Using Evaluation Form for Computation at a Specific Point

Suppose we need to compute $x_1 = r$. Let:

\[ \tilde{f}(x_2, ..., x_n) = \tilde{f}(r, x_2, ..., x_n) \]

let n = 2^v;
let half = n / 2;
for i in 0..half {
    let low = E[i];
    let high = E[half + i];
    E[i] = low + r * (high - low);
}

This works because:

\[ \tilde{f}(r, x_2, ..., x_n) = (1 - r) \cdot \tilde{f}(0, x_2, ..., x_n) + r \cdot \tilde{f}(1, x_2, ..., x_n) \]

This method computes the value at $r$ directly in evaluation form, without converting to coefficient form. Both time and space complexity are $O(n)$.

1.6 Using Coefficient Form for Computation at a Specific Point

  1. Convert from evaluation form to coefficient form using the method in Section 3.
  2. Substitute the specific point into the corresponding variable.

1.7 Multiplication of Multivariate Polynomials

In multivariate polynomial systems, MLE $\times$ MLE (Multilinear Extension $\times$ Multilinear Extension) operations are widely used in zkSNARK protocols based on the multivariate sumcheck.

Let $f,g$ be two MLE functions over the Boolean hypercube $0,1^v$ (each polynomial is linear in each variable). They can be represented by their evaluation vectors over all $2^v$ points.

Using evaluation form, we can efficiently compute:

\[ f(r_1, ..., r_v) \cdot g(r_1, ..., r_v) \]

— i.e., pointwise multiplication at the same point.

If the evaluation form length is insufficient (e.g., degree $d$ not reached), it must be padded to length $d$ first. This is because multiplication increases degree (up to double), and keeping length $d$ ensures correct subsequent IFFT or Lagrange interpolation.

This is analogous to univariate polynomial handling in FFT: pointwise multiplication after FFT, then IFFT to return to coefficient form. Multivariate MLE pointwise multiplication follows the same principle, but on the high-dimensional Boolean hypercube.

2. Sumcheck Protocol

The goal is to prove:\[ \sum_{x \in \{0,1\}^n} f(x_1, ..., x_n) = c \]

Protocol steps:

  1. Prover constructs the univariate polynomial:\[ g_1(X_1) = \sum_{(x_2, ..., x_n) \in \{0,1\}^{n-1}} f(X_1, x_2, ..., x_n) \]and sends it to the <strong>Verifier</strong> (computed in coefficient form).
  2. Verifier checks $g_1(0) + g_1(1) = c$, then sends random $r_1$.
  3. Prover constructs: \[ g_2(X_2) = \sum_{(x_3, ..., x_n) \in \{0,1\}^{n-2}} f(r_1, X_2, ..., x_n) \]and Verifier checks $g_1(r_1) = g_2(0) + g_2(1)$.
  4. Repeat: In each round, Prover sends $g_i(X)$, and Verifier checks $g_{i-1}(r_{i-1}) = g_i(0) + g_i(1)$.
  5. Final round: Verifier checks $g_n(0) + g_n(1) = g_{n-1}(r_{n-1})$, and additionally computes $f(r_1, r_2, ..., r_n)$ for the final verification.

The last-round verification can be done by querying a polynomial oracle or by having the Prover compute it directly in evaluation form.

3. Conclusion

The Multivariate Sumcheck Protocol provides an efficient and modular method to verify large multilinear polynomial sums over the Boolean hypercube without explicitly evaluating the function on all 2n points. By iteratively reducing an n-variate sum to a sequence of univariate checks, it minimizes communication and computational costs for both prover and verifier. Its reliance on multilinear extensions and evaluation-form computations eliminates the need for heavy FFTs in high dimensions, making it especially suitable for zero-knowledge proof systems such as zkSNARKs and GKR. The protocol’s structure not only ensures soundness through random challenges but also integrates cleanly with other PIOP components, enabling scalable verification in modern cryptographic protocols.

Subscribe to the ZKM Blog to stay updated with the latest research by ZKM.
Explore the ZKM Docs
Start building with
ZKM GitHub
Follow ZKM on
X

More articles
The ZKM March Update
Last month at ZKM, all attention was on one thing: performance.
Why ZKM Chose MIPS32r2 Over RISC-V for zkMIPS
When designing a zkVM, choosing the right instruction set architecture (ISA) is foundational. While most new zkVM projects default to RISC-V due to its simplicity and growing ecosystem, ZKM deliberately took a more difficult route: building on MIPS32r2.
Multivariate Sumcheck Protocol

The Multivariate Sumcheck Protocol is an important PIOP (Polynomial IOP) component in zero-knowledge proofs. It mainly proves the correctness of the following equation:

\[ c = \sum_{\vec{x} \in \{0,1\}^n} f(X_1, ..., X_n) \]

Here, $c$ is the sum result, and $X_1, ..., X_n$ are $d$ variables, each taking the value 0 or 1.Unlike traditional univariate polynomial operations, multiplication of multivariate polynomials does not require complex FFT operations — only pointwise multiplication is needed. We will now introduce the basics of multivariate polynomials, including their representation and operations.

1. Preliminaries

1.1 From Univariate to Multivariate Polynomials (Evaluation Form)

Let $\vec{g} = g^0, g^1, ..., g^7$ be an order-8 multiplicative subgroup in a large prime field, where:

\[ g^8 = 1 \]

We can encode position $i$ in the Boolean domain $g^i$ as its binary representation $000, 001, ..., 111$ to construct all evaluation points for a multivariate Boolean function. A univariate polynomial $f(X)$ can thus be represented as a multivariate polynomial $f(\vec{X}) = f(X_0, X_1, ..., X_7)$:

\[ (f(g^0), f(g^1), ..., f(g^7)) = (f(0,0,0), f(0,0,1), ..., f(1,1,1)) \]

Each value corresponds one-to-one, effectively flattening the high-dimensional Boolean function into a vector.

1.2 Vector Storage Form (Evaluation Form)

The multilinear extension can be stored as a vector in little-endian order:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

This is similar to storing point values using their binary representation in decimal indexing.

1.3 Low-degree Extension (Coefficient Form)

We want to extend $f$ from Boolean points to a function $\tilde{f}$ over the whole domain $F^n$.

Definition:\[ \chi_w(x) = \begin{cases} 1 & w = x \\ 0 & \text{otherwise} \end{cases} \]

The multivariate polynomial\[ \chi_w(x) = \prod_{i=1}^n [(1 - x_i)(1 - w_i) + x_i w_i] \]

is analogous to the Lagrange basis for univariate polynomials and is usually precomputed during system setup.

To convert from evaluation form to coefficient form, more generally for multivariates:\[ \tilde{f}(x_1, ..., x_n) = \sum_{w \in \{0,1\}^n} f(w) \cdot \chi_w(x_1, ..., x_n) \]

Here, $\tilde{f}$ is also known as the Multilinear Extension (MLE).

1.4 Vector Storage Form (Coefficient Form)

The multilinear extension can also be stored in vector form using little-endian ordering, e.g.:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

If the polynomial is $f(X_3, X_2, X_1, X_0)$, then $E[1]$, $E[5]$, $E[8]$ store the coefficients for the terms $X_0$, $X_2 \cdot X_0$, and $X_3$, respectively.

1.5 Using Evaluation Form for Computation at a Specific Point

Suppose we need to compute $x_1 = r$. Let:

\[ \tilde{f}(x_2, ..., x_n) = \tilde{f}(r, x_2, ..., x_n) \]

let n = 2^v;
let half = n / 2;
for i in 0..half {
    let low = E[i];
    let high = E[half + i];
    E[i] = low + r * (high - low);
}

This works because:

\[ \tilde{f}(r, x_2, ..., x_n) = (1 - r) \cdot \tilde{f}(0, x_2, ..., x_n) + r \cdot \tilde{f}(1, x_2, ..., x_n) \]

This method computes the value at $r$ directly in evaluation form, without converting to coefficient form. Both time and space complexity are $O(n)$.

1.6 Using Coefficient Form for Computation at a Specific Point

  1. Convert from evaluation form to coefficient form using the method in Section 3.
  2. Substitute the specific point into the corresponding variable.

1.7 Multiplication of Multivariate Polynomials

In multivariate polynomial systems, MLE $\times$ MLE (Multilinear Extension $\times$ Multilinear Extension) operations are widely used in zkSNARK protocols based on the multivariate sumcheck.

Let $f,g$ be two MLE functions over the Boolean hypercube $0,1^v$ (each polynomial is linear in each variable). They can be represented by their evaluation vectors over all $2^v$ points.

Using evaluation form, we can efficiently compute:

\[ f(r_1, ..., r_v) \cdot g(r_1, ..., r_v) \]

— i.e., pointwise multiplication at the same point.

If the evaluation form length is insufficient (e.g., degree $d$ not reached), it must be padded to length $d$ first. This is because multiplication increases degree (up to double), and keeping length $d$ ensures correct subsequent IFFT or Lagrange interpolation.

This is analogous to univariate polynomial handling in FFT: pointwise multiplication after FFT, then IFFT to return to coefficient form. Multivariate MLE pointwise multiplication follows the same principle, but on the high-dimensional Boolean hypercube.

2. Sumcheck Protocol

The goal is to prove:\[ \sum_{x \in \{0,1\}^n} f(x_1, ..., x_n) = c \]

Protocol steps:

  1. Prover constructs the univariate polynomial:\[ g_1(X_1) = \sum_{(x_2, ..., x_n) \in \{0,1\}^{n-1}} f(X_1, x_2, ..., x_n) \]and sends it to the <strong>Verifier</strong> (computed in coefficient form).
  2. Verifier checks $g_1(0) + g_1(1) = c$, then sends random $r_1$.
  3. Prover constructs: \[ g_2(X_2) = \sum_{(x_3, ..., x_n) \in \{0,1\}^{n-2}} f(r_1, X_2, ..., x_n) \]and Verifier checks $g_1(r_1) = g_2(0) + g_2(1)$.
  4. Repeat: In each round, Prover sends $g_i(X)$, and Verifier checks $g_{i-1}(r_{i-1}) = g_i(0) + g_i(1)$.
  5. Final round: Verifier checks $g_n(0) + g_n(1) = g_{n-1}(r_{n-1})$, and additionally computes $f(r_1, r_2, ..., r_n)$ for the final verification.

The last-round verification can be done by querying a polynomial oracle or by having the Prover compute it directly in evaluation form.

3. Conclusion

The Multivariate Sumcheck Protocol provides an efficient and modular method to verify large multilinear polynomial sums over the Boolean hypercube without explicitly evaluating the function on all 2n points. By iteratively reducing an n-variate sum to a sequence of univariate checks, it minimizes communication and computational costs for both prover and verifier. Its reliance on multilinear extensions and evaluation-form computations eliminates the need for heavy FFTs in high dimensions, making it especially suitable for zero-knowledge proof systems such as zkSNARKs and GKR. The protocol’s structure not only ensures soundness through random challenges but also integrates cleanly with other PIOP components, enabling scalable verification in modern cryptographic protocols.

Subscribe to the ZKM Blog to stay updated with the latest research by ZKM.
Explore the ZKM Docs
Start building with
ZKM GitHub
Follow ZKM on
X