Multivariate Sumcheck Protocol

Share on

The Multivariate Sumcheck Protocol is an important PIOP (Polynomial IOP) component in zero-knowledge proofs. It mainly proves the correctness of the following equation:

\[ c = \sum_{\vec{x} \in \{0,1\}^n} f(X_1, ..., X_n) \]

Here, $c$ is the sum result, and $X_1, ..., X_n$ are $d$ variables, each taking the value 0 or 1.Unlike traditional univariate polynomial operations, multiplication of multivariate polynomials does not require complex FFT operations — only pointwise multiplication is needed. We will now introduce the basics of multivariate polynomials, including their representation and operations.

1. Preliminaries

1.1 From Univariate to Multivariate Polynomials (Evaluation Form)

Let $\vec{g} = g^0, g^1, ..., g^7$ be an order-8 multiplicative subgroup in a large prime field, where:

\[ g^8 = 1 \]

We can encode position $i$ in the Boolean domain $g^i$ as its binary representation $000, 001, ..., 111$ to construct all evaluation points for a multivariate Boolean function. A univariate polynomial $f(X)$ can thus be represented as a multivariate polynomial $f(\vec{X}) = f(X_0, X_1, ..., X_7)$:

\[ (f(g^0), f(g^1), ..., f(g^7)) = (f(0,0,0), f(0,0,1), ..., f(1,1,1)) \]

Each value corresponds one-to-one, effectively flattening the high-dimensional Boolean function into a vector.

1.2 Vector Storage Form (Evaluation Form)

The multilinear extension can be stored as a vector in little-endian order:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

This is similar to storing point values using their binary representation in decimal indexing.

1.3 Low-degree Extension (Coefficient Form)

We want to extend $f$ from Boolean points to a function $\tilde{f}$ over the whole domain $F^n$.

Definition:\[ \chi_w(x) = \begin{cases} 1 & w = x \\ 0 & \text{otherwise} \end{cases} \]

The multivariate polynomial\[ \chi_w(x) = \prod_{i=1}^n [(1 - x_i)(1 - w_i) + x_i w_i] \]

is analogous to the Lagrange basis for univariate polynomials and is usually precomputed during system setup.

To convert from evaluation form to coefficient form, more generally for multivariates:\[ \tilde{f}(x_1, ..., x_n) = \sum_{w \in \{0,1\}^n} f(w) \cdot \chi_w(x_1, ..., x_n) \]

Here, $\tilde{f}$ is also known as the Multilinear Extension (MLE).

1.4 Vector Storage Form (Coefficient Form)

The multilinear extension can also be stored in vector form using little-endian ordering, e.g.:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

If the polynomial is $f(X_3, X_2, X_1, X_0)$, then $E[1]$, $E[5]$, $E[8]$ store the coefficients for the terms $X_0$, $X_2 \cdot X_0$, and $X_3$, respectively.

1.5 Using Evaluation Form for Computation at a Specific Point

Suppose we need to compute $x_1 = r$. Let:

\[ \tilde{f}(x_2, ..., x_n) = \tilde{f}(r, x_2, ..., x_n) \]

let n = 2^v;
let half = n / 2;
for i in 0..half {
    let low = E[i];
    let high = E[half + i];
    E[i] = low + r * (high - low);
}

This works because:

\[ \tilde{f}(r, x_2, ..., x_n) = (1 - r) \cdot \tilde{f}(0, x_2, ..., x_n) + r \cdot \tilde{f}(1, x_2, ..., x_n) \]

This method computes the value at $r$ directly in evaluation form, without converting to coefficient form. Both time and space complexity are $O(n)$.

1.6 Using Coefficient Form for Computation at a Specific Point

  1. Convert from evaluation form to coefficient form using the method in Section 3.
  2. Substitute the specific point into the corresponding variable.

1.7 Multiplication of Multivariate Polynomials

In multivariate polynomial systems, MLE $\times$ MLE (Multilinear Extension $\times$ Multilinear Extension) operations are widely used in zkSNARK protocols based on the multivariate sumcheck.

Let $f,g$ be two MLE functions over the Boolean hypercube $0,1^v$ (each polynomial is linear in each variable). They can be represented by their evaluation vectors over all $2^v$ points.

Using evaluation form, we can efficiently compute:

\[ f(r_1, ..., r_v) \cdot g(r_1, ..., r_v) \]

— i.e., pointwise multiplication at the same point.

If the evaluation form length is insufficient (e.g., degree $d$ not reached), it must be padded to length $d$ first. This is because multiplication increases degree (up to double), and keeping length $d$ ensures correct subsequent IFFT or Lagrange interpolation.

This is analogous to univariate polynomial handling in FFT: pointwise multiplication after FFT, then IFFT to return to coefficient form. Multivariate MLE pointwise multiplication follows the same principle, but on the high-dimensional Boolean hypercube.

2. Sumcheck Protocol

The goal is to prove:\[ \sum_{x \in \{0,1\}^n} f(x_1, ..., x_n) = c \]

Protocol steps:

  1. Prover constructs the univariate polynomial:\[ g_1(X_1) = \sum_{(x_2, ..., x_n) \in \{0,1\}^{n-1}} f(X_1, x_2, ..., x_n) \]and sends it to the <strong>Verifier</strong> (computed in coefficient form).
  2. Verifier checks $g_1(0) + g_1(1) = c$, then sends random $r_1$.
  3. Prover constructs: \[ g_2(X_2) = \sum_{(x_3, ..., x_n) \in \{0,1\}^{n-2}} f(r_1, X_2, ..., x_n) \]and Verifier checks $g_1(r_1) = g_2(0) + g_2(1)$.
  4. Repeat: In each round, Prover sends $g_i(X)$, and Verifier checks $g_{i-1}(r_{i-1}) = g_i(0) + g_i(1)$.
  5. Final round: Verifier checks $g_n(0) + g_n(1) = g_{n-1}(r_{n-1})$, and additionally computes $f(r_1, r_2, ..., r_n)$ for the final verification.

The last-round verification can be done by querying a polynomial oracle or by having the Prover compute it directly in evaluation form.

3. Conclusion

The Multivariate Sumcheck Protocol provides an efficient and modular method to verify large multilinear polynomial sums over the Boolean hypercube without explicitly evaluating the function on all 2n points. By iteratively reducing an n-variate sum to a sequence of univariate checks, it minimizes communication and computational costs for both prover and verifier. Its reliance on multilinear extensions and evaluation-form computations eliminates the need for heavy FFTs in high dimensions, making it especially suitable for zero-knowledge proof systems such as zkSNARKs and GKR. The protocol’s structure not only ensures soundness through random challenges but also integrates cleanly with other PIOP components, enabling scalable verification in modern cryptographic protocols.

Subscribe to the ZKM Blog to stay updated with the latest research by ZKM.
Explore the ZKM Docs
Start building with
ZKM GitHub
Follow ZKM on
X

More articles
当前举措概述
ZKM 引领着 ZK 技术的发展,开发了一套创新工具和教育资源,以实现 ZK 与区块链的无缝集成。本文全面概述了我们正在进行的举措,展示了我们工作的广度,展望了未来。
传统 STARK vs Circle
STARK(可扩展的透明知识论证)是埃利·本·萨森及其同事在2018年推出的一种证明系统,与传统的SNARK系统相比,它具有更好的可扩展性和透明度。STARK 的工作原理是将复杂的计算转换为算术电路,然后将其表示为多项式评估问题。为了在计算过程中隐藏中间结果,使用了多项式承诺,同时允许验证者对这些结果进行采样和检查。通过应用低度扩展,将复杂的计算简化为验证低度多项式,然后使用高效的交互式证明协议 FRI 来检查多项式是否为低度多项式。该技术在实现隐私保护和可验证计算方面具有广泛的应用。
Multivariate Sumcheck Protocol

The Multivariate Sumcheck Protocol is an important PIOP (Polynomial IOP) component in zero-knowledge proofs. It mainly proves the correctness of the following equation:

\[ c = \sum_{\vec{x} \in \{0,1\}^n} f(X_1, ..., X_n) \]

Here, $c$ is the sum result, and $X_1, ..., X_n$ are $d$ variables, each taking the value 0 or 1.Unlike traditional univariate polynomial operations, multiplication of multivariate polynomials does not require complex FFT operations — only pointwise multiplication is needed. We will now introduce the basics of multivariate polynomials, including their representation and operations.

1. Preliminaries

1.1 From Univariate to Multivariate Polynomials (Evaluation Form)

Let $\vec{g} = g^0, g^1, ..., g^7$ be an order-8 multiplicative subgroup in a large prime field, where:

\[ g^8 = 1 \]

We can encode position $i$ in the Boolean domain $g^i$ as its binary representation $000, 001, ..., 111$ to construct all evaluation points for a multivariate Boolean function. A univariate polynomial $f(X)$ can thus be represented as a multivariate polynomial $f(\vec{X}) = f(X_0, X_1, ..., X_7)$:

\[ (f(g^0), f(g^1), ..., f(g^7)) = (f(0,0,0), f(0,0,1), ..., f(1,1,1)) \]

Each value corresponds one-to-one, effectively flattening the high-dimensional Boolean function into a vector.

1.2 Vector Storage Form (Evaluation Form)

The multilinear extension can be stored as a vector in little-endian order:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

This is similar to storing point values using their binary representation in decimal indexing.

1.3 Low-degree Extension (Coefficient Form)

We want to extend $f$ from Boolean points to a function $\tilde{f}$ over the whole domain $F^n$.

Definition:\[ \chi_w(x) = \begin{cases} 1 & w = x \\ 0 & \text{otherwise} \end{cases} \]

The multivariate polynomial\[ \chi_w(x) = \prod_{i=1}^n [(1 - x_i)(1 - w_i) + x_i w_i] \]

is analogous to the Lagrange basis for univariate polynomials and is usually precomputed during system setup.

To convert from evaluation form to coefficient form, more generally for multivariates:\[ \tilde{f}(x_1, ..., x_n) = \sum_{w \in \{0,1\}^n} f(w) \cdot \chi_w(x_1, ..., x_n) \]

Here, $\tilde{f}$ is also known as the Multilinear Extension (MLE).

1.4 Vector Storage Form (Coefficient Form)

The multilinear extension can also be stored in vector form using little-endian ordering, e.g.:

\[ E[1] = \tilde{f}(0,0,0,1), \quad E[5] = \tilde{f}(0,1,0,1), \quad E[8] = \tilde{f}(1,0,0,0) \]

If the polynomial is $f(X_3, X_2, X_1, X_0)$, then $E[1]$, $E[5]$, $E[8]$ store the coefficients for the terms $X_0$, $X_2 \cdot X_0$, and $X_3$, respectively.

1.5 Using Evaluation Form for Computation at a Specific Point

Suppose we need to compute $x_1 = r$. Let:

\[ \tilde{f}(x_2, ..., x_n) = \tilde{f}(r, x_2, ..., x_n) \]

let n = 2^v;
let half = n / 2;
for i in 0..half {
    let low = E[i];
    let high = E[half + i];
    E[i] = low + r * (high - low);
}

This works because:

\[ \tilde{f}(r, x_2, ..., x_n) = (1 - r) \cdot \tilde{f}(0, x_2, ..., x_n) + r \cdot \tilde{f}(1, x_2, ..., x_n) \]

This method computes the value at $r$ directly in evaluation form, without converting to coefficient form. Both time and space complexity are $O(n)$.

1.6 Using Coefficient Form for Computation at a Specific Point

  1. Convert from evaluation form to coefficient form using the method in Section 3.
  2. Substitute the specific point into the corresponding variable.

1.7 Multiplication of Multivariate Polynomials

In multivariate polynomial systems, MLE $\times$ MLE (Multilinear Extension $\times$ Multilinear Extension) operations are widely used in zkSNARK protocols based on the multivariate sumcheck.

Let $f,g$ be two MLE functions over the Boolean hypercube $0,1^v$ (each polynomial is linear in each variable). They can be represented by their evaluation vectors over all $2^v$ points.

Using evaluation form, we can efficiently compute:

\[ f(r_1, ..., r_v) \cdot g(r_1, ..., r_v) \]

— i.e., pointwise multiplication at the same point.

If the evaluation form length is insufficient (e.g., degree $d$ not reached), it must be padded to length $d$ first. This is because multiplication increases degree (up to double), and keeping length $d$ ensures correct subsequent IFFT or Lagrange interpolation.

This is analogous to univariate polynomial handling in FFT: pointwise multiplication after FFT, then IFFT to return to coefficient form. Multivariate MLE pointwise multiplication follows the same principle, but on the high-dimensional Boolean hypercube.

2. Sumcheck Protocol

The goal is to prove:\[ \sum_{x \in \{0,1\}^n} f(x_1, ..., x_n) = c \]

Protocol steps:

  1. Prover constructs the univariate polynomial:\[ g_1(X_1) = \sum_{(x_2, ..., x_n) \in \{0,1\}^{n-1}} f(X_1, x_2, ..., x_n) \]and sends it to the <strong>Verifier</strong> (computed in coefficient form).
  2. Verifier checks $g_1(0) + g_1(1) = c$, then sends random $r_1$.
  3. Prover constructs: \[ g_2(X_2) = \sum_{(x_3, ..., x_n) \in \{0,1\}^{n-2}} f(r_1, X_2, ..., x_n) \]and Verifier checks $g_1(r_1) = g_2(0) + g_2(1)$.
  4. Repeat: In each round, Prover sends $g_i(X)$, and Verifier checks $g_{i-1}(r_{i-1}) = g_i(0) + g_i(1)$.
  5. Final round: Verifier checks $g_n(0) + g_n(1) = g_{n-1}(r_{n-1})$, and additionally computes $f(r_1, r_2, ..., r_n)$ for the final verification.

The last-round verification can be done by querying a polynomial oracle or by having the Prover compute it directly in evaluation form.

3. Conclusion

The Multivariate Sumcheck Protocol provides an efficient and modular method to verify large multilinear polynomial sums over the Boolean hypercube without explicitly evaluating the function on all 2n points. By iteratively reducing an n-variate sum to a sequence of univariate checks, it minimizes communication and computational costs for both prover and verifier. Its reliance on multilinear extensions and evaluation-form computations eliminates the need for heavy FFTs in high dimensions, making it especially suitable for zero-knowledge proof systems such as zkSNARKs and GKR. The protocol’s structure not only ensures soundness through random challenges but also integrates cleanly with other PIOP components, enabling scalable verification in modern cryptographic protocols.

Subscribe to the ZKM Blog to stay updated with the latest research by ZKM.
Explore the ZKM Docs
Start building with
ZKM GitHub
Follow ZKM on
X