ZKM Prover: Arithmetic Operations and CPU Operations

Home

Blog

News Tech Vision

ZKM

Apr 4, 2025

1. Arithmetic Operations

1.1 Operation Instructions

mul: Verifies binary multiplication operations, such as ab=c, suitable for simple two-input, one-output multiplications.
mult: Verifies multi-input or complex multiplication operations, handling higher precision or special cases of multiplication.
addcy: Verifies addition with carry operations, such as a+b+carry=c, used in multi-word additions or high-precision calculations.
slt: Verifies “less than” comparison operations a<b, typically used for conditional branching or logical evaluations.
lui: Verifies the “load upper immediate” operation, loading an immediate value into the high bits of a register, e.g., reg=imm16.
div: Verifies binary division operations ab=c, possibly handling remainders and ensuring the legality of the division.
shift: Verifies logical shift operations, including left or right shifts, e.g., ab or ab.
sra: Verifies arithmetic right shifts, preserving the sign bit for signed numbers.‍
lo_hi: Verifies high-low bit decomposition or merging operations, such as extracting the high bits a16 or low bits a&0xFFFF.

1.2 Execution Flow

Arithmetic operations involve the following steps:

Step 1: Initialize Arithmetic Table

Step 2: Generate Trace Data

Step 3: Verify Range Checks

Step 4: Validate Constraints

Step 5: Perform Circuit-Level Validation

1.2.1 Step 1: Initialize Arithmetic Table

Initialize the structure of the arithmetic table, mapping columns (e.g., operands, operation types, results) and linking them with corresponding fields in the CPU table.

1. Define Operation Types and Opcodes:

Use the COMBINED_OPS array to define 26 arithmetic operations with their corresponding columns and opcodes, e.g.:

columns::IS_ADD represents the addition operation with an opcode of 0b100000.
Opcodes follow the binary encoding of the MIPS instruction set.

2. Define Register Mappings:

REGISTER_MAP includes three register ranges:

INPUT_REGISTER_0 and INPUT_REGISTER_1 store two input operands.
OUTPUT_REGISTER stores the computation result.

3. Generate Filters:

Use Filter::new_simple to create filters marking valid arithmetic operation rows.
Filters activate only when a row corresponds to arithmetic operations defined in COMBINED_OPS.

4. Link Arithmetic and CPU Tables:

Call the cpu_arith_data_link method to link columns in the arithmetic table (e.g., registers and operands) with corresponding fields in the CPU table.

pub fn ctl_arithmetic_rows<F: Field>() -> TableWithColumns<F> {
    const COMBINED_OPS: [(usize, u32); 26] = [
        // Map arithmetic operations and their corresponding opcodes.
        (columns::IS_ADD, 0b100000 * (1 << 6)),
        (columns::IS_ADDU, 0b100001 * (1 << 6)),
        (columns::IS_ADDI, 0b001000),
        // ... omit ...
    ];
const REGISTER_MAP: [Range<usize>; 3] = [
        columns::INPUT_REGISTER_0, // input register0
        columns::INPUT_REGISTER_1, // input register1
        columns::OUTPUT_REGISTER,  // output register
    ];
let filter = Some(Filter::new_simple(Column::sum(
        COMBINED_OPS.iter().map(|(c, _v)| *c),
    )));
    TableWithColumns::new(
        Table::Arithmetic,
        cpu_arith_data_link(&COMBINED_OPS, &REGISTER_MAP), //Link the arithmetic table to the CPU table
        filter,
    )
}

1.2.2 Step 2: Generate Trace Data

Generate trace data row by row based on the input operations (Operation), supplement empty rows, and perform range checks.

1. Allocate Trace Rows:

max_rows defines the maximum number of rows in the trace table:

If each operation requires at most two rows, the row count is 2operations.len().
Range checks require at least RANGE_MAX rows, so the final row count is the maximum of the two.

2. Generate Trace Rows for Each Operation:

Convert each Operation to one or two rows of data using the to_rows method.
Append a second row for complex arithmetic operations as needed.

3. Fill Empty Rows:

The trace table’s length must be a power of two (to optimize later calculations).
Add zero-filled rows to reach the next power of two, ensuring the row count is no less than RANGE_MAX.

4. Transpose to Column Data:

Convert row data to columnar format for easier verification and polynomial calculations.

5. Perform Range Checks:

Call generate_range_checks to validate column data against specified ranges.

pub(crate) fn generate_trace(&self, operations: Vec<Operation>) -> Vec<PolynomialValues<F>> {
    let max_rows = std::cmp::max(2 * operations.len(), RANGE_MAX);
    let mut trace_rows = Vec::with_capacity(max_rows);
    for op in operations {
        let (row1, maybe_row2) = op.to_rows(); // Converts an operation to one or two rows of data.
        trace_rows.push(row1);
        if let Some(row2) = maybe_row2 { // If the operation requires two lines, add them to the Trace.
            trace_rows.push(row2);
        }
    }
    // Fill blank lines
    let padded_len = trace_rows.len().next_power_of_two(); // Make sure the number of rows is a power of two to facilitate the interpolation and commitment operation of FRI.
    for _ in trace_rows.len()..std::cmp::max(padded_len, RANGE_MAX) {
        trace_rows.push(vec![F::ZERO; columns::NUM_ARITH_COLUMNS]); // Add all zero rows.
    }
    // Transpose to column data
    let mut trace_cols = transpose(&trace_rows);
    // range check
    self.generate_range_checks(&mut trace_cols);
    // Convert to a polynomial representation
    trace_cols.into_iter().map(PolynomialValues::new).collect()
}

1.2.3 Step 3: Verify Range Checks

Validate specific columns in the trace to ensure values lie within valid ranges (e.g., ([0, 2^{16}))).

fn generate_range_checks(&self, cols: &mut [Vec<F>]) {
    // Initialize the RANGE COUNTER column with the range [0, 2^16).
    for i in 0..RANGE_MAX {
        cols[RANGE_COUNTER][i] = F::from_canonical_usize(i);
    }
    for i in RANGE_MAX..cols[0].len() {
        cols[RANGE_COUNTER][i] = F::from_canonical_usize(RANGE_MAX - 1);
    }
    // Frequency checks are performed on SHARED COLS
    for col in SHARED_COLS {
        for i in 0..cols[0].len() {
            let x = cols[col][i].to_canonical_u64() as usize;
            assert!(
                x < RANGE_MAX,
                "column value {} exceeds the max range value {}",
                x,
                RANGE_MAX
            );
            cols[RC_FREQUENCIES][x] += F::ONE; // statistical frequency
        }
    }
}

‍1.2.4 Step 4: Validate Constraints

Verify constraints for different arithmetic operations to ensure inputs, outputs, and intermediate states conform to definitions. This module ensures offline validation of filled values.

1. Initialize the RANGE_COUNTER Column:

Fill the first RANGE_MAX rows of the range counter column RANGE_COUNTER with sequential values ([0, 2^{16})).
Fill remaining rows with the maximum value 2^{16}-1 if the column exceeds RANGE_MAX.

2. Count Frequencies of Shared Columns:

Traverse all shared columns (SHARED_COLS) and count the frequency of values within the range ([0, 2^{16})).
Trigger an assertion error if any value exceeds the range.

fn eval_packed_generic<FE, P, const D2: usize>(
    &self,
    vars: &Self::EvaluationFrame<FE, P, D2>,
    yield_constr: &mut ConstraintConsumer<P>,
) where
    FE: FieldExtension<D2, BaseField = F>,
    P: PackedField<Scalar = FE>,
{
    let lv: &[P; NUM_ARITH_COLUMNS] = vars.get_local_values().try_into().unwrap();
    let nv: &[P; NUM_ARITH_COLUMNS] = vars.get_next_values().try_into().unwrap();
    // Range checking constraints
    let rc1 = lv[RANGE_COUNTER];
    let rc2 = nv[RANGE_COUNTER];
    yield_constr.constraint_first_row(rc1); // The first row must be 0
    let incr = rc2 - rc1;
    yield_constr.constraint_transition(incr * incr - incr); // Range increasing constraint
    let range_max = P::Scalar::from_canonical_u64((RANGE_MAX - 1) as u64);
    yield_constr.constraint_last_row(rc1 - range_max); // The last line is RANGE MAX-1
    // Invocation of constraint validation for individual arithmetic operations
    mul::eval_packed_generic(lv, yield_constr);
    mult::eval_packed_generic(lv, yield_constr);
    addcy::eval_packed_generic(lv, yield_constr);
    slt::eval_packed_generic(lv, yield_constr);
    lui::eval_packed_generic(lv, nv, yield_constr);
    div::eval_packed(lv, nv, yield_constr);
    shift::eval_packed_generic(lv, nv, yield_constr);
    sra::eval_packed_generic(lv, nv, yield_constr);
    lo_hi::eval_packed_generic(lv, yield_constr);
}

1.2.5 Step 5: Perform Circuit-Level Validation

Extend circuit validation for all operations in the arithmetic table, ensuring constraints are satisfied. This module builds circuit constraints to construct proofs for the STARK system.

fn eval_ext_circuit(
    &self,
    builder: &mut CircuitBuilder<F, D>,
    vars: &Self::EvaluationFrameTarget,
    yield_constr: &mut RecursiveConstraintConsumer<F, D>,
) {
    let lv: &[ExtensionTarget<D>; NUM_ARITH_COLUMNS] =
        vars.get_local_values().try_into().unwrap();
    let nv: &[ExtensionTarget<D>; NUM_ARITH_COLUMNS] =
        vars.get_next_values().try_into().unwrap();
    // Range Check Constraints (circuit)
    let rc1 = lv[RANGE_COUNTER];
    let rc2 = nv[RANGE_COUNTER];
    yield_constr.constraint_first_row(builder, rc1);
    let incr = builder.sub_extension(rc2, rc1);
    let t = builder.mul_sub_extension(incr, incr, incr);
    yield_constr.constraint_transition(builder, t);
    let range_max =  builder.constant_extension(F::Extension::from_canonical_usize(RANGE_MAX - 1));
    let t = builder.sub_extension(rc1, range_max);
    yield_constr.constraint_last_row(builder, t);
    // Circuit verification that invokes individual arithmetic operations    mul::eval_ext_circuit(builder, lv, yield_constr);
    mult::eval_ext_circuit(builder, lv, yield_constr);
    addcy::eval_ext_circuit(builder, lv, yield_constr);
    slt::eval_ext_circuit(builder, lv, yield_constr);
    lui::eval_ext_circuit(builder, lv, nv, yield_constr);
    div::eval_ext_circuit(builder, lv, nv, yield_constr);
    shift::eval_ext_circuit(builder, lv, nv, yield_constr);
    sra::eval_ext_circuit(builder, lv, nv, yield_constr);
    lo_hi::eval_ext_circuit(builder, lv, yield_constr);
}

2. CPU Operations

2.1 Basic Introduction to the CPU Module

The CPU’s execution primarily includes the following components:

Instruction Fetch and Decode The CPU fetches instructions from memory and sends them to the decode module. The module parses the instruction into opcode, function code, source registers, destination registers, and other fields. It validates the instruction’s legality and determines its type (e.g., arithmetic, jump, or memory operation). Purpose: Ensure the CPU executes only valid instructions and provides a clear instruction structure for subsequent modules.
Jumps and Branches For jump or branch instructions, the jump module calculates the target address and updates the program counter (PC). Unconditional jumps directly modify the PC, while conditional jumps depend on the status of condition registers. Branch instructions evaluate specific conditions (e.g., equal, greater, or less). Purpose: Control the program’s flow flexibly, enabling function calls, loops, and more.
Memory Access and Bus Operations Instructions requiring memory access (e.g., load and store) interact with memory via the memory bus module. The module ensures the memory channel’s status (e.g., availability) and context flags (e.g., kernel/user) are valid. Specific operations like loading words or storing data are handled by the memory input-output (Memio) module. Purpose: Bridge the CPU and memory, ensuring safe and correct data read/write operations.
Arithmetic and Logical Operations The CPU performs arithmetic or logical computations based on the instruction type, such as addition, subtraction, shifts, or conditional moves. Specific modules handle operations like bit shifts, sign extensions, or bit-level transformations. Purpose: Execute core computation tasks for the instruction.
System Calls and Privileged Operations System call instructions switch the CPU to kernel mode for privileged operations (e.g., memory mapping, thread management). The Syscall module validates the legality of operations. After completing tasks, the CPU exits kernel mode and resumes user mode. Purpose: Enable safe transitions between user and kernel modes for system-level services.
Result Writeback and State Update After executing an instruction, the result is written back to the destination register or memory, and the program counter is updated for the next instruction. Purpose: Save computation results and advance the instruction flow.

2.2 Execution Flow

2.2.1 Step 1: Structure Definition

1. Define the CpuStark structure, which is a generic structure containing the following parameters:

F: Field type, used to support finite field operations.
D: Degree of extension, used to represent the depth of the extended field.

2. The PhantomData<F> is used as a placeholder to indicate that this structure is associated with a field type.

#[derive(Copy, Clone, Default)]
pub struct CpuStark<F, const D: usize> {
    pub f: PhantomData<F>,
}

CpuStark is the core structure for CPU simulation, responsible for managing and implementing STARK-specific operations, including trace verification and constraint generation.

2.2.2 Step 2: Local Validation Method

fn eval_packed_generic<FE, P, const D2: usize>(
    &self,
    vars: &Self::EvaluationFrame<FE, P, D2>,
    yield_constr: &mut ConstraintConsumer<P>,
) where
    FE: FieldExtension<D2, BaseField = F>,
    P: PackedField<Scalar = FE>,

‍

Verifies whether the CPU’s execution trace satisfies logical constraints.
Uses packed fields for batch validation, improving the efficiency of local validation.

Step 2.1: Extract CPU State

Extract the current row (local_values) and the next row (next_values) from the input vars.
Convert these into CpuColumnsView, which allows easy access to the CPU’s fields.

let local_values: &[P; NUM_CPU_COLUMNS] = vars.get_local_values().try_into().unwrap();
let local_values: &CpuColumnsView<P> = local_values.borrow();
let next_values: &[P; NUM_CPU_COLUMNS] = vars.get_next_values().try_into().unwrap();
let next_values: &CpuColumnsView<P> = next_values.borrow();

‍

CpuColumnsView: Encapsulates CPU trace data, such as instructions and register values.
NUM_CPU_COLUMNS: The total number of columns in the CPU table.

Step 2.2: Call Module Validation

bootstrap_kernel::eval_bootstrap_kernel_packed(local_values, next_values, yield_constr);
decode::eval_packed_generic(local_values, yield_constr);
jumps::eval_packed(local_values, next_values, yield_constr);
membus::eval_packed(local_values, yield_constr);
memio::eval_packed(local_values, next_values, yield_constr);
shift::eval_packed(local_values, yield_constr);
count::eval_packed(local_values, yield_constr);
syscall::eval_packed(local_values, yield_constr);
bits::eval_packed(local_values, yield_constr);
misc::eval_packed(local_values, yield_constr);
exit_kernel::eval_exit_kernel_packed(local_values, next_values, yield_constr);

Modules are called to validate different CPU functionalities.
Each module corresponds to a specific validation logic (e.g., decoding, jumping, memory operations).

Example: decode::eval_packed_generic

fn eval_packed_generic<P: PackedField>(
    lv: &CpuColumnsView<P>,
    yield_constr: &mut ConstraintConsumer<P>,
) {
    let filter = lv.op.decode_op; // Filtering condition for instruction decoding
    let opcode_check = lv.opcode_bits[0]; // Get the first bit of the opcode
    yield_constr.constraint(filter * opcode_check); // Add constraint: verify the opcode is correct
}

‍Purpose: Verifies that the opcode is correct.

Process:

Checks whether the current row corresponds to an instruction decoding operation (via filter).
Verifies whether the first bit of the opcode matches expectations.

2.2.3 Step 3: Circuit Generation Method

fn eval_ext_circuit(
    &self,
    builder: &mut CircuitBuilder<F, D>,
    vars: &Self::EvaluationFrameTarget,
    yield_constr: &mut RecursiveConstraintConsumer<F, D>,
)

Generates circuit constraints for each operation step of the CPU.
These constraints are used by the zk-STARK proof generator.

Step 3.1: Extract State

let local_values: &[ExtensionTarget<D>; NUM_CPU_COLUMNS] =
    vars.get_local_values().try_into().unwrap();
let local_values: &CpuColumnsView<ExtensionTarget<D>> = local_values.borrow();
let next_values: &[ExtensionTarget<D>; NUM_CPU_COLUMNS] =
    vars.get_next_values().try_into().unwrap();
let next_values: &CpuColumnsView<ExtensionTarget<D>> = next_values.borrow();

Extract ExtensionTarget data for circuit generation.
This process is similar to the state extraction in eval_packed_generic.

Step 3.2: Call Module Constraint Generation

bootstrap_kernel::eval_bootstrap_kernel_ext_circuit(builder, local_values, next_values, yield_constr);
decode::eval_ext_circuit(builder, local_values, yield_constr);
jumps::eval_ext_circuit(builder, local_values, next_values, yield_constr);
membus::eval_ext_circuit(builder, local_values, yield_constr);
// ...
exit_kernel::eval_exit_kernel_ext_circuit(builder, local_values, next_values, yield_constr);

‍Functionality:

Each module generates corresponding circuit constraints.
Each module provides an eval_ext_circuit method to generate these constraints.

Example: decode::eval_ext_circuit

fn eval_ext_circuit<F: RichField + Extendable<D>, const D: usize>(
    builder: &mut CircuitBuilder<F, D>,
    lv: &CpuColumnsView<ExtensionTarget<D>>,
    yield_constr: &mut RecursiveConstraintConsumer<F, D>,
) {
    let filter = lv.op.decode_op; // Filtering condition
    let opcode_check = lv.opcode_bits[0]; // First bit of the opcode
    let constr = builder.mul_extension(filter, opcode_check); // Generate multiplication constraint
    yield_constr.constraint(builder, constr); // Add to the constraints
}

The CircuitBuilder is used to build multiplication constraints, creating the circuit constraint for instruction decoding.
Verifies that the opcode is correct.

2.2.4 Step 4: Testing and Verification

#[test]
fn test_stark_degree() -> Result<()> {
    const D: usize = 2;
    type C = PoseidonGoldilocksConfig;
    type F = <C as GenericConfig<D>>::F;
    type S = CpuStark<F, D>;
    let stark = S {
        f: Default::default(),
    };
    test_stark_low_degree(stark)
}

‍Functionality:

Tests whether the polynomial constraints satisfy the low-degree property.
Ensures the validity of the STARK proof.

#[test]
fn test_stark_circuit() -> Result<()> {
    const D: usize = 2;
    type C = PoseidonGoldilocksConfig;
    type F = <C as GenericConfig<D>>::F;
    type S = CpuStark<F, D>;
    let stark = S {
        f: Default::default(),
    };
    test_stark_circuit_constraints::<F, C, S, D>(stark)
}

‍Functionality:

Tests whether the generated circuit conforms to the expected constraints.
Verifies the correctness of the STARK circuit generation.

Subscribe to the ZKM Blog to stay updated with the latest research by ZKM. If you have any questions about the article, you can contact the ZKM team on discord: discord.com/zkm

你好世界：十一月通讯

The FRI Principle and Its Application Analysis

Fast Reed-Solomon Interactive Oracle Proof of Proximity (FRI) is an interactive protocol used to check if the degree of a committed polynomial is less than a specific value . The general idea is as follows:

1. Arithmetic Operations

1.1 Operation Instructions

mul: Verifies binary multiplication operations, such as ab=c, suitable for simple two-input, one-output multiplications.
mult: Verifies multi-input or complex multiplication operations, handling higher precision or special cases of multiplication.
addcy: Verifies addition with carry operations, such as a+b+carry=c, used in multi-word additions or high-precision calculations.
slt: Verifies “less than” comparison operations a<b, typically used for conditional branching or logical evaluations.
lui: Verifies the “load upper immediate” operation, loading an immediate value into the high bits of a register, e.g., reg=imm16.
div: Verifies binary division operations ab=c, possibly handling remainders and ensuring the legality of the division.
shift: Verifies logical shift operations, including left or right shifts, e.g., ab or ab.
sra: Verifies arithmetic right shifts, preserving the sign bit for signed numbers.‍
lo_hi: Verifies high-low bit decomposition or merging operations, such as extracting the high bits a16 or low bits a&0xFFFF.

1.2 Execution Flow

Arithmetic operations involve the following steps:

Step 1: Initialize Arithmetic Table

Step 2: Generate Trace Data

Step 3: Verify Range Checks

Step 4: Validate Constraints

Step 5: Perform Circuit-Level Validation

1.2.1 Step 1: Initialize Arithmetic Table

Initialize the structure of the arithmetic table, mapping columns (e.g., operands, operation types, results) and linking them with corresponding fields in the CPU table.

1. Define Operation Types and Opcodes:

Use the COMBINED_OPS array to define 26 arithmetic operations with their corresponding columns and opcodes, e.g.:

columns::IS_ADD represents the addition operation with an opcode of 0b100000.
Opcodes follow the binary encoding of the MIPS instruction set.

2. Define Register Mappings:

REGISTER_MAP includes three register ranges:

INPUT_REGISTER_0 and INPUT_REGISTER_1 store two input operands.
OUTPUT_REGISTER stores the computation result.

3. Generate Filters:

Use Filter::new_simple to create filters marking valid arithmetic operation rows.
Filters activate only when a row corresponds to arithmetic operations defined in COMBINED_OPS.

4. Link Arithmetic and CPU Tables:

Call the cpu_arith_data_link method to link columns in the arithmetic table (e.g., registers and operands) with corresponding fields in the CPU table.

pub fn ctl_arithmetic_rows<F: Field>() -> TableWithColumns<F> {
    const COMBINED_OPS: [(usize, u32); 26] = [
        // Map arithmetic operations and their corresponding opcodes.
        (columns::IS_ADD, 0b100000 * (1 << 6)),
        (columns::IS_ADDU, 0b100001 * (1 << 6)),
        (columns::IS_ADDI, 0b001000),
        // ... omit ...
    ];
const REGISTER_MAP: [Range<usize>; 3] = [
        columns::INPUT_REGISTER_0, // input register0
        columns::INPUT_REGISTER_1, // input register1
        columns::OUTPUT_REGISTER,  // output register
    ];
let filter = Some(Filter::new_simple(Column::sum(
        COMBINED_OPS.iter().map(|(c, _v)| *c),
    )));
    TableWithColumns::new(
        Table::Arithmetic,
        cpu_arith_data_link(&COMBINED_OPS, &REGISTER_MAP), //Link the arithmetic table to the CPU table
        filter,
    )
}

1.2.2 Step 2: Generate Trace Data

Generate trace data row by row based on the input operations (Operation), supplement empty rows, and perform range checks.

1. Allocate Trace Rows:

max_rows defines the maximum number of rows in the trace table:

If each operation requires at most two rows, the row count is 2operations.len().
Range checks require at least RANGE_MAX rows, so the final row count is the maximum of the two.

2. Generate Trace Rows for Each Operation:

Convert each Operation to one or two rows of data using the to_rows method.
Append a second row for complex arithmetic operations as needed.

3. Fill Empty Rows:

The trace table’s length must be a power of two (to optimize later calculations).
Add zero-filled rows to reach the next power of two, ensuring the row count is no less than RANGE_MAX.

4. Transpose to Column Data:

Convert row data to columnar format for easier verification and polynomial calculations.

5. Perform Range Checks:

Call generate_range_checks to validate column data against specified ranges.

pub(crate) fn generate_trace(&self, operations: Vec<Operation>) -> Vec<PolynomialValues<F>> {
    let max_rows = std::cmp::max(2 * operations.len(), RANGE_MAX);
    let mut trace_rows = Vec::with_capacity(max_rows);
    for op in operations {
        let (row1, maybe_row2) = op.to_rows(); // Converts an operation to one or two rows of data.
        trace_rows.push(row1);
        if let Some(row2) = maybe_row2 { // If the operation requires two lines, add them to the Trace.
            trace_rows.push(row2);
        }
    }
    // Fill blank lines
    let padded_len = trace_rows.len().next_power_of_two(); // Make sure the number of rows is a power of two to facilitate the interpolation and commitment operation of FRI.
    for _ in trace_rows.len()..std::cmp::max(padded_len, RANGE_MAX) {
        trace_rows.push(vec![F::ZERO; columns::NUM_ARITH_COLUMNS]); // Add all zero rows.
    }
    // Transpose to column data
    let mut trace_cols = transpose(&trace_rows);
    // range check
    self.generate_range_checks(&mut trace_cols);
    // Convert to a polynomial representation
    trace_cols.into_iter().map(PolynomialValues::new).collect()
}

1.2.3 Step 3: Verify Range Checks

Validate specific columns in the trace to ensure values lie within valid ranges (e.g., ([0, 2^{16}))).

fn generate_range_checks(&self, cols: &mut [Vec<F>]) {
    // Initialize the RANGE COUNTER column with the range [0, 2^16).
    for i in 0..RANGE_MAX {
        cols[RANGE_COUNTER][i] = F::from_canonical_usize(i);
    }
    for i in RANGE_MAX..cols[0].len() {
        cols[RANGE_COUNTER][i] = F::from_canonical_usize(RANGE_MAX - 1);
    }
    // Frequency checks are performed on SHARED COLS
    for col in SHARED_COLS {
        for i in 0..cols[0].len() {
            let x = cols[col][i].to_canonical_u64() as usize;
            assert!(
                x < RANGE_MAX,
                "column value {} exceeds the max range value {}",
                x,
                RANGE_MAX
            );
            cols[RC_FREQUENCIES][x] += F::ONE; // statistical frequency
        }
    }
}

‍1.2.4 Step 4: Validate Constraints

Verify constraints for different arithmetic operations to ensure inputs, outputs, and intermediate states conform to definitions. This module ensures offline validation of filled values.

1. Initialize the RANGE_COUNTER Column:

Fill the first RANGE_MAX rows of the range counter column RANGE_COUNTER with sequential values ([0, 2^{16})).
Fill remaining rows with the maximum value 2^{16}-1 if the column exceeds RANGE_MAX.

2. Count Frequencies of Shared Columns:

Traverse all shared columns (SHARED_COLS) and count the frequency of values within the range ([0, 2^{16})).
Trigger an assertion error if any value exceeds the range.

fn eval_packed_generic<FE, P, const D2: usize>(
    &self,
    vars: &Self::EvaluationFrame<FE, P, D2>,
    yield_constr: &mut ConstraintConsumer<P>,
) where
    FE: FieldExtension<D2, BaseField = F>,
    P: PackedField<Scalar = FE>,
{
    let lv: &[P; NUM_ARITH_COLUMNS] = vars.get_local_values().try_into().unwrap();
    let nv: &[P; NUM_ARITH_COLUMNS] = vars.get_next_values().try_into().unwrap();
    // Range checking constraints
    let rc1 = lv[RANGE_COUNTER];
    let rc2 = nv[RANGE_COUNTER];
    yield_constr.constraint_first_row(rc1); // The first row must be 0
    let incr = rc2 - rc1;
    yield_constr.constraint_transition(incr * incr - incr); // Range increasing constraint
    let range_max = P::Scalar::from_canonical_u64((RANGE_MAX - 1) as u64);
    yield_constr.constraint_last_row(rc1 - range_max); // The last line is RANGE MAX-1
    // Invocation of constraint validation for individual arithmetic operations
    mul::eval_packed_generic(lv, yield_constr);
    mult::eval_packed_generic(lv, yield_constr);
    addcy::eval_packed_generic(lv, yield_constr);
    slt::eval_packed_generic(lv, yield_constr);
    lui::eval_packed_generic(lv, nv, yield_constr);
    div::eval_packed(lv, nv, yield_constr);
    shift::eval_packed_generic(lv, nv, yield_constr);
    sra::eval_packed_generic(lv, nv, yield_constr);
    lo_hi::eval_packed_generic(lv, yield_constr);
}

1.2.5 Step 5: Perform Circuit-Level Validation

Extend circuit validation for all operations in the arithmetic table, ensuring constraints are satisfied. This module builds circuit constraints to construct proofs for the STARK system.

fn eval_ext_circuit(
    &self,
    builder: &mut CircuitBuilder<F, D>,
    vars: &Self::EvaluationFrameTarget,
    yield_constr: &mut RecursiveConstraintConsumer<F, D>,
) {
    let lv: &[ExtensionTarget<D>; NUM_ARITH_COLUMNS] =
        vars.get_local_values().try_into().unwrap();
    let nv: &[ExtensionTarget<D>; NUM_ARITH_COLUMNS] =
        vars.get_next_values().try_into().unwrap();
    // Range Check Constraints (circuit)
    let rc1 = lv[RANGE_COUNTER];
    let rc2 = nv[RANGE_COUNTER];
    yield_constr.constraint_first_row(builder, rc1);
    let incr = builder.sub_extension(rc2, rc1);
    let t = builder.mul_sub_extension(incr, incr, incr);
    yield_constr.constraint_transition(builder, t);
    let range_max =  builder.constant_extension(F::Extension::from_canonical_usize(RANGE_MAX - 1));
    let t = builder.sub_extension(rc1, range_max);
    yield_constr.constraint_last_row(builder, t);
    // Circuit verification that invokes individual arithmetic operations    mul::eval_ext_circuit(builder, lv, yield_constr);
    mult::eval_ext_circuit(builder, lv, yield_constr);
    addcy::eval_ext_circuit(builder, lv, yield_constr);
    slt::eval_ext_circuit(builder, lv, yield_constr);
    lui::eval_ext_circuit(builder, lv, nv, yield_constr);
    div::eval_ext_circuit(builder, lv, nv, yield_constr);
    shift::eval_ext_circuit(builder, lv, nv, yield_constr);
    sra::eval_ext_circuit(builder, lv, nv, yield_constr);
    lo_hi::eval_ext_circuit(builder, lv, yield_constr);
}

2. CPU Operations

2.1 Basic Introduction to the CPU Module

The CPU’s execution primarily includes the following components:

Instruction Fetch and Decode The CPU fetches instructions from memory and sends them to the decode module. The module parses the instruction into opcode, function code, source registers, destination registers, and other fields. It validates the instruction’s legality and determines its type (e.g., arithmetic, jump, or memory operation). Purpose: Ensure the CPU executes only valid instructions and provides a clear instruction structure for subsequent modules.
Jumps and Branches For jump or branch instructions, the jump module calculates the target address and updates the program counter (PC). Unconditional jumps directly modify the PC, while conditional jumps depend on the status of condition registers. Branch instructions evaluate specific conditions (e.g., equal, greater, or less). Purpose: Control the program’s flow flexibly, enabling function calls, loops, and more.
Memory Access and Bus Operations Instructions requiring memory access (e.g., load and store) interact with memory via the memory bus module. The module ensures the memory channel’s status (e.g., availability) and context flags (e.g., kernel/user) are valid. Specific operations like loading words or storing data are handled by the memory input-output (Memio) module. Purpose: Bridge the CPU and memory, ensuring safe and correct data read/write operations.
Arithmetic and Logical Operations The CPU performs arithmetic or logical computations based on the instruction type, such as addition, subtraction, shifts, or conditional moves. Specific modules handle operations like bit shifts, sign extensions, or bit-level transformations. Purpose: Execute core computation tasks for the instruction.
System Calls and Privileged Operations System call instructions switch the CPU to kernel mode for privileged operations (e.g., memory mapping, thread management). The Syscall module validates the legality of operations. After completing tasks, the CPU exits kernel mode and resumes user mode. Purpose: Enable safe transitions between user and kernel modes for system-level services.
Result Writeback and State Update After executing an instruction, the result is written back to the destination register or memory, and the program counter is updated for the next instruction. Purpose: Save computation results and advance the instruction flow.

2.2 Execution Flow

2.2.1 Step 1: Structure Definition

1. Define the CpuStark structure, which is a generic structure containing the following parameters:

F: Field type, used to support finite field operations.
D: Degree of extension, used to represent the depth of the extended field.

2. The PhantomData<F> is used as a placeholder to indicate that this structure is associated with a field type.

#[derive(Copy, Clone, Default)]
pub struct CpuStark<F, const D: usize> {
    pub f: PhantomData<F>,
}

CpuStark is the core structure for CPU simulation, responsible for managing and implementing STARK-specific operations, including trace verification and constraint generation.

2.2.2 Step 2: Local Validation Method

fn eval_packed_generic<FE, P, const D2: usize>(
    &self,
    vars: &Self::EvaluationFrame<FE, P, D2>,
    yield_constr: &mut ConstraintConsumer<P>,
) where
    FE: FieldExtension<D2, BaseField = F>,
    P: PackedField<Scalar = FE>,

‍

Verifies whether the CPU’s execution trace satisfies logical constraints.
Uses packed fields for batch validation, improving the efficiency of local validation.

Step 2.1: Extract CPU State

Extract the current row (local_values) and the next row (next_values) from the input vars.
Convert these into CpuColumnsView, which allows easy access to the CPU’s fields.

let local_values: &[P; NUM_CPU_COLUMNS] = vars.get_local_values().try_into().unwrap();
let local_values: &CpuColumnsView<P> = local_values.borrow();
let next_values: &[P; NUM_CPU_COLUMNS] = vars.get_next_values().try_into().unwrap();
let next_values: &CpuColumnsView<P> = next_values.borrow();

‍

CpuColumnsView: Encapsulates CPU trace data, such as instructions and register values.
NUM_CPU_COLUMNS: The total number of columns in the CPU table.

Step 2.2: Call Module Validation

bootstrap_kernel::eval_bootstrap_kernel_packed(local_values, next_values, yield_constr);
decode::eval_packed_generic(local_values, yield_constr);
jumps::eval_packed(local_values, next_values, yield_constr);
membus::eval_packed(local_values, yield_constr);
memio::eval_packed(local_values, next_values, yield_constr);
shift::eval_packed(local_values, yield_constr);
count::eval_packed(local_values, yield_constr);
syscall::eval_packed(local_values, yield_constr);
bits::eval_packed(local_values, yield_constr);
misc::eval_packed(local_values, yield_constr);
exit_kernel::eval_exit_kernel_packed(local_values, next_values, yield_constr);

Modules are called to validate different CPU functionalities.
Each module corresponds to a specific validation logic (e.g., decoding, jumping, memory operations).

Example: decode::eval_packed_generic

fn eval_packed_generic<P: PackedField>(
    lv: &CpuColumnsView<P>,
    yield_constr: &mut ConstraintConsumer<P>,
) {
    let filter = lv.op.decode_op; // Filtering condition for instruction decoding
    let opcode_check = lv.opcode_bits[0]; // Get the first bit of the opcode
    yield_constr.constraint(filter * opcode_check); // Add constraint: verify the opcode is correct
}

‍Purpose: Verifies that the opcode is correct.

Process:

Checks whether the current row corresponds to an instruction decoding operation (via filter).
Verifies whether the first bit of the opcode matches expectations.

2.2.3 Step 3: Circuit Generation Method

fn eval_ext_circuit(
    &self,
    builder: &mut CircuitBuilder<F, D>,
    vars: &Self::EvaluationFrameTarget,
    yield_constr: &mut RecursiveConstraintConsumer<F, D>,
)

Generates circuit constraints for each operation step of the CPU.
These constraints are used by the zk-STARK proof generator.

Step 3.1: Extract State

let local_values: &[ExtensionTarget<D>; NUM_CPU_COLUMNS] =
    vars.get_local_values().try_into().unwrap();
let local_values: &CpuColumnsView<ExtensionTarget<D>> = local_values.borrow();
let next_values: &[ExtensionTarget<D>; NUM_CPU_COLUMNS] =
    vars.get_next_values().try_into().unwrap();
let next_values: &CpuColumnsView<ExtensionTarget<D>> = next_values.borrow();

Extract ExtensionTarget data for circuit generation.
This process is similar to the state extraction in eval_packed_generic.

Step 3.2: Call Module Constraint Generation

bootstrap_kernel::eval_bootstrap_kernel_ext_circuit(builder, local_values, next_values, yield_constr);
decode::eval_ext_circuit(builder, local_values, yield_constr);
jumps::eval_ext_circuit(builder, local_values, next_values, yield_constr);
membus::eval_ext_circuit(builder, local_values, yield_constr);
// ...
exit_kernel::eval_exit_kernel_ext_circuit(builder, local_values, next_values, yield_constr);

‍Functionality:

Each module generates corresponding circuit constraints.
Each module provides an eval_ext_circuit method to generate these constraints.

Example: decode::eval_ext_circuit

fn eval_ext_circuit<F: RichField + Extendable<D>, const D: usize>(
    builder: &mut CircuitBuilder<F, D>,
    lv: &CpuColumnsView<ExtensionTarget<D>>,
    yield_constr: &mut RecursiveConstraintConsumer<F, D>,
) {
    let filter = lv.op.decode_op; // Filtering condition
    let opcode_check = lv.opcode_bits[0]; // First bit of the opcode
    let constr = builder.mul_extension(filter, opcode_check); // Generate multiplication constraint
    yield_constr.constraint(builder, constr); // Add to the constraints
}

The CircuitBuilder is used to build multiplication constraints, creating the circuit constraint for instruction decoding.
Verifies that the opcode is correct.

2.2.4 Step 4: Testing and Verification

#[test]
fn test_stark_degree() -> Result<()> {
    const D: usize = 2;
    type C = PoseidonGoldilocksConfig;
    type F = <C as GenericConfig<D>>::F;
    type S = CpuStark<F, D>;
    let stark = S {
        f: Default::default(),
    };
    test_stark_low_degree(stark)
}

‍Functionality:

Tests whether the polynomial constraints satisfy the low-degree property.
Ensures the validity of the STARK proof.

#[test]
fn test_stark_circuit() -> Result<()> {
    const D: usize = 2;
    type C = PoseidonGoldilocksConfig;
    type F = <C as GenericConfig<D>>::F;
    type S = CpuStark<F, D>;
    let stark = S {
        f: Default::default(),
    };
    test_stark_circuit_constraints::<F, C, S, D>(stark)
}

‍Functionality:

Tests whether the generated circuit conforms to the expected constraints.
Verifies the correctness of the STARK circuit generation.

Subscribe to the ZKM Blog to stay updated with the latest research by ZKM. If you have any questions about the article, you can contact the ZKM team on discord: discord.com/zkm