Skip to content

Fix pointer difference scaling for non-char pointer types #298

@jserv

Description

@jserv

Description

Pointer subtraction currently loses element type information and uses incorrect scaling. According to C99 §6.5.6, when two pointers to the same array are subtracted, the result is the difference of the subscripts. The current implementation doesn't divide by sizeof(element_type), causing incorrect results for int*, struct*, etc.

Current Bug

// src/parser.c - handle_pointer_arithmetic
// Current implementation loses type and doesn't scale properly
int arr[10];
int *p = &arr[0];
int *q = &arr[5];
int diff = q - p;  // Should be 5, but returns 20 (5 * sizeof(int))

Root Cause

The implementation in src/parser.c doesn't preserve operand type information through pointer arithmetic expressions and fails to divide by the element size.

Proposed Fix

1. Preserve Type Information

// Add to src/defs.h
typedef struct {
    var_t *var;
    type_t *type;  // Preserve original type
    int ptr_level;
    size_t element_size;  // Size of pointed-to type
} typed_var_t;

// Modify expression handling to track types
typedef struct {
    opcode_t op;
    typed_var_t *left;
    typed_var_t *right;
    type_t *result_type;
} typed_expr_t;

2. Fix Pointer Difference Calculation

// src/parser.c - Fix handle_pointer_arithmetic
void handle_pointer_diff(block_t *parent, basic_block_t **bb,
                         typed_var_t *left, typed_var_t *right) {
    // Verify both pointers point to compatible types
    if (!types_compatible(left->type, right->type)) {
        error("Subtracting pointers to incompatible types");
        return;
    }
    
    // Get element size
    size_t elem_size = get_element_size(left->type);
    
    if (elem_size == 0) {
        error("Cannot subtract pointers to incomplete type");
        return;
    }
    
    // Generate subtraction
    var_t *diff = add_tmp_var(parent, "ptr_diff", TYPE_int);
    add_insn(parent, bb, OP_SUB, diff, left->var, right->var, 
             PTR_SIZE, NULL);
    
    // Scale by element size (divide)
    if (elem_size > 1) {
        var_t *size_var = add_constant(elem_size);
        var_t *result = add_tmp_var(parent, "scaled_diff", TYPE_int);
        add_insn(parent, bb, OP_DIV, result, diff, size_var, 
                 TYPE_size(TYPE_int), NULL);
        push_var(result);
    } else {
        push_var(diff);  // char* doesn't need scaling
    }
}

// Helper to get element size
size_t get_element_size(type_t *ptr_type) {
    if (ptr_type->is_ptr == 0) {
        return 0;  // Not a pointer
    }
    
    // Dereference one level
    type_t *elem_type = deref_type(ptr_type);
    
    switch (elem_type->base_type) {
    case TYPE_char:
        return 1;
    case TYPE_int:
        return 4;
    case TYPE_struct:
        return elem_type->size;
    case TYPE_void:
        error("void* arithmetic is undefined");
        return 1;  // Error recovery
    default:
        return TYPE_size(elem_type);
    }
}

3. Update Expression Parser

// Modify read_expr_operand to preserve types
void read_expr_operand(block_t *parent, basic_block_t **bb) {
    // ... existing code ...
    
    if (peek_token(lex_peek()).type == '-') {
        lex_expect('-');
        
        // Check if both operands are pointers
        var_t *right_var = pop_var();
        var_t *left_var = pop_var();
        
        type_t *left_type = get_var_type(left_var);
        type_t *right_type = get_var_type(right_var);
        
        if (left_type->is_ptr && right_type->is_ptr) {
            // Pointer difference - needs special handling
            typed_var_t left_typed = {left_var, left_type, 
                                     left_type->is_ptr,
                                     get_element_size(left_type)};
            typed_var_t right_typed = {right_var, right_type,
                                      right_type->is_ptr,
                                      get_element_size(right_type)};
            
            handle_pointer_diff(parent, bb, &left_typed, &right_typed);
        } else {
            // Regular subtraction
            push_var(left_var);
            push_var(right_var);
            read_expr_add_sub(parent, bb, OP_SUB);
        }
    }
}

Test Cases

// tests/pointer_diff_test.c

void test_int_pointer_diff() {
    int arr[10];
    int *p1 = &arr[2];
    int *p2 = &arr[7];
    
    assert(p2 - p1 == 5);  // Currently fails, returns 20
    assert(p1 - p2 == -5); // Currently fails, returns -20
}

void test_struct_pointer_diff() {
    struct point { int x, y; } points[10];
    struct point *p1 = &points[3];
    struct point *p2 = &points[8];
    
    assert(p2 - p1 == 5);  // Currently fails, returns 40
}

void test_char_pointer_diff() {
    char str[100];
    char *p1 = &str[10];
    char *p2 = &str[30];
    
    assert(p2 - p1 == 20); // Should work correctly (no scaling)
}

void test_void_pointer_diff() {
    int arr[10];
    void *p1 = &arr[0];
    void *p2 = &arr[5];
    
    // Should be compilation error in C99
    // int diff = p2 - p1;  // Error: arithmetic on void*
}

C99 Standard Reference

C99 §6.5.6 Additive operators, paragraph 9:

When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements.

Impact

  • High severity: Incorrect pointer arithmetic breaks many C programs
  • Common pattern: Used in array processing, string manipulation
  • Silent corruption: Wrong values without errors

Implementation Plan

  1. Add type tracking to expression evaluation
  2. Fix pointer difference calculation
  3. Add comprehensive test cases
  4. Verify bootstrap still works
  5. Update documentation

Success Criteria

  • All pointer difference operations return correct scaled values
  • void * arithmetic produces error
  • Bootstrap compilation succeeds
  • No regression in existing tests

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions