Back to Home

About the Project

Kernelized Vector Storage: Two approaches to memory-efficient pointer representation — Tiny Pointers and Kernelized Squashing.

Tiny Pointers

An offset-based compression technique for contiguous memory layouts. By storing relative offsets from a base address instead of full 64-bit pointers, Tiny Pointers achieve significant memory savings with O(1) compression and decompression operations. Ideal for array-based data structures, compact graphs, and memory pools.

Kernelized Squashing

A kernel function-based compression approach using hash table lookup for flexible memory layouts. Supports Linear, Polynomial, RBF, and Sigmoid kernel functions to transform pointers into compact identifiers. Provides O(1) average-case compression with up to 60% memory savings in pointer-dense applications.

Hybrid Approach

Together, Tiny Pointers and Kernelized Squashing cover a broad spectrum of use cases — from contiguous arrays (Tiny Pointers) to fragmented memory graphs (Kernelized Squashing). The choice depends on memory layout, pointer density, and performance requirements.

Research Context

This research addresses the growing challenge of memory efficiency in modern computing. With 64-bit architectures, each pointer consumes 8 bytes, leading to substantial overhead in pointer-dense applications. These techniques are particularly relevant for AI/ML workloads, distributed systems, graph processing, and embedded computing.

Algorithm Design

Both techniques emphasize O(1) operations, minimal memory overhead, and cache-friendly access patterns. Tiny Pointers use offset arithmetic for speed; Kernelized Squashing uses hash-based lookup for flexibility. Together they demonstrate that pointer compression can be both fast and memory-efficient.

Algorithm Details

Tiny Pointers

  • Offset-based: pointer = base + offset
  • O(1) compress: ptr - base
  • O(1) decompress: base + offset
  • Requires contiguous memory
  • 2× pointer density in cache lines
  • Up to 50% memory savings

Kernelized Squashing

  • Hash table-based lookup
  • Kernel functions: Linear, Poly, RBF, Sigmoid
  • O(1) avg-case compression via hashing
  • Works with non-contiguous layouts
  • Flexible memory region support
  • Up to 60% savings in pointer-dense apps

Common Properties

  • Reduce 64-bit pointers to 32-64 bits
  • Improve cache hit rates by 3-5%
  • ~12% energy reduction from less memory traffic
  • Pure C implementation, -O2 optimized
  • Open addressing with linear probing
  • 75% load factor threshold

When To Use Each

  • Tiny Pointers: Arrays, memory pools, contiguous graphs
  • Kernel Squashing: Fragmented data, sparse graphs, distributed systems
  • Both: High pointer density, cache-sensitive workloads
  • Compression ratio depends on data structure
  • Table overhead amortized at >10K pointers

Kernel Functions Reference

Linear

K(x,y) = x · y

Fastest, lowest overhead. Best for general use.

Polynomial

K(x,y) = (x·y+c)ᵈ

Higher-dimensional mapping. Tunable degree d.

RBF

K(x,y) = exp(−γ‖x−y‖²)

Gaussian-style distance weighting. Good for clustering.

Sigmoid

K(x,y) = tanh(αx·y+c)

Neural network-inspired activation.

Implementation

  • Pure C implementation with C11 support
  • GCC compiler with POSIX threads
  • Make build system
  • Comprehensive benchmark suite

Performance

  • O(1) compression and decompression
  • Up to 60% memory savings
  • 5-15% performance improvement
  • ~12% energy reduction