About the Project

Kernelized Vector Storage: Two approaches to memory-efficient pointer representation — Tiny Pointers and Kernelized Squashing.

Tiny Pointers

An offset-based compression technique for contiguous memory layouts. By storing relative offsets from a base address instead of full 64-bit pointers, Tiny Pointers achieve significant memory savings with O(1) compression and decompression operations. Ideal for array-based data structures, compact graphs, and memory pools.

Kernelized Squashing

A kernel function-based compression approach using hash table lookup for flexible memory layouts. Supports Linear, Polynomial, RBF, and Sigmoid kernel functions to transform pointers into compact identifiers. Provides O(1) average-case compression with up to 60% memory savings in pointer-dense applications.

Hybrid Approach

Together, Tiny Pointers and Kernelized Squashing cover a broad spectrum of use cases — from contiguous arrays (Tiny Pointers) to fragmented memory graphs (Kernelized Squashing). The choice depends on memory layout, pointer density, and performance requirements.

Research Context

This research addresses the growing challenge of memory efficiency in modern computing. With 64-bit architectures, each pointer consumes 8 bytes, leading to substantial overhead in pointer-dense applications. These techniques are particularly relevant for AI/ML workloads, distributed systems, graph processing, and embedded computing.

Algorithm Design

Both techniques emphasize O(1) operations, minimal memory overhead, and cache-friendly access patterns. Tiny Pointers use offset arithmetic for speed; Kernelized Squashing uses hash-based lookup for flexibility. Together they demonstrate that pointer compression can be both fast and memory-efficient.

Algorithm Details

Tiny Pointers

Offset-based: pointer = base + offset
O(1) compress: ptr - base
O(1) decompress: base + offset
Requires contiguous memory
2× pointer density in cache lines
Up to 50% memory savings

Kernelized Squashing

Hash table-based lookup
Kernel functions: Linear, Poly, RBF, Sigmoid
O(1) avg-case compression via hashing
Works with non-contiguous layouts
Flexible memory region support
Up to 60% savings in pointer-dense apps

Common Properties

Reduce 64-bit pointers to 32-64 bits
Improve cache hit rates by 3-5%
~12% energy reduction from less memory traffic
Pure C implementation, -O2 optimized
Open addressing with linear probing
75% load factor threshold

When To Use Each

Tiny Pointers: Arrays, memory pools, contiguous graphs
Kernel Squashing: Fragmented data, sparse graphs, distributed systems
Both: High pointer density, cache-sensitive workloads
Compression ratio depends on data structure
Table overhead amortized at >10K pointers

Kernel Functions Reference

Linear

K(x,y) = x · y

Fastest, lowest overhead. Best for general use.

Polynomial

K(x,y) = (x·y+c)ᵈ

Higher-dimensional mapping. Tunable degree d.

RBF

K(x,y) = exp(−γ‖x−y‖²)

Gaussian-style distance weighting. Good for clustering.

Sigmoid

K(x,y) = tanh(αx·y+c)

Neural network-inspired activation.

Implementation

Pure C implementation with C11 support
GCC compiler with POSIX threads
Make build system
Comprehensive benchmark suite

Performance

O(1) compression and decompression
Up to 60% memory savings
5-15% performance improvement
~12% energy reduction