About the Project
Kernelized Vector Storage: Two approaches to memory-efficient pointer representation — Tiny Pointers and Kernelized Squashing.
Tiny Pointers
An offset-based compression technique for contiguous memory layouts. By storing relative offsets from a base address instead of full 64-bit pointers, Tiny Pointers achieve significant memory savings with O(1) compression and decompression operations. Ideal for array-based data structures, compact graphs, and memory pools.
Kernelized Squashing
A kernel function-based compression approach using hash table lookup for flexible memory layouts. Supports Linear, Polynomial, RBF, and Sigmoid kernel functions to transform pointers into compact identifiers. Provides O(1) average-case compression with up to 60% memory savings in pointer-dense applications.
Hybrid Approach
Together, Tiny Pointers and Kernelized Squashing cover a broad spectrum of use cases — from contiguous arrays (Tiny Pointers) to fragmented memory graphs (Kernelized Squashing). The choice depends on memory layout, pointer density, and performance requirements.
Research Context
This research addresses the growing challenge of memory efficiency in modern computing. With 64-bit architectures, each pointer consumes 8 bytes, leading to substantial overhead in pointer-dense applications. These techniques are particularly relevant for AI/ML workloads, distributed systems, graph processing, and embedded computing.
Algorithm Design
Both techniques emphasize O(1) operations, minimal memory overhead, and cache-friendly access patterns. Tiny Pointers use offset arithmetic for speed; Kernelized Squashing uses hash-based lookup for flexibility. Together they demonstrate that pointer compression can be both fast and memory-efficient.
Algorithm Details
Tiny Pointers
- Offset-based: pointer = base + offset
- O(1) compress: ptr - base
- O(1) decompress: base + offset
- Requires contiguous memory
- 2× pointer density in cache lines
- Up to 50% memory savings
Kernelized Squashing
- Hash table-based lookup
- Kernel functions: Linear, Poly, RBF, Sigmoid
- O(1) avg-case compression via hashing
- Works with non-contiguous layouts
- Flexible memory region support
- Up to 60% savings in pointer-dense apps
Common Properties
- Reduce 64-bit pointers to 32-64 bits
- Improve cache hit rates by 3-5%
- ~12% energy reduction from less memory traffic
- Pure C implementation, -O2 optimized
- Open addressing with linear probing
- 75% load factor threshold
When To Use Each
- Tiny Pointers: Arrays, memory pools, contiguous graphs
- Kernel Squashing: Fragmented data, sparse graphs, distributed systems
- Both: High pointer density, cache-sensitive workloads
- Compression ratio depends on data structure
- Table overhead amortized at >10K pointers
Kernel Functions Reference
Linear
K(x,y) = x · yFastest, lowest overhead. Best for general use.
Polynomial
K(x,y) = (x·y+c)ᵈHigher-dimensional mapping. Tunable degree d.
RBF
K(x,y) = exp(−γ‖x−y‖²)Gaussian-style distance weighting. Good for clustering.
Sigmoid
K(x,y) = tanh(αx·y+c)Neural network-inspired activation.
Implementation
- Pure C implementation with C11 support
- GCC compiler with POSIX threads
- Make build system
- Comprehensive benchmark suite
Performance
- O(1) compression and decompression
- Up to 60% memory savings
- 5-15% performance improvement
- ~12% energy reduction