Started reading an interesting paper by Jesper Holm Olsen and Søren Skov on cache-oblivious algorithms. When I first heard "cache" I was thinking memcache, ehcache, etc., but they're referrig to CPU caches (L1, L2, etc.). Most of us are aware of developing and analyzing algorithms with the RAM (random access machine) model, in which we have infinite memory and access is constant. For example, you have n inputs that fit neatly into memory, it's going to take you n log n with some constant factors thrown in for a comparison based sort.
While this model is simple and sufficient for many problems, for others we need better models and an understanding of the underlying hardware. For example, we see this in database systems that makes use of B-trees whose node size is typically related to the disk block size because disk access is relatively slow. Such algorithms, using an I/O model, are cache-aware. Analysis of cache-aware algoirthms depend on block and internal memory sizes. For example, a scan would take something like n/B + 1, where B is the block size. Even if you take the disk out of the picture, today's processors have hierarchical memory. L1 cache memory is a lot faster than L2, than L3, than main memory. To develop optimal algorithms you may need to be aware of these caches and their structures.
Obviously this makes it difficult to move a cache-aware algorithm's implementation from one architecture to another, since each architecture will have different memory sizes and structure; and different memory hierarchy levels. Enter cache-oblivious algorithms. As the name implies, these algorithms don't care about the underlying memory structure and size. The model is still based on some block size B and internal memory M, but these values don't need to be tweaked to get optimal performance.
An algorithm is cache oblivious if no program variables dependent on hardware configuration parameters, such as cache size and cache-line length need to be tuned to minimize the number of cache misses.
The authors go on to discuss some cache-oblivious algorithms in detail. Haven't read the rest of the paper, but thought I'd pass it along in case you're looking for something fun to read.
file under: interesting stuff I'll probably never use.
Comments