The opposite of a bloom filter

by Anilm3on 10/7/2017, 10:27 AMwith 25 comments

by pents90on 10/7/2017, 7:20 PM

I would argue that the opposite of a bloom filter doesn't really exist, at least not in a satisfying way. A bloom filter's size is dependent only on the desired false positive rate, whereas its opposite must be dependent on the size of the data. (And don't be fooled by data that can be represented by a primary key, that's not as general as a bloom filter.) I tried, with limited success, to explain my point of view in this answer on StackExchange: https://cstheory.stackexchange.com/questions/6596/a-probabil...

by DenisMon 10/7/2017, 8:03 PM

TLDR: a cache with LRU eviction, but only storing the keys, not the values.

by ww520on 10/7/2017, 5:19 PM

"A Bloom filter is a data structure that may report it contains an item that it does not (a false positive), but is guaranteed to report correctly if it contains the item (“no false negatives”)."

I'm afraid that is not how it works. A Bloom filter can tell whether an item may be in the set (false positive) but can definitely tell an item is NOT in the set (no false negative).

by ahazred8taon 10/7/2017, 5:58 PM

"It's a cache." The use case is deduplication in an event-stream environment. This calls for exact matching without hash collisions.

by supermatton 10/7/2017, 10:08 PM

Low memory version:

`return false;`