lp://staging/~jameinel/+junk/pybloom
- Get this branch:
- bzr branch lp://staging/~jameinel/+junk/pybloom
Branch information
- Owner:
- John A Meinel
- Status:
- Development
Recent revisions
- 42. By John A Meinel
-
Remove an !=, which also caught an accidental access to the original python object
instead of the character buffer. - 41. By John A Meinel
-
play around with some tweaks. when benchmarking murmur, use 20x the number of keys.
- 40. By John A Meinel
-
Lots of little tweaks.
Include a larger ancestry set. Now using 24k bzr revs instead of 8k.
Spend some time optimizing the insert side of blooms.
Improves insert time from 200+ms down to 50ms.
Add BloomMurmur to the benchmark suite, so it is actually being benchmarked.
Overall BloomMurmur is about 200 => 30ms for check times (not quite 10x faster).
Insert times are approx 250 => 50ms (about 5x).
Using the multi-way & aligned path shaves about 10% off. 56ms => 49ms.
We really need a longer running test, ideally ~1s rather than .05s. - 38. By John A Meinel
-
Add the framework for a multi-hash version.
The main win here is that we can factor out some of the common work
rather than making multiple passes over the data.
Also, we allow for using the fully optimized code when on a
little endian device with aligned input. - 37. By John A Meinel
-
Minor optimization of the python implementation of murmur hash.
The big change is to just use struct.unpack() across everything in one pass,
rather than lots of ord() calls and bit shifts. - 35. By John A Meinel
-
Push the __contains__ check down into pyrex.
Gives us another ~10% on indexbench.
For some reason get_components_positions still loses. - 34. By John A Meinel
-
Add a filter_by_presence function, to allow filtering multiple nodes in one go.
Customize the BloomMurmur.__contains_ _ so that it doesn't have to compute all hashes
when it can know right away that something doesn't hit.
Branch metadata
- Branch format:
- Branch format 5
- Repository format:
- Bazaar pack repository format 1 (needs bzr 0.92)