For riak 1.4 using leveldb as the backend so that secondary indexing is enabled via
{riak_kv, [{storage_backend, riak_kv_eleveldb_backend},...]}
{eleveldb, [
{data_root, "/var/lib/riak/leveldb"}
]}
Data:
from riak import RiakClient, RiakPbcTransport; ds = RiakClient('127.0.0.1',port=1030, transport_class=RiakPbcTransport)
[ipython 11]: for x in xrange(1, 100000):
ds.bucket('stuff').new('bob_'+str(x),
data={'test_obj':x}).store()
Setup:
from riak.mapreduce import RiakMapReduce, RiakKeyFilter
mr = MapReduce(self.datastore_client.riak_client)
mr.add(bucket)
mr.add_key_filters(RiakKeyFilter().starts_with(id_tag))
b = ds.bucket('stuff')
Results:
[ipython 21]: %time max([int(_id.get_key().split('bob_')[1]) for _id in mr.run()])
CPU times: user 4.09 s, sys: 0.31 s, total: 4.40 s
Wall time: 40.94 s
Out[21]: 99999
[ipython 22]: %time max([int(_id.get_key().split('bob_')[1]) for _id in ds.index('stuff', "$key", ' ', '~').run()])
CPU times: user 4.16 s, sys: 0.32 s, total: 4.48 s
Wall time: 41.80 s
Out[22]: 99999
[ipython 23]: %time max([int(_id.split('bob_')[1]) for _id in ds.bucket('stuff').get_keys()])
CPU times: user 0.48 s, sys: 0.01 s, total: 0.49 s
Wall time: 1.39 s
Out[23]: 99999
Granted it's a super simple test, but the results are still confusing. It could be a misconfigured leveldb? Turning it on seems pretty basic.
No comments:
Post a Comment