Looks can be deceiving

Written by Yaniv Kessler

lexic order !== numeric order (dah!)


lexic order !== numeric order (dah!)

Don’t do this:

var level = require('levelup')
var db = level('db')
db.put(Date.now(), 'somevalue1')
db.put(Date.now(), 'somevalue2')

or this:

var level = require('levelup')
var db = level('db')
var counter = 0
db.put(counter++, 'somevalue1')
db.put(counter++, 'somevalue2')

If you care about order of iteration being related to the context of the keys.

I know I did… and a quick search on github demonstrates I’m not the only one too.

Your keys will get converted into strings and LevelDB orders its keys in a lexicographical order, and lexic order != numeric order (dah!) although sometimes it looks like it is.

example

The following numerically ordered set [1, 2, 3, 4] will look exactly the same when ordered lexicographically, but [9, 10, 11, 12] will look like this: [10, 11, 12, 9].

Epoch timestamps are even more deceiving, since you need to look at a much wider range to see the problem:

var now = Date.now() // 2016 when I wrote this post
var aLongTimeAgo = 5e11 // circa 1985
var farIntoTheFuture = 5e12 // circa 2128

db.put(now, '1')
db.put(aLongTimeAgo, '2')
db.put(farIntoTheFuture, '3')
db.createReadStream().on('data', console.log)

// prints: 
// { key: '1461844195056', value: '1' }
// { key: '500000000000', value: '2' }
// { key: '5000000000000', value: '3' }

Again not the desired result.

solution

Probably the best solution for the problem at hand is to use bytewise to store the keys. Numeric indexing is the first use case described in the [README](https://github.com/deanlandolt/bytewise# numeric-indexing)

Another solution is to pad with zeros from the left, so [9, 10, 11, 12] will be [09, 10, 11, 12] but the downside of this approach is that once your range increases, all the keys have to be “repadded”. E.g if we add 100 to our set then we need to repad: [009, 010, 011, 012, 100]

Unlike node.js some environments expose a feature of leveldb called custom comparators, which will also solve the problem at hand. For example [plylev](http://plyvel.readthedocs.io/en/latest/user.html# custom-comparators)

Lastly, a neat trick I learned from a friend (@avishay) is to negate (i.e turn negative) the numbers before storing them. Of course this is only good for situations where the environment does not convert your integer into a string first. But still it’s such a cool simple trick, it is worth mentioning.

What was/is your solution?

updates: