Difference between Python and Ruby when it comes to hashes with default values
Having worked with Python for a while, I am trying to pick up Ruby, especially for some of my work with logstash. While trying out a small program in Ruby, I got stumped with a peculiar trait of Ruby hashes with default values. It made me lose an hour of my life I am not going to get back. :(
In Python, dictionaries with default values is not part of the core language, and need to be imported from a standard library. For some reason I still don’t know, you cannot set a simple default value, you need supply a function object which is going to create a value for you. :/
So if I want to create a default dictionary in Python, you need to do something like this:
>>> from collections import defaultdict
>>> d = defaultdict(lambda: 5)
Now when you access any non-existent key in this hash, the hash gets magically populated with that key and the default value. See below.
>>> if d["one"]: print "not empty"
...
not empty
>>>
>>> print dict(d)
{'one': 5}
>>>
Ruby hashes support default values in the core language. So you can do something like:
>> d = Hash.new(5)
=> {}
>> puts d["a"]
5
=> nil
>> d
=> {}
Wait! See the difference? Evaluating a non-existing hash position is Ruby returns the default value, but unlike Python, it doesn’t set the value!
After it drilled down to this quirk, I looked around and found some interesting articles.
This article from 2008 by David Black was particularly interesting. He points out in this article how this Hash behaviour breaks a very popular Ruby idiom.
Normally, the idiomatic Ruby way to initialize or return a variable goes like this:
>> d = Hash.new
=> {}
>> d["name"] ||= "Skye"
=> "Skye"
>> d
=> {"name"=>"Skye"}
However, if you use a Hash with a default value, it breaks this idiom because
evaluating a non-existing key returns the default value. However, you would
intuitively expect the ||=
operator to at least set the missing key to the
default value! But that doesn’t happen due to the peculiar treatment to that
operator by Ruby. Ruby evaluates A ||= B
NOT to A = A || B
, but A || A=B
.
This never lets the value to be assigned to keys in default hashes.
>> d = Hash.new("May")
=> {}
>> d["name"] ||= "Skye"
=> "May"
>> d
=> {}
Some gotcha lurking there in an otherwise beautiful language.
Another nice ref: http://www.rubyinside.com/what-rubys-double-pipe-or-equals-really-does-5488.html