Riff - ActiveRecord diff plugin
The other day I went looking for a quick way to get a diff between revisions of a record for ActiveRecord.
This led to the Riff plugin for Ruby on Rails.
There are two uses of this plugin:
- object1.diff?(object2) #=> true | false
- object2.diff(object2) #=> hash of differences
Here is an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
>> requests = Request.find(:all)
>> request1 = requests[0]
=> #<Request:0x2a9c93f360 @attributes={.....}>
>> request2 = requests[1]
=> #<Request:0x2a9c93f310 @attributes={.....}>
>> request1.diff?(request2)
=> true
>> request1.diff(request2)
=> {:last_email_alert_at=>[Wed Apr 11 15:36:03 MDT 2007, nil],
:description=>["hello", "goodbye"],
:due_date=>[#<Date: 4908383/2,0,2299161>,
#<Date: 4908421/2,0,2299161>],
:summary=>["Do this for me.....", "Test"],
:created_at=>[Mon Apr 09 10:58:24 MDT 2007,
Mon Apr 09 11:08:17 MDT 2007],
:last_email_alert_by=>[2, nil],
:version=>[30, 1],
:updated_at=>[Wed Apr 11 15:36:03 MDT 2007,
Mon Apr 09 11:08:17 MDT 2007]}
|
Yah, so that is cool, but with a little more time on my hands I decided to figure out what is going on behind the scenes.
Here is how you would normally get a diff between hashes, unfortunately you will see later this does not work so well:
First, create the hashes:
1 2 3 4 |
>> r1 = requests[0].attributes => {...} >> r2 = requests[1].attributes => {...} |
Next, show that the records contain the same keys:
1 2 |
>> r1.keys - r2.keys => [] |
Let's get the diffs, one for first - second and one for second - first:
1 2 3 4 5 6 7 |
>> diff_2_1 = r2.values - r1.values => [Mon Apr 09 11:08:17 MDT 2007, 4, "Test", "goodbye", #<Date: 4908421/2,0,2299161>, Mon Apr 09 11:08:17 MDT 2007] >> diff_1_2 = r1.values - r2.values => [Wed Apr 11 15:36:03 MDT 2007, Wed Apr 11 15:36:03 MDT 2007, 3, 30, "Do this for me.....", "Hello", #<Date: 4908383/2,0,2299161>, Mon Apr 09 10:58:24 MDT 2007] |
This would be perfect if the results of the hash diffs were the same size:
1 2 3 4 |
>> diff_1_2.size => 8 >> diff_2_1.size => 6 |
Well this means we can't assume the indexes match. So here is how we really need to get the hash differences:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
>> ra1 = r1.to_a #=> converted to an array [[key,value],[key2,value2]...] >> ra2 = r2.to_a #=> converted to an array >> diff_a_1_2 = ra1 - ra2 => [["updated_at", Wed Apr 11 15:36:03 MDT 2007], ["last_email_alert_at", Wed Apr 11 15:36:03 MDT 2007], ["id", 3], ["version", 30], ["summary", "Do this for me....."], ["last_email_alert_by", 2], ["description", "Hello"], ["due_date", #<Date: 4908383/2,0,2299161>], ["created_at", Mon Apr 09 10:58:24 MDT 2007]] >> diff_a_2_1 = ra2 - ra1 => [["updated_at", Mon Apr 09 11:08:17 MDT 2007], ["last_email_alert_at", nil], ["id", 4], ["version", 1], ["summary", "Test"], ["last_email_alert_by", nil], ["description", "goodbye"], ["due_date", #<Date: 4908421/2,0,2299161>], ["created_at", Mon Apr 09 11:08:17 MDT 2007]] >> diff_a_2_1.size => 9 >> diff_a_1_2.size => 9 |
Well, that's odd there appear to be nine differences, where as before we saw a max of eight differences. This difference in size can be explained by the missing nils in the prior example.
Now we create the diff hash, to have results like riff:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
>> diff = {}
=> {}
>> diff_a_1_2.each_with_index do |d,index|
?> diff[d[0]] = [d[1],diff_a_2_1[index][1]]
>> end #=> populates the diff
>> diff
=> {"updated_at"=>[Wed Apr 11 15:36:03 MDT 2007,
Mon Apr 09 11:08:17 MDT 2007],
"last_email_alert_at"=>[Wed Apr 11 15:36:03 MDT 2007, nil],
"id"=>[3, 4],
"description"=>["hello", "goodbye"],
"last_email_alert_by"=>[2, nil],
"summary"=>["Do this for me.....", "Test"],
"version"=>[30, 1],
"due_date"=>[#<Date: 4908383/2,0,2299161>,
#<Date: 4908421/2,0,2299161>],
"created_at"=>[Mon Apr 09 10:58:24 MDT 2007,
Mon Apr 09 11:08:17 MDT 2007]}
>> diff.size
=> 9
|
Now if you compare the results above with the results I showed for riff, they are almost identical, riff just takes the extra step of ignoring the differences between id, since that is implied.
