Riff – ActiveRecord diff plugin
The other day I went looking for a quick way to get a diff between revisions of a record for ActiveRecord.
This led to the Riff plugin for Ruby on Rails.
There are two uses of this plugin:
- object1.diff?(object2) #=> true | false
- object2.diff(object2) #=> hash of differences
Here is an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
>> requests = Request.find(:all) >> request1 = requests[0] => #<Request:0x2a9c93f360 @attributes={.....}> >> request2 = requests[1] => #<Request:0x2a9c93f310 @attributes={.....}> >> request1.diff?(request2) => true >> request1.diff(request2) => {:last_email_alert_at=>[Wed Apr 11 15:36:03 MDT 2007, nil], :description=>["hello", "goodbye"], :due_date=>[#<Date: 4908383/2,0,2299161>, #<Date: 4908421/2,0,2299161>], :summary=>["Do this for me.....", "Test"], :created_at=>[Mon Apr 09 10:58:24 MDT 2007, Mon Apr 09 11:08:17 MDT 2007], :last_email_alert_by=>[2, nil], :version=>[30, 1], :updated_at=>[Wed Apr 11 15:36:03 MDT 2007, Mon Apr 09 11:08:17 MDT 2007]} |
Yah, so that is cool, but with a little more time on my hands I decided to figure out what is going on behind the scenes.
Here is how you would normally get a diff between hashes, unfortunately you will see later this does not work so well:
First, create the hashes:
1 2 3 4 |
>> r1 = requests[0].attributes => {...} >> r2 = requests[1].attributes => {...} |
Next, show that the records contain the same keys:
1 2 |
>> r1.keys - r2.keys => [] |
Let's get the diffs, one for first - second and one for second - first:
1 2 3 4 5 6 7 |
>> diff_2_1 = r2.values - r1.values => [Mon Apr 09 11:08:17 MDT 2007, 4, "Test", "goodbye", #<Date: 4908421/2,0,2299161>, Mon Apr 09 11:08:17 MDT 2007] >> diff_1_2 = r1.values - r2.values => [Wed Apr 11 15:36:03 MDT 2007, Wed Apr 11 15:36:03 MDT 2007, 3, 30, "Do this for me.....", "Hello", #<Date: 4908383/2,0,2299161>, Mon Apr 09 10:58:24 MDT 2007] |
This would be perfect if the results of the hash diffs were the same size:
1 2 3 4 |
>> diff_1_2.size => 8 >> diff_2_1.size => 6 |
Well this means we can't assume the indexes match. So here is how we really need to get the hash differences:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
>> ra1 = r1.to_a #=> converted to an array [[key,value],[key2,value2]...] >> ra2 = r2.to_a #=> converted to an array >> diff_a_1_2 = ra1 - ra2 => [["updated_at", Wed Apr 11 15:36:03 MDT 2007], ["last_email_alert_at", Wed Apr 11 15:36:03 MDT 2007], ["id", 3], ["version", 30], ["summary", "Do this for me....."], ["last_email_alert_by", 2], ["description", "Hello"], ["due_date", #<Date: 4908383/2,0,2299161>], ["created_at", Mon Apr 09 10:58:24 MDT 2007]] >> diff_a_2_1 = ra2 - ra1 => [["updated_at", Mon Apr 09 11:08:17 MDT 2007], ["last_email_alert_at", nil], ["id", 4], ["version", 1], ["summary", "Test"], ["last_email_alert_by", nil], ["description", "goodbye"], ["due_date", #<Date: 4908421/2,0,2299161>], ["created_at", Mon Apr 09 11:08:17 MDT 2007]] >> diff_a_2_1.size => 9 >> diff_a_1_2.size => 9 |
Well, that's odd there appear to be nine differences, where as before we saw a max of eight differences. This difference in size can be explained by the missing nils in the prior example.
Now we create the diff hash, to have results like riff:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
>> diff = {} => {} >> diff_a_1_2.each_with_index do |d,index| ?> diff[d[0]] = [d[1],diff_a_2_1[index][1]] >> end #=> populates the diff >> diff => {"updated_at"=>[Wed Apr 11 15:36:03 MDT 2007, Mon Apr 09 11:08:17 MDT 2007], "last_email_alert_at"=>[Wed Apr 11 15:36:03 MDT 2007, nil], "id"=>[3, 4], "description"=>["hello", "goodbye"], "last_email_alert_by"=>[2, nil], "summary"=>["Do this for me.....", "Test"], "version"=>[30, 1], "due_date"=>[#<Date: 4908383/2,0,2299161>, #<Date: 4908421/2,0,2299161>], "created_at"=>[Mon Apr 09 10:58:24 MDT 2007, Mon Apr 09 11:08:17 MDT 2007]} >> diff.size => 9 |
Now if you compare the results above with the results I showed for riff, they are almost identical, riff just takes the extra step of ignoring the differences between id, since that is implied.
Sorry, comments for this entry are closed at this time.