PullMonkey Blog

02 May

Riff – ActiveRecord diff plugin


The other day I went looking for a quick way to get a diff between revisions of a record for ActiveRecord.
This led to the Riff plugin for Ruby on Rails.
There are two uses of this plugin:

  1. object1.diff?(object2) #=> true | false
  2. object2.diff(object2) #=> hash of differences

Here is an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

    >> requests = Request.find(:all)
    >> request1 = requests[0]
    => #<Request:0x2a9c93f360 @attributes={.....}>
    >> request2 = requests[1]
    => #<Request:0x2a9c93f310 @attributes={.....}>  
    >> request1.diff?(request2)
    => true
    >> request1.diff(request2)
    => {:last_email_alert_at=>[Wed Apr 11 15:36:03 MDT 2007, nil], 
           :description=>["hello", "goodbye"],
           :due_date=>[#<Date: 4908383/2,0,2299161>, 
                       #<Date: 4908421/2,0,2299161>], 
           :summary=>["Do this for me.....", "Test"], 
           :created_at=>[Mon Apr 09 10:58:24 MDT 2007, 
                         Mon Apr 09 11:08:17 MDT 2007], 
           :last_email_alert_by=>[2, nil], 
           :version=>[30, 1], 
           :updated_at=>[Wed Apr 11 15:36:03 MDT 2007, 
                         Mon Apr 09 11:08:17 MDT 2007]}

Yah, so that is cool, but with a little more time on my hands I decided to figure out what is going on behind the scenes.
Here is how you would normally get a diff between hashes, unfortunately you will see later this does not work so well:
First, create the hashes:

1
2
3
4
  >> r1 = requests[0].attributes
  => {...}
  >> r2 = requests[1].attributes
  => {...}

Next, show that the records contain the same keys:

1
2
  >> r1.keys - r2.keys
  => [] 

Let's get the diffs, one for first - second and one for second - first:

1
2
3
4
5
6
7
  >> diff_2_1 = r2.values - r1.values
  => [Mon Apr 09 11:08:17 MDT 2007, 4, "Test", "goodbye", 
    #<Date: 4908421/2,0,2299161>, Mon Apr 09 11:08:17 MDT 2007]
  >> diff_1_2 = r1.values - r2.values
  => [Wed Apr 11 15:36:03 MDT 2007, Wed Apr 11 15:36:03 MDT 2007, 
    3, 30, "Do this for me.....", "Hello", #<Date: 4908383/2,0,2299161>, 
    Mon Apr 09 10:58:24 MDT 2007]

This would be perfect if the results of the hash diffs were the same size:

1
2
3
4
  >> diff_1_2.size
  => 8
  >> diff_2_1.size
  => 6

Well this means we can't assume the indexes match. So here is how we really need to get the hash differences:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
  >> ra1 = r1.to_a  #=> converted to an array [[key,value],[key2,value2]...]
  >> ra2 = r2.to_a  #=> converted to an array 
  >> diff_a_1_2 = ra1 - ra2 
  => [["updated_at", Wed Apr 11 15:36:03 MDT 2007], 
    ["last_email_alert_at", Wed Apr 11 15:36:03 MDT 2007], ["id", 3], 
    ["version", 30], ["summary", "Do this for me....."], 
    ["last_email_alert_by", 2], ["description", "Hello"], 
    ["due_date", #<Date: 4908383/2,0,2299161>], 
    ["created_at", Mon Apr 09 10:58:24 MDT 2007]]
  >> diff_a_2_1 = ra2 - ra1
  => [["updated_at", Mon Apr 09 11:08:17 MDT 2007], 
    ["last_email_alert_at", nil], ["id", 4], ["version", 1], 
    ["summary", "Test"], ["last_email_alert_by", nil], 
    ["description", "goodbye"], 
    ["due_date", #<Date: 4908421/2,0,2299161>], 
    ["created_at", Mon Apr 09 11:08:17 MDT 2007]]
  >> diff_a_2_1.size
  => 9
  >> diff_a_1_2.size
  => 9

Well, that's odd there appear to be nine differences, where as before we saw a max of eight differences. This difference in size can be explained by the missing nils in the prior example.
Now we create the diff hash, to have results like riff:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
  >> diff = {}
  => {}
  >> diff_a_1_2.each_with_index do |d,index|
  ?> diff[d[0]] = [d[1],diff_a_2_1[index][1]]
  >> end  #=> populates the diff
  >> diff
  => {"updated_at"=>[Wed Apr 11 15:36:03 MDT 2007, 
                     Mon Apr 09 11:08:17 MDT 2007], 
    "last_email_alert_at"=>[Wed Apr 11 15:36:03 MDT 2007, nil], 
    "id"=>[3, 4], 
    "description"=>["hello", "goodbye"], 
    "last_email_alert_by"=>[2, nil], 
    "summary"=>["Do this for me.....",   "Test"], 
    "version"=>[30, 1], 
    "due_date"=>[#<Date: 4908383/2,0,2299161>, 
                 #<Date: 4908421/2,0,2299161>], 
    "created_at"=>[Mon Apr 09 10:58:24 MDT 2007, 
                   Mon Apr 09 11:08:17 MDT 2007]}
  >> diff.size
  => 9 

Now if you compare the results above with the results I showed for riff, they are almost identical, riff just takes the extra step of ignoring the differences between id, since that is implied.


Filed under: development, Home, rails

Sorry, comments for this entry are closed at this time.