{"id":15,"date":"2007-05-02T17:55:00","date_gmt":"2007-05-02T17:55:00","guid":{"rendered":"\/2007\/09\/18\/riff-activerecord-diff-plugin"},"modified":"2007-05-02T17:55:00","modified_gmt":"2007-05-02T17:55:00","slug":"riff-activerecord-diff-plugin","status":"publish","type":"post","link":"http:\/\/pullmonkey.com\/2007\/05\/02\/riff-activerecord-diff-plugin\/","title":{"rendered":"Riff – ActiveRecord diff plugin"},"content":{"rendered":"

The other day I went looking for a quick way to get a diff between revisions of a record for ActiveRecord.
\nThis led to the Riff plugin<\/a> for Ruby on Rails.
\nThere are two uses of this plugin:<\/p>\n

    \n
  1. object1.diff?(object2) #=> true | false<\/li>\n
  2. object2.diff(object2) #=> hash of differences<\/li>\n<\/ol>\n

    Here is an example:<\/p>\n\n\n
    \n
    1\n<\/tt>2\n<\/tt>3\n<\/tt>4\n<\/tt>5<\/strong>\n<\/tt>6\n<\/tt>7\n<\/tt>8\n<\/tt>9\n<\/tt>10<\/strong>\n<\/tt>11\n<\/tt>12\n<\/tt>13\n<\/tt>14\n<\/tt>15<\/strong>\n<\/tt>16\n<\/tt>17\n<\/tt>18\n<\/tt>19\n<\/tt>20<\/strong>\n<\/tt><\/pre>\n<\/td>\n
    \n
    \n<\/tt>    >> requests = Request<\/span>.find(:all<\/span>)\n<\/tt>    >> request1 = requests[0<\/span>]\n<\/tt>    => #<Request:0x2a9c93f360 @attributes={.....}><\/span>\n<\/tt>    >> request2 = requests[1<\/span>]\n<\/tt>    => #<Request:0x2a9c93f310 @attributes={.....}>  <\/span>\n<\/tt>    >> request1.diff?(request2)\n<\/tt>    => true<\/span>\n<\/tt>    >> request1.diff(request2)\n<\/tt>    => {:last_email_alert_at<\/span>=>[Wed<\/span> Apr<\/span> 11<\/span> 15<\/span>:36<\/span>:03<\/span> MDT<\/span> 2007<\/span>, nil<\/span>], \n<\/tt>           :description<\/span>=>["<\/span>hello<\/span>"<\/span><\/span>, "<\/span>goodbye<\/span>"<\/span><\/span>],\n<\/tt>           :due_date<\/span>=>[#<Date: 4908383\/2,0,2299161>, <\/span>\n<\/tt>                       #<Date: 4908421\/2,0,2299161>], <\/span>\n<\/tt>           :summary<\/span>=>["<\/span>Do this for me.....<\/span>"<\/span><\/span>, "<\/span>Test<\/span>"<\/span><\/span>], \n<\/tt>           :created_at<\/span>=>[Mon<\/span> Apr<\/span> 09<\/span> 10<\/span>:58<\/span>:24<\/span> MDT<\/span> 2007<\/span>, \n<\/tt>                         Mon<\/span> Apr<\/span> 09<\/span> 11<\/span>:08<\/span>:17<\/span> MDT<\/span> 2007<\/span>], \n<\/tt>           :last_email_alert_by<\/span>=>[2<\/span>, nil<\/span>], \n<\/tt>           :version<\/span>=>[30<\/span>, 1<\/span>], \n<\/tt>           :updated_at<\/span>=>[Wed<\/span> Apr<\/span> 11<\/span> 15<\/span>:36<\/span>:03<\/span> MDT<\/span> 2007<\/span>, \n<\/tt>                         Mon<\/span> Apr<\/span> 09<\/span> 11<\/span>:08<\/span>:17<\/span> MDT<\/span> 2007<\/span>]}\n<\/tt><\/pre>\n<\/td>\n<\/tr>\n<\/table>\n

    Yah, so that is cool, but with a little more time on my hands I decided to figure out what is going on behind the scenes.
    \nHere is how you would normally get a diff between hashes, unfortunately you will see later this does not work so well:
    \nFirst, create the hashes:<\/p>\n\n\n
    \n
    1\n<\/tt>2\n<\/tt>3\n<\/tt>4\n<\/tt><\/pre>\n<\/td>\n
    \n
      >> r1 = requests[0<\/span>].attributes\n<\/tt>  => {...}\n<\/tt>  >> r2 = requests[1<\/span>].attributes\n<\/tt>  => {...}\n<\/tt><\/pre>\n<\/td>\n<\/tr>\n<\/table>\n

    Next, show that the records contain the same keys:<\/p>\n\n\n
    \n
    1\n<\/tt>2\n<\/tt><\/pre>\n<\/td>\n
    \n
      >> r1.keys - r2.keys\n<\/tt>  => [] \n<\/tt><\/pre>\n<\/td>\n<\/tr>\n<\/table>\n

    Let's get the diffs, one for first - second and one for second - first:<\/p>\n\n\n
    \n
    1\n<\/tt>2\n<\/tt>3\n<\/tt>4\n<\/tt>5<\/strong>\n<\/tt>6\n<\/tt>7\n<\/tt><\/pre>\n<\/td>\n
    \n
      >> diff_2_1 = r2.values - r1.values\n<\/tt>  => [Mon<\/span> Apr<\/span> 09<\/span> 11<\/span>:08<\/span>:17<\/span> MDT<\/span> 2007<\/span>, 4<\/span>, "<\/span>Test<\/span>"<\/span><\/span>, "<\/span>goodbye<\/span>"<\/span><\/span>, \n<\/tt>    #<Date: 4908421\/2,0,2299161>, Mon Apr 09 11:08:17 MDT 2007]<\/span>\n<\/tt>  >> diff_1_2 = r1.values - r2.values\n<\/tt>  => [Wed<\/span> Apr<\/span> 11<\/span> 15<\/span>:36<\/span>:03<\/span> MDT<\/span> 2007<\/span>, Wed<\/span> Apr<\/span> 11<\/span> 15<\/span>:36<\/span>:03<\/span> MDT<\/span> 2007<\/span>, \n<\/tt>    3<\/span>, 30<\/span>, "<\/span>Do this for me.....<\/span>"<\/span><\/span>, "<\/span>Hello<\/span>"<\/span><\/span>, #<Date: 4908383\/2,0,2299161>, <\/span>\n<\/tt>    Mon<\/span> Apr<\/span> 09<\/span> 10<\/span>:58<\/span>:24<\/span> MDT<\/span> 2007<\/span>]\n<\/tt><\/pre>\n<\/td>\n<\/tr>\n<\/table>\n

    This would be perfect if the results of the hash diffs were the same size:<\/p>\n\n\n
    \n
    1\n<\/tt>2\n<\/tt>3\n<\/tt>4\n<\/tt><\/pre>\n<\/td>\n
    \n
      >> diff_1_2.size\n<\/tt>  => 8<\/span>\n<\/tt>  >> diff_2_1.size\n<\/tt>  => 6<\/span>\n<\/tt><\/pre>\n<\/td>\n<\/tr>\n<\/table>\n

    Well this means we can't assume the indexes match. So here is how we really need to get the hash differences:<\/p>\n\n\n
    \n
    1\n<\/tt>2\n<\/tt>3\n<\/tt>4\n<\/tt>5<\/strong>\n<\/tt>6\n<\/tt>7\n<\/tt>8\n<\/tt>9\n<\/tt>10<\/strong>\n<\/tt>11\n<\/tt>12\n<\/tt>13\n<\/tt>14\n<\/tt>15<\/strong>\n<\/tt>16\n<\/tt>17\n<\/tt>18\n<\/tt>19\n<\/tt>20<\/strong>\n<\/tt><\/pre>\n<\/td>\n
    \n
      >> ra1 = r1.to_a  #=> converted to an array [[key,value],[key2,value2]...]<\/span>\n<\/tt>  >> ra2 = r2.to_a  #=> converted to an array <\/span>\n<\/tt>  >> diff_a_1_2 = ra1 - ra2 \n<\/tt>  => [["<\/span>updated_at<\/span>"<\/span><\/span>, Wed<\/span> Apr<\/span> 11<\/span> 15<\/span>:36<\/span>:03<\/span> MDT<\/span> 2007<\/span>], \n<\/tt>    ["<\/span>last_email_alert_at<\/span>"<\/span><\/span>, Wed<\/span> Apr<\/span> 11<\/span> 15<\/span>:36<\/span>:03<\/span> MDT<\/span> 2007<\/span>], ["<\/span>id<\/span>"<\/span><\/span>, 3<\/span>], \n<\/tt>    ["<\/span>version<\/span>"<\/span><\/span>, 30<\/span>], ["<\/span>summary<\/span>"<\/span><\/span>, "<\/span>Do this for me.....<\/span>"<\/span><\/span>], \n<\/tt>    ["<\/span>last_email_alert_by<\/span>"<\/span><\/span>, 2<\/span>], ["<\/span>description<\/span>"<\/span><\/span>, "<\/span>Hello<\/span>"<\/span><\/span>], \n<\/tt>    ["<\/span>due_date<\/span>"<\/span><\/span>, #<Date: 4908383\/2,0,2299161>], <\/span>\n<\/tt>    ["<\/span>created_at<\/span>"<\/span><\/span>, Mon<\/span> Apr<\/span> 09<\/span> 10<\/span>:58<\/span>:24<\/span> MDT<\/span> 2007<\/span>]]\n<\/tt>  >> diff_a_2_1 = ra2 - ra1\n<\/tt>  => [["<\/span>updated_at<\/span>"<\/span><\/span>, Mon<\/span> Apr<\/span> 09<\/span> 11<\/span>:08<\/span>:17<\/span> MDT<\/span> 2007<\/span>], \n<\/tt>    ["<\/span>last_email_alert_at<\/span>"<\/span><\/span>, nil<\/span>], ["<\/span>id<\/span>"<\/span><\/span>, 4<\/span>], ["<\/span>version<\/span>"<\/span><\/span>, 1<\/span>], \n<\/tt>    ["<\/span>summary<\/span>"<\/span><\/span>, "<\/span>Test<\/span>"<\/span><\/span>], ["<\/span>last_email_alert_by<\/span>"<\/span><\/span>, nil<\/span>], \n<\/tt>    ["<\/span>description<\/span>"<\/span><\/span>, "<\/span>goodbye<\/span>"<\/span><\/span>], \n<\/tt>    ["<\/span>due_date<\/span>"<\/span><\/span>, #<Date: 4908421\/2,0,2299161>], <\/span>\n<\/tt>    ["<\/span>created_at<\/span>"<\/span><\/span>, Mon<\/span> Apr<\/span> 09<\/span> 11<\/span>:08<\/span>:17<\/span> MDT<\/span> 2007<\/span>]]\n<\/tt>  >> diff_a_2_1.size\n<\/tt>  => 9<\/span>\n<\/tt>  >> diff_a_1_2.size\n<\/tt>  => 9<\/span>\n<\/tt><\/pre>\n<\/td>\n<\/tr>\n<\/table>\n

    Well, that's odd there appear to be nine differences, where as before we saw a max of eight differences. This difference in size can be explained by the missing nils in the prior example.
    \nNow we create the diff hash, to have results like riff:<\/p>\n\n\n
    \n
    1\n<\/tt>2\n<\/tt>3\n<\/tt>4\n<\/tt>5<\/strong>\n<\/tt>6\n<\/tt>7\n<\/tt>8\n<\/tt>9\n<\/tt>10<\/strong>\n<\/tt>11\n<\/tt>12\n<\/tt>13\n<\/tt>14\n<\/tt>15<\/strong>\n<\/tt>16\n<\/tt>17\n<\/tt>18\n<\/tt>19\n<\/tt>20<\/strong>\n<\/tt><\/pre>\n<\/td>\n
    \n
      >> diff = {}\n<\/tt>  => {}\n<\/tt>  >> diff_a_1_2.each_with_index do<\/span> |d,index|\n<\/tt>  ?><\/span> diff[d[0<\/span>]] = [d[1<\/span>],diff_a_2_1[index][1<\/span>]]\n<\/tt>  >> end<\/span>  #=> populates the diff<\/span>\n<\/tt>  >> diff\n<\/tt>  => {"<\/span>updated_at<\/span>"<\/span><\/span>=>[Wed<\/span> Apr<\/span> 11<\/span> 15<\/span>:36<\/span>:03<\/span> MDT<\/span> 2007<\/span>, \n<\/tt>                     Mon<\/span> Apr<\/span> 09<\/span> 11<\/span>:08<\/span>:17<\/span> MDT<\/span> 2007<\/span>], \n<\/tt>    "<\/span>last_email_alert_at<\/span>"<\/span><\/span>=>[Wed<\/span> Apr<\/span> 11<\/span> 15<\/span>:36<\/span>:03<\/span> MDT<\/span> 2007<\/span>, nil<\/span>], \n<\/tt>    "<\/span>id<\/span>"<\/span><\/span>=>[3<\/span>, 4<\/span>], \n<\/tt>    "<\/span>description<\/span>"<\/span><\/span>=>["<\/span>hello<\/span>"<\/span><\/span>, "<\/span>goodbye<\/span>"<\/span><\/span>], \n<\/tt>    "<\/span>last_email_alert_by<\/span>"<\/span><\/span>=>[2<\/span>, nil<\/span>], \n<\/tt>    "<\/span>summary<\/span>"<\/span><\/span>=>["<\/span>Do this for me.....<\/span>"<\/span><\/span>,   "<\/span>Test<\/span>"<\/span><\/span>], \n<\/tt>    "<\/span>version<\/span>"<\/span><\/span>=>[30<\/span>, 1<\/span>], \n<\/tt>    "<\/span>due_date<\/span>"<\/span><\/span>=>[#<Date: 4908383\/2,0,2299161>, <\/span>\n<\/tt>                 #<Date: 4908421\/2,0,2299161>], <\/span>\n<\/tt>    "<\/span>created_at<\/span>"<\/span><\/span>=>[Mon<\/span> Apr<\/span> 09<\/span> 10<\/span>:58<\/span>:24<\/span> MDT<\/span> 2007<\/span>, \n<\/tt>                   Mon<\/span> Apr<\/span> 09<\/span> 11<\/span>:08<\/span>:17<\/span> MDT<\/span> 2007<\/span>]}\n<\/tt>  >> diff.size\n<\/tt>  => 9<\/span> \n<\/tt><\/pre>\n<\/td>\n<\/tr>\n<\/table>\n

    Now if you compare the results above with the results I showed for riff, they are almost identical, riff just takes the extra step of ignoring the differences between id, since that is implied.<\/p>\n","protected":false},"excerpt":{"rendered":"

    The other day I went looking for a quick way to get a diff between revisions of a record for ActiveRecord. This led to the Riff plugin for Ruby on Rails. There are two uses of this plugin: object1.diff?(object2) #=> true | false object2.diff(object2) #=> hash of differences Here is an example: 1 2 3 […]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[6,3,5],"tags":[],"_links":{"self":[{"href":"http:\/\/pullmonkey.com\/wp-json\/wp\/v2\/posts\/15"}],"collection":[{"href":"http:\/\/pullmonkey.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/pullmonkey.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/pullmonkey.com\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/pullmonkey.com\/wp-json\/wp\/v2\/comments?post=15"}],"version-history":[{"count":0,"href":"http:\/\/pullmonkey.com\/wp-json\/wp\/v2\/posts\/15\/revisions"}],"wp:attachment":[{"href":"http:\/\/pullmonkey.com\/wp-json\/wp\/v2\/media?parent=15"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/pullmonkey.com\/wp-json\/wp\/v2\/categories?post=15"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/pullmonkey.com\/wp-json\/wp\/v2\/tags?post=15"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}