21 Apr
Found a need for this information while answering questions on railsforum.
So, let's say that we want to use two databases and let's even say that we want to use an Oracle database and a MySQL database. How can this be done? To start, we must decide which database will be our default database. In this scenario, I chose MySQL. Let's see what that looks like:
1
2
3
4
5
6
7
|
# in your database.yml file
development:
adapter: mysql
username: root
password:
database: example_development
|
So we have all seen that before. Now all of our active record models will use this mysql connection.
But, I need to use data from an oracle database, so let's setup that connection:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
# in your database.yml file
development:
adapter: mysql
username: root
password:
database: example_development
oracle_development:
adapter: oracle
username: root
password:
database: example_oracle_development
|
Neat, we have two development connections. Now we have to tell which models to use which connection. Well, actually, we just need to tell the oracle models to connect to oracle_development, all the other models will default to development. I have a model named user, and it's records are kept in an oracle database. Our user model looks like this initially:
1
2
3
4
|
#RAILSROOT/app/models/user.rb
class User < ActiveRecord::Base
end
|
The line we need to add is this:
1
2
|
establish_connection :oracle_development
|
Even better, we can make this dynamic, so when we are in the test or production environment, we don't need to change the establish_connection line.
1
2
3
|
# use RAILS_ENV where RAILS_ENV is generally development, test or production
establish_connection "oracle_#{RAILS_ENV}"
|
So the final product could look something like this:
database.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
dev_basics: &dev_basics
username: root
password:
<% %w(development test production).each do |env| %>
<%= env %>:
<<: *dev_basics
adapter: mysql
database: example_<%= env %>
oracle_<%= env %>:
<<: *dev_basics
adapter: oracle
database: example_oracle_<%= env %>
<% end %>
|
Oracle model example
1
2
3
4
|
class User < ActiveRecord::Base
establish_connection "oracle_#{RAILS_ENV}"
end
|
That should be it.
05 Apr
Ok, great news. I have fixed a fairly large bug, Even better news, you probably never noticed it unless you were trying to output graphs via javascript, I.e., using graph.set_output_type("js").
Thanks go to Brandon, who provided an example of using javascript as the output type:
View Source Code
So if this was affecting you, then you will want to get the latest version of the plugin - Instructions.
02 Apr
After my last article about intersection and arrays, I got to thinking how this could apply elsewhere.
I decided to see if I could write a faster include?() method for Array using the intersection operator. So here is what I did:
1) Tested via irb (of course)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
[pullmonkey]$ irb
# build the array for conceptual testing
irb(main):003:0> a = [1,2,3]
=> [1, 2, 3]
# see how include?() does it
irb(main):004:0> a.include?(2)
=> true
irb(main):005:0> a.include?(4)
=> false
# see what could be done with intersect
irb(main):006:0> a & [2]
=> [2]
irb(main):007:0> (a & [2]).empty?
=> false
# this should return true, so negate it
irb(main):008:0> !(a & [2]).empty?
=> true
irb(main):009:0> !(a & [4]).empty?
=> false
|
2) Extended the Array class with the faster_include?() method:
1
2
3
4
5
6
|
class Array
def faster_include?(n)
!(self & [n]).empty?
end
end
|
3) Wrote a performance test comparing elapsed time for true and false results:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
# size of the array
x = 4000000
# build the array
a = (1..x).to_a
p "Finding #{x/2} ..."
t1 = Time.now
p "include?() - returns true: #{a.include?(x/2)}"
t2 = Time.now
p "faster_include?() - returns true: #{a.faster_include?(x/2)}"
t3 = Time.now
p "include?() - returns false: #{a.include?(-50)}"
t4 = Time.now
p "faster_include?() - returns false: #{a.faster_include?(-50)}"
t5 = Time.now
p "Elapsed time for include?() returning true: #{t2 - t1}."
p "Elapsed time for faster_include?() returning true: #{t3 - t2}."
p "Elapsed time for include?() returning false: #{t4 - t3}."
p "Elapsed time for faster_include?() returning false: #{t5 - t4}."
|
4) Just to see if there was a difference I got results and here they are:
Attempt 1:
1
2
3
4
5
6
7
8
9
10
|
"Finding 2000000 ..."
"include?() - returns true: true"
"faster_include?() - returns true: true"
"include?() - returns false: false"
"faster_include?() - returns false: false"
"Elapsed time for include?() returning true: 0.308243."
"Elapsed time for faster_include?() returning true: 0.271465."
"Elapsed time for include?() returning false: 0.629375."
"Elapsed time for faster_include?() returning false: 0.242673."
|
Attempt 2:
1
2
3
4
5
6
7
8
9
10
|
"Finding 2000000 ..."
"include?() - returns true: true"
"faster_include?() - returns true: true"
"include?() - returns false: false"
"faster_include?() - returns false: false"
"Elapsed time for include?() returning true: 0.323652."
"Elapsed time for faster_include?() returning true: 0.272212."
"Elapsed time for include?() returning false: 0.594751."
"Elapsed time for faster_include?() returning false: 0.277882."
|
Note that faster_include?() is consistent in elapsed time for a result of true and a result of false. However, include?() takes much longer in both attempts to return false than it does to return true, well actually about twice as long. include?() goes through each element, and I am using the midway point of the array in all tests, so it makes sense that include?() would return true in half the time it would take to return false. Very interesting, so now that I see performance improvement with faster_include?(), I need to do a wider spread of tests and for more attempts:
5) Better testing:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
results = {}
n = 5
# size of the array
[1,50,500,5000,5000000].each do |x|
a = (1..x).to_a
n.times do |t|
# build the array
t1 = Time.now
a.include?(x/2)
t2 = Time.now
a.faster_include?(x/2)
t3 = Time.now
a.include?(-50)
t4 = Time.now
a.faster_include?(-50)
t5 = Time.now
results[x] ||= {}
results[x]["true"] ||= []
results[x]["false"] ||= []
results[x]["true"] << (t2 - t1) - (t3 - t2)
results[x]["false"] << (t4 - t3) - (t5 - t4)
end
detailed_results = results
results[x] = {"true" => detailed_results[x]["true"].sum / n, "false" => detailed_results[x]["false"].sum / n}
end
pp results
|
My new results:
The results are based on 5 attempts for each array for both true and false for a total of 50 attempts.
1
2
3
4
5
6
|
{5000000=>{"true"=>0.0921126, "false"=>0.4374644},
5000=>{"true"=>5.84e-05, "false"=>0.0004404},
500=>{"true"=>1.4e-06, "false"=>4.16e-05},
50=>{"true"=>-3.4e-06, "false"=>1.0e-06},
1=>{"true"=>-4.8e-06, "false"=>-3.8e-06}}
|
Note that arrays with 50 items or less seem to perform better using include?(), but barely. On the other hand with an array of 5M items, you shave off half a second using faster_include?().
So there you have it, faster_include?() can keep its name 🙂