No Comments »

Hello there,

it’s late in Bremen, so I will keep this thing as short as possible: I just wasted the complete day’s work at Tansporter due to a bad preparation… dumb me! (Remember: Benchmark before any optimization!)

I tried to optimize some of our ActiveRecord find statements, installed a whole new column on a table solely for that reason, changed the model to fill this new column right and then changed the actual method I would like to spice up.

It was a 70 line monster of very ugly code. After refactoring it, there were just 40 lines left and it was very cool, clean and beautiful code. The main change: I replaced a find statement which returned a very huuuuge result set, which I had to iterate and do some computing on every element, with 4 very cool queries with just one result, each.

Sounds cool, but unfortunately it isn’t.

After I had a short “wow, that’s great stuff!” experience, I thought it might not be unclever to benchmark the new method and compare it to the old one.
And here comes the trouble: The old method finished in about 0.018 seconds, the new one needed somewhat between 0.54 and 0.4 seconds… What a pitty!

So I switched back to the ugly version, to my rescue the added column could be used elsewhere so my effort was not completely useless.

What I’ve learned: One query with a large result set apparently performs MUCH better than 4 (not absolute trivial) queries with a very small result set, even if you have to iterate through the whole big result set.

Anyway I stumbled upon two cool, small things today that are worth mentioning them:

1st: Get a random “thing” out of the database.

Until today we managed this with a method at the particular model looking somehow like this:

1
2
3
4
5
6
7
8
9
10
  def self.random_good
    good = nil
    while (!good)
      begin
        good = Good.find(rand(Good.maximum(:id)))
        rescue ActiveRecord::RecordNotFound => e
      end
    end
    good
  end

Today I received my copy of The Rails Way and one of the first things I found was the native, MySQL “ORDER BY RAND()” function. Pretty cool. Because database portability doesn’t really matter for us, the same function as above now looks like this:

1
2
3
  def self.random_good
    Good.find(:first, :order => 'RAND()')
  end

2nd: Don’t forget about the power of ActiveRecord’s :include option!

The :include option of a find statement is very powerful! I read about this maybe a year ago, never needed it again and forgot about it. Today I rediscovered the veeeeery nice nesting capability:

1
2
  tickets = Ticket.find(:all, :include => [:user,
      {:operator => {:open_tickets => :user}}])

I can’t really believe it, but after this find statement you can do this:

1
  tickets.first.operator.open_tickets.last.user.nickname

…without ANY extra query! It’s all joined together in the one find query and still runs at a reasonable pace.

Ok, now it’s really time for me to go to bed.

Stay tuned,

Thorben
FEtMab-Team

No Comments »

While it is always good to have attr_accessible or attr_protected set in a ActiveRecord model class to protect it from abuse when updating any object of that class from the params of a Rails controller.

But what to do when you have that attr_accessible but you won’t let a certain action in your controller even edit all of those accessible fields. Maybe because you’re paranoid or just a think that an action should only be allowed to edit what they was invented for, there might be some good reasons.

We faced this problem lately and came up with that solution:

1
2
3
4
5
6
7
8
9
10
11
12
class ActiveRecord::Base
  def update_by_params(params, *fields)
    self.attributes = params.reject
        {|key,value| !fields.index key.to_sym}
    self.save
  end
  def update_by_params!(params, *fields)
    self.attributes = params.reject
        {|key,value| !fields.index key.to_sym}
    self.save!
  end
end

The update_by_params instance method on the ActiveRecord::Base class let you do something like this:

1
2
3
4
5
6
# ... in a controller ...
def change_password
  @current_user.update_by_params(params[:user],
      :password, :password_confirmation)
  redirect_to frontpage_url
end

And then the user won’t be able to edit his address with a hacked password change form, even if he is allowed to do so in his profile.

The method will save the object after the update. To handle errors on saving there are (as usual in Rails) two specifications of this method available, one with a ! at the end which just uses save! to save the object and one without the ! which uses save. So update_by_params! will throw an error if the save fails while update_by_params just returns false.

Have fun with that.

Yours,

Thorben
FEtMab-Team

No Comments »

As you are reading this blog, I assume you love to play this web development game as much as we do! But blinded by this love we forget too often about a very important fact: This business is a highly concurrent one and we should care about this!

98% of all tasks we cope with in our every day coding we could just neglect this concurrency and treat our controllers like there are very loooonely guys. But what about that other 2 percent? What about these racing conditions which could happen with no need for AJAX or other complicated stuff.

Let’s stay simple: Imagine your customer asks you to implement a stupid counter which counts the views of ads. You could do this:

1
2
3
4
5
  # let's assume you always want to edit this first ad ;)
  ad = Ad.find(:first)
 
  ad.counter += 1
  ad.save!

But wait a minute! What happens if this occurs: Customer A and customer B gets served with the same ad in nearly the same moment:

  • The ad counter is at 1234
  • Customer A loads the ad
  • Customer B loads the ad
  • Customer A counts the counter up to 1235
  • Customer B counts the counter up to 1235
  • Customer A saves the ad, new count in the db: 1235
  • Customer B saves the ad, new count in the db: 1235

Damn, we just missed a view and got a classic race condition. Ok, let’s forget about all these: It’s not important if we do not count every 10.000th view … Let’s assume it is.

What to do? You could wrap this with a transaction like:

1
2
3
4
5
6
7
  Ad.transaction do
    # let's assume you allways wants to edit this first ad ;)
    ad = Ad.find(:first)
 
    ad.counter += 1
    ad.save!
  end

But what for? As long as you not run your DBMS with an isolation level of ’serializable’ (and I hope you don’t!) there comes no advantage from the use of a transaction. Customer A and B would both open a transaction, read the line, do the update and write it back. There is nothing the transaction could complain about, it’s still atomic, the data is consistent after the commit, the transaction itself is isolated (the race condition did not affect the fulfillment of the transaction! Both transaction can fulfill their work without any technical hassle) and last but not least the commited data is durable. So the only thing you get is a nice, fully working transaction, the race condition is still there.

So what is the solution? Simple: Locking. As I stated before you could (DON’T DO THAT) use an isolation level of ’serializable’. But why would this work? Because is serializes any transaction so that they are executed one by one… Not a performance booster you may think… And you are right! In addition to that this isolation level is tending to produce errors because many DBMS do not really serialize the transactions etc etc etc… Let’s stop talking about this seldom useful stuff and carry on.

In Rails (and in Hibernate and in nearly any other OR mapper I think) there are two ways of doing locks like this explained before in a more ’sensitive’ manner. Pessimistic Locking and Optimistic Locking.

I’m not a database professional, so please be patient when I’m saying something stupid, but I think the main difference between these both is that an optimistic lock is handled by the software you code (or the framework you use of course), while the pessimistic one is handled by the DBMS you use.

So how could you solve the above problem in Rails? It’s as simple as you might expect it to be: Just add an integer column named ‘lock_version’ to the Ads table, initialize it to 0 for any row and then when your are ready try this:

(I will write down the two customer example as it would be concurrently executed, I’ll hope this does not cause any confusion)

1
2
3
4
5
6
7
8
  ad1 = Ad.find(:first)#Customer A
  ad2 = Ad.find(:first)#Customer B
  ad1.counter += 1 #Customer A
  ad2.counter += 1 #Customer B
  ad1.save! #Customer A
 
  # The line below should raise an ActiveRecord::StaleObjectError
  ad2.save! #Customer B

What rails does is simple: It counts up the lock_version field of an row on any update. If there is an update call it checks if the lock_version field is still the same as in the fetched row and if not, it raises the error. Then it’s your part to handle it.

Ok, you just got an Optimistic Locking alá Rails. Congratulations.

The other way is to use Pessimistic Locking and let the DBMS do all the work. This is how you could do that in Rails:

1
2
3
4
5
  Ad.transaction do
    ad = Ad.find(:first, <strong>:lock => true</strong>)
    ad.counter += 1
    ad.save!
  end

Now there are no raised exceptions when you run into a race condition, because you never run into one. The DBMS (hopefully) will read-lock the selected rows as soon as you are selecting them (Rails uses the SQL-92 ‘SELECT … FOR UPDATE’ statement for that and MySQL’s InnoDB 5, Postgres and all other major players do support such a row wise locking. There might be some DBMS that lock the whole table instead of a few rows (like MySQL did in previous versions) which would be a hard performance impact as you could imagine. So read the manual of the DBMS of your choice for that issue) and any other select, which would like to select one of the locked rows has to wait until the transaction that encapsulates the ‘FOR UPDATE’ statement is commited (that’s the reason why we need a transaction for the Pessimistic Locking).

You could easily play with these locking a little bit if you open up two terminals, connect to your DB, begin a transaction in terminal 1, then do a SELECT … FOR UPDATE statement in terminal 1 and try to do the same SELECT statement (with or without FOR UPDATE) in terminal 2. You will see, that terminal 2 did not return from it’s ‘waiting state’ until you do not close the transaction in terminal 1.

Ok, I know that many of you just know about that since ages but my intention for this post is not that I would like to write something no one has done ever before! A few days ago I had to think about this issue again and realized how easy it is to forget about it. So if you have not forgot it and knew everything: perfect. If not this hopefully is a tiny reminder.

Cheers,

Thorben
FEtMab-Team