Measure Ruby on Rails Optimizations

by Alan Da Costa

ruby ruby-on-rails

Recently, a client asked me to look into the source of slow loading pages for a Ruby on Rails 2.x application. After identifying the database as the culprit, mostly due to mixed I/O and fragmentation, a plan for optimization was devised.

How do we know if we’re optimizing? Simple, we collect measurements and compare change. If we do a good job, our measurements improve for whatever metric is important to us.

Introducing request-log-analyzer

request-log-analyzer is a really nifty log analyzing tool that will give us a plethora of metrics from various log types, including Ruby on Rails logs.

For Rails, it’ll give us information such as: number of requests, hourly request distribution, page load times, page database times, and much more.

Establishing our metric before change

Before we make a change, what we want to do, is pick an important metric. For our case, looking at the 95th percentile of requests for Database time - by sum, makes sense. request-log-analyzer gives us this value in a range form, like 3ms-27ms.

Selecting logs for analysis

I compared like traffic days for the change, one week apart. Our first full day of collecting log data after the change was Monday, so I picked the previous Monday to ensure a similar traffic pattern. Total requests were also similar.

I’ve removed some columns from the output so we can focus on the 95th percentile. Here’s a sample from the old log (~140K requests).

Database time - by sum                                  ┃ 95 %tile
AssessmentsController#save.html [POST]                  ┃ 2ms-1.15s
AssessmentsController#practice.html [POST]              ┃ 0ms-1.06s
AssessmentsController#save_all.html [POST]              ┃ 22ms-4.31s
SessionsController#portal.html [GET]                    ┃ 0ms-71ms
AssessmentsController#task2.html [POST]                 ┃ 3ms-2.83s
AssessmentsController#rr_practice.html [POST]           ┃ 2ms-2.02s
AssessmentsController#speed_screening.html [POST]       ┃ 3ms-2.70s
SessionsController#create.html [POST]                   ┃ 1ms-5.14s
AssessmentsController#intro.html [POST]                 ┃ 3ms-2.05s
ManagementController#evaluee_data_grid.html [GET]       ┃ 1ms-3.02s

Notice the wide ranges, like 1ms-5.14s? That’s not good for the end user.

Applying change

Ideally, you want to measure one change at a time. In the real world, because of available time windows and difficulties in gathering metrics, that might not be possible. At any rate, keep the amount of changes you apply as small as possible.

Comparing change

How did we do? Here’s a look at our log with one full day of data after changes have been applied (~145K requests).

Database time - by sum                                   ┃ 95 %tile
AssessmentsController#save.html [POST]                   ┃ 3ms-27ms
AssessmentsController#task2.html [POST]                  ┃ 3ms-150ms
AssessmentsController#save_all.html [POST]               ┃ 26ms-1.08s
EvalueesController#registration.html [GET]               ┃ 1ms-6ms
SessionsController#portal.html [GET]                     ┃ 0ms-10ms
AssessmentsController#practice.html [POST]               ┃ 0ms-17ms
AssessmentsController#speed_screening.html [POST]        ┃ 3ms-29ms
AssessmentsController#intro.html [POST]                  ┃ 3ms-21ms
SessionsController#create.html [POST]                    ┃ 1ms-1.01s
AssessmentsController#survey.html [GET]                  ┃ 306ms-465ms

Look at those comparatively narrow ranges! The controller actions aren’t exactly the same between the two request-log-analyzer samples, but the numbers paint a clear picture. The 95th percentile results show significant improvement.