We recently upgraded our 8-year-old legacy FinTech application from Ruby on Rails version 3.2 to 4.2. Here are the key lessons learned and challenges we encountered during the process.
From 3.2 to 4.0
The upgrade process was already underway when I joined the company and took over. I continued with a few improvements, following this simplified workflow:
- Listen to podcasts about how others did it.
- Go through the 4.0 Release notes.
- Go through the official upgrade guide.
- Go through the unofficial Fast Ruby upgrade guide.
- Make all tests green.
- General test on staging.
- Stress test on staging for a night.
- Deploy to one production server and gradually route more web traffic to it.
- Deploy to the remaining servers leaving the queue system and the background jobs using the old version.
- Enable the new version for the queue system and the background jobs.
- Disco
Organizing the work
To manage the upgrade process effectively, I divided it into three branches:
- pre_rails4
- rails4/master
- post_rails4
We were fortunate enough to release half of the required changes before deploying Rails 4 itself. This allowed us to release changes frequently in small increments, making debugging easier when issues arose. The pre_rails4
branch served this purpose. Changes that required Rails 4 stability were held in the post_rails4
branch. The remaining changes were kept in the rails4/master
branch, with frequent rebasing.
This branching strategy was not a good idea after all. I would recommend reaching out for dual booting instead.
Unexpected challenges
These minor changes were not advertised in any upgrade guides and yet, they rewarded me with some quality debugging time.
Rails 4 enables strict mode
for MySQL by default
This means MySQL will no longer truncate your data silently and automatically but instead raise a loud exception. Fortunately we spotted this issue on staging before deploying to production.
More details about the problem and a hotfix can be found here.
Thanks to Moomaka for the clarification.
Ruby’s Logger is not extended by Rails 4
The following code won’t set the datetime_format
of the MyLogFormatter
instance:
# config/application.rb
config.logger.formatter = MyLogFormatter.new
config.logger.datetime_format = '...'
Solution:
# config/application.rb
config.log_formatter = MyLogFormatter.new
config.log_formatter.datetime_format = '...'
Tip: To verify if Rails overrides the datetime_format
method:
Rails.logger.method(:datetime_format).source_location
# with Rails 3 it returns
=> ["...activesupport/lib/active_support/core_ext/logger.rb", 65]
# with Rails 4 it returns
=> ["...ruby/lib/logger.rb", 285]
Database connection leaks in threads
The first stress test on staging triggered a lot of ActiveRecord::ConnectionTimeoutError
errors and the application stopped working completely.
It turned out that any database command running in a thread checks out a database connection from the pool and neglects to put it back. This is true for both Rails versions. The difference is Rails 3.2 calls the clear_stale_cached_connections! method when checking out a connection and there are no more available connections. Rails 4 doesn’t.
The solution was to call ActiveRecord::Base.clear_active_connections!
manually or use ActiveRecord::Base.with_connection
with a block.
Thread.new do
begin
User.last
ensure
ActiveRecord::Base.clear_active_connections!
end
end
Lessons learned
Keeping the dependencies up-to-date
Unfortunately this was highly neglected before, resulting in no upgrades for over two years. As a result, upgrading to Rails 4 required upgrading a large number of gems. We organized them into groups based on their risk level and deployed accordingly. This allowed us to catch any issues before upgrading Rails itself.
Later, I instituted a policy to upgrade the dependencies every Monday morning.
Waiting for management to allocate time officially
Initially, I did everything by the Book meaning I worked on the upgrade only when I had a ticket in the sprint. However, this was a big mistake because it was a rather rare phenomenon. As many of you have surely experienced, management tends to be less interested in the non-functional requirements compared to the functional ones, such as new features. Unfortunately this approach led to a significant delay, stretching the four-week task to over a year.
This taught me a valuable lesson: working on the non-functional requirements is part of our job as engineers. And we don’t ask permission to do our job. Management has enough problems withouth listening to our ramblings about technical debt. They hire us to run the business smoothly on the technical level.
After this revolutionary realization, I influenced the engineering culture to include the non-functional requirements in the estimation process. This way we paid off a significant amount of technical debt without bothering management.
With this approach (and a few extra technical ones), we upgraded Rails from 4.2 to 6.0 in a month without interrupting the feature development.
From 4.0 to 4.2
Leveraging the experience gained from the previous upgrade, the transition from 4.0 to 4.2 was relatively smooth and quick.
From 4.2 to 6.0
I introduced dual boot to the application. And, my God, it was a game changer!
We didn’t need any long-running branches. We could release changes frequently in small batches imitating Trunk-based development.
We hooked the new version into our CI. And after all tests were green we made it mandatory to pass the tests on the new version as well.
I applied a feature flag, so early on it was turned on for staging. Free testing while developing other features.
Then I turned it on for only one production server without any DevOps help.
Then for the queue system and for the background jobs.
This upgrade was completed within a month, and surprisingly, management didn’t even notice that we spent time on it.
Was it worth it?
¡Absolutely! The cost of NOT upgrading can outweigh the cost of upgrading in various ways:
- Security issues. What would we tell our insurance company if we got hacked?
- Bugs.
- Implementing and maintaining features available in newer versions?
- Outdated versions are less attractive for top talents in the hiring process.
- Etc…
Happy upgrading!
Currently doing this same thing now (which makes this the third Rails 3 => 4 upgrade I've had to do). I found your rebellion story interesting and relatable ;) I wasn't aware of the thread issue so thanks for pointing that out
ReplyDeleteI'm happy to hear that more people are upgrading. I think it is important! Regarding rebelling, we tuned our workflow since the first Rails upgrade. Now we include the cleanup / tech debt part in the time estimations. Against all of my expectation this works like charm, and now the engineers are just as happy as business people:D For the thread issue, I discovered it only because I tried to run some stress test, which are really valuable as we can see.
Delete