The other day I saw a funky error in our error log: Error at Server
. Unfortunately this was it, no backtrace, no exception class, nothing just a timestamp. It proved to be difficult to identify the root of it without any more info. So I looked up the error message in the code base and found this little snippet:
rescue StandardError
Rollbar.error('Error at Server', extra_params: {...})
end
I know I know, this makes the hair stand up on the back of my neck. This error handler is swallowing generic exceptions and without re-raising them it can effect logic flow. Unfortunately this and its buddies have been in our code base for ages and refactoring is beyond the scope of this debugging mission.
Let’s take a deep breath, calm down and continue.
If you don’t know Rollbar, don’t worry, it is for error logging. So the problem with this part is, we don’t provide the exception object to the Rollbar call so it can’t get the backtrace, the exception class, the exception message, etc…
The fix seemed to be obvious and quick:
rescue StandardError => exception
# send the exception object along
Rollbar.error(exception, 'Error at Server', extra_params: {...})
end
But against my high expectations, this only added the exception class to the error message: EOFError: Error at Server
, but still no backtrace.
So all we knew is we had an end of file reached
error somewhere deep in the jungle. And naturally we failed to replicate it locally.
I thought, clearly we do something wrong. So I added some extra log to have a better insight:
Rails.logger.error("Exception: #{exception.backtrace.inspect} Caller: #{caller.inspect}")
The result of exception.backtrace.inspect
was surprisingly nil
. The caller
had the expected backtrace showing the origin of the exception was from a Net::HTTP::Post
call.
But how is it possible that an exception exists without backtrace. A little bit of research revealed the following Ruby source code:
# callers do not care about backtrace, so avoid allocating for it
raise EOFError, 'end of file reached', []
Providing an empty array as a 3rd argument to raise
makes the call stack magically disappear. Fortunately it has been changed in the 2.5 series of Ruby.
But until we can upgrade to that version, let’s reopen the class and monkey patch around.
# config/initializers/monkey_patches.rb
module Net
class BufferedIO
def rbuf_fill
case rv = @io.read_nonblock(BUFSIZE, exception: false)
when String
return @rbuf << rv
when :wait_readable
@io.to_io.wait_readable(@read_timeout) or raise Net::ReadTimeout
# continue looping
when :wait_writable
# OpenSSL::Buffering#read_nonblock may fail with IO::WaitWritable.
# http://www.openssl.org/support/faq.html#PROG10
@io.to_io.wait_writable(@read_timeout) or raise Net::ReadTimeout
# continue looping
when nil
# !change here: callers do care about backtrace
raise EOFError, 'end of file reached'
end while true
end
end
end
Final thoughts
We managed to attach exception name, message and backtrace to the mysterious Error at Server
error, but the root of it is still in the mist.
The story continues in the next article.
Comments
Post a Comment