Rookout transforms Flytrex's ability to debug in production
The Flytrex Debugging Challenge
Debugging in production, without the right tools, can be situations made out of the most well-crafted nightmares. That’s exactly what Flytrex found out when they came across a bug within one of their drones – while it was in flight.
Flytrex faces a few core challenges when it comes to debugging their aircrafts. To begin with, their technology is extremely complex. Second, they
need to abide by stringent FAA regulations. And third, when bugs happen in production, it means that they’re dealing with a bug that can be found in
an airborne system.
Their drones are complex entities that include software, hardware, embedded code, and operations. When previously attempting to debug them, Flytrex had found themselves trying a myriad of methods to debug their drones. This included reproducing their bugs in local or staging environments, reproducing changes that had logging, or redeploying changes to production withthe aim to collect more information about the bug that needed to be fixed. All of these methods took a lot of time and, in most cases, proved to be nearly impossible. This was because a good amount of Flytrex’s issues can only be resolved in the field, as that is where the set of relevant parameters happen. Therefore, reproducing these bugs locally wasn’t the right solution.
Flytrex also encountered a unique challenge in being regulated by the FAA. FAA regulations stipulate that companies are unable to deploy to production without approval, as the production system is regulated by the FAA; as such, changing to a version that hasn’t yet been tested, verified, and certified by the FAA is not allowed. Therefore,a search for a solution had to be done that would allow them to find the source of the bug quickly, assess the situation, and be able to rectify it or inform the FAA about it as quickly as possible.
Rookout In-Flight: An example
There’s no better way to illustrate how a live debugging tool works than when your code is running. Or in Flytrex’s case, when a drone is in flight.
Rookout gave us complete confidence in being able to handle any bugs we were faced with, especially in prod. This confidence was due to our ability to gain a full understanding of the bug we were encountering. We therefore knew how to work around it, as we understood the issue completely, and can conclude whether it was safe or not safe to fly in that particular situation, even if the actual fix came a few days later. We were able to quickly understand the source of the bug, thanks to Rookout
During an internal test, with their drone up in the air, an issue with session counting was found. This issue took place due to a surprising race condition, which is very difficult to catch. The bug itself wasn’t even in the code, but rather in a library that was being employed. Rookout was used to pinpoint the bug and, while the drone was in flight and their people were in the field, were able to provide a workaround and ultimately a solution.
When in a test, while our people are in the field, having a certain component or scenario not working is quite uncomfortable, to say the least. Rookout gave us the confidence that we needed in knowing that we understand what’s happening in our systems, that it’s something we can and will fix. The certainty that we know how to solve whatever issue we’re facing helps us both internally, and with the FAA.
Rookout is most effective in locating, pinpointing, and understanding bugs that happen in production.
Confidence to handle any bugs we face
Dev’s force multiplier
Instant production insight