Rookout's Dudi Cohen - How To Live In The Debugging Moment
Rookout’s Dudi Cohen – How To Live In The Debugging Moment
Liran Haimovitch: Welcome to The Production-First Mindset, a podcast where we discuss the world of building code from the lab all the way to production. We explore the tactics, methodologies, and metrics used to drive real customer value by the engineering leaders actually doing it. I’m your host, Liran Haimovitch, CTO and Co-Founder of Rookout. Today, we have a very special guest for you. One of my favorite debug lovers, who just happens to be Rookout’s VP of R&D, Dudi Cohen. Thank you for joining us and welcome to the show.
Dudi Cohen: Thanks for inviting me.
Liran Haimovitch: So, Dudi can you tell us a little bit more about yourself – about who you are?
Dudi Cohen: Yeah, my name is Dudi. My real name is David, but everybody calls me Dudi. I’ve been managing Rookout’s R&D for the past about two and a half years. Prior to that, I worked for the government for 10 years, doing all sorts of fun things that I can’t tell you about. And prior to that, I worked for Sepia vision. It used to be called Eyesight mobile technologies. I was one of the first employees. And prior to that, I worked at Intel Corporation. So in my career, I had my fair share of working for startups, working for enterprises, and also working for the government, in which I had the experience of being in a sort of a startup and moving on to being a full-blown enterprise. I have a wife, a child, three dogs, one cat, and I think that’s pretty much about it.
Liran Haimovitch: So today, you’re working at a startup. What are your favorite parts about that?
Dudi Cohen: Working in a startup, I think my favorite part about working in a startup is the ability to do everything, or actually the requirement to do everything. Working for a startup, not working for an enterprise pretty much has no boundaries, you don’t need to request permission to touch another piece of code, you don’t need to understand whether someone needs your help, you can just give out your help. And you can also of course, innovate, which is something that enterprises are lacking. Most enterprises. And of course, the people when you work in a startup, you can’t compromise on people, when you recruit people to a start-up, especially a small startup like Rookout, you can’t compromise, you can’t recruit someone and say, okay, he’s not that good in this area, not that good in that area but it’s pretty good in what we’re looking. Because of the small nature of startups, everybody needs to be able to do everything. And that means that if you want to change the language that we’re writing, we can’t have a developer who says I only write GO or I only write in Python, you need to be able to move people. You need everybody to be pretty much a jack of all trades.
Liran Haimovitch: So how have your hiring practices changed over the past couple of years?
Dudi Cohen: So I’ve been hiring software developers, I think for the past 15 years I’ve been involved in hiring developers. And in the past, I interview the developer, or look for a developer, I mainly look at what the developer’s expertise is and what he’s best at and made a lot of technical questions on specific expertise. Nowadays, I do look for experts, but many look for the potential, I look for the potential of a developer to learn new things, the potential of developers to either learn new things from his co-workers or learn new things by his own, or even go forward and look for something that no one has done before.
Liran Haimovitch: You mentioned potential and today, there is a lot of discussion within the industry on how many people ought to join the industry. How many people want to become developers for the first time, and how often hard it is to get a first job. Now, on the one hand, you’re saying that you’re looking for potential. On the other hand, it can be pretty hard to prove that potential without any experience. So, how do you go about estimating potential especially for junior hires?
Dudi Cohen: I think the first thing that I look for in a potential candidate is the candidates passion. If I meet with a candidate who tells me about his projects, and tells me about what he did, usually most candidates have some sort of experience, whether it’s experience from their university studies or from high school. If he tells about those projects, without any passion, usually it means that he doesn’t have any potential. If someone has passion, it usually means that he learned, in addition to what he needed to learn for his project. He was curious, he did things not in the standard mainstream way. And usually, developers with passion, look beyond the basic, they try to understand how exactly their code is working. For example, let’s say that I interview someone for a front-end position, developing JavaScript, if that developer won’t understand or won’t know, in advance, how JavaScript is working, whether it’s single threaded, whether it is multi-threaded, whether it is compiled or not compiled. Those are basic things that I think, even if you’re a junior, if you’re not passionate, and you don’t want to know how your code works, you don’t have a lot of potential. It is really important to understand how everything is working behind the scenes, and without passion, you won’t be able to move forward in the industry.
Liran Haimovitch: I think software engineering today is a form of craftsmanship. And like any craftsman, it’s important to care about your work, to have passion for it, to try to aim to be better at it, to aim to excel at it so that you can live up to your potential and grow.
Dudi Cohen: I definitely agree. I mean, a lot of developers nowadays, sort of copy-paste things, go to Stack Overflow, see something that works, paste it, try, it doesn’t work, they copy and paste something else. Usually, it works. But it has a very, very low glass ceiling, you can’t move forward without understanding what you’re doing. And even juniors, when you interview juniors, if they don’t understand what they’re doing, they can probably do a pretty good job and they can probably pass, you know, technical challenges for big Corp enterprises. But if they start working in a startup, they will fail pretty fast. Because in a startup, they won’t have the patience of their colleagues to teach them everything from the beginning. And pretty soon they will be required to be very independent and you can’t be independent if you don’t understand what you’re doing.
Liran Haimovitch: Let’s take a step back for a second. I mean, I’ve been recording over 25 episodes of this podcast by now and I didn’t get my chance to chat about rookout. So kind of, what is rock out to you?
Dudi Cohen: Rookout to me, I think from the get-go, it’s developing for developers. During many years of my career, I’ve been in some way or another developing tools for developers, whether it was in my first job in Intel, or even in the government. And rookout allows me to have the possibility to develop really amazing stuff for developers. And I think that’s something very unique about Rookout, the ability to develop for developers. As a developer myself, I use rookout. And as someone who manages developers, and someone who also works with our product team, the ability to develop a product for developers is an amazing experience.
Liran Haimovitch: You’ve mentioned tools for developers. But what does Rookout actually do?
Dudi Cohen: About four years ago, Rookout has decided that developers should stop working with sticks and stones when they’re debugging, or actually, when they’re trying to understand what their code is doing. Whether it’s in production, staging, dev environment, whatever, whether it’s running on the cloud, whether it’s running on their local machine. And the basic building block of that is understanding what data their code is processing. Usually, when you look at your software, your application, your application is actually built from code. And code is pretty static, you understand what your code is doing. Usually, you understand who built it and why it’s built. But what continuously changes is the data, that the code is processing, and also the environment, the infrastructure on which the application is running. And we try to push forward a method or a theme, which is called dynamic observability. And dynamic observability means that you don’t really need to plan ahead what you want to collect or what you want to observe. For example, I recently wanted to explain dynamic observability to a non-developer. And I remembered a scene from Seinfeld in which Jerry and George go on a trip to Los Angeles. And George packs four or five suitcases for the weekend. And Jerry asked, why did he pack so many things? And George told him, Well, I happen to dress based on mood. And I feel that nowadays, a lot of software developers or a lot of the software industry, dress according to their mood, they pack in advance.
Liran Haimovitch: Yeah, they pack everything, they pack logs, they pack metrics, they pack spans, they pack from, you know, various verbosity levels, they pack every endpoint, they pack every business logic, they pack everything.
Dudi Cohen: Yeah, exactly. They don’t think about the possibility of doing laundry in their vacation. They pack everything. And the result of that is that you try to plan ahead everything, you try to future proof yourself in all sorts of manners. You log every line, and you take excuses to detail itself. Maybe there’s a problem with this line, maybe there will be a problem with that line. Maybe someone else from the DevOps or from the infrastructure will finally someday look at that specific log, and I can’t allow myself not to collect it. And dynamic observability is all about packing light. You don’t need to think about what you want to collect, you don’t need to think about which logline you want, you don’t need to think about which traces you need, which latencies interest you in advance. When you want to collect something when there’s something interesting, you simply go to your dynamic observability application, and you tell that application, hey, I want to collect data from that place in the application, I want to understand what the latency there, I want to profile only this specific function, etc, etc. You know, you don’t need to think of everything in advance.
Liran Haimovitch: And you can even go down to a specific line of code to see variable values, stack traces, everything that’s happened.
Dudi Cohen: Yeah, exactly. And you don’t need to make your coding process all about thinking about the future. Nowadays, a lot of developers, even junior developers, and also senior developers, need to think when they write code, where things will go wrong. There, of course, handle their errors properly. They try and they catch, but they also think to themselves, maybe something will go wrong here in this line. Okay, let me add a logline, maybe something will go wrong in that line, then we add another logline, whether it be an info, warning, error, or whatever. But usually, you don’t really need all of those things.
Liran Haimovitch: Yeah, you should still pack, should it still add some logs? You don’t want to go naked for the weekend. But you don’t have to get everything, get jeans, get a t-shirt, and figure out the rest along the way.
Dudi Cohen: Yeah, you don’t need to plan ahead. A fancy evening in a fancy restaurant, if you don’t know if you’re going to have it.
Liran Haimovitch: Yeah, exactly.
Dudi Cohen: And that’s the real fun thing about Rookout is the ability to deliver these sorts of capabilities to our customers. And as I mentioned, Rookout is all about developing for developers. So our day-to-day, as an R&D team is also meeting our customers. And our customers are just like us, whether they are developers, whether they’re DevOps personnel, we meet with them, and we help them understand how to better dynamically collect data from their application.
Liran Haimovitch: Now, what are some of the challenges of making this dream a reality?
Dudi Cohen: Well, the challenges are pretty big. I mean, one of the major challenges that we have is gaining the trust of our customers, and we gain the trust by making sure that our application or our product is secure. Security is one of our biggest concerns. As I mentioned, we allow our customers to collect data wherever and whenever they want. And that means that you can imagine, developer placing a breakpoint somewhere in their production environment, or —
Liran Haimovitch: non-breaking breakpoint
Dudi Cohen: –and yeah, and then a non-breaking breakpoint. And that breakpoint will be able, in theory, to collect credit card information, private health information, etc, etc. And we need to, first and foremost, give our customers the capability to limit which data they’re able to collect or to have the proper provisioning in their security systems, to audit and monitor who is using our dynamic tools. And of course, making sure that our customers know that their data is kept private, whether it’s on-premise on their own network, or whether it’s processed on Rookout’s back end.
Liran Haimovitch: We’ve recently added Golang. I mean, we’ve had dynamic languages for years. We’ve had Java, .Net, Python, Ruby, and Node. And now all of a sudden, we’ve released support for the first Native runtime supporting debugging Golang applications on the fly in production. How is Golang different from everything else?
Dudi Cohen: Except for the fact that it’s a compiled language, we had several new things that we had to be challenged with. And I think the first one is Rookout being more mature and a bigger company. And that means that this is a language that we developed support for with a bigger team. And not only that it was a bigger team, but we also had very big experience with Go development, and not within the team that developed the support for the language, but more on the side of our backend team. And that allowed us to build a really, really mature product, because we had the experience of using Go in our own backend. Now, putting all of this side, the challenge was very, very phenomenon. I mean, when you look at the technology that you need, in order to change code, or to dynamically instrument runtime or a language like Golang, I think it’s nearly impossible, unless you start disassembling all of the required tasks into very, very, very, very small building blocks. I mean, when you look at a language like, Golang, it’s very, very dependent on the platform you’re building, it’s very, very dependent on the platform you’re running. It’s very, very dependent on your… or your build parameters. And it’s very dependent on all of the dependency that you’re currently building. And in addition to all of these, I know, it’s not a very popular thing to say, but Golang, as opposed to other languages like Java or like C sharp is not yet as mature, it is still evolving, it is still changing very, very rapidly. Whether it’s the Golang is calling convention. Just recently Golang has added–
Liran Haimovitch: Fast call?
Dudi Cohen: — Yeah, fast call and a lot of other optimizations. And even in the area of package management, in Go models is very, very young. If you look at things like, I don’t know, even in node, as opposed to NPM as a package manager. And when we wanted to build all of these things in Go, we had to build a lot of building blocks, whether it’s our own native instrumentation library, and then we had to pretty much understand how all of the Go, runtime works. We didn’t find any alternative to dynamic instrumentation or dynamic code loading for Go. So we have to understand all of the bits and bytes of the Go runtime. And you can say, we’ve learned how to fool the Golang runtime in order to make the Golang runtime, run our own code that we’ve just created in real-time. So it was a real, real challenge. We are using it in production. Rookout’s backend, about three years ago, was built with Python, and until three years ago was built with Python, and we used Rookout’s Python agent to debug our own code. And about three years ago, we started moving to Golang. Because Golang is better than Python.
Liran Haimovitch: That’s crazy…
Dudi Cohen: Yeah, the really horrible thing about moving to Golang was not being able to use Rookout to debug our own back end. I mean, we do have microservices in JavaScript, in Node, then we have some internal microservices in Python. So we kept dogfooding our own code. But, we weren’t able to debug our own backend. We’re all full-blown backend with Rookout. And along the way, we had several bugs, that every time we encountered them, we told ourselves, where is our Golang support? Why can’t we use our own tools? And once we added the Go language support, it was amazing. All of the bugs that we had lying around in our backlog, were simply a non-breaking breakpoint away from solving them. It’s really the joy of using your own tool to find your own bugs is pretty much amazing. Sometimes it’s a bit mind bending, when you place a breakpoint on your own code, and that data is sent back to your own backend. And then, you start to try and understand in which environment, you placed your breakpoint and which environment is giving you the information from that breakpoint. But we have our own way of understanding how those things go.
Liran Haimovitch: So you’ve mentioned bugs. What’s the single bug that you remember the most?
Dudi Cohen: I try to hide away most of the bugs that I’ve solved in a very, very dark room and forget about them. In Rookout, we have– you can say that we have two sorts of bugs. The standard bugs that we have are in our own backend. And as I mentioned, we use Rookout to solve them. So, it’s nothing much to write home about. The most challenging bugs are bugs in our own agents. And I’ve mentioned earlier that the most important thing for us in Rookout’s product is our customers’ data and customer security. But the other thing, which is most important to us is our customers’ production environment. Our bread and butter in Rookout is the fact that we don’t interfere with our customers’ applications. A bug in our own agent is something that might interfere with our customers’ production environment. So we have very, very high standards in our own CI/CD and our own code into making sure that we don’t interrupt our customers in any way. I mean, the worst-case scenario that we aim to – in a bug – is that Rookout will stop working, but our customers application will continue working, I think the most challenging bugs, maybe the most interesting bugs that we– that I had the pleasure of dealing with are bugs in open source runtimes. Bugs, for example, when we wrote our debugger for Node, we decided to use some of the APIs that Node supplies. And that eventually uses some APIs that the V8 engine uses, supplies. And finding bugs in V8 and finding bugs in the node runtime is very, very challenging. It’s usually not only understanding whether we have a bug, understanding that the runtime itself has a bug. Sometimes it’s a bug that was already fixed in a recent version, but our customers use a legacy version of node. It’s very interesting because usually when you go and look at these sorts of bugs, you really need to find a workaround for those bugs. And a workaround for using in a buggy API in V8, or node is usually a workaround that you need to be very, very creative in order to solve.
Liran Haimovitch: And also verify quite often every version that the buggy API is still buggy, and it’s the buggy in the same way.
Dudi Cohen: Yeah, still buggy in the same way. Sometimes, they make sure to add some more colors to that bug. But the challenging thing about Rookout is that we do things that are not supposed to be done. I mean, we change code in real-time. Usually, no one expects you to change–
Liran Haimovitch: Just because nobody expects to do anything, doesn’t mean it shouldn’t be done.
Dudi Cohen: Yeah, I mean, obviously it should be done, we’re doing it. But no one expected you to do that. And you need to make sure that when you do that, you do that as safely as possible and you don’t want to awaken any sleeping demons in the runtime.
Liran Haimovitch: I see how testing and quality are imperative to making sure everything works as perfectly as possible. Any parting words for our audience?
Dudi Cohen: Yeah, I think that even as developers, you need to understand that you’re also responsible for production. And in recent years, more and more developers started looking at production, understanding how their code is running in production. And when you develop your code, you need to understand that if someone is responsible for how that code is behaving in production, it’s you.
Liran Haimovitch: Thank you very much for being here. It’s been a pleasure.
Dudi Cohen: Thanks.
Liran Haimovitch: So that’s a wrap on another episode of the Production-First Mindset. Please remember to like, subscribe, and share this podcast. Let us know what you think of the show and reach out to me on LinkedIn or Twitter at @productionfirst. Thanks again for joining us.