3 - How to be a Tech Lead (TL)
Michael Berk (00:00.908)
Welcome to another episode of Freeform AI. My name is Michael Berk and I do data engineering and machine learning at Databricks. I'm joined by my cohost. His name is Ben Wilson and he
Ben (00:12.099)
And I try to teach myself with the assistance of Gen.ai React so I can build web pages at Databricks.
Michael Berk (00:19.545)
that's truly fascinating. What webpage are you building these days?
Ben (00:23.679)
We're doing a customization of one of our onboarding flows, so I got to figure out how to do that. And full disclosure, I am not a front-end developer, so yeah, it's an exciting new thing to learn.
Michael Berk (00:41.55)
Cool. So today we're going to be talking about something that is very important to me at the very current moment. I'm working on a project that I'm very proud of, which is called Databricks for Good. We're going to see where that lands in a few months. But right now we have been approved from the senior VP level by one of our co-founders to give pro bono professional services to this specific organization that we've been working with for about upwards of a year now.
Their mission statement is to give, basically collect a bunch of hospital data so that they can pair volunteer doctors with hospitals in need. So it's a really great mission. It's really fun. And the thing that we're building is also really sexy and really cool. And I am sort of the TL for this. I have to interface with the stakeholders. I have to design everything. I have to read between the lines for what they actually need and what the organization gets value from.
And then I have to manage the team, onboard them, delegate, do work myself, and a bunch of other random crap. And so I think today would be really cool to go through this case study because, Ben, this is what you do day to day. So before we kick it off, do you mind giving some background about your path through Databricks and what you do now and why you are potentially qualified to have opinions on this?
Ben (02:02.599)
potentially qualified. I got hired at Databricks seven years ago or so into a field position as effectively a consultant doing coding for customers. also eventually over time, I moved away from doing that full-time into more of traditional consulting about nerd stuff. So work with teams that are trying to do something in data science. They want somebody who's done that before to come in and provide advice and also help them code it.
and did that for a number of years and got high enough in the field where they asked what I wanted to do at this new level. And I said, I want to go play with engineering. And they're like, what? Sure, go ahead. If you can find a team that wants you, go and do that. So I asked around and got plunked into working with, but not working in the AI, like open source ecosystem team.
And that's the, you know, maintainers of MLflow and a couple other packages and stuff. And then, after about two years of doing that, they just asked like, Hey, the guy that's running this right now, running the team, needs to go on and do bigger and better things. it needs to like take on more product related responsibility and help like drive a lot of big initiatives.
I was like, yeah, that makes a lot of sense. He should go do that. cause he's really qualified to do that. And they're like, you're going to take his place. Like, no. Okay. so yeah, I got, you know, gently moved over into that role over a period of like six months and then took over, officially.
Michael Berk (03:58.882)
Nice. So to add a little bit of context about the structure of that team, there's a bunch of engineers, there's a manager, and then there's a tech lead. And Ben is currently the tech lead. And I am serving in the Virtue Foundation project a very similar role to a tech lead. Arguably, it's exactly a tech lead. And I am loving it, but I got some questions. There's a lot of complexity to it and...
especially while working on two other customer projects and some open source contributions and other things of that nature. Just a lot to manage. So prior to recording, Ben and I just sort of sat down and brain vomited a few topics that we would like to cover. And I'm very excited because at the end of this, I will get a free consulting call with the Ben Wilson. So it's going to be pretty fun. So Ben, you want to kick us off?
Ben (04:49.733)
Yeah, the first question that we came up with, how do you exude confidence while maybe not knowing all the details?
Michael Berk (04:58.562)
Yeah. So this was a real pain point in like two weeks ago. Basically what we're looking to do is build a GenAI ETL pipeline. And so we're to take a bunch of disparate sources, namely LinkedIn, Facebook, Google, and Overture. Overture is basically maps. And do key information extraction on unstructured data to make it structured. And then upsert that data into a website serving.
So there's a lot of, it sounds pretty simple at face value, but there's a lot of edge cases, a lot of complexity. How do you resolve conflicting information? How do you know that you're upserting to the correct row? So a row represents a single hospital. So throughout all this, we've been working on it for, I don't know, nine, 12 months, and I'm very onboarded to the problem. And two weeks ago, it sort of hit me that this is hard. Like, web scraping a bunch of data,
is a solved problem, but it's solved by tons of money and tons of engineers in a variety of companies. And
My solution for exuding confidence is just hiding the fact that I don't really know how this is gonna turn out. I think it's gonna turn out well. I have a lot of like evidence to believe that but I'm also very objective and understand that we're tackling a challenging problem. So... My... I guess my answer is repress and make jokes.
Ben (06:31.099)
Yeah, homework. So every so often there's a project that comes along that is something company focused. Like, this is an initiative that we want to do and the product side of it makes a ton of sense. It's really logical. And we will go through that process of what is the component or series of components that we are responsible for for this? What is the product side of that? Like what?
Michael Berk (06:31.182)
But how do you do it?
Ben (07:00.155)
are we building, why are we building it? And we go through that process to formulate sort of those ideas in our own heads and make sure that everybody else is on board and aligned with what we're thinking. And it protects us from anything that we missed. Or if we're going in a direction that is like, this isn't what we're trying to do. So that alignment is super important and it's absolutely critical to get that done upfront.
before we start thinking about what is the scope of this? What is the technical direction? Like what should we implement? That's all stuff we'll figure out, right?
And after we get that product alignment, you go into that next phase of design. And part of design is if it's something you've done before, you know how to do it. It's straightforward. You have experience or people on the team have experience. You don't need to go and build a prototype for that. know, like, we're building an API that can, you know, communicates to a database. We've done that thousands of times.
So we already know how much effort that is. There's no need to prototype it. But if we were, if somebody came to us and we're like, Hey, you guys need to scrape these 10,000 web pages every day. Then we would be like, no idea how long this is going to take. And we would go and build prototypes as part of the design and prove. probably wouldn't just build one. We'd probably do like three or four different ones, like selected at random.
much time does it take to do one of these? And are they consistent? Do they all take a day or two days or something? Or is this like, well, one took half a day. It was super straightforward. This other one took five days because something like they're doing something weird on their site that we have to work around. So that's why we go through all of that and that informs scoping and like how many resources do I need? How many do I have?
Ben (09:07.793)
who has experience in this domain. And you have to make, as a tech lead, you're trying to assign that work to people. There's a bunch of considerations that go into that. And I think we have other questions that are related to that later on. But yeah, you want to prep for that pre-work work.
Michael Berk (09:20.75)
You
Michael Berk (09:30.764)
Yeah, exactly. We've been approaching it in that style where we need product alignment first and foremost. And then we want to create minimum viable products of the key components. And minimum and viable are subject to people's opinions. But generally, I've been gatekeeping what I determine minimal and what I determine viable. And once we have each of those, we can be relatively confident that we can scale it out and then stitch them together.
This week, we're starting to complete all of our MVPs, and it's actually looking really good. So a really important lesson, I think, to take away is have faith in the process of this prototyping. Because as long as you're doing the minimum to determine whether a solution is possible, you're going to be very efficient with time, and you're probably going to get it done. But like two weeks ago, we didn't even know if it was theoretically possible. So that's kind of a scary thing when you ask for.
Ben (10:07.388)
Mm-hmm.
Michael Berk (10:30.626)
tons of money from a co-founder, and they're like, we'll deliver.
Ben (10:34.181)
Yeah, but that's, it's also having faith and like, okay, other people have solved this. Like, this is doable. We just need to teach ourselves, like do that homework. So that the original question of like, when we don't know anything, everybody's at that phase if you've never done it before. And you shouldn't ever try to give an answer immediately when, when asked a question about that. Somebody was like,
Michael Berk (10:40.184)
True. Yeah.
Ben (11:02.511)
So tell me how you're going to build this. Be like, I don't know. I'll go research it and do my homework and figure out how hard this is.
Michael Berk (11:07.395)
Yeah.
Ben (11:13.253)
And there's sometimes where somebody will ask a question. think consultants are very, very guilty of this. They've never done something before. They don't even know if it's possible or not what somebody's asking them. And they just agree to it. Or they have so much confidence in themselves. They're like, I'll figure that out. And then six, eight, 10 weeks later, still no progress because they think they now have to like build this new framework that does this thing.
And if they had done their homework, would have been like, yeah, nobody does this thing because it's like the cost is too high or to implement this would take a massive company of R and D to like focus on this.
Michael Berk (12:00.92)
So some question to this. What do you tend to communicate to stakeholders in terms of progress? And I'll just say, for this project, we have weekly meetings with the big founders of this organization. And I try to be very, very transparent, but also showcase little tangible wins that a potentially non-technical person could enjoy. So each week, have a variety of feature developments. And I will just cherry pick one for someone on the team to present for 10 minutes.
And that's been a really great way to like build up rapport and just get people excited. But for you, who someone reports, who you report to internal Databricks folks, do you have to like quote unquote, please stakeholders and give status updates with fun wins along the way? How do you think about that?
Ben (12:47.587)
we most certainly knew. But it's more of, you know, we're an OKR shop. So we have quarterly goals. We have already gotten approval prior to the quarter starting that this is what the team is going to focus on. It is mutable. Like things can change throughout the quarter, but those change under the cognizance of leadership because they're the ones who are green lighting. Like what is this team that's full of
very expensive software engineers. How are they spending their time? What are they building? And product is involved as well to make sure that, are we, we thought we knew what we needed to build, but now the priorities have shifted. So product will work with the T L's and the managers to be like, can your, your team shift over to this other thing? Cause I, have evidence showing that this is more important. And if we're all in alignment, which most of the time we are.
Yeah, we shift the focus and then let our leadership chain know. So we have like bi-weekly meetings that it's just leadership and tech leads and managers that go and you present the state of your team.
Michael Berk (14:01.26)
And it's all OKR, very data-driven. And it's not showcase a feature or do a demo.
Ben (14:06.009)
yeah.
There's other avenues for that. And that's more within the engineering organization or within your department. You might, if you, if your team built something super cool and you want to kind of celebrate it, you go and do like a five minute demo in front of 400 people and showcase like, Hey, we built this cool thing. And people start asking questions. And then that's more to let people know that this exists because they might want to interface with it.
Michael Berk (14:24.963)
Hmm.
Ben (14:35.599)
like that's super cool, those APIs are awesome, that'll help me on this project I'm working on, that sort of thing.
Michael Berk (14:43.288)
Okay, cool. What question is next,
Ben (14:48.141)
onboarding new members to a team or to a team that's working on a project. How much time do you spend with that person? What context do you need to give them? And how do you start giving them work?
Michael Berk (15:01.688)
Yeah, I think this one and the following question of determining someone's competence, specifically doing that quickly, they're very related. So onboarding is defined as getting someone up to speed so that they can make meaningful contributions. And I think it's subject to where they come from. So A, if they have the technical expertise or similar project expertise, onboarding is often very simple. But if they really don't know what the hell is going on and you have to teach them skills along the way,
I think onboarding is a lot harder. And then there's a subcomponent of onboarding where it's just like niche to the project. So we're working with doctors and hospitals and things like that. A lot of Databricks employees don't have a medical background. So for instance, we need to resolve specialties and subspecialties, medical specialties. like, cardiarchics, pulmonology, you name it. If you don't know those things, such as myself, I'm learning it along the way, that's just a learning curve.
So I think step one is determined competence and then step two is reverse engineer from confidence to what they need to be doing. But I really, really struggle with determining competence out of the gate. It's been like one of the biggest pain points at Databricks so far. How do you think about doing that? Like what's the least amount of time you can put in to get a robust signal on how good someone is at a given task?
Ben (16:22.331)
Yeah, so for a project, if you follow the process that we do where you have the product design doc, and then you have the engineering design doc of the project and a project tracker, maybe there's a bunch of design docs that have been created, that's their homework when they onboard to the team. Like, hey, your first two days, just read all this. And then if you have questions, just ask me. That's your first gauge of competence.
And also the gauge of how good your docs were. That doc should be intelligible by any professional engineer. And it should be clear enough that they can understand what it is that's being built and why it's being built. And then the engineering design, they might have questions or they might challenge certain things based on what their own personal history is and like what they like to build or how they like to build it. Totally fine. That's for a one-on-one discussion with the tech lead.
There could be reasons why we're doing it one way or not another, like, we're not building this the most robust way. It's like, yeah, we don't know if this is going to be a final product. So we're going to go fast and loose at the start. And then if this takes off and we have like a hundred thousand downloads a day of this package, then we need to go and do phase two, which is harden up that implementation and build like the full system. But if it gets 10 downloads a day on average,
we're not going to sink the engineering resources into building something that is robust and scales and stuff. So you have those conversations with them and you're assessing their ability to grok the project independently. Like can this person do something without needing their hand to be held? And if they're asking tons of questions and they're not really understanding what's going on, maybe they're not good for this project because it's not clicking in their head.
Or if they're asking too many questions about the technical implementation and challenging too many things, is this person going to be like able to work within the team? Like are they going to culturally fit within the, with the rest of the people that are doing work or are they just going to be a pain in everybody's ass the entire time? So you're assessing them during that intro phase. And then once they get past that, you assign them
Ben (18:47.897)
easy stuff, like quick wins. So they just get comfortable with the process of contributing code to a code base that other people are using. Not critical features, not something that requires a lot of context, not something that's super complex, just something quick and easy. You know, if you have docs or something, have them do a docs PR so they understand the PR process. If you don't have docs and you're just working on a project, you know, like, we have this
this cool P1 that we wanted to do, but we're bringing somebody new in who needs to like, I need to figure out how good they are at banging code out. Give them one of the P1s, not a critical process path thing. Cause you don't want them to be a blocker to other things that are on the projects, like roadmap. But you also don't want them working on something that's like a P2 that's like, nobody cares if this gets built or not. So it has to be important, but not time critical.
Michael Berk (19:47.79)
Hmm.
Ben (19:47.847)
give them little bit of buffer rooms and you're going to have time, extra time for alignment when you're doing PRs with them. And the tech lead should be the one doing the first couple of PRs to make sure that you're doing that adjudication. Like, are they writing the same level of code as everybody else? Are they over implementing something? Are they under implementing something? Do they not know how to write tests properly? Like there's all sorts of things you're
kind of checking there.
Michael Berk (20:22.68)
Cool. How do you balance upskilling with compromising on quality to ship on
Ben (20:31.239)
So upskilling with the project context so that somebody understands why or what we're building, that's just part of project management. Like you have to do that, get everybody on board. And provided that you spent your time effectively during your homework phase, that should be self-explanatory in documentation. People should be able to read that 20 minutes and be like, yep, got it.
Michael Berk (20:44.258)
Yeah, that's essential, right?
Michael Berk (20:59.628)
I mean, like core technical skills, like not onboarding to the project. Like, let's say I need to write a lang chain, whatever, and they've never done that before. How do you think about giving something that's in someone's wheelhouse versus giving something that would make them learning grow?
Ben (21:18.243)
Excellent question. Depends on how critical timing is. So if you have something where you're like, Hey, I've agreed to ship this six weeks from now. know this part of the, this implementation takes three or four days to do. Somebody's really jazzed about like, Hey, I really want to do this. Can you please give me this ticket? I want to learn this. Like while you're executing a project with a deadline is not the time to
give somebody the ability to learn something from scratch.
You know, maybe have them do that in parallel or you do like mini hackathon or something. Everybody gets to build one and then you share learnings amongst everybody. But it's not something that's critical path impeding the delivery date just because somebody thinks it's super cool and they want to work on it. It depends on their technical competency too. There are engineers, like pretty much every engineer on the team that I'm on would be capable of doing that even if they've never done it before.
and probably get it done an entire day early. But that's because of the hiring standards that we adhere to.
Michael Berk (22:33.197)
What's the amount of, I guess, percent time that you hold out for learning-based projects? Because theoretically, you can always allocate 100 % of everyone's time to shipping features. But do you try to create a 5 % buffer for fun stuff or growth, does that not matter? Or just it happens organically?
Ben (22:54.255)
You can solve that problem by not dictating what to build. You dictate the requirements of what this thing is supposed to do. Like, Hey, I'm, need an API that does X, Y, and Z design it. And that might be an extra couple of days tacked onto that work item, but that whole process, they're going to have to go out and prototype stuff. They're going to have to learn that, figure it out, break it.
fix it, know what works, what doesn't work. And they're just going to test a bunch of hypotheses in order to get that design doc written. You have to, that's the whole process of that. So that's how we do it. We give somebody a task to, you know, you've seen it in our sprint boards before. It's just like in brackets design. Sometimes that's one day or two days of design work. Sometimes it's an entire sprint.
Like here's two weeks, go figure this out. Cause none of us know how to do this. And there's so many open questions. We just don't know what we don't know. So go figure out the questions and the answers to that. And somebody will go off and write a crap load of code and they'll figure it out and they'll come up with some salient design that other people get to pick apart and ask questions about. that hopefully aligns us to something that is doable.
And then right when that design's done, that person just rolls right over into building it. Cause now they have all of the context. know all the, you know, they, know where all the bodies are buried in that. And you're not going to be like, well, thanks for doing that design. Somebody else really wants to work on this. I'm going to give it to them. You're going to piss both of those people off. Cause one person's like, I'm just implementing something that somebody else designed. That's not fun. And then the other person's be like,
I just spent two weeks designing this. What the hell? Why is somebody else taking, you know, going to go and implement this? I already have it mostly done sort of thing.
Michael Berk (24:58.414)
Yeah. And there's a really nice component where people have end to end ownership and become SMEs in that area. subject matter experts. So that makes a lot of sense.
Ben (25:01.701)
Yes. Exactly.
Michael Berk (25:12.712)
What are you thinking is next?
Ben (25:14.555)
Well, you have that sub bullet question. How do you know when you should do it versus someone else? If you're the only one with context on how to do this.
Michael Berk (25:25.718)
Yeah, so this came out of something I was working on literally an hour ago, where was like, this will literally take me 30 minutes to do. And if I assign it to someone else, it will take them probably upwards of a day. The reason is not related to competence whatsoever. It's simply I have onboarded to the specific problem already. I've already done all that prep work of trying things, designing things, and I've settled that this should be done.
So I'm just going to do it. think the gain is way, like, it's a better use of my time than someone else's time. But actually, very directed question to you. Your salary is higher than the people on your team sometimes. So that's an angle for a multiplier. But also, at least for me, I value my time a lot. And so if someone else can do it and I don't have to do it, that's great.
Where do you, like what's your thought process around thinking about a multiplier on your time versus someone else's time? And how do you think about who, whether who that person is like Ali Ghazde, the CEO of Databricks, his time is probably a little bit more valuable than my time according to Databricks. But someone else may be less valuable in theory. So how do you think about that?
Ben (26:49.031)
So we have an actual process in place for stuff like this. And it's part of like TL onboarding. Anybody who's in that role should theoretically, at least at Databricks, should theoretically be able to do any work item that the team is working on. They should be able to just cut a branch, write a PR, write the tests, file it, and then see it through to merging and release.
That's kind of like a base requirement. You might not be the fastest at that. You might not be the person who understands all the nuance of the users using that, or you might not have the prettiest code, but it's gonna work. And that's like a requirement. There are exceptions to that. If you're a multidiscipline, like a team of multidiscipline engineers,
I can't do what our front end people can do. They're like, there's, just don't have that skillset working on it. but it would be, it would take years to get to their level, of just doing that because that's how they, what they did in previous roles are like a front end dev for like almost a decade. So it's understanding the way that you like think through this is there are times where as a
A tech lead, you should be taking on some more challenging work that the team is responsible for like in a given quarter. Like, Hey, take this like really complex thing, or you have this idea for this new part of the product, go do the design and go work on that and write some PRs and get some code out there. Cause a TL is a senior IC person within the team, but you can't do that a hundred percent of the time.
because you have all these other responsibilities that nobody else on the team has, is you're covering for the rest of the team. You're the point of contact for any incidents that happen. So you're also devoted to planning for the team and like resourcing and figuring out like who's the best person to work on this thing. Sometimes you're almost doing like secretary type work. You're like taking notes on behalf of other people so that they have context so you can unblock their work.
Ben (29:17.607)
or you're in meetings representing the team. And there's a lot of them at Databricks. So you're basically serving as this liaison on behalf of the team to ensure that everybody, their time is not wasted. It's not efficient to have eight engineers in a meeting that one person would have sufficed for. So keeping that in mind as a project slash tech lead,
You have a bunch of other responsibilities that are taking up your time that nobody else has to deal with. They just have to bang out features or do designs or whatever. So delegating that, some of the responsibilities of implementation to the right people. And sometimes that might be, yeah, maybe the TL should do this design because it's super complicated or it's something that they've done before and they have a lot of context. No, just delegate that. Get that other person.
up so that they're the expert in this domain. And the way our team used to work a year and a half ago or two years ago was the TL doing a lot of the designs and a lot of product decisions. I have dozens and dozens of documents in my folder of all of that stuff. And I would do all of that. We'd get it approved. And I'd hand it off to somebody to go and implement. And then some of them I would take and do myself.
but it doesn't really scale and it doesn't allow you to do enough IC work if you're just constantly designing stuff for other people and speccing things out. So when we started really like doubling down on the delegation part, all of a sudden you're, it's a force multiplier for yourself and it's good for those people too, because they now own this thing. They feel emotionally attached to it. And it's like, this is mine.
Like I'm designing this, I'm building this, this is my thing. I own this. I'm the point of contact. And from my perspective, that's awesome. Cause now have one or two experts in this domain on the team that know everything about it. They know like all the aspects of it. And if issues come up or new features or anything, they can just crank those out really fast. Cause they own that part of the code base. So it, it allows the team to scale.
Ben (31:44.901)
much more efficiently.
Michael Berk (31:47.734)
Yeah, I think this is really subject to a lot of variables because I'm doing consulting like contract work, which is part time. And so getting a subject matter expert for someone who's going to be here for three weeks isn't really beneficial to anyone. But going back to the question, how do you think about the multiplier of time? Like, here a component is creating subject matter experts, giving people ownership so that they're happy and thereby retained.
scalability of your time. But what's the most like, can you break it down to a multiplier? Then how do you think about it?
Ben (32:24.775)
No, there's plenty of stuff that if you look at stuff that I've done in the last six to nine months, previous to nine months ago, a lot of my time was like meetings and design. And there are a bunch of projects that our team has done that you look at the original design talk, it's mine. And I'm like, I didn't even remember doing it because I didn't implement it.
And then I look back and like, I guess I was the one that designed that's weird. Okay. but nowadays it's more like, no, I just, I free myself up to do that same role that they're doing, which is like taking on complex feature implementation and design for my own stuff. And then allowed them to do that as well. But I'm still just like coordinating messaging.
from the larger organization to them directly. So that scales time a lot better, more effectively. Because if you take a tech lead and you're just like, well, their responsibility is project management and making sure that all the people on the team are doing what they need to be doing.
I feel like if you do that for too long as somebody still in tech, it's different from managers, managers that there's whole thing, right? although they have a lot more responsibility of other things as well. But if you do that as a tech lead, it's almost like you're going to find you get super rusty on the tech side where you're like, well, I'm not writing code anymore. I'm just reviewing code.
and doing designs and delegating actual implementation details is boring. Like really boring.
Michael Berk (34:26.99)
Yeah. And the inverse, it's just lack of doing design or tech. I feel like both get boring. And owning it to end is the most fun.
Ben (34:38.417)
I mean, banging out implementations. Yeah. If you're just effectively the insulting term that I know of is just if you're a code monkey and somebody's handing you requirements or handing you designs to say, go build this, this exact thing. Here's your spec. There are people out there that love that, like love it. And then there's me, like I've done it. I, I've done it for.
a lot of time, but over time you just kind of feel like it's never ending monotony. You're like, okay, we got to build this feature. And I have the full spec in front of me, time to just go type. And it gets boring over time.
Michael Berk (35:19.534)
Cut it.
Michael Berk (35:25.143)
Yeah.
Yeah, I've actually really enjoyed the design aspect of this. Building the features has been cool, but it's been kind of eye-opening how I don't really care how it's built. I care that it solves the problem. And building it to solve a problem is rewarding, I guess. But yeah, I'm definitely learning a little bit about myself through this project. It's been cool. But yeah, OK, shifting a little bit back to the TL world, you are delegating.
And let's say we have standups with check-ins with tickets and everything, and you know what everybody's doing. How do you identify someone who's going in a bad direction? And what do you do?
Ben (36:09.563)
Bad direction from an implementation perspective.
Michael Berk (36:13.229)
Yes.
Ben (36:15.047)
always private, never public. So even if it's brought up by somebody else who's questioning something that they saw, you redirect the conversation to say that you're going to follow up offline, to just, but don't make it negative. Like, we're going to talk later. It's never do that in public. shouldn't do that period, but
more like, hey, I have some ideas for this that I'd like to discuss with you. If you have some time after this meeting, let's like just meet and like, we can just geek out about it. And in that meeting, you can start asking like Socratic questions. Like, did you think of this? did you think that like, just out of curiosity, what do you think would happen if a user did this or the data looked like this? And if they're good,
which hopefully they are because you brought them into the project and they passed muster, then they figure that out just by virtue of you asking that question. They don't need any other prompting. They're like, totally didn't think of that. Yup. Going to fix that right after this problem solved. they'll, they'll probably, you know, the natural human reaction to an intelligent human is
they're going to think about that next thing that they do. They're like, I got to think about that one thing that Ben asked about just to make sure that I build this right or that I have good test coverage here.
Michael Berk (37:53.186)
Yeah. Yeah, and another thing is customizing it to the person. Definitely a good idea for big stuff to bring it up in private. for some of this team, I have a lot of rapport with them. I've worked with them for a while. And I'm fine with being like, no, you're wrong. And I would actually don't actually say that. To put it exact, I'd be like, I guess I would come at it from a Socratic angle and be like, why did you do that?
And I think doing that in a public space, it actually has some really nice externalities where it makes people's opinions less important and lets the facts come to the surface. Because I typically am really, really direct, for better or for worse. And I have to tone it down a lot, especially with people that I don't know that well. But the inverse of that is by being so direct, it's just facts. We're just solving problems. It's not about
who's right or who got the credit. It's just like, let's figure out the solution. And I've noticed that a really effective way to do this is be really direct and then be really quick to change your opinion as soon as there are facts that outweigh your prior perspective.
Ben (39:08.101)
Right. We just don't do that in standup because standup is not the place for that. Standup is status updates and like standup is status updates of like what you're doing and you need assistance or advice from either the tech lead or other people on the team who have greater context. That's what that's for. Like you want to minimize the amount of time that those people are having to sit in a meeting room. Like that meeting should be
Michael Berk (39:10.99)
Mm-hmm.
It's more like a brainstorming call.
Ben (39:37.399)
as efficient like you're dealing with. What incidents are we dealing with? What's the plan on fixing these? What's the status? Okay. Now onto feature development work. Anybody basically blocked on anything that they're working on. And if nobody brings anything up, they're like, yep, I'm all good. Like you've been in our standups before. It's usually like, yeah, I'm working on this. All good. We'll get a PR tomorrow. We're not hashing through details of like what's
being implemented, how it's being implemented. There could, there's every so often if there's something where, Hey, we're up against a deadline. Like we've got to release next week and we need to discuss this as a team right now in order to unblock this. That's sometimes super helpful for that person to get the feedback of like all seven members all at once. But yeah, if it's something that can live in a PR comment, leave it in a PR comment.
If it's something that could be construed wrong by a PR comment, have a one-on-one meeting. Cause they might be thinking of something that you're not thinking of. Like that, that happens. And it has happened to me. That's why I approach it from a questioning manner is cause I want them to, be able to provide that justification to me in the, in the most conversational and friendly way possible.
Michael Berk (40:35.982)
Mm-hmm.
Michael Berk (40:44.396)
Yeah.
Michael Berk (40:48.238)
That's super true.
Ben (41:03.751)
because we're respecting each other as peers, that if it seems reasonable what they're doing, I'm like, yeah, thanks for explaining that to me. It sounds good. Or it's going to be, I ask a question and they're like, yeah, I didn't think of that. Thanks for asking that question. I'll fix that.
Michael Berk (41:14.616)
Yeah.
Michael Berk (41:23.214)
think this is a product of culture, but how many times in the past six months have you just been like, no, we're not.
despite other people disagree.
Ben (41:33.831)
I can count on one hand the number of times like there was a proposal to do something in the team. It's not like, oh, it's a bad idea. I don't think we've ever had somebody on the team suggest something that's like off the rails, like crazy. It's more like, yeah, I hear you. And I think that's important, but we don't have time to do that right now. So let's because it's
such a large undertaking to do this thing. This is quarterly planning area. And then at that point it becomes, you know, engineering management's responsibility to determine if that's a key priority for the team or not. So we've had tons of great ideas, things that we proposed. Sometimes it's been proposed many quarters over and over and we've had more pressing priorities that we needed to get done.
and it just gets deferred until we have capacity or it becomes critical mass. Like we have to fix this now or we have to implement this now because it's a maintenance burden or whatever it is. Then we actually get to do it.
Michael Berk (42:45.73)
Got it. Yeah, that makes sense. Cool, we're flying through these. You mind teaming up the next one?
Ben (42:55.131)
What is the boundary between prototype slash design and implementation?
Michael Berk (43:02.828)
Yeah, so we're currently there with the Virtue Foundation project with Databricks for good. We're wrapping up proving out every component probably by mid next week, maybe end of next week. And then we're productionizing a lot of the ETL. That's just, you got to write it. There's not a ton of design involved in that. But a lot of the Gen.AI pieces, that's relatively new. Organizations, no doubt, have done this very effectively.
But to my knowledge at Databricks, haven't done any projects of this nature. I'm sure I'm missing a few, but it's pretty cutting edge. And so we need to explore and quickly fail and then figure out the best path forward and then from there, productionize. So we're at that stage right now. The stopping criteria that I've been using is, does it handle 80 % of the cases that we can think of right now?
So.
Actually, I'm definitely not using that as a stopping criteria. I don't know the stopping criteria. What is the stopping criteria, Ben?
Ben (44:12.913)
for a project?
Michael Berk (44:14.542)
Yeah, it's sort of, as I'm talking through it, it's definitely not 80 % of the cases. It's like a gut feeling, like, ah, this will work. Yeah, let's do it. That's my stopping criteria, right?
Ben (44:25.703)
So we operate in binary mode on engineering, which is on your product design doc that you wrote up, you have some must haves, should haves, could haves, and won't haves. So you better not be implementing won't haves. There's reasons for that.
Michael Berk (44:30.446)
Mm-hmm.
Ben (44:55.919)
Your stopping criteria is all must haves are done by the deadline.
If you hit that and you're set to release next week, we're done with phase one.
Michael Berk (45:12.942)
Okay.
Ben (45:14.085)
Now, if you, if you have extra time, like let's assume that a normal project and we don't do that. don't, we don't trim that down to be like, okay, maximum efficiency. We're to get this done. We're going to scope this so that we only have time to do must haves. That never happens. There's always some wiggle room in there for like testing, bug bashing, validation, where during that we might.
While testing this out, we might find that some of those should haves probably are more of must haves. It's like, yeah, we said that we should build this, but it's not like a blocker to release. But it sucks so bad without this one API. Let's just create that interface. So we'll do that. And a lot of the should haves generally are lighter weight. There are things that are fairly trivial to.
Michael Berk (46:10.466)
Mm-hmm.
Ben (46:12.987)
like add on, but if it's something that's like, every so often there is like a should have that could be, this is like three weeks of work to do this. But we do that scoping during that design time, like estimates.
Michael Berk (46:22.829)
Mm-hmm.
Michael Berk (46:28.12)
Yeah, we covered that in episode one, so check out episode one if that's unfamiliar as a concept. But that's objectively right. I guess the hard part in theory for this project is that all of these components, when stitched together, need to create the must-have. And we don't know what the must-have is of each of the components. So right now, it's sort of like a spaghetti soup of inner functionality.
Ben (46:36.743)
Hmm.
Michael Berk (46:56.482)
that should all sum up to be the must-haves. So with that, do you have advice?
Ben (47:01.819)
Got it.
Yeah. So you're talking about a complex like end to end application that you need to build, which isn't to say like the code has to be complex or the final product has to be complex. It's more like you've been doing piecemeal subsystem implementations to figure out. Can we do this? Yes or no. Okay. We have, we have some prototype code here. We have some prototype here. So your next stage.
Michael Berk (47:07.182)
Yeah, complexity, that.
Michael Berk (47:22.112)
Exactly.
Ben (47:31.939)
immediately after all the prototypes are done, the next thing you do is prototype assembly into the full product. And then as you're doing that, you're working on hardening up all of those implementations. And that can be distributed work once everything works end to end. It might not be perfect. That first run through might be like, yeah, this kind of sucks, but we know what's broken.
But that's how you determine what you need to fix is running it and seeing what sucks.
Michael Berk (48:06.862)
That's a really good point. Yeah. It's just like, you got to be able to be smart. Like, I guess that's the answer. Like, be able to hold all the disparate pieces of the components in your brain, as well as get a 50-foot view and see how they would all work together end end. Do you have tips for that? Of course, yeah.
Ben (48:23.249)
But then you've got to, you got to glue it together. Cause if you don't and you wait till the end to stitch all those together, you're going to be doing rework, which is a huge waste of time. learn early what, because somebody may have implemented something thinking they know the best way to do it. I'm like, okay, I'm to get this data set in this, you know, I'm to work so hard to get this, this input data to be so perfect.
Michael Berk (48:30.178)
Yeah.
Ben (48:51.793)
And then when you stitch it into the rest of the components, you realize that, the schema is incompatible here or like the data structures needs to be completely redone. So now we need to add on all this additional processing code, which is going to take a week to fix this. So you want to avoid that. But if you learn that early before somebody's already finished all of that stuff, then it'll just be like, yeah, I can change that. And yeah, this will all work. So integration is important.
Michael Berk (49:08.803)
Yeah.
Michael Berk (49:22.254)
Yeah, it's funny. Whenever I think about integration, it reminds me of a Mark Rober video. Do you know Mark Rober? Yeah. He was building the egg drop video from space. And he delegated some dude to do the software, and he did the hardware. And they didn't properly do the stitching together testing. And it just would spin infinitely because the like
Ben (49:34.055)
Uh-huh.
Michael Berk (49:50.862)
Yeah, the connection didn't work. And they had tons of money, tons of time, many people coming out to watch the launch. And they just should have tested it earlier. So your advice for that component is to do it early. What else?
Ben (50:10.405)
You do it when you can safely do it. So you don't like patch things together and be like, here's some dummy data. You could do that theoretically, but then you have to build to that spec of whatever you're expecting. So there could be things that change. could introduce complexity. So when you're building something like an application, you can work on all the sub components asynchronously and, know, iteratively develop and get better.
Michael Berk (50:13.677)
Mm-hmm.
Michael Berk (50:24.462)
Mm.
Ben (50:39.963)
But there should be some goal in the project plan of like, we're integrating all these modules together on this day. And then for a week, we're fixing everything that's broken with that. This is just probably going to be stuff broken. At least when we do stuff like that, there's things that we're like, we didn't think of that. Or like, Hey, what's with the exception handling here? Like this is kind of broken. Or did we forget a field in this API? what, what happened to the data? You know?
Michael Berk (51:10.082)
Yeah. Noted. OK, cool. Damn, we are flying. Next one. You want to tee it up?
Ben (51:22.427)
How do you enforce consistency in a group of people who don't work together for a really long period of time?
Michael Berk (51:30.252)
Yeah, we've actually talked about this a lot in prior episodes and Adventures in Machine Learning episodes.
Linters are the best. The annoying thing for me is that we, develop both in a Databricks UI and in a local, and a lot of people are not comfortable with both, unfortunately. And Databricks just isn't, it doesn't have the same functionality as an IDE. It's, it's a very different development experience. Like running PI tests still sucks. And
Ben (52:03.291)
I've never tried it. Not even once.
Michael Berk (52:05.504)
It's very doable, yeah, but it's weird. It's really, really, really, really weird. You can't do make files. You can't do linters of any sort. It's just a different tool.
Ben (52:18.065)
I don't even know if you could test, you wouldn't be able to test async APIs on Databricks.
Michael Berk (52:24.078)
because of the Python kernel, or like IPython kernel, right? Mm-hmm.
Ben (52:27.237)
Yeah, because you have a running event loop. wouldn't eat like PyTestAsyncIO wouldn't even allow you to do that. But yeah, we don't do that in engineering. We don't use, we use, we do use notebooks for testing, but we use them for integration testing to make sure that we're simulating a customer using a notebook. But that's just like execute notebook as code effectively and make sure that it passes.
Michael Berk (52:36.374)
Exactly.
Michael Berk (52:43.278)
Mm-hmm.
Michael Berk (52:47.683)
Yeah.
Michael Berk (52:55.138)
Yeah, so to answer this question specifically for this project, I don't. I just review PRs, make sure it generally looks good, ask people to format their cells, their Databricks cells if they're primarily working in Databricks. And then maybe once a couple of weeks or once a month, I'll go in and just lint everything locally and just push that. But I don't enforce it. I just care that it works.
Ben (53:24.667)
Yeah, personally, I'm not, and nobody else on our team is a stickler for code syntax. There's a couple of exceptions to that. We choose to adhere to comprehensions instead of for loops if we can. We really like the walrus operator because it deletes one or two lines of code.
Michael Berk (53:47.34)
I've noticed.
Ben (53:50.949)
So that consistency, we enforce that amongst ourselves because we don't want to see like mismatched usage of the same thing in the same code base. So we just all agree. We're like use the walrus operator because we can. And if somebody's writing a for loop that usually triggers something and whoever's looking at the code, like is there external state that needs to be accessed here or is there.
Do I need to mutate external state within this? If so, that better not be done in a comprehension because the code to do that correctly and make it extensible is super complex. If there's a bunch of Lambda commands, we hate that. There are times and places to use Lambda and Python. It's effectively like an apply command, but
If you want to standardize on using Lambda everywhere, then do that. But if you only want to use it where you need to use it, then only use it where you need to use it. We choose the latter. But yeah, if you're talking about
things that a Python interpreter doesn't care about that's just human readable. It drives me insane to see a project being written where somebody's not using like a modern linter. You choose a language, there's a linter out there for it. Just choose the version and the config that makes sense for you and use that before every PR is filed. It saves so much.
annoyance with people. Because if you look at code that just fundamentally looks different, provided that you're writing in the language where syntax formatting doesn't really matter, like Python, there's other languages where it's like, doesn't really matter. Yeah, it's just a recipe for people getting kind of confused or making it look like
Ben (56:01.933)
seven different people wrote this and all of them have a different code style. So it's just jarring to read.
Michael Berk (56:10.53)
Yeah. Cool.
Ben (56:13.447)
But there are like, when you're talking about data engineering stuff, there are certain things that you can kind of enforce with that. When you're talking about like, Oh, I'm writing spark code. Are you doing code repetition instead of doing like using maps? Are you operating on a collection? If you're writing in Scala, for instance, are you doing like DF dot with column 37 times? And, or are you processing a map?
and like mapping over that or folding over that. You should have some consistency in like how code is written and you should be going for conciseness when you're talking about like a language like Scala.
Yeah, but Python, it's should be concise, but predominantly should be readable.
Michael Berk (57:05.826)
Mm-hmm. Yeah.
Ben (57:08.625)
And that's the TL's rule to set that standard and to set that expectation of like these are the rules we're going with.
Michael Berk (57:16.974)
That makes sense, because then if everybody gets onboarded to the same set of standards, you're just a lot more efficient. initial calibration and correction into those rules are great. But now, you've adopted those rules, you can make a lot of assumptions and thereby work faster.
Ben (57:36.859)
Yep. And I've got some other nits, like things that just piss me off. don't write comments in code if the code is self-documenting. that drives me nuts. where it's like, you know, pound sign, I'm doing this for this reason. you, or like just explaining what the code is doing right below it. And anybody who has experience in the language.
Michael Berk (57:41.72)
hit me.
Ben (58:06.597)
or doing this sort of thing reads it like, yeah, dude, I got it. That's what the code says it's doing. That's a huge pet peeve of mine. And then similarly, the converse of that, which is, hey, there's this crazy complex data manipulation that's going on where I'm using some like low level internal library to do something weird.
If I'm not using the typical public facing high level API that everybody would use to do this thing in Python, for instance, like, I'm going to be, I'm going to be reading this file, but also putting a lock on it. But I can't use stuff that's in the OS module for doing that. And I have to write like my own custom implementation. You would better have a note comment in there, a note to Benet, which is note to like note well.
to future developers and future you, why the hell you built this? Because otherwise somebody's going to look at that and be like, what idiot wrote this? Why didn't you just use, you know, the read command from the OS library to read this file? But with a note to Benet, you can look at that and be like, that's clever. I hope I don't need to modify this code ever again. So that's super important to you.
Michael Berk (59:33.56)
Yeah, explaining your decisions at the time of development is so valuable. I just pushed a PR to a customer code base that was built by like six different Databricks people. And it's just a shit show. Like there's so much outdated stuff. Stuff like execution, like flow control where it just like can't ever happen. There's utilities that it's just like not like, yeah, it's just dead code. That's the word for it.
Ben (59:58.737)
dead code.
Yeah, you should delete it.
Michael Berk (01:00:04.748)
I should, shouldn't I? But... I'm-
Ben (01:00:06.462)
yeah, dead code is dead code man, get it out of there.
Michael Berk (01:00:09.666)
Yeah, but I'm on this project for another day. oops. But no, I think we're going to go back and discuss a bigger refactor in the coming months. Because it's fine if there's a bit of broken window syndrome where stuff drifts to some degree. A lot of these code bases are just not going to be perfect by proxy of having independent consultants. But man, this is a new level of just like,
Come on, colleagues. Be an adult.
Ben (01:00:39.975)
Yeah. Or the other thing that drives me nuts is stuff like somebody files a PR and there's commented out code in it.
Michael Berk (01:00:51.781)
there's a lot of that, don't worry.
Ben (01:00:53.489)
Stuff like that is like, if you're going to do that, file a draft PR, because you just want CI to kick off. Fine. Whatever. There's some to-dos in there. But an implementation that's written, that's then commented out, and it isn't like a to-do. They're like, hey, to-do, reminder to do this thing. Like, we still have to have this. So anybody who's looking at the PR would be like, yeah, like that's like pending development work. Totally understandable. We do that all the time.
but we don't do an implementation, then comment out 75 lines of code and submit it. So it's like, if it, like, why would you write that before you're ready to test it? Like, just don't write that yet. Or if it's broken, then why are you including it?
Michael Berk (01:01:44.174)
Mm-hmm.
Ben (01:01:46.213)
Yeah, that drives me nuts. then public APIs that aren't properly documented in the API docs. It's like, you know that somebody's going to look at this or they're like within interactive terminal code completer is going to expose this method or this function. And if they've
If they do that, like I know a lot of fellow software engineers do, if you're like, I'm not going to like open up a browser window and like look what the signature is on this thing. I'm just going to wrap it in a help command. And that prints out the docstring right there to standard out. There's my docs and it's blank. And I'm like, who wrote this? Like, why would you do that? Now I don't.
know what the heck this signature is or what any of these variables or these arguments are. So great.
Michael Berk (01:02:46.755)
Yeah.
Yeah, if you're using libraries, mean, code is sort of built to be shared, right? Like, a lot of implementation code is a one-person ordeal, but your future self probably needs to know how it works. And I have a terrible memory, personally. And then your teammates probably also care. And then if it's actually like open source, the world cares. So it's just, I don't know. I don't understand it.
Ben (01:03:09.447)
Mm-hmm.
Michael Berk (01:03:14.06)
And it's mostly like the main way I learned that is like, I have looked back at old code and been like, what the fuck is going on? Like, just one comment would explain all this, but nope. So yeah, I completely agree.
Ben (01:03:20.455)
you
Ben (01:03:28.945)
that's a weekly thing for me. In fact, last night when I was working on something, looking at this crazy module and MLflow that like had this weird edge case bug that could occur. like, who wrote this anyway? I'm like reading through it. I'm like, man, like there's some funky stuff going on in here, like complex things. I'm like, why would you need to patch that and then release the patch? Like what the hell?
I got to know like pretty sure this was Haru that wrote this. I go back into the GitHub commit. I'm like, nope, that's my name. What the hell? Why? I mean, I wasn't surprised that I caused this weird edge bug. Like that happens, but I had no memory of like working on this module zero, but there were notes throughout it. I was like, Hmm.
Michael Berk (01:04:19.566)
Yeah.
Ben (01:04:26.193)
Like the notes sound like my, like how I type. Weird. Maybe I added comments into this later. Nope. Built it.
Michael Berk (01:04:37.004)
Yeah, when you're in a deep flow state, you can come up with some freaky stuff.
Michael Berk (01:04:44.11)
Cool, we're at time, let me summarize. So today we talked about the Databricks for a good project that I'm working on. We use it as a case study for being a tech lead. And again, got free consulting from Ben, which is always fun. So a bunch of questions were asked and some core takeaways. When you're starting, product alignment is absolutely necessary. That will guide a lot of the direction of the project.
And you need to work with stakeholders to interactively define those product requirements. There's a difference between learning how to do a solved thing and then learning how to invent something. And those have very different timelines. So typically, if it's an unsolved thing by the world, you want to do a research spike. But if you know that there's libraries out there that can probably do this, just go upscale, Google around, maybe ask someone. Product specs should be really clear and robust.
to easily communicate project context offline, thereby making you redundant and your knowledge redundant. If you have to go and have an hour-long call with someone every single time they join the project, that can be really time-consuming. And then to that effect, for onboarding, give an offline document that you've created that's hopefully really good and look for good questions. See if they immediately grok it, if they're just going in weird angles. That's a quick way to gauge if someone is skilled.
Ben (01:05:55.089)
Yup.
Michael Berk (01:06:09.674)
And then try to give them quick wins to start so that they can develop sort of confidence, but also so you can see if they can do these quick wins. One important thing when delegating for learning, you should try to dictate the requirements of the output. You should not dictate how it should be built. Sometimes if someone is very new, it's actually more efficient to say, these are the things that I would recommend doing, but don't require it. Allow them to go off on their own creative directions. It's a lot more fun.
For incorrect project implementations, if someone is going in an incorrect direction, bring it up privately and be socratic and also be humble. Typically, you work with smart people and you might be wrong. For stopping design and building a production solution, basically, you can do this when the must-haves or feature requirements for your end-to-end project that meets the business need is met when you know that you can actually go and build this.
Until then, you're still in prototyping phase. And then to that end, make sure you stitch everything together early. Don't save that for the last step because when you combine disparate components, often just due to the nature of complexity, stuff acts weird. So try to do that as early as possible.
Ben (01:07:28.187)
is an understatement, man. I've never had a complex implementation. And this includes the most recent one that we've been working on, the Unity catalog AI. We've never built something where like multiple people working on it, like put everything together and we've write an integration test and it passes the first time. It's, I've never seen it happen. Shit's always broken.
Michael Berk (01:07:54.36)
No kidding. Wow.
Ben (01:07:57.361)
Yeah, sometimes like everything's broken and you're like, okay, let's diagnose what the hell just happened. And, but usually now with most projects I work on, it's more like, all right, there's this one weird bug that happened or, huh, this one test, this one unit test is now failing that didn't used to fail. Like, let's go figure out what's going on there. But yeah, if you don't do that and wait till the last minute, boy.
Michael Berk (01:08:27.576)
I've done it many times. It sucks.
Ben (01:08:29.894)
Yeah.
Michael Berk (01:08:32.43)
Well, until next time, it's been Michael Berk and my co-host. And have a good day, everyone.
Ben (01:08:36.187)
Ben Wilson, we'll catch you next time.
Creators and Guests
