8 - Microservices with Mark Fussell

Michael Berk (00:00.808)
Welcome back to another episode of Freeform AI. My name is Michael and I do data engineering and machine learning at Databricks and I'm joined by my lovely and wonderful and beautiful co-host.

Ben (00:09.817)
Ben Wilson, I overhaul documentation for open source projects at Databricks.

Michael Berk (00:15.326)
Yeah, hell yeah. Today we're speaking with Mark Fussell. He started his career at Vodafone as a hardware engineer, then moved into engineering management focusing on security and mission critical applications. Then over a near 22 year span, he worked at Microsoft as a product manager, eventually holding several principle positions. And currently he works at Diagrid as CEO. It is a series A startup whose mission is to accelerate microservice development by turning

commonly used patterns into API calls. So Mark, you're headed to Microsoft Build in like a week to staff a booth. Is that actually a good use of your time?

Mark (00:52.802)
Yeah. It is, well, first it's fantastic to be here and happy on the show.

Ben (00:57.519)
You

Mark (01:03.118)
And yes, next Monday, we're going to Microsoft Build. I live in Seattle, so Build is just down the road from me. And we have a booth there. Yeah, it's a good use of the time in terms of getting to chat with a bunch of developers and see what's happening in their world around these things and find out what's happening inside the Microsoft ecosystem. I mean, I'm pretty familiar with it because of my background there, but excited to chat with a bunch of people and see some real faces.

Michael Berk (01:30.12)
Nice. What are you guys typically chatting about? What are some pain points? What are some interests that your software solves?

Mark (01:36.526)
Well, you I very much focused on this open source projects called DAPA or distributed application runtime. Um, bit of a history, you know, I worked at Microsoft for 21 plus years, built many developer technologies there, including the platform that runs Azure called service fabric. Uh, but we recognize that all these developers who build on these distributed systems platforms like Kubernetes, we used to have things called Mesosphere and Docker swarm, but, uh, you know, they're no longer really around.

Um, but you know, they needed to build and have common patterns. And that means like, how do you do request, reply messaging between services? How do you do service to service discovery and communication? So, you know, what I focus on particularly is talking to those developers who are building these sort of mission critical distributed apps, particularly running on things like Kubernetes and the challenges and the pains that they encounter, because you know, what happens is everyone goes, Hey,

Go and build that microservices architecture app on that kubletis thing and off you go. And then developers are just left struggling, figuring out what does that actually mean and what do I actually have to do? So that's the problems face I work in.

Michael Berk (02:50.484)
Cool. How hard are those things to just go and build?

Mark (02:55.17)
Well, they are a real challenge. It's, mean, can, if we look, think about what the business problem space is more than anything, what's happened over the last 15 years and is kind of the well-established paradigm now is, you know, you're building an application that you've want to call it microservices architecture. You want to call it cloud native, but it's basically separated with networking rather than compiled into a giant binary now.

You know, the compiled into a giant binary was great when you wanted to scale to a certain level and have that single database back in. But the organization wanted to ship things faster and more quickly and scale to different demands. That's what the organization wanted. So nowadays you have, you know, applications that run across multiple machines. so it seems easy on a surface, but you get into all sorts of difficulties around networking and communication.

resiliency, observability across that security. And those are sort of just a cross-cutting concerns before you have to do things like, how do I find that other servers running on that other machine? What if that machine fails and it pops up somewhere else? How do I hook up to it again? And, you know, all of a sudden I've got this purchase order application, talking to an email application, talking to an inventory application. And, you know, I've got a networking.

discoverability, resiliency issue around this at the cost of the fact that I can ship these things differently.

Ben (04:27.683)
And also like operationally where you're like, Hey, we just shipped some bad code. The service is live. How do I refresh that and roll that back and make sure that everything starts talking to each other properly? Like these problems are very challenging to get right.

Mark (04:44.078)
Exactly. and, you know, inside all of this, you know, so it looks easy to begin with, you know, well, I come up with a port here and I talked to these, to these two things, but you know, before you know it, you're into this sort of networking world and developers don't want to be there. You know, they're trying to write some business app to solve some business problem and not have to solve communication and discovery and failures and retry and security and this other things. So those are the things that.

we work on. And those are the things that we recognize that if we built, you know, took common patterns, and let's take a very common one here. A lot of people talk about event driven systems. Event driven being that, you know, it's, it's been, you know, since the dawn of time of computers, mostly for 50 years ago, you know, something has a message queue between something else. And it does that because, you know,

durable messages, sending in between systems is a very nice way to communicate. know, today that surfaces itself as your pub sub mechanisms where you do eventing between things. And so you want to be able to say, I just got this service A over here and it's done some processing and I want to tell these three other services and send messaging to them. So, you know, that's a very common pattern, but you know, invariably what happens is someone says, well, let's use Kafka.

And, and, uh, they take a choice and then, you then they sort of bake Kafka into their code and it's, you know, bounded to their code and they use it all. And then they're sort of locked into this and, know, they have to, well, before they even have to get started, they have to build a PubSub, um, API on top of Kafka. Cause it doesn't actually exist. know, Kafka is not that, you know, they have to build that sort of API and then they have to get other behaviors inside there. Like, well, you know, I don't want.

Ben (06:27.812)
Mm-hmm.

Mark (06:37.39)
these services to be able to receive messages and these ones to do. So there's a bit of security in there. You know, I don't want anyone to able to subscribe to any message, you know, um, and before you know it, they're building this whole platform. And you know, what we recognized is that, there's two problem spaces here. One is that wouldn't it be nice to have a common API that everyone could subscribe and use in a consistent way that had features inside it all. And secondly, wouldn't it be nice if I use that API.

but I could plug any different type of broker behind it all. I could swap it out and have Kafka or I could have RapidMQ or I could have MQTT or one of the cloud service providers around it all and so to swatch out in for my code. you know, that way that gives me sort of the ability to make design choices in the future, gives me code portability and also gives me code evolution. So those are sort of the problems that you start to run into.

Ben (07:37.719)
And also language agnostic implementations as well, right? If you have that interface, if you're like, hey, I just needed to find my remote procedural call of like this thing going out to this other service. So long as that language has the ability to like craft that in the standard format.

that makes it a whole lot easier instead of like, okay, we're a house that just does this one language. So for all use cases, we have to like force that into that language. And sometimes you're like, this would be so much easier if I could just do this in like Python or I could do this in Scala or C sharp. Yeah.

Mark (08:14.646)
yes. Yeah. I mean, let me tell you how many times we come across, you know, the Java shop that's running a bunch of spring and you know, they're running a whole bunch of spring code in there and they're doing things and the Python developers come along and say, here's my AI code. And I'm writing all of this AI modeling code and things like that. And how do we make these two work nice and play nice together? Because I want your Java legacy code that you built 10 years or 20 years ago to start sending messages.

Ben (08:29.775)
Yeah.

Mark (08:43.948)
publishing events to my Python app that's running over here because I'm doing a bunch of ML logic over here. How do we do that? And wow, you know, at the surface of that, you know, it doesn't seem too hard, but when you actually get down to it all, you know, deciding how you do event-based messaging between two different teams who want to use two different languages for all the different reasons that they want, you you start to get into those sorts of challenges. and you know,

very much the legacy Java, if I want to call it that, to the modern Python AI side of world is a classic example of that.

Ben (09:19.481)
Yeah, even people don't really think about it, about something like, how is a float defined? And like these different languages and their frameworks handle that computationally differently. Precision is different and stuff. So if you really need particularly an AI, and I can't even tell you how many times I've been in a room discussing what we're discussing right now. Somebody's like, well, my source system is like Java.

Mark (09:29.603)
Yeah.

Mark (09:32.887)
Yes.

Ben (09:48.109)
I'm having this problem. Like my model can't really figure out what to do with this particular feature. Like, can we just display the feature on screen real quick? I just want to see like what, what data this is. You show it and it's like a floating point value. That's it's just zero with 37 zeros behind it. and then like five numbers, you like when you load that into Python from that system, it's not going to handle it the same way.

So they're like, so it's always zero. Yeah, don't try to directly interface this language with this. Or you need some translation layer. Well, what does that code look like? Here's the abomination that I'm going to show you. But here's what something like RPC will do for you. And like, that's amazing.

Mark (10:35.798)
Yes. Yeah. Yeah. Well, I mean, yes. Data formats and data serialization is a whole topic in itself for certain things like this. but yeah, but the whole point here is that multi-language polyglot development is a real thing now in organizations. And so, you know, what DAPA does, you know, is it provides, common patterns around pub sub messaging, direct service to service invocation, secrets management, because that's a common thing to do.

Increasingly, what I think one of my favorite is, is also it contains a durable, long running, executionable workflow engine. And workflow, I think is the core of business now where you have, you know, some state, it's a classic state machine. You're running a state machine through it's collect a forward only state machine in that, you know, you don't just jump back to previous states. So, you know, it's a four-day state machine. goes from state A to state B and does a set of activities on the way. And if it fails.

It's fully recoverable from where it last left off. so in other, if you're familiar with state machines, sorry, workflow engines around like, communder and airflow and these other ones, know, DAPA has a workflow engine built into it all, but it's a code first one. and so you write your code as a workflow class and a set of activities. And what this means is as you, a developer, you know, you can,

debug it with all the tools that you're familiar with today. And where this comes into play is because when you build one of these distributed applications in verbally, you're saying, I want to call this service A here. I want to send a message to the service B here. I want to save something to this database here. And then the machine dies. And then, you know, your app starts up again somewhere else. Well, how does it know what it's already done? Yes. Well, it just called it. What was it doing? It forgot it recovered because failures happen.

You don't want to have to like sign figure it out. Well, workflow engines just do that all for you. basically replays all of state goes where it always works before I went, did all of this, goes back to where it went, this carries on where it left off. And so this allows you to do so-called saga patterns, you know, where you basically kind of effectively coordinate a set of services around and in event of failures, you can recover and carry on where you left off. So.

Mark (13:00.684)
That basically is what describes most business processes. And so DAPA has a workflow engine built into it all that allows you to do coordination patterns across the communication it provides, such as discovering other services and sending the messages.

Ben (13:17.263)
And with that stateful layer, you don't have to do like the sledgehammer approach, which is the entire queue of things that I was about to call. Say I have like 37 services involved in this one invocation. We don't have to replay from start, right? You just say like, check your last state and just, it just recovers almost instantly.

Mark (13:35.576)
Correct. Yes.

Yeah. Think of it as like all the variables and everything else was running before and everything else it did and all the conclusions for it all were all done and it just loads up from where it was before and carries on running. So, you know, that's effectively, you know, what it does. And in fact, that's exactly what it does. You can replace all the state and gets back to where it was before. So that's why often it's called durable execution, because it's as if you're running a program and executing and then it dies and then it recovers on another machine because, know,

Machines die all the time, particularly in the Kubernetes distributed world where restarts, reboots, cloud providers, know, pulling plugs on machines, power failures, lightning strikes, whatever you want to call it. Power outages in Spain, you know, happen. And, you know, you want the program to recover around these things.

Ben (14:17.263)
Yup.

Mark (14:34.03)
What's super cool about the DAPA one that I like is it's built on this concept of an actor model. I don't know if you're ever familiar with the actor model. It's a very classic 1970 design pattern that came out when a lot of people were theorizing around actors and they'd just say computing systems. But the concept of an actor is it's a durable, long running object that lives forever and it has identity associated with it.

So if you're very familiar with like function programming today, you know, there's a lot of stuff around Lambda and this sort of stuff and GCB functions and they're all great. You know, if you want to run a little bit of, one little bit of code that runs for 15 minutes and does something, but it doesn't have identity and it doesn't remember what it was doing and it doesn't have state associated with it. So it's a DAPA and it's a super cool usage. Now go into some usage of the, what the active model gets useful, but you have this concept of an actor, which is a little piece of code.

that has functions associated with it. So you might have a piece of code say, here is my thermometer code. Let's take a classic example here. An IoT device, they get used. We have a big customer today who runs millions and millions of lighting systems across Portugal, like controls millions of lights in parking lots and buildings and in...

stations and everything else like that and every little light is an actor and every one of those little lights has a method on it of temperature and whether it's working or not and luminosity and then you can control it like send messages to it on and off and every little actor has its own identity so there's light number one two three four five six seven eight and another one over here another over here and then they create millions of these things and they send messages between them all so you know

One actor can call another actor, can send messages, I can turn this on and off and I can ask for its state. And this, you know, this light can be around forever. So that actor model, whether you're having like a, like an IOT device or whether you have like a gaming session state or it's a shopping cart with some data, it was a very useful thing. And DAPA effectively can create tens, thousands, hundreds of these, distribute them all across your environment.

Mark (16:58.678)
and manage each one of those. And you don't think about any of this. You just go actor one, two, three, create one for me. And it's a super cool model because you can represent it. We have another customer who represents this as a transaction. A request comes in and this is a business transaction. It'll do a transaction. It'll do save some state to a database and that transaction completes and then it carries on.

Ben (17:13.55)
Hmm.

Mark (17:25.803)
doing tens of millions of transactions, each one which has state and identity. And that's what workflow is built on.

Ben (17:33.487)
It's actually what Michael, it's what our company's technology is built on as well. Like Spark uses Scala's actors in order to handle that resilient execution. That was the reason why Scala was chosen back in the day is because it has like robust support for actors. So when you're saying like, hey, I have this driver node, which is like the thing that you're running your code on and it's connecting to N number of worker nodes that are within an ephemeral cluster.

Mark (17:38.126)
you

Mark (17:42.567)
it does?

Yeah, yeah, okay.

Yeah, yeah, it is, yeah.

Mark (18:02.028)
Yes.

Ben (18:02.275)
particularly with like how we have it now with serverless, which is we're directly interfacing with Kubernetes pods that can spin up, spin down in like midway through your execution. You have no understanding as a user of like how many physical machines are involved in this. You shouldn't care, right? We manage that for you, but we're using that for like, okay, this actor actually needs to live until its entire execution chain is done. So we have to hold onto that pod.

Mark (18:21.548)
Yes.

Mark (18:31.116)
Yes.

Ben (18:31.329)
And if that pod dies, we can resume from last known good condition. But yeah, that statefulness is super important so that you're not like, could you build Apache Spark with Lambda? Sure, I guess. Would you want to? No.

Mark (18:44.142)
Yes, yeah, I love Lambda. I love Lambda for what it is, which is like attached a little bit around other things, but don't build your whole app with Lambda. You know, it becomes this kind of craziness. I mean, I love it when you see this picture of like 10,000 Lambda functions all stitched together. You're like, how the hell do you reason about this thing? Like, where do you even start? You know, but you know, if you go to the AWS world, you know.

Ben (18:56.174)
No.

Mark (19:12.184)
It's a lot of Lambda. Lambda is okay to stitch things around these things, but building your entire architecture with Lambda, no, not no. I don't think you should ever go there, in my opinion. Bad idea.

Ben (19:24.959)
I've actually seen companies deploy microservices like that where like a couple of people get together in a room and you're like, we should go full statelessness. Like just go full microservices. And they go off into a room for three to six months and they take their first like, let's take just these two parts of our monolithic deployment and break those up into microservices. And they get those kind of working. And then

Mark (19:26.254)
Yeah.

Ben (19:52.277)
As time goes on, every new month is like, bring this service on. You go and walk by that room and look in and everybody looks like they've just endured some sort of like natural disaster.

Mark (20:03.264)
Yeah. Well, I microservices does can get a bad rap. This is, it gets a bad rap for the very reasons of like, because you split this all up, you know, you then have to deal with all these problems and then it gets a bad rap. and I think there's also massive overuse of the term, you know, yes, is a Lambda or microservices? Yes, it is, you know, but you know, it's a kind of a stateless example of them around these things. you know, you have to combine Lambda in the case of AWS with step functions. If you want to do something.

Ben (20:13.965)
Yeah.

Mark (20:33.166)
reasonable because step functions is the durable execution thing that they have. But you know, those are the things. But yes, the world of microservices can get very touchy just because you have to deal with these problems. But that's where DAPA comes in and it eases all those design patterns for you. Whether you're just doing messaging, whether you're managing a secret, the actor pattern is sort of layered on top of all of that. It has APIs as well now.

We added an API in the last release for conversational API. We saw a lot of people coming to us and saying, Hey, look, I want to talk to this language model and this language model and this language model and things like this. And they all had their different APIs around in different ways. So we put an API in called conversation. And behind all of this, and this is probably where I should get to this is going back to my conversation earlier about the abstraction from the underlying infrastructure.

Dapres is concept of a component where you plug in the underlying infrastructure or the underlying backing thing. So in the case of the conversation API, you can plug in a component for open AI or anthropic or, deep seek or, AWS bid rock or any of those other ones around those things. But your API for the conversation remains the same and you just say, but just use, and.

Ben (21:55.503)
Hmm.

Mark (21:59.598)
As the components and so DAPA does all the translation from just your API call to that underlying language provider in this case, or going back to the PubSub one, you know, you can have the PubSub API, but you can swap out Kafka or RabbitMQ or any of these other ones. And this gives you that separation of concerns. It actually takes you into the world of platform engineering. If we ever want to go down that angle, because what also has happened in the industry is everyone's got very, very excited.

that they have more than one application now in a company. And that they should have an application platform team that provides, you know, consistent services for them, in a way to all those application teams. But what usually happens is the platform team goes, just use Kafka and they bake it all in and then things go horribly wrong when one team wants Kafka, another one wants RabbitMQ, another one's MQTT and they're all fighting for it and the platform team's going like, whoa, stop, stop. I can't give you all of these.

Ben (22:55.587)
Hahaha

Mark (22:58.482)
and so DAPA solves that problem where you have a clear interface contract between what the application team wants for a PubSub API and what the platform team can serve up, which is we can give you these things and it doesn't that the application team just used API PubSub and you could swap out these underlying components and say, you want to just switch to Kafka? Yep. We're done within 10 minutes off you go. Well, you want to switch to that. And we did the same with language models.

We saw a lot of people using language models in their code with a conversation API, and you can swap out these backing language models between them all, whatever you want to use. So you have this mix and match in the background with the code independence and the language independence around things.

Michael Berk (23:44.69)
Mark, I have a question. So you have a ton of experience building these systems. And when you develop these systems, you need to figure out what is going to be the standard interface that users call. And Ben and I have chatted about API design and figuring out, once it's out there in the world, people start building with it. You can't really roll it back because then it breaks stuff. How do you think about?

Mark (23:46.126)
Yes.

Mark (24:10.541)
Yes.

Michael Berk (24:14.398)
from a philosophical and then also a tangible perspective for software. How do you think about building the MVP so that users get the value that they need?

Mark (24:23.846)
well, in terms of when you say the MVP and what they need to, in terms of, you I mean, I, I fundamentally believe that, know, once you build an API, you've got to maintain that API. and you're, you're, yeah. And you have to have a, you have to go through a testing phase. I mean, if you look at an API and to DAPR, it goes through this alpha beta stages. gets lots of feedback and gets to a point where it becomes a stable API. And at that point is then becomes a non-breaking API. It's fixed at that point.

and then we have a versioning scheme over that, so we version those things. so, I mean, I, I think the MVP is that, know, you've got to test out that API with a set of users first. You've got to get enough feedback. You've got to feel as if you've confident that that solves a real problem around these things. You know, Dapr has, you know, alpha APIs in it at the moment. In fact, the conversation API is an alpha right now. It's a very simple API. I think we can, there's not a lot to it. It's just says.

prompt, you send it to us and it says, and actually there's a couple of other features you can do on it. You can do prompt caching and you can actually do data optimization, but that's pretty straightforward. So, I think you have to have a term, a time of alpha or beta feedback. You have to get to stability, but you have to maintain it. mean, Dapr has non-breaking changes in its APIs since day one and it maintains that in fact.

Backward compatibility is a key thing for an API. If you keep breaking APIs and breaking people's code, they don't like it. So you kind of have to do good testing. You certainly maintain backward compatibility and good versioning and have a versioning scheme that you think about and that you stick to if you want to do correct API design.

Ben (26:15.447)
So coming from your background of everything you just said about effectively distilled software engineering best practices there and product development best practices, I am 100 % on board with everything you said. When you come from that background and then you look at the rapid pace of evolution that's happening in the GNI space, this new tech bubble that's been going on for the last three years, what are your candid thoughts?

of how people are moving fast and breaking stuff. And what do you think is the forecast for this within the next five years?

Mark (26:52.216)
Well, I like to break the AI space into kind of three areas. One is there's the whole models and I mean, let's just break it out. The first is language models specifically. First, first, there's machine learning generally. And that's the thing that, cause there's plenty of non-language models out there, vision models and all sorts of other things. And we sort of forget those are there and they're incredibly useful. mean, you know, being able to write a solve a piece of code, you know, imagine writing some procedural code as we used to try to do vision.

detection, you know, wasn't possible. You know, we tried it for 30 years and never got it right. And now vision models, you know, solved the problem. We don't know why, but they do. And then language models now have become, you you see a lot of acceleration in the language models and in terms of their capabilities. And you see a lot of these language models now fighting over, you know, the length of their reasoning time inside them or how long it takes to reason, how long it takes them to think.

and the accuracy around these things. And I think if you plot some of these charts, there's a chart I saw the other day was a bit like Moore's law equivalence in the language model world in terms of the rate of understanding and interpretation of these things, which was going at about, I think it was like a seventh or eighth month schedule in terms of doubling them down to time for them to be able to reason from their level of intelligence, which was pretty phenomenal. So I look watch all of that and I think that's great. But the thing that interests me more than anything,

I think is the systems engineering that's going around and using those models, which is really where I'm getting to agentic AI systems and how you're using those in a way to build systems and combine those models together with procedural code still and using the models. I think, I know there's a lot of hype around the agentic things, but it is quite phenomenal when you can see an agent where you can...

For example, we did a demo the other day where we built an MCP server on top of one of our using DAPR. And it allows you to effectively query through the DAPR API and just say, it'll translate a call through the MCP server. you could point at the database and say, and you could type an English statement. Tell me all of the...

Mark (29:17.474)
the customers in this database that had their orders in the last week. And, you hand this data language model, it gives you back a SQL query perfectly. And so there's an agent that does a SQL query translation and does that all for you. So the systems engineering of, you know, having this MCP protocol coupled with language models that can take English language and interpret it into, you know, code effectively, we'll see a lot more of this code generation on the fly.

Ben (29:25.231)
Mm-hmm.

Mark (29:47.33)
whether it's SQL code or running code and stitching them together to achieve tasks dynamically, I think is incredibly interesting. So I think maybe I'm trying to answer your question that break fast and run thing. A lot of people are trying to figure out how these agents run and what they look like. But I think that there's a lot of still fundamental engineering that probably will not be, is very short lived in the sense of, you know, they're not very

deeply designed and engineering principles in places.

Ben (30:20.621)
Yeah, when you look at something like, I don't mean to dunk on any of these framework libraries, but let's just choose one of the most famous ones, Lang chain, right? And you look at crafting an agent with that framework, the APIs, they're evolving over time. And what you were using seven versions ago is definitely probably not going to work in the latest release. So that's frustrating, of course, for people. But when you're talking about

Mark (30:43.917)
Yes.

Ben (30:48.675)
deploying a service for this, you have to basically wrap that instruction set as that code, that script that you're defining this agent's behavior and deploy it somewhere. Put it into a container and deploy it somewhere on the cloud. But when we're talking about interfacing with a bunch of different services that need to support large scale infrastructure.

Mark (31:03.596)
Yes.

Ben (31:16.183)
it becomes very challenging. Like I've talked to customers that are trying to do this and they've had to move away from those, you know, demo focused libraries because they're like, well, in order to get this thing to scale, I have to deploy this to Kubernetes and spin up a hundred pods. And it's a copy of all of these different, you know, model air quote model definitions, like my application. And they're like, well, depending on what burst traffic is, I'm actually getting like overloading my database here.

Mark (31:22.381)
Yes.

Mark (31:32.109)
Yes.

Ben (31:44.939)
it's actually just sending effectively duplicated data over and over to this thing. And it's crashing. So now we have to think about sharding out our database in order to support this. It seems like something like Dapper is like, no, no, no, this is how you do that. You just need that one copy of this code and we'll be able to spin up these services to handle this massive infrastructure load.

Mark (32:07.138)
Yeah, yeah. mean, I do think that there is a lot of, going back to your question, I do think there's a lot of this throw together an agent that's come from the chat bot world and it's all good and the agents are a chat bot and doesn't this thing work? And that's very different from, you know, building a long running, durable agent that has, you know, correct messaging, semantics between services. It has durability. It has...

the ability to recover its state and has some hard principles of being able to run on a cloud native environment as well. Like if you take many of these frameworks and try to run them on Kubernetes, there's on a distributed systems platform, there's no bearing on this. They great run on your local machine or you deploy it to a VM and it does some things like this, but it's not cloud native in any way. I'd also argue that they haven't actually built in

you know, protocols that conform to any standards. For example, you know, there's a standard called cloud event, cloud events. That is a very well-defined standard around these things. Um, I don't think they integrate deeply with existing message brokers around these things. Um, I don't think about, they don't have to think about how they scale on demand or scale down for those things because, know, you're up and down and, and I don't think they've either built, you know, on

kind of a long-running durable going back to the durable workflow execution stage around these things. And I think these are all problems that they just hope will recover themselves at some point. I mean, you could argue that, yes, they'll get there eventually, but I think that it'll get there in a very hacky way in some ways. And you'll end up with some frameworks that have a lot of things bolted on around them all. In DAPO, by the way, we have done a thing called DAPA agents.

which I think is super cool myself, but it's laid on top of workflows. And it is using all of DAPA's distributed systems principles, particularly workflow to create long-running durable agents. And so we have this agentic framework. You give it a goal. You say, you're an agent of this type, like you're a PR reviewing agent. You can review code PRs. Here's your tools that you can use.

Mark (34:29.346)
Yeah, you can call onto this GitHub repository and you can use this language model around these things. And then you sort of, and you can then give it a prompt and say, here's some code that you can review. And you can create an agent that way. And it's a higher level agent concept. And then it uses durable execution to run the workflow, send messaging. And if you're doing multi-agent design and things like this. And so I think that's certainly, think Dapr agents is

fits a lot more with being building on those distributed systems principles.

Ben (35:05.059)
Yeah, actually what you just described actually sounds like Copilot about how that level of infrastructure, like how many... I know that we have bots set up on several of our repos where you can just basically ask for a PR review from Copilot. We don't typically use them. We like the suggestions and sometimes the style of suggestions. It's like, okay, cool.

Mark (35:08.814)
I

Mark (35:27.854)
Yeah. Yes. The challenge with those PR robots, things, although bots and things is the underlying language model and what you do and you switch between this one and this one and this one and this one. And then I still think the biggest challenge is that, you know, it's the data in the end, but you know, in terms of what they're doing, you know, I still, I still feel a tough time for those Java developers because there's so much Java legacy code.

that it got taught on, you know, that it still comes back with, you know, reviews from Java code from, you know, Java version, you know, three versions ago. But anyway, that's the whole topic in itself. But, but these, mean, if we do take these agent designs though, you know, an agent that can help you, you know, there's consumer ones as well in terms of, you know, help me with a, you know, travel booking agent. I can imagine what come along or things that will actually

You know, be sitting there in the background, know, checking systems conformance around these things. and you know, the dynamic nature or, or, you know, quite often, I think there'll be one where, you know, you, you'll have one that will just be looking across your infrastructure and seeing it if it's compliant in some way and, you know, checking your Terraform, kind of files and seeing if they have any issues inside them all and having that back to a language model.

So you can imagine that there's a security checking agent or compliance agent that's taking your Terraform files on regular basis, saying those language models and kind of giving back suggestions around these things. I mean, I guess I'm picking developer scenarios here, but you know, any world where you want some sort of dynamic checking on the fly and thinking things on your behalf that a human would probably tend to do, and then it gives suggestions is open to this. And so that's why it's going to...

fundamentally changed so many businesses.

Ben (37:25.839)
Yeah, I mean, and having like, when I talk to customers of ours who are looking to solve some of these problems, you know, their big thing with why they choose to use our services that we provide, because our backend is very similar to what your company provides is like, hey, it's this resilient layer of, you know, you have observability baked into everything that's going.

So there's metrology and telemetry about everything. So you can really see what's going on versus what they initially tried to do, which is like, we'll do a DIY, we'll do a hackathon, we'll try to get this thing working. And it just blows up in their face because they weren't aware of what the complexity is of something done in sort of the proper engineering way. I think organizations that are

Mark (38:13.676)
Yes.

Ben (38:23.769)
that are taking the fundamental principles of the best way of doing something and then adding on through their existing framework, the ability for people to easily do it the right way is really exciting. That's why I'm super stoked to see the future of this project.

Mark (38:40.62)
Yeah. And I think, yeah, you're going to see a lot of, I mean, I think the emergence of the agent to agent protocol that came out of Google has just been endorsed by Microsoft, the combination of MCP servers now, which by the way, is still broken in terms of authentication and authorization. And there's a whole big hole there and it feels like swagger and, W well, is it would still all over again, but you know, that's a different topic. You know, we, we love to create new protocols in the industry.

But you know, MCP has taken traction, but I do think it's super cool that you're going to put these things together and sort of create these. I think you to be careful of that, like the multi-agent world. mean, in the sense of, you know, every agent knowing about everything and talking these things, but I think it's creating more automation. But yes, going back to DAPR and you know, what we do, know, DAPR is a set of APIs that help you build these distributed systems, whether you're doing a deterministic system with a workflow.

Ben (39:08.151)
Yes.

Mark (39:36.248)
You're just doing simple communication with PubSub. You're doing some secrets management around these things. And the point here is that, you know, don't reinvent the wheel, take this amazing platform, go and build your applications with it, or be productive around this, particularly running on Kubernetes. And, and, you know, and you know, the fact that we have this concept of Actors, or, if you're building sort of IOT devices or an agent, if you're kind of doing an automation one that you want to bring in a language model,

are just sort of artifacts of those types of distributed systems patterns. And I think this is really cool. And what's great about it is an open source project. So anyone could come in and jump in, and you can contribute, whether that's using some of the samples around these things or actually even contributing to the core part of the project itself. All building what we see is just new components to plug into infrastructure. And where we diagram come into this is that we're core maintainers of the project.

today, we're the main drivers of this. We love to work with the community. We love kind of hearing people's stories. I love hearing stories about how we did a survey earlier in the year where we interviewed kind of 250 DAPER engineers and consistently 95 % of them said that DAPER had reduced their time to market of getting their application into production by 30 to 40 % because they didn't have to reinvent the pattern.

Yes. And so that's where, you know, you really should check out DAPR as a platform play. And particularly if you're a platform engineering team, I would say that DAPR fits in perfectly with that. And as well as just being an application design team as well. And then we diagram come into it where we provide commercial support around this.

We also provide this thing called a platform. We have a platform ourselves called Catalyst, where we just take DAPR and we host it for you on your behalf. So you don't have to host and run it yourself. You can just use the APIs. And that way, whether you want to use the Workflow API or the Pubso API, you just use it in a similar way that you have other platforms. And we provide that both as a service and as a standalone deployment into your infrastructure.

Mark (42:00.428)
That's where we come into this world.

Michael Berk (42:04.1)
So question as it relates to AI. This microservice like facilitation makes a lot of sense. And I'm curious, what are the bets slash trends in industry that you are expecting? And to like go back to a prior topic, we were discussing how the industry sort of moving fast and breaking things and stuff is not always done to standard. And we're gonna have to pay for that later, whether it be.

via bloated features in all these open source repositories or someone re-architects and actually gets it right the first time. And Dapr would fall very nicely into either of these trends. But I'm curious, what are the other trends that you're seeing that you think will propel this to be a successful company?

Mark (42:48.782)
Uh, you mean within cell within, you know, in diagram, Oh, yes. Well, I, I mean, there is, there's kind of a no doubt now that sort of the distributed microservices architecture is sort of the de facto architecture to build your applications with many ways. I think there's just a continued adoption of that, um, that we see, uh, and you know, that that's one trend is that you're just continuing to.

Michael Berk (42:51.39)
Specifically within AI ideally, but

Mark (43:19.578)
educate developers so that they can use that in particularly. think I would go back to the workflow example inside that again. You know, in terms of AI side of things, you know, the agentic system is also driving things a lot and bringing in language models. I'll see that we'll bring in other models as part of all of that. And I think that you'll see a continued rise of dynamic applications combining any form of model.

with procedural code and a combination of those two. I think you'll see a lot of the tooling will certainly has accelerated. You'll see different types of tooling accelerate. One thing that we're working on that I think is really interesting is I think that there's an entire world of using AI to help you generate your entire service itself from architecture diagrams. In fact, we've just released

Last this week I think called a workflow composer. It's super cool You can draw a state machine diagram of a workflow on a piece of paper on a napkin on a whiteboard News and modeling tool every want but draw your workflow out and then you give it to our workflow composer service and it will look at that picture look at that diagram it'll give it to a language model and it will generate a

So say you've drawn a workflow with like 20 steps inside it all and some of them go in parallel and some of come back together again. It'll then generate all the workflow backend code for you in a project that you can run. If you go to workflow.diagrid.io, you'll see our take a diagram and generate all this code for you. So in an instance, this language model has interpreted a diagram that I drew and sketched on a whiteboard with my engineering team on Bound Business Person and created me all my workflow code for me.

I think that that is super cool. Let's take that a step further. You you see a lot of people creating AI code today for web applications and building things and this whole vibe coding, whether you want to go down that route or not, and building apps fast with the wall, but they tend to be web apps. I think there's a whole world of how is it that you generate a fully fledged backend POC

Mark (45:43.822)
system where you can draw an architecture diagram and have it well architectured on like the well architecture diagrams that come out from AWS and things like this and generate a POC with that using a language model that kind of generates all of the dependencies to code the running artifacts and gets you 80 % of the way there in a running system from diagram to code. I think that that's a trend that's going to pick up and be

because no one has actually solved in any way so far generating backing solutions from architecture diagrams and doing that in a way that allows you to accelerate this. And I think you're going to see a rise of developers being more kind of designers and architects around these things. And then they give it to language models to help generate the initial code and even start to go back to the diagram and say, just change this thing here.

some of the code can be changed for you at some point. Saying that, I don't want to go too far down the vibe coding thing because I think it's a little bit, I think that, you know, if you don't understand your code as, know, who is it? Lord, who is the person who did thermodynamics? Kelvin, as Kelvin said, you can't affect what you don't understand.

Ben (47:06.575)
Hmm.

Mark (47:09.73)
Yes. So I see this in vibe coding. If you can't affect what you don't understand, it's like, if you don't understand that code, then I don't know how much security stuff you have inside there that's going to kill you off and then these things. But anyway, I do think that diagram to code generation can greatly accelerate backing service systems around these things. And this is something that we're very focused on. And I think it's something that will continue in the trend of the industry.

Ben (47:10.873)
Yes.

Ben (47:38.275)
I mean, I for one will use it whenever it's mature enough. Not for the way I currently use Gen.ai and I use it every day. It's for that first prototype. Like I had to do a bunch of front end stuff over the last month. And before I even think about filing a PR or even like starting on a PR, I'll open up like a completely different like scratch pad.

Mark (47:55.426)
Yes. Yeah.

Ben (48:07.535)
or sometimes I'll create a branch in that local repo, give, I'm a big fan of Claude three seven now. So give it as much context as it needs. Like, Hey, here's all the modules that are related to this particular component. And also here's a copy of my design spec. Like here's what I can use in my design system. I do not want to reinvent the wheel here. So use these as needed. And I need this, this, you know, webpage to do the following 37 things.

Mark (48:30.818)
Yes.

Ben (48:37.889)
And in no way, or form does the code that it generates relate to the final product. But that just saved me an entire day of hammering out a bunch of boilerplate that I don't want to write. And it allows me to see, am I thinking about this design slightly wrong? And then I'll just have that in the window, like in another monitor, while I'm actually doing the implementation.

Mark (48:50.35)
That's right. Yeah.

Mark (48:55.874)
Yes.

Mark (49:03.121)
Yeah, exactly. So imagine if your world you're going is that, you know, you're, you've got a drawing tool in front of you, just even a piece of paper. And you'd like, well, I've got service A and service B and service C here. And I want this one to call this one over here and coordinate these things. And, know, this one does messaging and this stuff. And, and this one's called service ABC. And, know, they've got actual names. That's it is okay. You know, build this thing for me right now. Generate me the code.

Boom, off I'm going deployed onto Kubernetes or onto your local machine. There I can test it all. Okay. Chop this thing onto Kubernetes or wherever it is running there. And it is literally your picture. And it's got the name and everything else. I mean, that's, that would be imagine amazing. And they're like, well, okay, let me just erase this line here. Here's scribble it out. Yeah. Let me draw another box down here. Draw a line here. Okay. Just add that into it all. Boom. There it is. And you're up and running with all those names. It just saves you hours. I mean.

that I think is the way that this should be done. And now you're testing designs and things like this. Yeah, my analogy is this. If I go back to my engineering days, when I was at college, I did electronic engineering. And, you know, it was a very time consuming cast because you had to like design your circuit on a piece of paper, you had to print a circuit board, you had to solder everything onto the circuit board. I then had to, you know,

Ben (50:01.043)
yeah.

Mark (50:28.27)
you hook it all up and invariably something was wrong and then you'd have to like print a new circuit board and things like this and buy all your components. After I left college 15 years later they had circuit simulators on your computer and I could go to a circuit simulator on my computer and simulate it. Well imagine now that you have your diagram is basically your circuit simulator but you're generating the live code there and then and it's running and you can see it and you can change it all but it's a circuit simulator concept.

Ben (50:41.187)
Yes.

Ben (50:51.289)
Mm-hmm.

Mark (50:57.548)
as opposed to you hand crafting it all and putting it all together yourself and doing a lot of

Ben (51:03.191)
or just having like being able to pass in a whiteboard representation of my data model of like, here's my contract between these two services. Can you please write all this RPC code for me and actually have it be correct? man, that would save person years of effort at a company.

Mark (51:14.604)
Yeah, exactly.

Yes. Well, that's, that's what I think is the direction we're going. And we, you we've started it with workflow. You can take a workflow and generate that workflow code. I think that you can do that. Draw a service, service A, service B, just generate me the RPC code to generate between these two ones. Yeah. A, B, boom, done. There I'm off to go and just testing it all. That I think is something that is going to be very inspirational in there. So yes, that's, I think is.

One of the directions, to answer your question, Michael, who else knows? I think the agentic thing going back to that is going to be transformational as well. I think the language models will continue to get more and more excitable, knowledgeable. I mean, great, they get more powerful, they get better each week. They get slightly better on these things. It's a trend. We want that to happen.

Ben (51:51.663)
you

Michael Berk (51:53.106)
Yeah.

Mark (52:16.888)
But I, me, it's not the most exciting thing. It's the systems around that are being built that are using those models and how they're going to use it, I think is the most exciting thing that's happening inside here. And the crazy usage that you're going to get from it all, is it going to happen that inside a factory now that there's going to be a model that's making decisions about what it does as part of the manufacturing process? I mean, you already have...

vision models in factories, detecting flaws in manufacturing, in order to replace people who literally looked at, know, particularly in car manufacturing, know, they'd look and inspect the whole car. Now a car comes off a line instead of a bunch of people looking at it, there's literally 50 cameras all around it all. Everything from x-ray cameras looking at the engine to visual cameras looking on the outside that just inspect the entire car before it leaves the factory.

Yeah, but I mean, will there be something deciding now more intelligently and dynamically? I think there will be, yes. So we'll see.

Michael Berk (53:24.38)
Yeah, it's so interesting when gold is discovered, there's all these industry forces that root like, should basically work to support mining of that gold or crude oil, or in this case AI and microservices is a very logical supporting piece of infrastructure. So makes sense.

Mark (53:41.578)
Exactly. Yeah. mean, getting back to the fundamentals of this, what we're talking about here is that, know, DAPA is a runtime and a project for building these distributed applications of which agentic systems are distributed applications with smart. And, you know, I am very excited by this direction. I think, I think a lot of the other frameworks out there for the agentic systems have a lot to learn in building real distributed paradigms inside the wall. I think they'll

capture some level of the market because it's a big market. but I think they'll do that. I mean, certainly the Lang chains and the semantic kernels have taken a big slice of the attention. It's just whether they can evolve to be and handle some of these really hard or let's just say more scalable and enterprise scale applications that take them into that space.

and particularly the cloud native space. I think they have a long, long, long way to go in order to do that yet.

Ben (54:48.409)
Yes, they do.

Mark (54:50.158)
So, yeah, there's a lot happening. But, know, we jumped on today. I topic, think other futures inside the DAPA project as a whole is that, you know, we see, you know, we'll continue to invest in new APIs. One of the areas that we still need to put a bit more time into is sort of data access, which is pretty close to your heart. You know, today DAPA has a sort of key value access. We still...

Michael Berk (54:54.772)
Cool. Yeah.

Mark (55:18.348)
don't have a sort of direct SQL access. We tell people just use SQL in terms of it. You still have to bind to the SDK of your choice, but we're putting in sort of blob data access and a document access as well. So you just pull back documents as an API. It's a general purpose document API that can underlying any different store. And those are things that are sort of coming in there. It's just a matter of getting to them in time, but that's why I strongly encourage and have community, community, community. We love community.

You know, you can jump in here. It's a CNCF project. More the merrier.

Michael Berk (55:53.648)
Yeah, so I think that's a great transition. We're about at time. If people want to learn more about UMark or Dapper or Diagrid, where should they go?

Mark (56:02.434)
Yeah, so first off, if you go to dapper.io as a place, it's just the website. That's a good place to start. But, you know, rapidly that points you to the DAPA Discord channel. Join the DAPA Discord channel, come in there, say hi. And then I would say, if you want to learn DAPA, go to our university courses that we do at Diagrid called, if you search for DAPA University, you'll find it. There's a DAPA 101 course.

And we just released a DAPA workflow course and hopefully in the next month we'll have a DAPA agents course inside there. So that's the best place to get started, to learn and get going. And then, you know, once that piques your interest, go to diagrid.io and we are the commercial arm of DAPA effectively. And we can help you design your system, support you in production, install catalysts.

which is our DAPA platform into your environment and take you on anything from your POC all the way through to production and greatly accelerate your project. Our engineering team is your engineering team, as we like to say. And we've been very successful with enterprise, financial, manufacturing, retail, airlines. And that's our...

Our core tenant, you know, know distributed systems well, know mission critical applications well, you know, these things run the business literally, you know, DAPER is running in 40,000 companies today that we track. Yeah, come and find us and, you know, we can make you successful, you know, building that cloud native microservices architecture using DAPER and

on any platform of your choice. So, and hopefully also build some of that agentic stuff as well, if you want to get down to that path. And we're hoping to do a trade-off between Dapr agents against other ones and see performance wise. Cause actually I think that Dapr agents is, we haven't proven this yet, but is one of the most efficient and cost-effective.

Michael Berk (58:07.336)
Yeah.

Mark (58:28.95)
as well as resilient platforms out there. But we've yet to test our theory. But yes, come and find us at Diagrid. We're more than happy to help you on your journey, either as a platform team or as an application team. Deliver great tech and build great mission critical applications.

Michael Berk (58:50.831)
Amazing. Cool. So I'll quickly summarize some additional points that I heard on this episode, but yeah, this has been a fascinating conversation. In short, Dapr will make you have better microservices faster. So please check it out as Mark alluded to. And then a few sort of high level points that stood out was the AI industry is moving fast at the cost of durability, resiliency, et cetera. We're not always following the best practices of engineering. And this will have two

core outcomes. The first is people are going to have to redesign things from scratch, or we're going to bolt on a lot of extra features to these open source resources, and there's going to be bloat. So this is a great opportunity where Dapr is looking to redesign from the beginning and do it right the first time. And then throughout this sort of rush of AI, microservices are going to be powering a ton of stuff. So it's a great industry to be in, and it's a great thing to have figured out from the beginning and from the outset of your project.

So yeah, until next time, it's been Michael Burke and my co-host and have a good day, everyone.

Ben (59:50.223)
Ben Wilson. We'll catch you next time.

8 - Microservices with Mark Fussell
Broadcast by