Transforming a 25-year logistics incumbent into an AI-native platform.
David Walters is the Director of Product at CXT Software, a 25-year incumbent in last-mile logistics that was acquired by Ionic Partners and tasked him with a specific mission: transform the company into an AI-native leader for the next 25 years. CXT builds the shipment management software that courier and delivery businesses use to automate dispatching, optimize routes, and track packages in real time — across regulated industries like healthcare, pharmaceuticals, and auto parts, where a missed delivery window can mean a wasted organ or a factory line shut down.
Before CXT, David worked at a DOD-funded startup building autonomous systems for U.S. intelligence agencies — greenfield work with a blank slate. Moving to a company with 25 years of infrastructure was a deliberate shift: he wanted the challenge of modernizing a system with deep operational data but aging architecture. His job now is to take a transactional, database-heavy platform and rebuild it as an event-driven autonomous system that can make real-time decisions — like rerouting a hundred orders when a truck breaks down on a highway, while respecting the service windows on a blood specimen pickup.
David has been thinking hard about how AI is reshaping the product function. He sees the role of product managers shifting upstream toward strategy and intent, while engineers are moving downstream toward code curation rather than code creation. His team runs a weekly Discovery Workshop to stay current with the pace of change — evaluating new tools, testing workflows, and coaching engineers through the transition from writing code to reviewing and curating AI-generated output.
His sharpest observation: the bottleneck in software development has moved from writing code to reviewing it. With engineers running ten parallel agents, organizations can now accumulate technical debt at an exponential rate. The engineers who thrive aren't the ones who can type fast — they're the ones who want to own the outcome and understand the business problem behind the code.
Read full transcript of interview
All right, we're having the same conversation that we basically had last week, although I would say that one of the major changes is that Club Mythos came out and is it hype? Is this PR or is this the thing that, oh, it discovered 14,000 vulnerabilities in all of our software and is going to murder us all?
I think it's a little bit of both.
The Foundation Labs have a habit of doing this every time a big new model release happens. You saw it with GPT-5, right? Open AI's like, this is going to absolutely change the world.
Entire industries are going to be shut down and that's not exactly what we saw happen. Anthropic has a similar playbook, except for that they always model themselves as like the adults in the room, right? They're the responsible custodians. They're going to make sure that it's safe first, but it still has that same hype feel, that same get excited, this is going to change everything sort of framing. But when you look at the model evals, right, they released a 244 page system card for this model. And you look at the evals, you'll see that on Sweetbench verified, for example, the score jumped from 81% to 94,
completely saturated on Sidebench, 100% on that. And then on a couple other benchmarks, like you see these significant changes, which again, these foundation labs are grading their own homework, but that does indicate that there is a step change here. And I think that, again, this isn't something that's going to revolutionize every industry in everyone's life because it was trained to do a specific set of tasks, is really good at cybersecurity because that's what they were going after,
but that's a big deal, right? So I think the approach here is most likely warranted. They were able to find a number of zero day exploits in tons of open source software, lots of operating systems. So they put together this coalition in Project Glasswing, of 40 of the top tech companies to get a defender's head start on patching some of those vulnerabilities. So I think that once we start to see which of those vulnerabilities are actually getting patched, and we start to see if it's had the effect that they've claimed that it has, then we can start to make some more inferences on whether this is something that's real, or if it's a lot of hype.
They're very good at the PR side of this game because what they essentially just did was went to all the biggest companies in the world and said, "We just built a thing that will absolutely destroy you. You should pay us not to."
Yep, it's a good playbook, especially when you don't get a large government contract.
That's a good point. Sam Alman got that, but has the instincts to do an interview with Ronan Farrow, who specializes in destroying pedophiles? That just weird instincts and all that.
Yeah.
All right, so let's go back to some of the questions that we were asking last week, which really come down to that there are all of these new AI tools. Everything is changing very quickly. You're someone who's in the thick of it right now. How have these AI tools kind of affected how you approach your job, running product and building things?
Yeah, there's definitely been a shift both upstream and downstream on the work that's getting done. I feel like it's moved everyone upstream towards strategy and intent.
On the engineering side, specifically it's moved people also downstream to being custodians of the code. Because when you're generating code now in the age of AI, it's incredibly cheap just to have code. But the bar's still really high for high quality, high impact code, and the last mile problem still exists. You'll notice that when you're working with these models day in and day out, they're really good at getting you 90% of the way there. But there's still 10% of the code that needs polished. You need to resolve edge cases. You need to make sure that your user interfaces are pixel perfect and AI is just not there yet.
So what I'm seeing is that there's a move towards,
there's a move away from mediocrity. We don't need engineers that can just type a lot of code really fast. That's now all outsourced to the cloud. But what we do need are engineers that want to own the outcome, that want to understand business problems, that want to make sure that the thing that they're delivering actually solves a real need. Because the goal is the outcome for the user, not just getting a PR merged.
How has it changed your day to day?
So for products, it's been a lot of acceleration in areas that used to be areas of drudgery, for example. So synthesizing thousands of feedback points from users, from the app store or from user interviews. AI can handle a lot of that. Great for summarization, great for pulling out key details. So now instead of synthesizing transcripts for hours, we're just making sure we ask the right questions during interviews so that they can get pulled out properly by the AI systems. And the same is true for writing specs. When you're writing out a specification for a new feature that you want to build, it used to take hours of, you know, really thinking hard about every edge case, going deep into maybe other parts of the software that you're not familiar with to see how it has downstream impacts on things that might be touched by while you're building. But if you have a really good setup where your AI understands the context of your software and of your business, a lot of that can be done upfront for you, or at least give you a good head start. One of the things that used to be hard for newer PMs was defining good exit criteria. So given some feature that you're specifying, what does the engineer need to test to make sure that this is everything that you asked for?
Some of that requires a bit of a technical mind. Some of it is just product stuff. When you click button, things should happen, right? But also, are you making sure that this database is modified but not this one? Or that the APIs that are connected to that interface that are also used by third parties that are using it just as an API and not as an interface is all of that data still intact for their use case, right? And AI is really good at noticing those things and pointing them out. And even if you only get five out of 30 exit criteria that you need, it's a good starting point to help you think of the rest.
What is it about these new tools that get you excited, like hopeful either for the work you do day to day or society in general?
For the work that I do day to day,
for me personally, I'm excited to see PMs becoming more technical. I think we're at a point right now where a lot of the front end work can be done
to be cliche via vibes, right? So if you have a well-defined set of functionality in your application and all you need to do is a UI rewrite to have a better experience for the users, but the data points are generally the same. You just want to improve the workflows. That's something that can be done almost autonomously by someone in product through prototyping tools or even as you build better scaffolding and better guardrails around your Cloud Code instance, for example,
it being grounded in your design context, the context of the existing code, you can get really good output from a non-technical person in code that you can actually implement in your app. A lot of people are still using Lovable and Replit and some of the other prototyping specific tools, which are great for testing with users, doing brainstorming, proving out ideas right away.
But there's still a gap from going from prototype to reality, especially if your reality is angular, it has to support Internet Explorer 10 or some weird set of requirements where the prototyping tool, it's almost always React. It almost always uses Tailwind, right? They're very opinionated because it was things that AI does really well. So there's a gap between moving that from intent to production, where if you have really good setup in your organization, you can just kind of vibe in production and then work with an engineer to kind of finish that last mile. But the path to get there is greatly reduced. And it's interesting to see the shift for product managers move from people that just think about problems to people who also think about systems and data models and things like that.
Yeah, AI is great. An analogy is like, I have a supercar, it's great on the racetrack, but if you're not on a racetrack, it's the word, like try to take it down a door road.
What, if anything, we were talking about mythos before about these tools kind of keeping you out of the night right now.
I would say the race to make them better with a total disregard for the consequences of what that means to the rest of us. I think that when you look at the current political and business landscape, there's a lot of focus on how can we make these tools better? How can we make our margins better, where efficiencies can be gained, what new things can we build? And those are all awesome, but there are a lot of layoffs happening, right? There are a lot of changes in responsibilities, there are a lot of positions that won't exist in five years at all, right? And there are very few people thinking about that problem. So that, as I've gotten older, I've gotten a lot of empathy for four people. Younger, I was very stark capitalist, hey, the people that move the fastest and break the most things are gonna make the most money and they deserve it, right? But there are a lot of people with families that just go to work because they value what they do, but they value their time at home more. And I feel like as we focus on hyper-productivity, which we're gaining with these tools, a lot of those people are starting to get left behind.
Anything to be done about that?
I don't know. That is a problem that's too complex for me to think of a solution to offhand, but...
Let's shift gears a little bit.
And one thing that I'm trying to understand more and more from these conversations is,
strong 30-year-old framework.
Now everyone with this new AI and agentic coding and everything that's involved in it, there is no set framework. There is no best use case or best practices yet. How are you, like, did you just build one for your team and that's what you're following? Or are you following someone else's model? How are you building your new version of the SDLC?
So we're doing it iteratively and it changes weekly, if not daily.
We're at a point now where it's almost faster to just have a PM and an engineer sit together and then build a thing live, rather than spend days or weeks thinking up a really good spec. And the days of writing the perfect spec and throwing it over the fence for an engineer to implement in a silo and just decide if the PR is ready to merge or not are completely dead, right? The line is blurring very quickly between
the people who are thinking about the product that they wanna ship and the outcomes that they want and the people who are building the thing, right? It's almost the same thing. There are different specialties still involved, right? We talked about the last mile problem, for example. But as that line blurs, I feel like people should be working even closer together.
And I think that the tenants of Agile, the original well-intentioned tenants of Agile become more important, right? Like communication over documentation.
That was a core tenant of Agile that we seem to have completely forgotten as like the spec and requirements and everything became so important. But now it's just text. Text generation is cheap with AI, right? But the intent and getting to the goal together as quickly as possible and as completely as possible is kind of the new paradigm, I'd say, between like product and engineering and building software.
I wonder if that was because the same people that were good at generating the code were not great at communicating, so they doubled down something else.
Yeah, absolutely. And I think the people that just took pride in being able to solve really hard lead code problems, right, like they're quickly losing their place to the people who wanna solve really hard business problems.
How do you, as someone in a leadership role running a team, keep them motivated or upscaled? Like, are you seeing, like, they're not all gonna make it. Like, are you seeing that in real time and what are you trying to do to, you know, and you'll be like, "That."
Yeah, and it's hard. So every Friday we have a discovery workshop. We call it Discovery Workshop Weekly, where we just look at all of the advancements that are being made because we're at a point now where it is weekly, right? We just saw Cloud Mythos. The week before that, we saw all of Cloud Codes, scaffolding code get leaked, right? Google Stitch just completely revamped the way that they think about vibe design, right? It used to be a very Figma-like tool, but now it's kind of this amalgamation between Figma and vibe coding and prototyping tools. So every week we look at those and we try to keep everybody up to date. And the best way to see how it's affecting people is their ability to adapt to these new tools, to try them out, their willingness to try these tools out, and then what kind of outcomes they get from that. And then, depending on the results, it's kind of just a coaching and mentoring thing from there. Like, "Hey, did you check this out? How did you feel about it? Why doesn't it work for you?" And then we can also use that to bring best practices back into organization. Like, "Hey, these tools fit our workflow. These are too hard."
One of the things that we noticed with building our own internal knowledge repository for agents to reason about and make good decisions on document or code generation is
it took a lot of discrete pieces of tech being coupled together to make a good system because things are moving so fast. And we found that that was a barrier to entry for more of our non-technical PMs to kind of adopt because it required, hey, we need you to have a GitHub account. We need you to have a Cloud Code account. We need you to download Obsidian so you can look at the Knowledge Graph and all the markdown files in an editor. And there's definitely space for new startups and things to build tools around all these processes that we're doing, but at the same time, the processes are changing pretty rapidly. So there's always the bitter lesson to think about.
I'm gonna ask you a selfish question now because I can.
I, because you're a product person.
I have 70 transcripts from all these interviews that I've done. I want to create essentially a database of all of those transcripts to pull out insights, interesting things to then generate article briefs or even potential video pieces that I wanna cut since the time code is all built within something like that. How many discrete pieces of tech, like how would you go about building something like that?
It depends. If you want to build it yourself,
I would...
So you have a database, you need your large language model,
some interface for you to interact with it directly. I think the tools are good enough that you wouldn't need too many things. Even something like if you use chatcheapd projects.
Like Cloud and you know, Cloud Cardboard, chatcheapd.
Yep, same idea.
Notepokalum, do a good job.
It depends, it depends. There's a lot of good tools for it already. Like we use Dovetail, which is a user interview insights tool. It's built exactly for that. So you just upload your videos from whatever source. It'll do a transcript. It'll identify any of the contacts in the video based on its historical knowledge of what their voice patterns look like. So if I had an interview with you before and I mark you in there as Josh in our catalog of contacts, then the next time I have an interview with you, it will, based on your voice, identify you as Josh for all the transcripts, right? Which is nice for when you have repeat users.
And then from that, you can identify as highlights automatically for you. You can choose which categories of highlights that you want.
Yeah, we found that.
I'd say that Breakout and Dovetail.
Yeah, Dovetail is an incredible tool for this.
Rather than me rebuilding this with Glideco, which I've experimented with several times. I'm engineers, but the question, I'm someone who believes they made the thing. Great, I'm gonna soon use the thing.
Yeah.
I don't wanna rebuild something.
Yeah, and that's past. I started thinking down because I love to build things first, right? But then for exactly that reason, once you get to a certain level of complexity, you're like, "Ah, I see why this has been productized. "Let's check out the product." But yeah, I'd recommend Dovetail.
In the product thing, you talk about how you can iterate much, much faster now. Where's the new bottleneck?
The new bottleneck is very much in code review.
There's so much code being generated that organizations now have the ability to accumulate technical debt at an exponential rate. So what we've seen is that previously, it used to take a lot of time to define tickets. What exactly do you want? What is everything that you want? Nothing more, nothing less. That took a lot of time. That's now a lot faster. Writing the code, making sure that it doesn't have bugs, making sure that there are no stylistic red flags for the rest of your development team, making sure it's maintainable.
Cloud does a great job of that too. But once you have all of that, making sure that that code is congruent with the rest of your code base, making sure that that code has all of the edge cases considered, making sure that it's performant to whatever your standards are, that's still a problem. And people only have so much capacity to read and understand code that they're becoming increasingly detached from. Previously, when you were developing a feature,
you were staring at that code for hours, thinking about it,
doing UI in particular, spent hours changing the padding on a button because it just didn't quite look right. And then through that process, you gain a really deep understanding of the code. With agents, you have engineers running 10 parallel agents working on different things.
And that's awesome, right? They can go off, they can do other things, their agents are taking care of the code generation. But when you come back, you have 10 engineers and maybe even junior engineers worth of code that you have to sift through, understand,
build a mental model of for what it all means, and then judge it. And in a lot of cases for more junior engineers or weaker engineers, it's writing better code than they would that they don't understand. So they just approve it. And that's a problem too.
How are you addressing that?
Well, right now it's a bottleneck for us. That's why it's the first thing that came to mind.
Our engineering group is,
they're always iterating on what best practices are and how they can improve what they're doing. One of it is using more AI. So while you have cloud generating all the code that you want to make it into the code base, once the PR is up, we have Devon AI running all of the PR reviews. And what's interesting is that as good as cloud is at writing code that works, and you go in and it seems functionally correct and it looks good and there's no errors and you pass it along,
it takes way longer to get a passing PR with Devon than it does to get working code in the first place because it'll find inconsistent patterns. It'll say, "Hey, you probably shouldn't have done this here. This is implicit when it should be explicit. You shouldn't have used a magic number here. Just make it a constant." These are all things that certainly you should do and maybe you would have even done them had you written the code yourself because you know those are the standards. But when you're running 10 parallel agents who are all coding in whatever they think is the best practice given the context that they have, it creates a huge review problem at the end.
How big is your team?
Our team is about 30 engineers.
Growing, shrinking, staying the same?
Staying about the same. We grew and then we shrank again
because of all the changes to AI. We moved away from the mediocrity that we talked about. We don't need more hands-on keyboard. We need more people that can curate the code that's actually making it into the code base. So it was kind of moved away from programmers towards architects.
And are you finding that you are finally at the point where you're not just measuring productivity gains, but you're measuring outcomes, positive outcomes from all of this change?
Yeah, I think the early days it was really all about measuring how many more lines of code have been written, how many more PRs are up, how many more of those PRs got merged. But now we're looking a lot more at how many defects came out of this development, how many things did we catch in QA, how many things made it all the way out to customers. And then how many things have we accelerated on the roadmap or be seeing increased velocity, not just in the number of tickets that are being resolved, but in the number of features that are actually making it downstream to end users.
What about the outcome of actual business outcomes?
Business outcomes for our business or for our customers?
I think it's, both potentially, but I also think about this, like this is the big question that I run into a lot. It's like, no, productivity gains are up. Are we making more money or less money? Is our business doing better or worse? Because on the one hand you've got companies saying, man, productivity is way up and now we find out we're doing better. Great, I don't need to hire more people. We can let these people go. I don't need them anymore. Or conversely, oh, all this productivity is up. Oh shit, we're doing worse. I need to let all these people go. I think it's self-fulfilling to a certain extent. But to a certain extent, I feel like we've only been in this era
to this point for like legitimately eight months. So we're only now starting to get to like, okay, is it working? And so what is your sense?
I think it's working in a lot of aspects, but I think that there's also a large element of productivity theater going on. One of the things that I saw, especially in the early days is managers all of a sudden were generating 300 times as much documentation as before. Here's what everybody should be doing. We should all be doing these processes and this process. And this is the new best practice because that's what they were getting from AI, right? In a lot of cases, those things were great suggestions and maybe even things that should be adopted in the business, but in practice, they just didn't work. They didn't match the operational reality of what people were doing day to day. And generating documents isn't necessarily being productive. Just like generating more code isn't necessarily being productive. But you do hear, you know, Claude code is writing 99% of all of the code in Claude. That's awesome. AI is writing all the code. Okay, but people are reviewing it. And then they're also revising the specs to have the agent run the same generation with slightly updated parameters, right? To then reject it, to have them run it again. So it's writing a lot more code that feels productive. It would have been productive were it still people writing that code, right? Because we used to measure productivity by output instead of outcome. But I think to your point, as we shift more towards looking at the outcomes,
it's not as clear if this increased productivity is leading to better business outcomes.
I guess we'll find out.
Sure, so I work at CXD software. We were acquired by Ionic Partners about a year and a half ago. And Ionic Partners brought me in and put me on a specific mission to transform a 25-year-old incumbent in the last myologistic space and turn them into an AI native leader
or leader for the next 25 years, let's say, in the space.
What is it like taking an old line company and turning them into an AI? Like, that's kind of the thing. The place where the biggest productivity gains are gonna be made.
Yep.
What's that experience been like so far?
It's been interesting. I came from being able to do at Greenfield. So before joining CXD, I worked for a DOD-funded startup where we were building autonomous systems for the US intelligence agencies. And there we had a blank slate to kind of start fresh and do whatever we wanted. When you're joining a company that has 25 years of history, on one hand you have this rich source of data. There's 25 years of operational expertise baked in to these databases that we have. But you also have a lot of old infrastructure that you have to account for. Today's autonomous systems, they need to make decisions in real time. So they should be event-driven, there should be milliseconds between an action happening and getting downstream to systems that are listening. And we have a system that is more old school, more transactional, it's very database-heavy.
Talk to me a little bit about what does that look like in the real world? What is the thing, the autonomous system in this case?
So when you're building an autonomous system, the first thing that you need to start with is building a world model of all the operational ahead.
Start with, when you talk about last model delivery, what are we literally talking about here?
All right, so in the context of supply chains, when you're moving goods from point A to point B,
80 to 90% of that journey is one very big leg. So if I'm moving something from California to Philadelphia,
80% to 90% of that journey is probably in a plane. But once it lands, there's a truck that needs to get to the airport, needs to take it to a depot where it can get sorted and segregated to different routes, where another probably smaller vehicle is gonna pick that up and then drive through all of the complicated windy roads and traffic to get to your front door. That's the most complicated and most expensive part of the entire supply chain. And that's where we specialize.
Perfect. For 25 years, people have been delivering stuff to people for a long time. How has that gone from the old school? What are you improving here?
Yeah, so the first systems were all about just moving people away from paper to digital. So just the digitization of operations. So all of your ledgers, all of your inventory, all of that is now in a digital system that you can share amongst your many locations. Can't share paper from here to Phoenix, that's tough.
From there, we moved to systems that could analyze and model that data better.
So now you have dashboards, you can see how things are changing in real time.
Really good systems here. In the last 10, 20 years became predictive. Now you can forecast what your order volume is gonna look like. We can do a little bit of modeling of
costs and staff head counts, things of that nature.
In the last 10 or so years, you've gotten systems that are really good at being prescriptive. So they're giving you recommendations on what they think you should do.
People do this all the time with current AI systems, right? Adding a chat bot to your product will sometimes get you some pretty good recommendations. But the systems of tomorrow take that next best action, so instead of just saying, "Hey, here's what I think you should do," it's, "Here's what I did. "Here were the things I considered "and the constraints under which I made this decision. "Here's why, here's a log that you can audit."
Is the dream of this ultimately, okay,
something comes from China, gets into a port. That port then gets put on a train that gets delivered to a city, and once in the city, it needs to get to an individual, like a fact, a depot or something, something, or even to someone's house. Autonomous systems could be Amazon's experimenting with drone delivery, like that level of autonomy, or Amazon drivers and fetch people. Where are they trying to automate the love to?
So that's where we'd love to get to. We wanna integrate with those systems. We're a software provider, so we won't have anything in the hardware layer where we would have robots picking from shelves and doing any of the placement on trucks or vehicles, but we wanna automate the entire software process, all the decisions that are made during that. So let's think of one of these high-value operational workflow, operational exceptions that we'd have to handle. If one of those trucks breaks down in middle of a highway, and it's unrecoverable, we can't get that thing started,
the system needs to look at that truck, identify the closest within whatever radius 20 drivers, decide whom is best fit to absorb some of those hundred orders that are on that truck, what downstream impacts that has to all of the ETAs that are already on their route, consider if it violates any of the service windows, because when you're dealing with regulated industries and you're picking up blood specimens or organs, we have very specific time windows that you need to operate with them. And then given all of that, what are the best choices that you could make? Can it make those choices automatically? Where does it need to involve a human?
And what are the downstream consequences to everything involved? And to do that, you need a very specific system, which is what we're building.
Autonomous logistical triage. Yeah.
Be part of the
conversation.
Whether you're a CTO who wants to be featured, a company looking to sponsor, or an engineering leader wanting a seat in the room — there's a place for you here.