Signal to Noise: Episode 8

Why 95% of AI Projects Fail and How to Be the 5% with Jon Krohn

Listen on:

Transcript

[00:00:04] Jon: We’re in an interesting time, right, where thanks to built-in Gen AI models in a lot of people’s favorite tools like Claude Code and Cursor, this means that the data scientist no longer needs to be incredible at writing accurate Python code, for example. Data scientists still need to have a lot of experience with experimentation. So being able to come up with ideas for teasing some signal from noise, for example, to be able to run multiple different experiments and be able to make sure that there aren’t going to be confounders, you know, unexpected things in the data that are really leading to some result.

[00:00:52] Intro: Welcome to Signal to Noise by Riviera Partners, the podcast where leading executives share how they cut through the noise and act on what matters most. We go beyond the headlines to explore the pivotal decisions, opportunities, and inflection points that define their careers and shape the future of the companies they lead. It’s time to cut through the noise and get to the signal.

[00:01:15] Michael: Welcome to Signal to Noise. I’m Michael Newcomer, CEO at Riviera Partners, where I focus on connecting leading tech innovators with the executive talent that drives transformation. With experience spanning executive search, financial services, and technology, I help companies build leadership teams that can scale and thrive in rapidly evolving markets. Today, we’re joined by Dr. Jon Krohn, a leading AI educator, author, and entrepreneur. Jon is the co-founder and CEO of Y Carrot, a data science consultancy delivering real-world AI solutions to Fortune 50 companies, startups, and government agencies. He’s also the host of the globally recognized Super Data Science podcast, author of the number one best-selling book, Deep Learning Illustrated and a machine learning practice fellow at Lightning AI. With a rich background in teaching, AI content creation, and practical machine learning applications, Jon brings a unique perspective on how AI is reshaping industries and empowering professionals. Jon, it’s a pleasure to have you here today. Let’s dive in. So what’s the biggest signal you’re paying attention to right now, and what noise are you tuning out?

[00:02:20] Jon: Well, the big thing for me in my life right now is I’m trying to figure out as much as possible, “How can I get away from digital experiences to real ones?” Because it seems to me like so much of what I encounter since the pandemic, I have not been back full-time in an office environment around people. I have a fellowship at a great company called Lightning AI in New York that allows me to spend as much time as I want working from their office. But I also still have this practical thing of filming podcast episodes, doing keynotes where I have to be traveling a lot, so I am remote most of the time anyway. And since that happened, since I left being in an office day-to-day, I’ve felt like something is missing from my life, like, sitting at a computer screen, in front of a camera, holding a microphone, this is not kind of real life, and it’s obviously not real life. And so the signal that I’m trying to elevate in my life is, “How can I just spend more time doing a bike ride around Central Park or being at a yoga class around people or delivering a live lecture where people can come up and ask me questions afterward?” Those kinds of experiences, that’s the kind of signal that I’m looking for in my life.

[00:03:36] Michael: So one of the guys I know who’s most into AI and data science is like, I’m getting away from AI and data science. Is that the lesson here?

[00:03:44] Jon: It’s all about balance. You know, my livelihood is a lot of digital content creation. We have 104 episodes a year of the Super Data Science podcast. In addition to that, I create a lot of tutorials for YouTube, and trainings, and writing books, and you can’t really write a book today with a pen and paper. I’m writing it in LaTeX on a computer. And so I’ve professionally got to spend a lot of my time sitting at a screen. That has been out of balance for me. It was even worse, obviously, in the pandemic, and it’s starting to get better. But the more that I am able to read the physical economist that comes to me every week, reading physical books, turning those pages, feeling the pages, and having the experience of moving that around, that, to me, is a much more valuable experience than reading the same book on my screen.

[00:04:38] Michael: Makes sense. One of the things I love talking to you about, like, it always reminds me of a bit when we get in conversations. At some point, you always bring me back to my PhD, and you bring up you writing a book in LaTeX, I was like, “Wait a minute. What? I haven’t used that for fifteen years at this point.” So are people commonly writing books in LaTeX now, or is this –

[00:04:58] Jon: Most people do not write books in LaTeX these days. So for example, if you write a book for O’Reilly, which is a very popular tech publisher these days, they don’t support that. So even if you’re writing, as far as I know, don’t quote me on that, but my understanding is that they don’t support that. They kind of have an in-house platform that they built because they, like me, create 104 podcast episodes a year, O’Reilly publishes 104 books a year. And so they need to have a very streamlined process. And part of that process is that they have a platform. It’s WYSIWYG (what you see is what you get) in terms of typing things in, putting in figures, and it allows them to move through the publication process very quickly. I published my books with Pearson, and I’m deeply grateful that they allow me to use LaTeX. I could submit Word documents instead, do each chapter as a Word document, which I think is one of the most common ways for publishers to work. But I love LaTeX because you don’t need to know figure numbers, for example, because you can just, it’s like writing, it’s programming. And so you can just have a figure name or a table name that is a name as opposed to a number, even a whole chapter. And so if you end up later reordering your book, all of those things, chapter references, book references, title references, and so it means it also gets me out of my head in terms of formatting the book properly because I’m just focused 100% on content, and the formatting just happens automatically, or the copy editor can do it later.

[00:06:25] Michael: Very much why we used it when I was writing scientific papers back then. It just made everything much easier. Well, cool. I’m going to give a little context on how we know each other, and then I’ll ask the next question that’s in our bank. So Jon and I met, what, like, three years ago? Roughly three, four years ago, something like that.

[00:06:39] Jon: Post-pandemic.

[00:06:40] Michael: Post-pandemic. Jon was working at a company called Nebula at the time. I know one of his former colleagues now, who, like, you still do a little bit of work with Ed, from the start of that company, ten, fifteen years ago, whatever it was. And we were talking about using, really, data to identify great talent and ways to do that in a systematic way. You can correct me where I’m wrong there. But one of our questions is around, how do you find great talent in your career? And so I’m interested both in a hiring perspective, but also what you do with Nebula, how you’d approach that question, like, what you’ve learned about that over the last handful of years. I don’t think my guess is like you, I didn’t do my PhD to go identify great talent, but that is where I’m spending a lot of my time now.

[00:07:22] Jon: For sure. Yeah. We do have that overlap for sure. Finding great talent for companies is the signal to noise question, I guess, that you and I are dealing with a lot in our regular lives, Michael. Certainly, when I was working at Nebula and Untapt before that, like you mentioned, Ed Donner in the past year. If people are interested in agentic AI, either as a hands-on developer or data scientist or as a more general business person, Ed has been in the past year creating some of the most popular agentic AI content on the internet. He has hundreds of thousands of students on Udemy, for example. So go check out his Udemy courses, and he does an amazing job. So I had been working with Ed for ten years on this problem of, how can we use data and machine learning to improve the experience of finding the right talent, finding the signal amongst all of the talent noise out there for employers? And so in 2015, I think, ten years ago, I joined Ed in a company that he founded, and he was CEO of, called Untapt, U-N-T-A-P-T. And I was the Chief Data Scientist there up until the pandemic when we were acquired, and then that led to Ed and me forming a new company that you mentioned there, Nebula. So we co-founded that with a third individual named Steven Talbot. In either case, with Nebula or Untapt, the fundamental thesis was that there is a big opportunity here to build a SaaS platform that leverages data and machine learning to find more relevant talent for people. It’s a huge pain point for so many hiring managers, so many organizations. How do I create a pipeline of the right talent? How do I interview them effectively? And for us in kind of recent years, a really big thing for us at Nebula was going beyond a keyword-based search to using encoding large language models in order to be able to, quote-unquote, understand the meaning of language so that, this is kind of a silly example, but one of the most popular programming languages in the world right now is Python. This is like a toy example, but it illustrates the idea. You wouldn’t want to be searching for a software developer and have a snake handler show up in your results just because they have “Python” and “Boa constrictor” written on their resume. So that’s the kind of example that, it’s amazing to me that all of the big incumbent hiring platforms out there, they all use keyword-based searching that is that simple and kind of, at best, still have these curated ontologies of relationships between different kinds of terms, but it’s still very black and white. Whereas with Nebula, probably other companies out there that have had this idea, we could be using modern AI technologies to really understand all the language and be able to pull up relevant information in context and get way more relevant results than any of the big existing incumbent hiring platforms can do.

[00:10:24] Michael: Makes sense. One question I have for you is, where do you think the use of this kind of ends, and where do the people kind of intervene? So how much can we get around data, and where do we still need people as the interface to learn about who we should hire and who’s a great talent?

[00:10:39] Jon: There are literally, in a lot of jurisdictions, legal requirements around this question. So what you can do in the EU might vary from what you can do in the United States in general, but then it might also vary by region. So where I live in New York City, there are specific regulations around how automated hiring can work. For us at Nebula and so I’ll use this as kind of a general recommendation, I suppose, as to how I think people should be doing this is, I think that the idea of having a ranked list by AI provided to you so that your, so, for example, probably a lot of hiring managers that have the experience of opening a job posting up on LinkedIn. That job posting can, within hours, get hundreds or even thousands of applications and after it’s been out there for a few days or a few weeks, it’s this impossibly large heap of noise to look through. And a lot of those resumes aren’t relevant. You can do things like a keyword filter in that LinkedIn recruiter experience. It’s not giving you a ranking. It’s not giving you an order of, these are the best people to be speaking to in this order, and so, I think, that one of the most useful areas for AI is taking this huge amount of noise that we get in terms of applications for a given role and being able to rank that. And then I think a human should be, and in a lot of jurisdictions, it is a legal requirement for a human to then go through and start at the top of the list and say, “Okay. This first person, the algorithm thinks this is most likely to be the best person for the role, but there could be all kinds of reasons why you as a human individual know that that actually isn’t a good choice or is a good choice because of something specific that the hiring manager, you know, from a conversation that you had with them that they’re looking for that isn’t included in the job description.” There can be all those kinds of things. So, basically, I think anytime you’re moving somebody further along in the recruitment pipeline, that the final decision should be made by a human, but use AI tools to make that decision a lot easier, to make it a lot easier to find the signal in the noise.

[00:12:48] Michael: I’m curious how you think about the use of data and probability distribution in this, right? And so the example I’ll use and I’ll put in the thing that we’ve done at Riviera that’s probably interesting to you is, like, looking at how good of a fit somebody is versus a job description, right? And like you said, “Hey. The machine’s gonna do it. A human’s gonna review it,” but it’s gonna come out with some sort of numerical score on it. And that numerical score will often look like a score that for the layperson looks really bad, right? It’ll be like, that’s a 72% fit or a 63% fit. And then for data scientists, that’s a great outcome, right? That’s a great correlation. How do you think about that and the kind of limitations on, and more broadly, the limitations on data science being used by the general populace, and how you bridge the gap between really good algorithms and people using it and understanding what it is doing and what it isn’t doing?

[00:13:40] Jon: Yeah. So there are different philosophies here and countless hours of arguments that happened at Nebula, for example, around what the right thing to do with that is. So a very common thing for us as data scientists, so I talked earlier about encoding large language models. So you can use an encoding large language model to turn any language, any natural language document like a resume or a job description, into a series of numbers. That series of numbers represents a location in space. And so you could think about a map as a two-dimensional representation. You have latitude and longitude, and you could literally, you could have your encoding large language model, your AI system, take a document and just put it into a two-dimensional map like that. And you could think of it as latitude and longitude. Or, about to reach the limits of what a human brain can imagine, you could go one dimension further and kind of imagine that you have a surface on this map, mountains and valleys. And so now you have a third dimension. You have latitude, longitude, and altitude. So you’re kind of three dimensions that you could describe the meaning of a given document in. Now, the way that AI models work, they will typically work in hundreds or thousands of dimensions. And so let’s say you have a 1,000-dimensional space to represent the meaning of a job description, then you’d have a thousand numbers that you need in order to describe the location of that particular document. But when you have lots and lots of dimensions like that, it allows for a lot of nuance in your results. So when you have that long string of numbers, one of the most common things that data scientists will do is they’ll use this very simple, very fast mathematical trick called cosine similarity. And this cosine similarity score, very quick to run computationally, is the kind of thing that gives you this result that you’re describing, where, like, a 63, a 72, can actually be a really good result. In fact, if you got something like a 99 score, it would actually just be, I’d say that’s a duplicate of the same document.

[00:15:41] Michael: Something’s wrong with the model. That’s right.

[00:15:43] Jon: Yeah. And so that presents problems going from then the data scientist being able to convey. Because when you look at scores in, kind of, platforms that we’re used to, you want something, you want a match that’s quite high. The person that you’re going to hire, like, oh, they should be in the 90s. So, you know, we talked a lot at Nebula about having some kind of reranking, and that’s definitely a way that you can go where you just map scores that are 70 or above, they’re all 99 or something like that. That’s a bit of a clunky example, but you could do something like that. My actually preferred solution to this, Mike, is to just not show the score at all. If you think about Google results, Amazon results, Netflix results, they don’t show you the score. They just say, “These are the films we think you should watch. These are the products we think you should buy. These are the pages we think you are interested in visiting as a Google browser.” And so, I think that’s the way to go. Just don’t show the numbers at all.

[00:16:34] Michael: It’s interesting. You talk to a lot of data scientists out there and probably various levels of communication and working on lots of different problems. What do you think makes a great data scientist? If you could put some metrics around that.

[00:16:46] Jon: So we’re in an interesting time, right? Thanks to built-in Gen AI models in a lot of people’s favorite tools like Claude Code, like Cursor, this means that the data scientist no longer needs to be incredible at writing accurate Python code, for example. Data scientists still need to have a lot of experience with experimentation, so being able to come up with ideas for teasing some signal from noise, for example, to be able to run multiple different experiments and be able to make sure that there aren’t going to be confounders, you know, unexpected things in the data that are really leading to some result. It’s these kinds of, like, methodological questions that if somebody does a PhD in a quantitative science like you have, then you will invariably kind of develop this skill set of working with a lot of quantitative data and figuring out the kinds of things that can go wrong in interpreting results and making sure that you’re looking out for the biggest likely issues. That’s a key thing, you know, just being, just having this level of scientific inquiry, scientific understanding. And then the big thing, I used to actually ask in the same way that you on this podcast have a set list of questions, I used to have, so my Super Data Science podcast, we’re coming up on a thousand episodes now, and there was some time ago where I used to have some set questions that I asked most guests or a lot of guests. I don’t do that at all anymore now, but I used to always ask, “What is the number one thing that you look for in the data scientists that you hire?” And there was one thing that came up constantly. Almost everyone said that there was one thing that they were looking for. Can you guess what that was?

[00:18:41] Michael: You’re putting me on the spot. I would guess communication ability.

[00:18:43] Jon: Communication. Exactly. That’s it. That’s the key thing. Being able to take potentially complex solutions and approaches that you had to solve some problem and being able to communicate that across the organization and get buy-in and have your solution make its way into a production solution, that’s a really big thing that we’re seeing these days. There was a recent study that came out of Nanda, a research lab at the Massachusetts Institute of Technology. And some people have been a bit critical of their research methodology, actually, but what they found is that 95% of AI projects never make an impact in production. They’re never profitable or have some other kind of successful result in production that the company was looking for. 95% of projects fail, basically. And that is staggering, and I think that a big part of why that happens is this communication gap that happens in organizations between executives and the people on the front line developing the AI solutions.

[00:19:47] Michael: One of the things we recently released is around organization readiness for AI. I think that’s part of what you’re getting to here, is that where you see a lot of gaps, right? Because there’s change management in that. There’s adoption. There’s probably some fear in that as well. When organizations fail at adopting AI, what do you see being the reasons?

[00:20:07] Jon: So how do we get in the 5%? One of the key things is to have the KPIs, the success metrics, the key performance indicators for the AI project defined up front and have the organization aligned on what those are. So, it could be cost, it could be accuracy. It could be speed, or it could be some combination of those metrics where we say, “This is the way that the process works today.” If we take this process that is currently entirely done by humans, so let’s take an example, lot of public records, you write on a form, like, people will remember how up until a few years ago, entering the U.S, entering a lot of countries, you fill out on a piece of paper your passport number, your nationality, what your address is going to be when you land, whether you’re declaring anything manually, and that people were, some time ago, they were hand-entering those data into a digital system so that it’s in some kind of permanent record. Today, it’s a fully automated system. And so when you’re making that transition, you’ve got to say, “Okay. What is the level of accuracy today when humans are typing these things up?” You know? So then, you know, maybe have that as a baseline for the accuracy of the AI system, but then to say, “Okay. But the AI system, we expect it to cost only 10% as much.” So these kinds of constraints, you know, what are the KPIs for the AI project? Have those defined up front. I think that’s the most important thing. And then kind of strategically, just as a general tip for enterprises that I see today, a lot of leaders in enterprises, because their experience of cutting-edge AI is through chat interfaces like Claude, like ChatGPT, a lot of executives in organizations, they have this intuition that the lowest-hanging fruit, that the best opportunities in their organization are to be having some kind of automated conversational agent over top of some traditional process, and there are some places where that is valid. So customer service, for example, you could potentially have a fully automated chatbot handling some queries, and then it only escalates to a human or maybe to a more expensive LLM first. There are definitely use cases like that, but executives see that throughout their organization. Whereas, in fact, far, vastly more opportunity in organizations today exists in much less sexy, much less visible automation of back-office operations. And some of that could involve state-of-the-art LLMs, but a lot of it also just involves getting the data into one place, having appropriate sharing and permissions, and having maybe even some kind of simplistic machine learning model like a regression model in place to be doing some kind of classification or some kind of triage of information. And so there’s way, way, way more opportunity with non-generative AI applications than with generative AI and way more opportunity in a typical organization, invisible back-office operations relative to things that are impacting your clients directly.

[00:23:19] Michael: Got it. Makes a ton of sense. And I know you’re doing some of that work now. And so when folks come to you for that work, are they looking for the cool, sexy, front-office, generative AI stuff, and you’re trying to steer them to these back-office things? Like, what’s the mix? How do you guide folks? What is that conversation like?

[00:23:37] Jon: For sure. Yeah. So we talked earlier in this episode about a company that I co-founded in the past, Nebula. My most recent thing, so, since the Northern Hemisphere Spring of 2025, I have been co-founder and CEO of a company called Y Carrot, so the letter Y and then the vegetable, carrot. And it’s a bit of a machine learning joke. It’s a play on the output of every single machine learning or AI model. But the idea with our consultancy is that we want to be helping exactly as you suggest there, organizations be able to figure out, “Where can I get ROI from AI? How can I be in the 5% of successful AI projects that actually deliver value in production as opposed to in the 95% that don’t?” And I don’t really have a one-size-fits-all answer in terms of what we do for clients because there’s lots of different organizations. Some of these are huge U.S. government departments. Some of these are small startups with ten people or, you know, a hedge fund with thirty people, very different kinds of businesses with very different kinds of needs. I would say that I’ve been impressed by how often folks come to us with a very specific problem where they say, “This is the way that we’re doing things today. We already maybe have some automation in place, or we already have some conversational agent in place.” But so, for example, to give you a real-life example of something that we’ve recently done, a tech company came to us and said, “We have this chatbot that works in a regulated environment. So there’s all kinds of constraints around what it can output, so we can’t just have your typical kind of LLM back and forth. We have all these kinds of guardrails around what happens. And the problem is that with all these guardrails in place, with the way that we’re doing it today, our users have to wait more than five seconds to even get the beginning of any response from this chatbot.” And they said, “Would you be able to work with us to get this under a second so that people can have more of a real-time conversation?” And so a lot of folks come to us with that kind of very specific question, that specific kind of use case, and by the way, we were able to deliver that. But, yeah, in other situations, organizations will come to us and say, “We’ve done a survey across all of our executives or our frontline users, and we’ve created this spreadsheet with hundreds of different potential use cases in our organization. Can you help us figure out which projects are going to give us the best ROI on the shortest timeline at the lowest cost? What are the low-hanging fruit for us? Where are the easy wins? So can we help prioritize those?” And then we can pick one, two, three of those projects to POC to do a proof of concept and assuming those POCs go well, which so far for us, they’ve all been on the mark, then we can help our clients with a high-volume production deployment. So we can work with their engineering team or do it entirely ourselves. So kind of end-to-end from figuring out what projects you should be doing to getting those into production, making an impact.

[00:26:32] Michael: We’ll try to talk about Y Carrot a little bit more and also Super Data Science a bit in this as well, which you’re doing a good job advertising for but we’ll come back to that. The next question is around where AI surprised you in its capabilities, and then, conversely, where it disappointed you in what you thought it could do that it’s not able to do now.

[00:26:51] Jon: When people ask me about AI and how I feel about it, I am completely blown away almost all the time by what’s going on, like you, I did a quantitative PhD. I specialized in applying machine learning to medical sciences use cases in my PhD, and I finished that PhD thirteen years ago. And so I’m coming out of twenty years of working in the AI space. And back when I was doing my PhD, in fact, as recently as five years ago, if you had described to me the kinds of AI capabilities that we have today, the Sora 2 model, taking text and converting that into compelling, almost cinema-grade video, the kinds of things that GPT-5 can do, and by the way, I am a big GPT-5 fan. I use Claude for most of my day-to-day use cases, but I do see a lot of value in GPT-5, and I’m surprised at how people have been throwing shade on GPT-5. But we can get into that in another later part of the episode if you want to. If somebody had told me that we would have the kinds of capabilities that we get with GPT-5 today five years ago or certainly while I was doing my PhD, I would have said that sounds impossible, you know? I don’t know how we could do that in our lifetimes. It sounds like science fiction that you’re describing.

[00:28:09] Michael: The problem I always think about, Jon and this gets to, you know, both of us doing quantitative PhDs fifteen years ago for my PhD, so I got a couple more years on finishing it. But one of the most intractable problems when I was doing my PhD was the protein-folding problem. There’s all these, like, it’s computation power, it’s all these things and you probably remember when they were trying to run everyone’s screensavers, trying to figure out the protein-folding problem, the way they tried to solve it. And then AI solved it basically instantly once you got to a certain level. And so it’s crazy that that problem was solved. And you think about it, and I spent all my time around developing models for large-scale biological macromodel tools that had lots of problems with doing that, the quantum-classical barrier, the size, compute power, all those things. And what you’re really trying to do is drug docking, right? Like, that’s the most obvious usage of this. All useless because AI can figure out 99% of that in one one-thousandth of the time, right? And so, like, that’s super interesting to me. And you think about all the applications in very basic science stuff where you and I spent some time, and then all the other applications now. So it’s very rapid, but, like, are there places, I’m not surprised that you, on the positive side, you had quite a bit. Is there a place where it’s disappointed you, or you think you’re surprised it hasn’t advanced to that place?

[00:29:29] Jon: I’m struggling to think of examples. The thing that has surprised me, I suppose, is how slow organizations are to adapt to this opportunity. There is a surprising amount of organizational resistance to making changes that would allow businesses to be more profitable, to run better, to be able to have more growth. The thing that has surprised me is how slow people move or how resistant they are to change.

[00:29:59] Michael: Humans don’t like change. That’s for sure. You mentioned before in GPT-5, and so one of the questions I had was around how do you cut through the hype cycle, all kinds of the noise around all the AI innovations and GPT-5 is a good example where I think you have a different viewpoint than maybe the Reddit consensus or the online consensus? So maybe we could use that as an example, but what do you think about this, especially with what your job is? I think part of it is figuring that out.

[00:30:27] Jon: I think why quite a lot of people were disappointed with GPT-5 is because the shift from GPT-3.5 to GPT-4 was a very tangible, big increase. And the reason for that is that it allowed tasks, with GPT-3.5, you could, with a relatively high rate of accuracy, have tasks that would take humans several seconds, maybe even tens of seconds. All of a sudden now those could be done by an AI system. With GPT-4, it became tasks that would take a human minutes to do that could now be reliably done. And that kind of jump from seconds to minutes, it’s all of a sudden now a huge range of everyday tasks, of everyday kinds of questions that we might have or ideas that we might have to ask a conversational agent. It’s way more things to now be handled. So a lot of people were blown away by that jump from GPT-3 to GPT-4. Now, I don’t know what kind of magic people were expecting at that level of, you know, human tasks that take seconds or minutes. I mean, how much better can they get? If people have experiences using OpenAI’s deep research, it is astounding to me how accurate, how well-cited, and how spot on the responses are to tasks that would take minutes or potentially even hours even with GPT-4 in that kind of deep research agentic framework. I don’t know what people thought GPT-5 would be able to do on those kinds of human task timelines. Where GPT-5 has shown a lot of strength is that it can now very reliably handle tasks that would have taken humans hours. So when you’re talking about complex problems, mathematical problems, software development problems, machine learning problems that would have taken a human hours, now this can be reliably and accurately done by GPT-5 and surely by other models from Google, Gemini shortly and, you know, we’ve recently seen a lot of press at the time that you and I are recording around Claude’s Sonnet 4.5, which, you know, there’s some stat that claims that it can do a task that would take up to thirty hours for a human to do. And I haven’t dug into that specific claim very much, so don’t quote me on that, but I did see that used in their marketing materials. These timelines for the human tasks that you can be replacing with a machine are getting really insane.

[00:32:53] Michael: I want to talk about Super Data Science a bit now. And so I think the first question is how did you get involved? Why did you do it? Tell me a little bit about the story of how that started and for people who don’t know, this is one of, if not, the most popular podcast about this topic with millions of downloads a year. So, you should definitely check out that podcast and the various guests that Jon has on. It is on my subscription list. I won’t say I listen to all the episodes, but I will say I listen to more than 50%, which is a high podcast hit rate for me, at least. But back to, you know, what made you get into it? Tell me the story.

[00:33:27] Jon: Yeah. It means a lot to have you listening so regularly, Michael, after all these years because I know you’ve been listening for a while, so that’s really cool. So, yeah, the Super Data Science podcast, we’re definitely the most listened-to podcast in data science for hands-on practitioners. So data scientists, AI engineers, software developers who are doing AI stuff, even product people, and a lot of folks like yourself who are, you know, leading organizations and want to stay on top of what’s happening in AI. And so we do 104 episodes a year, every Tuesday, every Friday. We’ve recently made a big push in video. We’ve added lots of bells and whistles to what we’re doing with video. Soon, we’ll be available in Spotify video, maybe by the time that this episode comes out. And as a result, our YouTube channel has also been growing a lot. Just a few months ago, we had something like 20,000 subscribers on YouTube, but now we’re up at 100,000 and growing really quickly. So for all you folks out there who love the video format, that’s something that’s really coming along. We are really picky about the guests that we have on the show. I think that’s part of what the quality is. All of the guests essentially are great communicators. They’re really deep into what they do, and we have an amazing team. I have a researcher named Serge Macis, who’s an outstanding data scientist in his own right. He does research on our guests for the show. And so the kinds of questions, the topics that I have that come up, we get into way more depth. We cover a range of topics in a way that you won’t get on any other show that I’m aware of. That’s what we’re doing today, and that’s why a lot of people like it. We’re coming up on episode a thousand soon. So almost ten years of running the show. I’ve been doing it for five or six years as host, and the way that I fell into this was just before the pandemic started, I created a podcast called The Artificial Neural Network News Network, A4N, and it was a deliberately cheesy name. We had deliberately cheesy music, and we did one episode in February 2020 that was fitting the vision that I had for this podcast. I didn’t want to be hosting another show that’s just guest after guest. I wanted it to be a new show, to be comprehensive, to be funny. And so myself and three other data scientists, we were all gathered around one large table, and it was supposed to be like this newsroom vibe. I was the anchor of the show, and then other guys, other folks, other data scientists that I had on this panel, we’d address specific topics, but we’d all have a laugh together. And it was a hoot. But then a week later, the pandemic hit. Nobody wanted to meet in person to do this. I tried to have this format work remotely, but it didn’t. You know, having four or five people all just online, you can’t play off each other in the same way. So it didn’t work. We moved to having guests, and one of the few guests that we had, we only did five or six episodes of the show but one of the few guests that we had was a guy named Kirill Eremenko, who founded the SuperDataScience company. He’s taught millions of people machine learning and data science principles, he’s a really well-known data science instructor, and he has been hosting the Super Data Science podcast for four years. We had him as a guest on the show, and I guess he had a great experience because then about six months later, he said, “You know, I’m taking a step back from my businesses, I just want to enjoy some time to myself and for the podcast business, I’d like you to co-own it with me and take the reins as the host of the show.” So I got really lucky falling into that. And then, you know, compared to A4N, that was the amateur. You know, we did our best, but we weren’t really publishing on a schedule. We were just getting started. You know, we might have gotten there at some point. But to inherit this show that was already one of the biggest podcasts in data science or AI, that was amazing. And the key thing was that Kirill is this amazing businessperson. He’s got such an eye for operations and procedure. And so he put together this outstanding team – operations manager, researcher, editing, and production. And so all of those pieces were in place, and I was just, like, you know, inserted into this bigger machine.

[00:37:24] Michael: That’s really cool. Last time we talked before we got interrupted by some technical difficulties, you asked me the question, like, what makes someone good at data science or some version of that? And we talked about communication. And you’re obviously world-class at communicating complex data science topics, mathematical models, all those things in your podcast and other venues I’ve seen you, and in personal conversation as well. But you mentioned in your answer there that it’s really important to get great communicators on the podcast for it to work. What have you learned about identifying that, etcetera? I think about this. You know, some of the folks who’ll be listening to this podcast are leaders of organizations now. Whether they want to or not, they need more data science in their organization, more AI, and getting a leader who is a great communicator is really important. But sussing that out is pretty difficult. So I’m wondering if you have any advice for those folks.

[00:38:12] Jon: That is an interesting question. I mean, luckily, of all of the kinds of skills that you can interview for, communication is one of the few where if that isn’t obvious right off the bat in an interview, you probably don’t have someone who’s a great communicator. There’s all kinds of other things that are so difficult to interview for: somebody’s grit, somebody’s level of innovation, dedication. Those kinds of things are also absolutely critical to somebody being successful on the job, but you have no idea until they’re a month or two in. You can have all kinds of people who interview really well, who are great communicators. In fact, I think that’s probably the bigger problem. I think assessing who’s a good communicator in interviews isn’t all that hard. You should just have a sense of it right off the bat. What’s trickier is somebody who is a great communicator, who isn’t actually passionate about what they’re doing with you, and that has become even more important now in this time where a lot of organizations today, post-pandemic, are completely remote or at least still hybrid and it’s really hard to know. Back when I had, until I had founded that A4N podcast, I had always been in person. Whether it was during my PhD or after my PhD, for the decade after that, I was always, every day, Monday to Friday or more, I was there in person with the teams I was working with. And when I became a manager, managing teams, my management style was hugely dependent on being able to see, you know, what is somebody’s mood like today? What’s their momentum like? And having whiteboards all around the room. And when one person runs into a problem, the data science teams that I led pre-pandemic, it was very easy for us all kind of to swivel around in our chairs and listen to this difficulty that one individual on the team was running into. And we’d say, alright, let’s explain the problem in more detail. Let’s dig into this. And I haven’t found a way to replicate that in hybrid or remote working at all. And I think it becomes more difficult than ever to tell whether somebody is really dedicated to what they’re doing, whether they’re putting the effort in, and it can take months into somebody being in a job before you realize, you know what, actually, this person, they’re not carrying their weight like other people on the team are.

[00:40:26] Michael: That’s a great answer. Given that you spend a lot of your time doing podcasts, communicating about AI, we talked a little bit about our mutual friend, Ed Donner, and some of the stuff he’s doing in education in this space. It seems very important to you. Makes a ton of sense, but how do you think that’s going to shape the agenda? Like, what impact do you have on that, kind of your role in the ecosystem and others who are spending, you know, lots of valuable time doing that to help educate the masses?

[00:40:52] Jon: I think that the platforms of today, YouTube, podcasting platforms, Udemy, there are so many great venues. LinkedIn has emerged as well, you know, both in terms of the long format, you know, LinkedIn Learning format as well as shorter form stuff that people post in the feed. There’s lots of great content out there available for free on the web, and I think it’s a good example of how people are concerned about AI automating away all jobs. And maybe that will happen. Maybe this time is different, but we’ve had more than 200 years of industrial revolution and automation. Up until recently, it was mechanical automation. And now suddenly, with the GPT-3.5 moment, all of a sudden, it’s a huge amount of cognitive work that can be automated as well. All of the previous automations led to more roles, more job creation than lost. Yes. Particular skills, even some particular jobs, do end up being automated away, but far more are created than lost. And I think that what you’re describing there is one of the things that’s happening. Not everyone involved in the AI space or in whatever their niche is, it doesn’t need to be AI. It could be whatever your expertise is. More and more of our time is being freed up to be creative. More and more of the drudgery can be handled by machines. And so if machines are providing more and more value, we can be looking for new things that we can be doing. If you enjoy creating, if you enjoy teaching, that can then be something you do in addition to having a regular day job. And so, yeah, I think that’s something that both for Ed Donner and me, that’s something that we’ve been interested in. And I think for both Ed and me, we do get some natural, there’s some kind of positive reinforcement to us when people enjoy the content and you get, “Oh, more people are watching this video than the previous one.” It’s reinforcing and rewarding to get that but I think also a big part of it for me, I can’t speak for Ed on this, but for me, I pick topics that I’m like, “Wow. I need to know that, or I’m going to be left behind.” It started with deep learning. So around 2015, deep learning had already been making a big impact academically and starting to make a big impact in industry. And weekend after weekend would go by and I was like, “I really wanted to learn about deep learning this weekend, finally get into neural networks,” but, you know, just ended up having another fun weekend. And then it was this brainwave I had of, well, if I commit to having some materials ready to the public by some date, I’m just going to have to do it. And so, yeah, for me, a lot of the content creation I do is just about making sure that I am staying up to date on skills.

[00:43:43] Michael: Hold on. If you can teach it, you know it.

[00:43:46] Jon: Yeah. Although, also, interestingly, if you can’t do it, teach. 

[00:43:50] Michael: Right. Also fair. Although I will say someone who took over an advanced quantum class when I was in grad school, still not sure I’m the best at that, but that’s a different conversation. I want to go back to Y Carrot a bit. We did bring it up before you founded this company relatively recently, as you said. Maybe talk a little bit more about the impetus behind that, the team you’ve built. We’d love to learn a little bit more about it. Partly, kind of on the, you and I haven’t talked a lot about it outside of this podcast, so I’m curious.

[00:44:18] Jon: Yeah. You’re going to get to hear a lot of this stuff for the first time right now, Michael. Have you heard of the Bay Area? Well, they actually started in the UK, but they’ve become a big Bay Area startup called Onfido, O-N-F-I-D-O. Have you heard of them?

[00:44:29] Michael: I have not. No. 

[00:44:31] Jon: So Onfido last year became the largest acquisition ever of a British startup. So they are considerably larger than DeepMind, which was the one beforehand. So, they were acquired for something like $650 million. And the co-founder and CEO of Onfido for the first decade of that business, which ran for about fifteen years before it was acquired last year, was a gentleman named Husayn Kassai. Husayn Kassai and I led the Oxford Entrepreneurship Society together back when I was doing my PhD at Oxford, and we stayed in touch. And earlier this year, I met up with Husayn in London just to catch up on how he’s been doing, what life has been like post-acquisition. You know, he’s got this insane motor. He’s so great at identifying priorities and executing on them, which is, you know, how you can have the kind of success that he’s had. When I met up with him in February, he had just opened the doors that week on something called the London AI Hub. It’s a physical space in London. If you’re based in London and you have an AI company, I highly recommend looking at this place as a place to work because all of a sudden, then you are surrounded by other AI companies at varying levels of growth. But at the time that Husayn and I met for coffee, it was brand new. He just opened the doors, and so there’s no one there. It’s just this empty space. And so we’re having coffee, just the two of us, and just kind of catching up on what we’re doing. And I mentioned to him in passing as kind of like the fifth or sixth of different things that I was doing, I mentioned that I was thinking of starting a consulting firm. And his eyes lit up, and he took me into one of the many empty offices with a whiteboard, and he started furiously writing out his thesis as to why it’s never been a better time to have an AI consulting firm and that if he could photocopy himself, so he has an agentic AI platform that he’s building, a company called Quench, right now. And he said if he wasn’t already doing that, if he could photocopy himself, he would be running an AI consulting business because there’s never been a better time than now to be doing it. And so that made this AI consulting idea jump for me from kind of, like, from a back-burner thing that I was doing, kind of just to keep my toes warm around what’s happening in the AI space, making sure that I’m still involved in production deployments, to all of a sudden becoming my number one priority. So since then, we’ve had a number of different clients ranging from publicly listed tech companies to small startups, hedge funds, and very large U.S. government departments. All of the projects that we’ve completed so far, our clients have been delighted, and it’s led to more work with them, expanding contracts with them. So, yeah, it’s been a really fun adventure. There’s so much opportunity, like I was alluding to earlier in this episode, the AI capabilities that we have today are so powerful. And about every seven months, the length of a human task that can be accurately handled by an AI model doubles, so if it’s about two hours today that we can get 90% accuracy on replacing a human task with a machine, you can expect that in seven months, that will be four hours. Seven months after that, it’ll be eight hours. And this has been happening for years. It’s a trajectory that you can map very reliably. GPT-5 fell perfectly onto that curve when it came out in August. That means that there’s unprecedented opportunity today, and that is only, that’s doubling. Every seven months, the opportunity for things that you can be automating in your organization is vastly increasing. And so what can you be doing today to be setting up your infrastructure, your governance for both data as well as humans in the organization to take advantage of this? I think that there is an unprecedented opportunity. I think that anyone who’s listening out there who has experience building and deploying AI systems, I assume you’re having a huge amount of success. If you’re not, figure out how to make some tweaks because every conversation that I have leads to next steps.

[00:48:32] Michael: That’s awesome. Congrats on launching it. It seems like it’s going very well. So congrats on, you know, keeping the journey going. At this point, we have these rapid-fire questions that we like to do. The first one is, what’s the most impactful lesson you’ve learned in your career that you wish more leaders understood?

[00:48:47] Jon: Something that I wish I’d known way earlier in my career, so something that I’ve, you know, discovered in recent years that I wish I’d known during my PhD. It’s a kind of set of principles. I’ve been friends for a decade now with James Clear, who wrote the book Atomic Habits. That book captures so many outstanding aspects of how to be successful. And one of the key things, if I was to summarize, you know, the most valuable lessons that I learned from that book, this isn’t very rapid-fire now all of a sudden but is that anything that you want to start, anything that you want to be good at, you can’t start at it doing it well. You just need to get the reps in. So if you want to build a software product, if you want to build a business, if you want to build a following online, you just have to start, and it’s going to be bad, and you need to stick to a schedule. And that’s something, those kinds of principles have pervaded things that I’ve done in recent years. And pursuit after pursuit, they invariably succeed because if you just keep making efforts, you take feedback on what you’re doing, and you adjust, it is basically guaranteed that you will be successful at whatever you want to be successful at. And I wish that I had known that, you know, back when I was doing a PhD because there’s all kinds of things like, you know, if I had started a YouTube channel back then, yeah, it would have been bad. But think of the growth over time. You get this compounding interest in any kind of habit or pursuit that you do just like you do on your investment account.

[00:50:17] Michael: That’s a great answer, and it kind of goes to our previous conversation. Fitness is a perfect example of that, right? Like, whatever new adventure you do in sports, you’re going to be terrible at. But if you keep doing it, you have genetic limitations just like you do in everything, but you’ll get as good at it as you possibly can. But I think people often don’t do sports. They don’t do business adventures. They don’t do any of this stuff because of that fear, right? That fear of not being good at something and the reality is everyone was terrible at it at some point.

[00:50:48] Jon: I think sports are a really good playground for getting a sense of this because it can be so quantitative right off the bat. With basically anything in sports, you know, you can measure how you’re doing from the beginning. With some things in business, that can be a lot harder. You know? If you want to be a good CEO, you need to be making lots of sales. And that once you have some traction, then, okay, you can have monthly targets. You can kind of have a sense of how you’re doing. But before you have that traction, how do you measure how well you’re doing on sales calls before you’ve made the first sale? You know, those first 100 calls, you might not have a sale. Are you getting better or not?

[00:51:22] Michael: Totally. Totally. That makes sense. Second rapid-fire question. What’s one AI trend you think people are underestimating right now?

[00:51:29] Jon: You know, I already said this. So the specific chart that people need to look up, it’s created by Mter, M-T-E-R. You can Google that if you do something like “Google Mter GPT-5.” The chart will come up right away. It’ll show you visually this thing that I’ve described a number of times in this episode about how AI systems, the length of a human task that they can replace with some level of accuracy, doubles every seven months. That is insane. The way that that is going to transform the world in the coming years is mind-blowing, and so wrap your head around that.

[00:52:03] Michael: Maybe we include that in the comments. I haven’t got to say that before, but maybe we can stick it down there on YouTube for folks. Next question. If you could instantly solve one major problem in tech, what would it be?

[00:52:14] Jon: One problem in tech or one problem with tech?

[00:52:18] Michael: I would say either way. Either way, you choose.

[00:52:22] Jon: From my perspective, there are lots of problems in this world that we need to deal with, but I don’t think any problem is more pressing than just having a planet that we can live on that is hospitable, that can provide high-quality nutrition. You know, there’s probably going to be about 11 billion people on this planet later this century. In order to be able to feed all of those people with high-quality nutrition, I think that’s the most pressing problem that we have. We’re at a point where, you know, biodiversity is dramatically reducing, less and less arable land is available, and, you know, that kind of thing leads to conflict. And it happens to be the same areas. You know, climate change is already making places that were already dry even drier, and that leads to conflicts in places like the Middle East. There are a lot of people out there that are tackling this problem, and I am very optimistic that we will have, you know, that the world will continue to be a better and better place as it has been over the past century, but it is something that we can’t lose sight of.

[00:53:24] Michael: Great answer. Alright. The next question is, outside of SuperDataScience, you can appeal to your own podcast, what’s your favorite resource for staying current on new AI machine learning developments?”

[00:53:37] Jon: It’s so easy. And it is a little bit of a cheat because a few times a year, I do actually co-host this show, but it’s called Last Week in AI. 95% of the time, I am not in any way involved in the show, so I think it’s a fair answer. Last Week in AI, it is a fantastic podcast. It is funny. The hosts, Jeremy and Andre, are so knowledgeable. They, unlike, you know, my show is guest-focused, it’s more deep-dive focused. Super Data Science deep dives into particular topics. With Last Week in AI, they are deliberately giving you a general overview of all of the news that has happened in the past week. So for anyone, no technical background is required, who wants to stay up to date on what’s going on AI, Last Week in AI is the show.

[00:54:19] Michael: Cool. Well, we’ll also listen when you are the co-host. We’ll make sure that we’re on the call.

[00:54:24] Jon: I’m recording with them tomorrow, so it could be around the same time as my Signal to Noise episode coming out. I could be co-hosting Last Week in AI as well.

[00:54:31] Michael: Nice. Last question. In one word, what does great leadership mean to you?

[00:54:38] Jon: It’s empathy. I want everyone that I work with, whether we’re at the same level, whether they’re senior to me, whether they’re junior to me, I want them to feel heard. I want their ideas, their impact, to be felt, to know that it’s valued. That’s the most important thing to me.

[00:54:57] Michael: Awesome. Well, Jon, one other question that I have for you. So you have Y Carrot, you have Super Data Science, you’re doing all this stuff in content, you were mentioning before starting. How important do you think this content engine that you’ve built is to your capability to do Y Carrot, to do other things in AI?

[00:55:13] Jon: This is a really interesting question. So I think that this is something that actually I wanted to get into earlier in the episode when we were talking about the podcast and how I got started in it or why I do it. Why would Ed and I create content? I think that today, creating content can really help you stand out as an individual or as a leader, especially in these kinds of fields like software development, AI engineering. You can be creating real things. You know? You can really cut your teeth on meaningful projects, publish those on GitHub, create a YouTube tutorial, create a blog post, summarize it in a LinkedIn post, you know, giving people a link to the GitHub, to the YouTube video. That kind of stuff. You are providing real value to people, and it shows that you know, to prospective employers or prospective clients that you know, you are up to date on the latest things and that you can communicate it in an effective way. So I do think that it is important, and it’s no accident that at Y Carrot, my title is CEO because I am responsible for using this kind of content creation machine that I have to be helping bring business into the organization. So, you know, using it as lead generation, that is really what I’m responsible for in the organization. You know, managing the projects day-to-day, executing on the projects falls to other folks on the team.

[00:56:41] Michael: Very similar role. Well, thank you, Jon. You know, I’m super appreciative of you doing this and spending the time, so thank you so much.

[00:56:47] Jon: For you, Michael, I have so much time. I’ve always really enjoyed any conversation we’ve ever had. And yeah, for your listeners out there, follow me on LinkedIn. That’s my primary social medium. Subscribe to the Super Data Science podcast wherever you listen to podcasts or on our quickly growing YouTube channel.

[00:57:02] Michael: I have checked out the YouTube channel. Thank you, Jon. That wraps up today’s episode with Dr. Jon Krohn, co-founder and CEO of Y Carrot and host of the Super Data Science podcast. We’ve explored his journey in AI education, practical applications of machine learning, and insights into building a human-centered data science consultancy. Jon, thank you for sharing your expertise and experience with us. If you enjoyed this conversation, make sure to subscribe, share, and leave a review. Stay tuned for more episodes of Signal to Noise, where we continue to explore the intersection of leadership, technology, and innovation.

[00:57:36] Outro: Signal to Noise is brought to you by Riviera Partners, leaders in executive search and the premier choice for tech talent. To learn more about how Riviera helps people and companies reach their full potential, visit rivierapartners.com. And don’t forget to search for Signal to Noise by Riviera Partners on Apple Podcasts, Spotify, or anywhere you listen to podcasts.

About the guest

Jon Krohn, Co-Founder & CEO, Y Carrot

Jon Krohn is the Co-Founder and CEO of Y Carrot, a data science consultancy delivering AI solutions to Fortune 50 companies, startups, and government agencies. As the host of the globally recognized Super Data Science podcast and author of the #1 bestselling book “Deep Learning Illustrated,” he brings vast expertise in AI education and practical machine learning applications. Currently serving as a Machine Learning Practice Fellow at Lightning AI, Jon combines his academic background with hands-on industry experience to help organizations implement effective AI solutions.

Riviera Partners
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful. Privacy Policy