Sveriges mest populära poddar

Value Driven Data Science

Episode 54: The Hidden Productivity Killer Most Data Scientists Miss

23 min • 5 mars 2025

Genevieve Hayes Consulting Episode 54: The Hidden Productivity Killer Most Data Scientists Miss

Why do some data scientists produce results at a rate 10X that of their peers?

Many data scientists believe that better technologies and faster tools are the key to accelerating their impact. But the highest-performing data scientists often succeed through a different approach entirely.

In this episode, Ben Johnson joins Dr Genevieve Hayes to discuss how productivity acts as a hidden multiplier for data science careers, and shares proven strategies to dramatically accelerate your results.

This episode reveals:

  1. Why lacking clear intention kills productivity — and how to ensure every analysis drives real decisions. [02:11]
  2. A powerful “storyboarding” framework for turning vague requests into actionable projects. [09:51]
  3. How to deliver results faster using modern data architectures and raw data analysis. [13:19]
  4. The game-changing mindset shift that transforms data scientists from order-takers into trusted strategic partners. [17:05]

Guest Bio

Ben Johnson is the CEO and Founder of Particle 41, a development firm that helps businesses accelerate their application development, data science and DevOps projects.

Links

Read Full Transcript

[00:00:00] Dr Genevieve Hayes: Hello and welcome to Value Driven Data Science, the podcast that helps data scientists transform their technical expertise into tangible business value, career autonomy, and financial reward. I’m Dr. Genevieve Hayes, and today I’m joined by Ben Johnson, CEO and founder of Particle 41, a development firm that helps businesses accelerate their application development, data science, and DevOps projects.

[00:00:30] In this episode, we’ll discuss strategies for accelerating your data science impact and results without sacrificing technical robustness. So get ready to boost your impact. Earn what you’re worth and rewrite your career algorithm. Ben, welcome to the show.

[00:00:48] Ben Johnson: Yeah, thank you for having me.

[00:00:50] Dr Genevieve Hayes: One of the most common misconceptions I see about data scientists is the mistaken belief that their worth within a business is directly linked to the technical complexity of the solutions they can produce.

[00:01:04] And to a certain extent, this is true. I mean, if you can’t program, fit a model, or perform even the most basic statistical analysis, realistically, your days as a data scientist are probably numbered. However, while technical skills are certainly necessary to land a data science job, The data scientists I see making the biggest impact are the ones who are not necessarily producing the most complex solutions, but who can produce solutions to the most pressing business problems in the shortest possible time.

[00:01:41] So in that sense, productivity can be seen as a hidden multiplier for data science careers. Ben, as the founder of a company that helps businesses accelerate their data science initiatives, it’s unsurprising that one of your areas of interest is personal productivity. Based on your experience, What are some of the biggest productivity killers holding data scientists back?

[00:02:11] Ben Johnson: I don’t know for others. I know for myself that what kills my productivity is not having an intention or a goal or a direct target that I’m trying to go for. So when we solve the science problems, we’re really trying to figure out, like, what is that hunt statement or that question that key answer you know, the question that will bring the answer.

[00:02:33] And also, what is the right level of information that would handle that at the asker’s level? So the ask is coming from a context or a person. And so we can know a lot. If that person is a fellow data scientist, then obviously we want to give them data. We want to answer them with data. But if that’s a results oriented business leader, then we need to make sure that we’re giving them information.

[00:02:57] And we. Are the managers of the data, but to answer your question, I think that the biggest killer to productivity is not being clear on what question are we trying to answer?

[00:03:08] Dr Genevieve Hayes: That, resonates with my own experience. One of the things I encountered early in my data science career was well, to take a step back. I originally trained as an actuary and worked as an actuary, and I was used to the situation where your boss would effectively tell you what to do. So, go calculate, calculate.

[00:03:28] premiums for a particular product. So when I moved into data science, I think I expected the same from my managers. And so I would ask my boss, okay, what do you want me to do? And his answer would be something like, Oh here’s some data, go do something with it. And you can probably imagine the sorts of solutions that we got myself and my team would come up with something that was a model that looks like a fun fit

[00:03:59] and those solutions tended to go down like a lead balloon. And it was only after several failures along those lines that it occurred to me, , maybe we should look at these problems from a different, point of view and figure out what is it that the senior management actually want to do with this data before starting to build a particular model from it.

[00:04:24] Ben Johnson: Yeah. What decision are you trying to make? Just kind of starting with like the end in mind or the result in mind, I find in any kind of digital execution there are people who speak results language and there are people who speak solutions language. And when we intermix those two conversations,

[00:04:41] it’s frustrating, it’s frustrating for the solution people to be like, okay, great. When are you going to give it to me? And it’s frustrating for the business folks, like hey, when am I going to get that answer when we want to talk about the solution? So I found like bifurcating like, okay, let’s have a results or planning discussion separate from a solution and asking for that right to proceed.

[00:05:02] In the way that we communicate is super helpful., what your share reminds me of is some of the playbooks that we have around data QA, because in those playbooks, we’re doing analysis just for analysis sake. I feel like we’re looking for the outliers.

[00:05:18] Okay. So if we look at this metric, these are the outliers. And really what we’re doing is we’re going back to the, originators of the data and say, like, sanity, check this for us. We want to run through a whole set of sanity checks to make sure that the pipeline that we’re about to analyze makes sense.

[00:05:34] Are there any other exterior references that we can compare this to? And I do know that the first time we were participating in this concept of data QA, not having that playbook Was a problem, right? Like, well, okay. Yeah, the data is there. It’s good. It’s coming in, but you know, to really grind on that and make sure that it was reflective of the real world was an important step.

[00:05:57] Dr Genevieve Hayes: So QA, I take your meaning quality assurance here? Is that right?

[00:06:02] Ben Johnson: Yes. That’s the acronym quality assurance, but testing and doing QA around your data pipelines.

[00:06:09] Dr Genevieve Hayes: Okay, so I get it. So actually making sure the pipelines work. And if you don’t understand what is it that you’re looking for with regard to performance, then you can end up going off in the wrong direction. Is that correct?

[00:06:23] Ben Johnson: So if you were analyzing sales data, you would want to make sure that your totals reflected the financial reports. You just want to make sure that what you’ve. Accumulated in your analysis environment is reflective of the real world. There’s nothing missing. It generally makes sense. We just haven’t introduced any problem in just the organizing and collection of the data.

[00:06:45] Dr Genevieve Hayes: Yeah, yeah. From my background in the insurance industry, those were all the sorts of checks that we used to have to do with the data as well.

[00:06:52] Ben Johnson: Well, and oftentimes the folks that are asking these hard questions, they’re not asking the questions because they have any idea how clean the data they’ve collected. They just think there might be a chance. It’s like the dumb and dumber, you know, okay, so we think we have a chance, you know anyways awful movie reference, but they think that there might be a possibility that the answer to all of their questions or this hard decision that they need to make regularly is somewhere in that pile of stuff.

[00:07:21] What we call a QA analysis Also is checking the data’s integrity if it’s even capable to solve the problem. So I think that’s a great first step and that sometimes that’s just kind of analysis for analysis sake or feels that way.

[00:07:37] Dr Genevieve Hayes: One of the things you’ve touched on several times is the idea of the results oriented people and the solutions oriented people and I take it with the solutions oriented people, you’re talking about people like the data scientists. When the data scientists are talking to those results oriented people, Is there a framework that they can follow for identifying what sorts of results those results oriented people are looking for?

[00:08:08] Ben Johnson: It’s very similar in the way that you approach like a UI UX design. We’ve taken kind of a storyboard approach, storyboard approach to what they want to see. Like, okay, What is the question? What are you expecting the answer to be? Like, what do you think would happen?

[00:08:25] And then what kind of decisions are you going to do as a result of that? And you had some of those things as well. But kind of storyboarded out what’s the journey that they’re going to take, even if it’s just a logical journey through this data to go affect some change.

[00:08:41] Dr Genevieve Hayes: So do you actually map this out on a whiteboard or with post it notes or something? So literally building a storyboard?

[00:08:48] Ben Johnson: Most of the time , it’s bullets. It’s more of like written requirements. But when we think of it, we think of it , in a storyboard and often it’ll turn into like a PowerPoint deck or something because we’re also helping them with their understanding of the funding of the data science project, like connecting ROI and what they’re trying to do.

[00:09:10] So yeah. Yeah, our firm isn’t just staff augmentation. We want to take a larger holistic ownership approach of the mission that we’re being attached to. So this is critical to like, okay, well, we’re going to be in a data science project together. Let’s make sure that we know what we’re trying to accomplish and what it’s for.

[00:09:29] Because, you know, if you’re working on a complex project and six months in everybody forgets Why they’ve done this, like why they’re spending this money oftentimes you need to remind them and, show them where you are in the roadmap to solving those problems.

[00:09:44] Dr Genevieve Hayes: With the storyboard approach, can you give me an example of that? Cause I’m still having a bit of trouble visualizing it.

[00:09:51] Ben Johnson: Yeah, it’s really just a set of questions. What are you trying to accomplish? What do you expect to have happen? Where are you getting this data? It’s , just a discovery survey that we are thinking about when we’re establishing the ground rules of the particular initiative.

[00:10:08] Dr Genevieve Hayes: And how do you go from that storyboard to the solution?

[00:10:12] Ben Johnson: That’s a great question. So the solution will end up resolving in whatever kind of framework we’re using data bricks or whatever it’ll talk about the collection, the organization and the analysis. So we’ll break down how are we going to get this data is the data already in a place where we can start messing with it.

[00:10:32] What we’re seeing is that a lot of. And I kind of going deep on the collection piece because that’s I feel like that’s like 60 percent of the work. We prefer a kind of a lake house type of environment where we’ll just leave a good portion of the data in its raw original format, analyze it.

[00:10:52] Bring it into the analysis. And then, of course, we’re usually comparing that to some relational data. But all that collection, making sure we have access to all of that. And it’s in a in a methodology and pipelines that we can start to analyze it is kind of the critical first step. So we want to get our hands around that.

[00:11:10] And then the organization. So is there, you know, anything we need to organize or is a little bit messy? And then what are those analysis? Like, what are those reports that are going to be needed or the visibility, the visualizations that would then be needed on top of that? And then what kind of decisions are trying to be made?

[00:11:28] So that’s where the ML and the predictive analytics could come in to try to help assist with the decisions. And we find that most data projects. Follow those, centralized steps that we need to have answers for those.

[00:11:43] Dr Genevieve Hayes: So a question that might need to be answered is, how much inventory should we have in a particular shop at a particular time? So that you can satisfy Christmas demand. And then you’d go and get the data about

[00:11:59] Ben Johnson: Yeah. The purchase orders or yeah. Where’s the data for your purchase orders? Do you need to collect that from all your stores or do you already have that sitting in some place? Oh, yeah. It’s in all these, you know, disparate CSVs all over the place. We just did a. project for a leading hearing aid manufacturer.

[00:12:18] And most of the data that they wanted to use was on a PC in the clinics. So we had to devise a collection mechanism in the software that the clinics were using to go collect all that and regularly import that into a place where We could analyze it, see if it was standardized enough to go into a warehouse or a lake.

[00:12:39] And there were a lot of standardization problems, oddly, some of the clinics had kind of taken matters into their own hands and started to add custom fields and whatnot. So to rationalize all of that. So collection, I feel like is a 60 percent of the problem.

[00:12:54] Dr Genevieve Hayes: So, we’ve got a framework for increasing productivity by identifying the right problem to solve, but the other half of this equation is how do you actually deliver results in a rapid fashion. because, as you know, A result today is worth far more than a result next year. What’s your advice around getting to those final results faster?

[00:13:19] Ben Johnson: So That’s why I like the lake house architecture. We’re also finding new mechanisms and methodology. Some, I can’t talk about where they’re rather than taking this time to take some of the raw data and kind of continuously summarize it. So maybe you’re summarizing it and data warehousing it, but we like the raw data to stay there and just ask it the questions, but it takes more time and more processing power.

[00:13:47] So what I’m seeing is we’re often taking that and organizing it into like a vector database or something that’s kind of right for the analysis. We’re also using vector databases in conjunction with AI solutions. So we’re asking the, we’re putting, we’re designing the vector database around the taxonomy, assuming that the user queries are going to match up with that taxonomy, and then using the LLM to help us make queries out of the vector database, and then passing that back to the LLM to test.

[00:14:15] Talk about it to make rational sense about the story that’s being told from the data. So one way that we’re accelerating the answer is just to ask questions of the raw data and pay for the processing cost. That’s fast, and that also allows us to say, okay, do we have it?

[00:14:32] Like, are we getting closer to having something that looks like the answer to your question? So we can be iterative that way, but at some point we’re starting to get some wins. In that process. And now we need to make those things more performant. And I think there’s a lot of innovation still happening in the middle of the problem.

[00:14:51] Dr Genevieve Hayes: Okay, so you’re starting by questioning the raw data. Once you realize that you’re asking the right question and getting something that the results oriented people are looking for, would you then productionize this and start creating pipelines and asking questions of process data? Yeah.

[00:15:11] Ben Johnson: Yeah. And we’d start figuring out how to summarize it so that the end user wasn’t waiting forever for an answer.

[00:15:17] Dr Genevieve Hayes: Okay, so by starting with the raw data, you’re getting them answers sooner, but then you can make it more robust.

[00:15:26] Ben Johnson: That’s right. Yes. More robust. More performant and then, of course, you could then have a wider group of users on the other side consuming that it wouldn’t just be a spreadsheet. It would be a working tool.

[00:15:37] Dr Genevieve Hayes: Yeah, it’s one of the things that I was thinking about. I used to have a boss who would always say fast, cheap and good, pick two. Meaning that, you can have a solution now and it can be cheap, but it’s going to come at the cost of And it sounds like you focus on Fast and cheap first, with some sacrifice of quality because you are dealing with raw data.

[00:16:00] But then, once you’ve got something locked in, you improve the quality of it, so then technical robustness doesn’t take a hit.

[00:16:09] Ben Johnson: Yeah, for sure. I would actually say in the early stage, you’re probably sacrificing the cheap for good and fast because you’re trying to get data right off the logs, right off your raw data, whatever it is. And to get an answer really quickly on that without having to set up a whole lot of pipeline is fast.

[00:16:28] And it’s it can be very good. It can be very powerful. We’ve seen many times where it like answers the question. You know, the question of, is that data worth? Mining further and summarizing and keeping around for a long time. So in that way, I think we addressed the ROI of it on the failures, right.

[00:16:46] Being able to fail faster. Oh yeah. That data is not going to answer the question that we have. So we don’t waste all the time of what it would have been to process that.

[00:16:55] Dr Genevieve Hayes: And what’s been the impact of taking this approach for the businesses and for the data scientists within your organisation who are taking this approach?

[00:17:05] Ben Johnson: I think it’s the feeling of like. of partnership with us around their data where we’re taking ownership of the question and they’re giving us access to whatever they have. And there’s a feeling of partnership and the kind of like immediate value. So we’re just as curious about their business as they are.

[00:17:27] And then we’re working shoulder to shoulder to help them determine the best way to answer those questions.

[00:17:32] Dr Genevieve Hayes: And what’s been the change in those businesses between, before you came on board and after you came on board?

[00:17:39] Ben Johnson: Well, I appreciate that question. So with many of the clients, they see that, oh, this is the value of the data. It has unlocked this realization that I, in the case of the hearing aid manufacturer that we work with, they really started finding that they could convert more clients and have a better brand relationship by having a better understanding of their data.

[00:18:03] And they were really happy that they kept it. You know, 10 years worth of hearing test data around to be able to understand, their audience better and then turn that into. So they’ve seen a tremendous growth in brand awareness and that’s resulted in making a significant dent in maintaining and continuing to grow their market share.

[00:18:26] Dr Genevieve Hayes: So they actually realize the true value of their data.

[00:18:30] Ben Johnson: That’s right. And then they saw when they would take action on their data they were able to increase market share because they were able to affect people that truly needed to know about their brand. And like we’re seeing after a couple of years, their brand is like, you don’t think hearing aids unless you think of this brand.

[00:18:48] So it’s really cool that they’ve been able to turn that data by really, Talking to the right people and sending their brand message to the right people.

[00:18:56] Dr Genevieve Hayes: Yeah, because what this made me think of was one of the things I kept encountering in the early days of data science was a lot of Senior decision makers would bring in data scientists and see data science as a magic bullet. And then because the data scientists didn’t know what questions to answer, they would not be able to create the value that had been promised in the organization.

[00:19:25] And the consequence after a year or two of this would be the senior decision makers would come to the conclusion that data science is just a scam. But it seems like by doing it right, you’ve managed to demonstrate to organizations such as this hearing aid manufacturer, that data science isn’t a scam and it can actually create value.

[00:19:48] Ben Johnson: Absolutely. I see data sciences anytime that that loop works, right? Where you have questions. So even I have a small client, small business, he owns a glass manufacturing shop. And. The software vendor he uses doesn’t give him a inexpensive way to mark refer like who his salespeople are,

[00:20:09] so he needs a kind of a salesperson dashboard. What’s really cool is that his software gives them, they get full access to a read only database. So putting a dashboard on top of. His data to answer this salesperson activities and commissions and just something like that. That’s data science.

[00:20:28] And now he can monitor his business. He’s able to scale using his data. He’s able to make decisions on how many salespeople should I hire, which ones are performing, which ones are not performing. How should I pay them? That’s a lot of value to us as data scientists. It just seems like we just put a dashboard together.

[00:20:46] But for that business, that’s a significant capability that they wouldn’t have otherwise had.

[00:20:52] Dr Genevieve Hayes: So with all that in mind, what is the single most important change our listeners could make tomorrow? to accelerate their data science impact and results.

[00:21:02] Ben Johnson: I would just say, be asking that question, Like what question am I trying to answer? What do you expect the outcome to be? Or what do you think the outcome is going to be? So that I’m not biased by that, but I’m sanity checking around that. And then what decisions are you going to make as a result?

[00:21:19] I think always having that like in the front of your mind would help you be more consultative and help you work according to an intention. And I think that’s super helpful. Like don’t let the client Or the customer in your case, whether that be an internal person give you that assignment, like, just tell me what’s there.

[00:21:38] Right. I just want insights. I think the have to push our leaders to give us a little more than that.

[00:21:46] Dr Genevieve Hayes: the way I look at it is, don’t treat your job as though you’re someone in a restaurant who’s just taking an order from someone.

[00:21:53] Ben Johnson: Sure.

[00:21:54] Dr Genevieve Hayes: Look at it as though you’re a doctor who’s diagnosing a problem.

[00:21:58] Ben Johnson: Yeah. And the data scientists that I worked with that have that like in their DNA, like they just can’t move forward unless they understand why they’re doing what they’re doing have been really impactful. In the organization, they just ask great questions and they quickly become an essential part of the team.

[00:22:14] Dr Genevieve Hayes: So for listeners who want to get in contact with you, Ben, or to learn more about Particle 41, what can they do?

[00:22:21] Ben Johnson: Yeah, I’m on LinkedIn. In fact I love talking to people about data science and DevOps and software development. And so I have a book appointment link on my LinkedIn profile itself. So I’m really easy to get into a call with, and we can discuss whatever is on your mind. I also offer fractional CTO services.

[00:22:42] And I would love to help you with a digital problem.

[00:22:45] Dr Genevieve Hayes: And there you have it. Another value packed episode to help turn your data science skills into serious clout, cash, and career freedom. If you enjoyed this episode, why not make it a double? Next week, catch Ben’s value boost, a quick five minute episode where he shares one powerful tip for getting real results real fast.

[00:23:10] Make sure you’re subscribed so you don’t miss it. Thanks for joining me today, Ben.

[00:23:16] Ben Johnson: Thank you. It was great being here. I enjoyed it

[00:23:19] Dr Genevieve Hayes: And for those in the audience, thank you for listening. I’m Dr. Genevieve Hayes, and this has been value driven data science.

The post Episode 54: The Hidden Productivity Killer Most Data Scientists Miss first appeared on Genevieve Hayes Consulting and is written by Dr Genevieve Hayes.

Förekommer på
00:00 -00:00