AI’s Impact on Testing: Enhancing Quality Without Losing Control with Laurent Py Artwork

Cloud Native Testing Podcast

The Cloud Native Testing Podcast, sponsored by Testkube, brings you insights from engineers navigating testing in cloud-native environments.

Hosted by Ole Lensmar, it explores test automation, CI/CD, Kubernetes, shifting left, scaling right, and reliability at scale through conversations with testing and cloud native experts.

Learn more about Testkube at http://testkube.io

All Episodes

Cloud Native Testing Podcast

AI’s Impact on Testing: Enhancing Quality Without Losing Control with Laurent Py

March 19, 2025 • Testkube • Season 1 • Episode 3

In this episode, Ole and Laurent Py of Packmind explore how AI is reshaping software development, particularly in quality assurance and testing. The discussion delves into AI’s dual role in testing—both as a tool for generating tests and as a complement to human critical thinking. They also examine the ongoing importance of exploratory testing, the impact of AI on test execution efficiency, and the challenges junior developers face in an AI-driven landscape. The conversation wraps up with insights into the future of AI in software development, emphasizing the need for a balance between automation and human oversight.

This podcast is proudly sponsored by Testkube, the cloud-native, vendor-agnostic test execution and orchestration platform that enables teams to run any type of test automation directly within their Kubernetes infrastructure. Learn more at www.testkube.io

Ole Lensmar (00:01.184)
Okay, hello everyone. Welcome to today's episode. I'm so thrilled and happy to be joined by Laurent Py or Py depending on your background. Laurent and I, or Laurent I've had the fortune, I've been fortunate to have been fortunate to have been able to work with you. Sorry about that. And at SmartBear. Now I know you're at Packmind doing some amazing stuff there. I'll hand it over to you for an intro and please tell us what you're up to.

Laurent (00:30.836)
Hi everyone, it has also been my pleasure to work with you at Smartbear I'm an entrepreneur, I spent the last 20 years of my career in software development, software tools for developers and especially with a focus on quality and testing.

So yeah, this podcast is right to the point. And as you mentioned, I've joined Packmind about a year ago, a bit more than a year ago, focused on, again, tools for developers, but this time leveraging AI, like a lot of companies nowadays.

Ole Lensmar (01:10.21)
Yes, definitely. There's, feel there's a trend going on out there with, with AI. okay. Super interesting. And do you want to share a little bit just about what more in detail, how, how Pack Mind is, is an AI and testing is kind of converging.

Laurent (01:13.726)
Yep.

Laurent (01:25.884)
Yeah so PackMind is the tech lead co-pilot and basically we scale tech leads by capturing automatically all the technical decisions that are made and stored. It can be in wikis where you have coding guidelines, it can be ADRs in your git repo with architecture decision records, but even more importantly in code review. So when you do a comments in a code review you share most of the time some

technical decisions with the developer and may ask him to...

rework his part of the code. all this is, yeah, you have some level of documentation, but it's never centralized. And there is never, it remains documentation. It's not actionable. So PackMind basically captured this and like a linter on steroids, make sure that whenever you write a line of code in the ID, either you by the way, or a GitHub copilot,

make sure that this line of code and the code that you write are there to you standard and specific coding guidelines.

Ole Lensmar (02:41.462)
Okay, so it kind of extrapolates the guidelines and standards from your documentation and your comments and kind of anything you can learn from in your existing code base.

Laurent (02:54.394)
Exactly, yeah, exactly. And the problem, you know, with documentation, especially when it's stored in Notion or Confluence is no one reads it. And especially if you are new to the team, that's not necessarily the first place where you'll go. So when you are alone facing your ID and trying to develop something, you're not going to necessarily remind all the...

hundreds of technical decisions that have been made on purpose and I'm in my background, I've been developer but I would define myself more as a product person, product leader and a product is the result of

Ole Lensmar (03:34.926)
Okay.

Laurent (03:39.998)
thousands of decisions. They are business decisions and trade-offs that you make, but they're also technical decisions. because we're going to have millions of users, we want to optimize for performance. Or for example, that's SmartBear I'm earning a portfolio of 10 products. We want to optimize for reusability.

Ole Lensmar (03:59.65)
Mm. Mm. Mm.

Laurent (04:00.196)
over-performance, reusability, because that's really important to be able to reuse components. For example, BDD, we have three products that do BDD. We're not going to redevelop that behavior again and again, all these three products. So that's something that's a technical decision that you make. The problem is we make these decisions over time.

we may document them best but then no one is going to remind them or apply them. So the great thing with AI is we have now the power to capture these decisions, formal or informal.

turn them into action. And that's exactly what PackMind does. And the reason why it's, I would say, critical nowadays is because with the use and the rise of Copilot, Cursor, and all these AI coding assistants, not to mention the agents that we may talk about in this podcast, the rate of code creation dramatically accelerates. So you have way more lines of code produced.

often with more technical debt. So you start to see bottlenecks in code review. So that's important to help tech leads when they do the code review to make sure that all these lines of code that have been created by AI or influenced by AI, created by humans, really align with these hundreds, if not thousands of technical decisions. And that's where Packmind can help and scale them.

Ole Lensmar (05:34.958)
That's super interesting. mean, think AI obviously, you know, permutating every all parts of software engineering. But specifically when it comes to quality and testing, I mean, where I've heard it most is around, you know, generating tests for you. And that can be both unit tests, which might be harder, but also like, you know, playwright tests or

Laurent (05:55.508)
Thank you.

Ole Lensmar (06:02.656)
API tests, you say, hey, can you create a test for me for this open API definition? Or here I have this web app. Could you create a playwright test for me that tests? I'm always like, I always feel like this could work pretty well, but I'm always a little bit skeptical. How good is it at capturing edge cases and unusual behavior and those kind of things? And maybe it's really good at that. Maybe I'm just.

ignorant there and I'm sure it can be a good help. Maybe that's just how to see it just as with a coding copilot, testing, test creation copilot, which is, mean, considering shift left, a lot of tests are code in the end. you see it similarly or do you see also people using AI to create tests themselves?

Laurent (06:42.686)
Mm-hmm.

Laurent (06:46.45)
Yeah.

Laurent (06:54.3)
Yeah, no, I definitely don't see big difference between test and code. They are the same thing, especially for modern and mature teams that were developers own quality, not dedicated QA teams. Yeah, they use the same tool. So you will use Cursor, Copilot to create your unit test. And same with the functional test.

So, I mean, the same way when you code, you have to be extremely careful about the suggestions. So yes, it accelerates you, but like a car, when you go faster, you need to have some good reflex and be a good driver, have some experience.

We've clearly seen and that's true for again development developing a new feature testing a new feature issues with code generated by AI accepted by more junior developers and I have a kind of scary Matrix there which has been released a few two months ago by Google Dora

They've used a very scientific approach and they've shown that it's more than a correlation causation there but for every 25 % increase in AI use there is a 7.2 decrease in delivery stability. So the more you leverage AI coding assistant and agent to create your test to create your code

the less stable will be your delivery. So to your point, yes, you have to think about the edge cases, you have to carefully review the suggestions. And when you start to just accept and enter, enter, enter, because it becomes at some point so easy, that's the trap you shouldn't fall into.

Ole Lensmar (08:58.094)
Yeah, no, it's super, I mean, super interesting because you, because you, because you could, mean, at a high level, you could think that also for testing, I mean, one of the big challenges with testing is like regressions, right? What if I make a change over here and I forget that that has an effect somewhere else? And you could potentially think that AI could kind of keep track of those things, changes to a UI could automatically detect that and update the tests. But I just feel like spontaneously.

Laurent (09:21.074)
Yep.

Ole Lensmar (09:23.266)
Maybe I'm just too old. feel I'm just skeptical and feel like that. I'm sure like I'm repeating myself a little bit here, but I'm sure it can help you get the basic scaffolding into place. You know, all the the core of the meat of your testing, but you're still going to need some oversight that and maybe over time, yeah, I can do that oversight and maybe that's kind of, I don't know, an AGI thing or whatever is coming further down the line. But at least today it feels like

I wouldn't at least put my entire testing efforts in the hands of an AI agent or anything like that. I would use it as a tool, just I was used as for code.

Laurent (09:59.412)
you

Yeah, so testing is also an interesting topic because they are usually a good tester is someone that has a great critical thinking. when you think about the skill, the most important skill when using AI is having this critical skill to be able to assess.

Ole Lensmar (10:10.19)
Mm.

Laurent (10:22.708)
critique, accept or refuse the suggestions, not just accepting everything. And I think testers have the right mindset to be augmented and empowered by AI. But yeah, this critical thinking mindset is definitely important. And...

I would maybe play the contrarian here but I think in testing AI will be way better than human at dealing with complex dependencies.

or a lot of data and patterns. when you think about, for example, you mentioned impact analysis, right? So I've typically updated this part of the code and I'm going to a test and I don't want to test everything. So you will make some, again, here some trade-off leveraging your own experience, but you don't have all the data or the relationship between

the past history of these tests, they fail or pass rate and the code changes that have been done. Dealing with this massive amount of data, trying to understand the correlations between code changes, critical modules, not critical modules, tests that have failed but we, for example, said, that's fine, let's deploy. Or again, code change in a critical module.

test file and we said, no, we shouldn't deploy. You know, have multiple stages of decisions that have been made there that are stuck somewhere into your database or Jenkins history or test cube. I'm sure you have this data available in your product. Well, I think connecting these dots, AI is really, really great.

Ole Lensmar (12:23.758)
Hmm.

really good at mining that data.

Laurent (12:27.86)
Yes, exactly. So it's really about what's the, what are my strengths as a tester, as a developer, where is the AI really strong and how can we play together? Well, it's like a tango, but you drive the tango where you drive.

Ole Lensmar (12:33.87)
Hmm.

Ole Lensmar (12:40.642)
Yeah.

But I mean, yeah, for once. But I think what's really I really love you brought up the kind of manual or exploratory testing, because it's something that I'm always very passionate about that you have to have, or like a good tester that can do think critically right about your system and the application you're building will is is a lifesaver, right? So it's something that you should always have on your team. And I wonder

Laurent (13:03.796)
Yes.

Ole Lensmar (13:13.23)
to what extent AI can do really think critically. mean, could you prompt an AI to kind of test this, think of this thing under test in a critical way? Would it be able to maybe by itself come up with edge cases or use cases that you couldn't think, you didn't think of just based on kind of the materials, but it's been trained on. I haven't tried myself. I'm just thinking out loud now.

Laurent (13:39.604)
Thank you.

Ole Lensmar (13:40.394)
Or is this actually one of those areas where you say, AI is really constrained. by, it's just learned the hap, it's learned from happy path data. So it doesn't, it can't really think outside the box that it's kind of, you know, been constrained by in its learning. And that's why you, the, the exploratory testing and, and, and human testing, becomes just as important. Maybe thanks to AI, you'll have more time for exploratory testing because it, it'll get better at covering, you know, things that.

an exploratory tester would have to do or a manual tester have to do. But we'll see, I guess, over time where that line will be drawn. I still believe that. I have a hard time believing that you could entirely replace good exploratory, critically thinking manual testers with AI. But I don't know.

Laurent (14:20.382)
Yeah.

Laurent (14:28.606)
Yeah, likewise, but is that a bias because we want to still have our skin in the game? So I don't know. think we have to be open and see where it will go. But for sure, in short and midterm, I would definitely agree with that claim. That's exploratory testing is a very interesting point. Again,

to what I was mentioning before, no use strength. Where are you good? Where is AI good? And how can we get the best of both? So analyzing tons of data is great and will help you do predictive analytics, for example. By the way, Google, Facebook, they already use AI for that. When they do a code change, they can, even before they build, use AI to see where are the...

Ole Lensmar (15:03.726)
Mm.

Laurent (15:20.66)
where the test or which feature or which unit test is more likely to fail. So they can directly look at it. When you have run the build and execute the test, it can help with flaky tests and analyze, try to do root cause. Is it a real issue? Is it an environment issue? Try to auto heal the CI. So all this already exists

exist, and there are solutions and companies, the most advanced companies, that leverage that. But back to exploratory testing, I think AI is, I wouldn't say can think out of the box, but can be very creative in terms of edge case. So it's not just it will do the nominal path and not the edge case. It will also think about edge cases, maybe different edge cases as the one you can think of.

Ole Lensmar (15:51.224)
system.

Laurent (16:18.834)
But when it comes to do exploratory testing, typically the person will not just think, I would say, critically about the requirement itself. It has the knowledge. An AI will be as good as the knowledge or the context it has. And the exploratory tester, a good tester, knows about the business. He knows that this feature is really critical because it's used a lot. Or even if it's not used a lot, when it's used,

Ole Lensmar (16:33.827)
Mm.

Laurent (16:48.54)
If it doesn't work, it breaks the promise. You may have another button feature that is used a lot, but it's not that critical, right? The product can still work, function without that specific feature. So he has this knowledge. And having the business...

Ole Lensmar (16:50.622)
Mm. Mm.

Ole Lensmar (17:05.806)
You could give an AI that knowledge too. Sorry. You could give an AI that knowledge too, couldn't you? I mean, if you give it the context of business priorities and requirements, et cetera, I'm guessing that even an AI could then use that to think critically around kind of features that need or think is the wrong word, but generate critically around.

Laurent (17:25.736)
Yes.

Ole Lensmar (17:31.938)
those things. maybe it's also if you add not just the application, like the visual UI, but if you also feed it to your point, business requirements, functional requirements, business data, maybe. I mean, I guess this will be the interesting to see over time is what data can you feed it to help it be a good tester in the context of this specific idea.

Laurent (17:59.624)
Yeah, no, but that's where we were having this discussion with a customer this morning saying in order to generate exactly what I want, I had to tweak the prompt for half an hour to give the AI all the context. Well, it would have taken me half an hour to do it anyway.

Ole Lensmar (18:21.774)
Okay.

Laurent (18:22.552)
maybe better to just do the thing. they are, I think you have to look at this trade-off where without spending a month giving the AI all the customer feedback, right? You can look at the Verbatim from customers. Last time you had the production data, you had downtime and here is the impact. It costs the company a million dollar because this specific

feature was done related to this part of the code. So there are things that as human, all these small signals, the context that we absorb, because we also hear things, things, there are things that are not in data or in systems, we just had a conversation with your customer. So there are things that are implicit, others explicit, and that's why we navigate as human when we think critically and then we come up with the best strategies for exploratory testing. So

I'm not saying in 10 years from now, unless no one knows exactly where we'll be in 10 years from now, but I don't see AI in short term being able to replace exploratory testing. But definitely being able to do better with five people instead of being with five testers instead of 10 testers. So augmenting really the best testers. Yeah, I can definitely see that.

Ole Lensmar (19:50.168)
Yeah, can also just relating to kind of my our space, like cloud native testing. think what we've people ask us about is around how can I optimize test execution costs, right? And how can I just run the tests that I need to test, or that I need to run for a specific change? And if you could use AI to analyze, you know, the

Laurent (20:05.608)
Music.

Ole Lensmar (20:18.316)
understand the dependencies and the correlations between components and changes and history, it could probably present you an optimized testing plan or test execution plan related to specific changes in your code, maybe. I don't know, over time it could learn and that could have an impact on your time to delivery because you won't be running as many tests which both will get things out faster and then...

Laurent (20:30.408)
Mm-hmm.

Ole Lensmar (20:46.764)
Also, you'll have lower executes costs related to test execution. So I think that that's definitely something that the problem to solve is how to deliver faster and to lower the cost of test execution. It feels like that's where AI could also help you create like dynamic test plans and then more tactically for maybe individual tests decide on how to parallelize or.

utilize resources best or provision resources best for a specific type of test based on how that test is run before. But those are very kind of operational optimizations where AI can help. They're not really related to the test themselves. It's more about test execution and finding the best path.

Laurent (21:24.851)
Yes.

Laurent (21:33.362)
Yeah, and in my experience, I mentioned the 20 years in software quality. Either you release every three months or 10 times or 100 times a day. Testing has always been a pain and a struggle. So optimizing that is definitely critical.

The predictive analytics for test planning is definitely something critical because we always, if you release every three months and I had a product, unfortunately, that was raising every three months, it was a pain, big changes, all these long QA phases that you are familiar with and all this back and forth, we have an issue, we have to fix it, we have to retest everything.

not having this dependency map.

really we were losing days. So because we have a release date, we were chasing days. If you release 10 times a day, that was the case with another of my product, Cucumber for Jira, we are chasing seconds or minutes because you deploy, every commit goes through the CI CD and goes then to a production. And I remember someone you met, Severin was my engineering lead at the

Ole Lensmar (22:41.759)
Mm. Mm.

Laurent (22:56.982)
time and I remember when the build so from the lead time from code commit to production was longer than 10 minutes she asked all the developers we stop we have to fix this so

Ole Lensmar (23:12.471)
Okay.

Laurent (23:13.384)
So that's why being able to optimize your pipeline and then having AI is a great help in order to, as you said, be able to understand these correlations and make sure you don't necessarily re-execute all the tests. So that's a product, by the way, I wanted to develop and I started to develop, but we had to stop at SmartBear and that was before AI. So there were also some...

Ole Lensmar (23:35.103)
Okay.

Laurent (23:41.918)
kind of technical challenges or hard problems to crack. But today with GenAI I think that's way easier. I would say the biggest problem is I see, and that's why probably you guys are really well positioned, is you need the data. And usually when it comes to testing, data is...

all over the place, it's fragmented because you have manual test in one tool, you have CI, you use different frameworks, so different test results in different places. So you need that single pane of glass back to our previous discussion in order for AI to have all the context and optimize not locally, but optimize globally, you need that single pane of glass, you need that data in one place.

Ole Lensmar (24:13.742)
Well, especially, yeah.

Ole Lensmar (24:31.854)
Especially, was going to say, mean, tying back to the discussion we had a couple of weeks ago about cloud native deployment pipelines that are just much more fragmented today and often event driven. So it's not like orchestrated by one monolithic pipeline that does the CI and the CD like Jenkins or whatever, but now it's more one tool builds pushes to a repository, which kicks off an event, which does this and does this. then, you know, after

couple of async events later, something is deployed and then you have that going on for a bunch of different components at the same time. So it becomes ultra fragmented. So that makes it, there's not just one place to go. I think this is also one of the challenges. think adopting a cloud native, just delivery pipelines, but then to a point having one single pane of glass, just to see all these things going on and collecting that data to make informed decisions on testing and anything else is probably one of the challenges that.

Laurent (25:00.604)
Yep.

Ole Lensmar (25:30.062)
organizations face as they move to a more cloud native or asynchronous software delivery process.

Laurent (25:36.498)
Yeah, yeah, yeah, because definitely things are more fragmented and today probably AI is good at helping you make this decision. But tomorrow probably AI can make the decision itself and learn and adapt, right? So we made this deployment. It was a bad decision. We had to roll back. Okay, I can learn from that. So what are the criteria for saying in this?

Ole Lensmar (25:47.127)
Hmm?

Ole Lensmar (25:53.771)
and

Laurent (26:05.012)
ocean of data that we have, things that have failed, things that have passed. mean for complex application, cloud native complex application, you always deploy despite having a lot of reds. But there are reds that are more important or less important than others and also learning that is important. AI.

Ole Lensmar (26:12.814)
Okay.

Ole Lensmar (26:21.667)
Yeah.

Laurent (26:32.434)
can be good at making also this type of decision at some point and learn and adapt in case of mistake. So yeah, that would definitely have been a great help for Sivrine and for the team.

Ole Lensmar (26:48.14)
Okay, I'll get back to her when we have something. sure. Okay. Well, I mean, what I'm fascinated by the idea of that, you know, if you look at the software delivery life cycle from idea to something in production that, know, we're starting to, AI is starting to kind of show up in different spots along this pipeline. And I'm guessing over the one next one, two, three, four, five years, it'll be.

be more more AI and more and more more governance by humans just to make sure that the AI is not doing anything wrong. And at some point I would guess that the majority of the process will be AI driven and we will have a different, I think humans will still be super important, but it would just be a different way of building software. So I know that we'll see what our place is there. And I'm sure testing will still be important there as well, but it'll be really interesting to see how that plays out.

Laurent (27:45.224)
Well, I wouldn't want to...

fly on a plane that has been fully built, where the software has been fully built by AI. Same with a car. So having been influenced or some part built by AI, but also under the control and governance of humans, yes, today I wouldn't be confident. And same with my bank, for example. So I think for the next three, five years, we can confi-

Ole Lensmar (27:52.142)
Thank

Thank

Ole Lensmar (28:07.63)
Hmm.

Laurent (28:17.629)
say yeah there will be still a place for developers, experienced developers with critical thinking. I think it will be tougher for juniors or mediocre developers but yeah junior especially something that I wouldn't say necessarily worries me but I think a lot about that so

Ole Lensmar (28:40.398)
Hmm.

Laurent (28:40.698)
the AI can really augment senior developers but how will we create the next generation of senior developers?

Ole Lensmar (28:47.028)
I was telling your developers.

Laurent (28:50.068)
If juniors now, and there have been plenty of already of study that have shown that juniors don't learn while using AI. They don't learn, they just apply things. So you are no longer familiar with design patterns, with all the, you know, the tough concepts and things you had to learn before coding. Now it's, you just as AI that provide a few lines of codes and it

Ole Lensmar (29:01.006)
Mm. Mm-hmm.

And they don't.

Laurent (29:20.022)
works for 80%. So why should you learn all that difficult stuff? So yeah, I think we have to really be cognizant about that. And same for the investors, by the way, and continue to train these people. Otherwise, we'll have a gap issue.

Ole Lensmar (29:21.902)
Hmm.

Ole Lensmar (29:39.554)
Yeah. Okay. And on that happy note, Laurent, thank you so much for joining. This was, as always, amazing talking to you. Really appreciate you sharing all your thoughts and insights. And thanks to everyone listening as well. Thank you, Laurent. Thank you. Bye bye.

Laurent (29:57.972)
Thank you, it has been my pleasure. Bye bye.

People on this episode

Ole Lensmar

Host

Kelly Revenaugh

Producer