Cracking vibe coding with AI-powered software test management

Software development is entering a new era. Vibe coding lets anyone create working applications by describing what they want in plain English – no traditional programming needed. But without proper testing, it’s a recipe for chaos.

If you haven’t heard or read about vibe coding yet, then you’ve possibly been living under a rock. In case you need a definition, it’s a new phenomenon where developers, hobbyists, and even people who don’t have any grounding in software development use LLMs such as ChatGPT or Claude to write programs by simply explaining their ideas.

Thanks to advancements in AI and LLMs, programs created by vibe coding often work as expected. And if they don’t quite give the intended results, then the ‘programmer’ can describe what isn’t working and what they would like to be different to make improvements. There’s no need for them to read through the script and identify the issues causing errors, just issue some plain language instructions.

For now, vibe coding is being used for relatively basic use cases. Pieter Levels created a flight simulator in just three hours using Cursor. Another example is the packed-lunch idea generating tool that New York Times journalist Kevin Roose came up with. It’s not yet being trusted to write mission-critical programs, but the mere existence of vibe coding is leading many people to ask when – not if – this will happen.

In my view, there is a key problem that needs to be solved before we will see its true potential – but there is already an answer out there.  

No skills, just vibes

One of the reasons why this particular buzzword du jour is so buzzy is that it has captured the imagination of non-technical people, excited about being able to create software with no skills at all very quickly. For businesses, there’s the tempting prospect of being able to quickly prototype products and services so they can be tested in the real world by genuine customers.

But there are also legitimate fears among software developers about the impact vibe coding could have on their job security. As AI improves and the models get better in terms of LLMs getting things exactly right, there could be an assumption that software developers will become largely redundant.

There are several reasons, however, why vibe coding isn’t an immediate threat. For a start, no self-respecting business could deploy software that has entirely opaque inner workings. If nobody in the organisation understands the code that’s been created and how exactly it functions, then it’s just a black box.

The success of vibe coding depends on how a problem is described and how well it was communicated. And like all things AI, there is a certain element of randomness. You could ask it to do the same thing twice and come up with two entirely different programs, with varying degrees of effectiveness. You don’t know whether you’ll end up making it better or worse, and there may also be entirely unintended and unforeseeable side effects.  

And while vibe coding is seeing some success with small tasks, working with limited sets of data in restricted contexts, if you were to apply vibe coding principles to large projects, all of a sudden the probability of mistakes gets exponentially higher.

Software development is all about managing complexity, and to undertake a large project with vibe coding would require the project to be carefully carved up into smaller areas, then explaining to the LLM what was required in each area without negatively impacting on other areas. Everything needs to be reviewed and tested and that means someone has to understand the code that’s been written.

The role of AI in software test management

Testing simple apps doesn't require a test management process – they can be tested quite quickly and easily, whether we do so manually or even ask AI to write a test script and carry out automated tests. There are some aspects that AI isn't capable of testing, specifically around usability and accessibility, but that's a separate article in itself.

But when it comes to larger applications, systematic testing is required and this means there must be some kind of test management process. One of the biggest challenges in developing large, complex software systems is that exhaustive testing simply isn’t feasible. It’s impossible to account for every potential use case, input, and environment the application might encounter. As a result, QA teams must make informed decisions about what to test and what to leave out. This requires careful planning, prioritisation, and risk assessment – with key trade-offs acknowledged and accepted by relevant stakeholders to balance quality and release timelines.

While AI can support this process with data and recommendations, it cannot make these decisions on its own. Determining acceptable risk and test coverage involves accountability, domain knowledge, and alignment with business priorities – responsibilities that rest with human teams.

AI has the potential to increase productivity and boost efficiency in QA. An AI assistant can help streamline the creation of test case descriptions and proofread them, for example. It can then analyse and select the relevant test cases for an application much more quickly than a human operative could.

AI assistants can also help to file bug reports. Rather than a QA tester having to manually compile all the information about what they did, what went wrong, what they were expecting to happen and so on, the AI assistant knows the full context in terms of the test that was being performed and can file the bug report in seconds. As with test case creation, the process still needs to be overseen and verified by humans, but AI allows for substantial time savings.

AI-assisted testing for vibe coding – with human oversight

Vibe coding may accelerate software creation, but it also brings significant risks. AI-generated code often looks correct on the surface, but without proper validation, it’s impossible to know where it might fail, how it will perform under pressure, or whether it’s genuinely safe to deploy.

Relying on the same LLM models to test the code they just wrote isn’t a solution. They might “hallucinate” passing results, tweak tests to pass, or miss edge cases entirely. That’s why systematic, human-guided testing is still essential – even in this new paradigm.

But in the era of AI, it doesn’t make sense to be going through lengthy, manual testing processes for software programs that have been created in just minutes. AI-assisted testing with human oversight could greatly increase the utility of vibe coding and dramatically boost the efficiency of the software creation process.

Vibe coding might remain an experimental or niche approach for now, but robust, AI-enhanced testing is already proving its value. Whether testing code created traditionally or through LLMs, it’s clear that the future of quality assurance will be increasingly automated, but not fully autonomous in the foreseeable future. AI can assist – but the judgement, accountability, and oversight will remain human for now.

For more startup news, check out the other articles on the website, and subscribe to the magazine for free. Listen to The Cereal Entrepreneur podcast for more interviews with entrepreneurs and big-hitters in the startup ecosystem.