QA testing should do these 4 things:

  1. Make sure software works
  2. Make sure software does what it should
  3. Make sure software doesn’t do what it shouldn’t do
  4. Make sure software doesn’t break
    when you (the user) do something wrong, for example

  5. Make sure software delights the user

Most of the time, test automation is really only doing #1 — making sure that it works, by navigating around the application, and performing basic functions.

This is ok.

This is what automation is good at.

But you also need to do other things. Things that are harder. Things that are difficult for humans to figure out how to do, and even harder for computers, despite the hype around “AI”.

Sidebar: Artificial intelligence and artificial sweeteners

Artificial intelligence is like artificial sweetener. It tricks your brain without providing any of the benefits, like:

  • Energy (in the form of calories)
  • Pleasure (to the taste)
  • Tooth decay

Artificial sweeteners only simulate the pleasure from taste, which is really an anticipation of energy. It’s like the dopamine from a drug, or a video game that tricks your brain into thinking you’ve accomplished something.
Likewise, AI only simulates thinking, and it’s outputs give you a false confidence that someone has thought about the implications.
A large language model (LLM) like ChatGPT literally has no idea what it has written, whether it is bad or good, right or wrong, self contradictory or repetitive, or if it makes any sense at all.
The generative AI models don’t know how many fingers a human should have, whether a picture is a cate or a dog, or the thing it draws is representational at all, much less if it is possible to exist in any real space or follows any rules of logic or consistency.
The idea of leaving testing up to generative computer “AI” models is preposterous, given that testing is supposed to answer exactly these types of questions.

1. Does it work?

As I said, making sure something works is the easy part. Does an application start when I launch it? Can I login? Can I see a report when I click a button?

But does it work right?

B. Does it do what it should?

This is the area of functional testing. How can I tell software does what it should unless I know what it should do?

For that you need requirements.

Some requirements can be inferred. A tester can often figure out if software that is working, is doing the right thing using common sense and their knowledge about a topic.

Sidebar: Shopping cats and usability dark patterns

Wouldn’t it be nice if your shopping cart at the grocery store could scan the items as you take them off the shelf and tell you how much you’re going to spend?They discovered that you’re less likely to buy as much stuff if you realize how much you’re spending.

Look for this feature in online shopping carts as competition vanishes. When it’s hard to figure out your total, you can be pretty sure we are in an effective monopoly.

But some requirements are subtle. Does this item calculate price per item or per weight? What taxes are applied to which people and what products?

And some requirements require specialized knowledge. Domain knowledge. Knowledge about the business developing the software, or knowledge about the how and what software will be used for. Medical diagnostics, or aeronautic controls for example.

Sidebar: Agile considered as a helix of combinatorial complexity

If you have the requirements, that is, you can perhaps test them — assuming you understand them. But in this day and age of big a “Agile” top down bureaucracy and time filling meaningless ceremonies and complex processes and un-user-friendly tools, requirements are less clear than ever.
But I digress. Again.

If you have tests (and automation) that is going to make sure software does what it should, you’re going to need to

1. Know what it should do, and

2. Map your tests to those requirements

That is, assuming your tests (which are software), are actually doing what they should do.

Oh yeah, and you also need to

3. Know how to verify that those requirements are being met.

Because it’s virtually impossible for anyone to understand, much less enumerate, ***all requirements***, it stands to reason, you won’t be able to automate them all, or track what they are doing.

Combine this with the many numerous ways you can accomplish something:

- click a button
- click back
- no click it again
- hit refresh
- and so on

And you have a nearly infinite variety of ways that requirements can be met.

Not to mention the potential number of ways software can break along the way.

Actually, I am going to mention it. Because that is our next point:

4. Will it break?

Using the software will tell you if it works, and more than likely, as a tester you will discover novel and interesting 

(some may say “perverse”, or “diabolical”, but we’re not developers or project managers here.)

ways the software can break.

In fact, good testers relish in it. They love finding bugs. They love breaking things. They love seeing the smoke come out and the server room catch on fire.

Cloud data centers have really made this less fun, but there are additional benefits (*ahem* risks) to running your code over the network to servers halfway across the world controlled by someone else. And additional ways things can go wrong.

And they (I should say “we”, because I am myself a tester) get even more satisfaction, the more esoteric or bizarre ways they can trigger these bugs.

Perhaps nothing gives us more delight than hearing a developer scratch there head and say “It worked for me!” when we can clearly prove that’s not the case in all cases. Or for all definitions of “work”.

Breaking things is the delight of the tester, and while there are tools that can put stress on software with high loads and random inputs, nothing beats a dumb human for making dumb mistakes.

And finally, since I like to do things out of order (to see if something breaks) we need to see what else software can do (that it probably shouldn’t do):

# X. Bugs

Have you ever come across an unexpected behavior in software and been told,

“That’s not a bug, that’s a feature”

No, dear developer, that’s a bug, not a feature.
If it’s doing what it’s not supposed to, it’s not supposed to do it.
So it stands to reason that any undocumented feature should be considered a bug.
But as we pointed out earlier, not every feature requirement can be documented, and the effort probably isn’t even worth it, because, let’s be honest: no one will ever read comprehensive documentation, much less test it all

  • Every.
  • Single.
  • Time.
  • Something changes.

What’s the difference between a bug and a defect?

Some would have you say spelling is the only difference. I disagree. I think the pronunciation is also different.

A defect is when something you wanted in the system isn’t in it. Something is missing. A requirement isn’t met.

A bug (as Admiral Grace Hopper *allegedly* found out the hard way) is when something you didn’t want gets into the system.

Whether it’s a moth or Richard Pryor doesn’t matter. The point is, it’s not supposed to be there. But it is.

Sometimes this breaks the system (like in Admiral Hopper’s case) other times, it doesn’t break the system (as in Richard Pryor’s case).

It could be a security issue, but it doesn’t have to be. It could just live there happily, taking up bits, and burning cycles and nobody ever notices anything is wrong (except whoever pays the AWS bill).

Anyway, it shouldn’t be there if it isn’t intended to be there, even if it’s beneficial. If you discover it, and it turns out useful, you can document it, and then it becomes a documented feature.

No, adding a note to the bug report “Working as intended” does not count as documenting a feature.

But, it’s very hard to prove a negative. That is that it doesn’t have a feature it shouldn’t have.


So to reiterate, there are 4 things that testing (or Quality Assurance) should be accomplishing:

1. Making sure it works

Automation, or any random user, can see that this is the case. However, just because something works, doesn’t mean that it does what it’s supposed to, that it doens’t do what it shouldn’t, that it will keep working when it comes into contact with the enemy — I mean users.

Smoke tests fit well into this category, but it should go beyond just making sure it doesn’t burst into flames when you first turn it on.

B. Making sure it does what it’s supposed to

You need to know what it’s supposed to do to test this. Some things are obvious, but in some cases, requirements are needed. But comprehensive documentation is not practical.

This is often considered functional testing. Some of this can be automated, but due to many factors (including the reasons above), it’s not practical to automate everything.

 4. Making sure it doesn’t break

This can be harder to prove. But it’s important. Just because something is working at one point, doesn’t mean it always will be.

Load & Stress testing are a part of this. But so is “monkey testing” or “chaos testing” which as the names imply, are unguided.

Testers with their pernicious creativity and reasoning abilities can go beyond random behavior and deliberately try to break things.

The goal here is to make the system stable.

X. Making sure it doesn’t do what it’s not supposed to do

This is the hardest part, but the funnest part of testing. Often when something breaks, (e.g. a buffer overrun), it can also have unexpected behavior.

It can have serious security implications, but also may cause usability issues.

Which brings us to our bonus point:

# Bonus: Making sure it delights the user.

Something can work flawlessly, be perfectly secure, fulfill all requirements, and still be an unmitigated pain in the neck to use.

In actuality, trying to make something robust, reliable, secure, and complete ***usually*** ends up harming usability.

Add to this the simple principle that someone who created the system is ***probably*** going to understand the system better than someone who didn’t, means that they may make assumptions about how to use it that are either not valid, or obvious to the intended user.

Usability testing is an important part of testing and pretty much can’t be automated (although I’d be interested to hear ideas about how you think it could.)

Usability testing is also often neglected, or not done from the user perspective.

Anyway, that’s all I have to say about that, for now.