Cloudflare outage and testing resiliency

Not to pile on the recent Cloudflare outage, but I want to talk about it from a testing perspective wrt a couple of different aspects.

We are seeing an increase in cloud outages lately, affecting all the major infrastructure service providers — Amazon, Microsoft, Google, and now Cloudflare. Cloudflare is somewhat unique in that it is a relatively small company (compared to the others) with an outsized impact on internet infrastructure.

There are probably several factors contributing to the recent spike in outages, with our increasing reliance on them being probably the biggest. You just didn’t notice as much previously, and the increasingly digitized, interconnected, and subscription-based model of services makes it more noticeable. But also, there are two major factors that I think are underpinning the problem, increased use of Ai and outsourcing of core technical responsibilities.

Both of these are related, but what it boils down to is lack of responsibility, and unwillingness to prepare for (and invest in) contingencies. And that’s where it ties to testing.

You’ve got to give Cloudflare props for their transparency and the amount of detail they are willing to share in their postmortem, which will no doubt lead to improved engineering at the company:

The Cloudflare issue was specifically related to a rewrite and replacement of some of their core infrastructure — memory pre-allocation, which may have been needed to increase the performance & scale of their services, but was not properly tested.

This type of thing is notoriously difficult to test, because it commonly requires infrastructure at scale to test. And there are a lot of moving parts. You can’t find infrastructure scaling issues using Playwright or Selenium to click buttons on a website — which is, unfortunately, where too much of QA testing efforts go.

But this was something that could have been anticipated with boundary testing. And a clear test strategy is described in their postmortem.

The bot management feature configuration file is updated frequenty, processed, and rules are inserted into the database to prevent malicious bots. There had been a fixed number of features (rules) and a bug in the configuraiton processing created an increasing number of rows to be inserted.

So this, couldn’t have been caught in a unit test, but could have easily been tested on small scale infrastructure that did the following:

  1. Generate the feature config
  2. Process it
  3. Check the database

The exact scenario (test data) which caused it to pre-allocate and overwhelm their proxy is not described, but it appears to be a typical memory allocation overrun defect, which is what edge case boundary checking is good at finding.

People have argued about whether Rust (a theoretically memory-safe) language was at fault, or rather, whether the assumption that because Rust checks against buffer overruns means you don’t have to worry about memory allocation — this might be try at the micro-scale, but memory allocation (and leaks) can happen at the macro scale too (see Java).

But the point I want to make is that this highlights

  1. The need for resiliancy, failover, recovery, and monitoring as part of your system architecture.
  2. The need for testing at this higher level. (Or lower level, if you consider infrastructure at the bottom. What I meant was probably “broader” as opposed to “narrower” focused testing.)

Organizations need to be sure that not only does the user-centric functionality of their software work as expected, but that the infrastructure focused aspects of it do as well.

At a dinner meeting with executives at a company several years ago, I was asked why the R&D team (comprising brilliant data scientists) could create prototypes so rapidly, and yet it took so many months for development to produce a working version 1 for customers. Implcit in this question was the assumption that the product engineers were less skilled and whether we should seek to improve the talent of our developers.

I started off by explaining that there were a lot more requirements for scalability, security, and usability than a proof of concept requires when you expose a system to your customers (and the wide-open internet), and that this took a lot of time. I also pointed out that testing takes time, and highlighted some bugs that were oncovered that led to expanded feature scope.

While this company did have an extremely slow and ineffective testing routine (that I had been brought in to help fix), testing wasn’t the primary bottleneck.

The nods of approval from IT and product leadership helped convince the executive team, but I knew that we still had a lot of work to do in QA to remove the friction to help increase development velocity — because we weren’t without blame.

But that conversation did lead to a initiative to spend resources building out a more robust test environment that, thanks to the infrastructure engineers, could be spun up quickly, and that test infrastructure became the basis of their multi-cloud resiliency strategy for production systems.

Simplifying Test Management Strategies

Here’s a somewhat unstructured rant from a question on Quora:
https://www.quora.com/unanswered/What-are-the-test-management-simplified-strategies-for-better-QA-outcomes

Structuring tests and maintaining test cases can be a difficult and complex aspect of testing. Here are a few strategies that can be used to simplify test management.

First, you can group tests by feature and use tags to organize tests. A well thought out collection of tags can allow to you group them though several cross-cutting aspects. For example — you could have one aspect describing what layer of tests you have — unit, component, integration, system, ui, or api. Another aspect could describe the components in the system being tested — login, purchase, customers, orders, etc. A third aspect can describe at what phase the tests are run — smoke, regression, etc.

So you can organize your tests (for both execution and reporting by multiple aspects). These are the “login” “api” “smoke” tests, for example.

Another important strategy for test management is to separate “tests” from “requirements”. And then you can measure which requirements have adequate coverage or gaps. But, that means having clear requirements.

Often tests are tied to ephemeral artifacts like “stories” or “tasks” which become out of date as the products grow and evolve. So separating the requirements from the tasks defined to implement them is difficult. Despite over-documented project management processes with tools like Jira, the concept of capturing can cataloging requirements has fallen into disuse. People follow complex ceremony without understanding it’s original purpose.

So if you don’t have clear requirements, what can can do is take a little time to define a list of features, and then using tags to group your tests by feature, you can see how many tests cover each feature, without a large maintenance overhead, and decide if it feels like you have enough test coverage, for example — we have 10 tests covering login regression, does that sound sufficient?

Of course, this metric, like any other, can be manipulated, and you can have a multiplicity of low value tests. So it’s not a complete solution. Focusing on test value rather than test coverage or test quantity is important, and that requires more active management than a chart or a spreadsheet or tool can give you.

BrowserStack Icons of Quality Q&A

I was nominated for the BrowserStack Icons of Quality, and they interviewed me, and to my surprise, they published it (with a few edits):

See the highlights on LinkedIn: https://lnkd.in/p/gBgksTnZ


Or read the published interview on BrowserStack: https://www.browserstack.com/blog/honoring-iconsofquality-aaron-evans-one-shore/

Here is my full original answers:

  1. What are the most exciting aspects of your role as a Test Automation Strategist at OneShore?

I’d say the most exciting aspect of my job is the opportunity to work on many different projects, with different technologies. I love to learn. And meeting lots of new people and seeing their different perspectives on (among other things) testing.

  1. What’s a testing trend/innovation that’s got you excited these days?

I think the ability of LLMs — as a better search engine with working code examples — is really cool. Again, I love learning, and the ability to be more confident delving into a new codebase and exploring things outside of my domain expertise is fun and not nearly as much of a time sink. (I’m easily distracted.)

Also, seeing how other testers are leveraging LLMs to improve their skills, level up with better coding patterns, and tackle things that only senior engineers could do in the past is great. When I see someone write better API tests, or build continuous delivery workflows for test automation who might have been intimidated by such tasks a couple years ago, it makes me excited.

  1. What’s your hot take on AI in testing?

I used to be pretty down on AI. I still think there’s way too much hype. And I can’t stand using agents for coding (like Cursor, Copilot, or Claude code. But part of that is because I’m experienced and opinionated. I might not remember a function name or argument order, so an autocomplete/intellisense/LSP is handy, and all I need most of the time. But when you can’t see your code to edit it yourself it drives me mad. I totally understand why people vibe code (hands off) with an LLM — because you can’t really go half way. It’s either accept everything it slops out, or nothing.

Maybe there’s a future where you turn on the LLM to churn out boilerplate and then turn it off to get real work done.

But for me, the value is really in the power of searching with context and restructuring. At least half of that might be just because advertizing driven SEO has shittified the normal search vector. I remember being amazed by Google and then Stack Overflow in the past too.

But the real hot take now is that I don’t think AI will damage junior coders. I think it’s a huge multiplier for anyone worth their salt — that is people able to research and think for themselves. But the number of people who behave like energy efficient, slower LLMs is too high. That’s my pessimism, my misanthropy showing.

  1. What’s one piece of advice you’d give to someone just starting their career in testing?

Find a mentor, but don’t be a follower. Explore new ideas on your own. Make mistakes, create ugly code, develop bad patterns. But refactor, iterate, learn from that. The hardest thing in technology these days is that there are so many establish, entrenched patterns, frameworks, processes — that are so stupifyingly bad — that it’s hard to see the good intentions in them, or see past the bloat for the value.

When you come up with an idea on your own, and realize it’s trash (maybe by seeing a better version), it’s a golden opportunity to learn and course correct — with wisdom instead of blind obedience. But when you see your unique idea validated, confirmed, or reinvented (before or after you) it’s golden. And gives you the confidence to not just hope for something better, but see the potential of being able to make the improvement.

The best thing you can do nowadays is try to understand the intent of some hopeless broken process or overly complex framework and show others how it can be simplified. You’ll probably be ignored — at least for now, but keep at it.

  1. How do you keep up with all the new trends and tools in software testing?

I just sit back on my porch and yell at all the kids to get off my lawn. And shake my stick at everyone and talk about how hard we had it back in my day, uphill both ways.

Seriously, I investigate trends and tools that interest me, and don’t worry about the rest. With experience comes understanding that you don’t have to be expert in everything, and trying to be can stretch you too thin — or wear you out. Be exceptional at one thing or pretty good at two or three things and be willing to adapt when forced. But don’t always feel like you have to bend to the latest fashion.

For example, if you know Selenium well, you don’t have to learn Playwright, or Cypress, or whatever comes next. There are still plenty of jobs that need Selenium skills. If the job you have declares you have to change, you can learn it if you want — or move on. I know that’s easy to say when you’re old and stubborn, but it applies the other way too. If you’re passionate about a new tool or trend — go ahead and jump in. Cool, now you know two tools, and your perspective is enhanced by knowing the differences and similarities between the old and new.

For me, learning about what’s outside of testing is more exciting these days. I’ve worked with so many different languages, platforms, tools, frameworks, etc. that one more isn’t a big deal. But getting better at devops or platform engineering, or trying to understand what makes sales guys and girls tick, or how scrum masters sleep at night without nightmares of jira, or even experiencing what it’s like to twist your mind in a knot developing a React app. If nothing else, it gives you empathy for people who have different problems in life.

  1. What are the things you wish you knew about software testing when you started your career?

I wish I knew how typecasting it was to accept a job in testing. When I started testing nobody — I mean no one had even heard of software testing — except maybe Cem Kaner. Sometimes I feel like Mark Hammill. For those of you under 50, that’s Luke Skywalker. He was so typecast that he never got another job, except in a horrible D-rate horror movie called “Guyver”. But he learned to become happy as a voice actor. Even poor Alec Guinness (O.B. 1 Kenobi) who was a prestigeous Shakespearean actor and had won an Oscar never got past it — but he rolled with the punches and went to conventions and shook hands with Star Wars fans (and laughed all the way to the bank — he made a fortune getting 2.5% of Star Wars gross profits.)

I’ve been lucky and learned to embrace testing, but only after literally running away to a remote island to get away from it. And now I get to speak at testing conferences and was honored to be asked to give these rambling answers.

  1. Outside of the tech world, what’s a hobby or activity you’re really passionate about?

I used to have hobbies and passions, but I’ve really mellowed out and given up on my dreams. I wanted to be a rockstar or an author or a painter or an extreme sports athlete (surfing, snowboarding, etc) and then I wanted to build a big successful company, but now I just want to water my garden, walk in the woods, and watch my kids grow up and see their dreams crushed too.

Here’s hoping you all can find contentment in the small things in life too.

Rubber Duck Testing

Rubber Duck Productivity

Are you easily distracted?
Have a tendency to “go down the rabbit hole” or “get lost in the weeds”?
I know I do.

I find that when I have someone to work with, even if they’re just watching and occasionally nodding their head when I ask questions that are really directed at my own mental process than anyone else, I work better, faster, and most importantly, stay focused on the task at hand — especially if there is a time limit.

I’ve tried pomodoro timers, task lists and focus apps, but none of them seem to work quite as well as just having someone there. And it’s great if that someone is vested — or at least interested in the work.

That’s one reason I love pair programming & testing.

I even like to return the favor, and see how other people think, code, and solve problems — and occasionally offer my advice, whether asked for or not.

Working together is just more fun. And it keeps me focused on getting the task done, especially if there is a time limit.

I don’t know who invented the concept of “Rubber Ducking” — but the general idea is that if you’re stumped on a problem, talking to a rubber duck (or some other inanimate object) will help you solve your problem as you explain it.

But some of us are too self-conscious (or as I like to say “self-aware”) to talk to a toy duck, or even a real live pet. (We have real life ducks that quack hello to me every time I go outside.)

I’ve tried talking to AI or sometimes my wife, but they get impatient with me and too readily point out the flaws in my reasoning. So I decided to create a tool, initially for myself, but maybe open it up for others to use well.

Rubber ducks as a service.

The idea is to find someone to pair with, and pay them a nominal fee for their time, and it doesn’t have to be someone who knows how to solve your problem;

in fact, it might be better if it’s someone who doesn’t know, but is interested (or at least good at pretending to be interested) in listening while you solve your problem, and maybe learning a bit while you listen.

So if you’d like to act like a rubber duck while I write automated tests or build software tools for testing software or learn how to do sales and marketing for my small consulting company, let me know.

And if you have something you’d like to bounce off of me and could use an ear to listen to you and help you keep focused while you’re working on something, give me a call, I’ll be your rubber duck too.

Chaotic Neutral – QA Roles and Alignment

Here are the slides for the talk I gave at the Innovate QA conference in Seattle in August 2024.

It was a blast and I met a lot of cool people. The hit of the talk was giving away stickers and polyhedral dice to the people listening to my talk.

Pair Testing and Other Radical Ideas

This is a link to the talk I gave at QA or the Highway conference in Columbus, Ohio back in June 2024.



https://www.youtube.com/watch?v=KdEL2bSHUEU

Mayonnaise Cake

My daughter came home from a church youth activity the other day with an interesting story to tell. She couldn’t wait to share it with us.

They made small chocolate cakes and a delicious butter cream frosting and brought sprinkles, chocolate syrup, and candies to decorate the cakes. And then they ate them.

When they were all done there was one plain cake left over, and plenty of decorations, but they were all out of frosting. She went to put the leftover cake in the fridge to take home later. There was nothing in the refrigerator except some old condiments from the summer church picnic — ketchup, mayonnaise, and mustard.

Some clever girl suggested they decorate the extra cake with those condiments for fun.

My daughter had a more sinister take. So instead, they spread they mayonnaise on the cake, drizzled chocolate sauce and sprinkles on the cake, and offered a slice to the boys — who had been playing basketball.

The first young man took a bite, struggled to maintain his composure and said “Thanks!” with a forced smile. The second young man came over and accepted a slice.

The first boy praised the delicious cake to his friend (and made exaggerated chewing motions), as he walked away to get a drink of water. “It’s good!”, he reassured the other boy, who eyed the cake skeptically.

He couldn’t imagine the girls would be this nice to him, and as he sniffed the cake he grinned as he recognized the scent of mayonnaise.

A third boy came over, and the second boy passed a slice to him. This last boy slowly and methodically ate the whole piece of mayonnaise cake without a grimace and thanked the girls afterward.

It was my son. And as I learned later, he just assumed the girls had made bad frosting, and he wanted to be polite.

Have you ever been given a slice of mayonnaise cake at work, and how did you react?

Mainframe testers

I’d love to talk with someone who does mainframe testing.

What does mainframe testing entail?
How did you get into it?
What tools do you use?
How much do you work with COBOL, Z/OS, JCL, RPG, PL/I, etc?
How do you interface with the mainframe & software — terminals, virtualization, etc. ?
What challenges are unique to testing mainframe system and what is common?

Testing Strategy and Supply Chain Data Pipelines

Reading “Your Data Supply Chains Are Probably a Mess. Here’s How to Fix Them.”

A good software testing strategy is like a supply chain data pipeline.

The purpose should be to get the relevant data in the right hands so they can make decisions about quality and risk.

Here are the common challenges:

1. The actual technical process of developing automation can be overwhelming and you can lose sight of the big picture in the implementation detains.

2. The right data to perform meaningful tests is often locked away in different silos. Whether developer knowledge of APIs or business understanding of requirements.

3. A common data communication language is necessary to communicate business priorities to QA and for QA to communicate their findings in a way that can provide contextual meaning to various stakeholders.

4. Different parts of the organization have different goals, and aligning them all with meeting customers need for quality and business need for the bottom line.

Solutions:

1. Leadership should drive the need for QA by communicating thier priorities and assigning test resources according to business goals and value.

2. QA should be demand driven. Developers, product owners, and leadership should seek the knowledge they need to make decisions from testers and enable them with the information needed to accomplish the testing.

3. Testers and developers should understand the business domain language and common communication channels should be open. If test reports and continuous delivery jobs are ignored, find out why: is the data accurate, relevant, timely, and meaningful?

4. QA leadership should align testing with customer and business leads and approach testing as a source of information for product decision makers not as a gateway or “check” on software quality.

The data you need to perform tests and the data you need to make decisions about quality are inter-related but not identical.

The ability to share (and transform) data for testing is critical, but I don’t think a unified tool or process is the solution. It’s why complex ERP deployments fail and why everyone hates Jira.

Recipes and Receipts

Recipes and Receipts

Say you’re a wedding planner. An important part of weddings is the wedding cake. And the Smiths want a Dutch chocolate cake for their wedding, with white meringue frosting.

So you hire a baker to bake their cake. He asks for the recipe, and you hand him a box of receipts.

“What am I supposed to do with this?”, he asks.

You explain to him that inside the box are the receipts for last year. After sorting through it you pull out a folder with the word “Jones” written on it. Inside it are all the receipts, invoices, and billing hours from the Jones’ 50th anniversary where they renewed their vows. A very touching ceremony, you remember.

“My accountant makes me keep everything for taxes,” you explain.

After rifling through the folder and, you pull out a receipt and hand it to the baker.

“Here you go, this is for a cake we made last year.” On the receipt are all the ingredients for the Jones’ cake:

10 lbs. flour
1 dozen eggs
1 bottle red food coloring
200 paper plates

When you see the confused look on the baker’s face, you elaborate: “It was a red velvet cake. Red velvet is really just chocolate with food coloring, you know. Don’t add the food coloring.”

“Or the paper plates and plastic forks?”

“Naturally,” you reply. “This is a fancy wedding. Don’t worry about the tableware, just bake the cake.”

But the baker has further objections. “How am I supposed to know how much of each ingredient?”

“Oh, there’s an invoice in the folder detailing the correspondence between the Smiths and the previous baker. You’ll find the number of servings and can just calculate the difference. The Jones had a big sheet cake, I remember, so they really could have just make smaller slices.”

The baker makes a face, but decides he can estimate the proportions for the Smith’s 3 layer cake.

“And the merengue?” he asked.

“The Jones had butter cream frosting. Personally, I like that better.”

The baker gets to work on the, only realizing after he purchased the ingredients that Dutch chocolate is totally different from normal cocoa powder.

He does a great job, and the cake looked great and tasted delicious. You don’t know why the bride made that face when she took the first bite, but Mr. Smith didn’t seem to care and was happy with it, and even happier when he saw the bill came in under estimate.

And business is booming. One of the guests at the Smith wedding loved the cake so much that she hired you to plan her daughter’s wedding. She wants a 7 layer White Chocolate sponge cake with raspberry filling.

You call up the baker with the good news.

“Do you have a recipe for this cake?” he asks.

“What happened to the recipe for the last cake,” you ask. “Didn’t you write it down?”

“Hello?”

I guess I’ll have to find another new baker, you think. No worry though, I’ve got all the receipts for the Smith’s cake too. I’m building up quite a recipe collection.


The analogy here I’m trying to make here is between test cases and user stories. Tests are like recipes. User stories are like receipts — or at best, todo lists.

Tests should represent the system, as it is now. And stories represent the tasks that needed to be done to get the system to the state it was in — at some point in the past.

The most recent user stories should represent the system as it exists right now, but they don’t describe the whole system. They describe a change set of the system from the point before to the point it is now. And now will not be the point it is at later.

You cannot build functional requirements from user stories or tasks.

A story can reference the tests needed to verify that the system is working after the story is complete — and that’s a good thing. But tests cannot reference a story without the risk of the test becoming outdated or incomplete. You need something different to describe the system as a whole as it is in the current state, or at any particular state in the past.

If you “add an abstraction” — requirements — you now have tests that verify the requirements, and stories that describe the work done to fulfill the requirements.

However, I’m not advocating for complex requirements documentation. That’s a large part of the resistance to specifying requirements. But another large part is that people feel like they are doing that when they document the requirements in a task management system like Jira.

Can’t tests be the requirements? That avoids duplication. Double plus if the tests are also executable specifications. Automated tests and requirements in one go.

Technically that’s possible, but technically speaking, that’s very difficult. It’s actually more difficult to make executable specifications that to link automated test results to a static specification. And it only works if the specification used to execute tests is the same specification designed by the product team. And practically, that never happens.