Cloudflare outage and testing resiliency

Not to pile on the recent Cloudflare outage, but I want to talk about it from a testing perspective wrt a couple of different aspects.

We are seeing an increase in cloud outages lately, affecting all the major infrastructure service providers — Amazon, Microsoft, Google, and now Cloudflare. Cloudflare is somewhat unique in that it is a relatively small company (compared to the others) with an outsized impact on internet infrastructure.

There are probably several factors contributing to the recent spike in outages, with our increasing reliance on them being probably the biggest. You just didn’t notice as much previously, and the increasingly digitized, interconnected, and subscription-based model of services makes it more noticeable. But also, there are two major factors that I think are underpinning the problem, increased use of Ai and outsourcing of core technical responsibilities.

Both of these are related, but what it boils down to is lack of responsibility, and unwillingness to prepare for (and invest in) contingencies. And that’s where it ties to testing.

You’ve got to give Cloudflare props for their transparency and the amount of detail they are willing to share in their postmortem, which will no doubt lead to improved engineering at the company:

The Cloudflare issue was specifically related to a rewrite and replacement of some of their core infrastructure — memory pre-allocation, which may have been needed to increase the performance & scale of their services, but was not properly tested.

This type of thing is notoriously difficult to test, because it commonly requires infrastructure at scale to test. And there are a lot of moving parts. You can’t find infrastructure scaling issues using Playwright or Selenium to click buttons on a website — which is, unfortunately, where too much of QA testing efforts go.

But this was something that could have been anticipated with boundary testing. And a clear test strategy is described in their postmortem.

The bot management feature configuration file is updated frequenty, processed, and rules are inserted into the database to prevent malicious bots. There had been a fixed number of features (rules) and a bug in the configuraiton processing created an increasing number of rows to be inserted.

So this, couldn’t have been caught in a unit test, but could have easily been tested on small scale infrastructure that did the following:

  1. Generate the feature config
  2. Process it
  3. Check the database

The exact scenario (test data) which caused it to pre-allocate and overwhelm their proxy is not described, but it appears to be a typical memory allocation overrun defect, which is what edge case boundary checking is good at finding.

People have argued about whether Rust (a theoretically memory-safe) language was at fault, or rather, whether the assumption that because Rust checks against buffer overruns means you don’t have to worry about memory allocation — this might be try at the micro-scale, but memory allocation (and leaks) can happen at the macro scale too (see Java).

But the point I want to make is that this highlights

  1. The need for resiliancy, failover, recovery, and monitoring as part of your system architecture.
  2. The need for testing at this higher level. (Or lower level, if you consider infrastructure at the bottom. What I meant was probably “broader” as opposed to “narrower” focused testing.)

Organizations need to be sure that not only does the user-centric functionality of their software work as expected, but that the infrastructure focused aspects of it do as well.

At a dinner meeting with executives at a company several years ago, I was asked why the R&D team (comprising brilliant data scientists) could create prototypes so rapidly, and yet it took so many months for development to produce a working version 1 for customers. Implcit in this question was the assumption that the product engineers were less skilled and whether we should seek to improve the talent of our developers.

I started off by explaining that there were a lot more requirements for scalability, security, and usability than a proof of concept requires when you expose a system to your customers (and the wide-open internet), and that this took a lot of time. I also pointed out that testing takes time, and highlighted some bugs that were oncovered that led to expanded feature scope.

While this company did have an extremely slow and ineffective testing routine (that I had been brought in to help fix), testing wasn’t the primary bottleneck.

The nods of approval from IT and product leadership helped convince the executive team, but I knew that we still had a lot of work to do in QA to remove the friction to help increase development velocity — because we weren’t without blame.

But that conversation did lead to a initiative to spend resources building out a more robust test environment that, thanks to the infrastructure engineers, could be spun up quickly, and that test infrastructure became the basis of their multi-cloud resiliency strategy for production systems.

Simplifying Test Management Strategies

Here’s a somewhat unstructured rant from a question on Quora:
https://www.quora.com/unanswered/What-are-the-test-management-simplified-strategies-for-better-QA-outcomes

Structuring tests and maintaining test cases can be a difficult and complex aspect of testing. Here are a few strategies that can be used to simplify test management.

First, you can group tests by feature and use tags to organize tests. A well thought out collection of tags can allow to you group them though several cross-cutting aspects. For example — you could have one aspect describing what layer of tests you have — unit, component, integration, system, ui, or api. Another aspect could describe the components in the system being tested — login, purchase, customers, orders, etc. A third aspect can describe at what phase the tests are run — smoke, regression, etc.

So you can organize your tests (for both execution and reporting by multiple aspects). These are the “login” “api” “smoke” tests, for example.

Another important strategy for test management is to separate “tests” from “requirements”. And then you can measure which requirements have adequate coverage or gaps. But, that means having clear requirements.

Often tests are tied to ephemeral artifacts like “stories” or “tasks” which become out of date as the products grow and evolve. So separating the requirements from the tasks defined to implement them is difficult. Despite over-documented project management processes with tools like Jira, the concept of capturing can cataloging requirements has fallen into disuse. People follow complex ceremony without understanding it’s original purpose.

So if you don’t have clear requirements, what can can do is take a little time to define a list of features, and then using tags to group your tests by feature, you can see how many tests cover each feature, without a large maintenance overhead, and decide if it feels like you have enough test coverage, for example — we have 10 tests covering login regression, does that sound sufficient?

Of course, this metric, like any other, can be manipulated, and you can have a multiplicity of low value tests. So it’s not a complete solution. Focusing on test value rather than test coverage or test quantity is important, and that requires more active management than a chart or a spreadsheet or tool can give you.

Make sure your tests fail

When you write automated tests, it’s important to make sure that those tests can fail. This can be done by mutating the test so it’s expected conditions are not met, so that the test will fail (not error). When you satisfy those conditions by correcting the inputs to the test, you can have more confidence that your test is actually testing what you think it is — or at least that it is testing something.

It’s easy to make a test fail, and then change it to make it pass , but testing your tests can be more complex than that — which is a good reason why tests should be simple. You’ll only catch the error that you explicitly tested for (or completely bogus tests like assert true == true.)

Not to say those simple sanity checks don’t have value, but an even better check is to write a test that fails before a change, but passes after the change is applied to the system under test. This is easier to do with unit tests, but for system tests there is great value in seeing tests fail before a feature (or bug fix) is deployed and then seeing it succeed afterwards.

It can still lead to bogus tests (or at least partially bogus tests) but a few of these type of tests being run after a deployment are extremely valuable and can catch all kinds of issues, as well as give greater confidence that what you added actually worked. This is especially useful when moving code (and configuration) through a delivery pipeline across multiple environments—from dev to test to stage to production.

Having (and tracking) these sort of tests — which pass only when the change is applied makes the delivery pipeline much more valuable,

Also don’t forget the other tests — those that make sure what you changed didn’t break anything else — although these are the more common types of automated regression tests.

Originally posted in response to Bas Dijkstra on LinkedIn:

Never trust a test you haven’t seen fail

What can I do to expand my skills beyond testing?

Someone asked about self-improvement after 10 years as a tester and wanting to expand their knowledge into software development.

I can sympathize with this attitude because I went through a similar mindset — which let to my eventual burnout and move to Fiji that I’ve written about previously.

Here is my response:

After 10 years as a tester you probably have pretty good testing skills.

Adding development skills can only increase your value as a tester because it will allow you to communicate better with developers and understand the architecture to find, anticipate, and troubleshoot bugs better.

And if you want to move into development (creative vs destructive role) your skills as a tester will help you and maybe influence other developers to think of testing first.

Other areas you could branch out into and expand your knowledge include operations/devops, understanding system architecture, project / product management, team leadership, or specialized domain knowledge (such as healthcare or machine learning) which can all benefit your work in testing or provide alternate career paths if you’re looking for change.

See the original post on LinkedIn here:

https://www.linkedin.com/posts/andrejs-doronins-195125149_softwaretesting-testautomation-activity-7031913802388398080-gsFB

Are companies getting worse at QA testing?

Melissa Perri posed this question on Twitter:

Aaron Hodder had a great response on Linkedin:

He talks about how companies are giving up on manual testing in favor of automation. Definitely worth the read.

My response about the ramifications of automation vs manual testing (it doesn’t have to be either / or):

There are two mistakes I often see around this:

  1. Attempting to replace manual testing with automation
  2. Attempting to automate manual tests

Both are causes for failure in testing.

People often think they will be saving money by eliminating manual QA tester headcount. But it turns out that effective automation is *more expensive* than manual testing. You have to look for benefits in automation, not cutting costs. Not only is someone experience in developing automation going to cost more than someone doing manual testing, but automated tests take time to develop and even more time to maintain.

That gets to my second point. You can’t just translate manual tests to automation. Automation and manual testing are good at different things. Automated tests that try to mimic manual tests are slower, more brittle, and take more effort. Use automation for what it’s good for — eliminating repetitive, slow manual work, not duplicating it.

Manual testing has an exploratory aspect that can’t be duplicated by automation. Not until AI takes over. (I don’t believe in AI.) And automation doesn’t have to do the same things a manual tester has to do – it can invoke APIs, reset the database, and do all sorts of things an end user can’t.

Looking for a Tester with GoLang experience

I was just talking with a recruiter looking for a QA engineer with experience using Go programming language for testing. While Go is gaining popularity – especially among systems application developers (for example, Docker is written in Go) and for developing microservices, not a lot of testers have much experience with Go.

That’s because Go is relatively new, and if you’re testing something as a black box (as QA typically does) then it doesn’t matter what programming language you use to write tests in.

Go does have testing capabilities — primarily targeting unit testing go ci, and at least one general purpose test framework I know of (Testify) and a couple BDD style frameworks (Ginkgo, GoConvey) and an HTTP client testing library (httpexpect), but for UI driven automation, while it is technically possible (WebDriver client libraries exist — tebeka/selenium) it is less complete and user friendly than in other programming languages (which testers may already know).

This post on Speedscale.com by Zara Cooper has a great reference for testing tools in Go.

The main reason to choose Go for writing tests, is if you are already writing all your other code in Go. Which means that you’re writing developer tests (even if not strictly unit tests), not user focused tests (like QA typically does).

By all means, if you’re writing microservices in Go, write tests for those services in Go too. (I recommend using Testify and go-resty or httpexpect.)

But there is no real advantage to use Go for writing test automation, especially if you’re testing a user interface (which is not written with Go).

I suggested that if you are set on looking for people to write automated tests with Go, that you either look for people experienced with go — look for projects that are written in Go (like Docker) and look for people who have worked on those projects — in the case of Docker, unless you’re developing the core of Docker itself you probably aren’t using Go. If someone has extensive experience using Docker that is no indication they have Go experience. This would be hard to do, and may still not find anyone with QA experience.

Rather, you should look for someone with QA aptitude and experience who has experience with at least 1 other programming language (Java, C#, Python, C, etc.) preferably more than one. And then account for some time for them to learn Go and get up to speed.

Good developers love learning new things — especially new programming languages, and good testers are good developers. If you’re hiring for someone based only on their experience with a particular programming language, or not looking for people who are comfortable learning more than one, then you lose out on people who are adaptive and curious, two traits that are way more important for testers than knowing any particular tool.

#golang #testing

An Asynchronous Test Runner?

Here’s a conversation on LinkedIn talking about which programming language you should choose for a test framework — including comments about how automated tests are inherently synchronous (which I agree with) and why someone would write an asynchronous test framework in, for example JavaScript.

https://www.linkedin.com/posts/vikas-mathur_qa-testing-testautomation-activity-6871687300267474944-1MUL

Vikas Mathur

Typically, programming language for test automation can be any language, irrespective of which language the developers are using. However, to allow for more and effective collaboration with the developers, using the same language as them might make sense. Of course, there might be situations where it does make sense to use a different language – but in most cases, it makes more sense to use the same language. In case the language is different than the developers, it might make sense to use a language that has a good ecosystem for test automation. Something to think about while selecting a programming language for test automation.

(I agree with this)

Gabriele Cafiero 

I can’t still figure out why someone chose Javascript to test things. It’s asynchronous, and tests should be synced by definition. It has a weak typing so you have to double check any assertion of yours

(I agree with this also)


But here is where it sparked my own thought:

90% of the time, Javascript doesn’t need to be async, but library writers have a fetish for it because it was difficult for them to learn and so they want to show off that hard won knowledge.

But…there is a reason for asynchronous code and that’s efficient resource usage through task switching (i.e. the call stack).

Tests also have a need for this — although it’s not really utilized in any framework:

1. Tests need to run concurrently so you can get faster results
2. Tests take different times to complete, so you shouldn’t have to wait for a test blocking
3. Tests interact with asynchronous events (e.g. network, UI)
4. Test runs shouldn’t have to be discreet sequential loops.

Wouldn’t it be nice to have tests run continuously?
A test runner that listens for events to trigger tests and you don’t have to wait for (or kill) the previous test run to start another.

Imagine a continuous test queue that doesn’t depend on a specific job to run,

Imagine failing tests that dynamically rerun for stability.

Imagine commits that trigger specific tests, but don’t worry about full regression or smoke testing because that’s happening in the background.

So while an individual test needs to run synchronously (but also needs to await asynchronous events sometimes), having a test runner that operates in an asynchronous, event driven style is a great opportunity that hasn’t (to my knowledge) really been explored.

When should you use JavaScriptExecutor in Selenium?

When you want to execute JavaScript on the browser :)

This was my answer to a question on Quora

https://www.quora.com/When-should-I-use-JavaScriptExecutor-in-Selenium-WebDriver/answer/Aaron-Evans

JavaScriptExecutor is an interface that defines 2 methods:

in Java (and similarly in C#):

Object executeScript(String script, Object... args)

and

Object executeAsyncScript(String script, Object... args)

which take as an argument a string representing the JavaScript code you want to execute on the browser and (optionally) one or more arguments. If the second argument is a WebElement it will apply the script to the corresponding HTML element. Arguments are added to the JS magic arguments variable which represents the values passed to a function. If the code executed returns a value, that is returned to your Selenium code

Each driver is responsible for implementing it for the browser.

RemoteWebDriver implements it as well.

But when *you* as a Selenium user want to use JavaScriptExecutor is when you assign a driver to the base type WebDriver, which does not implement it.

in this case, you cast your driver instance (which really does implement executeScript() and executeScriptAsync().

For example

WebDriver driver = new ChromeDriver();  

// base type ‘WebDriver’ does not define executeScript() although our instance that extends RemoteWebDriver actually does implement it.

// So we need to cast it to ‘JavaScriptExecutor’ to let the Java compiler know.

JavaScriptExecutor js = (JavaScriptExecutor) driver;

js.executeScript(“alert(‘hi from Selenium’);”

if you keep your instance typing, you do not need to cast to JavaScriptExecutor.

RemoteWebDriver driver = new RemoteWebDriver(url, capabilities);  

// information about our type is not lost so the Java compiler knows our object implements executeScript()

WebElement element = driver.findElement(By.id(“mybutton”));

driver.executeScript(“arguments[0].click();", element);

// in the above case it adds the element to arguments and performs a click() event (in JavaScript in the browser) on our element

String htmlsnippet = driver.executeScript(“return document.querySelector(‘#myid’).outerHTML” , element);

// this time we use native JavaScript on the browser to find an element and return its HTML, bypassing Selenium’s ability to do so.

The above two examples illustrate ways you can accomplish in JavaScript what you would normally use Selenium for.

Why would you do this?

Well, sometimes the driver has a bug, or it can be more efficient (or reliable) to do in JavaScript, or you might want to combine multiple actions in 1 WebDriver call.

Scheduling tests to monitor websites

If you have access to your crontab you can set a Selenium script to run periodically. If you don’t have cron, you can use a VM (with Vagrant) or Container (with Docker) to get it.

Cron is available on Linux & Unix systems. On Windows, you can use Task Scheduler. On Mac, there is launchd, but it also includes cron (which wraps launchd).

You could also set up a job to run on a schedule using a continuous integration server such as Jenkins. Or write a simple, long running script that runs in the background and sleeps between executions.

I have a service that runs Selenium tests and monitoring for my clients, and use both cron and Jenkins for executing test runs regularly. I also have event-triggered tasks that can be triggered by a checkin or user request.

Each line represents a task with schedule in the following format:

#minute   #hour     #day      #month    #weekday  #command

# perform a task every weekday morning at 7am
*         7         *         *         1-5       wakeup.sh

# perform a task every hour
@hourly python selenium-monitor.py

You can edit crontab to create a task by typing crontab -e

You can view your crontab by typing crontab -l

If you just want to repeat your task within your script while it’s running, you can add a sleep statement and loop (either over an interval or until you kill the script).

#!/usr/bin/env python

from time import sleep
from selenium import webdriver

sites = ['https://google.com', 'https://bing.com', 'https://duck.com']

interval = 60 #seconds
iterations = 10 #times

def poll_site(url):
	driver = webdriver.Chrome()
	driver.get(url)
	title = driver.title
	driver.quit()
	return title

while (iterations > 0):
	for url in sites:
		print(poll_site(url))
	sleep(interval)
	iterations -= 1

See the example code on github:

#!/usr/bin/env python
from time import sleep
from selenium import webdriver
sites = ['https://google.com', 'https://bing.com', 'https://duck.com'%5D
interval = 60 #seconds
iterations = 10 #times
def poll_site(url):
driver = webdriver.Chrome()
driver.get(url)
title = driver.title
driver.quit()
return title
while (iterations > 0):
for url in sites:
print(poll_site(url))
sleep(interval)
iterations -= 1

Originally posted on Quora:

https://www.quora.com/How-can-I-schedule-simple-website-test-scripts-Selenium-to-run-regularly-like-Cron-jobs-and-notify-me-if-it-fails-for-free/answer/Aaron-Evans-56

Acceptance Criteria Presentation

A few weeks ago I gave a presentation about acceptance criteria and agile testing to a team of developers I’m working with.

Some of the developers were familiar with agile processes & test driven development, but some were not. I introduced the idea of behavior driven development, with both rspec “it should” and gherkin “given/when/then” style syntax. I stressed that the exact syntax is not important, but consistency helps with understanding and can also help avoid “testers block”.

It’s a Java shop, but I didn’t get into the details of JBehave, Cucumber or any other frameworks.  I pointed out that you can write tests this way without implementing the automation steps and still get value — with the option of completing the automation later.  This is particularly valuable in a system that is difficult to test, or has external dependencies that aren’t easily mocked.

Here are the slides:

Acceptance Criteria Presentation [PDF] or [PPTX]

And a rough approximation below:


Acceptance Criteria

 

how to make it easier to know if what you’re doing is what they want you to do


What are Acceptance Criteria?

Image


By any other name…

● Requirements
● Use Cases
● Features
● Specifications
● User Stories
● Acceptance Tests
● Expected Results
● Tasks, Issues, Bugs, Defects,Tickets…


What are Acceptance Criteria?

Image


…would smell as sweet

 ● A way for business to say what they want
● A way for customers to describe what they need
● A way for developers to know when a feature is done
● A way for testers to know if something is working right


The “Agile” definition


User Stories

As a … [who]
I want to … [action]
So that I can … [result]


Acceptance Criteria

Given … [some precondition]
When … [action is performed]
Then … [expected outcome]

(Gherkin style)


Acceptance Criteria

Describe [the system] … [some context]

It (the system) should … [expected result]

(“should” syntax)


Shh…don’t tell the business guys

it’s programming

Image

but can be compiled by humans…and computers!


Inputs and Outputs

if I enter X + Y
then the result should be Z

f(x,y) = z

 


 Not a proof

or a function
or a test
or a requirement
or …

It’s just a way to help everyone understand


It should

  1. Describe “it”
    (feature/story/task/requirement/issue/defect/whatever)
  2. Give steps to perform
  3. List expected results

Show your work

Image

● Provide examples
● List preconditions
● Specify exceptions


A conversation not a specification

Do

● use plain English
● be precise
● be specific

Don’t…

● worry about covering everything
● include implementation details
● use jargon
● assume system knowledge


Thanks!

If you’re interested in learning how to turn your manual testing process into an agile automated test suite,  I can help.

contact me

Aaron Evans
aarone@one-shore.com

425-242-4304