Defining App Tranches to Drive Your App Compat Testing

This post has been republished via RSS; it originally appeared at: Windows Blog Archive articles.

First posted to MSDN on Apr, 30 2018

Way back in 2013 (wow, was it really that long ago?) I introduced a key taxonomy for testing apps:

In this post, I discussed how you should work to very crisply contain what you’re interested in assuming personal responsibility for testing, versus just apps which happened to exist.

As a refresher, defining an app as a Managed App is, essentially, an insurance policy. You are making a commitment to invest in app testing, and taking on responsibility for all managed apps working correctly or remediating the apps that don’t. Is “someone, somewhere managed to get an app installed one time” really a high enough bar for you to promise to keep it working between that initial discovery point and the end of time? Shouldn’t you have some sort of say on the quality of apps that you allow in, and the maintenance of those apps over time? I’ve seen entirely too many cases where an IT Pro has been trying to keep really old versions of non-IT-sanctioned software running, with the business unit providing no resources for upgrades but delivering an expectation that the status quo be maintained. This isn’t good for user experience, budgets, or security.

So, if you haven’t passed that step yet, I would encourage you to continue to work the Managed App list to distill it to what truly you mutually agree is part of the core deal between IT and the business.

As many organizations are finding, the next stage is further segmenting the managed apps. I had suggested a Platinum / Gold / Silver / Bronze taxonomy before, but want to start driving a little further into what we’re actually seeing in terms of organizational behavior and what is proving most effective in today’s faster-moving technology environment.

Always Test

First, we are seeing that just about every organization has a collection of applications which will always be tested. This number should be really small – truly just a handful. These are the apps where there is no room for failure, and where we will even test as much as we can before quality updates. If in doubt, you can always use Chris Jackson’s Law – ensuring that you compute the cost of testing to include the deferred value of either quality or feature updates in the maths!

Canary Test

The next tier in the test matrix is the Canary Testing, which is the best operationalization I’ve seen so far in how to meaningfully reduce the amount of testing in a risk-managed way. The key to this approach is that you are able to separate your apps into App Tranches, based on similarities which would predict that their success (or failure) would tend to be grouped. Apps that are developed by the same developers might be grouped together, or apps that are developed using the same technologies, such as ASP.NET MVC, or Visual Basic 6, or Windows Forms. In general, we are seeing apps start to cluster in their failure patterns. Over time, you may start to see a trend in related software brittleness, and you should continue to align your App Tranches based on your findings – but most customers will find that app failure rates are sufficiently low that you don’t have enough data to justify much learning until a fair bit of time has passed.

Within each App Tranche, you’ll identify a subset of apps to represent the entire group.

If all of the apps pass, you stop – the canaries all survived, and that serves as a statistical representative of the App Tranche. You’re done!

If one or more apps fail, you’ll test the rest of the App Tranche, as you now have real data (and not unsubstantiated fear) that indicates that this app category is uniquely at risk so your additional testing is justified.

Pilot Test

The final tier is Pilot Testing. These are apps where the risk is managed well enough (you have a defined failover plan, for example) or the app is complex enough (an untrained person wouldn’t be able to do any meaningful testing, for example) that you want to wait until you get to the first Ring of your deployment. The first term I heard used to describe this was Department Technical Coordinator (thanks, Joel!) – someone who ended up being the first to experience coming changes in the technology stack. I’ve heard quite a few terms since then, but they all revolve around the same concept – you have pilot users that you incentivize to participate as partners. You don’t want this to be a tax, but something that this select group of users is eager to participate in.

Wait – did I just say that people can be eager to participate in app testing?

Well, yes, there are some strategies I’ve seen used quite effectively to make pilot testing a key part of validating the entire technology stack (not just Windows, or even software releases)! Here are some of the strategies I’ve seen work really well here:

  • Make it opt-in, some people genuinely love getting new technology first, it’s like a new toy for them!
  • Helping users make their participation a differentiator for their annual review – not only did they do their regular job well, but they also advanced the technology of the entire company!
  • Give people the first experience on the fancy new hardware you’re buying – who doesn’t like a fancy new laptop!
  • Provide white glove service to these users in exchange for their key role – people love feeling special!

Visualizing the Taxonomy

Here is a visualization of the taxonomy as I’ve seen it, so we can walk through it and discuss what happens:

First, we have the Always Test apps – and in this example, we happened to find that all of them were working correctly (hooray!).

Next, we looked at the Canary Testing apps. In App Tranche 1, we had 5 apps, and decided that we were going to test two of them every time. Here, we found that both of them worked, so we went ahead and stopped testing.

After that, we looked at Tranche 2. Here, we had 3 apps we always tested (the tranche was deemed riskier), and during testing we found that one of the apps failed. Uh oh! Because of that failure, we went ahead and tested the rest of the tranche, as we had tangible data suggesting more risk. In this example, we didn’t find any more failures, but we feel much better about that App Tranche having done that additional risk management.

Finally, we released the Pilot after the testing for apps those users needed was completed, and in this example we only had one incident called in (which ended up not being a compat problem at all), and from here we felt comfortable proceeding onward to broader deployment.

What you can start to see here is a way to get smarter about what to test and not test. Because “test less” isn’t a very actionable, and “take on more risk” is something nobody likes to do, this methodology allows you to transform your testing strategy from fear-driven to data-driven- in a trust-but-verify kind of way!


The next thing to be thinking about is – how do you start to automate the app compat testing? We’re not going to go into details in this blog post, but from a strategy perspective, the idea is to keep pushing your way down the stack. Once you have automated testing lined up, you can much more inexpensively and quickly manage risk of ongoing change (from all of the change in the stack) so this should always be considered part of your Enterprise Architecture maintenance strategy.

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.