This post has been republished via RSS; it originally appeared at: Windows Blog Archive articles.
First posted to MSDN on Jul, 05 2013
It looks like, back in January, Converter Technologies decided to post a rebuttal to my July 2012 article on The Mathematics of Office Compatibility . So, apparently we’re having a conversation with a 6-month cadence! Well, 6 months is up so now it’s my turn again. :-)
Apparently the folks at Converter Technologies are trying to tear down the very concept of the Modern Office Compatibility Process we spent the better part of our Fiscal Year 2012 developing, piloting, and communicating. I’m not entirely sure why that would be the case, and I’m happy to pursue that conversation further should they want to chat. Or perhaps I am misinterpreting their intent in having a blog post rather than sending me questions directly (since my email address is hyperlinked directly on the bottom of my article I’m not too hard to track down). But I’m fine with them disagreeing with me and having their own ideas, because it turns out I’m wrong rather a lot – and it’s only by being wrong that you can learn anything so I’m quite content with that.
So, let’s start addressing the questions and assertions raised.
“…we see a lower critical failure rate, 2 percent than the 5 percent noted.”
Yes – I actually see that as well. Just a little bit north of 2 percent, statistically. I quite deliberately exaggerated the risk because, even if the statistical mean is around 2%, that doesn’t mean that you, or anyone, actually hits the mean. That means half of people are higher, and half are lower. I’m accounting for if you happen to be a couple of standard deviations above the mean because of sheer bad luck. I’m also exaggerating the risk because, let’s face it – Microsoft has a track record of telling you that everything is low risk when it comes to app compat. We even told you that most things should work on Windows Vista that worked on Windows XP! (While technically true, if you define “most” as > 50%, I think anyone who has actually done that migration would suggest that they had a very different interpretation of “most” than would make this statement factually consistent.) So, I went out of my way to paint the picture a touch more bleakly than I needed to so you don’t under-manage the risk, while trying to be somewhat cautious about not over-managing it. I just want you to be sensible about it. And, as you proceed with your testing, I encourage you to reflect on your own findings and statistically project them – don’t just take my, our, or anyone’s word for it!
“We question if the hourly rate is accurate since, even if it’s a weighted cost, the employee would be earning in the range of $250K (I’ve attached my resume at the end of this document for any prospective employers).”
Yep, $250/hour is a pretty high cost – again, my intent is to exaggerate the cost, which actually exaggerates the risk and causes you to test more. My experience is that most customers prefer to manage more rather than less risk. And the point I was hoping to make was that, even with an exaggerated failure rate and an exaggerated cost, you would still need to have your testing costs get extremely low in order to cost-justify the testing for each and every app. But I’ll also point out that this rebuttal misunderstands the intent of this figure. The value of a user’s time to the organization is not equivalent to the amount that you pay that employee. And weighting the salary to include benefits still doesn’t allow us to compute that value. There are all kinds of other factors. That employee need to contribute enough value to justify whatever percentage of their time is consumed by everyone above them in the management chain consumes on them. We have all kinds of overhead in nearly every organization which we need to fund using that user’s productivity. And we also need to have them be generating business value above and beyond what we pay them, in order to generate a profit. So, while a deliberately generous number, it’s not ridiculous as the author appears to suggest. But, if you have a better estimate, I encourage you to use it. This math varies for every customer, so my goal was to be pessimistic, but not unnecessarily so.
“Mr. Jackson indicates that reactive fixing is what you should plan for based on cost assumptions for proactive testing.”
Actually, I don’t say that. Rather, I say that you should absolutely, emphatically, do not pass go, do not collect $200 until you test everything where the cost of failure times the probability of failure exceeds the cost of testing. And, where the cost of failure multiplied by the probability of failure is less than the cost of testing, it honestly doesn’t make mathematical sense to test your apps.
“…you are going to have to test those files at some point.”
That’s not necessarily true. And you can try this for yourself independently. Between documents stored in my email archives, SharePoint sites, and hard drives on various devices, I have over 25,000 Office documents I am personally storing. Not that I am claiming to be “typical” in any way! But if I think about what needs to work, and what I personally test against new Office suites, I have a couple of dozen that “have” to work. I have probably a hundred in total that I could tell you today, with fairly high certainty, I’m going to need to open again someday. The rest are for archival purposes. I may have to open them up again, which is why I’m keeping them around, but in all likelihood I’m honestly never going to at any point between now and the end of time. So, 99% of my Office docs will never be opened again by any human ever. Of course, the plural of “anecdote” is not “data”, but I’m sure you would project assumptions that many of your archival documents won’t ever be “need” to be tested at any point. But, you don’t really know which ones these are, so we have to once again use statistical projections.
“The problem is how Mr. Jackson applies the 5 percent factor, to the per file cost - perhaps it’s new math.”
And I guess now it’s time to get emotional and bring out the insults to prove a point? Let’s just stick with the facts. We are dealing with statistical projections. If the cost of failure is $125, and the probability is 5%, then you need the cost of testing to remain below $6.25 per file in order to make the investment statistically reasonable. You don’t project the total cost of failure on every single document or app. You can think of this as insurance. Say you have a $50,000 whole life insurance policy. Eventually, you’re guaranteed to die, so they know they’re going to have to pay someday. But, they don’t know if they are going to have to pay THIS month or not. So you pay a monthly premium based on the cost of failure ($50,000) multiplied by the probability of failure (varies based on age), factor in the investment returns for that money, pay your staff, and add some profit – voila, you have a premium. And you can be pretty sure that if the premium is $50,000 a month, your customers are pretty quickly going to spot that this is not such a good deal. In that scheme, of course, some people get “unlucky” and live well beyond mean lifespans, while others will get “lucky” and be struck by a bus immediately after they finish signing the policy, but that’s the nature of statistics. The same thing is true here. This is a cold, heartless, actuarial task.
“…the cost to repair a given file by his estimations has to be $125…”
I didn’t at any point in my article include remediation costs as a factor. Because, remediation costs aren’t relevant in determining which apps you should test or not – remediation costs are something you have to pay as soon as something is broken, no matter whether you or the user found it. The question isn’t “should I fix it”, it’s “should I, the technology team, test it before I deploy”.
“Typically, our rule of thumb used to factor how many files are likely to be in an organization is this: simply take the number of Office seats and multiply by 1,000 (in the article he uses 500 files per user). Using the article numbers, for an organization of 50,000 office users, this would result in 25 million files, or a total cost of $156 million (25 million * 6.25). We feel it’s simpler to just apply the true cost per file ($125 as above) multiplied by the number of likely problematic files, which, based on 5 percent would be 1.25 million files in a 50,000 user organization. From our experience, it would likely be closer to 500,000 files.”
This is where things really get confusing. The author from Converter Tech is taking the math for a single file in a particular scenario, the scenario where you can resolve the issue with an SLA of 30 minutes, and applying it to every single application in a customer’s estate. And this is baseline assumption does not hold true. There are some files where the cost of failure is not limited to 30 minutes of high-cost downtime. Rather, there are some documents where a single failure can lead to millions of dollars in lost revenues, lawsuits, or even congressional inquiries! Concomitantly, there are some documents where the user will shrug their shoulders and just close something if it doesn’t work right, or perhaps not even notice. And there are a huge number of files which will never be opened at any point between now and the end of time. So, we can’t project the cost of a single document in a single failure case, which happens to be opened, to all files. That is a nonsensical computation, so I honestly don’t get the point. Perhaps it’s a misunderstanding? It then applies the potential downtime loss against an SLA of 30 minutes ($125) across all failing apps, which is the more sensible approach, so I’m not sure why the author chose to bother doing the nonsense math above that. However, what this author doesn’t incorporate is the probability that the failure matters. You can, of course, pay to have all of the documents you have fixed up just in case, but most reasonable people with budgets to manage would likely choose not to fix up an old document which nobody is ever going to look at again. So, if we want to do this multiplication, we need to factor in that variable.
“And, it’s worth reiterating – there’s no avoiding that at some point they are going to have to test the files.”
And, it’s worth reiterating – you are not going to have to test all of your files.
“Failure, while in production, is never an option for any of our clients.”
I thought I addressed this fear mongering in my article, so let’s just quote that here: “There will be fear that your upgrade will cause some mission-critical business failure. You might apply the following equation: The cost of failure can equal infinity. You must also contend with the fear of change and the insistence that nothing can be broken.” This is precisely the point of this article – dispelling that fear-based irrational behavior. First of all, unless you are an organization that has no technical help-desk, you have already accepted that failure, while in production, is a fact of life while using technology developed by humans. Second, it assumes that there is something you can do to avoid it. It’s an unfortunate mathematical reality that predicting rare events is hard! So I certainly applaud those who try, while also trying to prepare my customers for an adequate interpretation of their data. You can compute this using Bayes’ theorem. Let’s say you have a process or a tool which is 99% accurate. That’s really terrific for helping you build a budget! But now, use that same, really good data to predict for any one of your individual documents or applications whether it’s going to work or not. Your 99% average accuracy doesn’t mean that you’re 99% sure a “red” document is broken. Rather, if you assume 5% of docs will fail, then your posterior probability for that particular app ends up being 84%. Or, if you assume 2% of docs will fail, then your posterior probability drops to 67%. You’ll have erroneous data nearly 1/3 of the time, even with an exceptionally high quality tool! And if that quality drops to 90%, then your posterior probability on an individual doc (with a 2% failure rate) drops all the way down to 16%! (Which is still a whole lot better than 2%, so let’s not get discouraged here!) You just need to be careful when interpreting your data and planning your “failure-proof” solution. Make sure you ask about accuracy statistics, and how they were computed.
“We believe that manually scoping and testing (let alone repairing) the 5 percent of your files manually is a hugely daunting task.”
I don’t believe I asserted that everything (or even anything) should be done manually? I am a fan of tools, and encourage you to automate any efficient task you have in order to magnify that efficiency. In the world of app compat, I caution you to be careful to ensure that you have something which is truly efficient. I’ve invested a lot of political capital in getting some of our least accurate products removed from our products because they just didn’t make sense once you did the math. OMPM – gone. ACT down-level evaluators – gone. And I encourage everyone to do the math and understand the accuracy of any tools they incorporate. Because it’s impossible for that accuracy to be 100%, through no fault of the tool vendors but due to that pesky Alan Turning.
“The ROI, that Mr. Jackson argues for does not make sense.”
I’m not sure what doesn’t make sense, so it’s hard to elaborate on why I believe it makes sense. Again, I could totally be wrong, but the risk-balanced approach has been one that’s worked out really well for my customers. Particularly now, in the world of impending Windows XP and Office 2003 End of Life, where the most critical factor is velocity! Avoiding perfectionism and embracing calculated risk is working. Calculated is a critical term here!
“Leaving this to chance, and rolling out the latest version of Office without some consideration is a very risky approach indeed.”
Becoming more agile has never been more important than in today’s rapidly changing world. "Office 365 significantly advances technology and engineering. Because of the cloud, we can develop, test and deliver technology with more accuracy and precision. As soon as we build a release, we can deploy it and customers can consume the technology immediately. That’s good for customers, but it’s instrumental for our engineering." Tools have gotten way better, but our most successful customers are starting their approach with a mathematically sound process which drives the sensible use of tools anywhere and everywhere appropriate.