Assessment is how the military determines “the overall effectiveness of employing capabilities during military operations.”. That said, the military does not practice this dark art well. If an objective is clear-cut, measuring its performance is also straightforward. Broader objectives cause problems. This was the challenge in Iraq. It was also at the heart of the ineffective military engagement in Vietnam.
This diagram sets out the Operation Assessment steps set out in the February 7, 2020 joint publication designated as ATP 5-0.3/MCRP 5-10.1/NTTP 5-01.3/AFTTP 3-2.87
Operation assessment is not well understood. The COVID-19 pandemic—where the nation as a whole has engaged in a very public struggle concerning the problem of the “right” metric by which to measure progress—provides an excellent opportunity to highlight the challenges of operation assessment in the military as well as to demonstrate pitfalls in its application. The goal is not to critique the use of particular metrics by our civilian government, but to illustrate useful principles.
Similarly, there is no desire to geek out on math. In fact, joint doctrine does not require significant mathematical or statistical ability to undertake operational assessment. The Army does have experts especially for this, but joint doctrine envisages assessment tasks being undertaken by ad hoc teams without specialized skills. The math gets you something (better predictive tools based on statistical modelling), but it is a “nice to have” and not a “need to have”. The essential question does not involve math, but rather finding logical connections between things that can be observed and the commander’s desired end-state.
Where to begin? Before we can ask ourselves whether or not we are making progress towards our objective, we have to ask ourselves what that objective is.
Define the objective
What is it we are trying to do, exactly? And why? Doctrine tells us that “[p]oorly defined objectives or end states typically result in ineffective planning, as well as increase the risks of wasting resources and opportunities…” Moreover, only once an objective is understood can an effective assessment approach be developed.
Identifying the objective may become complex as we get higher up in echelon. Performing a systems analysis of the problem and determining the “centers of gravity” that affect it could lead to the development of “lines of effort” that create very distinct sub-tasks. Each of these sub-tasks may require a different assessment approach. The objective itself may also change through time. Changing circumstances may require re-visiting the assessment approach (viz., an increase in the total number of COVID-19 cases may be rendered unimportant by the development of an effective therapy).
Intellectual honesty in defining the objective—and in the entire assessment process—is of the utmost importance. With COVID-19, in some quarters of civil society there was a suspicion that the goal of “flattening the curve” gave way to the medical profession’s default goal of minimizing harm. The perception that the goalposts were being moved undermined public support for social distancing restrictions. A clear goal enables one to choose relevant metrics to demonstrate progress or a lack thereof.
Intellectual dishonesty can be introduced into this preliminary step for political or organizational advantage (there is ample literature concerning how data can be misleadingly reported at the tail end). In the military, if funding for a particular mission or task in premised on a certain logic, the answer to the “why are we doing this?” question can become intentionally clouded. Do we need to measure both progress towards our actual goal as well as towards the goal that we pretended to adopt in order to get the funding? In our COVID-19 example, the nation’s economic health is tied to the electoral prospects of civilian leaders, providing a significant incentive to spin the progress of anti-pandemic measures.
Examining the question of what, exactly, state and federal governments were trying to accomplish at various points during the ongoing COVID-19 fight could, perhaps, suggest better metrics to guide policy makers. Were they simply trying to slow the spread of the disease? Detect outbreaks? Safeguard vulnerable populations? Re-open the economy? Carry on business as usual?
Find the right metric for the objective
It is ideal if some extant set of reliable data will provide the needed information. Always consider the purpose for which the data was originally collected and how that might affect the quality of the data For example, COVID-19 testing has a primarily diagnostic or a safety function, not necessarily an epidemiological one, so that testing is often weighted towards those that present with symptoms or who form part of an essential workforce. Of the CDC’s recently promulgated categories for testing in non-healthcare workplaces, only one of the five categories relate to public health surveillance. If needed, collecting fresh data is possible, but this results in a commitment of resources (recent military examples from Afghanistan indicate that opinion polling requires costly in-person interviews in unstable security environments). Non-standard data parameters can affect the quality of the data.
Easing lockdown restrictions (originally imposed as a way to reduce transmission of the novel coronavirus that causes COVID-19) is perceived as vital for re-starting economic activity. In that context, the increase or decrease in infections (or deaths) is perceived as crucial to determining whether or not the pandemic is under control. That, in turn, drives public confidence in support of re-opening. Success is fewer infections today than yesterday; failure is a greater number. Hospitalization metrics—which arguably provide a better reflection of the disease’s effect on the population than potentially asymptomatic infections—have also featured prominently alongside infection and death metrics.
As anyone following the news will be aware, the limited testing capabilities and differing strategies concerning to whom tests were administered created some controversy over the new infections data. The testing strategy itself is not uniform, skewing the new infections numbers further. In some situations, everyone is tested (U.S.S. Theodore Roosevelt, prisons, hospital staff). In others, the focus is on testing only those who present with symptoms (hospital patients). Can those two types of results be appropriately combined into a single “new infections” metric? Similarly, the ways in which causes of death were assigned to fatalities caused some concern with the numbers regarding COVID-19 deaths.
The “new confirmed infections” metric does appear to be the current standard when discussing success or failure in slowing the progress of the disease. While a perfect daily snapshot of the number of new COVID-19 infections may indeed be useful, a truly accurate set of numbers cannot be achieved—some feared—until there is widespread testing (to move beyond confirming diagnoses to understand the number of “hidden” cases).
Endless refinement is possible here, and some may be necessary. Should we only be interested in new symptomatic cases (if everyone in a prison is tested and 18% are infected is that important, or is the fact that only a very, very small subset of those affected will take ill)? Should we only be concerned with cases among the most vulnerable population, who are more likely to die? Should new cases be backdated to the estimated date of onset?
If the goal is to “flatten the curve”—stabilize the rate of infection so that medical facilities needed to treat the pandemic and provide other necessary emergency care were not overwhelmed—perhaps a metric focused on availability of treatment is preferable. The delta between available medical facilities and equipment and the population needing them might work. So long as sufficient equipment is available to deal with medical emergencies, new cases might not matter. Collecting this data appears relatively straightforward (counting empty ICU beds and unused and available medical equipment).
If the objective is to increase public confidence to encourage a return to economic activity, perhaps all that is necessary is to demonstrate that the number of people dying from all causes today has been brought down to “normal” levels. This is excess mortality or the “excess deaths” metric. This avoids the problem with testing distortions and the cause of death controversy, but is susceptible to distortion by unrelated phenomenon or secondary effects of the pandemic fight (e.g., lower accidental deaths during lockdown). It is also straightforward to compile: counting corpses rather than arguing about death determination methodology. Success on this metric would demonstrate that the medical situation was “returning to normal”; the downside is that excess deaths recorded in one time period could be offset by fewer excess deaths later, as the most vulnerable members of society are “weeded out” early.
Understand the chosen metric
Some metrics are complex, combining a number of metrics to create a more complete picture. Others combine surveys, Google searches, and other “signals” to try to present a useful composite. Similarly, data on the effective reproduction number Rt (the average number of people who become infected by another infectious person) requires an algorithm for plotting the rate of spread into the future using known delay times between infection and a positive test result.
Variations can smooth out rough numbers. For “new infections” take instead a seven-day average of new cases compared to the number of tests being conducted. Moving averages seek to correct for daily variation in the chosen underlying metric when those daily variations are not, of themselves, indicative of a trend (e.g., the very real differences between weekday activity versus weekend activity). Linking the number of new cases to the number of tests administered seeks to correct for variations in test availability.
Without a robust understanding of how the underlying data works, complex metrics can be mis-understood. Understanding the data at this level of detail is difficult (for a cavalry scout this is the equivalent of realizing that your GPS and someone else’s map might not match, depending on the map datum used by each). In the military, usually the simpler solution is preferred. Similarly, relying on straightforward data is preferable, as it is easier to both communicate and understand. Even so, misleading oversimplifications must be avoided. Recourse to subject matter expertise can help navigate complex data.
The limits of metrics
COVID-19 illustrates that metrics do not offer a precise guide to action. The goal of assessment is to facilitate a deeper, shared understanding. If metrics are used mechanically as automatically-executing decision points, the temptation to skew the metrics to match the desired outcome can be overwhelming. Such a course of action amounts to “boss pleasing,” undercuts the rationale behind assessment, and creates deeper ethical concerns. A commander is not obliged to be guided strictly by the data. But ignoring the data is a risk he or she must explicitly accept as a consequence of exercising military judgment.
The COVID-19 example demonstrates an important caveat for military practitioners of operation assessment. Leadership and decision-making are more than just math or logic problems. Success on a medically-focused metric may be insufficient in the face of countervailing political or social concerns (economic collapse due to restrictions on productive activity). Leaders need to stay focused on the big picture.
Assessment should, instead, drive adjustments in operations (“[O]bserved and reported actions are of little value unless they can serve as a basis for future decisions and actions.”), but only in combination with a robust understanding of the constantly-changing overall situation. Assessment metrics only have value in context and in the hands of thoughtful commanders. “[H]uman military judgment is required to make sense of the collection of indicators and analytic output.”
Enjoy what you just read? Please share on social media or email utilizing the buttons below.
About the Author: Garri is a Cavalry Scout currently assigned to the 28th Infantry Division staff, most recently as an assessment officer. He occasionally writes military stuff, examples of which can be found on the U.S. Army War College’s War Room blog and at West Point’s Modern War Institute , but he’s part time, so take it with a grain of salt. The views expressed are those of the author and do not necessarily reflect the official policy or position of anyone at any time.