To print a PDF copy of this article, click here.
As I have watched programs come through for Milestone Decisions and other reviews, I have gained the impression that our processes for risk management may have focused too much on the process and not enough on the substance of identifying and controlling risk. I think I may be seeing risk identification—categorization in the “risk matrix” showing likelihood and consequence and with risk burn-down schedules tied to program events. From my perspective, this by itself isn’t risk management; it is risk watching. We need to do what we can to manage and control risk, not just observe it.
All programs, but particularly all development programs, involve risk. There is risk in doing anything for the first time, and all new product developments involve doing something for the first time. The Department of Defense (DoD) has a good tool that lays out in detail the process of identifying, evaluating, categorizing and planning for risk in programs. Recently updated to version 7.0 by our Chief Systems Engineer Dr. Steve Welby, it is called the Department of Defense Risk Management Guide for Defense Acquisition Programs and is available online at https://acc.dau.mil/rm-guidebook. I don’t want to duplicate that material here, but I would like to make some comments on the substance of risk identification and risk mitigation and how it drives—or should drive—program structure and content.
I think of every development program primarily as a problem of risk management. Each program has what I call a risk profile that changes over time. Think of the risk profile as a graph of the amount of uncertainty about a program’s outcomes. As we progress through the phases of a program—defining requirements, conducting trade studies, defining concepts and preliminary designs, completing detailed designs, building prototypes and conducting tests—what we really are doing is removing uncertainty from the program. That uncertainty encompasses the performance of the product, its cost and how much time is needed to develop and produce the product. We can be surprised at any point in this process. Some surprises can be handled in stride, and some may lead to major setbacks and a restructuring or even cancellation of the program. It is our job to anticipate those surprises, assess their likelihood and their impacts and, most of all, do something either to prevent them or, if they do occur, to limit their impacts. All this effort is risk management.
As managers, we can take a number of proactive measures to mitigate risk. These measures all tend to have one thing in common: They are not free. In our resource-constrained world, we can’t do everything possible to mitigate risk. The things we can do cover a wide spectrum: We can carry competitors through risk reduction or even development for production, we can pursue multiple technical approaches to the same goal, we can provide alternative lower-performance solutions that also carry lower risks, we can stretch schedule by slowing or delaying some program activities until risk is reduced and we can provide strong incentives to industry to achieve our most difficult program challenges. Our task as managers involves optimization—what are the highest-payoff risk-mitigation investments we can make with the resources available? I expect our managers to demonstrate that they have analyzed this problem and made good judgments about how best to use the resources they have to mitigate the program’s risk. This activity starts when the program plan is just beginning.
Our task as managers involves optimization—what are the highest-payoff risk-mitigation investments we can make with the resources available?
The most important decisions to control risk are made in the earliest stages of program planning. Very early in our planning, we determine the basic program structure, whether we will have a dedicated risk reduction phase, what basic contract types we will use, our criteria for entering design for production and for entering production itself, and how much time and money we will need to execute the program. Once these decisions are in place, the rest is details—important but much less consequential. As I’ve written before, these decisions should be guided not by an arbitrary process or best practice but by the nature of the specific product we intend to design and build.
What we call “requirements” determines a great deal—almost everything—about the risks we need to manage. Do the requirements call for a product like an Mine-Resistant Ambush Protected vehicle, which is basically a heavy truck built from existing off-the-shelf components? Or do they call for a Joint Strike Fighter built from all new design subsystems and much greater capability and complexity than anything we have ever built? In the first case, we probably can go directly into detailed design for production. In the second case, we need to spend years maturing the highest risk elements of the design, and it would be wise to build prototypes to reduce integration and performance risk before our performance requirements are made final and we start designing for production.
The contracting approach, fixed price or cost plus, is driven by risk considerations. We need to be careful about the illusion that all risk can be transferred to industry. This is never the case, even in a firm fixed-price contract. The risk that the contractor will not deliver the product is always borne by the government. We are the ones who need the product. Industry’s risk is always limited to the costs a firm can absorb—a very finite parameter. There certainly are cases where we should use fixed-price contracts for product development (the Air Force’s new KC-46 refueling and transport tanker is an example), but we should limit such contracts to situations where we have good reason to believe industry can perform as expected and where the risk is not more than the contractor can reasonably bear.
As a risk-mitigation measure, cost-plus development has a very attractive feature from the risk-management perspective—its flexibility. In a fixed-price environment, the government should have defined the deliverables clearly and should not make changes or direct the contractor about how to do the work. In a fixed-price world, we have chosen to transfer that responsibility to the contractor. In a cost-plus environment, the government can be (and should be) involved in cost-effectiveness trades that affect requirements and in decisions about investments in risk-mitigation measures. These decisions affect cost and schedule, and in a cost-plus environment the government has the flexibility to make those trade-offs without being required to renegotiate or modify the contract.
At certain points in programs, we make decisions to commit both time and funding to achieving certain goals. Sometimes the commitments include several years of work and require spending billions of dollars. These are the milestones and decision points we are all familiar with in the acquisition process. These milestones and decision points are critical risk-management events.
At each of these points, we need a thorough understanding of the risks we face and a clear plan to manage those risks. Understanding these risks is rooted in a deep understanding of the nature of the product we are building.
The nature of the product should determine whether a dedicated technology maturation and risk-reduction phase is needed and what will have to be accomplished in that phase. Although they can be useful indicators, we can’t rely solely on metrics like Technology Readiness Levels (TRLs) to make these decisions for us. A bureaucrat can determine if something meets the definition of TRL 6 or not. It takes a competent engineer (in the right discipline) to determine if a technology is too immature and risky to be incorporated into a design for production. The nature of the product also should determine whether system-level prototypes are necessary to reduce integration risk prior to making the commitment to design for production. We did not need those prototypes on the new Marine 1 helicopter. We did need them on the F-22 and the F-35 fighter aircraft.
One risk-mitigation rule of thumb for program planning is to do the hard things first. In the Comanche helicopter program during the 1990s, the Army didn’t have enough funding to mature both the mission equipment package and the airframe. The choice was made to build prototype airframes—the lower-risk and less ambitious part of the program. This was done (over my objections at the time), because it was believed that, without flying prototypes, the program risked cancellation for political reasons. In other words, political risk trumped development risk. It didn’t work, and the program ultimately was canceled anyway. I do not advocate this approach; there are other ways to deal with political risk. In general, we should do the hardest things as early as we can in acquisition program planning. Eat your spinach first; it makes the rest of the meal taste much better.
Preferably, we should do the hardest (most risky) things in a Technology Maturation Risk Reduction (TMRR) phase where the risk can be reduced with a lower financial commitment and with less severe consequences. Once Engineering and Manufacturing Development (EMD) begins, a program quickly has a marching army moving forward in a broad synchronized plan of work. When something goes wrong, that marching army often will mark time while it waits for the problem to be solved—an expensive proposition. We recently had a problem with the F-35 engine that led first to grounding the fleet and then to a restricted flight envelope. All this delayed the test program, and the effects rippled through much of the EMD effort. It would have been much better to have found this problem before it could disrupt the entire flight test program.
Within either a TMRR or EMD phase, we should structure workflow to reduce or realize as early as possible the likelier and more consequential risks. Risk should influence program planning details. We can use internal “knowledge points” to inform commitments within phases. Our chief developmental tester, Dave Brown, emphasizes “shifting left” in test planning. The benefits of this are that technical performance uncertainty is reduced as early as possible and that the consequences of realized risks are less severe in terms of lost work, rework or program disruption.
The major commitment to enter production should be driven primarily by achieving confidence in the stability of the product’s design, at least as regards any major changes. The key risk to manage here is that of discovering major design changes are required after the production line is up and running. This always is a trade-off; time to market does matter and our warfighters need the product we are developing. How much overlap is acceptable in development and production (concurrency) is a judgment call, but it is driven by an assessment of the risks of a major design problem that will require correction—and the consequences of such a discovery.
We recently had a fatigue failure in an F-35 bulkhead, a major structural member. We are in our eighth year of production. Fortunately, in this case, a reasonable cost fix seems viable, and we should be able to modify at modest cost the aircraft we already have built. I say “should be” because the fix will take time to verify through testing, and there remains some risk that the fix will be ineffective.
For all our major commitments, but particularly for exiting TMRR and for entering production, I demand specific accomplishments as criteria and I put them in Acquisition Decision Memoranda. The pressures are very high in our system to move forward, to spend the money appropriated and to preserve the appearance of progress. I recommend that this practice of setting specific criteria for work package initiation (or other resource, work-scope expansion or contractual commitments) be used internally throughout our programs. By setting these criteria objectively and in the absence of the pressure of the moment, I believe we can make better decisions about program commitments and better control the risks we face.
Delaying a commitment has impacts now; gambling that things will work out has impacts in the future. It often is tempting for managers under cost and schedule pressures to accept risk and continue as planned. We are paid to get these judgments right—and to have the courage to make the harder decision when we believe it is the right decision.
A source of risk nearly all programs face is uncertainty about external dependencies, often in the form of interfaces with other programs that may not themselves be defined or stable. In other cases, a companion program (user equipment for the satellite Global Positioning System, for example) may be needed to make the system itself viable or useful, but that program experiences its own risks that affect schedule and performance. We often expect program managers to coordinate with each other, but in many cases this isn’t enough. Controlling potential cyber vulnerabilities across program interfaces is a good example of an area in which we have problems. No affected program manager may be willing to change or have any incentive to adjust his or her program to bring it into synchronization with the other programs.
If there is a negative cost or schedule impact, the question always is, “Who will change and who will bear the cost of any needed adjustments?” I’m of the view that the DoD could do a better job at managing this type of risk. We can do so by establishing an appropriate technical authority with directive control over interfaces and program synchronization.
The sources of some of our greatest risks can go unnoticed and unchallenged. Gary Bliss, director of my Program Assessment and Root Cause Analysis Office, has introduced the concept of “framing assumptions” into our lexicon. One example of a framing assumption, again on the F-35, was that modeling and simulation were so good that actual physical testing wasn’t necessary to verify performance prior to the start of production. In the case of the Littoral Combat Ship, the assumption was that commercial construction standards were adequate to guide the design. Gary’s point, and it’s a good one, is that programs often get into trouble when framing assumptions prove invalid. However, these assumptions are so ingrained and established in our thinking that they are not challenged or fully appreciated as risks until reality rears its ugly head in a very visible way. This type of risk can be mitigated by acknowledging that the assumptions exist and by providing avenues for us to become aware of sources of evidence that the assumptions may not be valid. Our human tendency is to reject evidence that doesn’t agree with our preconceptions.
Gary found several cases where program management failed to recognize as early as it should have that core framing assumptions were false. The best way to manage this source of uncertainty is to take the time and effort during early program planning to identify a program’s framing assumptions, to understand that they are a source of risk and then to actively reexamine them for validity as more information becomes available. Again, “knowledge points” can be helpful, but we shouldn’t merely be passive about this. In our planning, we should create knowledge points as early as possible. If we do so, we can respond to any problems that emerge sooner rather than later.
I’ll conclude by reiterating two key points: Risk management is not a passive activity, and proactive risk-management investments are not free. Those investments, however, can be the most important resource allocations we make in our programs. As managers, we need to attack risk the way we’ve been attacking cost. Understand risk thoroughly, and then go after the risk items with the highest combined likelihoods and consequences and bring them under control.
Allocate your scarce resources so you achieve the highest possible return for your investments in risk reduction. Do this most of all at the very start of program planning. The course set then will determine the direction of the balance of the program and whether it succeeds or fails.