To print a PDF copy of this article, click here.
This article advocates the use of simple technology readiness metrics that focus on system-wide technological maturity. Current DoD practice is to set guidelines for the maturity of individual system components, but the statistical evidence provided in this article demonstrates that more holistic metrics should be adopted. A simple system technology readiness metric is proposed and evaluated based on historical cost and schedule performance, and is shown to be potentially quite useful in avoiding poor acquisition outcomes. Finally, the policy implications of implementing a decision rule based on the metric are explored in depth, and the DoD is advised to pursue and encourage applied research for the development of more comprehensive technology readiness metrics.
The discovery process in defense acquisition is expensive and time-consuming. If the Department of Defense (DoD) is too optimistic in what technologies it believes U.S. defense contractors can master in a timely fashion, taxpayers will be subject to large cost overruns, and warfighters will go without more effective weaponry for much longer than expected. Congress has attempted to reduce this risk by codifying minimum technology maturity levels for program elements before they can be included in a program of record. To further reduce these risks, minimum technology maturity levels should be enforced for the program in its entirety rather than focusing exclusively on its individual components. Indeed, empirical evidence gleaned from an evaluation of the effect of Technology Readiness Levels (TRL) on system cost overruns and schedule slips strongly supports taking a more holistic view of technology maturity. To put the magnitude of possible savings in perspective, a potentially avoidable additional cost overrun of 40 percent through the procurement phase for a single ‘typical’ $2.5 million Major Defense Acquisition Program (MDAP) results in $1 billion in additional outlays. The DoD should stop reaching for elusive technologies in its programs of record, but it should continue to retain and even enhance its technology leader status through strengthening its science and technology programs.
The Services face a problem familiar to commercial interests in high-technology sectors: the development and fielding of appropriate technologies to satisfy customers’ demands. The connection of technology to system requirements is clear—higher levels of technology generally allow the satisfaction of more demanding requirements. However, more advanced technologies can take much more time and money to develop, so trade-offs have to be made in a resource-constrained world.
TRLs describe the state of a critical technology element’s development. The National Aeronautics and Space Administration developed the TRL methodology in 1974 (Banke, 2011). The DoD adopted the TRL framework, and it is now partially codified in regulations applying to all MDAPs. Title 10 United States Code (U.S.C.) Section 2366b requires the Milestone Decision Authority to verify that “the technology in the [major defense acquisition] program has been demonstrated in a relevant environment” before receiving Milestone B approval. A technology readiness assessment consists of classifying each critical technology element into one of nine technology readiness categories according to its technological maturity. The DoD uses the following scale (DoD, 2009):
- TRL 1: Basic principles observed and reported
- TRL 2: Technology concept and/or application formulated
- TRL 3: Analytical and experimental proof of concept
- TRL 4: Component validation in a laboratory
- TRL 5: Component validation in a relevant environment
- TRL 6: Subsystem model or prototype demonstration in a relevant environment
- TRL 7: System prototype demonstration in an operational environment
- TRL 8: Actual system completed and qualified through test and demonstration
- TRL 9: Actual system proven through successful mission operations
As is clear from Title 10’s language, system components must achieve TRL 6 before reaching Milestone B. Although not codified in the U.S.C., TRL 7 or higher is the expected state of technology maturity at Milestone C. In addition, some programs use technology readiness assessments as an integral part of their risk assessment and risk reduction strategy (DoD, 2009). Since production begins after the Milestone C decision, a strong argument can be made that testing and integration should be complete—rendering TRL 8 the more appropriate standard.
The technological maturity of a system’s critical components has long been recognized as a key determinant of weapon systems outcomes (General Accounting Office, 1999). If significant technological advances are required during design and manufacturing development, the program will be very susceptible to extended cycle times, higher unit costs, management changes, and funding volatility. In addition, these outcomes will provoke deleterious second-order effects such as smaller buys and technological obsolescence of more mature system components. Technological obsolescence can then lead to requirements creep. It is easy to see how a vicious cycle could take hold and even lead to a program’s cancellation. Expanding technology readiness metrics by including a simple indicator of system readiness will allow program leadership to effectively control risk at both the individual technology element level and at the system level.
As mentioned previously, the U.S.C. requires an overall minimum TRL of 6 for critical technology elements. The Government Accountability Office (GAO) has recommended that technologies included in a product’s design reach TRL 7 before being turned over to the product development manager (GAO, 2009). In addition to reinforcing the GAO’s position, empirical analyses of Selected Acquisition Reports (SAR) and a system’s TRLs support including an additional metric that takes a more comprehensive view of technological maturity.
The quantities of interest for this analysis are the percentage change from the earliest available estimated Program Acquisition Unit Cost (PAUC) to the current estimated PAUC and the actual schedule slippage, in months, as of the last published SAR. TRLs were taken from Defense Acquisition Executive Risk Summaries when these numbers became available starting in March 2007. All systems included in the sample had reached Milestone B by the time TRLs were collected. To exploit recently available technology readiness data, programs were selected for inclusion in the sample if Milestone B approval occurred after January 1, 2000. Programs reaching Milestone B before 2000 were very likely to have TRLs of 8 or better for every critical technology element by the time TRL data became available in 2007, so the additional data would not be particularly informative.
While a minimum TRL is prescribed by law at Milestone B, the raw number of critical technology elements with TRLs below 8 has not been identified as a major risk factor. For the purposes of this analysis, when a critical technology element’s TRL reaches 8, technology risk has been effectively eliminated. To keep the focus on shortfalls in technology and because of mathematical necessity, the candidate explanatory variables were stated with the difference between a critical technology element’s TRL and a TRL of 8 as the basic building block. In mathematical notation, this shortfall can be found quite simply by computing:
i-th critical technology element’s TRL shortfall = 8 – TRLi
For example, the minimum TRL of 6 that is codified in the U.S.C. can be written interchangeably as
max(8 − TRLi)=2
The same principle can be applied for all critical technology elements with TRLs less than or equal to 8. Armed with the insight that both the number and magnitude of technology shortfalls are important, the candidate explanatory variables containing TRL information that were considered are listed in Table 1.
Table 1. Candidate Explanatory Variables
The sum of the squared TRL shortfalls, or SS, merits additional explanation. Clearly, advancing from a TRL of 5 to a TRL of 6 may not require the same amount of effort as advancing from a TRL of 6 to TRL 7. That is, there is no reason to believe TRLs were designed as a linear scale. Squaring the TRL shortfall allows more serious shortfalls to be weighted much more heavily than minor shortfalls. The effect of squaring is pronounced—a shortfall of two units in the TRL scale is weighted four times more heavily than a shortfall of one. A shortfall of three is considered to be a major weakness and is given a weight nine times higher than a shortfall of one to reflect its relative seriousness.
These candidate variables and combinations of them were regressed against the number of months the system is or is projected to be behind Acquisition Program Baseline (APB) schedule and the ratio of current estimated PAUC to the earliest available comparable estimate of PAUC. To preserve the simplicity of the model and to keep the number of decision rules to a minimum, a variable or combination of variables was evaluated according to its ability to explain both schedule slippages and the percentage change in estimated PAUC. In addition to evaluating the effects of various technology maturity metrics, several other explanatory variables were incorporated in models to assess their relative value in explaining acquisition outcomes. These covariates, and whether each variable helped explain acquisition outcomes, are listed in Table 2.
Table 2. Other Variables Considered and Their Explanatory Values
|Independent Variable||Explanatory Value|
|Months Since Milestone B||Yes|
|Type of Commodity||No|
|Cost Variance Causes||Yes|
The rationale for the consideration of most of these covariates is straightforward. The lead Service, the type of commodity being acquired, and the prime contractor were included to determine whether acquisition outcomes differed systematically because of the personnel involved in running the program or the fundamental nature of the acquisition itself. None of these were found to be useful in explaining acquisition outcomes when any TRL variable was also included in the model. A program’s size, measured by its estimated total acquisition cost, was considered because expenditure could be a good proxy for complexity, so larger programs may be intrinsically more difficult to gauge. No statistical evidence was found to support this conjecture. Finally, sources of cost variance identified in the SAR were introduced in various models to determine whether one particular type of error was particularly influential in explaining acquisition outcomes. Even though information on these variables is obviously collected after the fact, they can still shed light on the consequences of misjudging a program early in its development. One of these indicator variables was found to be useful in explaining each acquisition outcome. Obviously, the presence of these variables compromises the predictive value of the statistical models presented in the next section, and for this reason they were not included once a decision rule was formulated. The number of months that have passed since a program satisfied Milestone B has been included to determine whether overruns intensify or dissipate as more of the program is executed.
In explaining schedule slips, the model that explained the most variance while maintaining parsimony included the sum of the TRL shortfalls, the number of months since the program passed Milestone B, and an indicator variable that denotes that estimation error contributed to the cost variance that has occurred to date. The statistical results for this model are displayed in Table 3.
Table 3. Schedule Slippage Regression Results
|Variable||Estimate (Standard Error)|
|Months Since MS B||0.125*|
Estimated Schedule Slippage Model
Slippage = a(Sum) + b(Months Since MS B) + c(Estimating) + d
The model explains more than half of the variation in schedule slips. The model’s explanatory value strongly suggests that any decision rule regarding minimum technology readiness should incorporate the sum of technology readiness shortfalls. The results also reveal that schedule slippage worsens as the time since Milestone B increases, and that cost estimating errors that understate program expense contribute to schedule overruns. This probably occurs because unpleasant cost surprises lead to the stretching of program timelines to meet each calendar year’s budget targets. To evaluate the proposed technology readiness rule’s value, the model’s efficacy in explaining cost ratio was also investigated. As the results in Table 4 illustrate, a similar model also works well in explaining cost overruns.
Table 4. Cost Overrun Regression Results
|Variable||Estimate (Standard Error>|
|Months Since MS B||(0.01)|
Estimated Cost Overrun Model
Overrun = a(Sum) + b(Months Since MS B) + c(Sum* Months Since MS B) + d(Schedule) + f
In addition to explaining more than half of the variation in schedule slippage, the model also explains over half of the variation in cost overruns. Once again, the sum of TRL shortfalls is highly influential in explaining the acquisition outcome. Cost overruns tend to increase in severity as time passes, but strangely, the interaction between the sum of TRL shortfalls and the number of months since Milestone B has a negative sign. This means that the effects of the sum of TRL shortfalls diminish somewhat as time passes. However, evaluating the results from both regressions, clearly the sum of TRL shortfalls is the most useful of the variables considered in explaining cost and schedule overruns.
To illustrate the results in more concrete terms, a notional decision rule can be specified and applied to the systems in the sample. The most discriminating decision rule in this sample predicts that systems with a sum of TRL shortfalls above 10 or a maximum technology shortfall of 3 will cost significantly more than expected and will experience a longer cycle time than was previously expected. The currently codified standard of no maximum technology shortfall of 3 or higher was included because it is useful and highly unlikely to be eliminated. The average results for this rule are summarized in Table 5. It is especially noteworthy that the mean overrun for both acquisition outcomes was found to be substantially higher in violating programs versus no violating programs. The differences were found to be statistically significant at the 1 percent level for schedule and at the 5 percent level for cost.
Table 5. Application of the Proposed Decision Rule
|Quantity of Interest||No Violation||Violation|
|Mean Months Behind Schedule**||7.7 mos||31.2 mos|
|Mean Percentage Cost Overrun*||3.2%||35.5%|
The impact of the rule is stark—an average difference in schedule overrun of over 23.5 months and 32 percent higher relative costs. In fact, it is reasonable to conclude that systems that do not violate this decision rule perform, on average, almost as expected with respect to cost, and systems that violate the rule generally have problems that will lead to multiple Nunn-McCurdy and APB breaches.
Possible Mechanisms of Cost and Schedule Growth
Although a causal link between large technology gaps and cost and schedule overruns has not been established statistically, identifying an intuitively appealing mechanism that could be causing these effects would help satisfy us that these are probably not spurious results. Dan Davis speculated that low TRLs might cause more scope growth during development, and showed that scope growth was highly statistically significant in explaining development cost growth (Davis, 2010). Furthermore, a quick analysis of a recently published GAO report reveals that increases in Research and Development (R&D) costs are strongly correlated with rising estimated procurement costs (GAO, 2011). Complexity could be driving cost growth throughout a program’s development and production phases. This mechanism may be equally valid for explaining systems that have multiple schedule overruns. Using a sample of 70 systems with current SARs, development cost increases led to statistically significant increases in the number of APB schedule breaches, the probability of multiple breaches, and the probability of an APB breach after Milestone C. Therefore, there is statistical evidence that a large technology gap could, through the mechanism of scope growth during the development phase, eventually contribute to systemic cost and schedule issues throughout a system’s acquisition cycle. A program that gets into trouble early stays in trouble, and one way to virtually ensure a troubled system is to tackle too large a technology gap. The figure shown here summarizes the mechanism graphically.
Figure. Hypothesized Mechanism of Systemic Cost and Schedule Growth
Now that the predictive power of this decision rule for recent systems has been established, what general guidelines for implementing it should apply? If the rule is violated, the decision maker has three options: (a) take measures to bring the program into compliance with the rule, (b) cancel or delay the program, or (c) assume the risks associated with large technology gaps without mitigating them. The implications of the last two options are clear, so the emphasis in this article will be on the first option.
If the system technology readiness gap exceeds upper tolerance levels such as the one specified in the previous section, partially closing the gap by substituting more mature technologies could pay considerable dividends. Of course, for some systems such substitution would not be possible, but when feasible, it is highly recommended. Experienced systems engineers would work with science and technology professionals and cost estimators to make the most cost-effective trades. As the cost and schedule performance of the DoD acquisition system improves, these collaborative efforts could potentially be instrumental in demonstrating the worth of systems engineering. As the DoD learns more about cost and schedule performance at various technology readiness gaps, the technology readiness decision rule can be modified to reflect actual experience.
Analyses of Alternatives (AoA) could play an important role in identifying potentially useful technology substitutions. Through DoD Instruction 5000.02, defense leaders have mandated the performance of an AoA before Milestone A. An AoA is specifically intended to be the analytical foundation for arriving at the correct materiel solution when one is required. In addition, AoAs are to offer an assessment of the critical technology elements that make up each potential materiel solution. However, as the GAO has found, many AoAs do not provide a robust set of alternatives at the system level—much less at the critical technology element level. The DoD concurred with the GAO’s recommendations for improvement, so it is possible that AoAs are improving as this is being written (GAO, 2009). The DoD must prioritize the funding of AoAs, provide useful feedback to those who perform the analysis, and emphasize analysis at the critical technology element level. Perhaps most importantly, the DoD should require that an “80 percent solution” and a low-budget option are identified and analyzed at both the system and critical technology element level (Defense Science Board, 2009). The DoD must deter the use of an AoA merely to “support a predetermined solution” (GAO, 2007).
System Technology Readiness Gaps
Including an additional metric in technology readiness assessments could help facilitate the efforts of systems engineers and program managers in making technology trade-offs. Toward that end, John Mankins has proposed adding Research and Development Degree of Difficulty (R&D3) to technology readiness assessments (Mankins, 2002). Mankins created five R&D3 classifications that depend on how many parallel paths of discovery researchers believe are necessary to ensure a reasonable probability of successful discovery. It is likely that the number of parallel paths required will be positively correlated with total early development costs. Therefore, this approach could also facilitate and standardize the participation of cost estimators. By requiring researchers to submit R&D3 classifications for each critical technology element and then asking cost estimators to provide a ballpark estimate for development costs based on the size of the total technology gap and the R&D3 classification, systems engineers and program managers will be armed with the information they need to make reasonably informed technology trade-offs. Of course, this metric could be helpful in cost estimation without trade-offs (option (c) at the end of the previous section) and could be used to help understand the risks involved in developing systems with relatively large technology gaps. Although this recommendation almost certainly requires additional cost-estimating staff, it is likely these additional staff members will pay for themselves by providing useful information on how to save money. Because these estimates would be ballpark estimates, they would not require the level of analysis needed later in the cycle, and the number of new estimators would probably not be prohibitive. Where possible, we have refrained from making recommendations that require an up-front investment of increasingly scarce resources, but here it is highly advisable to take the long view.
Although the approach of evaluating TRLs in isolation would be a substantial improvement over existing practice, the long-term potential of technology readiness assessments that consider interfaces between components has greater potential and should be pursued. Recently, researchers have made progress in defining technology maturity metrics that incorporate the interface between critical technology elements (Sauser, Ramirez-Marquez, Verma, & Gove, 2008). Two technology maturity metrics have been advanced in the systems engineering literature: the interface readiness level (IRL) and the system readiness level (SRL). As their names suggest, IRLs measure the readiness of technology that enables interoperability of two critical technology elements, and the SRL is an aggregate number that incorporates both TRLs and IRLs. Theoretically, SRLs should be the ideal end state—allowing systems engineers to evaluate the contribution of each critical technology element to the functioning of the entire system at the press of a button. However, SRLs are a concept in its infancy, and much work needs to be done before transitioning to this technology readiness metric.
Multiple measurement issues arise when attempting to calculate an SRL (Kujawski, 2010). In the end, objections to the calculation of SRLs amount to concerns over mixing arbitrary subjective rating scales and then aggregating the results. While these objections are well-founded, there is no reason why IRLs cannot be considered in assessing system technological readiness with the eventual goal being more comprehensive metrics for technology readiness assessment.
IRLs could be used in several ways without mixing the two rating scales. First, minimum IRL guidelines could be set and considered when making technology trades. In addition, overall IRL technology gaps could be assessed and used to motivate technical substitution as we proposed with TRLs. Finally, IRLs should play an important role in system cost estimation. With all of these potentially worthwhile applications associated with IRLs, the DoD should devote the necessary resources to fully understanding the contribution of the interfaces to overall system technology readiness and applying the IRL concept to its complex systems.
Although calculation of credible overall SRLs may be quite a few years off, useful information on a system’s interfaces could be reported in a modest amount of time. While guidelines that incorporate IRLs are being developed, the DoD can use decision rules similar to those proposed in the preceding discussion. Eventually, the DoD should be able to report a single SRL and be able to make trades based on contributions to overall system technological readiness.
Implications for Science and Technology
The most obvious and potentially important implication of deferring the use of relatively immature technologies in system development relates to the allocation of funding between Science and Technology (S&T) and programs of record. If less R&D funding is to be spent after a system acquisition becomes a program of record, more funding should naturally be diverted to S&T. This is not a new idea—as far back as 2000, the Defense Science Board recommended that S&T budget requests be increased to almost 3 percent of the total DoD budget submission (Morrow, 2000). In 2003, Under Secretary of Defense for Acquisition, Technology, and Logistics Pete Aldridge set the same target for S&T funding; and in 2007, the Pentagon’s Director, Defense Research and Engineering John Young argued that the department’s S&T funding should be 3 percent of the DoD’s total budget (Davey, 2003; Chow, Silberglitt, & Hiromoto, 2009).
An increase in S&T’s share of DoD funding flies in the face of current trends. In fiscal year 2000, S&T accounted for approximately 3 percent of DoD’s budget (Morrow, 2000). In the proposed fiscal year 2013 budget, S&T has slipped to 2.26 percent of DoD’s allocation (DoD, 2011). In the extraordinarily tight budget climate expected for the foreseeable future, reversing this trend of decreasing S&T funding as a percentage of the DoD budget will be very challenging. To gradually raise S&T’s share of total funding, the DoD could implement a ‘reinvestment’ policy. That is, as cost performance improves from insisting on smaller technology maturity gaps in programs of record, the DoD could earmark a portion of the realized savings for use in basic and applied research. Since total defense S&T funding is less than 12 percent of DoD procurement funding, demonstrating the plausibility of such a reinvestment strategy is trivial (Office of the Under Secretary, 2012). If the DoD has enough discipline to accomplish this ramp-up in S&T funding, the Services’ technological edge will not be diminished to an unacceptable degree over the long haul despite obvious short-term sacrifices implied by some of these recommendations.
If the DoD increases the intensity of its S&T efforts, the Services must decide how to prioritize new R&D projects that will become possible with the newly available resources. RAND researchers working on behalf of the Army have made progress in this area, and the Air Force and Navy could profit from adopting some of their recommendations. In the most general terms, the algorithm they developed maximizes the number of technology objectives prospective projects satisfy while minimizing total life-cycle costs (Chow et al., 2009). While the algorithm considers neither cycle time nor discovery risk at this time, the general approach holds promise and has been under development and revision since 2002.
A new way of assessing a system’s TRLs has been demonstrated to be quite useful in explaining cost and schedule overruns. In particular, this research shows that larger total technology gaps, as measured by the sum of the TRL shortfalls, lead to drastically increased costs throughout the system’s useful life and to more frequent and sizable schedule slips. Since the size of the total technology gap is a key driver, reducing the technology gap by making technology substitutions where possible can mitigate these risks. First and foremost, the DoD should enforce present guidelines and stop allowing critical technology elements with TRLs below 6 to be included in a program after Milestone B. Further, using a simple rule of thumb that aggregates a system’s critical technology element readiness gaps could significantly improve acquisition outcomes, but DoD should eventually develop more comprehensive technology readiness metrics that have the potential to reduce technology risk even further. To implement these recommendations, the DoD must put the right professionals in place—skilled systems engineers, cost estimators, and researchers are vital to this roadmap’s success. Finally, the DoD should increase S&T funding as savings are realized from informed technology substitutions.
To print a PDF copy of this article, click here.
Dr. Chad Dacus is a Defense Analyst-Economist at the Air Force Research Institute (AFRI). Before joining the AFRI staff, Dr. Dacus worked for the Center for Naval Analyses as a field representative to U.S. Fleet Forces Command. His current research interests include defense acquisition, economics and strategy, and cyberspace risk modeling. He holds a PhD in Economics from Rice University and an MS in Statistics from Texas A&M University.
(E-mail address: email@example.com)
Banke, J. (2011). Technology readiness levels demystified. Retrieved from http://www.nasa.gov/aeronautics/features/trl_demystified
Chow, B., Silberglitt, R., & Hiromoto, S. (2009). Toward affordable systems-portfolio analysis and management for army science and technology programs. Santa Monica, CA: RAND Corporation.
Davey, M. (2003). Federal research and development funding: FY 2004 (CRS Issue Brief IB10117). Washington, DC: The Library of Congress.
Davis, D. (2010). Incorporating system complexity to improve development cost estimates. Alexandria, VA: CNA Corporation.
Defense Science Board. (2009). Creating a DoD strategic acquisition platform: The report of the defense science board. Washington, DC: Department of Defense.
Department of Defense. (2009). Technology readiness assessment (TRA) deskbook. Washington, DC: Office of the Director, Defense Research and Engineering (DDR&E).
Department of Defense. (2011). Summary of the DoD fiscal 2012 budget proposal. Retrieved from http://www.defense.gov/pdf/SUMMARY_OF_THE_ DoD_FISCAL_2012_BUDGET_PROPOSAL_(3).pdf
General Accounting Office. (1999). Best practices: Better management of technology development can improve weapon systems outcomes (Report No. GAO/NSIAD-99-162). Washington, DC: Author.
Government Accountability Office. (2007). DoD force modernization: Military personnel: The Navy has not provided adequate justification for its decision to invest in MCTFS (Report No. GAO-07-1139R). Washington, DC: Author.
Government Accountability Office. (2009). Defense acquisitions: Many analyses of alternatives have not provided a robust assessment of weapon systems options (Report No. GAO-09-665). Washington, DC: Author.
Government Accountability Office. (2011). Defense acquisitions of selected weapons programs (Report No. GAO-11-233SP). Washington, DC: Author.
Kujawski, E. (2010). The trouble with the system readiness level (SRL) index for managing the acquisition of defense systems. Presentation delivered at National Defense Industrial Association, 13th Annual Systems Engineering Conference. San Diego, CA.
Mankins, J. (2002). Approaches to strategic research and technology (R&T). Acta Astronautica, 51(1–9), 3–21.
Morrow, W. (2000). Defense Science Board letter report on DoD science and technology program. Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology and Logistics.
Office of the Under Secretary of Defense (Comptroller)/Chief Financial Officer. (2012). United States Department of Defense fiscal year 2013 budget request overview. Washington, DC: Department of Defense.
Sauser, B., Ramirez-Marquez, J., Verma, D., & Gove, R. (2008). From TRL to SRL: The concept of systems readiness levels. Paper (No. 126) presented at the Conference on Systems Engineering Research, Los Angeles, CA.