The Hidden Biases in Production Planning: Why Our Estimates Are Flawed from the Start

Overview

A good project plan is not a guarantee for success, but a bad project plan is always a guarantee for failure. Effective project planning relies on a combination of activity-based (work breakdown structures, critical path analyses, etc.) and object-based (flow, critical chain, advanced work packaging, etc.) approaches. However, a critical yet often-overlooked challenge in either approach is the inherent estimation error, which leads to flawed models, unrealistic schedules, and erratic workflows. Systematically, we underestimate both task durations and the variability of those durations, resulting in overly optimistic projections that fail to capture the complexities of the real world.

Three key factors drive this error. First, inherent bias most notably the planning fallacy, the cognitive tendency to be overly optimistic in estimating task durations and variability. Bias has been extensively studied and is now embedded within most project models and estimation processes. Second, variability, which exists independently of bias, represents the natural uncertainty in task durations that cannot be precisely predicted. Third, noise in the data or created by unpredictable or unknown factors beyond the estimator’s control further distorts our ability to forecast accurately. We analyze the estimation error decomposition as the sum of bias, variance and noise in pipeline projects to support our argument that variance and noise matter more than bias in project planning, yet are often overlooked.

Using evidence from ammonia plant construction, we demonstrate how bottom-up Monte Carlo simulations often fail to account for long-tail risks, resulting in a significant underestimation of actual outcomes. These errors are not solely the result of bias but are compounded by variability and noise, which introduce additional layers of uncertainty. By recognizing and addressing these three factors, we can refine estimation models to more accurately reflect the full range of possible outcomes. This approach enables the development of more resilient and reliable project plans, ultimately improving project execution and reducing the disruptions caused by flawed estimates. Finally, we outline ten approaches to reduce variability and noise during planning and execution of projects.

Keywords: Planning fallacy; Optimism bias; Variability in project execution; Noise; Lean Construction

Authors

Edward Zaayman

Edward Zaayman

Accenture

Lead styles Bio coming soon

Alexander Budzier

Alexander Budzier

Oxford University

Lead styles Bio coming soon

Paper

The Hidden Biases in Production Planning: Why Our Estimates Are Flawed from the Start

Introduction

Effective planning is critical for the success of large-scale projects [6]. The dominant view of project management, however, is that of transforming inputs into outputs through the execution of processes [28]. The most commonly used planning tools, such as the work-breakdown structure and the critical path method, are core to the planning process, as they break down processes into sub-processes and ultimately into tasks with a view to minimizing the cost of each task to achieve the lowest cost possible for the entire project. However, the results are lacking: cost overruns and schedule delays are the norm, not the exception [14]. Naturally, alternative approaches based on production planning methodologies have been championed to deliver construction projects on budget, on time, safely, and to the full scope at quality [4]. These methods, including critical chain, last planner, flow lines, and advanced work packaging, focus on how project work is actually done by concentrating on both tangible and intangible objects or deliverables being produced. Planning is about optimizing the flow of materials, not minimizing activity-based processes [4, 28, 40].

Today, most projects use a mix of both planning approaches. Combining activity-based conventional project planning methods with object-based production planning methods. However, both methods of planning are prone to underestimations of cost and schedules, which in project execution result in cost overruns and schedule delays [11, 12]. If a planner optimistically estimates the time it takes to complete a task in the work-breakdown structure, delays and cost overruns are inevitable. Equally, if a planner overestimates productivity, the flow is slower than planned, and inventory builds up, leading to longer lead times and higher costs than anticipated.

The unrealistic plans stem from model limitations [4], estimation errors [38], and biases in human judgment [15]. Strong arguments have been made that bias, rather than error, is the primary root cause [1, 14]. However, the Nobel prize laureate Daniel Kahneman and his co-authors write, “When we began our research, we were focusing on the relative weights of bias and noise in total error. We soon concluded that noise is often a larger component of effort than bias is” [26:211]. 

Noise is underappreciated in the planning debate. This paper addresses the gap. We examine the underlying factors that contribute to these errors, specifically cognitive bias, inherent variability, and data noise. We illustrate their impact through a case study, the construction of a large-scale ammonia plant. By examining these elements, we aim to refine estimation models to better reflect real-world complexities, thereby improving project execution and reducing disruptions caused by flawed estimates.

Statisticians regard error as: Prediction error = bias2 + variance + noise [5, 18, 39]. In the following sections, we will examine each component in turn before integrating the three strands of thinking in our case study.

Bias in Projects

Bias refers to the systematic deviation of a forecast from what is expected from a statistical, data-driven model and is caused by the cognitive limitations of the estimator when making forecasts intuitively and/or based on incomplete data, such as memory [36]. The concept of bias stems from the research into human judgment under uncertainty [37] and was first applied to the planning problem of engineering projects in 1979 [27]. This work was seminal for the exploration and explanation of overconfidence, overoptimism, and other human biases that lead to decision errors [31]. An application was the underestimation of task completion times, which is known as the Planning Fallacy [2]. 

The planning fallacy refers to the tendency of project planners to underestimate the time and obstacles involved in a project [3]. The classic inside-outside model posits that adopting an inside view in planning leads to prediction optimism, whereas an outside view reduces this optimism [27]. Much of the psychology research has focused on point estimates without much appreciation of the probabilistic nature of most forecasts. Hubbard [23] has shown that range estimates are too narrow, and estimators are too confident in their estimates. This, however, can be improved through better training of the estimators. 

However, practically, when we work with teams on project sites, they still use estimates in the planning system that are prone to optimism bias and the planning fallacy. We know that production control measures like percentage of promises completed rarely come close to 100%, typical performance is 66-76% [29]. Why bias occurs is not the topic of this paper, as it has been extensively explored in the planning literature; for a discussion, see, for example, [15]. We recognize the planning fallacy and the optimistic forecasts that it leads to. 

When composing the total forecasting error, bias is statistically measured as the mean or median of the differences between a number of predictions and reality [5, 18, 39]. For example, consider the data on pipelines [17], which indicate that pipeline projects (n = 437) have a mean schedule overrun of 17% and a median schedule overrun of 0%. Typically, the mean is preferred as the measure of bias, unless outliers are present in the data. The bias indicated by the mean suggests the presence of long tails in the projects. This demonstrates a typical scenario where the median (or, in the language of planners, the P50) aligns well with historical data. 

The same plays out at the more granular level. When we examine the estimates for individual tasks on the schedule, we find that while the P50 durations are reasonable, the long-tail risk for individual tasks has not been adequately accounted for, and frequently, the distributions are symmetrical. 

For example, the welding output on site is often measured in Welding Diameter Inches (WDI). Hence, productivity is measured as WDI per welder per day. Typical benchmarks for planners suggest a productivity of 7 WDI/welder/day, with a range of 5-9. In reality, we see far more variation with a rate ranging from 0-10 WDI/welder/day. 

If this is entered into a Monte Carlo model of the total project duration, the model will show an unbiased estimate (if indeed the median is 7WDI/welder/day). However, it will underestimate the tail risk and thus create a mean underestimation. To make matters worse, many planning tools use triangular or BetaPERT distributions for activity planning. Those are specified by a min and max; the choice of distribution means that no value smaller or larger can be observed. A lognormal distribution might be better, as it does not constrain the simulation draws to the specified range. For example, a bad welder might do worse than 0 WDI/day because they create rework. Then, this value could even be negative.

Variability in Projects

The impact of variability on project progress is well-documented [19, 33]. Modig outlines the efficiency paradox and the trade-off between point and flow productivity of a resource [33]. Take, for example, a crane; the high cost means that a production planner might want to optimize the utilization of the crane. It should be lifting something all the time. Only if the crane is the rate-limiting factor on site, will the optimization of crane utilization keep things flowing. If the crane is not the rate-limiting factor, speeding up the delivery of the object under construction will not be beneficial. If it is not the rate-limiting factor, then establishing buffer capacity ahead of the crane to increase crane utilization will increase the duration of the task, in effect decreasing throughput through the system.

This effect increases with variability. Variability in the statistical definition of variance is the mean of the deviation from the mean prediction. In a world with little variability, the trade-off is not required. However, most construction sites are highly variable, and given that resources are limited, this variability continues to increase time because it slows the flow down. One indicator of a project's complexity is how frequently the critical path changes, i.e., how often something else becomes the rate-limiting factor. 

Resourcing strategies aside, the authors' observation is that this variability is not accounted for in practice, particularly when resourcing strategies are based on median or mean values. 

The impact of averages rather than ranges accounting for variability is further optimism in the modelled schedule, even if the schedule leverages benchmark information and is modelled to appropriate levels of detail. Counterintuitively, the use of averages is often exacerbated as the level of detail increases, because the data requirements also increase with every breakdown.

Returning to our pipeline data. The mean schedule overrun was 17%, with a standard deviation of 39.9%. Therefore, approximately 68% of the observations fall within the range of 17 ± 40% (see Figure 1). If we were to take the median of 0%, then the variability measure around the median is the interquartile range, which is 24.4%, which means that half the data were in the range of 0% + 24% (see Figure 2). The Figures show that the data are influenced by outliers of large delays on three of the pipeline projects (the dots in Figure 2 denote the outliers). The two figures demonstrate that that the mean is not a good measure to characterize what delays should be expected (median versus mean). The figures also show that these projects had delays in the past and that we found some variability in the delays. Next, we will explore the concept of noise as the second component of variability (variability = variance + noise), before we will decompose these data further.

Figure 1: Variability measured by standard deviation in the pipeline data (n=437)
Figure 2: Variability measured by median and interquartile range in the pipeline data (n=437)

Noise in Projects

The third element of the prediction error is noise. Noise is the least well-understood element and can often be larger than the more commonly problematized issue of bias [26, 30]. Statisticians sometimes understand noise as a random error. If you build a regression model for a parametric estimate, e.g., the cost of constructing a building = square meters * cost per square meter, the model includes a random error. Statisticians refer to it as random error because it is not related to the number of square meters or the cost per square meter values. Researchers and practitioners of lean in construction have understood the importance of variability in projects. They refer to noise as external variability (as opposed to self-induced variability, which is more comparable to statistical variability and could be modelled, forecasted, and planned for through including more parameters in models). However, and this is the key argument in Kahneman et al. [26] noise should not be seen as random. Noise has patterns. Noise can be studied.

Different levels of noise can exist based on different individuals making plans. One planner might be more optimistic than another, and another might have better information to inform a forecast; they might also use a different methodology. We also know that humans overestimate internal consistency. This is another source of noise. An estimator can give you two estimates in the space of an hour that are entirely different.

In our planning, external events are often a source of noise that springs to mind. Rare, high-impact risks from the world of wild randomness [32, 35]. Indeed, our analysis has shown that noise due to extreme values can be so extreme that it renders other measures of bias and variability (mean, standard deviations, kurtosis, skew) meaningless [16]. This is due to the statistical distribution of extreme tail risks, which typically follow a Pareto distribution. In a project portfolio, these are individual projects that overrun dramatically. In more granular analyses, these risks include earthquakes, wildfires, cybersecurity incidents, terrorism, and flooding, among others. Why? The tail events dominate, and if the tail is sufficiently thick, the moments of the distribution are undefined, i.e., the mean and standard deviation (or bias and variability) do not make sense. You can measure them, of course, but the measurements will be very noisy and they converge only slowly with huge sample sizes, if they converge at all.

The Bertha Tunnel Project in Seattle exemplifies the pitfalls of megaproject planning. The world’s largest tunnel-boring machine, Bertha, broke down after only four months and a thousand feet of digging because of damage caused by a steel pipe. The event delayed the completion of a two-billion-dollar underground highway project by four years. This incident highlights the impact of extreme events, or tail risks, that are often overlooked in estimates.  

Project teams typically limit estimates to the information they have or the reference data, excluding “outliers”. This clean data is used to develop benchmarks and then becomes the basis for more sophisticated models, in essence, building noise into the data.

In practice, many more projects will be subject to noise. The time it will take to develop a new technology or commission a first-of-a-kind plant is subject to significant uncertainty, with far too many factors to rely on simple bottom-up estimates.

Decomposing Parametric Forecasting of Pipeline Cost

What does this mean in practice for production planning? In our dataset, we have pipeline data for 30 pipelines. We hypothesize that the unit cost and the length of the pipeline are associated with the cost. 

Actual cost = intercept + unit cost used in estimation * length of the pipeline.

Thirty freshwater and sewerage pipelines are in the small dataset we have available. The data were collected from major capital projects of water suppliers in Hong Kong, the UK, US, Australia, and European countries. The data were provided by the owner/operator of the pipelines. The unit cost used in the estimation is measured in GBP millions in 2022 prices, which are PPP adjusted. The mean unit cost is GBP 2.6m per km (median 2.9m, min 0.24, max 7.8). The length of the pipeline is the length of the mainline measured in km. The mean length is 25.7 km (median 6.1, min 1.1, max 228.5). The actual cost or outturn cost is measured in GBP millions in 2022 prices and are PPP adjusted. The mean actual cost is GBP 99.8m (median 17.9 million, min 2.3 million, max 1.4 billion).

Unsurprisingly, the model (see Table 1) is highly statistically significant (p < 0.001, Adjusted R2 = 0.9965) even though it is small. This is because most projects simply use parametric estimates for their estimation. However, the model is not entirely predictive because the difference between estimated cost and actual cost is the cost overrun typically encountered in projects (see Figure 2 and the shown 95% confidence interval of the predictions).

Table 1: Model Results for a Simple Parametric Estimate of the Actual Cost of Pipelines
Figure 3: Marginal Means of the Predicted Actual Cost (GBP, millions, 2022 prices, PPP adjusted)

Using the total prediction error decomposition, we can break the total expected error down as

Total Expected Error = Bias2 + Variance + Noise

Or

Let y be the actual value of cost on a log-scale, and where f_hat(x) is the predictor from the model above, and f(x) is the actual cost value observed, then:

We bootstrapped 100 samples of the same size as the available data for the estimate of bias and variance. 

The results of the decomposition gave us:

  • Total expected error = 152,170
  • Bias2 = 79,542.11
  • Variance = 0.0098
  • Noise = 72,627.9

This indicates that, for this model of actual cost, the variance is very low compared to bias and noise (see also Figure 4). This is typical for a simple linear model such as the one we used. Although this model is simple, a parametric model such as the one we used here, is often used in forecasting practice. We interpret that the very high level of statistical significance of the model and the very high adjusted R2 are the result of this being a common practice.

Figure 4: Geometric Proof of the Pythagorean Theorem (Bias² + Noise = Total Error; note Variance is omitted as the error contribution is too small to visualize meaningfully)

Noise and Bias are almost equally contributing to the model of explanation. The variance (0.0992) is too small to meaningfully visualize. This means that if we examine the problem solely through the lens of forecasting bias (following Kahneman and Tversky, 1979), we miss half of the total expected prediction error.

Ammonia Plant Case Study

By differentiating between these inherent sources of uncertainty, we can explain the patterns observed in the data from the real-world design of a large-scale ammonia project.

Context

The project required the design and construction of a large-scale (greater than one mtpa) ammonia plant in a remote geography. The team developed a detailed, bottom-up estimate for the project duration and then applied benchmark ranges collected from multiple EPCs to the model. This was followed by a Monte Carlo simulation to develop the final estimate for the business case.

Aware of the typical biases in scheduling, the team conducted a final test. The outputs from the Monte Carlo were compared to reference class data as suggested by Kahneman and Tversky [26]. The result was a reasonable estimate for the P20 and P50 estimates of the schedule, but a complete underestimate of the long-tail risk and, therefore, the average. While the project team had attempted to avoid the typical biases, these were inadvertently carried back into the estimate through both the model and the data used. 

Project teams had failed to capture the long-tail estimates fully, and the durations used from benchmark sets were based on “typical” projects, excluding outliers. Even when longer tail distributions were used, they typically relied on simple triangular distributions, effectively truncating the tail when present.

When estimates were developed from bottom-up productivity rates and resourcing models, the assumptions relied on averages, leading to optimistic durations as the impact of variability was removed. Variability in productivity is particularly relevant in remote locations where skilled workers are scarce and equipment cannot be easily increased or maintained. To account for this, lower averages were used, again based on neutral benchmarks, but this did not sufficiently compensate for the impact of variability.

What the team here assumed, implicitly, was that Jensen’s Inequality [24] does not apply. In simple terms, Jensen’s inequality states that f(E(x)) ≠ E(f(x)) depending on whether the function is concave or convex, namely f(E(x)) is larger in the convex case or smaller than E(f(x)) in the concave case. Suppose productivity is a nonlinear (typically concave) function of resource levels. This means that productivity does not linearly increase with more workers on site, it has reducing marginal productivity improvements (it is a concave function). 

The team, using the average resource level, estimated average productivity. They will always get the wrong answer – ignoring Jensen’s Inequality - will overestimate productivity and, thus, underestimate time. Jensen’s inequality tells us that productivity at average resources ≥  average productivity across resource levels. Plainly speaking, using average resources to estimate average productivity will overstate how much work gets done. Any team that uses average productivity and average resource levels invariably creates an optimistic schedule, an underestimated duration, and will end up with a delay.

To make matters worse, there was much noise in the information, and to build usable data sets, teams needed to “clean the high-risk geographies,” and for emergent technologies, this noise is more common. The project team tended to remove the extreme events or not allow for these extreme one-off events in estimating durations, suppressing noise while increasing the bias in the estimates. As we showed earlier, this is a “robbing Peter to pay Paul” fallacy.

Beyond the themes already discussed, there was also concern about developing the Monte Carlo. Critical parameters, such as the impact of the owner's team capabilities, shifts in regulation, or the availability of skills, could not be easily modeled, and initial models disregarded these factors. The impact of these factors would not only increase the tail risk but also increase the probability of poor performance across other dimensions.

Interventions

The team underwent technical training. The training was two-fold. First, the team used calibration training to help de-bias the estimates and include potential long-tail risks by extending the ranges of estimates. When uncalibrated estimators are asked to provide a 90% certain range estimate (P5-P95) they at best produce a 70% certain range (P15-P85); more commonly, they produce only a 50% certain range P25-P75 [22]. Second, the team learned about different distributions to use in modelling. Thus, they shifted from simple symmetrical triangular distributions (wrong) to allow for long tails (better) or log normal distributions (best).

When the work was subject to high degrees of variability, the teams used “outside” actual data sets for durations instead of relying on a purely internal view. The use of these reference class forecasts expanded the sample to include longer tail risks and extreme events, while also replacing theoretically assumed distributions with empirical, observed distributions.

Naturally, some modelling inputs lacked outside information, e.g., the impact of executing this project in a very remote location. In these cases, data from analogous projects was used to build a view of the bias and noise. Bias meant an average uplift. For example, the team analyzed data from this and similarly risky locations to determine that projects, on average, take 20% longer. Some planners in the team had already used the heuristic that “every calendar year here has only 10 months”. The reference class forecasts were modified using quantile mapping to adjust these for the project's location risk profile. This process not only resonated with the team’s intuition about averages but also introduced the full distributional information (bias2 + noise) and thus allowed to model the tail risk at the good and the bad end of the distribution. 

Proxy data and applied information economics were utilized to develop a more comprehensive decision-making model. Leveraging the work by Hubbard [22], the team developed a model that included “softer” metrics of uncertainty, such as the speed of decision-making or policy decisions.

Outcome

Firstly, the project's risk profile was altered. The P50 estimates remained aligned to initial thinking, but the P80 scenarios, i.e., the realistic downside case, showed durations that had increased by ~40%. This could have material impacts on the investment case.

Secondly, the shift in approach fundamentally changed the focus on which risks were material. The new focus of the risk management approach led to a significant change in the project development plan, and consequently, there was significantly less cost at risk. 

Note: This approach also has implications for managing schedules. The above approach provides valuable insights for developing the investment case and high-level management of contracts and resources; however, the degree of uncertainty and noise in the plan would render daily management of the plan impossible. At a tactical level, there is a need for specificity. The implication for future research is the separation of the methodology used for setting the baseline used in the business case from the tactical management of the project and the contracts.

Conclusions

As reliance on predictive models in planning increases, it is imperative to ensure the accuracy and reliability of the data feeding these models. As our case study illustrates, the classic tradeoff between bias, variance, and noise is complex. As our data on pipelines show, simple parametric estimation models (i.e., regressions) produce relatively little variability, with the key components being bias and noise in predictions. However, with the general hype surrounding AI, we see planners using more and more machine learning algorithms, specifically regression trees and neural networks. Those have a higher variability. Two models trained on the same dataset will not yield the same model as conventional regression models.  

Planners must recognize and address the impacts of cognitive bias, inherent variability, and data noise on their models and decisions to develop more resilient and reliable plans. How can this be done? Our analysis and case experience highlight six approaches to improve plans:

  1. Calibration of estimators: Calibration training improves the accuracy of range estimates even if data are too sparse to meaningfully de-bias the plan and plan for more accurate ranges of variability and noise.
  2. Data-driven methods for planning: Approaches like reference class forecasting enable the interrogation of historical data from similar projects (or any level of project decomposition) to estimate bias, variability, and noise, thereby improving the accuracy of planning.
  3. Triangulate forecasts: Use multiple forecasts and use multiple independent estimators to understand bias, variability and noise in forecasts and identify their sources.
  4. Shift the project methodology: Shifting from construction to assembly has been successfully executed, for example, in Lain O’Rourke’s Schwartzman Centre for the Humanities in Oxford, and the Grange Hospital in Wales, as well as Sir Robert McAlpine’s The Forge in London (and numerous others in China). At the Schwartzman Centre, the crane lifting schedule had not to be adjusted during any of the excavation and civils work, which exemplifies the ability that a shift to assembly can significantly reduce variability and noise.
  5. Shift the planning approach: Shifting from activity-based conventional planning to object-based product planning not only elevates the importance of understanding and managing variability, but it also offers a toolset including lean, statistical process control, six sigma for quality, etc. to reduce variance and noise.
  6. Consider scenarios: All plans are based on expectations of the future. Even if this is a single scenario that is implicitly assumed and not spoken about - the so-called “Ghost Scenario”. Using multiple scenarios in planning helps to interpret data from different perspectives, utilizing various mental models. This has been shown to remove noise and also help identify weak signals for projects.

In addition, our case study highlights three additional interventions outside the purview of planning to decrease variability and noise:

  1. Improve the fitness of the production system. Lean is one useful approach and has demonstrated that when reducing the non-value-added activities the variability and noise from those processes is also removed from the system, leading to an overall reduction in variability and noise. 
  2. During execution, the classic problem-solving tools (cause-and-effect matrices, fishbone diagrams, five whys, value stream maps, and process mapping) help identify sources of variability and noise.
  3. Improve capabilities: Rework is a source of variability in construction projects. Thus, investing in the capabilities of operatives is a logical step to reduce variability and noise. Additionally, investing in creating multi-trade capability operatives increases flexibility. It has been shown to reduce the buffers needed in high-variability or high-noise projects, thereby directly decreasing the resources required. At the project level, organizational capabilities such as dynamic team capabilities, information technology, and team-based learning (including retrospection and pre-mortem) have reduced variability and noise. 
  4. Increase the robustness of processes: Just-in-time and lean process optimization show limited improvements in high-variability, high-noise environments. A process organization that utilizes a bowl shape for operational task durations, combined with an inverse bowl shape for variability and noise, leads to the best achievable lead times.

Adopting some or all of the methods to reduce bias, variability, and noise in project upfront has also material implications for the methods advocated in typical governance processes, particularly when the goal is to approve business cases through a gated process that aims at reducing all three components of error through the front-end development phase of projects.

References

[1] Bertisen, J., & Davis, G. A. (2008). Bias and error in mine project capital cost estimation. The Engineering Economist, 53(2), 118-139

[2] Buehler, R., Griffin, D., & Ross, M. (1994). Exploring the "planning fallacy": Why people underestimate their task completion times. Journal of Personality and Social Psychology, 67(3), 366.

[3] Buehler, R., & Griffin, D. (2015). The planning fallacy: When plans lead to optimistic forecasts. The Psychology of Planning in Organizations, 31-57. Routledge.

[4] Dallasega, P., Marengo, E., & Revolti, A. (2021). Strengths and shortcomings of methodologies for production planning and control of construction projects: A systematic literature review and future perspectives. Production Planning & Control, 32(4), 257-282.

[5] Domingos, P. (2000). A unified bias-variance decomposition. In Proceedings of 17th International Conference on Machine Learning, 231-238. Morgan Kaufmann Stanford.

[6] Dvir, D., & Lechler, T. (2004). Plans are nothing, changing plans is everything: The impact of changes on project success. Research Policy, 33(1), 1-15.

[7] ENR. (2015, March 17). Highway 99 tunnel builder rescued in Seattle.

[8] Flyvbjerg, B. (2006). Curbing optimism bias and strategic misrepresentation in planning: Reference class forecasting in practice. European Planning Studies, 16(1), 3-21.

[9] Flyvbjerg, B. (2006). From Nobel Prize to project management: Getting risks right. Project Management Institute.

[10] Flyvbjerg, B. (2009). Delusion and deception in large infrastructure projects: Two models for explaining and preventing executive disaster. California Management Review, 51(2), 170-193.

[11] Flyvbjerg, B. (2013). From Nobel Prize to project management: Getting risks right. arXiv preprint arXiv:1302.3642.

[12] Flyvbjerg, B. (2014). What you should know about megaprojects and why: An overview. Project Management Journal, 45(2), 6-19.

[13] Flyvbjerg, B. (2017). Reference class forecasting for Hong Kong’s major roadworks projects. arXiv preprint arXiv:1710.09419.

[14] Flyvbjerg, B., Skamris Holm, M. K., & Buhl, S. L. (2018). How common and how large are cost overruns in transport infrastructure projects? Transport Reviews, 23(1), 71-88

[15] Flyvbjerg, B. (2021). Top ten behavioral biases in project management: An overview. Project Management Journal, 52(6), 531-546.

[16] Flyvbjerg, B., Budzier, A., Lee, J. S., Keil, M., Lunn, D., & Bester, D. W. (2022). The empirical reality of IT project cost overruns: discovering a power-law distribution. Journal of Management Information Systems, 39(3), 607-639.

[17] Flyvbjerg, B., Budzier, A., & Aaen, J. (2025). The uniqueness of IT cost risk: A cross-group comparison of 23 project types. Project Management Journal, (in press).

[18] Gelman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1-58.

[19] Goldratt, E. M., & Cox, J. (1984). The Goal: A Process of Ongoing Improvement. North River Press.

[20] Hedvall, L., & Mattsson, S. A. (2024). Causes and actions related to self-induced variations in manufacturing companies. Production Planning & Control35(10), 1081-1098.

[21] Hubbard, D. W. (2009). The Failure of Risk Management: Why It's Broken and How to Fix It. Wiley.

[22] Hubbard, D. W. (2014). How to Measure Anything: Finding the Value of Intangibles in Business (3rd ed.). Wiley.

[23] Hubbard, D. W. (2020). The Failure of Risk Management: Why It's Broken and How to Fix It. John Wiley & Sons.

[24] Jensen, J. L. W. V. (1906). Sur les fonctions convexes et les inégalités entre les valeurs moyennes. Acta mathematica, 30(1), 175-193.

[25] Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.

[26] Kahneman, D., Sibony, O., & Sunstein, C. R. (2021). Noise: A Flaw in Human Judgment. Hachette UK.

[27] Kahneman, D., & Tversky, A. (1979). Intuitive prediction: Biases and corrective procedures. In Studies in the Management Sciences: Forecasting, Amsterdam.

[28] Koskela, L. (2000). An exploration towards a production theory and its application to construction. VTT Technical Research Centre of Finland.

[29] Lappalainen, E., Reinbold, A., & Seppänen, O. (2023). Planned Percentage Completed in Construction–a Quantitative Review of Literature. In Annual Conference of the International Group for Lean Construction (pp. 1104-1115). International Group for Lean Construction (IGLC).

[30] Lee, C.-Y., & Morewedge, C. K. (2021). Noise Increases Anchoring Effects. Psychological Science, 33(1), 60-75.

[31] Lovallo, D., & Kahneman, D. (2003). Delusions of success. Harvard Business Review, 81(7), 56-63.

[32] Mandelbrot, B. B., & Hudson, R. L. (2010). The (Mis)behaviour of Markets: A Fractal View of Risk, Ruin and Reward. Profile Books.

[33] Modig, N., & Åhlström, P. (2015). This is Lean: Resolving the Efficiency Paradox. Rheologica Publishing.

[34] Park, J. E. (2021). Curbing cost overruns in infrastructure investment: Has reference class forecasting delivered its promised success? European Journal of Transport and Infrastructure Research, 21(2), 120–136.

[35] Taleb, N. N. (2010). The Black Swan: The Impact of the Highly Improbable (Vol. 2). Random House Trade Paperbacks.

[36] Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5(2), 207-232.

[37] Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124-1131.

[38] Vanston, J. H., & Vanston, L. K. (2004). Testing the tea leaves: Evaluating the validity of forecasts. Research-Technology Management, 47(5), 33-39.

[39] Wolpert, D. H. (1997). On bias plus variance. Neural Computation, 9(6), 1211-1243.

[40] Zabelle, T. R. (2024). Built to Fail: Why Construction Projects Take So Long, Cost Too Much, And How to Fix It. Forbes Books.

© 2025 Project Production Institute. All Rights Reserved.