Project Management math

Content:

  1. Introduction
  2. Typical assumptions
  3. Ideal scenario
  4. Realistic scenario
  5. Conclusions
  6. Code

Introduction:

Estimation of effort to develop software is a known hard problem.

Lots of books and articles exist on that topic.

And I am not even a project manager.

But I still think it is relevant to try and take a mathematical approach to project estimation.

Given the created model then the results from math are guaranteed true.

The model is a simplified version of the real world, but I hope that it is close enough to the real world for the results to be relevant.

Typical assumptions:

Everybody knows that there is uncertainty on project estimates.

A key part of good project management is to try and minimize the risk of actuals significantly exceeding estimates.

The general assumptions are that this can achieved by:

The first 3 are under the project managers control.

The accuracy of the task estimates depend on the estimators general development experience and the knowledge about the specific domain. Typically that is outside the project managers control.

Ideal scenario:

Model:

This is the very simple model that are behind the typical assumptions.

total size = total project effort in hours

number tasks = number of tasks project is split up in

task size = total size / number tasks

task estimate = task size + a * N(0, 1) * task size

developer buffer = b * task size

project manager buffer = c * 1 / sqrt(number tasks) * sum(task size)

a=0.20 means estimates +/- 40%
a=0.10 means estimates +/- 20%
a=0.05 means estimates +/- 10%

b=1.667 means risk of going over is 5%

c=1.667 means risk of going over is 5%

N(0, 1) is a standard normal (gaussian) distributed random component

It is of course not realistic that all tasks have exact same size, but typical projects are split up in tasks of approximately same size.

Simulation results:

We will simulate a 10000 hours project.

We will vary:

We will look at various metrics:

Assumption: 40% uncertainty on task estimates.

No buffer.

task avgest avgact > >5% >20% >100%
10 10000 10000 0.49 0.22 0.00 0.00
25 10000 10000 0.50 0.12 0.00 0.00
50 10000 10000 0.49 0.04 0.00 0.00
100 10000 10000 0.51 0.01 0.00 0.00
250 10000 10000 0.49 0.00 0.00 0.00
500 10000 10000 0.49 0.00 0.00 0.00
1000 10000 10000 0.49 0.00 0.00 0.00

Developer buffer:

task avgest avgact > >5% >20% >100%
10 13300 10000 0.00 0.00 0.00 0.00
25 13300 10000 0.00 0.00 0.00 0.00
50 13300 10000 0.00 0.00 0.00 0.00
100 13300 10000 0.00 0.00 0.00 0.00
250 13300 10000 0.00 0.00 0.00 0.00
500 13300 10000 0.00 0.00 0.00 0.00
1000 13300 10000 0.00 0.00 0.00 0.00

Project manager buffer:

task avgest avgact > >5% >20% >100%
10 11000 10000 0.05 0.01 0.00 0.00
25 10700 10000 0.05 0.00 0.00 0.00
50 10500 10000 0.04 0.00 0.00 0.00
100 10300 10000 0.05 0.00 0.00 0.00
250 10200 10000 0.04 0.00 0.00 0.00
500 10100 10000 0.05 0.00 0.00 0.00
1000 10100 10000 0.04 0.00 0.00 0.00

Assumption: 20% uncertainty on task estimates.

No buffer.

task avgest avgact > >5% >20% >100%
10 10000 10000 0.49 0.06 0.00 0.00
25 10000 10000 0.50 0.01 0.00 0.00
50 10000 10000 0.50 0.00 0.00 0.00
100 10000 10000 0.50 0.00 0.00 0.00
250 10000 10000 0.50 0.00 0.00 0.00
500 10000 10000 0.49 0.00 0.00 0.00
1000 10000 10000 0.48 0.00 0.00 0.00

Developer buffer:

task avgest avgact > >5% >20% >100%
10 11700 10000 0.00 0.00 0.00 0.00
25 11700 10000 0.00 0.00 0.00 0.00
50 11700 10000 0.00 0.00 0.00 0.00
100 11700 10000 0.00 0.00 0.00 0.00
250 11700 10000 0.00 0.00 0.00 0.00
500 11700 10000 0.00 0.00 0.00 0.00
1000 11700 10000 0.00 0.00 0.00 0.00

Project manager buffer:

task avgest avgact > >5% >20% >100%
10 10500 10000 0.05 0.00 0.00 0.00
25 10300 10000 0.05 0.00 0.00 0.00
50 10200 10000 0.05 0.00 0.00 0.00
100 10200 10000 0.05 0.00 0.00 0.00
250 10100 10000 0.05 0.00 0.00 0.00
500 10100 10000 0.04 0.00 0.00 0.00
1000 10100 10000 0.05 0.00 0.00 0.00

Assumption: 10% uncertainty on task estimates.

No buffer.

task avgest avgact > >5% >20% >100%
10 10000 10000 0.50 0.00 0.00 0.00
25 10000 10000 0.50 0.00 0.00 0.00
50 10000 10000 0.49 0.00 0.00 0.00
100 10000 10000 0.49 0.00 0.00 0.00
250 10000 10000 0.49 0.00 0.00 0.00
500 10000 10000 0.48 0.00 0.00 0.00
1000 10000 10000 0.47 0.00 0.00 0.00

Developer buffer:

task avgest avgact > >5% >20% >100%
10 10800 10000 0.00 0.00 0.00 0.00
25 10800 10000 0.00 0.00 0.00 0.00
50 10800 10000 0.00 0.00 0.00 0.00
100 10800 10000 0.00 0.00 0.00 0.00
250 10800 10000 0.00 0.00 0.00 0.00
500 10800 10000 0.00 0.00 0.00 0.00
1000 10800 10000 0.00 0.00 0.00 0.00

Project manager buffer:

task avgest avgact > >5% >20% >100%
10 10300 10000 0.05 0.00 0.00 0.00
25 10200 10000 0.05 0.00 0.00 0.00
50 10100 10000 0.05 0.00 0.00 0.00
100 10100 10000 0.04 0.00 0.00 0.00
250 10100 10000 0.05 0.00 0.00 0.00
500 10000 10000 0.05 0.00 0.00 0.00
1000 10000 10000 0.04 0.00 0.00 0.00

We note that:

We see all the initial assumptions confirmed.

Realistic scenario:

Model:

The above looks pretty neat.

But is the model really realistic?

Probably not!

Two additional factors come in:

So now the model looks like:

total size = total project effort in hours

number tasks = number of tasks project is split up in

task size = total size / number tasks

task estimate = task size + a * N(0, 1) * MAX(task size, d)

task actual = MAX(task size, task estimate)

developer buffer = b * task size

project manager buffer = c * 1 / sqrt(number tasks) * sum(task size)

a=0.20 means estimates +/- 40%
a=0.10 means estimates +/- 20%
a=0.05 means estimates +/- 10%

b=1.667 means risk of going over is 5%

c=1.667 means risk of going over is 5%

d=40 means that task uncertainty does not decrease below what it is for a task size of 40 hours

N(0, 1) is a standard normal (gaussian) distributed random component

Simulation results:

We will simulate a 10000 hours project.

We will vary:

We will look at various metrics:

Assumption: 40% uncertainty on task estimates.

No buffer.

task avgest avgact > >5% >20% >100%
10 10000 10800 1.00 0.77 0.01 0.00
25 10000 10800 1.00 0.88 0.00 0.00
50 10000 10800 1.00 0.96 0.00 0.00
100 10000 10800 1.00 0.99 0.00 0.00
250 10000 10800 1.00 1.00 0.00 0.00
500 10000 11600 1.00 1.00 0.00 0.00
1000 10000 13200 1.00 1.00 1.00 0.00

Developer buffer:

task avgest avgact > >5% >20% >100%
10 13300 13400 0.38 0.00 0.00 0.00
25 13300 13400 0.70 0.00 0.00 0.00
50 13300 13400 0.90 0.00 0.00 0.00
100 13300 13400 0.99 0.00 0.00 0.00
250 13300 13400 1.00 0.00 0.00 0.00
500 13300 13800 1.00 0.00 0.00 0.00
1000 13300 15100 1.00 1.00 0.00 0.00

Project manager buffer:

task avgest avgact > >5% >20% >100%
10 11000 10800 0.24 0.03 0.00 0.00
25 10700 10800 0.69 0.06 0.00 0.00
50 10500 10800 0.98 0.13 0.00 0.00
100 10300 10800 1.00 0.34 0.00 0.00
250 10200 10800 1.00 0.83 0.00 0.00
500 10200 11600 1.00 1.00 0.00 0.00
1000 10100 13200 1.00 1.00 1.00 0.00

Assumption: 20% uncertainty on task estimates.

No buffer.

task avgest avgact > >5% >20% >100%
10 10000 10400 1.00 0.29 0.00 0.00
25 10000 10400 1.00 0.20 0.00 0.00
50 10000 10400 1.00 0.12 0.00 0.00
100 10000 10400 1.00 0.06 0.00 0.00
250 10000 10400 1.00 0.01 0.00 0.00
500 10000 10800 1.00 1.00 0.00 0.00
1000 10000 11600 1.00 1.00 0.00 0.00

Developer buffer:

task avgest avgact > >5% >20% >100%
10 11700 11700 0.38 0.00 0.00 0.00
25 11700 11700 0.69 0.00 0.00 0.00
50 11700 11700 0.89 0.00 0.00 0.00
100 11700 11700 0.98 0.00 0.00 0.00
250 11700 11700 1.00 0.00 0.00 0.00
500 11700 11900 1.00 0.00 0.00 0.00
1000 11700 12600 1.00 1.00 0.00 0.00

Project manager buffer:

task avgest avgact > >5% >20% >100%
10 10500 10400 0.23 0.00 0.00 0.00
25 10300 10400 0.70 0.00 0.00 0.00
50 10200 10400 0.98 0.00 0.00 0.00
100 10200 10400 1.00 0.00 0.00 0.00
250 10100 10400 1.00 0.00 0.00 0.00
500 10100 10800 1.00 1.00 0.00 0.00
1000 10100 11600 1.00 1.00 0.00 0.00

Assumption: 10% uncertainty on task estimates.

No buffer.

task avgest avgact > >5% >20% >100%
10 10000 10200 1.00 0.00 0.00 0.00
25 10000 10200 1.00 0.00 0.00 0.00
50 10000 10200 1.00 0.00 0.00 0.00
100 10000 10200 1.00 0.00 0.00 0.00
250 10000 10200 1.00 0.00 0.00 0.00
500 10000 10400 1.00 0.00 0.00 0.00
1000 10000 10800 1.00 1.00 0.00 0.00

Developer buffer:

task avgest avgact > >5% >20% >100%
10 10800 10800 0.37 0.00 0.00 0.00
25 10800 10800 0.67 0.00 0.00 0.00
50 10800 10800 0.86 0.00 0.00 0.00
100 10800 10800 0.97 0.00 0.00 0.00
250 10800 10800 1.00 0.00 0.00 0.00
500 10800 10900 1.00 0.00 0.00 0.00
1000 10800 11300 1.00 0.00 0.00 0.00

Project manager buffer:

task avgest avgact > >5% >20% >100%
10 10300 10200 0.22 0.00 0.00 0.00
25 10200 10200 0.70 0.00 0.00 0.00
50 10100 10200 0.98 0.00 0.00 0.00
100 10100 10200 1.00 0.00 0.00 0.00
250 10100 10200 1.00 0.00 0.00 0.00
500 10000 10400 1.00 0.00 0.00 0.00
1000 10000 10800 1.00 1.00 0.00 0.00

We note that:

So of the original 4 assumptions then 2 are true and good, 1 is true but bad and 1 is false.

Conclusions:

The conclusions for the project manager are:

And remember telling developers that it is unacceptable to exceed estimate is the same as asking them add a buffer to their estimate. No buffer means below 50% of times and above 50% of times. So as a project manager you should tell your developers to exceed estimate 50% of times!

Code:

And for those that want to see the code and maybe try tweaking the calculations themselves:

package pmmath

import kotlin.math.*

import java.util.Random

// RNG fucntions

val rng = Random()

fun uniform_rng(): Double = rng.nextDouble()

fun normal_rng(): Double = sqrt(-2 * ln(uniform_rng())) * cos(2 * PI * uniform_rng())

// task functions

fun task_uncertainty(tasksize: Double, eps: Double, minsize: Double): Double = eps * max(tasksize, minsize) * normal_rng()

fun task_buffer(tasksize: Double, eps: Double): Double = 1.667 * eps * tasksize

fun task_estimate(tasksize: Double, eps: Double, minsize: Double, tasknorisk: Boolean): Double = tasksize + task_uncertainty(tasksize, eps, minsize)  + if(tasknorisk)  task_buffer(tasksize, eps) else 0.0 

fun task_actual(tasksize: Double, est: Double, minutil: Double): Double = max(tasksize, minutil * est)

// project functions

fun proj_estimate(projsize: Double, ntask: Int, eps: Double, minsize: Double, tasknorisk: Boolean): DoubleArray = DoubleArray(ntask, { _ -> task_estimate(projsize / ntask, eps, minsize, tasknorisk) })

fun proj_actual(projsize: Double, ntask: Int, est: DoubleArray, minutil: Double): DoubleArray = DoubleArray(ntask, { ix -> task_actual(projsize / ntask, est[ix], minutil) })

fun proj_buffer(projsize: Double, ntask: Int, eps: Double): Double = 1.667 * eps * projsize / sqrt(ntask.toDouble())

// simulation

fun xround(x: Double, scale: Int): Int = scale * round(x / scale).toInt()

data class Result(var avgest: Int, var avgact: Int, var risk0: Double, var risk5: Double, var risk20: Double, var risk100: Double)

val SIM_IT = 10000
val ROUND_SCALE= 100

fun simulate(projsize: Double, ntask: Int, eps: Double, minsize: Double, minutil: Double, tasknorisk: Boolean, projnorisk: Boolean): Result {
    var avgest = 0.0
    var avgact = 0.0
    var risk0 = 0.0
    var risk5 = 0.0
    var risk20 = 0.0
    var risk100 = 0.0
    for(i in 1..SIM_IT) {
        val taskest = proj_estimate(projsize, ntask, eps, minsize, tasknorisk)
        val taskact = proj_actual(projsize, ntask, taskest, minutil)
        val projest = taskest.sum() + if(projnorisk) proj_buffer(projsize, ntask, eps) else 1.0
        val projact = taskact.sum()
        avgest += projest
        avgact += projact
        if(projact >= 1.00 * projest) risk0++
        if(projact >= 1.05 * projest) risk5++
        if(projact >= 1.20 * projest) risk20++
        if(projact >= 2.00 * projest) risk100++
    }
    avgest /= SIM_IT
    avgact /= SIM_IT
    risk0 /= SIM_IT
    risk5 /= SIM_IT
    risk20 /= SIM_IT
    risk100 /= SIM_IT
    return Result(xround(avgest, ROUND_SCALE), xround(avgact, ROUND_SCALE), risk0, risk5, risk20, risk100)
}

// test

val PROJ_SIZE = 10000.0

val TXT_FMT_HEADER = "%4s %6s %6s %6s %6s %6s %6s"
val TXT_FMT_RECORD = "%4d %6d %6d %6.2f %6.2f %6.2f %6.2f"
val TXT_FMT_FOOTER = ""

val HTML_FMT_HEADER = "<table>\n<tr>\n<th>%s</th>\n<th>%s</th>\n<th>%s</th>\n<th>%s</th>\n<th>%s</th>\n<th>%s</th>\n<th>%s</th>\n</tr>"
val HTML_FMT_RECORD = "<tr>\n<td>%d</td>\n<td>%d</td>\n<td>%d</td>\n<td>%.2f</td>\n<td>%.2f</td>\n<td>%.2f</td>\n<td>%.2f</td>\n</tr>"
val HTML_FMT_FOOTER = "</table>"

fun test(eps: Double, minsize: Double, minutil: Double, projnorisk: Boolean, tasknorisk: Boolean, fmt_head: String, fmt_rec: String, fmt_foot: String): Unit {
    println("projnorisk=%b tasknorisk=%b".format(projnorisk, tasknorisk))
    println(fmt_head.format("task", "avgest", "avgact", ">", ">5%", ">20%", ">100%"))
    for(ntask in intArrayOf(10, 25, 50, 100, 250, 500, 1000)) {
        val res = simulate(PROJ_SIZE, ntask, eps, minsize, minutil, tasknorisk, projnorisk)
        println(fmt_rec.format(ntask, res.avgest, res.avgact, res.risk0, res.risk5, res.risk20, res.risk100))
    }
    println(fmt_foot)
}

fun test(eps: Double, minsize: Double, minutil: Double, projnorisk: Boolean, tasknorisk: Boolean) {
    test(eps, minsize, minutil, projnorisk, tasknorisk, HTML_FMT_HEADER, HTML_FMT_RECORD, HTML_FMT_FOOTER)
}

fun test(lbl: String, eps: Double, minsize: Double, minutil: Double): Unit {
    println("-- %s scenario: minsize=%.1f minutil=%.2f (est < %d are useless, actual never < %d%% of est)".format(lbl, minsize, minutil, minsize.toInt(), (100 * minutil).toInt()))
    test(eps, minsize, minutil, false, false);
    test(eps, minsize, minutil, false, true);
    test(eps, minsize, minutil, true, false);
}

fun test_ideal(eps: Double): Unit {
    test("Ideal", eps, 0.0, 0.0)
}

fun test_realistic(eps: Double): Unit {
    test("Realistic", eps, 40.0, 1.0)
}

fun main(): Unit {
    for(eps in doubleArrayOf(0.20, 0.15, 0.10, 0.05)) {
        println("**** Estimation quality: eps=%.2f (95%% prob for estimate +/- %d%%) ****".format(eps, (200 * eps).toInt()))
        test_ideal(eps)
        test_realistic(eps)
    }
}

Article history:

Version Date Description
1.0 April 1st 2022 Initial version

Other articles:

See list of all articles here

Comments:

Please send comments to Arne Vajhøj