Modelling forced ranking

Forced ranking is a contentious way of trying to segregate performance groups in a firm, often by trying to realign rankings to a distribution curve – usually a normal distribution. In many implementations the bottom x percent of performers each year are asked to leave.

There are two goals stated for launching such a system:

(a) That the capability of the overall population increases over time
(b) That by forcing a ranking it is easier to reward / identify the highest performers.

This article considers such a system purely by modelling effect over time using a simple Monte Carlo-type simulation. 

For the first model we assumed there was no selection and managers could accurately measure the full economic value that the employee could produce.

We built a population of 100 ‘employees’. We gave each a performance capability (think producing widgets) using a random function that was distributed to a normal curve (we used widgets=norminv(rand(),100, 30) where 100 is the mean and 30 was the standard distribution.

Next we ranked the employees and removed the bottom 10%. We then replaced them with another 10 created using the original process. We repeated this over 5 years.

 

The green shows the top 90% and the orange the bottom 10%, who are removed and replaced.

As you can see the effect of running this process is:

  • The lower tail of the distribution is removed, raising the base level
  • The performance of the 100 employees is bunched up.

Therefore on the face of it we are achieving goal (a) but probably making it harder to achieve (b).

Given this is overly simplistic we ran a second test. Starting with the same 100 people, we then assumed that managers couldn’t perfectly identify performance. A good way of thinking of this is if you think of a situation where you see someone on their own against a blank background and are asked to state their height. You’ll probably get the height right +/- an error amount. This error is probably distributed normally.

We therefore ‘blurred’ the real performance capability by adding such an error value in the form perceived value=real value +(100-(norminv(rand(),100,5)) – ie we were saying the error is a random variable aligned to a normal distribution with average 100 and standard deviation 5. This assumes the error value will not depend on the actual performance.

You then get the following result


Here the orange and green show the people who should be exited based on the real performance value, yet the distribution show the perceived performance – the 10% cut was made on this value.

We saw between 1 and 3 people (10-30%) of the people assigned to the bottom 10% were ‘wrong’ for each year. 

The effect of allowing for some subjectivity in assessment was to slightly reduce the speed that the lower tail was reduced, but the overall pattern is similar.

What else do we see?

First, we need to consider the average after every year to see the overall effect that this is creating:

As you can see the average is increasing on a year-on-year basis, but the rate of change is reducing – it’s flattening out. This is what has been found by academic studies of actual use of this method. As Dave Ulrich notes

“It’s a terrific idea for companies in trouble, done over one or two years, but to do it as a long-term solution is not going to work,”

Studies have shown it works well for about the first 3 years, then advantages decrease due to this flattening effect.

Secondly we explore who is being exited:

What you see is that as time progresses it is increasingly recent hires who form the population of exited people. 

What about selection?

Of course few companies hire by random selection. How does selection add to this model?

There are two differing approaches to selection (or more usually a hybrid of the two)

  • The first person who reaches X level is hired
  • The shortlist are ranked and the top person is hired.

Both produce similar types or results, albeit the second selection method is likely to produce a higher average. Either way you end up with a distribution where the spread of abilities is more closely aligned and higher than with a random selection. This is similar to the latter stages of the model.

As noted earlier, with this closer distribution the average performance increase of every year is less. Therefore we can state that selection reduces the effectiveness of forced ranking / trimming.

Conclusion

This article was meant to look at a typical HR policy using a simplified dynamic model. In doing so, and doing so in a simplistic manner, we were able to produce results that were similar to those seen in real examples.

There are many instances where such models can be used in HR to explore policies or organizational changes – we feel that they are currently under-utilized.

In a future article we’ll explore some other uses of models, and how using observed data we can improve their accuracy.

Andrew Marritt