The greatest mistake for many in People Analytics?

Douglas Hubbard, in his excellent book ‘How to measure anything’ cites 3 economically-valuable reasons for measuring (and thus analysing):

To make better decisions
To influence behaviours
To sell that information.

All three are applicable to HR departments, especially if you consider that participating in a salary survey is selling information (because the cost is less to acquire the aggregated data if you’ve contributed your own data). However the primary reason for most of the work we see in People Analytics today is the first – making better decisions.

Machine learning takes over HR

So called ‘predictive modelling’ has made a huge impact in HR, or at least the ambition of HR departments over the last 24 months.

As a firm, we’re less-inclined to focus purely on the predictive nature of models, because, as I’ve previously mentioned, there are several other good uses of models.

Models can be used for:

understanding what is going on
communicating what is happening
predicting / forecasting a future state.

Many I speak to in HR who talk about and work in ‘predictive analytics’ actually are using machine learning approaches predominately for the first two of these reasons (a good check on what the primary purpose of a piece of work is is by understanding how the it’s intended to be moved into production).

Machine learning approaches tend to try and do two things:

They try and optimise a particular variable or feature
They search for patterns in data.

The loss function

If we look at a simple model of decision making under uncertainty (all decision making is under uncertainty as decisions are made about future events and all events in the future are uncertain) we need:

the probability an event is going to happen
The cost / value that we’ll face if that event occurs.

Furthermore we often have to account for the cost / value of the action we’ll take on the back of a decision if we think a particular event will happen. In general there is hopefully a benefit from making a good decision. However, there is also usually a loss from making a wrong decision.

In HR (like in goalkeeping in football / soccer) often this loss / value is asymmetric – i.e. the cost of making a wrong decision can often be much larger than the benefit of making a right decision. In most instances it’s good enough to build something which amounts to a grid summarising all the key options.

The function that defines the various costs / values is the loss function. Multiply the probability of the even happening by the loss and you get the expected value.

(In this article I’ll focus on the firm acting rationally. It’s worth noting that individual managers might not share this incentive)

Optimising on the wrong thing

This brings us to the key issue we see with many modelling projects we see in HR: they focus on optimising a certain workforce value rather than optimising expected loss / value.

In most attrition models that we see the model tries to reduce the number of people leaving. This can cause significant issues.

In one recent project we found that improving resourcing issues would optimise the number of people that the firm lost. However, it did so by disproportionately reducing attrition of low / medium performers. In general if your sole objective is to reduce your turnover figures do a better job at keeping your low performers!

In this example the best approach, accounting for the loss function was to focus on another set of factors, which were most influential in driving attrition of high performers. Because our loss function valued high performers more highly than average employees the optimal solution focussed on issues that increased the likelihood of this group leaving.

Defining a loss function can take time.

In their eagerness to start building a great model many analytics teams overlook the resource and effort needed to build a good, reliable loss function.

In some areas the loss function is relatively easy to build – if, for example, you have individual sales figures for each employee. However in most instances it is more difficult and requires a long process of investigation plus some realistic assumptions.

A suggested approach

We use CRISP-DM as a methodology to guide all our analytics projects. The iterative nature of this approach is valuable when creating a loss function as well as when building the model. It fits well into the Business Understanding – Data Understanding stages of CRISP-DM.

It’s worth socialising the loss function widely across the firm, not only in HR but in other functions who might have a stake. We like including finance in such a conversation as an agreement from finance that the approach is realistic tends to be very powerful when convincing sceptical HR managers that the approach is sensible.

Monte Carlo simulation can be powerful

At the end of the process we’re often exploring questions such as ‘if we change ‘x’ what will be the overall benefit / loss?’

In reality we now have two forms of uncertainty:

We have uncertainty for each prediction or a row-by-row basis
We have uncertainty in the loss function (because we can not predict with 100% certainty what the real costs / values associated with each outcome).

Suddenly even the mathematics of understanding expected losses / gains become difficult. Fortunately simulations can help.

If we explore our probability curves using simulation we can start to understand not only the most likely outcome of any decision but also the certainty we should have in the predicted loss or gain.

Thoughts when building your People Analytics function.

Building loss functions is a skill that many analysts won’t have naturally. If you’re building a People Analytics function it’s worth thinking of how you are going to resource this need.

It’s likely that people with this skill will have come from a more business-focussed background. You might have them in the HR function. You might have them in other parts of the business, for example operations or finance. Alternatively some parts of management consultancy will have developed such skills.

One of the key underlying assumptions of People Analytics is that not everyone is equal, and therefore if we focus on valuable segments we can improve our organizations.

The only logical implication of this thinking is to include loss functions in all your analysis work.

People AnalyticsAndrew MarrittSeptember 14, 2016