In the previous article we focussed on the importance of understanding the user need, especially information that will inspire them to take action when needed.
Unfortunately raw data rarely has this power. Data in itself is a complex subject and not the focus of these articles. Variables need to be understood, not only for what information they hold, but also the statistically correct ways in which they can be transformed and presented.
To create a simple picture it is possible to think of HR data falling into two big buckets:
Manipulating the data
Much of the information that you need to show won’t be residing in the databases and will need to be calculated. It’s always worth starting with what is needed to drive action and then seeing how this can be constructed rather than seeing what you have and working-out what to show.
For most of our reporting applications we find that more of the number of these calculated fields exceed the number of underlying data fields
Averaging and distributions
There are 3 common types of averages; mean, median and mode. In most instances in HR the mean is used, mainly because it is computationally easy. Far too often it’s used for data where a mean is inappropriate. Medians which show the middle point are far more robust, meaning that they are resilient to outliers. Modes show the most frequently occurring value.
All three averages involve throwing away data and in many instances that will obscure valuable information. In most instances we therefore prefer to show the distribution of the data in addition to the average, often a median. You might also be interested in showing other quantiles.
Change and rates of change
In many instances, what is important to drive action is not the underlying state but the change that has occurred or is needed.
You might want to show how a variable has changed over time (often a percentage increase). Headcount growth is one example
You also might want to show whether that growth itself is getting faster or slower
Standardised HR metrics
HR bodies, whether they are commercial or professional bodies seem to love standardised metrics. A metric is usually a calculation based on a number of underlying variables. If you plan to use a metric then the default action should be to use an industry-standard one, however if a better construction is more appropriate then they shouldn’t be used slavishly. Just make sure that you highlight differences to the standard.
A standard metric can be useful but care is needed to present it in a way that ensures accurate assumptions are made and therefore the correct action can be taken. It is often useful to show the the underlying variable changes to facilitate comprehension.
Information that provides guidance to make decisions
A subject that is only just emerging in HR reporting is forecasting or prediction. In the simplest sense these are a set of techniques which take data residing in the database and estimate what is the likely future state.
An estimate will almost always have some form of confidence interval associated – in the near-term you can be reasonably confident and the range is likely to increase as you become more forward looking. If you’re using forecasting (and this is a very good thing) then showing confidence intervals can vastly increase comprehension. Where data is sampled confidence intervals should also be shown.
Importance of context
“And there’s also labels in supermarkets; you’ve got labels on the food stuff now, so you can- it says “Four grams of protein,” you go, “Ah!” Is that good? Is that far too little protein? Is it you’re gonna die of protein shortage, or you’re gonna overdose on it? “0.02 milligrams of sodium.” Sodium explodes in water. Do I need 0.02 milligrams of that?”
Eddie Izzard, Definite Article.
Much information is meaningless without context – the context gives it meaning. There are a few things that you’re likely to want to do:
Identifying what you need
When you’ve identified what you need to show then you need to identify what raw information you need construct the variables in a way that it is useful. This involves breaking aggregates and calculations up into component parts, identifying where the data could be stored and what is needed.
If you’re interested in this subject in more detail a good place to start is ‘Show me the numbers’ by Stephen Few.