How to start a People Analytics project

As one of the earliest People Analytics practices we have extensive experience of working with clients to help build great People Analytics organizations, either by helping them work through pilot projects or through our regular analytics trainings.

In most instances, to improve the quality of analysis it’s likely that you’ll need to acquire better quality data, not use better algorithms. Our Workometry product was initially built to meet the need of providing high-quality perception data to use in predictive models. Our experience is that this data is often the most valuable sources of insight and most predictive variables when model-building.

There are a few simple things to consider when starting analytics projects in HR. The most important thing is to do this in a systematic manner, not just grab the easiest-to-get dataset and start modelling.

What is your business objective? Really??

Possibly the hardest challenge of any analytics project is to accurately define what you want to analyse.

This might come as a shock but with so many things in employee management commonly used concepts are poorly defined. A good example is employee engagement – there is no common definition of engagement and therefore statistical analysis is made more difficult.

One of the ways that we recommend clarifying such topics is to add the words ‘as measure by X’ to the end. So if the project is defined as improving employee experience then the definition could be ‘to improve employee experience as measured by the employee experience rating on this survey.’ Socialising such definitions is important to ensure that all the key stakeholders agree with your definition.

Another useful technique is to use a version of a technique ‘5 Whys’. Here the purpose is to challenge the initial problem description by repeatedly asking ‘why?’ until the actual causal issue is identified. To give a simple example:

Manager: We need to understand how to reduce employee attrition?

Analyst: Why do you want to do that?

Manager: Because it is causing us a lot of unnecessary cost.

Analyst: Why is it so costly?

Manager: Because it is disruptive to our business and we have to pay recruitment costs

Analyst: Why is it disruptive?

Manager: because it takes so much management time both in recruitment and also bringing people up to speed

Analyst: Why do you think we need to focus on reducing attrition rather than reducing the management time it takes for each hire?

Manager: You’re right. We need to investigate both and understand what possibilities would yield more value from the investment.

How are you going to realise value from this analysis?

It’s important at an early stage considering what you’re going to do to implement the results of any analysis. Will changes be influencing policy change? Will you be creating a predictive model to give you an individual risk score on each employee?

The reason why this is important is that it has implications on what data you can bring into your model. As data needs to be captured for a predefined purpose then providing a personal ‘score’ has implications on what data can be used in your model. Working with anonymous data sets and at an aggregate level may enable you to do far more with the data and give you far more flexibility in your model building.

How you will realise value will also drive which types of modelling you want to do. Does your model need to be easily interpretable (needed for policy, process or training changes) or could a black-box model be sufficient. If a black-box model how are you going to determine you’re not at risk due to discrimination regulations (hint, just not adding a gender variable to your model won’t prevent your model being discriminatory).

What could be could be driving this behaviour?

The next important action is to identify a serious of explanations which could be causing / influencing what you’re studying. There are 3 main sources that we tend to use:

Desk research: What has been identified by others as causes / correlations
Brainstorm: Get together a group of key stakeholders to identify their view on what are the causes. This also helps socialise the problem
Ask employees: Short open-question questionnaires (like Workometry) to as wide a population as you can will help you get an extensive list of possible causes. (We want to do an exploratory analysis at this stage). Our experience is that there will be a significant difference between this list and the stakeholder list.

What data do you need to test each possible explanation?

Now that you’ve identified the potential causes you need to identify the possible variables could you use in your model which would enable you to test each potential relationship. Again, this is another instance where being clear in what you are actually trying to measure it’s critical.

Some of the information that you need you will have in traditional employee systems, but it’s not likely to be enough. You may have data in other business systems but you might need to acquire new data.

Lots of data is available online from various credible data sources. Numerous governments and organizations like the UN publish great databases which can help you understand what is going on outside the organization with things such as the labour market or populations.

What new data will you need to capture?

It’s highly likely that you will need to capture new information to validate some of your ideas. In many instances you’ll have to ask people directly.

There are numerous data capture methods that you can use, however the process of how you solicit information is often at least as important as the questions you ask. You need to identify approaches which require low input from both the organization and the individuals concerned. If you will need to understand this on an ongoing basis you need to make sure it’s sustainable.

How will you measure the success of any changes?

Finally, before implementing any changes it’s important to identify how you are going to measure the impact of your changes.

It’s likely in most situations that this will have to be a hybrid approach – some measurements will need to be quantitive. Others are likely to be perception-based.

What is unfortunate is that all changes within an organization are likely to have unintended consequences. Also, given the complexity of organizations it’s unlikely that your model will be stable over time so you need to identify when the model will need reviewing.

The use of exploratory, open-text questions on a regular basis will enable you to monitor when new reasons emerge.

People AnalyticsAndrew MarrittMarch 8, 2018