More R in HR

This article was first published on 14 August 2012 on the HR Tech Europe Blog

Look at most HR departments analyst teams and you’ll probably see extensive use of Excel to handle a wide variety of tasks from analysis, basic statistics and visualisation. Excel can certainly handle these tasks, but can be quite hard work and very hard to maintain quality. Some analysts use other tools, especially SPSS (probably given it’s popularity in university psychology departments) which certainly does many things better but does have weaknesses, cost being just one of them.

When I was starting OrganizationView I knew that getting an easy to use and powerful analytic platform was critical and during my conversations with many statisticians and analysts about their choices one tool kept coming up – R

R is a free statistical programming language that is rapidly becoming the standard in university departments and statistical teams in many big firms (especially in the ‘Big Data’ firms, life-sciences and finance sectors).

Is R really suitable for the HR analyst? To answer this question I spoke with Ajay Ohri, whose forthcoming book ‘R for Business Analytics‘ aims to answer this question.

“R is a statistical software but I’m not a statistician. I was an engineer who did an MBA and had a variety of commercial roles before starting my own consulting practice.”

We started by discussing the current landscape in HR.

“There is a lot of data out there and it’s stored in different formats. Spreadsheets have their uses but they’re limited in what they can do. The spreadsheet is bad when getting over 5000 or 10000 rows – it slows down. It’s just not designed for that. It was designed for much higher levels of interaction.

In the business world we really don’t need to know every row of data, we need to summarise it, we need to visualise it and put it into a powerpoint to show to colleagues or clients.”

Ajay and I discussed our initial experiences, common I guess to many, of downloading and starting R and being struck by the initial learning curve.

“My first thoughts were ‘this is R, it’s open source but it’s difficult to learn’. then I started to use Graphical User Interfaces – RCommander and Rattle. 

My book, which is written from the perspective of an ordinary guy who just wants to finish his project and go home on time. In every software there are tips and tricks that you learn over time. I’m just exposing this tricks. R isn’t too tough, you just need to use the right interface.

For those analysts used to a tool like SPSS then offers a GUI solution which is very close to the SPSS experience. Deducer is also meant for visualising using ggplot meaning that you can do some very sophisticated things without learning a line of code.”

Deducer wasn’t a tool I was familiar with before speaking to Ajay, my previous GUI of choice being the development environment RStudio. One area that I will certainly use it for in future is designing and producing visualisations for print. In a matter of minutes I had uploaded a big survey dataset, created a very good looking and sophisticated visualisation and saved it as a pdf which I could perfect in a tool like Illustrator. 

Whilst R can handle the routine tasks that an analyst needs to do with ease, and often with a fair degree of elegance, it really starts to come into its own when you want to extend it via the available packages. As Ajay notes..

“The beauty of R is that there are 3500 different packages. It’s like the top 500 universities in the world giving you software for free.”

R is one of the tools we use for building models and seeing populations in new ways. For example we’re firm believers in segmenting the workforce based on their similarities of behaviour or opinions using clustering techniques. Ajay recommends the GUI Rattle for clustering.

After speaking to Ajay, I wish that I had known some of his tips and tricks when I started my R journey a few years ago. As he notes.

“The great thing about R is that it cuts down the software budget to zero, and with GUIs like deducer cut down training to weeks. You can quickly move from spreadsheet world to being able to build models and predicting outcomes.”