Improvements to coding quality: autumn 2022

Our goal at OrganizationView is to support our clients to understand and take effective action based on large volumes of employee multilingual feedback. We see our role as helping employees be heard and leadership hear what is really happening.

To that aim we spend a lot of time trying to understand how executives can use the text data that their organizations are capturing to surface insight that can drive effective decision making. Much of the really useful insight isn’t the most common themes - good executives probably are already aware of those views - but instead by finding answers that help inform key business questions.

Our latest set of innovations have been supporting clients to achieve that goal

Increasing granularity of key themes

An important aspect of any coding model used for employee survey comments is that it needs to be granular-enough. As an analyst you want to be able to differentiate between different comments, ideally into ‘buckets’ which have common appropriate management responses.

We continue to identify new themes, at the moment often by taking larger themes and making them more granular. This isn’t always easy. In analysing verbatims it’s a lot easier to merge granular themes than to separate larger topics into subcomponents.

There are a few key principles that we apply with our coding model:

Themes should be mutually exclusive
Themes should be objective (we try to use themes with low inter-reviewer disagreement)
You need a process for managing theme evolution

At the theme level we’re trying to define unique topics which we in further steps in the analysis can illustrate how they’re being used.

For example, in our model over the last year or so we’ve had over 20 different themes that describe various parts of the benefits package, from the pension contribution through company cars to gyms and sports clubs. In many organizations different individuals might be responsible for managing each of these so differentiation will help channel feedback to the relevant team.

We’re currently using over 250 different themes that can be successfully identified using the base-model, before any client-specific fine tuning. We continue to expand this out, often by splitting larger themes into subcomponents.

On top of these themes we can provide higher-level groupings. We believe these should be related to stakeholders, and ideally reflect the organization structure, enabling client to direct feedback to where it can be resolved.

Even more granularity - identifying products, people, competitors, abbreviations etc.

Whilst we don’t see these as themes in their own right our clients often want to understand more details of what is being mentioned. For example the technology teams might want to understand which technologies are causing issues (is it the WiFi or Workday?) or to identify which answers relate to challenges with Slack.

Our new approach first builds a list of these entities. It then queries a knowledge base to provide more detail. For example it would link businesses with any holding company, determine their sector (useful to identify whether they’re a competitor) and add other metadata. It enables easy disambiguation by linking different names to the same entity.

This information is useful for us when we’re fine-tuning models. The algorithm will identify abbreviations and link this to our knowledge of several thousand common business and society terms. This can then be used to increase accuracy in the coding models.

This approach is powering our anonymisation methods for those clients interested in returning cleaned text. We’ve found it is important to avoid blanket-cleaning of feedback. For example John in accounts might be anonymised but you might want to assign answers about Jane the CEO to a leadership topic. Leaving mentions of famous people outside the organization might help comprehension of the answer.

Answering new questions

In the analysis of any survey (or arguably any dataset) the findings will often result in many new questions. In the past many organizations would perform a series of post-survey workshops to add more richness to their findings.

It’s also likely that business changes or management interest will result in new business questions being raised after a survey is completed. Is it possible to use survey data to inform these new requests?

What we wanted to do was to provide new ways of analysing the text answers with the aim of surfacing insight to answer these new questions. You can think of the challenge as finding needles in haystacks - there is a good chance there is information on other topics in the data but we need reliable ways of identifying it.

Our approach is to consider the text answers of recent surveys as a knowledge base which we can then question in a similar way that we all use Google to answer our questions on the web. In the past we might have done this via keyword searches but as the search engines have improved we have increasingly been able to ask natural language questions and have semantically relevant answers surfaced.

Take the example below. The question that our client had asked to their employees was how to improve working for a firm - a pretty standard question for an employee survey. Impacted by a tight labour market and increased attrition executives wanted to understand what was causing people to leave. Whilst employees hadn’t been directly asked about that topic it is a question that we can pose to the recent survey answers via our new algorithm:

These are just a handful of the answers that were the result of this new query. Every one of these answers can be traced back to where in the business, and to which employee groups the answers originated from. We can use the answers to look at other answers which reference the suggested cause, without mentioning attrition. “Who else is challenged with this issue?”. So in the answers below we might want to explore topics around lack of promotions or recognition.

Using this iterative approach, within a short period of time, we’re able build a rich qualitative picture to inform challenges.

Is it something they’ve experienced, a statement or a suggestion?

One of the downstream tasks of text analysis is finding answers that inform your goal. In many instances to inform decision making it’s useful to determine what the answer is showing - whether it is a suggestion, an experience, a statement or their wish. Classifying each sentence of each answer against these and other sentence types helps clients identify the answers that they’re looking for and push results to those who have responsibility to take action.

For example let’s consider that an analyst might want to find answers that describe employees experiences of HR dealing with Bullying. This becomes a simple query to show sentences where these three conditions apply. Clients can access these details by a few simple dashboard filters.

Alternatively you might want to find suggestions of how to improve SAP. Here we can filter to show all answers to show only these relevant ones.

Summary

We believe the purpose of text analysis is to structure large volumes of feedback so that executives can make more effective and reliable decisions. Even a single text question can provide richer and more insightful data than a large number of Likert questions and can continue to provide value long after the original survey analysis.

We continue to explore decision making and identify how ‘AI’ can be used to help analysts be guided and supported to make data-driven recommendations. We’re looking forward to sharing some of the improvements that we’ll be taking to clients in the next quarter.

Previous updates

Summer 2022 coding improvements