Pilat HR Solutions HomeAbout UsSolutionsProducts & ServicesMembersContact Us

Case Studies

Addressing Rater Bias In Assessments
Performance Appraisals have long been considered to tell us far more about the appraisER than about the appraisEE. Multirater feedback (or 360) is no exception. For years HR has been plagued with the issues arising from poor quality data and actual or potential rater bias has been at the center of the issues.

Our client, a major retail bank, was no exception either. For years, annual performance appraisals had been collected and were known to reflect a range of rater biases, from what they referred to as “Attila the Hun” ratings which were always harsh to the “Cuddly teddy bear” ratings which were always at the top of the scale. As is common in financial services, pay decisions were largely based on performance appraisal ratings. Complaints were common and the bank had been forced to introduce budgetary controls and to break the original mathematical link between appraisal rating and merit award. Managers had to distribute the budget based on their ratings. This, of course led to individuals with the same ratings receiving dramatically different awards with little constructive explanation, and managers abdicating from the debate by explaining that they awarded a high rating but “The system forced [them to only give a small award]”, etc.

For years, the client had tried the classic solutions of redesigning the form, changing the rating scale, and running “Performance Appraisal” training; all with little effect.

The client chose to work with Pilat to address the underlying issues and put in place a process that would engineer the desired behavior. At this stage the client did not have a web-based performance management system and wanted a paper based process that could be extended to its many remote and often small branches.

Pilat worked with the client over a 5-year period on the following issues:

  • Defining Good Performance
  • While individuals had goals and competency models, the processes did not truly help them to define these in day-to-day terms. Language was vague and behavioral indicators were described in academic terms. Pilat worked with the client to define these in terms that employees would use every day, making the link with rating definitions much clearer.
  • Defining Poor Performance
  • As above, but also looking at “What does very bad performance actually look like”. Most appraisers believe that none of their staff ever display poor performance. Often this is because, in HR, we always define everything in positive terms; we don’t help them to see how many day-to-day behaviors truly are poor. The lower performance ratings have to be defined in such terms that they do get used. (e.g.- “Always starting meetings a little late because you are waiting for one or two people who are [always!] late” is also poor performance.)
  • Collecting more elemental and hierarchical assessments
  • Pilat developed an appraisal that combined the best of focused narrative questions (seeking pithy, evidenced comments) and a substantial number of ratings. The ratings were multi-layered, enabling the appraiser to build up their overall assessment.
  • Collecting the data and completing statistical analyses

The appraisals were accumulated and the data automatically captured and analyzed. From this a number of outputs were produced:

  • Potential rater bias was detected (there was now sufficient data from each appraiser to complete this) and, using a process of standardization, an individual computed performance index for each factor was produced: goal achievement, display of each competency, display of cumulative competency, and overall performance. These were effectively de-biased scores that could then be used for such processes as merit awards, without the need for budget capping.
  • Using regression analysis, the relative values being attached to the elemental factors was computed by division, department and for the bank overall. (e.g. - To what extent is “quality” impacting overall perceptions of performance? Or to what extent is “customer service” impacting overall perceptions of performance?) This revealed that many “values” were only being given “lip-service”. Despite millions having been spent on customer service and quality training and processes, it was proved that those who (a) played the internal politics and (b) worked exceedingly hard on pure output issues, were the most valued.
  • Providing Appraisers with rater feedback

All of the data was analyzed and individual appraisers were provided with detailed report on their rating patterns and perceived rating skills. They were told about apparent harshness/leniency, inter-factor differentiation, inter-appraisee differentiation, possible halo/horns effects, etc.

The net results were:

  • Immediate cessation and major redesign of corporate Customer Service and Quality training
  • Redefinition of the corporate Values
  • Refining of the corporate Competency model
  • Substantial improvement in appraisal completion (process compliance)
  • Substantial increase in differentiation in appraisal ratings (within and across appraisals)
  • Substantial decrease in overall performance ratings
  • Dramatic time saving in determination and finalizing of merit awards
  • Increased employee satisfaction with the appraisal process as measured through a post implementation study. This was also measured independently and confirmed by means of annual employee survey.
  • Increased focus in requests for training, with employees and their managers making greater use of the competency model to articulate their needs.

As the bank extends into web-based systems, Pilat solutions for addressing rater bias can also be expanded to include:

  • real–time rater feedback. On entering a set of feedback (self assessment or provider assessment), the pattern of ratings can be automatically analyzed comparing the data to known use of rating scales within the organisation. An array of possible messages is then passed back to the person submitting the data. These all start with:

“Thank you for this submission. The data will be stored and used at the appropriate time.”

and end with:

“Click HERE if you wish to the feedback you have just submitted.”

Depending on the patterns detected in the data, a number of additional messages can be included such as:

“The average rating is relatively high and potentially places the individual in the top 10% of all employees. Is he/she really that good?”

“You have made relatively narrow use of the rating scale. This will make it hard for the recipient to identify her/his perceived relative strengths and limitations. Are you sure that he/she does no have any more clearly identifiable relative strengths and limitations?”

  • rating standard monitoring; giving senior managers the ability to monitor the distributions of ratings being contemplated (before disclosure to the appraisees) by their subordinated when rating their staff; and allowing them to initiate corrective action with the appraisers if appropriate
CONSULTING
TECHNOLOGY
DATA
Performance Management
Talent Management
Organisational Development
Development Management
Reward & Compensation
Custom Surveys
360 & Multi-Rater Feedback
Request Information
+44 (0)20 8343 3433
Case Studies
Demos
Reference Materials
Bookmark and Share
Home | Terms of Use | Privacy Policy | Site Map | © 2011 Pilat HR Solutions