Handle the Quirks of Upward Evaluation Surveys

Online surveys provide a supple way for law departments and law firms to obtain upward evaluations or up-the-chain evaluations. An upward evaluation applies to someone’s manager whereas an up-the-chain evaluation also includes the manager of that manager. Bear in mind the following points about this potent kind of survey.

Time Period: When you ask employees to evaluate their managers, you should narrow the period over which they are to reflect back on their manager’s behavior. It could be the past six months, but that is probably too short and limits the number of evaluations you receive because some employees might have worked only with a single manager during that period. The past twelve months or eighteen months would be solid choices for the time frame and would enlarge the pool of managers who are eligible for evaluation.

Knowledge Depth: The survey should ask for the degree of interaction a respondent had with the person they are evaluating. You might ask, “How often have you interacted with the manager you are evaluating?” where the scale is quarterly, monthly, bi-weekly, weekly and daily. Those answers let you weight respondents more heavily because they evidence increasingly more familiarity. As a side note, you might want the instructions to the survey questions to explain that the respondent should only take into account direct knowledge they possess, not what they have heard from other people (no hearsay).^[It is a completely different kind of question, albeit both legitimate and able to sidestep difficult questions, to ask, “What would be your colleagues’ consensus view on the sensitive topic?”]

Unsure Penetration: If you leave it up to the invitees to decide which manager(s) they will evaluate, you can’t precisely fix the total potential survey population. It is partly a subjective judgement by employees as to whether they will evaluate someone or not. How can you know whether collected data on 30 percent of manager-supervisee relationships or 80 percent?

Insufficient Knowledge: Here is a potentially ambiguous evaluation criteria for a difficulty with upward evaluations. “Click 1 for Unacceptable, 2 for Needs improvement, 3 for Meets standard, 4 for Exceeds standard, 5 for Outstanding, or the right column if you have not experienced or observed the behavior.” It might have been better to put in deny knowledge or information sufficient to form a belief,” or “not applicable.’ But the way the “have not experienced or observed the behavior” was worded it might have been a positive action that they have never seen and therefore a backhanded criticism (“Never say him praise anyone”). The instructions clarified the evaluation standard somewhat as it said, “If you lack a basis for giving an assessment rating, click the right column.” But not seeing behavior that a good manager should exhibit is tantamount to a downvote.

Grade Inflation: Despite lavish doses of reassurance, employees still worry that their replies will be leaked somehow and that their manager learns who gave which evaluations. One reaction to that risk is to award grand ratings, aka grade inflation. Flattery on a survey shows up in answers heavily skewed toward the higher selections (e.g., 7 out of 7 or “Outstanding”).

Confidentiality: You need to work overtime to convince people that you will cloak their answers in secrecy. This must be done with clear and consistent communication. Retaining a third party to collect and protect the data and then prepare the report helps to a degree. Not asking for identifying demographic information also preserves anonymity. Of course, aggregating the response data serves to obscure individual values but you have to demonstrate that to naïve invitees. Early on, before a survey has amassed responses, individuals might worry that they will be the lone bird and somehow thereby outed. Once they learn that many of their similarly-situated colleagues have taken the survey, they can relax.

Scrub Comments: You don’t want anyone who is evaluated to be able to figure out who left a verbatim comment about them. This assumes you provide managers with the pertinent verbatim text rather than coding them manually. Coding erases incriminating details.

Weights: Without demographic information, you cannot weight a lawyer’s rating, for example, more heavily than a paralegal’s ratings. Yet, it is demographic information that exposes the respondent to identification. You can’t be anonymous if someone knows you have been with the law department 14 years, report directly to the general counsel, and specialize in labor and employment.

Multiple Responses: If data collected by the questionnaire contains nothing that identifies the respondent, and if the survey allows more than one response from an ISP (which you need to permit if the respondent has the experience to justify more than one evaluation), you run a risk of duplication. The survey hosting software may allow people to submit more than one response. A particularly disgruntled person, or a strong champion of an issue, might submit multiple responses, including for the same manager. What keeps people from doing so multiply for the same person, for a manager they haven’t worked for, or even for themselves?

Multiple Managers: Employees who have the familiarity to evaluate more than one manager need a seamless way to end one survey and start another one. They need to know that their views on multiple managers are desired. That transition might be achieved with a conditional logic question, or with text instructions, or the thank-you message generated by the survey upon submission.

Patterns over Time: Can we detect patterns in evaluations over time? It is plausible to me that the earliest responders have more extreme views than later responders. It is also plausible that people with critical views might wait to see how many responses come in and therefore over time the negativity would increase.