Detail the Credibility of Changes in Metrics Over Time

The raison d’etre of a serial survey is for the sponsor to discover and proclaim changes in meaningful metrics over time. Based on the responses to their survey a year ago and the one this year, they want to shout from the rooftops that “Total legal spending as a percentage of revenue in the telecom industry climbed from 0.66 to 0.75 during the past year!” Setting aside quibbles about margin of error, to evaluate the legitimacy of such claims, readers should focus on the degree of similarity of the respondents last year to this year.

Ideally, a serial survey ought to have a stable group of respondents answering similar questions for the two identical periods of time. If there is churn in the participant group – last year 150 took part but this year only 100 of them did along with 60 new ones, the claim becomes more problematic. If either the phrasing of the key question or its definitions or instructions shift – last year asked about gross revenue and this year only revenue, the purported shift becomes questionable. If the interval moved – annual revenue one year and fiscal-year revenue the next, year-over-year comparisons become specious.

It will cast even more doubt on the reliability of the asserted change in total legal spending if we consider the echo chamber effect and the context of the question: from the invitation email, to the title of the survey to the order of the online questionnaire. It’s impossible to achieve ceteris paribus. Regardless of all that, were a significant external effect to have swept over the industry or group – perhaps a recession, a massive bankruptcy, a profound shaking from a new law or court decision – that too could put into question the purported change in a key metric.

A scrupulous survey report would not only highlight the change in the metric for the consistent, core group of respondents (and for the full group) but would also report honestly on any methodological challenges.

Even if the community of participants has substantial similarity, such as if 90 AMLAW 200 law firms answered last year, but only 65 returned and 35 new ones took the place of the dropouts, the sponsor should not shout about shifts in a benchmark metric. If the absolute number of participants declines, but the group in the second year stayed the same, that poses fewer or methodological problems (a form of response bias or the echo chamber effect might mar the data).

These methodological difficulties confound all serial surveys. The touted “continuity” of a benchmark survey may be a Potemkin village. Yes, nearly 50 years ago Equitable’s legal department commissioned Price Waterhouse to do a consulting project, from which emerged the distant ancestor of one of today’s staffing and spending surveys. Later, the survey trundled over to Hildebrandt, continued when Thomson Reuters acquired that firm, and kept plugging away when BakerRobbins merged in. Now, a survey that prides itself on its lineage perches with a fourth company. Continuity of a survey doesn’t matter; continuity of participants and questions does, so findings of serial surveys ought to disclose turnover in the ranks.

I have not found a way to capture in a single number the notion discussed here. To say that the results from the second survey reflect answers from 62% similar respondents goes part way (100/160), but that calculation does not bring out changes in the total number of respondents in the second survey. Nor am I clear whether to take the median of the second year carry-overs and compare it to their median of the first year, or to subtract each company’s second year metric from its first year metric and take the median of that set of numbers.