The U.S. Agency for International Development (USAID) recently released its first-ever “NGO Sustainability Index” for sub-Saharan Africa. While the research that went into the index shines some useful light on the operating environment facing non-governmental organizations (NGOs) in Africa, the index is unfortunately even more useful as an example of how not to build an index of fuzzy social concepts like “NGO sustainability.” We look here at some problematic practices employed by the index creators while also calling out the positives where merited.
Much of the critique that follows builds on our own experience in building/suffering through many years of trying to refine the annual Global Integrity Report assessments and our Local Integrity Initiative diagnostics, as well as the research that went into our joint publication with the United Nations Development Program, “A Users’ Guide to Measuring Corruption.” That guide provides a more detailed analysis of many of the concerns outlined below.
First, the good news…
What’s there to like about USAID’s index? A couple of things.
First, the authors do a reasonable job of defining what they mean by “NGO sustainability” in Section 1 of the report. They define it as a positive legal environment in which NGOs can operate; sufficient organizational capacity; financial viability; NGOs’ track record in influencing policy making through advocacy; NGOs’ effectiveness in service delivery; their infrastructure; and whether NGOs in the country boast a “positive public image.”
Now, you might raise objections over their definition of “NGO sustainability” (I, for one, have big problems with the “image,” “service delivery,” and “advocacy” dimensions, which I’ll skip here for brevity’s sake) but at least the authors are transparent about what they are trying to measure. That’s a plus, and something not all index creators do well.
Second, an attempt was made to tap local sources of information to “grade” countries included in the index rather than have a team based in Washington come up with a spreadsheet based on third-party data. This is good. While we still have significant concerns with the research methodology (see below), at least USAID’s index is not just another “my aggregation formula is better than yours” attempt at averaging the same old third-party data into a “new” index of…something.
Now, the bad news…
The bottom line is that, in 2010, what’s been done here to create this index is simply not good enough based on the current state of the art.
The first problem is the scoring committees. It took us a while to figure out who actually scored each country; there’s very little discussion in the index’s background materials as to how scores were generated for covered countries across the seven dimensions of “sustainability.”
To quote USAID:
With a few region-specific modifications, the methodology of this Index replicates that of the USAID NGO Sustainability Index for Central and Eastern Europe and Eurasia, an established tool used by NGOs, governments, donors, academics and others to better understand the sustainability of the NGO sector in Europe and Eurasia…The methodology relies on a panel of NGO practitioners and experts in each country who assess the NGO sector in terms of seven interrelated dimensions: legal environment, organizational capacity, financial viability, advocacy, service provision, infrastructure, and public image. An editorial committee of technical and regional experts reviews each country panel’s findings.
OK, using local talent, we like that.
So a reasonable observer (including your humble author) would next ask, “So who is actually on the magic scoring panels?” Although this isn’t stated explicitly (and USAID, we’d welcome a correction here if warranted), the reader can infer that the roughly 70 names listed on page iv under Acknowledgments were the panelists in the target countries. There are 2-5 experts listed for each country covered.
What’s our beef? We counted the names and organizations listed, and 29 out of 70 are USAID or US embassy officials; in other words, more than 40% of the panelists are working for USAID or the US government. The other 60% are working with NGOs, many of whom – and we don’t think this requires a huge leap of faith – are likely financially and programmatically supported by the same local USAID missions and/or embassies. Are we really comfortable with the idea that we’re getting unbiased “practitioners and experts in each country” generating these scores? We doubt it.
Second problem: lack of transparency around the scoring. Nowhere do the index’s authors make transparent how they arrived at Country X’s score for any of the seven dimensions of Country X’s NGO sustainability rating. You can download a spreadsheet from the USAID website that contains nothing more than a few columns and rows with the final dimensional scores for each country, but again, the public has no idea where 5.8 came from for Ethiopia’s “advocacy” rating, for example. This approach reminds me of the criticism Freedom House has come under historically for failing to fully disclose the process behind their somewhat opaque scoring committees that come up with the annual “Freedom in the World” index. (For a wonderful critique of the Freedom House approach, we recommend Gerry Munck’s Measuring Democracy.)
This is a truly cardinal sin. Publishing an index that others cannot recreate might have been fine 10 or 15 years ago when no one was talking about these issues and the goal was simple awareness raising, but that era has long since passed. This lack of transparency renders the index virtually useless for actual policymaking or advocacy efforts. Put yourself in a government’s shoes: how do you improve the climate for, say, NGOs’ financial sustainability in your country if you don’t know where the negative rating came from or what particular aspect of “financial sustainability” is dragging down the score?
A third major problem with the index is the implied statistical precision of the numerical scores. Without disclosing how the scoring committees arrived at scores for each country’s seven dimensions of NGO sustainability, the reader is also supposed to accept the fact that, for example, Senegal’s 3.9 score for “service provision” is materially weaker than its 4.0 for “advocacy.” We’re not statisticians, but we’re pretty sure that the questionable and opaque methodology for generating those scores does not lend itself to publishing significant digits at the tenth of a percent. At the very least, margins of error would be useful here for encouraging readers to not obsess over tiny and very likely meaningless differences in some of these numerical results. There is unfortunately an entire generation of economists and quantitative political scientists out there who would love to run regression analyses on these results by (inappropriately) treating them as empirical, cardinal data.
How to do it better
We personally and professionally know a number of people involved in the creation of this index, so there’s no fun for us in pointing out some of these deficiencies. But we’re frustrated with what appears to be a never-ending steam of poorly conceived “indices” and data sets that simply don’t stand the common sense test of reliability and credibility. We can do better as a community of practice.
Without starting from scratch, here are three things USAID could quickly do to improve the next iteration of this research:
1. Publish the raw scores that went into generating each country’s 7 dimensional scores of NGO sustainability. We assume the country scorers/panelists were each provided with the opportunity to score the seven dimensions individually. Let’s see those numbers, even if they’re not personally attributed.
2. Better caveat the numerical results. Even if unintended, publishing data of this sort without appropriate margins of error makes it easier to invite abuse by third-party researchers and analysts. Two similar approaches for generating margins of error that might be useful in this context are the techniques employed by Global Integrity and the Revenue Watch Institute, both of which conduct fieldwork via an expert assessment (non-survey) methodology.
3. Diversify the scoring teams. This seems obvious, but having nearly half your scorers be employees of the agency paying for the research doesn’t strike us as an ideal way to do this. Find an extra $30,000 somewhere and contract with other local in-country experts to provide unvarnished, third-party input to the results.
— Nathaniel Heller