3.5.6. What Constitutes a Rigorous Impact Assessment?
An impact assessment is generally considered rigorous to the extent that it establishes a credible counterfactual. Another term for this is "internal validity." The more rigorous the assessment methodology is, the more credible the counterfactual, and the greater the internal validity. Using internal validity as the standard, experimental methods that randomly assign study subjects into a treatment group who receive project services and a control group who do not receive project services are considered the most rigorous. (Experimental methods are also known as Randomized Controlled Trials, or RCTs.) Quasi-experimental methods, which create a control group from pre-existing groups by matching them to the control group, are less rigorous than experimental methods but still considered rigorous. Assessment methods that do not follow sound principles in control group creation are less rigorous yet, depending on the credibility of the control group, while methods that do not use a control group are not considered rigorous.
Methodological rigor, however, is not determined by internal validity alone. Rigor is also a function of "external validity," "construct validity," and "statistical conclusion validity." External validity is the extent to which the impact assessment findings are generalizable to other value chain projects. Generally, value chain projects operate in unique environments with unique actors and conditions and are subject to a wide variety of external forces outside of project control that make generalizations tentative in the best case. External validity depends on the extent to which the assessment methodology considers these other factors and incorporates them into the analysis and conclusions.
Construct validity is the extent to which the assessment design and data collection instruments accurately measure the project's causal model. Failure to measure the causal model accurately means that the assessment is not measuring what it purports to measure and thus the findings cannot be linked back to the project design, nor can the assessment findings be used to assess the validity of the causal model itself.
Statistical conclusion validity means that the researchers have correctly applied statistical methods and identified the statistical strength/certainty of the results.
Impact assessment rigor further depends on a variety of other factors that need to be incorporated into the assessment design, implementation, and analysis.
- Triangulation: The evidence for impact (and the counterfactual) is stronger to the extent that it is supported by multiple sources of evidence. Mixed method designs using different combinations of quantitative and qualitative research methodologies in particular allow researchers to triangulate toward more credible impact assessment findings.
- Methodological Transparency: Research methodologies are well documented and their weaknesses and related implications are identified.
- Sound Data Collection Methods: Data collection methods follow accepted good practice, including the use of competent researchers and the implementation of sound quality control measures.
- Methodological Appropriateness: The research methodology is appropriate to answer the research question(s). This principle incorporates the fundamental concept that the selection of the research methodology is driven by the research questions. Impact assessment is not a pre-determined research methodology in search of applications but the matching of research methodologies to the questions asked, as well as to the political, resource, and field constraints faced by researchers. Starting with the question rather than the methodology and taking into account relevant constraints will often point researchers toward methodologies outside their typical realm of preference or experience.