Rigorous impact evaluations – more than just art for art’s sake

    Two decades of rigorous impact evaluation in development cooperation

    Researchers have used rigorous impact evaluations (RIE) as an important tool for impact assessment for nearly 20 years. Rigorous evaluation approaches are also gaining ground in international organisations and development banks. But what are they exactly? What are the advantages and disadvantages, and what role do RIEs play at KfW Development Bank.

    Since the turn of the century, rigorous impact evaluations have become increasingly common in international development cooperation (DC). Inspired by the scientific community, they are now an integral part of many projects of stakeholders working in international cooperation.

    The World Bank, for instance, has a Development Impact Evaluation (DIME) unit. The World Food Programme - winner of the 2020 Nobel Peace Price - has been pursuing the WFP Impact Evaluation Strategy since 2019, while the International Initiative for Impact Evaluation (3ie) has been supporting and synthesising rigorous evidence from development projects since 2008.

    This led to a massive increase in the absolute number of RIEs carried out in development cooperation. While only about 50 RIEs of projects or politics in countries of the Global South were published worldwide by 2000, the next 15 years saw a boom with more than 4,000 RIEs (Sabet and Brown 2018).

    Abbildung 1

    The growth of RIEs was fuelled by the convergence of two trends. On the one hand, since the turn of the century political actors have actively pursued a stronger focus on impact in development cooperation. This was manifested in the Millennium Development Goals (MDGs) and in the Aid Effectiveness Agenda, in which the Federal Ministry for Economic Cooperation and Development (BMZ) played a significant role.

    On the other hand, researchers’ interest in analysing the causes of poverty and especially in possible mechanisms to alleviate it grew. They increasingly used improved statistical and econometric methods to evaluate projects. Researchers began to apply experimental methods – which were already common in natural science and medical research – to questions related to development economics.

    The Swedish Royal Academy of Sciences awarded these efforts in 2019 with the Prize in Economic Sciences in Memory

    of Alfred Nobel to development economists Abhijit Banerjee, Esther Duflo and Michael Kremer. The prize committee

    commented about the award: “Millions of people today benefit from effective interventions developed and tested with the new experimental approach for which they [the laureates] have laid the foundation.”

    Why evaluate?

    There are many good reasons for performing robust evaluations. Among the most important is accountability to the public and civil society in our partner countries as well as in Germany. Evaluations make it possible to identify particularly effective approaches, to modify them early on if necessary and to quantify the cost-effectiveness of a project. Moreover, reliable impact measurements also enable institutional learning. Finally, the findings contribute to external learning and the global evidence base.

    Evaluation is particularly important in the context of global DC: on the one hand, DC projects do not have to compete the way private companies do. Traditional market mechanisms, such as bankruptcy when companies are poorly managed or crowded out by better products, do not exist. On the other hand, limited financial resources stand in contrast to a large number of urgently needed investments. A solid understanding of effectiveness is therefore extremely important.

    RIEs are only one of several methods of evaluation or monitoring. For example, KfW Development Bank has successfully employed ex post evaluations since 1990 to systematically observe and assess projects as a whole and over time. However, if one is particularly interested in effects at the impact level, the most rigorous way to measure them is – as the name suggests – by means of an RIE.

    But what exactly are rigorous impact evaluations?

    RIEs describe a toolbox of experimental and semi-experimental methods that measure the causal effects of a project. The emphasis is on causality. In other words, on identifying those effects that can be attributed exclusively to the project and isolating them from concurrent developments or other connections between projects and target indicators. In addition to measuring specific impacts on the projects’ target groups, RIEs also analyse impacts on subgroups or mechanisms underlying the impacts. For example, a healthcare project may have significantly greater effects for women than for men, or a new connection to the electrical grid may only lead to productive electricity uses in areas that have access to markets.

    The most rigorous methods in the IE toolbox are fully experimental methods, such as randomised controlled trials (RCTs), which are also known as the “gold standard”. In RCTs, a project – or even parts of the project – is randomly assigned to a group of individuals, schools, communities or other (“intervention group”). The second group receives access to the project later or – as is the case with a placebo – not at all (control group). The principle of random assignment, similar to medical research, ensures the comparability of the two groups: for example depending on the measure, they are on average the same age, similarly healthy, ambitious, vulnerable or wealthy. This means that all post-intervention differences between the groups can be attributed to the project itself. A well-known example is cash transfers, which are disbursed to households in the target group if their children attend school.

    If a purely experimental (random) assignment is not reasonable or feasible, semi-experimental methods are often a useful alternative. For example, comparison groups can be defined along threshold values of certain selection criteria (Regression Discontinuity Design, RDD). If a project targets children under two years of age, participants who are almost two years old can be compared with participants who are just over two years old. This approach is shown in an example from Burkina Faso.

    RCTs and RDDs are only two examples from the IE toolbox. Depending on the type of project, the level of implementation and the criteria for selecting beneficiaries, the toolbox provides a range of methodological options. One thing is certain, however: the earlier an impact evaluation is integrated into a project’s implementation, the more likely it is that reliable conclusions can be drawn about its impacts. Collecting data before the start of the project (baseline), for example, can greatly improve evaluations. These lessons learned can then be transferred to similar or follow-up projects to increase their effectiveness.

    A water project in Pristina, Kosovo shows that, in addition to the questions commonly addressed in an impact evaluation, it can also be worthwhile to evaluate behaviour.

    The project aimed to build effective structures for water supply and sanitation. The goal was to improve the drinking water supply and, with it, living conditions.

    To examine the payment behaviour of customers, so-called “nudges”, i.e. incentives to change behaviour, were applied and tested for their impact. These incentives include, for example, attaching the bill to the front door (instead of leaving it in the letterbox) or sending letters appealing to the customers’ sense of responsibility.

    The different incentives were randomly assigned. Depending on the type of incentive and the wording of the message, on-time payments increased by up to 62%. According to the responsible water supplier, the approaches will be adopted beyond the originally planned two-month period.

    This evaluation example shows how relevant results can be achieved without spending a lot of time and money, and how valuable it can be to test new and creative approaches using randomised methods.

    Nudging users to pay their bills - An impact assessment in Kosov

    This video shows what the local water supplier has learned and implemented following the impact analysis of Sebastian Tonke.

    RIEs also have critics, and the debate about the pros and cons has been passionately waged for years. One of the main criticisms is an ethical reservation:

    • Ethical reservations: participation in projects is not assigned based on needs but based on randomisation. This criticism is valid and important. The right to participate in a project must always follow fair and reasonable criteria. However, RCTs can (and must) adhere to high ethical standards, for example, if their design exploits regional, budgetary or time limits.
    • Results of RIEs are difficult to generalise across contexts, populations or timeframes: this criticism applies, as it does to any other evaluation method of individual projects. Existing possibilities to increase generalisability must therefore be fully exploited in the implementation of RIEs, and the transferability to other projects must be scrutinised on a case-by-case basis. By the way, an increasing number of meta-evaluations and systematic reviews of RIEs are seeking to reduce this hurdle.
    • RIEs are not suitable for all projects; even if an RIE can theoretically be carried out for every project, it is not always the most expedient method. It is therefore important to weigh the advantages and disadvantages of various evaluation methods on a case-by-case basis.

    RIE at KfW Development Bank

    KfW Development Bank’s Evaluation Unit increasingly provides institutional and methodological knowledge to support the implementation of RIEs. You can find impressions of KfW Development Bank’s evaluation designs in the projects in Yemen, Burkina Faso and Tanzania. The Evaluation Unit adjusts the use of RIEs – taking into account the methodological possibilities and limits consistent with the principle of form follows function – to the relevant content-specific question, the context and the needs and capacities of its partners. Depending on needs, households can be surveyed, and analyses with satellite or other secondary data can be conducted. Ideally, RIEs are implemented in cooperation with other development banks such as the World Bank or the Agence Française de Développement as well as local or academic partners. This allows synergies in learning, both between development banks and between partners.

    RIEs have been established in DC for a long time. They are an important contribution to more effectiveness and institutional learning in development cooperation. Apart from ex post evaluations, RIEs will thus gain importance for KfW Development Bank's Evaluation Unit as a further instrument for evidence building in the new decade.

    Experiences with experimental evaluations at KfW Development Bank by Alina Sennewald

    Since 2005, KfW’s multisectoral Reintegration and Reconstruction programme has sought to improve living conditions in Liberia and contribute to consolidating the ongoing peace process. The programme is being carried out in cooperation with Deutsche Welthungerhilfe and other non-governmental organisations (NGOs).

    When preparing the fifth programme phase, our team took the opportunity to initiate a rigorous impact evaluation in the form of a Randomised Controlled Trial (RCT). Our aim was to verify the impact logic, understand causal relationships and measure the actual impact of the project. We also wanted to better understand the implications of specific implementation aspects in order to incorporate them into the design of follow-up projects and ultimately achieve greater effectiveness.

    The RCT is currently being conducted by experienced researchers in cooperation with the implementing NGO. The initial results of the RCT already offer exciting lessons learned for an effective continuation of the programme. For example, the RCT has shown that, despite the strong role of NGOs in project implementation, more trust can be placed in the government. We are already learning a lot about the impacts of our project on the social, health and economic situation of the programme participants. Our experience so far motivates me to continue conducting rigorous impact evaluations in the future – whenever possible.

    More on the topic

    Cash-for-Work – a rigorous analysis of the impacts in Yemen

    Burkina Faso – the first 1,000 days count for a lifetime

    Simiyu Climate Resilience in Tanzania – an impact evaluation

    Impact evaluations

    Evaluations worldwide