Advanced search
1 file | 9.38 MB Add to list

From raw numbers to robust evidence: finding fact, avoiding fiction

Sam Desiere (UGent)
(2015)
Author
Promoter
(UGent)
Organization
Abstract
Data are key to empirical research. But data by themselves are not yet information. Raw numbers need to be transformed into measurements and, finally, into robust evidence, which can be used to help designing evidence-based policies. In this thesis, three different steps in this transformation are examined: (i) collecting good-quality data; (ii) quantifying concepts and (iii) accounting for the imperfections in quantified concepts to obtain robust evidence. Different challenges are encountered at every step. This thesis focuses on household survey data from developing countries collected by universities, NGOs or (inter)national institutions with the explicit objective of `enhancing the evidence base'. Household surveys are still the most important source of information in developing countries where administrative data are often incomplete and where `big data', such as data from mobile phones, are still in their infancy. This is unlikely to change in the near future. Monitoring the implementation of the Sustainable Development Goals is likely to increase the demand for household surveys even further. More awareness about the process of transforming raw numbers from household survey into robust evidence is therefore indispensable. The first critical step towards robust evidence is collecting high quality data since using `wrong numbers' will lead to the `wrong results'. It is often argued that the lack of data in developing countries impedes the design of sensible policies. Perhaps even more critical, however, are data of poor quality that are used to design policies or to support far-reaching reforms. The first case study in this thesis illustrates that this is indeed a real threat. Different datasets that purport to measure the impact of large-scale and controversial agricultural reforms on yields in Rwanda provide very different results. However, only the most positive estimates have been incorporated into the international data management system of the FAO, amplifying the risk that these numbers will be accepted as the `truth' and possibly used for policy design elsewhere. The second step in the transformation of raw numbers into robust evidence requires quantifying theoretical concepts. The difficulty here is that these concepts are often not directly observable. Household surveys, for instance, are frequently designed to measure the concepts of poverty or food security. Yet, these concepts are not directly observable and require the development of measurement instruments. These measurement instruments are based on a set of rules that define how observable household characteristics should be translated into the unobservable concept. The development of such measurement instruments is challenging and involves making many different assumptions. Moreover, one can always question whether the final measurement instrument measures the concept it is intended to measure and under what circumstances it measures the concept precisely and accurately. Addressing these questions in the social sciences is notoriously difficult because of the lack of gold standards or the absence of benchmarks against which a newly developed measurement instrument can be assessed. Moreover, the validity of measurement instruments should ideally be tested in many different contexts. However, in practice, social scientists work outside of a laboratory and cannot manipulate the context in which they operate. In this thesis, the challenge of quantifying concepts is illustrated by evaluating the validity of four measurement instruments: GPS to measure the directly observable concept of land area and three poverty and food insecurity indicators, which quantify unobservable concepts. The evaluation of GPS measurement of land area is straightforward as it can be assessed against the gold standard of compass and rope measurement. The evaluation of food security and poverty indicators requires more creativity since gold standards are unavailable. The three case studies of poverty and food security indicators are used to illustrate three different aspect of validity: cross-sectional validity, inter-temporal validity and internal validity. The first indicator, the Progress out of Poverty Index (PPI) in Rwanda, is benchmarked against expenditure data. It turns out that this indicator is cross-sectionally valid, that is, it consistently distinguishes poor from non-poor households. The second indicator, the Household Food Insecurity Access Scale (HFIAS), is benchmarked against total agricultural production. This indicator is cross-sectionally valid, but its inter-temporal validity is questionable. While total food production decreased over a period of five years, the HFIAS pointed towards an improved food security situation over the same period. This implies that the indicator cannot be used to monitor the evolution of food security over time. The third food security indicator, the Household Dietary Diversity Score (HDDS), is not assessed against an external benchmark. Instead, its internal validity is evaluated using Rasch models. In other words, it is analyzed if the different food groups included in the HDDS measure a single underlying concept. This is not the case, raising the question of what the HDDS actually measures. Even with good-quality data and excellent measurement instruments, concepts may still be imprecisely or inaccurately measured. Hence, the third and final step of the transformation of raw numbers into robust evidence consists of accounting for these imperfections when establishing (causal) relations between two (or more) imperfectly measured concepts. To illustrate the relevance of accounting for measurement error, it is shown that imprecise measurement of the harvest at plot level can generate a spurious, negative correlation between productivity and plot size. This has implications for the stylized fact of the inverse productivity-size relationship. The transformation of raw numbers into robust evidence is a long journey with several steps along the way, all of which are decisive for the final outcome. At every step, new challenges need to be tackled. This requires skilful interventions by researchers and an open discussion about the minimum set of assumptions needed to overcome the challenges. These steps also hold some implications for the interpretation of the final outcome of the journey: robust evidence. A first policy implication is that the academic community pays more attention to the issue of data quality. The compulsory publication of the data alongside journal articles would be an important first step in this process. In addition, studying systematic measurement error can help to limit bias in empirical work and to improve survey design. A second implication has to do with the development of measurement instruments, and in particular, poverty and food security indicators. There is definitely a demand for indicators that can quickly estimate the prevalence of poverty and food insecurity at a regional level in order to monitor development programmes, target the most vulnerable household and design policies. Yet, with so many indicators in existence, choosing the one that is most useful for the purpose at hand is complicated since every indicator has its own strengths and weaknesses. More validation exercises of existing indicators could help to clarify the circumstances under which a particular indicator works and/or is useful. An important advantage of these `validity exercises' is that researchers will remain keenly aware of the shortcomings of a particular indicator, which are likely to be context-specific. Given the existence of so many indicators one can argue that the validation of existing indicators should be prioritized over the development of yet more indicators. Finally, we should remain aware that the principal driver for funding the collection and interpretation of raw numbers is the call for more `evidence-based policy'. The main –- and perhaps unexpected –- lesson of this thesis is that `quantitative evidence' should not be considered the gold standard for the design of evidence-based policies. Quantitative evidence is man-made and needs to be complemented by other sets of evidence when designing policies. Researchers should be at the forefront of weighing the quality of different evidence bases and of attempting to synthesize them.

Downloads

  • SAMDESIEREphd.pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 9.38 MB

Citation

Please use this url to cite or link to this publication:

MLA
Desiere, Sam. From Raw Numbers to Robust Evidence: Finding Fact, Avoiding Fiction. Ghent University. Faculty of Bioscience Engineering, 2015.
APA
Desiere, S. (2015). From raw numbers to robust evidence: finding fact, avoiding fiction. Ghent University. Faculty of Bioscience Engineering, Ghent, Belgium.
Chicago author-date
Desiere, Sam. 2015. “From Raw Numbers to Robust Evidence: Finding Fact, Avoiding Fiction.” Ghent, Belgium: Ghent University. Faculty of Bioscience Engineering.
Chicago author-date (all authors)
Desiere, Sam. 2015. “From Raw Numbers to Robust Evidence: Finding Fact, Avoiding Fiction.” Ghent, Belgium: Ghent University. Faculty of Bioscience Engineering.
Vancouver
1.
Desiere S. From raw numbers to robust evidence: finding fact, avoiding fiction. [Ghent, Belgium]: Ghent University. Faculty of Bioscience Engineering; 2015.
IEEE
[1]
S. Desiere, “From raw numbers to robust evidence: finding fact, avoiding fiction,” Ghent University. Faculty of Bioscience Engineering, Ghent, Belgium, 2015.
@phdthesis{7005097,
  abstract     = {Data are key to empirical research. But data by themselves are not yet information. Raw numbers need to be transformed into measurements and, finally, into robust evidence, which can be used to help designing evidence-based policies. In this thesis, three different steps in this transformation are examined: (i) collecting good-quality data; (ii) quantifying concepts and (iii) accounting for the imperfections in quantified concepts to obtain robust evidence. Different challenges are encountered at every step.
This thesis focuses on household survey data from developing countries collected by universities, NGOs or (inter)national institutions with the explicit objective of `enhancing the evidence base'. Household surveys are still the most important source of information in developing countries where administrative data are often incomplete and where `big data', such as data from mobile phones, are still in their infancy. This is unlikely to change in the near future. Monitoring the implementation of the Sustainable Development Goals is likely to increase the demand for household surveys even further. More awareness about the process of transforming raw numbers from household survey into robust evidence is therefore indispensable.
The first critical step towards robust evidence is collecting high quality data since using `wrong numbers' will lead to the `wrong results'. It is often argued that the lack of data in developing countries impedes the design of sensible policies. Perhaps even more critical, however, are data of poor quality that are used to design policies or to support far-reaching reforms. The first case study in this thesis illustrates that this is indeed a real threat. Different datasets that purport to measure the impact of large-scale and controversial agricultural reforms on yields in Rwanda provide very different results. However, only the most positive estimates have been incorporated into the international data management system of the FAO, amplifying the risk that these numbers will be accepted as the `truth' and possibly used for policy design elsewhere.
The second step in the transformation of raw numbers into robust evidence requires quantifying theoretical concepts. The difficulty here is that these concepts are often not directly observable. Household surveys, for instance, are frequently designed to measure the concepts of poverty or food security. Yet, these concepts are not directly observable and require the development of measurement instruments. These measurement instruments are based on a set of rules that define how observable household characteristics should be translated into the unobservable concept. The development of such measurement instruments is challenging and involves making many different assumptions. Moreover, one can always question whether the final measurement instrument measures the concept it is intended to measure and under what circumstances it measures the concept precisely and accurately. Addressing these questions in the social sciences is notoriously difficult because of the lack of gold standards or the absence of benchmarks against which a newly developed measurement instrument can be assessed. Moreover, the validity of measurement instruments should ideally be tested in many different contexts. However, in practice, social scientists work outside of a laboratory and cannot manipulate the context in which they operate.
In this thesis, the challenge of quantifying concepts is illustrated by evaluating the validity of four measurement instruments: GPS to measure the directly observable concept of land area and three poverty and food insecurity indicators, which quantify unobservable concepts. The evaluation of GPS measurement of land area is straightforward as it can be assessed against the gold standard of compass and rope measurement. The evaluation of food security and poverty indicators requires more creativity since gold standards are unavailable. The three case studies of poverty and food security indicators are used to illustrate three different aspect of validity: cross-sectional validity, inter-temporal validity and internal validity. The first indicator, the Progress out of Poverty Index (PPI) in Rwanda, is benchmarked against expenditure data. It turns out that this indicator is cross-sectionally valid, that is, it consistently distinguishes poor from non-poor households. The second indicator, the Household Food Insecurity Access Scale (HFIAS), is benchmarked against total agricultural production. This indicator is cross-sectionally valid, but its inter-temporal validity is questionable. While total food production decreased over a period of five years, the HFIAS pointed towards an improved food security situation over the same period. This implies that the indicator cannot be used to monitor the evolution of food security over time. The third food security indicator, the Household Dietary Diversity Score (HDDS), is not assessed against an external benchmark. Instead, its internal validity is evaluated using Rasch models. In other words, it is analyzed if the different food groups included in the HDDS measure a single underlying concept. This is not the case, raising the question of what the HDDS actually measures.
Even with good-quality data and excellent measurement instruments, concepts may still be imprecisely or inaccurately measured. Hence, the third and final step of the transformation of raw numbers into robust evidence consists of accounting for these imperfections when establishing (causal) relations between two (or more) imperfectly measured concepts. To illustrate the relevance of accounting for measurement error, it is shown that imprecise measurement of the harvest at plot level can generate a spurious, negative correlation between productivity and plot size. This has implications for the stylized fact of the inverse productivity-size relationship.
The transformation of raw numbers into robust evidence is a long journey with several steps along the way, all of which are decisive for the final outcome. At every step, new challenges need to be tackled. This requires skilful interventions by researchers and an open discussion about the minimum set of assumptions needed to overcome the challenges. These steps also hold some implications for the interpretation of the final outcome of the journey: robust evidence. A first policy implication is that the academic community pays more attention to the issue of data quality. The compulsory publication of the data alongside journal articles would be an important first step in this process. In addition, studying systematic measurement error can help to limit bias in empirical work and to improve survey design. A second implication has to do with the development of measurement instruments, and in particular, poverty and food security indicators. There is definitely a demand for indicators that can quickly estimate the prevalence of poverty and food insecurity at a regional level in order to monitor development programmes, target the most vulnerable household and design policies. Yet, with so many indicators in existence, choosing the one that is most useful for the purpose at hand is complicated since every indicator has its own strengths and weaknesses. More validation exercises of existing indicators could help to clarify the circumstances under which a particular indicator works and/or is useful. An important advantage of these `validity exercises' is that researchers will remain keenly aware of the shortcomings of a particular indicator, which are likely to be context-specific. Given the existence of so many indicators one can argue that the validation of existing indicators should be prioritized over the development of yet more indicators. Finally, we should remain aware that the principal driver for funding the collection and interpretation of raw numbers is the call for more `evidence-based policy'. The main –- and perhaps unexpected –- lesson of this thesis is that `quantitative evidence' should not be considered the gold standard for the design of evidence-based policies. Quantitative evidence is man-made and needs to be complemented by other sets of evidence when designing policies. Researchers should be at the forefront of weighing the quality of different evidence bases and of attempting to synthesize them.},
  author       = {Desiere, Sam},
  isbn         = {9789059898486},
  language     = {eng},
  pages        = {V, 209},
  publisher    = {Ghent University. Faculty of Bioscience Engineering},
  school       = {Ghent University},
  title        = {From raw numbers to robust evidence: finding fact, avoiding fiction},
  year         = {2015},
}