Handbook on Data Collection / Phase Three: Conduct Data Research

From Akvopedia
Jump to: navigation, search
English Français
Photo: Nta.ng

In phase two of the Handbook, we discussed the design of a project, which helps you to determine what data you need in order to inform stakeholders and help them make data-based decisions. Before rushing out into the field to collect this data, however, it is advisable to take a step back and assess what data is already out there, how reliable it is, and how you can actively involve your stakeholders in making sure you collect relevant data and you communicate it in the right way.

Data research is a method that helps you to systematically assess existing data, allowing you to identify where there are gaps in the data and where you can add value with your data collection project. At the same time, data research gives you the tools to think about your stakeholders and audience. In this article, you’ll find an overview of four consecutive steps that will help you in conducting data research. These steps build on the two earlier steps described in phase two: ‘Defining a clear and answerable question/problem you are trying to solve’, and ‘mapping all stakeholders involved in the problem’. The four consecutive steps are:

Data research in four steps

Step one: Make an inventory of existing data/evidence
Avoid double work and explore what data is readily available for use in your organisation.

Step two: Evaluate existing data
Assess the reliability of existing data and determine its usefulness.

Step three: Perform a gap analysis
Identify the data that is not readily available and will need to be collected.

Step four: Understand who will use your data
Think about which stakeholders to actively involve in your data collection and how to present the results to them.

Step one: Make an inventory of existing data/evidence

Once you have identified what data needs you have within your project, you will need to start gathering it. Some data may be readily available, while other data may still need to be captured. You can start off by making an inventory of existing data.

First of all, look into the data resources of your own organisation, including what is gathered in reports and stored in databases. Consider both quantitative data, expressing a certain quantity, amount or range, and qualitative data, which is more descriptive, resulting from small scale surveys, focus group discussions, observations and interviews. You can then think about what data may be available and easily accessible outside of your organisation. Are there any data sharing platforms or other organisations that deal with the same problem or try to answer the same question? What data do they have on this problem? Is it open access? Even if data is not openly accessible, it might be possible to persuade this organisation to share its data.

Step two: Evaluate existing data

Once you’ve created an inventory of existing data sources, it is important to evaluate the existing data on its accessibility, granularity, credibility and relevance. The following questions can help you understand whether the existing data is available for usage, detailed enough and has the right scale, and reliable enough for you to use in your programme:

  • Is the data openly available, or does it require special permission to access? (Accessibility)
  • Is the data structured in a way that is useful for your project? (Relevance)
  • How often is the data collected? (Granularity)
  • How granular or detailed is the data geographically? (Granularity)
  • How granular or detailed is the data demographically? (Granularity)
  • When was the data collected? How long has it been retained? (Relevance and Granularity)
  • Do the current problem solvers use it for decision making, evaluation, or something else? (Credibility)
  • Who collected the data? What was the purpose of their data collection? Has the data been cleaned and/or analysed? And if so, in what way? (Credibility)

Step three: Perform a gap analysis

Now that you have identified the data sources that are available to you and what data you can use for your project, you need to think about what data you still need to collect to answer your questions. To do so, it helps to ask the following: “what data do I need to answer my questions or describe my indicators?” It’s important that, in the first instance, you don’t think about restrictions that might be apparent in collecting this data. Only after identifying the data you need should you start considering potential restrictions, such as time, (financial) resources and feasibility. It might turn out that data you initially deemed infeasible to collect isn’t as difficult to gather after all.

Once you have identified all the data gaps, take a critical look at the data you have identified as necessary. Do you really need to collect all that data? And what are you going to use all the different elements for? Although it’s tempting to collect data that you may think will be useful in future, a general rule of thumb is that less data is more. It’s better to focus on the things that really matter and minimise complexity. This essential data, simplified, is less expensive to collect, less time consuming, and you don’t run the risk of collecting the wrong data.

Step four: Understand who will use your data

If you are collecting data to contribute to solving a problem, or to underline the importance of addressing a certain problem, keep in mind that it is crucial to involve all relevant stakeholders from the beginning of the data research process. This will create ownership of the data, ensure relevance and usefulness of the data, result in communities feeling represented by the data, and avoid decision makers turning a blind eye or questioning the credibility of the data. Start your data collection exercise with an inventory of what the different stakeholders want to know and how you are going to reach them.

Sharing the data with the people directly involved in the problem empowers them to take action. However, this involves thinking about how to share the data in an understandable and accessible way. In remote communities, accessing the data online may prove to be difficult, and radio stations or distribution of offline materials may be a better mode of dissemination. You might want to consider making a data dissemination plan, in which you identify your stakeholders and their respective communication channels. For more information on how to reach your target audience refer to phase eight of the Handbook.


Data research is an approach that will help you to create focus in your project. Thinking in this structured way about data gathering will avoid the collection of duplicates and encourage everyone involved to determine the quality and usefulness of available data. This method also allows you to assess whether the data you are collecting is truly relevant to your project, and the different stakeholders involved, and forces you to think about how to disseminate the data to them before the data collection has actually started.


Authors: Annabelle Poelert (Akvo.org), Karolina Sarna (Akvo.org)
Contributors: Anita van der Laan (Akvo.org), Rajashi Mukherjee (Akvo.org), Rob Lemmers (Faculty of Geo-Information Science and Earth Observation (ITC) of University of Twente)


The Africa-EU Innovation Alliance for Water and Climate (AfriAlliance), is a 5-year project funded by the European Union’s H2020 Research and Innovation Programme. It aims to improve African preparedness for climate change challenges by stimulating knowledge sharing and collaboration between African and European stakeholders. Rather than creating new networks, the 16 EU and African partners in this project will consolidate existing ones, consisting of scientists, decision makers, practitioners, citizens and other key stakeholders, into an effective, problem-focused knowledge sharing mechanism.
AfriAlliance is lead by the IHE Delft Institute for Water Education (Project Director: Dr. Uta Wehn) and runs from 2016 to 2021. The project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 689162.
EU flag RGB.jpg