Agency means that people have the power to play active roles in data systems and to influence decisions about their data and about the ways that data use affects them. Top-down approaches to data design and collection limit people’s exercise of agency and exacerbate existing power asymmetries in society. Inclusive approaches can expand it.

Who controls the design of data and statistical concepts and definitions has implications for how people are represented and included in data processes and resulting decisions. Inclusive approaches are important even beyond data production. Fundamental issues such as the structuring of questions, the decisions about who will ask those questions, and how the data is collected, analyzed, interpreted, and presented affect what data gaps are prioritized and ultimately how data systems are designed. Data in this way becomes an instrument that either reinforces or rebalances unequal power relationships in society. When people—especially those who have been historically excluded from decision making—actively participate in decisions about data collection, design, analysis, and use, they gain greater access to the benefits of data.

The statistics community plays an important role in the production of data and promoting inclusive approaches to data. By designing data and statistical concepts, definitions, methodologies, and quality assurance frameworks, this community influences how people are represented and included in data processes and the resulting decisions.[xl] The statistics community has made great strides in developing inclusive approaches to data in areas such as governance, gender, poverty, aging, and in using non-traditional data sources such as big data. But statisticians in the public sector are also often constrained by political priorities and by limited budgets and capacity. As the custodians of global statistical principles, statisticians have an important role to play in maintaining standards of autonomy and confidentiality to foster inclusion.

Building on this work, this chapter breaks down how data production and use affect power relationships in society. It highlights several promising approaches for increasing individual and community data agency, and it showcases how this agency contributes to a future centered around more equitable decision making and outcomes.

 

2.1 Unpacking data agency

Gwen Phillips is an Indigenous data advocate and member of the Ktunaxa Nation, one of Canada’s First Nations, who argues that the Canadian government’s data collection has historically focused on negative characteristics of societies like hers instead of on community assets, strengths, and abilities. Gwen says this historical focus is not by accident. “As long as others are controlling the agenda, data, and investments, we’re always going to be subject to being beggars in our homeland,” she explained.[xli] In Gwen’s view, data can be a means of oppression and of liberation.[xlii]

The government of Canada through Statistics Canada has been working with First Nations’ communities, and other marginalized communities, to address this. Statistics Canada is putting people at the center by analyzing the interactions between different sector outcomes to understand the factors that exacerbate exclusion and capture the lived experiences of these communities. As a data steward, Statistics Canada is also ensuring that data is based on consistent standards and classifications that allow international comparison to guide decision making.[xliii]

Like other historically marginalized groups, Indigenous communities around the world have experienced the adverse consequences of being excluded from data, of having no say in how they will be measured, and of having their lived experience ignored. As a result of long-standing systems of historical oppression and marginalization, many groups have been excluded from taking part in decision making processes, resulting in missed opportunities to share in the benefits and value of data.

When people and communities have agency in the production, governance, and use of data, they can influence the choices that are made about that data.

Agency is 'the capacity of people to actively and independently choose and affect change.'

For this paper, we apply this definition to data, having control over one's data and being able to choose whether, when, and with whom to share it as well as whether and how one is counted.[xliv]

Agency differs at personal and community levels. At the individual level, agency includes control over one's personal data (such as identification number, medical records, and location data) and the ability to choose when, with whom, and for what purposes to share it. But simply understanding agency at the individual level is not enough. The design, collection, and use of personal data can have broad impacts on groups and community members.[xlv][xlvi] Collective agency refers to the need for groups and communities to take part in data design, collection, analysis, interpretation, and presentation. A lack of agency at both levels means that people are excluded and unable to participate in decisions that affect their lives. It also means that their views and experiences may not be accurately reflected in data.

 

2.2 How data reinforces unequal power relationships in society

At the onset of the SDGs, the LNOB agenda was the central, transformative promise to reach the furthest behind and combat discrimination and inequalities within and among countries and address their root causes.[xlvii] The LNOB agenda has emphasized and advanced important efforts toward identifying inequalities and discrimination through the generation of evidence, data collection and data disaggregation. As Box 1 explains, disaggregating data by sex, disability status, and other factors is a first step towards agency in data-because inequalities are often obscured in aggregate-level data. But disaggregation is not sufficient on its own.

Box 1. The importance of looking beyond data disaggregation

Data disaggregation is the process of ensuring that data used to generate statistics and indicators for population groups can be further broken down into one or more dimensions or characteristics (commonly sex, geographic area, age, race, ethnicity, and disability). Data disaggregation allows data users to compare population groups and to understand the situations of specific groups.

Policy makers have used disaggregated data to identify at-risk populations and establish policies, programs, and legislation to protect them. For example, data from the Demographic and Health Survey revealed that, in the majority of sub-Saharan African countries, women in their teens and early twenties were disproportionately at risk of contracting HIV/AIDS. Governments responded by creating specific curricula on HIV transmission for young women and by prioritizing this population in the fight against infection.[xlviii]

Sometimes disaggregated data is not enough. Disaggregation cannot improve the visibility of those who are excluded from original data collection. It is also not possible to disaggregate data sets by every relevant dimension, meaning that some inequalities will remain invisible. Therefore, decision makers and statisticians who decide which disaggregation dimensions are prioritized, have power over which disparities will be analyzed, yet their perspectives may be biased or incomplete.[xlix] As such, disaggregation is not enough to ensure that people’s agency in data leads to greater access to resources, decision making, or existing levers of power.

An intersectional approach to data identifies inequalities within and between groups of people based on how an individual’s multiple identities (such as race, gender, disability status) intersect. This ensures that these factors are not intentionally or unintentionally obfuscated, consequently underestimating the roles and contributions of each person in society. Important concepts relevant to disaggregation may lack internationally agreed upon definitions or require activities beyond just data collection.

The Institute of Global Homelessness, through its ‘A Place to Call Home Initiative,’ took an intersectional approach to data.[l]Their approach ranged from developing a Global Framework for Understanding Homelessness that can be easily adapted to different contexts but which allows comparable definitions between countries, ensuring that people with lived experiences informed the design of data collection and took part in data collection, analysis, and use.

A key way that data reinforces unequal power relationships is by rendering people or groups invisible in data, undermining their agency and exacerbating inequalities. When people are not counted or are not appropriately represented in data, they are invisible to decision makers in government and development organizations.[li] Approaches that prevent people and communities from shaping data design, collection and analysis efforts based upon their own lived experiences also exacerbate their invisibility.

People may be excluded from data for a range of reasons. For example, people who live in hard-to-reach locations, who are illiterate, who lack access to digital technology, or who have a particular life situation or belong to a specific group of the population are often excluded from data sampling and data collection. Second, asking one household member to answer questions on behalf of the others (particularly on sensitive issues related to health, financial decision making, time use, and exposure to risk or violence) does not accurately capture differing constraints and opportunities within households. Household-level surveys have significant implications for people whose contributions are more likely to be underreported. Likewise, failure to register the births of children may prevent enrollment in school; and failure to gather data on children with disabilities, for example, hinders provision of accessible schooling, thus denying children with disabilities their right to quality education.

Some people may choose not to be counted because of a lack of trust in institutions or decision makers or due to perceiving no benefit to being counted. At times the choice not to be counted is for fear of the consequences, such asbusinesses being deregistered or taxed or the loss of privacy, of being recognized by governments or watchdog groups.[lii],[liii]  In countries where civic and digital rights are not well-protected, being included in data can pose a serious threat to people, as it gives governments the means to surveil and control populations.[liv]

In other cases, people are misrepresented or rendered invisible in data, resulting in information that does not accurately reflect the priorities or characteristics that are important to their communities. This is true particularly in settings such as humanitarian operations involving refugees and displaced people.[lv]  In these cases, data is collected for service provision, but when people are not consulted on what data should be collected and how it should be used or shared, decision makers may wield their power to manipulate priorities. This erodes people’s agency and access to resources and opportunities, particularly because the policies that are then enacted may not meet people’s needs.

Structural inequalities are reinforced when data design, collection, disaggregation, and analysis are top-down processes that measure levels of deprivation or assimilation, i.e., “How much poorer are these people in comparison with the majority?” instead of providing a more holistic picture of people's situation, reflecting their resilience and strengths, as well as needs. Inclusive and participatory approaches ensure that people and communities are actively involved and can shape these data processes.

The international statistical community has developed statistical methodologies to guide countries in producing statistics that actively involve people and their communities. The Fundamental Principles of Official Statistics, give clear guidelines to National Statistical Offices to ensure impartiality, confidentiality, and adherence to standards and methods, among other principles in producing statistics.[lvi]

Capturing robust, disaggregated, and intersectional data may require collecting larger samples or testing innovative approaches to capture the experiences of relatively small groups of people amongst larger populations and improving the availability of relevant data. Statistical agencies and other data-gathering organizations may face practical constraints to producing such data including a lack of financial resources, capacity, or adequate methodologies. As the custodians of statistical standards, National Statistical Offices (NSOs) face difficult trade-offs between producing robust statistics and avoiding exclusion. Nevertheless, examples in this chapter highlight how trailblazing data producers are experimenting with new methods, data sources, and approaches to foster inclusion and promote agency.

In a recent example from the United Kingdom (UK), advocates pointed out how nationwide inflation measures failed to factor in the experiences of low-income people for whom prices of basic food products had increased at rates several times higher than the average rate estimated by the government. “The system by which we measure the impact of inflation is fundamentally flawed—it completely ignores the reality and the REAL price rises for people on minimum wages, zero hour contracts, food bank clients, and millions more,” anti-poverty campaigner Jack Monroe argued on Twitter.[lvii] This increases the risk of enacting policies that further harm people whose experiences were not factored into inflation estimates.[lviii] In response, the UK Office of National Statistics announced ongoing plans to develop a more accurate and expansive measure of household inflation.[lix]

The increase in production and use of privately held data has led to practices that risk further erosion of individual and community agency in data.[lx] When decision making is contracted out to artificial intelligence (AI) without involving groups whose lives are affected by these algorithms, the consequences can be devastating in terms of bad decisions, unintended consequences, and missed opportunities. Misuse of historical data (resulting from built in bias and stereotypes affecting the datasets) as well as automatic classification can harm people who are already vulnerable. Take, for example, the COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) system in the United States, which has been found to be biased against Black people. The program is used by judges to predict whether defendants should be detained or released on bail pending trial by assigning a risk score based on the likelihood to commit a future offense and therefore guiding judges to give longer detention periods to defendants with higher risk scores.[lxi] Such systems exacerbate structural and systemic inequalities. Efforts from organizations like the Center for Policing Equity work with American police departments to minimize racial bias in data-driven systems.[lxii]

 

2.3 How data challenges power relationships in society

Data can also be a means of enhancing people and communities’ agency in decision making and resource allocation, increasing their visibility to decision makers in government and development organizations, and creating pathways for transparency and accountability. For example, foundational public data systems such as birth, marriage, divorce, identity, and death registration systems enable people to access services and exercise their civic duties. Information from these systems guides governments in allocating resources and deciding where to prioritize efforts and investments.[lxiii]Collecting data that reflects societal inequities among people based on race, gender, and other intersecting factors also enables policy makers to address disparities.

Big data and Artificial Intelligence can also be harnessed by NSOs to improve efficiency, timeliness, granularity and comprehensiveness of data collection and statistical production.[lxiv] For example, to ensure COVID-19 vaccines reached people with the greatest need in Guatemala, geospatial mapping software provider Fraym worked with the government and other actors to design an equitable vaccine allocation model to guide the national vaccination plan. The model identified population characteristics at the hyperlocal level, prioritizing people based on risk factors such as age and socioeconomic status and indicators such as utilization of health services.[lxv]

Analytical approaches beyond standard disaggregation can surface intersecting inequalities and reveal social norms and structural inequalities that may present themselves in data.[lxvi] On a global scale, the Multidimensional Poverty Index (Global MPI) uses traditional survey data to analyze intersecting experiences of poverty, such as housing, nutrition, and cooking fuel, to identify “the poorest among the poor.”[lxvii] For the first time in 2021, the Global MPI report looked at poverty data disaggregated by race and ethnicity, uncovering “stark inequalities” that had previously been obscured by aggregated data.[lxviii] Similarly, the 2022 SDG Gender Index developed by Equal Measures 2030 with support from the Tableau Foundation applies a gender lens to the 17 SDGs. This index uncovers areas in which women lag behind men—for example, in access to education and digital banking—to enable policy makers to target programs that help close the gender divide in key development outcomes.[lxix] The international statistical community has also increased efforts to provide leadership on intersectional approaches to data, particularly for gender. This is reflected in the work of the Interagency and Expert Group on Gender Statistics (IAEG-GS).[lxx]

Beyond data disaggregation and intersectional analyses, it’s critical to explore ways for people and communities, especially those who are marginalized, to participate at every stage of data creation, analysis, and use. For example, Statistics Canada recently established a disaggregated data action plan which prioritizes the voices of diverse groups and communities to better reflect their experiences and meet their data needs.[lxxi] Through direct involvement in data processes, people can surface different perspectives and influence decision making and implementation. In some instances, the voices of these diverse groups may be captured through qualitative methods such as storytelling. In India, the Poverty and Human Development Agency (PHDMA) of the government of Odisha has a network of 6,700 field officers trained to capture stories of change and lived experiences in their communities.[lxxii]  

When the Centre for Internet & Society (CIS) undertook a project to build digital platforms in the domestic and care work sectors in India, researchers initially planned to ask direct questions about how caste discrimination impacted women from Dalit and Indigenous communities.[lxxiii] But members of the Domestic Workers Union who were included as project co-researchers cautioned against asking specific types of questions based on their personal experiences of domestic work and the sensitivity of the subject. As a result, CIS researchers adjusted the questions. The answers they received brought the realities of domestic workers' experiences to the forefront, enabling more robust data collection and project design. Such person-focused and inclusive approaches lead to better data and research design and consequently better policies and outcomes.[lxxiv]

Through more participatory and inclusive data and data processes, people and communities can build their data literacy skills and their capacity to use data to create and advocate for change. Such data approaches also create incentives and mechanisms for people to access data and provide feedback on the quality of services. Efforts to publish data or make data open and accessible, while safeguarding privacy, ensure that people can interrogate, influence, and even lead decision making. These are the foundations for transparency and accountability, which strengthen individuals and communities’ agency and trust in data systems and decision makers. For example, as part of the Innovation to Inclusion (i2i) program, Organizations for Persons with Disabilities in Bangladesh and Kenya implemented data driven advocacy strategies to strengthen digital and tech-based solutions for disability inclusion. Through this project, the organizations learned that having clear goals for advocacy backed by data and relationships were key ingredients for concrete progress. By applying this learning, they were able to influence physical changes in government offices to enable accessibility.[lxxv]

 

2.4 Rebalancing unequal power dynamics: adapting features of inclusion

This section highlights practical applications of inclusion that support people to gain agency in data. The features of inclusive approaches are broadly termed representation, co-creation, and review and explained in further detail in Figure 1. These approaches enable people to engage directly in data production and/or participate in co-creation and decision making around what data is collected and how it should be collected and analyzed, building their data skills in the process. No single approach is sufficient, and each approach involves trade-offs that may compromise people’s agency in data.

Figure 1. Features of inclusive data systems

Representation

Standard disaggregation methods aligned with SDG target 17.18 and the LNOB agenda surface group-level inequalities and differences by “income, gender, age, race, ethnicity, migratory status, disability, geographic location” and more. Representation through disaggregation is a prerequisite to data agency.

Example: The Wa Community in Myanmar (located in the northern, non-government-controlled region) were included in the national census for the first time in 2014. This facilitated a development process to reach women and girls in particular from a remote location.[lxxvi]


Co-creation

In co-creation, data is created with rather than for or about people. The result is that people can influence the data that is produced, and they can produce data that they deem relevant for their needs. The key feature of co-creation is that, in deciding what matters to them, people take part in defining data concepts, classifications, and standards and informing decision making.[lxxvii]

Sometimes these efforts are led by governments working with communities to shape how they are defined and how data is collected, and at times these efforts are led by non-state actors.

Example: The Central Bureau of Statistics of Nepal and the National Human Rights Commission among others are working with youth and women to generate data on their situations.[lxxviii] Citizen-generated data methods such as Open Mapping (e.g. HOT), citizen science, sub-national data collection by citizens, and disability data collection enable citizens to decide what issues are important to them, collect the data and engage their leaders with the data.[lxxix][lxxx]


Review

Working arrangements such as committees or task forces convene experts and community representatives—often from different disciplines—to lead assessments of data gaps, biases, etc.

Examples: The Washington Group on Disability Statistics was established twenty years ago to develop internationally comparable disability measures. The development of these measures has been an inclusive process that has brought together government and non-government stakeholders. The international statistical community through the UN Statistics Division has also established city groups on statistical methodologies in which communities who are directly affected review data, for example, on governance and aging.[lxxxi] Some committees or task forces may be within a specific country, as done in the UK through the inclusive data task force.

When people are represented in data, efforts are made to ensure that they are visible in data collection, design, analysis, and presentation. Increasing representation often results from collective advocacy among different stakeholders including human rights groups and advocates. When people care deeply about issues and are willing to advocate for change, data producers can respond by expanding definitions and data collection efforts. For example, the Kenya National Bureau of Statistics added a third gender option (intersex) to the national census in 2019 after working closely with human rights groups.[lxxxii] This doesn’t mean that representation is easy or straightforward, a fact that is especially evident when establishing definitions of individual and group identity such as race and ethnicity.[lxxxiii]Again, people may not wish to share their data or be visible in data out of fear of reprisal.

People are more likely to care about data when they are involved in creating it. This is co-creation, when the views, lived experiences, and perceptions of communities are incorporated into the design phase of data-focused projects. This can happen directly, as in the Open Mapping projects through Humanitarian OpenStreetMap Team (HOT), or through representative consultation, such as the example from Colombia in Box 2.[lxxxiv] In both cases, people’s views are factored in and they receive feedback from decision makers at every step of data design, collection, and use. More broadly, when people are involved in co-creation, their stake in the data will also increase. However, co-creation can be time and resource-intensive, especially in settings that require quick action. Co-creation also requires a level of knowledge of the issues and a culture of willingness among citizens to engage in sharing views and experiences. Additionally, co-created data may not meet the criteria for official statistics or be regionally (or internationally comparable), but it can supplement or complement official data by adding granularity and nuance that highlights people's lived experiences.

Finally, review provides a means by which people can provide feedback and contribute to how data is created, processed, and used based on specific regional or community priorities. An example of this occurred in mid-2020, when the GovLab held a series of consultations on reusing personal data to respond to COVID-19. Policy makers, citizens, and advocates shared their expectations and concerns.[lxxxv] Through this approach, committees or groups are tasked with ensuring that people’s needs and priorities are included and protected in data, and allowing for extensive consultation with communities. As the custodians of data quality, NSOs can systematically adopt review mechanisms to ensure that inclusion is prioritized alongside statistical rigor, making this approach both scalable and sustainable. Review processes run the risk of tokenism, however, and require people to select trusted intermediaries to steward and represent their communities.

Creating avenues for participation in the design, collection, analysis, and use of data is critical to fostering agency. The next chapter unpacks participation, building on Ada Lovelace Institute’s framework of participatory data stewardship with a focus on participation in data governance.[lxxxvi] This framework is applicable because it highlights the need to ensure that data design, production, use and analysis is inclusive and meets the needs of communities, which ultimately builds trust in the system.

Box 2. Counting race in the Colombian census

With more than ten years since the last formal population count, Colombia’s national statistics office (Departamento Administrativo Nacional de Estadística or DANE) faced intense scrutiny ahead of the 2018 census. Previous censuses had asked questions of race but faced challenges of “poor wording and inadequate geographic representation” as well as “longstanding, culturally embedded discrimination” that resulted in “gross undercounting” among populations that historically lacked access to levers of power.[lxxxvii]

Community leaders, recognizing the importance of being counted, actively sought to shape the 2018 census. In the context of decades of conflict and historic undercounting of marginalized communities, “the risks of omission are very high,” a researcher from Colombia’s National University told reporters in 2016. “A very strong relationship between DANE and these organizations is needed for the logistics of this operation.”

In response, from 2015, Afro-Colombian and Indigenous community members and organizations consulted with officials from DANE to develop better measurements for race and to train enumerators to be sensitive when asking questions about race. An example of this was not assuming someone’s ethnicity because of skin color or clothing. For the first time, Indigenous communities were responsible for the census operations (transport and staff) in their territories. Collaboration led to a public education campaign to increase Colombians’ understanding and willingness to participate in the census.[lxxxviii]

While the census results were initially contested by Afro-Colombians, DANE has responded by combining census data with identification and georeferenced data and with other data sources such as administrative records to identify omitted populations in the census.[lxxxix] With AI, DANE has also been able to scale up existing poverty estimates from 1,123 data points to 78,000 data points—a 70-fold increase.[xc][xci]

 

2.5 Setting our sights on data agency

Examples in this section have demonstrated that there is no one-size-fits-all approach. A combination of these features should be applied to maximize benefits and expand people’s agency through data. Leaders must take strategic and institutional approaches to prioritize ways to increase individual and collective agency and promote inclusion.

Approaches that build agency take a deliberate investment of time and skills, as they are about changing and challenging mindsets and shifting power. The work of the Washington Group on disability statistics has been ongoing for twenty years. Partnerships between NSOs and Indigenous communities in Colombia and Peru to revise how the censuses captured data on Indigenous people spanned over three years. Implementing the resulting methodologies required the statistical offices to navigate sensitive issues like racial self-identification. Statistical offices and other data-producing institutions often experience resource and capacity constraints, making the task even more difficult. However, examples in this section have shown what is possible, even in low-resource settings.

Driving systematic change to rebalance power and promote agency should be a core goal of data stewardship. Across public and private sectors, data stewardship has been described as a function or set of functions to facilitate the production, management, sharing, and use of data within and between organizations in a responsible and trustworthy manner.[xcii],[xciii],[xciv] Trust is fundamental to stewarding data in the public interest and therefore requires considering the power imbalances that exist in data systems and how they can be addressed through greater inclusion and participation.

This chapter has highlighted how data can reinforce or rebalance unequal power dynamics in society. The negative effects of this inequality is felt most by people and communities that are marginalized. Inclusive approaches highlight ways to increase people’s agency in data, and applying a combination of those features enables the collective agency and expands the shared benefits of data.[xcv][xcvi]

Individual and communities’ agency in whether and how data is collected, analyzed, and presented is not enough on its own to alleviate injustice. How data is controlled, managed, and used—and who decides how this happens—can be a means of wielding power or of balancing and diffusing it. If the structures and mechanisms set up to govern data are accountable to the public and trade-offs are well managed, then data is more likely to be used for public good and less likely to cause harm. Fostering accountable data governance requires mechanisms for people to directly participate or have their interests represented in decisions about how their data is controlled and used. It also requires that the actions of decision makers are transparent and able to be questioned and changed if necessary.

The COVID-19 pandemic has illuminated many examples of public and private entities using personal data without adequate public engagement. When the UK’s National Health Service embarked on a contract with Palantir, a U.S.-based software firm, people were outraged that the contract could allow the company to use the health data of millions of Britons for non-COVID-19 response purposes. Handing this power to a company known for its work on defense and national security significantly undermined public trust. The government’s failure to consult the public on this contract and similar arrangements was at the heart of a lawsuit brought by Foxglove and openDemocracy that eventually caused the UK government to back out of the deal.[xcvii]

In the scandal over Palantir in the UK, recourse came through the legal system, which acted to safeguard rights and establish checks and balances. The legal system on its own, though, wasn’t enough to prevent harm. Civil society activists and members of the public who spoke out played a critical role by holding the government accountable. This example demonstrates that formal remedies and “after-the-event” enforcement might not even be triggered in the absence of participatory monitoring of decisions. Furthermore, retroactive enforcement solutions do not necessarily lead to more accountable data practices over time. In this case, the UK’s National Health Service had already been involved in a similar scandal in 2015 when it collaborated with the Google-owned AI company DeepMind to develop a health data-tracking app.[xcviii]

Accountability is far too important to be left to the realm of retroactive scrutiny and enforcement. It must be established at the outset to shape data-related decisions as they are taken. Accountability should be embedded at all stages of governance, starting with involving people in decision making. This can include mediating between conflicting interests and establishing penalties for bad behavior, creating the space for ongoing scrutiny of decisions and actions as they are taken, and, finally, integrating the outcomes of this scrutiny into new decisions. New participatory data governance mechanisms, such as the “learning data governance” approach established by Understanding Patient Data, an initiative of the UK-based foundation Wellcome Trust, reflect this cyclical view of accountability. It allows people to participate in decisions about data, to scrutinize the execution of decisions, impose remedies if needed, and learn from previous decisions to improve decision making outcomes over time.[xcix]

Formal data governance mechanisms such as laws, policies, and institutions provide frameworks for accountability at local, national, and international levels of governance.

However, formal mechanisms are necessary but not sufficient to shift power to the people whose data they are designed to protect."

Participatory mechanisms of data governance are essential for accountability because they provide spaces for deliberation, consensus-building, and continuous public scrutiny as a complement to and sometimes a check on formal mechanisms. These informal mechanisms are no less important than formal laws, policies, and institutions to ensure that data systems are accountable to people.

 

3.1 Accountability requires action at all levels and stages

The concept of data governance has its roots in the private sector, where it refers to the practices and systems used by corporations to manage data. Understandings of data governance in public policy have recently expanded to describe “the laws and policies governments enact to govern the use of data in society.”[c] The World Bank, in its 2021 World Development Report, argues that data governance is “the tangible expression of a country’s social contract around data.”[ci]

The World Bank’s report focuses on four core components of national and international data governance, including: 1) infrastructure policies; 2) data laws and regulations; 3) economic policies; and 4) governmental institutions, as well as other institutional actors, that set standards and increase data access and reuse. Efforts to strengthen data governance within and among countries over the last decade have focused heavily on the laws, policies, and institutions described by the World Bank.

Between 2010 and 2020, 62 countries enacted data privacy laws, more than in any other decade, bringing the total number of countries with such legislation to 142 at the end of 2019.[cii]  Many countries and regions are exploring bilateral and multilateral agreements that address cross border data flows while organizations and projects are establishing or refreshing their policies, protocols, and data sharing agreements. The pandemic has intensified the spotlight on the role and function of these laws, policies, and institutions, as well as the urgency of establishing or improving them in all corners of the world.[ciii]

The important work happening at the highest political levels must be extended and supported, particularly in low- and middle-income countries where legal frameworks and the institutions required to implement data laws and policies may be weak or non-existent.[civ] But establishing and strengthening these laws, policies, and institutions is only part of the story. While formal structures and top-down mechanisms of accountability are required for effective data governance, they are often designed and decided upon by a relatively small number of actors in each country or organization. On their own, they rarely provide the space for affected communities to shape decisions, or even to know or understand what those decisions are, let alone to hold leaders accountable for operating within the framework that they establish.

Formal mechanisms of data governance can have participatory dimensions built in. For example, the EU General Data Protection Regulation (GDPR) and GDPR-inspired laws establish parameters for informing data subjects about how their data will be used. They also foresee remedies and enforcement mechanisms to hold those making decisions about data accountable.

However, informing people and providing legal remedies that can only be activated after harms are incurred meets the bare minimum for standards of participation and rarely leads to people or communities being able to influence the outcome of data use through increased knowledge or understanding."

Participatory data governance mechanisms that enable people to influence decisions or outcomes provide an essential complement to formal mechanisms. These include a range of approaches, institutions and forums designed to foster transparency and participation or create space for people’s interests to be represented in data governance processes. Furthermore, they extend well beyond retroactive scrutiny of decisions and provide avenues for continuous involvement and oversight.

Participatory mechanisms can operate inside, outside, or alongside formal mechanisms of data governance to strengthen accountability in practice."

By creating pathways for accountability, participatory mechanisms give people and communities more power in data governance. They can bring a diversity of perspectives together to balance competing interests and shift power asymmetries. They can foster greater transparency through open communication and information exchange, which creates space for continuous scrutiny. They can create opportunities for learning among all stakeholders—experts and laypeople, data producers and users, government officials and community members. This builds trust, increases data literacy, and demystifies technology and data governance. Most importantly, participatory mechanisms can operate on an ongoing basis that allows them to be agile and evolve. In contrast, legislation, regulation, and institutions are slow to adapt to change and struggle to keep up with the pace of technological development. However, when accompanied by participatory mechanisms, they become better equipped to adapt to the modern fast-moving digital world.

Box 3. The problem with individual consent

Much of the discourse around data governance focuses on privacy and protection and places the emphasis on individual consent for companies or institutions to collect and use personal data. While consent is an important cornerstone of data governance, it is increasingly viewed as insufficient on its own to foster accountability.[cv]

First, it places the burden on individuals and requires them to be fully informed, skilled, and equipped to make decisions about their data. In practice, evidence suggests that very few people read privacy notices before accepting them, which indicates that the perfectly informed individual who has time to read and consent to multiple notices every time an entity wishes to collect or use data does not exist.[cvi] Second, consent relegates individuals to a passive “assent or dissent” role, without allowing them to articulate their needs and aspirations in terms of data collection and use.[cvii]  It prompts people to decide whether they want to participate by forcing them to either accept a given set of conditions or be left out or denied services, without any possible intermediate or third option.

Furthermore, individual consent mechanisms don’t address the way that personal data can impact people at the community or societal level.[cviii] They also don’t speak to the way big data is used in automated decision making where the goal is to derive population-level insights. This can lead to collective harms that are felt well beyond the individuals who provided consent.[cix]In other words, obtaining community consent for data collection, sharing, and use by ensuring that affected people and groups have outlets to have their views heard is equally if not more important than obtaining individual consent.[cx]

 

3.2 Pathways to accountability

A central feature of participatory mechanisms is that they enable people to engage directly or indirectly in data governance. This section describes what this looks like in practice and how these mechanisms contribute to accountability.

The Ada Lovelace Institute has created a useful model for understanding participatory data stewardship by adapting Sherry Arnstein’s ladder of citizen participation.[cxi],[cxii] The ladder’s steps in Figure 2 represent levels of participation by how much power affected people or communities have and how much is ceded by decision makers. The ladder begins at informing people how their data will be governed. The next steps are: consulting them and providing feedback on their concerns, involving them to ensure their concerns are reflected, collaborating with them in the design of data governance models, and empowering people by supporting their decisions about their own governance models. Moving up the ladder toward greater participation fosters greater transparency and trust and ultimately leads to redistributing power to people.

Figure 2. Ladder of participation in data governance (adapted from the Ada Lovelace framework), with examples


Inform

We will keep you informed about how your data is being governed."

Example: Most privacy and data protection regimes established in recent years follow the example of the GDPR in that they lay out clear rights of data subjects.[cxiii] In Uruguay, data subjects have the right to be informed about why their data is collected, who will be able to access it, what the effects are of not providing the data and how they can exercise other rights concerning data access, deletion, and modification.[cxiv] Data subjects must also be notified of any change in the governance of the data following its collection.


Consult

We will listen to, acknowledge, and provide feedback on concerns and aspirations for the governance of your data."

Example: In Ghana, where the Statistical Service (GSS) obtains mobile data to produce official statistics based on an agreement with Vodafone Ghana, Vodafone Foundation, and Flowminder, GSS established a Steering Committee to address requests for data from parties other than those in the agreement.[cxv] The Steering Committee includes representatives from civil society organizations that work to protect digital rights. This ensures that groups that bring a digital rights perspective can weigh in on ethical considerations in such decisions, and can hold government and private actors accountable through the decision making process.


Involve

We will work with you to ensure that your concerns and aspirations are directly reflected in data governance."

Example: Restore Data Rights is a grassroots movement campaigning for African governments to respect and protect fundamental human rights—particularly those exercised in cyberspace and over personal data—during and after the COVID-19 pandemic. Launched in November 2020, the movement is centered around a declaration that commits signatories and endorsers to transparency, inclusivity, and accountability around data governance in Africa during the pandemic.[cxvi] To date, 62 institutions and individuals have signed on, and organizers are additionally working with data protection offices in Kenya and Mauritius. The movement also established a civil society organization working group looking at long-term accountability on COVID-19 data use, ran a data protection awareness campaign in Kenya, and conducted research on how the provisions of the declaration are translated into law and practice in Kenya, South Africa, Nigeria, and Ghana, which will provide a way for the movement to assess government policies and actions against the declaration.[cxvii]


Collaborate

We will look to you for advice and innovation in the design of data governance models and incorporate your advice and recommendations where possible.”

Example: Data-Pop Alliance’s Councils for the Orientation of Development and Ethics (CODE) are advisory groups of independent and local stakeholders who provide ethical guidance for data collection and use.[cxviii] In a project focused on gender-based violence during COVID-19 in South America, concerns from CODE members about stigmatization of victims led organizers to abandon plans to create maps of violent hotspots. Instead, “no stigmatization” became the primary ethical principle to ensure the project did not violate other data-related concerns related to harm, confidentiality, and privacy. This resulted in a shift to focus on factors that affect reporting rates among domestic violence victims.[cxix]


Empower

We will advise and assist in line with your decisions about your own data governance model."

Example: The First Nations principles of OCAP—which stands for ownership, control, access, and possession—informed the First Nations Regional Health Survey, the only First Nations-governed national health survey in Canada.[cxx] Since its launch 20 years ago, it has undergone three survey cycles in over 250 First Nations communities in Canada using both Western and traditional understandings of health and well-being. Its results have been used by numerous public agencies in Canada across health, economic, and public safety domains to assess the effectiveness of programs and design policies in a way that is responsive to First Nations’ needs and aspirations.[cxxi]

Fostering participation in data governance in one or several of the ways described by the ladder is already happening around the world and leading to greater accountability as a result, as Figure 2 explains. Councils and committees made up of local stakeholders can scrutinize a project or an organization’s data management processes to ensure it is responsive to local needs at the design and implementation stages, similar to what CODE does. Another approach is for communities to establish and implement their own data governance principles. Indigenous communities, as Figure 2 shows, have been at the forefront of establishing practical and ethical principles to govern data about their communities, starting with the recognition that accurate and timely information is key to addressing the long-lasting impacts of colonization and systemic racism. Many other innovative participatory approaches to data governance are currently being tested and researched around the world.[cxxii]

Fostering participation in data governance is not only the responsibility of public sector and civil society organizations. Private companies, too, can and should be engaged. Dozens of corporations, including data platforms and intermediaries such as 1001 Lakes, DataCave, and Meeco, have signed onto the MyData Declaration and joined the MyData Global movement since its founding in 2018. As a global network of entrepreneurs, activists, academics, corporations, public agencies, and developers, MyData aims to empower individuals to give, deny, or revoke their consent to share data based on a clear understanding of why, how, and for how long their data will be used. Likewise, software companies played a key role in embedding accountability in the adoption of GDPR in Europe. Making it possible for companies to easily buy GDPR-compliant data management software accelerated uptake of the new data protection regulations and, for the largest companies, enabled them to set their global data systems to standards set by GDPR.

There is no ideal approach for participatory data governance mechanisms. They adapt to the situations for which they are developed to enable accountability in national, local, or community contexts. However limited or expansive a particular participatory mechanism may be, they all provide important complements to formal governance mechanisms by shifting power to affected communities and creating pathways for accountability.

Box 4. Types of participatory data governance mechanisms

Recent years have witnessed an evolution in thinking and experimentation with mechanisms that shift power to data subjects and affected communities by enabling people to participate or have their interests represented in data governance.

The World Bank refers to these as multi-stakeholder governance mechanisms, which they define as “participatory solutions which enable trust, value and equity in data use by adopting an approach that is informed by all people.”[cxxiii]The Open Data Institute has explored the concept of data institutions, or “organizations that steward data on behalf of others.”[cxxiv] Data institutions are a broad category that includes traditional organizations such as NSOs and newer constructs that enable greater participation through data trusts and data cooperatives.

Data trusts and data cooperatives are legal entities with statutes, rules, or mandates.[cxxv][cxxvi] They foster the emergence of trustworthy data practices by establishing structures where delegation and accountability mechanisms empower data subjects and affected communities that are not directly involved in daily decision making.

Data intermediaries are structures or organizations that facilitate the exchange of information between data rights holders (such as people or businesses) by “encapsulating, communicating and enacting the shared interests of the relevant parties and safeguarding their interests.”[cxxvii] Some data intermediaries offer technology-based solutions for data sharing that ensure decision making power remains entirely in the hands of data subjects. In other cases, data intermediaries assume decision making, including on behalf of people.

Multi-stakeholder fora, citizens’ juries, and assemblies aim to convene stakeholders with diverse and sometimes divergent interests around data to reach an agreement which is accepted by all stakeholders. They lead to the establishment of more trustworthy data practices by offering methods for building consensus and resolving conflicts and they tend to be more informal in nature. The New York Data Assembly and Data Collaboratives are examples of initiatives that balance individual and collective as well as public and private interests around data sharing and use.[cxxviii][cxxix]

What these all have in common is they create space to broaden participation in data governance by bringing interested and affected people together or creating a binding requirement to represent those who are most affected by data governance decisions.

 

3.3 Accountability in practice

If increasing participation is the gold standard in responsive and accountable data governance, then we’d be remiss not to also confront the challenges and enablers inherent to it. Numerous examples make it clear that participatory governance is not only possible but already widespread, even in low-capacity settings. Challenges and enablers will be context-specific. Nonetheless, organizations aiming to increase participation will often face similar trade-offs related to practical constraints and balancing individual and collective interests, as this section describes.

First, pure democracy is messy and complex. It’s a relatively simple task to gather three people together to create an agreement for how to manage and use their data. But these are not the situations where participatory governance presents a challenge. Instead, most governance questions arise at national, regional, and international levels, creating a trade-off between the possibility for direct involvement in decision making and the number of people who can be directly consulted. In such cases, individuals and communities can delegate to a representative who can advance their interests and participate in decision making on their behalf. However, this approach is also replete with the challenges of tokenism and the generalization of the views of a complex community.

To avoid tokenism, participatory mechanisms must respect the inherent diversity of views within communities, understanding that people have different priorities."

This diversity of views, however, might fail to emerge even when participatory mechanisms are well-conceived as communities have internal power dynamics that disempower some members or leaders who privilege their own personal interests ahead of collective needs. Furthermore, participatory data governance approaches can be time and resource-intensive and are often at odds with the pace of project implementation and technological innovation.

Second, we should not expect people who have been historically marginalized and disempowered to have the same values, priorities, or resources for data governance as the people and institutions that currently hold power.[cxxx]Additionally, people who have faced marginalization might be disillusioned and skeptical about what is achievable by engaging in initiatives launched by those who have power. If powerful players consistently set the agenda and define the rules of engagement for participatory initiatives, buy-in from marginalized communities may be low. 

Within the Indigenous peoples’ data rights movement, for example, the emphasis has been on data sovereignty and self-determination, framing agency, privacy, and data sharing as issues of community—not personal—power and autonomy.[cxxxi][cxxxii] Where the focus is on addressing historical oppression, the balance between individual and collective rights in questions of data governance must be resolved through thoughtful participatory processes. Consultations must also highlight the resilience and strengths of communities—not only their needs and obstacles. Underpinning all of this is a critical question: How do we engage people in ways that address power asymmetries when the organizations and governments collecting data often have immense power and resources relative to local communities?

Finally, in seeking to increase participation in data governance, we must consider how to ensure that people have the knowledge, skills, abilities, resources, time, and willingness to take part in these processes. Certain forms of participatory data governance (i.e. those involving direct representation) require higher levels of engagement, knowledge, and skills than others (for instance, those involving delegation). Not all individuals need to become data experts. However, a general increase in levels of data literacy in society is desirable to enable participatory data governance mechanisms to flourish. Research shows that people at all levels of decision making have lower-than-necessary levels of data literacy, and that individuals may be unaware of the need, or unwilling to invest time, in protecting their own data.[cxxxiii] Although interest in personal data governance appears to be increasing, there is still a general lack of awareness and knowledge of data governance as it appears at local, organizational, and international levels, making it unlikely that participants will come to the table fully prepared to participate without investment in training and education.[cxxxiv],[cxxxv],[cxxxvi]Data governance institutions can also do more to make processes accessible and understandable for non-experts. We must also consider how to compensate people fairly for their time and insights to ensure that participatory processes do not further exacerbate inequalities.[cxxxvii] This includes avoiding subjecting people to repetitive and costly requests for information.

There are limits to the extent to which people can genuinely participate in data governance and to what can be achieved through participation. These constraints notwithstanding, in places where formal data governance mechanisms are fragmented or weak, participatory approaches can lead to the adoption of more trustworthy data practices and increase accountability in how public and private institutions and organizations collect and manage data. We can enable people to engage by providing tangible resources such as compensation or childcare at meetings to ensure that parents can attend. The goal is to adopt approaches that challenge the status quo and force us to question underlying assumptions about who has a say and what matters in data governance.

 Creating true participation in data governance is only possible through intentional, well-planned, and flexible efforts."

Creating avenues for participation must also account for and balance complex community dynamics and the day-to-day constraints that may hold people back from getting involved. It is also critical to manage expectations by creating the space for meaningful contributions while being transparent about the limitations and practicalities of projects and organizations. Above all else, participatory mechanisms must protect people and not put them at risk.

Institutions and individuals charged with stewarding data have an important role to play in engaging with communities and adopting or developing participatory approaches to data governance. Data stewards are uniquely positioned to consider how formal and participatory mechanisms of data governance may interact to foster greater trust and accountability in decision making around data.

For accountability to work, rules need to be enforced, decisions and actions need to be inclusive and transparent, and people need to be able to verify that those in power are doing what they said they would do. This requires robust data governance that is built on a solid foundation of laws, policies, and institutions and is buttressed by participatory mechanisms that allow affected communities to be informed and have a say in how their data will be managed and used. When accountability is continuous, data governance becomes more trustworthy. Numerous examples of this exist already in both the policy and development spheres. Nonetheless, there is a need to continue to create space for more innovation and experimentation to improve participatory approaches to data governance. New and evolving models are needed to push the boundaries of what participatory mechanisms look like and to broaden the range of participants.

As data transforms society, all people, especially those who have been marginalized, should have the means to hold the powerful accountable for decisions that determine how their data can be managed and used."

Data is ubiquitous in today’s world, embedded in the social, cultural, and political contexts of every country in the world. Humans have never produced so much information so quickly, but increases in the quantity of data has not translated equally into our ability to address collective challenges. A wide range of incentives determine whether decision makers seek out data or willfully ignore or manipulate it. Ensuring that data is used is a complex business. Data is only one of several inputs when making a decision. This section focuses on the factors influencing whether decision-makers seek out data and use it in the public interest. This is important because, almost a decade after the publication of A World That Counts, much valuable data remains untapped and underutilized.[cxxxviii] This failure fuels bad policy and inefficient programs, benefits the most powerful in society who profit by perpetuating the status quo, and leaves people who are marginalized behind.

Data that is collected with agency and governed with accountability must still be used effectively to drive actions that improve people’s lives. Collaboration and partnerships can help to deliver these outcomes. The uses and applications of different types of data (i.e. personal or non-personal, quantitative or qualitative, and publicly or privately held) vary and therefore require different levels of protection and openness. In recent years, the development community has adopted more nuanced approaches to data availability and data use, going beyond an “open by default” mentality and toward a culture of openness focusing on sharing and use of data in specific contexts to address specific challenges.

This is what happened in Togo during COVID-19-related shutdowns, when 138,000 people living in poverty received mobile cash transfers through their phones. No application process, survey, questionnaire, enumerator, or social worker was involved. Instead, four data-holding partners came together behind the scenes, using phone records, satellite data, and population data to develop MobileAid. MobileAid’s cash delivery program demonstrates that existing data can be shared and leveraged through innovative partnerships to make meaningful improvements in people’s lives. Putting data into action, while promoting agency and accountability, is an essential component of more equitable data systems. 

 

4.1 Factors that impact data use

Evidence-based decision making requires high-quality, timely data to be accessible to decision makers. This involves wide-ranging technical considerations including methodology, standards, infrastructure, data interoperability, format, and more. Discourse in data for development has largely focused on these considerations. Human factors that impact data use such as people’s motivations, incentives, and opportunities to collaborate in addition to their capacity, skills, and institutional and organizational cultures and constraints receive much less attention although they appear to have greater influence on whether and to what extent data is used.[cxxxix] These human factors are more difficult and complex to identify and slower and trickier to fix. But, as the following sections demonstrate, they’re far from intractable.

4.1.1 Data use suffers amidst a landscape of declining trust

Trust is both an enabler and an outcome of data use."

For decision makers to use data, they must trust in its validity and reliability. Likewise, the public must be able to trust, not only in data themselves, but also in the credibility of the data producers and in public institutions and decision makers to put that data to use. A 2021 paper, Towards a Framework for Governing Data Innovation: Fostering Trust in the Use of Non-Traditional Data Sources in Statistics Production, highlights that “you cannot have trust in the usability of statistics if the data that underpin them are of poor quality and those producing them lack integrity.”[cxl] As discussed in the preceding chapter, building more trustworthy data practices starts with establishing participatory governance approaches, which should also provide a venue for people to hold decision makers accountable for effective data use, for instance, by monitoring how evidence is leveraged for public policies over time.

Yet all too often decision makers do not use data for public benefit. Data is often used in ways that concentrate power in the hands of the already powerful.[cxli] Ignoring data, using it to harm or surveil people without their consent, using it selectively or taking it out of context, or intentionally misrepresenting data to sway people’s opinions or mislead them are uses of data that disempower people. Data in public policy and private decision making is part of a larger landscape of ongoing social and political events and personal motivations and biases. Ensuring that timely, high-quality data exists and is accessible provides no guarantee that decision makers will use it to address inequalities.

Failure to use data and misuse of data have devastating results for communities—especially those that are marginalized—and for society at large. Misuse and ignorance of data leads to bad policy outcomes and results in declining levels of trust in public authorities. Failure to respond to peoples’ needs over time leads to disillusioned citizens who, in turn, increasingly mistrust their governments to use their data. The consequences of declining levels of trust have been particularly visible during the COVID-19 pandemic, as we’ve seen with the adoption of contact tracing apps, which suffered from limited popularity and buy-in.[cxlii]

The spread of misinformation is also a sign of a declining trust in official institutions. Initiatives like the CoronaVirusFacts Alliance provide independent fact checking aimed at rebuilding citizens’ trust in the context of what is now called an “infodemic.”[cxliii] However, these initiatives alone are insufficient to rebuild trust in institutions in the absence of better policy outcomes that demonstrate the benefits of putting data into action for all of society.

4.1.2 A patchy record of public use of privately held data

Public trust in responsible data use has become particularly important because the data landscape has shifted away from governments and toward private sector companies as the primary producers and holders of data. Box 5 reflects on the benefits of public access to privately held data.

The absence of strong frameworks for data sharing and protection between governments and companies erodes public confidence and use of data. The public is often only aware of data sharing and use by companies and governments when a scandal breaks. The revelation that Israeli cybersecurity company NSO Group had shared personal data with governments who spied on citizens around the world is one example of this.[cxliv] Scandals erode public confidence in the public sector's ability to responsibly use and manage personal data, and they increase skepticism among policy makers about the benefits of sharing and using data from the private sector.

It doesn’t have to be this way, as many examples of public-private data-sharing partnerships born during the pandemic demonstrate. In one example, when the government of Argentina issued a call for data and analysis to respond to COVID-19, Telefonica Argentina responded by collaborating with the National University of San Martín to create a hub with up-to-date mobility data. “Privacy by design” to protect users' data was a critical feature of the program.[cxlv]The company signed agreements with national and local government agencies that used the hub on an ongoing basis to make policy decisions.[cxlvi]

Box 5. The benefits of public access to privately held data

Large companies today have access to more and better data on people compared to many governments. Expanding the state’s access to privately held data is complicated, but it is seen as essential for many governments given the volume and reach of privately held data.[cxlvii] Yet sufficient legal and regulatory frameworks for accessing privately held data may not exist, and the public may not trust either side with their personal data.

Meanwhile, governments are frequently excluded from accessing information that is largely available to other players based on pay-for-data solutions. Mobile Network Operators, for instance, sell customers’ aggregated and anonymized data to companies in finance, tourism, and retail that are willing to pay for insights.

This problem can’t be resolved simply by asking governments to buy data from companies. Nor can it be solved by companies universally giving customers’ data away for free. Initiatives like the recent European Commission’s Data Act are new attempts to redress the balance between public and private sectors by granting public authorities access to privately held datasets (and prescribing the circumstances under which such access is required) by law, while establishing safeguards against misuse of data by the public sector.[cxlviii]

There is no one-size-fits-all approach for data sharing, but a useful menu of options around access to privately held data is starting to develop. This includes regulatory measures, contractual partnerships, procurement solutions, reciprocity models, and more. Such approaches hold promise to align incentives and allow governments to safely access and use privately held data.

 
4.1.3 Human interoperability and partnerships as important parts of the puzzle

Effective data use requires human interoperability—the idea that data doesn’t come together on its own but requires people working together across different parts of government, sectors, and communities.[cxlix] Individuals, not platforms or technical data pipelines, are at the heart of data sharing and use. At the most basic level, breakdowns in communication and coordination can leave data untapped to address public challenges. Beyond data interoperability—the ability to join up and merge data without losing meaning or context—the people engaged in designing, providing, collecting, analyzing, interpreting, and using data are crucial factors in enabling data use that empowers people at the bottom.

At the organizational level, multi-stakeholder partnerships can foster human interoperability and address human barriers to data use, ultimately putting data into action. Numerous examples show how partnerships between governments, private sector, academia, non-governmental organizations, and citizens lead to more informed decisions and help embed sustained data use in local and national contexts. Such partnerships are beneficial for all stakeholders involved, from traditional data institutions like NSOs to citizens’ groups and private sector players.

This is what happened when stakeholders in Kenya’s agricultural sector agreed that the government lacked reliable information on food stocks to guide policy and action amid COVID-19. Farmers, grocers, producers, and other stakeholders worked with government officials across departments and companies including Microsoft and ESRI to create the Food Staples Dashboard to monitor prices and availability of food stocks. The information was used in the context of Kenya’s Food Security War Room with more than 50 partners from development agencies, civil society, international organizations, government, and the private sector. The project enabled government officials to respond to the impact of COVID-19 lockdowns on food insecurity by strengthening food supplies, targeting food distribution, providing accurate information to citizens and media, and communicating directly with producers and consumers.[cl]

Work to break down barriers to data use in Senegal is another powerful illustration of human interoperability and the impact of multi-stakeholder partnerships, outlined below in Box 6.

Box 6. Breaking down barriers to data use in Senegal by building human interoperability

In Senegal, data use was stymied for years until an investment in human relationships and partnerships opened agricultural data for public use.[cli]

Agricultural activities account for the majority of economic activity in Senegal. For years, the national agriculture ministry produced regular data on farmers in the country that wasn’t used by other government ministries or non-governmental organizations. Instead, individual ministries, development agencies, and civil society organizations produced datasets for their own use, often duplicating efforts and multiplying inconsistencies, which led to poor policy outcomes.

Through the Agridata project, led by IPAR (Initiative Prospective Agricole et Rurale), a Senagalese think tank, and supported by Development Gateway, more than 50 agricultural data stakeholders from the public sector, private sector, and civil society came together to identify data sources and to build trust over more than two years.

Only by working together to align interests and resolve conflicts over who had ownership of this data were stakeholders able to agree to a common data platform to inform better decision making. Establishing a partnership between these actors was the first essential step to increasing data use among relevant decision makers.

 
4.1.4 Shifting organizational culture

Widespread shifts in organizational culture within governments, companies, and the nonprofit sector are needed to realize the potential of both public and private data use for public benefit.

The Open Data Institute (ODI) has created a useful model for breaking down attitudes that affect whether data is used to its maximum benefit by companies, communities, organizations, and governments. ODI’s Theory of Change, included in Figure 3, distinguishes between treating data like a precious and proprietary commodity (data hoarding) and shying away from data use altogether because of legitimate concerns of how it may be used or who has access to it (data fearing). In both scenarios, the power of data is left untapped without cultures “of openness and trust around data.”[clii]

Figure 3: ODI's theory of change

This is not simply letting good data go to waste. Poor data use cultures, especially within governments and companies, exacerbate power asymmetries and prevent the establishment of coalitions and partnerships.[cliii] When data is hoarded by organizations, its benefits go to the few instead of the many, and society cannot access its full value. Likewise, not collecting, analyzing, or using data out of fears of negative effects squanders the enormous potential of data—much of which already exists and is held by other actors. Only by transforming organizational culture, exploiting the power of partnerships, and holding leadership accountable can we fully unlock the power of data to affect positive social change.

 

4.2 Building data skills and literacy

People’s ability and confidence to understand, analyze, and make decisions about data, or data literacy, are the practical bedrocks of effective data use. Once seen as a technical concern for business leaders and public servants, the proliferation of data and software platforms has expanded data literacy concerns to the wider public sphere. Now, individuals across organizations and particularly in management roles must feel empowered to assess and make decisions based on data, and the broader public needs to develop the knowledge and confidence to hold decision makers accountable for data use. Building skills around data is central to increasing data use by individuals, organizations, and governments.[cliv],[clv]

People need to feel confident in their ability to engage with and think critically about data to hold decision makers accountable.[clvi] Increasing people’s engagement with and use of data is a two-way street. Not only do people need to see value in paying attention to data and official statistics, they must also be able to access and understand them. These are all factors that fall largely onto the shoulders of data producers who often communicate about data in ways that obfuscate its meaning and use. The onus is on these data producers to ensure the data is accessible, understandable, and usable. Data intermediaries and other organizations which stand in between data producers and data users can also help bridge the knowledge gap and increase participation in data decision making. A Nigerian civil society organization, BudgIT, offers a useful illustration of how increasing people’s understanding of and access to official data can shift power dynamics and increase government accountability, outlined in Box 7.

Box 7. Increasing transparency around public data use in Nigeria

BudgIT launched in 2011 to make the federal budget more transparent to Nigerians by using simplified explanations and visual representations of data. BudgIT’s campaign reached 2.5 million people and engaged 25,000 people in the budget review process in 2017, leading to exposure of fraudulent projects and a cap on pay for civil servants. SDSN TReNDS authors note that this illustrates the ways in which “data openness, accessibility, and literacy can build trust in public institutions and improve efficiency in public spending.”[clvii]

Data literacy should take a community-centered approach. Communities need to collectively care about the promise and the peril of data. Data literacy should enable communities to hold governments accountable and empower them to address problems in their own ways.

 

4.3 Data use for public benefit

While political challenges to data use are large and complex, in many cases, human and relational barriers are the biggest obstacles to effective data use. The factors that enable us to address these barriers are inevitably linked. A culture in which data is protected appropriately and shared openly requires trust, incentives, relationships, and new partnerships among stakeholders. Likewise, increasing people’s trust in governments and organizations’ responsible use of data requires accountability and transparency, enabled by building the public’s skills in data literacy and creating participatory mechanisms through which people can make decisions about how their data is used. None of these factors can be improved without the others, and all are key means of addressing power imbalances.

Data use is deeply embedded in our lives: We use data every day to make decisions about travel, work, shopping, education, and much more. At a larger scale, data gives decision makers immense power to take informed action for public benefit. To materialize these benefits, leaders and data stewards in the public and private sectors need to go beyond the mechanics of data access and sharing to create trust, build relationships and partnerships, invest in data skills, and create incentives to use data for good.