What is effective multiparty data sharing?
Multiparty data sharing in the development sector refers to situations in which two or more organizations collaborate to collect, share, and/or analyze data to address societal challenges.
Multiparty data sharing often happens among stakeholders from different sectors, including governmental agencies, private companies, and nongovernmental organizations, who come together to establish collaborative partnerships. This document focuses on data sharing initiatives that involve partners from across sectors, excluding business-to-business (B2B) and government-to-government (G2G) data sharing.
There is no formal or broadly accepted consensus on what makes a data sharing partnership effective in general, much less in the development sector. In this cookbook, we draw on input from development practitioners to characterize effective data sharing as a reduction of friction in data sharing over the long term that addresses a societal challenge or seeks to improve public wellbeing without producing negative externalities, and which is accountable and transparent toward key stakeholders and the broader public.
Why do we need a cookbook on multiparty data sharing?
Multiparty data sharing initiatives in the development sector have proliferated in the past decade as efforts to better leverage data for achieving the Sustainable Development Goals have intensified.
However, few spaces exist to exchange knowledge on best practices or for partners engaged in (more or less effective) initiatives to consolidate learning and draw out common insights. This deficit persists in spite of demands from members of the data for development community for recommendations and actionable insights on what enables effective data sharing.
The objective of this cookbook is to provide development practitioners and organizations with actionable, evidence-based insights, recommendations, and examples to create successful data sharing initiatives. This cookbook aspires to be a user-friendly, pragmatic, concrete, and concise tool for our community.
Understanding basic cooking
The data sharing food pyramid
The content in this cookbook is organized around five groups of key ingredients for effective data sharing initiatives:
1. Mechanisms for building and sustaining trust (carbohydrates): Any healthy diet starts with a foundation of trust among data sharing partners, data users, and other stakeholders.
2. Shared value and benefits (fruits and vegetables): When all stakeholders share in the value and benefits of data sharing, they get the necessary vitamins and minerals (like you’d find in fruits and vegetables) to reduce the risks of initiative-threatening “diseases.”
3. Dependable data and infrastructure (proteins): Just like healthy proteins, data and infrastructure occupy the center of the data sharing plate and must always be accompanied by shared value and benefits.
4. Mechanisms for supporting knowledge and strengthening skills (dairy): Dairy helps develop and strengthen human bones. Similarly, knowledge and skills support data sharing functioning and operations.
5. Flexibility and adaptability (healthy fats): No diet is balanced without a healthy dose of fats or—in this case—flexibility and adaptability, which facilitate storing energy for when it is most needed.
These five key food groups are found across all successful data sharing initiatives. They form the basis of a varied and balanced data sharing diet. The quantity (or emphasis on) each of these varies according to individual data sharing initiatives, as do the recipes needed to create them.
Each of the five sections of this cookbook contains “recipes,” i.e., tested methods and tools that can be replicated by others.
Data sharing in the development sector is a relatively new domain, and there are some clear gaps in knowledge and experience. For this reason, the cookbook points out where recipes or information on ingredients are missing. Highlighting these gaps helps aspiring chefs to understand what still needs to be done and paves the way for more thinking and experimentation.
Factors influencing choice and quantity of ingredients
There is no one-size-fits-all approach to effective multiparty data sharing. Instead, this cookbook offers suggestions for key ingredients and recipes to inspire creative approaches.
The importance (or quantity) of each food group, as well as the recipes to use, depends on two broad categories of factors related to:
- The characteristics of the specific data sharing initiative, such as stakeholders or sectors involved, types of data (i.e., personal, nonpersonal) being used, the stage the initiative is in, the objectives of the partnership, the number of stakeholders involved, and the openness of the data or initiative.
- The context in which the data sharing initiative operates, including, for instance, the regulatory and policy environment, other stakeholders within a particular data ecosystem, and other contextual factors such as whether data sharing happens during an emergency situation (e.g., a pandemic or natural disaster) or during “business as usual.”
Mechanisms for building and sustaining trustworthiness (carbohydrates)
Trust is the most important ingredient for effective multiparty data sharing and, precisely like carbohydrates, it provides initiatives with the energy they need to function and achieve their objectives. Recent research highlights the robust quantitative evidence that greater trust is associated with increased data sharing and that the impact of trust is particularly significant where initial levels of trust are low. However, the same research also reveals that, to reach optimal levels of data sharing, increasing trust needs to be coupled with other strategies or actions, which is why trust must be combined with other ingredients.
Simply saying that trust is important does not help organizations or individuals working on data sharing initiatives to clearly lay out how or what they need to do. For this reason, and in order to foster more effective data sharing, this section focuses on understanding effective mechanisms for building and sustaining trustworthiness in data sharing initiatives.
What are we making?
For data sharing initiatives to succeed, data partners need to build and maintain the trustworthiness of the initiative and embed core values and ethical considerations in its functioning.
The trustworthiness of a data sharing initiative is the result of three key elements:
- Trust among partners,
- Trust from the general community, and
- Accountability measures.
It follows that recipes for building and maintaining trust and for embedding values and ethical guidelines in data sharing initiatives can be grouped into three categories, which are described in the next section:
Shared value and benefits (fruits and vegetables)
Fruits and vegetables contain essential nutrients for people to thrive. Likewise, a sense that all partners are benefiting proportionately is essential to creating effective partnerships and reducing the risk of threats to the initiative. While humans can live for some time without fruits and vegetables, their dietary absence in the long run results in significant health problems. Unless organizations see the value in partnering for data sharing, they won’t stay at the table for long. Data sharing fails when the perceived value and benefits are unequally distributed for extended periods of time. When benefits consistently go to a few instead of many, incentives to participate decline.
Benefits encompass more than mere economic value. Data sharing partners can profit from an initiative in terms of cost reduction, reputation, knowledge, skills, or even access to data generated by others. Therefore, the emphasis should be not only on the monetary value of data sharing but on understanding the broader scope of added value of the initiative for data partners. Tools such as the Data Ecosystem Mapping Tool and Guide developed by the Open Data Institute in the context of the Microsoft Open Data Campaign can help analyze potential value that can be exchanged between actors within a data ecosystem.
Sharing value in a fair way is already complicated in the context of bilateral data sharing between the public and private sector, as the literature suggests, because reconciling diverging interests and expectations can be hard, irrespective of the number of stakeholders involved. In the context of multiparty data sharing, it can become even more complex to balance competing interests.
What are we making?
Success in data sharing depends on establishing and maintaining shared value for all partners. It means that all partners benefit from data sharing to some extent and no partner benefits disproportionately more than others.
That said, value distribution is not set in stone and can change over time, either because the needs and expectations of data partners change or because the data sharing initiative evolves in terms of focus, activities, and level of input needed from the different partners. Just as adjusting the value distribution might be required, it is also important for initiatives to be clear about their initial approach and be able to monitor whether the promised benefits materialize for the various partners and to what extent. Doing so allows them to correct imbalances and change value propositions as needs arise.
Like other aspects of data sharing, there is no one-size-fits-all approach to creating and distributing value creation. Recipes to ensure that benefits and value are distributed as fairly as possible can take many forms, often customized to the needs of specific partners or to differing contexts and ecosystems. Three, in particular, emerged from the landscape analysis in preparation for this cookbook: lowering costs, innovation and the provision of data products, and the delivery of tailored services.
Understanding value in data sharing*
Value in economic terms is generally used to refer to “added value,” meaning the difference between inputs and outputs for a certain product. In the context of multiparty data sharing, this often translates into a focus on financial benefits that are relatively easy to quantify. At the macroeconomic level, the Organization for Economic Development and Cooperation (OECD), for instance, suggests that improved public and private “data access and sharing can help generate social and economic benefits worth between 1 and 2.5 percent of Gross Domestic Product - GDP (in few studies up to 4 percent of GDP)”. At the microeconomic level, a well known study by Deloitte on the impacts of Transport of London’s (TfL) practice of openly sharing non-personal data showed that companies using TfL data generated a gross value added between GBP 12 million and GBP 15 million per year, including directly supporting around 500 jobs.
Many benefits from sharing data are not so easily measured. For the public sector, diverse and numerous social benefits may be attained through data sharing partnerships. The OECD highlights in particular positive impacts on transparency and accountability and increased user empowerment as associated with greater data sharing. For companies, reputational or knowledge benefits, as discussed above, may also serve as motivation to engage in data partnerships. The OECD also points at the opportunity for private sector data providers to crowdsource new insights and exploit user-driven innovation linked to the emergence of a community that creates additional value that an organization on its own would not be able to create.
*more research, experimentation, and/or knowledge exchange is needed.
Dependable data and infrastructure (protein)
High-quality data and infrastructure (defined as the software and hardware underpinning data exchange) lie at the heart of successful data sharing. They can be compared to proteins, as they are generally put at the center of the plate and seen as the main course of a meal.
What is less obvious is that it is not the quantity of data shared nor the technological features of the infrastructure that ultimately lead to the success of data sharing initiatives, but rather their dependability.
- Trustworthy data and infrastructure are safe, unbiased, and regularly monitored.
- Quality and appropriateness are linked to the security and interoperability of datasets.
These two concepts reinforce each other and ensure that data partnerships result in dependable products that organizations can use to develop new applications, services, policies, and operations.
What are we making?
Data sharing initiatives may take different approaches to ensure their data and infrastructure are dependable, but what they generally have in common is their emphasis on:
- Ensuring technical safeguards are in place.
- Embedding quality assurance mechanisms in data collection and data analysis.
- Adopting appropriate interoperability and data format approaches.
A stronger focus on one or another of these elements depends mainly on the sensitivity of the data at hand, the characteristics of the data themselves (i.e., in terms of variety of sources or granularity), the targeted use cases and objectives of the initiatives, and the resources available. Levels of trust between partners also influence the attention paid to technical safeguards and quality assurance in particular.
For example, data aggregators such as the Humanitarian Data Exchange, which is an open-source and open-access web platform enabling humanitarian organizations to share data, spend more time and resources on ensuring interoperability of datasets, whereas initiatives such as Global Fishing Watch, which seek to inform government response, focus their efforts on ensuring data quality. This is because the former collects a wide variety of data from different sources whose interoperability is key to making the data usable, while the latter needs very precise data to create accurate maps of illegal fishing activities.
Mechanisms for supporting knowledge and skills strengthening (dairy)
A common characteristic of successful data sharing initiatives is their emphasis on knowledge, skills and capacity building. Sharing knowledge and skills strengthens data sharing agreements, just as the calcium in dairy products helps build and maintain strong bones.
It is widely recognized within the development community that it is not sufficient to make data available and expect organizations to access or use it. Instead, there is a consensus on the need to build data use skills among communities of users and on the importance of increasing data literacy to achieve sustained data use.
Often, however, users are not alone in lacking data skills. The landscape analysis, for instance, pointed out that partners of successful data sharing initiatives are concerned about their own organization’s lack of data skills and the difficulty of finding and hiring the right people. Furthermore, the Global Partnership’s work has highlighted that data governance skills, for instance, are among the major obstacles to establishing public-private partnerships.
Finally, it emerged from this work that limited attention is paid to building capacity at the community level. The landscape study indicates that almost none of the data sharing partnerships reviewed engage directly with the communities from which the data originate. Some models with limited engagement mechanisms included participation from civil society organizations and national representatives. However, these did not aim explicitly at building the data capacity of these communities.
What are we making?
Successful data sharing requires building the skills of the user community alongside the skills of key stakeholders and partners.
Identifying the set of skills needed for a data sharing partnership to achieve its objectives and staffing it with individuals who possess these skills is crucial to achieving success.
Furthermore, data sharing partnerships need to know their users very well and to understand their barriers to and requirements for accessing the data. This requires partnerships to focus on the capacity of external stakeholders who are expected to use the data.
Finally, to make data sharing fairer and more sustainable requires strategies to increase the data confidence of the communities whose data are being used. However, examples of how this can be done in practice are limited, in part due to the high costs of large data literacy programs that are perceived as falling outside the scope of specific data sharing initiatives.
To increase the capacity of external and internal stakeholders, approaches can range from producing formal training tools to establishing more informal mechanisms for transferring knowledge. A few successful recipes for building institutional capacity are provided below.
Flexibility and adaptability (healthy fats)
Data ecosystems in which the data sharing initiatives operate are never static, but rather continuously evolve, particularly from technological, stakeholder, and cultural perspectives. It follows that governance, architectural, and operational choices cannot be set in stone, and the possibility of adapting or changing must be envisaged at the outset.
Financial and funding approaches also cannot be immutable. Most data sharing initiatives start with specific project funding but need to transition to something else at a later stage when the project’s funding ends. Changing the funding model can mean diversifying sources of cash or adopting subscription or fee-based models. Whichever approach is chosen, this transition from one funding approach to another remains a delicate moment that can break even very successful (from a user perspective) initiatives.
Healthy doses of flexibility and adaptability are therefore as important in data sharing as healthy fats are for human diets. Reasonable doses of flexibility and adaptability are critical for data sharing to succeed in the long term and for partners to remain involved.
What are we making?
Like trust, flexibility and adaptability are generic terms that do not help organizations to figure out what exactly they should be doing to ensure success in data sharing. More concretely, what data sharing initiatives should be paying attention to is how to embed flexibility in their governance, operations, and data architecture and how to identify suitable sustainable financial models that allow them to evolve over time.
Unfortunately, the development sector does not provide many examples of flexibility embedded in operations, governance, architecture, and financing. On the contrary, this area remains under-explored and represents a topic for which successful recipes are missing.
This is not to say that no initiative has successfully included flexibility and adaptability,but rather that there is limited knowledge and research on how it can be done. Even successful initiatives have not explicitly theorized how to ensure the implementation of an evolutive approach. The sections below highlight a few findings and examples from the landscape analysis, but gaps remain concerning concrete steps for data sharing partners in this area.
The burgeoning number of multiparty data sharing initiatives within the development sector testify to the importance of data collaboration for the achievement of the Sustainable Development Goals. While increased data sharing is good news, new data platforms on their own won’t ensure we meet the 2030 agenda. Data sharing initiatives like the ones described in this cookbook will not support the objectives of the development community unless they are effective. Effective data sharing within this sector requires more knowledge sharing to identify best practices that can be replicated across the world.
The recipes in this cookbook aim to do just that. This cookbook lays the groundwork for discussions about the key ingredients for effective data sharing and provides a number of recipes for inspiration. This is only the beginning. Much remains to be done to close gaps in the existing knowledge, due in part to the limits of this work and the relative novelty of many data sharing initiatives, especially in the development sector.
This Cookbook draws on the landscape analysis commissioned by the Global Partnership and carried out by Athena Infonomics between March and September 2022. The final report of the analysis is available here and its executive summary here. Interested readers can also find the summaries of the case studies here.
The Global Partnership is grateful to the Athena Infonomics and Atlas AI team (Shruti Viswanathan, Deepa Karthykeyan, Vivek Sakhrani, and Saiyed Kamil) for their relevant and insightful work.
The table below contains the list of members composing the Reference Group for effective data sharing, in alphabetical order.
Position and organization
Data Privacy Specialist / UN Global Pulse
Project Assistant Professor & Co-lead of the Spatial Data Commons / Center for Spatial Information Science, University of Tokyo
Senior Product Strategist, Land & Carbon Lab / World Resources Institute
Calderon Machicado, Claudia
Partnerships Lead, Development Data Partnership / World Bank
Lead for Collaborative Data Ecosystems / Capgemini Invent
Digital Climate Smart Agriculture Manager / Mercy Corps AgriFin
Assistant General Counsel, Open Innovation Team / Microsoft
Senior Manager, AI for Social Good & Crisis Response / Google.org
Co-founder & Chair, AI Initiative / The Future Society
Former Project Lead, Data for Common Purpose Initiative / World Economic Forum
Assistant Director & Chief of Data Innovation and Capacity Branch / United Nations Statistics Division
Research Scientist Data Science / African Population and Health Research Center
Program Manager and Founder, Development Data Partnership / World Bank
Program Officer, Gender Equity and Governance Program / William and Flora Hewlett Foundation
Director / DataReady
Chief Executive Officer / Development Gateway: An IREX Venture
Executive Director / Open Data Latin American Initiative
Former Fellow, AI Ethics and Digital Governance / UN Global Pulse, now UNICEF
Affiliate Fellow / Stanford Institute for Human-Centered Artificial Intelligence
Statistician / United Nations Office for the Coordination of Humanitarian Affairs
Director of Fair Tech Institute / AccessPartnership
Head of the National Digital Platform / Executive Secretariat of the National Anticorruption System, Mexican Government
Director of Mobile Data Partnerships / Flowminder