How open are official statistics?

This is a guest post from the World Bank's Data Blog by Shaida Badiee and Eric Swanson. They are co-founders of Open Data Watch, an NGO that works with a particular focus on the intersection of open data and official statistics.  The Open Data Inventory (ODIN) is a commitment of Open Data Watch to the Global Partnership. 

Although "open data" has been a popular rallying cry and many countries, states, even cities, have announced open data initiatives, open access to the important data produced by national statistical agencies remains, at best, limited.

To get a baseline measurement, Open Data Watch conducted in depth assessments of the statistics commonly produced by national statistical systems in 125 mostly low- and middle-income countries. Called the Open Data Inventory (ODIN), results are now available online at Global results are shown in Figure 1. In 2015 ODIN found only 10 national statistical offices (NSOs) that satisfied more than 50 percent of the criteria for data coverage and openness. Mexico, at 68 percent was the highest scoring country followed by Mongolia, Moldova, and Rwanda. Uzbekistan at 3 percent was the lowest.

An interactive table of all country scores is available here:

Data coverage and openness

Before there can be open data, there must be data. ODIN assessments proceed in two phases -- coverage and openness. In the coverage phase, the availability of key indicators by age, sex, and other functional disaggregations in 20 data categories is evaluated for frequency (data available in the previous 5- and 10-year periods) and for geographic disaggregation (first and second administrative levels). A maximum of 100 points can be awarded for data coverage.

Given the data available in each category, openness is scored using the criteria of the Open Definition. The openness assessment has five elements:

  1. Data are available in a machine readable format (not PDF files or images)
  2. Data are available in non-proprietary formats such as CSV, text, or XML
  3. Users have options for selecting data and for bulk download
  4. Metadata are available
  5. Terms of use are provided that are no more restrictive than a Creative Commons by Attribution (CC-BY) license.

Each element can be scored as 0 (not satisfied), ½ (partially satisfied), or 1 (fully satisfied). The maximum openness score is 100.

Figure 2 shows the distribution of country scores by deciles. Openness scores are skewed toward the lower end of the distribution. Mexico, the highest scoring country, is an exception. Measured just on openness Mexico scored 74 out of a possible 100 points but ranked only sixth in data coverage with a score of 61. Cuba ranked first in coverage, scoring 67, but ranked 21st on openness with score of 36. This pattern was repeated by most of the 125 countries. The full set of ODIN scores for countries included in the 2015 assessment can be downloaded at

Elements of openness

ODIN scores can be further disaggregated. A decomposition of the openness scores is shown in Figure 3. The provision of metadata had the highest average score across all data categories, but scores on the other elements drop off rapidly. Many countries publish data online only in PDF files or as images from print publications. Users have little choice of what data they will download and machine readable data are often available only in Excel files. Most surprising is the very low score for terms of use. Almost every country, including Mexico, could make a significant improvement in its openness score simply by adopting and displaying a CC-BY or equivalent terms of use. Publishing data files in CSV format would be equally costless and would further raise the scores of many countries. The World Bank's Open Data Readiness Assessments might want to stress the immediate payoff to these simple steps.

Scores by Data Category

ODIN scores can also be disaggregated by statistical categories. Average scores for 125 countries are shown in Figure 4. ODIN includes eight categories of social statistics, shown in blue; seven categories of economic statistics, shown in yellow; and five categories of environmental statistics, shown in green. Economic statistics predominate with international trade statistics achieving the highest score, reflecting the decades long effort to develop internationally comparable measures of trade. Although population and vital statistics hold second place, most social and environmental statistics fall in the bottom half. Of particular note for the World Bank are the low scores achieved for gender and poverty statistics. These may be attributable to their infrequency of publication and the lack of subnational data. There is also a tendency of countries to publish reports in PDF format without making the underlying data available in machine readable formats.



The World Bank launched its open data initiative in 2010 as a centerpiece of the reform of its access to information policies. The World Development Indicators database was in the first wave of open datasets and remains among the most sought after by visitors to the World Bank's website. The Bank's leadership has been influential. Other international agencies have followed its example by opening their databases and by encouraging member countries to adopt open data policies. The Open Data Readiness Assessments now provide a useful tool to guide the planning process for open data. But in most countries there remain large gaps between planning, policy, and implementation of open data. If NSOs are going to fulfill their responsibilities for providing the indicators to inform the public and monitor the Sustainable Development Goals, they – and their development partners -- must move quickly to increase their capacity to produce and disseminate open data.

This is a guest post from Shaida Badiee and Eric Swanson, co-founders of the NGO Open Data Watch, which works on a variety of initiatives at the intersection of Open Data and Official Statistics.