From Texts to Tweets to Satellites: The Power of Big Data to Fill Gender Data Gaps

By Rebecca Furst-Nichols, Deputy Director, Data2X

Twitter posts, credit card purchases, phone calls, and satellites are all part of our day-to-day digital landscape.

Detailed data, known broadly as “big data” because of the massive amounts of passively collected and high-frequency information that such interactions generate, are produced every time we use one of these technologies. These digital traces have great potential and have already developed a track record for application in global development and humanitarian response.

Data2X has focused particularly on what big data can tell us about the lives of women and girls in resource-poor settings. Our research, released today in a new report, Big Data and the Well-Being of Women and Girls, demonstrates how four big data sources can be harnessed to fill gender data gaps and inform policy aimed at mitigating global gender inequality. Big data can complement traditional surveys and other data sources, offering a glimpse into dimensions of girls’ and women’s lives that have otherwise been overlooked, and providing a level of precision and timeliness that policymakers need to make actionable decisions.

Here are three findings from our report that underscore the power and potential offered by big data to fill gender data gaps:

1. Social media data can improve understanding of the mental health of girls and women.

Mental health conditions, from anxiety to depression, are thought to be significant contributors to the global burden of disease, particularly for young women, though precise data on mental health is sparse in most countries. However, research by Georgia Tech University, commissioned by Data2X, finds that social media provides an accurate barometer of mental health status.

Algorithms can not only detect genuine self-disclosures of mental illness on Twitter, but can disaggregate these tweets by sex and gauge characteristics like tone and affect to track positive or negative expressions. Across the world, these tools can serve as an early first step in assessing prevalence of mental health conditions. And for individual women and girls, they may be used to provide information on treatment and resources to groups with high prevalence levels.

These methodologies still have limitations, including bias toward literate (and tech-literate) women and girls, dominant-language Twitter users, and those with access to the internet. However, as more women, and particularly young women, come online, these methodologies are likely to be increasingly valuable, especially given the severity of these issues and the challenges associated with collecting mental health information through other means.

2. Cell phone and credit card records can illustrate women’s economic and social patterns – and track impacts of shocks in the economy.

Our spending priorities and social habits often indicate economic status, and these activities can also expose economic disparities between women and men.

By compiling cell phone and credit card records, our research partners at MIT traced patterns of women’s expenditures, spending priorities, and physical mobility. The research found that women have less mobility diversity than men, live further away from city centers, and report less total expenditure per capita.

Since this data is continuously generated, this type of analysis can be performed over longer time spans to capture impacts of economic and environmental shocks, stressors, and policy changes on women’s lives in real time.

It is critical to note that, despite its promise, data access and privacy remain a key challenge for institutionalization of these real-time surveillance systems into country statistical offices. And, as with social media information, any analysis performed on cell phone and credit card data must be complemented with other ‘ground truthing’ surveys to ensure that researchers know what type of women are included in – and left out of – the dataset for reasons of access, affordability, literacy, and other barriers.

The 61st Commission on the Status of Women taking place this week highlights women’s economic empowerment and their roles in both paid and unpaid work, and big data holds great promise for measuring empowerment and shaping our understanding of women’s economic needs and priorities.

3.  Satellite imagery can map rivers and roads, but it can also measure gender inequality.

Satellite imagery has the power to capture high-resolution, real-time data on everything from natural landscape features, like vegetation and river flows, to human infrastructure, like roads and schools. Research by our partners at the Flowminder Foundation finds that it is also able to measure gender inequality.

Satellite imagery can fill gaps in traditional surveys by providing more frequent and higher resolution information about girls’ and women’s lives. Our research piloted methods of correlating geospatial variables (like distance to roads) with well-being indicators (like literacy) to infer patterns of social and health phenomena.

Mapping these phenomena using this method can reveal pockets of gender inequalities that are typically masked by averages on the country or district level. This use of big data for more frequent, and higher resolution, information on the well-being of women and girls offers huge potential for helping policymakers more effectively direct resources to where they are needed most.

The release of this report is just the first step. Data2X is excited to explore future possibilities for using digital data sources, and this year, will announce a new opportunity for researchers interested in using big data – along with other sources – to capture multiple dimensions of girls’ and women’s lives, inform policies, and improve outcomes.