Web Archives for Social Sciences Datathon

We had a blast!

We organised a Web Archives for Social Sciences Datathon on 27–28 November 2025 at the University of Bristol. The aim was to increase capacity among social scientists to use large-scale web archived data for policy-relevant, socio-economic research.

We partnered with our friends from the Common Crawl and the UK Web Archive at The British Library and we used the amazing the BDFI Neutral Lab.

We were so lucky to have 25 briliant participants joining us from as far as Chile and Belgium. We set up five problems – from identifying Fintech and CreaTech firms to tracking policy changes at national and local levels – and we built relevant datasets using data from the Common Crawl based on the work we are doing for the Atlas of Economic Activities project. Over the two days, the participants worked in teams to tckle these problems with the support of our expert facilitators: Emmanouil Tranos, Leonardo Castro Gonzalez, Laurie Burchell and Thom Vaughan. Special thanks to our guest star facilitator, Jon Reades.

Don’t get me wrong though, the problems were only the excuse. The real benefit was the journey to Ithaca: the collective capacity we built as a team. While I wouldn’t be surprised if these 25 web data warriors could save the world in just 36 hours, the real impact will come later, when they apply these data and methods to their own research questions. And that is why we are so excited.

All Datathon materials including the actual problems, the data, the group presentations and the code the teams developed, are openly available at: data.commoncrawl.org.