We launched the new www.oldWeather.org a month ago, which means that the volunteers using the site have provided quite a bit of new data, and we can start to analyse it. This is one of my favourite moments in any project – first blood, when we get the initial sense of what we’ve got, how it’s going to work, what we can learn from it.
One of the golden rules of statistical analysis is “first plot the data” – always start by making a simple visualisation, so you can be sure you understand what you’ve got, and you’re not missing anything obvious. But the oldWeather data is not easy to plot: the database contains records from hundreds of people making thousands of annotations on dozens of different logbook pages; what, exactly, should we look at?
So I’ve taken inspiration from Listen to Wikipedia, and asked ‘what would it look like if we could see (and hear) the data as it came in – in (accelerated) real time?’ The video below shows every contribution to www.oldWeather.org over a three hour period on December 3rd 2015. The number of pages shown is the number of volunteers contributing at each point in time. Each box drawn, and sound played, is one annotation, a contribution to the project. Blue boxes contain weather data, yellow boxes ship positions, orange boxes dates, and red boxes other events; pages that have moved on to the transcription phase have grey boxes.
December 3rd was when we launched the new site, so we can see a large change in the number of people participating as they learn about the launch. It’s instantly clear that it’s working – we are collecting annotations and transcriptions in quantity, as we hoped. There is much to be learned from careful examination of visualisations like this, but mostly I think it shows the power of the project – the awesome capability of collective public science.