Getting oldWeather data ship-shape for science

Author of Post: Larry Spencer

Date of Post: Tuesday, December 04, 2018


     The oldWeather science team is using the ship logbook observations to help make a three-dimensional global reconstruction of the Earth’s weather in order to understand the past climate of the Earth and better predict its future. This uses the ship observations in exciting and fascinating ways, but it needs them to be processed into a precise format and as accurately as possible. We know that the ship logbook transcriptions made by our volunteers are very accurate, but we still have plenty of challenges in using the data. Some of the entries were incorrect when originally recorded in the logs, and we need to infer some information that is not in the logbook explicitly. Therefore, we have to calculate exact positions and standardized dates from the local dates and location names provided in the logs. We need to get the observations ready to sail!

     You might be wondering how all of that transcribed historical weather data is processed and then ends up being used by the scientists after the work of the volunteers has been completed. As one member of the science team, I work on this extensively, and I can tell you that it is a very enjoyable process. For each one of the oldWeather ships available, I meticulously analyze and quality-control the transcribed data for both latitude and longitude positions and weather observations (checking for and removing existing errors in the data) in order to ensure the highest-quality datasets possible. As a way to illustrate just how important this process is, two maps are displayed below. The first one is a “Before Map” that shows the Thetis’s positions and tracks (represented by the gold dots and lines) that were automatically generated by Philip’s software from the place names transcribed from the logbooks. The second one is an “After Map” that shows the ship’s positions and tracks after I carefully analyzed and quality-controlled the geographical positions data from the ship. There is a significant difference between the two maps displayed, with the “After Map” illustrating a much more “polished-up” and realistic version of the ship’s voyage positions and tracks (for instance, the ship not traveling over any land masses).


Figure #1. – “Before Map” of the oldWeather Thetis ship’s voyage positions/tracks throughout the time period of May 01, 1884 – December 31, 1908 before quality-control on the geographical positions data from the ship.


Figure #2. – “After Map” of the oldWeather Thetis ship’s voyage positions/tracks throughout the time period of May 01, 1884 – December 31, 1908 after quality-control on the geographical positions data from the ship.


     The transcribed data produced by oldWeather are all stored in one big database. From this database, Philip’s software makes two separate data files for each ship: one for the latitude and longitude positions and one for the weather observations. I then apply a meticulous data analysis/quality-control process to the data that is contained in both of these files for each ship. As a part of this process, to make error-detecting a bit easier, I import the weather data into a spreadsheet and arrange it in order from the least to the greatest values and from the greatest to the least values. This is a quick and easy method of detecting and correcting the most obvious errors that exist in the data. I complete this for the three meteorological variables of surface pressure, 2-meter air temperature, and sea surface temperature. In addition to this, I manually edit the data where errors exist throughout the entire files, both in the geographical positions data file and in the weather observations data file. I use a standard set of criteria as a guide for how to locate errors and perform quality-control on the data where I detect problems. For example, in regards to the surface pressure data, if there are any values provided in the original file that are lower than 28.00 inches of Mercury or if there are any values that are higher than 32.00 inches of Mercury, then I remove that particular value and replace it in the file with an “NA”, which represents “Not Available”, because it would be classified as a “meteorologically-unrealistic” value. This specific criteria is used in applicability to either if there are any values of this type contained in the original file or if there are any of these types of values recorded in the original ship logbook. Another example would be if there are any values that have been transcribed correctly by the volunteers, but are obviously incorrectly recorded in the original ship logbooks, I then either remove and replace the values given in the original file with an “NA”, or I remove and change them to the obviously-correct values, depending upon the exact circumstances. In those particular circumstances, I make a careful comparison of the “obviously-incorrect” values with the values provided directly above and below them in the original ship logbooks and then proceed to manually edit them accordingly. When I have performed and completed this technical process in its entirety, I then convert a given ship’s quality-controlled weather and position data from the transcribed database into a standard format called the International Marine Meteorological Archive (IMMA) format. Each oldWeather ship has a final file that is created for it, which contains the combined weather and position data presented in the IMMA format. The overarching goal of this entire process is to produce standardized data, as accurate as possible, that can be easily used by professional scientists in major research projects.

     This IMMA-formatted data is being used in several ways, notably for assimilation into the Twentieth Century Reanalysis Project (20CR) and for distribution through international databases, such as the International Comprehensive Ocean-Atmosphere Data Set (ICOADS) and the International Surface Pressure Databank (ISPD). For example, this IMMA-formatted data for OldWeather3 ships will be archived as an ICOADS auxiliary dataset in the Research Data Archive (RDA) at the National Center for Atmospheric Research (NCAR) located in Boulder, Colorado. The overarching purpose of doing this work is to develop weather and climate models and to better predict and understand extreme, high-impact weather and climate phenomena on a global scale. We would be much less able to successfully accomplish this purpose without our oldWeather volunteers! So, I want to extend a huge “thank-you” to all of our volunteers for contributing so much time, effort, and dedication to this project ….. because it all starts with the very critical step of getting the observations and positions contained in the original ship logbooks accurately transcribed!

6 responses to “Getting oldWeather data ship-shape for science”

  1. helenj55 says :

    Thank you Larry, this is fascinating. It’s so good to know how all the information we extracted is being cleaned up and used by the scientists. I’m glad you find it interesting too!

  2. AvastMH says :

    And thank you from me too, Larry. I absolutely second helenj’s words. It’s a fascinating blog. It certainly encourages me to do more 😀

  3. Alexandra Pierce-Smith says :

    Interested in volunteering but having problems getting access to “volunteer page” 😦

    This project fascinates me on many fronts …. Both as a Math graduate, granddaughter and daughter of MN Captains, and retired civil servant. Let me know what I can do to assist!

  4. Michael Purves says :

    I’m surprised that you cut pressures off at 28.00 inches. The following are the lowest station pressures recorded at St. John’s Airport in Newfoundland.

    1977 01 20 13 92.56 Kpascals
    1977 01 20 14 92.66 Kpascals
    1977 01 20 12 92.90 Kpascals
    1971 01 17 02 93.04 Kpascals
    1971 01 17 01 93.08 Kpascals
    1971 01 17 03 93.12 Kpascals
    1977 01 20 15 93.17 Kpascals
    1971 01 17 00 93.23 Kpascals
    1971 01 17 04 93.27 Kpascals

    8403506 St John’s A NFLD 47’37N 52’45W 141m

    Element Description From To Count Misg % Tot

    77 Station Pressure 195301 201112 21522 27 100

    That station pressure of 92.56 is 27.33 inches.

    Bear, for 1912-11-16 and 17 had a series of pressures less than 28 inches, the lowest being 27.62.

    These pressures are quite rare, but they do occur. The lowest pressure being 874 MB in Typhoon Tip.

  5. Michael Purves says :

    Hi Larry. I suggest that you set a lower limit of 27.00 inches, not 28.00 inches for your pressure checks. Bear had several pressures less than 28.00 on November 16, 1912. You might consider checking that hourly or two hourly pressure changes are less than 0.50 inches, too.

    1912-11-16 11 27.9
    1912-11-16 12 27.8
    1912-11-16 13 27.62
    1912-11-16 14 27.62
    1912-11-16 15 27.64
    1912-11-16 16 27.65
    1912-11-16 17 27.82
    1912-11-16 18 27.83
    1912-11-16 19 27.86
    1912-11-16 20 27.86
    1912-11-16 21 27.89
    1912-11-16 22 27.9
    1912-11-16 23 27.9
    1912-11-16 24 27.9
    1912-11-17 1 27.9
    1912-11-17 2 27.92
    1912-11-17 3 27.93
    1912-11-17 4 27.94
    1912-11-17 5 27.94
    1912-11-17 6 27.97
    1912-11-17 7 27.99
    1912-11-17 8 27.99

    Yours truly,

    Michael Purves

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: