18 Comments

Have you seen Connelly and Connelly’s papers on screen temps across China USA Ireland.. also Bill Johnston at Bomwatch.com.au provides a good look at station site changes effect on rainfall and max temps ..

Expand full comment

Not yet!

Expand full comment

How do you deal with the “—-“ missing data entries?

And when the weather gets lousy and keeps me indoors I might have a go a batch scooping the big data with Python or something.

Expand full comment

I start with a decent cafetière of Colombian and treat myself to four squares of flapjack before grumbling about the crazy coding they've adopted! If I'm importing the CSV directly into SPSS it sets them as system missing. Importing into Excel is the pain and for this I use the find & replace function, which rips through them like a hot knife through butter. If you want a copy of my efforts for all 37 stations in xls format I'm happy to pop you a link, though you'd need to update the file when the August data becomes available.

Expand full comment

thanks - I'll do some cross-checking in due course

Expand full comment

Could probably write a script to download all the MIDAS data in one go, assuming some logic to their naming convention..

Expand full comment

Wowie! Scripting is deffo not my thing despite being a Fortran 77 bod back in the day. I've not had a glimpse of their ftp server as yet so can't comment on convention and coding.

Expand full comment

Haha - I started with Fortran 77 too. I was looking into pulling the MIDAS data down with VBA or Python, when I went poking around the CEDA site and found the “bulk data” options. Basically FTP/FileZilla or WGET. Both need extra access keys to be added to your CEDA account. Despite using FileZilla extensively in a previous life, I just couldn’t get it to connect, so tried WGET. Once the really ugly access key was factored into the WGET command it just worked. Left it running overnight and ended up with about 2GB of data. However I know the process has dropped at least one file , so it ain’t faultless. Now the fun starts. Huge amounts of missing data, incompatible manual and automatic readings etc.

Expand full comment

Fantastic effort! I really do despair sometimes over the quality of meteorological data. I’ve been downloading daily wind speed from Irish stations via ECA&D/KNMI and it’s a good job I’ve eyeballed a load of raw record charts before turning the handle. Sure makes you wonder what quality control the Met Office/Hadley Centre employ. This bit of fun will be published over the next few weeks.

Expand full comment

I have been in correspondence with both the MO and CEDA about missing data. CEDA’s stance is that they are a “receiving archive” and have no control over what is sent. They also have a complex QC procedure which I haven’t got to the bottom of, but it doesn’t include noticing that a whole year or month is missing. The MO ‘s response was basically “dunno - try our archive people”. The rather helpful archivist sent me a PDF of an original Metform3208 for a missing month of data and referred me to similar scans online. Currently trying to audit / reconcile their “monthly averages” page… which is when all these discrepancies emerged…. E.g. treblicated data when they move from DLY3208 to AWSDLY, with dates not directly comparable due to “throwback”. Bit of a challenge! I am trying to get my “ingestion” process to handle all these wrinkles. Currently I can pull in an entire county’s data in one easy move, separate partial data from clean data, and summarise it all. If I can automate the treblication, we’re nearly there. This is all in Excel (Power Query) btw. Oh and we have a problem with old data - Excel can’t do dates before 1900. I should write it all up I guess.

Expand full comment

Crikey Dave, that’s a mammoth job! Responses are pretty much as I had expected. Yep, the 1900 Excel limit is a right pain in the Aga - I resort to using an index that permits exchange with SPSS’ date format. This works fine unless you wish to go back further than 15 October 1582.

Expand full comment

I've written up the start of this little exercise - here : https://open.substack.com/pub/davesdata/p/met-office-temperature-data-cawood

Expand full comment

One of the issues with many temperature data-sets is the corrections that have been applied to the quoted values. The 'very handy resource' that you link to does not state whether the values are 'raw' or 'corrected'. Does the Met Office publish the data anywhere such that we can see and compare what was originally recorded versus what appears in their table?

Expand full comment

Absolutely so. As far as I am aware these are raw readings but you can check them against data held in MIDAS...

https://catalogue.ceda.ac.uk/uuid/dbd451271eb04662beade68da43546e1

Back in August I checked my own temp data against those for a station a few miles from me, a summary of which can be found in this newsletter...

https://jdeeclimate.substack.com/p/my-garden-part-3

Expand full comment

Hi John

I have actually scooped the whole of the MIDAS bunch of CSV files and am working my way through "ingesting" it into a usable format in XL ... Working with Ray Sanders (off the Tallbloke blog). Need some advice on ARIMA and Fourier tools and all the good stuff that you do. The data is pretty funky in places!

Expand full comment

My goodness that is fabulous! As it so happens I jotted 'ARIMA article' in my black book for Private Passion only yesterday in order to pass on hints and tips gathered over the years.

Expand full comment