CEDA Technical Blog

Technical blog by the Centre for Environmental Data Analysis (CEDA). These posts are written by members of the development and data management teams about work being undertaken in CEDA. Projects described here may be experimental and unfinished.


Climate Forecast Aggregation (CFA) Conventions

CEDA and JASMIN are facing two interrelated challenges regarding both Archive and user data: the data itself is growing rapidly year on year, and the introduction of new storage technology may require a change in workflows for users. Several software packages already exist to exploit this new storage technology, some of which involve re-writing the data into a new file format, which we believe is not suitable for archiving data. This article will present a data format, developed in conjunction with CEDA and NCAS CMS, that we believe can exploit the new storage technology and remain suitable as an archival data format.


What is a user? Removing anomalous behaviour from Anonymous access logs.

The Climate Change Initiative (CCI) project’s goal is to provide open, registration-free, access to essential climate variables (ECVs). CEDA runs the open data portal, a suite of services to provide access to the CCI datasets held in the CEDA Archive including download and metadata services. Dataset usage is an important metric in understanding uptake and usage of the different datasets however, without requiring users to register, it is difficult to determine distinct users. Recent changes in access patterns have led to spurious user counts when thinking 1 IP = 1 USER. This article looks at methods to determine “normal” thresholds to reduce the impact of the different access patterns on our usage statistics.

Tags:CCIdownload stats

Search Futures

We have been looking around for a flexible, scalable standard which would allow us to expose the bulk of the CEDA archive via faceted search. This could then be used to build user interfaces and enhance search services at CEDA. Here, we consider the feasibility and suitability of STAC and discuss progress into an Elasticsearch-based implementation.