This eupp package provides access to a variety of gridded data sets provided within the scope of the European Post-Processing benchmark project.
The gridded data sets consists of (re)analysis data used as the gridded ground-trough in some scenarios, deterministic and ensemble forecasts for the training and test period defined within the project as well as hindcasts (or reforecasts) to be worked with. While the different data sets differ in form and extent, the eupp package provides a uniform interface to download and–to some extent–process the data.
This article (Getting started with gridde data) shows the main use the eupp_*_gridded()
functionality with some minimal examples on how to working with the data. Therefore, different types of gridded data sets will be used in different situations. Dedicated articles are available highlighting specific characteristics and explicit examples for the different types of data. Namely:
Note that these articles will often refer back to this ‘getting started’ article as most functions/procedures work the very same independent of the type of gridded data.
The data set has been designed and prepared by colleagues at the RMI in Brussels (part of part of the R&D Department of the Royal Meteorological Institute of Belgium). The gridded data set consists of different ECMWF products (see LICENSE) with access granted via the europeanweather.cloud
S3 bucket.
All gridded data sets are stored as GRIB version 1 files, alongside with a GRIB index file. These files can technically be accessed directly, however, this may be inconvenient for most/some. Thus, the eupp package provides an interface to download the data.
Rough scheme on the download/processing process.
Independent of the product or subset, the procedure for all products is the same:
eupp_config()
).stars
.This article contains a series of links to the article “Gridded data: Advanced” not required to follow as casual users but might be helpful to show some insights to more advanced users, programmers, and supporters.
Under the hood, the eupp package performs a series of intermediate steps for (2) to achive the goal.
curl
) and stores the requested messages in a new GRIB version 1 file.stars
object has been requested: read the NetCDF file. This goes trough the intermediate step of creating a NetCDF file; thus ecCodes is necessary.Step one Before starting downloading data, a configuration object must be created using eupp_config()
which contains the specification of the data to be retrieved.
# Loading the package
library("eupp")
# Create custom configuration
conf <- eupp_config(product = "forecast",
level = "surf",
type = "ens",
date = "2017-07-01",
parameter = c("cp", "2t"),
steps = c(24L, 240L), # +1 and +10 days ahead
cache = "_cache") # optional; caching grib index
Typically not done by the end-user but handy to see what messages will be downloaded or to have a look at available messages before downloading the data itself is to look at the GRIB inventory.
inv <- eupp_get_inventory(conf)
head(inv)
## path domain
## 529 data/fcs/surf/EU_forecast_ctr_surf_params_2017-07_0.grb g
## 546 data/fcs/surf/EU_forecast_ctr_surf_params_2017-07_0.grb g
## 2729 data/fcs/surf/EU_forecast_ctr_surf_params_2017-07_0.grb g
## 2746 data/fcs/surf/EU_forecast_ctr_surf_params_2017-07_0.grb g
## 115371 data/fcs/surf/EU_forecast_ens_surf_params_2017-07-01_0.grb g
## 115388 data/fcs/surf/EU_forecast_ens_surf_params_2017-07-01_0.grb g
## levtype step_char param class type stream expver leg_number offset
## 529 sfc 24 2t od cf enfo 0001 1 12191040
## 546 sfc 24 cp od cf enfo 0001 1 12602400
## 2729 sfc 240 2t od cf enfo 0001 1 63022200
## 2746 sfc 240 cp od cf enfo 0001 1 63433560
## 115371 sfc 24 2t od pf enfo 0001 1 609540720
## 115388 sfc 24 cp od pf enfo 0001 1 609952080
## length param_id number init step valid
## 529 23412 167 0 2017-07-01 24 2017-07-02
## 546 23412 143 0 2017-07-01 24 2017-07-02
## 2729 23412 167 0 2017-07-01 240 2017-07-11
## 2746 23412 143 0 2017-07-01 240 2017-07-11
## 115371 23412 167 1 2017-07-01 24 2017-07-02
## 115388 23412 143 1 2017-07-01 24 2017-07-02
dim(inv)
## [1] 204 17
In this case the configuration (conf
) defines a set of 204 messages to be processed/downloaded. To see what messages are available, one can simply set up a configuration for a specific product
/level
/type
/date
but not specifying steps
or parameters
. This will return the full inventory with all available parameters and steps.
From eupp_get_inventory()
we know that there are 204 fields matching our configuration. eupp_download_gridded()
allows us to retrieve the data in the original GRIB version 1 file format by specifying output_format = "grib"
.
The function will first download/parse the GRIB index file (uses cache
if specified) to know which GRIB messages are required given the configuration (conf
) before starting to download the requires messages. All messages matching the configuration will be stored in one single file specified by output_file
(GRIB version 1 file format).
eupp_download_gridded(conf, output_file = "_test.grb", overwrite = TRUE)
Alongside with the GRIB vile ("_test.grb"
) an .rds
file "_test.grb.rds"
will be stored containing the GRIB inventory (meta information about the fields). Whilst not really required this allows to interpolate the GRIB files without the need to have ecCodes to be installed (see next section).
The eupp package allows to interpolate GRIB data directly. Commonly this is done using additional libraries which are able to read the GRIB meta information (index) such as the ecCodes.
stars
can also read GRIB files directly (via rgdal
), it does, however, not return this meta information. eupp_interpolate_gridded()
thus does the following:
.rds
file exists alongside the GRIB file to be interpolated (see previous section). Uses this information to perform interpolation (does not require ecCodes)..rds
does not exist, grib_ls
(ecCodes) is called to create the inventory/index from the GRIB file.Currently, eupp_interpolate_gridded()
only allows to interpolate one or multiple points (POINT
features). The interpolation is performed via stars
before being manipulated and brought to a ‘more usable’ form.
First an sf
object containing the target locations has to be created. Only point locations are allowed and the object must have a valid coordinate reference system (CRS).
library("sf")
locations <- data.frame(name = c("Innsbruck", "Brussels"),
lon = c(11.39, 4.35),
lat = c(47.27, 50.85))
(locations <- st_as_sf(locations, coords = c("lon", "lat"), crs = 4326))
## Simple feature collection with 2 features and 1 field
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 4.35 ymin: 47.27 xmax: 11.39 ymax: 50.85
## Geodetic CRS: WGS 84
## name geometry
## 1 Innsbruck POINT (11.39 47.27)
## 2 Brussels POINT (4.35 50.85)
Once available, the GRIB file can be interpolated.
ip <- eupp_interpolate_grib("_test.grb", at = locations,
atname = "name", bilinear = TRUE)
The warnings come from readGDAL()
(rgdal
) and can be ignored at this point. By default, a wide-format is returned, but a long format can be retrieved if needed.
head(ip[, 1:11]) # First 11 columns only
## init valid step geometry name cp_0
## 1 2017-07-01 2017-07-02 24 POINT (11.39 47.27) Innsbruck 0.008243402
## 2 2017-07-01 2017-07-11 240 POINT (11.39 47.27) Innsbruck 0.024329524
## 3 2017-07-01 2017-07-02 24 POINT (4.35 50.85) Brussels 0.001813469
## 4 2017-07-01 2017-07-11 240 POINT (4.35 50.85) Brussels 0.012095566
## cp_1 cp_10 cp_11 cp_12 cp_13
## 1 0.006312119 0.006833686 0.009337233 0.004342669 0.0068883202
## 2 0.035008109 0.038468658 0.036461267 0.056630942 0.0335263306
## 3 0.002549515 0.002692184 0.002620182 0.001578808 0.0008776283
## 4 0.006306648 0.015253906 0.006379013 0.038313293 0.0048063660
# Long format; contains more extensive information
# (differs between rds/grib_ls).
head(eupp_interpolate_grib("_test.grb", at = locations,
atname = "name", wide = FALSE), n = 3)
## path domain levtype
## 1 data/fcs/surf/EU_forecast_ctr_surf_params_2017-07_0.grb g sfc
## 2 data/fcs/surf/EU_forecast_ctr_surf_params_2017-07_0.grb g sfc
## 3 data/fcs/surf/EU_forecast_ctr_surf_params_2017-07_0.grb g sfc
## step_char param class type stream expver leg_number offset length param_id
## 1 24 t2m_0 od cf enfo 0001 1 12191040 23412 167
## 2 24 cp_0 od cf enfo 0001 1 12602400 23412 143
## 3 240 t2m_0 od cf enfo 0001 1 63022200 23412 167
## number init step valid geometry value name
## 1 0 2017-07-01 24 2017-07-02 POINT (11.39 47.27) 2.778642e+02 Innsbruck
## 2 0 2017-07-01 24 2017-07-02 POINT (11.39 47.27) 8.243402e-03 Innsbruck
## 3 0 2017-07-01 240 2017-07-11 POINT (11.39 47.27) 2.799946e+02 Innsbruck
Please check out the additional arguments of eupp_interpolate_grib()
for details on the arguments and additional arguments not demonstrated here.
The eupp contains some additional functionality to download/process gridded data sets. They, however, all go trough grib_to_netcdf
(ecCodes) which comes with a series of benefits and drawbacks. A separate article shows that, however, when using it keep in mind that this must be seen as ‘experimental’.