climepi.climdata.get_climate_data#

climepi.climdata.get_climate_data(data_source, frequency='monthly', subset=None, save_dir=None, download=True, force_remake=False, subset_check_interval=10, max_subset_wait_time=20, **kwargs)[source]#

Retrieve and download climate projection data from a remote server.

Currently available data sources are CESM2 LENS (data_source=’lens2’), CESM2 ARISE (data_source=’arise’), CESM1 GLENS (data_source=’glens’), and ISIMIP (data_source=’isimip’). CESM2 LENS 2 and ARISE data are taken from AWS servers (https://registry.opendata.aws/ncar-cesm2-lens/ and https://registry.opendata.aws/ncar-cesm2-arise/), while CESM1 GLENS data are taken from the NSF NCAR Research Data archive (https://rda.ucar.edu/datasets/d651064/). Terms of use for CESM data can be found at https://www.ucar.edu/terms-of-use/data. ISIMIP data are taken from the ISIMIP repository (https://data.isimip.org/), and terms of use can be found at https://www.isimip.org/gettingstarted/terms-of-use/terms-use-publicly-available-isimip-data-after-embargo-period/.

Note that for GLENS data, no server-side subsetting is performed, and downloaded full data files are only cleared up after all data have been downloaded and processed; splitting the data retrieval into multiple calls to this function may reduce the disk space overhead.

Parameters:
  • data_source (str) – Data source to retrieve data from. Currently supported sources are ‘lens2’ (for CESM2 LENS data), ‘arise’ (CESM2 ARISE data), ‘glens’ (CESM1 GLENS data), and ‘isimip’ (ISIMIP data).

  • frequency (str, optional) – Frequency of the data to retrieve. Should be one of ‘daily’, ‘monthly’ or ‘yearly’ (default is ‘monthly’).

  • subset (dict, optional) –

    Dictionary of data subsetting options. The following keys/values are available:
    yearslist of int, optional

    Years for which to retrieve data within the available data range. If not provided, all years are retrieved.

    scenarioslist of str, optional

    Scenarios for which to retrieve data. If not provided, all available scenarios are retrieved.

    modelslist of str, optional

    Models for which to retrieve data. If not provided, all available models are retrieved.

    realizationslist of int, optional

    Realizations for which to retrieve data (indexed from 0). If not provided, all available realizations are retrieved.

    locationslist of str, optional

    Name of one or more locations for which to retrieve data. If provided, and the ‘lons’ and ‘lats’ keys are not provided, OpenStreetMap data (https://openstreetmap.org/copyright) is used to query corresponding longitude and latitudes, and data for the nearest grid point to each location are retrieved). If ‘lons’ and ‘lats’ are also provided, these are used to retrieve the data (the locations parameter is still used as a dimension coordinate in the output dataset). If not provided, the ‘lon_range’ and ‘lat_range’ keys are used instead.

    lonslist of float, optional

    Longitude(s) for which to retrieve data. If provided, both ‘locations’ and ‘lats’ should also be provided, and must be lists of the same length.

    latslist of float, optional

    Latitude(s) for which to retrieve data. If provided, both ‘locations’ and ‘lons’ should also be provided, and must be lists of the same length.

    lon_rangetuple of two floats, optional

    Longitude range for which to retrieve data. Should comprise two values giving the minimum and maximum longitudes. Ignored if ‘locations’ is provided. If not provided, and ‘locations’ is also not provided, all longitudes are retrieved.

    lat_rangetuple of two floats, optional

    Latitude range for which to retrieve data. Should comprise two values giving the minimum and maximum latitudes. Ignored if ‘locations’ is provided. If not provided, and ‘locations’ is also not provided, all latitudes are retrieved.

  • save_dir (str or pathlib.Path, optional) – Directory to which downloaded data are saved to and accessed from. If not provided, a directory within the OS cache directory is used.

  • download (bool, optional) – For CESM2 LENS and ARISE data only; whether to download the data to the save_dir directory if not found locally (default is True). If False and the data are not found locally, a lazily opened dataset linked to the remote data is returned. For ISIMIP data, the data must be downloaded if not found locally.

  • force_remake (bool, optional) – Whether to force re-download and re-formatting of the data even if found locally (default is False). Can only be used if download is True.

  • subset_check_interval (float, optional) – For ISIMIP data only; time interval in seconds between checks for server-side data subsetting completion (default is 10).

  • max_subset_wait_time (float, optional) – For ISIMIP data only; maximum time to wait for server-side data subsetting to complete, in seconds, before timing out (default is 20). Server-side subsetting will continue to run after this function times out, and this function can be re-run to check if the subsetting has completed and retrieve the subsetted data.

  • **kwargs – Additional keyword arguments to pass to xarray.open_mfdataset() when opening downloaded data files.

Returns:

xarray.Dataset – Formatted climate projection dataset.