Gridded Data Station Subsetting on closest model point

This program is a replacement for the suite of programs like HROISST_Station.py and NARR_daily_WindsSFCtemp_Station. Previous iterations of this program would use netcdf files from archive centers to first, locate the closest gridpoint, then extract that point from all successive files and combine the results into a single EPIC flavored netcdf file.

Common Datasets

  • NARR 3hrly or daily data

  • NCEP/NCAR Reanalysis data

  • erddap hosted JPL MUR sst data

  • etopo5 or other bathymetry

Primary functions to replace and motivation for this notebook

  • Stop supporting python 2.7 (support 3.8 or greater)

  • Stop outputting anything EPIC related files

  • use native “find nearest” for xarray… may not truly take into account lat/lon is not consistent km as you go furthur north like great circle would. Not sure this is actually an issue though unles your grid is super coarse…

Tasks that are addressed below:

  • read from multiple files (often parameters are in seperate files for seperate years)

  • read from erddap if available

  • filter 3hr winds (3pt triangle filter shown), but you can use any other filter on any other parameter

  • take daily, hourly or monthly input

Inputs:

  • station name

  • start/end time

  • location (lat,lon)

  • filter information

import xarray as xa
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Explore using data from netcdf files

Commonly PJS asks for NARR winds/temp from certain locations (C2, M2)

For netcdf files with parameters that span mulitple files (like NARR or NCEP Reanalysis files where each file is usually just a single parameter plus location/time) you can read them as a multi-file dataset aggregated on similar years (so loop over years for a multi year analysis)

If you only want a single parameter, you can go straight to looping over years or even just use the mfdataset method to open all files.

year = 2022
lat = 71.230
lon = -164.105
site = 'C2'
netcdf_dir = {'datapath':'/Users/bell/in_and_outbox/data_sets/reanalysis_data/NARR/', #NARR Netcdf example
              'dataset':'NARR'}
# netcdf_dir = {'datapath':'/Volumes/External/in_and_outbox/data_sets/reanalyis_data/NCEP-NCAR/daily/', #NCEP Netcdf example
#               'dataset':'NCEP'}

#load muliple files, one year - or multiple years, one measurement type
ds = xa.open_mfdataset(netcdf_dir['datapath']+f'*{year}.nc').load() #<-- pass year in - 
/Users/bell/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/conventions.py:516: SerializationWarning: variable 'air' has multiple fill values {9.96921e+36, -9.96921e+36}, decoding all values to NaN.
  new_vars[k] = decode_cf_variable(
/Users/bell/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/conventions.py:516: SerializationWarning: variable 'uwnd' has multiple fill values {9.96921e+36, -9.96921e+36}, decoding all values to NaN.
  new_vars[k] = decode_cf_variable(
/Users/bell/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/conventions.py:516: SerializationWarning: variable 'vwnd' has multiple fill values {9.96921e+36, -9.96921e+36}, decoding all values to NaN.
  new_vars[k] = decode_cf_variable(
# the following is a least distance calculator using the units specified.  
# As one deg of lon and one deg of latitude are not consistantly equal over the entire globe (like one unit of km is) this approach
#  has its numerical limitations and should only be used when a measure of ambiguity is ok.  otherwise, stick to consistent units (km or x/y)
#
# You may need to change the lat/lon variables though for the independent dataset

if netcdf_dir['dataset'] == 'NARR':
    ''' for when grid is in x/y coords'''
    d_lat = ds.lat - lat
    d_lon = ds.lon - lon
    r2_requested = d_lat**2 + d_lon**2
    i_j_loc = np.where(r2_requested == np.min(r2_requested))

    nearest_point = ds.isel(x=i_j_loc[1], y=i_j_loc[0]) #x/y order may need reversal
else:
    if ds.lon.all() >=0:
        ''' for when its 0:360'''
        nearest_point = ds.sel(lat=lat,lon=lon+360,method='nearest')
    else:
        ''' for when its -180:180'''
        pass
ods.AT_21.isel(lat=0,lon=0,depth=0).values
array([-25.797241 , -23.142609 , -23.471329 , -22.580551 , -22.1232   ,
       -21.782013 , -22.937912 , -24.244415 , -24.639267 , -22.822906 ,
       -24.151901 , -23.558151 , -23.051193 , -22.685837 , -22.852142 ,
       -25.176117 , -26.748383 , -27.441772 , -27.982346 , -27.701141 ,
       -27.684387 , -26.920288 , -27.034958 , -26.36586  , -25.174927 ,
       -22.801147 , -21.335037 , -20.32814  , -20.090027 , -19.014496 ,
       -18.656036 , -18.33432  , -18.312744 , -18.867462 , -19.123322 ,
       -19.530334 , -19.808167 , -20.566345 , -19.852066 , -18.34375  ,
       -21.006226 , -22.135834 , -24.501404 , -25.845093 , -27.237137 ,
       -27.84909  , -27.634018 , -26.053299 , -25.37236  , -24.274063 ,
       -24.02945  , -23.893768 , -23.797607 , -24.200073 , -25.256744 ,
       -26.08464  , -26.45224  , -27.006073 , -27.184387 , -27.360825 ,
       -28.49202  , -29.554688 , -30.18161  , -28.604752 , -28.504517 ,
       -27.216522 , -27.655838 , -27.569717 , -30.631851 , -30.50978  ,
       -29.884338 , -30.630112 , -30.294113 , -29.409927 , -29.023758 ,
       -28.70839  , -28.579788 , -29.44629  , -29.007843 , -27.992172 ,
       -28.955826 , -28.396881 , -29.64447  , -30.68576  , -29.868164 ,
       -26.324219 , -24.067413 , -22.775467 , -23.02095  , -22.389984 ,
       -23.037903 , -23.350128 , -24.453735 , -24.55095  , -24.663025 ,
       -24.21341  , -24.628922 , -25.36383  , -29.119385 , -28.724533 ,
       -25.916534 , -24.26091  , -24.387695 , -25.535263 , -27.925217 ,
       -28.968918 , -29.526382 , -30.571579 , -30.792603 , -30.929962 ,
       -30.75     , -30.460037 , -30.713226 , -30.992676 , -30.997833 ,
       -31.066498 , -28.498413 , -25.762604 , -24.499802 , -23.450424 ,
       -22.92743  , -22.535599 , -22.30931  , -21.966705 , -22.258926 ,
       -22.057129 , -22.19011  , -22.510681 , -22.884918 , -23.304062 ,
       -23.495773 , -23.606567 , -23.762451 , -23.091904 , -23.052628 ,
       -23.139679 , -23.091766 , -23.003769 , -23.120255 , -22.884995 ,
       -22.91307  , -22.882202 , -22.764969 , -22.304047 , -22.441101 ,
       -21.888702 , -22.027435 , -22.827454 , -23.215454 , -22.14981  ,
       -21.637299 , -21.417328 , -21.410492 , -21.087234 , -21.2892   ,
       -24.152466 , -25.597122 , -23.37622  , -25.003525 , -26.941193 ,
       -27.99472  , -29.55513  , -28.690567 , -29.73584  , -30.543549 ,
       -31.182251 , -30.239868 , -30.199387 , -30.147125 , -29.973907 ,
       -29.152802 , -28.82785  , -28.714828 , -28.56578  , -28.102356 ,
       -27.78598  , -27.907959 , -26.756393 , -25.438782 , -23.034515 ,
       -22.078812 , -21.082474 , -20.860596 , -20.186462 , -20.442932 ,
       -19.857346 , -19.891113 , -18.790176 , -17.843277 , -17.215332 ,
       -16.122345 , -16.008514 , -16.461517 , -17.011932 , -18.471573 ,
       -19.928268 , -21.855621 , -23.028519 , -24.142868 , -24.31601  ,
       -25.166443 , -24.833649 , -25.89003  , -26.81691  , -27.513916 ,
       -27.969193 , -28.218216 , -26.56868  , -25.843903 , -24.785019 ,
       -24.69136  , -24.421722 , -24.548645 , -24.300842 , -24.445587 ,
       -24.653534 , -25.154327 , -25.782394 , -29.256134 , -31.366928 ,
       -32.069214 , -31.644287 , -31.165802 , -30.421448 , -28.939972 ,
       -26.889282 , -26.629623 , -24.763763 , -24.12938  , -23.443787 ,
       -22.235413 , -21.286896 , -20.552994 , -21.059296 , -22.329514 ,
       -22.56543  , -22.992813 , -22.750381 , -23.367355 , -23.765656 ,
       -23.619095 , -23.77452  , -25.652878 , -28.085388 , -28.053345 ,
       -24.4012   , -23.27214  , -22.236664 , -22.313324 , -22.345596 ,
       -22.788803 , -22.694275 , -22.397995 , -20.139328 , -19.77629  ,
       -18.892273 , -19.037079 , -19.142685 , -19.934753 , -19.736954 ,
       -20.600937 , -21.314468 , -21.114014 , -21.541702 , -21.557663 ,
       -21.075119 , -21.647797 , -22.544678 , -22.597672 , -23.998398 ,
       -25.892075 , -27.64206  , -28.01155  , -28.710999 , -30.233063 ,
       -30.732376 , -30.65538  , -30.358414 , -30.567978 , -28.30066  ,
       -24.61174  , -23.48262  , -23.120346 , -22.473541 , -22.937866 ,
       -22.869202 , -23.101883 , -22.697525 , -22.47847  , -22.861786 ,
       -24.58139  , -26.317322 , -26.460815 , -24.416367 , -22.85318  ,
       -24.899445 , -25.288467 , -27.482422 , -31.53064  , -32.615097 ,
       -31.33786  , -31.542831 , -32.125366 , -30.1053   , -27.458801 ,
       -27.96733  , -28.320557 , -29.414047 , -29.701645 , -30.007126 ,
       -30.251907 , -31.723068 , -31.082932 , -33.120483 , -34.22223  ,
       -34.603516 , -34.738205 , -33.91574  , -33.86606  , -34.095886 ,
       -32.517807 , -32.65146  , -30.152786 , -28.534409 , -27.475998 ,
       -26.156952 , -25.499695 , -27.151672 , -24.961609 , -24.06836  ,
       -24.23845  , -24.484772 , -24.841751 , -24.781387 , -26.522827 ,
       -27.95723  , -28.119919 , -28.081818 , -29.066391 , -27.479263 ,
       -26.937897 , -27.716187 , -28.326889 , -26.515274 , -26.329483 ,
       -25.697235 , -25.886505 , -25.807343 , -25.990204 , -25.288193 ,
       -25.085754 , -24.947067 , -24.516388 , -24.075897 , -24.311798 ,
       -24.652664 , -24.867401 , -25.174896 , -25.498001 , -25.814758 ,
       -24.698608 , -24.08107  , -25.033005 , -24.264801 , -24.205963 ,
       -23.332916 , -23.031738 , -22.224075 , -21.425903 , -20.529556 ,
       -20.591873 , -20.736252 , -20.631866 , -22.365433 , -24.274597 ,
       -24.68509  , -23.109085 , -22.182663 , -22.154587 , -22.593597 ,
       -22.449814 , -22.943527 , -23.566284 , -22.110062 , -20.648743 ,
       -20.14357  , -19.773956 , -19.791626 , -19.817581 , -19.584122 ,
       -19.382904 , -19.696075 , -19.038757 , -20.176804 , -21.547012 ,
       -23.678772 , -25.396011 , -27.070496 , -28.253357 , -27.779785 ,
       -25.577118 , -25.259216 , -26.042984 , -28.683655 , -30.060562 ,
       -30.522995 , -31.110916 , -30.452148 , -28.249252 , -28.871002 ,
       -30.133163 , -30.17183  , -29.098495 , -28.961777 , -26.251526 ,
       -21.490768 , -16.297882 , -12.163116 ,  -9.480072 ,  -7.9664   ,
        -7.4937134,  -6.978485 ,  -7.0210876,  -7.1669006,  -8.823486 ,
       -16.56137  , -20.950134 , -22.349564 , -23.6819   , -23.662216 ,
       -25.077988 , -24.94011  , -22.697327 , -23.575699 , -25.591263 ,
       -25.702133 , -25.645813 , -25.136978 , -23.754333 , -21.871902 ,
       -19.539062 , -18.932098 , -18.59349  , -18.323685 , -18.290253 ,
       -17.921967 , -18.129883 , -17.944199 , -17.282715 , -17.48265  ,
       -17.883896 , -18.607956 , -19.586166 , -20.137115 , -20.627487 ,
       -18.382889 , -16.06015  , -15.256409 , -15.630157 , -14.454559 ,
       -14.310272 , -14.125763 , -15.2005005, -16.877472 , -14.247833 ,
       -11.060699 ,  -9.455933 ,  -8.198761 ,  -7.917389 ,  -8.124695 ,
        -8.517944 ,  -7.0194397], dtype=float32)
#compare to old method which used python=2.7, EPIC formatted NetCDF data and a Great Circle calculator for nearest point

ods = xa.load_dataset(netcdf_dir['datapath']+'NARR_C2_2022.cf.nc')

fig, ax = plt.subplots(2,figsize=(12, 4), sharex=True)

(nearest_point.air-273.15).plot(ax=ax[0])
ods.AT_21.plot(ax=ax[0])

((nearest_point.air-273.15)-ods.AT_21).plot(ax=ax[1])

assert ((nearest_point.air-273.15).isel(x=0,y=0).values == ods.AT_21.isel(lat=0,lon=0,depth=0).values).all(), 'Methods not equivalent' 
../../../_images/FindNearestGriddedData_8_0.png
# often a triangle filter was applied to the 3hr data... you can do the same with the following numpy call
fig, ax = plt.subplots(1,figsize=(12, 4), sharex=True)
ods.AT_21.plot(ax=ax)
ax.plot(ods.time,np.convolve(ods.AT_21[:,0,0,0],[0.25, 0.5, 0.25],'same')) #triangle filter
start_date = '2022-01-01'
end_date = '2022-02-26'
lat = 56.867
lon = -164.07
site = 'M2'
## from erddap (SST and ice concentration)

# Could also use this to retrieve bathymetry from erddap hosted files

def NearestFromErddap(server=None,dataset=None,variablename=None,latitude=0,longitude=0,startdate='',enddate=''):
    ''' Given an erddap url and a datasetID, provide the latitude and longitude to get the nearest datavalue within a time window.  
    You will also need to specify the name of the parameter you are interested in, some examples are below.
    
    date format : yyyy-mm-dd
    latitude : -90 to 90
    longitude: try -180:180 or 0:360
    
    NOAA - HROISST: https://coastwatch.pfeg.noaa.gov :: ncdcOisst21Agg_LonPM180 or ncdcOisst21Agg
            sst: err: ice (also all as a 4d function :'( )
    JPL - MUR: https://coastwatch.pfeg.noaa.gov :: jplMURSST41 or jplMURSST41_Lon0360
            analysed_sst; sea_ice_fraction; analysis_error
    UKMET - OSTIA: https://upwell.pfeg.noaa.gov :: nasa_jpl_2fa2_cbe7_cd13.html
    
    OSISAF - EUMET: polarwatch.noaa.gov :: eumetsat_osisaf_ice_conc_nh_ps
    (this outputs and x/y dataset that needs to be merged on to the NH PolarStereographic Grid)
    '''
    
    try:
        if dataset in ['ncdcOisst21Agg_LonPM180','ncdcOisst21Agg']:
            url=f"{server}/erddap/griddap/{dataset}.csvp?{variablename}[({startdate}):1:({enddate})][(0.0):1:(0.0)][({latitude}):1:({latitude})][({longitude}):1:({longitude})]"
            return pd.read_csv(url,skiprows=0,parse_dates=True,index_col='time (UTC)')
        else:
            url=f"{server}/erddap/griddap/{dataset}.csvp?{variablename}[({startdate}):1:({enddate})][({latitude}):1:({latitude})][({longitude}):1:({longitude})]"
            return pd.read_csv(url,skiprows=0,parse_dates=True,index_col='time (UTC)')
    except:
        return url    
    
jpl_sst = NearestFromErddap('https://coastwatch.pfeg.noaa.gov','jplMURSST41','analysed_sst',lat,lon,f'{start_date}',f'{end_date}')
ukmet_sst = NearestFromErddap('https://upwell.pfeg.noaa.gov','nasa_jpl_2fa2_cbe7_cd13','analysed_sst',lat,lon,f'{start_date}',f'{end_date}')
noaaoi_sst = NearestFromErddap('https://coastwatch.pfeg.noaa.gov','ncdcOisst21Agg_LonPM180','sst',lat,lon,f'{start_date}',f'{end_date}')
fig, ax = plt.subplots(1,figsize=(16, 2))

jpl_sst['analysed_sst (degree_C)'].plot(ax=ax,label='MUR')
ukmet_sst['analysed_sst (degree_C)'].plot(ax=ax,label='UKMet')
noaaoi_sst['sst (degree_C)'].plot(ax=ax,label='NOAA')
ax.set_ylim([-2,5])
ax.legend()

We can use the same code above to retrieve “ice concentration” information at the same location as well… some even from the same datasource

jpl_ice = NearestFromErddap('https://coastwatch.pfeg.noaa.gov','jplMURSST41','sea_ice_fraction',lat,lon,f'{year}-01-01',f'{end_date}')
ukmet_ice = NearestFromErddap('https://upwell.pfeg.noaa.gov','nasa_jpl_2fa2_cbe7_cd13','sea_ice_fraction',lat,lon,f'{year}-01-01',f'{end_date}')
noaaoi_ice = NearestFromErddap('https://coastwatch.pfeg.noaa.gov','ncdcOisst21Agg_LonPM180','ice',lat,lon,f'{year}-01-01',f'{end_date}')
fig, ax = plt.subplots(1,figsize=(16, 2))

jpl_ice['sea_ice_fraction (1)'].plot(ax=ax,label='MUR')
ukmet_ice['sea_ice_fraction (1)'].plot(ax=ax,label='UKMet')
noaaoi_ice['ice (1)'].plot(ax=ax,label='NOAA')
ax.set_ylim([-.01,1.01])
ax.legend()

Todo:

  • Show that the gridded datapoint is near the initial datapoint and include some simple bathymetry or coastlines

  • Ouput to csv is straight forward (netcdf -> pandas -> csv)… any other specific formats need descriptions

  • Tether into Database for MooringID’s or CTD Locations?