Gridded Data Station Subsetting on closest model point¶
This program is a replacement for the suite of programs like HROISST_Station.py
and NARR_daily_WindsSFCtemp_Station
. Previous iterations of this program would use netcdf files from archive centers to first, locate the closest gridpoint, then extract that point from all successive files and combine the results into a single EPIC flavored netcdf file.
Common Datasets
NARR 3hrly or daily data
NCEP/NCAR Reanalysis data
erddap hosted JPL MUR sst data
etopo5 or other bathymetry
Primary functions to replace and motivation for this notebook
Stop supporting python 2.7 (support 3.8 or greater)
Stop outputting anything EPIC related files
use native “find nearest” for xarray… may not truly take into account lat/lon is not consistent km as you go furthur north like great circle would. Not sure this is actually an issue though unles your grid is super coarse…
Tasks that are addressed below:
read from multiple files (often parameters are in seperate files for seperate years)
read from erddap if available
filter 3hr winds (3pt triangle filter shown), but you can use any other filter on any other parameter
take daily, hourly or monthly input
Inputs:
station name
start/end time
location (lat,lon)
filter information
import xarray as xa
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
Explore using data from netcdf files¶
Commonly PJS asks for NARR winds/temp from certain locations (C2, M2)
For netcdf files with parameters that span mulitple files (like NARR or NCEP Reanalysis files where each file is usually just a single parameter plus location/time) you can read them as a multi-file dataset aggregated on similar years (so loop over years for a multi year analysis)
If you only want a single parameter, you can go straight to looping over years or even just use the mfdataset method to open all files.
year = 2022
lat = 71.230
lon = -164.105
site = 'C2'
netcdf_dir = {'datapath':'/Users/bell/in_and_outbox/data_sets/reanalysis_data/NARR/', #NARR Netcdf example
'dataset':'NARR'}
# netcdf_dir = {'datapath':'/Volumes/External/in_and_outbox/data_sets/reanalyis_data/NCEP-NCAR/daily/', #NCEP Netcdf example
# 'dataset':'NCEP'}
#load muliple files, one year - or multiple years, one measurement type
ds = xa.open_mfdataset(netcdf_dir['datapath']+f'*{year}.nc').load() #<-- pass year in -
/Users/bell/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/conventions.py:516: SerializationWarning: variable 'air' has multiple fill values {9.96921e+36, -9.96921e+36}, decoding all values to NaN.
new_vars[k] = decode_cf_variable(
/Users/bell/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/conventions.py:516: SerializationWarning: variable 'uwnd' has multiple fill values {9.96921e+36, -9.96921e+36}, decoding all values to NaN.
new_vars[k] = decode_cf_variable(
/Users/bell/miniconda3/envs/py38/lib/python3.8/site-packages/xarray/conventions.py:516: SerializationWarning: variable 'vwnd' has multiple fill values {9.96921e+36, -9.96921e+36}, decoding all values to NaN.
new_vars[k] = decode_cf_variable(
# the following is a least distance calculator using the units specified.
# As one deg of lon and one deg of latitude are not consistantly equal over the entire globe (like one unit of km is) this approach
# has its numerical limitations and should only be used when a measure of ambiguity is ok. otherwise, stick to consistent units (km or x/y)
#
# You may need to change the lat/lon variables though for the independent dataset
if netcdf_dir['dataset'] == 'NARR':
''' for when grid is in x/y coords'''
d_lat = ds.lat - lat
d_lon = ds.lon - lon
r2_requested = d_lat**2 + d_lon**2
i_j_loc = np.where(r2_requested == np.min(r2_requested))
nearest_point = ds.isel(x=i_j_loc[1], y=i_j_loc[0]) #x/y order may need reversal
else:
if ds.lon.all() >=0:
''' for when its 0:360'''
nearest_point = ds.sel(lat=lat,lon=lon+360,method='nearest')
else:
''' for when its -180:180'''
pass
ods.AT_21.isel(lat=0,lon=0,depth=0).values
array([-25.797241 , -23.142609 , -23.471329 , -22.580551 , -22.1232 ,
-21.782013 , -22.937912 , -24.244415 , -24.639267 , -22.822906 ,
-24.151901 , -23.558151 , -23.051193 , -22.685837 , -22.852142 ,
-25.176117 , -26.748383 , -27.441772 , -27.982346 , -27.701141 ,
-27.684387 , -26.920288 , -27.034958 , -26.36586 , -25.174927 ,
-22.801147 , -21.335037 , -20.32814 , -20.090027 , -19.014496 ,
-18.656036 , -18.33432 , -18.312744 , -18.867462 , -19.123322 ,
-19.530334 , -19.808167 , -20.566345 , -19.852066 , -18.34375 ,
-21.006226 , -22.135834 , -24.501404 , -25.845093 , -27.237137 ,
-27.84909 , -27.634018 , -26.053299 , -25.37236 , -24.274063 ,
-24.02945 , -23.893768 , -23.797607 , -24.200073 , -25.256744 ,
-26.08464 , -26.45224 , -27.006073 , -27.184387 , -27.360825 ,
-28.49202 , -29.554688 , -30.18161 , -28.604752 , -28.504517 ,
-27.216522 , -27.655838 , -27.569717 , -30.631851 , -30.50978 ,
-29.884338 , -30.630112 , -30.294113 , -29.409927 , -29.023758 ,
-28.70839 , -28.579788 , -29.44629 , -29.007843 , -27.992172 ,
-28.955826 , -28.396881 , -29.64447 , -30.68576 , -29.868164 ,
-26.324219 , -24.067413 , -22.775467 , -23.02095 , -22.389984 ,
-23.037903 , -23.350128 , -24.453735 , -24.55095 , -24.663025 ,
-24.21341 , -24.628922 , -25.36383 , -29.119385 , -28.724533 ,
-25.916534 , -24.26091 , -24.387695 , -25.535263 , -27.925217 ,
-28.968918 , -29.526382 , -30.571579 , -30.792603 , -30.929962 ,
-30.75 , -30.460037 , -30.713226 , -30.992676 , -30.997833 ,
-31.066498 , -28.498413 , -25.762604 , -24.499802 , -23.450424 ,
-22.92743 , -22.535599 , -22.30931 , -21.966705 , -22.258926 ,
-22.057129 , -22.19011 , -22.510681 , -22.884918 , -23.304062 ,
-23.495773 , -23.606567 , -23.762451 , -23.091904 , -23.052628 ,
-23.139679 , -23.091766 , -23.003769 , -23.120255 , -22.884995 ,
-22.91307 , -22.882202 , -22.764969 , -22.304047 , -22.441101 ,
-21.888702 , -22.027435 , -22.827454 , -23.215454 , -22.14981 ,
-21.637299 , -21.417328 , -21.410492 , -21.087234 , -21.2892 ,
-24.152466 , -25.597122 , -23.37622 , -25.003525 , -26.941193 ,
-27.99472 , -29.55513 , -28.690567 , -29.73584 , -30.543549 ,
-31.182251 , -30.239868 , -30.199387 , -30.147125 , -29.973907 ,
-29.152802 , -28.82785 , -28.714828 , -28.56578 , -28.102356 ,
-27.78598 , -27.907959 , -26.756393 , -25.438782 , -23.034515 ,
-22.078812 , -21.082474 , -20.860596 , -20.186462 , -20.442932 ,
-19.857346 , -19.891113 , -18.790176 , -17.843277 , -17.215332 ,
-16.122345 , -16.008514 , -16.461517 , -17.011932 , -18.471573 ,
-19.928268 , -21.855621 , -23.028519 , -24.142868 , -24.31601 ,
-25.166443 , -24.833649 , -25.89003 , -26.81691 , -27.513916 ,
-27.969193 , -28.218216 , -26.56868 , -25.843903 , -24.785019 ,
-24.69136 , -24.421722 , -24.548645 , -24.300842 , -24.445587 ,
-24.653534 , -25.154327 , -25.782394 , -29.256134 , -31.366928 ,
-32.069214 , -31.644287 , -31.165802 , -30.421448 , -28.939972 ,
-26.889282 , -26.629623 , -24.763763 , -24.12938 , -23.443787 ,
-22.235413 , -21.286896 , -20.552994 , -21.059296 , -22.329514 ,
-22.56543 , -22.992813 , -22.750381 , -23.367355 , -23.765656 ,
-23.619095 , -23.77452 , -25.652878 , -28.085388 , -28.053345 ,
-24.4012 , -23.27214 , -22.236664 , -22.313324 , -22.345596 ,
-22.788803 , -22.694275 , -22.397995 , -20.139328 , -19.77629 ,
-18.892273 , -19.037079 , -19.142685 , -19.934753 , -19.736954 ,
-20.600937 , -21.314468 , -21.114014 , -21.541702 , -21.557663 ,
-21.075119 , -21.647797 , -22.544678 , -22.597672 , -23.998398 ,
-25.892075 , -27.64206 , -28.01155 , -28.710999 , -30.233063 ,
-30.732376 , -30.65538 , -30.358414 , -30.567978 , -28.30066 ,
-24.61174 , -23.48262 , -23.120346 , -22.473541 , -22.937866 ,
-22.869202 , -23.101883 , -22.697525 , -22.47847 , -22.861786 ,
-24.58139 , -26.317322 , -26.460815 , -24.416367 , -22.85318 ,
-24.899445 , -25.288467 , -27.482422 , -31.53064 , -32.615097 ,
-31.33786 , -31.542831 , -32.125366 , -30.1053 , -27.458801 ,
-27.96733 , -28.320557 , -29.414047 , -29.701645 , -30.007126 ,
-30.251907 , -31.723068 , -31.082932 , -33.120483 , -34.22223 ,
-34.603516 , -34.738205 , -33.91574 , -33.86606 , -34.095886 ,
-32.517807 , -32.65146 , -30.152786 , -28.534409 , -27.475998 ,
-26.156952 , -25.499695 , -27.151672 , -24.961609 , -24.06836 ,
-24.23845 , -24.484772 , -24.841751 , -24.781387 , -26.522827 ,
-27.95723 , -28.119919 , -28.081818 , -29.066391 , -27.479263 ,
-26.937897 , -27.716187 , -28.326889 , -26.515274 , -26.329483 ,
-25.697235 , -25.886505 , -25.807343 , -25.990204 , -25.288193 ,
-25.085754 , -24.947067 , -24.516388 , -24.075897 , -24.311798 ,
-24.652664 , -24.867401 , -25.174896 , -25.498001 , -25.814758 ,
-24.698608 , -24.08107 , -25.033005 , -24.264801 , -24.205963 ,
-23.332916 , -23.031738 , -22.224075 , -21.425903 , -20.529556 ,
-20.591873 , -20.736252 , -20.631866 , -22.365433 , -24.274597 ,
-24.68509 , -23.109085 , -22.182663 , -22.154587 , -22.593597 ,
-22.449814 , -22.943527 , -23.566284 , -22.110062 , -20.648743 ,
-20.14357 , -19.773956 , -19.791626 , -19.817581 , -19.584122 ,
-19.382904 , -19.696075 , -19.038757 , -20.176804 , -21.547012 ,
-23.678772 , -25.396011 , -27.070496 , -28.253357 , -27.779785 ,
-25.577118 , -25.259216 , -26.042984 , -28.683655 , -30.060562 ,
-30.522995 , -31.110916 , -30.452148 , -28.249252 , -28.871002 ,
-30.133163 , -30.17183 , -29.098495 , -28.961777 , -26.251526 ,
-21.490768 , -16.297882 , -12.163116 , -9.480072 , -7.9664 ,
-7.4937134, -6.978485 , -7.0210876, -7.1669006, -8.823486 ,
-16.56137 , -20.950134 , -22.349564 , -23.6819 , -23.662216 ,
-25.077988 , -24.94011 , -22.697327 , -23.575699 , -25.591263 ,
-25.702133 , -25.645813 , -25.136978 , -23.754333 , -21.871902 ,
-19.539062 , -18.932098 , -18.59349 , -18.323685 , -18.290253 ,
-17.921967 , -18.129883 , -17.944199 , -17.282715 , -17.48265 ,
-17.883896 , -18.607956 , -19.586166 , -20.137115 , -20.627487 ,
-18.382889 , -16.06015 , -15.256409 , -15.630157 , -14.454559 ,
-14.310272 , -14.125763 , -15.2005005, -16.877472 , -14.247833 ,
-11.060699 , -9.455933 , -8.198761 , -7.917389 , -8.124695 ,
-8.517944 , -7.0194397], dtype=float32)
#compare to old method which used python=2.7, EPIC formatted NetCDF data and a Great Circle calculator for nearest point
ods = xa.load_dataset(netcdf_dir['datapath']+'NARR_C2_2022.cf.nc')
fig, ax = plt.subplots(2,figsize=(12, 4), sharex=True)
(nearest_point.air-273.15).plot(ax=ax[0])
ods.AT_21.plot(ax=ax[0])
((nearest_point.air-273.15)-ods.AT_21).plot(ax=ax[1])
assert ((nearest_point.air-273.15).isel(x=0,y=0).values == ods.AT_21.isel(lat=0,lon=0,depth=0).values).all(), 'Methods not equivalent'
# often a triangle filter was applied to the 3hr data... you can do the same with the following numpy call
fig, ax = plt.subplots(1,figsize=(12, 4), sharex=True)
ods.AT_21.plot(ax=ax)
ax.plot(ods.time,np.convolve(ods.AT_21[:,0,0,0],[0.25, 0.5, 0.25],'same')) #triangle filter
start_date = '2022-01-01'
end_date = '2022-02-26'
lat = 56.867
lon = -164.07
site = 'M2'
## from erddap (SST and ice concentration)
# Could also use this to retrieve bathymetry from erddap hosted files
def NearestFromErddap(server=None,dataset=None,variablename=None,latitude=0,longitude=0,startdate='',enddate=''):
''' Given an erddap url and a datasetID, provide the latitude and longitude to get the nearest datavalue within a time window.
You will also need to specify the name of the parameter you are interested in, some examples are below.
date format : yyyy-mm-dd
latitude : -90 to 90
longitude: try -180:180 or 0:360
NOAA - HROISST: https://coastwatch.pfeg.noaa.gov :: ncdcOisst21Agg_LonPM180 or ncdcOisst21Agg
sst: err: ice (also all as a 4d function :'( )
JPL - MUR: https://coastwatch.pfeg.noaa.gov :: jplMURSST41 or jplMURSST41_Lon0360
analysed_sst; sea_ice_fraction; analysis_error
UKMET - OSTIA: https://upwell.pfeg.noaa.gov :: nasa_jpl_2fa2_cbe7_cd13.html
OSISAF - EUMET: polarwatch.noaa.gov :: eumetsat_osisaf_ice_conc_nh_ps
(this outputs and x/y dataset that needs to be merged on to the NH PolarStereographic Grid)
'''
try:
if dataset in ['ncdcOisst21Agg_LonPM180','ncdcOisst21Agg']:
url=f"{server}/erddap/griddap/{dataset}.csvp?{variablename}[({startdate}):1:({enddate})][(0.0):1:(0.0)][({latitude}):1:({latitude})][({longitude}):1:({longitude})]"
return pd.read_csv(url,skiprows=0,parse_dates=True,index_col='time (UTC)')
else:
url=f"{server}/erddap/griddap/{dataset}.csvp?{variablename}[({startdate}):1:({enddate})][({latitude}):1:({latitude})][({longitude}):1:({longitude})]"
return pd.read_csv(url,skiprows=0,parse_dates=True,index_col='time (UTC)')
except:
return url
jpl_sst = NearestFromErddap('https://coastwatch.pfeg.noaa.gov','jplMURSST41','analysed_sst',lat,lon,f'{start_date}',f'{end_date}')
ukmet_sst = NearestFromErddap('https://upwell.pfeg.noaa.gov','nasa_jpl_2fa2_cbe7_cd13','analysed_sst',lat,lon,f'{start_date}',f'{end_date}')
noaaoi_sst = NearestFromErddap('https://coastwatch.pfeg.noaa.gov','ncdcOisst21Agg_LonPM180','sst',lat,lon,f'{start_date}',f'{end_date}')
fig, ax = plt.subplots(1,figsize=(16, 2))
jpl_sst['analysed_sst (degree_C)'].plot(ax=ax,label='MUR')
ukmet_sst['analysed_sst (degree_C)'].plot(ax=ax,label='UKMet')
noaaoi_sst['sst (degree_C)'].plot(ax=ax,label='NOAA')
ax.set_ylim([-2,5])
ax.legend()
We can use the same code above to retrieve “ice concentration” information at the same location as well… some even from the same datasource
jpl_ice = NearestFromErddap('https://coastwatch.pfeg.noaa.gov','jplMURSST41','sea_ice_fraction',lat,lon,f'{year}-01-01',f'{end_date}')
ukmet_ice = NearestFromErddap('https://upwell.pfeg.noaa.gov','nasa_jpl_2fa2_cbe7_cd13','sea_ice_fraction',lat,lon,f'{year}-01-01',f'{end_date}')
noaaoi_ice = NearestFromErddap('https://coastwatch.pfeg.noaa.gov','ncdcOisst21Agg_LonPM180','ice',lat,lon,f'{year}-01-01',f'{end_date}')
fig, ax = plt.subplots(1,figsize=(16, 2))
jpl_ice['sea_ice_fraction (1)'].plot(ax=ax,label='MUR')
ukmet_ice['sea_ice_fraction (1)'].plot(ax=ax,label='UKMet')
noaaoi_ice['ice (1)'].plot(ax=ax,label='NOAA')
ax.set_ylim([-.01,1.01])
ax.legend()
Todo:
Show that the gridded datapoint is near the initial datapoint and include some simple bathymetry or coastlines
Ouput to csv is straight forward (netcdf -> pandas -> csv)… any other specific formats need descriptions
Tether into Database for MooringID’s or CTD Locations?