Извлечение данных из файла netCDF на основе станции - PullRequest
0 голосов
/ 20 февраля 2019

Я прочитал другие решения для данных NetCDF, но мои данные немного отличаются, и я не знаю, как извлечь данные из NetCDF и сохранить их в файлах CSV на основе станций.Данные включают в себя максимальные значения температуры для станций.Мне просто нужны станции, расположенные в широтах: от 25,74 до 49,05 и в долготе: от -93,44 до -116,0.Формат времени другой, и мне просто нужно время [7518: 43947190], которое включает данные 1948 года. Я хочу создать несколько файлов CSV.каждый файл должен представлять собой данные для одной станции, которые включают время, tmax и флаг качества данных.Я действительно ценю, если кто-нибудь может помочь.

from netCDF4 import Dataset
dataset=Dataset("D:/ushcn_tmax.nc")

#### Print dimentions #####
print dataset.file_format
print dataset.dimensions.keys()
print dataset.dimensions['name_strlen']
print dataset.dimensions['obs']
print dataset.dimensions['station']

#### Print variables ####
print dataset.variables.keys()
print dataset.variables['LON']
print dataset.variables['LAT']
print dataset.variables['ELEVATION']
print dataset.variables['STATION_NAME']
print dataset.variables['STATION_INDEX']
print dataset.variables['TIME']
print dataset.variables['TMAX']
print dataset.variables['TMAX_MFLAG']
print dataset.variables['TMAX_QFLAG']
print dataset.variables['TMAX_SFLAG']

Размер и переменные моих данных можно увидеть здесь:

NETCDF3_CLASSIC
[u'name_strlen', u'obs', u'station']
<type 'netCDF4._netCDF4.Dimension'>: name = 'name_strlen', size = 50

<type 'netCDF4._netCDF4.Dimension'>: name = 'obs', size = 43947189

<type 'netCDF4._netCDF4.Dimension'>: name = 'station', size = 1218

[u'LON', u'LAT', u'ELEVATION', u'STATION_NAME', u'STATION_INDEX', u'TIME', u'TMAX', u'TMAX_MFLAG', u'TMAX_QFLAG', u'TMAX_SFLAG']
<type 'netCDF4._netCDF4.Variable'>
float32 LON(station)
    standard_name: longitude
    long_name: station longitude
    units: degrees_east
unlimited dimensions: 
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float32 LAT(station)
    standard_name: latitude
    long_name: station latitude
    units: degrees_north
unlimited dimensions: 
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 ELEVATION(station)
    long_name: elevation above the sea level
    standard_name: elevation
    units: m
    positive: up
    axis: Z
unlimited dimensions: 
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 STATION_NAME(station, name_strlen)
    long_name: USHCN station name
    cf_role: timeseries_id
unlimited dimensions: 
current shape = (1218, 50)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 STATION_INDEX(obs)
    long_name: which station this obs is for
    instance_dimension: station
unlimited dimensions: 
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 TIME(obs)
    standard_name: time
    long_name: Time
    units: decimal day
    _FillValue: -9999.0
    comment: time calculeted as: year + day_of_year/days_in_year
unlimited dimensions: 
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 TMAX(obs)
    standard_name: TMAX
    long_name: maximum temperature
    units: degrees F
    coordinates: time lat lon elevation
    _FillValue: -9999
unlimited dimensions: 
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_MFLAG(obs)
    standard_name: TMAX_MFLAG
    long_mane: measurement flag for TMAX
    flag_values:  BDLT
    flag_meanings: Blank = no measurement information applicable; B = precipitation total formed from two 12-hour totals; D = precipitation total formed from four six-hour totals; L = temperature appears to be lagged with respect to reported hour of OBServation; T = trace of precipitation, snowfall, or snow depth
unlimited dimensions: 
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_QFLAG(obs)
    standard_name: TMAX_QFLAG
    long_mane: quality flag for TMAX
    flag_values:  ADGIKMNORSTWX
    flag_meanings: Blank = did not fail any quality assurance check; A = failed accumulation total check; D = failed duplicate check; G = failed gap check; I = failed internal consistency check; K = failed streak/frequent-value check; M = failed megaconsistency check; N = failed naught check; O = failed climatological outlier check; R = failed lagged range check; S = failed spatial consistency check; T = failed temporal consistency check; W = temperature too warm for snow; X = failed bounds check;
unlimited dimensions: 
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_SFLAG(obs)
    standard_name: TMAX_SFLAG
    long_mane: source flag for TMAX
    flag_values:  0126ABFGHIMQRSX
    flag_meanings: Blank = No source (i.e., data value missing); 0 = U.S. Cooperative Summary of the Day (NCDC DSI-3200); 1 = U.S. Preliminary Cooperative Summary of the Day -- Transmitted; 2 = U.S. Preliminary Cooperative Summary of the Day -- Keyed from paper forms; 6 = CDMP Cooperative Summary of the Day (NCDC DSI-3206); A = U.S. Automated Surface Observing System (ASOS) real-time data (since January 1, 2006); B = U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211); F = U.S. Fort data; G = Official Global Climate Observing System (GCOS) or other government-supplied data; H = High Plains Regional Climate Center real-time data; I = International collection (non U.S. data received through personal contacts); M = Monthly METAR Extract (additional ASOS data); Q = Data from several African countries that had been 'quarantined', that is, withheld from public release until permission was granted from the respective meteorological services; R = NCDC Reference Network Database (Climate Reference Network and Historical Climatology Network-Modernized); S = Global Summary of the Day (NCDC DSI-9618), NOTE: 'S' values are derived from hourly synoptic reports exchanged on the Global Telecommunications System (GTS).Daily values derived in this fashion may differ significantly from 'true' daily data, particularly for precipitation (i.e., use with caution); X = U.S. First-Order Summary of the Day (NCDC DSI-3210)
unlimited dimensions: 
current shape = (43947189,)
filling off

Я пытался читать данные с:

xr.open_dataset("D:/ushcn_tmax.nc")
df=dataset.sel(lon=-99.30,lat=32.73,method='nearest')

, пока упоминается лати я принадлежу одной станции, и я получил ошибку "KeyError: 'lat'".Можно ли как-то преобразовать переменные (широта, долгота и время) в измерения, чтобы с ними было легче работать?Или каким-либо образом я могу извлечь данные на основе станции в качестве измерения?Любая помощь очень ценится.

...