Horel Group#
John Horel group - MesoWest, TRAX/eBUS, etc.
This module contains classes and functions for working with the Horel group data in the CHPC UATAQ filesystem.
Module Attributes
Horel group directory |
|
UUTRAX directory |
|
UUTRAX pilot directory |
|
Pilot phase time ranges for UUTRAX data |
|
Data levels for UUTRAX data |
|
Horel to UATAQ column mapping |
Classes
|
Class for parsing CSV files from the Horel group. |
|
Class for parsing finalized CSV |
|
Abstract base class for Horel data files. |
A class representing the Horel group data space in the CHPC UATAQ filesystem. |
|
|
Class for parsing H5 files from the Horel group. |
- lair.uataq.filesystem.groupspaces.horel.HOREL_DIR: str = '/uufs/chpc.utah.edu/common/home/horel-group'#
Horel group directory
- lair.uataq.filesystem.groupspaces.horel.UUTRAX_DIR: str = '/uufs/chpc.utah.edu/common/home/horel-group/uutrax'#
UUTRAX directory
- lair.uataq.filesystem.groupspaces.horel.UUTRAX_PILOT_DIR: str = '/uufs/chpc.utah.edu/common/home/horel-group/uutrax_pilot'#
UUTRAX pilot directory
- lair.uataq.filesystem.groupspaces.horel.PILOT_PHASE: dict[str, TimeRange] = {'TRX01': TimeRange(start=2014-11-11 00:00:00, stop=2018-11-19 20:03:58), 'TRX02': TimeRange(start=2016-02-04 00:00:00, stop=2018-11-19 18:53:52)}#
Pilot phase time ranges for UUTRAX data
- lair.uataq.filesystem.groupspaces.horel.lvl_data_dirs: dict[str, list] = {'final': ['/uufs/chpc.utah.edu/common/home/horel-group/uutrax'], 'qaqc': ['/uufs/chpc.utah.edu/common/home/horel-group/uutrax'], 'raw': ['/uufs/chpc.utah.edu/common/home/horel-group/uutrax_pilot', '/uufs/chpc.utah.edu/common/home/horel-group/uutrax']}#
Data levels for UUTRAX data
- lair.uataq.filesystem.groupspaces.horel.column_mapping: dict[str, dict[str, str]] = {'2b_205': {'2B_Air_Flow_Rate': 'Flow_Lpm', '2B_Internal_Air_Pressure': 'Internal_P_hPa', '2B_Internal_Air_Temperature': 'Internal_T_C', '2B_Ozone_Concentration': 'O3_ppb', 'FL2B': 'Flow_Lpm', 'OZNE': 'O3_ppb', 'Ozone_Data_Flagged': 'QAQC_Flag', 'PS2B': 'Internal_P_hPa', 'TC2B': 'Internal_T_C'}, '2b_405': {'2B405_Air_Flow_Rate': 'Flow_Lpm', '2B405_Cell_O3_Flow_Rate': 'O3_Flow_mLpm', '2B405_Internal_Air_Pressure': 'Internal_P_hPa', '2B405_Internal_Air_Temperature': 'Internal_T_C', '2B405_NO2_Concentration': 'NO2_ppb', '2B405_NOX_Concentration': 'NOx_ppb', '2B405_NO_Concentration': 'NO_ppb', 'FLNO': 'Flow_Lpm', 'FO3N': 'O3_Flow_mLpm', 'NO1C': 'NO_ppb', 'NO2C': 'NO2_ppb', 'NOXC': 'NOx_ppb', 'PSNO': 'Internal_P_hPa', 'TCNO': 'Internal_T_C'}, 'cr1000': {'Battery_Voltage': 'Battery_Voltage_V', 'Bus_Box_Temperature': 'Logger_T_C', 'Bus_Top_Relative_Humidity': 'Ambient_RH_pct', 'Bus_Top_Temperature': 'Ambient_T_C', 'TICC': 'Logger_T_C', 'TRNR': 'Ambient_RH_pct', 'TRNT': 'Ambient_T_C', 'Train_Box_Temperature': 'Logger_T_C', 'Train_Top_Relative_Humidity': 'Ambient_RH_pct', 'Train_Top_Temperature': 'Ambient_T_C', 'VOLT': 'Battery_Voltage_V'}, 'gps': {'Elevation': 'Altitude_msl', 'GELV': 'Altitude_msl', 'GLAT': 'Latitude_deg', 'GLON': 'Longitude_deg', 'GPS_Data_Flagged': 'QAQC_Flag', 'GPS_Direction': 'Course_deg', 'GPS_RMC_Valid': 'Status', 'GPS_Speed': 'Speed_kt', 'GTIM': 'Instrument_Time', 'Latitude': 'Latitude_deg', 'Longitude': 'Longitude_deg', 'NSAT': 'N_Satellites', 'RDIR': 'Course_deg', 'RSPD': 'Speed_kt', 'RSTS': 'Status'}, 'metone_es405': {'ERRR': 'Status', 'ES405_Air_Flow_Rate': 'Flow_Lpm', 'ES405_Error_Code': 'Status', 'ES405_Internal_Air_Pressure': 'Internal_P_hPa', 'ES405_Internal_Air_Temperature': 'Internal_T_C', 'ES405_Internal_Relative_Humidity': 'Internal_RH_pct', 'ES405_PM10_Concentration': 'PM10_ugm3', 'ES405_PM1_Concentration': 'PM1_ugm3', 'ES405_PM2.5_Concentration': 'PM2.5_ugm3', 'ES405_PM4_Concentration': 'PM4_ugm3', 'FLOW': 'Flow_Lpm', 'INRH': 'Internal_RH_pct', 'ITMP': 'Internal_T_F', 'PM01': 'PM1_ugm3', 'PM04': 'PM4_ugm3', 'PM10': 'PM10_ugm3', 'PM2.5_Data_Flagged': 'QAQC_Flag', 'PM25': 'PM2.5_ugm3', 'PRES': 'Internal_P_hpa'}, 'metone_es642': {'ERRR': 'Status', 'ES642_Air_Flow_Rate': 'Flow_Lpm', 'ES642_Error_Code': 'Status', 'ES642_Internal_Air_Pressure': 'Ambient_P_hPa', 'ES642_Internal_Air_Temperature': 'Ambient_T_C', 'ES642_Internal_Relative_Humidity': 'Internal_RH_pct', 'ES642_PM2.5_Concentration': 'PM2.5_ugm3', 'FLOW': 'Flow_Lpm', 'INRH': 'Internal_RH_pct', 'ITMP': 'Ambient_T_F', 'PM2.5_Data_Flagged': 'QAQC_Flag', 'PM25': 'PM2.5_ugm3', 'PRES': 'Ambient_P_hpa'}}#
Horel to UATAQ column mapping
- class lair.uataq.filesystem.groupspaces.horel.HorelFile(path: str, instrument: str)[source]#
Abstract base class for Horel data files.
Attributes
path
(str) The file path.
period
(pd.Period) The period of the data file.
logger
(str) The logger name.
date_slicer
(slice) A slice object to extract the date from the file name.
file_freq
(str) The file frequency.
ext
(str) The file extension.
time_col
(str) The time column name.
instrument
(str) The instrument name.
Methods
usecols(col)
Check if a column should be used based on the instrument.
convert_nodata(data, nodata=-9999.0)
Convert NoData values to NaN.
coerce_numeric(data, exclude=’Time_UTC’)
Coerce columns to numeric.
- __init__(path: str, instrument: str)[source]#
Initialize a HorelFile subclass object.
The instrument parameter is used to filter columns based on the instrument name.
- Parameters:
- pathstr
The file path.
- instrumentstr
The instrument name - used to filter columns.
- usecols(col: str) bool [source]#
Check if a column should be used based on the instrument.
- Parameters:
- colstr
The column name.
- Returns:
- bool
True if the column should be used, False otherwise.
- format_time(data: DataFrame, **kwargs) DataFrame [source]#
Format the time column in the data DataFrame.
- Parameters:
- datapd.DataFrame
The data DataFrame.
- **kwargsdict
Additional keyword arguments to pass to pd.to_datetime.
- Returns:
- pd.DataFrame
The data DataFrame with the time column formatted as Time_UTC.
- class lair.uataq.filesystem.groupspaces.horel.HorelH5File(path: str, instrument: str)[source]#
Class for parsing H5 files from the Horel group.
Attributes
path
(str) The file path.
period
(pd.Period) The period of the data file.
logger
(str) The logger name.
date_slicer
(slice) A slice object to extract the date from the file name.
file_freq
(str) The file frequency.
ext
(str) The file extension.
time_col
(str) The time column name.
instrument
(str) The instrument name.
Methods
usecols(col)
Check if a column should be used based on the instrument.
convert_nodata(data, nodata=-9999.0)
Convert NoData values to NaN.
coerce_numeric(data, exclude=’Time_UTC’)
Coerce columns to numeric.
- class lair.uataq.filesystem.groupspaces.horel.HorelCSVFile(path: str, instrument: str)[source]#
Class for parsing CSV files from the Horel group.
Attributes
path
(str) The file path.
period
(pd.Period) The period of the data file.
logger
(str) The logger name.
date_slicer
(slice) A slice object to extract the date from the file name.
file_freq
(str) The file frequency.
ext
(str) The file extension.
time_col
(str) The time column name.
instrument
(str) The instrument name.
Methods
usecols(col)
Check if a column should be used based on the instrument.
convert_nodata(data, nodata=-9999.0)
Convert NoData values to NaN.
coerce_numeric(data, exclude=’Time_UTC’)
Coerce columns to numeric.
parse()
Parse the CSV file and return a DataFrame.
- class lair.uataq.filesystem.groupspaces.horel.HorelCSVFinalizedFile(path: str, instrument: str)[source]#
Class for parsing finalized CSV
Attributes
path
(str) The file path.
period
(pd.Period) The period of the data file.
logger
(str) The logger name.
date_slicer
(slice) A slice object to extract the date from the file name.
file_freq
(str) The file frequency.
ext
(str) The file extension.
time_col
(str) The time column name.
final_patterns
(list[str]) A list of patterns to filter columns.
instrument
(str) The instrument name.
Methods
usecols(col)
Check if a column should be used based on the instrument.
convert_nodata(data, nodata=-9999.0)
Convert NoData values to NaN.
coerce_numeric(data, exclude=’Time_UTC’)
Coerce columns to numeric.
parse()
Parse the CSV file, finalize the data, and return a DataFrame.
- class lair.uataq.filesystem.groupspaces.horel.HorelGroup[source]#
A class representing the Horel group data space in the CHPC UATAQ filesystem.
Attributes
name
(str) The group name.
datafiles
(dict[str, Type[DataFile]]) A dictionary mapping datafile keys to DataFile classes.
Methods
get_highest_lvl(SID, instrument)
Get the highest data level for a given site and instrument.
get_files(SID, instrument, lvl, logger)
Get list of file paths for a given site, instrument, and level.
get_datafile_key(instrument, lvl, logger)
Get the datafile key based on the instrument, level, and logger.
get_datafiles(SID, instrument, lvl, logger, time_range, pattern=None)
Returns a list of data files for a given level and time range.
- static get_highest_lvl(SID: str, instrument: str) str [source]#
Get the highest data level for a given site and instrument.
- Parameters:
- SIDstr
The site ID.
- instrumentstr
The instrument name.
- Returns:
- str
The highest data level.
- get_files(SID: str, instrument: str, lvl: str, logger: str = 'campbellsci') List[str] [source]#
Get list of file paths for a given site, instrument, and level.
- Parameters:
- SIDstr
The site ID.
- instrumentstr
The instrument name.
- lvlstr
The data level.
- loggerstr
The logger name.
- Returns
- list[str]
A list of file paths.
- get_datafile_key(instrument: str, lvl: str, logger: str) str [source]#
Get the datafile key based on the instrument, level, and logger.
- Parameters:
- instrumentstr
The instrument name.
- lvlstr
The data level.
- loggerstr
The logger name.
- Returns:
- str
The datafile key.
- get_datafiles(SID: str, instrument: str, lvl: str, logger: str, time_range: TimeRange, pattern: str | None = None) list[DataFile] [source]#
Returns a list of data files for a given level and time range. Extends DataFile.get_datafiles by supplying the instrument name to the DataFile subclass.
- Parameters:
- SIDstr
The site ID.
- instrumentstr
The instrument name.
- lvlstr
The data level.
- loggerstr
The logger name.
- time_rangeTimeRange
The time range.
- patternstr | None
The pattern to match file names.
- Returns:
- list[DataFile]
A list of data files.
- static standardize_data(instrument: str, data: DataFrame) DataFrame [source]#
Manipulate the data to a standard format between research groups, renaming columns, converting units, mapping values, etc. as needed.
- Parameters:
- instrumentstr
The instrument model.
- datapd.DataFrame
The data to standardize.
- Returns:
- pd.DataFrame
The standardized data.