Sites#
Classes
|
A class representing a site where atmospheric measurements are taken. |
|
A class representing a mobile site where atmospheric measurements are taken. |
- class lair.uataq.sites.Site(SID: str, config: dict, instruments: InstrumentEnsemble)[source]#
A class representing a site where atmospheric measurements are taken.
Attributes
SID
(str) The site identifier.
config
(dict) A dictionary containing configuration information for the site.
instruments
(InstrumentEnsemble) An instance of the InstrumentEnsemble class representing the instruments at the site.
groups
(set of str) The research groups that collect data at the site.
loggers
(set of str) The loggers used by research groups that record data at a site.
pollutants
(set of str) The pollutants measured at the site.
Methods
read_data(instruments=’all’, lvl=None, time_range=None, num_processes=1, file_pattern=None)
Read data for each instrument for specified level.
read_obs(pollutants=’all’, format=’wide’, time_range=None, num_processes=1)
Read observations for each pollutant, combining instruments by pollutants.
get_recent_obs(recent=dt.timedelta(days=10), lvl=’qaqc’)
Get recent observations from site instruments.
- __init__(SID: str, config: dict, instruments: InstrumentEnsemble)[source]#
Initializes a Site object with the given site ID.
- Parameters:
- SIDstr
The site identifier.
- configdict
A dictionary containing configuration information for the site:
{ name: str, is_active: bool, is_mobile: bool, latitude: float, longitude: float, zagl: float, loggers: dict, instruments: { instrument: { loggers: dict installation_date: str, removal_date: str, } } }
- instrumentsInstrumentEnsemble
An instance of the InstrumentEnsemble class representing the instruments at the site.
- read_data(instruments: Literal['all'] | str | list[str] | tuple[str, ...] | set[str] = 'all', group: str | None = None, lvl: str | None = None, time_range: TimeRange | str | list[str | datetime | None] | tuple[str | datetime | None, str | datetime | None] | slice | None = None, num_processes: int | Literal['max'] = 1, file_pattern: str | None = None) dict[str, DataFrame] [source]#
Read data for the specified instruments and level.
- Parameters:
- instrumentsstr or list of str or ‘all’
The instrument(s) to read data from. If ‘all’, read data from all instruments. Default is ‘all’.
- groupstr, optional
The research group to read data from. Default is None which uses the default group.
- lvlstr, optional
The data level to read. Default is None which reads the highest level available.
- time_rangestr | list[Union[str, dt.datetime, None]] | tuple[Union[str, dt.datetime, None], Union[str, dt.datetime, None]] | slice | None
The time range to read data. Default is None which reads all available data.
- num_processesint or ‘max’
The number of processes to use for reading data. Default is 1.
- file_patternstr, optional
The file pattern to use for filtering files. Default is None.
- Returns:
- dict[str, pandas.DataFrame]
A dictionary containing the data for each instrument.
- Raises:
- ReaderError
If no data is found for the specified instruments.
- get_obs(pollutants: Literal['all'] | str | list[str] | tuple[str, ...] | set[str] = 'all', format: Literal['wide', 'long'] = 'wide', group: str | None = None, time_range: str | list[str | datetime | None] | tuple[str | datetime | None, str | datetime | None] | slice | None = None, num_processes: int | Literal['max'] = 1) DataFrame [source]#
Get observations for each pollutant, combining instruments by pollutants.
- Parameters:
- pollutantsstr or list of str, optional
pollutants to read. If ‘all’, read all pollutants. Default is ‘all’.
- formatstr, optional
Format of the data to return. Default is ‘wide’.
- groupstr, optional
Research group to read data from. Default is None which uses the default group.
- time_rangestr | list[Union[str, dt.datetime, None]] | tuple[Union[str, dt.datetime, None], Union[str, dt.datetime, None]] | slice | None
The time range to read data. Default is None which reads all available data.
- num_processesint, optional
Number of processes to use for reading data. Default is 1.
- Returns:
- Union[Dict[str, pandas.DataFrame], pandas.DataFrame]
A dictionary of dataframes, one for each level of data read, or a single dataframe if only one level was read. The keys of the dictionary are the names of the levels (‘calibrated’, ‘qaqc’, ‘raw’), and the values are the corresponding dataframes. If only one level was read, the method returns the corresponding dataframe directly.
- get_recent_obs(recent: str | timedelta = datetime.timedelta(days=10), pollutants: Literal['all'] | str | list[str] | tuple[str, ...] | set[str] = 'all', format: Literal['wide', 'long'] = 'wide', group: str | None = None) DataFrame [source]#
Get recent observations from site instruments.
- Parameters:
- recentstr or datetime.timedelta, optional
Time range to get recent observations. Default is 10 days.
- pollutantsstr or list of str, optional
Pollutants to read. If ‘all’, read all pollutants. Default is ‘all’.
- formatstr, optional
Format of the data to return. Default is ‘wide’.
- groupstr, optional
Research group to read data from. Defaults to None which uses the default group.
- Returns:
- pandas.DataFrame
A dataframe containing recent observations from site instruments.
- class lair.uataq.sites.MobileSite(SID: str, config: dict, instruments: InstrumentEnsemble)[source]#
A class representing a mobile site where atmospheric measurements are taken.
- Parameters:
- SIDstr
The site identifier.
- configdict
A dictionary containing configuration information for the site:
{ ... is_mobile: True, instruments: { instrument: {...} } ... }
- static merge_gps(obs: DataFrame, gps: DataFrame, on: str | None = None, obs_on: str | None = None, gps_on: str | None = None) DataFrame [source]#
Merge observation data with location data from GPS.
- Parameters:
- obs (pd.DataFrame): The observation data.
- gps (pd.DataFrame): The GPS location data.
- on (str, optional): The column name to merge on. Defaults to ‘Time_UTC’.
- obs_on (str, optional): The column name in the observation data to merge on. If not specified, it will use the value of ‘on’.
- gps_on (str, optional): The column name in the GPS data to merge on. If not specified, it will use the value of ‘on’.
- Returns:
- pd.DataFrame: The merged data with added location information.
- get_obs(pollutants: Literal['all'] | str | list[str] | tuple[str, ...] | set[str] = 'all', format: Literal['wide', 'long'] = 'wide', group: str | None = None, time_range: str | list[str | datetime | None] | tuple[str | datetime | None, str | datetime | None] | slice | None = None, num_processes: int | Literal['max'] = 1) DataFrame [source]#
Get mobile site observations for each pollutant, combining instruments by pollutants, and merging location data from GPS.
- Parameters:
- pollutantsstr or list of str, optional
pollutants to read. If ‘all’, read all pollutants. Default is ‘all’.
- formatstr, optional
Format of the data to return. Default is ‘wide’.
- groupstr, optional
Research group to read data from. Default is None which uses the default group.
- time_rangelist of str, optional
Time range to read data. Default is None.
- num_processesint, optional
Number of processes to use for reading data. Default is 1.
- Returns:
- pandas.DataFrame
A dataframe containing mobile site observations for each pollutant with location data merged.