Sites#

Classes

Site(SID, config, instruments)

A class representing a site where atmospheric measurements are taken.

MobileSite(SID, config, instruments)

A class representing a mobile site where atmospheric measurements are taken.

class lair.uataq.sites.Site(SID: str, config: dict, instruments: InstrumentEnsemble)[source]#

A class representing a site where atmospheric measurements are taken.

Attributes

SID

(str) The site identifier.

config

(dict) A dictionary containing configuration information for the site.

instruments

(InstrumentEnsemble) An instance of the InstrumentEnsemble class representing the instruments at the site.

groups

(set of str) The research groups that collect data at the site.

loggers

(set of str) The loggers used by research groups that record data at a site.

pollutants

(set of str) The pollutants measured at the site.

Methods

read_data(instruments=’all’, lvl=None, time_range=None, num_processes=1, file_pattern=None)

Read data for each instrument for specified level.

read_obs(pollutants=’all’, format=’wide’, time_range=None, num_processes=1)

Read observations for each pollutant, combining instruments by pollutants.

get_recent_obs(recent=dt.timedelta(days=10), lvl=’qaqc’)

Get recent observations from site instruments.

__init__(SID: str, config: dict, instruments: InstrumentEnsemble)[source]#

Initializes a Site object with the given site ID.

Parameters:
SIDstr

The site identifier.

configdict

A dictionary containing configuration information for the site:

{
    name: str,
    is_active: bool,
    is_mobile: bool,
    latitude: float,
    longitude: float,
    zagl: float,
    loggers: dict,
    instruments: {
        instrument: {
            loggers: dict
            installation_date: str,
            removal_date: str,
        }
    }
}
instrumentsInstrumentEnsemble

An instance of the InstrumentEnsemble class representing the instruments at the site.

read_data(instruments: Literal['all'] | str | list[str] | tuple[str, ...] | set[str] = 'all', group: str | None = None, lvl: str | None = None, time_range: TimeRange | str | list[str | datetime | None] | tuple[str | datetime | None, str | datetime | None] | slice | None = None, num_processes: int | Literal['max'] = 1, file_pattern: str | None = None) dict[str, DataFrame][source]#

Read data for the specified instruments and level.

Parameters:
instrumentsstr or list of str or ‘all’

The instrument(s) to read data from. If ‘all’, read data from all instruments. Default is ‘all’.

groupstr, optional

The research group to read data from. Default is None which uses the default group.

lvlstr, optional

The data level to read. Default is None which reads the highest level available.

time_rangestr | list[Union[str, dt.datetime, None]] | tuple[Union[str, dt.datetime, None], Union[str, dt.datetime, None]] | slice | None

The time range to read data. Default is None which reads all available data.

num_processesint or ‘max’

The number of processes to use for reading data. Default is 1.

file_patternstr, optional

The file pattern to use for filtering files. Default is None.

Returns:
dict[str, pandas.DataFrame]

A dictionary containing the data for each instrument.

Raises:
ReaderError

If no data is found for the specified instruments.

get_obs(pollutants: Literal['all'] | str | list[str] | tuple[str, ...] | set[str] = 'all', format: Literal['wide', 'long'] = 'wide', group: str | None = None, time_range: str | list[str | datetime | None] | tuple[str | datetime | None, str | datetime | None] | slice | None = None, num_processes: int | Literal['max'] = 1) DataFrame[source]#

Get observations for each pollutant, combining instruments by pollutants.

Parameters:
pollutantsstr or list of str, optional

pollutants to read. If ‘all’, read all pollutants. Default is ‘all’.

formatstr, optional

Format of the data to return. Default is ‘wide’.

groupstr, optional

Research group to read data from. Default is None which uses the default group.

time_rangestr | list[Union[str, dt.datetime, None]] | tuple[Union[str, dt.datetime, None], Union[str, dt.datetime, None]] | slice | None

The time range to read data. Default is None which reads all available data.

num_processesint, optional

Number of processes to use for reading data. Default is 1.

Returns:
Union[Dict[str, pandas.DataFrame], pandas.DataFrame]

A dictionary of dataframes, one for each level of data read, or a single dataframe if only one level was read. The keys of the dictionary are the names of the levels (‘calibrated’, ‘qaqc’, ‘raw’), and the values are the corresponding dataframes. If only one level was read, the method returns the corresponding dataframe directly.

get_recent_obs(recent: str | timedelta = datetime.timedelta(days=10), pollutants: Literal['all'] | str | list[str] | tuple[str, ...] | set[str] = 'all', format: Literal['wide', 'long'] = 'wide', group: str | None = None) DataFrame[source]#

Get recent observations from site instruments.

Parameters:
recentstr or datetime.timedelta, optional

Time range to get recent observations. Default is 10 days.

pollutantsstr or list of str, optional

Pollutants to read. If ‘all’, read all pollutants. Default is ‘all’.

formatstr, optional

Format of the data to return. Default is ‘wide’.

groupstr, optional

Research group to read data from. Defaults to None which uses the default group.

Returns:
pandas.DataFrame

A dataframe containing recent observations from site instruments.

class lair.uataq.sites.MobileSite(SID: str, config: dict, instruments: InstrumentEnsemble)[source]#

A class representing a mobile site where atmospheric measurements are taken.

Parameters:
SIDstr

The site identifier.

configdict

A dictionary containing configuration information for the site:

{
    ...
    is_mobile: True,
    instruments: {
        instrument: {...}
    }
    ...
}
static merge_gps(obs: DataFrame, gps: DataFrame, on: str | None = None, obs_on: str | None = None, gps_on: str | None = None) DataFrame[source]#

Merge observation data with location data from GPS.

Parameters:
obs (pd.DataFrame): The observation data.
gps (pd.DataFrame): The GPS location data.
on (str, optional): The column name to merge on. Defaults to ‘Time_UTC’.
obs_on (str, optional): The column name in the observation data to merge on. If not specified, it will use the value of ‘on’.
gps_on (str, optional): The column name in the GPS data to merge on. If not specified, it will use the value of ‘on’.
Returns:
pd.DataFrame: The merged data with added location information.
get_obs(pollutants: Literal['all'] | str | list[str] | tuple[str, ...] | set[str] = 'all', format: Literal['wide', 'long'] = 'wide', group: str | None = None, time_range: str | list[str | datetime | None] | tuple[str | datetime | None, str | datetime | None] | slice | None = None, num_processes: int | Literal['max'] = 1) DataFrame[source]#

Get mobile site observations for each pollutant, combining instruments by pollutants, and merging location data from GPS.

Parameters:
pollutantsstr or list of str, optional

pollutants to read. If ‘all’, read all pollutants. Default is ‘all’.

formatstr, optional

Format of the data to return. Default is ‘wide’.

groupstr, optional

Research group to read data from. Default is None which uses the default group.

time_rangelist of str, optional

Time range to read data. Default is None.

num_processesint, optional

Number of processes to use for reading data. Default is 1.

Returns:
pandas.DataFrame

A dataframe containing mobile site observations for each pollutant with location data merged.