lair.utils.records#

Utilities for working with files and directories.

Functions

ftp_download(host, paths, download_dir[, ...])

Recursively download files from an FTP server.

list_files([path, pattern, ignore_case, ...])

Returns a list of files in the specified directory that match the specified pattern.

parallelize_file_parser(file_parser[, ...])

Parallelizes a file parser function to read multiple files in parallel.

read_kml(path)

Read kml file from path

unzip(zf[, dir_path])

Unzip file into dir_path if given

wget_download(urls, download_dir[, prefix, ...])

Download multiple files from given URLs using wget and extract them if they are ZIP files.

Classes

Cacher(func, cache_file[, reload])

A class that caches function results to a file for future use.

lair.utils.records.unzip(zf: str, dir_path: str | None = None)[source]#

Unzip file into dir_path if given

Parameters:
zfstr

Path to zip file

dir_pathstr, optional

Path to directory to unzip to, by default None

lair.utils.records.list_files(path: str = '.', pattern: str | None = None, ignore_case: bool = False, all_files: bool = False, full_names: bool = False, recursive: bool = False) list[str][source]#

Returns a list of files in the specified directory that match the specified pattern.

Parameters:
pathstr, optional

The directory to search for files. Defaults to the current directory.

patternstr, optional

The glob-style pattern to match against file names. Defaults to None, which matches all files.

ignore_casebool, optional

Whether to ignore case when matching file names. Defaults to False.

all_filesbool, optional

Whether to include hidden files (files that start with a dot). Defaults to False.

full_namesbool, optional

Whether to return the full path of each file. Defaults to False.

recursivebool, optional

Whether to search for files recursively in subdirectories. Defaults to False.

Returns:
List[str]

A list of file names or full paths that match the specified pattern.

lair.utils.records.read_kml(path: str) KML[source]#

Read kml file from path

Parameters:
pathstr

The path to the KML file.

Returns:
KML

The KML object.

lair.utils.records.wget_download(urls: str | list[str], download_dir: str | None, prefix: str | None = None, num_threads: int = 1, unzip: bool = True)[source]#

Download multiple files from given URLs using wget and extract them if they are ZIP files.

Parameters:
urlsstr | list[str]

List of URLs to download.

download_dirstr

The local directory to download files to.

prefixstr, optional

The common prefix to use for the local directory structure. Defaults to None. If None, the files will be downloaded directly into the download_dir. If an empty string, the entire structure will be recreated.

num_threadsint, optional

Maximum number of threads to use for downloading, by default 4.

unzipbool, optional

Whether to unzip the downloaded files if they are ZIP files. Defaults to True.

lair.utils.records.ftp_download(host: str, paths: str | list[str], download_dir: str, username: str = 'anonymous', password: str = '', prefix: str | None = None, pattern: str | None = None)[source]#

Recursively download files from an FTP server.

Parameters:
hoststr

The FTP server host.

pathsstr | list[str]

The path(s) to download from the FTP server.

download_dirstr

The local directory to download files to.

usernamestr, optional

The username to use for the FTP server. Defaults to ‘anonymous’.

passwordstr, optional

The password to use for the FTP server. Defaults to ‘’.

prefixstr, optional

The common prefix to use for the local directory structure. Defaults to None.

patternstr, optional

The pattern to match against file names. Defaults to None.

Returns:
bool

True if the download was successful, False otherwise.

lair.utils.records.parallelize_file_parser(file_parser: Callable, num_processes: int | Literal['max'] = 1)[source]#

Parallelizes a file parser function to read multiple files in parallel.

Parameters:
file_parserfunction

The function to be parallelized. Must be picklable.

num_processesint | ‘max’, optional

The number of processes to use for parallelization. Defaults to 1.

Returns:
function

A parallelized version of the file parser function.

class lair.utils.records.Cacher(func: Callable, cache_file: str, reload=False)[source]#

A class that caches function results to a file for future use.

Attributes

func

(function) The function to be cached.

cache_file

(str) The name of the file to cache results to.

reload

(bool) Whether to reload the cache index from the index file.

pkl = <module 'pickle' from '/uufs/chpc.utah.edu/common/home/u6036966/software/python/miniforge3/envs/lair-dev/lib/python3.12/pickle.py'>#
__init__(func: Callable, cache_file: str, reload=False)[source]#

Initializes a Cacher object.

Parameters:
funcfunction

The function to be cached.

cache_filestr

The name of the file to cache results to.

reloadbool, optional

Whether to reload the cache index from the index file. Defaults to False.

load_cache_index()[source]#

Loads the cache index from a file.

Returns:
dict

A dictionary of cached results and their corresponding file positions.

save_cache_index()[source]#

Saves the cache index to a file.