paidiverpy.utils.data#
Helper functions to download and load datasets.
Classes#
A class to download and load datasets. |
Module Contents#
- class paidiverpy.utils.data.PaidiverpyData[source]#
A class to download and load datasets.
- copy_files_docker(extract_dir: pathlib.Path, dataset_name: str) None[source]#
Copy files from the extract directory to the appropriate location in the Docker container.
- Parameters:
extract_dir (Path) – The directory where the dataset has been extracted.
dataset_name (str) – The name of the dataset.
- save_persistent_paths(paths: dict[str, str]) None[source]#
Save the persistent paths to the cache directory.
- Parameters:
paths (dict) – The paths to save.
- download_file(url: str, dataset_name: str, cache_dir: pathlib.Path = CACHE_DIR) pathlib.Path[source]#
Download dataset file from the given URL.
Download the file from the given URL and cache it locally to avoid redundant downloads. A progress bar is displayed for the download process.
- unzip_file(zip_path: pathlib.Path, dataset_name: str, extract_dir: pathlib.Path = CACHE_DIR) None[source]#
Unzip the file to the specified directory.
- Parameters:
zip_path (Path) – The path to the zip file.
extract_dir (Path) – The directory to extract the contents to.
dataset_name (str) – The name of the dataset.