paidiverpy.utils.data

paidiverpy.utils.data#

Helper functions to download and load datasets.

A class to download and load datasets.

class paidiverpy.utils.data.PaidiverpyData[source]#

A class to download and load datasets.

load(dataset_name: str) → dict[str, str][source]#

Download, unzip, and load the specified dataset.

Parameters:: dataset_name (str) – The name of the dataset (for example, ‘sample_image’).
Returns:: A dictionary containing the input path, metadata path, metadata type, and image type.
Return type:: dict

copy_files_docker(extract_dir: pathlib.Path, dataset_name: str) → None[source]#

Copy files from the extract directory to the appropriate location in the Docker container.

Parameters:

load_persistent_paths() → dict[str, str][source]#

Load the persistent paths from the cache directory.

save_persistent_paths(paths: dict[str, str]) → None[source]#

Save the persistent paths to the cache directory.

download_file(url: str, dataset_name: str, cache_dir: pathlib.Path = CACHE_DIR) → pathlib.Path[source]#

Download dataset file from the given URL.

Download the file from the given URL and cache it locally to avoid redundant downloads. A progress bar is displayed for the download process.

Parameters:

Returns:

The path to the downloaded file.

Return type:

Path

unzip_file(zip_path: pathlib.Path, dataset_name: str, extract_dir: pathlib.Path = CACHE_DIR) → None[source]#

Unzip the file to the specified directory.

Parameters:

calculate_information(dataset_name: str, extract_dir: pathlib.Path, dataset_information: dict[str, Any]) → dict[str, str][source]#

Calculate the information for the dataset.

Parameters:

Returns:

Information about the dataset

Return type:

dict