gigl.common.utils.GcsUtils#

class gigl.common.utils.gcs.GcsUtils(project: str | None = None)#

Bases: object

Utility class for interacting with Google Cloud Storage (GCS).

Methods

__init__

Initialize the GcsUtils instance.

add_bucket_lifecycle_rule_with_prefix

close_upload_delete_and_push_to_gcs

copy_gcs_path

count_blobs_in_gcs_path

delete_files

delete_files_in_bucket_dir

delete_gcs_file_if_exist

does_gcs_file_exist

download_file_from_gcs

download_file_from_gcs_to_temp_file

download_files_from_gcs_paths_to_local_dir

download_files_from_gcs_paths_to_local_paths

Downloads files from GCS path to local path.

get_bucket_and_blob_path_from_gcs_path

list_uris_with_gcs_path_pattern

List GCS URIs with a given suffix or pattern.

read_from_gcs

upload_files_to_gcs

Upload files from local paths to their subsequent provided GCS paths.

upload_from_filelike

Uploads a file-like object to GCS.

upload_from_string

__init__(project: str | None = None) None#

Initialize the GcsUtils instance.

Args:

project (Optional[str]): The GCP project ID. Defaults to None.

__weakref__#

list of weak references to the object (if defined)

download_files_from_gcs_paths_to_local_paths(file_map: Dict[GcsUri, LocalUri])#

Downloads files from GCS path to local path. :param file_map: mapping of GCS path -> local path :return:

list_uris_with_gcs_path_pattern(gcs_path: GcsUri, suffix: str | None = None, pattern: str | None = None) List[GcsUri]#

List GCS URIs with a given suffix or pattern.

Ex: gs://bucket-name/dir/file1.txt gs://bucket-name/dir/foo.txt gs://bucket-name/dir/file.json

list_uris_with_gcs_path_pattern(gcs_path=gs://bucket-name/dir, suffix=”.txt”) -> [gs://bucket-name/dir/file1.txt, gs://bucket-name/dir/foo.txt] list_uris_with_gcs_path_pattern(gcs_path=gs://bucket-name/dir, pattern=”file.*”) -> [gs://bucket-name/dir/file1.txt, gs://bucket-name/dir/file.json]

Args:

gcs_path (GcsUri): The GCS path to list URIs from. suffix (Optional[str]): The suffix to filter URIs by. If None (the default), then no filtering on suffix will be done. pattern (Optional[str]): The regex to filter URIs by. If None (the default), then no filtering on the pattern will be done.

Returns:

List[GcsUri]: A list of GCS URIs that match the given suffix or pattern.

upload_files_to_gcs(local_file_path_to_gcs_path_map: Dict[LocalUri, GcsUri], parallel: bool = True) None#

Upload files from local paths to their subsequent provided GCS paths.

Args:

local_file_path_to_gcs_path_map (Dict[LocalUri, GcsUri]): A dictionary mapping local file paths to GCS paths. parallel (bool): Flag indicating whether to upload files in parallel. Defaults to True.

upload_from_filelike(gcs_path: GcsUri, filelike: IO, content_type: str = 'application/octet-stream') None#

Uploads a file-like object to GCS.

A “filelike” object is one that satisfies the typing.IO interface, e.g contains read(), write(), etc. The prototypical example of this is the object returned by open(), but we also use io.BytesIO as an in-memory buffer which also satisfies the typing.IO interface.

Args:

gcs_path (GcsUri): The GCS path to upload the file to. filelike (IO[AnyStr]): The file-like object to upload. content_type (str): The content type of the file. Defaults to “application/octet-stream”.