gigl.common.utils.GcsUtils#
- class gigl.common.utils.gcs.GcsUtils(project: str | None = None)#
Bases:
objectUtility class for interacting with Google Cloud Storage (GCS).
Methods
Initialize the GcsUtils instance.
add_bucket_lifecycle_rule_with_prefixclose_upload_delete_and_push_to_gcscopy_gcs_pathcount_blobs_in_gcs_pathdelete_filesdelete_files_in_bucket_dirdelete_gcs_file_if_existdoes_gcs_file_existdownload_file_from_gcsdownload_file_from_gcs_to_temp_filedownload_files_from_gcs_paths_to_local_dirDownloads files from GCS path to local path.
get_bucket_and_blob_path_from_gcs_pathList GCS URIs with a given suffix or pattern.
read_from_gcsUpload files from local paths to their subsequent provided GCS paths.
Uploads a file-like object to GCS.
upload_from_string- __init__(project: str | None = None) None#
Initialize the GcsUtils instance.
- Args:
project (Optional[str]): The GCP project ID. Defaults to None.
- __weakref__#
list of weak references to the object (if defined)
- download_files_from_gcs_paths_to_local_paths(file_map: Dict[GcsUri, LocalUri])#
Downloads files from GCS path to local path. :param file_map: mapping of GCS path -> local path :return:
- list_uris_with_gcs_path_pattern(gcs_path: GcsUri, suffix: str | None = None, pattern: str | None = None) List[GcsUri]#
List GCS URIs with a given suffix or pattern.
Ex: gs://bucket-name/dir/file1.txt gs://bucket-name/dir/foo.txt gs://bucket-name/dir/file.json
list_uris_with_gcs_path_pattern(gcs_path=gs://bucket-name/dir, suffix=”.txt”) -> [gs://bucket-name/dir/file1.txt, gs://bucket-name/dir/foo.txt] list_uris_with_gcs_path_pattern(gcs_path=gs://bucket-name/dir, pattern=”file.*”) -> [gs://bucket-name/dir/file1.txt, gs://bucket-name/dir/file.json]
- Args:
gcs_path (GcsUri): The GCS path to list URIs from. suffix (Optional[str]): The suffix to filter URIs by. If None (the default), then no filtering on suffix will be done. pattern (Optional[str]): The regex to filter URIs by. If None (the default), then no filtering on the pattern will be done.
- Returns:
List[GcsUri]: A list of GCS URIs that match the given suffix or pattern.
- upload_files_to_gcs(local_file_path_to_gcs_path_map: Dict[LocalUri, GcsUri], parallel: bool = True) None#
Upload files from local paths to their subsequent provided GCS paths.
- Args:
local_file_path_to_gcs_path_map (Dict[LocalUri, GcsUri]): A dictionary mapping local file paths to GCS paths. parallel (bool): Flag indicating whether to upload files in parallel. Defaults to True.
- upload_from_filelike(gcs_path: GcsUri, filelike: IO, content_type: str = 'application/octet-stream') None#
Uploads a file-like object to GCS.
A “filelike” object is one that satisfies the typing.IO interface, e.g contains read(), write(), etc. The prototypical example of this is the object returned by open(), but we also use io.BytesIO as an in-memory buffer which also satisfies the typing.IO interface.
- Args:
gcs_path (GcsUri): The GCS path to upload the file to. filelike (IO[AnyStr]): The file-like object to upload. content_type (str): The content type of the file. Defaults to “application/octet-stream”.