gigl.common.data.export#

Utility functions for exporting embeddings to Google Cloud Storage and BigQuery.

Note that we use avro files here since due to testing they are quicker to generate and upload compared to parquet files.

However, if we switch to an on-line upload scheme, where we upload the embeddings as they are generated, then we should look into if parquet or orc files are more performant in that modality.

Functions

load_embeddings_to_bigquery

Loads multiple Avro files containing GNN embeddings from GCS into BigQuery.

Classes

EmbeddingExporter