gigl.src.inference.v1.lib.BaseInferenceBlueprint#

class gigl.src.inference.v1.lib.base_inference_blueprint.BaseInferenceBlueprint(inferencer: BaseInferencer)#

Bases: ABC, Generic[RawSampleType, BatchType]

Abstract Base Class that needs to be implemented for inference dataflow pipelines to correctly compute and save inference results for GBML tasks, such as Supervised Node Classification, Node Anchor-Based Link Prediction, Supervised Link-Based Task Split, etc.

Implements Generics: - RawSampleType: The raw sample that will be parsed from get_tf_record_coder. - BatchType: The batch type needed for model inference (forward pass) for the specific task at hand (e.g RootedNodeNeighborhoodBatch).

Methods

__init__

get_batch_generator_fn

Returns:

get_emb_table_schema

Returns the schema for the BQ table that will house embeddings.

get_inference_data_tf_record_uri_prefixes

Returns:

get_inferer

Returns a function that takes a DigestableBatchType object instance as input and yields TaggedOutputs with tags of either PREDICTION_TAGGED_OUTPUT_KEY or EMBEDDING_TAGGED_OUTPUT_KEY.

get_pred_table_schema

Returns the schema for the BQ table that will house predictions.

get_tf_record_coder

Returns:

__init__(inferencer: BaseInferencer)#

classmethod __init_subclass__(*args, **kwargs)#

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

__weakref__#: list of weak references to the object (if defined)

abstract get_batch_generator_fn() → Callable#

Returns:: Callable: The function specific to the batch type needed for the inference task at hand.

static get_emb_table_schema(should_run_unenumeration: bool = False) → InferenceOutputBigqueryTableSchema#

Returns the schema for the BQ table that will house embeddings.

Returns: InferenceOutputBQTableSchema: Instance containing the schema and registered node field. See: https://beam.apache.org/documentation/io/built-in/google-bigquery/#creating-a-table-schema Example schema:

‘fields’: [
{‘name’: ‘source’, ‘type’: ‘STRING’, ‘mode’: ‘NULLABLE’}, {‘name’: ‘quote’, ‘type’: ‘STRING’, ‘mode’: ‘REQUIRED’}

]

abstract get_inference_data_tf_record_uri_prefixes() → Dict[NodeType, List[Uri]]#

Returns:: Dict[NodeType, List[Uri]]: Dictionary of node type to the list of uri prefixes where to find tf record files that will be used for inference

get_inferer() → Callable[[BatchType], Iterable[TaggedOutput]]#

Returns a function that takes a DigestableBatchType object instance as input and yields TaggedOutputs with tags of either PREDICTION_TAGGED_OUTPUT_KEY or EMBEDDING_TAGGED_OUTPUT_KEY. The value is a Dict that can be directly written to BQ following the schemas defined in get_emb_table_schema for outputs with tag “embeddings” and get_pred_table_schema for outputs with tag PREDICTION_TAGGED_OUTPUT_KEY.

For example, the following will be mapped to the predictions table: pvalue.TaggedOutput(

PREDICTION_TAGGED_OUTPUT_KEY, {

‘source’: ‘Mahatma Gandhi’, ‘quote’: ‘My life is my message.’

}

)

Note that the output follows the schema presented in get_pred_table_schema.

static get_pred_table_schema(should_run_unenumeration: bool = False) → InferenceOutputBigqueryTableSchema#

Returns the schema for the BQ table that will house predictions.

Returns: InferenceOutputBQTableSchema: Instance containing the schema and registered node field. See: https://beam.apache.org/documentation/io/built-in/google-bigquery/#creating-a-table-schema Example schema:

‘fields’: [
{‘name’: ‘source’, ‘type’: ‘STRING’, ‘mode’: ‘NULLABLE’}, {‘name’: ‘quote’, ‘type’: ‘STRING’, ‘mode’: ‘REQUIRED’}

]

abstract get_tf_record_coder() → ProtoCoder#

Returns:: beam.coders.ProtoCoder: The coder used to parse the TFRecords to raw data samples of type RawSampleType

`__init__`
`get_batch_generator_fn`	Returns:
`get_emb_table_schema`	Returns the schema for the BQ table that will house embeddings.
`get_inference_data_tf_record_uri_prefixes`	Returns:
`get_inferer`	Returns a function that takes a DigestableBatchType object instance as input and yields TaggedOutputs with tags of either PREDICTION_TAGGED_OUTPUT_KEY or EMBEDDING_TAGGED_OUTPUT_KEY.
`get_pred_table_schema`	Returns the schema for the BQ table that will house predictions.
`get_tf_record_coder`	Returns: