gigl.src.inference.v1.lib.BaseInferenceBlueprint#
- class gigl.src.inference.v1.lib.base_inference_blueprint.BaseInferenceBlueprint(inferencer: BaseInferencer)#
Bases:
ABC
,Generic
[RawSampleType
,BatchType
]Abstract Base Class that needs to be implemented for inference dataflow pipelines to correctly compute and save inference results for GBML tasks, such as Supervised Node Classification, Node Anchor-Based Link Prediction, Supervised Link-Based Task Split, etc.
Implements Generics: - RawSampleType: The raw sample that will be parsed from get_tf_record_coder. - BatchType: The batch type needed for model inference (forward pass) for the specific task at hand (e.g RootedNodeNeighborhoodBatch).
Methods
Returns:
Returns the schema for the BQ table that will house embeddings.
Returns:
Returns a function that takes a DigestableBatchType object instance as input and yields TaggedOutputs with tags of either PREDICTION_TAGGED_OUTPUT_KEY or EMBEDDING_TAGGED_OUTPUT_KEY.
Returns the schema for the BQ table that will house predictions.
Returns:
- __init__(inferencer: BaseInferencer)#
- classmethod __init_subclass__(*args, **kwargs)#
This method is called when a class is subclassed.
The default implementation does nothing. It may be overridden to extend subclasses.
- __weakref__#
list of weak references to the object (if defined)
- abstract get_batch_generator_fn() Callable #
- Returns:
Callable: The function specific to the batch type needed for the inference task at hand.
- static get_emb_table_schema(should_run_unenumeration: bool = False) InferenceOutputBigqueryTableSchema #
Returns the schema for the BQ table that will house embeddings.
Returns: InferenceOutputBQTableSchema: Instance containing the schema and registered node field. See: https://beam.apache.org/documentation/io/built-in/google-bigquery/#creating-a-table-schema Example schema:
- ‘fields’: [
{‘name’: ‘source’, ‘type’: ‘STRING’, ‘mode’: ‘NULLABLE’}, {‘name’: ‘quote’, ‘type’: ‘STRING’, ‘mode’: ‘REQUIRED’}
]
- abstract get_inference_data_tf_record_uri_prefixes() Dict[NodeType, List[Uri]] #
- Returns:
Dict[NodeType, List[Uri]]: Dictionary of node type to the list of uri prefixes where to find tf record files that will be used for inference
- get_inferer() Callable[[BatchType], Iterable[TaggedOutput]] #
Returns a function that takes a DigestableBatchType object instance as input and yields TaggedOutputs with tags of either PREDICTION_TAGGED_OUTPUT_KEY or EMBEDDING_TAGGED_OUTPUT_KEY. The value is a Dict that can be directly written to BQ following the schemas defined in get_emb_table_schema for outputs with tag “embeddings” and get_pred_table_schema for outputs with tag PREDICTION_TAGGED_OUTPUT_KEY.
For example, the following will be mapped to the predictions table: pvalue.TaggedOutput(
PREDICTION_TAGGED_OUTPUT_KEY, {
‘source’: ‘Mahatma Gandhi’, ‘quote’: ‘My life is my message.’
}
)
Note that the output follows the schema presented in get_pred_table_schema.
- static get_pred_table_schema(should_run_unenumeration: bool = False) InferenceOutputBigqueryTableSchema #
Returns the schema for the BQ table that will house predictions.
Returns: InferenceOutputBQTableSchema: Instance containing the schema and registered node field. See: https://beam.apache.org/documentation/io/built-in/google-bigquery/#creating-a-table-schema Example schema:
- ‘fields’: [
{‘name’: ‘source’, ‘type’: ‘STRING’, ‘mode’: ‘NULLABLE’}, {‘name’: ‘quote’, ‘type’: ‘STRING’, ‘mode’: ‘REQUIRED’}
]
- abstract get_tf_record_coder() ProtoCoder #
- Returns:
beam.coders.ProtoCoder: The coder used to parse the TFRecords to raw data samples of type RawSampleType