gigl.src.inference.v1.lib.NodeAnchorBasedLinkPredictionInferenceBlueprint#

Bases: BaseInferenceBlueprint[RootedNodeNeighborhood, RootedNodeNeighborhoodBatch]

Concrete NodeAnchorBasedLinkPredictionInferenceBlueprint class that implements functions in order to correctly compute and save inference results for NodeAnchorBasedLinkPrediction task.

Implements Generics:

RawSampleType = training_samples_schema_pb2.RootedNodeNeighborhood BatchType = RootedNodeNeighborhoodBatch

Note that this sample does inference on RootedNodeNeighborhood pbs as these are the full node neighborhoods required for inference. The NodeAnchorBasedLinkPrediction samples contain more information which is useful for training but unnecessary for inference.

Methods

__init__

get_batch_generator_fn

Returns:

get_emb_table_schema

Returns the schema for the BQ table that will house embeddings.

get_inference_data_tf_record_uri_prefixes

Returns:

get_inferer

Returns a function that takes a DigestableBatchType object instance as input and yields TaggedOutputs with tags of either PREDICTION_TAGGED_OUTPUT_KEY or EMBEDDING_TAGGED_OUTPUT_KEY.

get_pred_table_schema

Returns the schema for the BQ table that will house predictions.

get_tf_record_coder

Returns:

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

list of weak references to the object (if defined)

Returns:

Callable: The function specific to the batch type needed for the inference task at hand.

Returns the schema for the BQ table that will house embeddings.

Returns: InferenceOutputBQTableSchema: Instance containing the schema and registered node field. See: https://beam.apache.org/documentation/io/built-in/google-bigquery/#creating-a-table-schema Example schema:

‘fields’: [

{‘name’: ‘source’, ‘type’: ‘STRING’, ‘mode’: ‘NULLABLE’}, {‘name’: ‘quote’, ‘type’: ‘STRING’, ‘mode’: ‘REQUIRED’}

]

Returns:

Dict[NodeType, List[Uri]]: Dictionary of node type to the list of uri prefixes where to find tf record files that will be used for inference

Returns a function that takes a DigestableBatchType object instance as input and yields TaggedOutputs with tags of either PREDICTION_TAGGED_OUTPUT_KEY or EMBEDDING_TAGGED_OUTPUT_KEY. The value is a Dict that can be directly written to BQ following the schemas defined in get_emb_table_schema for outputs with tag “embeddings” and get_pred_table_schema for outputs with tag PREDICTION_TAGGED_OUTPUT_KEY.

For example, the following will be mapped to the predictions table: pvalue.TaggedOutput(

PREDICTION_TAGGED_OUTPUT_KEY, {

‘source’: ‘Mahatma Gandhi’, ‘quote’: ‘My life is my message.’

}

)

Note that the output follows the schema presented in get_pred_table_schema.

Returns the schema for the BQ table that will house predictions.

Returns: InferenceOutputBQTableSchema: Instance containing the schema and registered node field. See: https://beam.apache.org/documentation/io/built-in/google-bigquery/#creating-a-table-schema Example schema:

‘fields’: [

{‘name’: ‘source’, ‘type’: ‘STRING’, ‘mode’: ‘NULLABLE’}, {‘name’: ‘quote’, ‘type’: ‘STRING’, ‘mode’: ‘REQUIRED’}

]

Returns:

beam.coders.ProtoCoder: The coder used to parse the TFRecords to raw data samples of type RawSampleType