# Model Inference

The Inferencer component is responsible for running inference of a trained model on samples generated by the Subgraph Sampler component.  At a high level, it works by applying a trained model in an embarrassingly parallel and distributed fashion across these samples, and persisting the output embeddings and/or predictions.

## Input

- **job_name** (AppliedTaskIdentifier):  which uniquely identifies an end-to-end task.
- **task_config_uri** (Uri):  Path which points to a "template" `GbmlConfig` proto yaml file.
- **resource_config_uri** (Uri): Path which points to a `GiGLResourceConfig` yaml
- **Optional: custom_worker_image_uri**: Path to docker file to be used for dataflow worker harness image

## What does it do?

The Inferencer undertakes the following actions:

- Reads frozen `GbmlConfig` proto yaml.  This proto contains a pointer to a class instance which implements the `BaseInferencer` protocol (see `inferencerClsPath` field of `inferencerConfig` in `GbmlConfig`).  This class houses logic which dictates how to run inference for a batch of samples (see `infer_batch` in [modeling task spec](../../python/gigl/src/common/modeling_task_specs/node_anchor_based_link_prediction_modeling_task_spec.py)) -- the types of these samples are determined by the `taskMetadata` in the frozen `GbmlConfig`.

  Custom arguments can also be passed into the class instance by including them in the `inferencerArgs` field inside `inferencerConfig` section of `GbmlConfig`.  Several standard configurations of this instance are implemented already at a GiGL platform-level; for example, the `NodeAnchorBasedLinkPredictionModelingTaskSpec` instance referenced in the sample frozen `GbmlConfig` can be used with no/minimal changes for other node-anchor based link prediction tasks.

- Reads the trained model asset from the `trainedModelUri` field in the `sharedConfig.trainedModelMetadata` section of the frozen `GbmlConfig`, and uses it to initialize the `BaseInferencer` class instance above.

- Instantiates a Dataflow job to read samples produced by the Subgraph Sampler component, which are stored at URIs referenced inside the `sharedConfig.flattenedGraphMetadata` section of the frozen `GbmlConfig`.  Note that depending on the `taskMetadata` in the `GbmlConfig`, the URIs will be housed under different keys in this section. Upon reading the outputs from Subgraph Sampler,  the pipeline follows logic housed in a `BaseInferenceBlueprint` class (platform-level), which decodes and collates individual samples into batches, and then runs the inference logic specified in infer_batch of the `BaseInferencer` class instance referenced above.  Subsequently, the pipeline writes out embeddings and/or predictions (in classification scenarios) to BigQuery.

- Finally, the component "un-enumerates" all the assets in BigQuery (to revert the "enumeration" conducted by the Data Preprocessor component).  

## How do I run it?
**Import GiGL**

```python
from gigl.src.inference.v1.gnn_inferencer import InferencerV1
from gigl.common import UriFactory
from gigl.src.common.types import AppliedTaskIdentifier

inferencer = InferencerV1()

inferencer.run(
    applied_task_identifier=AppliedTaskIdentifier("my_gigl_job_name"),
    task_config_uri=UriFactory.create_uri("gs://my-temp-assets-bucket/task_config.yaml"),
    resource_config_uri=UriFactory.create_uri("gs://my-temp-assets-bucket/resource_config.yaml")
    custom_worker_image_uri="gcr.io/project/directory/dataflow_image:x.x.x",  # Optional
)
```

**Command Line**

```
python -m gigl.src.inference.v1.gnn_inferencer \
    --job_name my_gigl_job_name \
    --task_config_uri "gs://my-temp-assets-bucket/task_config.yaml"
    --resource_config_uri="gs://my-temp-assets-bucket/resource_config.yaml"
```

## Output

The Inferencer outputs embedding and / or prediction assets, based on the `taskMetadata` in the frozen `GbmlConfig`.  Specifically, for Node-anchor Based Link Prediction tasks as we have in the sample MAU config, the embeddings are written to the BQ table specified at the `embeddingsBqPath` field in the `sharedConfig.inferenceMetadata` section.

## Custom Usage

None of the logic in this component should require changing for currently supported tasks, such as the inference logic specified in the provided `NodeAnchorBasedLinkPredictionModelingTaskSpec` which is fairly standard. However, you may override `infer_batch` in a custom class `BaseInferencer` class instance for custom tasks that are not supported.

## Other

- **Design:** Currently, all inference happens on CPU.  This is because we can easily scale this component by adding more worker machines in Dataflow, and compute is cheap.  Dataflow does support GPU instances, but seems it requires more care/attention to monitor utilization due to cost implications for limited benefits.

- **Debugging:** The core logic of this component executes in Dataflow.  A link to the Dataflow job will be printed in the logs of the component, which can be used to navigate to the Dataflow console and see fine-grained logging of the Dataflow pipeline.