gigl.src.data_preprocessor.lib.ingest.DataReference#

class gigl.src.data_preprocessor.lib.ingest.reference.DataReference(reference_uri: str)#

Bases: ABC

Contains a URI string to the data reference, and provides a means of yielding instance dicts via a beam PTransform.

A single DataReference is currently assumed to have data relevant to a single node or edge type. A single DataReference cannot currently house mixed-type data.

Methods

__init__

yield_instance_dict_ptransform

Returns a PTransform whose expand method returns a PCollection of InstanceDicts, which can be subsequently ingested and transformed via Tensorflow Transform.

__delattr__(name)#

Implement delattr(self, name).

__eq__(other)#

Return self==value.

__hash__()#

Return hash(self).

__init__(reference_uri: str) None#
__repr__()#

Return repr(self).

__setattr__(name, value)#

Implement setattr(self, name, value).

__weakref__#

list of weak references to the object (if defined)

abstract yield_instance_dict_ptransform(*args, **kwargs) InstanceDictPTransform#

Returns a PTransform whose expand method returns a PCollection of InstanceDicts, which can be subsequently ingested and transformed via Tensorflow Transform.

TODO: extend to support multiple edge types being in the same table. :param args: :param kwargs: :return: