Advanced Topics
Object instantiation
This section describes the object that are to be instantiated from the YAML definition.
If the component is a literal, then it is returned as is:
3
will result in
3
If the component definition is a mapping with a "type" field, the factory will lookup the CLASS_TYPES_REGISTRY and replace the "type" field by "class_name" -> CLASS_TYPES_REGISTRY[type] and instantiate the object from the resulting mapping
If the component definition is a mapping with neither a "class_name" nor a "type" field, the factory will do a best-effort attempt at inferring the component type by looking up the parent object's constructor type hints. If the type hint is an interface present in [DEFAULT_IMPLEMENTATIONS_REGISTRY](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/default_implementation_registry.py, then the factory will create an object of its default implementation.
If the component definition is a list, then the factory will iterate over the elements of the list, instantiate its subcomponents, and return a list of instantiated objects.
If the component has subcomponents, the factory will create the subcomponents before instantiating the top level object
{
  "type": TopLevel
  "param":
    {
      "type": "ParamType"
      "k": "v"
    }
}
will result in
TopLevel(param=ParamType(k="v"))
More details on object instantiation can be found here.
$parameters
Parameters can be passed down from a parent component to its subcomponents using the $parameters key. This can be used to avoid repetitions.
Schema:
  "$parameters":
    type: object
    additionalProperties: true
Example:
outer:
  $parameters:
    MyKey: MyValue
  inner:
    k2: v2
This the example above, if both outer and inner are types with a "MyKey" field, both of them will evaluate to "MyValue".
These parameters can be overwritten by subcomponents as a form of specialization:
outer:
  $parameters:
    MyKey: MyValue
  inner:
    $parameters:
      MyKey: YourValue
    k2: v2
In this example, "outer.MyKey" will evaluate to "MyValue", and "inner.MyKey" will evaluate to "YourValue".
The value can also be used for string interpolation:
outer:
  $parameters:
    MyKey: MyValue
  inner:
    k2: "MyKey is {{ parameters['MyKey'] }}"
In this example, outer.inner.k2 will evaluate to "MyKey is MyValue"
References
Strings can contain references to previously defined values. The parser will dereference these values to produce a complete object definition.
References can be defined using a #/{arg} string.
key: 1234
reference: "#/key"
will produce the following definition:
key: 1234
reference: 1234
This also works with objects:
key_value_pairs:
  k1: v1
  k2: v2
same_key_value_pairs: "#/key_value_pairs"
will produce the following definition:
key_value_pairs:
  k1: v1
  k2: v2
same_key_value_pairs:
  k1: v1
  k2: v2
The $ref keyword can be used to refer to an object and enhance it with addition key-value pairs
key_value_pairs:
  k1: v1
  k2: v2
same_key_value_pairs:
  $ref: "#/key_value_pairs"
  k3: v3
will produce the following definition:
key_value_pairs:
  k1: v1
  k2: v2
same_key_value_pairs:
  k1: v1
  k2: v2
  k3: v3
References can also point to nested values.
Nested references are ambiguous because one could define a key containing with /
in this example, we want to refer to the limit key in the dict object:
dict:
  limit: 50
limit_ref: "#/dict/limit"
will produce the following definition:
dict
limit: 50
limit-ref: 50
whereas here we want to access the nested/path value.
nested:
  path: "first one"
nested.path: "uh oh"
value: "ref(nested.path)
will produce the following definition:
nested:
  path: "first one"
nested/path: "uh oh"
value: "uh oh"
To resolve the ambiguity, we try looking for the reference key at the top-level, and then traverse the structs downward until we find a key with the given path, or until there is nothing to traverse.
More details on referencing values can be found here.
String interpolation
String values can be evaluated as Jinja2 templates.
If the input string is a raw string, the interpolated string will be the same.
"hello world" -> "hello world"
The engine will evaluate the content passed within {{...}}, interpolating the keys from context-specific arguments.
The "parameters" keyword see ($parameters) can be referenced.
For example, some_object.inner_object.key will evaluate to "Hello airbyte" at runtime.
some_object:
  $parameters:
    name: "airbyte"
  inner_object:
    key: "Hello {{ parameters.name }}"
Some components also pass in additional arguments to the context.
This is the case for the record selector, which passes in an additional response argument.
Both dot notation and bracket notations (with single quotes ( ')) are interchangeable.
This means that both these string templates will evaluate to the same string:
- "{{ parameters.name }}"
- "{{ parameters['name'] }}"
In addition to passing additional values through the $parameters argument, macros can be called from within the string interpolation.
For example,
"{{ max(2, 3) }}" -> 3
The macros and variables available in all possible contexts are documented in the YAML Reference.
Additional information on jinja templating can be found at https://jinja.palletsprojects.com/en/3.1.x/templates/#
Component schema reference
A JSON schema representation of the relationships between the components that can be used in the YAML configuration can be found here.
Custom components
Please help us improve the low code CDK! If you find yourself needing to build a custom component,please create a feature request issue. If appropriate, we'll add it directly to the framework (or you can submit a PR)!
If an issue already exist for the missing feature you need, please upvote or comment on it so we can prioritize the issue accordingly.
Any built-in components can be overloaded by a custom Python class.
To create a custom component, define a new class in a new file in the connector's module.
The class must implement the interface of the component it is replacing. For instance, a pagination strategy must implement airbyte_cdk.sources.declarative.requesters.paginators.strategies.pagination_strategy.PaginationStrategy.
The class must also be a dataclass where each field represents an argument to configure from the yaml file, and an InitVar named parameters.
For example:
@dataclass
class MyPaginationStrategy(PaginationStrategy):
  my_field: Union[InterpolatedString, str]
  parameters: InitVar[Mapping[str, Any]]
  def __post_init__(self, parameters: Mapping[str, Any]):
    pass
  def next_page_token(self, response: requests.Response, last_records: List[Mapping[str, Any]]) -> Optional[Any]:
    pass
  def reset(self):
    pass
This class can then be referred from the yaml file by specifying the type of custom component and using its fully qualified class name:
pagination_strategy:
  type: "CustomPaginationStrategy"
  class_name: "my_connector_module.MyPaginationStrategy"
  my_field: "hello world"
Custom Components that pass fields to child components
There are certain scenarios where a child subcomponent might rely on a field defined on a parent component. For regular components, we perform this propagation of fields from the parent component to the child automatically. However, custom components do not support this behavior. If you have a child subcomponent of your custom component that falls under this use case, you will see an error message like:
Error creating component 'DefaultPaginator' with parent custom component source_example.components.CustomRetriever: Please provide DefaultPaginator.$parameters.url_base
When you receive this error, you can address this by defining the missing field within the $parameters block of the child component.
  paginator:
    type: "DefaultPaginator"
    <...>
    $parameters:
      url_base: "https://example.com"
How the framework works
- Given the connection config and an optional stream state, the PartitionRoutercomputes the partitions that should be routed to read data.
- Iterate over all the partitions defined by the stream's partition router.
- For each partition,
- Submit a request to the partner API as defined by the requester
- Select the records from the response
- Repeat for as long as the paginator points to a next page