Refdata loaders
dve.core_engine.backends.base.reference_data.BaseRefDataLoader
¶
Bases: Generic[EntityType], Mapping[EntityName, EntityType], ABC
A reference data mapper which lazy-loads requested entities.
Source code in src/dve/core_engine/backends/base/reference_data.py
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 | |
__entity_type__
class-attribute
¶
The entity type used for the reference data.
This will be populated from the generic annotation at class creation time.
__reader_functions__ = {}
class-attribute
¶
A mapping between file extensions and functions to load the file uris into reference data entities
__step_functions__ = {}
class-attribute
¶
A mapping between refdata config types and functions to call to load these configs into reference data entities
dataset_config_uri = dataset_config_uri
instance-attribute
¶
Configuration options for the reference data. This is likely to vary from backend to backend (e.g. might be locations and file types for some backends, and table names for others).
entity_cache = {}
instance-attribute
¶
A cache for already-loaded entities.
__init_subclass__(*_, **__)
¶
When this class is subclassed, create and populate the __step_functions__
class variable for the subclass.
Source code in src/dve/core_engine/backends/base/reference_data.py
112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | |
load_entity(entity_name, config)
¶
Load a reference entity given the reference config
Source code in src/dve/core_engine/backends/base/reference_data.py
195 196 197 198 199 200 201 | |
load_file(config)
¶
Load reference entity from a relative file path
Source code in src/dve/core_engine/backends/base/reference_data.py
170 171 172 173 174 175 176 177 178 179 180 181 | |
load_table(config)
abstractmethod
¶
Load reference entity from a database table
Source code in src/dve/core_engine/backends/base/reference_data.py
165 166 167 168 | |
load_uri(config)
¶
Load reference entity from an absolute URI
Source code in src/dve/core_engine/backends/base/reference_data.py
183 184 185 186 187 188 189 190 191 192 193 | |
dve.core_engine.backends.implementations.duckdb.reference_data.DuckDBRefDataLoader
¶
Bases: BaseRefDataLoader[DuckDBPyRelation]
A reference data loader using already existing DuckDB tables.
Source code in src/dve/core_engine/backends/implementations/duckdb/reference_data.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
connection
instance-attribute
¶
The DuckDB connection for the backend.
dataset_config_uri = None
class-attribute
instance-attribute
¶
The location of the dischema file
load_arrow_file(uri)
¶
Load an arrow ipc file into a duckdb relation
Source code in src/dve/core_engine/backends/implementations/duckdb/reference_data.py
46 47 48 49 | |
load_parquet_file(uri)
¶
Load a parquet file into a duckdb relation
Source code in src/dve/core_engine/backends/implementations/duckdb/reference_data.py
41 42 43 44 | |
load_table(config)
¶
Load reference entity from a database table
Source code in src/dve/core_engine/backends/implementations/duckdb/reference_data.py
37 38 39 | |
dve.core_engine.backends.implementations.spark.reference_data.SparkRefDataLoader
¶
Bases: BaseRefDataLoader[DataFrame]
A reference data loader using already existing Apache Spark Tables.
Source code in src/dve/core_engine/backends/implementations/spark/reference_data.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
dataset_config_uri = None
class-attribute
instance-attribute
¶
The location of the dischema file defining business rules
spark
instance-attribute
¶
The Spark session for the backend.
load_parquet_file(uri)
¶
Load a parquet file into a spark dataframe
Source code in src/dve/core_engine/backends/implementations/spark/reference_data.py
39 40 41 42 | |