معرفی شرکت ها


dbtvault-generator-0.2.3


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

A little util to help you generate DBTVault SQL files from metadata, while we wait for DBTLabs to add meta to the execution context
ویژگی مقدار
سیستم عامل -
نام فایل dbtvault-generator-0.2.3
نام dbtvault-generator
نسخه کتابخانه 0.2.3
نگهدارنده []
ایمیل نگهدارنده []
نویسنده -
ایمیل نویسنده Alex Stenlake <alex.stenlake@gmail.com>
آدرس صفحه اصلی -
آدرس اینترنتی https://pypi.org/project/dbtvault-generator/
مجوز Apache-2.0
# dbtvault-generator Generate DbtVault files from external `yml`! ## Why this exists A good dev is a lazy dev. Let's do less work, the right way! DBTVault is a great tool, but due to quirks of DBT (looking at you, compile context) we don't necessarily have access additional variables we want to use...at least, without overloading the `meta` or `vars` attributes. See [this issue](https://github.com/Datavault-UK/dbtvault/issues/106) for some details about the limitations and workarounds for the build context. The `dbtvault` docs used `yml` specified inside DBT `sql` files to specify its variables, or otherwise directly set the variables using `jinja`. While this is a workable solution, it's still a workaround to the fact that additional attributes can't be consumed by the `dbt compile` operation. Using env vars as a stopgap isn't a problem on small projects, but on more complicated codebases it can quickly lead to confusion. It becomes difficult to trace which variables belong where. Further, it hinders the development of other data vault management tools being built on top of DBT. Data Vault is a metadata-heavy method of architecting a data warehouse, as the modelling discipline associated with data vault makes it ripe for automation and abstraction. A standardised modelling output format independent of env vars permits the building of GUI tools to streamline the development and expansion of the data vault. This tool is a Python-based templating library designed to act around DBT and `dbtvault`, and allow external `dbtvault` parameterisation. It also seeks to leverage the uniformity of data vault specifications to automate certain low-level maintenance tasks such as code documentation. The primary purpose is to generate Data Vault architectures (trhough `dbtvault`), generating SQL and `schema.yml` directly from specialised `dbtvault.yml` files. Watch this space for a GUI assistance tool in future. ## Requirements - `python>=3.9` - `dbt>=1.0.0` available in the Python environment - A DBT project initialised and ready to go. The `dbtv-gen` CLI tool needs to execute from the same directory as you would normally execute DBT from. - The DBT package `dbtvault>0.9.0` installed in the DBT project ## Overview Of Features Depending on your set-up, install `dbtvault-generator` into your Python environment. A shim will be installed that adds `dbtv-gen` to your CLI, and you're ready to go. The following assumes you have `dbtvault.yml` files preconfigured (see below for details). ![image](./static/images/folder-setup.png) The `dbtvault.yml` file contains the metadata required for templating the `dbtvault` library ![image](./static/images/dbtvault-yml.png) You can also use a root-level `dbtvault.yml` to specify global defaults. ![image](./static/images/root-dbtvault.png) ### Generate SQL Templates In your command line of choice, run the following command: ```bash dbtv-gen sql ``` By default, you will see the SQL files appear in the same directory as the `dbtvault.yml` file. Files can be prefixed automatically (shown), can be manually specified per-model, or ignored. ![image](./static/images/sql-files-created.png) The contents of the file will match the details specified in `dbtvault.yml`. ![image](./static/images/sql-file-details.png) ### Generate `schema.yml` Doc Config Once the sql has been generated, execute `dbt run` to instantiate them as views in your development environment. From here, `dbt docs generate` can be executed to build DBT's `catalog.json` artifact. `dbtvault-generator` can then use this to pre-populate a `schema.yml` file with basic documentation. ```bash dbtv-gen docs ``` ![image](./static/images/schema-file-created.png) The schema file will infer certain properties about the columns. This includes the data types of the columns, read from the information schema. Unfortunately we can't yet describe the columns automatically. ![image](./static/images/schema-file-details.png) In additionFor example, primary keys will automatically have `not_null` and `unique` tests added for alerting of clashes. Foreign key columns will have a `relationships` test added, but this test is conditional on a `where 1 != 1` condition to prevent it from triggering alerts. What this does do, however, is play nicely with other tools and packages that use the "relationship" test to automatically create foreign keys or similar metadata (e.g. on Snowflake) for downstream use by other tools. Instantiating the models is currently required to build the docs. In future versions, the docs will be built directly from the config options. ### BONUS! Use `dbterd` To Generate ER Diagrams! The wonderful [Dat Nguyen](https://github.com/datnguye) has built [a cool library called `dbterd`](https://github.com/datnguye/dbterd). `dbterd` picks up DBT's catalog and manifest artifacts and, using the relationship test, creates a `.dbml` file detailing the core relationships within your DBT catalog. The `.dbml` format is an open-source entity-relation spec that can either be consumed directly by a number of SaaS tools or by open-source tools like [DBML Renderer](https://github.com/softwaretechnik-berlin/dbml-renderer). Importantly, `dbterd` uses the `relationships` test to identify foreign key relations...which you will have out of the box. You will need to rerun `dbt docs generate` to add in the new relationships, but Go check out [his page on Github](https://github.com/datnguye/dbterd) for further details...and leave him a star while you're there. ## Installation Instructions ### From Pip ```bash pip install --upgrade dbtvault-generator ``` ### Direct From Git Many package managers allow installation of Python packages direct from version control repositories. Add this repo directly with e.g. ```bash poetry add git@github.com:Oracen/dbtvault-generator.git ``` ### From Source For now, using PDM as dependency manager as it seems to be better supported than Poetry. As such, give PDM a try! It's like Yarn for Python, or Poetry with a faster resolver and Poe built in. ```bash git clone https://github.com/Oracen/dbtvault-generator cd dbtvault-generator pdm build ``` ## Usage On command execution, `dbtv-gen` scans your project directory for files named `dbtvault.yml`, containing a root-level key called `dbtvault`. The location of these files specify where any associated models will be generated, unless the target path is overriden. The only special file is an optional `dbtvault.yml` located at the project root. This file's default attributes will be treated as the defaults for the entire project, and so serves as a good place for specifying prefixes etc. Each `dbtvault.yml` contains 2 keys: - `defaults`: the default options for each model. These default options can be overwritten on a per-model basis - `models`: the specification for each model to be generated in the parent folder of the current file. The `defaults` element has the following structure: ```yaml version: 2 dbtvault: defaults: use_prefix: true prefixes: table: stage: stg_ hub: hub_ link: lnk_ t_link: t_lnk_ sat: sat_ eff_sat: eff_sat_ ma_sat: ma_sat_ xts: xts_ pit: pit_ bridge: bridge_ custom_macros: stage: custom_staging_macro target_path: "" ``` - `use_prefix`: indicates whether to prepend the designated string to each table name as the model generates. Defaults to `false` - `prefixes`: a mapping of prefixes to use per supported model type. Defaults are shown above, and are aligned with `dbtvault`'s examples - `custom_macros`: sometimes the default `dbtvault` macros don't do exactly what you want and a custom macro needs to be substituted. For example, you may want to restrict your data to a small time range in the development environment. If specified, the custom macro will be substituted in for particular model structures. Defaults to `{}` (empty) - `target_path`: The location where files will be created. It is advised to not set this on the default settings, but can be overriden on the objects attribute per-model. Defaults to `""` (empty). The `models` element is a list of `model` elements, each of whic has the following structure: ```yaml version: 2 dbtvault: models: - name: clients model_type: stage options: use_prefix: false target_path: "raw_vault/staging" dbtvault_arguments: include_source_columns: false source_model: main_db: clientData derived_columns: SOURCE: "!1" LOAD_DATETIME: "CRM_INGESTION_TIME" null_columns: LAST_TICKET_DATE: "lastTicketDate" hashed_columns: CLIENT_HASHKEY: "CLIENT_ID" CUSTOMER_NATION_HK: - "CUSTOMER_ID" - "NATION_ID" ranked_columns: one : "two" ``` - `name`: specifies the base name of the generated model. If `use_prefix = false`, the generated file will have this name, otherwise it will have the appropriate prefix prepended depending on its type - `model_type`: the `dbtvault`-compliant model type. This field drives the yml schema validation, and so should throw errors if the overall model object is incorrectly specified. - `options`: this specifies the generation options for the model, and inherits values from the defaults. The choices are the same. - `dbtvault_arguments`: Implements the `dbtvault` macro API. Choices depend on the type of model implemented. You can also use `!include relative/path/to/file.yml` syntax to push config out to other files. This is handy in e.g. keeping all DBTVault config in separate ymls to avoid overloading a single `.yml` file, streamlining PRs and maintenance. All included files will still require the `version` KV-pair DBT looks for, but it will be removed from any included templates. Please note; to use this functionality you'll need to add `dbtvault.yml` to your `.dbtignore` file: ```bash # .dbtignore dbtvault.yml ``` DBT does not support the `!include` constructor pattern. For further information on how to specify `dbtvault` models, please see [the DBTVault macros documentation](https://dbtvault.readthedocs.io/en/latest/macros).


نیازمندی

مقدار نام
- dbt-artifacts-parser>=0.2.4
- pydantic>=1.10.5
- schema>=0.7.5
- typer[all]>=0.7.0


زبان مورد نیاز

مقدار نام
>=3.9 Python


نحوه نصب


نصب پکیج whl dbtvault-generator-0.2.3:

    pip install dbtvault-generator-0.2.3.whl


نصب پکیج tar.gz dbtvault-generator-0.2.3:

    pip install dbtvault-generator-0.2.3.tar.gz