معرفی شرکت ها


apache-airflow-providers-fastetl-0.0.9


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

FastETL custom package Apache Airflow provider.
ویژگی مقدار
سیستم عامل -
نام فایل apache-airflow-providers-fastetl-0.0.9
نام apache-airflow-providers-fastetl
نسخه کتابخانه 0.0.9
نگهدارنده []
ایمیل نگهدارنده []
نویسنده Time de Dados CGINF
ایمیل نویسنده seges.cginf@economia.gov.br
آدرس صفحه اصلی https://github.com/economiagovbr/FastETL
آدرس اینترنتی https://pypi.org/project/apache-airflow-providers-fastetl/
مجوز Apache License 2.0
![FastETL's logo. It's a Swiss army knife with some open tools](docs/images/logo.svg) <p align="center"> <em>FastETL framework, modern, versatile, does almost everything.</em> </p> Este texto também está disponível em português: 🇧🇷[LEIAME.md](LEIAME.md). --- [![CI Tests](https://github.com/economiagovbr/FastETL/actions/workflows/ci-tests.yml/badge.svg)](https://github.com/economiagovbr/FastETL/actions/workflows/ci-tests.yml) **FastETL** is a plugins package for Airflow for building data pipelines for a number of common scenarios. Main features: * Full or incremental **replication** of tables in SQL Server, Postgres and MySQL databases * Load data from **GSheets** and from spreadsheets on **Samba/Windows** networks * Extracting **CSV** from SQL * Clean data using custom data patching tasks (e.g. for messy geographical coordinates, mapping canonical values for columns, etc.) * Querying the Brazilian National Official Gazette's (**DOU**'s) API * Using a [Open Street Routing Machine](https://project-osrm.org/) service to calculate route distances * Using [CKAN](https://docs.ckan.org/en/2.10/api/index.html) or dados.gov.br's API to update dataset metadata * Using Frictionless [Tabular Data Packages](https://specs.frictionlessdata.io/tabular-data-package/) to write OpenDocument Text format data dictionaries <!-- Contar a história da origem do FastETL --> This framework is maintained by a network of developers from many teams at the Ministry of Management and Innovation in Public Services and is the cumulative result of using [Apache Airflow](https://airflow.apache.org/), a free and open source tool, starting in 2019. **For government:** FastETL is widely used for replication of data queried via Quartzo (DaaS) from Serpro. # Installation in Airflow FastETL implements the standards for Airflow plugins. To install it, simply add the `apache-airflow-providers-fastetl` package to your Python dependencies in your Airflow environment. Or install it with ```bash pip install apache-airflow-providers-fastetl ``` To see an example of an Apache Airflow container that uses FastETL, check out the [airflow2-docker](https://github.com/economiagovbr/airflow2-docker) repository. To ensure appropriate results, please make sure to install the `msodbcsql17` and `unixodbc-dev` libraries on your Apache Airflow workers. # Tests The test suite uses Docker containers to simulate a complete use environment, including Airflow and the databases. For that reason, to execute the tests, you first need to install Docker and docker-compose. For people using Ubuntu 20.04, you can just type on the terminal: ```bash snap install docker ``` For other versions and operating systems, see the [official Docker documentation](https://docs.docker.com/get-docker/). To build the containers: ```bash make setup ``` To run the tests, use: ```bash make setup && make tests ``` To shutdown the environment, use: ```bash make down ``` # Usage examples The main FastETL feature is the `DbToDbOperator` operator. It copies data between `postgres` and `mssql` databases. MySQL is also supported as a source. Here goes an example: ```python from datetime import datetime from airflow import DAG from fastetl.operators.db_to_db_operator import DbToDbOperator default_args = { "start_date": datetime(2023, 4, 1), } dag = DAG( "copy_db_to_db_example", default_args=default_args, schedule_interval=None, ) t0 = DbToDbOperator( task_id="copy_data", source={ "conn_id": airflow_source_conn_id, "schema": source_schema, "table": table_name, }, destination={ "conn_id": airflow_dest_conn_id, "schema": dest_schema, "table": table_name, }, destination_truncate=True, copy_table_comments=True, chunksize=10000, dag=dag, ) ``` More detail about the parameters and the workings of `DbToDbOperator` can bee seen on the following files: * [fast_etl.py](fastetl/custom_functions/fast_etl.py) * [db_to_db_operator.py](fastetl/operators/db_to_db_operator.py) # How to contribute To be written on the `CONTRIBUTING.md` document (issue [#4](/economiagovbr/FastETL/issues/4)).


نیازمندی

مقدار نام
>=2.3 apache-airflow
- apache-airflow-providers-microsoft-mssql
- apache-airflow-providers-mysql
- apache-airflow-providers-postgres
- apache-airflow-providers-common-sql
>=1.8.1 alembic
>=4.1.11 beautifulsoup4
>=4.6 ckanapi
>=5.11.1 frictionless
>=3.4.1 markdown
>=1.4.1 odfpy
<2,>=1.5.2 pandas
>=2.9.5 psycopg2
>=2.0.5 pygsheets
>=4.0.35 pyodbc
>=1.2.6 pysmb
>=7.0.0 python-slugify
>=2022.6 pytz
>=2.28.1 requests
>=1.4.44 SQLAlchemy


زبان مورد نیاز

مقدار نام
~=3.8 Python


نحوه نصب


نصب پکیج whl apache-airflow-providers-fastetl-0.0.9:

    pip install apache-airflow-providers-fastetl-0.0.9.whl


نصب پکیج tar.gz apache-airflow-providers-fastetl-0.0.9:

    pip install apache-airflow-providers-fastetl-0.0.9.tar.gz