跳转到主要内容

标签(标签)

资源精选(342) Go开发(108) Go语言(103) Go(99) angular(82) LLM(75) 大语言模型(63) 人工智能(53) 前端开发(50) LangChain(43) golang(43) 机器学习(39) Go工程师(38) Go程序员(38) Go开发者(36) React(33) Go基础(29) Python(24) Vue(22) Web开发(20) Web技术(19) 精选资源(19) 深度学习(19) Java(18) ChatGTP(17) Cookie(16) android(16) 前端框架(13) JavaScript(13) Next.js(12) 安卓(11) 聊天机器人(10) typescript(10) 资料精选(10) NLP(10) 第三方Cookie(9) Redwoodjs(9) LLMOps(9) Go语言中级开发(9) 自然语言处理(9) PostgreSQL(9) 区块链(9) mlops(9) 安全(9) 全栈开发(8) ChatGPT(8) OpenAI(8) Linux(8) AI(8) GraphQL(8) iOS(8) 软件架构(7) Go语言高级开发(7) AWS(7) C++(7) 数据科学(7) whisper(6) Prisma(6) 隐私保护(6) RAG(6) JSON(6) DevOps(6) 数据可视化(6) wasm(6) 计算机视觉(6) 算法(6) Rust(6) 微服务(6) 隐私沙盒(5) FedCM(5) 语音识别(5) Angular开发(5) 快速应用开发(5) 提示工程(5) Agent(5) LLaMA(5) 低代码开发(5) Go测试(5) gorm(5) REST API(5) 推荐系统(5) WebAssembly(5) GameDev(5) CMS(5) CSS(5) machine-learning(5) 机器人(5) 游戏开发(5) Blockchain(5) Web安全(5) Kotlin(5) 低代码平台(5) 机器学习资源(5) Go资源(5) Nodejs(5) PHP(5) Swift(5) 智能体(4) devin(4) Blitz(4) javascript框架(4) Redwood(4) GDPR(4) 生成式人工智能(4) Angular16(4) Alpaca(4) 编程语言(4) SAML(4) JWT(4) JSON处理(4) Go并发(4) kafka(4) 移动开发(4) 移动应用(4) security(4) 隐私(4) spring-boot(4) 物联网(4) nextjs(4) 网络安全(4) API(4) Ruby(4) 信息安全(4) flutter(4) 专家智能体(3) Chrome(3) CHIPS(3) 3PC(3) SSE(3) 人工智能软件工程师(3) LLM Agent(3) Remix(3) Ubuntu(3) GPT4All(3) 软件开发(3) 问答系统(3) 开发工具(3) 最佳实践(3) RxJS(3) SSR(3) Node.js(3) Dolly(3) 移动应用开发(3) 低代码(3) IAM(3) Web框架(3) CORS(3) 基准测试(3) Go语言数据库开发(3) Oauth2(3) 并发(3) 主题(3) Theme(3) earth(3) nginx(3) 软件工程(3) azure(3) keycloak(3) 生产力工具(3) gpt3(3) 工作流(3) C(3) jupyter(3) 认证(3) prometheus(3) GAN(3) Spring(3) 逆向工程(3) 应用安全(3) Docker(3) Django(3) R(3) .NET(3) 大数据(3) Hacking(3) 渗透测试(3) C++资源(3) Mac(3) 微信小程序(3) Python资源(3) JHipster(3) 大型语言模型(2) 语言模型(2) 可穿戴设备(2) JDK(2) SQL(2) Apache(2) Hashicorp Vault(2) Spring Cloud Vault(2) Go语言Web开发(2) Go测试工程师(2) WebSocket(2) 容器化(2) AES(2) 加密(2) 输入验证(2) ORM(2) Fiber(2) Postgres(2) Gorilla Mux(2) Go数据库开发(2) 模块(2) 泛型(2) 指针(2) HTTP(2) PostgreSQL开发(2) Vault(2) K8s(2) Spring boot(2) R语言(2) 深度学习资源(2) 半监督学习(2) semi-supervised-learning(2) architecture(2) 普罗米修斯(2) 嵌入模型(2) productivity(2) 编码(2) Qt(2) 前端(2) Rust语言(2) NeRF(2) 神经辐射场(2) 元宇宙(2) CPP(2) 数据分析(2) spark(2) 流处理(2) Ionic(2) 人体姿势估计(2) human-pose-estimation(2) 视频处理(2) deep-learning(2) kotlin语言(2) kotlin开发(2) burp(2) Chatbot(2) npm(2) quantum(2) OCR(2) 游戏(2) game(2) 内容管理系统(2) MySQL(2) python-books(2) pentest(2) opengl(2) IDE(2) 漏洞赏金(2) Web(2) 知识图谱(2) PyTorch(2) 数据库(2) reverse-engineering(2) 数据工程(2) swift开发(2) rest(2) robotics(2) ios-animation(2) 知识蒸馏(2) 安卓开发(2) nestjs(2) solidity(2) 爬虫(2) 面试(2) 容器(2) C++精选(2) 人工智能资源(2) Machine Learning(2) 备忘单(2) 编程书籍(2) angular资源(2) 速查表(2) cheatsheets(2) SecOps(2) mlops资源(2) R资源(2) DDD(2) 架构设计模式(2) 量化(2) Hacking资源(2) 强化学习(2) flask(2) 设计(2) 性能(2) Sysadmin(2) 系统管理员(2) Java资源(2) 机器学习精选(2) android资源(2) android-UI(2) Mac资源(2) iOS资源(2) Vue资源(2) flutter资源(2) JavaScript精选(2) JavaScript资源(2) Rust开发(2) deeplearning(2) RAD(2)

Contents

Vital links

Airflow deployment solutions

Introductions and tutorials

Airflow Summit 2020 videos

The first Airflow Summit 2020 was held in July 2020. It was a truly global, fully online event that was co-hosted by 9 Airflow Meetups from all over the world (MelbourneTokyoBangaloreWarsawAmsterdamLondonNYCBayArea).

It featured 40+ talks and three workshops. You can check out the talk recordings as a YouTube Airflow Summit 2020 Playlist or see the individual talks here:

Best practices, lessons learned and cool use cases

Books, blogs, podcasts, and such

Slide deck presentations and online videos

Libraries, Hooks, Utilities

  • AirFly - Auto generate Airflow's dag.py on the fly.
  • DEAfrica Airflow - Airflow libraries used by Digital Earth Africa, an humanitarian effort to utilize satellite imagery of Africa.
  • Airflow plugins - Central collection of repositories of various plugins for Airflow, including mailchimp, trello, sftp, GitHub, etc.
  • fileflow - Collection of modules to support large data transfers between Airflow operators through either local file system or S3. This addresses a gap where data is too large for XCOMs but too small or inconvenient for loading directly in the operator. Built by Industry Dive.
  • fairflow - Library to abstract away Airflow's Operators with functional pieces that transform the data from one operator to another.
  • airflow-maintenance-dags - Clairvoyant has a repo of Airflow DAGs that operator on Airflow itself, clearing out various bits of the backing metadata store.
  • test_dags - a more complete solution for DAG integrity tests (first Circle of Data’s Inferno are the first.
  • dag-factory - A library for dynamically generating Apache Airflow DAGs from YAML configuration files.
  • whirl - Fast iterative local development and testing of Apache Airflow workflows.
  • airflow-code-editor - A plugin for Apache Airflow that allows you to edit DAGs in browser.
  • Pylint-Airflow - A Pylint plugin for static code analysis on Airflow code.
  • afctl - A CLI tool that includes everything required to create, manage and deploy airflow projects faster and smoother.
  • Dag Dependencies viewer - A plugin which creates a view to visualize dependencies between the Airflow DAGs
  • Airflow ECR Plugin - Plugin to refresh AWS ECR login token at regular intervals. This is helpful where DockerOperator needs to pull images hosted on ECR.
  • AirflowK8sDebugger - A library for generate k8s pod yaml templates from an Airflow dag using the KubernetesPodOperator.
  • Oozie to Airflow - A tool to easily convert between Apache Oozie workflows and Apache Airflow workflows.
  • Airflow Ditto - An extensible framework to do transformations to an Airflow DAG and convert it into another DAG which is flow-isomorphic with the original DAG, to be able to run it on different environments (e.g. on different clouds, or even different container frameworks - Apache Spark on YARN vs Kubernetes). Comes with out-of-the-box support for EMR-to-HDInsight-DAG transforms.
  • gusty - Create a DAG using any number of YAML, Python, Jupyter Notebook, or R Markdown files that represent individual tasks in the DAG. gusty also configures dependencies, DAGs, and TaskGroups, features support for your local operators, and more. A fully containerized demo is available here.
  • Meltano - Open source, self-hosted, CLI-first, debuggable, and extensible ELT tool that embraces Singer for extraction and loading, leverages dbt for transformation, and integrates with Airflow for orchestration.
  • DAG checks - The dag-checks consist of checks that can help you in maintaining your Apache Airflow instance.
  • Airflow DVC plugin - Plugin for open-source version-control system for data science and Machine Learning pipelines - DVC.

Meetups

Commercial Airflow-as-a-service providers

  • Google Cloud Composer - Google Cloud Composer is a managed service built atop Google Cloud and Airflow.
  • Qubole - Qubole is mainly known as a service-and-support company for Apache Hive, but also provides Airflow as a component of its platform.
  • Astronomer.io - Astronomer provides complete ETL lifecycle solutions and appears to be entirely focused on providing Airflow-based products.
  • AWS MWAA - Amazon Managed Workflows for Apache Airflow (MWAA) is a managed orchestration service for Apache Airflow that makes it easier to set up and operate end-to-end data pipelines in the cloud at scale.

Cloud Composer resources

This section contains articles that apply to Cloud Composer — a service built by Google Cloud based on Apache Airflow. Tricks and solutions are described here that are intended for Cloud Composer, but may be applicable to vanilla Airflow.

Non-English resources

Sample projects

原文:https://github.com/jghoman/awesome-apache-airflow