About This Page

Kestra is the open-source, declarative, event-driven orchestration platform for data, AI, and infrastructure workflows. GitHub: kestra-io/kestra · Website: kestra.io · Docs: kestra.io/docs 1400+ plugins · SOC 2 Certified · Built for enterprise scale


What Is Kestra?

  • Kestra is a universal workflow orchestration engine — write workflows in YAML, run tasks in any language, trigger by events, schedule by cron, and observe everything from one place.

Core Philosophy

PhilosophyWhat It Means
DeclarativeWrite WHAT you want (YAML), not HOW to do it
Language-agnosticPython, Bash, Node.js, Go, or containers — no lock-in
Event-drivenCron, webhooks, messages, API triggers — all in one engine
API-firstEverything controllable via REST API
GitOps-nativeCI/CD pipelines for workflow deployment

Three Core Use Cases

graph TD
    Kestra["⚡ Kestra Platform"]
    D["📊 Data Workflows\nIngestion · dbt · Airbyte · Spark\n10x faster pipeline delivery\n90% fewer manual backfills"]
    I["🔧 Infrastructure Automation\nTerraform · Ansible · CI/CD\n6x faster infra delivery\n90% lower legacy tooling cost"]
    A["🤖 AI Workflows\nAgents · RAG · Eval · Retraining\n50x less pipeline maintenance\n3x faster AI delivery cycles"]
    Kestra --> D
    Kestra --> I
    Kestra --> A

Core Concepts

Workflow = Flow in Kestra

  • A flow is the fundamental unit in Kestra — a YAML file defining tasks, triggers, and their relationships.
id: my-first-workflow
namespace: company.team
description: "Basic ETL pipeline example"
 
# Triggers — when to run this flow
triggers:
  - id: daily-schedule
    type: io.kestra.plugin.core.trigger.Schedule
    cron: "0 9 * * *"    # Every day at 9am
 
# Tasks — what to do
tasks:
  - id: extract-data
    type: io.kestra.plugin.scripts.python.Script
    script: |
      import requests
      data = requests.get("https://api.example.com/data").json()
      print(data)
 
  - id: transform-data
    type: io.kestra.plugin.scripts.python.Script
    script: |
      # Transform logic here
      print("Transforming...")
 
  - id: load-data
    type: io.kestra.plugin.jdbc.postgresql.Query
    sql: "INSERT INTO table SELECT * FROM staging"

Key Concepts Table

ConceptDescription
FlowA workflow definition — YAML file with tasks and triggers
TaskA single unit of work (run script, call API, query DB)
TriggerWhat starts a flow — schedule, webhook, event, message queue
NamespaceLogical grouping for flows — like folders (e.g., company.team)
ExecutionA single run of a flow — has a unique ID and full trace
PluginIntegrations — 1400+ available (AWS, GCP, dbt, Airbyte, Slack, etc.)
BlueprintPre-built flow templates for common use cases
TenantIsolated environment for multi-tenancy (Enterprise)

Task Execution Model

graph LR
    Trigger["⚡ Trigger\nSchedule / Event / Webhook / API"]
    Executor["🧠 Executor\nTask scheduling + coordination"]
    Worker["⚙️ Worker\nActual task execution\n(isolated, scalable)"]
    Storage["📦 Internal Storage\nPass outputs between tasks"]
    Logs["📋 Execution Logs\nFull trace per task"]
    Trigger -->|"Start execution"| Executor
    Executor -->|"Dispatch task"| Worker
    Worker -->|"Output data"| Storage
    Worker -->|"Logs"| Logs
    Storage -->|"Inputs for next task"| Worker

Triggers — How Flows Start

Trigger Types

Trigger TypeYAML TypeUse Case
Schedule (Cron)core.trigger.ScheduleRun daily ETL at 2am
Webhookcore.trigger.WebhookGitHub push triggers CI flow
Message Queuekafka.trigger.ConsumeProcess Kafka messages
Flow completioncore.trigger.FlowChain flows together
File detectionfs.trigger.ListenNew file in S3 triggers pipeline
API callREST APIManual or external system trigger

Schedule Examples

triggers:
  # Every day at midnight
  - id: daily
    type: io.kestra.plugin.core.trigger.Schedule
    cron: "0 0 * * *"
 
  # Every 15 minutes
  - id: frequent
    type: io.kestra.plugin.core.trigger.Schedule
    cron: "*/15 * * * *"
 
  # First Monday of each month
  - id: monthly
    type: io.kestra.plugin.core.trigger.Schedule
    cron: "0 9 * * 1#1"
 
  # Webhook — POST to trigger
  - id: webhook
    type: io.kestra.plugin.core.trigger.Webhook
    key: "secret-key-here"

Task Types & Patterns

Running Code in Any Language

tasks:
  # Python script
  - id: python-task
    type: io.kestra.plugin.scripts.python.Script
    script: |
      import pandas as pd
      df = pd.read_csv("data.csv")
      print(df.head())
 
  # Bash / Shell
  - id: bash-task
    type: io.kestra.plugin.scripts.shell.Script
    script: |
      echo "Hello from Bash"
      curl -X POST https://api.example.com/webhook
 
  # Node.js
  - id: node-task
    type: io.kestra.plugin.scripts.node.Script
    script: |
      const axios = require('axios');
      const data = await axios.get('https://api.example.com');
      console.log(data.data);
 
  # Docker container
  - id: docker-task
    type: io.kestra.plugin.scripts.runner.docker.Docker
    containerImage: "python:3.11"
    commands:
      - python -c "print('Running in Docker')"

Parallel Execution

tasks:
  # Run multiple tasks in parallel
  - id: parallel-ingest
    type: io.kestra.plugin.core.flow.Parallel
    tasks:
      - id: ingest-source-a
        type: io.kestra.plugin.scripts.python.Script
        script: |
          print("Ingesting from source A")
 
      - id: ingest-source-b
        type: io.kestra.plugin.scripts.python.Script
        script: |
          print("Ingesting from source B")
 
      - id: ingest-source-c
        type: io.kestra.plugin.scripts.python.Script
        script: |
          print("Ingesting from source C")
 
  # All 3 run at the same time, next task waits for all to finish
  - id: merge-results
    type: io.kestra.plugin.scripts.python.Script
    script: |
      print("All sources ingested, merging...")

Sequential Flow with Error Handling

tasks:
  - id: risky-task
    type: io.kestra.plugin.scripts.python.Script
    script: |
      # This might fail
      raise Exception("Simulated failure")
 
errors:
  # On failure — create a Jira ticket
  - id: alert-on-failure
    type: io.kestra.plugin.jira.issues.Create
    domain: "yourcompany.atlassian.net"
    username: "{{ secret('JIRA_USER') }}"
    apiToken: "{{ secret('JIRA_TOKEN') }}"
    projectKey: "OPS"
    summary: "Workflow {{ flow.id }} failed"
    description: "Execution {{ execution.id }} failed at task {{ task.id }}"

Conditional Execution

tasks:
  - id: check-data-quality
    type: io.kestra.plugin.scripts.python.Script
    script: |
      row_count = 1000
      print(row_count)
 
  - id: branch-on-quality
    type: io.kestra.plugin.core.flow.If
    condition: "{{ outputs['check-data-quality'].vars.row_count > 500 }}"
    then:
      - id: proceed-to-load
        type: io.kestra.plugin.scripts.python.Script
        script: print("Data quality OK — proceeding")
    else:
      - id: alert-bad-data
        type: io.kestra.plugin.slack.IncomingWebhook
        url: "{{ secret('SLACK_WEBHOOK') }}"
        payload: |
          { "text": "Data quality failed — row count too low" }

Plugin Ecosystem

Plugin Categories

CategoryKey Plugins
Cloud — AWSS3 · Redshift · Lambda · Glue · Athena · ECR · ECS · EventBridge · Batch
Cloud — GCPBigQuery · Dataflow · Cloud Storage · Pub/Sub · Dataproc
Cloud — AzureBlob Storage · Data Factory · Event Hub · Synapse
Datadbt · Airbyte · Spark · Flink · DuckDB · Polars · Pandas
DatabasesPostgreSQL · MySQL · MongoDB · Redis · Elasticsearch · Snowflake
MessagingKafka · RabbitMQ · NATS · AWS SQS · Azure Service Bus · Pulsar
Dev ToolsGitHub · GitLab · Jira · Slack · PagerDuty · Zendesk
InfraTerraform · Ansible · Kubernetes · Docker · Helm
AI / MLOpenAI · LangChain · Pinecone · Vertex AI · Hugging Face
ScriptsPython · Bash · Node.js · Go · R · Julia

Plugin Usage Example — dbt + Airbyte + Postgres

id: full-etl-pipeline
namespace: data.production
 
tasks:
  # Step 1 — Sync data from source using Airbyte
  - id: airbyte-sync
    type: io.kestra.plugin.airbyte.connections.Sync
    connectionId: "your-airbyte-connection-id"
    url: "http://airbyte:8006"
 
  # Step 2 — Run dbt transformations
  - id: dbt-run
    type: io.kestra.plugin.dbt.cli.DbtCLI
    commands:
      - dbt run --project-dir dbt_project
      - dbt test --project-dir dbt_project
 
  # Step 3 — Verify row counts in Postgres
  - id: verify-load
    type: io.kestra.plugin.jdbc.postgresql.Query
    url: "{{ secret('POSTGRES_URL') }}"
    sql: "SELECT COUNT(*) as row_count FROM public.transformed_data"
 
  # Step 4 — Alert Slack on completion
  - id: notify-success
    type: io.kestra.plugin.slack.IncomingWebhook
    url: "{{ secret('SLACK_WEBHOOK') }}"
    payload: |
      { "text": "ETL pipeline completed: {{ outputs['verify-load'].rows[0]['row_count'] }} rows loaded" }

Governance & Reliability

Reliability Features

FeatureHow to Use
Retriesretry: maxAttempts: 3 per task — automatic retry on failure
Timeoutstimeout: PT30M — kill task after 30 min
SLAsAlert if flow takes longer than expected
Concurrent limitsControl max parallel executions per flow
BackfillRe-run past failed executions with original data
tasks:
  - id: reliable-task
    type: io.kestra.plugin.scripts.python.Script
    retry:
      type: constant
      maxAttempts: 3
      interval: PT30S    # Wait 30s between retries
    timeout: PT10M       # Fail task if it runs > 10 minutes
    script: |
      # Your resilient task code here
      print("Running with retry + timeout protection")

Enterprise Governance Features

graph TD
    SSO["🔐 SSO Integration\nOkta · Azure AD · LDAP"]
    RBAC["👥 RBAC\nRole-based access control\nNamespace-level permissions"]
    Audit["📋 Audit Logs\nAll user actions logged\nImmutable history"]
    MT["🏢 Multi-Tenancy\nIsolated environments per team\nSeparate quotas + namespaces"]
    IW["🔒 Isolated Workers\nDedicated task runners\nAir-gapped environments"]
    SSO --> RBAC --> Audit
    MT --> IW

Security Best Practices

# WRONG — never do this
tasks:
  - id: bad-example
    type: io.kestra.plugin.jdbc.postgresql.Query
    url: "postgresql://user:PASSWORD123@host:5432/db"
 
# CORRECT — use secrets
tasks:
  - id: good-example
    type: io.kestra.plugin.jdbc.postgresql.Query
    url: "{{ secret('POSTGRES_URL') }}"
 
# Secrets stored in:
#  → Kestra's built-in secrets store
#  → AWS Secrets Manager
#  → Azure Key Vault
#  → HashiCorp Vault
#  → GCP Secret Manager

Deployment Options

Editions Comparison

EditionDeploymentKey FeaturesPrice
Open SourceSelf-hosted (Docker / K8s)Full core platform, all 1400+ pluginsFree forever
EnterpriseSelf-hosted / Hybrid / Air-gappedSSO, RBAC, audit logs, multi-tenancy, isolated workers, SLA supportContact sales
CloudFully managed by KestraFastest time to value, production-ready, auto-scalingRequest access

Docker Compose Quick Start

# docker-compose.yml — Kestra + Postgres
version: "3"
services:
  postgres:
    image: postgres:15
    environment:
      POSTGRES_DB: kestra
      POSTGRES_USER: kestra
      POSTGRES_PASSWORD: k3str4
    volumes:
      - postgres-data:/var/lib/postgresql/data
 
  kestra:
    image: kestra/kestra:latest
    pull_policy: always
    entrypoint: /bin/bash
    command: -c 'kestra server standalone'
    environment:
      KESTRA_CONFIGURATION: |
        datasources:
          postgres:
            url: jdbc:postgresql://postgres:5432/kestra
            username: kestra
            password: k3str4
    ports:
      - "8080:8080"    # Kestra UI
      - "8081:8081"    # Kestra API
    depends_on:
      - postgres
 
volumes:
  postgres-data:
# Start Kestra
docker-compose up -d
 
# Access UI at http://localhost:8080
# Access API at http://localhost:8081

Kubernetes Deployment

# Add Kestra Helm chart
helm repo add kestra https://helm.kestra.io
helm repo update
 
# Install with custom values
helm install kestra kestra/kestra \
  --set configuration.kestra.datasources.postgres.url=jdbc:postgresql://postgres:5432/kestra \
  --set configuration.kestra.datasources.postgres.username=kestra \
  --set configuration.kestra.datasources.postgres.password=your-password
 
# Scale workers independently from the executor
kubectl scale deployment kestra-worker --replicas=5

CI/CD & GitOps

Git-Driven Workflow Deployment

# .github/workflows/deploy-kestra.yml
name: Deploy Kestra Flows
 
on:
  push:
    branches: [main]
 
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
 
      - name: Deploy flows to Kestra
        uses: kestra-io/deploy-action@master
        with:
          resource: flow
          directory: ./flows
          namespace: company.production
          server: ${{ secrets.KESTRA_HOSTNAME }}
          user: ${{ secrets.KESTRA_USER }}
          password: ${{ secrets.KESTRA_PASSWORD }}

API-First Control

# Trigger a flow execution via API
curl -X POST \
  http://localhost:8081/api/v1/executions/company.team/my-workflow \
  -H "Content-Type: application/json" \
  -d '{"key": "value"}'
 
# List all executions
curl http://localhost:8081/api/v1/executions?namespace=company.team
 
# Get execution details
curl http://localhost:8081/api/v1/executions/{executionId}
 
# Pause a running execution
curl -X POST \
  http://localhost:8081/api/v1/executions/{executionId}/pause

Kestra vs Alternatives

Kestra vs Apache Airflow

DimensionKestraApache Airflow
Workflow languageYAML (declarative)Python (imperative)
Learning curveLow — any YAML editor worksHigh — Python + Airflow concepts
Code/UIBoth — YAML + full UI in syncCode-first, limited UI
Language supportAny (Python, Bash, Node.js, Go, containers)Python-centric
Event-drivenNativeRequires plugins/workarounds
Plugin ecosystem1400+ built-in3rd party providers, inconsistent
Scaling workersIndependent horizontal scalingComplex worker scaling
Multi-tenancyBuilt-in (Enterprise)External solutions needed
Self-hostedYes — Docker or K8sYes — more complex setup

Kestra vs n8n

DimensionKestran8n
Primary userData/Platform engineersNon-technical users
Workflow definitionYAML (code)Visual drag-and-drop
ScaleEnterprise, millions of executionsMid-scale automation
Data pipelinesFirst-class (ETL, dbt, Spark)Limited
Infrastructure automationFirst-class (Terraform, Ansible)Limited
ObservabilityDeep execution tracing, SLAsBasic logs

Real-World Use Case Patterns

Pattern 1 — Daily ETL Pipeline

graph LR
    T["⏰ Cron Trigger\n2:00 AM daily"]
    E["📥 Extract\nAPI → raw S3 bucket"]
    Tr["🔄 Transform\nPython/Polars → clean data"]
    L["📤 Load\nPostgres / Snowflake / BigQuery"]
    V["✅ Validate\nRow counts · null checks"]
    N["🔔 Notify\nSlack success / failure"]
    T --> E --> Tr --> L --> V --> N

Pattern 2 — Event-Driven Incident Response

graph LR
    Alert["🚨 PagerDuty Alert\nor monitoring webhook"]
    Assess["🔍 Assess Impact\nQuery metrics · check dashboards"]
    Decision["🤔 Decision\nSeverity level?"]
    Auto["🤖 Auto-remediate\nRestart service · scale up"]
    Escalate["📢 Escalate to Human\nCreate Jira · Page on-call"]
    Alert --> Assess --> Decision
    Decision -->|"Low severity"| Auto
    Decision -->|"High severity"| Escalate

Pattern 3 — AI Pipeline (RAG System)

id: rag-pipeline
namespace: ai.production
 
tasks:
  # Step 1 — Scrape new documents
  - id: fetch-documents
    type: io.kestra.plugin.scripts.python.Script
    script: |
      # Fetch new docs from knowledge base
      docs = fetch_new_documents()
 
  # Step 2 — Chunk + Embed (parallel)
  - id: embed-parallel
    type: io.kestra.plugin.core.flow.EachParallel
    value: "{{ outputs['fetch-documents'].vars.doc_ids }}"
    tasks:
      - id: embed-doc
        type: io.kestra.plugin.scripts.python.Script
        script: |
          from openai import OpenAI
          client = OpenAI()
          embedding = client.embeddings.create(...)
 
  # Step 3 — Store in vector DB
  - id: store-embeddings
    type: io.kestra.plugin.scripts.python.Script
    script: |
      # Store in Pinecone / Qdrant / pgvector
      store_embeddings(embeddings)
 
  # Step 4 — Evaluate retrieval quality
  - id: evaluate-rag
    type: io.kestra.plugin.scripts.python.Script
    script: |
      # Run RAGAS evaluation metrics
      score = evaluate_retrieval()
      print(f"RAG score: {score}")

More Learn

Github & Webs

Master Playlists YouTube