Skip to content

datarootsio/terraform-aws-ecs-dagster

Repository files navigation

Maintained by dataroots Terraform 0.13

Terraform module Dagster on AWS ECS

This is a module for Terraform that deploys Dagster in AWS.

Setup

  • An ECS Cluster with:
    • Sidecar injection container
    • Dagit webserver container
    • Dagster daemon
  • An ALB
  • A S3 bucket
  • A RDS instance (optional but recommended)
  • A DNS Record (optional but recommended)

Average cost of minimal setup with RDS: ~60$/month

Intend

The Dagster setup provided with this module is intended to be used to manage your runs/schedules/etc... If you want Dagster to have access to services like AWS EMR, AWS Glue, ..., use the output role and give it permissions to these services through IAM.

Usage

module "dagster" {
    source = "datarootsio/ecs-dagster/aws"

    resource_prefix = "my-awesome-company"
    resource_suffix = "env"

    vpc_id             = "vpc-123456"
    public_subnet_ids  = ["subnet-456789", "subnet-098765"]

    rds_password = "super-secret-pass"
}

Adding new pipeline

To add new pipelines to Dagster:

  • you need to add in the workspace.yml file the new pipeline file name and its path in the mounted volume of the ECS instance,
load_from:
  - python_file:
      relative_path: new_pipeline.py
      working_directory: /path/to/mounted/volume
  • add the pipeline python file to the created S3 bucket in the pipeline folder,
  • run the syncing pipeline in dagit to pick up the new pipeline and workspace.yml file.

Security

This module supports HTTPS however RBAC is not supported (yet?) by Dagster. Therefore, this module results in Dagit being publicly available. It is important to note that the user should implement and manage authentication themselves, for example by implementing SSO.

Requirements

Name Version
terraform >= 0.14

Providers

Name Version
aws n/a

Inputs

Name Description Type Default Required
aws_availability_zone The availability zone of the resource. string "eu-west-1a" no
aws_region The region of the aws account string "eu-west-1" no
certificate_arn The arn of the certificate that will be used. string "" no
dagster-container-home n/a string "/opt" no
dagster_config_bucket Dagster bucket containing the config files. string "dagster-bucket" no
dagster_file The config file needed to use database and daemon with dagit. string "dagster.yaml" no
dns_name The dns name that will be used to expose Dagster. It will be auto generated if not provided. string "" no
ecs_cpu The amount of cpu to give to the ECS instance. number 1024 no
ecs_memory The amount of ecs memory to give to the ECS instance. number 2048 no
ip_allow_list A list of ip ranges that are allowed to access the airflow webserver, default: full access list(string)
[
"0.0.0.0/0"
]
no
log_retention The number of days that the logs shoud live. number 7 no
private_subnet The private subnets where the RDS and ECS reside. list(string) [] no
public_subnet The public subnet where the load balancer should reside. Moreover, the ecs and rds will use these if no private subnets are defined. At least two should be provided. list(string) [] no
rds_deletion_protection n/a bool false no
rds_instance_class The type of instance class for the RDS. string "db.t2.micro" no
rds_password The password to access the RDS instance. string "" no
rds_skip_final_snapshot Whether or not to skip the final snapshot before deleting (mainly for tests) bool true no
rds_username The username to access the RDS instance. string "" no
resource_prefix The prefix of the resource to be created string "ps" no
resource_suffix The suffix of the resource to be created string "sp" no
route53_zone_name The name of the route53 zone that will be used for the certificate validation. string "" no
tags Tags to add to the created resources. map(string)
{
"Name": "Terraform-aws-dagster"
}
no
use_https Expose traffic using HTTPS or not bool false no
vpc The id of the virtual private cloud. string "" no
workspace_file The config file needed to run dagit. string "workspace.yaml" no

Outputs

Name Description
dagster_alb_dns The DNS name of the ALB, with this you can access the dagster webserver
dagster_connection_sg The security group with which you can connect other instance to dagster, for example EMR Livy
dagster_dns_record The created DNS record (only if "use_https" = true)
dagster_task_iam_role The IAM role of the dagster task, use this to give dagster more permissions

Makefile Targets

Available targets:

  tools                             Pull Go and Terraform dependencies
  fmt                               Format Go and Terraform code
  lint/lint-tf/lint-go              Lint Go and Terraform code
  test/testverbose                  Run tests

Contributing

Contributions to this repository are very welcome! Found a bug or do you have a suggestion? Please open an issue. Do you know how to fix it? Pull requests are welcome as well! To get you started faster, a Makefile is provided.

Make sure to install Terraform, Go (for automated testing) and Make (optional, if you want to use the Makefile) on your computer. Install tflint to be able to run the linting.

  • Setup tools & dependencies: make tools
  • Format your code: make fmt
  • Linting: make lint
  • Run tests: make test (or go test -timeout 2h ./... without Make)

Make sure you branch from the 'open-pr-here' branch, and submit a PR back to the 'open-pr-here' branch.

License

MIT license. Please see LICENSE for details.