Ask AI

You are viewing an unreleased or outdated version of the documentation

Changelog#

0.13.11#

New#

  • [dagit] Made performance improvements to the Run page.
  • [dagit] Highlighting a specific sensor / schedule ticks is now reflected in a shareable URL.

Bugfixes#

  • [dagit] On the Runs page, when filtering runs with a tag containing a comma, the filter input would incorrectly break the tag apart. This has been fixed.
  • [dagit] For sensors that do not target a specific job (e.g. un_status_sensor, we are now hiding potentially confusing Job details
  • [dagit] Fixed an issue where some graph explorer views generated multiple scrollbars.
  • [dagit] Fixed an issue with the Run view where the Gantt view incorrectly showed in-progress steps when the run had exited.
  • [dagster-celery-k8s] Fixed an issue where setting a custom Celery broker URL but not a custom Celery backend URL in the helm chart would produce an incorrect Celery configuration.
  • [dagster-k8s] Fixed an issue where Kubernetes volumes using list or dict types could not be set in the Helm chart.

Community Contributions#

  • [dagster-k8s] Added the ability to set a custom location name when configuring a workspace in the Helm chart. Thanks @pcherednichenko!

Experimental#

  • [dagit] Asset jobs now display with spinners on assets that are currently in progress.
  • [dagit] Assets jobs that are in progress will now display a dot icon on all assets that are not yet running but will be re-materialized in the run.
  • [dagit] Fixed broken links to the asset catalog entries from the explorer view of asset jobs.
  • The AssetIn input object now accepts an asset key so upstream assets can be explicitly specified (e.g. AssetIn(asset_key=AssetKey("asset1")))
  • The @asset decorator now has an optional non_argument_deps parameter that accepts AssetKeys of assets that do not pass data but are upstream dependencies.
  • ForeignAsset objects now have an optional description attribute.

Documentation#

  • “Validating Data with Dagster Type Factories” guide added.

0.13.10#

New#

  • run_id, job_name, and op_exception have been added as parameters to build_hook_context.
  • You can now define inputs on the top-level job / graph. Those inputs can be can configured as an inputs key on the top level of your run config. For example, consider the following job:
from dagster import job, op

@op
def add_one(x):
    return x + 1

@job
def my_job(x):
    add_one(x)

You can now add config for x at the top level of my run_config like so:

run_config = {
  "inputs": {
    "x": {
      "value": 2
    }
  }
}
  • You can now create partitioned jobs and reference a run’s partition from inside an op body or IOManager load_input or handle_output method, without threading partition values through config. For example, where previously you might have written:
@op(config_schema={"partition_key": str})
def my_op(context):
    print("partition_key: " + context.op_config["partition_key"])

@static_partitioned_config(partition_keys=["a", "b"])
def my_static_partitioned_config(partition_key: str):
    return {"ops": {"my_op": {"config": {"partition_key": partition_key}}}}

@job(config=my_static_partitioned_config)
def my_partitioned_job():
    my_op()

You can now write:

@op
def my_op(context):
    print("partition_key: " + context.partition_key)

@job(partitions_def=StaticPartitionsDefinition(["a", "b"]))
def my_partitioned_job():
    my_op()
  • Added op_retry_policy to @job. You can also specify op_retry_policy when invoking to_job on graphs.
  • [dagster-fivetran] The fivetran_sync_op will now be rendered with a fivetran tag in Dagit.
  • [dagster-fivetran] The fivetran_sync_op now supports producing AssetMaterializations for each table updated during the sync. To this end, it now outputs a structured FivetranOutput containing this schema information, instead of an unstructured dictionary.
  • [dagster-dbt] AssetMaterializations produced from the dbt_cloud_run_op now include a link to the dbt Cloud docs for each asset (if docs were generated for that run).
  • You can now use the @schedule decorator with RunRequest - based evaluation functions. For example, you can now write:
@schedule(cron_schedule="* * * * *", job=my_job)
def my_schedule(context):
    yield RunRequest(run_key="a", ...)
    yield RunRequest(run_key="b", ...)
  • [dagster-k8s] You may now configure instance-level python_logs settings using the Dagster Helm chart.
  • [dagster-k8s] You can now manage a secret that contains the Celery broker and backend URLs, rather than the Helm chart
  • [Dagster-slack] Improved the default messages in make_slack_on_run_failure_sensor to use Slack layout blocks and include clickable link to Dagit. Previously, it sent a plain text message.

Dagit#

  • Made performance improvements to the Run page.
  • The Run page now has a pane control that splits the Gantt view and log table evenly on the screen.
  • The Run page now includes a list of succeeded steps in the status panel next to the Gantt chart.
  • In the Schedules list, execution timezone is now shown alongside tick timestamps.
  • If no repositories are successfully loaded when viewing Dagit, we now redirect to /workspace to quickly surface errors to the user.
  • Increased the size of the reload repository button
  • Repositories that had been hidden from the left nav became inaccessible when loaded in a workspace containing only that repository. Now, when loading a workspace containing a single repository, jobs for that repository will always appear in the left nav.
  • In the Launchpad, selected ops were incorrectly hidden in the lower right panel.
  • Repaired asset search input keyboard interaction.
  • In the Run page, the list of previous runs was incorrectly ordered based on run ID, and is now ordered by start time.
  • Using keyboard commands with the / key (e.g. toggling commented code) in the config editor

Bugfixes#

  • Previously, if an asset in software-defined assets job depended on a ForeignAsset, the repository containing that job would fail to load.
  • Incorrectly triggered global search. This has been fixed.
  • Fix type on tags of EMR cluster config (thanks Chris)!
  • Fixes to the tests in dagster new-project , which were previously using an outdated result API (thanks Vašek)!

Experimental#

  • You can now mount AWS Secrets Manager secrets as environment variables in runs launched by the EcsRunLauncher.
  • You can now specify the CPU and Memory for runs launched by the EcsRunLauncher.
  • The EcsRunLauncher now dynamically chooses between assigning a public IP address or not based on whether it’s running in a public or private subnet.
  • The @asset and @multi_asset decorator now return AssetsDefinition objects instead of OpDefinitions

Documentation#

  • The tutorial now uses get_dagster_logger instead of context.log.
  • In the API docs, most configurable objects (such as ops and resources) now have their configuration schema documented in-line.
  • Removed typo from CLI readme (thanks Kan (https://github.com/zkan))!

0.13.9#

New#

  • Memoization can now be used with the multiprocess, k8s, celery-k8s, and dask executors.

0.13.8#

New#

  • Improved the error message for situations where you try a, b = my_op(), inside @graph or @job, but my_op only has a single Out.
  • [dagster-dbt] A new integration with dbt Cloud allows you to launch dbt Cloud jobs as part of your Dagster jobs. This comes complete with rich error messages, links back to the dbt Cloud UI, and automatically generated Asset Materializations to help keep track of your dbt models in Dagit. It provides a pre-built dbt_cloud_run_op, as well as a more flexible dbt_cloud_resource for more customized use cases. Check out the api docs to learn more!
  • [dagster-gcp] Pinned the google-cloud-bigquery dependency to \<3, because the new 3.0.0b1 version was causing some problems in tests.
  • [dagit] Verbiage update to make it clear that wiping an asset means deleting the materialization events for that asset.

Bugfixes#

  • Fixed a bug with the pipeline launch / job launch CLIs that would spin up an ephemeral dagster instance for the launch, then tear it down before the run actually executed. Now, the CLI will enforce that your instance is non-ephemeral.
  • Fixed a bug with re-execution when upstream step skips some outputs. Previously, it mistakenly tried to load inputs from parent runs. Now, if an upstream step doesn’t yield outputs, the downstream step would skip.
  • [dagit] Fixed a bug where configs for unsatisfied input wasn’t properly resolved when op selection is specified in Launchpad.
  • [dagit] Restored local font files for Inter and Inconsolata instead of using the Google Fonts API. This allows correct font rendering for offline use.
  • [dagit] Improved initial workspace loading screen to indicate loading state instead of showing an empty repository message.

Breaking Changes#

  • The pipeline argument of the InitExecutorContext constructor has been changed to job.

Experimental#

  • The @asset decorator now accepts a dagster_type argument, which determines the DagsterType for the output of the asset op.
  • build_assets_job accepts an executor_def argument, which determines the executor for the job.

Documentation#

  • A docs section on context manager resources has been added. Check it out here.
  • Removed the versions of the Hacker News example jobs that used the legacy solid & pipeline APIs.

0.13.7#

New#

  • The Runs page in Dagit now loads much more quickly.

Bugfixes#

  • Fixed an issue where Dagit would sometimes display a red "Invalid JSON" error message.

Dependencies#

  • google-cloud-bigquery is temporarily pinned to be prior to version 3 due to a breaking change in that version.