Ask AI

You are viewing an unreleased or outdated version of the documentation

Changelog#

1.0.9 (core) / 0.16.9 (libraries)#

New#

  • The multi_asset_sensor (experimental) now has improved capabilities to monitor asset partitions via a latest_materialization_records_by_partition method.
  • Performance improvements for the Partitions page in Dagit.

Bugfixes#

  • Fixed a bug that caused the op_config argument of dagstermill.get_context to be ignored
  • Fixed a bug that caused errors when loading the asset details page for assets with time window partitions definitions
  • Fixed a bug where assets sometimes didn’t appear in the Asset Catalog while in Folder view.
  • [dagit] Opening the asset lineage tab no longer scrolls the page header off screen in some scenarios
  • [dagit] The asset lineage tab no longer attempts to materialize source assets included in the upstream / downstream views.
  • [dagit] The Instance page Run Timeline no longer commingles runs with the same job name in different repositories
  • [dagit] Emitting materializations with JSON metadata that cannot be parsed as JSON no longer crashes the run details page
  • [dagit] Viewing the assets related to a run no longer shows the same assets multiple times in some scenarios
  • [dagster-k8s] Fixed a bug with timeouts causing errors in k8s_job_op
  • [dagster-docker] Fixed a bug with Op retries causing errors with the docker_executor

Community Contributions#

  • [dagster-aws] Thanks @Vivanov98 for adding the list_objects method to S3FakeSession!

Experimental#

  • [dagster-airbyte] Added an experimental function to automatically generate Airbyte assets from project YAML files. For more information, see the dagster-airbyte docs.
  • [dagster-airbyte] Added the forward_logs option to AirbyteResource, allowing users to disble forwarding of Airbyte logs to the compute log, which can be expensive for long-running syncs.
  • [dagster-airbyte] Added the ability to generate Airbyte assets for basic normalization tables generated as part of a sync.

Documentation#

1.0.8 (core) / 0.16.8 (libraries)#

New#

  • With the new cron_schedule argument to TimeWindowPartitionsDefinition, you can now supply arbitrary cron expressions to define time window-based partition sets.
  • Graph-backed assets can now be subsetted for execution via AssetsDefinition.from_graph(my_graph, can_subset=True).
  • RunsFilter is now exported in the public API.
  • [dagster-k8s] The dagster-user-deployments.deployments[].schedulerName Helm value for specifying custom Kubernetes schedulers will now also apply to run and step workers launched for the given user deployment. Previously it would only apply to the grpc server.

Bugfixes#

  • In some situations, default asset config was ignored when a subset of assets were selected for execution. This has been fixed.
  • Added a pin to grpcio in dagster to address an issue with the recent 0.48.1 grpcio release that was sometimes causing Dagster code servers to hang.
  • Fixed an issue where the “Latest run” column on the Instance Status page sometimes displayed an older run instead of the most recent run.

Community Contributions#

  • In addition to a single cron string, cron_schedule now also accepts a sequence of cron strings. If a sequence is provided, the schedule will run for the union of all execution times for the provided cron strings, e.g., ['45 23 * * 6', '30 9 * * 0] for a schedule that runs at 11:45 PM every Saturday and 9:30 AM every Sunday. Thanks @erinov1!
  • Added an optional boolean config install_default_libraries to databricks_pyspark_step_launcher . It allows to run Databricks jobs without installing the default Dagster libraries .Thanks @nvinhphuc!

Experimental#

  • [dagster-k8s] Added additional configuration fields (container_config, pod_template_spec_metadata, pod_spec_config, job_metadata, and job_spec_config) to the experimental k8s_job_op that can be used to add additional configuration to the Kubernetes pod that is launched within the op.

1.0.7 (core) / 0.16.7 (libraries)#

New#

  • Several updates to the Dagit run timeline view: your time window preference will now be preserved locally, there is a clearer “Now” label to delineate the current time, and upcoming scheduled ticks will no longer be batched with existing runs.
  • [dagster-k8s] ingress.labels is now available in the Helm chart. Any provided labels are appended to the default labels on each object (helm.sh/chart, app.kubernetes.io/version, and app.kubernetes.io/managed-by).
  • [dagster-dbt] Added support for two types of dbt nodes: metrics, and ephemeral models.
  • When constructing a GraphDefinition manually, InputMapping and OutputMapping objects should be directly constructed.

Bugfixes#

  • [dagster-snowflake] Pandas is no longer imported when dagster_snowflake is imported. Instead, it’s only imported when using functionality inside dagster-snowflake that depends on pandas.
  • Recent changes to run_status_sensors caused sensors that only monitored jobs in external repositories to also monitor all jobs in the current repository. This has been fixed.
  • Fixed an issue where "unhashable type" errors could be spawned from sensor executions.
  • [dagit] Clicking between assets in different repositories from asset groups and asset jobs now works as expected.
  • [dagit] The DAG rendering of composite ops with more than one input/output mapping has been fixed.
  • [dagit] Selecting a source asset in Dagit no longer produces a GraphQL error
  • [dagit] Viewing “Related Assets” for an asset run now shows the full set of assets included in the run, regardless of whether they were materialized successfully.
  • [dagit] The Asset Lineage view has been simplified and lets you know if the view is being clipped and more distant upstream/downstream assets exist.
  • Fixed erroneous experimental warnings being thrown when using with_resources alongside source assets.

Breaking Changes#

  • [dagit] The launchpad tab is no longer shown for Asset jobs. Asset jobs can be launched via the “Materialize All” button shown on the Overview tab. To provide optional configuration, hold shift when clicking “Materialize”.
  • The arguments to InputMapping and OutputMapping APIs have changed.

Community Contributions#

  • The ssh_resource can now accept configuration from environment variables. Thanks @cbini!
  • Spelling corrections in migrations.md. Thanks @gogi2811!

1.0.6 (core) / 0.16.6 (libraries)#

New#

  • [dagit] nbconvert is now installed as an extra in Dagit.
  • Multiple assets can be monitored for materialization using the multi_asset_sensor (experimental).
  • Run status sensors can now monitor jobs in external repositories.
  • The config argument of define_asset_job now works if the job contains partitioned assets.
  • When configuring sqlite-based storages in dagster.yaml, you can now point to environment variables.
  • When emitting RunRequests from sensors, you can now optionally supply an asset_selection argument, which accepts a list of AssetKeys to materialize from the larger job.
  • [dagster-dbt] load_assets_from_dbt_project and load_assets_from_dbt_manifest now support the exclude parameter, allowing you to more precisely which resources to load from your dbt project (thanks @flvndh!)
  • [dagster-k8s] schedulerName is now available for all deployments in the Helm chart for users who use a custom Kubernetes scheduler

Bugfixes#

  • Previously, types for multi-assets would display incorrectly in Dagit when specified. This has been fixed.
  • In some circumstances, viewing nested asset paths in Dagit could lead to unexpected empty states. This was due to incorrect slicing of the asset list, and has been fixed.
  • Fixed an issue in Dagit where the dialog used to wipe materializations displayed broken text for assets with long paths.
  • [dagit] Fixed the Job page to change the latest run tag and the related assets to bucket repository-specific jobs. Previously, runs from jobs with the same name in different repositories would be intermingled.
  • Previously, if you launched a backfill for a subset of a multi-asset (e.g. dbt assets), all assets would be executed on each run, instead of just the selected ones. This has been fixed.
  • [dagster-dbt] Previously, if you configured a select parameter on your dbt_cli_resource , this would not get passed into the corresponding invocations of certain context.resources.dbt.x() commands. This has been fixed.

1.0.4 (core) / 0.16.4 (libraries)#

New#

  • Assets can now be materialized to storage conditionally by setting output_required=False. If this is set and no result is yielded from the asset, Dagster will not create an asset materialization event, the I/O manager will not be invoked, downstream assets will not be materialized, and asset sensors monitoring the asset will not trigger.
  • JobDefinition.run_request_for_partition can now be used inside sensors that target multiple jobs (Thanks Metin Senturk!)
  • The environment variable DAGSTER_GRPC_TIMEOUT_SECONDS now allows for overriding the default timeout for communications between host processes like dagit and the daemon and user code servers.
  • Import time for the dagster module has been reduced, by approximately 50% in initial measurements.
  • AssetIn now accepts a dagster_type argument, for specifying runtime checks on asset input values.
  • [dagit] The column names on the Activity tab of the asset details page no longer reference the legacy term “Pipeline”.
  • [dagster-snowflake] The execute_query method of the snowflake resource now accepts a use_pandas_result argument, which fetches the result of the query as a Pandas dataframe. (Thanks @swotai!)
  • [dagster-shell] Made the execute and execute_script_file utilities in dagster_shell part of the public API (Thanks Fahad Khan!)
  • [dagster-dbt] load_assets_from_dbt_project and load_assets_from_dbt_manifest now support the exclude parameter. (Thanks @flvndh!)

Bugfixes#

  • [dagit] Removed the x-frame-options response header from Dagit, allowing the Dagit UI to be rendered in an iframe.
  • [fully-featured project example] Fixed the duckdb IO manager so the comment_stories step can load data successfully.
  • [dagster-dbt] Previously, if a select parameter was configured on the dbt_cli_resource, it would not be passed into invocations of context.resources.dbt.run() (and other similar commands). This has been fixed.
  • [dagster-ge] An incompatibility between dagster_ge_validation_factory and dagster 1.0 has been fixed.
  • [dagstermill] Previously, updated arguments and properties to DagstermillExecutionContext were not exposed. This has since been fixed.

Documentation#

  • The integrations page on the docs site now has a section for links to community-hosted integrations. The first linked integration is @silentsokolov’s Vault integration.