materialize and materialize_to_memory now both accept a selection argument that allows specifying a subset of assets to materialize.
MultiPartitionsDefinition is no longer marked experimental.
Context methods to access time window partition information now work for MultiPartitionsDefinitions with a time dimension.
Improved the performance of the asset reconciliation sensor when a non-partitioned asset depends on a partitioned asset.
load_assets_from_package_module and similar methods now accept a freshness_policy, which will be applied to all loaded assets.
When the asset reconciliation sensor is scheduling based on freshness policies, and there are observable source assets, the observed versions now inform the data time of the assets.
build_sensor_context and build_multi_asset_sensor_context can now take a Definitions object in place of a RepositoryDefinition
[UI] Performance improvement for loading asset partition statuses.
[dagster-aws] s3_resource now accepts use_ssl and verify configurations.
Fixed a bug that caused an error to be raised when passing a multi-asset into the selection argument on define_asset_job.
Fixes a graphQL error that displays on Dagit load when an asset’s partitions definition is change from a single-dimensional partitions definition to a MultiPartitionsDefinition.
Fixed a bug that caused backfills to fail when spanning assets that live in different code locations.
Fixed an error that displays when a code location with a MultiPartitionsMapping (experimental) is loaded.
Fixed a bug that caused errors with invalid TimeWindowPartitionMappings to not be bubbled up to the UI.
Fixed an issue where the scheduler would sometimes incorrectly handle spring Daylight Savings Time transitions for schedules running at 2AM in a timezone other than UTC.
Fixed an issue introduced in the 1.2.4 release where running pdb stopped working when using dagster dev.
Fixed an issue where it is was possible to create AssetMaterialization objects with a null AssetKey.
Previously, if you had a TimeWindowPartitionsDefinition with a non-standard cron schedule, and also provided a minute_of_hour or similar argument in build_schedule_from_partitioned_job. Dagster would silently create the wrong cron expression. It now raises an error.
The asset reconciliation sensor now no longer fails when the event log contains materializations that contain partitions that aren’t contained in the asset’s PartitionsDefinition. These partitions are now ignored.
Fixed a regression that prevented materializing dynamically partitioned assets from the UI (thanks @planvin!)
[UI] On the asset graph, the asset health displayed in the sidebar for the selected asset updates as materializations and failures occur.
[UI] The asset partitions page has been adjusted to make materialization and observation event metadata more clear.
[UI] Large table schema metadata entries now display within a modal rather than taking up considerable space on the page.
[UI] Launching a backfill of a partitioned asset with unpartitioned assets immediately upstream no longer shows the “missing partitions” warning.
[dagster-airflow] fixed a bug in the PersistentAirflowDatabase where versions of airflow from 2.0.0 till 2.3.0 would not use the correct connection environment variable name.
[dagster-census] fixed a bug with the poll_sync_run function ofdagster-census that prevented polling from working correctly (thanks @ldincolasmay!)
The run_request_for_partition method on JobDefinition and UnresolvedAssetJobDefinition is now deprecated and will be removed in 2.0.0. Instead, directly instantiate a run request with a partition key via RunRequest(partition_key=...).
When using build_asset_reconciliation_sensor, in some cases duplicate runs could be produced for the same partition of an asset. This has been fixed.
When using Pythonic configuration for resources, aliased field names would cause an error. This has been fixed.
Fixed an issue where context.asset_partitions_time_window_for_output threw an error when an asset was directly invoked with build_op_context.
[dagster-dbt] In some cases, use of ephemeral dbt models could cause the dagster representation of the dbt dependency graph to become incorrect. This has been fixed.
[celery-k8s] Fixed a bug that caused JSON deserialization errors when an Op or Asset emitted JSON that doesn't represent a DagsterEvent.
Fixed an issue where launching a large backfill while running dagster dev would sometimes fail with a connection error after running for a few minutes.
Fixed an issue where dagster dev would sometimes hang when running Dagster code that attempted to read in input via stdin.
Fixed an issue where runs that take a long time to import code would sometimes continue running even after they were stopped by run monitoring for taking too long to start.
Fixed an issue where AssetSelection.groups() would simultaneously select both source and regular assets and consequently raise an error.
Fixed an issue where BindResourcesToJobs would raise errors encapsulating jobs which had config specified at definition-time.
Fixed Pythonic config objects erroring when omitting optional values rather than specifying None.
Fixed Pythonic config and resources not supporting Enum values.
DagsterInstance.local_temp and DagsterInstance.ephemeral now use object instance scoped local artifact storage temporary directories instead of a shared process scoped one, removing a class of thread safety errors that could manifest on initialization.
Improved direct invocation behavior for ops and assets which specify resource dependencies as parameters, for instance:
classMyResource(ConfigurableResource):pass@opdefmy_op(x:int, y:int, my_resource: MyResource)->int:return x + y
my_op(4,5, my_resource=MyResource())
[dagster-azure] Fixed an issue with an AttributeError being thrown when using the async DefaultAzureCredential (thanks @mpicard)
[ui] Fixed an issue introduced in 1.2.3 in which no log levels were selected by default when viewing Run logs, which made it appear as if there were no logs at all.
The environment_vars argument to ScheduleDefinition is deprecated (the argument is currently non-functional; environment variables no longer need to be whitelisted for schedules)
Jobs defined via define_asset_job now auto-infer their partitions definitions if not explicitly defined.
Observable source assets can now be run as part of a job via define_asset_job. This allows putting them on a schedule/sensor.
Added an instance property to the HookContext object that is passed into Op Hook functions, which can be used to access the current DagsterInstance object for the hook.
(experimental) Dynamic partitions definitions can now exist as dimensions of multi-partitions definitions.
[dagster-pandas] New create_table_schema_metadata_from_dataframe function to generate a TableSchemaMetadataValue from a Pandas DataFrame. Thanks @AndyBys!
[dagster-airflow] New option for setting dag_run configuration on the integration’s database resources.
[ui] The asset partitions page now links to the most recent failed or in-progress run for the selected partition.
[ui] Asset descriptions have been moved to the top in the asset sidebar.
[ui] Log filter switches have been consolidated into a single control, and selected log levels will be persisted locally so that the same selections are used by default when viewing a run.
[ui] You can now customize the hour formatting in timestamp display: 12-hour, 24-hour, or automatic (based on your browser locale). This option can be found in User Settings.
In certain situations a few of the first partitions displayed as “unpartitioned” in the health bar despite being materialized. This has now been fixed, but users may need to run dagster asset wipe-partitions-status-cache to see the partitions displayed.
Starting 1.1.18, users with a gRPC server that could not access the Dagster instance on user code deployments would see an error when launching backfills as the instance could not instantiate. This has been fixed.
Previously, incorrect partition status counts would display for static partitions definitions with duplicate keys. This has been fixed.
In some situations, having SourceAssets could prevent the build_asset_reconciliation_sensor from kicking off runs of downstream assets. This has been fixed.
The build_asset_reconciliation_sensor is now much more performant in cases where unpartitioned assets are upstream or downstream of static-partitioned assets with a large number of partitions.
[dagster-airflow] Fixed an issue were the persistent Airflow DB resource required the user to set the correct Airflow database URI environment variable.
[dagster-celery-k8s] Fixed an issue where run monitoring failed when setting the jobNamespace field in the Dagster Helm chart when using the CeleryK8sRunLauncher.
[ui] Filtering on the asset partitions page no longer results in keys being presented out of order in the left sidebar in some scenarios.
[ui] Launching an asset backfill outside an asset job page now supports partition mapping, even if your selection shares a partition space.
[ui] In the run timeline, date/time display at the top of the timeline was sometimes broken for users not using the en-US browser locale. This has been fixed.
Users can now opt in to have resources provided to Definitions bind to their jobs. Opt in by wrapping your job definitions in BindResourcesToJobs. This will become the default behavior in the future.
Added dagster asset list and dagster asset materialize commands to Dagster’s command line interface, for listing and materializing software-defined assets.
build_schedule_from_partitioned_job now accepts jobs partitioned with a MultiPartitionsDefinition that have a time-partitioned dimension.
Added SpecificPartitionsPartitionMapping, which allows an asset, or all partitions of an asset, to depend on a specific subset of the partitions in an upstream asset.
load_asset_value now supports SourceAssets.
[ui] Ctrl+K has been added as a keyboard shortcut to open global search.
[ui] Most pages with search bars now sync the search filter to the URL, so it’s easier to bookmark views of interest.
[ui] In the run logs table, the timestamp column has been moved to the far left, which will hopefully allow for better visual alignment with op names and tags.
[dagster-dbt] A new node_info_to_definition_metadata_fn to load_assets_from_dbt_project and load_assets_from_dbt_manifest allows custom metadata to be attached to the asset definitions generated from these methods.
[dagster-celery-k8s] The Kubernetes namespace that runs using the CeleryK8sRunLauncher are launched in can now be configured by setting the jobNamespace field in the Dagster Helm chart under celeryK8sRunLauncherConfig.
[dagster-gcp] The BigQuery I/O manager now accepts timeout configuration. Currently, this configuration will only be applied when working with Pandas DataFrames, and will set the number of seconds to wait for a request before using a retry.
[dagster-gcp][dagster-snowflake] [dagster-duckdb] The BigQuery, Snowflake, and DuckDB I/O managers now support self-dependent assets. When a partitioned asset depends on a prior partition of itself, the I/O managers will now load that partition as a DataFrame. For the first partition in the dependency sequence, an empty DataFrame will be returned.
[dagster-k8s] k8s_job_op now supports running Kubernetes jobs with more than one pod (Thanks @Taadas).
Fixed a bug that causes backfill tags that users set in the UI to not be included on the backfill runs, when launching an asset backfill.
Fixed a bug that prevented resume from failure re-execution for jobs that contained assets and dynamic graphs.
Fixed an issue where the asset reconciliation sensor would issue run requests for assets that were targeted by an active asset backfill, resulting in duplicate runs.
Fixed an issue where the asset reconciliation sensor could issue runs more frequently than necessary for assets with FreshnessPolicies having intervals longer than 12 hours.
Fixed an issue where AssetValueLoader.load_asset_value() didn’t load transitive resource dependencies correctly.
Fixed an issue where constructing a RunConfig object with optional config arguments would lead to an error.
Fixed the type annotation on ScheduleEvaluationContext.scheduled_execution_time to not be Optional.
Fixed the type annotation on OpExecutionContext.partition_time_window ****(thanks @elben10).
InputContext.upstream_output.log is no longer None when loading a source asset.
An input resolution bug that occurred in certain conditions when composing graphs with same named ops has been fixed.
Invoking an op with collisions between positional args and keyword args now throws an exception.
async def ops are now invoked with asyncio.run.
TimeWindowPartitionDefinition now throws an error at definition time when passed an invalid cron schedule instead of at runtime.
[ui] Previously, using dynamic partitions with assets that required config would raise an error in the launchpad. This has been fixed.
[ui] The lineage tab loads faster and flickers less as you navigate between connected assets in the lineage graph
[ui] The config YAML editor no longer offers incorrect autcompletion context when you’re beginning a new indented line.
[ui] When viewing the asset details page for a source asset, the button in the top right correctly reads “Observe” instead of “Materialize”
[dagster-dbt] Previously, setting a cron_schedule_timezone inside of the config for a dbt model would not result in that property being set on the generated FreshnessPolicy. This has been fixed.
[dagster-gcp] Added a fallback download url for the GCSComputeLogManager when the session does not have permissions to generate signed urls.
[dagster-snowflake] In a previous release, functionality was added for the Snowflake I/O manager to attempt to create a schema if it did not already exist. This caused an issue when the schema already existed but the account did not have permission to create the schema. We now check if a schema exists before attempting to create it so that accounts with restricted permissions do not error, but schemas can still be created if they do not exist.
validate_run_config no longer accepts pipeline_def or mode arguments. These arguments refer to legacy concepts that were removed in Dagster 1.0, and since then there have been no valid values for them.
Added experimental support for resource requirements in sensors and schedules. Resources can be specified using required_resource_keys and accessed through the context or specified as parameters: