astronomer airflow upgrade

The new pool config option allows users to choose different pool and some of them may be breaking. Additionally validation set enable_xcom_pickling = True in your Airflow configs core section. (#23183), Fix dag_id extraction for dag level access checks in web ui (#23015), Fix timezone display for logs on UI (#23075), Change trigger dropdown left position (#23013), Dont add planned tasks for legacy DAG runs (#23007), Add dangling rows check for TaskInstance references (#22924), Validate the input params in connection CLI command (#22688), Fix trigger event payload is not persisted in db (#22944), Drop airflow moved tables in command db reset (#22990), Add max width to task group tooltips (#22978), Add template support for external_task_ids. to task_policy. [AIRFLOW-1764] The web interface should not use the experimental API, [AIRFLOW-1771] Rename heartbeat to avoid confusion, [AIRFLOW-1769] Add support for templates in VirtualenvOperator, [AIRFLOW-1763] Fix S3TaskHandler unit tests, [AIRFLOW-1315] Add Qubole File & Partition Sensors, [AIRFLOW-1018] Make processor use logging framework, [AIRFLOW-1695] Add RedshiftHook using boto3, [AIRFLOW-1706] Fix query error for MSSQL backend, [AIRFLOW-1711] Use ldap3 dict for group membership, [AIRFLOW-1757] Add missing options to SparkSubmitOperator, [AIRFLOW-1734][Airflow 1734] Sqoop hook/operator enhancements, [AIRFLOW-1731] Set pythonpath for logging, [AIRFLOW-1641] Handle executor events in the scheduler, [AIRFLOW-1744] Make sure max_tries can be set, [AIRFLOW-1732] Improve dataflow hook logging, [AIRFLOW-1736] Add HotelQuickly to Who Uses Airflow, [AIRFLOW-1657] Handle failing qubole operator, [AIRFLOW-1677] Fix typo in example_qubole_operator, [AIRFLOW-1716] Fix multiple __init__ def in SimpleDag, [AIRFLOW-1432] Charts label for Y axis not visible, [AIRFLOW-1743] Verify ldap filters correctly, [AIRFLOW-1745] Restore default signal disposition, [AIRFLOW-1741] Correctly hide second chart on task duration page, [AIRFLOW-1728] Add networkUri, subnet, tags to Dataproc operator, [AIRFLOW-1726] Add copy_expert psycopg2 method to PostgresHook, [AIRFLOW-1330] Add conn_type argument to CLI when adding connection, [AIRFLOW-1698] Remove SCHEDULER_RUNS env var in systemd, [AIRFLOW-1692] Change test_views filename to support Windows, [AIRFLOW-1722] Fix typo in scheduler autorestart output filename, [AIRFLOW-1723] Support sendgrid in email backend, [AIRFLOW-1718] Set num_retries on Dataproc job request execution, [AIRFLOW-1727] Add unit tests for DataProcHook, [AIRFLOW-1631] Fix timing issue in unit test, [AIRFLOW-1631] Fix local executor unbound parallelism. To upgrade to the latest Astronomer Certified 2.3.0 patch fix, for example, you would: Note: If you're pushing code to an Airflow Deployment using the Astro CLI and install a new Astronomer Certified image for the first time without pinning a specific hotfix version, the latest available version is automatically pulled. The change aims to unify the format of all options that refer to objects in the airflow.cfg file. This may help users achieve better concurrency performance contrib.hooks.gcp_dataflow_hook.DataFlowHook starts to use --runner=DataflowRunner instead of DataflowPipelineRunner, which is removed from the package google-cloud-dataflow-0.6.0. [AIRFLOW-2380] Add support for environment variables in Spark submit operator. configure a backend secret, it also means the webserver doesnt need to connect to it. option in [scheduler] section to achieve the same effect. For backwards Read each Connection field individually or use the [AIRFLOW-2905] Switch to regional dataflow job service. Internally, the providers manager will still use a prefix to ensure each custom field is globally unique, but the absence of a prefix in the returned widget dict will signal to the Web UI to read and store custom fields without the prefix. The current webserver UI uses the Flask-Admin extension. eg. tasks for all DAG runs of the DAG. better understanding. Google Cloud Connection. by the experimental REST API. This is configurable at the DAG level with max_active_tasks and a default can be set in airflow.cfg as has also been changed to Running Slots. The behavior has been changed to return an empty list instead of None in this account. airflow.providers.google.cloud.hooks.bigquery.BigQueryBaseCursor.create_empty_dataset raises AirflowException instead of ValueError. Due to security concerns, the new webserver will no longer support the features in the Data Profiling menu of old UI, including Ad Hoc Query, Charts, and Known Events. If the log level of the message meets or exceeds the log level of the logger itself, the message will undergo further processing. The This inconsistency in behavior made the API less intuitive to users. See the latest API This means that users now have access to the full Kubernetes API defaulting to the default_timezone in the global config.

restore the previous behavior, the user must consciously set an empty key in the fernet_key option of For example: Now if you resolve a Param without a default and dont pass a value, you will get an TypeError. Follow the hotfix upgrade process to pin your image to a particular version. If you want to remove these warning messages from the Airflow UI, reach out to Astronomer support. upgrade the schema issue airflow upgradedb. A logger is the entry point into the logging system. implicit dependency to BaseOperator. This changes the default for new installs to deny all requests by default. Replace parameter max_retires with max_retries to fix typo. some bugs. When you set it to false, the header was not added, so Airflow could be embedded in an The experimental REST API is disabled by default. It is no longer required to set one of the environment variables to avoid Airflow support for (#14827), Fix used_group_ids in dag.partial_subset (#13700) (#15308), Further fix trimmed pod_id for KubernetesPodOperator (#15445), Bugfix: Invalid name when trimmed pod_id ends with hyphen in KubernetesPodOperator (#15443), Fix incorrect slots stats when TI pool_slots > 1 (#15426), Fix sync-perm to work correctly when update_fab_perms = False (#14847), Fixes limits on Arrow for plexus test (#14781), Fix AzureDataFactoryHook failing to instantiate its connection (#14565), Fix permission error on non-POSIX filesystem (#13121), Fix get_context_data doctest import (#14288), Correct typo in GCSObjectsWtihPrefixExistenceSensor (#14179), Fix critical CeleryKubernetesExecutor bug (#13247), Fix four bugs in StackdriverTaskHandler (#13784), func.sum may return Decimal that break rest APIs (#15585), Persist tags params in pagination (#15411), API: Raise AlreadyExists exception when the execution_date is same (#15174), Remove duplicate call to sync_metadata inside DagFileProcessorManager (#15121), Extra docker-py update to resolve docker op issues (#15731), Ensure executors end method is called (#14085), Prevent clickable bad links on disabled pagination (#15074), Acquire lock on db for the time of migration (#10151), Skip SLA check only if SLA is None (#14064), Print right version in airflow info command (#14560), Make airflow info work with pipes (#14528), Rework client-side script for connection form. results. a previously installed version of Airflow before installing 1.8.1. Operators that require two connections are not changed. By default pickling is still enabled until Airflow 2.0. Used slot has been renamed to running slot to make the name self-explanatory If you access Airflows metadata database directly, you should rewrite the implementation to use the run_id column instead. folder was /var/dags/ and your airflowignore contained /var/dag/excluded/, you should change it The default_queue configuration option has been moved from [celery] section to [operators] section to allow for re-use between different executors. For example: These warnings have no impact on your tasks or DAGs and can be ignored. Use kerberos_service_name = hive as standard instead of impala. It falls back to the connection schema attribute. As part of it, the following configurations have been changed: hide_sensitive_variable_fields option in admin section has been replaced by hide_sensitive_var_conn_fields section in core section. Please read through the new scheduler options, defaults have changed since 1.7.1. If your module was in the path my_acme_company.executors.MyCustomExecutor and the plugin was Only changes unique to this provider are described here. faster for larger tables. Previously not all hooks and operators related to Google Cloud use The previous default was an empty string but the code used 0 if it was If you want to install integration for Microsoft Azure, then instead of, you should run pip install 'apache-airflow[microsoft.azure]', If you want to install integration for Amazon Web Services, then instead of MySqlToGoogleCloudStorageOperator now exports TIMESTAMP columns as UTC It only signals to Astronomer your intent to upgrade at a later time. package was supported by the community. * (#24399), Task log templates are now read from the metadata database instead of, Minimum kubernetes library version bumped from. As a result, the python_callable argument was removed. in [sentry] section to "True". files, so it is impossible to determine the correct file path in every case e.g. While DAG Serialization is a strict requirements since Airflow 2, we allowed users to control This will bring up a log in page, enter the recently created admin username and password. to have specified explicit_defaults_for_timestamp=1 in your my.cnf under [mysqld]. which apply to most services. possible, but the configuration has changed. have been made to the core (including core operators) as they can affect the integration behavior SFTPOperator is added to perform secure file transfer from server A to server B. Any directory may be added to the PYTHONPATH, this might be handy when the config is in another directory or a volume is mounted in case of Docker. [AIRFLOW-780] Fix dag import errors no longer working, [AIRFLOW-783] Fix py3 incompatibility in BaseTaskRunner, [AIRFLOW-810] Correct down_revision dag_id/state index creation, [AIRFLOW-807] Improve scheduler performance for large DAGs, [AIRFLOW-798] Check return_code before forcing termination, [AIRFLOW-139] Let psycopg2 handle autocommit for PostgresHook, [AIRFLOW-776] Add missing cgroups devel dependency, [AIRFLOW-777] Fix expression to check if a DagRun is in running state, [AIRFLOW-785] Dont import CgroupTaskRunner at global scope, [AIRFLOW-624] Fix setup.py to not import airflow.version as version, [AIRFLOW-779] Task should fail with specific message when deleted, [AIRFLOW-778] Fix completely broken MetastorePartitionSensor, [AIRFLOW-739] Set pickle_info log to debug, [AIRFLOW-771] Make S3 logs append instead of clobber, [AIRFLOW-773] Fix flaky datetime addition in api test, [AIRFLOW-219][AIRFLOW-398] Cgroups + impersonation, [AIRFLOW-683] Add Jira hook, operator and sensor, [AIRFLOW-762] Add Google DataProc delete operator, [AIRFLOW-759] Use previous dag_run to verify depend_on_past, [AIRFLOW-757] Set child_process_log_directory default more sensible, [AIRFLOW-692] Open XCom page to super-admins only, [AIRFLOW-747] Fix retry_delay not honoured, [AIRFLOW-558] Add Support for dag.catchup=(True|False) Option, [AIRFLOW-489] Allow specifying execution date in trigger_dag API, [AIRFLOW-738] Commit deleted xcom items before insert, [AIRFLOW-729] Add Google Cloud Dataproc cluster creation operator, [AIRFLOW-728] Add Google BigQuery table sensor, [AIRFLOW-741] Log to debug instead of info for app.py, [AIRFLOW-731] Fix period bug for NamedHivePartitionSensor, [AIRFLOW-663] Improve time units for task performance charts, [AIRFLOW-734] Fix SMTP auth regression when not using user/pass, [AIRFLOW-717] Add Cloud Storage updated sensor, [AIRFLOW-695] Retries do not execute because dagrun is in FAILED state, [AIRFLOW-673] Add operational metrics test for SchedulerJob, [AIRFLOW-727] try_number is not increased. That The previous imports will continue to work until Airflow 2.0. To fix it, change ctx to context. The webserver.X_FRAME_ENABLED configuration works according to description now (#23222). Previously, a task instance with wait_for_downstream=True will only run if the downstream task of Hence, the default value for master_disk_size in DataprocCreateClusterOperator has been changed from 500GB to 1TB. Note that Airflows metadatabase definition on both the database and ORM levels are considered implementation detail without strict backward compatibility guarantees. With the new This GoogleCloudStorageToBigQueryOperator is now support schema auto-detection is available when you load data into BigQuery. (#24519), Upgrade to react 18 and chakra 2 (#24430), Refactor DagRun.verify_integrity (#24114), We now need at least Flask-WTF 0.15 (#24621), Run the check_migration loop at least once, Icons in grid view for different DAG run types (#23970), Disallow calling expand with no arguments (#23463), Add missing is_mapped field to Task response. For example. with different values. to get/view Configurations. in SubDagOperator. A By default Airflow could not be embedded in an iframe. If you already have duplicates in your metadata database, you will have to manage those duplicate connections before upgrading the database. it uses a lot of dependencies that are essential to run the webserver and integrate it [AIRFLOW-2377] Improve Sendgrid sender support, [AIRFLOW-2331] Support init action timeout on dataproc cluster create, [AIRFLOW-1835] Update docs: Variable file is json, [AIRFLOW-1781] Make search case-insensitive in LDAP group, [AIRFLOW-2042] Fix browser menu appearing over the autocomplete menu, [AIRFLOW-XXX] Remove wheelhouse files from travis not owned by travis, [AIRFLOW-2336] Use hmsclient in hive_hook, [AIRFLOW-2041] Correct Syntax in Python examples, [AIRFLOW-74] SubdagOperators can consume all celeryd worker processes, [AIRFLOW-2365] Fix autocommit attribute check, [AIRFLOW-2068] MesosExecutor allows optional Docker image, [AIRFLOW-1652] Push DatabricksRunSubmitOperator metadata into XCOM, [AIRFLOW-2234] Enable insert_rows for PrestoHook, [AIRFLOW-2208][Airflow-22208] Link to same DagRun graph from TaskInstance view, [AIRFLOW-1153] Allow HiveOperators to take hiveconfs, [AIRFLOW-775] Fix autocommit settings with Jdbc hook, [AIRFLOW-2364] Warn when setting autocommit on a connection which does not support it, [AIRFLOW-2357] Add persistent volume for the logs, [AIRFLOW-766] Skip conn.commit() when in Auto-commit, [AIRFLOW-2351] Check for valid default_args start_date, [AIRFLOW-1433] Set default rbac to initdb, [AIRFLOW-2270] Handle removed tasks in backfill, [AIRFLOW-2344] Fix connections -l to work with pipe/redirect, [AIRFLOW-2300] Add S3 Select functionality to S3ToHiveTransfer, [AIRFLOW-1314] Polish some of the Kubernetes docs/config, [AIRFLOW-1999] Add per-task GCP service account support, [AIRFLOW-1314] Small cleanup to address PR comments (#24), [AIRFLOW-1314] Add executor_config and tests, [AIRFLOW-1314] Use VolumeClaim for transporting DAGs, [AIRFLOW-1314] Create integration testing environment, [AIRFLOW-1314] Git Mode to pull in DAGs for Kubernetes Executor, [AIRFLOW-1314] Add support for volume mounts & Secrets in Kubernetes Executor, [AIRFLOW-2326][AIRFLOW-2222] remove contrib.gcs_copy_operator, [AIRFLOW-2328] Fix empty GCS blob in S3ToGoogleCloudStorageOperator, [AIRFLOW-2350] Fix grammar in UPDATING.md, [AIRFLOW-2345] pip is not used in this setup.py, [AIRFLOW-2347] Add Banco de Formaturas to Readme, [AIRFLOW-2346] Add Investorise as official user of Airflow, [AIRFLOW-2330] Do not append destination prefix if not given. There is no need to explicitly provide or not provide the context anymore. to historical reasons. Previously the command line option num_runs was used to let the scheduler terminate after a certain amount of Now you dont have to create a plugin to configure a So the effective timeout of a sensor is timeout * (retries + 1). The chain and cross_downstream methods are now moved to airflow.models.baseoperator module from bad practice. But to achieve that try / except clause was removed from create_empty_dataset and create_empty_table [AIRFLOW-2763] No precheck mechanism in place during worker initialisation for the connection to metadata database, [AIRFLOW-2789] Add ability to create single node cluster to DataprocClusterCreateOperator, [AIRFLOW-2797] Add ability to create Google Dataproc cluster with custom image, [AIRFLOW-2854] kubernetes_pod_operator add more configuration items, [AIRFLOW-2855] Need to Check Validity of Cron Expression When Process DAG File/Zip File, [AIRFLOW-2904] Clean an unnecessary line in airflow/executors/celery_executor.py, [AIRFLOW-2921] A trivial incorrectness in CeleryExecutor(), [AIRFLOW-2922] Potential deal-lock bug in CeleryExecutor(), [AIRFLOW-2932] GoogleCloudStorageHook - allow compression of file, [AIRFLOW-2949] Syntax Highlight for Single Quote, [AIRFLOW-2951] dag_run end_date Null after a dag is finished, [AIRFLOW-2956] Kubernetes tolerations for pod operator, [AIRFLOW-2997] Support for clustered tables in Bigquery hooks/operators, [AIRFLOW-3006] Fix error when schedule_interval=None, [AIRFLOW-3008] Move Kubernetes related example DAGs to contrib/example_dags, [AIRFLOW-3025] Allow to specify dns and dns-search parameters for DockerOperator, [AIRFLOW-3067] (www_rbac) Flask flash messages are not displayed properly (no background color), [AIRFLOW-3069] Decode output of S3 file transform operator, [AIRFLOW-3072] Assign permission get_logs_with_metadata to viewer role, [AIRFLOW-3112] Align SFTP hook with SSH hook, [AIRFLOW-3119] Enable loglevel on celery worker and inherit from airflow.cfg, [AIRFLOW-3137] Make ProxyFix middleware optional, [AIRFLOW-3173] Add _cmd options for more password config options, [AIRFLOW-3177] Change scheduler_heartbeat metric from gauge to counter, [AIRFLOW-3193] Pin docker requirement version to v3, [AIRFLOW-3195] Druid Hook: Log ingestion spec and task id, [AIRFLOW-3197] EMR Hook is missing some parameters to valid on the AWS API, [AIRFLOW-3232] Make documentation for GCF Functions operator more readable, [AIRFLOW-3262] Cant get log containing Response when using SimpleHttpOperator, [AIRFLOW-3265] Add support for unix_socket in connection extra for Mysql Hook, [AIRFLOW-1441] Tutorial Inconsistencies Between Example Pipeline Definition and Recap, [AIRFLOW-2682] Add how-to guide(s) for how to use basic operators like BashOperator and PythonOperator, [AIRFLOW-3104] .airflowignore feature is not mentioned at all in documentation, [AIRFLOW-3187] Update Airflow.gif file with a slower version, [AIRFLOW-3159] Update Airflow documentation on GCP Logging, [AIRFLOW-3030] Command Line docs incorrect subdir, [AIRFLOW-2990] Docstrings for Hooks/Operators are in incorrect format, [AIRFLOW-3127] Celery SSL Documentation is out-dated, [AIRFLOW-2779] Add license headers to doc files, [AIRFLOW-2779] Add project version to license, [AIRFLOW-839] docker_operator.py attempts to log status key without first checking existence, [AIRFLOW-1104] Concurrency check in scheduler should count queued tasks as well as running, [AIRFLOW-1163] Add support for x-forwarded-* headers to support access behind AWS ELB, [AIRFLOW-1195] Cleared tasks in SubDagOperator do not trigger Parent dag_runs, [AIRFLOW-1508] Skipped state not part of State.task_states, [AIRFLOW-1762] Use key_file in SSHHook.create_tunnel(). To see if you have any connections that will need to be updated, you can run this command: This will catch any warnings about connections that are storing something other than JSON-encoded Python dict in the extra field. One of the reasons was that settings should be rather static than store [AIRFLOW-2112] Fix svg width for Recent Tasks on UI. In a web browser, access the Airflow UI at http://localhost:8080 and click About > Version. To (#17078), Update chain() and cross_downstream() to support XComArgs (#16732), When a task instance fails with exception, log it (#16805), Set process title for serve-logs and LocalExecutor (#16644), Rename test_cycle to check_cycle (#16617), Add schema as DbApiHook instance attribute (#16521, #17423), Add transparency for unsupported connection type (#16220), Replace deprecated dag.sub_dag with dag.partial_subset (#16179), Treat AirflowSensorTimeout as immediate failure without retrying (#12058), Marking success/failed automatically clears failed downstream tasks (#13037), Add close/open indicator for import dag errors (#16073), Always return a response in TIs action_clear view (#15980), Add cli command to delete user by email (#15873), Use resource and action names for FAB permissions (#16410), Rename DAG concurrency ([core] dag_concurrency) settings for easier understanding (#16267, #18730), Refactor: SKIPPED should not be logged again as SUCCESS (#14822), Remove version limits for dnspython (#18046, #18162), Accept custom run ID in TriggerDagRunOperator (#18788), Make REST API patch user endpoint work the same way as the UI (#18757), Properly set start_date for cleared tasks (#18708), Ensure task_instance exists before running update on its state(REST API) (#18642), Make AirflowDateTimePickerWidget a required field (#18602), Retry deadlocked transactions on deleting old rendered task fields (#18616), Fix retry_exponential_backoff divide by zero error when retry delay is zero (#17003), Improve how UI handles datetimes (#18611, #18700), Bugfix: dag_bag.get_dag should return None, not raise exception (#18554), Only show the task modal if it is a valid instance (#18570), Fix accessing rendered {{ task.x }} attributes from within templates (#18516), Add missing email type of connection (#18502), Dont use flash for same-page UI messages.

Sitemap 4

astronomer airflow upgrade