• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

airflow - Getting the date of the most recent successful DAG execution

[复制链接]
菜鸟教程小白 发表于 2022-6-1 20:09:36 | 显示全部楼层 |阅读模式 打印 上一主题 下一主题

I am looking to create a transform in Airflow, and I want to ensure to get all data from my source since the last time a DAG has run in order to update my target table. In order to this, I want to be able to get the most recent execution which was successful.

I have found this: Apache airflow macro to get last dag run execution time which gets me somewhere to the end goal, however, this only gets the last time the DAG executed, regardless of it being successful or not.

SELECT col1, col2, col3
FROM schema.table
WHERE table.updated_at > '{{ last_dag_run_execution_date(dag) }}';

If an execution fails (due to connectivity or something like), the last_dag_run_execution_date(dag) will update, but we've missed the execution for that previous DAG run.

Ideally, this will pull the most recent non-failed execution. Or if anyone has any ideas how I can meet this, please let me know



Best Answer-推荐答案


I've ended up changing the function in the referenced question to use the latest_execution_date, which is a predefined macro in Airflow, as such:

def get_last_dag_run(dag):
    last_dag_run = dag.latest_execution_date
    if last_dag_run is None: 
        return '2013-01-01'
    else:
        return last_dag_run

Seems to be working for me at the moment.

回复

使用道具 举报

懒得打字嘛,点击右侧快捷回复 【右侧内容,后台自定义】
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关注0

粉丝2

帖子830918

发布主题
阅读排行 更多
广告位

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap