Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

snowflake cloud data platform - SQL query to find ids in the same table but different timestamp events (cohorts)

I need to write a query that gives me the count with the following logic. The example below shows that ACCOUNT_ID 123 signup in 2020-02-21 so M0 is 1 and then the same ACCOUNT_ID had an event in the consecutive month so M1 is 1.

M0 is a the signup date
M1 is signup date + 1 month
M2 is signup date + 2 consecutive months
M3 is signup date + 3 consecutive months

enter image description here

enter image description here

    WITH M_O AS (
SELECT
    parsed_data."ACCOUNT_ID" AS "parsed_data.account_id",
    MIN(TO_CHAR(TO_DATE(parsed_data."TIMESTAMP"::timestamp_ntz ), 'YYYY-MM-DD')) AS "SIGNUP",
    COUNT(DISTINCT (parsed_data."ACCOUNT_ID") ) AS "COUNT_USERS_O"
FROM "PUBLIC"."PARSED_DATA"
     AS parsed_data

WHERE (parsed_data."ACCOUNT_ID") IS NOT NULL 
    AND (((parsed_data."EVENT") = 'Started')) 
    AND (
      ((TO_CHAR(TO_DATE(parsed_data."TIMESTAMP"::timestamp_ntz ), 'YYYY-MM-DD')) >= '2020-02-21')
      AND ((parsed_data."TIMESTAMP"::timestamp_ntz ) < CURRENT_DATE())
    )
GROUP BY 1),
M_1 AS (
  SELECT
    parsed_data."ACCOUNT_ID" AS "parsed_data.account_id",
    TO_CHAR(TO_DATE(parsed_data."TIMESTAMP"::timestamp_ntz ), 'YYYY-MM-DD') AS "parsed_data.timestamp_date",
    COUNT(DISTINCT (parsed_data."ACCOUNT_ID") ) AS "COUNT_USERS_1"
  FROM "PUBLIC"."PARSED_DATA" 
    AS parsed_data INNER JOIN M_O ON parsed_data.account_id = M_O."parsed_data.account_id"
  WHERE
    (parsed_data."ACCOUNT_ID") IS NOT NULL 
    AND (((parsed_data."EVENT") = 'Started'))
    AND (
      (TO_CHAR(TO_DATE(parsed_data."TIMESTAMP"::timestamp_ntz ), 'YYYY-MM-DD')) >= DATEADD('MONTH', 1, SIGNUP)
      AND ((parsed_data."TIMESTAMP"::timestamp_ntz ) < CURRENT_DATE())
    )
GROUP BY 1,2
)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It looks like you want to create cohorts? As in "establish the creation date for each id, and then look how they changed their behavior every month thereafter".

This code should work:

with events as (
    select 1 id, '2020-01-01'::date e_date
    union all select 1, '2020-02-03'
    union all select 2, '2020-03-01'
    union all select 2, '2020-05-08'
    union all select 3, '2020-08-01'
    union all select 3, '2020-09-02'
    union all select 3, '2020-09-22'
    union all select 3, '2020-09-30'
    union all select 3, '2020-10-10'
),
first_per_id as (
    select id, min(e_date) first_date
    from events
    group by id
)

select a.id
    , count_if(e_date>=dateadd(month, 0, first_date) and e_date<dateadd(month, 1, first_date)) m0
    , count_if(e_date>=dateadd(month, 1, first_date) and e_date<dateadd(month, 2, first_date)) m1
    , count_if(e_date>=dateadd(month, 2, first_date) and e_date<dateadd(month, 3, first_date)) m2
from events a
join first_per_id b
where a.id=b.id
group by 1

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...