Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
935 views
in Technique[技术] by (71.8m points)

postgresql - How to `sum( DISTINCT <column> ) OVER ()` using window function?

I have next data:

enter image description here

Here I already calculated total for conf_id. But want also calculate total for whole partition. eg:
Calculate total suma by agreement for each its order (not goods at order which are with slightly different rounding)

How to sum 737.38 and 1238.3? eg. take only one number among group

(I can not sum( item_suma ), because it will return 1975.67. Notice round for conf_suma as intermediate step)

UPD
Full query. Here I want to calculate rounded suma for each group. Then I need to calculate total suma for those groups

SELECT app_period( '2021-02-01', '2021-03-01' );


WITH
target_date AS ( SELECT '2021-02-01'::timestamptz ),
target_order as (
  SELECT
    tstzrange( '2021-01-01', '2021-02-01') as bill_range,
    o.*
  FROM ( SELECT * FROM "order_bt" WHERE sys_period @> sys_time() ) o
  WHERE FALSE
    OR o.agreement_id = 3385 and o.period_id = 10
),
USAGE AS ( SELECT
  ocd.*,


  o.agreement_id                  as agreement_id,
  o.id                            AS order_id,
  
  (dense_rank() over (PARTITION BY o.agreement_id       ORDER BY o.id                     )) as zzzz_id,
  (dense_rank() over (PARTITION BY o.agreement_id, o.id ORDER BY (ocd.ic).consumed_period )) as conf_id,

  
   sum( ocd.item_suma     ) OVER( PARTITION BY (ocd.o).agreement_id                 ) AS agreement_suma2,

 
  (sum( ocd.item_suma )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_suma,
  (sum( ocd.item_cost )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_cost,
  (sum( ocd.item_suma )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_suma,
  (sum( ocd.item_cost )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_cost,
  max((ocd.ic).consumed) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )                   AS consumed,
  (sum( ocd.item_suma )  OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id                           )) AS order_suma2
FROM target_order o
LEFT JOIN order_cost_details( o.bill_range ) ocd
  ON (ocd.o).id = o.id  AND  (ocd.ic).consumed_period && o.app_period
)

SELECT 
  *,
  (conf_suma/6) ::numeric( 10, 2 ) as group_nds,
  (SELECT sum(x) from (SELECT  sum( DISTINCT conf_suma )                       AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_suma,
  (SELECT sum(x) from (SELECT (sum( DISTINCT conf_suma ) /6)::numeric( 10, 2 ) AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_nds
FROM USAGE
WINDOW w AS ( PARTITION BY usage.agreement_id ROWS CURRENT ROW EXCLUDE TIES)
ORDER BY
  order_id,
  conf_id

My old question

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I found solution. See dbfiddle.

To run window function for distinct values I should get first value from each peer. To complete this I

  1. aggregate IDs of rows for this peer
  2. lag this aggregation by one
  3. Mark rows that are not aggregated yet (this is first row at peer) as _distinct
  4. sum( ) FILTER ( WHERE _distinct ) over ( ... )

Voila. You get sum over DISTINCT values at target PARTITION
which are not implemented yet by PostgreSQL

with data as (
  select * from (values 
      ( 1, 1, 1, 1.0049 ), (2, 1,1,1.0049), ( 3, 1,1,1.0049 ) ,
      ( 4, 1, 2, 1.0049 ), (5, 1,2,1.0057), 
      ( 6, 2, 1, 1.53 ), ( 7,2,1,2.18), ( 8,2,2,3.48 )
 ) t (id, agreement_id, order_id, suma)
),
intermediate as (select 
 *,
 sum( suma ) over ( partition by agreement_id, order_id ) as fract_order_suma,
 sum( suma ) over ( partition by agreement_id           ) as fract_agreement_total,
 (sum( suma::numeric(10,2) ) over ( partition by agreement_id, order_id )) as wrong_order_suma,
 (sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
 (sum( suma ) over ( partition by agreement_id           ))::numeric( 10, 2) as wrong_agreement_total,
 id as xid,
 array_agg( id ) over ( partition by agreement_id, order_id ) as agg
from data),

distinc as (select *,
  lag( agg ) over ( partition by agreement_id ) as prev, 
  id = any (lag( agg ) over ()) is not true as _distinct, -- allow to match first ID from next peer
  order_suma as xorder_suma, -- repeat column to easily visually compare with _distinct
  (SELECT sum(x) from (SELECT  sum( DISTINCT order_suma ) AS x FROM intermediate sub_q WHERE sub_q.agreement_id = intermediate.agreement_id GROUP BY agreement_id, order_id) t) as correct_total_suma
from intermediate
)
select 
*,
sum( order_suma ) filter ( where _distinct ) over ( partition by agreement_id ) as also_correct_total_suma
from distinc

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...