Input and code:
with data as (
select 1 id, 'A' name, 'fruit' r_group, '2007' year, '04' month, 5 sales from dual union all
select 2 id, 'Z' name, 'fruit' r_group, '2007' year, '04' month, 99 sales from dual union all
select 3 id, 'A' name, 'fruit' r_group, '2008' year, '05' month, 10 sales from dual union all
select 4 id, 'B' name, 'vegetable' r_group, '2008' year, '07' month, 20 sales from dual
)
select year,
month,
r_group,
sum(sales) sales,
sum(opening) opening,
sum(closing) closing
from (
select t.*,
(sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) -sales ) as opening,
sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) as closing
from data t
)
group by year, month, r_group
order by year, month
Output:
year | month | r_group | sales | opening | closing |
2007 | 04 | fruit | 104 | 0 | 104 |
2008 | 05 | fruit | 10 | 5 | 15 |
2008 | 07 | vegetable | 20 | 0 | 20 |
I want the output to be like the following:
year | month | r_group | sales | opening | closing |
2007 | 04 | fruit | 104 | 0 | 104 |
2008 | 05 | fruit | 10 | 104 | 114 |
2008 | 07 | vegetable | 20 | 0 | 20 |
I can achieve the desired output only by adding a zero-valued record in the data for month=05 and for name = 'Z' like this:
select 1 id, 'A' name, 'fruit' r_group, '2007', year '04' month, 5 sales from dual union all
select 2 id, 'Z' name, 'fruit' r_group, '2007', year '04' month, 99 sales from dual union all
select 3 id, 'A' name, 'fruit' r_group, '2008', year '05' month, 10 sales from dual union all
select 4 id, 'Z' name, 'fruit' r_group, '2008', year '05' month, 0 sales from dual union all
select 5 id, 'B' name, 'vegetable' r_group, '2008', year '07' month, 20 sales from dual ))
However, I want to know if I can do this as part of the select query without having to edit the data itself.
EDIT
The inner select statement will input into a database table the detailed version: year, month, name, r_group, opening, closing. In other words the result of this query will be used to populate the db table and then aggregation using the outer query will happen afterwards:
select t.*,
(sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) -sales ) as opening,
sum(sales) over (partition by name, r_group
order by year, month
rows between unbounded preceding and current row
) as closing
from data t
then I'll use an aggregate on that using an analytical tool (3rd party) to aggregate on r_group only without including the name. But the year, month, name, r_group detail must exist in the background.
EDIT 2
In other workds, I'm trying to dynamically add missing data. For instance, if name = 'Z' exists in 2007,04 but DOES NOT in 2008,05 then the cumulative function will fail once it gets to 2008. Because, it does not have a name ='Z' in 2008 to start with it fails.
See Question&Answers more detail:
os