Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
354 views
in Technique[技术] by (71.8m points)

sql - Grouping of top 80% categories

I need SQL query for grouping by some category which would present quantities of only those groups which in total contain at least 80% of all categories, other rare categories (containing up to 20% of total) should be represented like "other".

So the result of such a query for grouping apples by category color should look like this:

RED    1118 44% )
YELLOW  711 28% > at least 80%
GREEN   229  9% )
other   482 19%

How to do that?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I would do this with a combination of aggregation and analytic functions. The colors are put in the "other" category when the cumulative sum of the rarest is under 20%:

select (case when cumcntdesc < totalcnt * 0.2 then 'other'
             else color
        end) as color, sum(cnt) as cnt
from (select color, count(*) as cnt,
             sum(count(*)) over (order by count(*) asc) as cumcntdesc,
             sum(count(*)) over () as totalcnt
      from t
      group by color
     ) t
group by (case when cumcntdesc < totalcnt * 0.2 then 'other'
               else color
          end)

Here is a SQL Fiddle.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...