Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
121 views
in Technique[技术] by (71.8m points)

sql - Aggregating based on GROUPING of multiple columns

I am trying to subquery and aggregate in SQL after doing an initial query with multiple joins. My ultimate goal is to get a count (or a sum) of specimens tested based on a grouping of multiple columns. This is slightly different from SQL Server query - Selecting COUNT(*) with DISTINCT and SQL Server: aggregate error on grouping.

The three tables that I use (PERSON, SPECIMEN, TEST), have 1-many relationships. So PERSON has many SPECIMENS and those SPECIMENS have many TESTS. I did three inner joins to combine these tables plus an additional table (ANALYSIS).

WITH TALLY as (
SELECT PERSON.NAME, PERSON.PHASE, TEST.DATE_STARTED, TEST.ANALYSIS, SPECIMEN.GROUP, TEST.STATUS,
 ANALYSIS.ANALYSIS_TYPE, SPECIMEN.SPECIMEN_NUMBER
    FROM DB.TEST
    INNER JOIN
    DB.SAMPLE ON
    TEST.SPECIMEN_NUMBER = SPECIMEN.SPECIMEN_NUMBER
    INNER JOIN 
    DB.PRODUCT ON
    SPECIMEN.PERSON = PERSON.NAME
    INNER JOIN
    DB.ANALYSIS ON
    TEST.ANALYSIS = ANALYSIS.NAME
    WHERE PERSON.NAME = 'Joe'
    AND TEST.DATE_STARTED >= '20-DEC-16' AND TEST.DATE_STARTED <='01-APR-18'
    AND PERSON.PHASE = 'PHASE1'
    ORDER BY TEST.DATE_STARTED)
    
    SELECT COUNT(DISTINCT ANALYSIS) as SPECIMEN_COUNT, DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
    FROM TALLY 
    GROUP BY DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
    ORDER BY DATE_STARTED;

This gives me the repeated columns: first grouping repeated 4 times

first grouping repeated 4 times

What I am trying to see is: aggregated first grouping with total count

aggregated first grouping with total count

Any thoughts as to what is missing? SUM instead of COUNT or in addition to COUNT creates an error. Thanks in advance!

9/17/2020 Update: I have tried adding a subquery because I also need to use a new column of metadata (ANALYSIS_TYPE_ALIAS) which is created in the first query through a CASE STATEMENT(...). I have also tried using another subquery with inner join to count based on those conditions to a temp table, but still cannot seem to aggregate to flatten the table. Here is my current attempt:

WITH TALLY as (
SELECT PERSON.NAME, PERSON.PHASE, TEST.DATE_STARTED, TEST.ANALYSIS, SPECIMEN.GROUP, TEST.STATUS,
 ANALYSIS.ANALYSIS_TYPE...
    FROM DB.TEST
    INNER JOIN
    DB.SAMPLE ON
    TEST.SPECIMEN_NUMBER = SPECIMEN.SPECIMEN_NUMBER
    INNER JOIN 
    DB.PRODUCT ON
    SPECIMEN.PERSON = PERSON.NAME
    INNER JOIN
    DB.ANALYSIS ON
    TEST.ANALYSIS = ANALYSIS.NAME
    WHERE PERSON.NAME = 'Joe'
    AND TEST.DATE_STARTED >= '20-DEC-16' AND TEST.DATE_STARTED <='01-APR-18'
    AND PERSON.PHASE = 'PHASE1'
    ORDER BY TEST.DATE_STARTED),
SUMMARY_COMBO AS (SELECT DISTINCT(CONCAT(CONCAT(CONCAT(CONCAT(ANALYSIS, DATE_STARTED),STATUS), GROUP), ANALYSIS_TYPE_ALIAS))AS UUID,
TALLY.NAME, TALLY.PHASE, TALLY.DATE_STARTED, TALLY.ANALYSIS, TALLY.GROUP, TALLY.STATUS, TALLY.ANALYSIS_TYPE_ALIAS
FROM TALLY)
SELECT SUMMARY_COMBO.NAME, SUMMARY_COMBO.PHASE, SUMMARY_COMBO.DATE_STARTED, SUMMARY_COMBO.ANALYSIS,SUMMARY_COMBO.GROUP, SUMMARY_COMBO.STATUS, SUMMARY_COMBO.ANALYSIS_TYPE_ALIAS,
COUNT(SUMMARY_COMBO.ANALYSIS) OVER (PARTITION BY SUMMARY_COMBO.UUID) AS SPECIMEN_COUNT
FROM SUMMARY_COMBO 
ORDER BY SUMMARY_COMBO.DATE_STARTED;

This gave me the following table Shows aggregated counts, but doesn't aggregate based on unique UUID. Is there a way to take the sum of the count? I've tried to do this by storing count to a subquery and then referencing that count variable, but I am missing something in how to group the 8 columns of data that I want to show + the count of that combination of columns.

Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Just remove analysis from the group by clause, since that's the column whose distinct values you want to count. Otherwise, the query generates more groups than what you need (and the count of distinct analysis values in each group is always 1).

WITH TALLY as ( ...)
SELECT COUNT(DISTINCT ANALYSIS) as SPECIMEN_COUNT, DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
FROM TALLY 
GROUP BY DATE_STARTED, STATUS, GROUP, ANALYSIS_TYPE
ORDER BY DATE_STARTED;

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...