sql - Aggregating based on GROUPING of multiple columns

Question

Welcome To Ask or Share your Answers For Others

sql - Aggregating based on GROUPING of multiple columns

posted Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

sql - Aggregating based on GROUPING of multiple columns

I am trying to subquery and aggregate in SQL after doing an initial query with multiple joins. My ultimate goal is to get a count (or a sum) of specimens tested based on a grouping of multiple columns. This is slightly different from SQL Server query - Selecting COUNT(*) with DISTINCT and SQL Server: aggregate error on grouping.

The three tables that I use (PERSON, SPECIMEN, TEST), have 1-many relationships. So PERSON has many SPECIMENS and those SPECIMENS have many TESTS. I did three inner joins to combine these tables plus an additional table (ANALYSIS).

WITH TALLY as (
SELECT PERSON.NAME, PERSON.PHASE, TEST.DATE_STARTED, TEST.ANALYSIS, SPECIMEN.GROUP, TEST.STATUS,
 ANALYSIS.ANALYSIS_TYPE, SPECIMEN.SPECIMEN_NUMBER
    FROM DB.TEST
    INNER JOIN
    DB.SAMPLE ON
    TEST.SPECIMEN_NUMBER = SPECIMEN.SPECIMEN_NUMBER
    INNER JOIN 
    DB.PRODUCT ON
    SPECIMEN.PERSON = PERSON.NAME
    INNER JOIN
    DB.ANALYSIS ON
    TEST.ANALYSIS = ANALYSIS.NAME
    WHERE PERSON.NAME = 'Joe'
    AND TEST.DATE_STARTED >= '20-DEC-16' AND TEST.DATE_STARTED <='01-APR-18'
    AND PERSON.PHASE = 'PHASE1'
    ORDER BY TEST.DATE_STARTED)
    
    SELECT COUNT(DISTINCT ANALYSIS) as SPECIMEN_COUNT, DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
    FROM TALLY 
    GROUP BY DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
    ORDER BY DATE_STARTED;

This gives me the repeated columns: first grouping repeated 4 times

first grouping repeated 4 times

What I am trying to see is: aggregated first grouping with total count

aggregated first grouping with total count

Any thoughts as to what is missing? SUM instead of COUNT or in addition to COUNT creates an error. Thanks in advance!

9/17/2020 Update: I have tried adding a subquery because I also need to use a new column of metadata (ANALYSIS_TYPE_ALIAS) which is created in the first query through a CASE STATEMENT(...). I have also tried using another subquery with inner join to count based on those conditions to a temp table, but still cannot seem to aggregate to flatten the table. Here is my current attempt:

WITH TALLY as (
SELECT PERSON.NAME, PERSON.PHASE, TEST.DATE_STARTED, TEST.ANALYSIS, SPECIMEN.GROUP, TEST.STATUS,
 ANALYSIS.ANALYSIS_TYPE...
    FROM DB.TEST
    INNER JOIN
    DB.SAMPLE ON
    TEST.SPECIMEN_NUMBER = SPECIMEN.SPECIMEN_NUMBER
    INNER JOIN 
    DB.PRODUCT ON
    SPECIMEN.PERSON = PERSON.NAME
    INNER JOIN
    DB.ANALYSIS ON
    TEST.ANALYSIS = ANALYSIS.NAME
    WHERE PERSON.NAME = 'Joe'
    AND TEST.DATE_STARTED >= '20-DEC-16' AND TEST.DATE_STARTED <='01-APR-18'
    AND PERSON.PHASE = 'PHASE1'
    ORDER BY TEST.DATE_STARTED),
SUMMARY_COMBO AS (SELECT DISTINCT(CONCAT(CONCAT(CONCAT(CONCAT(ANALYSIS, DATE_STARTED),STATUS), GROUP), ANALYSIS_TYPE_ALIAS))AS UUID,
TALLY.NAME, TALLY.PHASE, TALLY.DATE_STARTED, TALLY.ANALYSIS, TALLY.GROUP, TALLY.STATUS, TALLY.ANALYSIS_TYPE_ALIAS
FROM TALLY)
SELECT SUMMARY_COMBO.NAME, SUMMARY_COMBO.PHASE, SUMMARY_COMBO.DATE_STARTED, SUMMARY_COMBO.ANALYSIS,SUMMARY_COMBO.GROUP, SUMMARY_COMBO.STATUS, SUMMARY_COMBO.ANALYSIS_TYPE_ALIAS,
COUNT(SUMMARY_COMBO.ANALYSIS) OVER (PARTITION BY SUMMARY_COMBO.UUID) AS SPECIMEN_COUNT
FROM SUMMARY_COMBO 
ORDER BY SUMMARY_COMBO.DATE_STARTED;

This gave me the following table Shows aggregated counts, but doesn't aggregate based on unique UUID. Is there a way to take the sum of the count? I've tried to do this by storing count to a subquery and then referencing that count variable, but I am missing something in how to group the 8 columns of data that I want to show + the count of that combination of columns.

Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2022-01-31T07:21:32+0000

Just remove analysis from the group by clause, since that's the column whose distinct values you want to count. Otherwise, the query generates more groups than what you need (and the count of distinct analysis values in each group is always 1).

WITH TALLY as ( ...)
SELECT COUNT(DISTINCT ANALYSIS) as SPECIMEN_COUNT, DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
FROM TALLY 
GROUP BY DATE_STARTED, STATUS, GROUP, ANALYSIS_TYPE
ORDER BY DATE_STARTED;

Categories

sql - Aggregating based on GROUPING of multiple columns

sql - Aggregating based on GROUPING of multiple columns

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags