Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
395 views
in Technique[技术] by (71.8m points)

How to improve the query speed of clickhouse

In clickhouse, I want to do a query operation. The query contains group by QJTD1, but QJTD1 is obtained by querying the dictionary. The statement is as follows:

`SELECT
IF(
    sale_mode = 'owner',
    dictGetString(
        'dict.dict_sku',
        'dept_id_1',
        toUInt64OrZero(sku_id)
    ),
    dictGetString(
        'dict.dict_shop',
        'dept_id_1',
        toUInt64OrZero(shop_id)
    )
) AS QJTD1,
brand_cd,
coalesce(
    uniq(sd_deal_ord_user_num),
    0
) AS sd_deal_ord_user_num,
0 AS item_uv,
dt
FROM app.test_all
WHERE dt >= '2020-11-01'
AND dt <= '2020-11-30'
and IF(
    sale_mode = 'owner',
    dictGetString(
        'dict.dict_sku',
        'bu_id',
        toUInt64OrZero(sku_id)
    ),
    dictGetString(
        'dict.dict_shop',
        'bu_id',
        toUInt64OrZero(shop_id)
    )
)= '1727' GROUP BY
QJTD1,
brand_cd,
dt
ORDER BY item_pv desc limit 0,
100`

, QJTD1 has serious data skew, resulting in slow query speed. I have tried to optimize the index to improve the query speed. The index is as follows: sku_id,shop_id....but it has no effect. How can I improve the query efficiency?

question from:https://stackoverflow.com/questions/65880597/how-to-improve-the-query-speed-of-clickhouse

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

CH calculates both branches of IF (then & else) always.

You can use two-stage group by

select IF( sale_mode ='owner', ... as QJTD1
from (  
  select owner, sku_id, dept_id_1, ....
  ...
  group by owner, sku_id, dept_id_1
  )
group by QJTD1

Or define dictionary <injective>true

https://clickhouse.tech/docs/en/sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure/

Flag that shows whether the id -> attribute image is injective.
If true, ClickHouse can automatically place after the GROUP BY 
clause the requests to dictionaries with injection. Usually it 
significantly reduces the amount of such requests.

Default value: false.

If they are injective.

And I would test Union all then to calculate IF branches only one time.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...