amazon web services - Redshift. Convert comma delimited values into rows

Question

Welcome To Ask or Share your Answers For Others

amazon web services - Redshift. Convert comma delimited values into rows

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

amazon web services - Redshift. Convert comma delimited values into rows

I am wondering how to convert comma-delimited values into rows in Redshift. I am afraid that my own solution isn't optimal. Please advise. I have table with one of the columns with coma-separated values. For example:

I have:

user_id|user_name|user_action
-----------------------------
1      | Shone   | start,stop,cancell...

I would like to see

user_id|user_name|parsed_action 
------------------------------- 
1      | Shone   | start        
1      | Shone   | stop         
1      | Shone   | cancell      
....

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T00:10:21+0000

A slight improvement over the existing answer is to use a second "numbers" table that enumerates all of the possible list lengths and then use a cross join to make the query more compact.

Redshift does not have a straightforward method for creating a numbers table that I am aware of, but we can use a bit of a hack from https://www.periscope.io/blog/generate-series-in-redshift-and-mysql.html to create one using row numbers.

Specifically, if we assume the number of rows in cmd_logs is larger than the maximum number of commas in the user_action column, we can create a numbers table by counting rows. To start, let's assume there are at most 99 commas in the user_action column:

select 
  (row_number() over (order by true))::int as n
into numbers
from cmd_logs
limit 100;

If we want to get fancy, we can compute the number of commas from the cmd_logs table to create a more precise set of rows in numbers:

select
  n::int
into numbers
from
  (select 
      row_number() over (order by true) as n
   from cmd_logs)
cross join
  (select 
      max(regexp_count(user_action, '[,]')) as max_num 
   from cmd_logs)
where
  n <= max_num + 1;

Once there is a numbers table, we can do:

select
  user_id, 
  user_name, 
  split_part(user_action,',',n) as parsed_action 
from
  cmd_logs
cross join
  numbers
where
  split_part(user_action,',',n) is not null
  and split_part(user_action,',',n) != '';

Categories

amazon web services - Redshift. Convert comma delimited values into rows

amazon web services - Redshift. Convert comma delimited values into rows

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags