I am trying to extract just numbers from string and separate them by comma in spark sql , so I am using REGEXP_REPLACE function to convert all character to space then use the same function to replace all spaces by comma but it did not work ,
what I try :
/// first try :
val resultDF = hiveContext.sql("select MMS,WAP,EMAIL ,case when MMS=1 and WAP=1 and INTERNET=1 and EMAIL=1 then 'SMS,MMS,WAP' when MMS=1 and WAP=1 then 'SMS,MMS' end as supported_services,LTE_FREQUENCIES, REGEXP_REPLACE ( REGEXP_REPLACE (LTE_FREQUENCIES, '[A-Z]', ' '), '\s+', ',') from test.tttt where limit 1")
/// second try :
val resultDF = hiveContext.sql("select MMS,WAP,EMAIL,supported_services,REGEXP_REPLACE ( network_filed, '\s+', ',') from (select MMS,WAP,EMAIL ,case when MMS=1 and WAP=1 and INTERNET=1 and EMAIL=1 then 'SMS,MMS,WAP' when MMS=1 and WAP=1 then 'SMS,MMS' end as supported_services, REGEXP_REPLACE (LTE_FREQUENCIES, '[A-Z]', ' ') as network_filed from test.tttt limit 1)")
///third try :
val resultDF = hiveContext.sql("select MMS,WAP,EMAIL,supported_services,REGEXP_REPLACE ( network_filed, '[[:space:]]+', ',') from (select MMS,WAP,EMAIL ,case when MMS=1 and WAP=1 and INTERNET=1 and EMAIL=1 then 'SMS,MMS,WAP' when MMS=1 and WAP=1 then 'SMS,MMS' end as supported_services, REGEXP_REPLACE (LTE_FREQUENCIES, '[A-Z]', ' ') as network_filed from test.tttt limit 1)")
sample of input (LTE_FREQUENCIES value):
LTE FDD BAND 1 LTE FDD BAND 2 LTE FDD BAND 3 LTE FDD BAND 4 LTE FDD BAND 5 LTE TDD BAND 6 LTE FDD BAND 7 LTE FDD BAND 9 LTE FDD BAND 10 LTE FDD BAND 11 LTE FDD BAND 12 LTE FDD BAND 13 LTE FDD BAND 14 LTE FDD BAND 17 LTE FDD BAND 18 LTE FDD BAND 19 LTE FDD BAND 20 LTE FDD BAND 21 LTE FDD BAND 23 LTE FDD BAND 24 LTE FDD BAND 25 LTE FDD BAND 26 LTE FDD BAND 33 LTE FDD BAND 34 LTE FDD BAND 35 LTE FDD BAND 36 LTE FDD BAND 37 LTE TDD BAND 38 LTE TDD BAND 39 LTE TDD BAND 40 LTE TDD BAND 41 LTE TDD BAND 42 LTE TDD BAND 43 LTE FDD BAND 8 LTE FDD BAND 28 LTE FDD BAND 29 LTE FDD BAND 31 LTE TDD BAND 33 LTE TDD BAND 34 LTE TDD BAND 35 LTE TDD BAND 36 LTE TDD BAND 37 LTE TDD BAND 61 LTE TDD BAND 62 LTE TDD BAND 60
sample of excepted output :
1,2,3,4,5,6,7,9,10,11,12,13,14,17,18,19,20,21,23,24,25,26,33,34,35,36,37,38,39,40,41,42,43,8,28,29,31,33,34,35,36,37,61,62,60
non of above work always the result when I run resultDF.show(20, false) is :
is there are any way to replace those spaces by comma ?