Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
223 views
in Technique[技术] by (71.8m points)

String combination in Haskell

I'm writing an algorithm in Haskell that simplifies context-free grammars and I've been struggling with the removal of null productions, more specifically with the "substitution" of nullable non-terminals in the other productions.

Given a string, let's say "ASA", I would like to return a list of strings built by removing the character "A" one, two, ... every time it appears. To be clear, given "ASA" I'd like to return this: ["SA", "AS", "S"].

In Python I did it quite easily, but in Haskell I don't know how to iterate over the string and manipulate it how I'd like to. Probably because I'm still not used tu functional programming.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A library-based approach:

A given input character may or may not be in any of the output partial strings. So it seems natural to involve the Haskell Maybe type transformer. It is similar to std::optional in C++.

We can have an expand function that associates to each input character a list of the corresponding possibilities:

$ ghci
 λ> 
 λ> st = "ASA"
 λ> 
 λ> expand ch = if (ch == 'A') then [ Just ch, Nothing ] else [ Just ch ]
 λ> 
 λ> map expand st
 [[Just 'A',Nothing],[Just 'S'],[Just 'A',Nothing]]
 λ> 

What we need is basically the Cartesian product of the above lists of possibilites. A list Cartesian product can be obtained by using the highly polymorphic sequence library function:

 λ> 
 λ> sequence (map expand st)
[[Just 'A',Just 'S',Just 'A'],[Just 'A',Just 'S',Nothing],[Nothing,Just 'S',Just 'A'],[Nothing,Just 'S',Nothing]]
 λ> 

Next, we need to change for example [Just 'A', Just 'S', Nothing] into ['A', 'S'], which in Haskell is exactly the same thing as "AS". The required function would have as its type signature:

func :: [Maybe α] -> [α]

If we submit this candidate type signature into Hoogle, we readily get library function catMaybes:

 λ> 
 λ> import qualified Data.Maybe as Mb
 λ> 
 λ> Mb.catMaybes [Just 'A',Just 'S',Nothing]
 "AS"
 λ> 
 λ> map Mb.catMaybes (sequence (map expand st))
 ["ASA","AS","SA","S"]
 λ> 

and we just have to remove the full string "ASA" from that last list.

Of course, there is no need to restrict this to the Char data type. Any type with a proper equality test can do. And the privileged character 'A' should be made into a variable argument. Overall, this gives us the following code:

import qualified Data.Maybe as Mb

multiSuppressor :: Eq α => α -> [α] -> [[α]]
multiSuppressor e xs =
    let  expand e1 = if (e1 == e) then [ Just e1, Nothing ] else [ Just e1 ]
         maybes  = sequence  (map expand xs)
         res1    = map  Mb.catMaybes  maybes
    in
         -- final massaging as the whole list is normally unwanted:
         if (null xs)  then  [[]]  else  filter (/= xs) res1

A note on efficiency:

Function sequence is polymorphic. Being the list cartesian product is not its sole role in life. Unfortunately, this happens to have the sad side effect that its memory consumption can become quite large if you go beyond toy-sized examples.

If this becomes a problem, one can use the following replacement code instead, which is based on an idea by K. A. Buhr:

cartesianProduct :: [[α]] -> [[α]]
cartesianProduct xss =
    map reverse (helper (reverse xss))
        where
            helper [] = [[]]
            helper (ys:zss) =  [y:zs | zs <- helper zss, y <- ys]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...