Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
996 views
in Technique[技术] by (71.8m points)

haskell - IO/Monadic assign operator causing ghci to explode for infinite list

Consider the following program. It runs forever and does nothing useful, but the memory consumption in ghci is constant :

--NoExplode.hs
module Main (main) where

test :: [Int] -> IO()
test lst = do
  print "test"
  rList lst

rList :: [Int] -> IO ()
rList [] = return ()
rList (x:xs) = do
  rList xs

main = do
  test [1..]

Now consider the following trivially modified version of the above. When this program is run in ghci the memory explodes. The only difference is that print "test" is now assigned to x in the do block of test.

--Explode.hs
module Main (main) where

test :: [Int] -> IO()
test lst = do
  x <- print "test"
  rList lst

rList :: [Int] -> IO ()
rList [] = return ()
rList (x:xs) = do
  rList xs

main = do
  test [1..]

Why does changing print "test" to x <- print "test" cause ghci to blow up?

p.s. I came across this when trying to understand Memory exploding upon writing a lazy bytestring to file in ghci, and the problem there (I think) essentially distils to the above. Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Disclaimer: I'm not a GHCi expert, and also not that good with GHC core. Now that I've lost my credibility, lets try to understand what happens:

GHCi and CAFs

GHCi retains all evaluated CAFs:

Normally, any evaluation of top-level expressions (otherwise known as CAFs or Constant Applicative Forms) in loaded modules is retained between evaluations.

Now you might wonder why there's such a big difference between both versions. Lets have a look at the core with -ddump-simpl. Note that you might want to drop -dsuppress-all when you dump the programs yourself.

Dumps of your programs

Non-exploding version:

? ghc SO.hs -ddump-simpl -fforce-recomp -O0 -dsuppress-all
[1 of 1] Compiling Main             ( SO.hs, SO.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 29, types: 28, coercions: 0}

$dShow_rq2
$dShow_rq2 = $fShow[] $fShowChar

Rec {
rList_reI
rList_reI =
   ds_dpU ->
    case ds_dpU of _ {
      [] -> return $fMonadIO ();
      : x_aho xs_ahp -> rList_reI xs_ahp
    }
end Rec }

main
main =
  >>
    $fMonadIO
    (print $dShow_rq2 (unpackCString# "test"))
    (rList_reI (enumFrom $fEnumInt (I# 1)))

main
main = runMainIO main

The important part is the location of [1..], almost at the end:

enumFrom $fEnumInt (I# 1))

As you can see, the list isn't a CAF. But what happens if we instead use the exploding version?

Exploding version

? ghc SO.hs -ddump-simpl -fforce-recomp -O0 -dsuppress-all
[1 of 1] Compiling Main             ( SO.hs, SO.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 32, types: 31, coercions: 0}

$dShow_rq3
$dShow_rq3 = $fShow[] $fShowChar

Rec {
rList_reI
rList_reI =
   ds_dpV ->
    case ds_dpV of _ {
      [] -> return $fMonadIO ();
      : x_ahp xs_ahq -> rList_reI xs_ahq
    }
end Rec }

lst_rq4
lst_rq4 = enumFrom $fEnumInt (I# 1)

main
main =
  >>=
    $fMonadIO
    (print $dShow_rq3 (unpackCString# "test"))
    ( _ -> rList_reI lst_rq4)

main
main = runMainIO main

There's suddenly a new top-level expression, namely lst_rq4, which generates the list. And as seen before, GHCi retains the evaluations of top-level expressions, so lst_rq4 will also be retained.

Now there is an option to discard the evaluations:

Turning on +r causes all evaluation of top-level expressions to be discarded after each evaluation (they are still retained during a single evaluation).

But since "they are still retained during a single evaluation" even :set +r won't help you in this case. Unfortunately I cannot answer why GHC introduces a new top-level expression.

Why does this even happen in optimized code?

The list is still a top-level expression:

main2
main2 = eftInt 1 2147483647

Funny enough, GHC actually doesn't create an infinite list, since Int is bounded.

How can one get rid of the leak?

In this case you can get rid of it if you place the list in test:

test = do
   x <- print "test"
   rList [1..]

This will prevent GHC from creating a top-level expression.

However, I can't really give a general advise on this. Unfortunately, my Haskell-fu isn't yet good enough.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...