Environment: Ubuntu x86_64 (14.10), Oracle JDK 1.8u25
I try and use a parallel stream of Files.lines()
but I want to .skip()
the first line (it's a CSV file with a header). Therefore I try and do this:
try (
final Stream<String> stream = Files.lines(thePath, StandardCharsets.UTF_8)
.skip(1L).parallel();
) {
// etc
}
But then one column failed to parse to an int...
So I tried some simple code. The file is question is dead simple:
$ cat info.csv
startDate;treeDepth;nrMatchers;nrLines;nrChars;nrCodePoints;nrNodes
1422758875023;34;54;151;4375;4375;27486
$
And the code is equally simple:
public static void main(final String... args)
{
final Path path = Paths.get("/home/fge/tmp/dd/info.csv");
Files.lines(path, StandardCharsets.UTF_8).skip(1L).parallel()
.forEach(System.out::println);
}
And I systematically get the following result (OK, I have only run it something around 20 times):
startDate;treeDepth;nrMatchers;nrLines;nrChars;nrCodePoints;nrNodes
What am I missing here?
EDIT It seems like the problem, or misunderstanding, is much more rooted than that (the two examples below were cooked up by a fellow on FreeNode's ##java):
public static void main(final String... args)
{
new BufferedReader(new StringReader("Hello
World")).lines()
.skip(1L).parallel()
.forEach(System.out::println);
final Iterator<String> iter
= Arrays.asList("Hello", "World").iterator();
final Spliterator<String> spliterator
= Spliterators.spliteratorUnknownSize(iter, Spliterator.ORDERED);
final Stream<String> s
= StreamSupport.stream(spliterator, true);
s.skip(1L).forEach(System.out::println);
}
This prints:
Hello
Hello
Uh.
@Holger suggested that this happens for any stream which is ORDERED
and not SIZED
with this other sample:
Stream.of("Hello", "World")
.filter(x -> true)
.parallel()
.skip(1L)
.forEach(System.out::println);
Also, it stems from all the discussion which already took place that the problem (if it is one?) is with .forEach()
(as @SotiriosDelimanolis first pointed out).
See Question&Answers more detail:
os