Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
397 views
in Technique[技术] by (71.8m points)

perl - Error:Wide character in print at X at line 35, <$fh> ?(read text files from command line)

i am newbie to perl. and this is my second assignment i should create program to parse n files and print m sentences using n-grams model. long story short, i wrote this script that will take n arguments, where the first and second arguments are numeric but the rest are files names, however i am getting this error Wide character in print at ngram.pl line 35, line 1.

steps to reproduce it :

input from command line : perl ngram.pl 5 10 tale-cities.txt bleak-house.txt papers.txt
output : Wide character in print at ngram.pl line 35, line 1.

#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use Scalar::Util qw(looks_like_number);
use utf8;
use Encode;
#Charles Dickens


sub checkIfNumberic
{
 my ($inp)=@_;
    if  (looks_like_number($inp)){
       return "True";
    }
    else{
        return "False" ;
    }
}
sub main
{
    my $correctInput=", your input must be something like this 5 10 somefile.txt somefile2.txt ";
    my @inputs= @ARGV;
    if (checkIfNumberic($inputs[0]) eq "False"){
        die "first argument must be numberic $correctInput
";
    }
    if (checkIfNumberic($inputs[1]) eq "False"){
        die "second argument must be numberic $correctInput
";
    }
    for (my $i=2;  $i< scalar @inputs ;$i++)
    {
        if (open(my $fh, '<:encoding(UTF-8)', $inputs[$i])) {
            while (my $line = <$fh>) {
                chomp $line;
                print "$line 
";
            }
        }
    }
}

main();
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You decoded your inputs (the script, with use utf8;; and the file, with :encoding(UTF-8)), but you didn't encode your outputs. Add

use open ':std', ':encoding(UTF-8)';

This is equivalent to

BEGIN {
   binmode STDIN,  ':encoding(UTF-8)';
   binmode STDOUT, ':encoding(UTF-8)';
   binmode STDERR, ':encoding(UTF-8)';
}

It also sets the default encoding for file handles opened in its lexical scope, you can remove the existing :encoding(UTF-8) if you want.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...