Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

string - How can I sort an array or table by column in Perl?

I've been looking everywhere for an answer to this, and I just can't get it to work.

I have an input file that is read into an array using Perl. The file is a text file containing a table. Perl reads it in as an array, with each element being a full line (including all five columns). This is what the array looks like:

0__len__340      16    324       0    0.0470588235294118
1__len__251      2     249       0    0.00796812749003984
2__len__497      0     497       0    0
3__len__55       7     48        0    0.127272727272727
4__len__171      0     171       0    0
5__len__75       0     75        0    0
6__len__160      75    85        0    0.46875
7__len__285      1     284       0    0.00350877192982456
8__len__94       44    50        0    0.468085106382979

I need to sort this table by the last column in descending order. So my output should be:

6__len__160     75    85       0    0.46875
8__len__94      44    50       0    0.468085106382979
3__len__55      7     48       0    0.127272727272727
0__len__340     16    324      0    0.0470588235294118
1__len__251     2     249      0    0.00796812749003984
7__len__285     1     284      0    0.00350877192982456
2__len__497     0     497      0    0
4__len__171     0     171      0    0
5__len__75      0     75       0    0

I've tried a few approaches, but none have worked. Here's the code I've tried:

@input = <FILENAME>;

#Close the file
close FILENAME;

my @fractions;
my $y = 0;
for (my $x = 1; $x <= $#input; ++$x) {
    $fractions[$y] = (split (/s/, $input[$x]))[4];
    ++$y;
}
my @sorted = sort {$b <=> $a} @fractions;
my $e = 1;
my $z = 0;
my $f = 0;
my @final;

do {
    do {
        if ((split (/s/, $input[$e]))[4] == $sorted[$z]){
            $final[$f] = $input[$e];
            ++$e;
            ++$f;
        } else {
            ++$e;
        }
    } until ($e > $#input);

    do {
        ++$z;
    } until ($sorted[$z] != $sorted[$z - 1]);

    $e = 0;
} until ($z > $#sorted);

for (my $h = 0; $h <= $#final; ++$h) {
    print $final[$h] . "

";
}

With this one, I basically tried to put the column 5 numbers into their own array, sort them, and then go back through the original array and pull out the elements that match the sorted array, and put them into the final array.

This may work if I keep working on it, but it takes so long to run that it's impractical. This small table I'm using to test my code with took a long time for this to run, and once the code is working it will be dealing with a table that has millions of rows.

I also tried applying the sort command to the table itself, but my output is the exact same table as my input...it doesn't get sorted.

@input = <FILENAME>;
close FILENAME;
my @sorted = sort { $b->[4] <=> $a->[4] } @input;
for (my $h = 0; $h <= $#sorted; ++$h) {
    print $sorted[$h] . "

";
}
exit;

Lastly, I tried to put the array into a hash where the key was the first four columns, since the first column name is unique, and the values being the fifth column.

Then I hoped I could sort the hash by the values and the keys would stay with their assigned values. I couldn't get this to work either, though unfortunately it was a couple days ago and I erased the code.

One problem was that I couldn't figure out how to split the string only before the fifth column, so I end up with two strings, one containing the first four columns and one containing the fifth.

What am I doing wrong with the sort command? Is there a better way to do this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In your last code example you can replace

my @sorted = sort { $b->[4] <=> $a->[4] } @input;

with

my @sorted = sort { (split(' ', $b))[4] <=> (split(' ', $a))[4] } @input;

or even

my @sorted = sort { (split(/s+/, $b))[4] <=> (split(/s+/, $a))[4] } @input;

if input data has no lines with leading spaces.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...