Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
388 views
in Technique[技术] by (71.8m points)

hash - Perl Column comparison in two files

I have two files "1.txt" and "2.txt".

The structure of both the files is :

Main_File1 Start End 
1           200   250
2           310   340

Main_File2  Start End 
1           200   250
2           350   370

I want to write a perl code for which the output should be two files i.e. one file having the list of common "start" and "end" positions and other file having unique "start" and "end" positions.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I'm going to guess you already have code for reading each of these files, and just need to parse out the values and organize them.

The design part is coming up with a key that is unique for each start,end pair, but not hard to work with. Since start and end are numbers, this should be pretty easy:

our %matchups ;

sub process
{
  my ($lst_)= @_ ;
  for ( @$lst_ ) {
    my ($strt,$endn)= /d+w+(d+)w+(d+)/ ;
    next unless $strt && $endn ;
    my $key= "${strt}_$endn" ;
    $matchups{$key}[0]= $_ ;
    $matchups{$key}[1] ++ ;
  }
}

sub outputmatch
{
  my ($dest,$multi)= @_ ;
  # open file
  for ( values %matchups ) {
    print $OUT $_->[0] if ( $_->[1] > 1 ) == $multi ;
  }
}

{
  process(@listfrom1txt) ;
  process(@listfrom2txt) ;

  outputmatch( "common.txt", 1 ) ;
  outputmatch( "uniq.txt", 0 ) ;
}

So here we make a key which is start_end, and then we build a data structure inside the hash which is an array of two elements. The first element is the original line, the second is the count of how many times we've seen this entry.

If a line is unique, the count will be 1; if it's not, then it will be greater than 1.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...