Вот пример того, как вы можете подойти к нему в Perl:
use feature qw(say);
use strict;
use warnings;
my $fn = 'file1.tsv';
open ( my $fh, '<', $fn ) or die "Could not open file '$fn': $!";
my $header = <$fh>;
my @pos;
my %info;
while( my $line = <$fh> ) {
chomp $line;
my ($nbr, $pos, $pvalue, $percentage, $samplename) = split /\t/, $line;
if ( !exists $info{$pos} ) {
$info{$pos} = {
nbr => $nbr,
pvalue => [$pvalue],
percentage => $percentage,
samplename => [$samplename],
};
push @pos, $pos;
}
else {
push @{$info{$pos}{pvalue}}, $pvalue;
push @{$info{$pos}{samplename}}, $samplename;
}
}
close $fh;
print $header;
for my $pos (@pos) {
my $data = $info{$pos};
say join "\t", $data->{nbr}, $pos,
(join ",", @{$data->{pvalue}}), $data->{percentage},
(join ",", @{$data->{samplename}});
}
Вывод :
rownbr pos pvalue percentage samplename
1 chr1_12000 0.05 5.6 S1
1 chr1_12500 0.04,0.03 15.9 S1,S3
3 chr1_12570 0.9 45.3 S2