Еще один, который пытается свести к минимуму поиск хеша b
, если edge >> input:
$ awk '
NR==FNR && !($1 in a) { # if node not in hash a yet, ie. remove duplicates in input
for(i in a) { # "c" -> a[]: insert to b: ca, ac, cb, bc
b[$1 i]
b[i $1]
}
a[$1] # new entries go to a as well
next
}
($1 $2 in b) {
# delete b[$1 $2] # uncomment these to remove duplicates
# delete b[$2 $1] # ie. "a b 0.8" vs. "b a 0.8"
print
}' input edge # if both $1 and $2 are in a, $1 $2 is in b
Выход:
a b 0.8
b c 0.1
c b 0.1
b a 0.8
a c 0.1
С удаленными дубликатами:
a b 0.8
b c 0.1
a c 0.1