Вы можете использовать рекурсивную форму поиска в ширину:
def overlap(a, b) -> bool:
return a[-1] >= b[0] and a[-1] < b[-1]
def group(d, _c, seen):
return [_c,
[i if i not in seen else group(d, i, seen+[i]) for i in d if overlap(_c, i)]]
r = {'5ykw.pdb': [[10, 22], [33, 40], [39, 51], [63, 71], [94, 105]]}
new_data = [group(r['5ykw.pdb'], i, []) for i in r['5ykw.pdb'] if not any(overlap(c, i) for c in r['5ykw.pdb'])]
final_data = [a if not b else [a[0], max(h for _, h in b)] for a, b in new_data]
Выход:
[[10, 22], [33, 51], [63, 71], [94, 105]]
Это также будет работать при вводе с большим числом перекрытий:
r = {'5ykw.pdb':[[15, 20], [18, 21], [19, 30]]}
new_data = [group(r['5ykw.pdb'], i, []) for i in r['5ykw.pdb'] if not any(overlap(c, i) for c in r['5ykw.pdb'])]
final_data = [a if not b else [a[0], max(h for _, h in b)] for a, b in new_data]
Выход:
[[15, 30]]