Разделение текста по процентам от знаков препинания
def split_text(s):
""" Partitions text into three parts
in proportion 30%, 40%, 30%"""
i1 = int(0.3*len(s)) # first part from 0 to i1
i2 = int(0.7*len(s)) # 2nd for i1 to i2, 3rd i2 onward
# Use isalpha() to check when we are at a punctuation
# i.e. . or ; or , or ? " or ' etc.
# Find nearest alphanumeric boundary
# backup as long as we are in an alphanumeric
while s[i1].isalpha() and i1 > 0:
i1 -= 1
# Find nearest alphanumeric boundary (for 2nd part)
while s[i2].isalpha() and i2 > i1:
i2 -= 1
# Returns the three parts
return s[:i1], s[i1:i2], s[i2:]
for s in list_strings:
# Loop over list reporting lengths of parts
# Three parts are a, b, c
a, b, c = split_text(s)
print(f'{s}\nLengths: {len(a)}, {len(b)}, {len(c)}')
print()
Вывод
I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best
Lengths: 52, 86, 63
So many books, so little time.
Lengths: 7, 10, 13
In three words I can sum up everything I've learned about life: it goes on.
Lengths: 20, 31, 24
if you tell the truth, you don't have to remember anything.
Lengths: 15, 25, 19
Always forgive your enemies; nothing annoys them so much.
Lengths: 14, 22, 21
Вывод split_text
Код
for s in list_strings:
a, b, c = split_text(s)
print(a)
print(b)
print(c)
print()
Результат
I'm selfish, impatient and a little insecure. I make
mistakes, I am out of control and at times hard to handle. But if you can't handle me
at my worst, then you sure as hell don't deserve me at my best
So many
books, so
little time.
In three words I can
sum up everything I've learned
about life: it goes on.
if you tell the
truth, you don't have to
remember anything.
Always forgive
your enemies; nothing
annoys them so much.
Захват разделов
result_a, result_b, result_c = [], [], []
for s in list_strings:
# Loop over list reporting lengths of parts
# Three parts are a, b, c
a, b, c = split_text(s)
result_a.append(a)
result_b.append(b)
result_c.append(c)