Я просмотрел код и исправил его,
Это должно сработать,
import re
#read input file content
with open('input.txt') as inputFile:
inputText = inputFile.read()
regx = r'^(\d{4})\s{2,}(\D+?)(?=\s{2,})\s{2,}(\D+?)(?=\s{2,})\s{2,}(\D+?)(?=\s{2,})|(^\d{4})'
parsedText = re.findall(regx,inputText,flags=re.M)
rows = []
#organizing data to write to file
for line in parsedText:
if len(line[0]):
rows.append(list(line))
else :
rows[-1][-1] = line[-1]
#writing to file
with open('output.txt','w') as csvfile:
for row in rows:
csvfile.write("{} | {} | {} | {} | {}\n".format(row[1],row[0],row[4],row[2],row[3]))
Вы можете найти регулярное выражение, которое я использовал здесь для объяснения,
https://regex101.com/r/mHWcTD/1
1st Alternative ^(\d{4})\s{2,}(\D+?)(?=\s{2,})\s{2,}(\D+?)(?=\s{2,})\s{2,}(\D+?)(?=\s{2,})
^ asserts position at start of a line
1st Capturing Group (\d{4}) # Captures the start time
\d{4} matches a digit (equal to [0-9])
{4} Quantifier — Matches exactly 4 times
\s{2,} matches any whitespace character (equal to [\r\n\t\f\v ])
{2,} Quantifier — Matches between 2 and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Capturing Group (\D+?) # captures patient name
\D+? matches any character that\'s not a digit (equal to [^0-9])
+? Quantifier — Matches between one and unlimited times, as few times as possible, expanding as needed (lazy)
Positive Lookahead (?=\s{2,})
Assert that the Regex below matches
\s{2,} matches any whitespace character (equal to [\r\n\t\f\v ])
{2,} Quantifier — Matches between 2 and unlimited times, as many times as possible, giving back as needed (greedy)
\s{2,} matches any whitespace character (equal to [\r\n\t\f\v ])
{2,} Quantifier — Matches between 2 and unlimited times, as many times as possible, giving back as needed (greedy)
3rd Capturing Group (\D+?) # captures operation details
\D+? matches any character that\'s not a digit (equal to [^0-9])
+? Quantifier — Matches between one and unlimited times, as few times as possible, expanding as needed (lazy)
Positive Lookahead (?=\s{2,})
Assert that the Regex below matches
\s{2,} matches any whitespace character (equal to [\r\n\t\f\v ])
\s{2,} matches any whitespace character (equal to [\r\n\t\f\v ])
{2,} Quantifier — Matches between 2 and unlimited times, as many times as possible, giving back as needed (greedy)
4th Capturing Group (\D+?) # captures surgeons name
Positive Lookahead (?=\s{2,})
Assert that the Regex below matches
2nd Alternative (^\d{4})
5th Capturing Group (^\d{4}) # captures end time
^ asserts position at start of a line
\d{4} matches a digit (equal to [0-9])
{4} Quantifier — Matches exactly 4 times
Пример ввода:
Run on: 10/07/19 - 1444 Hospital PAGE 1
Run by: H Final Slate For: 11/07/19 THU
PIR Patient Name R/L/B Proposed Procedure Surgeon Path Reg'd Dur
POR Time Unit Number PHN Assist Bld Req'd PIR-POR
Pri DOB Age/S Med Imaging
Loc Bed Type Req'd Staff
Ward
OR Room - 1 Room End Time: 1730 Anaesthetist: S,A T
OHS 0900-2000
0745 Morgan Freeman Replace Root and Ascending Dr. Henry Cavail GENERAL
1305 RC02654289 96985693 Aorta/Hemiarch (Tissue), Amputate Left 4 UNITS
3A 21/12/1943 75/M Atrial Appendage Perfusionist
SDA ICU
RC-T2S
Weeks on Waitlist: 5 (36 days) 320
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1400 Alicia Cuthbart Coronary Artery Bypass Graft Dr. Denzel Washington GENERAL
1730 RC00968458 906854959 SCREEN
2B 18/06/1958 61/M Perfusionist
INPT ICU
RC-T2S
Weeks on Waitlist: 2 (17 days) 210
Other Comments: DM Type 2
Run on: 10/07/19 - 1444 Hospital PAGE 2
Run by: H Final Slate For: 11/07/19 THU
PIR Patient Name R/L/B Proposed Procedure Surgeon Path Reg'd Dur
POR Time Unit Number PHN Assist Bld Req'd PIR-POR
Pri DOB Age/S Med Imaging
Loc Bed Type Req'd Staff
Ward
OR Room - 2 Room End Time: 1825 Anaesthetist: K,N S
OHS 0900-1930
0745 John van-Damn Aortic Valve Replacement (Mechanical) Dr. Bon Jovi GENERAL
1205 RC00584564 9095681571 4 UNITS
3A 13/04/1955 64/F Perfusionist
SDA ICU
RC-T2S
Weeks on Waitlist: 14 (98 days) 260
Other Comments: DM Type 2
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Пример вывода:
Morgan Freeman | 0745 | 1305 | Replace Root and Ascending | Dr. Henry Cavail
Alicia Cuthbart | 1400 | 1730 | Coronary Artery Bypass Graft | Dr. Denzel Washington
John van-Damn | 0745 | 1205 | Aortic Valve Replacement (Mechanical) | Dr. Bon Jovi