Я использовал 'html.parser'
вместо 'lxml'
и смог отобразить весь скрипт с правильным форматированием:
import requests
from bs4 import BeautifulSoup
website_url = requests.get("https://www.imsdb.com/scripts/Thor-Ragnarok.html").text
soup = BeautifulSoup(website_url, 'html.parser')
text = soup.pre
, то есть начало раздела 5 отображалось как:
<b> BLUE DRAFT 05/20/16 5.
</b>
THE WALLS COME ALIVE! A seemingly infinite swarm of FIRE
DEMONS rally to Surtur's aid.
<b> THOR
</b> I make grave mistakes all the time.
Everything seems to work out.
In the shadows, a massive FIRE DRAGON ROARS.
The fire demons SURGE FORWARD. Thor backs up, HAMMERING
AWAY. He then leaps back, SPRINGBOARDS off the wall, and-