Чтобы прочитать текстовое содержимое строки после преобразования HTMl в текст в java - PullRequest
0 голосов
/ 20 апреля 2020

У меня есть HTMl Форма в теле письма, как я могу прочитать текстовое содержимое строки после преобразования HTML ФОРМА в текст. Кто-нибудь может мне помочь?

Тело электронной почты - HTML Форма: OldImage NewImage

Тело электронной почты - HTML Содержимое формы:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
        {font-family:Helvetica;
        panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"Century Gothic";
        panose-1:2 11 5 2 2 2 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.EmailStyle17
        {mso-style-type:personal-compose;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri",sans-serif;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal"><img width="809" height="364" style="width:8.427in;height:3.7916in" id="Picture_x0020_4" src="cid:image001.jpg@01D609B1.5BB77760"><o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">Non-Contacted/Non-Qualified Leads from Ateco:&nbsp; <o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" width="1553" style="width:1165.0pt;margin-left:-.15pt;border-collapse:collapse">
<tbody>
<tr style="height:15.0pt">
<td width="167" nowrap="" valign="bottom" style="width:125.0pt;border:solid #8EA9DB 1.0pt;border-right:none;background:#4472C4;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><b><span style="color:white">Name<o:p></o:p></span></b></p>
</td>
<td width="111" nowrap="" valign="bottom" style="width:83.0pt;border-top:solid #8EA9DB 1.0pt;border-left:none;border-bottom:solid #8EA9DB 1.0pt;border-right:none;background
;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><b><span style="color:white">Mobile<o:p></o:p></span></b></p>
</td>
<td width="259" nowrap="" valign="bottom" style="width:194.0pt;border-top:solid #8EA9DB 1.0pt;border-left:none;border-bottom:solid #8EA9DB 1.0pt;border-right:none;backgroun
4;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><b><span style="color:white">Email<o:p></o:p></span></b></p>
</td>
<td width="103" nowrap="" valign="bottom" style="width:77.0pt;border-top:solid #8EA9DB 1.0pt;border-left:none;border-bottom:solid #8EA9DB 1.0pt;border-right:none;background
;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><b><span style="color:white">Postal Code<o:p></o:p></span></b></p>
</td>
<td width="109" nowrap="" valign="bottom" style="width:82.0pt;border-top:solid #8EA9DB 1.0pt;border-left:none;border-bottom:solid #8EA9DB 1.0pt;border-right:none;background
;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><b><span style="color:white">Enquiry Date<o:p></o:p></span></b></p>
</td>
<td width="239" nowrap="" valign="bottom" style="width:179.0pt;border-top:solid #8EA9DB 1.0pt;border-left:none;border-bottom:solid #8EA9DB 1.0pt;border-right:none;backgroun
4;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><b><span style="color:white">Lead Source<o:p></o:p></span></b></p>
</td>
<td width="261" nowrap="" valign="bottom" style="width:196.0pt;border-top:solid #8EA9DB 1.0pt;border-left:none;border-bottom:solid #8EA9DB 1.0pt;border-right:none;backgroun
4;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><b><span style="color:white">Dealer<o:p></o:p></span></b></p>
</td>
<td width="93" nowrap="" valign="bottom" style="width:70.0pt;border-top:solid #8EA9DB 1.0pt;border-left:none;border-bottom:solid #8EA9DB 1.0pt;border-right:none;background:
padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><b><span style="color:white">Date Sent
<o:p></o:p></span></b></p>
</td>
<td width="212" nowrap="" valign="bottom" style="width:159.0pt;border:solid #8EA9DB 1.0pt;border-left:none;background:#4472C4;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><b><span style="color:white">Preferred Model<o:p></o:p></span></b></p>
</td>
</tr>
<tr style="height:15.0pt">
<td width="167" nowrap="" valign="bottom" style="width:125.0pt;border-top:none;border-left:solid #8EA9DB 1.0pt;border-bottom:solid #8EA9DB 1.0pt;border-right:none;backgroun
2;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><span style="color:black">Test Justin<o:p></o:p></span></p>
</td>
<td width="111" nowrap="" valign="bottom" style="width:83.0pt;border:none;border-bottom:solid #8EA9DB 1.0pt;background:#D9E1F2;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><span style="color:black">&#43;61 420 888 999<o:p></o:p></span></p>
</td>
<td width="259" nowrap="" valign="bottom" style="width:194.0pt;border:none;border-bottom:solid #8EA9DB 1.0pt;background:#D9E1F2;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><span style="color:black">testmail@hotmail.com<o:p></o:p></span></p>
</td>
<td width="103" nowrap="" valign="bottom" style="width:77.0pt;border:none;border-bottom:solid #8EA9DB 1.0pt;background:#D9E1F2;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><span style="color:black">4218<o:p></o:p></span></p>
</td>
<td width="109" nowrap="" valign="bottom" style="width:82.0pt;border:none;border-bottom:solid #8EA9DB 1.0pt;background:#D9E1F2;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><span style="color:black">31-03-20<o:p></o:p></span></p>
</td>
<td width="239" nowrap="" valign="bottom" style="width:179.0pt;border:none;border-bottom:solid #8EA9DB 1.0pt;background:#D9E1F2;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><span style="color:black">LDV Facebook - Book a Test Drive<o:p></o:p></span></p>
</td>
<td width="261" nowrap="" valign="bottom" style="width:196.0pt;border:none;border-bottom:solid #8EA9DB 1.0pt;background:#D9E1F2;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><span style="color:black">QLD - Von Bibra Gold Coast - 554216<o:p></o:p></span></p>
</td>
<td width="93" nowrap="" valign="bottom" style="width:70.0pt;border:none;border-bottom:solid #8EA9DB 1.0pt;background:#D9E1F2;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><span style="color:black">03-04-20<o:p></o:p></span></p>
</td>
<td width="212" nowrap="" valign="bottom" style="width:159.0pt;border-top:none;border-left:none;border-bottom:solid #8EA9DB 1.0pt;border-right:solid #8EA9DB 1.0pt;backgroun
2;padding:0in 5.4pt 0in 5.4pt;height:15.0pt">
<p class="MsoNormal" align="center" style="text-align:center"><span style="color:black">T60 4WD Diesel Dual Cab Ute<o:p></o:p></span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">Thank you,<o:p></o:p></p>
<p class="MsoNormal">Anna<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><b><span lang="EN-AU" style="color:black;mso-fareast-language:EN-AU">Anna Tupou</span></b><span lang="EN-AU" style="font-family:&quot;Helvetica&quot;,s
f;color:black;mso-fareast-language:EN-AU"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="color:black;mso-fareast-language:EN-AU">Call Centre Supervisor ÔÇô Lead Management</span><span lang="EN-AU" style="font-famil
Helvetica&quot;,sans-serif;color:black;mso-fareast-language:EN-AU"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.5pt;color:black;mso-fareast-language:EN-AU"><br>
</span><b><span lang="EN-AU" style="font-family:&quot;Century Gothic&quot;,sans-serif;color:black;mso-fareast-language:EN-AU"><img width="294" height="34" style="width:3.06
ght:.3541in" id="_x0038_11B48E0-2644-4F0E-A8FF-F2DD7ECD462F" src="cid:image002.jpg@01D609B1.5BB77760" alt="cid:BD091752-D740-4B3A-B050-FF52A328E5C8"></span></b><b><span lan
" style="font-family:&quot;Century Gothic&quot;,sans-serif;color:black;mso-fareast-language:EN-AU"><o:p></o:p></span></b></p>
<p class="MsoNormal"><span lang="EN-AU" style="mso-fareast-language:EN-AU"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.5pt;color:black;mso-fareast-language:EN-AU">2A Hill Rd Lidcombe NSW 2141 Australia</span><span lang="EN-AU" styl
size:10.5pt;color:#005CFB;mso-fareast-language:EN-AU">
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.5pt;color:#005CFB;mso-fareast-language:EN-AU">P</span><span lang="EN-AU" style="font-size:10.5pt;color:black;mso
-language:EN-AU"> ÔÇé&#43;61 2 8577 8097ÔÇé</span><span lang="EN-AU" style="font-size:10.5pt;color:#005CFB;mso-fareast-language:EN-AU">|</span><span lang="EN-AU" style="fon
0.5pt;color:black;mso-fareast-language:EN-AU">ÔÇé</span><span lang="EN-AU" style="font-size:10.5pt;color:#005CFB;mso-fareast-language:EN-AU">
 E</span><span lang="EN-AU" style="font-size:10.5pt;color:black;mso-fareast-language:EN-AU">ÔÇé</span><u><span lang="EN-AU" style="font-size:10.5pt;color:blue;mso-fareast-l
EN-AU">atupou@ateco.com.au<o:p></o:p></span></u></p>
<p class="MsoNormal"><span lang="EN-AU" style="font-size:10.5pt;color:#005CFB;mso-fareast-language:EN-AU">M&nbsp;
</span><span lang="EN-AU" style="font-size:10.5pt;mso-fareast-language:EN-AU">0407 588 506<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt"><o:p>&nbsp;</o:p></span></p>
</div>
<div>
<p><b><span style="font-size:13.5pt;font-family:webdings;color:green">P</span> <span style="font-size: 7.5pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:gree
<i>: Please consider the environment before printing this e-mail. </span></i></b></p>
<p id="disclaimer-input" style="font-family: Helvetica,Arial,sans-serif; color: gray; font-size: 7.5pt;" class="txt">
IMPORTANT NOTICE: If this e-mail is received by other than the named addressee, please notify us immediately by telephone or return e-mail and delete all copies from your c
system. This document contains information proprietary to Ateco Group and its
 affiliates or third parties to which Ateco may have a legal obligation to protect such information from unauthorised disclosure, use or duplication. Any disclosure, use or
tion of this document or the information contained herein for other than the
 specific purpose for which it was disclosed by Ateco is expressly prohibited. It is the recipient's responsibility to check this message and attachments for viruses.
</p>
<div></div>
</div>
</body>
</html>

<br>
<p> </p>

<p align="center" style="text-align:center">**********Disclaimer**********</p>

<p style="text-align:justify">&quot;This email and any
attachments are confidential and are for the intended addressee[s] only.
Unauthorised use of this communication is prohibited. If you have received this
communication in error, please notify the sender and remove them from your
system. Confidentiality is not waived or lost by reason of the mistaken
delivery to you. Please scan this email and any attachment(s) for viruses. It
is your responsibility to check them before opening&quot; </p>

<p align="center" style="text-align:center">********End of
Disclaimer*********</p>

Содержимое строки после преобразования (Тело электронной почты):

Non-Contacted/Non-Qualified Leads from Ateco:

Name
Mobile
Email
Postal Code
Enquiry Date
Lead Source
Dealer
Date Sent
Preferred Model
Test Justin
+61 420 888 999
testmail@hotmail.com
4218
31-03-20
LDV Facebook - Book a Test Drive
QLD - Von Bibra Gold Coast - 554216
03-04-20
T60 4WD Diesel Dual Cab Ute

Thank you,
Anna


Anna Tupou
Call Centre Supervisor ÔÇô Lead Management


2A Hill Rd Lidcombe NSW 2141 Australia
P ÔÇé+61 2 8577 8097ÔÇé|ÔÇé EÔÇéatupou@ateco.com.au
M 0407 588 506


P : Please consider the environment before printing this e-mail.
IMPORTANT NOTICE: If this e-mail is received by other than the named addressee, please notify us immediately by telephone or return e-mail and delete all copies from your computer
system. This document contains information proprietary to Ateco Group and its affiliates or third parties to which Ateco may have a legal obligation to protect such information fro
m unauthorised disclosure, use or duplication. Any disclosure, use or duplication of this document or the information contained herein for other than the specific purpose for which
 it was disclosed by Ateco is expressly prohibited. It is the recipient's responsibility to check this message and attachments for viruses.


**********Disclaimer**********
"This email and any attachments are confidential and are for the intended addressee[s] only. Unauthorised use of this communication is prohibited. If you have received this communi
cation in error, please notify the sender and remove them from your system. Confidentiality is not waived or lost by reason of the mistaken delivery to you. Please scan this email
and any attachment(s) for viruses. It is your responsibility to check them before opening"
********End of Disclaimer*********

Примечание : мне нужно составить точную пару ключ-значение, например. Почтовый индекс: 4218 .

Ответы [ 2 ]

0 голосов
/ 20 апреля 2020

Вы можете использовать любую библиотеку DOM Parser для анализа вашего HTML. Вы можете просто присвоить идентификатор любому тегу HTML и затем получить этот элемент. Я предложу вам Jsoup библиотеку.

Используйте приведенный ниже код с использованием библиотеки Jsoup

Добавьте ID в HTML

<p id="POSTAL_CODE">4218</p>

Java Код

Document doc = Jsoup.parse(htmlString);

Element elPostalCode = doc.getElementById("POSTAL_CODE");
String postalCode = elPostalCode.text();

Вы также можете использовать извлечение атрибута для вашего HTML, для получения дополнительной информации о извлечении атрибута с помощью Jsoup вы можете посетить this стр.

. .

Для получения дополнительной информации вы можете обратиться к этой статье , в этой статье они упомянули несколько HTML библиотек синтаксического анализа для нескольких языков программирования.

Код для вашей точной задачи

.

ПРИМЕЧАНИЕ: Этот код будет работать только тогда, когда у вас будет одинаковое количество тегов p в каждой строке, включая заголовок .

Document doc = Jsoup.parse(htmlString);

List<String> keys = new ArrayList<>();
List<Map<String, String>> dataPairs = new ArrayList<>();

Elements trElements = doc.getElementsByTag("tr");

    for (int i = 0; i < trElements.size(); i++) {
    Element element = trElements.get(i);
    Elements pElements = element.getElementsByTag("p");

    Map<String, String> map = new HashMap<>();
    for (int i1 = 0; i1 < pElements.size(); i1++) {
        Element p = pElements.get(i1);
        if (i == 0) {
            keys.add(p.text());
        } else {
            map.put(keys.get(i1), p.text());
        }
    }
    dataPairs.add(map);
}
0 голосов
/ 20 апреля 2020

Я пришел с такой идеей.

Map<String, String> map = new HashMap<>();
String[] lines = text.split("\n");
int numberOfElements = lines.length / 2;

for (int i = 0; i < numberOfElements; i++) {
    map.put(lines[i], lines[i + numberOfElements]);
}

Попробуйте и дайте мне знать, если это работает для вас.

...