I need to extract address, telephone no using xPath from my html page. My
address is sometimes within one `<p>`, else within two `<p>`. I have 11
stores.
This is the html tag <p> in my xml. (Just an example)
<div class="info-block-value"> ==$0
<p>36 rue de la Verrerie 75004 PARIS</p>
<p>Tél : 0111 222 222</p>
</div>
<div class="info-block-value"> ==$0
<p>11 rue des archives</p>
<p>75004 PARIS</p>
<p>Tél : 01 11 11 11 11</p>
</div>
1st shop: P1 =address P2= tel
2nd shop P1= address P2 = tel P3 = fax
3rd shop P1=address line 1 P2 = address line 2 P3= tel
4th : P1 = address P2 = tel
5th : P1= add P2 = tel
Shops 6,7,8,9,11 : P1 = add line 1 P2 = add line 2 ( they have no
telephone)
10th shop : P1= add line 1 P2= addline 2, P3= tel, P4= space P5 = email
I tried with,
{
"name": "store Addr",
«ключ»: «адрес»,
"xPath": "(// div [@ class = 'info-block-value'] / p) [1] |
(// DIV [@ класс = 'инфо-блок-значение'] / р) [2]»,
«уровень»: 0,
«включен»: правда,
"ценности": []
},
{
"name": "Tel No",
"ключ": "номер телефона",
"xPath": "(// div [@ class = 'info-block-value'] / p) [2] |
(// DIV [@ класс = 'info-
Блок-значение '] / р) [3]»,
"regex": "Tél: ((\ d + \ s *) +) +",
«уровень»: 0,
«включен»: правда,
"ценности": []
}
But I'm not getting the correct results. Can someone help me this?
Results:
id name address phone
1 a 36 rue de la Verrerie 75004 PARIS 0111 222 222
2 b 11 rue des archives 01 11 11 11 11
Expecting results
id name address phone
1 a 36 rue de la Verrerie 75004 PARIS 0111 222 222
2 b 11 rue des archives 75004 PARIS 01 11 11 11 11