Scrap Attrs. текст в якоре от BeautifulSoup - PullRequest
1 голос

1 Ответ

1 голос
/ 18 июня 2020

Данные загружаются динамически через JavaScript. Вы можете использовать модуль requests для получения информации.

Например:

import json
import requests


page = 1
search_link = 'https://www.*********/GetDrugs.php?page={page}'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}

data = requests.get(search_link.format(page=page), headers=headers).json()

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

# print some data to screen:
print('Page {}/{}'.format(data['currentPage'], data['pageCount']))
for r in data['results']:
    print('{:<10} {:<10} {:<40} {:<40} {}'.format(r['id'], r['registerNumber'], r['tradeName'], r['scientificName'], r['agent']))

Печать:

Page 1/714
6912       3-5286-19  ATOXIA 120 mg Film-coated Tablet         ETORICOXIB                               SAUDI INTERNATIONAL TRADING COMPANY LTD (SITCO)
7162       27-271-17  EPIVAL 200MG\5ML SYRUP                   VALPROATE SODIUM                         Dallah Health Care Company
5688       43-271-19  SENERGY 10 MG/160 MG F.C. TABLET         AMLODIPINE ,   VALSARTAN                 SAJA-SAUDI ARABIAN JAPANESE PHARMACEUTICAL CO
8341       33-271-18  LEROXO 8 MG FILM COATED TABLET           LORNOXICAM                               Alkamal Import Office
8812       1-939-18   FEFOL SPANSULES                          FERROUS SULFATE, FOLIC ACID              TABUK PHARMACEUTICAL MANUFACTURING CO.
2531       4-271-98   CLODEARM 0.05% OINTMENT                  CLOBETASOL PROPIONATE                    ALNAGHI COMPANY
2532       5-271-98   CLODEARM 0.05% CREAM                     CLOBETASOL PROPIONATE                    ALNAGHI COMPANY
4531       1-271-96   DICLOFEN 1% CREMOGEL                     DICLOFENAC SODIUM                        SALEHIYA TRADING EST.
321        18-271-03  PROFILAR 1MG/5ML SYRUP                   KETOTIFEN                                SALEHIYA TRADING EST.
1268       13-271-01  UNIFED SYRUP                             TRIPROLIDINE, PSEUDOEPHEDRINE            SALEHIYA TRADING EST.

РЕДАКТИРОВАТЬ: печатать страницы из 1 на номер 99:

for page in range(1, 100):
    print('Page', page)
    search_link = 'https://**********/GetDrugs.php?page={page}'
    headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}

    data = requests.get(search_link.format(page=page), headers=headers).json()

    # uncomment this to print all data:
    # print(json.dumps(data, indent=4))

    # print some data to screen:
    print('Page {}/{}'.format(data['currentPage'], data['pageCount']))
    for r in data['results']:
        print('{:<10} {:<10} {:<40} {:<40} {}'.format(r['id'], r['registerNumber'], r['tradeName'], r['scientificName'], r['agent'] or '-'))
...