Я уже давно пытаюсь зайти на этот сайт и загрузить некоторые файлы.Я не могу узнать, что не так с моим кодом ниже.Я новичок в Python:
Когда вы попадаете на страницу входа https://www.targetsite/members/login.php,, он делает следующие вызовы:
Страница входа:
[General]
Request URL: https://www.targetsite/members/login.php
Request Method: GET
Status Code: 302 Found
Remote Address: 0.0.0.0:443
Referrer Policy: no-referrer-when-downgrade
[Response Headers]
Connection: keep-alive
Content-Length: 425
Content-Type: text/html; charset=iso-8859-1
Date: Thu, 13 Dec 2018 14:37:02 GMT
Keep-Alive: timeout=5
Location: https://www.targetsite/auth.form?bWFmYkpTbjhNN0J4bWM2S2NwaCtlNTkydDJJV0xiMk1aTWRKa0kwVldWZ29hZjdaMEUweFJuRWl3a3NqOVUwTwpSbmtyRnptekp6Wm40VlM2MDF5dWVVSDR2V1FXMG5JU1ZNQVUrZ0lvYTlXTWQ4T2ZEVzJVY1ZSWW4wZk1NVHZhCm9uZWZZeTI2V1JNPQo=
Server: nginx/1.14.2
Set-Cookie: pcar%5fUkVTVFJJQ1RFRA%3d%3d=; path=/; domain=.targetsite; expires=Wed 13-Dec-2017 14:37:02 GMT
X-Vegas-No-Cache: YES
[Request Headers]
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: keep-alive
Cookie: pcah=SXlOK1VkSytiMTl0VllvbDk4N2tVaXR5bmZFZmNNVUsK
Host: www.targetsite
Referer: http://www.targetsite/tour/index.php
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36 Vivaldi/2.1.1337.36
AuthForm:
[General]
Request URL: https://www.targetsite/auth.form?bWFmYkpTbjhNN0J4bWM2S2NwaCtlNTkydDJJV0xiMk1aTWRKa0kwVldWZ29hZjdaMEUweFJuRWl3a3NqOVUwTwpSbmtyRnptekp6Wm40VlM2MDF5dWVVSDR2V1FXMG5JU1ZNQVUrZ0lvYTlXTWQ4T2ZEVzJVY1ZSWW4wZk1NVHZhCm9uZWZZeTI2V1JNPQo=
Request Method: GET
Status Code: 200 OK
Remote Address: 0.0.0.0:443
Referrer Policy: no-referrer-when-downgrade
[Response Headers]
Cache-Control: max-age=1, must-revalidate
Connection: keep-alive
Content-Type: text/html
Date: Thu, 13 Dec 2018 14:37:03 GMT
Expires: Thu, 13 Dec 2018 14:37:03 GMT
Keep-Alive: timeout=5
Server: nginx/1.14.2
Set-Cookie: pcar%5fUkVTVFJJQ1RFRA%3d%3d=; path=/; domain=.targetsite; expires=Wed 13-Dec-2017 14:37:03 GMT
Transfer-Encoding: chunked
[Request Headers]
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: keep-alive
Cookie: pcah=SXlOK1VkSytiMTl0VllvbDk4N2tVaXR5bmZFZmNNVUsK
Host: www.targetsite
Referer: http://www.targetsite/tour/index.php
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36 Vivaldi/2.1.1337.36
bWFmYkpTbjhNN0J4bWM2S2NwaCtlNTkydDJJV0xiMk1aTWRKa0kwVldWZ29hZjdaMEUweFJuRWl3a3NqOVUwTwpSbmtyRnptekp6Wm40VlM2MDF5dWVVSDR2V1FXMG5JU1ZNQVUrZ0lvYTlXTWQ4T2ZEVzJVY1ZSWW4wZk1NVHZhCm9uZWZZeTI2V1JNPQo:
Итак, основываясь на этом поведении, наблюдаемом с помощью Chrome Inspector, я кодировал следующее, пытаясь эмулировать действия, запускаемые после доступа к https://www.targetsite.com/members/login.php
#Libraries
import requests
import json
from lxml import html
#URL
primeiraUrl = 'https://www.targetsite.com/members/login.php'
urlPost = 'https://ams.targetsite.com/auth.form'
#Credentials
userd = 'user'
passwd = 'pass'
session = requests.Session()
#session.verify = False
#GetToken
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9',
'Connection': 'keep-alive',
'Referer': 'http://www.targetsite.com/tour/index.php',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36 Vivaldi/2.1.1337.36'
}
get1stContact = session.get(primeiraUrl,headers=headers)
segundaUrl = get1stContact.url
get2ndContact = session.get(segundaUrl,headers=headers)
И затем при входе в системуна веб-сайте вот что вы получаете:
[General]
Request URL: https://ams.targetsite.com/auth.form
Request Method: POST
Status Code: 302 Found
Remote Address: 1.1.1.1:443
Referrer Policy: no-referrer-when-downgrade
[Response Headers]
Connection: Keep-Alive
Content-Length: 239
Content-Type: text/html; charset=iso-8859-1
Date: Thu, 13 Dec 2018 14:47:57 GMT
Keep-Alive: timeout=20, max=94
Location: http://www.targetsite.com/members/index.php
Server: Apache/2.2.15 (CentOS)
Set-Cookie: pcar%5fUkVTVFJJQ1RFRA%3d%3d=cS90NW9XLzVVeVNIeElMOUpFaHlCb2hGWkZveVUrdTFiK0dad0FYVDN2UT0K; path=/; domain=.targetsite.com; expires=Thu 13-Dec-2018 20:47:57 GMT
[Request Headers]
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 147
Content-Type: application/x-www-form-urlencoded
Cookie: pcah=Q3BLRnpDUGRhcnJFMmg1OGI0LzBrLzNhYWM5cjBVV2IK
Host: ams.targetsite.com
Origin: https://www.targetsite.com
Referer: https://www.targetsite.com/auth.form?N2dFMUIwaGFIc1BWQ3BoRTd2NVBWayt5ZE91UnZsa2xCcmNUU1VtVG8yNW54WUhjNFBYblE3STJwK2xrRWhNawpNRWtmMjJtUFF2Y0xSL2t1N2xIc2pmSk4wZG5uRVdmbkEyRUpxdnVDODI4UmVhMjlId1h6dVZIeFRtWGZuUGd5CkoySHEwZXg5RnRVPQo=
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36 Vivaldi/2.1.1337.36
[Form-Data]
rlm: RESTRICTED
for: http%3a%2f%2fwww%2etargetsite%2ecom%2fmembers%2findex%2ephp
rmb: y
uid: user
pwd: pass
А вот код, который я написал для того, чтобы сделать этот пост-запрос:
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Origin': 'https://www.targetsite.com',
'Referer': segundaUrl,
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36 Vivaldi/2.1.1337.36'
}
body = {
'rlm':'RESTRICTED',
'for':'http://www.targetsite.com/members/index.php',
'rmb': 'y',
'uid': 'user',
'pwd': 'pass'
}
#session.post(url)
r = session.post(urlPost, headers=headers, data=body)
Все это говорит, что есть кто-то, кто может мне помочьвыяснить это ?Заранее спасибо!
Редактировать .: Полный код по запросу:
#Help
#http://kazuar.github.io/scraping-tutorial/
#Libraries
import requests
import json
from lxml import html
#URL
primeiraUrl = 'https://www.targetsite.com/members/login.php'
urlPost = 'https://ams.targetsite.com/auth.form'
#Credentials
userd = 'user'
passwd = 'pass'
session = requests.Session()
#session.verify = False
#GetToken
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9',
'Connection': 'keep-alive',
'Referer': 'http://www.targetsite.com/tour/index.php',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36 Vivaldi/2.1.1337.36'
}
get1stContact = session.get(primeiraUrl,headers=headers)
segundaUrl = get1stContact.url
get2ndContact = session.get(segundaUrl,headers=headers)
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
'Content-Type': 'application/x-www-form-urlencoded',
'Accept-Language': 'en-US,en;q=0.9',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Origin': 'https://www.targetsite.com',
'Referer': segundaUrl,
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36 Vivaldi/2.1.1337.36'
}
body = {
'rlm':'RESTRICTED',
'for':'http://www.targetsite.com/members/index.php',
'rmb': 'y',
'uid': 'user',
'pwd': 'pass'
}
#session.post(url)
r = session.post(urlPost, headers=headers, data=body)
print(r.text)