Вы должны предоставить строку для BeautifulSoup
:
# parse bookmarks.html
with open(sys.argv[1]) as bookmark_file:
soup = BeautifulSoup(bookmark_file.read())
# extract youtube video urls
video_url_regex = re.compile('http://www.youtube.com/watch')
urls = [link['href'] for link in soup('a', href=video_url_regex)]
Отделяйте очень быстрый анализ URL от гораздо более длительной загрузки статистики:
# extract video ids from the urls
ids = [] # you could use `set()` and `ids.add()` to avoid duplicates
for video_url in urls:
url = urlparse.urlparse(video_url)
video_id = urlparse.parse_qs(url.query).get('v')
if not video_id: continue # no video_id in the url
ids.append(video_id[0])
Вам не нужно проходить аутентификацию для запросов на чтение:
# get some statistics for the videos
yt_service = YouTubeService()
yt_service.ssl = True #NOTE: it works for readonly requests
yt_service.debug = True # show requests
Сохранить некоторую статистику в CSV-файл, предоставленный в командной строке. Не останавливайтесь, если какое-то видео вызывает ошибку:
writer = csv.writer(open(sys.argv[2], 'wb')) # save to cvs file
for video_id in ids:
try:
entry = yt_service.GetYouTubeVideoEntry(video_id=video_id)
except Exception, e:
print >>sys.stderr, "Failed to retrieve entry video_id=%s: %s" %(
video_id, e)
else:
title = entry.media.title.text
print "Title:", title
view_count = entry.statistics.view_count
print "View count:", view_count
writer.writerow((video_id, title, view_count)) # write it
Вот полный сценарий , нажмите кнопку воспроизведения, чтобы посмотреть, как он был написан.
выход
$ python download-video-stats.py neudorfer.html out.csv
send: u'GET https://gdata.youtube.com/feeds/api/videos/Gg81zi0pheg HTTP/1.1\r\nAcc
ept-Encoding: identity\r\nHost: gdata.youtube.com\r\nContent-Type: application/ato
m+xml\r\nUser-Agent: None GData-Python/2.0.15\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: X-GData-User-Country: RU
header: Content-Type: application/atom+xml; charset=UTF-8
header: Expires: Thu, 10 Nov 2011 19:31:23 GMT
header: Date: Thu, 10 Nov 2011 19:31:23 GMT
header: Cache-Control: private, max-age=300, no-transform
header: Vary: *
header: GData-Version: 1.0
header: Last-Modified: Wed, 02 Nov 2011 08:58:11 GMT
header: Transfer-Encoding: chunked
header: X-Content-Type-Options: nosniff
header: X-Frame-Options: SAMEORIGIN
header: X-XSS-Protection: 1; mode=block
header: Server: GSE
Title: Paramore - Let The Flames Begin [Wal-Mart Soundcheck]
View count: 27807
out.csv
Gg81zi0pheg,Paramore - Let The Flames Begin [Wal-Mart Soundcheck],27807
pP9VjGmmhfo,Paramore: Wal-Mart Soundcheck,1363078
yTA1u6D1fyE,Paramore-Walmart Soundcheck 7-CrushCrushCrush(HQ),843
4v8HvQf4fgE,Paramore-Walmart Soundcheck 4-That's What You Get(HQ),1429
e9zG20wQQ1U,Paramore-Walmart Soundcheck 8-Interview(HQ),1306
khL4s2bvn-8,Paramore-Walmart Soundcheck 3-Emergency(HQ),796
XTndQ7bYV0A,Paramore-Walmart Soundcheck 6-For a pessimist(HQ),599
xTT2MqgWRRc,Paramore-Walmart Soundcheck 5-Pressure(HQ),963
J2ZYQngwSUw,Paramore - Wal-Mart Soundcheck Interview,10261
9RZwvg7unrU,Paramore - 08 - Interview [Wal-Mart Soundcheck],1674
vz3qOYWwm10,Paramore - 04 - That's What You Get [Wal-Mart Soundcheck],1268
yarv52QX_Yw,Paramore - 05 - Pressure [Wal-Mart Soundcheck],1296
LRREY1H3GCI,Paramore - Walmart Promo,523