Вот один из способов для вас. Когда вы просматриваете каждую страницу, вы можете создать фрейм данных, который содержит два столбца. map_dfr()
связывает два фрейма данных.
library(rvest)
library(tidyverse)
map_dfr(.x = paste("https://news.ycombinator.com/news?p", 1:2, sep = ""),
.f = function(x){tibble(url = x,
title = read_html(x) %>%
html_nodes("a.storylink") %>%
html_text()
)})
url title
<chr> <chr>
1 https://news.ycombinator.com/news?p1 1k True Fans? Try 100
2 https://news.ycombinator.com/news?p1 Critical Bluetooth Vulnerability in Android (CVE-2020-0022)
3 https://news.ycombinator.com/news?p1 FLIF – Free Lossless Image Format
4 https://news.ycombinator.com/news?p1 The Rapid Growth of Io_uring
5 https://news.ycombinator.com/news?p1 Show HN: Building an open-source language-learning platform
6 https://news.ycombinator.com/news?p1 Why Google Might Prefer Dropping a $22B Business
7 https://news.ycombinator.com/news?p1 TV Backlight Compensation
8 https://news.ycombinator.com/news?p1 This person does not exist
9 https://news.ycombinator.com/news?p1 Angular 9.0
10 https://news.ycombinator.com/news?p1 Before the DNS: how yours truly upstaged the NIC's official HOSTS.TXT (2004)
Если вы хотите добавить hnuser, добавьте еще один столбец. Проще говоря, вы можете сделать следующее.
map_dfr(.x = paste("https://news.ycombinator.com/news?p", 1:2, sep = ""),
.f = function(x){tibble(url = x,
title = read_html(x) %>%
html_nodes("a.storylink") %>%
html_text(),
hnuser = read_html(x) %>%
html_nodes("a.hnuser") %>%
html_text()
)})