Спор данные в R - PullRequest
       8

Спор данные в R

1 голос
/ 10 апреля 2020

Я пытаюсь разбить эти данные (в частности, извлечь уровень выпускников) для анализа полезным способом. Я считаю, что мне нужно str_split (используя R), но я не понимаю, что это за тип данных и что все это значит / et c. Я удалил это с веб-сайта, используя пакет rvest и код ниже:

url <- "https://www.greatschools.org/maryland/severna-park/115-Severna-Park-High-School/"

grad_rate <- read_html(url) %>%
html_nodes("script") %>%
html_text() %>%
purrr::pluck(9)

grad_rate

    "{\"title\":\"College readiness\",\"anchor\":\"College_readiness\",\"analytics_id\":\"CollegeReadiness\",\"subtitle\":\"Learn more about how to help your child graduate ready for college. \\u003ca href=\\\"/gk/articles/jump-start-college-planning/\\\" target=\\\"_blank\\\"\\u003eSee how.\\u003c/a\\u003e\",\"icon_classes\":\"icon-graduation\",\"info_text\":\"\\u003cp\\u003eThis rating shows how well students at this school are prepared for college compared to students at other schools in this state, based on key measures, like graduation rates, college entrance tests and advanced coursework when available.\\u003c/p\\u003e\\u003cp\\u003e\\u003ca href=\\\"/gk/ratings/#collegereadinessrating\\\" target=\\\"_blank\\\"\\u003eLearn more about this rating.\\u003c/a\\u003e\\u003c/p\\u003e\\n\",\"rating\":9,\"sources\":\"\\u003cdiv class=\\\"sourcing\\\"\\u003e\\u003ch1\\u003eGreatSchools profile data sources \\u0026amp; information\\u003c/h1\\u003e\\u003cdiv\\u003e\\u003ch4 \\u003eGreatSchools College Readiness Rating\\u003c/h4\\u003e\\u003cp\\u003eThe College Readiness Rating uses this high school's graduation rates, college entrance exam participation and performance, or AP, IB, or Dual Enrollment participation and AP performance to determine how well schools are preparing students for success in college and beyond. The College Readiness Rating was created using 2015 4-year high school graduation rate data from MSDE, using 2016 demographic data from NCES, and the following data from the 2016 Civil Rights Data Collection: percentage of students enrolled in IB, AP or Dual Enrollment classes in grades 9-12, and percentage of students passing 1 or more AP exams grades 9-12.\\u003c/p\\u003e\\u003cp\\u003e\\u003cspan class=\\\"emphasis\\\"\\u003eSource\\u003c/span\\u003e: GreatSchools; this rating was calculated in 2019 | \\u003cspan class=\\\"emphasis\\\"\\u003eSee more\\u003c/span\\u003e: \\u003ca href=\\\"/gk/ratings/#collegereadinessrating\\\"; target=\\\"_blank\\\"\\u003eAbout this rating\\u003c/a\\u003e\\u003c/p\\u003e\\u003c/div\\u003e\\u003cdiv\\u003e\\u003ch4\\u003e4-year high school graduation rate\\u003c/h4\\u003e\\u003cp\\u003eGraduation rates reflect how many students graduate from this school on time.\\u003c/p\\u003e\\u003cp\\u003e\\u003cspan class=\\\"emphasis\\\"\\u003eSource\\u003c/span\\u003e: MSDE, 2015\\u003c/p\\u003e\\u003c/div\\u003e\\u003cdiv\\u003e\\u003ch4\\u003eAP course participation\\u003c/h4\\u003e\\u003cp\\u003eAdvanced Placement classes are college-level courses students can take in high school. The percentage of students taking AP classes may reflect whether the school culture is focused on college.\\u003c/p\\u003e\\u003cp\\u003e\\u003cspan class=\\\"emphasis\\\"\\u003eSource\\u003c/span\\u003e: Civil Rights Data Collection, 2016\\u003c/p\\u003e\\u003c/div\\u003e\\u003cdiv\\u003e\\u003ch4\\u003ePercentage of students passing 1 or more AP exams grades 9-12\\u003c/h4\\u003e\\u003cp\\u003eThe AP exam pass rate reflects how many students at this school earned a passing score on at least one AP exam. Students who do well on AP exams (passing with a score of 3, 4, or 5) may receive college credit.\\u003c/p\\u003e\\u003cp\\u003e\\u003cspan class=\\\"emphasis\\\"\\u003eSource\\u003c/span\\u003e: Civil Rights Data Collection, 2016\\u003c/p\\u003e\\u003c/div\\u003e\\u003cdiv\\u003e\\u003ch4\\u003ePercentage of students enrolled in Dual Enrollment classes grades 9-12\\u003c/h4\\u003e\\u003cp\\u003eThe Dual Enrollment participation rate reflects the percentage of students at this school who are taking college courses while in high school. Credits for these courses apply both to high school diploma requirements and college graduation requisites.\\u003c/p\\u003e\\u003cp\\u003e\\u003cspan class=\\\"emphasis\\\"\\u003eSource\\u003c/span\\u003e: Civil Rights Data Collection, 2016\\u003c/p\\u003e\\u003c/div\\u003e\\u003cdiv\\u003e\\u003ch4\\u003ePercentage of students enrolled in IB grades 9-12\\u003c/h4\\u003e\\u003cp\\u003eInternational Baccalaureate (IB) is an internationally recognized, high-standards program that emphasizes creative and critical thinking. A high school may have specific IB classes students can take, or a school-wide IB program that affects all classes. Some colleges give college credit for IB courses. \\u003ca href='/gk/articles/what-is-ib-international-baccalaureate/' target='_blank'\\u003eMore about IB\\u003c/a\\u003e\\n\\u003c/p\\u003e\\u003cp\\u003e\\u003cspan class=\\\"emphasis\\\"\\u003eSource\\u003c/span\\u003e: Civil Rights Data Collection, 2016\\u003c/p\\u003e\\u003c/div\\u003e\\u003cdiv\\u003e\\u003ch4\\u003eSAT/ACT participation rate\\u003c/h4\\u003e\\u003cp\\u003eThe SAT/ACT participation rate shows the percentage of eligible students in grades 11 or 12 at this school who took the SAT or ACT.\\u003c/p\\u003e\\u003cp\\u003e\\u003cspan class=\\\"emphasis\\\"\\u003eSource\\u003c/span\\u003e: Civil Rights Data Collection, 2014\\u003c/p\\u003e\\u003c/div\\u003e\\u003c/div\\u003e\",\"feedback\":{\"feedback_cta\":\"Did you find the information about college success useful? What can we do better?\",\"feedback_link\":\"https://s.qualaroo.com/45194/cb0e676f-324a-4a74-bc02-72ddf1a2ddd6?school=115\\u0026state=MD\",\"button_text\":\"Answer\"},\"share_content\":\"\\u003cdiv class=\\\"sharing-modal\\\"\\u003e\\u003cdiv class=\\\"sharing-row js-emailSharingLinks js-slTracking\\\" data-url=\\\"https://www.greatschools.org/maryland/severna-park/115-Severna-Park-High-School/?utm_source=profile\\u0026utm_medium=Email\\u0026subject=Severna+Park+High+School+-+College+readiness\\u0026body=Check+out+the+Severna+Park+High+School+-+College+readiness%250D%250A#College_readiness\\\" data-type=\\\"Email\\\" data-module=\\\"College_readiness\\\" data-link=\\\"mailto:?subject=Severna Park High School - College readiness\\u0026body=Check out the Severna Park High School - College readiness%0D%0Ahttps://www.greatschools.org/maryland/severna-park/115-Severna-Park-High-School//?utm_source=profile%26utm_medium=email#College_readiness\\\"\\u003e\\u003cdiv class=\\\"sharing-icon-box\\\"\\u003e\\u003cspan class=\\\"icon-mail\\\"\\u003e\\u003c/span\\u003e\\u003c/div\\u003e\\u003cspan class=\\\"sharing-row-text\\\"\\u003eEmail\\u003c/span\\u003e\\u003c/div\\u003e\\u003cdiv class=\\\"sharing-row js-sharingLinks js-slTracking\\\" data-url=\\\"https://www.greatschools.org/maryland/severna-park/115-Severna-Park-High-School/?utm_source=profile\\u0026utm_medium=Facebook#College_readiness\\\" data-siteparams=\\\"\\u0026t=Severna Park High School - College readiness\\\" data-type=\\\"Facebook\\\" data-module=\\\"College_readiness\\\" data-link=\\\"https://www.facebook.com/sharer/sharer.php?u=\\\"\\u003e\\u003cdiv class=\\\"sharing-icon-box\\\"\\u003e\\u003cspan class=\\\"icon-facebook\\\"\\u003e\\u003c/span\\u003e\\u003c/div\\u003e\\u003cspan class=\\\"sharing-row-text\\\"\\u003eFacebook\\u003c/span\\u003e\\u003c/div\\u003e\\u003cdiv class=\\\"sharing-row js-sharingLinks js-slTracking\\\" data-url=\\\"https://www.greatschools.org/maryland/severna-park/115-Severna-Park-High-School/?utm_source=profile\\u0026utm_medium=Twitter#College_readiness\\\" data-siteparams=\\\"\\u0026via=GreatSchools\\u0026text=Severna Park High School - College readiness\\\" data-type=\\\"Twitter\\\" data-module=\\\"College_readiness\\\" data-link=\\\"https://twitter.com/share?url=\\\"\\u003e\\u003cdiv class=\\\"sharing-icon-box\\\"\\u003e\\u003cspan class=\\\"icon-twitter\\\"\\u003e\\u003c/span\\u003e\\u003c/div\\u003e\\u003cspan class=\\\"sharing-row-text\\\"\\u003eTwitter\\u003c/span\\u003e\\u003c/div\\u003e\\u003cdiv class=\\\"sharing-row\\\"\\u003e\\u003cdiv class=\\\"sharing-icon-box\\\"\\u003e\\u003cspan class=\\\"icon-link\\\"\\u003e\\u003c/span\\u003e\\u003c/div\\u003e\\u003cspan class=\\\"sharing-row-text\\\"\\u003ePermalink\\u003c/span\\u003e\\u003cdiv\\u003e\\u003cinput class=\\\"permalink js-permaLink js-slTracking\\\" type=\\\"text\\\" value=\\\"https://www.greatschools.org/maryland/severna-park/115-Severna-Park-High-School/?utm_source=profile\\u0026utm_medium=Permalink#College_readiness\\\" /\\u003e\\u003cspan class=\\\"acknowledgement\\\"\\u003eCopied to clipboard\\u003c/span\\u003e\\u003c/div\\u003e\\u003c/div\\u003e\\u003c/div\\u003e\",\"data\":[{\"title\":\"College readiness\",\"anchor\":\"College_readiness\",\"data\":[{\"narration\":\"\\u003cdiv class=\\\"auto-narration\\\"\\u003e \\u003ch3 class=\\\"positive\\\"\\u003eGood news!\\u003c/h3\\u003e \\u003cp\\u003eThis school is \\u003cspan class=\\\"emphasis\\\"\\u003efar above\\u003c/span\\u003e the state average in key measures of college and career readiness.\\u003c/p\\u003e \\u003cp\\u003eEven at schools with strong college and career readiness, there may be students who are not getting the opportunities they need to succeed.\\u003c/p\\u003e \\u003chr /\\u003e \\u003cp class=\\\"parent-tip\\\"\\u003e\\u003cimg src='/assets/school_profiles/owl.png' /\\u003e\\u003cspan class=\\\"speech-bubble left\\\"\\u003eParent tip\\u003c/span\\u003e\\u003c/p\\u003e \\u003cp class=\\\"footnote\\\"\\u003eAsk the school what it’s doing to help all students succeed in advanced classes and prepare for \\u003ca href=\\\"/gk/articles/improving-sat-scores/\\\"\\u003ecollege entrance tests\\u003c/a\\u003e.\\u003c/p\\u003e \\u003c/div\\u003e\\n\",\"title\":\"College readiness\",\"values\":[{\"label\":\"94\",\"score\":93,\"breakdown\":\"4-year high school graduation rate\",\"state_average\":86,\"state_average_label\":\"87\",\"display_type\":\"person\",\"lower_range\":0,\"upper_range\":100,\"tooltip_html\":\"Graduation rates reflect how many students graduate from this school on time.\"},{\"label\":\"51\",\"score\":50,\"breakdown\":\"AP course participation\",\"state_average\":26,\"state_average_label\":\"27\",\"display_type\":\"person\",\"lower_range\":0,\"upper_range\":100,\"tooltip_html\":\"Advanced Placement classes are college-level courses students can take in high school. The percentage of students taking AP classes may reflect whether the school culture is focused on college.\"},{\"label\":\"73\",\"score\":72,\"breakdown\":\"Percentage of students passing 1 or more AP exams grades 9-12\",\"state_average\":62,\"state_average_label\":\"63\",\"display_type\":\"bar\",\"lower_range\":0,\"upper_range\":100,\"tooltip_html\":\"The AP exam pass rate reflects how many students at this school earned a passing score on at least one AP exam. Students who do well on AP exams (passing with a score of 3, 4, or 5) may receive college credit.\"},{\"label\":\"6\",\"score\":5,\"breakdown\":\"Percentage of students enrolled in Dual Enrollment classes grades 9-12\",\"state_average\":2,\"state_average_label\":\"3\",\"display_type\":\"person\",\"lower_range\":0,\"upper_range\":100,\"tooltip_html\":\"The Dual Enrollment participation rate reflects the percentage of students at this school who are taking college courses while in high school. Credits for these courses apply both to high school diploma requirements and college graduation requisites.\"},{\"label\":\"\\u003c1\",\"score\":0,\"breakdown\":\"Percentage of students enrolled in IB grades 9-12\",\"state_average\":2,\"state_average_label\":\"2\",\"display_type\":\"person\",\"lower_range\":0,\"upper_range\":100,\"tooltip_html\":\"International Baccalaureate (IB) is an internationally recognized, high-standards program that emphasizes creative and critical thinking. A high school may have specific IB classes students can take, or a school-wide IB program that affects all classes. Some colleges give college credit for IB courses. \\u003ca href='/gk/articles/what-is-ib-international-baccalaureate/' target='_blank'\\u003eMore about IB\\u003c/a\\u003e\\n\"},{\"label\":\"93\",\"score\":93,\"breakdown\":\"SAT/ACT participation rate\",\"state_average\":57,\"state_average_label\":\"57\",\"display_type\":\"person\",\"lower_range\":0,\"upper_range\":100,\"tooltip_html\":\"The SAT/ACT participation rate shows the percentage of eligible students in grades 11 or 12 at this school who took the SAT or ACT.\"}]}]}],\"showTabs\":false,\"faq\":{\"cta\":\"Notice something missing or confusing?\",\"content\":\"\\u003cp\\u003eCollege readiness information comes from state or national education agencies (click on the \\\"Sources\\\" link for details).\\u003c/p\\u003e \\u003cp\\u003eWhen information is missing in our display, it's most likely because this school did not offer an AP course, IB or dual enrollment classes, or participate in one of the two college readiness tests, the ACT or SAT (some states mandate which college readiness test schools use). It's also possible that the missing data was not included in the data we received from the state.\\u003c/p\\u003e \\u003cp\\u003eDid you find the information about college readiness useful? What can we do better? \\u003ca href=\\\"https://s.qualaroo.com/45194/34aea707-ec71-4130-b6bb-2864e0528c64\\\" target=\\\"_blank\\\"\\u003eShare your feedback.\\u003c/a\\u003e\\u003c/p\\u003e \\u003cp\\u003e\\u003ca href=\\\"/gk/ratings/#collegereadinessrating\\\" target=\\\"_blank\\\"\\u003eLearn more about this rating.\\u003c/a\\u003e\\u003c/p\\u003e \\u003cp\\u003eStill have questions? \\u003ca href=\\\"https://greatschools.zendesk.com/hc/en-us\\\" target=\\\"_blank\\\"\\u003eVisit our FAQ page.\\u003c/a\\u003e\\u003c/p\\u003e\\n\",\"element_type\":\"faq\"},\"no_data_summary\":\"This section includes information about this school’s graduation rates, SAT/ACT tests, and AP coursework.\\n\",\"qualaroo_module_link\":\"https://s.qualaroo.com/45194/34aea707-ec71-4130-b6bb-2864e0528c64?state=MD\\u0026school=115\"}"

Спасибо за любую помощь!

1 Ответ

0 голосов
/ 10 апреля 2020

Хорошая новость - вы можете легко вывести это значение из текста ответа

library(rvest)
library(magrittr)
library(stringr)

p <- read_html('https://www.greatschools.org/maryland/severna-park/115-Severna-Park-High-School/') %>% html_text()
rate <- str_match_all(p,'"College readiness","values":\\[\\{"label":"(.*?)"')[[1]][,2][1]
print(as.numeric(rate))
...