Как вставить среднее значение в середину моей существующей диаграммы леденцов ggplot2? - PullRequest
0 голосов
/ 13 июля 2020

Я использую R в RStudio. У меня есть следующий набор R-кодов, которые создают диаграммы леденцов из моего фрейма данных:

library(tidyr) 
library(dplyr)
library(ggplot2)
library(ggthemes)
library(scales)

df2 <- read.csv("hotel_list.csv", as.is=TRUE, header = TRUE, fileEncoding="UTF-8-BOM")

df3<-(subset(df2,Mth %in% c("Jan-20", "Feb-20", "Mar-20", "Apr-20", "May-20", "Jun-20")))

p1<-ggplot(df3,aes(x=Day, y=Rank)) + 
  geom_point(size=4, color="tomato3") + 
  geom_segment(aes(x=Day, 
                   xend=Day, 
                   y=0, 
                   yend=Rank)) + 
  #print value for each bar as well
  geom_text(color="purple", size=3, vjust=-1.0, 
            aes(label=sprintf("%0.0f", round(Rank, digits = 2))))+
  labs(title="Hotel Ranking (by Day and by Month)",
  subtitle="ABC Ltd") + 
  coord_cartesian(ylim = c(39, 65)) +
  theme(axis.text.x = element_text(angle=90, vjust=0.7, color="tomato3")) +
  facet_wrap(~Mth, scales='free', ncol=3)

p1+ theme(strip.text = element_text(size=10, face="bold")) +
  scale_y_continuous(labels = scales::number_format()) +
  scale_x_continuous(breaks=c(5,10,15,20,25,30)) +
  geom_hline(yintercept=c(40), linetype="dotted", color = "red", size=1.2)

Это мой результат при запуске кодов выше:

Actual output

I would like to calculate the monthly average of the Rank values and to overlay that average value in the middle of my existing lollipop charts (using a low opacity level). Here is the expected output (screenshot shown only for the month of Feb-20):

Ожидаемый результат

Как я могу этого добиться?

Пожалуйста, найдите ниже dput фрейма данных df3:

structure(list(Hotel = c("ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd", 
"ABC Ltd", "ABC Ltd", "ABC Ltd", "ABC Ltd"), Date = c("2020-01-01", 
"2020-01-02", "2020-01-03", "2020-01-04", "2020-01-05", "2020-01-06", 
"2020-01-07", "2020-01-08", "2020-01-09", "2020-01-10", "2020-01-11", 
"2020-01-12", "2020-01-13", "2020-01-14", "2020-01-15", "2020-01-16", 
"2020-01-17", "2020-01-18", "2020-01-19", "2020-01-20", "2020-01-21", 
"2020-01-22", "2020-01-23", "2020-01-24", "2020-01-25", "2020-01-26", 
"2020-01-27", "2020-01-28", "2020-01-29", "2020-01-30", "2020-01-31", 
"2020-02-01", "2020-02-02", "2020-02-03", "2020-02-04", "2020-02-05", 
"2020-02-06", "2020-02-07", "2020-02-08", "2020-02-09", "2020-02-10", 
"2020-02-11", "2020-02-12", "2020-02-13", "2020-02-14", "2020-02-15", 
"2020-02-16", "2020-02-17", "2020-02-18", "2020-02-19", "2020-02-20", 
"2020-02-21", "2020-02-22", "2020-02-23", "2020-02-24", "2020-02-25", 
"2020-02-26", "2020-02-27", "2020-02-28", "2020-02-29", "2020-03-01", 
"2020-03-02", "2020-03-03", "2020-03-04", "2020-03-05", "2020-03-06", 
"2020-03-07", "2020-03-08", "2020-03-09", "2020-03-10", "2020-03-11", 
"2020-03-12", "2020-03-13", "2020-03-14", "2020-03-15", "2020-03-16", 
"2020-03-17", "2020-03-18", "2020-03-19", "2020-03-20", "2020-03-21", 
"2020-03-22", "2020-03-23", "2020-03-24", "2020-03-25", "2020-03-26", 
"2020-03-27", "2020-03-28", "2020-03-29", "2020-03-30", "2020-03-31", 
"2020-04-01", "2020-04-02", "2020-04-03", "2020-04-04", "2020-04-05", 
"2020-04-06", "2020-04-07", "2020-04-08", "2020-04-09", "2020-04-10", 
"2020-04-11", "2020-04-12", "2020-04-13", "2020-04-14", "2020-04-15", 
"2020-04-16", "2020-04-17", "2020-04-18", "2020-04-19", "2020-04-20", 
"2020-04-21", "2020-04-22", "2020-04-23", "2020-04-24", "2020-04-25", 
"2020-04-26", "2020-04-27", "2020-04-28", "2020-04-29", "2020-04-30", 
"2020-05-01", "2020-05-02", "2020-05-03", "2020-05-04", "2020-05-05", 
"2020-05-06", "2020-05-07", "2020-05-08", "2020-05-09", "2020-05-10", 
"2020-05-11", "2020-05-12", "2020-05-13", "2020-05-14", "2020-05-15", 
"2020-05-16", "2020-05-17", "2020-05-18", "2020-05-19", "2020-05-20", 
"2020-05-21", "2020-05-22", "2020-05-23", "2020-05-24", "2020-05-25", 
"2020-05-26", "2020-05-27", "2020-05-28", "2020-05-29", "2020-05-30", 
"2020-05-31", "2020-06-01", "2020-06-02", "2020-06-03", "2020-06-04", 
"2020-06-05", "2020-06-06", "2020-06-07", "2020-06-08", "2020-06-09", 
"2020-06-10", "2020-06-11", "2020-06-12", "2020-06-13", "2020-06-14", 
"2020-06-15", "2020-06-16", "2020-06-17", "2020-06-18", "2020-06-19", 
"2020-06-20", "2020-06-21", "2020-06-22", "2020-06-23", "2020-06-24", 
"2020-06-25", "2020-06-26", "2020-06-27", "2020-06-28", "2020-06-29", 
"2020-06-30"), Day = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 
24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 
20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 
18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 
31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 
14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 
27L, 28L, 29L, 30L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 
11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 
24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 
20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L), Mth = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L), .Label = c("Jan-20", "Feb-20", "Mar-20", 
"Apr-20", "May-20", "Jun-20"), class = c("ordered", "factor")), 
    Rank = c(59L, 59L, 59L, 59L, 59L, 59L, 60L, 60L, 59L, 61L, 
    61L, 61L, 61L, 61L, 62L, 62L, 63L, 62L, 62L, 62L, 62L, 62L, 
    62L, 62L, 62L, 62L, 62L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 
    61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 
    61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 
    61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 60L, 60L, 60L, 
    59L, 59L, 59L, 59L, 59L, 59L, 59L, 59L, 59L, 59L, 59L, 59L, 
    59L, 59L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 
    60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 
    60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 
    60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 
    60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 
    60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 
    60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 
    60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 
    60L, 60L, 60L, 60L)), row.names = c(NA, 182L), class = "data.frame")

1 Ответ

3 голосов
/ 13 июля 2020

Обычно мне проще создать отдельный небольшой фрейм данных, чтобы помочь:

df4 <- df3 %>% 
  group_by(Hotel, Mth) %>% 
  summarise(Rank = sprintf("%0.0f", round(mean(Rank), digits = 2)), Day = 15)

Это позволяет нам легко вставлять дополнительный geom_text вызов:

p1 <- ggplot(df3, aes(x = Day, y = Rank)) + 
        geom_point(size = 4, color="tomato3") + 
        geom_segment(aes(xend = Day, y = 0, yend = Rank)) +
        geom_text(color = "purple", size = 3, vjust = -1.0, 
                  aes(label = sprintf("%0.0f", round(Rank, digits = 2)))) +
        geom_text(aes(label = Rank, y = 50), data = df4, check_overlap = TRUE,
                  size = 30, colour = "gray50", alpha = 0.3) +
        labs(title = "Hotel Ranking (by Day and by Month)",
             subtitle = "ABC Ltd") + 
        coord_cartesian(ylim = c(39, 65)) +
        theme(axis.text.x = element_text(angle = 90, vjust = 0.7, color = "tomato3")) +
        facet_wrap(~Mth, scales = 'free', ncol = 3)

Дает вам желаемый результат:

p1 + 
  theme(strip.text = element_text(size = 10, face = "bold")) +
  scale_y_continuous(labels = scales::number_format()) +
  scale_x_continuous(breaks = c(5, 10, 15, 20, 25, 30)) +
  geom_hline(yintercept = 40, linetype = "dotted", color = "red", size = 1.2)

введите описание изображения здесь

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...