Другой вариант - присоединиться к самому себе, поля поменялись местами.
Редактировать: Добавлен "trip_num", чтобы лучше справляться с повторными поездками одного и того же человека.
library(dplyr)
# First, convert date field to Date type
df <- df %>%
mutate(Date = lubridate::mdy(Date)) %>%
# update with M-M's suggestion in comments
mutate_at(.vars = vars(origin_city, Destination_city), .funs = toupper) %>%
# EDIT: adding trip_num to protect against extraneous joins for repeat trips
group_by(Account_number, origin_city, Destination_city) %>%
mutate(trip_num = row_number()) %>%
ungroup()
df2 <- df %>%
left_join(df, by = c("Account_number", "trip_num",
"origin_city" = "Destination_city",
"Destination_city" = "origin_city")) %>%
mutate(days = (Date.x - Date.y)/lubridate::ddays(1))
> df2
# A tibble: 6 x 7
Account_number origin_city Destination_city Date.x trip_num Date.y days
<int> <chr> <chr> <date> <int> <date> <dbl>
1 1 LONDON CHICAGO 2018-07-22 1 2018-08-22 -31
2 2 MILAN LONDON 2018-07-23 1 2018-07-28 -5
3 2 LONDON MILAN 2018-07-28 1 2018-07-23 5
4 1 CHICAGO LONDON 2018-08-22 1 2018-07-22 31
5 2 MILAN LONDON 2018-08-23 2 2018-08-28 -5
6 2 LONDON MILAN 2018-08-28 2 2018-08-23 5
Данные: (добавлено повторное путешествие по Account_number 2)
df <- read.table(
header = T,
stringsAsFactors = F,
text = "Account_number origin_city Destination_city Date
1 London chicago 7/22/2018
2 Milan London 7/23/2018
2 London Milan 7/28/2018
1 chicago london 8/22/2018
2 Milan London 8/23/2018
2 London Milan 8/28/2018")