Трудно продемонстрировать, что здесь не так, без данных, поэтому я попытаюсь создать что-то, примерно похожее на ваше:
set.seed(69)
m <- rgamma(5000, 2, 2) * 30000
p <- 3e4 * log((rnorm(5e3, 1e4, 1e3) + m)/(m + rnorm(5e3, 5e3, 5e2)) + rgamma(5000, 2, 2)/8)
c <- data.frame(Mileage = m, Price = p)
plot (c$Mileage, c$Price,
xlab = "Mileage",
ylab = "Price")
This is close enough for demonstration purposes.
Now we can add the linear regression line using your code:
regrPM1
Now, if we regress the log of the price on the mileage, we will get the same flat green line as you did if we just plot the result using abline
:
regrPM2
That's because we are plotting the log of the price on the (non-logged) plot. We want to take the anti-log of the result of our regression and plot that.
Note that it's better to use the data
argument in our lm
call, so let's do:
regrPM3
Now instead of trying to plot this as a straight line, let's take the anti-log of its predictions at fixed intervals and plot them:
lines(seq(0, 2e5, 1e3),
exp(predict(regrPM3, newdata = list(Mileage = seq(0, 2e5, 1e3)))),
col = "blue", lty = 2, lwd = 4)
введите описание изображения здесь
Итак, синяя пунктирная линия - это то, как выглядит регрессия журнала.