Возьмем очень простой пример. Предположим, у меня есть такие точки данных:
library(ggplot2)
df <- data.frame(x = 1:10, y = (1:10)^2)
(p <- ggplot(df, aes(x, y)) + geom_point())
I want to try to fit a model to them, but don't know what form this should take. I try a linear regression first and plot the resultant prediction:
mod1 <- lm(y ~ x, data = df)
(p <- p + geom_line(aes(y = predict(mod1)), color = "blue"))
Next I try a linear regression on log(y)
. Whatever results I get from predictions from this model will be predicted values of log(y)
. But I don't want log(y)
predictions, I want y
predictions, so I need to take the 'anti-log' of the prediction. We do this in R by doing exp
:
mod2 <- lm(log(y) ~ x, data = df)
(p <- p + geom_line(aes(y = exp(predict(mod2))), color = "red"))
But we can see that we have different regression lines. That's because when we took the log of y, we were effectively fitting a straight line on the plot of log(y) against x. When we transform the axis back to a non-log axis, our straight line becomes an exponential curve. We can see this more clearly by drawing our plot again with a log-transformed y axis:
p + scale_y_log10(limits = c(1, 500))
Created on 2020-08-04 by the репозиторий (v0.3.0)