I() equivalent (used in R), what is the Python equivalent?

405 views Asked by At

The I() function in R is used to create new predictor in Linear Regression, such as X^2 for example:

lm.fit2=lm(medv∼lstat +I(lstat ^2))

A good explanation is given here (What does the capital letter "I" in R linear regression formula mean?).

I'm trying to do the linear regression in Python with the same formula and I can't seem to find the equivalent. This code works for a single variable

fit3 = smf.ols('medv~lstat', data=data).fit()
print(fit3.summary())

but if I try, the below code snippet, it obviously doesn't work correctly.

fit3 = smf.ols('medv~lstat + lstat**2', data=data).fit()
print(fit3.summary())

Trying the ^ operator also doesnt make sense as Python interprets this symbol as bitwise xor. Does anyone know if there is an equivalent of the same function I() in Python?

1

There are 1 answers

1
user42 On BEST ANSWER

I found the answer, seems to be as simple as:

f = 'medv~lstat + I(lstat**2)'
fit3 = smf.ols(f, data=data).fit()
print(fit3.summary())