Logistic regression model using statesmodels:
log_reg = st.logit(formula = 'label ~ pregnant + glucose + bp + insulin + bmi + pedigree + age', data=pima).fit()
is there any short way of writing second part of formula (pregnant + glucose + bp + insulin + bmi + pedigree + age)? Here all the columns have to be mentioned explicitly. If there are more than 100 columns, it would be difficult to write and also the statement would be very long.
There are no specific shortcuts for the formulas.
You can use python string manipulation to build the formula, e.g. based on pandas dataframe column names.
Or you work directly with arrays or dataframes. But even then you need a list of names if you want human readable output for example in
summary()
. If you only need prediction, then arrays without variable names are useful.