У меня есть набор данных со следующими столбцами
Acres,FamilyType, NumBedrooms,NumChildren, NumPeople, NumRooms,NumUnits NumVehicles,NumWorkers, OwnRent,YearBuilt, HouseCosts,ElectricBill, FoodStamp,HeatingFuel,Insurance,Language, above_150K
Я сделал
fit<-glm(above_150K~Acres+ FamilyType+ NumBedrooms+ NumChildren+NumPeople+NumRooms+NumUnits+NumVehicles+NumWorkers+OwnRent+YearBuilt+HouseCosts+ElectricBill+FoodStamp+HeatingFuel+Insurance+Language,data=‘df’)
summary(fit)
Он разбивает каждый столбец дальше вниз на подстолбцы, как показано ниже
Abbreviation
Acres10+ A
AcresSub 1 A1
FamilyTypeMale Head FH
FamilyTypeMarried FT
NumBedrooms NB
NumChildren NC
NumPeople NP
NumRooms NR
NumUnitsSingle attached Na
NumUnitsSingle detached Nd
NumVehicles NV
NumWorkers NW
OwnRentOutright ORO
OwnRentRented ORR
YearBuilt1940-1949 YB194
YearBuilt1950-1959 YB195
YearBuilt1960-1969 YB196
YearBuilt1970-1979 YB197
YearBuilt1980-1989 YB198
YearBuilt1990-1999 YB199
YearBuilt2000-2004 YB2000
YearBuilt2005 YB2005
YearBuilt2006 YB2006
YearBuilt2007 YB2007
YearBuilt2008 YB2008
YearBuilt2009 YB2009
YearBuilt2010 YB201
YearBuiltBefore 1939 Y1
HouseCosts HC
ElectricBill E
FoodStampYes FS
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Insurance I
LanguageEnglish LnE
LanguageOther LO
LanguageOther European LOE
LanguageSpanish LS
Как видите, один столбец HeatingFuel разбит на
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Почему это происходит?
Я хотел выбрать переменные для прогнозирования выше_150K, я использовал Stepwise, AllSubsets AutomaticВыбор переменных, они предлагают использовать все переменные.Может кто-нибудь уточнить, пожалуйста?