Я полагаю, что вы ошибаетесь с target функцией target (в качестве параметра используется obj), документация по xgboost иногда довольно запутанная.
Короче говоря, вам просто нужно исправить это:
m = XGBClassifier(obj=brier, seed=42)
Немного глубже, цель в том, как xgboost оптимизирует данную целевую функцию. Обычно xgboost выводит оптимизацию из числа классов в вашем y-векторе.
Я взял фрагмент из исходного кода , как вы можете видеть, когда у вас есть только два класса, цель устанавливается в двоичный: logisti c:
class XGBClassifier(XGBModel, XGBClassifierBase):
def __init__(self, objective="binary:logistic", **kwargs):
super().__init__(objective=objective, **kwargs)
def fit(self, X, y, sample_weight=None, base_margin=None,
eval_set=None, eval_metric=None,
early_stopping_rounds=None, verbose=True, xgb_model=None,
sample_weight_eval_set=None, callbacks=None):
evals_result = {}
self.classes_ = np.unique(y)
self.n_classes_ = len(self.classes_)
xgb_options = self.get_xgb_params() # <-- obj function is set here
if callable(self.objective):
obj = _objective_decorator(self.objective) # <----- here is the mismatch of the names, if you pass objective as your brie func it will become "binary:logistic"
xgb_options["objective"] = "binary:logistic"
obj = None
if self.n_classes_ > 2:
xgb_options['objective'] = 'multi:softprob' # <----- objective is being set here if n_classes> 2
xgb_options['num_class'] = self.n_classes_
+-- 35 lines: feval = eval_metric if callable(eval_metric) else None-----------------------------------------------------------------------------------------------------------------------------------------------------
self._Booster = train(xgb_options, train_dmatrix, # <----- objective is being passed in xgb_options dictionary
evals_result=evals_result, obj=obj, feval=feval, # <----- obj function is being passed to lower level api here
verbose_eval=verbose, xgb_model=xgb_model,
+-- 12 lines: self.objective = xgb_options["objective"]------------------------------------------------------------------------------------------------------------------------------------------------------------------
return self
Существует фиксированный список целей списки целей, которые вы можете установить:
цель [по умолчанию = reg: squarederror]
reg:squarederror: regression with squared loss.
reg:squaredlogerror: regression with squared log loss 12[???(????+1)−???(?????+1)]2. All input labels are required to be greater than -1. Also, see metric rmsle for possible issue with this objective.
reg:logistic: logistic regression
binary:logistic: logistic regression for binary classification, output probability
binary:logitraw: logistic regression for binary classification, output score before logistic transformation
binary:hinge: hinge loss for binary classification. This makes predictions of 0 or 1, rather than producing probabilities.
count:poisson –poisson regression for count data, output mean of poisson distribution
max_delta_step is set to 0.7 by default in poisson regression (used to safeguard optimization)
survival:cox: Cox regression for right censored survival time data (negative values are considered right censored). Note that predictions are returned on the hazard ratio scale (i.e., as HR = exp(marginal_prediction) in the proportional hazard function h(t) = h0(t) * HR).
multi:softmax: set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)
multi:softprob: same as softmax, but output a vector of ndata * nclass, which can be further reshaped to ndata * nclass matrix. The result contains predicted probability of each data point belonging to each class.
rank:pairwise: Use LambdaMART to perform pairwise ranking where the pairwise loss is minimized
rank:ndcg: Use LambdaMART to perform list-wise ranking where Normalized Discounted Cumulative Gain (NDCG) is maximized
rank:map: Use LambdaMART to perform list-wise ranking where Mean Average Precision (MAP) is maximized
reg:gamma: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be gamma-distributed.
reg:tweedie: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be Tweedie-distributed.
Просто подтвердить, что цель может это не ваша функция br ie, вручную устанавливая цель, чтобы она была вашей функцией br ie внутри исходного кода прямо перед вызовом нижнего уровня api
class XGBClassifier(XGBModel, XGBClassifierBase):
def __init__(self, objective="binary:logistic", **kwargs):
super().__init__(objective=objective, **kwargs)
def fit(self, X, y, sample_weight=None, base_margin=None,
eval_set=None, eval_metric=None,
early_stopping_rounds=None, verbose=True, xgb_model=None,
sample_weight_eval_set=None, callbacks=None):
+-- 54 lines: evals_result = {}--------------------------------------------------------------------
xgb_options["objective"] = xgb_options["obj"]
self._Booster = train(xgb_options, train_dmatrix,
evals_result=evals_result, obj=obj, feval=feval,
verbose_eval=verbose, xgb_model=xgb_model,
+-- 14 lines: self.objective = xgb_options["objective"]--------------------------------------------
Выдает эту ошибку:
raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: [10:09:53] /private/var/folders/z5/mchb9bz51cx3h97nkw9v0wkr0000gn/T/pip-install-kh801rm0/xgboost/xgboost/src/objective/objective.cc:26: Unknown objective function: `<function brier at 0x10b630d08>`
Objective candidate: binary:hinge
Objective candidate: multi:softmax
Objective candidate: multi:softprob
Objective candidate: rank:pairwise
Objective candidate: rank:ndcg
Objective candidate: rank:map
Objective candidate: reg:squarederror
Objective candidate: reg:squaredlogerror
Objective candidate: reg:logistic
Objective candidate: binary:logistic
Objective candidate: binary:logitraw
Objective candidate: reg:linear
Objective candidate: count:poisson
Objective candidate: survival:cox
Objective candidate: reg:gamma
Objective candidate: reg:tweedie