Я написал свой собственный оценщик для автоматической очистки определенного набора данных. Я думаю, что правильно выполнил правила scikit:
from sklearn.base import BaseEstimator, TransformerMixin
import pandas as pd
from pathlib import Path
class cleaning(BaseEstimator, TransformerMixin):
def __init__(self, to_drop = [], ins_threshold=0.6,
corr_threshold=0.7, attribute_filepath='attribute.xlsx'): # no *args or **kargs, provides methods get_params() and set_params()
"""
Parameters:
-----------
to_drop (list) : columns to be dropped
ins_thresholrd (float) : [0.0 - 1.0] insignificant threshold above which columns containing that proportion of NaN get dropped
corr_threshold (float) : [0.0 - 1.0] correlation threshold above which correlated columns get dropped (first one is kept)
attribute_filepath (str of pathlib.Path) : path to the Excel file containing attributes information
"""
self.attribute_filepath = Path(attribute_filepath)
self.ins_threshold = ins_threshold
self.corr_threshold = corr_threshold
self.to_drop = to_drop
self.ins_col = None
self.correlated_col = None
Но я все еще получаю сообщение об ошибке
RuntimeError: Cannot clone object cleaning(attribute_filepath=PosixPath('MyFile.xlsx')), as the constructor either does not set or modifies parameter attribute_filepath
Я не понимаю, почему, поскольку self.attribute_filepath
четко определен в моем __init__
?