网格搜索报错:ValueError: Expected
来源:4-6 网格搜索与k近邻算法中更多超参数

哈哈笑笑9632300
2021-02-06
我自己写了一个关于把股票数据应用于knn的函数,但是在网格搜索时运行报错,不知道什么原因,请老师解答下,错误提示如下:
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\18211\anaconda3\lib\site-packages\joblib\externals\loky\process_executor.py", line 431, in _process_worker
r = call_item()
File "C:\Users\18211\anaconda3\lib\site-packages\joblib\externals\loky\process_executor.py", line 285, in __call__
return self.fn(*self.args, **self.kwargs)
File "C:\Users\18211\anaconda3\lib\site-packages\joblib\_parallel_backends.py", line 595, in __call__
return self.func(*args, **kwargs)
File "C:\Users\18211\anaconda3\lib\site-packages\joblib\parallel.py", line 262, in __call__
return [func(*args, **kwargs)
File "C:\Users\18211\anaconda3\lib\site-packages\joblib\parallel.py", line 262, in <listcomp>
return [func(*args, **kwargs)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 560, in _fit_and_score
test_scores = _score(estimator, X_test, y_test, scorer)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 607, in _score
scores = scorer(estimator, X_test, y_test)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py", line 90, in __call__
score = scorer(estimator, *args, **kwargs)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py", line 372, in _passthrough_scorer
return estimator.score(*args, **kwargs)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\base.py", line 499, in score
return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\neighbors\_classification.py", line 175, in predict
neigh_dist, neigh_ind = self.kneighbors(X)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\neighbors\_base.py", line 616, in kneighbors
raise ValueError(
ValueError: Expected n_neighbors <= n_samples, but n_samples = 35, n_neighbors = 36
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\18211\Desktop\股票\knn\myknn.py", line 34, in my_knn_gp
grid_search.fit(X_train_standard, y_train)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\utils\validation.py", line 72, in inner_f
return f(**kwargs)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\model_selection\_search.py", line 736, in fit
self._run_search(evaluate_candidates)
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\model_selection\_search.py", line 1188, in _run_search
evaluate_candidates(ParameterGrid(self.param_grid))
File "C:\Users\18211\anaconda3\lib\site-packages\sklearn\model_selection\_search.py", line 708, in evaluate_candidates
out = parallel(delayed(_fit_and_score)(clone(base_estimator),
self.retrieve()
File "C:\Users\18211\anaconda3\lib\site-packages\joblib\parallel.py", line 940, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "C:\Users\18211\anaconda3\lib\site-packages\joblib\_parallel_backends.py", line 542, in wrap_future_result
return future.result(timeout=timeout)
File "C:\Users\18211\anaconda3\lib\concurrent\futures\_base.py", line 439, in result
return self.__get_result()
File "C:\Users\18211\anaconda3\lib\concurrent\futures\_base.py", line 388, in __get_result
raise self._exception
ValueError: Expected n_neighbors <= n_samples, but n_samples = 35, n_neighbors = 36
我写的函数:
from cxgp.cxgp import cxgp,cxjg #导出训练数据集
from sklearn.model_selection import train_test_split #分类测试数据和训练数据
from sklearn.neighbors import KNeighborsClassifier #分类器
from sklearn.model_selection import GridSearchCV #找到最佳超参数
from sklearn.preprocessing import StandardScaler #均值方差归一化
def my_knn_gp(gpdm,k = 11,p = 6):
#输入股票代码获取该股票的最佳超参数和准确率
t= cxgp(gpdm)
T = cxjg(t)
X = t[:-1]
y = T[1:]
X_train, X_test, y_train, y_test = train_test_split(X,y)
skl = StandardScaler()
skl.fit(X_train)
skl.transform(X_train)
X_train_standard = skl.transform(X_train)
X_test_standard = skl.transform(X_test)
param_grid = [
{
'weights':['uniform'],
'n_neighbors':[i for i in range(1,k)]
},
{
'weights':['distance'],
'n_neighbors':[i for i in range(1,k)],
'p':[i for i in range(1,p)]
}
]
knn_clf = KNeighborsClassifier()
grid_search = GridSearchCV(knn_clf,param_grid, n_jobs=-1, verbose=2)
grid_search.fit(X_train_standard, y_train)
sj = {
"超参数":grid_search.best_params_,
"准确度":grid_search.best_score_,
'测试数据x':X_test_standard,
'测试数据y':y_test
}
return sj
写回答
1回答
-
ValueError: Expected n_neighbors <= n_samples, but n_samples = 35, n_neighbors = 36
这行报错的意思是:knn 中的 k 必须小于样本数。但是你的 k(n_neighbors) 是 36,样本数只有 35。
继续加油!:)
012021-02-06
相似问题