详解R语言caret包trainControl函数
更新时间:2022年08月08日 09:37:12 作者:嘛里嘛里哄
这篇文章主要介绍了R语言caret包trainControl函数详解,本文通过源码分析给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友可以参考下
trainControl参数详解
源码
caret::trainControl <- function (method = "boot", number = ifelse(grepl("cv", method), 10, 25), repeats = ifelse(grepl("[d_]cv$", method), 1, NA), p = 0.75, search = "grid", initialWindow = NULL, horizon = 1, fixedWindow = TRUE, skip = 0, verboseIter = FALSE, returnData = TRUE, returnResamp = "final", savePredictions = FALSE, classProbs = FALSE, summaryFunction = defaultSummary, selectionFunction = "best", preProcOptions = list(thresh = 0.95, ICAcomp = 3, k = 5, freqCut = 95/5, uniqueCut = 10, cutoff = 0.9), sampling = NULL, index = NULL, indexOut = NULL, indexFinal = NULL, timingSamps = 0, predictionBounds = rep(FALSE, 2), seeds = NA, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), trim = FALSE, allowParallel = TRUE) { if (is.null(selectionFunction)) stop("null selectionFunction values not allowed") if (!(returnResamp %in% c("all", "final", "none"))) stop("incorrect value of returnResamp") if (length(predictionBounds) > 0 && length(predictionBounds) != 2) stop("'predictionBounds' should be a logical or numeric vector of length 2") if (any(names(preProcOptions) == "method")) stop("'method' cannot be specified here") if (any(names(preProcOptions) == "x")) stop("'x' cannot be specified here") if (!is.na(repeats) & !(method %in% c("repeatedcv", "adaptive_cv"))) warning("`repeats` has no meaning for this resampling method.", call. = FALSE) if (!(adaptive$method %in% c("gls", "BT"))) stop("incorrect value of adaptive$method") if (adaptive$alpha < 1e-07 | adaptive$alpha > 1) stop("incorrect value of adaptive$alpha") if (grepl("adapt", method)) { num <- if (method == "adaptive_cv") number * repeats else number if (adaptive$min >= num) stop(paste("adaptive$min should be less than", num)) if (adaptive$min <= 1) stop("adaptive$min should be greater than 1") } if (!(search %in% c("grid", "random"))) stop("`search` should be either 'grid' or 'random'") if (method == "oob" & any(names(match.call()) == "summaryFunction")) { warning("Custom summary measures cannot be computed for out-of-bag resampling. ", "This value of `summaryFunction` will be ignored.", call. = FALSE) } list(method = method, number = number, repeats = repeats, search = search, p = p, initialWindow = initialWindow, horizon = horizon, fixedWindow = fixedWindow, skip = skip, verboseIter = verboseIter, returnData = returnData, returnResamp = returnResamp, savePredictions = savePredictions, classProbs = classProbs, summaryFunction = summaryFunction, selectionFunction = selectionFunction, preProcOptions = preProcOptions, sampling = sampling, index = index, indexOut = indexOut, indexFinal = indexFinal, timingSamps = timingSamps, predictionBounds = predictionBounds, seeds = seeds, adaptive = adaptive, trim = trim, allowParallel = allowParallel) }
参数详解
trainControl | 所有参数详解 |
---|---|
method | 重抽样方法:Bootstrap(有放回随机抽样) 、Bootstrap632(有放回随机抽样扩展) 、LOOCV(留一交叉验证) 、LGOCV(蒙特卡罗交叉验证) 、cv(k折交叉验证) 、repeatedcv(重复的k折交叉验证) 、optimism_boot(Efron, B., & Tibshirani, R. J. (1994). “An introduction to the bootstrap”, pages 249-252. CRC press.) 、none(仅使用一个训练集拟合模型) 、oob(袋外估计:随机森林、多元自适应回归样条、树模型、灵活判别分析、条件树) |
number | 控制K折交叉验证的数目或者Bootstrap和LGOCV的抽样迭代次数 |
repeats | 控制重复交叉验证的次数 |
p | LGOCV:控制训练比例 |
verboseIter | 输出训练日志的逻辑变量 |
returnData | 逻辑变量,把数据保存到trainingData 中(str(trainControl) 查看) |
search | search = grid(网格搜索) ,random(随机搜索) |
returnResamp | 包含以下值的字符串:final、all、none ,设定有多少抽样性能度量被保存。 |
classProbs | 是否计算类别概率 |
summaryFunction | 根据重抽样计算模型性能的函数 |
selectionFunction | 选择最优参数的函数 |
index | 指定重抽样样本(使用相同的重抽样样本评估不同的算法、模型) |
allowParallel | 是否允许并行 |
示例
library(mlbench) #使用包中的数据 Warning message: 程辑包‘mlbench'是用R版本4.1.3 来建造的 > data(Sonar) > str(Sonar[, 1:10]) 'data.frame': 208 obs. of 10 variables: $ V1 : num 0.02 0.0453 0.0262 0.01 0.0762 0.0286 0.0317 0.0519 0.0223 0.0164 ... $ V2 : num 0.0371 0.0523 0.0582 0.0171 0.0666 0.0453 0.0956 0.0548 0.0375 0.0173 ... $ V3 : num 0.0428 0.0843 0.1099 0.0623 0.0481 ... $ V4 : num 0.0207 0.0689 0.1083 0.0205 0.0394 ... $ V5 : num 0.0954 0.1183 0.0974 0.0205 0.059 ... $ V6 : num 0.0986 0.2583 0.228 0.0368 0.0649 ... $ V7 : num 0.154 0.216 0.243 0.11 0.121 ... $ V8 : num 0.16 0.348 0.377 0.128 0.247 ... $ V9 : num 0.3109 0.3337 0.5598 0.0598 0.3564 ... $ V10: num 0.211 0.287 0.619 0.126 0.446 ...
数据分割:
library(caret) set.seed(998) inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE) training <- Sonar[ inTraining,] #训练集 testing <- Sonar[-inTraining,] #测试集
模型拟合:
fitControl <- trainControl(## 10折交叉验证 method = "repeatedcv", number = 10, ## 重复10次 repeats = 1) set.seed(825) gbmFit1 <- train(Class ~ ., data = training, method = "gbm", # 助推树 trControl = fitControl, verbose = FALSE) gbmFit1 Stochastic Gradient Boosting 157 samples 60 predictor 2 classes: 'M', 'R' No pre-processing Resampling: Cross-Validated (10 fold, repeated 10 times) Summary of sample sizes: 141, 142, 141, 142, 141, 142, ... Resampling results across tuning parameters: interaction.depth n.trees Accuracy Kappa 1 50 0.7935784 0.5797839 1 100 0.8171078 0.6290208 1 150 0.8219608 0.6383173 2 50 0.8041912 0.6027771 2 100 0.8296176 0.6544713 2 150 0.8283627 0.6520181 3 50 0.8110343 0.6170317 3 100 0.8301275 0.6551379 3 150 0.8310343 0.6577252 Tuning parameter 'shrinkage' was held constant at a value of 0.1 Tuning parameter 'n.minobsinnode' was held constant at a value of 10 Accuracy was used to select the optimal model using the largest value. The final values used for the model were n.trees = 150, interaction.depth = 3, shrinkage = 0.1 and n.minobsinnode = 10.
到此这篇关于R语言caret包trainControl函数详解的文章就介绍到这了,更多相关R语言caret包trainControl函数内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!
最新评论