北京大学R语言教程(李东风)第42章: 正则化与惩罚回归

42.1 介绍

这一章用Hitters数据集演示线性回归、回归自变量选择, 岭回归、lasso回归, 以及如何进行超参数调优。

考虑ISLR包的Hitters数据集。 此数据集有322个运动员的20个变量的数据, 其中的变量Salary(工资)是我们关心的。 变量包括:

library(tidyverse)
library(ISLR) # 参考书对应的包
data(Hitters)
names(Hitters)
##  [1] "AtBat"     "Hits"      "HmRun"     "Runs"      "RBI"       "Walks"     "Years"     "CAtBat"    "CHits"     "CHmRun"    "CRuns"     "CRBI"      "CWalks"    "League"    "Division"  "PutOuts"   "Assists"   "Errors"    "Salary"    "NewLeague"

数据集的详细变量信息如下:

glimpse(Hitters)
## Rows: 322
## Columns: 20
## $ AtBat     <int> 293, 315, 479, 496, 321, 594, 185, 298, 323, 401, 574, 202, 418, 239, 196, 183, 568, 190, 407, 127, 413, 426, 22, 472, 629, 587, 324, 474, 550, 513, 313, 419, 517, 583, 204, 379, 161, 268, 346, 241, 181, 216, 200, 217, 194, 254, 416, 205, 542, 526, 457, 214, 19, 591, 403, 405, 244, 235, 313, 627, 416, 155, 236, 216, 24, 585, 191, 199, 521, 419, 311, 138, 512, 507, 529, 424, 351, 195, 388, 339, 561, 255, 677, 227, 614, 329, 637, 280, 155, 458, 314, 475, 317, 511, 278, 382, 565…
## $ Hits      <int> 66, 81, 130, 141, 87, 169, 37, 73, 81, 92, 159, 53, 113, 60, 43, 39, 158, 46, 104, 32, 92, 109, 10, 116, 168, 163, 73, 129, 152, 137, 84, 108, 141, 168, 49, 106, 36, 60, 98, 61, 41, 54, 57, 46, 40, 68, 132, 57, 140, 146, 101, 53, 7, 168, 101, 102, 58, 61, 78, 177, 113, 44, 56, 53, 3, 139, 37, 53, 142, 113, 81, 31, 131, 122, 137, 119, 97, 55, 103, 96, 118, 70, 238, 46, 163, 83, 174, 82, 41, 114, 83, 123, 78, 138, 69, 119, 148, 71, 115, 110, 151, 132, 49, 106, 114, 37, 95, 154,…
## $ HmRun     <int> 1, 7, 18, 20, 10, 4, 1, 0, 6, 17, 21, 4, 13, 0, 7, 3, 20, 2, 6, 8, 16, 3, 1, 16, 18, 4, 4, 10, 6, 20, 9, 6, 27, 17, 6, 10, 0, 5, 5, 1, 1, 0, 6, 7, 7, 2, 7, 8, 12, 13, 14, 2, 0, 19, 12, 18, 9, 3, 6, 25, 24, 6, 0, 1, 0, 31, 4, 5, 20, 1, 3, 8, 26, 29, 26, 6, 4, 5, 15, 4, 35, 7, 31, 7, 29, 9, 31, 16, 12, 13, 13, 27, 7, 25, 3, 13, 24, 2, 27, 15, 17, 9, 2, 16, 23, 8, 23, 22, 31, 4, 16, 16, 24, 31, 14, 34, 12, 14, 4, 3, 21, 16, 5, 11, 2, 16, 13, 5, 15, 21, 14, 10, 7, 1, 5, 4, 40, 6,…
## $ Runs      <int> 30, 24, 66, 65, 39, 74, 23, 24, 26, 49, 107, 31, 48, 30, 29, 20, 89, 24, 57, 16, 72, 55, 4, 60, 73, 92, 32, 50, 92, 90, 42, 55, 70, 83, 23, 38, 19, 24, 31, 34, 15, 21, 23, 32, 19, 28, 57, 34, 46, 71, 42, 30, 1, 80, 45, 49, 28, 24, 32, 98, 58, 21, 27, 31, 1, 93, 12, 29, 67, 44, 42, 18, 69, 78, 86, 57, 55, 24, 59, 37, 70, 49, 117, 23, 89, 50, 89, 44, 21, 67, 39, 76, 35, 76, 24, 54, 90, 27, 97, 70, 61, 69, 41, 48, 67, 15, 55, 76, 101, 19, 70, 33, 81, 91, 30, 91, 63, 45, 42, 30, …
## $ RBI       <int> 29, 38, 72, 78, 42, 51, 8, 24, 32, 66, 75, 26, 61, 11, 27, 15, 75, 8, 43, 22, 48, 43, 2, 62, 102, 51, 18, 56, 37, 95, 30, 36, 87, 80, 25, 60, 10, 25, 53, 12, 21, 18, 14, 19, 29, 26, 49, 32, 75, 70, 63, 29, 2, 72, 53, 85, 25, 39, 41, 81, 69, 23, 15, 15, 0, 94, 17, 22, 86, 27, 30, 21, 96, 85, 97, 46, 29, 33, 47, 29, 94, 35, 113, 20, 83, 39, 116, 45, 29, 57, 46, 93, 35, 96, 21, 58, 104, 29, 71, 47, 84, 47, 23, 56, 67, 19, 58, 84, 108, 18, 73, 52, 105, 101, 42, 108, 54, 47, 36, 4…
## $ Walks     <int> 14, 39, 76, 37, 30, 35, 21, 7, 8, 65, 59, 27, 47, 22, 30, 11, 73, 15, 65, 14, 65, 62, 1, 74, 40, 70, 22, 40, 81, 90, 39, 22, 52, 56, 12, 30, 17, 15, 30, 14, 33, 15, 14, 9, 30, 22, 33, 9, 41, 84, 22, 23, 1, 39, 39, 20, 35, 21, 12, 70, 16, 15, 11, 22, 2, 62, 14, 21, 45, 44, 26, 38, 52, 91, 97, 13, 39, 30, 39, 23, 33, 43, 53, 12, 75, 56, 56, 47, 22, 48, 16, 72, 32, 61, 29, 36, 77, 14, 68, 36, 78, 54, 18, 35, 53, 15, 37, 43, 41, 11, 80, 37, 62, 64, 24, 52, 30, 26, 66, 20, 60, 41,…
## $ Years     <int> 1, 14, 3, 11, 2, 11, 2, 3, 2, 13, 10, 9, 4, 6, 13, 3, 15, 5, 12, 8, 1, 1, 6, 6, 18, 6, 7, 10, 5, 14, 17, 3, 9, 5, 7, 14, 4, 2, 16, 1, 2, 18, 9, 4, 11, 6, 3, 5, 16, 6, 17, 2, 4, 9, 12, 6, 4, 14, 12, 6, 1, 16, 4, 4, 3, 17, 4, 3, 4, 12, 17, 3, 14, 18, 15, 9, 4, 8, 6, 4, 16, 15, 5, 5, 11, 9, 14, 2, 16, 4, 5, 4, 1, 3, 8, 12, 14, 15, 3, 7, 10, 2, 8, 10, 13, 6, 3, 14, 5, 1, 14, 5, 13, 3, 18, 6, 4, 16, 9, 8, 15, 20, 5, 5, 11, 13, 5, 8, 5, 7, 7, 5, 18, 4, 9, 3, 6, 15, 5, 2, 2, 4, 12, …
## $ CAtBat    <int> 293, 3449, 1624, 5628, 396, 4408, 214, 509, 341, 5206, 4631, 1876, 1512, 1941, 3231, 201, 8068, 479, 5233, 727, 413, 426, 84, 1924, 8424, 2695, 1931, 2331, 2308, 5201, 6890, 591, 3571, 1646, 1309, 6207, 1053, 350, 5913, 241, 232, 7318, 2516, 694, 4183, 999, 932, 756, 7099, 2648, 6521, 226, 41, 4478, 5150, 950, 1335, 3926, 3742, 3210, 416, 6631, 1115, 926, 159, 7546, 773, 514, 815, 4484, 8247, 244, 5347, 7761, 6661, 3651, 1258, 1313, 2174, 1064, 6677, 6311, 2223, 1325, 5017, 3…
## $ CHits     <int> 66, 835, 457, 1575, 101, 1133, 42, 108, 86, 1332, 1300, 467, 392, 510, 825, 42, 2273, 102, 1478, 180, 92, 109, 26, 489, 2464, 747, 491, 604, 633, 1382, 1833, 149, 994, 452, 308, 1906, 244, 78, 1615, 61, 50, 1926, 684, 160, 1069, 236, 273, 192, 2130, 715, 1767, 59, 13, 1307, 1429, 231, 333, 1029, 968, 927, 113, 1634, 270, 210, 28, 1982, 163, 120, 205, 1231, 2198, 53, 1397, 1947, 1785, 1046, 353, 338, 555, 290, 1575, 1661, 737, 324, 1388, 948, 2024, 113, 1338, 298, 405, 471, 78…
## $ CHmRun    <int> 1, 69, 63, 225, 12, 19, 1, 0, 6, 253, 90, 15, 41, 4, 36, 3, 177, 5, 100, 24, 16, 3, 2, 67, 164, 17, 13, 61, 32, 166, 224, 8, 215, 44, 27, 146, 3, 5, 235, 1, 4, 46, 46, 32, 64, 21, 24, 32, 235, 77, 281, 2, 1, 113, 166, 29, 49, 35, 35, 133, 24, 98, 1, 9, 0, 315, 16, 8, 22, 32, 100, 12, 221, 347, 291, 32, 16, 25, 80, 11, 442, 154, 93, 44, 266, 145, 247, 25, 181, 28, 28, 108, 7, 28, 32, 41, 305, 60, 45, 38, 275, 14, 7, 86, 241, 36, 31, 131, 92, 4, 209, 71, 271, 53, 348, 107, 14, …
## $ CRuns     <int> 30, 321, 224, 828, 48, 501, 30, 41, 32, 784, 702, 192, 205, 309, 376, 20, 1045, 65, 643, 67, 72, 55, 9, 242, 1008, 442, 291, 246, 349, 763, 1033, 80, 545, 219, 126, 859, 156, 34, 784, 34, 20, 796, 371, 86, 486, 108, 113, 117, 987, 352, 1003, 32, 3, 634, 747, 99, 164, 441, 409, 529, 58, 698, 116, 118, 20, 1141, 61, 57, 99, 612, 950, 33, 712, 1175, 1082, 461, 196, 144, 285, 123, 901, 1019, 349, 156, 813, 575, 978, 61, 746, 160, 156, 292, 35, 87, 258, 287, 1135, 753, 156, 335, 8…
## $ CRBI      <int> 29, 414, 266, 838, 46, 336, 9, 37, 34, 890, 504, 186, 204, 103, 290, 16, 993, 23, 658, 82, 48, 43, 9, 251, 1072, 198, 108, 327, 182, 734, 864, 46, 652, 208, 132, 803, 86, 29, 901, 12, 29, 627, 230, 76, 493, 117, 121, 107, 1089, 342, 977, 32, 4, 563, 666, 138, 179, 401, 321, 472, 69, 661, 64, 69, 12, 1179, 74, 40, 103, 344, 909, 32, 815, 1152, 949, 301, 110, 149, 274, 108, 1210, 608, 401, 158, 822, 528, 1093, 70, 805, 123, 159, 343, 35, 110, 192, 294, 1234, 596, 119, 174, 1015…
## $ CWalks    <int> 14, 375, 263, 354, 33, 194, 24, 12, 8, 866, 488, 161, 203, 207, 238, 11, 732, 39, 653, 56, 65, 62, 3, 240, 402, 317, 180, 166, 308, 784, 1087, 31, 337, 136, 66, 571, 107, 18, 560, 14, 45, 483, 195, 32, 608, 118, 80, 51, 431, 289, 619, 27, 4, 319, 526, 64, 194, 333, 170, 313, 16, 777, 57, 114, 9, 727, 52, 39, 78, 422, 690, 55, 548, 1380, 989, 112, 117, 153, 186, 55, 608, 820, 171, 67, 617, 635, 495, 63, 875, 122, 76, 267, 32, 71, 162, 227, 791, 259, 99, 258, 709, 90, 106, 248,…
## $ League    <fct> A, N, A, N, N, A, N, A, N, A, A, N, N, A, N, A, N, A, A, N, N, A, A, N, A, A, N, N, N, A, A, N, N, A, A, N, A, N, A, N, A, N, N, A, A, A, N, A, A, N, A, N, A, A, A, N, N, A, N, A, A, N, A, N, A, A, N, A, A, A, N, N, A, A, A, A, N, N, A, A, A, N, A, A, N, A, N, A, A, A, A, N, A, A, N, N, A, N, N, N, A, A, A, A, A, A, N, A, A, A, A, N, N, N, N, A, A, A, N, A, N, N, A, N, N, A, A, A, N, A, N, N, A, A, N, N, A, A, A, A, A, A, N, N, A, N, A, A, A, A, A, A, N, N, N, A, N, N, A, A, …
## $ Division  <fct> E, W, W, E, E, W, E, W, W, E, E, W, E, E, E, W, W, W, W, W, E, W, W, W, E, E, E, W, W, W, W, W, W, E, W, W, E, W, E, W, E, W, W, E, E, E, W, E, E, W, W, E, E, W, E, W, W, E, W, E, E, E, W, W, W, E, E, W, E, E, W, E, W, E, E, E, W, E, W, W, W, E, E, W, W, W, W, E, W, W, W, E, E, W, W, W, E, W, W, W, E, E, E, E, E, E, W, W, E, E, W, W, E, W, E, W, W, W, W, E, E, W, W, E, W, W, W, W, E, W, E, E, W, W, W, E, E, E, E, W, W, E, E, W, W, E, E, E, E, E, W, W, W, W, E, W, E, E, W, W, …
## $ PutOuts   <int> 446, 632, 880, 200, 805, 282, 76, 121, 143, 0, 238, 304, 211, 121, 80, 118, 105, 102, 912, 202, 280, 361, 812, 518, 1067, 434, 222, 732, 262, 267, 127, 226, 1378, 109, 419, 72, 70, 442, 0, 166, 326, 103, 69, 307, 325, 359, 73, 58, 697, 303, 389, 109, 0, 67, 316, 161, 142, 425, 106, 240, 203, 53, 125, 73, 80, 0, 391, 152, 107, 211, 153, 244, 119, 808, 280, 224, 226, 83, 182, 104, 463, 51, 1377, 92, 303, 276, 278, 148, 165, 246, 533, 226, 45, 157, 142, 59, 292, 360, 274, 292, 1…
## $ Assists   <int> 33, 43, 82, 11, 40, 421, 127, 283, 290, 0, 445, 45, 11, 151, 45, 0, 290, 177, 88, 22, 9, 22, 84, 55, 157, 9, 3, 83, 329, 5, 221, 7, 102, 292, 46, 170, 149, 59, 0, 172, 29, 84, 1, 25, 22, 30, 177, 4, 61, 9, 39, 7, 0, 147, 6, 10, 14, 43, 206, 482, 70, 88, 199, 152, 4, 0, 38, 3, 242, 2, 223, 21, 216, 108, 10, 286, 7, 2, 9, 213, 32, 54, 100, 2, 6, 6, 9, 4, 9, 389, 40, 10, 122, 7, 210, 156, 9, 32, 2, 6, 88, 327, 132, 41, 2, 115, 10, 439, 17, 5, 218, 87, 62, 111, 4, 334, 377, 6, 48…
## $ Errors    <int> 20, 10, 14, 3, 4, 25, 7, 9, 19, 0, 22, 11, 7, 6, 8, 0, 10, 16, 9, 2, 5, 2, 11, 3, 14, 3, 3, 13, 16, 3, 7, 4, 8, 25, 5, 24, 12, 6, 0, 10, 5, 5, 1, 1, 2, 4, 18, 4, 9, 9, 4, 3, 0, 4, 5, 3, 2, 4, 7, 13, 10, 3, 13, 11, 0, 0, 8, 5, 23, 1, 10, 4, 12, 2, 5, 8, 3, 1, 4, 9, 8, 8, 6, 2, 6, 2, 9, 2, 1, 18, 4, 6, 26, 8, 10, 9, 5, 5, 7, 3, 13, 20, 10, 7, 4, 15, 7, 10, 10, 12, 16, 3, 8, 11, 4, 21, 26, 5, 19, 12, 9, 16, 7, 4, 20, 0, 5, 1, 4, 5, 15, 20, 0, 16, 1, 2, 3, 13, 5, 9, 14, 6, 3, 4, …
## $ Salary    <dbl> NA, 475.000, 480.000, 500.000, 91.500, 750.000, 70.000, 100.000, 75.000, 1100.000, 517.143, 512.500, 550.000, 700.000, 240.000, NA, 775.000, 175.000, NA, 135.000, 100.000, 115.000, NA, 600.000, 776.667, 765.000, 708.333, 750.000, 625.000, 900.000, NA, 110.000, NA, 612.500, 300.000, 850.000, NA, 90.000, NA, NA, 67.500, NA, NA, 180.000, NA, 305.000, 215.000, 247.500, NA, 815.000, 875.000, 70.000, NA, 1200.000, 675.000, 415.000, 340.000, NA, 416.667, 1350.000, 90.000, 275.000, 2…
## $ NewLeague <fct> A, N, A, N, N, A, A, A, N, A, A, N, N, A, N, A, N, A, A, N, N, N, A, N, A, A, N, N, N, A, A, N, N, A, A, N, A, N, A, N, A, N, N, A, A, A, N, A, A, N, A, N, A, A, A, N, N, A, N, A, A, N, A, N, A, A, N, A, A, A, N, N, A, A, A, N, A, N, A, A, A, N, A, A, N, A, N, A, A, A, A, N, A, A, N, N, A, N, N, N, A, A, A, A, A, A, N, A, A, A, A, A, N, N, N, A, A, A, N, A, N, N, A, A, N, A, A, A, N, A, N, N, A, A, N, N, A, A, N, N, A, A, N, N, A, N, A, A, A, A, A, A, N, N, N, A, N, N, A, A, …

希望以Salary为因变量,查看其缺失值个数:

sum( is.na(Hitters$Salary) )
## [1] 59

为简单起见,去掉有缺失值的观测:

da_hit <- na.omit(Hitters); dim(da_hit)
## [1] 263  20

42.2 划分训练集和测试集

rsample包的initial_split可以将一个数据集随机拆分为两个数据集, 称为训练集和测试集, 用prop指定比例, 用strata指定分层抽样基于的变量。 基于因变量使用分层抽样法划分训练集、测试集可以更具有代表性。

library(rsample)
set.seed(101)
hit_split <- initial_split(
  da_hit, prop = 0.80, strata = Salary)
hit_train <- training(hit_split)
hit_test <- testing(hit_split)

42.3 回归自变量选择

42.3.1 最优子集选择

用leaps包的regsubsets()函数计算最优子集回归, 办法是对某个试验性的子集自变量个数p̂ 值, 都找到p̂ 固定情况下残差平方和最小的变量子集, 这样只要在这些不同p̂ 的最优子集中挑选就可以了。 挑选可以用AIC、BIC等方法。

可以先进行一个包含所有自变量的全集回归:

regfit.full <- regsubsets(
  Salary ~ ., data=hit_train, nvmax=19)
reg.summary <- summary(regfit.full)
reg.summary
## Subset selection object
## Call: regsubsets.formula(Salary ~ ., data = hit_train, nvmax = 19)
## 19 Variables  (and intercept)
##            Forced in Forced out
## AtBat          FALSE      FALSE
## Hits           FALSE      FALSE
## HmRun          FALSE      FALSE
## Runs           FALSE      FALSE
## RBI            FALSE      FALSE
## Walks          FALSE      FALSE
## Years          FALSE      FALSE
## CAtBat         FALSE      FALSE
## CHits          FALSE      FALSE
## CHmRun         FALSE      FALSE
## CRuns          FALSE      FALSE
## CRBI           FALSE      FALSE
## CWalks         FALSE      FALSE
## LeagueN        FALSE      FALSE
## DivisionW      FALSE      FALSE
## PutOuts        FALSE      FALSE
## Assists        FALSE      FALSE
## Errors         FALSE      FALSE
## NewLeagueN     FALSE      FALSE
## 1 subsets of each size up to 19
## Selection Algorithm: exhaustive
##           AtBat Hits HmRun Runs RBI Walks Years CAtBat CHits CHmRun CRuns CRBI CWalks LeagueN DivisionW PutOuts Assists Errors NewLeagueN
## 1  ( 1 )  " "   " "  " "   " "  " " " "   " "   " "    " "   " "    " "   "*"  " "    " "     " "       " "     " "     " "    " "       
## 2  ( 1 )  " "   "*"  " "   " "  " " " "   " "   " "    " "   " "    " "   "*"  " "    " "     " "       " "     " "     " "    " "       
## 3  ( 1 )  " "   "*"  " "   " "  " " " "   " "   " "    " "   " "    " "   "*"  " "    " "     "*"       " "     " "     " "    " "       
## 4  ( 1 )  " "   "*"  " "   " "  " " " "   " "   " "    " "   " "    " "   "*"  " "    " "     "*"       "*"     " "     " "    " "       
## 5  ( 1 )  "*"   "*"  " "   " "  " " " "   " "   " "    " "   " "    " "   "*"  " "    " "     "*"       "*"     " "     " "    " "       
## 6  ( 1 )  "*"   "*"  " "   " "  " " "*"   " "   " "    " "   " "    " "   "*"  " "    " "     "*"       "*"     " "     " "    " "       
## 7  ( 1 )  "*"   "*"  " "   " "  " " "*"   " "   " "    " "   " "    " "   "*"  "*"    " "     "*"       "*"     " "     " "    " "       
## 8  ( 1 )  "*"   "*"  " "   " "  " " "*"   " "   " "    " "   " "    "*"   "*"  "*"    " "     "*"       "*"     " "     " "    " "       
## 9  ( 1 )  "*"   "*"  " "   " "  " " "*"   " "   " "    " "   "*"    "*"   " "  "*"    " "     "*"       "*"     "*"     " "    " "       
## 10  ( 1 ) "*"   "*"  " "   " "  " " "*"   " "   "*"    " "   " "    "*"   "*"  "*"    " "     "*"       "*"     "*"     " "    " "       
## 11  ( 1 ) "*"   "*"  " "   " "  " " "*"   " "   "*"    " "   " "    "*"   "*"  "*"    "*"     "*"       "*"     "*"     " "    " "       
## 12  ( 1 ) "*"   "*"  " "   "*"  " " "*"   " "   "*"    " "   " "    "*"   "*"  "*"    "*"     "*"       "*"     "*"     " "    " "       
## 13  ( 1 ) "*"   "*"  " "   "*"  " " "*"   "*"   "*"    " "   " "    "*"   "*"  "*"    "*"     "*"       "*"     "*"     " "    " "       
## 14  ( 1 ) "*"   "*"  " "   "*"  "*" "*"   "*"   "*"    " "   " "    "*"   "*"  "*"    "*"     "*"       "*"     "*"     " "    " "       
## 15  ( 1 ) "*"   "*"  " "   "*"  "*" "*"   "*"   "*"    " "   " "    "*"   "*"  "*"    "*"     "*"       "*"     "*"     "*"    " "       
## 16  ( 1 ) "*"   "*"  " "   "*"  "*" "*"   "*"   "*"    "*"   "*"    "*"   "*"  "*"    "*"     "*"       "*"     "*"     " "    " "       
## 17  ( 1 ) "*"   "*"  " "   "*"  "*" "*"   "*"   "*"    "*"   "*"    "*"   "*"  "*"    "*"     "*"       "*"     "*"     "*"    " "       
## 18  ( 1 ) "*"   "*"  " "   "*"  "*" "*"   "*"   "*"    "*"   "*"    "*"   "*"  "*"    "*"     "*"       "*"     "*"     "*"    "*"       
## 19  ( 1 ) "*"   "*"  "*"   "*"  "*" "*"   "*"   "*"    "*"   "*"    "*"   "*"  "*"    "*"     "*"       "*"     "*"     "*"    "*"

这里用nvmax=指定了允许所有的自变量都参加, 缺省行为是限制最多个数的。 上述结果表格中每一行给出了固定p̂ 条件下的最优子集。

试比较这些最优模型的BIC值:

reg.summary$bic
##  [1] -63.90242 -86.59469 -90.68877 -93.51559 -96.29865 -96.35699 -95.24328 -94.33547 -91.79438 -89.31463 -85.07463 -80.40798 -75.33025 -70.12122 -64.82873 -59.53306 -54.25553 -48.92352 -43.58870
plot(reg.summary$bic)

Hitters数据最优子集回归BIC

图42.1: Hitters数据最优子集回归BIC

其中p̂ =5,6的值相近,都很低, 取p̂ =6。 用coef()id=6指定第六种子集:

coef(regfit.full, id=6)
##  (Intercept)        AtBat         Hits        Walks         CRBI    DivisionW      PutOuts 
##  149.0951521   -2.1064928    8.2070703    3.2517011    0.6351933 -136.2935330    0.2646021

这种方法实现了选取BIC最小的自变量子集, 有6个自变量。

42.3.2 逐步回归方法

在用lm()做了全集回归后, 把全集回归结果输入到stats::step()函数中可以执行逐步回归。 如:

lm.full <- lm(Salary ~ ., data = hit_train)
print(summary(lm.full))
## 
## Call:
## lm(formula = Salary ~ ., data = hit_train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -918.96 -183.16  -35.62  138.30 1799.45 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  241.67291  109.57064   2.206  0.02862 * 
## AtBat         -2.48494    0.76899  -3.231  0.00145 **
## Hits           8.15485    2.84403   2.867  0.00461 **
## HmRun         -0.37929    7.64779  -0.050  0.96050   
## Runs          -2.12109    3.59273  -0.590  0.55564   
## RBI            0.76668    3.11770   0.246  0.80602   
## Walks          6.27568    2.18144   2.877  0.00448 **
## Years         -7.18987   15.10209  -0.476  0.63457   
## CAtBat        -0.14891    0.16372  -0.909  0.36425   
## CHits          0.23486    0.78151   0.301  0.76411   
## CHmRun         0.50158    1.97716   0.254  0.80002   
## CRuns          1.11476    0.92330   1.207  0.22881   
## CRBI           0.70183    0.84282   0.833  0.40606   
## CWalks        -0.83644    0.37968  -2.203  0.02881 * 
## LeagueN       47.02170   94.26262   0.499  0.61848   
## DivisionW   -120.60207   48.51038  -2.486  0.01379 * 
## PutOuts        0.26292    0.09121   2.883  0.00440 **
## Assists        0.38272    0.26915   1.422  0.15670   
## Errors        -1.28251    5.36074  -0.239  0.81118   
## NewLeagueN    -7.16809   94.61668  -0.076  0.93969   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 336.8 on 188 degrees of freedom
## Multiple R-squared:  0.5146, Adjusted R-squared:  0.4655 
## F-statistic: 10.49 on 19 and 188 DF,  p-value: < 2.2e-16
stats::step(lm.full)
## Start:  AIC=2439.89
## Salary ~ AtBat + Hits + HmRun + Runs + RBI + Walks + Years + 
##     CAtBat + CHits + CHmRun + CRuns + CRBI + CWalks + League + 
##     Division + PutOuts + Assists + Errors + NewLeague
## 
##             Df Sum of Sq      RSS    AIC
## - HmRun      1       279 21327132 2437.9
## - NewLeague  1       651 21327504 2437.9
## - Errors     1      6493 21333346 2437.9
## - RBI        1      6860 21333713 2438.0
## - CHmRun     1      7301 21334153 2438.0
## - CHits      1     10245 21337098 2438.0
## - Years      1     25712 21352565 2438.1
## - League     1     28228 21355081 2438.2
## - Runs       1     39540 21366393 2438.3
## - CRBI       1     78662 21405515 2438.7
## - CAtBat     1     93836 21420689 2438.8
## - CRuns      1    165367 21492220 2439.5
## <none>                   21326853 2439.9
## - Assists    1    229372 21556225 2440.1
## - CWalks     1    550572 21877425 2443.2
## - Division   1    701147 22028000 2444.6
## - Hits       1    932679 22259532 2446.8
## - Walks      1    938864 22265716 2446.8
## - PutOuts    1    942588 22269441 2446.9
## - AtBat      1   1184571 22511424 2449.1
## 
## Step:  AIC=2437.89
## Salary ~ AtBat + Hits + Runs + RBI + Walks + Years + CAtBat + 
##     CHits + CHmRun + CRuns + CRBI + CWalks + League + Division + 
##     PutOuts + Assists + Errors + NewLeague
## 
##             Df Sum of Sq      RSS    AIC
## - NewLeague  1       566 21327698 2435.9
## - Errors     1      6443 21333575 2436.0
## - CHmRun     1      7539 21334671 2436.0
## - CHits      1      9986 21337118 2436.0
## - RBI        1     12495 21339627 2436.0
## - Years      1     25478 21352610 2436.1
## - League     1     27950 21355082 2436.2
## - Runs       1     53429 21380561 2436.4
## - CAtBat     1     94340 21421471 2436.8
## - CRBI       1     96689 21423821 2436.8
## - CRuns      1    185367 21512499 2437.7
## <none>                   21327132 2437.9
## - Assists    1    235593 21562725 2438.2
## - CWalks     1    575407 21902539 2441.4
## - Division   1    720408 22047540 2442.8
## - PutOuts    1    947076 22274208 2444.9
## - Walks      1   1002501 22329633 2445.4
## - Hits       1   1073306 22400438 2446.1
## - AtBat      1   1185325 22512457 2447.1
## 
## Step:  AIC=2435.9
## Salary ~ AtBat + Hits + Runs + RBI + Walks + Years + CAtBat + 
##     CHits + CHmRun + CRuns + CRBI + CWalks + League + Division + 
##     PutOuts + Assists + Errors
## 
##            Df Sum of Sq      RSS    AIC
## - Errors    1      6155 21333853 2434.0
## - CHmRun    1      7339 21335037 2434.0
## - CHits     1      9541 21337239 2434.0
## - RBI       1     12817 21340515 2434.0
## - Years     1     25398 21353097 2434.2
## - Runs      1     53335 21381033 2434.4
## - League    1     75071 21402769 2434.6
## - CAtBat    1     93812 21421510 2434.8
## - CRBI      1     98282 21425981 2434.9
## - CRuns     1    190610 21518308 2435.8
## <none>                  21327698 2435.9
## - Assists   1    236010 21563708 2436.2
## - CWalks    1    577288 21904986 2439.4
## - Division  1    720061 22047759 2440.8
## - PutOuts   1    948064 22275762 2442.9
## - Walks     1   1003786 22331484 2443.5
## - Hits      1   1091940 22419639 2444.3
## - AtBat     1   1223590 22551289 2445.5
## 
## Step:  AIC=2433.96
## Salary ~ AtBat + Hits + Runs + RBI + Walks + Years + CAtBat + 
##     CHits + CHmRun + CRuns + CRBI + CWalks + League + Division + 
##     PutOuts + Assists
## 
##            Df Sum of Sq      RSS    AIC
## - CHmRun    1      6724 21340577 2432.0
## - CHits     1      7824 21341677 2432.0
## - RBI       1     11220 21345072 2432.1
## - Years     1     24104 21357956 2432.2
## - Runs      1     57526 21391379 2432.5
## - League    1     70922 21404775 2432.7
## - CAtBat    1     90644 21424497 2432.8
## - CRBI      1    100984 21434837 2432.9
## - CRuns     1    201382 21535235 2433.9
## <none>                  21333853 2434.0
## - Assists   1    313674 21647527 2435.0
## - CWalks    1    593539 21927392 2437.7
## - Division  1    722945 22056798 2438.9
## - PutOuts   1    942739 22276592 2440.9
## - Walks     1   1040700 22374553 2441.9
## - Hits      1   1161864 22495717 2443.0
## - AtBat     1   1281359 22615212 2444.1
## 
## Step:  AIC=2432.03
## Salary ~ AtBat + Hits + Runs + RBI + Walks + Years + CAtBat + 
##     CHits + CRuns + CRBI + CWalks + League + Division + PutOuts + 
##     Assists
## 
##            Df Sum of Sq      RSS    AIC
## - CHits     1      2192 21342770 2430.1
## - RBI       1     12586 21353163 2430.2
## - Years     1     24971 21365548 2430.3
## - Runs      1     63054 21403631 2430.6
## - League    1     71042 21411619 2430.7
## - CAtBat    1     86281 21426858 2430.9
## <none>                  21340577 2432.0
## - Assists   1    306971 21647548 2433.0
## - CRuns     1    433335 21773912 2434.2
## - CWalks    1    631568 21972145 2436.1
## - Division  1    716579 22057157 2436.9
## - PutOuts   1    954537 22295114 2439.1
## - CRBI      1   1001899 22342476 2439.6
## - Walks     1   1036407 22376984 2439.9
## - Hits      1   1187105 22527683 2441.3
## - AtBat     1   1283747 22624325 2442.2
## 
## Step:  AIC=2430.05
## Salary ~ AtBat + Hits + Runs + RBI + Walks + Years + CAtBat + 
##     CRuns + CRBI + CWalks + League + Division + PutOuts + Assists
## 
##            Df Sum of Sq      RSS    AIC
## - RBI       1     13190 21355960 2428.2
## - Years     1     29638 21372407 2428.3
## - League    1     72742 21415512 2428.8
## - Runs      1     81521 21424290 2428.8
## <none>                  21342770 2430.1
## - CAtBat    1    230265 21573034 2430.3
## - Assists   1    307170 21649939 2431.0
## - CRuns     1    713710 22056479 2434.9
## - Division  1    715586 22058356 2434.9
## - CWalks    1    929774 22272544 2436.9
## - PutOuts   1    978714 22321484 2437.4
## - CRBI      1   1002770 22345540 2437.6
## - Walks     1   1086910 22429680 2438.4
## - AtBat     1   1599684 22942453 2443.1
## - Hits      1   1779918 23122687 2444.7
## 
## Step:  AIC=2428.18
## Salary ~ AtBat + Hits + Runs + Walks + Years + CAtBat + CRuns + 
##     CRBI + CWalks + League + Division + PutOuts + Assists
## 
##            Df Sum of Sq      RSS    AIC
## - Years     1     26692 21382651 2426.4
## - League    1     70307 21426266 2426.9
## - Runs      1     73753 21429713 2426.9
## <none>                  21355960 2428.2
## - CAtBat    1    249406 21605365 2428.6
## - Assists   1    295538 21651497 2429.0
## - CRuns     1    702284 22058244 2432.9
## - Division  1    734085 22090044 2433.2
## - CWalks    1    937348 22293308 2435.1
## - PutOuts   1   1002301 22358261 2435.7
## - Walks     1   1086003 22441962 2436.5
## - CRBI      1   1439193 22795152 2439.7
## - AtBat     1   1640165 22996124 2441.6
## - Hits      1   1787801 23143761 2442.9
## 
## Step:  AIC=2426.43
## Salary ~ AtBat + Hits + Runs + Walks + CAtBat + CRuns + CRBI + 
##     CWalks + League + Division + PutOuts + Assists
## 
##            Df Sum of Sq      RSS    AIC
## - Runs      1     69079 21451730 2425.1
## - League    1     87548 21470199 2425.3
## <none>                  21382651 2426.4
## - Assists   1    314039 21696690 2427.5
## - CAtBat    1    492567 21875218 2429.2
## - Division  1    725175 22107827 2431.4
## - CRuns     1    880113 22262764 2432.8
## - CWalks    1    988001 22370652 2433.8
## - PutOuts   1   1049648 22432299 2434.4
## - Walks     1   1079896 22462547 2434.7
## - CRBI      1   1420036 22802687 2437.8
## - AtBat     1   1614330 22996981 2439.6
## - Hits      1   1772982 23155633 2441.0
## 
## Step:  AIC=2425.11
## Salary ~ AtBat + Hits + Walks + CAtBat + CRuns + CRBI + CWalks + 
##     League + Division + PutOuts + Assists
## 
##            Df Sum of Sq      RSS    AIC
## - League    1    113492 21565223 2424.2
## <none>                  21451730 2425.1
## - Assists   1    399827 21851557 2426.9
## - CAtBat    1    428452 21880182 2427.2
## - Division  1    727359 22179089 2430.0
## - CRuns     1    811308 22263038 2430.8
## - CWalks    1    947776 22399506 2432.1
## - Walks     1   1029714 22481444 2432.9
## - PutOuts   1   1153252 22604982 2434.0
## - CRBI      1   1434607 22886337 2436.6
## - AtBat     1   1793723 23245454 2439.8
## - Hits      1   1825947 23277677 2440.1
## 
## Step:  AIC=2424.2
## Salary ~ AtBat + Hits + Walks + CAtBat + CRuns + CRBI + CWalks + 
##     Division + PutOuts + Assists
## 
##            Df Sum of Sq      RSS    AIC
## <none>                  21565223 2424.2
## - CAtBat    1    366456 21931678 2425.7
## - Assists   1    423017 21988240 2426.2
## - CRuns     1    756041 22321264 2429.4
## - Division  1    762166 22327389 2429.4
## - CWalks    1    998625 22563847 2431.6
## - Walks     1   1124976 22690198 2432.8
## - PutOuts   1   1245275 22810497 2433.9
## - CRBI      1   1393594 22958817 2435.2
## - Hits      1   1785448 23350671 2438.8
## - AtBat     1   1830070 23395292 2439.2
## 
## Call:
## lm(formula = Salary ~ AtBat + Hits + Walks + CAtBat + CRuns + 
##     CRBI + CWalks + Division + PutOuts + Assists, data = hit_train)
## 
## Coefficients:
## (Intercept)        AtBat         Hits        Walks       CAtBat        CRuns         CRBI       CWalks    DivisionW      PutOuts      Assists  
##    235.9278      -2.5863       7.7364       5.9827      -0.1210       1.2468       0.9302      -0.9100    -123.4092       0.2893       0.3770

最后保留了10个自变量。

42.3.3 预测根均方误差计算

仅用训练集估计模型。 为了在测试集和交叉验证集上用模型进行预报并估计预测均方误差, 需要自己写一个预测函数:

predict.regsubsets <- function(object, newdata, id, ...){
  form <- as.formula(object$call[[2]])
  mat <- model.matrix(form, newdata)
  coefi <- coef(object, id=id)
  xvars <- names(coefi)
  mat[, xvars] %*% coefi
}

42.3.4 用10折交叉验证方法选择最优子集

用交叉验证方法比较不同的模型, 使用tidymodels扩展包有标准的做法, 参见47.3。 这里为了对方法进行更直接的演示, 直接调用交叉验证函数进行超参数调优并在测试集上计算预测精度指标。

下列程序对数据中每一行分配一个折号:

set.seed(102)
hit_fold <- vfold_cv(hit_train, v = 10)

下面,对10折中每一折都分别当作测试集一次, 得到不同子集大小的根均方误差:

cv.errors <- matrix( as.numeric(NA), 10, 19, dimnames=list(NULL, paste(1:19)) )
for(j in 1:10){ # 折
  d_ana <- analysis(hit_fold$splits[[j]])
  d_ass <- assessment((hit_fold$splits[[j]]))
  best.fit <- regsubsets(
    Salary ~ ., 
    data = d_ana, nvmax=19)
  for(i in 1:19){
    pred <- predict( 
      best.fit, d_ass, id=i)
    cv.errors[j, i] <- 
      mean( (d_ass[["Salary"]] - pred)^2 ) |> sqrt()
  }
}
cv.errors[1:3, 1:5]
##             1        2        3        4        5
## [1,] 527.1116 448.7947 541.2805 486.2844 500.2707
## [2,] 380.8413 417.8030 339.0055 320.3588 294.5357
## [3,] 425.7064 407.8210 401.5712 381.7351 365.8554

cv.errors是一个10×19矩阵, 每行对应一折作为测试集(或称评估集)的情形, 每列是一个子集大小, 元素值是预测的根均方误差。

对每列的10个元素求平均, 可以得到每个子集大小的平均根均方误差:

mean.cv.errors <- rowMeans(cv.errors)
mean.cv.errors
##  [1] 446.5360 348.7238 370.0678 519.2774 403.6128 254.5081 298.6319 302.1066 387.7379 353.0201
best.id <- which.min(mean.cv.errors)
plot(mean.cv.errors, type='b', 
  main = "RMSE",
  xlab = "p")

Hitters数据CV均方误差

图42.2: Hitters数据CV均方误差

这样找到的最优子集大小是6, RMSE=254.5。 注意, 一般不需要用户自己进行这种交叉验证调参, 机器学习的函数一般都集成了这个功能。

用这种方法找到最优子集大小后, 可以对全数据集重新建模但是选择最优子集大小为6:

reg.best <- regsubsets(Salary ~ ., data = da_hit, nvmax=19)
coef(reg.best, id=best.id)
##  (Intercept)        AtBat         Hits        Walks         CRBI    DivisionW      PutOuts 
##   91.5117981   -1.8685892    7.6043976    3.6976468    0.6430169 -122.9515338    0.2643076

这样的模型可以用于同一问题的新增数据的预测。

42.4 岭回归

当自变量个数太多时,模型复杂度高, 可能有过度拟合, 模型不稳定。 自变量子集选择是降低复杂度的一种方法。

另一种方法是对较大的模型系数施加二次惩罚, 把最小二乘问题变成带有二次惩罚项的惩罚最小二乘问题:

min∑i=1n(yi−β0−β1xi1−⋯−βpxip)2+λ∑j=1pβ2j.

这比通常最小二乘得到的回归系数绝对值变小, 但是求解的稳定性增加了,避免了共线问题。 这种方法称为“正则化”(regularization), 其中的∑pj=1β2j称为正则项或者L2惩罚项。

实际上, 与线性模型Y=Xβ+ε 的普通最小二乘解 β̂ =(XTX)−1XTY 相比, 岭回归问题的解为

β̃ =(XTX+sI)−1XTY

其中I为单位阵,s>0与λ有关。

λ称为调节参数,λ越大,相当于模型复杂度越低。 适当选择λ可以在方差与偏差之间找到适当的折衷, 从而减小预测误差。 这样的参数不能从数据中直接估计, 称为“超参数”, 需要用模型比较的方法获得最优值。

由于量纲问题,在不同自变量不可比时,数据集应该进行标准化。

用R的glmnet包计算岭回归。 用glmnet()函数, 指定参数alpha=0时执行的是岭回归。 用参数lambda=指定一个调节参数网格, 岭回归的算法可以进行一轮计算就获得所有这些调节参数上对应的参数估计。 用coef()从回归结果中取得不同调节参数对应的回归系数估计, 结果是一个矩阵,每列对应于一个调节参数。

仍采用上面去掉了缺失值的Hitters数据集结果da_hit

glmnet包不支持R的公式界面, 所以用如下程序把回归的设计阵与因变量提取出来:

x <- model.matrix(Salary ~ ., hit_train)[,-1]
y <- hit_train$Salary

岭回归涉及到调节参数λ的选择, 为了绘图, 先选择λ的一个网格:

grid <- 10^seq(10, -2, length=100)

用所有数据针对这样的调节参数网格计算岭回归结果, 注意glmnet()函数允许调节参数λ输入多个值:

ridge.mod <- glmnet(x, y, alpha=0, lambda=grid)
dim(coef(ridge.mod))
## [1]  20 100

glmnet()函数默认对数据进行标准化。
coef()的结果是一个矩阵, 每列对应一个调节参数值, 其中的数值是回归系数估计值。

42.4.1 用10折交叉验证选取调节参数

如何进行超参数调优并在测试集上计算性能, tidymodels有系统的方法, 参见47.3。 这里为了对方法进行更直接的演示, 直接调用交叉验证函数进行超参数调优并在测试集上计算预测精度指标。

在训练集用交叉验证选择调节参数, 称为参数调优或者超参数调优。 cv.glmnet()函数本身可以执行交叉验证, 不需要自己划分折:

set.seed(1)
cv.out <- cv.glmnet(x, y, alpha=0)
plot(cv.out)

Hitters数据岭回归参数选择

图42.3: Hitters数据岭回归参数选择

bestlam <- cv.out$lambda.min
bestlam
## [1] 25.22831

这样获得了最优调节参数λ= 25.2283126。 用最优调节参数对测试集作预测, 得到预测根均方误差:

ridge.pred <- predict(
  ridge.mod, s = bestlam, 
  newx = model.matrix(Salary ~ ., hit_test)[,-1])
mean( (ridge.pred - hit_test$Salary)^2 ) |> sqrt()
## [1] 240.7377

根均方误差240.7,比最优自变量子集方法的254.5要好。

最后,用选取的最优调节系数对全数据集建模, 得到相应的岭回归系数估计:

x <- model.matrix(Salary ~ ., da_hit)[,-1]
y <- da_hit$Salary
out <- glmnet(x, y, alpha=0)
predict(out, type='coefficients', s=bestlam)[1:20,]
##   (Intercept)         AtBat          Hits         HmRun          Runs           RBI         Walks         Years        CAtBat         CHits        CHmRun         CRuns          CRBI        CWalks       LeagueN     DivisionW       PutOuts       Assists        Errors    NewLeagueN 
##  8.112693e+01 -6.815959e-01  2.772312e+00 -1.365680e+00  1.014826e+00  7.130224e-01  3.378558e+00 -9.066800e+00 -1.199478e-03  1.361029e-01  6.979958e-01  2.958896e-01  2.570711e-01 -2.789666e-01  5.321272e+01 -1.228345e+02  2.638876e-01  1.698796e-01 -3.685645e+00 -1.810510e+01

这样的模型可以用在同一问题的新数据预测上。

42.5 Lasso回归

另一种对回归系数的惩罚是L1惩罚:

min∑i=1n(yi−β0−β1xi1−⋯−βpxip)2+λ∑j=1p|βj|.(42.1)

奇妙地是, 当调节参数λ较大时, 可以使得部分回归系数变成零, 达到了即减小回归系数的绝对值又挑选重要变量子集的效果。

事实上,(42.1)等价于约束最小值问题

min∑i=1n(yi−β0−β1xi1−⋯−βpxip)2s.t.∑j=1p|βj|≤s.

其中s与λ一一对应。 这样的约束区域是带有顶点的凸集, 而目标函数是二次函数, 最小值点经常在约束区域顶点达到, 这些顶点是某些坐标等于零的点。 见图42.4。 图中阴影部分是约束区域, 注意4个顶点处都有一个回归系数等于0; 同心的椭圆线是目标函数的等值线, 椭圆中心处是目标函数的无约束最小值点, 即普通最小二乘的解, 而约束区域与目标函数值最小的等值线的交点出现在顶点处, 该处的β1=0。

knitr::include_graphics("figs/lasso-min.png")

Lasso约束优化问题图示

图42.4: Lasso约束优化问题图示

对于每个调节参数λ, 都应该解出(42.1)的相应解, 记为β̂ (λ)。 幸运的是, 不需要对每个λ去解最小值问题(42.1), 存在巧妙的算法使得问题的计算量与求解一次最小二乘相仿。

通常选取λ的格子点,计算相应的惩罚回归系数。 用交叉验证方法估计预测的均方误差。 选取使得交叉验证均方误差最小的调节参数(一般R函数中已经作为选项)。

用R的glmnet包计算lasso。 用glmnet()函数, 指定参数alpha=1时执行的是lasso。 用参数lambda=指定一个调节参数网格, lasso将输出这些调节参数对应的结果。 对回归结果使用plot()函数可以画出调节参数变化时系数估计的变化情况。

仍使用gmlnet包的glmnet()函数计算Lasso回归, 指定一个调节参数网格(沿用前面的网格):

x <- model.matrix(Salary ~ ., hit_train)[,-1]
y <- hit_train$Salary
lasso.mod <- glmnet(x, y, alpha=1, lambda=grid)
plot(lasso.mod)
## Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm): collapsing to unique 'x' values

Hitters数据lasso轨迹

图42.5: Hitters数据lasso轨迹

对lasso结果使用plot()函数可以绘制延调节参数网格变化的各回归系数估计,横坐标不是调节参数而是调节参数对应的系数绝对值和, 可以看出随着系数绝对值和增大,实际是调节参数变小, 更多地自变量进入模型。

42.5.1 用交叉验证估计调节参数

如何进行超参数调优并在测试集上计算性能, tidymodels有系统的方法, 参见47.3。 这里为了对方法进行更直接的演示, 直接调用交叉验证函数进行超参数调优并在测试集上计算预测精度指标。

按照前面划分的训练集与测试集, 仅使用训练集数据做交叉验证估计最优调节参数:

set.seed(1)
cv.out <- cv.glmnet(x, y, alpha=1)
plot(cv.out)
北京大学R语言教程(李东风)第42章: 正则化与惩罚回归
bestlam <- cv.out$lambda.min; bestlam
## [1] 2.19423

得到调节参数估计后,对测试集计算预测均方误差:

lasso.pred <- predict(
  lasso.mod, s = bestlam, 
  newx = model.matrix(Salary ~ ., hit_test)[,-1])
mean( (lasso.pred - hit_test$Salary)^2 ) |> sqrt()
## [1] 242.0375

RMSE=242.0, 这个效果比岭回归(RMSE=240.7)效果略差, 比最优子集方法(RMSE=254.5)好。

为了充分利用数据, 使用前面获得的最优调节参数, 对全数据集建模:

x <- model.matrix(Salary ~ ., da_hit)[,-1]
y <- da_hit$Salary
out <- glmnet(x, y, alpha=1, lambda=grid)
lasso.coef <- predict(
  out, type='coefficients', s=bestlam)[1:20,]
lasso.coef[lasso.coef != 0]
##   (Intercept)         AtBat          Hits         HmRun         Walks         Years        CAtBat        CHmRun         CRuns          CRBI        CWalks       LeagueN     DivisionW       PutOuts       Assists        Errors 
##  1.348925e+02 -1.689582e+00  5.971182e+00  9.734402e-02  4.978211e+00 -1.019167e+01 -9.794493e-05  5.650266e-01  7.036826e-01  3.867695e-01 -5.851131e-01  3.305686e+01 -1.193420e+02  2.760478e-01  2.008473e-01 -2.277618e+00

选择的自变量子集有15个自变量。

42.6 附录

42.6.1 Hitters数据

knitr::kable(Hitters)
AtBatHitsHmRunRunsRBIWalksYearsCAtBatCHitsCHmRunCRunsCRBICWalksLeagueDivisionPutOutsAssistsErrorsSalaryNewLeague
-Andy Allanson2936613029141293661302914AE4463320NAA
-Alan Ashby31581724383914344983569321414375NW6324310475.000N
-Alvin Davis479130186672763162445763224266263AW8808214480.000A
-Andre Dawson496141206578371156281575225828838354NE200113500.000N
-Andres Galarraga3218710394230239610112484633NE80540491.500N
-Alfredo Griffin5941694745135114408113319501336194AW28242125750.000A
-Al Newman18537123821221442130924NE76127770.000A
-Argenis Salazar2987302424735091080413712AW1212839100.000A
-Andres Thomas32381626328234186632348NW1432901975.000N
-Andre Thornton40192174966651352061332253784890866AE0001100.000A
-Alan Trammell574159211077559104631130090702504488AE23844522517.143A
-Alex Trevino2025343126279187646715192186161NW3044511512.500N
-Andy VanSlyke418113134861474151239241205204203NE211117550.000N
-Alan Wiggins239600301122619415104309103207AE1211516700.000A
-Bill Almon19643729273013323182536376290238NE80458240.000N
-Billy Beane1833932015113201423201611AW11800NAA
-Buddy Bell5681582089757315806822731771045993732NW10529010775.000N
-Buddy Biancalana1904622481554791025652339AW10217716175.000A
-Bruce Bochte40710465743651252331478100643658653AW912889NAA
-Bruce Bochy127328162214872718024678256NW202222135.000N
-Barry Bonds413921672486514139216724865NE28095100.000N
-Bobby Bonilla426109355436214261093554362AW361222115.000N
-Bob Boone22101421684262993AW8128411NAA
-Bob Brenly472116166062746192448967242251240NW518553600.000N
-Bill Buckner629168187310240188424246416410081072402AE106715714776.667A
-Brett Butler58716349251706269574717442198317AE43493765.000A
-Bob Dernier3247343218227193149113291108180NE22233708.333N
-Bo Diaz4741291050564010233160461246327166NW7328313750.000N
-Bill Doran55015269237815230863332349182308NW26232916625.000N
-Brian Downing513137209095901452011382166763734784AW26753900.000A
-Bobby Grich313849423039176890183322410338641087AW1272217NAA
-Billy Hatcher419108655362235911498804631NW22674110.000N
-Bob Horner5171412770875293571994215545652337NW13781028NAN
-Brook Jacoby583168178380565164645244219208136AE10929225612.500A
-Bob Kearney204496232512713093082712613266AW419465300.000A
-Bill Madlock379106103860301462071906146859803571NW7217024850.000N
-Bobby Meacham16136019101741053244315686107AE7014912NAA
-Bob Melvin2686052425152350785342918NW44259690.000N
-Ben Oglivie3469853153301659131615235784901560AE000NAA
-Bip Roberts2416113412141241611341214NW16617210NAN
-BillyJo Robidoux1814111521332232504202945AE32629567.500A
-Bill Russell216540211815187318192646796627483NW103845NAN
-Billy Sample2005762314149251668446371230195NW6911NAN
-Bill Schroeder21746732199469416032867632AE307251180.000A
-Butch Wynegar194407192930114183106964486493608AE325222NAA
-Chris Bando254682282622699923621108117118AE359304305.000A
-Chris Brown416132757493339322732411312180NW7317718215.000N
-Carmen Castillo2055783432957561923211710751AE5844247.500A
-Cecil Cooper5421401246754116709921302359871089431AE697619NAA
-Chili Davis526146137170846264871577352342289NW30399815.000N
-Carlton Fisk4571011442632217652117672811003977619AW389394875.000A
-Curt Ford2145323029232226592323227NE1097370.000N
-Cliff Johnson1970121441131344AE000NAA
-Carney Lansford59116819807239944781307113634563319AW6714741200.000A
-Chet Lemon403101124553391251501429166747666526AE31665675.000A
-Candy Maldonado405102184985206950231299913864NW161103415.000N
-Carmelo Martinez2445892825354133533349164179194NW142142340.000N
-Charlie Moore235613243921143926102935441401333AE425434NAA
-Craig Reynolds31378632411212374296835409321170NW1062067416.667N
-Cal Ripken6271772598817063210927133529472313AE240482131350.000A
-Cory Snyder41611324586916141611324586916AE203701090.000A
-Chris Speier155446212315166631163498698661777NE53883275.000N
-Curt Wilkerson2365602715114111527011166457AW12519913230.000A
-Dave Anderson2165313115224926210911869114NW7315211225.000N
-Doug Baker2430102315928020129AW8040NAA
-Don Baylor58513931939462177546198231511411179727AE000950.000A
-Dann Bilardello191374121714477316316617452NE391388NAN
-Daryl Boston19953529222135141208574039AW1523575.000A
-Darnell Coles521142206786454815205229910378AE10724223105.000A
-Dave Collins4191131442744124484123132612344422AE21121NAA
-Dave Concepcion3118134230261782472198100950909690NW15322310320.000N
-Darren Daulton13831818213832445312333255NE244214NAN
-Doug DeCinces512131266996521453471397221712815548AW11921612850.000A
-Darrell Evans507122297885911877611947347117511521380AE8081082535.000A
-Dwight Evans5291372686979715666117852911082949989AE280105933.333A
-Damaso Garcia424119657461393651104632461301112AE2242868850.000N
-Dan Gladden3519745529394125835316196110117NW22673210.000A
-Danny Heep1955552433308131333825144149153NE8321NAN
-Dave Henderson388103155947396217455580285274186AW18294325.000A
-Donnie Hill339964372923410642901112310855AW1042139275.000A
-Dave Kingman5611183570943316667715754429011210608AW463328NAA
-Davey Lopes25570749354315631116611541019608820NE51548450.000N
-Don Mattingly67723831117113535222373793349401171AE137710061975.000A
-Darryl Motley227467232012513253244415615867AW9222NAA
-Dale Murphy614163298983751150171388266813822617NW303661900.000N
-Dwayne Murphy32983950395693828948145575528635AW27662600.000A
-Dave Parker63717431891165614672720242479781093495NW278991041.667N
-Dan Pasqua2808216444547242811325617063AE14842110.000A
-Darrell Porter15541122129221654091338181746805875AW16591260.000A
-Dick Schofield458114136757484135029828160123122AW24638918475.000A
-Don Slaught3148313394616514574052815615976AW533404431.500A
-Darryl Strawberry4751232776937241810471108292343267NE2261061220.000N
-Dale Sveum3177873535321317787353532AE451222670.000A
-Danny Tartabull511138257696613592164288711071AW15778145.000A
-Dickie Thon2786932421298207956532258192162NW14221010NAN
-Denny Walling3821191354583612213359441287294227NW591569595.000N
-Dave Winfield565148249010477147287208330511351234791AE292951861.460A
-Enos Cabell277712272914155952164760753596259NW360325NAN
-Eric Davis4151152797716837111844515611999NW27427300.000N
-Eddie Milner424110157047367213054438335174258NW29263490.000N
-Eddie Murray4951511761847810562416792758841015709AE104588132460.000A
-Ernest Riles52413296947542972260141239290AE21232720NAA
-Ed Romero233492412318813503367166122106AE10213210375.000A
-Ernie Whitt3951061648563510230357186266323248AE709417NAA
-Fred Lynn397114236767531355891632241906926716AE24424NAA
-Floyd Rayford21037815191569942443610711453AE4011515NAA
-Franklin Stubbs4209523555837364613931777761NW206107NAN
-Frank White566154227684431461001583131743693300AW31643910750.000A
-George Bell64119831101108415212961092297319117AE26917101175.000A
-Glenn Braggs2155141918111215514191811AE11651270.000A
-George Brett44112816707380146675209520910721050695AW97218161500.000A
-Greg Brock32576163352375150635171195219214NW726873385.000A
-Gary Carter4901252481105621360631646271847999680NE8696281925.571N
-Glenn Davis57415231911016439852605314817395NW125311111215.000N
-George Foster284641430422418702319253489861239666NE9644NAN
-Gary Gaetti59617134911085262862728107361401224AW11833421900.000A
-Greg Gagne472118126354304793187141028050AW22837726155.000A
-George Hendrick283771445472616684019102599151067546AW14465700.000A
-Glenn Hubbard4089444236669357386659429365410NW28248719535.000N
-Garth Iorg327853304420821405681621620893AE9118512362.500A
-Gary Matthews370962149466015698619722311070955921NE13759733.333N
-Graig Nettles35477163655412087162172384117212671057NW8317416200.000N
-Gary Pettis53913959358695146936912247126198AW46297400.000A
-Gary Redus34084116233475151637642284141219NE18584400.000A
-Garry Templeton5101262424435115562157844703519256NW20735820737.500N
-Gorman Thomas31559164536581346771051268681782697AW000NAA
-Greg Walker28278133751295164945373211280138AW670575500.000A
-Gary Ward38012055451318311890092444419240AW23781600.000A
-Glenn Wilson584158157084425235863658265316134NE331204662.500N
-Harold Baines57016921728838737541077140492589263AW295155950.000A
-Hubie Brooks306104145058257295482255313377187NE11622215750.000N
-Howard Johnson22054103039315118529940145154128NE5013620297.500N
-Hal McRae27870722371818718620811909351088643AW000325.000A
-Harold Reynolds44599146242946181291723148AW2784151687.500A
-Harry Spilman143395183015963915116809761NW138151175.000N
-Herm Winningham18540423111835241257583747NE972290.000N
-Jesse Barfield589170401071086962325634128371376238AE3682031237.500A
-Juan Beniquez3431036483640154338119370581421325AE2115613430.000A
-Juan Bonilla28469133182551407361613998111AE1221405NAN
-John Cangelosi438103265327124401032673271AW27679100.000N
-Jose Canseco60014433851176526961733810113069AW319414165.000A
-Joe Carter6632002910812132414474045721022268AE24186250.000A
-Jack Clark2325593423451244051213194702705625NE6233531300.000N
-Jose Cruz4791331048725517747221471539801032854NW23754773.333N
-Julio Cruz20945038194210385991623557279478AW1322055NAA
-Jody Davis528132216174416264167197273383226NE88510581008.333N
-Jim Dwyer16039818312214212854356304268298AE3330275.000A
-Julio Franco599183108074325248271527330326158AE23137418775.000A
-Jim Gantner4971367583826113871106640450367241AE30434710850.000A
-Johnny Grubb2107013325128154040113097544462551AE000365.000A
-Jerry Hairston22561532262611156840825202185257AW13290NAA
-Jack Howell1514142621192288689453935AW2856295.000A
-John Kruk2788643338451278864333845NW10242110.000N
-Jeffrey Leonard34195648422010296480881379428221NW15845100.000N
-Jim Morrison5371472358884710274473097302351174NE9225720277.500N
-John Moses399102356343456701674894854AW2119380.000A
-Jerry Mumphrey309945373226134618133057616522436NE16133600.000N
-Joe Orsulak4011002601928487623821264455NE193114NAN
-Jorge Orta3369393546231557791610128730741497AW000NAA
-Jim Presley616163278310732314373776518122782AW11030815200.000A
-Jamie Quirk2194782426171211882862310012563AW260584NAA
-Johnny Ray57917476778586305388032366337218NE2804795657.000N
-Jeff Reed165392139163196442181018AW33219275.000N
-Jim Rice618200209811062137127216335111041289564AE3301682412.500A
-Jerry Royster25766531263214391097933518324382NW8716614250.000A
-John Russell3157613356025363015124689455NE4983913155.000N
-Juan Samuel59115716907826420205415231022691NE29044025640.000N
-John Shelby4049211544918613543253018813563AE22255300.000A
-Joel Skinner31573523371644501086384628AW227153110.000A
-Jeff Stone249696321920470220910974844NE10382NAN
-Jim Sundberg4299112414257135590139783578579644AW686464825.000N
-Jim Traber212541328441822335913314620AE243235NAA
-Jose Uribe453101346436139482186967291NW24944416195.000N
-Jerry Willard161434172622370717921779976AW300122NAA
-Joel Youngblood18447520281811332789074419382304NW4920450.000N
-Kevin Bass59118420837938516894624021919582NW303125630.000N
-Kal Daniels1815863423221181586342322NW880386.500N
-Kirk Gibson4411182884866882723750126433420309AE190221300.000A
-Ken Griffey490150216958351461261839121983707600AE96531000.000N
-Keith Hernandez551171139483941360901840128969900917NE119914951800.000N
-Kent Hrbek5501472985917162816815117405474319AW1218104101310.000A
-Ken Landreaux283744342922103919106285505456283NW14557737.500N
-Kevin McReynolds560161268996664178947065233260155NW33298625.000N
-Kevin Mitchell328911251433323429412514433NE145598125.000N
-Keith Moreland586159127279539308288083363477295NE1811341043.333N
-Ken Oberkfell503136562488310342397020408303414NW652588725.000N
-Ken Phelps3448524696488791121464150156187AW000300.000A
-Kirby Puckett680223311199634319285873526220191AW42986365.000A
-Kurt Stillwell2796403126301279640312630NW1072051675.000N
-Leon Durham4841272066656773006844116436458377NE12318071183.333N
-Len Dykstra4311278774558266718791176488NE28383202.500N
-Larry Herndon283708333727124479122294557483307AE15622225.000A
-Lee Lacy49114111774737154291124084615430340AE23982525.000A
-Len Matuszek19952926282168051913011311987NW235225265.000N
-Lloyd Moseby5891492189866473558928102513471351AE37166787.500A
-Lance Parrish32784225362381042731123212577700334AE483486800.000N
-Larry Parrish464128286794521358291552210740840452AW000587.500A
-Luis Rivera1663402013171166340201317NE641199NAN
-Larry Sheets33892184260213682185368811250AE000145.000A
-Lonnie Smith50814688044469314891541571289326AW24559NAA
-Lou Whitaker58415720957363104704132093724522576AE27642111420.000A
-Mike Aldrete2165422725331216542272533NW31736175.000N
-Marty Barrett62517949460655169647612216163166AE30345014575.000A
-Mike Brown24353418262748532282310111076NE10733NAN
-Mike Davis489131197755347205154962300263153AW31099780.000A
-Mike Diaz209561222361922165812243719NE2016390.000N
-Mariano Duncan4079384730302969230141216968NW17231725150.000N
-Mike Easler490148146478491334001000113445491301AE000700.000N
-Mike Fitzgerald2095962037274884209146610692NE415353NAN
-Mel Hall442131186877336141639847210203136AE23377550.000A
-Mickey Hatcher3178834032198254371528269270118AW220164NAA
-Mike Heath2886583036279281569855315325189NE2593010650.000A
-Mike Kingery2095432514121209543251412AW1026368.000A
-Mike LaValliere3037131830363344763203645NE468476100.000N
-Mike Marshall33077194753276192851690247288161NW14986670.000N
-Mike Pagliarulo504120287171543108525954150167114AE10328319175.000A
-Mark Salas258608283318363817017807536AW358328137.000A
-Mike Schmidt201000024192674NE7822062127.333N
-Mike Scioscia3749453626627196851926181199288NW7566415875.000N
-Mickey Tettleton2114310263539349811614595578AW463328120.000A
-Milt Thompson29975638232635801608713344NE21212140.000N
-Mitch Webster57616788949574822232191328379NE325128210.000N
-Mookie Wilson38111096145327301583440451249168NE22875800.000N
-Marvell Wynne2887673437154164440816198120113NW20333240.000N
-Mike Young3699394342495125832354181177157AE14916350.000A
-Nick Esasky33076123541474136732655167198167NW512305NAN
-Ozzie Guillen54713725847122103827131298024AW26145922175.000A
-Oddibe McDowell57215218105496529782493616891101AW325133200.000A
-Omar Moreno359844462721124992125737699386387NW15185NAN
-Ozzie Smith514144067547994739116913583374528NE229453151940.000N
-Ozzie Virgil35980154548637149335961176202175NW6829313700.000N
-Phil Bradley526163128850774155647038245167174AW250111750.000A
-Phil Garner3138394341301458851543104751714535NW5814123450.000N
-Pete Incaviglia54013530828855154013530828855AW157614172.000A
-Paul Molitor437123962554094139120379676390364AE82170151260.000A
-Pete O’Brien551160238690875223560275278328273AW122411511NAA
-Pete Rose23752015253024140534256160216513141566NW523436750.000N
-Pat Sheridan2365664119215125732924166125105AE17214190.000A
-Pat Tabler47315466148296196656629250252178AE846849580.000A
-Rafael Belliard3097203331265354820413226NE11726912130.000N
-Rick Burleson271775352933124933135848630435403AW62903450.000A
-Randy Bush3579675045395139434443178192136AW16724300.000A
-Rick Cerone21656422181512279666543266304198AE391444250.000A
-Ron Cey256701342364416705818453129651128990NE4111881050.000A
-Rob Deer46610833758672365214244102109102AE28688215.000A
-Rick Dempsey327681342294518394993978438380466AE659537400.000A
-Rich Gedman462119164965377213158369244288150AE866656NAA
-Ron Hassey34111094549469233165850249322274AE25194560.000A
-Rickey Henderson608160281307489840711182103862417708AE426461670.000A
-Reggie Jackson419101186558922095282510548150916591342AW000487.500A
-Ricky Jones336024713360247AW20554NAA
-Ron Kittle376822142603551770408115238299157AW000425.000A
-Ray Knight48614511517640113967110267410497284NE8820416500.000A
-Randy Kutcher1864472816111186447281611NW9931NAN
-Rudy Law3078014236297242165618379198184AW14522NAA
-Rick Leach2467653539136912234121029680AE4401250.000A
-Rick Manning205528312717125134132356643445459AE15532400.000A
-Rance Mulliniks348901150454310228861443295273269AE601766450.000A
-Ron Oester52313585244529336889539377284296NW36747519750.000N
-Rey Quinones3126823222241312682322224AE861501570.000A
-Rafael Ramirez49611985733217335888236365280165NW15537129875.000N
-Ronn Reynolds12627381054239493161314NE19029190.000N
-Ron Roenicke275685424261696123816128104172NE18132191.000N
-Ryne Sandberg627178146876466314690274494345242NE3094925740.000N
-Rafael Santana394861382836410892673947176NE20336916250.000N
-Rick Schu208578322518365317017985462NE429413140.000N
-Ruben Sierra38210116505522138210116505522AW2007697.500A
-Roy Smalley459113205957681253481369155713660735AW000740.000A
-Robby Thompson549149773474215491497734742NW25545017140.000N
-Rob Wilfong28863325331610268266738315259204AW1352577341.667A
-Reggie Williams3038443532232312874393223NW17953NAN
-Robin Yount522163982466213703720191531043827535AE352911000.000A
-Steve Balboni5121172954884361750412100204276155AW12369818100.000A
-Scott Bradley2206652028133290805273115AW28121390.000A
-Sid Bream522140167377604730185229310686NE132016617200.000N
-Steve Buechele46111218545435268016024767549AW11122611135.000A
-Shawon Dunston581145176668212831210211068640NE32046532155.000N
-Scott Fletcher53015938250476161942611218149163AW19635415475.000A
-Steve Garvey55714221588123188759258327111381299478NW11605371450.000N
-Steve Jeltz43996044366547111481685699NE22940622150.000N
-Steve Lombardozzi453103853335225071238633958AW2894076105.000A
-Spike Owen52812216745514171640312211146155AW20937217350.000A
-Steve Sax63321069156596307087219420230274NW3674321690.000N
-Tony Armas162010022840100AE24748NAA
-Tony Bernazard562169178873538318184161450342373AE35144217530.000A
-Tom Brookens2817634225208265865748324300179AE1061447341.667A
-Tom Brunansky5931522369755362765686133369384321AW315106940.000A
-Tony Fernandez68721310916527415184481519613789AE29444513350.000A
-Tim Flannery3681033482854818974939207162198NW2092463326.667N
-Tom Foley26370126233048882209838286NE811474250.000N
-Tony Gwynn6422111410759525236477027352230193NW337194740.000N
-Terry Harper2656882630297133733932135163128NW9253425.000A
-Toby Harrah289637364144177402195419511159191153AW1662117NAA
-Tommy Herr55914124861738316287416421349359NE3524149925.000N
-Tim Hulett520120175344214927227221068052AW7014411185.000A
-Terry Kennedy194123111941231NW692708920.000A
-Tito Landrum2054322417207854219121059971NE13161286.667N
-Tim Laudner19347102129246113625642129139106AW299135245.000A
-Tom O’Malley181461191817593723898895104AE37989NAA
-Tom Paciorek21361417223174061114583488491244AW178454235.000A
-Tony Pena510147105652537287282163307340174NE81099181150.000N
-Terry Pendleton578138156593431399357714916187NE13337120160.000N
-Tony Perez200512142925239778273237912721652925NW398297NAN
-Tony Phillips44111357652765154639717226149191AW16029011425.000A
-Terry Puhl172423171415104086115057579363406NW6500900.000N
-Tim Raines580194991627883372102848604314469NE270136NAN
-Ted Simmons127324142512198396240224210481348819NW167186500.000N
-Tim Teufel2796943531324135935531180148158NE1331739277.500N
-Tim Wallach4801121850714473031771110338406239NE9427016750.000N
-Vince Coleman600139094296021236309120169110NE300129160.000N
-Von Hayes6101861910798746272875369399366286NE118296131300.000N
-Vance Law3608153744377226856641279257246NE1702843525.000N
-Wally Backman3871241672736717755066272125194NE18629017550.000N
-Wade Boggs5802078107711055277897832474322417AE121267191600.000A
-Will Clark40811711664134140811711664134NW9427211120.000N
-Wally Joyner5931722282100571593172228210057AW122213915165.000A
-Wayne Krenchicki2215322123228106328315107124106NE325586NAN
-Willie McGee49712776548375270380632379311138NE32593700.000N
-Willie Randolph4921365765094125511151139897451875AE31338120875.000A
-Wayne Tolleson475126361435261700433721793146AW371137385.000A
-Willie Upshaw57314498560788319885797470420332AE131413112960.000A
-Willie Wilson63117097744311149081457

韭菜热线原创版权所有,发布者:风生水起,转载请注明出处:https://www.9crx.com/79995.html

(0)
打赏
风生水起的头像风生水起普通用户
上一篇 2023年12月1日 22:13
下一篇 2023年12月2日 22:09

相关推荐

  • 北京大学金融时间序列分析讲义第19章: 改进的GARCH模型

    本章讲GARCH模型的一些有针对性的改进。来自(Tsay 2013)§4.9-4.11内容。 EGARCH模型 模型 (Nelson 1991)提出的指数GARCH(EGARCH)模型允许正负资产收益率对波动率有不对称的影响。考虑如下变换 g(εt)=αεt+γ[|εt|−E|εt|],(19.1) 其中α和γ是实常数。{εt}和{|εt|−E|εt|}都分…

    2023年8月1日
    13200
  • 我应该担心通货膨胀吗?

    许多投资者想知道现在是否有充分的理由担心通胀。通货膨胀,即价格随时间上涨的趋势,在 21 世纪迄今为止还没有出现太多。事实上,在新冠病毒大流行期间,一些行业反而出现了通货紧缩。 然而,一些美国人仍然记得 20 世纪 70 年代,通货膨胀失控,但经济却没有强劲(滞胀)。自从将监控通胀作为除了“充分”就业之外的美联储使命的一部分之后,通胀已不再是一个问题。但如您…

    2023年12月28日
    8200
  • 北京大学R语言教程(李东风)第51章:数据库访问

    51.1 介绍 对于大型的数据, 或者保存在多个表中的复杂数据, 经常会保存在一个数据库中。 数据库可以存在于专用的数据库服务器硬件上, 也可以是本机中的一个系统程序, 或者R直接管理的一个文件。 比较通用的数据库是关系数据库, 这样的数据库已经有很标准的设计理念和管理方法, 从用户使用的角度来看, 都可以使用一种专用的SQL语言来访问和管理。 常…

    2023年12月14日
    10900
  • 主动管理的错觉:尊重群体的智慧

    “我在这里的基本观点是,无论是金融分析师作为一个整体,还是投资基金作为一个整体,都不能指望‘击败市场’,因为在某种意义上,他们(或你)就是市场。 ” . . 金融分析师对投资和投机决策的整体影响越大,整体结果优于市场的数学可能性就越小。” —本杰明·格雷厄姆 金融史上的一个经久不衰的原则是,过去的解决方案往往会为未来的问题埋下种子。这种现象最出乎意料的例子是…

    2023年6月14日
    13500
  • 堪萨斯城联储 1 月份制造业活动小幅下滑

    最新的堪萨斯城联储制造业调查综合指数 1 月份温和下降,而未来前景进一步扩大。综合指数从 12 月份的 -1 降至 -9,而未来展望则跃升至 11。 以下是最新报告的摘录: 工厂活动小幅下降 第十区制造业活动温和下降,对未来活动的预期进一步扩大(图 1、表 1 和表 2)。 1月份原材料价格较上个月和去年同期大幅上涨。展望未来,原材料价格预计将继续以比成品价…

    2024年2月20日
    4800

发表回复

登录后才能评论
客服
客服
关注订阅号
关注订阅号
分享本页
返回顶部