R comes with functions for ordinary linear regression (lm
), for generalized linear models (glm
), and time-to-event models (package survival
- do not need to install - comes with R).
However, a more powerful package for these common models is rms
. You’ll see some messages after your load the package.
## install.packages("rms")
library(rms)
We’ll use data in the adm
package:
# library(devtools); install_github("advdatamgmt/adm")
library(adm)
Let’s start with a linear regression of the optic nerve head data using the “built-in” version:
lm(mean ~ age, onhlong)
##
## Call:
## lm(formula = mean ~ age, data = onhlong)
##
## Coefficients:
## (Intercept) age
## 2.0065 0.1399
You don’t get much back that way. It turns out that R returns an object that has a lot of the useful data. You can access many aspects of the model. For a quick summary start with summary
:
summary(lm(mean ~ age, onhlong))
##
## Call:
## lm(formula = mean ~ age, data = onhlong)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.8121 -0.6617 0.2338 0.6715 1.2797
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.00651 0.10378 19.335 < 2e-16 ***
## age 0.13995 0.01827 7.661 7.12e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7968 on 112 degrees of freedom
## Multiple R-squared: 0.3439, Adjusted R-squared: 0.338
## F-statistic: 58.7 on 1 and 112 DF, p-value: 7.125e-12
That is you usual summary… it will get tedious to keep typing the model so save it in a variable:
m <- lm(mean ~ age, onhlong)
summary(m)
##
## Call:
## lm(formula = mean ~ age, data = onhlong)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.8121 -0.6617 0.2338 0.6715 1.2797
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.00651 0.10378 19.335 < 2e-16 ***
## age 0.13995 0.01827 7.661 7.12e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7968 on 112 degrees of freedom
## Multiple R-squared: 0.3439, Adjusted R-squared: 0.338
## F-statistic: 58.7 on 1 and 112 DF, p-value: 7.125e-12
# coefficients
coef(m)
## (Intercept) age
## 2.0065072 0.1399477
# diagnostic plots
plot(m)
# influence measures
influence.measures(m)
## Influence measures of
## lm(formula = mean ~ age, data = onhlong) :
##
## dfb.1_ dfb.age dffit cov.r cook.d hat inf
## 1 -0.08896 0.032659 -0.09777 1.011 4.78e-03 0.00987
## 2 -0.03687 0.012971 -0.04085 1.025 8.41e-04 0.00976
## 3 -0.15262 0.097382 -0.15309 1.005 1.17e-02 0.01473
## 4 0.11563 -0.072672 0.11613 1.016 6.75e-03 0.01442
## 5 0.00468 0.002013 0.00870 1.027 3.81e-05 0.00927
## 6 -0.11027 0.073173 -0.11037 1.020 6.10e-03 0.01565
## 7 -0.15860 0.105248 -0.15875 1.005 1.25e-02 0.01565
## 8 0.01792 0.000322 0.02524 1.026 3.21e-04 0.00877
## 9 0.05986 -0.019993 0.06697 1.020 2.25e-03 0.00963
## 10 -0.08797 0.047602 -0.08996 1.019 4.06e-03 0.01218
## 11 0.00201 -0.001082 0.00206 1.031 2.13e-06 0.01213
## 12 0.04087 -0.016278 0.04421 1.025 9.84e-04 0.01015
## 13 -0.21844 0.149166 -0.21847 0.984 2.35e-02 0.01643
## 14 -0.16493 0.111371 -0.16499 1.004 1.35e-02 0.01611
## 15 -0.01895 0.012050 -0.01901 1.033 1.82e-04 0.01466
## 16 -0.13828 0.060978 -0.14665 0.993 1.07e-02 0.01061
## 17 -0.13974 0.068785 -0.14519 0.996 1.05e-02 0.01131
## 18 -0.10898 0.069536 -0.10932 1.019 5.99e-03 0.01473
## 19 -0.13417 0.084323 -0.13474 1.010 9.06e-03 0.01442
## 20 0.13026 -0.078226 0.13138 1.009 8.61e-03 0.01359
## 21 -0.05403 -0.048891 -0.13179 0.998 8.63e-03 0.01017
## 22 -0.07187 0.045861 -0.07210 1.027 2.61e-03 0.01473
## 23 -0.13490 0.092184 -0.13492 1.015 9.09e-03 0.01645
## 24 0.22695 -0.704426 -0.79343 0.830 2.81e-01 0.04142 *
## 25 0.15272 -0.078901 0.15734 0.992 1.23e-02 0.01172
## 26 -0.03797 0.023194 -0.03823 1.031 7.37e-04 0.01388
## 27 -0.21508 0.146971 -0.21510 0.985 2.28e-02 0.01645
## 31 0.09021 -0.059861 0.09029 1.025 4.09e-03 0.01565
## 32 0.02051 0.018561 0.05003 1.024 1.26e-03 0.01017
## 33 0.02518 0.034182 0.07615 1.020 2.91e-03 0.01099
## 34 0.03412 -0.098001 -0.10880 1.063 5.96e-03 0.04649 *
## 35 0.02317 0.050774 0.09583 1.017 4.60e-03 0.01220
## 36 0.04934 -0.011033 0.05899 1.020 1.75e-03 0.00909
## 37 0.13359 -0.071933 0.13671 1.003 9.30e-03 0.01213
## 38 0.08252 -0.020225 0.09733 1.009 4.74e-03 0.00917
## 40 0.00891 0.019534 0.03687 1.029 6.85e-04 0.01220
## 41 0.00314 0.070223 0.10074 1.025 5.09e-03 0.01706
## 43 0.03845 -0.108445 -0.11998 1.064 7.24e-03 0.04793 *
## 44 0.04581 -0.030933 0.04582 1.032 1.06e-03 0.01611
## 46 0.03843 -0.011564 0.04382 1.024 9.67e-04 0.00943
## 47 0.03416 0.026261 0.07747 1.017 3.01e-03 0.00991
## 48 -0.00677 0.029589 0.03526 1.048 6.27e-04 0.02965
## 49 0.13717 -0.086208 0.13776 1.009 9.46e-03 0.01442
## 50 0.01204 0.010053 0.02831 1.027 4.04e-04 0.01004
## 51 0.05893 0.004702 0.08663 1.012 3.76e-03 0.00880
## 53 0.06817 -0.017967 0.07949 1.015 3.17e-03 0.00924
## 54 0.04164 0.048082 0.11491 1.007 6.59e-03 0.01063
## 55 -0.00766 0.073678 0.09536 1.033 4.57e-03 0.02176
## 60 0.01069 0.043985 0.07229 1.026 2.63e-03 0.01393
## 61 0.14216 -0.064527 0.14993 0.992 1.11e-02 0.01077
## 62 0.07257 -0.000685 0.10025 1.006 5.02e-03 0.00877
## 63 0.11438 -0.243451 -0.25510 1.117 3.27e-02 0.09830 *
## 64 0.13189 -0.087523 0.13202 1.014 8.71e-03 0.01565
## 65 0.01828 0.075209 0.12361 1.013 7.63e-03 0.01393
## 66 0.08134 -0.178334 -0.18791 1.109 1.78e-02 0.08831 *
## 67 0.01336 0.054972 0.09035 1.022 4.10e-03 0.01393
## 68 0.17438 -0.109592 0.17512 0.995 1.52e-02 0.01442
## 110 0.07135 -0.026193 0.07841 1.017 3.08e-03 0.00987
## 28 0.11471 -0.040353 0.12710 0.998 8.03e-03 0.00976
## 39 -0.16203 0.103389 -0.16254 1.001 1.31e-02 0.01473
## 42 -0.05466 0.034352 -0.05489 1.029 1.52e-03 0.01442
## 52 -0.03463 -0.014879 -0.06428 1.019 2.08e-03 0.00927
## 69 -0.11668 0.077431 -0.11679 1.018 6.83e-03 0.01565
## 71 -0.11989 0.079561 -0.12001 1.018 7.21e-03 0.01565
## 81 -0.07473 -0.001342 -0.10523 1.004 5.52e-03 0.00877
## 91 -0.04655 0.015548 -0.05208 1.023 1.37e-03 0.00963
## 101 -0.10452 0.056562 -0.10689 1.014 5.72e-03 0.01218
## 111 0.02107 -0.011346 0.02156 1.030 2.34e-04 0.01213
## 121 -0.11990 0.047754 -0.12970 0.999 8.36e-03 0.01015
## 131 -0.18145 0.123908 -0.18148 0.999 1.63e-02 0.01643
## 141 -0.09194 0.062086 -0.09197 1.025 4.25e-03 0.01611
## 151 -0.09573 0.060878 -0.09605 1.022 4.63e-03 0.01466
## 161 -0.12070 0.053226 -0.12801 1.001 8.15e-03 0.01061
## 171 -0.07930 0.039035 -0.08239 1.019 3.41e-03 0.01131
## 181 -0.12140 0.077464 -0.12178 1.015 7.42e-03 0.01473
## 191 -0.10958 0.068869 -0.11005 1.018 6.06e-03 0.01442
## 201 0.15416 -0.092579 0.15548 1.000 1.20e-02 0.01359
## 211 -0.05403 -0.048891 -0.13179 0.998 8.63e-03 0.01017
## 221 -0.08422 0.053737 -0.08448 1.024 3.59e-03 0.01473
## 231 -0.22693 0.155074 -0.22696 0.980 2.53e-02 0.01645
## 241 0.03131 -0.097176 -0.10945 1.057 6.03e-03 0.04142 *
## 251 -0.08361 0.043193 -0.08614 1.019 3.72e-03 0.01172
## 261 -0.17102 0.104459 -0.17218 0.994 1.47e-02 0.01388
## 271 -0.23203 0.158557 -0.23206 0.978 2.64e-02 0.01645
## 311 0.06788 -0.045043 0.06794 1.029 2.32e-03 0.01565
## 321 0.02785 0.025204 0.06794 1.020 2.32e-03 0.01017
## 331 0.02254 0.030593 0.06816 1.022 2.33e-03 0.01099
## 341 0.05015 -0.144042 -0.15992 1.058 1.28e-02 0.04649 *
## 351 0.02317 0.050774 0.09583 1.017 4.60e-03 0.01220
## 361 0.05136 -0.011486 0.06140 1.020 1.90e-03 0.00909
## 371 0.11972 -0.064468 0.12252 1.008 7.49e-03 0.01213
## 381 0.07423 -0.018193 0.08755 1.012 3.84e-03 0.00917
## 401 0.01568 0.034377 0.06489 1.024 2.12e-03 0.01220
## 411 0.00345 0.077229 0.11079 1.023 6.15e-03 0.01706
## 431 0.06162 -0.173801 -0.19228 1.055 1.85e-02 0.04793 *
## 441 0.07490 -0.050580 0.07493 1.028 2.82e-03 0.01611
## 461 0.03412 -0.010268 0.03891 1.025 7.63e-04 0.00943
## 471 0.04312 0.033146 0.09778 1.011 4.78e-03 0.00991
## 481 -0.00762 0.033311 0.03970 1.048 7.95e-04 0.02965
## 491 0.12178 -0.076533 0.12230 1.014 7.48e-03 0.01442
## 501 0.01312 0.010952 0.03084 1.027 4.79e-04 0.01004
## 511 0.06545 0.005222 0.09620 1.008 4.63e-03 0.00880
## 531 0.07027 -0.018520 0.08194 1.014 3.37e-03 0.00924
## 541 0.04452 0.051410 0.12287 1.003 7.52e-03 0.01063
## 551 -0.00706 0.067825 0.08778 1.034 3.88e-03 0.02176
## 601 0.00936 0.038503 0.06328 1.027 2.02e-03 0.01393
## 611 0.14728 -0.066849 0.15533 0.989 1.19e-02 0.01077
## 621 0.06910 -0.000652 0.09546 1.008 4.56e-03 0.00877
## 631 0.12617 -0.268550 -0.28140 1.114 3.97e-02 0.09830 *
## 641 0.12224 -0.081119 0.12236 1.017 7.49e-03 0.01565
## 651 0.01694 0.069676 0.11452 1.015 6.56e-03 0.01393
## 661 0.09907 -0.217219 -0.22888 1.106 2.63e-02 0.08831 *
## 671 0.01069 0.043985 0.07229 1.026 2.63e-03 0.01393
## 681 0.19628 -0.123358 0.19712 0.985 1.91e-02 0.01442
After you have loaded the rms
package you can use the ols
function. OLS is ordinary least squares:
m <- ols(mean ~ age, onhlong)
m # no need for summary
## Linear Regression Model
##
## ols(formula = mean ~ age, data = onhlong)
##
## Model Likelihood Discrimination
## Ratio Test Indexes
## Obs 114 LR chi2 48.04 R2 0.344
## sigma0.7968 d.f. 1 R2 adj 0.338
## d.f. 112 Pr(> chi2) 0.0000 g 0.597
##
## Residuals
##
## Min 1Q Median 3Q Max
## -2.8121 -0.6617 0.2338 0.6715 1.2797
##
##
## Coef S.E. t Pr(>|t|)
## Intercept 2.0065 0.1038 19.33 <0.0001
## age 0.1399 0.0183 7.66 <0.0001
##
# you can make nice plots with rms, but first do the following to let
# rms know the distribution of your dataset
options(datadist = "dd")
dd <- datadist(onhlong)
# now you can plot the actual line
plot(Predict(m, age))
# and if you do summary you get your CIs and an effect size based on the
# IQR of the different variables
summary(m)
## Effects Response : mean
##
## Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95
## age 50.167 62.378 12.211 1.7089 0.22305 1.2669 2.1508
# let's add in the case variable
m2 <- ols(mean ~ age + case, onhlong)
m2
## Linear Regression Model
##
## ols(formula = mean ~ age + case, data = onhlong)
##
## Model Likelihood Discrimination
## Ratio Test Indexes
## Obs 114 LR chi2 151.36 R2 0.735
## sigma0.5087 d.f. 2 R2 adj 0.730
## d.f. 111 Pr(> chi2) 0.0000 g 0.928
##
## Residuals
##
## Min 1Q Median 3Q Max
## -1.47290 -0.28740 -0.03718 0.24526 1.70644
##
##
## Coef S.E. t Pr(>|t|)
## Intercept 2.9292 0.0979 29.91 <0.0001
## age 0.0647 0.0131 4.95 <0.0001
## case=1 -1.3711 0.1072 -12.80 <0.0001
##
plot(Predict(m2, case))
summary(m2)
## Effects Response : mean
##
## Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95
## age 0.67 5.83 5.16 0.3336 0.067406 0.20003 0.46717
## case - 1:0 1.00 2.00 NA -1.3711 0.107150 -1.58340 -1.15880
plot(m2, which = 1) # check the help for plot.lm
which.influence(m2)
## Error in residuals.ols(fit, "dfbetas"): did not specify x=TRUE in fit
Sometimes rms
will ask you to specify x
or y
in the fit. Do it like this:
m2 <- ols(mean ~ age + case, onhlong, x = TRUE)
It just saves extra info about the data which makes the model object bigger but doesn’t change anything else:
which.influence(m2)
## $Intercept
## [1] 24 39 81
##
## $age
## [1] 24 81 95
##
## $case
## [1] 5 8 9 24 25 58 59 81
show.influence(which.influence(m2), onhlong)
## Count age case
## 5 1 4.92 *1
## 8 1 4.00 *1
## 9 1 2.67 *1
## 24 3 *11.83 *1
## 25 1 1.58 *1
## 39 1 0.58 1
## 81 3 * 4.00 *1
If you want to specify interaction (R keeps things HWF) use the *
for all the terms:
(m3 <- ols(mean ~ age * case , onhlong))
## Linear Regression Model
##
## ols(formula = mean ~ age * case, data = onhlong)
##
## Model Likelihood Discrimination
## Ratio Test Indexes
## Obs 114 LR chi2 154.29 R2 0.742
## sigma0.5045 d.f. 3 R2 adj 0.735
## d.f. 110 Pr(> chi2) 0.0000 g 0.932
##
## Residuals
##
## Min 1Q Median 3Q Max
## -1.89925 -0.28756 -0.02924 0.22494 1.72189
##
##
## Coef S.E. t Pr(>|t|)
## Intercept 2.9922 0.1040 28.78 <0.0001
## age 0.0535 0.0145 3.68 0.0004
## case=1 -1.5176 0.1370 -11.08 <0.0001
## age * case=1 0.0543 0.0320 1.69 0.0930
##
Finally, one useful function for removing or adding terms as you model is update
. The .
means keep everything on that side and the -
will take terms away while +
would add:
update(m2, . ~ . - age)
## Linear Regression Model
##
## ols(formula = mean ~ case, data = onhlong, x = TRUE)
##
## Model Likelihood Discrimination
## Ratio Test Indexes
## Obs 114 LR chi2 128.63 R2 0.676
## sigma0.5596 d.f. 1 R2 adj 0.674
## d.f. 112 Pr(> chi2) 0.0000 g 0.806
##
## Residuals
##
## Min 1Q Median 3Q Max
## -1.06340 -0.36089 -0.05339 0.30661 1.69660
##
##
## Coef S.E. t Pr(>|t|)
## Intercept 3.2934 0.0711 46.34 <0.0001
## case=1 -1.6100 0.1052 -15.30 <0.0001
##
update(m2, . ~ . - case)
## Linear Regression Model
##
## ols(formula = mean ~ age, data = onhlong, x = TRUE)
##
## Model Likelihood Discrimination
## Ratio Test Indexes
## Obs 114 LR chi2 48.04 R2 0.344
## sigma0.7968 d.f. 1 R2 adj 0.338
## d.f. 112 Pr(> chi2) 0.0000 g 0.597
##
## Residuals
##
## Min 1Q Median 3Q Max
## -2.8121 -0.6617 0.2338 0.6715 1.2797
##
##
## Coef S.E. t Pr(>|t|)
## Intercept 2.0065 0.1038 19.33 <0.0001
## age 0.1399 0.0183 7.66 <0.0001
##
You can also add additional valid arguements (like x = TRUE
):
which.influence(update(m3, x = TRUE))
## $Intercept
## [1] 39 85 96
##
## $age
## [1] 39 95
##
## $case
## [1] 4 24 25 39 81
##
## $`age * case`
## [1] 5 24 81
For logistic regression use rms
’s lrm
instead of ols
.
(m <- lrm(case ~ age, onhlong))
## Logistic Regression Model
##
## lrm(formula = case ~ age, data = onhlong)
##
## Model Likelihood Discrimination Rank Discrim.
## Ratio Test Indexes Indexes
## Obs 114 LR chi2 29.36 R2 0.304 C 0.793
## 0 62 d.f. 1 g 1.517 Dxy 0.586
## 1 52 Pr(> chi2) <0.0001 gr 4.560 gamma 0.594
## max |deriv| 4e-08 gp 0.268 tau-a 0.293
## Brier 0.185
##
## Coef S.E. Wald Z Pr(>|Z|)
## Intercept 1.0033 0.3149 3.19 0.0014
## age -0.3556 0.0844 -4.21 <0.0001
##
plot(Predict(m, age)) # remember that the model forces the log-odds to be linear
summary(m)
## Effects Response : case
##
## Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95
## age 0.67 5.83 5.16 -1.8351 0.43574 -2.689100 -0.98103
## Odds Ratio 0.67 5.83 5.16 0.1596 NA 0.067942 0.37493
summary(m, age = c(4,5))
## Effects Response : case
##
## Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95
## age 4 5 1 -0.35563 0.084446 -0.52114 -0.19012
## Odds Ratio 4 5 1 0.70073 NA 0.59384 0.82686
For time-to-event you can use the built-in functions from the survival
package.
library(survival)
# the Surv specifies the follow-up time and the
# status of the subject; survfit is K-M
survfit(Surv(futime, fustat) ~ rx, data = ovarian)
## Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian)
##
## n events median 0.95LCL 0.95UCL
## rx=1 13 7 638 268 NA
## rx=2 13 5 NA 475 NA
survdiff(Surv(futime, fustat) ~ rx, data = ovarian)
## Call:
## survdiff(formula = Surv(futime, fustat) ~ rx, data = ovarian)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## rx=1 13 7 5.23 0.596 1.06
## rx=2 13 5 6.77 0.461 1.06
##
## Chisq= 1.1 on 1 degrees of freedom, p= 0.303
# stratify us strata in the survival package
survdiff(Surv(time, status) ~ pat.karno + strata(inst), data=lung)
## Call:
## survdiff(formula = Surv(time, status) ~ pat.karno + strata(inst),
## data = lung)
##
## n=224, 4 observations deleted due to missingness.
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## pat.karno=30 2 1 0.692 0.13720 0.15752
## pat.karno=40 2 1 1.099 0.00889 0.00973
## pat.karno=50 4 4 1.166 6.88314 7.45359
## pat.karno=60 30 27 16.298 7.02790 9.57333
## pat.karno=70 41 31 26.358 0.81742 1.14774
## pat.karno=80 50 38 41.938 0.36978 0.60032
## pat.karno=90 60 38 47.242 1.80800 3.23078
## pat.karno=100 35 21 26.207 1.03451 1.44067
##
## Chisq= 21.4 on 7 degrees of freedom, p= 0.00326
# plot it
plot(survfit(Surv(futime, fustat) ~ rx, data = ovarian))
# Cox model
(m <- coxph(Surv(futime, fustat) ~ rx + age, data = ovarian))
## Call:
## coxph(formula = Surv(futime, fustat) ~ rx + age, data = ovarian)
##
## coef exp(coef) se(coef) z p
## rx -0.8040 0.4475 0.6320 -1.27 0.2034
## age 0.1473 1.1587 0.0461 3.19 0.0014
##
## Likelihood ratio test=15.9 on 2 df, p=0.000355
## n= 26, number of events= 12
summary(m)
## Call:
## coxph(formula = Surv(futime, fustat) ~ rx + age, data = ovarian)
##
## n= 26, number of events= 12
##
## coef exp(coef) se(coef) z Pr(>|z|)
## rx -0.80397 0.44755 0.63205 -1.272 0.20337
## age 0.14733 1.15873 0.04615 3.193 0.00141 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## rx 0.4475 2.234 0.1297 1.545
## age 1.1587 0.863 1.0585 1.268
##
## Concordance= 0.798 (se = 0.091 )
## Rsquare= 0.457 (max possible= 0.932 )
## Likelihood ratio test= 15.89 on 2 df, p=0.0003551
## Wald test = 13.47 on 2 df, p=0.00119
## Score (logrank) test = 18.56 on 2 df, p=9.341e-05
# plot(m) - doesn't work
With the rms
package:
# Cox model
## only need once per session and we did above
# options(datadist = "dd")
dd <- datadist(ovarian)
(m <- cph(Surv(futime, fustat) ~ rx + age, data = ovarian))
## Cox Proportional Hazards Model
##
## cph(formula = Surv(futime, fustat) ~ rx + age, data = ovarian)
##
## Model Tests Discrimination
## Indexes
## Obs 26 LR chi2 15.89 R2 0.490
## Events 12 d.f. 2 Dxy 0.596
## Center 7.0687 Pr(> chi2) 0.0004 g 1.734
## Score chi2 18.56 gr 5.661
## Pr(> chi2) 0.0001
##
## Coef S.E. Wald Z Pr(>|Z|)
## rx -0.8040 0.6320 -1.27 0.2034
## age 0.1473 0.0461 3.19 0.0014
##
# need x and y
m <- update(m, x = TRUE, y = TRUE)
survplot(m, rx)
survplot(m, age)
rms
functionsMore powerful than summary:
describe(ovarian)
## ovarian
##
## 6 Variables 26 Observations
## ---------------------------------------------------------------------------
## futime
## n missing distinct Info Mean Gmd .05 .10
## 26 0 26 1 599.5 390.3 125.2 212.0
## .25 .50 .75 .90 .95
## 368.0 476.0 794.8 1117.5 1186.8
##
## lowest : 59 115 156 268 329, highest: 1040 1106 1129 1206 1227
## ---------------------------------------------------------------------------
## fustat
## n missing distinct Info Sum Mean Gmd
## 26 0 2 0.747 12 0.4615 0.5169
##
## ---------------------------------------------------------------------------
## age
## n missing distinct Info Mean Gmd .05 .10
## 26 0 26 1 56.17 11.64 40.23 43.13
## .25 .50 .75 .90 .95
## 50.17 56.85 62.38 69.40 73.95
##
## lowest : 38.8932 39.2712 43.1233 43.1370 44.2055
## highest: 64.4247 66.4658 72.3315 74.4932 74.5041
## ---------------------------------------------------------------------------
## resid.ds
## n missing distinct Info Mean Gmd
## 26 0 2 0.733 1.577 0.5077
##
## Value 1 2
## Frequency 11 15
## Proportion 0.423 0.577
## ---------------------------------------------------------------------------
## rx
## n missing distinct Info Mean Gmd
## 26 0 2 0.751 1.5 0.52
##
## Value 1 2
## Frequency 13 13
## Proportion 0.5 0.5
## ---------------------------------------------------------------------------
## ecog.ps
## n missing distinct Info Mean Gmd
## 26 0 2 0.747 1.462 0.5169
##
## Value 1 2
## Frequency 14 12
## Proportion 0.538 0.462
## ---------------------------------------------------------------------------
Really awesome for univariate analysis:
summary(rx ~ age + futime + fustat, ovarian, method = "reverse", test = TRUE)
##
##
## Descriptive Statistics by rx
##
## +------+---------------------------+---------------------------+------------------------------+
## | |1 |2 | Test |
## | |(N=13) |(N=13) |Statistic |
## +------+---------------------------+---------------------------+------------------------------+
## |age | 43.1370/56.4301/66.4658| 53.9068/57.0521/59.6301| F=0.14 d.f.=1,24 P=0.709 |
## +------+---------------------------+---------------------------+------------------------------+
## |futime| 268/448/803 | 421/563/770 | F=1.48 d.f.=1,24 P=0.236 |
## +------+---------------------------+---------------------------+------------------------------+
## |fustat| 54% (7) | 38% (5) |Chi-square=0.62 d.f.=1 P=0.431|
## +------+---------------------------+---------------------------+------------------------------+
Read more about the function above in the help for summary.formula
.
Look at the edfun
dataset in the adm
package. It has four variables representing four presenting complaints (headache, neurologic, vision, and elevated blood pressure), phase of the study (1 or 2), sex, race, age, body mass index, and mean arterial pressure (MAP). Is there a linear relationship between higher MAP and older age?
summary(lm(map ~ age, edfun))
##
## Call:
## lm(formula = map ~ age, data = edfun)
##
## Residuals:
## Min 1Q Median 3Q Max
## -47.884 -13.180 -3.214 7.627 76.770
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 95.60066 2.14556 44.56 < 2e-16 ***
## age 0.27045 0.04348 6.22 8.54e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 19.38 on 702 degrees of freedom
## Multiple R-squared: 0.05223, Adjusted R-squared: 0.05088
## F-statistic: 38.69 on 1 and 702 DF, p-value: 8.536e-10