Modeling

Exercise

R comes with functions for ordinary linear regression (lm), for generalized linear models (glm), and time-to-event models (package survival - do not need to install - comes with R).
However, a more powerful package for these common models is rms. You’ll see some messages after your load the package.

## install.packages("rms")
library(rms)

We’ll use data in the adm package:

# library(devtools); install_github("advdatamgmt/adm")
library(adm)

## 
## Attaching package: 'adm'

## The following object is masked from 'package:Hmisc':
## 
##     html

Linear regression

Let’s start with a linear regression of the optic nerve head data using the “built-in” version:

lm(mean ~ age, onhlong)

## 
## Call:
## lm(formula = mean ~ age, data = onhlong)
## 
## Coefficients:
## (Intercept)          age  
##      2.0065       0.1399

You don’t get much back that way. It turns out that R returns an object that has a lot of the useful data. You can access many aspects of the model. For a quick summary start with summary:

summary(lm(mean ~ age, onhlong))

## 
## Call:
## lm(formula = mean ~ age, data = onhlong)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.8121 -0.6617  0.2338  0.6715  1.2797 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.00651    0.10378  19.335  < 2e-16 ***
## age          0.13995    0.01827   7.661 7.12e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7968 on 112 degrees of freedom
## Multiple R-squared:  0.3439, Adjusted R-squared:  0.338 
## F-statistic:  58.7 on 1 and 112 DF,  p-value: 7.125e-12

That is you usual summary… it will get tedious to keep typing the model so save it in a variable:

m <- lm(mean ~ age, onhlong)
summary(m)

## 
## Call:
## lm(formula = mean ~ age, data = onhlong)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.8121 -0.6617  0.2338  0.6715  1.2797 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.00651    0.10378  19.335  < 2e-16 ***
## age          0.13995    0.01827   7.661 7.12e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7968 on 112 degrees of freedom
## Multiple R-squared:  0.3439, Adjusted R-squared:  0.338 
## F-statistic:  58.7 on 1 and 112 DF,  p-value: 7.125e-12

# coefficients
coef(m)

## (Intercept)         age 
##   2.0065072   0.1399477

# diagnostic plots
plot(m)

# influence measures
influence.measures(m)

## Influence measures of
##   lm(formula = mean ~ age, data = onhlong) :
## 
##       dfb.1_   dfb.age    dffit cov.r   cook.d     hat inf
## 1   -0.08896  0.032659 -0.09777 1.011 4.78e-03 0.00987    
## 2   -0.03687  0.012971 -0.04085 1.025 8.41e-04 0.00976    
## 3   -0.15262  0.097382 -0.15309 1.005 1.17e-02 0.01473    
## 4    0.11563 -0.072672  0.11613 1.016 6.75e-03 0.01442    
## 5    0.00468  0.002013  0.00870 1.027 3.81e-05 0.00927    
## 6   -0.11027  0.073173 -0.11037 1.020 6.10e-03 0.01565    
## 7   -0.15860  0.105248 -0.15875 1.005 1.25e-02 0.01565    
## 8    0.01792  0.000322  0.02524 1.026 3.21e-04 0.00877    
## 9    0.05986 -0.019993  0.06697 1.020 2.25e-03 0.00963    
## 10  -0.08797  0.047602 -0.08996 1.019 4.06e-03 0.01218    
## 11   0.00201 -0.001082  0.00206 1.031 2.13e-06 0.01213    
## 12   0.04087 -0.016278  0.04421 1.025 9.84e-04 0.01015    
## 13  -0.21844  0.149166 -0.21847 0.984 2.35e-02 0.01643    
## 14  -0.16493  0.111371 -0.16499 1.004 1.35e-02 0.01611    
## 15  -0.01895  0.012050 -0.01901 1.033 1.82e-04 0.01466    
## 16  -0.13828  0.060978 -0.14665 0.993 1.07e-02 0.01061    
## 17  -0.13974  0.068785 -0.14519 0.996 1.05e-02 0.01131    
## 18  -0.10898  0.069536 -0.10932 1.019 5.99e-03 0.01473    
## 19  -0.13417  0.084323 -0.13474 1.010 9.06e-03 0.01442    
## 20   0.13026 -0.078226  0.13138 1.009 8.61e-03 0.01359    
## 21  -0.05403 -0.048891 -0.13179 0.998 8.63e-03 0.01017    
## 22  -0.07187  0.045861 -0.07210 1.027 2.61e-03 0.01473    
## 23  -0.13490  0.092184 -0.13492 1.015 9.09e-03 0.01645    
## 24   0.22695 -0.704426 -0.79343 0.830 2.81e-01 0.04142   *
## 25   0.15272 -0.078901  0.15734 0.992 1.23e-02 0.01172    
## 26  -0.03797  0.023194 -0.03823 1.031 7.37e-04 0.01388    
## 27  -0.21508  0.146971 -0.21510 0.985 2.28e-02 0.01645    
## 31   0.09021 -0.059861  0.09029 1.025 4.09e-03 0.01565    
## 32   0.02051  0.018561  0.05003 1.024 1.26e-03 0.01017    
## 33   0.02518  0.034182  0.07615 1.020 2.91e-03 0.01099    
## 34   0.03412 -0.098001 -0.10880 1.063 5.96e-03 0.04649   *
## 35   0.02317  0.050774  0.09583 1.017 4.60e-03 0.01220    
## 36   0.04934 -0.011033  0.05899 1.020 1.75e-03 0.00909    
## 37   0.13359 -0.071933  0.13671 1.003 9.30e-03 0.01213    
## 38   0.08252 -0.020225  0.09733 1.009 4.74e-03 0.00917    
## 40   0.00891  0.019534  0.03687 1.029 6.85e-04 0.01220    
## 41   0.00314  0.070223  0.10074 1.025 5.09e-03 0.01706    
## 43   0.03845 -0.108445 -0.11998 1.064 7.24e-03 0.04793   *
## 44   0.04581 -0.030933  0.04582 1.032 1.06e-03 0.01611    
## 46   0.03843 -0.011564  0.04382 1.024 9.67e-04 0.00943    
## 47   0.03416  0.026261  0.07747 1.017 3.01e-03 0.00991    
## 48  -0.00677  0.029589  0.03526 1.048 6.27e-04 0.02965    
## 49   0.13717 -0.086208  0.13776 1.009 9.46e-03 0.01442    
## 50   0.01204  0.010053  0.02831 1.027 4.04e-04 0.01004    
## 51   0.05893  0.004702  0.08663 1.012 3.76e-03 0.00880    
## 53   0.06817 -0.017967  0.07949 1.015 3.17e-03 0.00924    
## 54   0.04164  0.048082  0.11491 1.007 6.59e-03 0.01063    
## 55  -0.00766  0.073678  0.09536 1.033 4.57e-03 0.02176    
## 60   0.01069  0.043985  0.07229 1.026 2.63e-03 0.01393    
## 61   0.14216 -0.064527  0.14993 0.992 1.11e-02 0.01077    
## 62   0.07257 -0.000685  0.10025 1.006 5.02e-03 0.00877    
## 63   0.11438 -0.243451 -0.25510 1.117 3.27e-02 0.09830   *
## 64   0.13189 -0.087523  0.13202 1.014 8.71e-03 0.01565    
## 65   0.01828  0.075209  0.12361 1.013 7.63e-03 0.01393    
## 66   0.08134 -0.178334 -0.18791 1.109 1.78e-02 0.08831   *
## 67   0.01336  0.054972  0.09035 1.022 4.10e-03 0.01393    
## 68   0.17438 -0.109592  0.17512 0.995 1.52e-02 0.01442    
## 110  0.07135 -0.026193  0.07841 1.017 3.08e-03 0.00987    
## 28   0.11471 -0.040353  0.12710 0.998 8.03e-03 0.00976    
## 39  -0.16203  0.103389 -0.16254 1.001 1.31e-02 0.01473    
## 42  -0.05466  0.034352 -0.05489 1.029 1.52e-03 0.01442    
## 52  -0.03463 -0.014879 -0.06428 1.019 2.08e-03 0.00927    
## 69  -0.11668  0.077431 -0.11679 1.018 6.83e-03 0.01565    
## 71  -0.11989  0.079561 -0.12001 1.018 7.21e-03 0.01565    
## 81  -0.07473 -0.001342 -0.10523 1.004 5.52e-03 0.00877    
## 91  -0.04655  0.015548 -0.05208 1.023 1.37e-03 0.00963    
## 101 -0.10452  0.056562 -0.10689 1.014 5.72e-03 0.01218    
## 111  0.02107 -0.011346  0.02156 1.030 2.34e-04 0.01213    
## 121 -0.11990  0.047754 -0.12970 0.999 8.36e-03 0.01015    
## 131 -0.18145  0.123908 -0.18148 0.999 1.63e-02 0.01643    
## 141 -0.09194  0.062086 -0.09197 1.025 4.25e-03 0.01611    
## 151 -0.09573  0.060878 -0.09605 1.022 4.63e-03 0.01466    
## 161 -0.12070  0.053226 -0.12801 1.001 8.15e-03 0.01061    
## 171 -0.07930  0.039035 -0.08239 1.019 3.41e-03 0.01131    
## 181 -0.12140  0.077464 -0.12178 1.015 7.42e-03 0.01473    
## 191 -0.10958  0.068869 -0.11005 1.018 6.06e-03 0.01442    
## 201  0.15416 -0.092579  0.15548 1.000 1.20e-02 0.01359    
## 211 -0.05403 -0.048891 -0.13179 0.998 8.63e-03 0.01017    
## 221 -0.08422  0.053737 -0.08448 1.024 3.59e-03 0.01473    
## 231 -0.22693  0.155074 -0.22696 0.980 2.53e-02 0.01645    
## 241  0.03131 -0.097176 -0.10945 1.057 6.03e-03 0.04142   *
## 251 -0.08361  0.043193 -0.08614 1.019 3.72e-03 0.01172    
## 261 -0.17102  0.104459 -0.17218 0.994 1.47e-02 0.01388    
## 271 -0.23203  0.158557 -0.23206 0.978 2.64e-02 0.01645    
## 311  0.06788 -0.045043  0.06794 1.029 2.32e-03 0.01565    
## 321  0.02785  0.025204  0.06794 1.020 2.32e-03 0.01017    
## 331  0.02254  0.030593  0.06816 1.022 2.33e-03 0.01099    
## 341  0.05015 -0.144042 -0.15992 1.058 1.28e-02 0.04649   *
## 351  0.02317  0.050774  0.09583 1.017 4.60e-03 0.01220    
## 361  0.05136 -0.011486  0.06140 1.020 1.90e-03 0.00909    
## 371  0.11972 -0.064468  0.12252 1.008 7.49e-03 0.01213    
## 381  0.07423 -0.018193  0.08755 1.012 3.84e-03 0.00917    
## 401  0.01568  0.034377  0.06489 1.024 2.12e-03 0.01220    
## 411  0.00345  0.077229  0.11079 1.023 6.15e-03 0.01706    
## 431  0.06162 -0.173801 -0.19228 1.055 1.85e-02 0.04793   *
## 441  0.07490 -0.050580  0.07493 1.028 2.82e-03 0.01611    
## 461  0.03412 -0.010268  0.03891 1.025 7.63e-04 0.00943    
## 471  0.04312  0.033146  0.09778 1.011 4.78e-03 0.00991    
## 481 -0.00762  0.033311  0.03970 1.048 7.95e-04 0.02965    
## 491  0.12178 -0.076533  0.12230 1.014 7.48e-03 0.01442    
## 501  0.01312  0.010952  0.03084 1.027 4.79e-04 0.01004    
## 511  0.06545  0.005222  0.09620 1.008 4.63e-03 0.00880    
## 531  0.07027 -0.018520  0.08194 1.014 3.37e-03 0.00924    
## 541  0.04452  0.051410  0.12287 1.003 7.52e-03 0.01063    
## 551 -0.00706  0.067825  0.08778 1.034 3.88e-03 0.02176    
## 601  0.00936  0.038503  0.06328 1.027 2.02e-03 0.01393    
## 611  0.14728 -0.066849  0.15533 0.989 1.19e-02 0.01077    
## 621  0.06910 -0.000652  0.09546 1.008 4.56e-03 0.00877    
## 631  0.12617 -0.268550 -0.28140 1.114 3.97e-02 0.09830   *
## 641  0.12224 -0.081119  0.12236 1.017 7.49e-03 0.01565    
## 651  0.01694  0.069676  0.11452 1.015 6.56e-03 0.01393    
## 661  0.09907 -0.217219 -0.22888 1.106 2.63e-02 0.08831   *
## 671  0.01069  0.043985  0.07229 1.026 2.63e-03 0.01393    
## 681  0.19628 -0.123358  0.19712 0.985 1.91e-02 0.01442

After you have loaded the rms package you can use the ols function. OLS is ordinary least squares:

m <- ols(mean ~ age, onhlong)
m # no need for summary

## Linear Regression Model
##  
##  ols(formula = mean ~ age, data = onhlong)
##  
##                 Model Likelihood     Discrimination    
##                    Ratio Test           Indexes        
##  Obs     114    LR chi2     48.04    R2       0.344    
##  sigma0.7968    d.f.            1    R2 adj   0.338    
##  d.f.    112    Pr(> chi2) 0.0000    g        0.597    
##  
##  Residuals
##  
##      Min      1Q  Median      3Q     Max 
##  -2.8121 -0.6617  0.2338  0.6715  1.2797 
##  
##  
##            Coef   S.E.   t     Pr(>|t|)
##  Intercept 2.0065 0.1038 19.33 <0.0001 
##  age       0.1399 0.0183  7.66 <0.0001 
##

# you can make nice plots with rms, but first do the following to let
# rms know the distribution of your dataset
options(datadist = "dd")
dd <- datadist(onhlong)

# now you can plot the actual line
plot(Predict(m, age))

# and if you do summary you get your CIs and an effect size based on the
# IQR of the different variables
summary(m)

##              Effects              Response : mean 
## 
##  Factor Low  High Diff. Effect  S.E.     Lower 0.95 Upper 0.95
##  age    0.67 5.83 5.16  0.72213 0.094256 0.53537    0.90889

# let's add in the case variable
m2 <- ols(mean ~ age + case, onhlong)
m2

## Linear Regression Model
##  
##  ols(formula = mean ~ age + case, data = onhlong)
##  
##                 Model Likelihood     Discrimination    
##                    Ratio Test           Indexes        
##  Obs     114    LR chi2    151.36    R2       0.735    
##  sigma0.5087    d.f.            2    R2 adj   0.730    
##  d.f.    111    Pr(> chi2) 0.0000    g        0.928    
##  
##  Residuals
##  
##       Min       1Q   Median       3Q      Max 
##  -1.47290 -0.28740 -0.03718  0.24526  1.70644 
##  
##  
##            Coef    S.E.   t      Pr(>|t|)
##  Intercept  2.9292 0.0979  29.91 <0.0001 
##  age        0.0647 0.0131   4.95 <0.0001 
##  case=1    -1.3711 0.1072 -12.80 <0.0001 
##

plot(Predict(m2, case))

summary(m2)

##              Effects              Response : mean 
## 
##  Factor     Low  High Diff. Effect  S.E.     Lower 0.95 Upper 0.95
##  age        0.67 5.83 5.16   0.3336 0.067406  0.20003    0.46717  
##  case - 1:0 1.00 2.00   NA  -1.3711 0.107150 -1.58340   -1.15880

plot(m2, which = 1) # check the help for plot.lm

which.influence(m2)

## Error in residuals.ols(fit, "dfbetas"): did not specify x=TRUE in fit

Sometimes rms will ask you to specify x or y in the fit. Do it like this:

m2 <- ols(mean ~ age + case, onhlong, x = TRUE)

It just saves extra info about the data which makes the model object bigger but doesn’t change anything else:

which.influence(m2)

## $Intercept
## [1] 24 39 81
## 
## $age
## [1] 24 81 95
## 
## $case
## [1]  5  8  9 24 25 58 59 81

show.influence(which.influence(m2), onhlong)

##    Count    age case
## 5      1   4.92   *1
## 8      1   4.00   *1
## 9      1   2.67   *1
## 24     3 *11.83   *1
## 25     1   1.58   *1
## 39     1   0.58    1
## 81     3 * 4.00   *1

If you want to specify interaction (R keeps things HWF) use the * for all the terms:

(m3 <- ols(mean ~ age * case , onhlong))

## Linear Regression Model
##  
##  ols(formula = mean ~ age * case, data = onhlong)
##  
##                 Model Likelihood     Discrimination    
##                    Ratio Test           Indexes        
##  Obs     114    LR chi2    154.29    R2       0.742    
##  sigma0.5045    d.f.            3    R2 adj   0.735    
##  d.f.    110    Pr(> chi2) 0.0000    g        0.932    
##  
##  Residuals
##  
##       Min       1Q   Median       3Q      Max 
##  -1.89925 -0.28756 -0.02924  0.22494  1.72189 
##  
##  
##               Coef    S.E.   t      Pr(>|t|)
##  Intercept     2.9922 0.1040  28.78 <0.0001 
##  age           0.0535 0.0145   3.68 0.0004  
##  case=1       -1.5176 0.1370 -11.08 <0.0001 
##  age * case=1  0.0543 0.0320   1.69 0.0930  
##

Finally, one useful function for removing or adding terms as you model is update. The . means keep everything on that side and the - will take terms away while + would add:

update(m2, . ~ . - age)

## Linear Regression Model
##  
##  ols(formula = mean ~ case, data = onhlong, x = TRUE)
##  
##                 Model Likelihood     Discrimination    
##                    Ratio Test           Indexes        
##  Obs     114    LR chi2    128.63    R2       0.676    
##  sigma0.5596    d.f.            1    R2 adj   0.674    
##  d.f.    112    Pr(> chi2) 0.0000    g        0.806    
##  
##  Residuals
##  
##       Min       1Q   Median       3Q      Max 
##  -1.06340 -0.36089 -0.05339  0.30661  1.69660 
##  
##  
##            Coef    S.E.   t      Pr(>|t|)
##  Intercept  3.2934 0.0711  46.34 <0.0001 
##  case=1    -1.6100 0.1052 -15.30 <0.0001 
##

update(m2, . ~ . - case)

## Linear Regression Model
##  
##  ols(formula = mean ~ age, data = onhlong, x = TRUE)
##  
##                 Model Likelihood     Discrimination    
##                    Ratio Test           Indexes        
##  Obs     114    LR chi2     48.04    R2       0.344    
##  sigma0.7968    d.f.            1    R2 adj   0.338    
##  d.f.    112    Pr(> chi2) 0.0000    g        0.597    
##  
##  Residuals
##  
##      Min      1Q  Median      3Q     Max 
##  -2.8121 -0.6617  0.2338  0.6715  1.2797 
##  
##  
##            Coef   S.E.   t     Pr(>|t|)
##  Intercept 2.0065 0.1038 19.33 <0.0001 
##  age       0.1399 0.0183  7.66 <0.0001 
##

You can also add additional valid arguements (like x = TRUE):

which.influence(update(m3, x = TRUE))

## $Intercept
## [1] 39 85 96
## 
## $age
## [1] 39 95
## 
## $case
## [1]  4 24 25 39 81
## 
## $`age * case`
## [1]  5 24 81

Logistic regression

For logistic regression use rms’s lrm instead of ols.

(m <- lrm(case ~ age, onhlong))

## Logistic Regression Model
##  
##  lrm(formula = case ~ age, data = onhlong)
##  
##                        Model Likelihood     Discrimination    Rank Discrim.    
##                           Ratio Test           Indexes           Indexes       
##  Obs           114    LR chi2      29.36    R2       0.304    C       0.793    
##   0             62    d.f.             1    g        1.517    Dxy     0.586    
##   1             52    Pr(> chi2) <0.0001    gr       4.560    gamma   0.594    
##  max |deriv| 4e-08                          gp       0.268    tau-a   0.293    
##                                             Brier    0.185                     
##  
##            Coef    S.E.   Wald Z Pr(>|Z|)
##  Intercept  1.0033 0.3149  3.19  0.0014  
##  age       -0.3556 0.0844 -4.21  <0.0001 
##

plot(Predict(m, age)) # remember that the model forces the log-odds to be linear

summary(m)

##              Effects              Response : case 
## 
##  Factor      Low  High Diff. Effect  S.E.    Lower 0.95 Upper 0.95
##  age         0.67 5.83 5.16  -1.8351 0.43574 -2.689100  -0.98103  
##   Odds Ratio 0.67 5.83 5.16   0.1596      NA  0.067942   0.37493

summary(m, age = c(4,5))

##              Effects              Response : case 
## 
##  Factor      Low High Diff. Effect   S.E.     Lower 0.95 Upper 0.95
##  age         4   5    1     -0.35563 0.084446 -0.52114   -0.19012  
##   Odds Ratio 4   5    1      0.70073       NA  0.59384    0.82686

Time-to-event

For time-to-event you can use the built-in functions from the survival package.

library(survival)

# the Surv specifies the follow-up time and the 
# status of the subject; survfit is K-M
survfit(Surv(futime, fustat) ~ rx, data = ovarian)

## Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian)
## 
##       n events median 0.95LCL 0.95UCL
## rx=1 13      7    638     268      NA
## rx=2 13      5     NA     475      NA

survdiff(Surv(futime, fustat) ~ rx, data = ovarian)

## Call:
## survdiff(formula = Surv(futime, fustat) ~ rx, data = ovarian)
## 
##       N Observed Expected (O-E)^2/E (O-E)^2/V
## rx=1 13        7     5.23     0.596      1.06
## rx=2 13        5     6.77     0.461      1.06
## 
##  Chisq= 1.1  on 1 degrees of freedom, p= 0.303

# stratify us strata in the survival package
survdiff(Surv(time, status) ~ pat.karno + strata(inst), data=lung)

## Call:
## survdiff(formula = Surv(time, status) ~ pat.karno + strata(inst), 
##     data = lung)
## 
## n=224, 4 observations deleted due to missingness.
## 
##                N Observed Expected (O-E)^2/E (O-E)^2/V
## pat.karno=30   2        1    0.692   0.13720   0.15752
## pat.karno=40   2        1    1.099   0.00889   0.00973
## pat.karno=50   4        4    1.166   6.88314   7.45359
## pat.karno=60  30       27   16.298   7.02790   9.57333
## pat.karno=70  41       31   26.358   0.81742   1.14774
## pat.karno=80  50       38   41.938   0.36978   0.60032
## pat.karno=90  60       38   47.242   1.80800   3.23078
## pat.karno=100 35       21   26.207   1.03451   1.44067
## 
##  Chisq= 21.4  on 7 degrees of freedom, p= 0.00326

# plot it
plot(survfit(Surv(futime, fustat) ~ rx, data = ovarian))

# Cox model
(m <- coxph(Surv(futime, fustat) ~ rx + age, data = ovarian))

## Call:
## coxph(formula = Surv(futime, fustat) ~ rx + age, data = ovarian)
## 
##        coef exp(coef) se(coef)     z      p
## rx  -0.8040    0.4475   0.6320 -1.27 0.2034
## age  0.1473    1.1587   0.0461  3.19 0.0014
## 
## Likelihood ratio test=15.9  on 2 df, p=0.000355
## n= 26, number of events= 12

summary(m)

## Call:
## coxph(formula = Surv(futime, fustat) ~ rx + age, data = ovarian)
## 
##   n= 26, number of events= 12 
## 
##         coef exp(coef) se(coef)      z Pr(>|z|)   
## rx  -0.80397   0.44755  0.63205 -1.272  0.20337   
## age  0.14733   1.15873  0.04615  3.193  0.00141 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##     exp(coef) exp(-coef) lower .95 upper .95
## rx     0.4475      2.234    0.1297     1.545
## age    1.1587      0.863    1.0585     1.268
## 
## Concordance= 0.798  (se = 0.091 )
## Rsquare= 0.457   (max possible= 0.932 )
## Likelihood ratio test= 15.89  on 2 df,   p=0.0003551
## Wald test            = 13.47  on 2 df,   p=0.00119
## Score (logrank) test = 18.56  on 2 df,   p=9.341e-05

# plot(m) - doesn't work

With the rms package:

# Cox model
## only need once per session and we did above
# options(datadist = "dd")
dd <- datadist(ovarian)
(m <- cph(Surv(futime, fustat) ~ rx + age, data = ovarian))

## Cox Proportional Hazards Model
##  
##  cph(formula = Surv(futime, fustat) ~ rx + age, data = ovarian)
##  
##                      Model Tests       Discrimination    
##                                           Indexes        
##  Obs        26    LR chi2     15.89    R2       0.490    
##  Events     12    d.f.            2    Dxy      0.596    
##  Center 7.0687    Pr(> chi2) 0.0004    g        1.734    
##                   Score chi2  18.56    gr       5.661    
##                   Pr(> chi2) 0.0001                      
##  
##      Coef    S.E.   Wald Z Pr(>|Z|)
##  rx  -0.8040 0.6320 -1.27  0.2034  
##  age  0.1473 0.0461  3.19  0.0014  
##

# need x and y
m <- update(m, x = TRUE, y = TRUE)
survplot(m, rx)

survplot(m, age)

Extra `rms` functions

More powerful than summary:

describe(ovarian)

## ovarian 
## 
##  6  Variables      26  Observations
## ---------------------------------------------------------------------------
## futime 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##       26        0       26        1    599.5    390.3    125.2    212.0 
##      .25      .50      .75      .90      .95 
##    368.0    476.0    794.8   1117.5   1186.8 
## 
## lowest :   59  115  156  268  329, highest: 1040 1106 1129 1206 1227
## ---------------------------------------------------------------------------
## fustat 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##       26        0        2    0.747       12   0.4615   0.5169 
## 
## ---------------------------------------------------------------------------
## age 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##       26        0       26        1    56.17    11.64    40.23    43.13 
##      .25      .50      .75      .90      .95 
##    50.17    56.85    62.38    69.40    73.95 
## 
## lowest : 38.8932 39.2712 43.1233 43.1370 44.2055
## highest: 64.4247 66.4658 72.3315 74.4932 74.5041
## ---------------------------------------------------------------------------
## resid.ds 
##        n  missing distinct     Info     Mean      Gmd 
##       26        0        2    0.733    1.577   0.5077 
##                       
## Value          1     2
## Frequency     11    15
## Proportion 0.423 0.577
## ---------------------------------------------------------------------------
## rx 
##        n  missing distinct     Info     Mean      Gmd 
##       26        0        2    0.751      1.5     0.52 
##                   
## Value        1   2
## Frequency   13  13
## Proportion 0.5 0.5
## ---------------------------------------------------------------------------
## ecog.ps 
##        n  missing distinct     Info     Mean      Gmd 
##       26        0        2    0.747    1.462   0.5169 
##                       
## Value          1     2
## Frequency     14    12
## Proportion 0.538 0.462
## ---------------------------------------------------------------------------

Really awesome for univariate analysis:

summary(rx ~ age + futime + fustat, ovarian, method = "reverse", test = TRUE)

## 
## 
## Descriptive Statistics by rx
## 
## +------+---------------------------+---------------------------+------------------------------+
## |      |1                          |2                          |  Test                        |
## |      |(N=13)                     |(N=13)                     |Statistic                     |
## +------+---------------------------+---------------------------+------------------------------+
## |age   |    43.1370/56.4301/66.4658|    53.9068/57.0521/59.6301|   F=0.14 d.f.=1,24 P=0.709   |
## +------+---------------------------+---------------------------+------------------------------+
## |futime|          268/448/803      |          421/563/770      |   F=1.48 d.f.=1,24 P=0.236   |
## +------+---------------------------+---------------------------+------------------------------+
## |fustat|           54%  (7)        |           38%  (5)        |Chi-square=0.62 d.f.=1 P=0.431|
## +------+---------------------------+---------------------------+------------------------------+

Read more about the function above in the help for summary.formula.

Modeling

Beau B. Bruce

Exercise

Linear regression

Logistic regression

Time-to-event

Extra `rms` functions

Explore and Extend

Modeling

Beau B. Bruce

Exercise

Linear regression

Logistic regression

Time-to-event

Extra rms functions

Explore and Extend

Extra `rms` functions