Advanced Table Example
Introduction
Herein we create a more advanced table that would be suitable for inclusion a final report or manuscript. The table is formatted completely in a way that you would expect to see in a published manuscript or official report using HTML, the main language that creates webpages.
The amount of work that generally goes into creating a table like this is substantial, so I would only suggest you use this approach for a report that you expect to be publishing on a regular basis or for a report or manuscript in development which has frequent changes to the underlying data or analysis.
We will use the lbw
dataset which is included in this dataset. This dataset contains data on low birth weight (lbw) babies and various maternal risk factors and was used in the book Applied Logistic Regression by Hosmer and Lemeshow. Take a look at the dataset.
We need to create some derived variables for the table.
Create the formatted table data
I’ve determined that I’m going to need eight columns in my table, and I’ll therefore start building by creating a data.frame
with eight character columns. Beacause I’m going to be creating the entire table instead of using a table shell in Word, I’ll just go ahead and put my column headers in, noting that some are blank.
Add in the table rows one at a time, but first add a couple of helper functions.
A function to format p-values with appropriate number of decimal places. You’ll definitely want to keep this function handy for future use! The special comments are a way to document your functions in your R code in a standard way that can create help files later if you wish using roxygen2.
A function to figure out autmomatically if the chi-square test is appropriate or if we need to use Fisher’s exact test, reporting he correctly formatted result for our table. Note how we use the internals of the summary
function’s results to decide how to proceed.
Now the main rows. Start with a vector that holds the formatted variable names and associate units as appropriate. Use row_names
so you don’t clash with the built-in rownames
function.
Create a function to make each formatted row.
Now put this together with some fancy footwork.
That looks great! We just need to put in the headers that span across the low birth weight and non-low birth weight columns. To make those, we need to get the sample sizes for each group. See how I get pull them from the data vs. hard coding them.
Create the HTML table
Now, we have all the formatted data we need to create the HTML table. Let’s load a package that will help us create HTML elements, htmltools
.
In HTML, the <table>
element is made up of rows (<tr>
elements) which are made up of cells. There are two types of cells, header cells (<th>
) and data cells (<td>
). These have different default formatting in HTML, and allows us to independenly change the look of the header cells vs. the data cells if we wish. We will also specify some formatting attributes in the cells using CSS (i.e., cascading style sheets) which is the language used to format HTML elements.
We will apply make <th>
elements for the first row and put a border on the bottom of that row <tr>
using CSS. For the remaining rows, we will use <td>
elements, but will put a border on the last <tr>
. We will use a for
loop to create the rows because we want to apply something different to different rows and lapply
works best when we want to do the same thing to each element.
However, we will use lapply
in the for
loop to apply the <th>
and the <td>
to the individual cells within each row since there we are doing the same thing to each cell.
Build our fixed spanning headers. The
below are something that are likely new to you. It represents a non-breaking space in HTML. If we don’t use them, HTML will collapse multiple spaces into a single space when it renders the table. Their use here is simply to add some padding to the empty column in the middle of the table for readability. We only have to do it in one place because it will expand the rest of that column overall.
One could cut-and-paste the Unicode character for the non-breaking space, but because it looks like a space (here’s one now in the quotes ” “) using them that way is a bit opaque.
Finally, some footnotes.
And all together now plus a caption! Line break tags <br>
go between we don’t apply the function to each footnote.