Branching Out & Feeling Loopy
Exposition
In the last lesson we had an introduction to functions, a way to encapsulate pieces of R code such that we can substitute different values (arguments) into that code and apply that “recipe” to a set of arbitrary values. However, what if we need the program to “think”, i.e., make decisions about what to do? What if we need to do something many, many times: ten times, one hundred times, a million times?
Yes, our function might be easier to write than the code it runs, but you still don’t want to write your function out that many times. This lesson begins to address both of these problems.
Control structures guide the flow of your program. The first group of control structures we will consider are branch control structures. They are called branch structures because they place a “fork in the road” for your program and provide a way for your program to “think”.
Branch structures
If-else
The first control structure to look at is if
. Start by assigning 5
to x
so that we have something to work with.
x <- 5
<- 5 x
Now, read the code carefully below and then run it to see what happens.
if (x > 3) { "x is greater than 3" }
if (x > 3) { "x is greater than 3" }
Now look at this code; it is similar but not exactly the same as the code above. What do you think will happen when you run it? Try it after you have thought about it.
if (x > 10) { "x is greater than 10" }
if (x > 10) { "x is greater than 10" }
Did R return anything? No, because x
is 5
which is not greater than 10
, So, R just skipped the block of code in those curly braces. You can add an else
to provide an alternative block of code to execute if the expression in the parentheses following the keyword if
is not TRUE
.
Examine the following code, think about what it should do, and when you are done, run it to see if you are correct!
if (x > 10) {
"x is greater than 10"
} else {
"x is not greater than 10"
}
if (x > 10) {
"x is greater than 10"
else {
} "x is not greater than 10"
}
The output should make sense I hope. You can also chain if
-else
statements together. Examine the following code. Think about what will happen, then run it:
if (x > 10) {
"x is greater than 10"
} else if (x > 4) {
"x is greater than 4"
} else {
"x is less than or equal to 4"
}
if (x > 10) {
"x is greater than 10"
else if (x > 4) {
} "x is greater than 4"
else {
} "x is less than or equal to 4"
}
Switch
Now, let’s try out a different branching structure, i.e., a switch
statement which I placed in a function called center
. As you can see the function center
takes two arguments, x
and type
. The argument x
is the numeric
vector to be summarized and type
is a character
vector that indicates which summary statistic to compute with a default of "mean"
. Examine the code below and think about what it does. When you are ready, run it.
center <- function(x, type = "mean") {
switch(type,
mean = mean(x),
median = median(x),
trim = mean(x, trim = 0.1)
)
}
<- function(x, type = "mean") {
center switch(type,
mean = mean(x),
median = median(x),
trim = mean(x, trim = 0.1)
) }
Hopefully you aren’t suprised that the code above doesn’t produce any output as we were merely assigning a function to the name center
. Let’s take the new function for a spin.
I’ve defined d
as follows:
Try it using the following R code: center(d, "mean")
center(d, "mean")
center(d, "mean")
Now center(d, "median")
.
center(d, "median")
center(d, "median")
Next center(d, "trim")
center(d, "trim")
center(d, "trim")
Finally, try your function on d
without specifying the second argument: center(d)
center(d)
center(d)
Did you get the answer you expected, i.e., the same as center(d, "mean")
? Hopefully so!
Loop structures
For loops
The for
loop control structure allows you to iterate over a vector
or list
. Type in and try this R code:
for (x in 10:1) { print(x) }
for (x in 10:1) { print(x) }
for (x in 10:1) { print(x) }
The print
function prints the value in x
at that time. In this case of a loop, nothing is returned (unless you assign it somewhere as we’ll see later), so we need the print
function to show us what is going on.
Read the for
loop as “for each element in the vector 10:1
call it x
temporarily and do what is inside the curly brackets. Recall 10:1
is shorthand for c(10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
.
Now, I’ve already defined the list l
as:
See what happens when you iterate over the object l
like this (be careful with the “ell” vs. “one” distinction!):
for (x in l) { print(x[1]) }
for (x in l) { print(x[1]) }
for (x in l) { print(x[1]) }
As you see it took each element of l
, i.e., a
and then b
, and assigned it temporarily to x
and printed the first value of each element.
While loops
A while
loop keeps doing something while the condition is TRUE
. Study and run this code:
i <- 10
while (i > 0) {
print(i)
i <- i - 1
}
<- 10
i while (i > 0) {
print(i)
<- i - 1
i }
If you aren’t careful, you could write a loop that never ends. Study and try this code to see what happens:
<- 1
i while (i > 0) {
print(i)
<- i + 1
i }
I stopped that code after about a tenth of a second because otherwise it would have run forever! (Computers count really fast!) The condition i > 0
is always TRUE
when I start from 1
and keep adding 1
to it! When you are in RStudio you can stop the execution of code by clicking the red stop button in the upper right corner of the Console pane or by pressing the Esc key.
Two reserved words permit further control of loops: break
and next
. When used within a loop (either a for
or while
loop), they allow you to completely break out of the loop and continue further along your program (i.e., break
) or to skip the rest of the current loop and start the next one (i.e., next
). They can help with tricky situations.
Try running this example using break
:
i <- 10
while(TRUE) {
if(i == 0) break;
print(i)
i <- i - 1
}
<- 10
i while(TRUE) {
if(i == 0) break;
print(i)
<- i - 1
i }
Above is an trivial example of using break
. Here we just used TRUE
as the while
condition. Thus, it would be an infinite loop were it not for the use of break
.
Now study this example that also uses next
in a trivial way just to illustrate the concepts we are learning. See if you can figure out what it will do. Just go step by step; write down what i
is as you go step by step and figure out what will be printed out. Then, run the code and see if you were right!
i <- 20
while(TRUE) {
if(i < 5) break;
if(i %% 2) {
i <- i - 1
next; # go to next iteration if odd
}
print(i)
i <- i - 1
}
<- 20
i while(TRUE) {
if(i < 5) break;
if(i %% 2) {
<- i - 1
i next; # go to next iteration if odd
}print(i)
<- i - 1
i }
The ifelse function
The ifelse
function is a vectorized way of applying an if
-else
logic. It is technically not a control structure because it does not change the flow of your program, but it is very useful and similar in concept to if
-else
. It is very useful for making decisions on each item in a vector or each row in a data.frame
.
I’ve defined y
as follows:
Now type and run this R code:
ifelse(y == 7, "seven", "not seven")
ifelse(y == 7, "seven", "not seven")
ifelse(y == 7, "seven", "not seven")
Try this R code to check for even numbers:
ifelse(y %% 2 == 0, "even", "odd")
ifelse(y %% 2 == 0, "even", "odd")
ifelse(y %% 2 == 0, "even", "odd")
You don’t have to return a character
vector. Try this R code to return 0
for even numbers and 1
for odd numbers:
ifelse(y %% 2 == 0, 0, 1)
ifelse(y %% 2 == 0, 0, 1)
ifelse(y %% 2 == 0, 0, 1)
Here is a fun one using a new operator, the %in%
operator. You will find this operator is useful in many situations and can use it in other places than just inside an ifelse
function call:
ifelse(y %in% c(7, 4, 0), "in", "out")
ifelse(y %in% c(7, 4, 0), "in", "out")
ifelse(y %in% c(7, 4, 0), "in", "out")
Hopefully, that made sense. If not, we can discuss more in class. You’ve learned a lot! That’s enough for now! Take a break!