Introduction

Functions are an important programming construct. They help in structuring large programs and minimize duplication of code. Additionally, they make code easier to read and debug. R, just like all other modern programming language supports functions.

A quick note on terminology: a function returns a value while a procedure does not. R, like many other languages, does not support procedures in a direct way, although one can define a function that does not return a value which would then be a procedure. This is similar to functions returning void in C/C++ and Java.

R is a procedural language similar to C. It is not object-oriented and does not support objects, classes, inheritance, or polymorphism. It has little support for data encapsulation or abstraction, so no equivalent for class or struct in C/C++ or Java.

All functions are free methods and are not bound to an object or a class like in Java or C++. It’s the same way as in Python or C.

All functions in R can return a value, although they do not have to. So, R does not distinguish between functions and procedures and there is no void return type as in C, C++, and Java.

Programs is R are scripts. There is no “main function” or similar. R programs execute with the first statement at the beginning of the program script, so functions must be defined before they can be called. Functions can be placed into packages for reuse and inclusion by other programs, but that is beyond the scope of this lesson.

Defining a Function

The generic template for defining a function is:

function_name <- function(arg_1, arg_2, ...) {
   Function body 
}

Simple Program with Function

The program below is complete and defines a function that is then called. The function must be defined before it can be called.

# Full R Program 

addNums <- function(n1, n2) 
{
  if (is.numeric(n1) && is.numeric(n2))
  {
    r <- n1 + n2
  
    return(r)
  } else {
    n1 <- as.numeric(n1)
    n2 <- as.numeric(n2)
    
    if (is.na(n1) || is.na(n2)) {
      return(NA)
    } else {
      return(addNums(n1,n2))
    }
  }
}

a <- 10
b <- 20

c <- addNums(a,b)
print(c)
## [1] 30
d <- addNums(a,50)
print(d)
## [1] 60
e <- addNums(a*10,b*45)
print(e)
## [1] 1000
f <- addNums('1', "20.3")
print(f)
## [1] 21.3
# Full R Program 

addVecNums <- function(v) 
{
  if (!is.vector(v) || length(v) < 1)
    return(NA)
  
  s <- 0
  for (i in 1:length(v))
  {
    n <- as.numeric(v[i])
    if (!is.na(n))
      s <- s + n
  }
  
  return(s)
}


vec.nums <- c(23,55,34,87,65,'x',11,98)

e <- addVecNums(vec.nums)
## Warning in addVecNums(vec.nums): NAs introduced by coercion
print(e)
## [1] 373

Example Function

The example below defines a function called findSmallest() which takes a vector of positive integers as an argument and returns the smallest element in the vector. While it can be solved in several ways, we will show a design that uses loops and should be familiar to programmers of most other languages.

Note that we are using the predefined value Inf with is the largest representable integer. There is also -Inf that is the smallest representable integer.

findSmallest = function(v)
{
  s = Inf
  for (i in 1:length(v))
  {
    if (v[i] < s) {
      s = v[i]
    }
  }
  return (s)
}

While you can use = to define a function, you should really get used to using the more common <- syntax. So, let’s try again:

findSmallest <- function(v)
{
  s = Inf
  for (i in 1:length(v))
  {
    if (v[i] < s) {
      s = v[i]
    }
  }
  return (s)
}

Just to be clear, in practice you would use the min() function to find the smallest element rather writing it yourself.

While there are several ways to return a value from a function, the way that is most congruent with other languages is the use of the return statement.

Note that the type of return value and the type of arguments are not declared. R uses a lazy evaluation mechanism and no type checking is performed until run-time.

Calling a Function

To call a function, you would invoke it with its name and its required arguments.

x = c(3,1,9,7,3,6)

w = findSmallest(x)
print(w)
## [1] 1

Function Parameters

If a function takes several arguments you generally pass them in the order declared; the approach that is used by all other languages. However, in R you can pass the arguments in any order as long as you specify the name of the argument.

Argument matching is a bit different in R compared to other languages. Firstly, R does all argument checking at run-time. Secondly, while arguments can be matched positionally like in other languages, arguments can also be matched by parameter name – a syntax not supported by most other languages.

For example, the built-in function seq generates a sequence of numbers and returns those numbers in a vector. The definition of the function is as follows: seq(to, from, by, length.out, along.with).

Here are examples of using it. Note that by, length.out, and along.with have default values and are therefore optional.

v = seq(1, 10, 2)    # integers from 1 to 10 in increments of 2
w = seq(1, 5)        # integers from 1 to 5 (by default in increments of 1)

# pass arguments in a different order but specify by name
w = seq(from = 5, by = -0.5, to = 1)

R also supports variable numbers of arguments but that is beyond the scope of this tutorial.

Default Arguments

R functions can have default values for arguments which are then optional when the function is called. When the argument is missing, then the default value is passed. In the example below, the start argument is the position at which the search for the smallest element will start.

findSmallest <- function(v, start = 1)
{
  s = Inf
  for (i in start:length(v))
  {
    if (v[i] < s) {
      s = v[i]
    }
  }
  return (s)
}
x = c(3,1,9,7,3,6)

w = findSmallest(x, 3)
print(w)
## [1] 3
w = findSmallest(x)
print(w)
## [1] 1

Local Variables

As in most other programming languages, R functions can define local variables that are not known outside the scope of the function. The scope boundaries in R are like other languages: a block enclosed in curly braces.

In the example below, local.var is local to the function and thus is not visible outside of the function. The code below produces the error: “Error in print(local.var) : object ‘local.var’ not found”.

findSmallest <- function(v, start = 1)
{
  local.var = Inf
  for (i in start:length(v))
  {
    if (v[i] < local.var) {
      local.var = v[i]
    }
  }
  return (local.var)
}

x = c(3,1,9,7,3,6)

w = findSmallest(x, 3)

# we cannot echo or access the local variable "s"
print(local.var)

Recursion

R functions can be called recursively. The example below calculates factorial using recursion rather than a loop.

fac <- function(x)
{
  if (x == 1) 
    return (1)
  else 
    return (x * fac(x-1))
}

print(fac(8))
## [1] 40320

If it hasn’t been obvious yet, just like in other languages, the placement of curly braces makes no difference. For single statement blocks, the curly braces can be omitted.

The parenthesis around the value for return are required.

As an exercise, try writing the above function to calculate factorial using a loop.

Conclusion

Functions are an important code structuring mechanism and any R program with more than a few lines of code can benefit from functions. Functions are first-class objects in R and can be passed to functions as parameters.


Files & Resources

All Files for Lesson 6.121

References

No references.

Errata

None collected yet. Let us know.

---
title: "Writing Functions in R"
params:
  category: 6
  number: 121
  time: 60
  level: beginner
  tags: "r,primer,loops"
  description: "Explains the concept of functions and their implementation
                in R. Demonstrates some of the unique mechanisms for
                writing and calling functions in R."
date: "<small>`r Sys.Date()`</small>"
author: "<small>Martin Schedlbauer</small>"
email: "m.schedlbauer@neu.edu"
affilitation: "Northeastern University"
output: 
  bookdown::html_document2:
    toc: true
    toc_float: true
    collapsed: false
    number_sections: false
    code_download: true
    theme: spacelab
    highlight: tango
---

---
title: "<small>`r params$category`.`r params$number`</small><br/><span style='color: #2E4053; font-size: 0.9em'>`r rmarkdown::metadata$title`</span>"
---

```{r code=xfun::read_utf8(paste0(here::here(),'/R/_insert2DB.R')), include = FALSE}
```

## Prerequisites

-   [6.100 ┆ Beginning R](http://artificium.us/lessons/06.r/l-6-100-beginning-r/l-6-100.html)
-   [6.104 ┆ Quick Guide to R For Programmers](http://artificium.us/lessons/06.r/l-6-104-r4progs/l-6-104.html)

## Introduction

Functions are an important programming construct. They help in structuring large programs and minimize duplication of code. Additionally, they make code easier to read and debug. R, just like all other modern programming language supports functions.

A quick note on terminology: a function returns a value while a procedure does not. R, like many other languages, does not support procedures in a direct way, although one can define a function that does not return a value which would then be a procedure. This is similar to functions returning *void* in C/C++ and Java.

R is a procedural language similar to C. It is not object-oriented and does not support objects, classes, inheritance, or polymorphism. It has little support for data encapsulation or abstraction, so no equivalent for *class* or *struct* in C/C++ or Java.

All functions are free methods and are not bound to an object or a class like in Java or C++. It's the same way as in Python or C.

All functions in R can return a value, although they do not have to. So, R does not distinguish between functions and procedures and there is no *void* return type as in C, C++, and Java.

Programs is R are scripts. There is no "main function" or similar. R programs execute with the first statement at the beginning of the program script, so functions must be defined before they can be called. Functions can be placed into packages for reuse and inclusion by other programs, but that is beyond the scope of this lesson.

## Defining a Function

The generic template for defining a function is:

```{r eval=FALSE}
function_name <- function(arg_1, arg_2, ...) {
   Function body 
}
```

### Simple Program with Function

The program below is complete and defines a function that is then called. The function must be defined *before* it can be called.

```{r simpleFuncInProg}
# Full R Program 

addNums <- function(n1, n2) 
{
  if (is.numeric(n1) && is.numeric(n2))
  {
    r <- n1 + n2
  
    return(r)
  } else {
    n1 <- as.numeric(n1)
    n2 <- as.numeric(n2)
    
    if (is.na(n1) || is.na(n2)) {
      return(NA)
    } else {
      return(addNums(n1,n2))
    }
  }
}

a <- 10
b <- 20

c <- addNums(a,b)
print(c)

d <- addNums(a,50)
print(d)

e <- addNums(a*10,b*45)
print(e)

f <- addNums('1', "20.3")
print(f)
```

```{r simpleVectorFuncInProg}
# Full R Program 

addVecNums <- function(v) 
{
  if (!is.vector(v) || length(v) < 1)
    return(NA)
  
  s <- 0
  for (i in 1:length(v))
  {
    n <- as.numeric(v[i])
    if (!is.na(n))
      s <- s + n
  }
  
  return(s)
}


vec.nums <- c(23,55,34,87,65,'x',11,98)

e <- addVecNums(vec.nums)
print(e)
```

### Example Function

The example below defines a function called *findSmallest()* which takes a vector of positive integers as an argument and returns the smallest element in the vector. While it can be solved in several ways, we will show a design that uses loops and should be familiar to programmers of most other languages.

Note that we are using the predefined value *Inf* with is the largest representable integer. There is also *-Inf* that is the smallest representable integer.

```{r functionDef}
findSmallest = function(v)
{
  s = Inf
  for (i in 1:length(v))
  {
    if (v[i] < s) {
      s = v[i]
    }
  }
  return (s)
}
```

While you can use *=* to define a function, you should really get used to using the more common *\<-* syntax. So, let's try again:

```{r functionDefBetter}
findSmallest <- function(v)
{
  s = Inf
  for (i in 1:length(v))
  {
    if (v[i] < s) {
      s = v[i]
    }
  }
  return (s)
}
```

Just to be clear, in practice you would use the <code>min()</code> function to find the smallest element rather writing it yourself.

While there are several ways to return a value from a function, the way that is most congruent with other languages is the use of the *return* statement.

Note that the type of return value and the type of arguments are not declared. R uses a lazy evaluation mechanism and no type checking is performed until run-time.

### Calling a Function

To call a function, you would invoke it with its name and its required arguments.

```{r callFunc}
x = c(3,1,9,7,3,6)

w = findSmallest(x)
print(w)
```

### Function Parameters

If a function takes several arguments you generally pass them in the order declared; the approach that is used by all other languages. However, in R you can pass the arguments in any order as long as you specify the name of the argument.

Argument matching is a bit different in R compared to other languages. Firstly, R does all argument checking at run-time. Secondly, while arguments can be matched positionally like in other languages, arguments can also be matched by parameter name -- a syntax not supported by most other languages.

For example, the built-in function <code>seq</code> generates a sequence of numbers and returns those numbers in a vector. The definition of the function is as follows: <code>seq(to, from, by, length.out, along.with)</code>.

Here are examples of using it. Note that *by*, *length.out*, and *along.with* have default values and are therefore optional.

```{r seqParmPassing}
v = seq(1, 10, 2)    # integers from 1 to 10 in increments of 2
w = seq(1, 5)        # integers from 1 to 5 (by default in increments of 1)

# pass arguments in a different order but specify by name
w = seq(from = 5, by = -0.5, to = 1)
```

R also supports variable numbers of arguments but that is beyond the scope of this tutorial.

### Default Arguments

R functions can have default values for arguments which are then optional when the function is called. When the argument is missing, then the default value is passed. In the example below, the *start* argument is the position at which the search for the smallest element will start.

```{r functionDefArg}
findSmallest <- function(v, start = 1)
{
  s = Inf
  for (i in start:length(v))
  {
    if (v[i] < s) {
      s = v[i]
    }
  }
  return (s)
}
```

```{r}
x = c(3,1,9,7,3,6)

w = findSmallest(x, 3)
print(w)

w = findSmallest(x)
print(w)
```

### Local Variables

As in most other programming languages, R functions can define local variables that are not known outside the scope of the function. The scope boundaries in R are like other languages: a block enclosed in curly braces.

In the example below, *local.var* is local to the function and thus is not visible outside of the function. The code below produces the error: "Error in print(local.var) : object 'local.var' not found".

```{r locaVars, echo=TRUE, eval=FALSE}
findSmallest <- function(v, start = 1)
{
  local.var = Inf
  for (i in start:length(v))
  {
    if (v[i] < local.var) {
      local.var = v[i]
    }
  }
  return (local.var)
}

x = c(3,1,9,7,3,6)

w = findSmallest(x, 3)

# we cannot echo or access the local variable "s"
print(local.var)
```

### Recursion

R functions can be called recursively. The example below calculates factorial using recursion rather than a loop.

```{r recursiveFuncs}
fac <- function(x)
{
  if (x == 1) 
    return (1)
  else 
    return (x * fac(x-1))
}

print(fac(8))
```

> If it hasn't been obvious yet, just like in other languages, the placement of curly braces makes no difference. For single statement blocks, the curly braces can be omitted.

> The parenthesis around the value for *return* are required.

As an exercise, try writing the above function to calculate factorial using a loop.

## Conclusion

Functions are an important code structuring mechanism and any R program with more than a few lines of code can benefit from functions. Functions are first-class objects in R and can be passed to functions as parameters.

------------------------------------------------------------------------

## Files & Resources

```{r zipFiles, echo=FALSE}
zipName = sprintf("LessonFiles-%s-%s.zip", 
                 params$category,
                 params$number)

textALink = paste0("All Files for Lesson ", 
               params$category,".",params$number)

# downloadFilesLink() is included from _insert2DB.R
knitr::raw_html(downloadFilesLink(".", zipName, textALink))
```

------------------------------------------------------------------------

## References

No references.

## Errata

None collected yet. Let us know.

```{=html}
<script src="https://form.jotform.com/static/feedback2.js" type="text/javascript">
  new JotformFeedback({
    formId: "212187072784157",
    buttonText: "Feedback",
    base: "https://form.jotform.com/",
    background: "#F59202",
    fontColor: "#FFFFFF",
    buttonSide: "left",
    buttonAlign: "center",
    type: false,
    width: 700,
    height: 500,
    isCardForm: false
  });
</script>
```
```{r code=xfun::read_utf8(paste0(here::here(),'/R/_deployKnit.R')), include = FALSE}
```
