-
Description: The course is designed primarily for those who are already familiar with programming in another language (e.g. Python, C, Java), and want to understand how R works, and for those who already know the basics of R programming and want to gain a more in-depth understanding of the language in order to improve their coding. The focus is on the underlying paradigms in R, such as atomic and non-atomic vectors, functional programming, environments, and object systems (time permitting). The goal of this course is to better understand programming principles in R and to write better R code that capitalizes on the language's design. Topics include (not necessarily covered in the following order):
- Data types and data structures in R (e.g. vectors, arrays, lists, data frames)
- 🔨 Tools for data manipulation
- 📊 Tools for data visualization
- 📥 Data input/output
- 🔀 Control flow structures (e.g. conditionals, iterations)
- 📝 Writing simple functions
- 📢 Function calls
- 📑 Argument matching
- ➕ The formula language of R (time permitting)
- 🎁 Building simple R packages
-
Instructor: Gaston Sanchez
-
Lecture: 1 hour of lecture per week
-
Lab: 1 hour of laboratory per week
-
Assignments: biweekly HW assignments
-
Exams: one midterm exam, and final test
-
Texts and Notes:
- Breaking the Ice with R
- R Coding Basics
- Rolling Dice
- Tidy Hurricanes
- Handling Strings with R
- Pack YouR Code
- Prof. Sanchez's slides
-
LMS: the specific learning resources of a given semester are shared in the Learning Management Sysment (LMS) approved by Campus authorities (e.g. bCourses, Canvas)
-
Policies:
📇 ABOUT:
We begin with the usual review of the course policies, logistics, overall expectations, topics in a nutshell, etc. At the computational level, you'll get introduced to RStudio and R, as well as Markdown and its use in dynamic computational documents (e.g. Rmd
and qmd
files).
📖 READING:
- Slides
- Breaking the Ice with R
✏️ TOPICS:
- Introduction
- About the course
- First contact with R and RStudio
- Markdown syntax
📇 ABOUT:
In this week we describe data types and their implementation in R vectors (the most fundamental data object in R).
📖 READING:
✏️ TOPICS:
- Data Types
- atomic types (e.g. logical, integer, double, character)
- coercion
- vectorization
- recycling
- subsetting
📇 ABOUT:
We continue describing more atomic objects such as arrays (N-dimensional objects) and matrices (2-dimensional arrays).
📖 READING:
✏️ TOPICS:
- More atomic objects
- Creation of simple matrices with matrix()
- How R internally stores matrices
- Why matrices are atomic objects
- In what sense a matrix is a 2-dimensional object
- Matrix subsetting (subscripting, indexing)
📇 ABOUT:
In this week, We continue describing non-atomic objects such as list and data-frames, and we also review how to manipulate (subset) these kind of objects.
📖 READING:
✏️ TOPICS:
- Lists
- Manipulation of lists and data frames
📇 ABOUT:
In this week, we'll talk about concepts and functions that have to do with so-called input(s)-output(s), or simply put, with importing and exporting operations. For example:
- how to import a data table
- how to export a data table
- how to export a graphic to an image file
- how to export function outputs to external files
📖 READING:
✏️ TOPICS:
- Imports/Exports
read.table()
and derived functions: e.g.read.csv()
,read.delim()
- Importing text with
readLines()
- Importing code with
soruce()
- Exporting output to external files with
sink()
- Exporting images with
png()
,jpeg()
,pdf()
, etc - Mechanism used by R for reading-in data
📇 ABOUT:
In addition to learning about the "classic" way to work with data frames, we will briefly touch on an "alternative" approach for working with tables provided by the tidy data framework and the ecosystem of packages known as the "tidyverse"
: https://www.tidyverse.org
This week we start with the tidyverse package "dplyr"
. Simply put, "dplyr"
comes with functions to manipulate data-tables (e.g. data-frames, and other 2-dimensional objects) using a modern and syntactic way.
📖 READING:
✏️ TOPICS:
- dplyr verbs
slice()
filter()
select()
arrange()
group_by()
summarise()
📇 ABOUT:
Last week we discussed the basics of "dplyr"
. This week we move on to "ggplot2"
which is another tidyverse package that allows you to create nice graphics, also following the tidy data framework.
📖 READING:
✏️ TOPICS:
- ggplot verbs
- the grammar of graphics
- geometric objects and visual attributes
- building a graphic with layers
- supporting graphical elements
📇 ABOUT:
This week we introduce the notion of R expressions, and we provide the syntax used by R to handle if-else statements and related conditionals constructs.
📖 READING:
✏️ TOPICS:
- If-else statements
- R compound expressions and the use of curly braces
{ ... }
- Anatomy of an
if-else
statement in R - Vectorized if-else function
ifelse()
switch()
construct
- R compound expressions and the use of curly braces
📇 ABOUT:
This week we introduce the syntax used by R to handle iteration constructs such as for()
loops, while()
loops, repeat
loops, and the apply family functions.
📖 READING:
✏️ TOPICS:
- Loops
- Anatomy of a
for()
loop in R - Anatomy of a
while()
loop in R - Anatomy of a
repeat
loop in R break
statement to stop a loopnext
statement to skip an iterationapply()
family functions
- Anatomy of a
📇 ABOUT:
This week we review the syntax used by R for writing functions. We also take a look at auxiliary functions such as return()
, stop()
, and warning()
📖 READING:
✏️ TOPICS:
- Functions
- Main parts of a function (i.e. anatomy of a function)
- Examples for creating a function
- Difference between positional arguments, and named arguments
- Binary opeartor functions
📇 ABOUT:
This week we review more technical aspects of functions in R. Specifically, we will focus on the scoping mechanisms used by R to find the value of a variable.
📖 READING:
- Slides
✏️ TOPICS:
- Environments
- What is an environment?
- Creating environments
- Types of environments
- The search list
- Scoping principles
- Name maksing
- Functions vs Variables
- Fresh start
- Dynamic lookup
📇 ABOUT:
This week we review more underlying principles that have to do with performance in R.
📖 READING:
- Slides
✏️ TOPICS:
- R's behavior
- R's motto
- Copy-on-modify policy
- What things make R slow
- Performance
- Measuring performance in a "quick-and-dirty" way with
system.time()
- Profiling code with
Rprof()
and"profvis"
- Alternative way to measure performance with
"microbenchmark"
- Measuring performance in a "quick-and-dirty" way with
📇 ABOUT:
This week we review the anatomy of an R package.
📖 READING:
- Slides
✏️ TOPICS:
- Anatomy of an R package
DESCRIPTION
fileNAMESPACE
fileR/
folderman/
folderRoxygen
comments andRd
files
📇 ABOUT:
This week we go through the first steps for creating a simple R package.
📖 READING:
- Slides
✏️ TOPICS:
- Building an R package
devtools
functions- Create
Rd
files withdocument()
- Check content of
Rd
files withcheck_man()
- Build a bundle with
build()
- Install a package locally with
install()