Introduction to R & RStudio
Setting Up Your Environment
Installing R and RStudio
R is the language; RStudio is the IDE.
- Download R from cran.r-project.org — choose your OS.
- Download RStudio Desktop (free) from posit.co/downloads.
- Open RStudio — four panes: Console, Source, Environment, Files/Plots.
- Type
versionin the Console and press Enter to confirm R is working.
RStudio keyboard shortcuts:
Ctrl+Enter— run current lineCtrl+Shift+Enter— run entire scriptAlt+-— insert<-assignment arrowCtrl+Shift+M— insert|>pipe
R.version.string in the console. What does it return?
It returns a string like 'R version 4.3.1 (2023-06-16)'. This confirms R is installed and shows the version number.
<- assignment operator in RStudio?R Basics — Arithmetic and Assignment
R can be used as a calculator. The assignment operator <- (or =) stores values in named variables.
# Arithmetic
2 + 3 * 4 # 14 (operator precedence applies)
sqrt(144) # 12
log(exp(1)) # 1
# Assignment
x <- 42
name <- "Vidaara"
is_ok <- TRUE
# Check type
class(x) # "numeric"
class(name) # "character"
class(is_ok) # "logical"
# Multiple assignment
a <- b <- c <- 0 # all three equal 0
R is case-sensitive: Name and name are different variables.
x_int <- 42L
x_dbl <- 3.14
x_chr <- 'hello'
x_lgl <- TRUE
cat(class(x_int), class(x_dbl), class(x_chr), class(x_lgl))
# integer numeric character logical
The L suffix forces integer type instead of numeric (double).
17 %% 5 # 2 (remainder)
17 %/% 5 # 3 (integer quotient)
R uses %% for modulo and %/% for integer division, unlike Python's % and //.
class(42L) return in R?5^3 compute in R?Vectors and Functions
Vectors — R's Core Data Structure
In R, everything is a vector. A single value like 42 is a length-1 vector. Vectors are homogeneous — all elements must be the same type.
# Create with c() — combine
scores <- c(85, 92, 78, 95, 88)
names_v <- c("Aarav", "Kavya", "Rohan")
# Vectorised operations — no loops needed!
scores + 5 # adds 5 to each
scores * 1.1 # 10% bonus to each
scores > 88 # logical vector: FALSE TRUE FALSE TRUE FALSE
# Subsetting (1-indexed!)
scores[1] # 85
scores[c(1,3)] # 85 78
scores[scores > 88] # 92 95 — logical indexing
scores[-1] # drop first element: 92 78 95 88
# Named vectors
setNames(scores, c("Math","English","Science","R","Stats"))
# Useful functions
length(scores) # 5
sum(scores) # 438
mean(scores) # 87.6
range(scores) # 78 95
which.max(scores) # 4 (index of max)
Recycling rule: when operating on vectors of different lengths, R recycles the shorter one: c(1,2,3,4) + c(10,20) gives c(11,22,13,24).
# Try it yourself — run in RStudio console
scores <- c(85, 92, 78, 95, 88)
cat("Sum:", sum(scores), "\n")
cat("Mean:", mean(scores), "\n")
cat("Max:", max(scores), "\n")
scores > 88
Sum: 438 Mean: 87.6 Max: 95 [1] FALSE TRUE FALSE TRUE FALSE
scores <- c(72,85,90,65,95,88,76,91)
high <- scores[scores > 80]
mean(high) # mean(85,90,95,88,91) = 89.8
evens <- seq(2, 20, by = 2)
# or
evens <- seq(from=2, to=20, length.out=10)
# 2 4 6 8 10 12 14 16 18 20
c(1,2,3)[c(TRUE,FALSE,TRUE)] return?length(1:10)?Packages and the Tidyverse
R has 20,000+ packages on CRAN. The tidyverse is a collection of packages sharing a consistent design philosophy.
# Install (once)
install.packages("tidyverse")
# Load (every session)
library(tidyverse)
# Check what's loaded
search()
# Key tidyverse packages:
# dplyr — data manipulation
# ggplot2 — visualisation
# tidyr — reshaping
# readr — fast CSV reading
# purrr — functional programming
# stringr — string manipulation
# forcats — factor handling
# lubridate— dates
Install only once per machine; library() loads the package into the current session.
packageVersion("dplyr") # e.g. '1.1.3'
# or
installed.packages()["dplyr", "Version"]