Data Import & Export
Reading Data
Reading CSV with readr
readr's read_csv() is faster than base R's read.csv() and automatically parses data types.
Key arguments: col_types, skip, na, locale, comment.
library(readr)
df <- read_csv('students.csv')
df <- read_csv('data.csv', col_types=cols(score=col_double(), name=col_character()), na=c('','NA','N/A'))
# See the code example above and adapt it to your data. # Always check your output with str() and head().
Importing Excel with readxl
readxl imports .xlsx and .xls files without needing Excel installed.
Use sheet= to pick a sheet and range= to read a specific cell range.
library(readxl)
df <- read_excel('report.xlsx', sheet='Q3', range='B2:F50')
excel_sheets('report.xlsx') # list all sheet names
# See the code example above and adapt it to your data. # Always check your output with str() and head().
APIs and Databases
JSON and REST APIs
jsonlite parses JSON. httr makes HTTP requests. Together they handle most REST APIs.
Always check status_code(resp) == 200 before parsing.
library(jsonlite); library(httr)
resp <- GET('https://api.example.com/data', add_headers(Authorization='Bearer TOKEN'))
if(status_code(resp)==200) data <- content(resp, as='parsed')
# See the code example above and adapt it to your data. # Always check your output with str() and head().
Writing Data
write_csv (readr) and write.xlsx (openxlsx) export data. Always use write_csv over write.csv for UTF-8 and speed.
write_csv(df, 'output.csv')
write_json(df, 'output.json', pretty=TRUE)
# Excel
library(openxlsx)
write.xlsx(df, 'output.xlsx', sheetName='Data')
# See the code example above and adapt it to your data. # Always check your output with str() and head().