Whether we like it or not Excel is a common tool for working with data, so we often find ourselves wanting to load Excel files into R. In this post I show you a simple method for how to import data from different sheets of an Excel workbook into R.
The way I like to load Excel files into R is with the gdata library. This library has a number of functions for manipulating data, including Excel files.
Simple Import
In its simplest form, you can import data with the “read.xls” function:
library(gdata) my_data <- read.xls("MyExcelFile.xlsx")
This will load in data from the first sheet in “MyExcelFile.xlsx”. Behind the scenes gdata first extracts the sheet as a csv file. This means the data you get in R should be the same what you would get from a “save as csv” operation in Excel
Also note: read.xls requires that perl is installed and set up correctly. If this isn’t the case on your system you will probably get this error: Error in file.exists(tfn) : invalid ‘file’ argument.
Specifying Which Sheet to Import
You can also specify which sheet you want to import:
library(gdata) my_data <- read.xls("MyExcelFile.xlsx", sheet="My First Sheet")
Be careful though, as the sheet name you specify is case-sensitive.
Skipping Rows
Sometimes your excel file will have header, or labelling lines that you are not interested in – this can be a bit of a problem for the read.xls function. Instead we can use the gdata xls2csv function to load the sheet into R as a csv file object.
my_data_to_csv <- xls2csv("MyExcelFile.xlsx", sheet="My Sheet")
The read.csv function then has more flexibility in allowing us to ignore lines when we import:
my_data_from_csv <- read.csv(my_data_to_csv, skip=1)