Load Excel Files Into R

By | October 14, 2016

Whether we like it or not Excel is a common tool for working with data, so we often find ourselves wanting to load Excel files into R. In this post I show you a simple method for how to import data from different sheets of an Excel workbook into R.

The way I like to load Excel files into R is with the gdata library. This library has a number of functions for manipulating data, including Excel files.

Simple Import

In its simplest form, you can import data with the “read.xls” function:

my_data <- read.xls("MyExcelFile.xlsx")

This will load in data from the first sheet in “MyExcelFile.xlsx”. Behind the scenes gdata first extracts the sheet as a csv file. This means the data you get in R should be the same what you would get from a “save as csv” operation in Excel

Also note: read.xls requires that perl is installed and set up correctly. If this isn’t the case on your system you will probably get this error: Error in file.exists(tfn) : invalid ‘file’ argument.

Specifying Which Sheet to Import

You can also specify which sheet you want to import:

my_data <- read.xls("MyExcelFile.xlsx", sheet="My First Sheet")

Be careful though, as the sheet name you specify is case-sensitive.

Skipping Rows

Sometimes your excel file will have header, or labelling lines that you are not interested in – this can be a bit of a problem for the read.xls function. Instead we can use the gdata xls2csv function to load the sheet into R as a csv file object.

my_data_to_csv <- xls2csv("MyExcelFile.xlsx", sheet="My Sheet")

The read.csv function then has more flexibility in allowing us to ignore lines when we import:

my_data_from_csv <- read.csv(my_data_to_csv, skip=1)