Skip to Content

How to Create a DataFrame in R?

The R programming language comes with a number of useful functions and data structures to work with data. A dataframe in R is used to store two-dimensional data (in the form of rows and columns). In this tutorial, we will look at how to create a dataframe in R with the help of some examples.

How do I create a dataframe in R?

You can use the data.frame() function to create a dataframe in R. Pass the columns that you want to have in the dataframe as vectors to the data.frame() function.

The following is the syntax –

# create a dataframe
df = data.frame(col_name=col_data)

You can specify the name of the column and pass the corresponding column values as a vector.

Examples

Let’s now look at some examples of creating a dataframe using the above syntax –

Let’s say we have the following two vectors representing the name and age of some students in a university.

# create vectors
name <- c("Ben", "Quinton", "Virat", "Smriti", "Jos")
age <- c(23, 27, 24, 21, 26)

Note that the values in the name and the age vectors are related. That is, values at each index correspond to a particular student. For example, student 1’s name is name[1] and age is age[1].

Let’s now create a dataframe to capture this related information better.

# create a dataframe
students_df = data.frame(
  "Name"= name,
  "Age"= age
)
# display the dataframe
print(students_df)

Output:

     Name Age
1     Ben  23
2 Quinton  27
3   Virat  24
4  Smriti  21
5     Jos  26

Here, we create a dataframe with two columns “Name” and “Age”. We use the name vector as values for the “Name” column and the age vector as values for the “Age” column.

From the above dataframe, we can intuitively say that Ben’s age is 23.

The above operations can also be done by directly creating the vectors inside the data.frame() function. Here, we directly pass the age and name values to the dataframe.

# create a dataframe
students_df = data.frame(
  "Name"= c("Ben", "Quinton", "Virat", "Smriti", "Jos"),
  "Age"= c(23, 27, 24, 21, 26)
)
# display the dataframe
print(students_df)

Output:

     Name Age
1     Ben  23
2 Quinton  27
3   Virat  24
4  Smriti  21
5     Jos  26

We get the same result as above.

Accessing values in an R dataframe

Rows and columns in an R dataframe are indexed starting from 1. You can use the [] notation to access values from a dataframe in R. The following is the syntax –

# access value at row r and column c in dataframe df
df[r, c]

Here, we want to get the value in row r and column c in the dataframe df.

Let’s look at an example.

In the dataframe created above, let’s get the value at row 3 and column 1.

# dataframe value at row 3, column 1
print(students_df[3, 1])

Output:

[1] "Virat"

We get “Virat” as the value in row 3 and column 1 which basically gives us the value of the “Name” column in row 3.

You might also be interested in –


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Authors

  • Piyush

    Piyush is a data scientist passionate about using data to understand things better and make informed decisions. In the past, he's worked as a Data Scientist for ZS and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

  • Gottumukkala Sravan Kumar