Extract year from datetime column in pandas

Pandas – Extract Year from a datetime column

A column of datetime type in pandas is a rich source of temporal information that could be useful. For example, you may want to see your data at a yearly or monthly level. Or, you may want to find whether it was a weekend or not. Knowing how to extract such information quickly can come quite handy when working with dataframes with datetime column(s). In this tutorial, we’ll look at how to extract the year, month, day, etc. information from a datetime column in pandas.

Pandas datetime columns have information like year, month, day, etc as properties. To extract the year from a datetime column, simply access it by referring to its “year” property. The following is the syntax:

df['Month'] = df['Col'].dt.year

Here, ‘Col’ is the datetime column from which you want to extract the year.

For example, you have the following dataframe of sales of an online store.

import pandas as pd

# sales data of a shop
dates = ["2020-12-29", "2020-12-30", "2020-12-31", "2021-01-01", "2021-01-02", "2021-01-03"]
units_sold = [10, 8, 14, 21, 16, 12]

# create the dataframe
df = pd.DataFrame({
    "Purchase Date": dates,
    "Sales": units_sold
})

# print the dataframe
print(df)

Output:

  Purchase Date  Sales
0    2020-12-29     10
1    2020-12-30      8
2    2020-12-31     14
3    2021-01-01     21
4    2021-01-02     16
5    2021-01-03     12

To extract the year from the “Purchase Date” column and add it as a separate column:

# convert the Purchase Date to datetime
df['Purchase Date'] = pd.to_datetime(df['Purchase Date'])
# add a column for Year
df['Year'] = df['Purchase Date'].dt.year

# print the dataframe
print(df)

Output:

  Purchase Date  Sales  Year
0    2020-12-29     10  2020
1    2020-12-30      8  2020
2    2020-12-31     14  2020
3    2021-01-01     21  2021
4    2021-01-02     16  2021
5    2021-01-03     12  2021

Here, we first converted the “Purchase Date” column to datetime format using the pandas to_datetime() function and then created the “Year” column by accessing its “year” property. Let’s look at the type of each column using the pandas info() function.

📚 Data Science Programs By Skill Level

Introductory

Intermediate ⭐⭐⭐

Advanced ⭐⭐⭐⭐⭐

🔎 Find Data Science Programs 👨‍💻 111,889 already enrolled

Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.

df.info()

Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 3 columns):
 #   Column         Non-Null Count  Dtype         
---  ------         --------------  -----         
 0   Purchase Date  6 non-null      datetime64[ns]
 1   Sales          6 non-null      int64         
 2   Year           6 non-null      int64         
dtypes: datetime64[ns](1), int64(2)
memory usage: 272.0 bytes

You can see that “Purchase Date” is of type datetime whereas the “Year” is of integer type.

You can also extract other date-related properties like the month, week, day, hour, minute, etc. from a datetime column as we did above. For example, let’s add columns showing the month, day, and day of the year in the above dataframe.

# add a column for Month
df['Month'] = df['Purchase Date'].dt.month
# add a column for Day
df['Day'] = df['Purchase Date'].dt.day
# add a column for Day of the Year
df['Day of Year'] = df['Purchase Date'].dt.dayofyear

# show the dataframe
print(df)

Output:

  Purchase Date  Sales  Year  Month  Day  Day of Year
0    2020-12-29     10  2020     12   29          364
1    2020-12-30      8  2020     12   30          365
2    2020-12-31     14  2020     12   31          366
3    2021-01-01     21  2021      1    1            1
4    2021-01-02     16  2021      1    2            2
5    2021-01-03     12  2021      1    3            3

In the above example, we created columns “Month”, “Day”, and “Day of Year” by accessing the “Purchase Date” datetime column’s month, day, and dayofyear properties respectively.

There are a number of other properties (all the way up to nanoseconds). You can also get some boolean properties as well like whether the date is a starting date of a month, or whether it’s a leap year or not, etc.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Author

  • Piyush Raj

    Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

Scroll to Top