A column of datetime type in pandas is a rich source of temporal information that could be useful. For example, you may want to see your data at a yearly or monthly level. Or, you may want to find whether it was a weekend or not. Knowing how to extract such information quickly can come quite handy when working with dataframes with datetime column(s). In this tutorial, we’ll look at how to extract the year, month, day, etc. information from a datetime column in pandas.
Extract Year from a datetime column
Pandas datetime columns have information like year, month, day, etc as properties. To extract the year from a datetime column, simply access it by referring to its “year” property. The following is the syntax:
df['Month'] = df['Col'].dt.year
Here, ‘Col’ is the datetime column from which you want to extract the year.
For example, you have the following dataframe of sales of an online store.
import pandas as pd # sales data of a shop dates = ["2020-12-29", "2020-12-30", "2020-12-31", "2021-01-01", "2021-01-02", "2021-01-03"] units_sold = [10, 8, 14, 21, 16, 12] # create the dataframe df = pd.DataFrame({ "Purchase Date": dates, "Sales": units_sold }) # print the dataframe print(df)
Output:
Purchase Date Sales 0 2020-12-29 10 1 2020-12-30 8 2 2020-12-31 14 3 2021-01-01 21 4 2021-01-02 16 5 2021-01-03 12
To extract the year from the “Purchase Date” column and add it as a separate column:
# convert the Purchase Date to datetime df['Purchase Date'] = pd.to_datetime(df['Purchase Date']) # add a column for Year df['Year'] = df['Purchase Date'].dt.year # print the dataframe print(df)
Output:
Purchase Date Sales Year 0 2020-12-29 10 2020 1 2020-12-30 8 2020 2 2020-12-31 14 2020 3 2021-01-01 21 2021 4 2021-01-02 16 2021 5 2021-01-03 12 2021
Here, we first converted the “Purchase Date” column to datetime format using the pandas to_datetime() function and then created the “Year” column by accessing its “year” property. Let’s look at the type of each column using the pandas info()
function.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
df.info()
Output:
<class 'pandas.core.frame.DataFrame'> RangeIndex: 6 entries, 0 to 5 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Purchase Date 6 non-null datetime64[ns] 1 Sales 6 non-null int64 2 Year 6 non-null int64 dtypes: datetime64[ns](1), int64(2) memory usage: 272.0 bytes
You can see that “Purchase Date” is of type datetime whereas the “Year” is of integer type.
Extract Month, Day, Hour, Minute, Second, etc from datetime column
You can also extract other date-related properties like the month, week, day, hour, minute, etc. from a datetime column as we did above. For example, let’s add columns showing the month, day, and day of the year in the above dataframe.
# add a column for Month df['Month'] = df['Purchase Date'].dt.month # add a column for Day df['Day'] = df['Purchase Date'].dt.day # add a column for Day of the Year df['Day of Year'] = df['Purchase Date'].dt.dayofyear # show the dataframe print(df)
Output:
Purchase Date Sales Year Month Day Day of Year 0 2020-12-29 10 2020 12 29 364 1 2020-12-30 8 2020 12 30 365 2 2020-12-31 14 2020 12 31 366 3 2021-01-01 21 2021 1 1 1 4 2021-01-02 16 2021 1 2 2 5 2021-01-03 12 2021 1 3 3
In the above example, we created columns “Month”, “Day”, and “Day of Year” by accessing the “Purchase Date” datetime column’s month
, day
, and dayofyear
properties respectively.
There are a number of other properties (all the way up to nanoseconds). You can also get some boolean properties as well like whether the date is a starting date of a month, or whether it’s a leap year or not, etc.
With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.