A column of datetime type in pandas is a rich source of temporal information that could be useful. For example, you may want to see your data at a yearly or monthly level. Or, you may want to find whether it was a weekend or not. Knowing how to extract such information quickly can come quite handy when working with dataframes with datetime column(s). In this tutorial, we’ll look at how to extract the year, month, day, etc. information from a datetime column in pandas.
Extract Year from a datetime column
Pandas datetime columns have information like year, month, day, etc as properties. To extract the year from a datetime column, simply access it by referring to its “year” property. The following is the syntax:
df['Month'] = df['Col'].dt.year
Here, ‘Col’ is the datetime column from which you want to extract the year.
For example, you have the following dataframe of sales of an online store.
import pandas as pd # sales data of a shop dates = ["2020-12-29", "2020-12-30", "2020-12-31", "2021-01-01", "2021-01-02", "2021-01-03"] units_sold = [10, 8, 14, 21, 16, 12] # create the dataframe df = pd.DataFrame({ "Purchase Date": dates, "Sales": units_sold }) # print the dataframe print(df)
Output:
Purchase Date Sales 0 2020-12-29 10 1 2020-12-30 8 2 2020-12-31 14 3 2021-01-01 21 4 2021-01-02 16 5 2021-01-03 12
To extract the year from the “Purchase Date” column and add it as a separate column:
# convert the Purchase Date to datetime df['Purchase Date'] = pd.to_datetime(df['Purchase Date']) # add a column for Year df['Year'] = df['Purchase Date'].dt.year # print the dataframe print(df)
Output:
Purchase Date Sales Year 0 2020-12-29 10 2020 1 2020-12-30 8 2020 2 2020-12-31 14 2020 3 2021-01-01 21 2021 4 2021-01-02 16 2021 5 2021-01-03 12 2021
Here, we first converted the “Purchase Date” column to datetime format using the pandas to_datetime() function and then created the “Year” column by accessing its “year” property. Let’s look at the type of each column using the pandas info()
function.
df.info()
Output:
<class 'pandas.core.frame.DataFrame'> RangeIndex: 6 entries, 0 to 5 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Purchase Date 6 non-null datetime64[ns] 1 Sales 6 non-null int64 2 Year 6 non-null int64 dtypes: datetime64[ns](1), int64(2) memory usage: 272.0 bytes
You can see that “Purchase Date” is of type datetime whereas the “Year” is of integer type.
Extract Month, Day, Hour, Minute, Second, etc from datetime column
You can also extract other date-related properties like the month, week, day, hour, minute, etc. from a datetime column as we did above. For example, let’s add columns showing the month, day, and day of the year in the above dataframe.
# add a column for Month df['Month'] = df['Purchase Date'].dt.month # add a column for Day df['Day'] = df['Purchase Date'].dt.day # add a column for Day of the Year df['Day of Year'] = df['Purchase Date'].dt.dayofyear # show the dataframe print(df)
Output:
Purchase Date Sales Year Month Day Day of Year 0 2020-12-29 10 2020 12 29 364 1 2020-12-30 8 2020 12 30 365 2 2020-12-31 14 2020 12 31 366 3 2021-01-01 21 2021 1 1 1 4 2021-01-02 16 2021 1 2 2 5 2021-01-03 12 2021 1 3 3
In the above example, we created columns “Month”, “Day”, and “Day of Year” by accessing the “Purchase Date” datetime column’s month
, day
, and dayofyear
properties respectively.
There are a number of other properties (all the way up to nanoseconds). You can also get some boolean properties as well like whether the date is a starting date of a month, or whether it’s a leap year or not, etc.
With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.