In this tutorial, we will look at how to remove non-alphanumeric characters from a string in Python with the help of some examples.
What are alphanumeric characters?
A character is an alphanumeric character if it’s either an alphabet (a to z, A to Z) or a digit (0 to 9). For example, the string striker123
contains only alphanumeric characters whereas the string striker_123
contains one non-alphanumeric character ('_'
).
You can use the string isalnum()
function to check if a character is an alphanumeric character or not.
# check if character is alphanumeric print('a'.isalnum()) print('A'.isalnum()) print('7'.isalnum()) print('_'.isalnum()) print('#'.isalnum())
Output:
True True True False False
Removing non alphanumeric characters from a string is commonly used as a text preprocessing step. Let’s now look at how to remove non alphanumeric characters from a string with the help of some examples.
There are a number of ways you can remove non alphanumeric characters from a string in Python.
Using string isalnum()
and string join()
functions
You can use the string isalnum()
function along with the string join()
function to create a string with only alphanumeric characters.
# string with non alphanumeric characters s = "Striker@#$_123" # remove non alphanuemeric characters new_s = ''.join(c for c in s if c.isalnum()) print(new_s)
Output:
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
Striker123
You can see that the resulting string doesn’t have any non alphanumeric characters. Here we iterate over all the characters in the original string and keep it only if it’s an alphanumeric character which we check using the string isalnum()
function. We then use the string join()
function to concatenate each character.
Using regular expression to remove non alphanumeric characters
We can also use regular expressions to remove such characters. For example, we can write a regular expression to match with all the non-alphanumeric characters in the string and then replace them with an empty string. You can use the re
library in Python to implement regular expression pattern matching.
import re # string with non alphanumeric characters s = "Striker@#$_123" # remove non alphanuemeric characters new_s = re.sub(r'[^a-zA-Z0-9]', '', s) print(new_s)
Output:
Striker123
We get the same result as above.
For more on regular expressions in Python, refer to this guide.
You might also be interested in –
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.