How to Fix - NameError name 'gutenberg' is not defined

Gutenberg is a corpus, or a collection of written texts, included in the Natural Language Toolkit (NLTK) library for Python. It contains a diverse set of literary works in English, including novels, essays, and plays, from various time periods. The Gutenberg corpus is often used for natural language processing tasks such as text classification, language modeling, and information retrieval.

The gutenberg module provides functions for accessing the Gutenberg corpus. It can happen that when you’re trying to use gutenberg module from the NLTK library, you may run into the “NameError: name ‘gutenberg’ is not defined”.

Why does the `NameError: name 'gutenberg' is not defined` occur?

This error occurs when you try to use the gutenberg module from the NLTK library in your Python code, but Python cannot find the gutenberg module in its namespace. This could happen if you are not correctly importing the gutenberg module.

How to correctly import `gutenberg`?

The correct way to import the gutenberg module is as follows –

Make sure that you have the nltk module installed. Use pip show nltk inside command prompt or terminal to check if you have the nltk module installed or not. If it is not installed, use pip install nltk inside the command prompt or terminal to install the nltk module.
Import the nltk module.
Download the gutenberg corpus using the nltk module using the command nltk.download('gutenberg'). This will download the gutenberg corpus to your computer.
After downloading the gutenberg corpus, you can import the gutenberg module in your Python code using from nltk.corpus import gutenberg

The above steps will make sure that you have correctly imported gutenberg from the nltk module. Let’s now look at an example of importing and using the gutenberg module from the nltk library.

📚 Data Science Programs By Skill Level

Introductory ⭐

Intermediate ⭐⭐⭐

Advanced ⭐⭐⭐⭐⭐

🔎 Find Data Science Programs 👨‍💻 111,889 already enrolled

Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.

Assuming that the nltk module is installed.

import nltk

# download the gutenberg corpus
nltk.download('gutenberg')

# import gutenberg
from nltk.corpus import gutenberg

# load the text of Jane Austen's "Sense and Sensibility"
sense_and_sensibility = gutenberg.raw('austen-sense.txt')

# print the first 400 characters of the text
print(sense_and_sensibility[:400])

Output:

[nltk_data] Downloading package gutenberg to
[nltk_data]     /Users/piyush/nltk_data...

[Sense and Sensibility by Jane Austen 1811]

CHAPTER 1

The family of Dashwood had long been settled in Sussex.
Their estate was large, and their residence was at Norland Park,
in the centre of their property, where, for many generations,
they had lived in so respectable a manner as to engage
the general good opinion of their surrounding acquaintance.
The late owner of this estate was a single ma

[nltk_data]   Unzipping corpora/gutenberg.zip.

In the above example, we are loading the text of Jane Austen’s “Sense and Sensibility” and print its first 400 characters using the gutenberg corpus from nltk. We followed the steps mentioned earlier and thus didn’t get an error.

Common Errors when importing `gutenberg`

Let’s now look at some common scenarios that could result in errors while importing the gutenberg module.

`NameError: name 'gutenberg' is not defined`

A common mistake people do is that they download the gutenberg corpus but forget to import it. Like in the example below –

import nltk

# download the gutenberg corpus
nltk.download('gutenberg')

# load the text of Jane Austen's "Sense and Sensibility"
sense_and_sensibility = gutenberg.raw('austen-sense.txt')

# print the first 400 characters of the text
print(sense_and_sensibility[:400])

Output:

[nltk_data] Downloading package gutenberg to
[nltk_data]     /Users/piyush/nltk_data...
[nltk_data]   Unzipping corpora/gutenberg.zip.

---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

Cell In[1], line 7
      4 nltk.download('gutenberg')
      6 # load the text of Jane Austen's "Sense and Sensibility"
----> 7 sense_and_sensibility = gutenberg.raw('austen-sense.txt')
      9 # print the first 400 characters of the text
     10 print(sense_and_sensibility[:400])

NameError: name 'gutenberg' is not defined

Using nltk.download('gutenberg') will download the gutenberg corpus to your computer but in order to use it in your Python code, you still have to import the gutenberg module. You can import the gutenberg module using from nltk.corpus import gutenberg.

`LookupError: resource 'gutenberg' was not found`

If the gutenberg corpus is not downloaded on your machine and you try to import the gutenberg module, it will give you a LookupError.

import nltk

# import gutenberg
from nltk.corpus import gutenberg

# load the text of Jane Austen's "Sense and Sensibility"
sense_and_sensibility = gutenberg.raw('austen-sense.txt')

# print the first 400 characters of the text
print(sense_and_sensibility[:400])

Output:

LookupError: 
**********************************************************************
  Resource gutenberg not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('gutenberg')
**********************************************************************

To avoid this error, make sure that the gutenberg corpus is downloaded before you import it into your code.

Conclusion

In conclusion, the “NameError name ‘gutenberg’ is not defined” error can be frustrating when working with natural language processing tasks. However, by following the steps outlined in this tutorial, you can easily fix this error and continue with your NLP project. Remember to import the necessary libraries and modules, and ensure that you have installed the required packages. With these simple fixes, you can overcome this error and successfully complete your NLP tasks.

Author

Piyush Raj

Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

View all posts

Why does the NameError: name 'gutenberg' is not defined occur?

How to correctly import gutenberg?

Common Errors when importing gutenberg

NameError: name 'gutenberg' is not defined

LookupError: resource 'gutenberg' was not found

Conclusion

Author

Why does the `NameError: name 'gutenberg' is not defined` occur?

How to correctly import `gutenberg`?

Common Errors when importing `gutenberg`

`NameError: name 'gutenberg' is not defined`

`LookupError: resource 'gutenberg' was not found`