GutenTag download instructions

Most users of GutenTag will wish to follow these steps

First, download the GutenTag zip file corresponding to your operating system (Click here to access older versions, click here to get the Python source code on GitHub):

Unzip this file and move the GutenTag folder to the desired location on your machine.

(If you are upgrading from an older version, we recommend you first delete the old folder after making copies of any lexicons or lists you may have created)

There are several options for the download of datasets which work with GutenTag. Earlier versions of GutenTag were based on the 2010 DVD image of (US) Project Gutenberg, and that is still supported (and includes more low-bandwidth-friendly options such as bittorrent). However, if possible we recommend that you download our version of the corpus which also includes texts released between 2010 and (June) 2016, and excludes non-text media (both files are about 8 GB). We also have separate corpora for texts from Project Gutenberg Canada and Project Gutenberg Australia; these corpora are much smaller, but they include many well-known modern texts which are still in copyright in the US. Each of these corpora are hosted in the corresponding country, users should be careful that their download and use of these files is in accordance with the laws in their jurisdiction.

Unzip the downloaded corpora and, for simplest use, place the resulting folder in the same directory as the main GutenTag folder (they should be parallel, not one inside the other).

To run GutenTag, navigate to the folder where you have placed GutenTag. If you are using Windows, you can just double-click on the executable file. Mac and Linux users will have to run GutenTag.py, which may require opening a terminal window and typing "python GutenTag.py". A browser window should open automatically. The first time you run GutenTag, it will ask for the location of the Gutenberg corpus. If you have downloaded our version of the original Project Gutenberg corpus and followed the above instructions, simply enter "../PGUS/" (without quotation marks).

For detailed instructions on how to use GutenTag, see our Readme (readme.md; .md files can be opened in a text editor or a web browser).

For users wishing to run GutenTag as a Python script or Python API

GutenTag can be also accessed through a Python API. For this uses, use the Mac/Linux version above or download all the files directly from the GutenTag GitHub page and see the Readme.