Dictionaries

Several dictionaries exist that can be loaded into TAPA. If you have your own dictionary that you would like to make available to others, please send me an e-mail to have it uploaded to this site.

A couple of brief notes about making your own dictionary file:

  • The first column on the left should always have the header “Symbol”
  • The second column should always have the header “Type”
    • The “Type” column can be filled with one of two values, corresponding to the type of symbol that is being coded (words or characters). The symbol types are as follows for words and characters, respectively:
      • word
      • char

If you have any questions about making a dictionary file, please refer to the files that are available below for reference. Additionally, feel free to send me an e-mail if I can be of any assistance.


*Note: The Warriner et al. ratings found below are the same that are loaded into the software by default.

**Note: For all of the dictionary files on this page, you may want to set TAPA’s “Text Encoding” option to utf-8 when reading the files in. Your system’s default encoding may also be fine.


Dictionary Files

Right Click and “Save Link As…” to download a dictionary file. Alternatively, click the “download” link, then copy and paste the dictionary contents into a .txt file on your hard drive.

Bestgen & Vincze (2012) – DIC-LSA norms (link) (download)

Buchanan et al. (2012) – Single word norms (link) (download)

Brysbaert, Warriner, & Kuperman (2014) – Concreteness norms (link) (download) (rescaled version)

Brysbaert et al. (2014) – Concreteness norms for Dutch (link) (download)

Engelthaler & Hills (2017) – Humor norms (link) (download)

Kovács, Carroll, & Lehman (2013) – Authenticity norms (link) (download)

Kuperman et al. (2012) – Age of Acquisition norms (link) (download)

Kwan et al. – Embodiment ratings for 687 English verbs (link) (download)

Lynott & Connell (2009) – Adjective Modality norms (link) (download)

Lynott & Connell (2013) – Noun Modality norms (link) (download)

Lynott & Connell – Adjective and Noun Modality norms combined (download)

Paetzold & Specia (2016) – Psycholinguistic Properties of Words (link) (download)

Stadthagen-Gonzalez et al. (2016) – Spanish Emotional Norms (link) (download)

Warriner et al. (2013) – Affective rating norms (link) (download)


Pre-trained Word Vectors

Note: These dictionaries consist of extremely large numbers of words and dimensions. All of the following files are extremely large and are extremely memory-intensive. You will most likely need at least 32GB of RAM in your computer to use any of the following files. Some may require 64+ GB of RAM.

You may also consider downloading the “first 100K” or “first 500k” versions of these dictionary files. They are shortened versions of the full dictionaries and may be more viable on less cutting-edge systems.

Pennington et al. (2014) – GloVe pre-trained vectors (link)

⇒ Wikipedia 2014 + Gigaword 5, 100-dimensional version, de-duplicated (download)

⇒ Twitter, 25 through 200 dimension versions, cleaned (download) (first 500K words version) (first 100K words version)

⇒ Common Crawl, 42B token version, 300 dimensions, uncased, cleaned
(download) (first 500K words version) (first 100K words version)

Salle (2016) –LexVec pre-trained vector (link)

⇒ Common Crawl, 58B tokens, 300 dimensions, word vectors (download)