The proper citation for this software is as follows:
Boyd, R. L. (2016). RIOT Scan: Recursive Inspection of Text Scanner (Version 2.0.21) [Software]. Available from https://www.ryanboyd.io/software/riot
If you publish an article that cites RIOT Scan, let me know and I would be happy to list it on this website.
2016-03-26 — Uploaded version 2.0.21. Added the ability to output processed text with the rest of the output.
2016-03-25 — Uploaded version 2.0.2. Added additional segmentation options.
2015-05-20 — Uploaded version 2.0.11. Added the Global Citizen dictionary. Special thanks to Nicholas Faulkner for suggesting this addition.
2015-03-17 — Uploaded version 2.0.1 and 2.0.1 XP. Fixed a bug with the custom dictionary engine that could cause text analysis to halt if your custom dictionary contained multi-word phrases.
2014-09-07 — Uploaded version 2.0.04 and 2.0.04 XP. Added the Aging Stereotypes dictionary. Special thanks to Dr. Allison Chasteen and Dr. Jessica Remedios for contributing the original dictionaries and additional information.
2014-08-14 — Uploaded version 2.0.03 and 2.0.03 XP. Added the Mindfulness dictionary. Special thanks to Dr. Susan E. Collins for this contribution to the project.
2014-08-10 — Uploaded version 2.0.02 and 2.0.02 XP. Fixed a minor bug with the “Not” word option of the custom dictionary feature that could lead to overcounting in some cases.
2014-08-10 — Uploaded version 2.0.01 and 2.0.01 XP. Updated the Content Coding Output guide to include these dictionaries. Added the Moral Foundations Dictionary.
2014-08-10 — Uploaded version 2.0.0 and 2.0.0 XP. Updated the Content Coding Output guide to include these dictionaries. Version 2.0.0 constitutes a major update to the software. Restructured core content coding engine for considerable performance gains. Depending on the content coding systems that you select and the particulars of your dataset, you can expect each coding system to run anywhere between about 5% to 25% more quickly. This update also includes gains in accuracy for some content coding systems. Notably, this applies to the Forestry Values Dictionary, Common Nouns, PANAS-X dictionary, and Lexicoder Sentiment Dictionary, among others. As an additional note, these changes may provide minimally different output when compared to older versions of RIOT Scan. Across several corpora, these differences still correlate at ~.98 with each other. Fixed a minor typo in the previous implementation of the Lexicoder Sentiment Dictionary (was not detecting the word “mar”). Restructured the order of dictionary searchers for multiple systems for higher accuracy. Added “Not” words to custom dictionary features (see Creating Custom Dictionaries page for more details).
2014-08-10 — Added the “Self-talk” dictionary and the Privacy Dictionary. I would like to extend a special thanks to both Dr. Emily J. Oliver and Dr. Asimina Vasalou for their respective contributions to the RIOT Scan project and for providing the full dictionaries used in their research.
2014-08-08 — Uploaded version 1.8.711 and 1.8.711 XP. Added the Agitation-Dejection dictionary and the Home Perception dictionary. Updated the Content Coding Output guide to include these dictionaries.
2014-07-16 — Uploaded version 1.8.71 and 1.8.71 XP. Minor internal changes to how file information is passed around within the system. Fixed minor bugs related to these changes that involved RIOT Scan skipping a file when running very small numbers (i.e., 1 or 2) of files.
2014-06-17 — Special thanks to Dr. David Walker for his help in tracking down bugs for the 1.8.7 release! Updating to 1.8.7 is recommended.
2014-06-17 — Uploaded version 1.8.7 and 1.8.7 XP. Fixed an issue with the Harvard General Inquirer that could, in specific circumstances, cause the program to stop processing text. This new version is also more light on its feet, and no longer loads all text into memory prior to analysis. Rather, 1.8.7 reads text in on an “as needed” basis. This not only makes RIOT incredibly less greedy with system resources, but also allows for the analysis of extremely large corpora. Essentially, the size of the corpus that can be processed by RIOT is no longer limited by whether you can fit the entire thing into RAM all at once.
2014-04-25 — Uploaded version 1.8.6 and 1.8.6 XP. Added the Lexicoder Topic Dictionary. Fixed an issue where part of the Vowel/Consonant and Value Lexicon output would be mixed with each other. Updating to the latest version is recommended.
2014-03-14 — Uploaded version 1.8.5 and 1.8.5 XP. Added Bardi et al.’s (2008) Value Lexicon.
2014-03-02 — Uploaded version 1.8.4 and 1.8.4 XP. Fixed the European-formatted output. Please let me know if there are any issues with this output.
2014-01-12 — Uploaded version 1.8.3 and 1.8.3 XP. Fixed a bug that could cause analysis to halt on rare occasions when using a custom dictionary that contains phrases.
2014-01-04 — Uploaded version 1.8.2 and 1.8.2 XP. Fixed a minor display issue and added “Preparing Text” status display. Functionality is identical to version 1.8.1.
2013-12-17 — Uploaded version 1.8.1 and 1.8.1 XP. Worked out a bug where extra white space (e.g., tabs, spaces, linebreaks) surrounding the “%%%” breaks in custom dictionaries could cause issues with “Dict_Percent” output and potentially cause user-defined categories to not be recognized. If you still encounter this bug, please let me know. Changed the “Variability Indices” checkbox label to be a bit more accurate. By request, added “SixLetter” category to this output, reflecting percentage of words in an observation that are greater than six letters long.
2013-12-05 — Update a good deal of information on the Content Coding Output page.
2013-12-04 — Uploaded version 1.8.0 and 1.8.0 XP. Added the Brysbaert et al. concreteness norms.
2013-11-30 — Uploaded version 1.7.91 and 1.7.91 XP. Fixed a tiny dictionary bug that, in some cases, could cause the DICLSA coding scheme to prematurely halt analysis. Uploaded a slightly updated version of the R and SAS prep scripts. The previous versions had a slight error in calculating the “Making Distinctions” category.
2013-11-20 — Uploaded version 1.7.9 and 1.7.9 XP. Added the Forest Value dictionary coding system. Slight changes to improve resource efficiency.
2013-11-19 — Uploaded version 1.7.8 XP. This version is identical to 1.7.8, although built around the .NET 4.0 framework. This will allow Windows XP users to still have full access to RIOT Scan.
2013-10-10 — Uploaded version 1.7.8. Updated to .NET 4.5. RIOT Scan has access to a greater amount of RAM, allowing for the processing of much larger quantities of text.
2013-10-10 — Uploaded version 1.7.7. Added the option of European .csv output formatting.
2013-10-08 — Uploaded version 1.7.6. Fixed a small bug that would halt analysis under an isolate circumstance.
2013-10-03 — Uploaded version 1.7.5. Added a method for abstraction detection inspired by Mergenthaler’s work labelled as “Abstraction (Broad Method)”.
2013-10-02 — Uploaded version 1.7.4. Converted Lexicoder Sentiment Dictionary and vowel/consonant detection to the newer system. Removed extraneous code, which should streamline the process a bit more. Tests under the new engine suggest that Lexicoder Sentiment Dictionary is more accurate now, especially in terms of negations. Altered the order of content processing for internal consistency.
2013-09-30 — Uploaded version 1.7.3. Changed code so that words with hanging hyphens are no longer merged. This was done in order to accommodate certain transcription styles and will not impact most users.
2013-09-24 — Uploaded version 1.7.2. Fixed a bug that would cause certain special characters in a text file to halt analysis.
2013-09-18 — Uploaded version 1.7.1. Small tweak made to allow for custom dictionaries encoded in UTF-8 to be used. This allows for dictionaries to code in multiple languages.
2013-09-18 — Uploaded version 1.7.0. Fixed a problem that would allow for poorly constructed dictionaries to overemphasize some categories. Fixed an issue that was preventing phrases in custom dictionaries from being properly detected. Updating to this recent version is highly recommended.
2013-09-18 — Uploaded version 1.6.9. Fixed a small bug that, in a few select circumstances, could cause the newer content coding engine to be excessively conservative. Speed and resource allocation slightly (but not noticeably) improved.
2013-09-17 — Uploaded version 1.6.8. Added some code that should help prevent odd output when a custom dictionary contains extra white space around the delimiters (%%%). Considerable UI overhaul and usability improvements. Special thanks to Dr. Yi-Tai Seih for his suggestions.
2013-09-05 — Uploaded version 1.6.5. Added a small snippet of code that should help RIOT Scan be slightly less greedy with system resources, freeing up small chunks of RAM as files finish processing. No change in functionality. Only users processing very large amounts of text may notice a difference.
2013-09-03 — Uploaded version 1.6.4. Wrote a small number of patches to improve lemmatization.
2013-08-21 — Uploaded version 1.6.3. Fixed a small bug that would allow poorly-built dictionaries to facilitate multiple detections of the same word via wildcards. Fixed a long-standing bug that would cause RIOT Scan to trip and halt on empty files. Converted even more systems over to new engine.
2013-08-19 — Uploaded version 1.6.2. Multiple instances are now allowed. Fixed minor issue that would occasionally prevent sequential wildcards from being detected.
2013-08-17 — Uploaded version 1.6.1. Small optimizations to the upgraded engines. Updates should help reduce memory demands, especially for large files. Added more frequent cancellation checks to upgraded engines.
2013-08-16 — Uploaded version 1.6.0. Complete overhaul of the majority of coding systems, including the custom coding system. The new system is much more thorough when processing text and has new built-in safeguards that are more rigorous than in previous versions. New systems are more flexible and sensitive to detection. Custom dictionary support is now extremely customizable and allows for different combinations of wildcards. Please report any bugs that you may come across.
2013-07-13 — Uploaded version 1.5.5. Added PANAS-X markers as a coding scheme (for reference, see Pressman & Cohen, 2012). Significantly reworked small parts of the engine, removed redundant output (e.g., WhirlAll dictionary percentage and WhirlAll output), and fixed a small bug that prevented a very small number of specific phrases that are coded in dictionaries from being detected. Upgrading to the latest version is recommended.
2013-06-12 — Uploaded updated R script. This version includes calculation of certain composite variables that were already done in the SAS script.
2013-06-01 — Uploaded a better R script. This version is much smarter about creating new variables based on what is actually in the input data file. Additionally, the script is much smaller.
2013-05-31 — Uploaded a preliminary R script to be parallel to the SAS script already provided. It works fine, but is likely to be somewhat inefficient.
2013-05-22 — Uploaded version 1.5.2. Added LabMT Norms. Removed the stripping of certain characters (i.e., “#”, “@”, “[“, “]”, “=”) so that LabMT system is able to detect words that use these characters.
2013-05-22 — Uploaded version 1.5.1. Added AFINN Norms.
2013-05-13 — Uploaded version 1.5.0. Added “DictPercent” output for relevant content coding systems. See “Content Coding System Output” for information on this.
2013-05-04 — Uploaded version 1.4.95r. The LIWC 2007 coding system is no longer included in RIOT Scan. A RIOT-compatible dictionary file will be available via LIWC.net. Minor code restructuring with regards to completion of text processing (no change in system functionality). Fixed the citation for the Gilhooly & Logie (1980) norms. The system is the same, however, the citation that was output when running the system was incorrect in version 1.4.9.
2013-05-02 — Uploaded version 1.4.9. Added the 1,944 Gilhooly & Logie (1980) word norms. Added a break feature for user-based errors in file writing (i.e., in case the user deletes the file during writing, etc.). Replaced the “Clear List” function button with a “Check/Uncheck all” feature for more convenient user interactions.
2013-05-02 — Uploaded version 1.4.8. Added the complete version of the Harvard General Inquirer / IV-4 system. The previous version had approximately 99 of the 182 categories; this version does all 182.
2013-04-30 — Uploaded version 1.4.7. Custom dictionary system now automatically detects $MAXCATS:. Inclusion of this in your custom dictionary will not harm anything, but no longer does anything either.
2013-04-25 — Uploaded version 1.4.6. Updated the Noun List to the most recent version. Slight speed / efficiency improvement made to the DIC-LSA norms system (no change in functionality).
2013-04-25 — Uploaded version 1.4.5. Added the Bestgen & Vincze (2012) DIC-LSA norms.
2013-04-24 — Uploaded version 1.4.4. Added Pressman & Cohen’s (2007) Social Ties Scoring Dictionary. Updated SAS prep script accordingly.
2013-03-28 — Uploaded version 1.4.3, which includes a small piece of code optimization (functionality is identical to previous versions). Site redesign for better navigability and information presentation. Uploaded an updated version of the SAS Prep Script that includes some new composites as well as the newer SDM categories.
2013-02-19 — Uploaded version 1.4.2. Does some new calculations (Standard deviations as derived from means, or SDM, which is different from the Mean of Standard Deviations, or MSD, as derived from the norm standard deviations) for the ANEW and Toronto Word Pool systems.
2013-02-18 — Uploaded version 1.4.1. Added Hu & Liu’s Opinion Lexicon.
2013-02-14 — Uploaded version 1.4.0. By popular request, I added the ability to use your own custom dictionaries. Your dictionary file must be organized by (somewhat) stringent standards, however, the system has been rigorously tested to ensure that it is fully functional and error-free if your dictionary file is formatted properly.
2013-02-13 — Uploaded version 1.3.8. Added Laver & Garry’s Dictionary of Policy Position. This version has 418 words/phrases (as opposed to the listed 415 on the Provalis site), where “civil war”, “people”, and “war in Iraq” are added to the “Culture” category.
2013-02-11 — Uploaded version 1.3.7. Fixed a bug where the math was not cleared when calculating values for the Toronto Word Pool Norms across multiple files. If you downloaded version 1.3.6 earlier today, please update to the latest version.
2013-02-11 — Uploaded version 1.3.6. Added Martindale’s (1990) “Being” and “Analogy” categories, detailed in The Clockwork Muse. They are really only useful for looking at poetry, it would seem, and should be thought about with care before use as they are very general and simple. Also added a system to provide means for the Toronto Word Pool scores, including Imagery and Concreteness means (and mean standard deviations), FOA, SOA, and ONR, among others. Cleaned up the routines for the ANEW systems, so they are *ever so slightly* more efficient now. However, the code is tidier.
2012-12-07 — Uploaded version 1.3.5. Added Quintan’s noun list to the content coding system. Did some small tweaks for optimization. Uploaded a SAS script to prep and give descriptive statistics for all of the content categories and variation indices.
2012-12-06 — Uploaded version 1.3.4. Fixed a bug that reported incorrect processing times when files were segmented. Did some small optimizations. The functionality and accuracy of all text analysis remains identical.
2012-11-30 — Uploaded version 1.3.3. A couple of slight interface tweaks and fixed the spacing in a couple of the citations to match APA format. No change in functionality.
2012-11-19 — Uploaded version 1.3.2. Added the ability to process each file as multiple bodies of text. Changed output file encoding to make it more generally compatible (some people reported issues with the previous UTF-8 encoding method).
2012-11-15 — Uploaded version 1.3.1. Added the LemmaGen open-source lemmatization classes. Retooled Porter’s stemming algorithm to make it significantly more effifient. Retooled a couple of core parts of the engine to make it slightly cleaner and more efficient. Thoroughly retested and optimized sentence-level recursive inspection with stemming / lemmatisation.
2012-11-12 — Uploaded version 1.3.0. Made variation index calculations optional. For larger bodies of text, people who are not interested in these indices might not want to wait the extra time for these calculations to get processed. But, in all seriousness, who isn’t enraptured by the coefficient of variation? Also, added a catch to ensure that hypens are stripped out for the calculations of vowels/consonants. There was a possibility in version 1.2.9 that they would not always get cut.
2012-11-12 — Uploaded version 1.2.9. Added many more cancellation catches for better usability. Added a file write catch so that the user is notified to close file instead of RIOT Scan simply dropping further analyses. RIOT Scan no longer automatically strips hyphens from text, as multiple content coding systems pick up hyphenated words. Hyphens are, however, still stripped out when coding for the Lexicoder Sentiment categories, as this system includes only non-hyphenated forms of such words.
2012-11-10 — Uploaded version 1.2.7. Implemented a modified version of Christos Attikos’ VB7 port of Martin Porter’s stemming algorithm. Added some catches to file writing so that it doesn’t cut out of loop if there is an access error.
2012-11-09 — Uploaded version 1.2.6. Fixed an overly-thorough loop for the Lexicoder Sentiment Dictionary that was causing massively excessive processing for each text file.
2012-11-08 — Uploaded version 1.2.5. Added the Lexicoder Sentiment Dictionary, which is very exciting. This is the first coding system in the software that can catch phrases in addition to single words. This implementation does the dictionary, its negation dictionary, and the calculations for final numbers. This system does not do preprocessing. That may or may not be added in a future version of RIOT Scan.
2012-11-08 — Added a page with other text tools in order to make them publicly available.
2012-11-07 — Uploaded version 1.2.4. Added a list of “Whirlall” words, which was brought to my attention while reading The Psychology of Adaptation To Absurdity by Fisher and Fisher (1993). Added a system that strips out everything that is not an alpha character, then determines the percentage of Vowels (and Vowels + Y) along with Consonants (and Consonants + Y).
2012-11-02 — Uploaded version 1.2.3. RIOT Scan now calculates 8 more categories for Whissell’s DAL. I cleaned up the code majorly for tracking and processing for all of the ANEW systems and the WDAL system. Cleaner code, *slightly* more efficient now.
2012-11-02 — Uploaded version 1.2.2. Added Male and Female subject norms for Bradley & Lang’s (1999) ANEW system.
2012-11-02 — Uploaded version 1.2.1. Added Bradley & Lang’s (1999) Affective Norms for English Words system. The current version applies the norms from ALL subjects. Separate coding systems for Female and Male norms will be added in the very near future. I also tweaked the scoring for Whissell’s DAL such that it is more precise.
2012-11-01 — Uploaded version 1.2.0. Contains an early working version of the Whissell Dictionary of Affect in Language. Currently calculates means, but currently only calculates Pleasantness, Activation, and Imagery based upon Whissell’s listed norms.
2012-09-03 — Uploaded version 1.1.7. Added the Dolch Sight Words coding scheme. Added a self-constructed list of 300+ onomatopoeia words.
2012-08-31 — Uploaded version 1.1.6. Added a “cancel” button so that progress may be halted while processing, preventing the need to close and restart the application if a cancellation is desired. Made structural tweaks to the content coding engine for (very minor) efficiency gains, as well as to make it easier to add dictionaries in the future.
2012-08-31 — Uploaded version 1.1.5. Added the Financial Sentiment Dictionary coding scheme.
2012-08-31 — Uploaded version 1.1.4. A number of optimizations implemented. Changed checkboxes to checkedlistbox for ease of adding coding schemes in the future. Added citation output when text analysis is executed. Other minor and not-so-minor tweaks to the content coding system in order to make it more efficient.
2012-08-30 — Uploaded version 1.1.3. Fixed a broken word for the Body Type system under the “Transparency” category.
2012-08-30 — Uploaded version 1.1.2. Added “processing time” to the output. Keeps track of how long each file takes to process through RIOT Scan. Helpful for debugging and user estimation of the time it takes to do different things based upon file size. Additionally, added a function to disable the content coding checkbox if no coding schemes are selected. This is for the sake of efficiency (multiple conditionals will not be executed now due to content coding, but no schemes, being selected).
2012-08-29 — Uploaded version 1.1.1. Added an English Prime violations dictionary. The ability to select which scoring systems you would like to use has also been added.
2012-08-28 — Uploaded version 1.1.0. A number of parsing anomalies were fixed (non-default characters are now converted to default). Added content coding systems.