CRAN

textclean 0.9.3

Text Cleaning Tools

Released Jul 23, 2018 by Tyler Rinker

This package cannot yet be used with Renjin it depends on other packages which are not available: lexicon 1.2.1, textshape 1.6.0, and data.table 1.12.0 An older version of this package is more compatible with Renjin.

Dependencies

data.table 1.12.0 lexicon 1.2.1 textshape 1.6.0 english 1.2-3 stringi glue 1.3.1 mgsub 1.7.1 qdapRegex 0.7.2

Tools to clean and process text. Tools are geared at checking for substrings that are not optimal for analysis and replacing or removing them (normalizing) with more analysis friendly substrings (see Sproat, Black, Chen, Kumar, Ostendorf, & Richards (2001) ) or extracting them into new variables. For example, emoticons are often used in text but not always easily handled by analysis algorithms. The replace_emoticon() function replaces emoticons with word equivalents.