| cleanNLP-package | cleanNLP: A Tidy Data Model for Natural Language Processing |
| annotate | Run the annotation pipeline on a set of documents |
| cleanNLP | cleanNLP: A Tidy Data Model for Natural Language Processing |
| combine_documents | Combine a set of annotations |
| dep_frequency | Universal Dependency Frequencies |
| doc_id_reset | Reset document ids |
| download_core_nlp | Download java files needed for CoreNLP |
| extract_documents | Extract documents from an annotation object |
| from_CoNLLU | Reads a CoNLL-U or CoNLL-X File |
| get_coreference | Access coreferences from an annotation object |
| get_dependency | Access dependencies from an annotation object |
| get_document | Access document meta data from an annotation object |
| get_entity | Access named entities from an annotation object |
| get_sentence | Access sentence-level annotations |
| get_tfidf | Construct the TF-IDF Matrix from Annotation or Data Frame |
| get_token | Access tokens from an annotation object |
| get_vector | Access word embedding vector from an annotation object |
| init_coreNLP | Interface for initializingthe coreNLP backend |
| init_spaCy | Interface for initializing up the spaCy backend |
| init_tokenizers | Interface for initializing the tokenizers backend |
| obama | Annotation of Barack Obama's State of the Union Addresses |
| pos_frequency | Universal Part of Speech Code Frequencies |
| print.annotation | Print a summary of an annotation object |
| read_annotation | Read annotation files from disk |
| tidy_pca | Compute Principal Components and store as a Data Frame |
| word_frequency | Most frequent English words |
| write_annotation | Write annotation files to disk |