CRAN

ngram 3.0.4

Fast n-Gram 'Tokenization'

Released Nov 21, 2017 by Drew Schmidt

This package can be loaded by Renjin but 10 out 14 tests failed.

An n-gram is a sequence of n "words" taken, in order, from a body of text. This is a collection of utilities for creating, displaying, summarizing, and "babbling" n-grams. The 'tokenization' and "babbling" are handled by very efficient C code, which can even be built as its own standalone library. The babbler is a simple Markov chain. The package also offers a vignette with complete example 'workflows' and information about the utilities offered in the package.

Installation

Maven

This package can be included as a dependency from a Java or Scala project by including the following your project's pom.xml file. Read more about embedding Renjin in JVM-based projects.

<dependencies>
  <dependency>
    <groupId>org.renjin.cran</groupId>
    <artifactId>ngram</artifactId>
    <version>3.0.4-b8</version>
  </dependency>
</dependencies>
<repositories>
  <repository>
    <id>bedatadriven</id>
    <name>bedatadriven public repo</name>
    <url>https://nexus.bedatadriven.com/content/groups/public/</url>
  </repository>
</repositories>

View build log

Renjin CLI

If you're using Renjin from the command line, you load this library by invoking:

library('org.renjin.cran:ngram')

Test Results

This package was last tested against Renjin 0.9.2644 on Jun 1, 2018.

Source

R
C

View GitHub Mirror

Release History