Extract Text from Microsoft Word Documents
Released Nov 9, 2018 by Jeroen Ooms
This package can be loaded by Renjin but all tests failed.
Wraps the 'AntiWord' utility to extract text from Microsoft Word documents. The utility only supports the old 'doc' format, not the new xml based 'docx' format. Use the 'xml2' package to read the latter.
This package can be included as a dependency from a Java or Scala project by including
the following your project's
about embedding Renjin in JVM-based projects.
<dependencies> <dependency> <groupId>org.renjin.cran</groupId> <artifactId>antiword</artifactId> <version>1.3-b6</version> </dependency> </dependencies> <repositories> <repository> <id>bedatadriven</id> <name>bedatadriven public repo</name> <url>https://nexus.bedatadriven.com/content/groups/public/</url> </repository> </repositories>
If you're using Renjin from the command line, you load this library by invoking:
This package was last tested against Renjin 0.9.2724 on Mar 2, 2019.