tokenizers.bpe (0.1.0)

0 users

Byte Pair Encoding Text Tokenization.

https://github.com/bnosac/tokenizers.bpe
http://cran.r-project.org/web/packages/tokenizers.bpe

Unsupervised text tokenizer focused on computational efficiency. Wraps the 'YouTokenToMe' library which is an implementation of fast Byte Pair Encoding (BPE) .

Maintainer: Jan Wijffels
Author(s): Jan Wijffels [aut, cre, cph] (R wrapper), BNOSAC [cph] (R wrapper), VK.com [cph], Gregory Popovitch [ctb, cph] (Files at src/parallel_hashmap (Apache License, Version 2.0), The Abseil Authors [ctb, cph] (Files at src/parallel_hashmap (Apache License, Version 2.0), Ivan Belonogov [ctb, cph] (Files at src/youtokentome (MIT License))

License: MPL-2.0

Uses: Rcpp

Released 10 months ago.


Ratings

Overall:

  (0 votes)

Documentation:

  (0 votes)

Log in to vote.

Reviews

No one has written a review of tokenizers.bpe yet. Want to be the first? Write one now.


Related packages: corpora, gsubfn, kernlab, languageR, lsa, tm, wordnet, zipfR, RWeka, RKEA, openNLP, skmeans, tau, tm.plugin.mail, lda, textcat, topicmodels, tm.plugin.dc, textir, movMF(20 best matches, based on common tags.)


Search for tokenizers.bpe on google, google scholar, r-help, r-devel.

Visit tokenizers.bpe on R Graphical Manual.