-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support alphanumeric and hyphenated words #45
Comments
I have the same issue, picked up with ordinal indicators. It looks like this is a problem with the hunspell parser: hunspell::hunspell_parse(c("1st", "RNA-seq", "EIF4G1"))
#> [[1]]
#> [1] "st"
#>
#> [[2]]
#> [1] "RNA" "seq"
#>
#> [[3]]
#> [1] "EIF" "G" Created on 2021-02-06 by the reprex package (v0.3.0) |
Implementing a pre filter right before the parse here could work: Lines 118 to 123 in a2b5f29
It feels like more of a quick-fix because it parses with ignore_words <- c("1st", "RNA-seq", "EIF4G1")
lines <- c(
"This is the 1st line. It has first written in it.",
"The second has RNA-seq inside. But does not use RNAseq -- without the '-'",
"EIF4G1 but not EIF4G1fdsadf is used",
"This line's words are fine!"
)
pre_filter_plain <- function(lines, ignore = character()) {
word_list <- strsplit(lines, "([^-[:alnum:][:punct:]])")
vapply(
word_list,
function(i) {
paste(i[!i %in% ignore], collapse = " ")
},
character(1)
)
}
pre_filter_plain(lines, ignore_words)
#> [1] "This is the line. It has first written in it."
#> [2] "The second has inside. But does not use RNAseq -- without the '-'"
#> [3] "but not EIF4G1fdsadf is used"
#> [4] "This line's words are fine!" Created on 2021-02-06 by the reprex package (v0.3.0) |
This is meant to be a quick fix; issue should probably be resolved in hunspell parser instead * remove "ignore" words from WORDLIST before parsing in hunspell * replaces complex if ... else if ... statement with simplier switch()
I am using the following words in my package:
After inserting these words in
inst/WORDLIST
and runningspelling::spell_check_package()
, the function reports that the wordsseq
,st
,nd
andEIF
are misspelled.Currently, my
WORDLIST
includes the wordsseq
,st
,nd
andEIF
to avoid triggering the spell checker, but I would prefer to include the full words. Thanks.The text was updated successfully, but these errors were encountered: