Error in `rf.modelSel` #7

jcarlis3 · 2021-07-01T16:17:21Z

I ran rf.modelSel back in 2017 in my sage-grouse research with Melanie et al. I'm dusting that analysis off, and Melanie pointed out that my MIR values should be 0-1, but several were negative. I see v2.2-0 has incorporated some bug fixes in that function, so I installed and attempted to re-run, but now get an error when running the updated rf.modelSel. I can reproduce the same error using the example in the help doc for rf.modelSel:

require(randomForest)
require(rfUtilities)

sessionInfo()
# R version 4.1.0 (2021-05-18)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 10 x64 (build 19042)
# 
# Matrix products: default
# 
# locale:
#   [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
# [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
# [5] LC_TIME=English_United States.1252    
# 
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] rfUtilities_2.2-0   randomForest_4.6-14
# 
# loaded via a namespace (and not attached):
#   [1] compiler_4.1.0 tools_4.1.0 


# Example from rf.modelSel
data(airquality)
airquality <- na.omit(airquality)

xdata = airquality[,2:6]
ydata = airquality[,1]

#### Regression example

#### Using Breiman's original Fortran code from randomForest package
( rf.regress <- rf.modelSel(airquality[,2:6], airquality[,1], 
                            imp.scale="se") )

#### Using Wright's C++ code from ranger package
( rf.regress <- rf.modelSel(airquality[,2:6], airquality[,1], 
                            method="Wright") )

#### Classification example
ydata = as.factor(ifelse(ydata < 40, 0, 1))

#### Using Breiman's original Fortran code from randomForest package
( rf.class <- rf.modelSel(xdata, ydata, ntree=1000) )


# The above statement returnes this error:
# [1] "ntree"      "y"          "x"          "importance"
# Error in is.nan(errors) : default method not implemented for type 'list'

The text was updated successfully, but these errors were encountered:

jeffreyevans · 2021-07-02T14:21:44Z

Jason, I am out until July 19th. I will take a look then. Best, Jeff Get Outlook for Android<https://aka.ms/ghei36>

…

________________________________ From: Jason Carlisle ***@***.***> Sent: Thursday, July 1, 2021 10:17:30 AM To: jeffreyevans/rfUtilities ***@***.***> Cc: Jeffrey Evans <[email protected]>; Mention ***@***.***> Subject: [jeffreyevans/rfUtilities] Error in `rf.modelSel` (#7) Hi @jeffreyevans<https://github.com/jeffreyevans> , I ran rf.modelSel back in 2017 in my sage-grouse research with Melanie et al. I'm dusting that analysis off, and Melanie pointed out that my MIR values should be 0-1, but several were negative. I see v2.2-0 has incorporated some bug fixes in that function, so I installed and attempted to re-run, but now get an error when running the updated rf.modelSel. I can reproduce the same error using the example in the help doc for rf.modelSel: require(randomForest) require(rfUtilities) sessionInfo() # R version 4.1.0 (2021-05-18) # Platform: x86_64-w64-mingw32/x64 (64-bit) # Running under: Windows 10 x64 (build 19042) # # Matrix products: default # # locale: # [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 # [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C # [5] LC_TIME=English_United States.1252 # # attached base packages: # [1] stats graphics grDevices utils datasets methods base # # other attached packages: # [1] rfUtilities_2.2-0 randomForest_4.6-14 # # loaded via a namespace (and not attached): # [1] compiler_4.1.0 tools_4.1.0 # Example from rf.modelSel data(airquality) airquality <- na.omit(airquality) xdata = airquality[,2:6] ydata = airquality[,1] #### Regression example #### Using Breiman's original Fortran code from randomForest package ( rf.regress <- rf.modelSel(airquality[,2:6], airquality[,1], imp.scale="se") ) #### Using Wright's C++ code from ranger package ( rf.regress <- rf.modelSel(airquality[,2:6], airquality[,1], method="Wright") ) #### Classification example ydata = as.factor(ifelse(ydata < 40, 0, 1)) #### Using Breiman's original Fortran code from randomForest package ( rf.class <- rf.modelSel(xdata, ydata, ntree=1000) ) # The above statement returnes this error: # [1] "ntree" "y" "x" "importance" # Error in is.nan(errors) : default method not implemented for type 'list' — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#7>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACLKH75SC4JZVNBXVQ6IMATTVSIJVANCNFSM47VBNGPQ>.

jcarlis3 · 2021-10-18T17:51:31Z

Brief recap of an off-thread discussion (in case others are having this same issue):

Negative MIR values are caused by the original importance values being negative. Suggestion is to use the CRAN version of the package (currently 2.1-5) and manually remove variables with negative MIR. Future developments may build in this sort of screening, but no fixes are imminent.

jeffreyevans · 2021-10-18T18:41:56Z

I am wondering what you would consider a reasonable behavior for negative importance values? Negative importance values are a valid possible outcome simply denoting that the random permutation worked better and resulted in no feature contribution to the model. The magnitude of the negative value is irrelevant, representing random variation and, mostly controlled by the stochastic element in the Bootstrap. As such, should negative importance simply be truncated at zero? I can add an argument that controls the behavior of negative values. One huge advantage of the new implementation of the importance function is that it now allows for a significance test that allows one to evaluate the p-value of the feature contribution and screen-out insignificant parameters. It also supports a much faster and flexible implementation of random forests (through the ranger package). If you are wanting to replicate a previous analysis and produce results consistent with your original model, please use the CRAN version and specify a consistent random seed to control stochasticity. I am still planning on getting back to the development version and submitting to CRAN but, life got in the way so this will not happen until the beginning of the year. If there is a specific function that you would like me to address I can find bandwidth to accommodate you, just let me know. Best, Jeff From: Jason Carlisle ***@***.***> Sent: Monday, October 18, 2021 11:52 AM To: jeffreyevans/rfUtilities ***@***.***> Cc: Jeffrey Evans ***@***.***>; Mention ***@***.***> Subject: Re: [jeffreyevans/rfUtilities] Error in `rf.modelSel` (#7) Brief recap of an off-thread discussion (in case others are having this same issue): Negative MIR values are caused by the original importance values being negative. Suggestion is to use the CRAN version of the package (currently 2.1-5) and manually remove variables with negative MIR. Future developments may build in this sort of screening, but no fixes are imminent. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#7 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACLKH7ZLB55VQDPT5BENS7TUHRNC5ANCNFSM47VBNGPQ>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in `rf.modelSel` #7

Error in `rf.modelSel` #7

jcarlis3 commented Jul 1, 2021

jeffreyevans commented Jul 2, 2021 via email

jcarlis3 commented Oct 18, 2021

jeffreyevans commented Oct 18, 2021 via email

Error in rf.modelSel #7

Error in rf.modelSel #7

Comments

jcarlis3 commented Jul 1, 2021

jeffreyevans commented Jul 2, 2021 via email

jcarlis3 commented Oct 18, 2021

jeffreyevans commented Oct 18, 2021 via email

Error in `rf.modelSel` #7

Error in `rf.modelSel` #7