Validation of exploration capability of BO #1842
-
Hi the comunity! Is there any known way to validate the capability of bayesian optimization's recommendations on a historic dataset? BO is normally used in the loop with the black-box function evaluations. However if I have no access to the black-box function and only have a significant historic record of samples, is it possible to somehow assess the exploration capability of my specific implementation (with specific hyperparameters like lenghscale and UCB beta)? I could do folded cross-validation on the dataset, but this will only assess the quality of GP fit. The difficulty is with the acquisition function. How to assess that it is correct and the UCB beta is good. There is no way to query the acquisition function maximizer, but I can check GP values and the UCB value on the hold-out top-10% of the dataset samples by the (scalar) objective. Hower I am puzzeld how I can combine these values into a single robust metric. I'd appreciate your help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi @Obs01ete. To compare various methods on a given dataset, we typically fit a surrogate model on the dataset and run full BO loops to optimize the surrogate model. The surrogate can be any statistical model that interpolates the dataset to the full search space. Using a GP model is a convenient choice but it may not be great for comparing various length-scales (since the surrogate itself has a length-scale). Is there a particular reason that you're using UCB (and trying to tune beta) rather than using some other acquisition function that doesn't require a tuning a hyper-parameter? We find that EI and its variants generally work much better than UCB. @SebastianAment recently introduced LogEI, which eliminates some numerical issues and performs quite a bit better. |
Beta Was this translation helpful? Give feedback.
Hi @Obs01ete. To compare various methods on a given dataset, we typically fit a surrogate model on the dataset and run full BO loops to optimize the surrogate model. The surrogate can be any statistical model that interpolates the dataset to the full search space. Using a GP model is a convenient choice but it may not be great for comparing various length-scales (since the surrogate itself has a length-scale).
Is there a particular reason that you're using UCB (and trying to tune beta) rather than using some other acquisition function that doesn't require a tuning a hyper-parameter? We find that EI and its variants generally work much better than UCB. @SebastianAment recently introduced L…