-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ProximityArchive for novelty search #472
Add ProximityArchive for novelty search #472
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gresavage, thank you for your PR! I appreciate that it is well-documented and tested and that you clearly put a lot of time into it. I have left comments requesting a couple of changes.
Overall, I think the UnstructuredArchive is great. The UnstructuredGridArchive seems rather specialized, so I do not think we should add it to pyribs. Could you remove from this PR the UnstructuredGridArchive and the roll() method in ArrayStore?
I know these things can take quite a bit of time; if you need to work on other things, just let me know and I'm happy to help make the changes I mentioned.
change `GridArchive` to `GridUnstructuredArchive` in `roll` docstring
narrow pylint disable scope
upper and lower bound properties return measure max/min without checking archive size. update docstrings: UnstructuredArchive, index_of remove learning_rate, threshold_min arguments from __init__ rename "sparsity" to "novelty" remove boundaries property and `List` typehints
…bs into 468-unstructured-archives
Hi @gresavage, just wanted to give a quick update on things. I'm going through and modifying the UnstructuredArchive -- I've changed it a decent amount to align more with the rest of the library and how we implement things. I removed GridUnstrucutredArchive and ArrayStore.roll as per my earlier comments, but do consider taking a look at my comment here #472 (comment) as I think it reveals an unexpected behavior. I'll keep working on this bit by bit, and it should be ready soon. Thanks again! |
Hi @btjanaka I'm eager to see what you've done. I haven't pushed any of my changes w.r.t NS + local competition yet as they are still in flux and I'd like to know your thoughts on handling non-dominated sorting... I also want to fold my changes into what you've got so the two strategies are homogeneous if you decide to include NS + local competition. I was also thinking more about the objective threshold and learning rate. Are we sure we want to do away with this behavior entirely if we implement NS + local competition? I removed the keyword arguments in my last commit but I think it could be useful to have CMA-MAE behavior with the local competition variant. |
RE: EmittersI think an emitter could work for NSLC so long as the ranking doesn't play a roll in what gets added to the archive. The only real hurdle I see with that route is that NSLC doesn't bother with how candidates are generated, so an NSLC emitter could essentially wrap any of the other emitters and just leave the RE: Replacement & Use of NDS RankI looked over the papers as well and I think you're right that solutions are not replaced - which greatly simplifies the issue. So after thinking about it even more, I think the way the archive works in NSLC is that essentially every member of the current population is added to the archive if it survives the genetic selection. And a solution only survives selection based on NSGA-II optimizing morphological novelty and local competition objective plus novelty playing the role of crowding distance. So I think the archive in NSLC doesn't actually do any determination of what gets added - if it is sent to the I also think whether the "selection" based on non-domination rank and novelty for NSLC is done outside of PyRibs by a user algorithm or in the |
## Description <!-- Provide a brief description of the PR's purpose here. --> We are working on supporting diversity optimization algorithms such as novelty search (see #472). One difference from other algorithms is that such algorithms do not require objectives. However, the current API to Scheduler.tell() is: ``` tell(objective, measures, **fields) ``` i.e., it assumes objective and measures will be provided. This PR seeks to make it possible to call `scheduler.tell(measures)` in addition to the current `scheduler.tell(objective, measures)`. ### Use Cases There are a couple of use cases that the scheduler must satisfy: 1. QD optimization (i.e., current behavior should not break): - Positional arguments: `scheduler.tell(objective, measures)` - Keyword arguments: `scheduler.tell(objective=objective, measures=measures)` - Mixed (a rather extreme case): `scheduler.tell(objective, measures=measures)` 2. Diversity optimization: - Positional arguments: `scheduler.tell(measures)` - Keyword arguments: `scheduler.tell(measures=measures)` 3. Diversity optimization but where an objective has been added as an extra field: - Positional arguments: `scheduler.tell(measures, objective=objective)` - Keyword arguments: `scheduler.tell(measures=measures, objective=objective)` Ideally, we would also be robust to a case where we need to only have objectives in the future: 4. Objective optimization: - Positional arguments: `scheduler.tell(objective)` - Keyword arguments: `scheduler.tell(objective=objective)` ### Potential Solutions 1. Change scheduler API to `tell(*args, **kwargs)` and add a `problem_type` to each archive. - args and kwargs are interpreted based on the type of the archive. If the problem type is quality_diversity, then the args are interpreted as objectives and measures. If the problem type is diversity_optimization, then the args are interpreted as measures. If the problem type is single_objective, the args are interpreted as objective. kwargs will be treated the same as fields in all cases. - Pros: Backwards-compatible and handles all 4 use cases above. - Cons: Makes the archives a bit more complex, may be a bit confusing to users because the signature will not be very informative. However, I think we can get away with this with good docstrings and documentation. 2. Allow the current objective argument to be treated as measures, and assign a default argument of None to the current objective and measures -> i.e., `tell(objective=None, measures=None, **fields)` - Inspired by how numpy's `integers` can be either `rng.integers(high)` or `rng.integers(low, high)` -> see [here](https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.integers.html) - Pros: Minimal changes to current API, and still quite interpretable for users. - Cons: On the other hand, it may be confusing to see that objective can be set to measures. This will also fail case 3 above, i.e., `scheduler.tell(measures, objective=objective)` will throw an error because `measures` is actually `objective` due to the positional argument, so `objective` is being passed in twice. 3. **Require the user to pass in `objective=None` when performing diversity optimization and maintain the same `scheduler.tell` API.** - Pros: This requires the fewest changes to the API. It also provides a good model for what EvolutionStrategyEmitter and other emitters should do -- all these classes can now implement behavior for `objective=None`. It also does not require modifying the archives to have a `problem_type` or `archive_type` attribute. Furthermore, `objective` can still be provided if that is a field in the archive. - Cons: It is a bit verbose to have `scheduler.tell(None, measures)` or `scheduler.tell(objective=None, measures=measures)` but I think users can understand that. - Optionally, we can also set a default of `objective=None` and `measures=None` so that one can pass just measures, e.g., `scheduler.tell(measures=...)` or just objectives `scheduler.tell(objective=)`. However, I think we may not want to add this feature for now as it requires adding default values to a lot of places. 4. Add a new parameter to `scheduler.tell` called `mode` or something similar to indicate diversity optimization is in effect, i.e., `scheduler.tell(measures, mode="diversity")` - Pros: This is very explicit and makes it clear that diversity optimization is happening without objectives. - Cons: Rather verbose. ### Decision I believe **Solution 3** is the best solution, since it involves the fewest changes to the API and also contains the changes to the scheduler. ## TODO <!-- Notable points that this PR has either accomplished or will accomplish. --> - [x] Implement behavior for when `objective=None` in `Scheduler.tell` -- specifically, the scheduler will now allow any field to be None, and when the field is None, the scheduler will simply pass None down to the emitter tell functions. - [x] Write test for passing `objective=None` to the scheduler, both with positional and keyword arguments. This test will be updated once the UnstructuredArchive is implemented. ## Status - [x] I have read the guidelines in [CONTRIBUTING.md](https://github.com/icaros-usc/pyribs/blob/master/CONTRIBUTING.md) - [x] I have formatted my code using `yapf` - [x] I have tested my code by running `pytest` - [x] I have linted my code with `pylint` - [x] I have added a one-line description of my change to the changelog in `HISTORY.md` - [x] This PR is ready to go
@gresavage I see what you are saying about using the NSLC ranking. I think a good path would be to implement something in For now, I've updated the code to just be a regular Novelty Search archive. I added a TODO to add local competition on issue #474. Would you mind taking a look at the code? Thanks! |
@btjanaka AHHHH, rankers, of course! Clever :) I will take a look. |
@btjanaka Do you think we need to implement visualization for the archive? |
This reverts commit 66f2d44.
Description
Implement the ProximityArchive (also known as unstructured archive or novelty archive) from Novelty Search (Lehman, 2011) http://eplex.cs.ucf.edu/papers/lehman_ecj11.pdf
Resolves #468
Naming
ProximityArchive
is a specific name that reflects how the archive operations are based on how close solutions are to each other in measure space. We also consideredNearestNeighborArchive
, but that name is a bit long. We find thatNoveltyArchive
andUnstructuredArchive
are rather imprecise terms.TODO
add_mode="single"
)Status
CONTRIBUTING.md
yapf
pytest
pylint
HISTORY.md