Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataloader sampler support #6

Open
Lime-Cakes opened this issue Nov 21, 2022 · 1 comment
Open

Dataloader sampler support #6

Lime-Cakes opened this issue Nov 21, 2022 · 1 comment

Comments

@Lime-Cakes
Copy link

Lime-Cakes commented Nov 21, 2022

Is it possible to use dataloader with a custom sample/batch_sampler? At the moment, I cannot find any useful information on using poptorch's dataloader with custom sampler. Are there plans to support it, or is custom sampler impossible due to IPU design?

Edit: At the moment, using a custom batch_sampler would results in the following error:

Traceback (most recent call last):
  File "train-ipu.py", line 488, in <module>
    main()
  File "train-ipu.py", line 446, in main
    train_dataloader = poptorch.DataLoader(opts,train_dataset,collate_fn=collate_fn, batch_sampler=box_sampler)
  File "/usr/local/lib/python3.8/dist-packages/poptorch/__init__.py", line 356, in __init__
    super().__init__(dataset,
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 251, in __init__
    raise ValueError('batch_sampler option is mutually exclusive '
ValueError: batch_sampler option is mutually exclusive with batch_size, shuffle, sampler, and drop_last
@AnthonyBarbier
Copy link
Contributor

batch_sampler is currently not supported by the poptorch.DataLoader but you could use one with a stock torch.utils.data.DataLoader however you need to make sure each element the sampler returns matches the combined batch size expected by the PopTorch model.

Here is how the combined batch size is computed:

            self._combined_batch_size = batch_size * \
                options.device_iterations * \
                options.replication_factor * \
                options.Training.gradient_accumulation

Source: https://github.com/graphcore/poptorch/blob/sdk-release-3.0/python/__init__.py#L278

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants