`pipeline` AttributeError with `torch.nn.DataParallel` #35747

kerem-coemert · 2025-01-17T08:36:16Z

System Info

transformers version: 4.48.0
Platform: Linux-6.5.0-35-generic-x86_64-with-glibc2.35
Python version: 3.11.5
Huggingface_hub version: 0.27.1
Safetensors version: 0.4.3
Accelerate version: 0.33.0
Accelerate config: not found
PyTorch version (GPU?): 2.1.1+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: yes
Using GPU in script?: yes
GPU type: NVIDIA RTX A6000

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Hello,

I am finetuning a BertForSequenceClassification after which point I would like to test it using pipelines.
However, since I have multiple GPUs, I use torch.nn.DataParallel to wrap it in the following way:

self.model = torch.nn.DataParallel(
                module=BertForSequenceClassification.from_pretrained(
                    pretrained_model_name_or_path=self.config.embedding_model_file.model_name,
                    cache_dir=Path(self.config.embedding_model_file.cache_dir),
                    num_labels=len(self.datasets.train.unique_classes),
                    id2label={
                        idx: label
                        for idx, label in enumerate(self.datasets.train.unique_classes)
                    },
                    label2id={
                        label: idx
                        for idx, label in enumerate(self.datasets.train.unique_classes)
                    },
                    torch_dtype=self.config.training_params.torch_dtype,
                ).to(self.device)
            )

and then try to use it for inference via:

pipeline(
            task="text-classification",
            model=self.model,
            tokenizer=self.datasets.test.tokenizer,
            device=self.device,
            top_k=self.config.training_params.top_k,
            torch_dtype=self.config.training_params.torch_dtype,
        )

This worked when I simply had the BertForSequenceClassification instance but now with the DataParallel wrapping over it I get:

File "/home/xx/miniconda3/envs/xxx/lib/python3.11/site-packages/transformers/pipelines/__init__.py", line 950, in pipeline
    model_config = model.config
                   ^^^^^^^^^^^^
  File "/home/xx/miniconda3/envs/xxx/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1695, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'DataParallel' object has no attribute 'config'
```'


What is the recommended way in this case, do I have to unwrap the model from the `DataParallel` at inference?

### Expected behavior

Expected behavior is for the `pipeline` call to not throw an Exception.

The text was updated successfully, but these errors were encountered:

kerem-coemert added the bug label Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`pipeline` AttributeError with `torch.nn.DataParallel` #35747

`pipeline` AttributeError with `torch.nn.DataParallel` #35747

kerem-coemert commented Jan 17, 2025

pipeline AttributeError with torch.nn.DataParallel #35747

pipeline AttributeError with torch.nn.DataParallel #35747

Comments

kerem-coemert commented Jan 17, 2025

System Info

Who can help?

Information

Tasks

Reproduction

`pipeline` AttributeError with `torch.nn.DataParallel` #35747

`pipeline` AttributeError with `torch.nn.DataParallel` #35747