Extend usability of `calculate_offload_device_map` #768

kylesayrs · 2024-10-02T19:33:16Z

Purpose

Allow calculate_offload_device_map to be used in environments with non-homogenous and/or non-sequential GPUs

Changes

Default to using all gpus if num_gpus is not specified
Add gpu_ids argument to allow users to choose which devices to use
Fix gpu memory calculation in non-homogenous GPU set-ups
- Previously the calculation assumed that the first GPU's memory was the same for all gpus

Testing

Grepped for all uses of function and confirmed that changes are compatible
TODO: Test with simulated non-homogenous GPU setup

…odels

…d_default_gpus

kylesayrs · 2025-02-04T03:34:55Z

In hindsight, I think I'd prefer to give users helper functions which they can use to compute their own device maps. For example

from llmcompressor import hessian_memory_requirements, quantization_memory_requirements, batch_memory_requirements

model_skeleton = load_model_skeleton(model_stub)
reserved_memory = (
    hessian_memory_requirements(model_skeleton) +
    quantization_memory_requirement(model_skeleton) +
    batch_memory_requirements((bs, seq_len), attention_mask=False) +
    ("whatever junk or padding the user thinks is relevant")
)

device_map = infer_auto_device_map(
    model_skeleton,
    max_memory=get_max_memory(reserved_memory, gpu_ids=[1, 2]),
    no_split_module_classes=model_skeleton._no_split_modules,
)

device_map = get_uniform_device_map(model_skeleton, reserved_memory, gpu_ids=[1, 2])

I believe that this is a preferable user experience as opposed to trying to hide too many things behind a function api which a user has to learn.

dsikka

This is somewhat difficult to review atm.

Can you summarize what each of these helper functions you're suggesting to use and how the interface is expected to change before and after?

Generally speaking, having the helper functions is nice but we should maintain a higher level api that most users can just use/does the necessary memory calculations for them, which for most people right now would not include batching memory (but we could always expand to include this)

kylesayrs added 2 commits October 2, 2024 19:29

make max the default

d643095

set default for num_gpus, add gpu_ids argument

536b686

kylesayrs changed the title ~~calculate_offload_device_map default to all GPUs~~ [WIP] Increase usability of calculate_offload_device_map Oct 2, 2024

update documentation

21b6a7c

kylesayrs changed the title ~~[WIP] Increase usability of calculate_offload_device_map~~ [WIP] Extend usability of calculate_offload_device_map Oct 2, 2024

vllm-project deleted a comment from github-actions bot Oct 2, 2024

kylesayrs changed the title ~~[WIP] Extend usability of calculate_offload_device_map~~ Extend usability of calculate_offload_device_map Oct 4, 2024

kylesayrs added 5 commits October 7, 2024 15:24

Merge branch 'main' into kylesayrs/calculate_offload_default_gpus

c378a56

Merge branch 'main' into kylesayrs/calculate_offload_default_gpus

07ae5d0

remove unnecessary lines

fc87f79

Remove unnecessary lines

7f64144

Merge branch 'main' into kylesayrs/calculate_offload_default_gpus

6019316

kylesayrs self-assigned this Oct 17, 2024

kylesayrs added 3 commits October 18, 2024 23:24

add ability to pass model class to support non-traditional (vision) m…

4539052

…odels

Merge remote-tracking branch 'origin' into kylesayrs/calculate_offloa…

483744c

…d_default_gpus

update docstring

9ae998b

kylesayrs removed their assignment Nov 15, 2024

kylesayrs added 2 commits December 2, 2024 22:58

Merge branch 'main' into kylesayrs/calculate_offload_default_gpus

d69106e

Merge branch 'main' into kylesayrs/calculate_offload_default_gpus

58fdf4c

kylesayrs mentioned this pull request Feb 4, 2025

Support user-defined batch size for one shot #1117

Open

dsikka requested changes Feb 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend usability of `calculate_offload_device_map` #768

Extend usability of `calculate_offload_device_map` #768

kylesayrs commented Oct 2, 2024 •

edited

Loading

kylesayrs commented Feb 4, 2025 •

edited

Loading

dsikka left a comment •

edited

Loading

Extend usability of calculate_offload_device_map #768

Are you sure you want to change the base?

Extend usability of calculate_offload_device_map #768

Conversation

kylesayrs commented Oct 2, 2024 • edited Loading

Purpose

Changes

Testing

kylesayrs commented Feb 4, 2025 • edited Loading

dsikka left a comment • edited Loading

Choose a reason for hiding this comment

Extend usability of `calculate_offload_device_map` #768

Extend usability of `calculate_offload_device_map` #768

kylesayrs commented Oct 2, 2024 •

edited

Loading

kylesayrs commented Feb 4, 2025 •

edited

Loading

dsikka left a comment •

edited

Loading