Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a total_iodepth option that can be used with librbdfio #324

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

harriscr
Copy link
Contributor

@harriscr harriscr commented Jan 30, 2025

Description

Add an option to the CBT configuration yaml file called total_iodepth. The total_iodepth is then split evenly among the number of volumes used for the test. If the total_iodepth for a run does not divide evenly into the number of volumes, then any remainder will be assigned 1iodepth at a time to the volumes, starting at 0.

For (a simple) example:
For an total_iodepth of 18 and volumes_per_client of 5, the following iodepth allocations would occur:

volume iodepth
0 4
1 4
2 4
3 3
4 3

If the number of volumes specified is such that there is not enough iodepth for 1 per volume, then the number of volumes for that test will be reduced so that an iodepth of 1 per volume can be achieved.

Example:
For volumes_per_client = 5 and total_iodepth=4, the benchmark would be run with 4 volumes, each of iodepth 1

Testing

Manual testing was done for a number of scenarios, with and without using workloads The below are the output from debug statements added for the purposes of testing.

Regression testing: iodepth=32, volumes_per_client=3

CHDEBUG: fio_cmd for volume 0 is /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=cbt-librbdfio --rbdname=cbt-librbdfio-hostname -f-0 --invalidate=0 --rw=randwrite --output-format=json,normal --runtime=300 --numjobs=1 --direct=1 --bs=4096B --iodepth=32 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.0 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.0 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.0 --log_avg_msec=101 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.0

CHDEBUG: fio_cmd for volume 1 is /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=cbt-librbdfio --rbdname=cbt-librbdfio-hostname -f-1 --invalidate=0 --rw=randwrite --output-format=json,normal --runtime=300 --numjobs=1 --direct=1 --bs=4096B --iodepth=32 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.1 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.1 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.1 --log_avg_msec=101 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.1

CHDEBUG: fio_cmd for volume 2 is /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=cbt-librbdfio --rbdname=cbt-librbdfio-hostname -f-2 --invalidate=0 --rw=randwrite --output-format=json,normal --runtime=300 --numjobs=1 --direct=1 --bs=4096B --iodepth=32 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.2 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.2 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.2 --log_avg_msec=101 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-032/randwrite/output.2

total_iodepth=32, volumes_per_client=3

CHDEBUG: fio_cmd for volume 0 is /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=cbt-librbdfio --rbdname=cbt-librbdfio-hostname -f-0 --invalidate=0 --rw=randwrite --output-format=json,normal --runtime=300 --numjobs=1 --direct=1 --bs=4096B --iodepth=11 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.0 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.0 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.0 --log_avg_msec=101 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.0

CHDEBUG: fio_cmd for volume 1 is /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=cbt-librbdfio --rbdname=cbt-librbdfio-hostname -f-1 --invalidate=0 --rw=randwrite --output-format=json,normal --runtime=300 --numjobs=1 --direct=1 --bs=4096B --iodepth=11 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.1 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.1 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.1 --log_avg_msec=101 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.1

CHDEBUG: fio_cmd for volume 2 is /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=cbt-librbdfio --rbdname=cbt-librbdfio-hostname -f-2 --invalidate=0 --rw=randwrite --output-format=json,normal --runtime=300 --numjobs=1 --direct=1 --bs=4096B --iodepth=10 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.2 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.2 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.2 --log_avg_msec=101 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-003/iodepth-016/randwrite/output.2

total_iodepth=2, volumes_per_client=3

11:13:28 - WARNING - cbt - The total iodepth requested: 2 is less than 1 per volume.
11:13:28 - WARNING - cbt - Number of volumes per client will be reduced from 3 to 2

CHDEBUG: fio_cmd for volume 0 is /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=cbt-librbdfio --rbdname=cbt-librbdfio-hostname -f-0 --invalidate=0 --rw=randwrite --output-format=json,normal --runtime=300 --numjobs=1 --direct=1 --bs=4096B --iodepth=1 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-002/iodepth-016/randwrite/output.0 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-002/iodepth-016/randwrite/output.0 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-002/iodepth-016/randwrite/output.0 --log_avg_msec=101 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-002/iodepth-016/randwrite/output.0

CHDEBUG: fio_cmd for volume 1 is /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=cbt-librbdfio --rbdname=cbt-librbdfio-hostname -f-1 --invalidate=0 --rw=randwrite --output-format=json,normal --runtime=300 --numjobs=1 --direct=1 --bs=4096B --iodepth=1 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-002/iodepth-016/randwrite/output.1 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-002/iodepth-016/randwrite/output.1 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-002/iodepth-016/randwrite/output.1 --log_avg_msec=101 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/osd_ra-00004096/op_size-00004096/concurrent_procs-002/iodepth-016/randwrite/output.1

I also tested with an array of total_iodepths, but the output is too long for here so instead I will attach the file containing the results:
total_iodepth_testing.txt

I'll update with teuthology results once they have run

@harriscr harriscr self-assigned this Jan 30, 2025
@harriscr
Copy link
Contributor Author

harriscr commented Jan 31, 2025

python unit tests:

============================= slowest 5 durations ==============================
0.01s setup tests/test_bm_kvmrbdfio.py::TestBenchmarkkvmrbdfio::test_valid_archive_dir
0.01s setup tests/test_bm_nullbench.py::TestBenchmarknullbench::test_valid_archive_dir
0.01s setup tests/test_bm_rawfio.py::TestBenchmarkrawfio::test_valid_archive_dir
0.01s setup tests/test_bm_radosbench.py::TestBenchmarkradosbench::test_valid_archive_dir
0.01s setup tests/test_bm_fio.py::TestBenchmarkfio::test_valid_archive_dir
======================== 326 passed, 3 skipped in 0.47s ========================
Finished running tests!

I haven't run black, ruff or mypy against the file to check for pep-8 compliance as the unchanged file gives too many errors currently

@harriscr
Copy link
Contributor Author

harriscr commented Feb 6, 2025

Testing with workloads:

precondition: iodepth=[2]

13:45:31 - INFO - cbt - Running rbd fio precondition test, mode randwrite
13:45:31 - WARNING - cbt - CHDEBUG: Using iodepth

13:45:31 - INFO - cbt - CHDEBUG: fio command for iodepth 2 and vol 0 is
sudo /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=rbd_replicated --rbdname=cbt-librbdfio-hostname -f-0 --invalidate=0 --rw=randwrite --output-format=json --runtime=60 --time_based --ramp_time=30 --numjobs=1 --direct=1 --bs=65536B --iodepth=2 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.0 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.0 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.0 --log_avg_msec=100 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.0

13:45:31 - INFO - cbt - CHDEBUG: fio command for iodepth 2 and vol 1 is
sudo /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=rbd_replicated --rbdname=cbt-librbdfio-hostname -f-1 --invalidate=0 --rw=randwrite --output-format=json --runtime=60 --time_based --ramp_time=30 --numjobs=1 --direct=1 --bs=65536B --iodepth=2 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.1 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.1 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.1 --log_avg_msec=100 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.1

13:45:31 - INFO - cbt - CHDEBUG: fio command for iodepth 2 and vol 7 is
sudo /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=rbd_replicated --rbdname=cbt-librbdfio-hostname -f-7 --invalidate=0 --rw=randwrite --output-format=json --runtime=60 --time_based --ramp_time=30 --numjobs=1 --direct=1 --bs=65536B --iodepth=2 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.7 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.7 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.7 --log_avg_msec=100 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/randwrite_65536/iodepth-002/numjobs-001/output.7

The procondition runs on all 8 vlumes with an iodepth of 2, as expected.

total_iodepth=16, volumes=8

13:45:31 - INFO - cbt - Running rbd fio seq32kwrite test, mode write
13:45:31 - WARNING - cbt - CHDEBUG: Using total_iodepth

13:45:31 - INFO - cbt - CHDEBUG: fio command for iodepth 16 and vol 0 is
sudo /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=rbd_replicated --rbdname=cbt-librbdfio-hostname -f-0 --invalidate=0 --rw=write --output-format=json --runtime=60 --time_based --ramp_time=30 --numjobs=1 --direct=1 --bs=32768B --iodepth=2 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-016/numjobs-001/output.0 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-016/numjobs-001/output.0 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-016/numjobs-001/output.0 --log_avg_msec=100 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-016/numjobs-001/output.0

13:45:31 - INFO - cbt - CHDEBUG: fio command for iodepth 16 and vol 1 is
sudo /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=rbd_replicated --rbdname=cbt-librbdfio-hostname -f-1 --invalidate=0 --rw=write --output-format=json --runtime=60 --time_based --ramp_time=30 --numjobs=1 --direct=1 --bs=32768B --iodepth=2 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-016/numjobs-001/output.1 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-016/numjobs-001/output.1 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-016/numjobs-001/output.1 --log_avg_msec=100 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-016/numjobs-001/output.1

total_iodepth=7, volumes=8

13:45:32 - WARNING - cbt - The total iodepth requested: 7 is less than 1 per volume (8)
13:45:32 - WARNING - cbt - Number of volumes per client will be reduced from 8 to 7

13:45:32 - INFO - cbt - CHDEBUG: fio command for iodepth 7 and vol 0 is
sudo /usr/local/bin/fio --ioengine=rbd --clientname=admin --pool=rbd_replicated --rbdname=cbt-librbdfio-hostname -f-0 --invalidate=0 --rw=write --output-format=json --runtime=60 --time_based --ramp_time=30 --numjobs=1 --direct=1 --bs=32768B --iodepth=1 --end_fsync=0 --norandommap --write_iops_log=/tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-007/numjobs-001/output.0 --write_bw_log=/tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-007/numjobs-001/output.0 --write_lat_log=/tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-007/numjobs-001/output.0 --log_avg_msec=100 --name=cbt-librbdfio-hostname -f-file-0 > /tmp/cbt/00000000/LibrbdFio/write_32768/iodepth-007/numjobs-001/output.0

@harriscr harriscr force-pushed the ch_wip_total_iodepth branch from 3ce5687 to 1510b6b Compare February 6, 2025 15:52
@harriscr harriscr force-pushed the ch_wip_total_iodepth branch from 1510b6b to dc1a795 Compare February 7, 2025 10:04
@harriscr harriscr marked this pull request as ready for review February 11, 2025 09:01
lee-j-sanders
lee-j-sanders previously approved these changes Feb 11, 2025
Copy link

@lee-j-sanders lee-j-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this change out and it is doing what we expect.
Replace iodepth with total_iodepth in the workloads section of the YAML and it just works :)

Functionally this works great, not sure if we need to add unit tests for total_iodepth though?

@lee-j-sanders
Copy link

I'm wondering if there should be a test added to test_bm_librbdfio.py to test:

  • test_valid_total_iodepth()

and I'm wondering why the test didn't with the new total_iodepth ?
Maybe the unit tests don't test the workloads section enough?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants