Name		Name	Last commit message	Last commit date
parent directory ..
src		src
README.md		README.md
pom.xml		pom.xml

README.md

QuickBuffers - Benchmarks

Below is a comparison with Google's official bindings for a variety of datasets. The performance depends a lot on the specific data format and content, so the results may not be representative for your use case. All tests were run on OpenJDK 17 using JMH on an AMD Ryzen 9 3900x. The protobuf-java benchmarks used version 3.21.12.

Benchmark 1 - SBE dataset

The first benchmark was copied from Small Binary Encoding's Car (140 byte) and MarketData (64 byte) throughput benchmarks. It tests manual creation of messages and encodes and decodes them from a byte array, which is representative of sending and receiving individual messages over a network.

Protobuf Binary	Size [bytes]	QuickBuffers [msg/s]	Protobuf-Java [msg/s]	Ratio
Car
Encode	140	3.40M (454 MB/s)	1.25M (167 MB/s)	2.7
Decode	140	3.36M (449 MB/s)	1.1M (145 MB/s)	3.1
Market Data (varint)
Encode	64	12.48M (761 MB/s)	5.62M (342 MB/s)	2.2
Decode	64	9.20M (561 MB/s)	2.46M (150 MB/s)	3.7

Note that the throughput is heavily impacted by the chosen data types. Variable length integers are comparatively expensive to work with, so using fixed-width types can often significantly increase throughput at the cost of a larger message size. In the market data case, changing the number encoding from varint to fixed-width types could more than double the decoding throughput. In some cases it may even make sense to use Groups as they are encoded with a start and end tag and can be streamed without requiring computing the size ahead of time.

Protobuf Binary	Size [bytes]	QuickBuffers [msg/s]	Protobuf-Java [msg/s]	Ratio
Market Data (fixed)
Encode	84	17.0M (1.3 GB/s)	7.2M (576 MB/s)	2.4
Decode	84	20.0M (1.6 GB/s)	4.8M (384 MB/s)	4.2

We also benchmarked the built-in JSON encoding with Protobuf-Java's JsonFormat Printer. This is an unfair comparison of generated code against a reflective approach, so a big speedup is to be expected.

Protobuf JSON	Size [bytes]	QuickBuffers [msg/s]	Protobuf-Java [msg/s]	Ratio
Car Encode	559	1.44M (765 MB/s)	0.12M (62 MB/s)	12.3
Market Data Encode	435	3.60M ( 1.5 GB/s)	0.16M (67 MB/s)	22.2

Benchmark 2 - File streams

The second benchmark reads and writes streams of delimited protobuf messages with varying contents, which is representative for streaming log files. This benchmark does not trigger lazy-parsing of strings, so it is primarily indicative of forwarding use cases. This is a best case scenario for protobuf-java serialization as it omits all the overhead related to building the objects. All datasets were loaded into memory and decoded from a byte array.

Dataset	Content	Size [bytes/msg]	QuickBuffers [ms/log]	Java [ms/log]	Ratio
Read
1	sensor feedback	220	118 ms (773 MB/s)	432 ms (211 MB/s)	3.7
2	sensor info	650	67 ms (892 MB/s)	225 ms (266 MB/s)	3.4
3	car data	140	23 ms (456 MB/s)	70 ms (150 MB/s)	3.0
4	market data	64	18 ms (583 MB/s)	68 ms (154 MB/s)	3.8
5	packed doubles	64M	5.8 ms (11.6 GB/s)	68 ms ( 1.0 GB/s)	11.6
Write
1	sensor feedback	220	89 ms ( 1.0 GB/s)	137 ms (666 MB/s)	1.5
2	sensor info	650	49 ms ( 1.2 GB/s)	75 ms (797 MB/s)	1.5
3	car data	140	19 ms (552 MB/s)	23 ms (466 MB/s)	1.2
4	market data	64	14 ms (749 MB/s)	20 ms (524 MB/s)	1.4
5	packed doubles	64M	5.6 ms (12.0 GB/s)	40 ms ( 1.7 GB/s)	7.1
Read + Write
1	sensor feedback	220	207 ms (441 MB/s)	569ms (160 MB/s)	2.7
2	sensor info	650	116 ms (515 MB/s)	300 ms (199 MB/s)	2.6
3	car data	140	42 ms (250 MB/s)	93 ms (113 MB/s)	2.2
4	market data	64	32 ms (328 MB/s)	88 ms (119 MB/s)	2.8
5	packed doubles	64M	11.4 ms ( 5.9 GB/s)	108 ms (621 MB/s)	9.5

Dataset contents

Dataset 1 contains a series of delimited ~220 byte messages containing sensor measurements from production data. Messages consist mostly of scalar data types and a relatively small amount of nesting. No strings, repeated, or unknown fields. Only a small subset of defined fields is populated. (87 MB)
Dataset 2 contains a series of delimited ~650 byte messages containing sensor data and hardware information from production data. The messages are a superset of dataset 1 with additional strings (mostly small and ascii) and more nesting. No unknown or repeated fields. About half of the defined fields are populated. (57 MB)
Dataset 3 contains ~147k car messages generated by the SbeBenchmark (10 MB)
Dataset 4 contains ~73k market messages generated by the SbeBenchmark (10 MB)
Dataset 5 contains a single artificial message with one packed double field (repeated double values = 1 [packed=true]). It only encodes a repeated type with fixed size, so it should be representative of the best-case scenario memory throughput (on little-endian systems this can map to memcpy). (64 MB)

The benchmark code can be found in the benchmarks module. The Write results are derived from Write = ((Read + Write) - Read), which is not necessarily composable.

Benchmark 3 - FlatBuffers

For the last benchmark we compared QuickBuffers against the Java bindings of Google's FlatBuffers project and ported its official C++ benchmark.

Contrary to the official C++ benchmark that shows tremendous performance benefits over Protobuf-Lite (which is much slower than the regular version), the Java version is not nearly as optimized, and actually performs slower for most use cases. Recent JDK improvements have improved the performance of ByteBuffer, but the overhead is still enough to counter any benefits gained by removing the decoding step.

Moreover, the benchmark was deliberately setup to favor FlatBuffers by structuring the content as a worst case for Protobuf. It uses deep levels of nesting and inappropriate varint types for very large numbers. For example, an int64 type is chosen for a field that always maps to the largest size (10 bytes) and goes through the most expensive path. Choosing a flatter hierarchy with appropriate data types would speed things up considerably.

That being said, it may be worth exploring for use cases that require random access to small subsets of data.

	QuickBuffers (1.0.0/jdk17)	FlatBuffers (2.0.0/jdk17)	FlatBuffers (1.11.0/jdk8)	FlatBuffers (1.10.0/jdk8)	Ratio
DirectByteBuffer	[ns/op]	[ns/op]	[ns/op]	[ns/op]
Decode	185	0	0	0	0.0
Traverse	31	223	234	321	7.2
Decode + Traverse	216	223	234	321	1.0
Encode	264	467	457	649	1.8
Encode + Decode + Traverse	480	690	691	970	1.4
HeapByteBuffer	[ns/op]	[ns/op]	[ns/op]	[ns/op]
Decode	166	0	0	0	0.0
Traverse	33	211	381	427	6.4
Decode + Traverse	199	211	381	427	1.1
Encode	259	512	626	821	2.0
Encode + Decode + Traverse	458	723	1007	1248	1.6
Other
Serialized Size	228 bytes	344 bytes	344 bytes	344 bytes	1.5
Transient memory allocated during decode	0 bytes	0 bytes	0 bytes	0 bytes	1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmarks

benchmarks

README.md

QuickBuffers - Benchmarks

Benchmark 1 - SBE dataset

Benchmark 2 - File streams

Benchmark 3 - FlatBuffers

Files

benchmarks

Directory actions

More options

Directory actions

More options

Latest commit

History

benchmarks

Folders and files

parent directory

README.md

QuickBuffers - Benchmarks

Benchmark 1 - SBE dataset

Benchmark 2 - File streams

Benchmark 3 - FlatBuffers