From e3feacf74170b1bb4cb36cf5b0b1db72b83a31f2 Mon Sep 17 00:00:00 2001 From: Abseil Team Date: Fri, 22 Mar 2024 15:37:18 -0400 Subject: [PATCH] Project import generated by Copybara. PiperOrigin-RevId: 618258885 Change-Id: I8782cc9206b935ab84bc65f7adaf39b5641a95a3 --- _posts/2018-02-22-totw-93.md | 4 +- _posts/2018-05-03-totw-148.md | 2 +- _posts/2018-05-03-totw-149.md | 6 +- _posts/2018-09-28-totw-144.md | 6 + _posts/2019-10-01-totw-180.md | 2 +- _posts/2019-12-12-totw-146.md | 45 ++++- _posts/2019-12-12-totw-166.md | 12 +- _posts/2019-12-19-totw-108.md | 26 +-- _posts/2020-04-06-totw-163.md | 15 +- _posts/2020-04-06-totw-172.md | 4 +- _posts/2020-04-06-totw-173.md | 2 +- _posts/2020-04-06-totw-175.md | 10 +- _posts/2020-04-06-totw-176.md | 2 +- _posts/2020-04-06-totw-177.md | 2 +- _posts/2020-09-01-totw-140.md | 92 +++++----- _posts/2020-09-11-totw-76.md | 2 +- _posts/2020-11-11-totw-186.md | 3 + _posts/2022-11-16-totw-18.md | 2 +- _posts/2022-11-16-totw-215.md | 2 +- _posts/2022-11-16-totw-3.md | 3 + _posts/2023-01-19-totw-218.md | 57 +++--- _posts/2024-03-21-totw-224.md | 190 ++++++++++++++++++++ _posts/2024-03-21-totw-227.md | 126 +++++++++++++ _posts/2024-03-21-totw-229.md | 330 ++++++++++++++++++++++++++++++++++ 24 files changed, 814 insertions(+), 131 deletions(-) create mode 100644 _posts/2024-03-21-totw-224.md create mode 100644 _posts/2024-03-21-totw-227.md create mode 100644 _posts/2024-03-21-totw-229.md diff --git a/_posts/2018-02-22-totw-93.md b/_posts/2018-02-22-totw-93.md index 6089269e..e2a3ee1f 100644 --- a/_posts/2018-02-22-totw-93.md +++ b/_posts/2018-02-22-totw-93.md @@ -42,8 +42,8 @@ Some of the benefits of using `Span` as a function parameter are similar to those of using `string_view`. The caller can pass a slice of the original vector, or pass a plain array. It is -also compatible with other array-like containers, like `InlinedVector`, -`FixedArray`, `google::protobuf::RepeatedField`, etc. +also compatible with other array-like containers, like `absl::InlinedVector`, +`absl::FixedArray`, `google::protobuf::RepeatedField`, etc. As with `string_view`, it is usually better to pass `Span` by value when used as a function parameter - this form is slightly faster, and produces smaller code. diff --git a/_posts/2018-05-03-totw-148.md b/_posts/2018-05-03-totw-148.md index c80fcdf6..a372fdac 100644 --- a/_posts/2018-05-03-totw-148.md +++ b/_posts/2018-05-03-totw-148.md @@ -10,7 +10,7 @@ order: "148" Originally posted as TotW #148 on May 3, 2018 -*By [Titus Winters](mailto:titus@google.com)* +*By [Titus Winters](mailto:titus@cs.ucr.edu)* Updated 2020-04-06 diff --git a/_posts/2018-05-03-totw-149.md b/_posts/2018-05-03-totw-149.md index 0168d908..393af6e7 100644 --- a/_posts/2018-05-03-totw-149.md +++ b/_posts/2018-05-03-totw-149.md @@ -10,7 +10,7 @@ order: "149" Originally posted as TotW #149 on May 3, 2018 -*By [Titus Winters](mailto:titus@google.com)* +*By [Titus Winters](mailto:titus@cs.ucr.edu)* Updated 2020-04-06 @@ -188,8 +188,8 @@ expressed in those types, not in the APIs that operate on them. ### Ref-qualification As a side note: it's possible to apply the same reasoning for ref-qualifiers on -destructive accessors. Consider a class like `std::stringbuf` - in C++20 it will -gain an accessor to consume the contained string, presented as an overload set +destructive accessors. Consider a class like `std::stringbuf` - in C++20 it +gained an accessor to consume the contained string, presented as an overload set with the existing accessor:
diff --git a/_posts/2018-09-28-totw-144.md b/_posts/2018-09-28-totw-144.md
index 2056ea40..eb60b116 100644
--- a/_posts/2018-09-28-totw-144.md
+++ b/_posts/2018-09-28-totw-144.md
@@ -134,5 +134,11 @@ types require explicit opt-in from the user.
 The [B-Tree][btree] containers (`absl::btree_{set,map,multiset,multimap}`) also
 support heterogeneous lookup.
 
+[Protocol Buffers'](protobuf) associative map's implementation,
+`google::protobuf::Map`, supports heterogeneous lookup when the map is keyed
+with `std::string` using string-like keys (any type that is convertible to
+`absl::string_view`).
+
 [swisstables]: https://abseil.io/docs/cpp/guides/container
 [btree]: https://abseil.io/docs/cpp/guides/container
+[protobuf]: https://protobuf.dev/
diff --git a/_posts/2019-10-01-totw-180.md b/_posts/2019-10-01-totw-180.md
index dd5dcf8e..4bcec7cc 100644
--- a/_posts/2019-10-01-totw-180.md
+++ b/_posts/2019-10-01-totw-180.md
@@ -10,7 +10,7 @@ order: "180"
 
 Originally posted as TotW #180 on June 11, 2020
 
-*By [Titus Winters](mailto:titus@google.com)*
+*By [Titus Winters](mailto:titus@cs.ucr.edu)*
 
 Updated 2020-06-11
 
diff --git a/_posts/2019-12-12-totw-146.md b/_posts/2019-12-12-totw-146.md
index 6aa05439..6e6d8af8 100644
--- a/_posts/2019-12-12-totw-146.md
+++ b/_posts/2019-12-12-totw-146.md
@@ -35,9 +35,8 @@ uninitialized is not trivial.
 The first thing to understand is if the type under construction is scalar,
 aggregate, or some other type. A *scalar* type can be thought of as a simple
 type: an integral or floating point arithmetic object; a pointer; an enum; a
-pointer-to-member; `nullptr_t`. An *aggregate* type is an array or trivial class
-(one with only public, non-static data members, no user-provided constructors,
-no base classes, and no virtual member functions).
+pointer-to-member; `nullptr_t`. An *aggregate* type is an array or a class with
+nothing virtual, no non-public fields or bases, and no constructor declarations.
 
 Another factor affecting whether an instance has been initialized to a value
 that is safe to read is whether it has an explicit *initializer*. That is, the
@@ -81,6 +80,33 @@ constructor, so `default_foo.v` is also zero-initialized and is safe to read.
 Note that `Foo::s` has a user-provided constructor, so it is value-initialized
 in either case, and safe to read.
 
+## User-Declared vs User-Provided Constructors
+
+It is possible for the user to *declare* a constructor while asking the compiler
+to *provide* the definition via `=default`. For example:
+
+
+struct Foo {
+  Foo() = default; // "Used-declared", NOT "user-provided".
+
+  int v;
+};
+
+int main() {
+  Foo default_foo;
+  Foo value_foo = {};
+}
+
+ +In this case, `Foo` defines a *user-declared*, but not *user-provided*, +constructor. While this type will not be an aggregate, members will be +initialized as if for an aggregate. This means that `default_foo.v` will be +uninitialized, while `value_foo.v` will be *zero-initialized*. Note that +"user-declared" only applies to a default constructor which is defaulted (`= +default`) *at its point of declaration*. A defaulted out-of-line +*implementation* (`Foo::Foo() = default`) is considered user-provided and will +behave equivalently to a definition of `Foo::Foo() {}`. + ### Uninitialized Members in User-Provided Constructors
@@ -195,13 +221,14 @@ Many developers would reasonably assume that this may affect code generation
 quality, but otherwise is a style preference. As you might have guessed, because
 I'm asking, this is not the case.
 
-The reason goes back to the first section above on User-provided Constructors.
+The reason goes back to the section above on
+[User-Declared vs User-Provided Constructors](#user-declared-vs-user-provided-constructors).
 As the constructor for `Foo` is defaulted on declaration, it is not
-user-provided. This means that `Foo` is an aggregate type, and `f.v` is
-zero-initialized. However, `Bar` has a user-provided constructor, albeit created
-by the compiler as a defaulted constructor. As this constructor does not
-explicitly initialize `Bar::v`, `b.v` will be default-initialized and unsafe to
-read.
+user-provided (but it *is* user-declared). This means that while `Foo` is not an
+aggregate type, `f.v` is still zero-initialized. However, `Bar` has a
+user-provided constructor, albeit created by the compiler as a defaulted
+constructor. As this constructor does not explicitly initialize `Bar::v`, `b.v`
+will be default-initialized and unsafe to read.
 
 ## Recommendations
 
diff --git a/_posts/2019-12-12-totw-166.md b/_posts/2019-12-12-totw-166.md
index d3a216ce..c50aab06 100644
--- a/_posts/2019-12-12-totw-166.md
+++ b/_posts/2019-12-12-totw-166.md
@@ -17,10 +17,10 @@ Updated 2020-04-06
 Quicklink: [abseil.io/tips/166](https://abseil.io/tips/166)
 
 
-*Entia non sunt multiplicanda praeter necessitatem." ("Entities should not be
+*"Entia non sunt multiplicanda praeter necessitatem." ("Entities should not be
 multiplied without necessity") -- William of Ockham*
 
-*If you don't know where you're going, you're probably going wrong.” -- Terry
+*"If you don't know where you're going, you're probably going wrong." -- Terry
 Pratchett*
 
 ## Overview
@@ -60,10 +60,10 @@ In practice, however, the object was always constructed "in place" in the
 variable `thing`, with no moves being performed, and the C++ language rules
 permitted these move operations to be "elided" to facilitate this optimization.
 
-In C++17, this code is guaranteed to perform zero copies or moves. In fact, the
-above code is valid even if `BigExpensiveThing` is not moveable. The constructor
-call in `BigExpensiveThing::Make` directly constructs the local variable `thing`
-in `UseTheThing`.
+Since C++17, this code is guaranteed to perform zero copies or moves. In fact,
+the above code is valid even if `BigExpensiveThing` is not moveable. The
+constructor call in `BigExpensiveThing::Make` directly constructs the local
+variable `thing` in `UseTheThing`.
 
 So what's going on?
 
diff --git a/_posts/2019-12-19-totw-108.md b/_posts/2019-12-19-totw-108.md
index 5ac1bf42..1ce1ae76 100644
--- a/_posts/2019-12-19-totw-108.md
+++ b/_posts/2019-12-19-totw-108.md
@@ -51,17 +51,17 @@ std::bind(&MyClass::OnDone, this, std::placeholders::_1)
 
Ugh, that's ugly. Is there a better way? Why yes, use -[`absl::bind_front()`](https://github.com/abseil/abseil-cpp/blob/master/absl/functional/bind_front.h) -instead. +[`std::bind_front()`](https://en.cppreference.com/w/cpp/utility/functional/bind_front) +instead. (For code that cannot yet use C++20, there's `absl::bind_front`.)
-absl::bind_front(&MyClass::OnDone, this)
+std::bind_front(&MyClass::OnDone, this)
 
Remember partial function application -- the thing that `std::bind()` *does not -do*? Well, `absl::bind_front()` does exactly that: it binds the first N -arguments and perfect-forwards the rest: `absl::bind_front(F, a, b)(x, y)` -evaluates to `F(a, b, x, y)`. +do*? Well, `std::bind_front()` does exactly that: it binds the first N arguments +and perfect-forwards the rest: `std::bind_front(F, a, b)(x, y)` evaluates to +`F(a, b, x, y)`. Ahhh, sanity is restored. Want to see something truly terrifying now? What does this code do? @@ -107,7 +107,7 @@ void ProcessAsync(std::unique_ptr<Request> req) { Good old passing `std::unique_ptr` across async boundaries. Needless to say, `std::bind()` isn't a solution -- the code doesn't compile, because `std::bind()` doesn't move the bound move-only argument to the target function. -Simply replacing `std::bind()` with `absl::bind_front()` fixes it. +Simply replacing `std::bind()` with `std::bind_front()` fixes it. The next example regularly trips even C++ experts. See if you can find the problem. @@ -139,7 +139,7 @@ which case it evaluates to `F(arg())`. If `arg` is converted to **Applying std::bind() to a type you don't control is always a bug**. `DoStuffAsync()` shouldn't apply `std::bind()` to the template argument. Either -`absl::bind_front()` or lambda would work fine. +`std::bind_front()` or lambda would work fine. The author of `DoStuffAsync()` might even have entirely green tests because they always pass a lambda or `std::function` as the argument but never the result of @@ -167,7 +167,7 @@ vs
**Calls to std::bind() that perform partial application are better off as -absl::bind_front().** The more placeholders you have, the more obvious it gets. +std::bind_front().** The more placeholders you have, the more obvious it gets.
 std::bind(&MyClass::OnDone, this, std::placeholders::_1)
@@ -176,11 +176,11 @@ std::bind(&MyClass::OnDone, this, std::placeholders::_1)
 vs
 
 
-absl::bind_front(&MyClass::OnDone, this)
+std::bind_front(&MyClass::OnDone, this)
 
-(Whether to use `absl::bind_front()` or a lambda when performing partial -function application is a judgement call; use your discretion.) +(Whether to use `std::bind_front()` or a lambda when performing partial function +application is a judgement call; use your discretion.) This covers 99% of all calls `std::bind()`. The remaining calls do something fancy: @@ -197,7 +197,7 @@ of a few characters or lines of code are worth it. ### Conclusion -Avoid `std::bind`. Use a lambda or `absl::bind_front` instead. +Avoid `std::bind`. Use a lambda or `std::bind_front` instead. ### Further Reading diff --git a/_posts/2020-04-06-totw-163.md b/_posts/2020-04-06-totw-163.md index ce2e6ffb..0a3944d6 100644 --- a/_posts/2020-04-06-totw-163.md +++ b/_posts/2020-04-06-totw-163.md @@ -63,10 +63,11 @@ void MyFunc(std::optional<Foo> foo); Otherwise, skip the `std::optional` altogether. -You can pass it by `const Foo*` and let `nullptr` indicate "does not exist." +You can pass it by `absl::Nullable` and let `nullptr` indicate "does +not exist."
-void MyFunc(const Foo* foo);
+void MyFunc(absl::Nullable<const Foo*> foo);
 
This will be just as efficient as passing by `const Foo&`, but supports null @@ -80,10 +81,10 @@ class members and function return values often work well with `std::optional`. ### Exception -If you expect all callers of your function to already have an object inside of -an `std::optional`, then you may take a `const std::optional&`. However, this is -rare; it usually only occurs if your function is private within your own -file/library. +If you expect all callers of your function to already have a +`std::optional` and never pass in a `Foo`, then you may take a `const +std::optional&`. However, this is rare; it usually only occurs if your +function is private within your own file/library. ### What about std::reference_wrapper? @@ -102,4 +103,4 @@ However, we don't recommend this: the standard library special case it in ways that make it act differently from a normal value or reference. * `std::optional>` is cumbersome and - verbose, compared to simply `const Foo*`. + verbose, compared to `absl::Nullable`. diff --git a/_posts/2020-04-06-totw-172.md b/_posts/2020-04-06-totw-172.md index 3d6966f6..04800f35 100644 --- a/_posts/2020-04-06-totw-172.md +++ b/_posts/2020-04-06-totw-172.md @@ -18,8 +18,8 @@ Quicklink: [abseil.io/tips/172](https://abseil.io/tips/172) [Designated initializers](https://en.cppreference.com/w/cpp/language/aggregate_initialization#Designated_initializers) -are a syntax in the draft C++20 standard for specifying the contents of a struct -in a compact yet readable and maintainable manner. Instead of the repetitive +are a syntax in the C++20 standard for specifying the contents of a struct in a +compact yet readable and maintainable manner. Instead of the repetitive
 struct Point {
diff --git a/_posts/2020-04-06-totw-173.md b/_posts/2020-04-06-totw-173.md
index 13c528cb..5be94de5 100644
--- a/_posts/2020-04-06-totw-173.md
+++ b/_posts/2020-04-06-totw-173.md
@@ -280,7 +280,7 @@ class DoublePrinter {
 But then if you need to allow the option struct to be skipped entirely, such as
 when it's being added to an existing class, and the nested struct has a default
 member initializer (the `= 8` after the field name `precision`, for example),
-you cannot use have a
+you cannot have a
 [default argument](https://google.github.io/styleguide/cppguide.html#Default_Arguments)
 whose value leaves the field implicit.
 
diff --git a/_posts/2020-04-06-totw-175.md b/_posts/2020-04-06-totw-175.md
index d91f4311..3333098a 100644
--- a/_posts/2020-04-06-totw-175.md
+++ b/_posts/2020-04-06-totw-175.md
@@ -63,8 +63,14 @@ hex digits are present).
 Hex floating point literals are indicated by writing a `p` (or `P`) to separate
 the significand from the exponent—where decimal floating point literals would
 use `e` (or `E`). For example, `0x2Ap12` is another way to write the value `0x2A
-<< 12`, i.e., 0x2A000. The exponent is always written in decimal, denotes a
-power of 2, and may be negative: `0x1p-10` is (exactly) `1.0/1024`.
+<< 12`, i.e., 0x2A000, except that is a floating point value, not an integer. As
+a result, our style guide
+[requires](https://google.github.io/styleguide/cppguide.html#Floating_Literals)
+it to be written as `0x2A.0p12` to be explicit that it is a floating point value
+and not just another way to write an integer.
+
+The exponent is always written in decimal, denotes a power of 2, and may be
+negative: `0x1p-10` is (exactly) `1.0/1024`.
 
 ## Recommendations
 
diff --git a/_posts/2020-04-06-totw-176.md b/_posts/2020-04-06-totw-176.md
index 4455e629..72cddcfb 100644
--- a/_posts/2020-04-06-totw-176.md
+++ b/_posts/2020-04-06-totw-176.md
@@ -121,7 +121,7 @@ composable; that is, it can easily be used as part of a wider expression, e.g.
     consistent with the
     [style guide](https://google.github.io/styleguide/cppguide.html#Output_Parameters).
 -   **Use generic wrappers** like `std::optional` to represent a missing return
-    value. Consider returning `absl::variant`if you need a more flexible
+    value. Consider returning `std::variant`if you need a more flexible
     representation with multiple alternatives.
 -   **Use a struct** to return multiple values from a function.
     -   Feel free to write a new struct specifically to represent the return
diff --git a/_posts/2020-04-06-totw-177.md b/_posts/2020-04-06-totw-177.md
index 62f430f8..5bc1ec7b 100644
--- a/_posts/2020-04-06-totw-177.md
+++ b/_posts/2020-04-06-totw-177.md
@@ -10,7 +10,7 @@ order: "177"
 
 Originally posted as TotW #177 on April 6, 2020
 
-*By [Titus Winters](mailto:titus@google.com)*
+*By [Titus Winters](mailto:titus@cs.ucr.edu)*
 
 Updated 2020-04-06
 
diff --git a/_posts/2020-09-01-totw-140.md b/_posts/2020-09-01-totw-140.md
index 140b9492..5a4e962f 100644
--- a/_posts/2020-09-01-totw-140.md
+++ b/_posts/2020-09-01-totw-140.md
@@ -99,38 +99,38 @@ inline constexpr absl::string_view kMyString = "Hello";
 
 
 // Declared in foo.h
-ABSL_CONST_INIT extern const int kMyNumber;
-ABSL_CONST_INIT extern const char kMyString[];
-ABSL_CONST_INIT extern const absl::string_view kMyStringView;
+extern const int kMyNumber;
+extern const char kMyString[];
+extern const absl::string_view kMyStringView;
 
The above example **declares** *one* instance of each object. The `extern` -keyword ensures external linkage. The `const` keyword helps prevent accidental -mutation of the value. This is a fine way to go, though it does mean the -compiler can't "see" the constant values. This limits their utility somewhat, -but not in ways that matter for typical use cases. It also requires **defining** -the variables in the associated `.cc` file. +keyword ensures external linkage, while the `const` keyword helps prevent +accidental mutation of the value. This is a fine way to go, though it does mean +the compiler can't "see" the constant values. This limits their utility +somewhat, but not in ways that matter for typical use cases. It also requires +**defining** the variables in the associated `.cc` file.
 // Defined in foo.cc
-const int kMyNumber = 42;
-const char kMyString[] = "Hello";
-const absl::string_view kMyStringView = "Hello";
+constexpr int kMyNumber = 42;
+constexpr char kMyString[] = "Hello";
+constexpr absl::string_view kMyStringView = "Hello";
 
-The `ABSL_CONST_INIT` macro ensures each constant is compile-time initialized, -but that is all it does. It *does not* make the variable `const` and it *does -not* prevent declarations of variables with non-trivial destructors that violate -the style guide rules. See mention of the macro in -[the style guide](https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables). +The `constexpr` keyword ensures each variable is a constant, is compile-time +initialized, and has a trivial destructor. This is a convenient way to ensure it +meets the +[style guide rules](https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables) +for globals. -You might be tempted to define the variables in the `.cc` file with `constexpr`, -but this is [not a portable approach at the moment](#non-portable-mistake). +You should define the variables in the `.cc` file with `constexpr`, unless you +need to [support an old toolchain](#non-portable-mistake). NOTE: `absl::string_view` is a good way to declare a string constant. The type -has a constexpr constructor and a trivial destructor, so it is safe to declare -them as global variables. Because a string view knows its length, using them -does not require a runtime call to `strlen()`. +has a `constexpr` constructor and a trivial destructor, so it is safe to declare +instances of it as global variables. Because a `string_view` knows its length, +using them does not require a runtime call to `strlen()`. ### A constexpr Function @@ -414,35 +414,23 @@ static before returning it. This fixes its address. ### Mistake #3: Non-Portable Code {#non-portable-mistake} -Some modern C++ features are not yet supported by some major compilers. +For the `extern const` +[variables declared in header files](#extern-const-variable) the following +approach to defining their values is valid according to the standard C++, and is +generally preferable to C++20's +[`constinit`](https://en.cppreference.com/w/cpp/language/constinit) (or the +older `ABSL_CONST_INIT`), but runs afoul of a bug with at least one common +compiler: -1. In both Clang and GCC, the `static constexpr char kHello[]` array in the - `MyString` function [above](#string-view-mistake) can be a `static constexpr - absl::string_view`. But this won't compile in Microsoft Visual Studio. If - portability is a concern, avoid `constexpr absl::string_view` until we get - the `std::string_view` type from C++17. - -
-    inline absl::string_view MyString() {
-      // Visual Studio refuses to compile this.
-      static constexpr absl::string_view kHello = "Hello";
-      return kHello;
-    }
-    
- -2. For the `extern const` - [variables declared in header files](#extern-const-variable) the following - approach to defining their values is valid according to the standard C++, - and would in fact be preferrable to ABSL_CONST_INIT, but it is not yet - supported by some compilers. - -
-    // Defined in foo.cc -- valid C++, but not supported by MSVC 19.
-    constexpr absl::string_view kOtherBufferName = "other example";
-    
+
+// Defined in foo.cc -- valid C++, but not supported by MSVC 19 by default.
+constexpr absl::string_view kOtherBufferName = "other example";
+
- As a workaround for a `constexpr` variable in a `.cc` file you can provide - its value to other files through functions. +Unfortunately, MSVC++19 incorrectly gives a C2370 error for this code unless the +`/Zc:externConstexpr` option is used. If code needs to compile with MSVC++19 and +cannot rely on `/Zc:externConstexpr`, as a workaround you can provide its value +to other files through functions instead of as a global variable. ### Mistake #4: Improperly Initialized Constants @@ -512,10 +500,10 @@ Here is a super-quick constant initialization cheat sheet (not in header files): (trivial) destruction. Any `constexpr` variable is entirely fine when defined in a `.cc` file, but is problematic in header files for reasons explained earlier. -2. `ABSL_CONST_INIT` guarantees safe constant initialization. Unlike - `constexpr`, it does not actually make the variable `const`, nor does it - ensure the destructor is trivial, so care must still be taken when declaring - static variables with it. See again +2. `constinit` (`ABSL_CONST_INIT` prior to C++20) guarantees safe constant + initialization. Unlike `constexpr`, it does not actually make the variable + `const`, nor does it ensure the destructor is trivial, so care must still be + taken when declaring static variables with it. See again https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables. 3. Otherwise, you're most likely best off using a static variable within a function and returning it. See the "ordinary function" example shown diff --git a/_posts/2020-09-11-totw-76.md b/_posts/2020-09-11-totw-76.md index 93af91dc..ec1de287 100644 --- a/_posts/2020-09-11-totw-76.md +++ b/_posts/2020-09-11-totw-76.md @@ -10,7 +10,7 @@ order: "076" Originally posted as TotW #76 on May 4, 2014 -*By [Titus Winters](mailto:titus@google.com)* +*By [Titus Winters](mailto:titus@cs.ucr.edu)* Updated 2020-02-06 diff --git a/_posts/2020-11-11-totw-186.md b/_posts/2020-11-11-totw-186.md index c7285477..f8e9cdae 100644 --- a/_posts/2020-11-11-totw-186.md +++ b/_posts/2020-11-11-totw-186.md @@ -57,6 +57,9 @@ Benefits over private methods include: private methods may make it difficult to find inheritance-related private declarations or declarations after the class. +Most of these benefits remain even if there is no relevant header file, such as +for a `*_test.cc` or a `*_main.cc` file. + ## Reasons to Look Elsewhere Sometimes a non-member local function does not make sense. For example: diff --git a/_posts/2022-11-16-totw-18.md b/_posts/2022-11-16-totw-18.md index 21e4c501..83eaa05b 100644 --- a/_posts/2022-11-16-totw-18.md +++ b/_posts/2022-11-16-totw-18.md @@ -10,7 +10,7 @@ order: "018" Originally posted as TotW #18 on October 4, 2012 -*By [Titus Winters](mailto:titus@google.com)* +*By [Titus Winters](mailto:titus@cs.ucr.edu)* Updated 2022-11-16 diff --git a/_posts/2022-11-16-totw-215.md b/_posts/2022-11-16-totw-215.md index 1f2e9601..9b88142f 100644 --- a/_posts/2022-11-16-totw-215.md +++ b/_posts/2022-11-16-totw-215.md @@ -143,7 +143,7 @@ LOG(INFO) << p; This code will produce a message in the logs like: -
+
 I0926 09:00:00.000000   12345 main.cc:10] (10, 20)
 
diff --git a/_posts/2022-11-16-totw-3.md b/_posts/2022-11-16-totw-3.md index 789654e4..6842b649 100644 --- a/_posts/2022-11-16-totw-3.md +++ b/_posts/2022-11-16-totw-3.md @@ -94,3 +94,6 @@ and `string_view`, like this:
 std::string foo = absl::StrCat("The year is ", year);
 
+ +For additional information, see +[absl::StrCat() and absl::StrAppend() for String Concatenation](https://abseil.io/docs/cpp/guides/strings#abslstrcat-and-abslstrappend-for-string-concatenation). diff --git a/_posts/2023-01-19-totw-218.md b/_posts/2023-01-19-totw-218.md index 47febbdc..72b1e74d 100644 --- a/_posts/2023-01-19-totw-218.md +++ b/_posts/2023-01-19-totw-218.md @@ -34,20 +34,17 @@ several considerations worth weighing: * Readability -- How easy is it for an engineer to understand the relationship between your library and the extension? - -* Maintainability -- How easy will it be change the extension point as the +* Maintainability -- How easy will it be to change the extension point as the needs of your library and your library's users change? - * Dependency Hygiene -- Does your extension point require your library to be linked in to a user's binary? We want to make sure extension points play nicely with [IWYU](https://google.github.io/styleguide/cppguide.html#Include_What_You_Use), so if a header needs to be included for the extension mechanism to work, the extended types should actually use something from that header. - -* Lack of [ODR violations](http://go/odr-violation) -- Some mechanisms make it - easy to have different portions of your program have contradictory views - about what a program means. ODR violations are always a bug. +* Lack of [ODR violations][odr-violations] -- Some mechanisms make it easy to + have different portions of your program have contradictory views about what + a program means. ODR violations are always a bug. ### FTADLE: A Good Pattern With A Great Name @@ -63,11 +60,9 @@ without namespace qualification (i.e., no `::`s). ADL is explained in detail in 1. Pick a name for your extension point and prefix it with your project's namespace. Our extension is for drawing, and our project lives in the `sketchy` namespace, so we'll call our extension `SketchyDraw`. - 1. Design a type to be passed in to `SketchyDraw` that has all the behavior your users will need. In our case, this is the `sketchy::Canvas` on which users can draw their types. - 1. Implement your functionality as an overload set. One member of that overload set will be a template and will call your extension point. The non-template functions in the overload set should be the basic building blocks; the @@ -88,7 +83,7 @@ without namespace qualification (i.e., no `::`s). ADL is explained in detail in template <typename T> void Draw(Canvas& c, const T& value) { // Called without namespace qualifiers. We rely on ADL to find the correct - // overload. See [Tip #49]([Tip #49](/tips/49)) for details on ADL. + // overload. See [Tip #49](/tips/49) for details on ADL. SketchyDraw(c, value); } @@ -102,20 +97,25 @@ a friend function template in their type named `SketchyDraw` with the appropriate signature. the template overload above will use ADL to find the `SketchyDraw` function. For example, -
+
 class Triangle {
  public:
   explicit Triangle(Point a, Point b, Point c) : a_(a), b_(b), c_(c) {}
 
-template <typename SC> friend void SketchyDraw(SC& canvas, const Triangle&
-triangle) { // Note: This is a template, even though the only type we ever
-expect to be // passed in for `SC` is `sketchy::Canvas`. Using `sketchy::Canvas`
-directly // works, but pulls in an extra dependency that may not be used by all
-users // of `Triangle`. sketchy::Draw(canvas, sketchy::Line(triangle.a_,
-triangle.b_)); sketchy::Draw(canvas, sketchy::Line(triangle.b_, triangle.c_));
-sketchy::Draw(canvas, sketchy::Line(triangle.c_, triangle.a_)); }
+  template <typename SC>
+  friend void SketchyDraw(SC& canvas, const Triangle& triangle) {
+    // Note: This is a template, even though the only type we ever expect to be
+    // passed in for `SC` is `sketchy::Canvas`. Using `sketchy::Canvas` directly
+    // works, but pulls in an extra dependency that may not be used by all users
+    // of `Triangle`.
+    sketchy::Draw(canvas, sketchy::Line(triangle.a_, triangle.b_));
+    sketchy::Draw(canvas, sketchy::Line(triangle.b_, triangle.c_));
+    sketchy::Draw(canvas, sketchy::Line(triangle.c_, triangle.a_));
+  }
 
-private: Point a_, b_, c_; };
+ private:
+  Point a_, b_, c_;
+};
 
 // Usage:
 void DrawTriangles(sketchy::Canvas& canvas, absl::Span<const Triangle> triangles) {
@@ -135,7 +135,6 @@ The FTADLE pattern has been used with several other common libraries.
 
 *   The `AbslHashValue` extension point allows you to make your type hashable by
     any of Abseil's hash containers. See [Tip #152](/tips/152) for details.
-
 *   The `AbslStringify` extension point allows you to print your type with many
     many Abseil libraries, including logging, `absl::StrCat`, `absl::StrFormat`,
     and `absl::Substitute`.
@@ -212,12 +211,12 @@ struct hash<MyType> {
 
Aside from requiring more boilerplate, this technique is ripe for -[ODR violations](http://go/odr-violation). While not terribly common, providing -different specializations for this type in different translation units, or even -the same definition twice is an ODR violation. More commonly, if such a -specialization is available only in some translation units but not others, -metaprogramming techniques will produce different answers to the question "is -there a hash function available?" which is also an ODR-violation. +[ODR violations][odr-violations]. While not terribly common, providing different +specializations for this type in different translation units, or even the same +definition twice is an ODR violation. More commonly, if such a specialization is +available only in some translation units but not others, metaprogramming +techniques will produce different answers to the question "is there a hash +function available?" which is also an ODR-violation. Beyond that, it is generally bad practice to open up a namespace you do not own (amongst other reasons, because it leads to ODR violations). We should design @@ -230,7 +229,11 @@ The FTADLE extension point pattern is readable, maintainable, mitigates against ODR violations, and avoids adding dependencies. If your library needs an extension point, FTADLE comes highly recommended. -[^1]: C++ has a rich tradition of almost-pronouncable +[odr-violations]: https://en.cppreference.com/w/cpp/language/definition + + + +[^1]: C++ has a rich tradition of almost-pronounceable [acronyms](https://en.cppreference.com/w/cpp/language/acronyms), including [RAII](https://en.cppreference.com/w/cpp/language/raii), [IFNDR](https://en.cppreference.com/w/cpp/language/ndr), diff --git a/_posts/2024-03-21-totw-224.md b/_posts/2024-03-21-totw-224.md new file mode 100644 index 00000000..070c05e2 --- /dev/null +++ b/_posts/2024-03-21-totw-224.md @@ -0,0 +1,190 @@ +--- +title: "Tip of the Week #224: Avoid vector.at()" +layout: tips +sidenav: side-nav-tips.html +published: true +permalink: tips/224 +type: markdown +order: "224" +--- + +Originally posted as TotW #224 on August 24, 2023 + +*By [Titus Winters](mailto:titus@cs.ucr.edu)* + +Updated 2024-01-24 + +Quicklink: [abseil.io/tips/224](https://abseil.io/tips/224) + + +There is no good use of `vector::at()` in google3, and fairly few good uses +in other C++ environments. The same reasoning applies to `at()` on other +random-access sequences like `RepeatedPtrField` in protobuf, as well as to +`value()` on wrapper types like `optional` and `absl::StatusOr`. + +## What Does at() Do? + +The specification of `at(size_type pos)` is as follows: + +> Returns a reference to the element at specified location `pos`, with bounds +> checking. If `pos` is not within the range of the container, an exception of +> type std::out_of_range is thrown. + +This means we could view the contract of this method as two distinct behaviors: + +- Check whether `pos >= size()`, and if so then throw a `std::out_of_range` + exception. +- Otherwise, return the element at index `pos`. + +Note: The specification does not directly address the case of code passing a +negative index, but `std::out_of_range` will be thrown for that case too – +because `size_type` is an *unsigned* integral type, a call to `at(-5)` will +yield a very large positive value for `pos`. + +## When Would We Use at()? + +Since the contract of `at()` depends on the bounds-checking logic, we can break +this into two cases: either we know by construction that the index is valid, or +we don't. + +If we already know that the sequence is sufficiently large and the lookup will +succeed, the extra bounds check is overhead. Most `vector` accesses, for +instance, are as part of a loop from `0` to `size()` and we already know the +operation will succeed. Therefore, in cases where we already know the bounds +check will be successful, it's likely that we want the more common +`operator[]()`. + +
 {.bad}
+for (int i = 0; i + 1 < vec.size(); ++i) {
+  ProcessPair(vec.at(i), vec.at(i + 1));
+}
+
+ +becomes + +
 {.good}
+for (int i = 0; i + 1  < vec.size(); ++i) {
+  ProcessPair(vec[i], vec[i + 1]);
+}
+
+ +If we do **not** know that the sequence is sufficiently large, is throwing an +exception the right way to handle that? Usually not. In google3 builds, throwing +an exception will terminate the program, messily. Many (perhaps most) readers +won't necessarily spot an innocuously named method like `at()` as a process +termination risk. + +
 {.bad}
+std::vector<absl::string_view> tokens = absl::StrSplit(user_string, ByChar(','));
+LOG(INFO) << "Got leading token " << tokens.at(0);
+
+ +is probably better as + +
 {.good}
+std::vector<absl::string_view> tokens = absl::StrSplit(user_string, ByChar(','));
+if (tokens.empty()) {
+  return absl::InvalidArgumentError("Invalid user_string, expected ','");
+}
+
+ +or if aborting the program is preferable + +
 {.good}
+std::vector<absl::string_view> tokens = absl::StrSplit(user_string, ByChar(','));
+CHECK(!tokens.empty()) << "Invalid user_string "
+                       << std::quoted(user_string)
+                       << ", expected at least one ','";
+
+ +So at least in a google3 context, none of the uses of `at()` are really useful — +for any given use case, there is a more preferred alternative. + +## What About UB? + +Unfortunately, reality is hardly so clean as "we know or we don't": we make +mistakes and code can change over time, invalidating originally correct +assumptions. Given that humans are fallible, we can *imagine* a use-case for +`at()`. Specifically, if we are completely consistent in using `at()` instead of +`operator[]`, we might ensure that even if we're crashing messily (*bad*), we +don't trigger [undefined behavior (UB)](/tips/labs/ub-and-you) (*worse*). + +While we believe "avoid UB" is a very legitimate goal, we still don't endorse +the use of `at()`, specifically, because of its exception-entangled semantics, +discussed above. The ideal future solution is a hardened-by-default +`operator[]`, with compiler optimizations to remove bounds checking, when +provably safe. The `at()` method is a bad approximation of this solution. + +Instead, we encourage users to stick with `operator[]` and reduce exposure to UB +by other means, including: + +* If your project can afford it, we recommend enabling bounds check in + production using + [HARDENING_ENABLE_SAFE_LIBCXX](http://go/cc_hardened_binary). Current + measurements suggest the [macrobenchmark](https://abseil.io/fast/39) cost of + this hardening is just barely statistically significant. +* If you run your code with [ASAN][asan] you'll *also* get diagnostics if you + access an element out of range. + +In fact, your project is likely already relying on some of these protections! + +## What About Maps? + +In [Tip #202](/tips/202) we discussed the use of `at()` on associative +containers like maps and sets. In general, the error-handling logic above +applies: it's likely the case that a missing key should be handled by logging or +returning an error, rather than messily crashing the process. + +However, the "bounds checking" overhead logic is different for these containers. +In the `std::vector` case, the compute cost of doing the bounds check is similar +to the cost of doing the actual work (returning the indicated reference). For +associative containers, the "bounds check" equivalent is doing the (necessary) +lookup, whether that is tree traversal, hashing, etc. + +Following that reasoning, we might use `at()` when we know the key is present +already (no exception throwing) but were unable to keep an iterator or +reference, so it is necessary to perform the lookup again. This is pretty rare: +see [Tip #132](/tips/132) for ways to avoid redundant map lookups. + +In the end, there's some minor room for usage of `at()` in associative +containers. There is more room for nuance in those cases than there is for +`vector`. + +## What About C++ With Exceptions? + +In an exceptions-enabled environment, opinions may differ a bit more when it +comes to `at()`. It's still broadly the case that explicit bounds checking is +likely better performance (and harder to mess up) than relying on exceptions. An +argument could be made for defense-in-depth prevention of UB, but it's fairly +clear that the idiom is (and will continue to be) `operator[]()` rather than +`at()`. + +Ideally, code should make as few assumptions as it can about the environment in +which it will work. Reasoning about code based on which toolchains will be used +to compile it is often fragile. For code that uses `at()` (or another +exception-based API) to be correct, it needs to be correct for two different +build modes: it must be acceptable to terminate the entire process *and* it must +be acceptable for code at a higher level to catch the exception and continue +execution, so the library code must preserve all invariants. In practice that +means that the code must be exception-safe *and* that it must be OK for any +out-of-bounds use of `at()` to terminate the process. + +The best advice we can give about use of `at()` in an exception-enabled +environment is perhaps that it trades a reduction in potential UB for hidden and +often unnecessary error handling. That isn't always a clear tradeoff, but it +still seems unlikely to be commonly worth the cost. + +## Closing Thoughts + +When indexing into a container, be mindful of which case we are in: is the index +"correct by construction", or does the code need to detect and handle invalid +indexes? In both cases we can do better than using the exception-based +`std::vector::at()` API. + +Similar thinking applies to other exception-based APIs such as +`std::optional::value()` and `absl::StatusOr::value()` (See +[Tip #181](/tips/181)). For error handling in non-concurrent C++ code, prefer to +"look before you leap" – and then, having checked that things are in order, +avoid APIs that include their own checking. + +[asan]: https://github.com/google/sanitizers/wiki/AddressSanitizer diff --git a/_posts/2024-03-21-totw-227.md b/_posts/2024-03-21-totw-227.md new file mode 100644 index 00000000..6c7367f6 --- /dev/null +++ b/_posts/2024-03-21-totw-227.md @@ -0,0 +1,126 @@ +--- +title: "Tip of the Week #227: Be Careful with Empty Containers and Unsigned Arithmetic" +layout: tips +sidenav: side-nav-tips.html +published: true +permalink: tips/227 +type: markdown +order: "227" +--- + +Originally posted as TotW #227 on November 16, 2023 + +*By [James Dennett](mailto:jdennett@google.com)* + +Updated 2024-03-11 + +Quicklink: [abseil.io/tips/227](https://abseil.io/tips/227) + + +## Index-Based Loops (Still Have Their Uses) + +Modern C++ code uses index-based `for` loops much less often now that +range-based `for` loops are available, but there are still times when we need to +use an index while we iterate. Parallel iteration over multiple containers is +one such case; another is when we want to process multiple adjacent elements +from a single container. In this tip we’ll look at a pitfall for the second of +these. + +## Plausible Code + +Let’s start by looking at some code that might be correct: + +
+for (int64_t i = 0; i < v.size() - 1; ++i) {
+  ProcessPair(v[i], v[i+1]);
+}
+
+ +This code wisely takes some care to check for valid indexes before calling +`ProcessPair()`, so why do we say that it only “might” be correct? A careful +unit test (or almost any [fuzz test](https://en.wikipedia.org/wiki/Fuzzing)) +will cover the case where `v` is empty. If the code surrounding our `for` loop +ensures that the `for` loop is never reached in that case, all is well. But if +we execute our loop with an empty `v`, C++ makes trouble for us. + +## Unsigned Types to the Unrescue + +Recall that +[our style guide warns](https://google.github.io/styleguide/cppguide.html#on_unsigned_integers) +against use of `unsigned` types in C++. The style guide also says + +> Because of historical accident, the C++ standard also uses unsigned integers +> to represent the **size of containers** - many members of the standards body +> believe this to be a mistake, but it is effectively impossible to fix at this +> point. + +(emphasis added) + +Looking carefully, we can see that our example falls afoul of the exact problem +discussed in the style guide. When checking whether we have valid `v[i]` and +`v[i+1]` elements, we are seemingly correct in checking whether `i` is less than +`v.size() - 1` given that both elements need to be valid. However, for an empty +container `v.size()` is zero (so far, so good!), but because the type of +`v.size()` is unsigned, when we subtract one from that zero, we don't get the +value `-1`, but instead we get the *maximum* value of the given type. Then the +check for whether `i` is less than `v.size() - 1` evaluates as `true` for any +small value of `i`, and so the code will use out-of-bounds indexes for `v` - +yielding undefined behavior. + +## How Should We Fix This? + +Interestingly, if we make the code express its intent a little more directly, +our problem goes away. + +What do we mean by “express its intent a little more directly”? What is the +intent here? The purpose of the loop condition here is to ensure that the +indexes `i` *and* `i + 1` used to index into `v` are valid. + +Given that indexes in C++ are zero-based, we test whether a (non-negative) index +`i` into a container is valid by checking `i < v.size()`. It would be redundant +to check validity of both indexes (though we could do so if we wished): if `i + +1` is valid then we know that `i` is (because `i` is never negative here). “`i + +1` is valid” translates directly into C++ as `i + 1 < v.size()`. Our original +code `i < v.size() - 1` does not have such a direct translation as a statement +about the validity of an index. + +The rewritten code `i + 1 < v.size()` looks almost the same as `i < v.size() - +1`, but it is crucially different in that we never subtract, so we avoid the +danger of wrapping around to a huge positive value. Did we swap this for a risk +of overflowing when we calculate `i + 1`? Only if `i` is already the largest +value of its type (`int64_t`) – so we are safe. This difference is sometimes +characterized by noting that the common, useful values of `int64_t` are a long +way away from overflowing, whereas with unsigned types such as `uint64_t`, the +very common value 0 is the smallest value of the type, so it’s much easier to +unintentionally wrap around. + +## Fixed Code, Fuzz Free + +With this one small change, our now-robust code looks like this: + +
+for (int64_t i = 0; i + 1 < v.size(); ++i) {
+  ProcessPair(v[i], v[i+1]);
+}
+
+ +The indexes into `v` are clearly safe, without a need to look further afield to +know whether `v` might be empty. + +Now we can let our fuzzer loose on the fixed code, and feel warm fuzzy feelings +that our `for` loop is bug-free and reviewer-friendly. + +Note: This is just one (robust) way to write this loop; there are many others. + +## Summary + +While our fix doesn’t change many bytes of source code, it touches on a number +of ideas: + +* As the style guide + [says](https://google.github.io/styleguide/cppguide.html#on_unsigned_integers), + be wary of arithmetic on unsigned types in C++. +* Remember that `container.size()` yields an unsigned type. +* Prefer code where correctness can be verified as locally as possible. +* Try to make code correspond as directly as possible to the underlying + intent. diff --git a/_posts/2024-03-21-totw-229.md b/_posts/2024-03-21-totw-229.md new file mode 100644 index 00000000..aeee1b0c --- /dev/null +++ b/_posts/2024-03-21-totw-229.md @@ -0,0 +1,330 @@ +--- +title: "Tip of the Week #229: Ranked Overloads for Template Metaprogramming" +layout: tips +sidenav: side-nav-tips.html +published: true +permalink: tips/229 +type: markdown +order: "229" +--- + +Originally posted as TotW #229 on February 5, 2024 + +*By [Miguel Young de la Sota](mailto:mcyoung@mit.edu) and [Matt Kulukundis](mailto:kfm@google.com)* + +Updated 2024-02-20 + +Quicklink: [abseil.io/tips/229](https://abseil.io/tips/229) + + +Warning: This is an advanced tip for folks doing template metaprogramming. In +general, avoid template metaprogramming unless you have a very, very good +reason. If you are reading this, it's because you need to do some template +metaprogramming or just want to learn something nifty. + +## One Cool Trick + +Ordinarily, C++ requires every function invocation to resolve to a single "best" +function or it produces an ambiguity error. The exact definition of "best" is +more complex than we want to go into, but involves things like implicit +conversions and type qualifiers. + +In situations that would produce ambiguity errors, we can use explicit class +hierarchies to force the definition of "best" to be what we prefer. This "ranked +overloads" technique uses structures with a class hierarchy so they have a +priority ordering and the compiler will select the highest priority method +first. We'll define a family of empty [tag types](/tips/198) `Rank0`, `Rank1`, +etc., that are related by inheritance, and use those to guide the overload +resolution process.[^rank] + +
+// Public API with good comments.
+template <typename T>
+size_t Size(const T& t);
+
+// Everything below here is a working example of ranked overloads, that
+// you can copy and paste to get you started!
+namespace internal_size {
+
+// Use go/ranked-overloads for dispatching.
+struct Rank0 {};
+struct Rank1 : Rank0 {};
+struct Rank2 : Rank1 {};
+struct Rank3 : Rank2 {};
+
+template <typename T>
+size_t SizeImpl(Rank3, const std::optional<T>& x) {
+  return x.has_value() ? Size(*x) : 0;
+}
+
+template <typename T>
+size_t SizeImpl(Rank3, const std::vector<T>& v) {
+  size_t res = 0;
+  for (const auto& e : v) res += Size(e);
+  return res;
+}
+
+template <typename T>
+size_t SizeImpl(Rank3, const T& t)
+  requires std::convertible_to<T, absl::string_view>
+{
+  return absl::string_view{t}.size();
+}
+
+template <typename T>
+size_t SizeImpl(Rank2, const T& x)
+  requires requires { x.length(); }
+{
+  return x.length();
+}
+
+template <typename T>
+size_t SizeImpl(Rank1, const T& x)
+  requires requires { x.size(); }
+{
+  return x.size();
+}
+
+template <typename T>
+size_t SizeImpl(Rank0, const T& x) { return 1; }
+
+}  // namespace internal_size
+
+template <typename T>
+size_t Size(const T& t) {
+  // Start with the highest rank, Rank3.
+  return internal_size::SizeImpl(internal_size::Rank3{}, t);
+}
+
+auto i = Size("foo");                      // hits the string_view overload
+auto j = Size(std::vector<int>{1, 2, 3});  // hits the vector overload
+auto k = Size(17);                         // hits the lowest rank "catch all"
+
+ +Note that the `absl::string_view`, `std::optional`, and `std::vector` overloads +use `Rank3`. When overloads are mutually incompatible (the call *can't* be +ambiguous by construction), the same rank type can be used. You can think of all +overloads with the same rank as being tried in parallel. + +NOTE: The astute reader may wonder why the `absl::string_view` overload is +declared as a template. Doing so ensures that no implicit conversions will take +place in the signature other than the one for the rank structs. If this overload +were declared with an `absl::string_view` parameter then the call would be +ambiguous: `Rank2{}` -> `Rank0{}` would count as a conversion but `const char[]` +to `absl::string_view` would also and the call would become +[ambiguous again][godbolt-link]. + +## Detailed Example + +Let's suppose we want `Size(x)` to return `x.length()`, `x.size()`, or `1` +depending on what the passed type `x` implements. The naive approach does not +work: + +
+template <typename T>
+size_t Size(const T& x)
+  requires requires { x.length(); }
+{
+  return x.length();
+}
+
+template <typename T>
+size_t Size(const T& x)
+  requires requires { x.size(); }
+{
+  return x.size();
+}
+
+template <typename T>
+size_t Size(const T& x) { return 1; }
+
+auto i = Size(std::string("foo"));  // Ambiguous.
+
+ +Because the size and length overloads are of equal rank, they are equally good +matches for the call. Because overload resolution does not eliminate all but one +candidate, the compiler declares the callsite ambiguous. There are clever tricks +using variadic functions or `int`/`long` promotion to create an ordering for two +options, but these do not scale to having N descending ranks of overloads. + +Using ranked overloads as we've suggested attaches *explicit ranks* in the form +of inheritance to specific overloads. This rank is based on the following rule: +overloads with more derived classes have higher rank than overloads with less +specific ones. That is, if two overloads differ by a single argument's type, and +both are bases of the argument, the type closest in the inheritance hierarchy +has higher rank and is a better match. + +This means we can build a tower of empty structs, each deriving from the +previous, to put an *explicit, numeric rank* on each overload. Using this trick, +we would write `Size` like this: + +
+// Public API with good comments.
+template <typename T>
+size_t Size(const T& t);
+
+namespace internal_size {
+
+// Use go/ranked-overloads for dispatching.
+struct Rank0 {};
+struct Rank1 : Rank0 {};
+struct Rank2 : Rank1 {};
+
+template <typename T>
+size_t SizeImpl(Rank2, const T& x)
+  requires requires { x.length(); }
+{
+  return x.length();
+}
+
+template <typename T>
+size_t SizeImpl(Rank1, const T& x)
+  requires requires { x.size(); }
+{
+  return x.size();
+}
+
+template <typename T>
+size_t SizeImpl(Rank0, const T& x) { return 1; }
+
+}  // namespace internal_size
+
+template <typename T>
+size_t Size(const T& t) {
+  // Start with the highest rank
+  return internal_size::SizeImpl(internal_size::Rank2{}, t);
+}
+
+auto i = Size(std::string("foo"));  // 3
+
+ +The overloads can now be read as an `if`/`else` chain. First we try the `Rank2` +overload; if substitution fails, we fall back to the next rank, `Rank1`, and +then `Rank0`. Of course, this particular method will treat +`Size(std::string("foo"))` differently from `Size("foo")`. This highlights some +of the dangers of generic programming, though the fix is relatively +straightforward: add an explicit rank to handle strings, as below. + +
+// Public API with good comments.
+template <typename T>
+size_t Size(const T& t);
+
+namespace internal_size {
+// Use go/ranked-overloads for dispatching.
+struct Rank0 {};
+struct Rank1 : Rank0 {};
+struct Rank2 : Rank1 {};
+struct Rank3 : Rank2 {};
+
+template <typename T>
+size_t SizeImpl(Rank3, const T& t)
+  requires std::convertible_to<T, absl::string_view>
+{
+  return absl::string_view{t}.size();
+}
+
+template <typename T>
+size_t SizeImpl(Rank2, const T& x)
+  requires requires { x.length(); }
+{
+  return x.length();
+}
+
+template <typename T>
+size_t SizeImpl(Rank1, const T& x)
+  requires requires { x.size(); }
+{
+  return x.size();
+}
+
+template <typename T>
+size_t SizeImpl(Rank0, const T& x) { return 1; }
+
+}  // namespace internal_size
+
+template <typename T>
+size_t Size(const T& t) {
+  // Start with the highest rank
+  return internal_size::SizeImpl(internal_size::Rank3{}, t);
+}
+
+auto i = Size("foo");  // 3
+
+ +Now extending this to `vector` and `optional` is quite straightforward! + +
+// Public API with good comments.
+template <typename T>
+size_t Size(const T& t);
+
+namespace internal_size {
+// Use go/ranked-overloads for dispatching.
+struct Rank0 {};
+struct Rank1 : Rank0 {};
+struct Rank2 : Rank1 {};
+struct Rank3 : Rank2 {};
+
+template <typename T>
+size_t SizeImpl(Rank3, const std::optional<T>& x) {
+  return x.has_value() ? Size(*x) : 0;
+}
+
+template <typename T>
+size_t SizeImpl(Rank3, const std::vector<T>& v) {
+  size_t res = 0;
+  for (const auto& e : v) res += Size(e);
+  return res;
+}
+
+template <typename T>
+size_t SizeImpl(Rank3, const T& t)
+  requires std::convertible_to<T, absl::string_view>
+{
+  return absl::string_view{t}.size();
+}
+
+template <typename T>
+size_t SizeImpl(Rank2, const T& x)
+  requires requires { x.length(); }
+{
+  return x.length();
+}
+
+template <typename T>
+size_t SizeImpl(Rank1, const T& x)
+  requires requires { x.size(); }
+{
+  return x.size();
+}
+
+template <typename T>
+size_t SizeImpl(Rank0, const T& x) { return 1; }
+
+}  // namespace internal_size
+
+template <typename T>
+size_t Size(const T& t) {
+  // Start with the highest rank
+  return internal_size::SizeImpl(internal_size::Rank3{}, t);
+}
+
+auto i = Size("foo");                      // hits the string_view overload
+auto j = Size(std::vector<int>{1, 2, 3});  // hits the vector overload
+auto k = Size(17);                         // hits the lowest rank "catch all"
+
+ +## Parting Thoughts + +Now that you have learned this awesome power, please remember to use it +sparingly. As we saw with the `absl::string_view` overload, generic programming +is subtle and can lead to unexpected results. + +[godbolt-link]: https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGe1wAyeAyYAHI%2BAEaYxCBmAJykAA6oCoRODB7evnrJqY4CQSHhLFEx8baY9vkMQgRMxASZPn5cFVXptfUEhWGR0bEJCnUNTdmtQ109xaUDAJS2qF7EyOwc5gDMwcjeWADUJuseidViB9gmGgCCG1s7mPuHQ8TBwAD6AG54mADuZxfXZk2DG2Xj2BzcTxefyuN2BdwebgIAE9EphXgRiExCApoQCgSCwYd3pgHCRcf8APQU3bKLwRWh4ZC7S7KACSu2%2BhAQu2AqAwuzQLDYggUADp/gRMCxEgZJQjkajmGxdgAVXGpABeaIIuyEeC1EDQDCGqvMADZdgRZgcrDCrkrMApEkwVrtgpLiMxaK9NfcTAB2W2XKm7RT3XkUzEMADWmHQAFpUMTiLRUEx0ApdvxiLt8E6mARkAgXuKrk8vA5dgAlQzRjT7QMBgAiNv%2B5crNZjXF2IGrtfrAas/pb6yD7Z1nejZh7fa7DaHI7HGIrE9r6xnk%2B7g%2Bbrbtl0l0tlfsOCsYrHuavW5zL%2Bu1utvrMPEEn61IAoEJqG6BAIFQx3SpyHJe5xmBaqizPO/y7LsxCYAQSwMLsqiiggTAKB8YheJgEAQQcABi94GgAVOBM4aLu1zDv8EpSjKBbHoiKJnsqwFtre6KEZgj4ys%2Ba5vkan4EN%2BIDEqSxDgqxoG7O8uGNlc0G%2BhxsGZgcTa7ORo5QVmJC7IaH46kwXhEOauz3L2MkwY6%2ByWKpnEQJg1qafJlnwZ6lk4k5lEtnuik6nqWrcbQvExq%2Buxfj%2BkJGB8XzfJaslBtBsGuYhBCir6OEUTue4HnRcrgqeDqmlebFahx/lcU%2Bk5mPx%2BmmlJ4FabBACOXh4MplktW1VmDkhor0EYBAIBlo4Nt5lEJS5CG9f1wCDcNQZZdcVw5Ue8pMYVrE3qVfkPpVtatO%2Bxo6mq9XWs5zWte1F1dSpga9eljkWKN1FyZciVwVNyEPZlVHZbRq35et55Fdely%2BZxgXBXWNVHXVYHxZNblcDaz17s20Ehg6%2Bauu60Rej6t7Uct/30WtirA5tYPsTtBoCcdJlWpBzkhp0DQclyloIPcxbAFzJpRtGjUfW5uOemIBNaj%2B5WQ6L%2BO%2Bj%2BL7bsOb5Wj9Y3/IZRBug8anlRA5hmPwqAG490Fm%2BbFuWxj1LFgQmaDfckVvJ8Py7Em0SpumGtGaguzaDrdnhSJJJEOJhzun8gYHdVuzrM2pu7CGtv21z0khzp7spmm6De1r0YB3rXD%2BgnVul6XSfYpz9ypt8jo6gL1lmKIhbcmItAGxw8y0JwACsvB%2BBwWikKgnBuNY1hhYsyx%2BoCPCkAQmid/M0YgD3XCimYPc9/EAAc6zrP6XAH2Y/r6Jwkj94vw%2BcLwCggBo8%2BL/McCwDAiAoKg0p0NE5CUIKiTfxiNsQwbxlwxj4HQD0d8IARCvhEYI9QkScDnvA5gxAkQAHkIjaBDsg3ggphQEAwQwWgSDB68CwBELwwA3BtzvtwChUoQHiAYaQfAsEHB4GJPQoemBVAkiMqsOeuNu7kP0HgCImJ0EeCwFfDEeAWB4PmFQAwwAFAADUYoYMVHgmQggRBiHYFIPR8glBqCvroVoBgjAoHHpYcREQ76wAdCAJgjjMB0FIMmEAYDoyzHmH%2Bao9D4xflUqYGylgzD1l4JnZ4WAnE4TaCHdILgGDuE8M0fwaSph9BiK0XIaQBCjBaEkFIhSGA5JKP0cYlRkkCFZo0DJYwkmcPqcMbowRehVLybYdpxS9ATAaJUmYXB5gKCnisCQXde6XzESPDgSEd5mnjGaSQAprHAF0r4iCEBcCEB0hsUZvAF7kP8aQFePcH6iIvqQRRlzSADyHvM2%2B99H6nNIC/d%2BiwCCJCMr/PSX96DEFCOeTgqglkrLWcAowWziBeBjLMXgcZ9lxL0PwfRohxDGPRaYlQ6gxGWNIN8TEiQlFnw4H3B5V95kYKMr8nUqAqCLOWas9ZIDYXwr8bpDwgLojWXWEct5Wgzlc3TP0RJ1zeB3Ifo8mJN9bCvJOcK5eIBJA71FP6SQkh/QaC4GaM0Zglkn1fKI9YsynnyqVUvclZhzVyo4Mcp%2B8xkypGcJIIAA%3D%3D + + + +[^rank]: Note that "rank" here is used in a colloquial sense and is unrelated to + [conversion ranks](https://en.cppreference.com/w/cpp/language/usual_arithmetic_conversions) + for integers and floating point values.