Occasionally, I see std::string_view in someone else’s code. Passed into a function by value, sometimes even by r-value reference. I have never needed it, but I got curious and delved into what it is and how it should (and should not) be used.
So what is std::string_view?
Introduced in C++17, a string view is a non-owning, read-only pointer to a char array, one that also tracks the length of the pointed array and behaves as a string. Feed it char* and you can use string operations like comparison, hashing, or substring.
However, the non-owning bit is quite important about string view and you might use it wrong because of it. Before delving any deeper into the rabbit hole, let’s examine how the string view could be used:
const std::string ss = "hello;world;of;harry;potter";
std::string_view sv{ss};
sv.remove_prefix(sv.find(';') + 1); // sv is now "world;of;harry;potter"
sv.remove_suffix(sv.size() - sv.find_last_of(';')); // sv is now "world;of;harry"
You might be confused by this bit of code. Didn’t I say that string view is a non-owning reference? How can it change the source data? The answer is: it doesn’t. The string view only holds a pointer to the first character and size of the array. That means I can do various substring operations without ever modifying the source data. What’s better, no strings are allocated or copied during the process!
I can postpone the copy to the point when I am done with the transformations and simply call:
// Conversion from string_view to string is explicit
std::string substring = std::string(sv);
String view is so simple that it can be used for compile-time text processing:
consteval unsigned getLengthOfFirstWord()
{
constexpr std::string_view sv("hello world");
return sv.substr(sv.find(' ') - 1).size();
}
constinit const unsigned LENGTH = getLengthOfFirstWord(); // Equals to 5
Pitfalls of std::string_view
We established that string view can optimize certain string operations by eliminating extra copies. It can also cause you a lot of headache because as with everything in C++, it is optimized for speed, not safety. Consider the following program. What do you think will happen?
#include <string>
#include <iostream>
int main(){
std::string_view sv("hello world");
try {
auto sv2 = sv.substr(1, 2); // sv2 is "el"
sv2.remove_prefix(4);
std::cout << "OUTPUT: " << sv2 << std::endl;
}
catch (std::exception& e)
{
std::cerr << "ERROR: " << e.what() << std::endl;
}
}
I would expect one of the following:
- remove_prefix throws as exception, because it went past the string size
- sv2 behaves as an empty string because all characters were removed from it
With MSVC 2022 in Debug mode, I got a runtime assertion. In Release mode, the code seems to just crash somewhat silently. In clang (trunk), much more interesting stuff happened:
But why? In all cases, the size() of sv is equal to -2. Since it is a std::size_t, it overflows to a really huge number. The internal pointer points to ’ world’ because we moved it 5 characters to the right (past ‘hello’).
Despite what you might think, the clang is not even triggering the exception. It is actively trying to read the original string up to the number of characters provided by size(). Since it is stored in a static part of memory, it leaks all other constexpr strings stored there (the exception text is there as well). Then it starts leaking unwritten memory until it eventually crashes.
Constructing a string from this view would indeed throw an exception.
Should we ever use it?
Although it might sound like a data type from hell, string view has its (limited) place in your C++ toolbox. For one, it is useful for string processing in compile-time environments. It is also useful for performing basic string operations over raw char arrays:
bool custom_strcmp(const char* str1, const char* str2)
{
return std::string_view(str1) == std::string_view(str2);
}
It can also help you optimize code with certain string transformations — just make a copy into std::string before you try printing it.
In all cases, bear in mind to keep your view close to your data source, as the destruction of source data will cause nasty things to happen.
When not to use std::string_view?
- For passing data into functions — const& is always cheaper
- For returning string data from functions — the underlying data will probably cease to exist when the function returns
- For passing data between classes, threads, etc. Always use the view locally where you can ensure the lifetime of the referenced string
Conclusion
String view is a useful tool for certain string operations and can help optimize code by eliminating extra copies during string transformations. However, it is important to be aware of its limitations and pitfalls, such as the risk of accessing data that no longer exists or reduced safety when going out of bounds. Keep it close to the source data, and you should be just fine.
See Safetica in action. Complete this quick contact form today.