Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::string and std::string_view support in the same way for setters and constructors as for const char * #73

Open
biodranik opened this issue Dec 17, 2015 · 21 comments

Comments

@biodranik
Copy link

biodranik commented Dec 17, 2015

Hi and thanks for amazing library!

Do you plan to add std::string and std::string_view overloads for most setters/constructors? It makes code usage much more convenient.

@DasNaCl
Copy link

DasNaCl commented Jan 5, 2016

This might become much more interesting when the standard library will have std::string_view

@zeux
Copy link
Owner

zeux commented Jan 10, 2016

Here's some previous discussion on the topic: https://code.google.com/p/pugixml/issues/detail?id=49

That thread also has different implementations of a possible solution.

@zeux
Copy link
Owner

zeux commented Apr 3, 2016

I experimented a bit with string_arg approach (the link above appears to be dead, but basically one way to implement this is to replace all existing const char_t* arguments with string_arg which would be a custom wrapper type for a char_t pointer with implicit ctor from const char_t* and const std::basic_string<char_t>&.

The biggest issue apart from breaking binary compatibility (which would mean I'd like to defer this until pugixml 2.0 to avoid potential issues with existing applications that link to pugixml dynamically) is the overload issues with xml_attribute::set_value / xml_text::set, where bool overload is now selected instead of string_arg one (which is probably fixable by keeping the const char_t* overload...)

Overall this seems like a good solution. There's obviously extra overhead in debug builds but there is no cost if compiler inlines the helper (which it should when optimizations are enabled).

@arturbac
Copy link

arturbac commented Nov 3, 2019

Do you plan to add std::string overloads for most setters/constructors? It makes code usage much more convenient.

It would be better to support std::string_view, it will cover any strings and will solve problem with using unterminated strings.
Actualy with modern project which uses std::string_view or boost::string_view instead of raw char const * every time i need to pass data to pugi i have to create temporary string for null termination

Btw - great library i love projects which have stl compatibile interface

@davidhunter22
Copy link

A smaller but still useful change would be to have a
bool xml_node::set_value(const char_t* rhs, size_t strlength );
and just not do the impl::strlength(rhs) call in the implementation. The existing set_value can just be
bool xml_node::set_value(const char_t* rhs ) { return set_value( rhs, impl::strlength(rhs) ); }

@air2
Copy link

air2 commented Nov 28, 2019

I think std::(w)string_view is much more desirable as std::(w)string. And constructing a string_view from a std::string is very cheap, however constructing a string from a string_view is very expensive, so I would skip the std::string implementation and favor the std::string_view implementation.

@Sailanarmo
Copy link

@air2 The question you would need to ask yourself is if you need a null-terminated string or not. Because if that is the case then std::string_view would not be the way to go per-say. It's a difficult question. However I think in this case I can't think of a reason why std::string_view couldn't be used. However, this topic may be long dead as of now since it hasn't been updated for a while.

The other issue would require those who use this library to require a compliant C++17 compiler. With const std::string& it would only require C++11 which I feel a lot more compilers have as their default now.

@matt77hias
Copy link

matt77hias commented Feb 13, 2020

The other issue would require those who use this library to require a compliant C++17 compiler. With const std::string& it would only require C++11 which I feel a lot more compilers have as their default now.

It is not required to use C++17. Libraries such as fmt and spdlog (which uses fmt) use a custom tailored std::string_view for pre-C++17, because std::string_view uses/exploits no C++17 specific features and could thus be used pre-C++17.

@Sailanarmo
Copy link

@matt77hias But that would require the dependencies of an external library which I feel defeats the purpose of this library having to work on it's own without any dependencies except for the C headers that it uses.

@matt77hias
Copy link

matt77hias commented Feb 14, 2020 via email

@ytimenkov
Copy link

Is there any progress on this? As mentioned in #73 (comment) there is no need to depend on any new functionality, just provide an overload which accepts size parameter (which already is done internally).
This doesn't break any binary compatibility either: in worst case (app built with newer version load old version of pugixml shared library) there will be a dynamic linker error.

It is not always input value has terminating null and adding one means creating an unnecessary copy on heap.

@ytimenkov
Copy link

The biggest issue ... is the overload issues with xml_attribute::set_value / xml_text::set, where bool overload is now selected instead of string_arg one (which is probably fixable by keeping the const char_t* overload...)

That's why you need std::enable_if overloads to narrow down the set. But this probably better to postpone to 2.0 where you can just straight to concepts 😄

@halx99
Copy link

halx99 commented Dec 23, 2021

This one halx99#2 done setter/getter with string_view and other improvements, but the PUGIXML_COMPACT was broken.

@biodranik biodranik changed the title std::string support in the same way for setters and constructors as for const char * std::string and std::string_view support in the same way for setters and constructors as for const char * Feb 2, 2023
@biodranik
Copy link
Author

biodranik commented Feb 2, 2023

7 years passed :) Is it feasible to propose a pull request with the necessary changes? Will it be accepted?

pugixml performance will be better with string and string_view, why it wasn't considered yet?

@eugeneko
Copy link

eugeneko commented Apr 6, 2023

@zeux maybe there should be separate branch for possibly breaking string_view-friendly API for people who need it?

I am geniunely consider dropping or forking pugixml due to being unable to use string_view in my own API that calls pugixml under the hood.

@biodranik
Copy link
Author

A C++ library should have a modern and fast C++ interface...

@zeux
Copy link
Owner

zeux commented Apr 6, 2023

Please keep the comments on topic and substantive.

In a recent change, several methods (like set_value and xml_text::set) gained support for size_t argument. To what extent adding string_view support just for these, and keeping functions that deal with attribute/element names as they are, would satisfy the need here?

It's of course possible to add string_view overloads for every method. That doubles the interface size for the library, and requires alternate implementations in certain cases where currently the internals assume null terminated inputs - this is obviously possible, but also not ideal.

The original idea for this issue has been to replace all methods with something that uses a wrapper, but given the binary compatibility concerns and uncertain timeline for 2.0 it's not clear that this is actually optimal.

The reason why names might be special is that code is generally more likely to use string literals than dynamic strings for them.

@eugeneko
Copy link

eugeneko commented Apr 7, 2023

In a recent change, several methods (like set_value and xml_text::set) gained support for size_t argument. To what extent adding string_view support just for these, and keeping functions that deal with attribute/element names as they are, would satisfy the need here?

@zeux For me personally, I don't really care about supporting std::string_view per se because I use custom string_view anyway. My issue is about using non-0-terminated strings (coming from my code) in pugixml API in general.

Perfect solution for me would be to have optional size_t parameter for each function consuming const char*, or any functionally equivalent solution (like having wrapper class, or just having std::string_view)

@zeux
Copy link
Owner

zeux commented Apr 7, 2023

As mentioned above, certain functions like xml_node::set_value / xml_text::set have overloads that accept size_t, see #490. This is not part of a versioned release yet. Do you have examples of the code for which these APIs do not suffice?

@eugeneko
Copy link

eugeneko commented Apr 7, 2023

Do you have examples of the code for which these APIs do not suffice?

I have XMLElement::GetChild(name), where XMLElement is a wrapper on top of pugixml that provides utility functions useful in my code.
Then I have virtual Archive::Serialize(name, value) which may or may not use XML in the implementation.
Both Archive and XMLElement are public API in my library.

My issue is that I am effectively forbidden from using string_view in any API built on top of pugixml. I am forced to use const char* in my public API because otherwise I cannot work with pugixml hidden deep in the implementation details w/o doing string allocation.

@zeux
Copy link
Owner

zeux commented Aug 24, 2023

For the wrapper library use case it feels fairly easy to implement GetChild(name) as something like

for (pugi::xml_node child : node)
    if (child.name() == name)
         return wrap(child);

(I do think that for direct use for people who are used to string_view it would be more ergonomic to support it directly as right now you need to use .data()/.size() and call the overload that supports size which isn’t ideal)

edit: the above doesn’t cover set_name, so that probably does need an extra overload (now part of master).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests