New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add takestring! and make String(::Memory)
not truncate
#54372
base: master
Are you sure you want to change the base?
Conversation
Also, check for uses of julia/stdlib/FileWatching/src/pidfile.jl Lines 283 to 288 in f3561ae
#53962 |
The name makes me think that |
I would be in favor of that. Every time I have used an IOBuffer I have had to go to the docs to figure out how to get the data out of it correctly. |
So what's the specification of |
It does not truncate - except when the argument is a |
Do we need an underscore here? |
Triage had a long talk about this, #54369, #54156 and #54273. The conclusions we came to are
|
The style guide says not to use underscores "when readable" without, which I think applies here. |
String(::Memory)
not truncateString(::Memory)
not truncate
Can this be refined to "mutating an Array after its alias has been turned into a String is UB"? edit: For the record , I don't think it's right for us to put this contract on an API that's not marked |
Yes. observing it is not great but not UB. the only UB is if you mutate it. |
|
After triage, this PR now contains the following changes:
|
This commit builds upon JuliaLang#54438, re-using the old String(::Vector{UInt8}) impl, which now no longer truncates. Also remove takestring!(::Memory) - that function probably should not be defined
Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
And use it unstead of takestring!(::Memory)
This should be good to go as soon as tests pass. See this comment for a complete list of changes in this PR: #54372 (comment) |
The current behavior is already approved by triage (#24388 (comment)) so it feels a bit odd that there is a new triage later that just goes and overrides this without any reference (at least from what I can see) to the original decision. This also lacks review from people heavily invested in the original design like @JeffBezanson. |
@KristofferC It is useful context and was not discussed in the previous triage call. That said, I think none of what triage said in this call contradicts the previous call. Almost all of the decision points for triage here were about the behavior when The one part where the current triage call somewhat diverged from the previous is the decision that a new function that does the combination of removing data from an |
Is the problem here then that we don't have any way to do a safe runtime "move" out of a Memory? I'm aware of three options for a copy/performance tradeoff like this:
I think our policy on the ownership theft of I'm assuming that's not an option for |
|
On @KristofferC 's recommendation, I have now read through the old decisions in #24388, #25241, #25846 and #26093. In #24388, the two main points are: The other PRs are less relevant - they pertain to
Which essentially suggests encapsulating this whole sketchy thing with creating strings from mutable data in a single function, as done here. So, I just don't see how the old triage decisions conflict with the PR here, and given that triage has approved this already, I think we should move forward with it. To motivate it a little more, I think the status quo is not acceptable because it leads to memory bugs too easily: Here is a way that works on released versions of Julia unless this PR or #54452 is applied to mutate strings b = IOBuffer()
write(b, "abc")
v1 = take!(b)
v2 = view(v1, :)
s = String(v1) # "abc"
fill!(v2, 0)
s # now all zeros Here is a bug on master where a vector now is able to read from out of bounds (fixed by this PR): v1 = Memory{UInt8}(codeunits("abc"))
v2 = view(v1, 1:3)
s = String(v1)
println(isempty(v2.ref.mem)) # true
v2[1] # accesses v2.ref.mem OOB More broadly speaking, I'm not suggesting that we compromise on our APIs in order to categorically prevent unsafe behaviour. We're not Rust, and need to be practical about what is unsafe. However:
|
@jakobnissen Thank you for your tireless efforts here. I do like From the implementation here the difference between |
Thanks for the feedback. I've now changed both The Edit: I think the 32-bit Windows test failure is spurious. Tests do seem to pass. Edit2: With these changes, function foo(data)
buf = IOBuffer()
write(buf, data)
String(take!(buf)) # or takestring!(buf) or unsafe_takestring!(buf)
end
@benchmark foo("abc") Takes the following time:
|
This function creates a string from a byte vector, truncating the vector and reusing its memory where possible.
This also removes the truncating behaviour of
String(::Memory)
- see #54369Still needs to be done
take_string!(::Memory)
should be used instead ofString(::Memory)
for efficiency [edit: couldn't find any]Closes #54369