Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] XContentBuilder.toString() will close the builder itself #13731

Open
chishui opened this issue May 17, 2024 · 2 comments · May be fixed by #13809
Open

[BUG] XContentBuilder.toString() will close the builder itself #13731

chishui opened this issue May 17, 2024 · 2 comments · May be fixed by #13809
Labels
bug Something isn't working Libraries Lucene Upgrades and Libraries, Any 3rd party library that Core depends on, ex: nebula; team is respo

Comments

@chishui
Copy link
Contributor

chishui commented May 17, 2024

Describe the bug

We are using XContentBuilder in ml-commons plugin https://github.com/opensearch-project/ml-commons/blob/0483d14f1ce39af546633ee460e0ec7afa54ccbd/common/src/main/java/org/opensearch/ml/common/connector/functions/preprocess/DefaultPreProcessFunction.java#L45.

When I debug code linked above, I found mlInput.toXContent(builder, EMPTY_PARAMS); would throw an NPE exception

java.lang.NullPointerException: Cannot store to byte/boolean array because "this._outputBuffer" is null
	at com.fasterxml.jackson.core.json.UTF8JsonGenerator.writeStartObject(UTF8JsonGenerator.java:389)
	at org.opensearch.common.xcontent.json.JsonXContentGenerator.writeStartObject(JsonXContentGenerator.java:168)
	at org.opensearch.core.xcontent.XContentBuilder.startObject(XContentBuilder.java:286)
	at org.opensearch.ml.common.input.MLInput.toXContent(MLInput.java:139)

However, if I set breakpoint not right at mlInput.toXContent(builder, EMPTY_PARAMS); but the line below, there is no NPE.

With help of @zane-neo, I realized that XContentBuilder.toString() will call BytesReference.bytes(this) and BytesReference will close the XContentBuilder before making bytes.

The issue I saw was because the debugger would call XContentBuilder.toString() which close the instance which was very unexpected. This seems to be a very weird behavior and could cause some serious issue when people accidentally call toString(). And since XContentBuilder already extends Closeable, it's a contract that people should call close explicitly or put builder in a try clause, we shouldn't use toString() as an implicit signal to close XContentBuilder.

Related component

Libraries

To Reproduce

try (XContentBuilder builder = XContentFactory.jsonBuilder()) {
       mlInput.toXContent(builder, EMPTY_PARAMS); // add a break point at this line

or

XContentBuilder builder = XContentFactory.jsonBuilder()
String str = builder.toString();
mlInput.toXContent(builder, EMPTY_PARAMS);

Expected behavior

Implicit or explicit call of toString() of XContentBuilder should not unexpected close it.

Additional Details

Plugins
The issue was found in ml-commons plugin, but it's a general issue.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@chishui chishui added bug Something isn't working untriaged labels May 17, 2024
@github-actions github-actions bot added the Libraries Lucene Upgrades and Libraries, Any 3rd party library that Core depends on, ex: nebula; team is respo label May 17, 2024
@andrross
Copy link
Member

@chishui Thanks for reporting this! I think it is a pretty common with the builder pattern that a builder isn't necessarily mutable after "building" it. It's a fair question as to whether toString-ing the builder should implicitly "build" it. I don't know exactly the specifics here, but in order to avoid this problem you may well have to make a deep copy of the builder state when toString-ing it, which would have performance implications. Would you care to put up a PR attempting to fix this? I would also probably be okay with just explicitly documenting the toString() method as a terminal operation on the builder that closes it. The BytesReference::bytes method is properly documented.

@chishui
Copy link
Contributor Author

chishui commented May 24, 2024

Many implementations neither explicitly call close() nor use try (XContentBuilder buildre = ...), they just return builder.toString(). Seems not feasible to fix all of them, will add a comment for now.

@chishui chishui linked a pull request May 24, 2024 that will close this issue
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Libraries Lucene Upgrades and Libraries, Any 3rd party library that Core depends on, ex: nebula; team is respo
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants