Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OpenSearch NullPointerException for "scope" UUID that is not a community or collection #9482

Conversation

dsteelma-umd
Copy link
Contributor

@dsteelma-umd dsteelma-umd commented Apr 18, 2024

References

Fixes #9481

Description

Fixes a NullPointerException that occurs when the OpenSearch endpoint receives a "scope" parameter that is a valid UUID, but which does not correspond to a community or collection.

While it is not believed to be possible to trigger this NullPointerException through the GUI, on our production system we have seen badly-behaved crawlers make thousands of requests an hour to the OpenSearch endpoint with UUIDs that do not correspond to a community or collection.

This pull request fixes the issue in the same manner as when an invalid UUID, or some other error, occurs when retrieving the scope -- the "resolveScope" method in "org.dspace.app.rest.utils.ScopeResolver" returns null, and ultimately an "unscoped" OpenSearch result is returned to the requestor.

Instructions for Reviewers

The NullPointerException occurs due to the following code (lines 44-46)
of the "resolveScope" in the org.dspace.app.rest.utils.ScopeResolver:

  if (scopeObj.getIndexedObject() == null) {
      scopeObj = new IndexableCollection(collectionService.find(context, uuid));
  }

The collectionService.find(context, uuid) call will return null if the
given uuid is not a collection. This causes a NullPointerException when the getID() method is in called on the null object when the OpenSearchController.search method attempts to use it as part of the Solr search.

List of changes in this PR:

  • The pull request adds an additional null check to determine if the collection retrieval request has returned null, and if so, emits a warning to the log, and sets the scopeObj to null, which is then returned. This is consistent with the resolveScope method's behavior of returning a null scopeObj if the UUID has an invalid format, or a SQLException occurs when retrieving the UUID. This results in an "unscoped" OpenSearch response being sent back to the requestor (i.e., an OpenSearch response equivalent to the "scope" parameter not being provided in the request).

  • Added a scopeNotCommunityOrCollectionUUIDTest integration test to the org.dspace.app.opensearch.OpenSearchControllerIT to verify that an "unscoped" response is returned when the "scope" parameter in the request in a valid UUID that is not a community or collection.

Include guidance for how to test or review your PR.

Note: The fix provided by this pull request is a minimal change which is consistent with the existing OpenSearch behavior that when given an invalid "scope" parameter, returns an "unscoped" OpenSearch response.

It could be argued it would be better to return an HTTP Status 400 (Bad Request) response when an invalid "scope" parameter is provided, but that would not be consistent with existing behavior (and changing the existing behavior seems outside the scope of this issue).

Verification Steps

The following steps require:

  1. Run the following command to monitor the DSpace back-end log:
docker logs -f dspace
  1. Run the following "curl" command to access the OpenSearch endpoint with a "scope" parameter that is a valid UUID (b68f0d1c-7316-41dc-835d-46b79b35642e), but is not a community or collection:
curl 'http://localhost:8080/server/opensearch/search?format=atom&scope=b68f0d1c-7316-41dc-835d-46b79b35642e&query=*'

Verify that:

  • an XML response is returned (not an HTTP status 500 error)
  • In the DSpace log, there is a warning similar to the following:
    2024-04-18 14:44:08,424 WARN  unknown 256eec7b-e13c-4ca4-bcb9-1c9c3d960b64 org.dspace.app.rest.utils.ScopeResolver @ The given scope string b68f0d1c-7316-41dc-835d-46b79b35642e is not a collection or community UUID
    

Checklist

This checklist provides a reminder of what we are going to look for when reviewing your PR. You need not complete this checklist prior to creating your PR (draft PRs are always welcome). If you are unsure about an item in the checklist, don't hesitate to ask. We're here to help!

  • My PR is small in size (e.g. less than 1,000 lines of code, not including comments & integration tests). Exceptions may be made if previously agreed upon.
  • My PR passes Checkstyle validation based on the Code Style Guide.
  • My PR includes Javadoc for all new (or modified) public methods and classes. It also includes Javadoc for large or complex private methods.
  • My PR passes all tests and includes new/updated Unit or Integration Tests based on the Code Testing Guide.
  • If my PR includes new libraries/dependencies (in any pom.xml), I've made sure their licenses align with the DSpace BSD License based on the Licensing of Contributions documentation.
  • If my PR modifies REST API endpoints, I've opened a separate REST Contract PR related to this change.
  • If my PR includes new configurations, I've provided basic technical documentation in the PR itself.
  • If my PR fixes an issue ticket, I've linked them together.

Fixes a NullPointerException when the "scope" parameter provided to
the OpenSearch endpoint is a valid UUID, but is not a UUID associated
with a Community or Collection.

Instead of throwing a NullPointerException, this change modifies the
code to return a null scope (resulting in an "unscoped" OpenSearch
request), which is the same behavior that occurs when the UUID
is invalid, or otherwise not usable.
@tdonohue tdonohue added bug 1 APPROVAL pull request only requires a single approval to merge. port to main This PR needs to be ported to `main` branch for the next major release labels Apr 18, 2024
@tdonohue tdonohue added this to the 8.0 milestone Apr 18, 2024
@dsteelma-umd
Copy link
Contributor Author

I tried linking this issue with #9481 by putting a "Fixes" keyword in the "References" section of pull request description, but it did not work because the base branch for the pull request is "dspace-7_x", not the default branch (as explained in https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue).

I do not have write permissions to the repository, so I am unable to manually link this pull request to the issue. If someone could do that (or explain how I can do it), I would appreciate it.

Thanks.

@tdonohue
Copy link
Member

@dsteelma-umd : I've manually linked this PR as fixing #9481

dsteelma-umd added a commit to dsteelma-umd/mdsoar that referenced this pull request Apr 22, 2024
Fixes a NullPointerException when the "scope" parameter provided to
the OpenSearch endpoint is a valid UUID, but is not a UUID associated
with a Community or Collection.

Instead of throwing a NullPointerException, this change modifies the
code to return a null scope (resulting in an "unscoped" OpenSearch
request), as if the UUID were invalid, or otherwise not usable.

This fix was provided to DSpace as Pull Request #9482 -
<DSpace/DSpace#9482>. The customization markers
added in this issue can be removed when MD-SOAR is updated to a DSpace
version containing this change.

https://umd-dit.atlassian.net/browse/LIBCIR-386
dsteelma-umd added a commit to dsteelma-umd/DSpace that referenced this pull request Apr 24, 2024
Fixes a NullPointerException when the "scope" parameter provided to
the OpenSearch endpoint is a valid UUID, but is not a UUID associated
with a Community or Collection.

Instead of throwing a NullPointerException, this change modifies the
code to return a null scope (resulting in an "unscoped" OpenSearch
request), as if the UUID were invalid, or otherwise not usable.

This fix was provided to DSpace as Pull Request DSpace#9482 -
<DSpace#9482>. The customization markers
added in this issue can be removed when DRUM is updated to a DSpace
version containing this change.

https://umd-dit.atlassian.net/browse/LIBDRUM-859
Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks @dsteelma-umd ! These changes make sense and the IT proves that it is working properly.

@tdonohue tdonohue merged commit a2b9bf2 into DSpace:dspace-7_x May 3, 2024
22 checks passed
@dspace-bot
Copy link

@tdonohue tdonohue modified the milestones: 8.0, 7.6.2 May 3, 2024
@tdonohue tdonohue removed the port to main This PR needs to be ported to `main` branch for the next major release label May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 APPROVAL pull request only requires a single approval to merge. bug
Projects
Development

Successfully merging this pull request may close these issues.

OpenSearch throws NullPointerException when "scope" UUID is not community or collection
3 participants