Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STORM-3702] Change thrift exception classes to contain getMessage() method #3337

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

bipinprasad
Copy link
Contributor

What is the purpose of the change

*Make TException classes consistent with Exception class and eventually remove Wrapped exception classes.

How was the change tested

Recompile and run tests

@bipinprasad bipinprasad force-pushed the storm3702 branch 2 times, most recently from 5d0cc69 to a46c9cb Compare September 17, 2020 23:06
Copy link
Contributor

@agresch agresch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer someone with more thrift knowledge to look at this as well, but I like this change. Please file a JIRA to follow through on removing the deprecated Wrapped exceptions.

@Ethanlm
Copy link
Contributor

Ethanlm commented Sep 18, 2020

Since it is possible to break backwards compatibility, I am more conservative on this and would like more tests to be done.

(1) Did we get a chance to test with old clients?
(2) And did we get a chance to test python code (client and topology code written in python) since this PR added storm_exception.py?
(3) Will it break backwards compatibility for server-side plugins if they invoke old get_msg method?
(4) Or is it possible that some of the topology code invokes some "exception.get_msg" method in bolt/spout, and when the topology code runs on the upgraded cluster, throwing NoSuchMethod error?

For (3-4), my question is more like: if there is any case like these cases. I am currently not sure.

Thanks

@bipinprasad
Copy link
Contributor Author

bipinprasad commented Sep 18, 2020

(1) No new tests have been added.
Following code that uses get_msg() was changed as part of this change
- java: 685c817 and
- clojure (test): 5ea0ac6
(2) No new python testing was done - other than what is already part of the test suite (if any)
(3) Not sure I understand what server-side plugin is.
(4) Any topology (old code) should have its own (presumably old) version of TException class from its storm-client jar. Which will have get_msg() method. The new class and the old class are binary compatible (at thrift level). So that code "should" work.

I can do a more comprehensive search of get_msg() usage and see if something was overlooked - especially python code that may not cause compile or test error without proper code coverage.

@Ethanlm
Copy link
Contributor

Ethanlm commented Sep 18, 2020

(1) For old clients, I am thinking of users using old storm client to submit topologies to upgraded cluster.

(2) For old python code, I am thinking of

  • topologies depending on older storm version (with older storm python module)
  • older version of python client connecting to upgraded storm cluster

This can't be tested on unit tests.

(3) By server-side plugin, I meant plugins that storm users can implement for their cluster, for example, nimbus.topology.validator. These are couple of server-side plugins that users can implement. If those plugins invokes get_msg, it will not continue to work on upgraded cluster.

(4) The topology code depends on storm-client.jar as provided dependency. But when it is running on the cluster, it uses the cluster provided storm-client.jar. So a possible issue is that in the topology code, it invokes get_msg somewhere (since it depends on older storm-client.jar), but when it is running on upgraded cluster, storm-client.jar doesn't have get_msg method, and hence NoSuchMethod error.

Theoretically (3-4) could happen. But what I am currently not sure is that if anyone would implement a server-side plugin that invokes get_msg somewhere in their code or if anyone would have their topology invokes get_msg somewhere in their bolt/spout?

I am more confident with it if we can do comprehensive tests first. But this is not urgent. And I will also try to find some time to do some tests too. Thanks

@bipinprasad
Copy link
Contributor Author

bipinprasad commented Sep 18, 2020

Thanks for the explanation. Only way to check item (4) is to do a "binary" check on the topology jar to test for incompatible use (i.e. use of these specific exception.get_msg() call. I need to think about how to do this properly.

@bipinprasad
Copy link
Contributor Author

  • I did comprehensive search: grep get_msg/set_msg on **/.py, **/.java, **/*.clj. All calls are accounted for this change request.
  • So this leaves only the user topology jars, which may also include the plugins.

@kishorvpatil
Copy link
Contributor

@bipinprasad As discussed, I feel all the thrift API methods thould wrap these exception under TException. So we can consistently ensure message/cause is passed down to the client side. Currently sending just message is not sufficient/not follow typical exception criteria of providing cause.

@Ethanlm
Copy link
Contributor

Ethanlm commented Sep 22, 2020

  • I did comprehensive search: grep get_msg/set_msg on **/.py, **/.java, **/*.clj. All calls are accounted for this change request.
  • So this leaves only the user topology jars, which may also include the plugins.

Another question I have related to python is

since this PR separates out storm_exception/*.py, will code like splitsentence.py https://github.com/apache/storm/blob/master/examples/storm-starter/multilang/resources/splitsentence.py#L16 work if it tries to access these exception classes? Does it need to import storm_exception explicitly? (I see it import storm explicitly which includes exception classes before this PR)

@Ethanlm
Copy link
Contributor

Ethanlm commented Sep 22, 2020

@bipinprasad As discussed, I feel all the thrift API methods thould wrap these exception under TException. So we can consistently ensure message/cause is passed down to the client side. Currently sending just message is not sufficient/not follow typical exception criteria of providing cause.

I would like to learn more about this approach. The current way of passing an exact exception class (like NotAliveException) seems fine with me

@bipinprasad
Copy link
Contributor Author

  • I did comprehensive search: grep get_msg/set_msg on **/.py, **/.java, **/*.clj. All calls are accounted for this change request.
  • So this leaves only the user topology jars, which may also include the plugins.

Another question I have related to python is

since this PR separates out storm_exception/*.py, will code like splitsentence.py https://github.com/apache/storm/blob/master/examples/storm-starter/multilang/resources/splitsentence.py#L16 work if it tries to access these exception classes? Does it need to import storm_exception explicitly? (I see it import storm explicitly which includes exception classes before this PR)

Indeed. Good point. Ideally python import should remain unchanged - i.e. just single import.

# Conflicts:
#	storm-client/src/jvm/org/apache/storm/generated/AlreadyAliveException.java
#	storm-client/src/jvm/org/apache/storm/generated/AuthorizationException.java
#	storm-client/src/jvm/org/apache/storm/generated/HBAuthorizationException.java
#	storm-client/src/jvm/org/apache/storm/generated/HBExecutionException.java
#	storm-client/src/jvm/org/apache/storm/generated/IllegalStateException.java
#	storm-client/src/jvm/org/apache/storm/generated/InvalidTopologyException.java
#	storm-client/src/jvm/org/apache/storm/generated/KeyAlreadyExistsException.java
#	storm-client/src/jvm/org/apache/storm/generated/KeyNotFoundException.java
#	storm-client/src/jvm/org/apache/storm/generated/NotAliveException.java
#	storm-client/src/py/storm/ttypes.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants