Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract common classes from src/scala/microsoft-spark-<version>. #15

Open
imback82 opened this issue Mar 19, 2019 · 5 comments · May be fixed by #819
Open

Extract common classes from src/scala/microsoft-spark-<version>. #15

imback82 opened this issue Mar 19, 2019 · 5 comments · May be fixed by #819
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@imback82
Copy link
Contributor

imback82 commented Mar 19, 2019

We create multiple jars during our builds to accommodate multiple versions of Apache Spark. In the current approach, the implementation is copied from one version to another and then necessary changes are made.

An ideal approach could create a common directory and extract common classes from duplicate code. Note that even if class/code is exactly the same, you cannot pull out to a common class if it depends on Apache Spark.

Success Criteria:

  • PR that refactors all the classes appropriately
  • Documentation for all the classes changed/added
  • Documentation on upgrading versions (if it doesn't already exist)
@rapoth rapoth transferred this issue from another repository Apr 24, 2019
@rapoth rapoth added good first issue Good for newcomers help wanted Extra attention is needed labels Apr 24, 2019
@spzSource
Copy link
Contributor

Hi @imback82
I'm happy to be a volunteer for working on that ticket.

@imback82
Copy link
Contributor Author

That will be great, thanks @spzSource!

@spzSource
Copy link
Contributor

Hi @imback82

Before start work I just want to confirm if I correctly understand suggested approach.

Am I correct saying that the intention is to create separate maven project (for instance, microsoft-spark-common), which should compile into separate .jar file?

@imback82
Copy link
Contributor Author

Yea, I think that's one way to do it. But we have to make sure microsoft-spark is a fat JAR so that we don't break the existing customers' pipeline. Or we can put the common files into a common folder that different projects can build (not sure if this is doable). Does this make sense?

@spzSource spzSource linked a pull request Feb 3, 2021 that will close this issue
4 tasks
@spzSource
Copy link
Contributor

spzSource commented Feb 27, 2021

Hi @imback82

Looks like I stuck right after creating common maven module. Almost all classes inherit from org.apache.spark.util.Utils.Logging which doesn't allow to move such classes into common module due to dependence on specific spark version.

Am I correctly understand that removing Logging inheritance is not the option? In any way I'm happy for any ideas how to mitigate the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants