Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: How to use DataFrame API to achieve the function equivalent to map/reduce in spark.net #1156

Open
JunweiSUN opened this issue Aug 3, 2023 · 0 comments

Comments

@JunweiSUN
Copy link

JunweiSUN commented Aug 3, 2023

Hi,
we have a scenario that need to use the map/reduce function in spark.net,
For example, we want to call

public IEnumerable<object[]> MapCallback(IEnumerable<Row> input)
{
   // do something with `IEnumerable<Row> input`
}

df.Rdd.MapPartitions(MapCallback, true)

The thing is, we need this IEnumerable input so that we can do some operation on the row level.
In Mobius, we can access all the Rdd-related APIs, but according to this issue seems all the Rdd-related APIs are no longer accessible.

So we have the following questions:

  1. Is there any API in current Spark.Net that can implement the function that exactly equivalent to Rdd.Map, Rdd.Reduce and other mapreduce related function? Note that we need to deal with a IEnumerable with arbitrary number of elements in one row, i.e., we may not know how many elements (columns) in a row until runtime.
  2. If the answer of 1 is false, can we just download the source code, change the visibility of Rdd-related APIs to public, and build a private bits to use?
  3. Any other related suggestions will be really appreciated.

Looking forward to your answer! Thanks a lot!

@JunweiSUN JunweiSUN changed the title Question: How to using DataFrame API to achieve the function equivalent to map/reduce in spark.net Question: How to use DataFrame API to achieve the function equivalent to map/reduce in spark.net Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant