Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark streaming AvailableNow trigger terminates after first batch #656

Open
seb-emmot opened this issue Oct 12, 2022 · 1 comment
Open
Assignees
Labels

Comments

@seb-emmot
Copy link

I am trying to build a spark streaming application to ingest data from Azure Event Hubs and persist to a delta table in databricks.
I'm using the AvailableNow trigger in spark streaming.
This trigger should process all data from the source in batches according to https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#triggers

Bug Report:

  • Actual behavior
    The stream start and processes first batch, then it terminates.
  • Expected behavior
    The stream start and processes all available data, in microbatches, then terminates
  • Spark version
    3.3.0
  • spark-eventhubs artifactId and version
    com.microsoft.azure:azure-eventhubs-spark_2.12:2.3.22

It seems like the support for the 'AvailableNow' trigger might not be implemented?

My code:

val connectionString = ConnectionStringBuilder(namespace_str)
  .setEventHubName("myhubname")
  .build

val ehConf = EventHubsConf(connectionString)
  .setConsumerGroup("myconsumergroup")
  .setMaxEventsPerTrigger(1000)

val inStream = spark.readStream.format("eventhubs").options(ehConf.toMap).load()

val outStream = inStream.writeStream
  .outputMode("append")
  .format("delta")
  .option("checkpointLocation", checkpointLocation)
  .trigger(Trigger.AvailableNow).toTable("mytablename")

I have previously asked a question related to this on Stack Overflow (in Pyspark though)
https://stackoverflow.com/questions/74025485/is-spark-streaming-availablenow-trigger-compatible-with-azure-event-hub

@hmlam hmlam self-assigned this Nov 3, 2022
@hmlam hmlam added the feature label Nov 3, 2022
@hmlam hmlam assigned yamin-msft and unassigned hmlam Nov 4, 2022
@dilisha
Copy link

dilisha commented Jul 27, 2023

Hi, I am facing the same issue. Is there any fix on this @yamin-msft @hmlam? If yes, by when will this feature be available?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants