Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP-S3-BOX-3: Unexpected error during wake-word-detection #132

Open
sti0 opened this issue Dec 18, 2023 · 6 comments
Open

ESP-S3-BOX-3: Unexpected error during wake-word-detection #132

sti0 opened this issue Dec 18, 2023 · 6 comments

Comments

@sti0
Copy link

sti0 commented Dec 18, 2023

I flashed my ESP-S3-Box-3 by using the esphome.io website. I had connected the box to HA. The waiting house symbol appears on the screen. There is a little snap sound after the startup was successful. After saying "ok nabu" the orange house (error) appears and no more response is possible. I'm using piper, openwakeword and whisper as local addons in my HA supervised installation.

ESPHOME shows the following output:

[22:06:00][D][api.connection:1089]: Home Assistant 2023.12.3 (xx.xx.xx.xx): Connected successfully
[22:06:00][W][component:214]: Component api took a long time for an operation (0.28 s).
[22:06:00][W][component:215]: Components should block for at most 20-30ms.
[22:06:01][D][voice_assistant:422]: State changed from IDLE to START_MICROPHONE
[22:06:01][D][voice_assistant:428]: Desired state set to WAIT_FOR_VAD
[22:06:01][W][component:214]: Component api took a long time for an operation (0.28 s).
[22:06:01][W][component:215]: Components should block for at most 20-30ms.
[22:06:01][D][voice_assistant:159]: Starting Microphone
[22:06:01][D][voice_assistant:422]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[22:06:01][D][esp-idf:000]: I (25378) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[22:06:01][D][esp-idf:000]: I (25384) I2S: I2S0, MCLK output by GPIO2

[22:06:01][D][esp-idf:000]: I (25390) AUDIO_PIPELINE: link el->rb, el:0x3d036c54, tag:i2s, rb:0x3d037068

[22:06:01][D][esp-idf:000]: I (25393) AUDIO_PIPELINE: link el->rb, el:0x3d036dc8, tag:filter, rb:0x3d0390a8

[22:06:01][D][esp-idf:000]: I (25399) AUDIO_ELEMENT: [i2s-0x3d036c54] Element task created

[22:06:01][D][esp-idf:000]: I (25402) AUDIO_THREAD: The filter task allocate stack on external memory

[22:06:01][D][esp-idf:000]: I (25409) AUDIO_ELEMENT: [filter-0x3d036dc8] Element task created

[22:06:01][D][esp-idf:000]: I (25415) AUDIO_ELEMENT: [raw-0x3d036ef8] Element task created

[22:06:01][D][esp-idf:000]: I (25419) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:16586299 Bytes, Inter:50964 Bytes, Dram:50964 Bytes


[22:06:01][D][esp-idf:000]: I (25422) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1

[22:06:01][D][esp-idf:000]: I (25425) AUDIO_ELEMENT: [filter] AEL_MSG_CMD_RESUME,state:1

[22:06:01][D][esp-idf:000]: I (25429) RSP_FILTER: sample rate of source data : 16000, channel of source data : 2, sample rate of destination data : 16000, channel of destination data : 1

[22:06:01][D][esp-idf:000]: I (25432) AUDIO_PIPELINE: Pipeline started

[22:06:01][D][esp_adf.microphone:294]: Microphone started
[22:06:01][D][voice_assistant:422]: State changed from STARTING_MICROPHONE to WAIT_FOR_VAD
[22:06:01][D][voice_assistant:176]: Waiting for speech...
[22:06:01][D][voice_assistant:422]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[22:06:06][D][voice_assistant:189]: VAD detected speech
[22:06:06][D][voice_assistant:422]: State changed from WAITING_FOR_VAD to START_PIPELINE
[22:06:06][D][voice_assistant:428]: Desired state set to STREAMING_MICROPHONE
[22:06:06][D][voice_assistant:206]: Requesting start...
[22:06:06][D][voice_assistant:422]: State changed from START_PIPELINE to STARTING_PIPELINE
[22:06:06][D][voice_assistant:443]: Client started, streaming microphone
[22:06:06][D][voice_assistant:422]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[22:06:06][D][voice_assistant:428]: Desired state set to STREAMING_MICROPHONE
[22:06:06][D][voice_assistant:529]: Event Type: 0
[22:06:06][E][voice_assistant:656]: Error: no_wake_word - No wake word detected
[22:06:06][D][voice_assistant:522]: Signaling stop...
[22:06:06][D][voice_assistant:422]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[22:06:06][D][voice_assistant:428]: Desired state set to IDLE
[22:06:06][D][esp_adf.microphone:256]: Stopping microphone
[22:06:06][D][voice_assistant:422]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[22:06:06][D][esp-idf:000]: W (30571) AUDIO_ELEMENT: IN-[filter] AEL_IO_ABORT

[22:06:06][D][esp-idf:000]: E (30575) AUDIO_ELEMENT: [filter] Element already stopped

[22:06:06][D][esp-idf:000]: W (30603) AUDIO_PIPELINE: There are no listener registered

[22:06:06][D][esp-idf:000]: I (30607) AUDIO_PIPELINE: audio_pipeline_unlinked

[22:06:06][D][esp-idf:000]: W (30612) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE

[22:06:06][D][esp-idf:000]: I (30617) I2S: DMA queue destroyed

[22:06:06][D][esp-idf:000]: W (30624) AUDIO_ELEMENT: [filter] Element has not create when AUDIO_ELEMENT_TERMINATE

[22:06:07][D][esp-idf:000]: W (30629) AUDIO_ELEMENT: [raw] Element has not create when AUDIO_ELEMENT_TERMINATE

[22:06:07][W][component:214]: Component voice_assistant took a long time for an operation (0.28 s).
[22:06:07][W][component:215]: Components should block for at most 20-30ms.
[22:06:07][D][voice_assistant:529]: Event Type: 2
[22:06:07][D][voice_assistant:619]: Assist Pipeline ended
[22:06:07][D][voice_assistant:529]: Event Type: 1
[22:06:07][D][voice_assistant:532]: Assist Pipeline running
[22:06:07][D][voice_assistant:529]: Event Type: 9
[22:06:07][D][esp_adf.microphone:306]: Microphone stopped
[22:06:07][D][voice_assistant:529]: Event Type: 0
[22:06:07][E][voice_assistant:656]: Error: wake-stream-failed - Unexpected error during wake-word-detection
[22:06:07][D][voice_assistant:522]: Signaling stop...
[22:06:07][D][voice_assistant:422]: State changed from STOPPING_MICROPHONE to STOP_MICROPHONE
[22:06:07][D][voice_assistant:428]: Desired state set to IDLE
[22:06:07][D][voice_assistant:422]: State changed from STOP_MICROPHONE to IDLE
[22:06:07][W][component:214]: Component voice_assistant took a long time for an operation (0.27 s).
[22:06:07][W][component:215]: Components should block for at most 20-30ms.
[22:06:07][D][voice_assistant:529]: Event Type: 2
[22:06:07][D][voice_assistant:619]: Assist Pipeline ended
[22:06:07][D][voice_assistant:422]: State changed from IDLE to START_MICROPHONE
[22:06:07][D][voice_assistant:428]: Desired state set to WAIT_FOR_VAD
[22:06:07][D][voice_assistant:159]: Starting Microphone
[22:06:07][D][voice_assistant:422]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[22:06:07][D][esp-idf:000]: I (31287) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[22:06:07][D][esp-idf:000]: I (31292) I2S: I2S0, MCLK output by GPIO2

[22:06:07][D][esp-idf:000]: I (31299) AUDIO_PIPELINE: link el->rb, el:0x3d036c54, tag:i2s, rb:0x3d037068

[22:06:07][D][esp-idf:000]: I (31303) AUDIO_PIPELINE: link el->rb, el:0x3d036dc8, tag:filter, rb:0x3d0390a8

[22:06:07][D][esp-idf:000]: I (31310) AUDIO_ELEMENT: [i2s-0x3d036c54] Element task created

[22:06:07][D][esp-idf:000]: I (31313) AUDIO_THREAD: The filter task allocate stack on external memory

[22:06:07][D][esp-idf:000]: I (31318) AUDIO_ELEMENT: [filter-0x3d036dc8] Element task created

[22:06:07][D][esp-idf:000]: I (31323) AUDIO_ELEMENT: [raw-0x3d036ef8] Element task created

[22:06:07][D][esp-idf:000]: I (31326) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:16586747 Bytes, Inter:51412 Bytes, Dram:51412 Bytes


[22:06:07][D][esp-idf:000]: I (31330) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1

[22:06:07][D][esp-idf:000]: I (31332) AUDIO_ELEMENT: [filter] AEL_MSG_CMD_RESUME,state:1

[22:06:07][D][esp-idf:000]: I (31336) RSP_FILTER: sample rate of source data : 16000, channel of source data : 2, sample rate of destination data : 16000, channel of destination data : 1

[22:06:07][D][esp-idf:000]: I (31339) AUDIO_PIPELINE: Pipeline started

[22:06:07][D][esp_adf.microphone:294]: Microphone started
[22:06:07][D][voice_assistant:422]: State changed from STARTING_MICROPHONE to WAIT_FOR_VAD
[22:06:07][D][voice_assistant:176]: Waiting for speech...
[22:06:07][D][voice_assistant:422]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[22:06:08][W][component:214]: Component voice_assistant took a long time for an operation (0.27 s).
[22:06:08][W][component:215]: Components should block for at most 20-30ms.
[22:06:08][W][component:214]: Component voice_assistant took a long time for an operation (0.27 s).
[22:06:08][W][component:215]: Components should block for at most 20-30ms.

HA assist debug:
image

stage: done
run:
  pipeline: 01hhz9tr3494bh7s6fvc967ncw
  language: en
events:
  - type: run-start
    data:
      pipeline: 01hhz9tr3494bh7s6fvc967ncw
      language: en
    timestamp: "2023-12-18T21:06:06.836622+00:00"
  - type: wake_word-start
    data:
      entity_id: wake_word.openwakeword
      metadata:
        format: wav
        codec: pcm
        bit_rate: 16
        sample_rate: 16000
        channel: 1
      timeout: 5
    timestamp: "2023-12-18T21:06:06.837100+00:00"
  - type: error
    data:
      code: wake-stream-failed
      message: Unexpected error during wake-word-detection
    timestamp: "2023-12-18T21:06:06.847630+00:00"
  - type: run-end
    data: null
    timestamp: "2023-12-18T21:06:06.848318+00:00"
wake_word:
  entity_id: wake_word.openwakeword
  metadata:
    format: wav
    codec: pcm
    bit_rate: 16
    sample_rate: 16000
    channel: 1
  timeout: 5
  done: false
error:
  code: wake-stream-failed
  message: Unexpected error during wake-word-detection
@mackowskim
Copy link

mackowskim commented Jan 18, 2024

Very similar issue here.
The same errors from ESP side:

`
[D][esp-idf:000]: I (14121) AUDIO_PIPELINE: Pipeline started
[D][esp_adf.microphone:264]: Microphone started
[D][voice_assistant:410]: State changed from STARTING_MICROPHONE to WAIT_FOR_VAD
[D][voice_assistant:170]: Waiting for speech...
[D][voice_assistant:410]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[D][voice_assistant:183]: VAD detected speech
[D][voice_assistant:410]: State changed from WAITING_FOR_VAD to START_PIPELINE
[D][voice_assistant:416]: Desired state set to STREAMING_MICROPHONE
[D][voice_assistant:200]: Requesting start...
[D][voice_assistant:410]: State changed from START_PIPELINE to STARTING_PIPELINE
[D][voice_assistant:431]: Client started, streaming microphone
[D][voice_assistant:410]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[D][voice_assistant:416]: Desired state set to STREAMING_MICROPHONE
[D][voice_assistant:517]: Event Type: 0
[E][voice_assistant:644]: Error: no_wake_word - No wake word detected
[D][voice_assistant:510]: Signaling stop...
[D][voice_assistant:410]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[D][voice_assistant:416]: Desired state set to IDLE
[D][esp_adf.microphone:226]: Stopping microphone
[D][voice_assistant:410]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[D][esp-idf:000]: W (72547) AUDIO_ELEMENT: IN-[filter] AEL_IO_ABORT

[D][esp-idf:000]: E (72549) AUDIO_ELEMENT: [filter] Element already stopped

[D][esp-idf:000]: W (72580) AUDIO_PIPELINE: There are no listener registered

[D][esp-idf:000]: I (72583) AUDIO_PIPELINE: audio_pipeline_unlinked

[D][esp-idf:000]: W (72585) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE

[D][esp-idf:000]: I (72587) I2S: DMA queue destroyed

[D][esp-idf:000]: W (72589) AUDIO_ELEMENT: [filter] Element has not create when AUDIO_ELEMENT_TERMINATE

[D][esp-idf:000]: W (72593) AUDIO_ELEMENT: [raw] Element has not create when AUDIO_ELEMENT_TERMINATE

[W][component:214]: Component voice_assistant took a long time for an operation (0.21 s).
[W][component:215]: Components should block for at most 20-30ms.
[D][voice_assistant:517]: Event Type: 2
[D][voice_assistant:607]: Assist Pipeline ended
[D][voice_assistant:517]: Event Type: 1
[D][voice_assistant:520]: Assist Pipeline running
[D][esp_adf.microphone:276]: Microphone stopped
[D][voice_assistant:517]: Event Type: 9
[D][voice_assistant:410]: State changed from STOPPING_MICROPHONE to IDLE
[D][voice_assistant:517]: Event Type: 0
[E][voice_assistant:644]: Error: wake-stream-failed - Unexpected error during wake-word-detection
[D][voice_assistant:410]: State changed from IDLE to START_MICROPHONE
[D][voice_assistant:416]: Desired state set to WAIT_FOR_VAD
[W][component:214]: Component voice_assistant took a long time for an operation (0.21 s).
[W][component:215]: Components should block for at most 20-30ms.
[D][voice_assistant:517]: Event Type: 2
[D][voice_assistant:607]: Assist Pipeline ended
[D][voice_assistant:153]: Starting Microphone
[D][voice_assistant:410]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[D][esp-idf:000]: I (72987) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[D][esp-idf:000]: I (72992) I2S: I2S0, MCLK output by GPIO2
[D][esp-idf:000]: I (72996) AUDIO_PIPELINE: link el->rb, el:0x3d036c2c, tag:i2s, rb:0x3d037040

[D][esp-idf:000]: I (72998) AUDIO_PIPELINE: link el->rb, el:0x3d036da0, tag:filter, rb:0x3d039080
[D][esp-idf:000]: I (73003) AUDIO_ELEMENT: [i2s-0x3d036c2c] Element task created0m
[D][esp-idf:000]: I (73005) AUDIO_THREAD: The filter task allocate stack on external memory

[D][esp-idf:000]: I (73008) AUDIO_ELEMENT: [filter-0x3d036da0] Element task created

[D][esp-idf:000]: I (73010) AUDIO_ELEMENT: [raw-0x3d036ed0] Element task created
`

Did you manage to resolve it somehow?

@sti0
Copy link
Author

sti0 commented Jan 20, 2024

Sadly not, its still not working

@sti0
Copy link
Author

sti0 commented Jan 23, 2024

Issue still exists after upgrading to ESPHome 2023.12.9. Doesn't make a difference if I use openwakeword or porcupine1. Both times its crashing after saying something (even if its not the wake word)....

@mackowskim
Copy link

Exactly the same symptoms here.
What's more, I was able to reinstall original firmware and it reacts to "OK E.S.P." with no issues.
Also understands the commands, to it's not a hardware issue.

@sti0
Copy link
Author

sti0 commented Mar 2, 2024

After switching to local wake word the detection works like expected. https://github.com/esphome/firmware/tree/main/wake-word-voice-assistant

@sti0 sti0 closed this as completed Mar 2, 2024
@sti0
Copy link
Author

sti0 commented Mar 2, 2024

When switching to wake word engine location from "on device" to "in Home Assistant" the error still exists. So re-opening this ticket because its solved when using local wake word detection but one may still use the Home Assistant detection...

@sti0 sti0 reopened this Mar 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants