ESP-S3-BOX-3: Unexpected error during wake-word-detection #132

sti0 · 2023-12-18T21:12:41Z

I flashed my ESP-S3-Box-3 by using the esphome.io website. I had connected the box to HA. The waiting house symbol appears on the screen. There is a little snap sound after the startup was successful. After saying "ok nabu" the orange house (error) appears and no more response is possible. I'm using piper, openwakeword and whisper as local addons in my HA supervised installation.

ESPHOME shows the following output:

[22:06:00][D][api.connection:1089]: Home Assistant 2023.12.3 (xx.xx.xx.xx): Connected successfully
[22:06:00][W][component:214]: Component api took a long time for an operation (0.28 s).
[22:06:00][W][component:215]: Components should block for at most 20-30ms.
[22:06:01][D][voice_assistant:422]: State changed from IDLE to START_MICROPHONE
[22:06:01][D][voice_assistant:428]: Desired state set to WAIT_FOR_VAD
[22:06:01][W][component:214]: Component api took a long time for an operation (0.28 s).
[22:06:01][W][component:215]: Components should block for at most 20-30ms.
[22:06:01][D][voice_assistant:159]: Starting Microphone
[22:06:01][D][voice_assistant:422]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[22:06:01][D][esp-idf:000]: I (25378) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[22:06:01][D][esp-idf:000]: I (25384) I2S: I2S0, MCLK output by GPIO2

[22:06:01][D][esp-idf:000]: I (25390) AUDIO_PIPELINE: link el->rb, el:0x3d036c54, tag:i2s, rb:0x3d037068

[22:06:01][D][esp-idf:000]: I (25393) AUDIO_PIPELINE: link el->rb, el:0x3d036dc8, tag:filter, rb:0x3d0390a8

[22:06:01][D][esp-idf:000]: I (25399) AUDIO_ELEMENT: [i2s-0x3d036c54] Element task created

[22:06:01][D][esp-idf:000]: I (25402) AUDIO_THREAD: The filter task allocate stack on external memory

[22:06:01][D][esp-idf:000]: I (25409) AUDIO_ELEMENT: [filter-0x3d036dc8] Element task created

[22:06:01][D][esp-idf:000]: I (25415) AUDIO_ELEMENT: [raw-0x3d036ef8] Element task created

[22:06:01][D][esp-idf:000]: I (25419) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:16586299 Bytes, Inter:50964 Bytes, Dram:50964 Bytes


[22:06:01][D][esp-idf:000]: I (25422) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1

[22:06:01][D][esp-idf:000]: I (25425) AUDIO_ELEMENT: [filter] AEL_MSG_CMD_RESUME,state:1

[22:06:01][D][esp-idf:000]: I (25429) RSP_FILTER: sample rate of source data : 16000, channel of source data : 2, sample rate of destination data : 16000, channel of destination data : 1

[22:06:01][D][esp-idf:000]: I (25432) AUDIO_PIPELINE: Pipeline started

[22:06:01][D][esp_adf.microphone:294]: Microphone started
[22:06:01][D][voice_assistant:422]: State changed from STARTING_MICROPHONE to WAIT_FOR_VAD
[22:06:01][D][voice_assistant:176]: Waiting for speech...
[22:06:01][D][voice_assistant:422]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[22:06:06][D][voice_assistant:189]: VAD detected speech
[22:06:06][D][voice_assistant:422]: State changed from WAITING_FOR_VAD to START_PIPELINE
[22:06:06][D][voice_assistant:428]: Desired state set to STREAMING_MICROPHONE
[22:06:06][D][voice_assistant:206]: Requesting start...
[22:06:06][D][voice_assistant:422]: State changed from START_PIPELINE to STARTING_PIPELINE
[22:06:06][D][voice_assistant:443]: Client started, streaming microphone
[22:06:06][D][voice_assistant:422]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[22:06:06][D][voice_assistant:428]: Desired state set to STREAMING_MICROPHONE
[22:06:06][D][voice_assistant:529]: Event Type: 0
[22:06:06][E][voice_assistant:656]: Error: no_wake_word - No wake word detected
[22:06:06][D][voice_assistant:522]: Signaling stop...
[22:06:06][D][voice_assistant:422]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[22:06:06][D][voice_assistant:428]: Desired state set to IDLE
[22:06:06][D][esp_adf.microphone:256]: Stopping microphone
[22:06:06][D][voice_assistant:422]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[22:06:06][D][esp-idf:000]: W (30571) AUDIO_ELEMENT: IN-[filter] AEL_IO_ABORT

[22:06:06][D][esp-idf:000]: E (30575) AUDIO_ELEMENT: [filter] Element already stopped

[22:06:06][D][esp-idf:000]: W (30603) AUDIO_PIPELINE: There are no listener registered

[22:06:06][D][esp-idf:000]: I (30607) AUDIO_PIPELINE: audio_pipeline_unlinked

[22:06:06][D][esp-idf:000]: W (30612) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE

[22:06:06][D][esp-idf:000]: I (30617) I2S: DMA queue destroyed

[22:06:06][D][esp-idf:000]: W (30624) AUDIO_ELEMENT: [filter] Element has not create when AUDIO_ELEMENT_TERMINATE

[22:06:07][D][esp-idf:000]: W (30629) AUDIO_ELEMENT: [raw] Element has not create when AUDIO_ELEMENT_TERMINATE

[22:06:07][W][component:214]: Component voice_assistant took a long time for an operation (0.28 s).
[22:06:07][W][component:215]: Components should block for at most 20-30ms.
[22:06:07][D][voice_assistant:529]: Event Type: 2
[22:06:07][D][voice_assistant:619]: Assist Pipeline ended
[22:06:07][D][voice_assistant:529]: Event Type: 1
[22:06:07][D][voice_assistant:532]: Assist Pipeline running
[22:06:07][D][voice_assistant:529]: Event Type: 9
[22:06:07][D][esp_adf.microphone:306]: Microphone stopped
[22:06:07][D][voice_assistant:529]: Event Type: 0
[22:06:07][E][voice_assistant:656]: Error: wake-stream-failed - Unexpected error during wake-word-detection
[22:06:07][D][voice_assistant:522]: Signaling stop...
[22:06:07][D][voice_assistant:422]: State changed from STOPPING_MICROPHONE to STOP_MICROPHONE
[22:06:07][D][voice_assistant:428]: Desired state set to IDLE
[22:06:07][D][voice_assistant:422]: State changed from STOP_MICROPHONE to IDLE
[22:06:07][W][component:214]: Component voice_assistant took a long time for an operation (0.27 s).
[22:06:07][W][component:215]: Components should block for at most 20-30ms.
[22:06:07][D][voice_assistant:529]: Event Type: 2
[22:06:07][D][voice_assistant:619]: Assist Pipeline ended
[22:06:07][D][voice_assistant:422]: State changed from IDLE to START_MICROPHONE
[22:06:07][D][voice_assistant:428]: Desired state set to WAIT_FOR_VAD
[22:06:07][D][voice_assistant:159]: Starting Microphone
[22:06:07][D][voice_assistant:422]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[22:06:07][D][esp-idf:000]: I (31287) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[22:06:07][D][esp-idf:000]: I (31292) I2S: I2S0, MCLK output by GPIO2

[22:06:07][D][esp-idf:000]: I (31299) AUDIO_PIPELINE: link el->rb, el:0x3d036c54, tag:i2s, rb:0x3d037068

[22:06:07][D][esp-idf:000]: I (31303) AUDIO_PIPELINE: link el->rb, el:0x3d036dc8, tag:filter, rb:0x3d0390a8

[22:06:07][D][esp-idf:000]: I (31310) AUDIO_ELEMENT: [i2s-0x3d036c54] Element task created

[22:06:07][D][esp-idf:000]: I (31313) AUDIO_THREAD: The filter task allocate stack on external memory

[22:06:07][D][esp-idf:000]: I (31318) AUDIO_ELEMENT: [filter-0x3d036dc8] Element task created

[22:06:07][D][esp-idf:000]: I (31323) AUDIO_ELEMENT: [raw-0x3d036ef8] Element task created

[22:06:07][D][esp-idf:000]: I (31326) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:16586747 Bytes, Inter:51412 Bytes, Dram:51412 Bytes


[22:06:07][D][esp-idf:000]: I (31330) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1

[22:06:07][D][esp-idf:000]: I (31332) AUDIO_ELEMENT: [filter] AEL_MSG_CMD_RESUME,state:1

[22:06:07][D][esp-idf:000]: I (31336) RSP_FILTER: sample rate of source data : 16000, channel of source data : 2, sample rate of destination data : 16000, channel of destination data : 1

[22:06:07][D][esp-idf:000]: I (31339) AUDIO_PIPELINE: Pipeline started

[22:06:07][D][esp_adf.microphone:294]: Microphone started
[22:06:07][D][voice_assistant:422]: State changed from STARTING_MICROPHONE to WAIT_FOR_VAD
[22:06:07][D][voice_assistant:176]: Waiting for speech...
[22:06:07][D][voice_assistant:422]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[22:06:08][W][component:214]: Component voice_assistant took a long time for an operation (0.27 s).
[22:06:08][W][component:215]: Components should block for at most 20-30ms.
[22:06:08][W][component:214]: Component voice_assistant took a long time for an operation (0.27 s).
[22:06:08][W][component:215]: Components should block for at most 20-30ms.

HA assist debug:

stage: done
run:
  pipeline: 01hhz9tr3494bh7s6fvc967ncw
  language: en
events:
  - type: run-start
    data:
      pipeline: 01hhz9tr3494bh7s6fvc967ncw
      language: en
    timestamp: "2023-12-18T21:06:06.836622+00:00"
  - type: wake_word-start
    data:
      entity_id: wake_word.openwakeword
      metadata:
        format: wav
        codec: pcm
        bit_rate: 16
        sample_rate: 16000
        channel: 1
      timeout: 5
    timestamp: "2023-12-18T21:06:06.837100+00:00"
  - type: error
    data:
      code: wake-stream-failed
      message: Unexpected error during wake-word-detection
    timestamp: "2023-12-18T21:06:06.847630+00:00"
  - type: run-end
    data: null
    timestamp: "2023-12-18T21:06:06.848318+00:00"
wake_word:
  entity_id: wake_word.openwakeword
  metadata:
    format: wav
    codec: pcm
    bit_rate: 16
    sample_rate: 16000
    channel: 1
  timeout: 5
  done: false
error:
  code: wake-stream-failed
  message: Unexpected error during wake-word-detection

The text was updated successfully, but these errors were encountered:

mackowskim · 2024-01-18T16:09:07Z

Very similar issue here.
The same errors from ESP side:

`
[D][esp-idf:000]: I (14121) AUDIO_PIPELINE: Pipeline started
[D][esp_adf.microphone:264]: Microphone started
[D][voice_assistant:410]: State changed from STARTING_MICROPHONE to WAIT_FOR_VAD
[D][voice_assistant:170]: Waiting for speech...
[D][voice_assistant:410]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[D][voice_assistant:183]: VAD detected speech
[D][voice_assistant:410]: State changed from WAITING_FOR_VAD to START_PIPELINE
[D][voice_assistant:416]: Desired state set to STREAMING_MICROPHONE
[D][voice_assistant:200]: Requesting start...
[D][voice_assistant:410]: State changed from START_PIPELINE to STARTING_PIPELINE
[D][voice_assistant:431]: Client started, streaming microphone
[D][voice_assistant:410]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[D][voice_assistant:416]: Desired state set to STREAMING_MICROPHONE
[D][voice_assistant:517]: Event Type: 0
[E][voice_assistant:644]: Error: no_wake_word - No wake word detected
[D][voice_assistant:510]: Signaling stop...
[D][voice_assistant:410]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[D][voice_assistant:416]: Desired state set to IDLE
[D][esp_adf.microphone:226]: Stopping microphone
[D][voice_assistant:410]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[D][esp-idf:000]: W (72547) AUDIO_ELEMENT: IN-[filter] AEL_IO_ABORT

[D][esp-idf:000]: E (72549) AUDIO_ELEMENT: [filter] Element already stopped

[D][esp-idf:000]: W (72580) AUDIO_PIPELINE: There are no listener registered

[D][esp-idf:000]: I (72583) AUDIO_PIPELINE: audio_pipeline_unlinked

[D][esp-idf:000]: W (72585) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE

[D][esp-idf:000]: I (72587) I2S: DMA queue destroyed

[D][esp-idf:000]: W (72589) AUDIO_ELEMENT: [filter] Element has not create when AUDIO_ELEMENT_TERMINATE

[D][esp-idf:000]: W (72593) AUDIO_ELEMENT: [raw] Element has not create when AUDIO_ELEMENT_TERMINATE

[W][component:214]: Component voice_assistant took a long time for an operation (0.21 s).
[W][component:215]: Components should block for at most 20-30ms.
[D][voice_assistant:517]: Event Type: 2
[D][voice_assistant:607]: Assist Pipeline ended
[D][voice_assistant:517]: Event Type: 1
[D][voice_assistant:520]: Assist Pipeline running
[D][esp_adf.microphone:276]: Microphone stopped
[D][voice_assistant:517]: Event Type: 9
[D][voice_assistant:410]: State changed from STOPPING_MICROPHONE to IDLE
[D][voice_assistant:517]: Event Type: 0
[E][voice_assistant:644]: Error: wake-stream-failed - Unexpected error during wake-word-detection
[D][voice_assistant:410]: State changed from IDLE to START_MICROPHONE
[D][voice_assistant:416]: Desired state set to WAIT_FOR_VAD
[W][component:214]: Component voice_assistant took a long time for an operation (0.21 s).
[W][component:215]: Components should block for at most 20-30ms.
[D][voice_assistant:517]: Event Type: 2
[D][voice_assistant:607]: Assist Pipeline ended
[D][voice_assistant:153]: Starting Microphone
[D][voice_assistant:410]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[D][esp-idf:000]: I (72987) I2S: DMA Malloc info, datalen=blocksize=512, dma_buf_count=8

[D][esp-idf:000]: I (72992) I2S: I2S0, MCLK output by GPIO2
[D][esp-idf:000]: I (72996) AUDIO_PIPELINE: link el->rb, el:0x3d036c2c, tag:i2s, rb:0x3d037040

[D][esp-idf:000]: I (72998) AUDIO_PIPELINE: link el->rb, el:0x3d036da0, tag:filter, rb:0x3d039080
[D][esp-idf:000]: I (73003) AUDIO_ELEMENT: [i2s-0x3d036c2c] Element task created0m
[D][esp-idf:000]: I (73005) AUDIO_THREAD: The filter task allocate stack on external memory

[D][esp-idf:000]: I (73008) AUDIO_ELEMENT: [filter-0x3d036da0] Element task created

[D][esp-idf:000]: I (73010) AUDIO_ELEMENT: [raw-0x3d036ed0] Element task created
`

Did you manage to resolve it somehow?

sti0 · 2024-01-20T21:05:14Z

Sadly not, its still not working

sti0 · 2024-01-23T20:02:46Z

Issue still exists after upgrading to ESPHome 2023.12.9. Doesn't make a difference if I use openwakeword or porcupine1. Both times its crashing after saying something (even if its not the wake word)....

mackowskim · 2024-01-26T10:17:02Z

Exactly the same symptoms here.
What's more, I was able to reinstall original firmware and it reacts to "OK E.S.P." with no issues.
Also understands the commands, to it's not a hardware issue.

sti0 · 2024-03-02T22:58:11Z

After switching to local wake word the detection works like expected. https://github.com/esphome/firmware/tree/main/wake-word-voice-assistant

sti0 · 2024-03-02T23:04:11Z

When switching to wake word engine location from "on device" to "in Home Assistant" the error still exists. So re-opening this ticket because its solved when using local wake word detection but one may still use the Home Assistant detection...

sti0 closed this as completed Mar 2, 2024

sti0 reopened this Mar 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESP-S3-BOX-3: Unexpected error during wake-word-detection #132

ESP-S3-BOX-3: Unexpected error during wake-word-detection #132

sti0 commented Dec 18, 2023

mackowskim commented Jan 18, 2024 •

edited

sti0 commented Jan 20, 2024

sti0 commented Jan 23, 2024

mackowskim commented Jan 26, 2024

sti0 commented Mar 2, 2024

sti0 commented Mar 2, 2024

ESP-S3-BOX-3: Unexpected error during wake-word-detection #132

ESP-S3-BOX-3: Unexpected error during wake-word-detection #132

Comments

sti0 commented Dec 18, 2023

mackowskim commented Jan 18, 2024 • edited

sti0 commented Jan 20, 2024

sti0 commented Jan 23, 2024

mackowskim commented Jan 26, 2024

sti0 commented Mar 2, 2024

sti0 commented Mar 2, 2024

mackowskim commented Jan 18, 2024 •

edited