- Published on
How to Build Your Own Local Voice Assistant with ReSpeaker Lite and Home Assistant
- Authors
- Name
- Amrut Prabhu
- @amrutprabhu42
Table of Contents
In this guide, we’ll walk through setting up your own local voice assistant using Seeed Studio’s ReSpeaker Lite Voice Assistant Kit and Home Assistant. With this setup, you can enjoy voice command functionality without relying on cloud-based services — all while keeping your data secure and private. Plus, the process is straightforward: no soldering or wiring required! Let’s dive in.
What You’ll Need
To start, you’ll need a few specific components:
- ReSpeaker Lite with XIAO ESP32S3
- 5W Speaker
- Acrylic Enclosure (optional)
You can buy these from here. You can alternatively buy it from AliExpress.
Make sure to select the ESP32-S3 version of the ReSpeaker Lite and a 5W speaker for the best compatibility. The acrylic enclosure is optional but recommended if you want a polished look; it costs around $5. Alternatively, you can 3D print a custom case, which I did to personalize my setup.
Step 1: Updating the Firmware
Before starting, you’ll need to update the firmware on the ReSpeaker board. Here’s how:
- Connect the Device: Use the USB port next to the microphone to connect the board to your computer.
- Install dfu-util: Install
dfu-util
based on your operating system from here(Windows, macOS, or Linux) from here. - Download the Firmware: Get the latest I2S firmware from Seed Studio’s website. At the time of writing, the latest version is 1.0.9. Make sure you download the I2S firmware and not the USB firmware. You can download the firmware from here.
- Flash the Firmware: Once downloaded, flash the firmware onto your device using the appropriate command as listed in the documentation.
Step 2: Adding On-Device Wake Word Detection
To enable wake-word detection, we’ll use ESPhome, which supports Micro Wake Word. This feature allows the device to listen continuously for a wake word, making it more responsive.
Here’s the ESPHome Yaml Code
esphome:
name: respeaker-lite
friendly_name: ReSpeaker-lite
platformio_options:
board_build.flash_mode: dio
board_build.mcu: esp32s3
on_shutdown:
then:
# Prevent loud noise on software restart
- lambda: id(respeaker).mute_speaker();
esp32:
board: esp32-s3-devkitc-1
framework:
type: esp-idf
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
CONFIG_AUDIO_BOARD_CUSTOM: "y"
psram:
mode: octal
speed: 80MHz
# Enable logging
logger:
# Enable Home Assistant API
api:
encryption:
key: "r/Sisfr4iti6gYFTiTP0/ip7PWtf59Utc1ILfK92Sco="
on_client_connected:
then:
- delay: 50ms
- light.turn_off: led
- micro_wake_word.start:
on_client_disconnected:
then:
- voice_assistant.stop:
ota:
- platform: esphome
password: "9f4f17ff79c3f803413a44210d7bfddc"
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: "Respeaker-Lite Fallback Hotspot"
password: "ojjt3U1DHzKn"
captive_portal:
external_components:
- source:
type: git
url: https://github.com/esphome/voice-kit
ref: dev
components:
- aic3204
- audio_dac
- media_player
- micro_wake_word
- microphone
- nabu
- nabu_microphone
- voice_assistant
- voice_kit
refresh: 0s
- source: github://pr#7605
components: [ audio, i2s_audio, speaker]
refresh: 0s
- source:
type: git
url: https://github.com/formatBCE/Respeaker-Lite-ESPHome-integration
ref: main
components: [ respeaker_lite ]
refresh: 0s
i2s_audio:
- id: i2s_output
i2s_lrclk_pin:
number: GPIO7
allow_other_uses: true
i2s_bclk_pin:
number: GPIO8
allow_other_uses: true
i2s_mclk_pin:
number: GPIO9
allow_other_uses: true
- id: i2s_input
i2s_lrclk_pin:
number: GPIO7
allow_other_uses: true
i2s_bclk_pin:
number: GPIO8
allow_other_uses: true
i2s_mclk_pin:
number: GPIO9
allow_other_uses: true
i2c:
- id: bus_a
sda: GPIO5
scl: GPIO6
scan: true
respeaker_lite:
id: respeaker
i2c_id: bus_a
reset_pin: GPIO2
mute_state:
internal: true
id: mute_state
firmware_version:
icon: mdi:application-cog
name: XMOS firmware version
internal: false
id: firmware_version
microphone:
- platform: nabu_microphone
id: xiao_mic
adc_type: external
i2s_din_pin: GPIO44
pdm: false
sample_rate: 16000
bits_per_sample: 32bit
i2s_mode: secondary
i2s_audio_id : i2s_input
channel_0:
id: nabu_mic_mww
channel_1:
id: nabu_mic_va
speaker:
- platform: i2s_audio
id: xiao_speaker
dac_type: external
i2s_dout_pin: GPIO43
i2s_mode: secondary
sample_rate: 16000
bits_per_sample: 32bit
i2s_audio_id: i2s_output
channel: mono
media_player:
- platform: nabu
id: nabu_media_player
name: Media Player
internal: false
speaker: xiao_speaker
sample_rate: 16000
volume_increment: 0.05
volume_min: 0.4
volume_max: 0.85
files:
- id: timer_audio
file: https://github.com/esphome/firmware/raw/main/voice-assistant/sounds/timer_finished.wav
micro_wake_word:
vad:
microphone: nabu_mic_mww
on_wake_word_detected:
- voice_assistant.start:
wake_word: !lambda return wake_word;
silence_detection: true
- light.turn_on:
id: led
red: 80%
green: 0%
blue: 80%
brightness: 60%
effect: fast pulse
models:
- model: hey_jarvis
voice_assistant:
microphone: nabu_mic_va
noise_suppression_level: 0
auto_gain: 0dBFS
volume_multiplier: 1
id: assist
media_player: nabu_media_player
on_stt_end:
then:
- light.turn_off: led
on_error:
- micro_wake_word.start:
on_end:
then:
- light.turn_off: led
- wait_until:
not:
voice_assistant.is_running:
- micro_wake_word.start:
# timer functionality
on_timer_finished:
- switch.turn_on: timer_ringing
- light.turn_on:
id: led
effect: "Slow Pulse"
red: 80%
green: 0%
blue: 30%
brightness: 80%
- while:
condition:
switch.is_on: timer_ringing
then:
- nabu.play_local_media_file: timer_audio
- delay: 2s
- light.turn_off: led
- micro_wake_word.start:
switch:
- platform: template
id: timer_ringing
optimistic: true
internal: False
name: "Timer Ringing"
restore_mode: ALWAYS_OFF
light:
- platform: esp32_rmt_led_strip
id: led
name: "Led Light"
pin: GPIO1
default_transition_length: 0s
chipset: ws2812
num_leds: 1
rgb_order: grb
rmt_channel: 0
effects:
- pulse:
name: "Slow Pulse"
transition_length: 250ms
update_interval: 250ms
min_brightness: 50%
max_brightness: 100%
- pulse:
name: "Fast Pulse"
transition_length: 100ms
update_interval: 100ms
min_brightness: 50%
max_brightness: 100%
Important Considerations
Before using this setup as your main voice assistant, keep the following points in mind:
- Sound Quality: The ReSpeaker is capped at a 32-bit and 16,000 Hz sample rate. This limitation arises because both the microphone and speaker share the same I2S bus, and the microphone’s 16,000 Hz rate dictates the same for the speaker. This setup is excellent for voice commands but may not be ideal for high-quality music playback.
- Development Stage: The ESPhome code we’re using is still under development. As such, there may be updates or changes that could affect functionality. I plan to use this setup as my daily driver, so stay tuned for updates on Twitter & YouTube related to its stability and improvements.
Final Thoughts and Support
This DIY local voice assistant is a fantastic way to integrate voice commands into your smart home while keeping your data local. I’ll continue to test and refine this setup, so make sure to subscribe to my YouTube channel for updates.
If you enjoy my content and want to support the channel, consider buying me a coffee.
Thanks for following along, and happy building!