fayizferosh / KWS-ha

Files

	.github
	def
	docs
	gds
	lef
	lib
	lvs
	mag
	maglef
	openlane
	sdc
	signoff
	spef
	spi
	venv-cocotb
	verilog
	.gitignore
	.readthedocs.yaml
	LICENSE
	Makefile
	README.md

README.md

MelKWS_Engine

MelKWS Engine

Static Badge GitHub last commit GitHub language count GitHub top language GitHub repo size GitHub code size in bytes GitHub repo file count (file type)

A simple and resource efficient hardware accelerator designed specifically for Keyword Spotting (KWS) applications using log-mel spectrograms as the audio feature extractor.

Architecture

Description

Input:
- The input audio stream is sampled at a specific frequency, such as 16 kHz.
- Each audio frame consists of a fixed number of samples.
Log-Mel Spectrogram Computation:
- Implement a lightweight log-mel spectrogram computation module to extract features from the input audio stream.
Keyword Detection:
- The accelerator should detect the presence or absence of a single predefined keyword or command based on the computed log-mel spectrograms.
Output:
- Provide a mechanism to indicate the presence or absence of the keyword in the input audio stream.
- Output a binary flag signal indicating the presence or absence of the keyword.

Architecture Choice

Input Interface:
- Purpose: Handles incoming audio samples, ensuring they are correctly timed and formatted for processing.
- Components:
  - Sample buffer: Temporarily stores incoming audio samples.
  - Control logic: Manages the flow of samples based on system state and input validity.
Pre-processing:
- Purpose: Applies necessary pre-processing steps to the audio samples, such as framing and windowing.
- Components:
  - Frame buffer: Segments the continuous audio stream into overlapping frames.
  - Window function: Applies a windowing function to each frame to minimize spectral leakage.
FFT Module:
- Purpose: Converts time-domain audio frames into frequency-domain representations using the Fast Fourier Transform (FFT).
- Components:
  - FFT processor: Computes the FFT of each windowed frame.
Mel Filterbank Processing:
- Purpose: Applies a set of Mel-scaled filters to the FFT output to extract frequency bands that mimic human auditory perception.
- Components:
  - Filterbank: A collection of band-pass filters corresponding to the Mel scale.
  - Energy computation: Calculates the energy in each Mel band.
Feature Extraction:
- Purpose: Optionally extracts additional features from the Mel spectrogram, such as MFCCs (Mel Frequency Cepstral Coefficients), if required by the keyword detection logic.
- Components:
  - Feature extractor: Calculates MFCCs or other features from the Mel spectrogram.
Dynamic Precision Adjustment:
- Purpose: Adjusts the precision of the FFT or Mel spectrogram data to optimize for computational efficiency or resource usage.
- Components:
  - Precision control: Dynamically adjusts data bit-width based on configurable criteria.
Logarithmic Compression:
- Purpose: Applies logarithmic compression to the Mel spectrogram to better match the non-linear perception of loudness in the human auditory system.
- Components:
  - Logarithmic function: Computes the logarithm of Mel spectrogram values.
Keyword Detection Logic:
- Purpose: Analyzes the log-Mel spectrogram (and possibly additional features) to detect the presence of specific keywords.
- Components:
  - Detection algorithm: Implements a simple thresholding or a more complex pattern matching/machine learning algorithm to identify keywords.
  - Keyword selector: Allows dynamic selection of the keyword(s) to be detected.
Output Interface:
- Purpose: Indicates the detection result, such as the presence of a keyword.
- Components:
  - Detection output: Signals when a keyword has been detected.
  - Status indicators: Provide additional information about the detection process, such as confidence levels.
Integration and Control (Top):
- System Controller: Coordinates the operation of all stages, managing state transitions, processing flow, and synchronization.
- Clock and Reset Management: Ensures all components operate synchronously and can be reset to a known state.

:exclamation: Important Note

Forked from the Caravel User Project

Refer to README for a quickstart of how to use caravel_user_project

Refer to README for this sample project documentation.

Refer to the following readthedocs for how to add cocotb tests to your project.