With Audio Toolbox you can import, label, and augment audio data sets, as well as extract features and transform signals for machine learning and deep learning. You can prototype audio processing algorithms in real time by streaming low-latency audio while tuning parameters and visualizing signals. You can also validate your algorithm by turning it into an audio plugin to run in external host applications such as Digital Audio Workstations. Plugin hosting lets you use external audio plugins like regular objects to process MATLAB® arrays. Sound card connectivity enables you to run custom measurements on real-world audio signals and acoustic systems.

Audio Streaming with Sound Cards

Connect to standard laptop and desktop sound cards for streaming low-latency multichannel audio between any combination of files and live inputs and outputs.

Connectivity to Standard Audio Drivers

Read and write audio samples from and to sounds cards (such as USB or Thunderbolt™) using standard audio drivers (such as ASIO, WASAPI, CoreAudio, and ALSA) across Windows®, Mac®, and Linux® operating systems.

Audio and Speech Feature Extraction

Extract low-level features for speech and audio analytics, including Mel frequency cepstral coefficients (MFCC), gammatone cepstral coefficients (GTCC), pitch, harmonicity, and spectral descriptors. Feed deep learning architectures working on time-series, such as those based on LSTM layers.

Time-Frequency Transformations

Transform signals into time-frequency representations using a modified discrete cosine transform (MDCT), short-time Fourier transform (STFT), or the more compact Mel-spaced spectrogram. Decompose signals by using perceptually-spaced frequency bands that use gammatone filter banks. Feed deep learning models working on two-dimensional data, such as those based on CNN layers.

Ingest Large Audio Datasets

Index and read from large collections of audio recordings using audioDatastore. Randomly split lists of audio files according to labels. Parallelize processing tasks using tall arrays for data augmentation, time-frequency transformations, and feature extraction.