audio-processing

v1.0.15

Published

2 years ago

Audio processing, such as pitch detection, fft, mfcc and converting AMR NB/WB to Wave PCM data.

Downloads

0High
0Medium
0Low

fengzhang2011

audio processing pitch fft silence

Audio-Processing

A handy nodejs package for audio processing.

For example, it can extract frequencies from the audio and compute the pitch.

The source code can be found on https://github.com/fengzhang2011/audio-processing.

The npm package is on https://www.npmjs.com/package/audio-processing.

1. HOW TO USE

1.1 A simple test on the library

Just execute the following command.

$ cd build
$ cmake ..
$ make
$ ./audio_processing

1.2 Use it in the Javascript

This package has been published into the npm repository. Therefore, it can be installed via npm.

$ mkdir your_project
$ cd your_project
$ npm init
$ npm install audio-processing

NOTE:

If you encounter some issues like permission denied while installing it, especially in a docker container, try the following command. Reason: The unsafe-perm boolean set to true to suppress the UID/GID switching when running package scripts.

# npm config set unsafe-perm true

Now you could use the code. The example code is as follows.

const ap = require('audio-processing');

console.log(ap.hello());

async function test() {
let audio = await ap.readAudio("./wav/female.wav");
  console.log(audio.samplerate);
  ap.saveAudio("haha.wav", audio.wavdataL, audio.wavdataR, audio.samplerate, audio.bitdepth, audio.channels);
  console.log(ap.detectPitch(audio.wavdataL, audio.samplerate, 'acorr'));
  console.log(ap.detectPitch(audio.wavdataL, audio.samplerate, 'yin'));
  console.log(ap.detectPitch(audio.wavdataL, audio.samplerate, 'mpm'));
  // console.log(ap.detectPitch(audio.wavdataL, audio.samplerate, 'goertzel'));
  // console.log(ap.detectPitch(audio.wavdataL, audio.samplerate, 'dft'));

  let ampfreq = await ap.ampfreq(audio.wavdataL, audio.samplerate);
  // console.log('ampfreq=', ampfreq);

  let data = new Float32Array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]);
  // console.log(data);

  let freq_data = await ap.fft(data);
  // console.log(freq_data);

  let td_data = await ap.ifft(freq_data.real, freq_data.imag);
  // console.log(td_data);
}

test();

2. CREDITS

This code uses the FFTS, Pitch-Detection, AudioFile, Opencore-AMR, MiniMp3, and libsamplerate.

Thanks for their great work.

The detailed versions in use are as follows:

| # | Project | Version | Date | | ---| --- | --- | --- | | 1 | FFTS | fe86885 | Jun-17-2017 | | 2 | Pitch-Detection | 7799a62 | Oct-07-2018 | 3 | AudioFile | a6430a0 | Jun-06-2017 | 4 | Opencore-AMR | 0.1.5 | Mar-16-2017 | 5 | MiniMp3 | 7295650 | Sep-26-2018 | 6 | libsamplerate | 313685a | Mar-07-2019

3. THIRD-PARTY LIBRARIES

3.1 Supported Audio Format

3.1.1 Format: .wav (WAV)

Copy the AudioFile source code

$ git clone https://github.com/adamstark/AudioFile
$ cd AudioFile
$ mkdir build
$ cd build
$ g++ -ansi -pedantic -Werror -O3 -std=c++17 -fPIC -fext-numeric-literals -ffast-math -c ../*.cpp
$ ar rcs libaudiofile.a *.o

Once these commands are done, the libaudiofile.a will be generated under the build folder.

Copy the header file src/AudioFile.h and libaudiofile.a to ./include and ./lib folders, respectively, into this repository.

3.1.2 Format: .amr (AMR)

OpenAMR WB/NB

https://sourceforge.net/projects/opencore-amr/files/opencore-amr/

3.1.3 Format: .mp3 (MP3)

MP3 decoder

https://github.com/lieff/minimp3

3.2 Resample

3.2.1 Sample Rate Converter

libsamplerate

$ git clone https://github.com/anthonix/libsamplerate.git
$ cd libsamplerate
$ echo "set(CMAKE_C_FLAGS \"\${CMAKE_C_FLAGS} -fPIC\")" >> CMakeLists.txt
$ mkdir build
$ cd build
$ cmake ..
$ make

3.3 FFT and MFCC

3.3.1 Compile the FFTS static library

$ git clone https://github.com/anthonix/ffts.git
$ cd ffts
$ echo "set(CMAKE_C_FLAGS \"\${CMAKE_C_FLAGS} -fPIC\")" >> CMakeLists.txt
$ mkdir build
$ cd build
$ cmake ..
$ make ffts_static

NOTE: We must enable the -fPIC flag when compiling the ffts static library, to enable the "Position Independent Code". Otherwise, it will generate the follow error:

/usr/bin/ld: ../lib/libffts.a(ffts.c.o): relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile with -fPIC
../lib/libffts.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
make: *** [Release/obj.target/ssrc.node] Error 1

Once these commands are done, the libffts.a will be generated under the build folder.

Copy the header file include/ffts.h and libffts.a to ./include and ./lib folders, respectively, into this repository.

3.3.2 MFCC compuation

The code are written based on these great documents. Thanks for the authors.

3.4 Audio features

3.4.1 Pitch

Compile the Pitch-Detection static library

$ git clone https://github.com/sevagh/pitch-detection.git
$ cd pitch-detection
$ mkdir build
$ cd build
$ g++ -I../include -ansi -pedantic -Werror -Wall -O3 -std=c++17 -fPIC -fext-numeric-literals -ffast-math -c ../src/*.cpp
$ ar rcs --plugin $(gcc --print-file-name=liblto_plugin.so) libpitch_detection.a *.o

Once these commands are done, the libpitch_detection.a will be generated under the build folder.

Copy the header file include/pitch_detection.h and libpitch_detection.a to ./include and ./lib folders, respectively, into this repository.

3.5 Noise reduction

3.5.1 Weiner filter for Noise Reduction and speech enhancement.

Pascal Scalart. Wiener Noise Suppressor based on Decision-Directed method with TSNR and HRNR algorithms.