The idea is to approximate .wav files with sets of sine waves. Then we use the sine wave coefficients as basis functions to describe and recognize speech contained in new .wav files.
Math summary as a MATLAB script (I’ll be using Octave):
clear;clc xdata=1:0.1:10; % X or Independant Data ydata=sin(xdata+0.2)+0.5*sin(0.3*xdata+0.3)+ 2*sin( 0.2*xdata+23 )+... 0.7*sin( 0.34*xdata+12 )+.76*sin( .23*xdata+.3 )+.98*sin(.76 *xdata+.56 )+... +.34*sin( .87*xdata+.123 )+.234*sin(.234 *xdata+23 ); % Y or Dependant data x0 = randn(36,1); % Initial Guess fun = @(x,xdata)x(1)*sin(x(2)*xdata+x(3))+... x(4)*sin(x(5)*xdata+x(6))+... x(7)*sin(x(8)*xdata+x(9))+... x(10)*sin(x(11)*xdata+x(12))+... x(13)*sin(x(14)*xdata+x(15))+... x(16)*sin(x(17)*xdata+x(18))+... x(19)*sin(x(20)*xdata+x(21))+... x(22)*sin(x(23)*xdata+x(24))+... x(25)*sin(x(26)*xdata+x(27))+... x(28)*sin(x(29)*xdata+x(30))+... x(31)*sin(x(32)*xdata+x(33))+... x(34)*sin(x(35)*xdata+x(36)); % Goal function which is Sum of 12 sines options = optimoptions('lsqcurvefit','Algorithm','trust-region-reflective');% Options for fitting x=lsqcurvefit(fun,x0,xdata,ydata) % the main instruction times = linspace(xdata(1),xdata(end)); plot(xdata,ydata,'ko',times,fun(x,times),'r-') legend('Data','Fitted Sum of 12 Sines') title('Data and Fitted Curve') To run this in Octave requires the octave-dev and the Optimization package.