Principal Component Analysis To Decompose Signals and Reduce Dimensionality (codes included)
We will learn the basics of Fourier analysis and implement it to remove noise from the synthetic and real signals
Principal Component Analysis is a technique that can extract the dominant pattern in the data matrix in terms of a set of new orthogonal variables called principal components and the corresponding set of factor scores based on its dominance (e.g. Kutz 2013; Abdi & L. J. Williams 2010). Some of its applications includes data compression, image processing, exploratory data analysis, pattern recognition, time series prediction etc.
Similar posts
The data matrix contains the observations that are described by several inter-correlated quantitative dependent variables. The main goals of PCA are to extract the most important information (may not be most dominant) from the data matrix, compress the size of data matrix by removing orthogonal components that explains less variance for the data. The PCA computes principal components, the new set of variables, that are linear combinations of original variables. The principal components are required to have estimated variance in descending order.
Apply PCA on a time-space function
Let us apply PCA on a time-space function borrowed from Kutz (2013).
Create a function
\begin{equation}
f(x, t) = [1 - 0.5 \cos 2t] sech x + [1-0.5 \sin 2t] sech x tanh x
\end{equation}
Solutions of the form equation above are often obtained in myriad of contexts by full numerical simulations of underlying physical system.
import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg
plt.style.use('seaborn')
x = np.linspace(-10,10,100)
t = np.linspace(0,10,30)
[X,T]=np.meshgrid(x,t)
def sech(X):
return 1/np.cosh(X)
f=sech(X)*(1-0.5*np.cos(2*T))+(sech(X)*np.tanh(X))*(1-0.5*np.sin(2*T))
vmin = np.amin(f)
vmax = np.amax(f)
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=plt.figaspect(0.8))
ax = fig.gca(projection='3d')
# Plot the surface.
surf = ax.plot_surface(T, X, f, cmap='summer',linewidth=0, antialiased=True, rstride=1, cstride=1, alpha=None)
ax.set_ylabel(r'$X$')
ax.set_xlabel(r'$T$')
surf.set_clim(vmin,vmax)
plt.colorbar(surf)
plt.tight_layout()
plt.savefig('PCA_space_time_func.png',dpi=300,bbox_inches='tight')
plt.close('all')
Compute the PCA
We can apply PCA to investigate the dynamics of this system.
u,s,v = linalg.svd(f, full_matrices=False,check_finite=False)
eigen_values = s
s = np.diag(s)
pca_modes = []
for j in range(1,3):
ff=u[:,0:j].dot(s[0:j,0:j]).dot(v[0:j,:])
pca_modes.append(ff)
vmin = np.min([np.amin(pca_modes[0]),np.amin(pca_modes[1])])
vmax = np.max([np.amax(pca_modes[0]),np.amax(pca_modes[1])])
Plot the Spatial Pattern
## Spatial behaviour
fig = plt.figure(figsize=plt.figaspect(0.5))
for jj in range(len(pca_modes)):
ax = fig.add_subplot(1, 2, jj+1, projection='3d')
surf = ax.plot_surface(T, X, pca_modes[jj], cmap='summer',linewidth=0, antialiased=True, rstride=1, cstride=1, alpha=None)
surf.set_clim(vmin,vmax)
ax.set_ylabel(r'$X$')
ax.set_xlabel(r'$T$')
ax.set_title('Var explained: {:.2f}%'.format(eigen_values[jj]/np.sum(eigen_values)*100))
plt.colorbar(surf)
plt.subplots_adjust(hspace=0.5,wspace=0.05)
plt.tight_layout()
plt.savefig('PCA_space_time_func_modes.png',dpi=300,bbox_inches='tight')
plt.close('all')
The first two modes capture 100% of the total surface energy in the system. Hence, the third mode is completely unnecessary. This suggests that system can be easily and accurately represented by a simple two-mode expansion.
Figure 2 shows the spatial behavior of the first two modes.
Plot the temporal pattern
## temporal behaviour
fig, ax = plt.subplots(2,1,figsize=plt.figaspect(0.5))
ax[0].plot(x,v[0,:],'k-',label='mode 1')
ax[0].plot(x,v[1,:],'k--',label='mode 2')
ax[0].set_xlabel('x')
ax[0].set_ylabel('PCA modes')
ax[0].legend()
ax[1].plot(t,u[:,0],'k-',label='mode 1')
ax[1].plot(t,u[:,1],'k--',label='mode 2')
ax[1].set_xlabel('x')
ax[1].set_ylabel('PCA modes')
ax[1].legend()
plt.subplots_adjust(hspace=0.5,wspace=0.05)
plt.tight_layout()
plt.savefig('PCA_space_time_behaviour.png',dpi=300,bbox_inches='tight')
plt.close('all')
In Figure 3, top panel shows the spatial behaviour of the system for the two modes, and bottom panel shows the temporal evolution of the system for two modes.
Complete Code
References
- Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459. https://doi.org/10.1002/wics.101
- Kutz, J. N. (2013). Data-driven modeling & scientific computation: methods for complex systems & big data. Oxford University Press.
Disclaimer of liability
The information provided by the Earth Inversion is made available for educational purposes only.
Whilst we endeavor to keep the information up-to-date and correct. Earth Inversion makes no representations or warranties of any kind, express or implied about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services or related graphics content on the website for any purpose.
UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE. ANY RELIANCE YOU PLACED ON SUCH MATERIAL IS THEREFORE STRICTLY AT YOUR OWN RISK.
Leave a comment