Introduction to Principal Component Analysis (PCA)
In Machine Learning, we need features for the algorithm to figure out patterns that help differentiate classes of data. More the number of features, more the variance (variation in data) and hence model finds it easy to make ‘splits’ or ‘boundaries’. But not all features provide useful information. They can have noise too. If our model starts fitting to this random noise, it would lose its robustness. So here’s the deal — We want to compose m features from the available feature space that give us maximum variance. Note that we want to compose and not just select m features as it is.
Source: Introduction to Principal Component Analysis (PCA) — with Python code, an article by Dhruvil Karani.