# Curse Of Dimensionality

0
155 Curse of Dimensionality: The Curse of Dimensionality refers to the problem of finding patterns in data in a high dimensional space. The more features or dimensions we have, the more data points we need to identify patterns within data.
The reason for above problem is as number of dimensions increases, the volume of the space increases so fast that available data become sparse. This sparsely distributed data becomes problem while we trying to come up with a statistically significant result. To obtain a statistically significant result, the amount of data needed grows exponentially as the number of dimensions grows.
Let us try to understand with an easy example. Say we dropped a coin on a 100 meter line. It would not be difficult to find. We simple walk along the line and finding the coin would need few minutes. Now let`s say we have to find coin in 100 * 100 square meter field, we would certainly need few hours. If we add another dimension and now we have to find the coin in a cube each side of 100 meters. It might take few days to find the coin.
Similarly as the numbers of dimensions increases not only mathematical computation becomes more complex but it becomes time consuming also.

Problems with high dimensional data:
1. Increases the processing time
2. Over fitting
3. Required data size increases exponentially
4. Principal Component Analysis is one of the common methods used to reduce the dimensionality. The idea behind PCA is to find out dimensions which account for most of the variance within data.