## Thursday, May 18, 2006

### Curse of Dimensionality and intuition

There are some counter-intuitive things happening as dimension increases.

For instance consider unit sphere (radius 1)

If you go from 2 dimensions to 3 dimensions, the volume increases (Pi to 4/3 pi). Similar increase happens when you go to 3,4,5 dimensions. But then the volume starts to decrease. Eventually it decreases all the way to 0.

Now consider a cube of width 1. As dimension increases, the volume stays the same. But the length of the main diagonal grows as sqrt(dimension). So the corners become situated further and further away from the center, and eventually almost all the mass is concentrated in the corners (meaning outside of the inscribed sphere)

Here's a plot of the fraction of the mass concentrated in the corners as a function of dimension

For a 7 dimensional cube, about 96% of the mass is concentrated in one of it's 128 "corners", so if we had 7 dimensional vision, we might see that cube like this

Finally consider what happens with a d-dimensional standard normal (unit covariance matrix, centered at 0). If we plot probability mass contained around a thin shell of radius r as a function r we get the following graphs (d=1,d=4,d=16)

The mean of this distribution is

which can be approximated as sqrt(d)

Furthermore you can derive that variance of this distribution converges to 1/2 (see notebook). Applying Chebyshev's inequality, we get that for any dimension, at least 0.75 of the mass is contained in the shell of thickness 1 centered at about sqrt(d). In numerical integration this seems to converge to a slightly higher number, for instance for 10^6 dimensions, it's about 0.84

Here's the Mathematica notebook with some derivations (web version)

damien francois said...

Very interesting read. May I recommend the nice overview of Mario Koppen as complementary reading. damien

Anonymous said...

This is the math post of the year!

Anonymous said...

Nice post.

I have some sort of knee-jerk reaction against comparing the volume of spheres in different dimensions. What does it mean to compare the volume of something in 3d with units [L^3] and 4d with units [L^4]? I wouldn't like to say that one is ‘bigger’ than the other. Of course one may compare dimensionless quantities, which I appreciate is what you are doing by comparing the ratio of the volume of the sphere and the enclosing cube. One could also compare to an inscribed cube and see how all of the mass of the sphere gets pushed out to its surface, just as the cube’s mass goes to the corners.

A pratical application that spends some time working out how spheres and cubes fit together in high dimensions is http://tinyurl.com/s8xms

Yaroslav Bulatov said...

Thanks for the link Damien, thanks for the comment Daniel, Iain -- I agree, units are not really meaningful.

Consider a cube with side 1 meter long. As dimensions increase, cube volume stays constant, goes to infinity or to zero depending on whether you use meters, millimeters or kilometers to measure volume.

Relative volume of cube inscribed in a sphere is another interesting calculation -- since diagonal^2 has length dx^2, where x is the length of the side, inscribed cube's "volume" goes down as d^{-d/2} which is faster than the rate at which sphere's volume decreases (about (d/2)^(-d/2))

Anonymous said...

Really nice !
As regards the gaussian part of this post, some graphs are missing ("following graphs d=1,d=4,d=16") and that makes "The mean of this distribution is..." a little confusing because you actually refer to E[||X||] and not to E[X] (which is always 0). This is obvious, but, at first glance I was confused...
Nevertheless, are there consequences on technics using d-normal (like Parzen estimator) ?
And what about the mass repartition of other distributions, with lower/bigger tails ?

damien francois said...

Dear all, high-dimensional data analysis is the topic of my PhD thesis, which I am currently writing. I have gathered quite a few references about 'strange' properties of high dimensional spaces. If you are interested, just send me an email. damien

Anonymous said...

Nice introduction of simple math, high dimensional space seems to be challenging for we 3D human being. Anyway, really nice post and thanks.

draj said...

Excellent machine learning blog,thanks for sharing...
Seo Internship in Bangalore
Smo Internship in Bangalore
Digital Marketing Internship Program in Bangalore

john said...
Mike Smith said...

Our Webroot Geek Squad Agents provide repair, installation and setup services on all kinds of tech at more than 1,100 Best Buy stores – including computer & tablet repair, setup and support, TV & home theater repair, car stereo & GPS installation, cell phone repair and home appliance repair. We fix most makes and models, no matter where you bought them and can show you how to get the most out of your technology. Want to fix any gadget issue? Our Trend Micro Geek Squad team is here to provide you with seamless services for your faulty devices. Geek Squad has set the benchmark when it comes to office or home appliances repairing. Our experts get acquainted with some exclusive technologies for hardware or electronic device issues. We can fix all the hindrances that can’t be solved by a typical customer support approach. The specialized touch of our technicians resolves the problem immediately.

DP said...

Pharm Envee is a global biotechnology and pharmaceutical marketing company that brings targeted products to a global audience. We specialize in utilizing innovative strategies to introduce groundbreaking biotechnology, pharmaceutical agents, and medical devices to market. Our team of marketing consultants and product specialists are dedicated to identifying and, if needed, creating opportunities for our clients’ products to launch and sustain in their targeted marketplaces.

help with my math assignment said...

Your blogs are great. Are you also searching for Help with my Math assignment? we are the best solution for you. We are best known for delivering quality essay writing services to students without having to break the bank

cheap assignment help said...

Good blog. Keep sharing. I love them Are you also searching for Cheap assignment help? we are the best solution for you. We are best known for delivering cheap assignments to students without having to break the bank