The Nikon D3200 has a huge number of pixels on its sensor. This means photos with huge resolution. I just finished rendering a 450 megapixel VR panorama. That's huge.
But 450 megapixels refers to the size of the equirectangular map. As we know, the map is a simple mapping of the angles theta (the left-right angle) and phi (the up-down angle) to a section of a 2D plane. Among other things, the mapping results in the zenith and nadir points being smeared out over the top and bottom pixel rows, respectively. If the map image is 1000 pixels wide, then we are using 2000 pixels - the two pixel rows - to store what really only are two VR panorama pixels. That's a 999:1 junk-to-data ratio.
1. The Problem
How much image data is there, really, in an equirectangular image map? Let's think about it this way: The equirectangular map is the surface of a sphere, and we assume that the map was produced in a way that gives approximately equal amount of detail to every part of the sphere. If we figure out the area of a single pixel, we can divide the area of the sphere with the area of a pixel and have a fairly good estimate of the actual image data content.
2. Image Data
To figure out the area of a pixel, we start by figuring out the angle a single pixel subtends. When the map was rendered, the size was chosen so that each pixel along the horizon would correspond to a single input pixel from the photos making up the panorama. Therefore, we can divide a full circle with the width, w, of the image in pixels:
This gives us theta_(p), the number of radians per pixel. For the area, then, we approximate it as a square with side being twice the sine of the half-angle of a pixel:
For small x, sin(x) = x. At 450 megapixels, x is indeed small; so we approximate a bit:
Which then simplifies to:
The area of a sphere is well-known:
Now we just divide A by a_(p), to get N, the number of image pixels that fit on the panosphere:
Since theta_(p) is a function of the map width, we can get a more interesting equation by substituting the definition of theta_(p) back into the definition for N:
...which, in the end, simplifies down to:
For a 30,000 x 15,000 equirectangular map, then, the real amount of image data is 287 megapixels.
To figure out the efficiency, E, of storing a VR panorama as an equirectangular map, we note that there are wh pixels in the map, and since 0 <= theta < 2pi but -pi/2 <= phi <= pi/2 we know that h = w/2. Some substitutions give N_(m), the number of pixels in the map, as:
Then we just divide N by N_(m) and get:
Sixty-four percent. I thought it'd be worse, so I'm actually happy seeing that number.
4. Perceptual Values
Some may object to the above calculation giving equal weight to pixels close to the zenith or nadir points. As I've written before about the making of VR panoramas, the interesting parts of a panorama tend to be within 33 degrees of the horizon.
If we really want to figure out the efficiency, then, we should only count the pixels along this band. Fortunately, there is a simple formula for this. The area of a sphere above x radians of latitude is: 2pir^2(1 - sin(x)). Two such areas are present, so we subtract this twice from the area of the sphere and end up with A_(i):
A_(i) = 4pir^2 - 4pir^2(1 - sin x)
= 4pir^2 - 4pir^2 + 4pir^2sin(x)
Dividing again by a_(p):
N_(i) = A_(i) / a_(p)
~~ (4pir^2sin(x)) / theta_(p)^2
Substituting the definition of a_(p) back in and fixing up the equation:
w^2sin(x) / pi
Putting this into the efficiency formula as above leads to:
E = N_(i) / N_(m)
= (w^2sin(x)) / pi // w^2 / 2
Evaluating this at x = 33° gives:
2 * sin(0.5759) / pi ~~ 0.35 = 35%
That's just 35%. 156 megapixels of interesting stuff, from a 450 megapixel image. Ouch.
5. A Final Twist
But what does the JPEG encoding do with this? Given that the image is more and more smeared out toward the poles, you would expect the JPEG encoder to be more efficient in compressing the data. It turns out that it does: The "interesting section" of the panorama is stored as a 112MB jpg in Photoshop at quality 12. The full panorama is 236MB at the same level of quality. So, in terms of actual on-disk storage needs, the efficiency, even in this worst case, is 47%, which, given that someone just might be interested in the zenith and nadir views, is "not good but acceptable" in my mind. Maybe future versions of the Java JPEG encoder will allow one to use a different quality setting on a per-macroblock level.