Virtual University Of Pakistan Network
What is the resolution of the human eye in megapixels? originally appeared on Quora: the knowledge sharing network where compelling questions are answered by people with unique insights.
What is the resolution of the human eye in megapixels? Well, it wouldn't directly match a real-world camera ... but read on.
On most digital cameras, you have orthogonal pixels: they're in the same distribution across the sensor (in fact, a nearly perfect grid), and there's a filter (usually the "Bayer" filter, named after Bryce Bayer, the scientist who came up with the usual color array) that delivers red, green, and blue pixels.
So, for the eye, imagine a sensor with a huge number of pixels, about 120 million. There's a higher density of pixels in the center of the sensor, and only about 6 million of those sensors are filtered to enable color sensitivity. And of course, only about 100,000 sense for blue! Oh, and by the way, this sensor isn't made flat, but in fact, semi-spherical, so that a very simple lens can be used without distortions; real camera lenses have to project onto a flat surface, which is less natural given the spherical nature of a simple lens (in fact, better lenses usually contain a few aspherical elements).
This is about 22mm diagonal on the average, just a bit larger than a micro four-thirds sensor, but the spherical nature means the surface area is around 1100mm^2, a bit larger than a full-frame 35mm camera sensor. The highest pixel resolution on a 35mm sensor is on the Canon 5Ds, which stuffs 50.6Mpixels into about 860mm^2.
So that's the hardware. But that's not the limiting factor on effective resolution. The eye seems to see "continuously," but it's cyclical, there's kind of a frame rate that's really fast, but that's not the important one. The eye is in constant motion from ocular microtremors that occur at around 70-110Hz. Your brain is constantly integrating the output of your eye as it's moving around into the image you actually perceive, and the result is that, unless something's moving too fast, you get an effective resolution boost from 120MP to something more like 480MP as the image is constructed from multiple samples.
Which makes perfect sense—our brains can do this kind of problem as a parallel processor with performance comparable to the fastest supercomputers we have today. When we perceive an image, there's this low-level image processing, plus specialized processes that work on higher level abstractions. For example, we humans are really good at recognizing horizontal and vertical lines, while our friendly frog neighbors have specialized processing in their relatively simple brains looking for a small object flying across the visual field: that fly he just ate. We also do constant pattern matching of what we see back to our memories of things. So we don't just see an object, we instantly recognize an object and call up a whole library of information on that thing we just saw.
Another interesting aspect of our in-brain image processing is that we don't demand any particular resolution. As our eyes age and we can't see as well, our effective resolution drops, and yet, we adapt. In a relatively short term, we adapt to what the eye can actually see, and you can experience this at home. If you're old enough to have spent lots of time in front of Standard Definition television, you have already experienced this. Your brain adapted to the fairly terrible quality of NTSC television (or the slightly less terrible but still bad quality of PAL television), and then perhaps jumped to VHS, which was even worse than what you could get via broadcast. When digital started, between VideoCD and early DVRs like the TiVo, the quality was really terrible, but if you watched lots of it, you stopped noticing the quality over time if you didn't dwell on it. An HDTV viewer of today, going back to those old media, will be really disappointed, and mostly because their brain moved on to the better video experience and dropped those bad-TV adaptations over time.
Back to the multi-sampled image for a second; cameras do this. In low light, many cameras today have the ability to average several different photos on the fly, which boosts the signal and cuts down on noise; your brain does this, too, in the dark. We're even doing the "microtremor" thing in cameras. The recent Olympus OM-D E-M5 Mark II has a "hires" mode that takes eight shots with 1/2 pixel adjustment, to deliver what's essentially two 16MP images in full RGB (because full pixel steps ensure every pixel is sampled at R, G, B, G), one offset by 1/2 pixel from the other. Interpolating these interstitial images as a normal pixel grid delivers 64MP, but the effective resolution is more like 40MP, still a big jump up from 16MP. Hasselblad showed a similar thing in 2013 that delivered a 200MP capture, and Pentax is also releasing a camera with something like this built-in.
We're doing simple versions of the higher-level brain functions, too, in our cameras. All kinds of current-model cameras can do face recognition and tracking, follow-focus, etc. They're nowhere near as good at it as our eye/brain combination, but they do ok for such weak hardware.
They're only few hundred million years late...