Hardware and software setup

dynamic compression. Synthesis and speech recognition

© 2014 website

Or photographic latitude photographic material is the ratio between the maximum and minimum exposure values ​​\u200b\u200bthat can be correctly captured in the picture. As applied to digital photography, the dynamic range is actually equivalent to the ratio of the maximum and minimum possible values ​​of the useful electrical signal generated by the photosensor during exposure.

Dynamic range is measured in exposure steps (). Each step corresponds to doubling the amount of light. So, for example, if a certain camera has a dynamic range of 8 EV, then this means that the maximum possible value of the useful signal of its matrix is ​​related to the minimum as 2 8: 1, which means that the camera is able to capture objects that differ in brightness within one frame no more than 256 times. More precisely, it can capture objects with any brightness, however, objects whose brightness will exceed the maximum allowable value will come out dazzling white in the picture, and objects whose brightness will be below the minimum value will be jet black. Details and texture will be distinguishable only on those objects, the brightness of which fits into the dynamic range of the camera.

To describe the relationship between the brightness of the lightest and darkest of the subjects being photographed, the not quite correct term "dynamic range of the scene" is often used. It would be more correct to talk about the range of brightness or the level of contrast, since the dynamic range is usually a characteristic of the measuring device (in this case, the matrix of a digital camera).

Unfortunately, the brightness range of many of the beautiful scenes we encounter in real life may significantly exceed the dynamic range of a digital camera. In such cases, the photographer is forced to decide which objects should be worked out in all details, and which can be left outside. dynamic range without sacrificing creativity. In order to make the most of your camera's dynamic range, sometimes you may need not so much a thorough understanding of the principle of operation of the photosensor as a developed artistic flair.

Factors limiting dynamic range

The lower limit of the dynamic range is set by the intrinsic noise level of the photosensor. Even an unlit matrix generates a background electrical signal called dark noise. Also, interference occurs when a charge is transferred to an analog-to-digital converter, and the ADC itself introduces a certain error into the digitized signal - the so-called. sampling noise.

If you take a picture in complete darkness or with a lens cap on, the camera will only record this meaningless noise. If a minimum amount of light is allowed to hit the sensor, the photodiodes will begin to accumulate electric charge. The magnitude of the charge, and hence the intensity of the useful signal, will be proportional to the number of captured photons. In order for any meaningful details to appear in the picture, it is necessary that the level of the useful signal exceed the level of background noise.

Thus, the lower limit of the dynamic range or, in other words, the sensor sensitivity threshold can be formally defined as the output signal level at which the signal-to-noise ratio is greater than one.

The upper limit of the dynamic range is determined by the capacitance of a single photodiode. If during exposure any photodiode accumulates an electric charge of the maximum value for itself, then the image pixel corresponding to the overloaded photodiode will turn out to be absolutely white, and further irradiation will not affect its brightness in any way. This phenomenon is called clipping. The higher the overload capacity of the photodiode, the more signal it is able to give at the output before it reaches saturation.

For greater clarity, let's turn to the characteristic curve, which is a graph of the dependence of the output signal on the exposure. The horizontal axis is the binary logarithm of the irradiation received by the sensor, and the vertical axis is the binary logarithm of the magnitude of the electrical signal generated by the sensor in response to this irradiation. My drawing is largely arbitrary and is for illustrative purposes only. The characteristic curve of a real photosensor has a slightly more complex shape, and the noise level is rarely so high.

Two critical turning points are clearly visible on the graph: in the first of them, the useful signal level crosses the noise threshold, and in the second, the photodiodes reach saturation. The exposure values ​​between these two points constitute the dynamic range. In this abstract example, it is equal, as you can easily see, to 5 EV, i.e. the camera is able to digest five doublings of exposure, which is equivalent to a 32-fold (2 5 = 32) difference in brightness.

The exposure zones that make up the dynamic range are not equivalent. The upper zones have a higher signal-to-noise ratio, and therefore look cleaner and more detailed than the lower ones. As a result, the upper limit of the dynamic range is very real and noticeable - clipping cuts off the light at the slightest overexposure, while the lower limit is inconspicuously drowned in noise, and the transition to black is not as sharp as to white.

The linear dependence of the signal on exposure, as well as a sharp plateau, are unique features of the digital photographic process. For comparison, take a look at the conditional characteristic curve of traditional photographic film.

The shape of the curve, and especially the angle of inclination, strongly depend on the type of film and on the procedure for its development, but the main, conspicuous difference between the film graph and the digital one remains unchanged - the non-linear nature of the dependence of the optical density of the film on the exposure value.

The lower limit of the photographic latitude of the negative film is determined by the density of the veil, and the upper limit is determined by the maximum achievable optical density of the photolayer; for reversible films, the opposite is true. Both in the shadows and in the highlights, smooth curves of the characteristic curve are observed, indicating a drop in contrast when approaching the boundaries of the dynamic range, because the slope of the curve is proportional to the contrast of the image. Thus, exposure areas lying in the middle of the graph have maximum contrast, while contrast is reduced in highlights and shadows. In practice, the difference between film and digital matrix is ​​especially noticeable in the highlights: where in the digital image the lights are burned out by clipping, on the film the details are still distinguishable, albeit with low contrast, and the transition to pure white color looks smooth and natural.

In sensitometry, even two independent terms are used: actually photographic latitude, limited by a relatively linear section of the characteristic curve, and useful photographic latitude, which, in addition to the linear section, also includes the base and shoulder of the chart.

It is noteworthy that when processing digital photographs, as a rule, a more or less pronounced S-shaped curve is applied to them, increasing the contrast in midtones at the cost of reducing it in shadows and highlights, which gives the digital image a more natural and pleasing look to the eye.

Bit depth

Unlike the matrix of a digital camera, human vision is characterized by, let's say, a logarithmic view of the world. Successive doublings of the amount of light are perceived by us as equal changes in brightness. Light numbers can even be compared with musical octaves, because two-fold changes in sound frequency are perceived by ear as a single musical interval. Other sense organs work on the same principle. The non-linearity of perception greatly expands the range of human sensitivity to stimuli of varying intensity.

When converting a RAW file (it doesn't matter - using the camera or in a RAW converter) containing linear data, the so-called. gamma curve, which is designed to non-linearly increase the brightness of a digital image, bringing it into line with the characteristics of human vision.

With linear conversion, the image is too dark.

After gamma correction, the brightness returns to normal.

The gamma curve, as it were, stretches the dark tones and compresses the light tones, making the distribution of gradations more uniform. The result is a natural-looking image, but the noise and sampling artifacts in the shadows inevitably become more noticeable, which is only exacerbated by the small number of brightness levels in the lower zones.

Linear distribution of gradations of brightness.
Uniform distribution after applying the gamma curve.

ISO and dynamic range

Despite the fact that digital photography uses the same concept of the photosensitivity of photographic material as in film photography, it should be understood that this happens solely due to tradition, since the approaches to changing the photosensitivity in digital and film photography differ fundamentally.

Increasing the ISO speed in traditional photography means changing from one film to another with coarser grain, i.e. there is an objective change in the properties of the photographic material itself. In a digital camera, the light sensitivity of the sensor is rigidly set by its physical characteristics and cannot be literally changed. When increasing the ISO, the camera does not change the actual sensitivity of the sensor, but only amplifies the electrical signal generated by the sensor in response to irradiation and adjusts the algorithm for digitizing this signal accordingly.

An important consequence of this is the decrease in effective dynamic range in proportion to the increase in ISO, because along with the useful signal, noise also increases. If at ISO 100 the entire range of signal values ​​is digitized - from zero to the saturation point, then at ISO 200 only half of the capacity of photodiodes is taken as a maximum. With each doubling of ISO sensitivity, the top stop of the dynamic range seems to be cut off, and the remaining steps are pulled up in its place. That is why the use of ultra-high ISO values ​​\u200b\u200bis devoid of practical meaning. With the same success, you can brighten the photo in the RAW converter and get a comparable noise level. The difference between increasing the ISO and artificially brightening the image is that when the ISO is increased, the signal is amplified before it enters the ADC, which means that the quantization noise is not amplified, unlike the sensor’s own noise, while in the RAW converter they are subject to amplification including ADC errors. In addition, reducing the sampling range means more accurate sampling of the remaining values ​​of the input signal.

By the way, lowering the ISO below the base value (for example, to ISO 50) available on some devices does not expand the dynamic range at all, but simply attenuates the signal by half, which is equivalent to darkening the image in the RAW converter. This function can even be considered as harmful, since using a sub-minimum ISO value provokes the camera to increase the exposure, which, with the sensor saturation threshold remaining unchanged, increases the risk of clipping in the highlights.

True value of dynamic range

There are a number of programs like (DxO Analyzer, Imatest, RawDigger, etc.) that allow you to measure the dynamic range of a digital camera at home. In principle, this is not very necessary, since data for most cameras can be freely found on the Internet, for example, at DxOMark.com.

Should we believe the results of such tests? Quite. With the only caveat that all these tests determine the effective or, so to speak, the technical dynamic range, i.e. the relationship between saturation level and matrix noise level. For the photographer, the useful dynamic range is of primary importance, i.e. the number of exposure zones that really allow you to capture some useful information.

As you remember, the dynamic range threshold is set by the noise level of the photosensor. The problem is that, in practice, the lower zones, which are technically already included in the dynamic range, still contain too much noise to be usefully used. Here, much depends on individual disgust - everyone determines the acceptable noise level for himself.

My subjective opinion is that the details in the shadows begin to look more or less decent at a signal-to-noise ratio of at least eight. On that basis, I define useful dynamic range for myself as technical dynamic range minus about three stops.

For example, if a reflex camera has a dynamic range of 13 EV, which is very good by today's standards, according to reliable tests, then its useful dynamic range will be about 10 EV, which, in general, is also quite good. Of course, we are talking about shooting in RAW, with a minimum ISO and maximum bit depth. When shooting in JPEG, the dynamic range is highly dependent on the contrast settings, but on average, another two to three stops should be discarded.

For comparison: color reversible films have a useful photographic latitude of 5-6 steps; black and white negative films give 9-10 stops at standard procedures manifestations and seals, and with certain manipulations - up to 16-18 steps.

Summarizing the above, let's try to formulate a few simple rules, following which will help you get the most out of your camera sensor:

  • The dynamic range of a digital camera is fully available only when shooting in RAW.
  • Dynamic range decreases as ISO increases, so avoid high ISO unless absolutely necessary.
  • Using higher bit depths for RAW files does not increase true dynamic range, but improves tonal separation in shadows at the expense of more brightness levels.
  • Exposure to the right. The upper exposure zones always contain maximum useful information with minimum noise and should be used most effectively. At the same time, do not forget about the danger of clipping - pixels that have reached saturation are absolutely useless.

And most importantly, don't worry too much about your camera's dynamic range. It's all right with dynamic range. Your ability to see the light and properly manage the exposure is much more important. A good photographer will not complain about the lack of photographic latitude, but will try to wait for more comfortable lighting, or change the angle, or use the flash, in a word, will act in accordance with the circumstances. I'll tell you more: some scenes only benefit from the fact that they do not fit into the dynamic range of the camera. Often, unnecessary abundance of details just needs to be hidden in a semi-abstract black silhouette, which makes the photo both concise and richer.

High contrast is not always bad - you just need to be able to work with it. Learn to exploit the equipment's weaknesses as well as its strengths, and you'll be surprised at how much your creativity expands.

Thank you for your attention!

Vasily A.

post scriptum

If the article turned out to be useful and informative for you, you can kindly support the project by contributing to its development. If you did not like the article, but you have thoughts on how to make it better, your criticism will be accepted with no less gratitude.

Do not forget that this article is subject to copyright. Reprinting and quoting are permissible provided there is a valid link to the original source, and the text used must not be distorted or modified in any way.

Dynamic compression(Dynamic range compression, DRC) - narrowing (or expanding in the case of an expander) the dynamic range of a phonogram. Dynamic Range, is the difference between the quietest and loudest sound. Sometimes the quietest sound in the phonogram will be a little louder than the noise level, and sometimes a little quieter than the loudest. Hardware devices and programs that perform dynamic compression are called compressors, distinguishing four main groups among them: compressors themselves, limiters, expanders and gates.

Tube analog compressor DBX 566

Down and up compression

downcompression(Downward compression) reduces the volume of a sound when it exceeds a certain threshold, leaving quieter sounds unchanged. An extreme version of downcompression is limiter. Up Compression(Upward compression), on the contrary, increases the volume of the sound if it is below the threshold value, without affecting more loud sounds. At the same time, both types of compression narrow the dynamic range of the audio signal.

downcompression

Up Compression

Expander and Gate

If the compressor reduces the dynamic range, the expander increases it. When the signal level gets above the threshold level, the expander increases it even more, thus increasing the difference between loud and soft sounds. Such devices are often used when recording a drum set to separate the sounds of one drum from another.

The type of expander that is used not to amplify loud, but to mute soft sounds that do not exceed a threshold level (for example, background noise) is called noise gate. In such a device, as soon as the sound level becomes less than the threshold, the signal stops passing. Typically, a gate is used to suppress noise in pauses. On some models, you can make sure that the sound does not stop abruptly when the threshold level is reached, but gradually fades out. In this case, the decay rate is set by the Decay control.

Gate, like other types of compressors, can be frequency dependent(i.e. treat certain frequency bands differently) and can operate in side chain(see below).

The principle of operation of the compressor

The signal entering the compressor is split into two copies. One copy is sent to an amplifier in which the gain is controlled by an external signal, the second copy forms this signal. It enters a device called a side-chain, where the signal is measured, and based on this data, an envelope is created that describes the change in its volume.
This is how most modern compressors are arranged, this is the so-called feed-forward type. In older devices (feedback type), the signal level is measured after the amplifier.

There are various analog technologies for controlled amplification (variable-gain amplification), each with its own advantages and disadvantages: tube, optical using photoresistors and transistors. When working with digital audio (in sound editor or DAW) can use their own mathematical algorithms or emulate the work of analog technologies.

Main parameters of compressors

Threshold

The compressor reduces the level of the audio signal if its amplitude exceeds a certain threshold value (threshold). It is usually specified in decibels, with a lower threshold (eg -60 dB) meaning more sound will be processed than a higher threshold (eg -5 dB).

Ratio

The amount of level reduction is determined by the ratio parameter: a ratio of 4:1 means that if the input level is 4 dB above the threshold, the output level will be 1 dB above the threshold.
For instance:
Threshold = -10dB
Input signal = -6 dB (4 dB above threshold)
Output signal = -9 dB (1 dB above threshold)

It is important to keep in mind that the suppression of the signal level continues for some time after it falls below the threshold level, and this time is determined by the value of the parameter release.

Compression with a maximum ratio of ∞:1 is called limiting. This means that any signal above the threshold level is attenuated to the threshold level (except for a short period after a sudden increase in the input volume). See "Limiter" below for details.

Examples of different Ratio values

Attack and Release

The compressor provides some control over how quickly it responds to changing signal dynamics. The Attack parameter determines the time it takes for the compressor to reduce the gain to the level specified by the Ratio parameter. Release determines the amount of time it takes for the compressor to either ramp up the gain, or return to normal if the input level drops below the threshold.

Attack and Release phases

These parameters indicate the time (usually in milliseconds) it takes for the gain to change by a certain number of decibels, typically 10 dB. For example, in this case, if Attack is set to 1ms, it will take 1ms to decrease the gain by 10dB, and 2ms by 20dB.

In many compressors, the Attack and Release parameters can be adjusted, but in some they are preset and are not adjustable. Sometimes they are referred to as "automatic" or "program dependent", i.e. change depending on the input signal.

Knee

Another compressor option: hard/soft Knee. It determines whether the start of applying compression will be abrupt (hard) or gradual (soft). Soft knee reduces the visibility of the dry-to-compressed signal transition, especially at high Ratios and sudden volume increases.

Hard Knee and Soft Knee Compression

Peak and RMS

The compressor can respond to peak (short-term maximum) values ​​or to the average level of the input signal. The use of peak values ​​can lead to large fluctuations in the degree of compression, and even distortion. Therefore, compressors apply an averaging function (usually RMS) of the input signal when comparing it to a threshold value. This gives a more comfortable compression that is closer to the human perception of loudness.

RMS is a parameter that reflects the average loudness of a phonogram. From a mathematical point of view, RMS (Root Mean Square) is the root mean square value of the amplitude of a certain number of samples:

stereo linking

A compressor in stereo linking mode applies the same gain to both stereo channels. This avoids shifting the stereo pan that can result from processing the left and right channels individually. Such an offset occurs if, for example, any loud element is panned off-center.

makeup gain

As the compressor reduces general level signal, the possibility of a fixed gain at the output is usually added, which allows you to get the optimal level.

Look-ahead

The look-ahead function is intended to solve the problems associated with both too large and too small Attack and Release values. Too long an attack time does not allow effective interception of transients, and too short an attack time may not be comfortable for the listener. When using the look-ahead function, the main signal is delayed relative to the control signal, this allows compression to begin in advance, even before the signal reaches the threshold value.
The only drawback of this method is the time delay of the signal, which is undesirable in some cases.

Using Dynamic Compression

Compression is used everywhere, not only in musical phonograms, but also wherever it is necessary to increase the overall volume without increasing peak levels, where inexpensive sound reproducing equipment or a limited transmission channel is used (public address and communication systems, amateur radio, etc.) .

Compression is applied when playing background music (in shops, restaurants, etc.) where any noticeable volume changes are undesirable.

But the most important application of dynamic compression is music production and broadcasting. Compression is used to give the sound "thickness" and "drive", to better match instruments with each other, and especially when processing vocals.

Vocals in rock and pop music are usually compressed to make them stand out from the accompaniment and add clarity. A special kind of compressor, tuned only to certain frequencies - a de-esser, is used to suppress hissing phonemes.

In instrumental parts, compression is also used for effects that are not directly related to volume, for example, quickly fading drum sounds can become longer.

Electronic dance music (EDM) often uses side-chaining (see below) - for example, the bass line can be driven by a kick or similar to prevent bass/drum conflict and create dynamic pulsation.

Compression is widely used in broadcast (radio, TV, internet) to increase the perceived loudness while reducing the dynamic range of the original audio (usually a CD). Most countries have legal limits on the instantaneous maximum volume that can be broadcast. Usually these limitations are implemented by permanent hardware compressors in the on-air circuit. In addition, increasing the perceived loudness improves the "quality" of the sound from the point of view of most listeners.

see also Loudness war.

Sequential increase in the volume of the same song, remastered for CD from 1983 to 2000.

side chaining

Another common compressor switch is the "side chain". In this mode, the sound is compressed not depending on its own level, but depending on the level of the signal coming to the connector, which is usually called side chain.

There are several uses for this. For example, the vocalist is lisping and all the letters "s" stand out from the overall picture. You pass his voice through the compressor, and the same sound is fed into the side chain jack, but passed through the equalizer. On the equalizer, you remove all frequencies except those used by the vocalist when pronouncing the letter "c". Usually about 5 kHz, but can be from 3 kHz to 8 kHz. If you then put the compressor in side chain mode, then the compression of the voice will occur at those moments when the letter “s” is pronounced. Thus, the device known as the "de-esser" (de-esser) was obtained. This way of working is called frequency dependent.

Another application of this function is called "ducker". For example, at a radio station, the music goes through the compressor, and the words of the DJ go through the side chain. When the DJ starts chatting, the volume of the music will automatically decrease. This effect can also be successfully applied in recording, for example, to reduce the volume of keyboard parts while singing.

brick wall limiting

The compressor and limiter work in much the same way, we can say that the limiter is a compressor with a high Ratio (from 10:1) and usually a low attack time.

There is the concept of Brick wall limiting - limiting with a very high Ratio (from 20:1 and above) and a very fast attack. Ideally, it does not allow the signal to exceed the threshold level at all. The result will be unpleasant to the ear, but it will prevent damage to sound-reproducing equipment or exceeding the bandwidth of the channel. Many manufacturers integrate limiters into their devices for this very purpose.

Clipper vs. Limiter, soft and hard clipping

, Media players

Records, especially older records that were recorded and made before 1982, were much less likely to be mixed to make the record louder. They reproduce natural music with a natural dynamic range that is retained on the record and lost in most standard or high-definition digital formats.

Of course, there are exceptions here - listen to the recently released Steven Wilson album from MA Recordings or Reference Recordings, and you will hear how good it can be digital sound. But this is rare, most modern sound recordings are loud and compressed.

Music compression has come under a lot of criticism lately, but I'm willing to bet that almost all of your favorite recordings are compressed. Some of them less, some more, but still compressed. Dynamic range compression is a scapegoat that gets blamed for bad musical sounds, but highly compressed music is not a new trend: listen to Motown albums from the 60s. The same can be said for Led Zeppelin classics or younger Wilco and Radiohead albums. Dynamic range compression reduces the natural ratio between the loudest and the most quiet sound on the recording, so a whisper can be as loud as a scream. It's pretty hard to find pop music from the last 50 years that hasn't been compressed.

I recently had a nice chat with Tape Op magazine founder and editor Larry Crane about the good, bad, and "evil" aspects of compression. Larry Crane has worked with such bands and artists as Stefan Marcus, Cat Power, Sleater-Kinney, Jenny Lewis, M. Ward, The Go-Betweens, Jason Little, Eliot Smith, Quasi and Richmond Fontaine. He also runs the recording studio Jackpot! in Portland, Oregon, which has been home to The Breeders, The Decemberists, Eddie Vedder, Pavement, R.E.M., She & Him and many, many more.

As an example of surprisingly stilted but still great songs, I cite Spoon's "They Want My Soul", released in 2014. Crane laughs and says he listens to it in the car because it sounds great there. Which brings us to yet another answer to why music is compressed: because the compression and extra "clarity" makes it easier to hear in noisy places.

Larry Crane at work. Photo by Jason Quigley

When people say they like the sound of an audio recording, I consider that they like the music, as if sound and music were inseparable terms. But for myself, I differentiate these concepts. From a music lover's point of view, the sound may be rough and raw, but that won't matter to most listeners.

Many are in a hurry to accuse mastering engineers of abusing compression, but compression is applied directly during recording, during mixing, and only then during mastering. Unless you were personally present at each of these stages, you will not be able to tell how the instruments and vocals sounded at the very beginning of the process.

Crane was on fire: "If a musician wants to intentionally make the sound crazy and distorted like the Guided by Voices records, then there is nothing wrong with that - the desire always outweighs the quality of the sound." The performer's voice is almost always compressed, the same thing happens with bass, drums, guitars and synthesizers. With compression, the volume of the vocals is kept at right level throughout the song or stands out a little from the rest of the sounds.

Properly done compression can make drums sound more lively or intentionally strange. To make music sound great, you need to be able to use the necessary tools for this. That's why it takes years to figure out how to use compression and not overdo it. If the mix engineer compresses the guitar part too much, then the mastering engineer will no longer be able to fully restore the missing frequencies.

If musicians wanted you to listen to music that had not gone through the stages of mixing and mastering, then they would release it on store shelves straight from the studio. Crane says that the people who create, edit, mix and master music recordings are not there to get in the way of musicians - they have been helping performers from the very beginning, that is, for more than a hundred years.

These people are part of the creative process that results in amazing works of art. Crane adds, "You don't want a version of 'Dark Side of the Moon' that hasn't been mixed and mastered." Pink Floyd released the song the way they wanted to hear it.

The second part of the cycle is devoted to the functions of optimizing the dynamic range of images. In it, we will explain why such solutions are needed, consider various options for their implementation, as well as their advantages and disadvantages.

Embrace the immensity

Ideally, the camera should capture the image of the surrounding world as it is perceived by a person. However, due to the fact that the mechanisms of "vision" of the camera and the human eye are significantly different, there are a number of limitations that do not allow this condition to be met.

One of the problems previously faced by users of film cameras, and now owners of digital ones, is the inability to adequately capture scenes with large differences in light without the use of special devices and / or special shooting techniques. The features of the human visual apparatus make it possible to equally well perceive the details of high-contrast scenes both in brightly lit and dark areas. Unfortunately, the camera sensor is not always able to capture the image as we see it.

The greater the difference in brightness on the photographed scene, the higher the likelihood of loss of detail in the highlights and / or shadows. As a result, instead of a blue sky with lush clouds in the picture, only a whitish spot is obtained, and objects located in the shadows turn into indistinct dark silhouettes or even merge with the surroundings.

Classical photography uses the notion photographic latitude(see sidebar for details). Theoretically photographic latitude digital cameras is determined by the capacity of the analog-to-digital converter (ADC). For example, when using an 8-bit ADC, taking into account the quantization error, the theoretically achievable value of the photographic latitude will be 7 EV, for a 12-bit ADC - 11 EV, etc. However, in real devices, the dynamic range of images is at same theoretical maximum due to the influence of various kinds of noise and other factors.

A large difference in brightness levels is a serious
photography problem. In this case, the capabilities of the camera
was not enough to adequately convey the most
light areas of the scene, and as a result, instead of a blue area
the sky (marked with a stroke) turned out to be a white “patch”

The maximum brightness value that a photosensitive sensor can detect is determined by the saturation level of its cells. The minimum value depends on several factors, including the amount of thermal noise of the matrix, charge transfer noise, and ADC error.

It is also worth noting that the photographic latitude of the same digital camera may vary depending on the sensitivity value set in the settings. The maximum dynamic range is achievable by setting the so-called basic sensitivity (corresponding to the minimum numerical value possible). As the value of this parameter increases, the dynamic range decreases due to the increased noise level.

The photographic latitude of modern models of digital cameras equipped with large sensors and 14- or 16-bit ADCs is from 9 to 11 EV, which is significantly higher compared to the same characteristics of 35 mm color negative films (4 to 5 EV on average). ). Thus, even relatively inexpensive digital cameras have enough photographic latitude to adequately capture most typical amateur photography scenes.

However, there is a problem of a different kind. It is connected with the restrictions imposed by the existing standards for recording digital images. Using the 8 bits per color channel JPEG format (which has now become the de facto standard for recording digital images in computer industry and digital technology), even theoretically it is impossible to save a picture with a photographic latitude of more than 8 EV.

Let's assume that the ADC of the camera allows you to get an image with a bit depth of 12 or 14 bits, containing distinguishable details in both highlights and shadows. However, if the photographic latitude of this image exceeds 8 EV, then in the process of converting to a standard 8-bit format without any additional steps (that is, simply by discarding "extra" bits), part of the information recorded by the photosensitive sensor will be lost.

Dynamic Range and Photographic Latitude

In simple terms, dynamic range is defined as the ratio of the maximum brightness value of an image to its minimum value. In classical photography, the term photographic latitude is traditionally used, which, in fact, means the same thing.

The dynamic range width can be expressed as a ratio (for example, 1000:1, 2500:1, etc.), but the logarithmic scale is most commonly used. In this case, the value of the decimal logarithm of the ratio of the maximum brightness to its minimum value is calculated, and the number is followed by a capital letter D (from the English density? - density), less often? - the abbreviation OD (from the English optical density? - optical density). For example, if the ratio of the maximum brightness value to the minimum value of any device is 1000:1, then the dynamic range will be 3.0 D:

To measure photographic latitude, so-called exposure units are traditionally used, denoted by the abbreviation EV (from the English exposure values; professionals often refer to them as “feet” or “steps”). It is in these units that the exposure compensation value is usually set in the camera settings. Increasing the photographic latitude value by 1 EV is equivalent to doubling the difference between the maximum and minimum brightness levels. Thus, the EV scale is also a logarithmic scale, but in this case, a logarithm with base 2 is used to calculate the numerical values. photographic latitude will be 8 EV:

Compression is a reasonable compromise

The most effective way to preserve the full image information captured by the camera's light sensor is to record pictures in RAW format. However, this function is not available in all cameras, and not every amateur photographer is ready to engage in painstaking work on the selection individual settings for each picture taken.

To reduce the possibility of loss of detail in high-contrast images converted inside the camera to 8-bit JPEG, devices from many manufacturers (and not only compact ones, but also SLRs) have been introduced special functions, allowing without user intervention to compress the dynamic range of the saved images. By reducing the overall contrast and losing a small part of the information of the original image, such solutions make it possible to preserve in 8-bit JPEG the details in highlights and shadows recorded by the light-sensitive sensor of the device, even if the dynamic range of the original image turned out to be wider than 8 EV.

One of the pioneers in the development of this direction was the HP company. Launched in 2003, the HP Photosmart 945 digital camera was the world's first to implement HP Adaptive Lightling technology, which automatically compensates for the lack of light in dark areas of images and thus preserves the details in the shadows without the risk of overexposure (which is very important when shooting high-contrast scenes). The algorithm of HP Adaptive Lightling is based on the principles set forth by the English scientist Edwin Land in the theory of human visual perception RETINEX.

HP Adaptive Lighting Feature Menu

How does Adaptive Lighting work? After obtaining a 12-bit image image, an auxiliary monochrome image is extracted from it, which is actually a light map. When processing an image, this map is used as a mask that allows you to adjust the degree of influence of a rather complex digital filter on the image. Thus, in areas corresponding to the darkest points of the map, the impact on the image of the future image is minimal, and vice versa. This approach allows you to show details in the shadows by selectively brightening these areas and, accordingly, reducing the overall contrast of the resulting image.

It should be noted that when the Adaptive Lighting function is enabled, the captured image is processed in the manner described above before the final image is written to a file. All the described operations are performed automatically, and the user can only select one of the two Adaptive Lighting modes in the camera menu (low or high level impact) or disable this feature.

Generally speaking, many of the specific functions of modern digital cameras (including the face recognition systems discussed in the previous article) are some kind of by-products or conversion products of research projects that were originally carried out for military customers. As far as image dynamic range optimization functions are concerned, one of the most well-known providers of such solutions is Apical. The algorithms created by its employees, in particular, underlie the operation of the SAT (Shadow Adjustment Technology - shadow correction technology) function implemented in a number of Olympus digital cameras. Briefly, the operation of the SAT function can be described as follows: based on the original image image, a mask is created corresponding to the darkest areas, and then the exposure level is automatically corrected for these areas.

Sony also acquired a license for the right to use Apical's developments. Many models of compact cameras in the Cyber-shot series and in SLR cameras of the alpha series have a so-called dynamic range optimization (Dynamic Range Optimizer, DRO) function.

Photos taken with the HP Photosmart R927 turned off (top)
and activated Adaptive Lighting

Image correction when DRO is activated is performed during the initial image processing (that is, before the finished JPEG file is written). In the basic version, DRO has a two-stage setting (in the menu, you can select the standard or extended mode of its operation). When Standard mode is selected, based on image analysis, the exposure is corrected for the exposure value, and then a tone curve is applied to the image to even out the overall balance. Advanced mode uses a more complex algorithm that allows you to make corrections in both shadows and highlights.

Sony developers are constantly working on improving the DRO algorithm. For example, in the a700 SLR camera, when the advanced DRO mode is activated, it is possible to select one of five correction options. In addition, it is possible to save three variants of one image at once (a kind of bracketing) with different DRO settings.

Many Nikon digital cameras have D-Lighting, which is also based on Apical algorithms. True, unlike the solutions described above, D-Lighting is implemented as a filter for processing previously saved images using a tone curve, the shape of which allows you to make the shadows lighter, while keeping the rest of the image unchanged. But since in this case ready-made 8-bit images are processed (and not the original image of the frame, which has a higher bit depth and, accordingly, a wider dynamic range), the possibilities of D-Lighting are very limited. The user can get the same result by processing the image in a graphical editor.

When comparing enlarged fragments, it is clearly seen that the dark areas of the original image (left)
when the Adaptive Lighting function is turned on, they become lighter

There are also a number of solutions based on other principles. So, in many cameras of the Lumix family from Panasonic (in particular, DMC-FX35, DMC-TZ4, DMC-TZ5, DMC-FS20, DMC-FZ18, etc.), the illumination recognition function (Intelligent Exposure) is implemented, which is integral part iA intelligent automatic shooting control systems. The Intelligent Exposure function is based on automatic analysis of the frame image and correction of dark areas of the image to avoid loss of detail in the shadows, as well as (if necessary) compression of the dynamic range of high-contrast scenes.

In some cases, the operation of the dynamic range optimization function provides not only certain operations for processing the original image image, but also the correction of shooting settings. For example, in the new models of Fujifilm digital cameras (in particular, in the FinePix S100FS), the function of expanding the dynamic range (Wide Dynamic Range, WDR) is implemented, which, according to the developers, allows to increase the photographic latitude by one or two steps (in terms of settings - 200 and 400%).

When the WDR function is activated, the camera takes pictures with an exposure compensation of -1 or -2 EV (depending on the selected setting). Thus, the image of the frame is underexposed - this is necessary in order to preserve the maximum information about the details in the highlights. Then the resulting image is processed using a tone curve, which allows you to even out the overall balance and adjust the black level. The image is then converted to 8-bit format and recorded as a JPEG file.

Dynamic range compression allows more detail to be retained
in lights and shadows, but the inevitable consequence of such an impact
is a decrease in overall contrast. On the bottom image
the texture of the clouds is much better worked out, however
due to the lower contrast, this variant of the image
looks less natural

A similar function called Dynamic Range Enlargement is implemented in a number of Pentax compact and SLR cameras (Optio S12, K200D, etc.). According to the manufacturer, the use of the Dynamic Range Enlargement function allows you to increase the photographic latitude of images by 1 EV without losing details in highlights and shadows.

A similar function called Highlight tone priority (HTP) is implemented in a number of Canon SLR models (EOS 40D, EOS 450D, etc.). According to the information in the user manual, activating HTP allows for better detail in highlights (more specifically, in the range of levels from 0 to 18% gray).

Conclusion

Let's summarize. Built-in dynamic range compression allows you to convert the original image with a large dynamic range to 8-bit with minimal damage jpeg file. In the absence of RAW frame saving, the dynamic range compression mode allows the photographer to use the full potential of his camera when shooting high-contrast scenes.

Of course, keep in mind that dynamic range compression is not a miracle cure, but rather a compromise. Preserving detail in highlights and/or shadows comes at the price of increased noise in the dark areas of the image, reduced contrast, and some coarsening of smooth tonal transitions.

Like any automatic function, the dynamic range compression algorithm is not a fully universal solution that allows you to improve absolutely any picture. Therefore, it makes sense to activate it only in those cases when it is really needed. For example, in order to shoot a silhouette with a well-developed background, the dynamic range compression function must be disabled - in otherwise effective plot will be hopelessly spoiled.

Concluding the consideration of this topic, it should be noted that the use of dynamic range compression functions does not allow you to “pull out” details in the resulting image that were not captured by the camera sensor. To obtain a satisfactory result when shooting high-contrast scenes, it is necessary to use additional devices (for example, gradient filters for photographing landscapes) or special techniques (such as taking several exposure bracketed shots and then combining them into one image using Tone Mapping technology).

The next article will focus on the burst shooting feature.

To be continued

Liked the article? Share with friends!
Was this article helpful?
Yes
Not
Thanks for your feedback!
Something went wrong and your vote was not counted.
Thank you. Your message has been sent
Did you find an error in the text?
Select it, click Ctrl+Enter and we'll fix it!