Implement image zoom with a linear interpolation algorithm

zhaozj2021-02-08  265

Implement image zoom with a linear interpolation algorithm

Raptor [Mental Studio] (Personal Column) (Blog)

Http://eental.mentsu.com

People who have used images in Windows should all know that Windows GDI has an API function: Stretchblt, corresponding to the StretchDraw method for the Tcanvas class in VCL. It can simply realize the zoom operation of the image. But the problem is that it is the fastest, simplest but effect is also the worst "nearest neighborhood", although in most cases, it is also enough, but it will not be possible for higher requirements.

Not long ago, I did a little girl (see "My Album of Human Information Assistant"), used to manage a bunch of photos I took in DC, one of which provides zoom function, the current version is used STRETCHDRAW, sometimes the effect can not be satisfactory, I always want to join two better: linear interpolation and three samples. After study, it was found that the amount of calculation of three splines was too large, not practical, so decided to do only the linear interpolation method.

From the basic theory of digital image processing, we can know that the transformation of the image is the coordinate transformation of the source image to the target image. The simple idea is to convert each point coordinate of the source image to the corresponding point of the target image through the deformation operation, but this will cause a problem that the coordinates of the target point will not be integer, and it can cause the target like an enlargement operation. There is no point of the source image in the image to the shortcomings of the so-called "forward mapping" method. Therefore, it is generally adopted "reverse mapping" method.

However, the reverse mapping method is not an integer when mapping the source image coordinates. Here is a "resample filter". This term looks very professional, but it is because it borrows the usual statement in electronic signal processing (in most cases, its function is similar to the bandpass filter in electronic signal processing), it is not complex, How to determine what color problem should be in this non-integer coordinate. The three methods mentioned earlier: the nearest neighborhood method, linear interpolation method and triple spline method are so-called "resample filters".

The so-called "nearest neighborhood" is the color of this non-integer coordinate as a round of round, take the color of the nearest integer point coordinates. The "linear interpolation method" is based on the most close few points (for planar images, a total of four points) is used as linear interpolation calculations (two-dimensional linear interpolation for planar images) to estimate this color. In most cases, its accuracy is higher than the nearest neighborhood method. Of course, the effect is much better, and the most obvious is that the serration of the image edge is much smaller than the nearest neighborhood. Of course, it also has a problem with a problem: it is the image will be relatively soft. This filter uses a professional term (huh, the priority of the priority) is called: good with resistance, but there is a losses, the rectangular coefficient of the passband curve is not high. As for three splines, I don't say it, it is more complicated, and I can refer to the professional books in digital image processing, such as the reference.

Let's discuss the algorithm for coordinate transformation. Simple spatial transformation can be represented by a transformation matrix:

[X ', Y', W '] = [U, V, W] * T

Where: x ', y' is the target image coordinate, U, V is the source image coordinate, W, W 'referred to as qi coordinates, typically set to 1, t is a 3x3 transformation matrix.

This representation is very mathematically, but in this form, it is convenient to represent a variety of different transformations, such as translation, rotation, zoom, and the like. For zoom, it is equivalent to:

[Su 0 0]

[x, y, 1] = [u, v, 1] * | 0 sv 0 |

[0 0 1]

Where SU, SV are the zoom ratio in the X-axis direction and the Y-axis direction, amplified when it is greater than 1, which is more than 0 less than 1, and it is more than zero. Does the matrix look more dizzy? In fact, put the above formula by matrix multiplication is:

{x = u * su

{y = v * sv

It's that simple. ^ _ ^

With the three preparations above, you can start writing code. The idea is simple: first use two cycles to traverse each point coordinate of the target image, through the above transformation type (note: the corresponding transformation type should be: u = x / su and v = Y / SV Get the source coordinate. Because the source coordinate is not an integer coordinate, two-dimensional linear interpolation operation is required:

P = N * b * pa n * (1 - b) * Pb (1 - n) * b * pc (1 - n) * (1 - b) * Pd

Where: N is V (the corresponding point of the corresponding point in the source image, it is generally not an integer) The closer row of Y-axis coordinates and V are poor; the same B is similar, but it is an x-axis coordinate. The PA-PD is the color of four (U, V) points (U, V) (top left, upper right, left, right) source image points (using Tcanva's PIELS attribute). P is (U, V) point of the interpolation color, that is, the approximate color of the (x, y) point.

I don't write this code because it is extremely low: RGBs to each point of the target image for each string of complex floating point operations. So you must optimize. For VCL applications, there is a relatively simple optimization method to use Tbitmap's Scanline property, and can avoid Pixels's pixel level operations, which can be greatly improved in performance. This is already a basic optimization of image processing with VCL. However, this method does not always use, such as when the image is rotated, there is more techniques.

In any case, the overhead of floating point operations is much larger than integers, and this is also necessary to optimize. As can be seen from the above, the floating point number is introduced during transformation, and the transform parameters SU, SV usually the floating point number, so it will be optimized from it. In general, SU, SV can represent the form of a component:

SU = (double) dw / sw; sv = (double) DH / SH

Where DW, DH is the width and height of the target image, SW, SH is the width and height of the source image (because all integers are integrated, and the type conversion is required for the floating point result.).

Put the new SU, SV into the previous transform formula and interpolation formula, can export new interpolation formulas:

because:

B = 1 - x * SW% DW / (double) dw; n = 1 - y * SH% DH / (double) DH

Assume:

B = DW - x * SW% DW; N = DH - Y * SH% DH

then:

B = b / (double) dw; n = n / (double) DH

Use an integer B, N instead of floating point B, N, conversion interpolation formula:

P = (B * n * (PA - PB - PC PD) DW * N * PB DH * B * PC (DW * DH - DH * B - DW * N) * PD) / (double) DW * DH) Here the final result P is the floating point number, and the results can be obtained from it. In order to completely eliminate floating point numbers, you can use this method to go to the round:

P = (B * n ... * PD DW * DH / 2) / (dw * dh)

In this way, P is directly rounded to the integer value, all of the calculations are integer operations.

Simple optimized code is as follows:

INT __FASTCALL TRESIZEDLG :: Stretch_Linear (Graphics :: Tbitmap * ADest, Graphics :: Tbitmap * ASRC)

{

INT SW = asrc-> width - 1, sh = asrc-> height - 1, dw = adst-> width - 1, dh = adst-> height - 1;

INT B, N, X, Y;

INT NPIXELSIZE = GetPixelsize (ADest-> Pixelformat);

BYTE * PLINEPREV, * PLINENEXT;

BYTE * PDEST;

BYTE * PA, * PB, * PC, * Pd;

For (INT i = 0; i <= dh; i)

{

PDEST = (Byte *) ADEST-> Scanline [i];

Y = i * sh / dh;

N = DH - I * SH% DH;

PLINEPREV = (Byte *) ASRC-> Scanline [Y ];

PLINENEXT = (n == DH)? PLINEPREV: (Byte *) ASRC-> Scanline [Y];

For (int J = 0; j <= dw; j)

{

X = J * SW / DW * NPIXELSIZE;

B = DW - J * SW% DW;

PA = PLINEPREV X;

PB = Pa NPIXELSIZE;

PC = PLINENEXT X;

PD = PC NPIXELSIZE;

IF (b == dw)

{

PB = Pa;

PD = PC;

}

For (int K = 0; k

* PDEST = (byte) (int) (int)

(B * n * (* PA - * Pb - * pc * pd) DW * n * * PB

DH * B * * PC (DW * DH - DH * B - DW * N) * * PD

DW * DH / 2) / (dw * dh)

);

}

}

Return 0;

}

It should be said that it is still relatively simple. Because the width height is calculated from 0, it is necessary to reduce one. GetPixelsize is how many bytes of each pixel is determined based on the PixelFormat property. This code only supports 24 or 32-bit colors (for 15 or 16-bit color needs Night disassembly - because the words that do not disassemble will appear in the calculation, resulting in image color chaos - processing is more troublesome; for 8-bit and 8-bit index colors need to check the palette, and need to be indexed It is also very troublesome, so it is not supported; but the 8-bit grayscale image can be supported). In addition, some code is added to prevent access to the crossover when the image edge is added. By comparison, in the case of the PIII-733, the target image is less than 1024x768, the basic feeling is not significantly slower than StretchDraw (it feels more significantly when using floating point). The effect is also quite satisfactory, whether it is shrinking or amplified, the image quality is significantly improved than the StretchDraw method.

However, due to the use of integer operations, there is a problem to pay attention, that is, the problem of overflow: Since the denominator is DW * DH, the result should be an 8-bit binary number, the symbol integer can represent 31 bits The binary number, so the value of DW * DH cannot exceed 23 binary numbers, i.e., the aspect of 2: 1 is higher than calculating the target image resolution exceeding 4096 * 2048. Of course, this can also be extended by means of unsigned number (which can increase one) and reduced calculation accuracy, interested friends can try themselves.

Of course, this code is far from being optimized, and there are still many problems, such as anti-aliasing, etc., interested friends can refer to relevant book research, if you have what research results, very Welcome to write plugins for my program.

[Mental Studio] Raptor

2004-3-28

references:

Cui Wei "Digital Image Processing Technology and Application" Electronic Industry Press, 1997

转载请注明原文地址:https://www.9cbs.com/read-2670.html

New Post(0)