office

Domestic data encryption standard. Domestic data encryption standard

The algorithm defined by GOST 28147-89 has an encryption key length of 256 bits. It encrypts information in blocks of 64 bits (such algorithms are called block algorithms), which are then divided into two sub-blocks of 32 bits (N1 and N2) (Figure 1). Subblock N1 is processed in a certain way, after which its value is added to the value of subblock N2 (addition is performed modulo 2, i.e., the logical XOR operation is applied - “exclusive or”), and then the subblocks are swapped. This transformation is performed a certain number of times ("rounds"): 16 or 32 depending on the mode of the algorithm. Two operations are performed in each round.

Figure 1. Scheme of the algorithm GOST 28147-89.

The first is keying. The contents of subblock N1 are added modulo 2 to the 32-bit part of the key Kx. The full encryption key is represented as a concatenation of 32-bit subkeys: K0, K1, K2, K3, K4, K5, K6, K7. One of these subkeys is used in the encryption process, depending on the round number and algorithm operation mode.

The second operation is a table replacement. After keying, subblock N1 is divided into 8 parts of 4 bits, the value of each of which is replaced in accordance with the replacement table for this part of the subblock. The subblock is then bitwise left rotated by 11 bits.

Table replacements (Substitution box - S-box) are often used in modern encryption algorithms, so it is worth explaining how such an operation is organized. The output values of the blocks are written to the table. A data block of a certain dimension (in our case, 4-bit) has its own numerical representation, which determines the number of the output value. For example, if the S-box looks like 4, 11, 2, 14, 15, 0, 8, 13, 3, 12, 9, 7, 5, 10, 6, 1 and a 4-bit block "0100" came to the input (value 4), then, according to the table, the output value will be 15, i.e. "1111" (0 a 4, 1 a 11, 2 a 2 ...).

The algorithm defined by GOST 28147-89 provides for four modes of operation: simple replacement, scaling, scaling with feedback, and generation of imitation prefixes. They use the same encryption transformation described above, but since the purpose of the modes is different, this transformation is carried out in each of them differently.

In the simple replacement mode, the 32 rounds described above are performed to encrypt each 64-bit block of information. In this case, 32-bit subkeys are used in the following sequence:

K0, K1, K2, K3, K4, K5, K6, K7, K0, K1, etc. - in rounds 1 to 24;

K7, K6, K5, K4, K3, K2, K1, K0 - in rounds 25 to 32.

Decryption in this mode is carried out in exactly the same way, but with a slightly different sequence of using subkeys:

K0, K1, K2, K3, K4, K5, K6, K7 - in rounds 1 to 8;

K7, K6, K5, K4, K3, K2, K1, K0, K7, K6, etc. - in rounds 9 to 32.

All blocks are encrypted independently of each other, i.e. the result of encryption of each block depends only on its content (the corresponding source block). If there are several identical blocks of the original (plain) text, the corresponding ciphertext blocks will also be the same, which provides additional useful information for a cryptanalyst trying to open the cipher. That's why this mode it is mainly used to encrypt the encryption keys themselves (multi-key schemes are often implemented, in which, for a number of reasons, the keys are encrypted on top of each other). To encrypt the information itself, two other modes of operation are intended - gamma and gamma with feedback.

In gamma mode, each plaintext block is bit-wise added modulo 2 to the 64-bit cipher gamma block. The cipher gamma is a special sequence, which is obtained as a result of certain operations with the N1 and N2 registers.

1. In registers N1 and N2, their initial filling is written - a 64-bit value called a sync message.
2. Encryption of the contents of registers N1 and N2 (in this case, sync messages) is performed in the simple replacement mode.
3. The content of register N1 is added modulo (232 - 1) with the constant C1 = 224 + 216 + 28 + 24, and the result of the addition is written to register N1.
4. The contents of register N2 are added modulo 232 with the constant C2 = 224 + 216 + 28 + 1, and the result of the addition is written to register N2.
5. The contents of registers N1 and N2 are output as a 64-bit cipher gamma block (in this case, N1 and N2 form the first gamma block).

If the next gamma block is needed (i.e. the encryption or decryption needs to continue), return to step 2.

For decryption, the gamma is generated in a similar way, and then the ciphertext and gamma bits are again XORed. Since this operation is reversible, in the case of a correctly generated gamma, the original text is obtained (Table 1).

Table 1. Encryption and decryption in gamma mode

To develop the cipher range required for decryption, the user decrypting the cryptogram must have the same key and the same value of the sync message that were used when encrypting the information. Otherwise, you will not be able to get the original text from the encrypted one.

In most implementations of the GOST 28147-89 algorithm, the sync message is not secret, but there are systems where the sync message is the same secret element as the encryption key. For such systems, the effective length of the algorithm key (256 bits) is increased by another 64 bits of the secret sync message, which can also be considered as a key element.

In the feedback gamma mode, to fill the registers N1 and N2, starting from the 2nd block, not the previous gamma block is used, but the result of encryption of the previous plaintext block (Figure 2). The first block in this mode is generated in exactly the same way as the previous one.

Figure 2. Generation of the cipher gamma in the feedback gamma mode.

Considering the mode of generation of imitation prefixes, it is necessary to define the concept of the subject of generation. A spoof is a cryptographic checksum calculated using an encryption key and designed to check the integrity of messages. When generating a prefix, the following operations are performed: the first 64-bit block of the information array, for which the prefix is calculated, is written to registers N1 and N2 and encrypted in the reduced simple replacement mode (the first 16 rounds out of 32 are performed). The result obtained is summed modulo 2 with the next block of information, saving the result in N1 and N2.

The cycle repeats until the last block of information. The 64-bit contents of the registers N1 and N2, or part of it, resulting from these transformations, is called the imitation prefix. The size of the prefix is chosen based on the required reliability of messages: with a length of prefix r bits, the probability that a change in the message will go unnoticed is 2-r. Most often, a 32-bit prefix is used, i.e. half the contents of registers This is sufficient, since, like any checksum, the imitation prefix is primarily intended to protect against accidental distortion of information. To protect against deliberate modification of data, other cryptographic methods are used - primarily an electronic digital signature.

When exchanging information, the imitation prefix serves as a kind of additional means of control. It is calculated for the plaintext when some information is encrypted and is sent along with the ciphertext. After decryption, a new value of the imitation prefix is calculated, which is compared with the one sent. If the values do not match, it means that the ciphertext was distorted during transmission or incorrect keys were used during decryption. The imitation prefix is especially useful for checking the correct decryption of key information when using multi-key schemes.

The GOST 28147-89 algorithm is considered a very strong algorithm - at present, no more effective methods have been proposed for its disclosure than the "brute force" method mentioned above. Its high security is achieved primarily due to the large key length - 256 bits. When using a secret sync message, the effective key length is increased to 320 bits, and the secret of the substitution table adds additional bits. In addition, cryptographic strength depends on the number of rounds of transformations, which, according to GOST 28147-89, should be 32 (the full effect of input data dispersion is achieved after 8 rounds).

The advantages of GOST 28147-89 are the presence of protection against the imposition of false data (the development of imitation insertion) and the same encryption cycle in all four GOST algorithms.

“While you are alive, do not die, look at this world.
Many here have a dead soul - they are dead inside.
But they walk and laugh, not knowing that they are not there,
Don't rush your death," she told me.

Aria, "Up There"

Introduction

Introduction to block ciphers

2.1 Feistel networks.
2.2 Block cipher GOST 28147-89

Theoretical minimum

3.1 Key information
3.2 The main step of the crypto transformation

3.3 Basic cycles:“ 32-З” , “ 32-R” .

Practice

4.1 Implementation of the main step of crypto transformation
4.2 Increasing the speed of the algorithm
5.
6. List of used literature
7. Thanks.

Introduction.

This document is my attempt to describe the method of simply replacing the GOST 28147-89 encryption algorithm in the simplest, but, nevertheless, technically literate language. How well I succeeded, the reader will give his opinion after reading the first six points.

In order for my work to be more useful, I recommend arming yourself with the works of the authors indicated in the list of used literature. A calculator is also recommended, so that it has a function for calculating the operation XOR, because reading the article assumes that the reader has set out to study this encryption algorithm. Although it is also suitable as a reference tool, I wrote this article precisely as a tutorial.

Introduction to block ciphers.

Before we begin to consider the algorithm, we need to familiarize ourselves with the history of the creation of this kind of cipher. The algorithm belongs to the category of block ciphers, in the architecture of which the information is divided into a finite number of blocks, the final one naturally may not be complete. The encryption process takes place precisely over the full blocks, which form the ciphertext. The final block, if it is incomplete, is supplemented with something (I will talk about the nuances of its addition below) and is encrypted in the same way as full blocks. By ciphertext, I mean - the result of the action of the encryption function on a certain amount of data that the user submitted for encryption. In other words, the ciphertext is the end result of encryption.

The history of the development of block ciphers is associated with the early 70s, when IBM company realizing the need to protect information when transmitting data over computer communication channels, she began to implement her own program scientific research dedicated to the protection of information in electronic networks, including cryptography.

A group of researchers - developers of the IBM company, which began to study encryption systems with a symmetric key scheme, was headed by Dr. Horst Feistel.

2.1 Feistel networks

The architecture of the new encryption method proposed by Feistel in the classical literature was called the “Feistel Architecture”, but at the moment a more established term is used in Russian and foreign literature - “Feistel's network” or Feistel`s NetWork. Subsequently, according to this architecture, the Lucifer cipher was built - which was later published and caused a new wave of interest in cryptography in general.

The idea of the Feistel network architecture is as follows: the input information stream is divided into blocks of n bits in size, where n is an even number. Each block is divided into two parts - L and R, then these parts are fed into an iterative block cipher, in which the result of the j-th stage is determined by the result of the previous stage j-1! This can be illustrated with an example:

Rice. one

Where, function A is the main action of the block cipher. May be a simple action, such as an XOR operation, or may have more complex view be a sequence of a number of simple actions - modulo addition, shift to the left, replacement of elements, etc., together these simple steps form the so-called - the main step of crypto-transformation.

It should be noted that the key elements of the function are the supply of key elements and the XOR operation, and how well thought out the work of these operations is, speaks of the cryptographic strength of the cipher as a whole.

In order for the idea of Feistel networks to be completely clear, consider the simplest case depicted in Fig. rice. one, where in the function A - the operations “mod 2” (“xor”) will perform, but this protozoan case, in a more serious situation, for example, hiding information of national importance, function A can be more complex (as far as I have seen, function A can indeed be very complicated):

Initial data:

L=1110b, R=0101, K=1111b

Get a ciphertext

(R + K) mod 2 4 = Smod, Smod = 0100b
(Smod + L) mod 2 = Sxor, Sxor = 1010b
L=R, R=Sxor

L=0101b, R=1010b

Let's explain our steps:

This operation is addition mod 2 4 . In practice, such an operation boils down to a simple addition, where we must add two numbers and ignore the carry to the 5th digit. Since, if we put exponents over the digits of the binary representation of a number, there will be an indicator of four over the fifth digit, take a look at the figure below, which shows the actions of our operation:

Rice. 2

Here I pointed to the exponents with an arrow, as you can see, the result should have been 10100, but since the carry is ignored during the mod 2 4 operation, we get 0100.

This operation in the literature is called mod 2, in assembly language it is implemented by the command XOR. But its more correct name is mod 2 1 . Without this unique operation, it is hardly possible to build a fast, easy-to-implement encryption algorithm and, at the same time, be quite crypto-resistant. The uniqueness of this operation lies in the fact that it is the reverse of itself! For example, if the number A is XORed with the number B, the result will be C, in the future it is enough to re-XOR the numbers B and C together to get the previous value of A!

In this operation, we got 1010 having the numbers 1110 and 0100, to get back 1110, it is enough to XOR the numbers 0100 and 1010 together! More details about this operation can be found in the article, which is embedded on the site. www.wasm.ru, « Elementary guide toCRC_error detection algorithms» the author who Ross N. Williams. There is a paragraph in this work - 5. Binary arithmetic excluding transfers". It is in this article that the operation is described xor! I exclaim because in this article this operation is described in such a way that the reader not only understands how this operation works, he even begins it see, hear and feel!

This action is necessary so that when decrypting from the ciphertext, you can get the original values.

2.2 Block cipher GOST 28147-89

The encryption algorithm GOST 28147 - 89 belongs to the category of block ciphers operating on the Feistel balanced network architecture, where two parts of the selected information block are of equal size. The algorithm was developed in the depths of the eighth department of the KGB, now transformed into FAPSI, and was fixed as an encryption standard. Russian Federation back in 1989 under the USSR.

For this method of the algorithm to work, it is necessary to break the information into blocks of 64 bits in size. Generate or enter into the encryption system the following key information: key and substitution table. The choice of the key and substitution table for encryption should be taken very seriously, because. this is the foundation of the security of your information. About what requirements are imposed on the key, and the table of substitutions, see the paragraph "Requirements for key information".

When considering the method, we will not focus on this, because. this article, as I said above, was written with the aim of teaching the reader how to encrypt data using the method of simply replacing this encryption algorithm, but we will definitely touch on this issue at the end of the article.

theoretical minimum.

3.1 Key information

As I said above, the following are actively involved in data encryption:

3.1.1. The key is a sequence of eight elements of 32 bits each. Further, we will denote the symbol K, and the elements of which it consists - k1,k2,k3,k4,k5,k6,k7,k8.

3.1.2 The substitution table is a matrix of eight rows and sixteen columns, hereinafter referred to as Hij. Each element at the intersection of row i and column j takes 4 bits.

The main action in the encryption process is the main step of the crypto transformation. This is nothing more than an action to encrypt data according to a certain algorithm, only the name the developers introduced was too cumbersome :).

Before starting to encrypt, the block is divided into two parts L and R, 32 bits each. The key element is selected and only then these two parts of the block are fed, the key element is the substitution table into the main step function, the result of the main step is one iteration of the base cycle, which will be discussed in the next paragraph. The main step consists of the following steps:

The addition part of the block R is added to the key element K mod 2 32 . I described a similar operation above, here it’s the same thing, only the exponent is not “4”, but “32” - the result of this operation will be denoted by Smod in the future.
We divide the previously obtained result Smod into four bit elements s7,s6,s5,s4,s3,s2,s1,s0 and feed it into the replacement function. The replacement is as follows: the element Smod - s i is selected, from the beginning we start from the lowest element, and replace it with the value from the replacement table by i - the row and column indicated by the value of the element s i . We pass to s i +1 element and do the same and continue until we replace the value of the last element Smod - the result of this operation will be denoted as Ssimple.
In this operation, the value of Ssimple is shifted cyclically to the left by 11 bits and we get Srol.
We select the second part of the block L and add mod 2 with Srol, as a result we have Sxor.
At this stage, the L part of the block becomes equal to the value of the R part, and the R part in turn is initialized with the result of Sxor, and this completes the main step function!

3.3 Basic cycles: “32-Z”, “32-R”.

In order to encrypt information, it is necessary to break it into blocks of 64 bits in size, naturally the last block can be less than 64 bits. This fact is the Achilles' heel of this "simple substitution" method. Since its addition to 64 bits is a very important task to increase the cryptographic strength of the ciphertext, this sensitive place, if it is present in the array of information, but it may not be (for example, a file of 512 bytes in size!), should be treated with great responsibility!

After you have broken the information into blocks, you should break the key into elements:

K = k1,k2,k3,k4,k5,k6,k7,k8

Encryption itself consists in using the so-called basic cycles. Which, in turn, include the n -th number of basic crypto-transformation steps.

Basic cycles are, how to say, marked: n - m. Where n is the number of basic crypto-transformation steps in the base loop, and m is the "type" of the base loop, i.e. what is it about, "Z" encryption or "R" encryption of data.

The basic encryption cycle 32–3 consists of 32 basic cryptographic steps. The block N and the element of the key K are fed into the function that implements the actions of the step, and the first step occurs with k1, the second over the result with the element k2, etc. according to the following scheme:

k1,k2,k3,k4,k5,k6,k7,k8,k1,k2,k3,k4,k5,k6,k7,k8,k1,k2,k3,k4,k5,k6,k7,k8k8,k7, k6,k5,k4,k3,k2,k1

The 32-P decryption process proceeds in a similar way, but the key elements are fed in the reverse order:

k1,k2,k3,k4,k5,k6,k7,k8,k8,k7,k6,k5,k4,k3,k2,k1,k8,k7,k6,k5,k4,k3,k2,k1,k8, k7,k6,k5,k4,k3,k2,k1

Practice.

After we got acquainted with the theory of how to encrypt information, it is time to see how encryption works in practice.

Initial data:

Let's take a block of information N = 0102030405060708h, here the parts L and R are equal:

L = 01020304h, R = 05060708h, take the key:

K=' as28 zw37 q839 7342ui23 8e2t wqm2 ewp1' (these are ASCII codes, in order to view the hexadecimal representation, you can open this file in view mode in Total Commander pressing the key " F3” and then the key “ 3 "). In this key, the element values will be:

k1 = 'as28', k2 = 'zw37', k3 = 'q839', k4 = '7342'

k5 = 'ui23', k6 = '8e2t', k7 = 'wqm2', k8 = 'ewp1'

Also take the following substitution table:

Rice. 3

Here the rows are numbered from 0 to 7, the columns from 0 to F.

A warning: All information, including the key with the substitution table, is taken as an example for considering the algorithm!

Using the "Initial data", you need to get the result of the action of the main step of the crypto transformation.

We select the part R = 05060708h and the key element k1 = ‘as28’, in hexadecimal form the key element will look like this: 61733238h. Now we do the operation of summation mod 2 32:

Rice. 4

As you can see in the figure, we did not transfer to 33 bits marked in red and with the exponent " 32 ". And if we had other values of R and the key element, this could well happen, and then we would ignore it, and in the future use only the bits marked in yellow.

I perform such an operation with the assembler command add:

; eax=R, ebx='as28'

The result of this operation is Smod = 66793940h

Now the most intricate operation, but if you look closely, it is no longer as scary as it seems at first. Let's imagine Smod in the following form:

PATTERN NOT SAVE

Rice. five

I tried to visualize the Smod elements in the figure, but I’ll explain anyway:

s0 = 0, s1 = 4, s2 = 9, etc.

Now, starting from the lowest element s0, we make a replacement. Remembering the paragraph 3.2 The main step of crypto transformation» i is a row, s i is a column, we are looking for a value in the zero row and zero column:

Fig.6

So the current value of Smod is not 6679394 0 h, and 6679394 5 h.

We proceed to replace s1, i.e. four. Using the first row and fourth column (s1= 4!). Let's look at the picture:

Rice. 7

Now the value is Smod, not 667939 4 5h, 667939 2 5h. I assume that now the replacement algorithm is clear to the reader, and I can say that after the end result of Ssimple will have the following value - 11e10325h.

I will talk about how this is easiest to implement in the form of assembler commands later in the next paragraph, after I talk about the extended table.

We must shift the resulting Ssimple value 11 bits to the left.

Rice. 8

As you can see, this action is quite simple, and is implemented by one assembly language command - roll and the result of this Srol operation is 0819288Fh.

Now it remains part L of our block of information to XOR with the value Srol. I take calculator from w2k sp4 and get Sxor = 091b2b8bh.
This action is final and we simply assign, clean R the value of the L part, and initialize the L part with the value of Sxor.

Final result:

L=091b2b8bh, R=01020304h

4.2 Increasing the speed of the algorithm

Now let's talk about optimizing the algorithm for speed. When implementing a project, one has to take into account that a program that works with registers more often than with memory works the fastest, and here this judgment is also very important, because. over one block of information as many as 32 encryption steps!

When I implemented the encryption algorithm in my program, I did the following:

Chose part of block L into register eax, and R into edx.
Initialized into the esi register with the address of the extended key, more on that below.
I assigned the value of the address of the extended substitution table to the ebx register, more on this below
Passed the information of points 1,2, 3 to the function of the basic cycle 32 - Z or 32 - R, depending on the situation.

If you look at the scheme for supplying key elements in paragraph " Basic cycles: “32-Z”, “32-R””, then our key for the basic cycle 32 - Z can be represented as follows:

K 32-Z =

'as28','zw37','q839','7342','ui23','8e2t','wqm2','ewp1',

'ewp1','wqm2','8e2t','ui23','7342','q839','zw37','as28'

Those. from the beginning go k1,k2,k3,k4,k5,k6,k7,k8 - ‘ as28', 'zw37', 'q839', '7342', 'ui23', '8e2t', 'wqm2', 'ewp1' This sequence is repeated three times. Then the elements go in reverse order, i.e.: k8,k7,k6,k5,k4,k3,k2,k1 - 'ewp1', 'wqm2', '8e2t', 'ui23', '7342', 'q839', 'zw37', 'as28'.

I pre-arranged the elements in the array in the order they should be served in 32 - Z. Thus, I increased the memory required per key, but saved myself from some thinking processes that I did not need, and increased the speed of the algorithm, for by reducing the memory access time! Here I described only the key for 32 - Z, for the cycle 32 - R I did the same, but using a different scheme for supplying elements, which I also described in paragraph " Basic cycles: “32-Z”, “32-R».

It's time to describe the implementation of the replacement function, as I promised above. I could not describe earlier, because. this requires the introduction of a new concept - an extended substitution table. I can't explain to you what it is. Instead, I will show you it, and you can formulate for yourself what it is - an extended substitution table?

So, in order to understand what an extended substitution table is, we need a substitution table, for example, I will take the one shown in Fig. 3.

For example, we needed to replace the number 66793940h. I will present it in the following form:

PATTERN NOT SAVE

Rice. nine

Now if we take the elements s1,s0, i.e. low byte, then the result of the replacement function will be 25h! After reading the article by Andrey Vinokurov, which I cited in the paragraph “ List of used literature”, you will actually find that if you take two lines, you can get an array that allows you to quickly find replacement elements using an assembler command xlat. They say it is possible in another way, faster, but Andrey Vinokurov spent about four years researching fast algorithms for implementing GOST! I don't think you should reinvent the wheel when it already exists.

So, about the array:

Let's take the first two lines, zero and first, and create an array of 256 bytes. Now we observe one feature that if you need to convert 00h, then the result will be 75h (based on Fig. 3) - put this value in the array at offset 00h. We take the value 01h, the result of the replacement function is 79h, put it in the array at offset 01 and so on until 0FFh, which will give us 0FCh, which we will put in the array at offset 0FFh. So we got an extended table of replacements for the first group of strings: the first and zero. But there are still three groups: the second page 2, page 3, the third page 4, page 5, the fourth page 6, page 7. With these three groups we proceed in the same way as with the first. The result is an extended substitution table!

Now you can implement an algorithm that will perform the replacement. To do this, we take the source codes that Andrey Vinokurov posted on his page, see " Bibliography».

lea ebx,extented_table_simple

mov eax,[put number to replace]

add ebx,100h ;go to the next two nodes

sub ebx,300h ; so that in the future ebx points to the table

Now one more feature, we not only replaced the previous actions, but also shifted the number by 8 bits to the left! We just have to shift the number another 3 bits to the left:

and we get the result of the operation rol eax,11!

There is nothing more I can add on optimization, the only thing I can emphasize what I said above is to use registers more often than memory accesses. I think these words are only for beginners, experienced people understand this very well without my words :).

Key Information Requirements.

As stated in the article by Andrey Vinokurov, the key is selected according to two criteria:

- criterion of equiprobable distribution of bits between values 1 and 0. Usually, Pearson's criterion ("chi-square") acts as a criterion of equiprobable distribution of bits.

This means the key, in principle any number can. That is, when forming the next bit of the key, the probability of its initialization by one or zero is 50/50!

Please note that the key has eight elements, each 32 bits, so there are 32 * 8 = 256 bits in total in the key and the number of possible keys is 2 256! Doesn't it strike you? 🙂

— series criterion.

If we look at our key, which I gave in paragraph " 4.1 Implementation of the main step of crypto transformation”, then you will notice that the following entry is true:

Rice. 10

In one phrase, the value of k 1 should not be repeated not in k 2 , not in any other element of the key.

That is, the key that we have chosen as the consideration of the encryption algorithm, well meets the two criteria above.

Now about the choice of substitution table:

Now let's talk about how to choose the right substitution table. The main requirement for the choice of substitution tables is the phenomenon of "non-repeatability" of elements, each of which is 4 bits in size. As you saw above, each row of the substitution table consists of the values 0h, 1h, 2h, 3h, ..., 0fh. So the main requirement is that each line contains the values 0h, 1h, 2h, ... , 0fh and each such value in one copy. For example, the sequence:

1 2 3 4 5 6 7 8 9 A B C D E F

Fully meets this requirement, but still! It is not recommended to select such a sequence as a string. Because if you supply a value to the input of a function that relies on such a string, then you will get the same value at the output! Don't believe? Then take the number 332DA43Fh and eight such lines as a substitution table. Perform the replacement operation, and I assure you, the output will be the number 332DA43Fh! That is, the same that you submitted to the input of the operation! And this is not a sign of good form in encryption, and was it? 🙂

That was one requirement, the next criterion says that - each bit of the output block must be statistically independent of each bit of the input block!

How does it look easier? And here is how, for example, we chose the element s0 = 0Fh, 01111b from the above number. The probability that we now replace the first bit with a one or a zero is 0.5! The probability of replacing the second, third and fourth bits, each bit, considered separately, by ones or zeros is also 0.5. When choosing s1 = 0Eh, we will replace the probability that we are a zero bit, and this is “0”, by zero or one too equals - 0.5! Thus, according to this criterion, there is no regularity between the replacement of zero bits of elements s0, s1! Yes, you could substitute ones, but you could also put zeros. 🙂

To evaluate the table according to this criterion, you can build a table of correlation coefficients calculated by the formula:

— if p = 1, then the value of bit j at the output is equal to the value of bit i at the input for any combination of bits at the input;

— if p = -1, then the value of bit j at the output is always the inversion of the input bit i;

— if p = 0, then the output bit j takes on the values 0 and 1 with equal probability for any fixed value of the input bit i.

Let's take a single line example:

D	B	4	1	3	F	5	9	0	A	E	7	6	8	2	C

Let's break it down into components:

We calculate one coefficient using the formula above. To make it easier to understand how this is done, I will explain in more detail:

- we take the 0th bit of the 0th number (0) at the input and the 0th bit of the 0th number at the output (1) we perform the operation 0 XOR 1 = 1.

- we take the 0th bit of the 1st number (1) at the input and the 0th bit of the 1st number at the output (1) we perform the operation 1 XOR 1 = 0.

- we take the 0th bit of the 2nd number (0) at the input and the 0th bit of the 2nd number at the output (0) we perform the operation 0 XOR 0 = 0.

- we take the 0th bit of the 3rd number (1) at the input and the 0th bit of the 3rd number at the output (1) we perform the operation 1 XOR 1 = 0.

After sequentially performing XOR operations in such a sequence, we count the number of all non-zero values, we get the value 6. Hence P 00 = 1-(6/2 4-1) = 0.25. So, it turned out that the value of bit 0 at the output is equal to the value of bit 0 at the input in 4 cases out of 16;

Final odds table:

The table of coefficients will be as follows (who is not lazy to recalculate)

	entrance
Output		0	1	2	3
	0	-0,25	0,00	0,00	0,00
	1	0,00	1,00	0,00	0,00
	2	0,00	0,00	1,00	0,00
	3	0,00	0,00	0,00	-0,50

Well, things are even worse in this table - bits 1 and 2 of the group remain unchanged! The cryptanalyst has a place to turn around 🙂 Taking into account all these requirements, a simple enumeration ("head-on") found permutation tables corresponding to the specified theory (today - 1276 combinations) Here are some of them:

09 0D 03 0E-06 02 05 08-0A 07 00 04-0C 01 0F 0B

00 05 0A 07-03 08 0F 0C-0E 0B 04 09-0D 06 01 02

06 0B 0F 00-0C 01 02 0D-08 07 09 04-05 0A 03 0E

04 0E 00 09-0B 01 0F 06-03 0D 07 0A-0C 02 08 05

04 02 08 0E-05 0F 03 09-0B 01 0D 07-0A 0C 06 00

07 03 09 0C-08 00 06 0F-0E 04 01 0A-0D 0B 02 05

06 0F 03 08-0D 04 0A 01-09 02 05 0C-00 0B 0E 07

0C 06 08 01-03 09 07 0E-0B 05 0F 02-04 0A 00 0D

04 0B 09 06-0E 01 00 0F-0A 05 03 0C-0D 02 07 08

00 0E 0F 01-07 08 09 06-04 0B 0A 05-03 0D 0C 02

0F 09 01 07-04 0A 08 06-0E 00 02 0C-05 03 0B 0D

0A 03 04 01-05 0C 0B 0E-08 06 0F 0D-07 09 00 02

0B 06 0F 01-04 0A 08 05-00 0D 0C 02-07 09 03 0E

0C 03 02 08-0D 06 0B 05-07 09 04 0F-0A 00 01 0E

02 0B 0F 04-09 00 06 0D-05 0E 01 08-0C 07 0A 03

List of used literature.

Article by Andrey Vinokurov:

Encryption algorithm GOST 28147-89, its use and implementation

for Intel x86 platform computers.

(can be found at: http://www.enlight.ru/crypto/frame.htm).

Here are the source codes for the implementation of the encryption algorithm.

Article by Horst Feistel:

Cryptography and Computer Security.

(can be found at the same address as the previous article)

Ross N. Williams:

An Elementary Guide to CRC Error Detection Algorithms

Posted on the site www.wasm.en.

Thanks.

I would like to express my gratitude to all visitors of the forum www.wasm.ru. But I would especially like to thank ChS, who in currently known as SteelRat, he helped me understand things that I probably would never understand, as well as helping me write a paragraph: “ Key information requirements”, the main part of this paragraph was written by him. I am also deeply grateful to the employee of KSTU. A.N. Tupolev Anikin Igor Vyacheslavovich and it would be a sin not to mention Chris Kaspersky, for the fact that he is, and Volodya / wasm.ru for his instructions. Oh, and I get from him :). I also want to note Sega-Zero / Callipso for bringing some mathematical jungle to my mind.

This is perhaps all I would like to tell you.

I would be grateful for criticism or questions related to this article or just advice. My contact details: [email protected], ICQ - 337310594.

Sincerely, Evil`s Interrupt.

P.S.: I did not try to outdo anyone with this article. It was written with the intent to make it easier to study GOST, and if you have difficulties, this does not mean that I am to blame for this. Be reasonable and be patient, all the best to you!

Encryption algorithm GOST 28147-89, its use and software implementation for computers of the Intel x86 platform.

Andrey Vinokurov

Description of the algorithm.

Terms and designations.

The description of the encryption standard of the Russian Federation is contained in a very interesting document entitled "GOST 28147-89 Cryptographic Transformation Algorithm". The fact that in its title, instead of the term "encryption", a more general concept " cryptographic transformation ' is no coincidence. In addition to several closely related encryption procedures, the document describes one algorithm for generating imitation inserts . The latter is nothing more than a cryptographic control combination, that is, a code generated from the original data using a secret key in order to imitation protection , or protect data from unauthorized changes.

At various steps of the GOST algorithms, the data they operate on is interpreted and used in different ways. In some cases, data elements are treated as arrays of independent bits, in other cases as an unsigned integer, in others as a complex element having a structure, consisting of several simpler elements. Therefore, in order to avoid confusion, it is necessary to agree on the notation used.

Data elements in this article are denoted by capital Latin letters with italics (for example, X). Through | X| denotes the size of the data element X in bits. Thus, if we interpret the data element X as a non-negative integer, we can write the following inequality:

If the data element consists of several elements of a smaller size, then this fact is indicated as follows: X=(X 0 ,X 1 ,…,X n –1)=X 0 ||X 1 ||…||X n-one . The procedure for combining several data elements into one is called concatenation data and is denoted by the symbol "||". Naturally, the following relation must hold for the sizes of data elements: | X|=|X 0 |+|X 1 |+…+|X n-1 |. When specifying complex data elements and the concatenation operation, the constituent data elements are listed in ascending order of precedence. In other words, if we interpret the compound element and all data elements included in it as unsigned integers, then we can write the following equality:

In an algorithm, a data element can be interpreted as an array of individual bits, in which case the bits are denoted by the same letter as the array, but in lower case, as shown in the following example:

X=(x 0 ,x 1 ,…,x n –1)=x 0 +2 1 x 1 +…+2 n-one · x n –1 .

Thus, if you paid attention, the so-called. "little-endian" numbering of digits, i.e. within multi-bit data words, individual binary digits and their groups with smaller numbers are less significant. This is explicitly stated in clause 1.3 of the standard: "When adding and cyclically shifting binary vectors, the highest bits are considered to be the bits of accumulators with large numbers." Further, paragraphs of the standard 1.4, 2.1.1 and others prescribe to start filling the storage registers of the virtual encryption device with data from the lower ones, i.e. less significant ranks. Exactly the same numbering order is adopted in microprocessor architecture Intel x86, which is why the software implementation of the cipher on this architecture does not require any additional permutations of bits inside the data words.

If some operation is performed on data elements that has a logical meaning, then it is assumed that this operation is performed on the corresponding bits of the elements. In other words A B=(a 0 b 0 ,a 1 b 1 ,…,a n –1 b n–1), where n=|A|=|B|, and the symbol " " denotes an arbitrary binary logical operation; usually refers to the operation exclusive or , which is also the operation of summation modulo 2:

The logic of constructing the cipher and the structure of the key information of GOST.

If you carefully study the original GOST 28147–89, you will notice that it contains a description of the algorithms of several levels. At the very top are practical algorithms designed to encrypt data arrays and develop fake inserts for them. All of them are based on three lower-level algorithms, called GOST in the text cycles . These fundamental algorithms are referred to in this article as basic cycles to distinguish them from all other loops. They have the following names and designations, the latter are given in brackets and their meaning will be explained later:

encryption cycle (32-3);
decryption cycle (32-P);
cycle of generating an imitation insert (16-Z).

In turn, each of the basic cycles is a multiple repetition of a single procedure, called for definiteness further in this paper the main step of crypto transformation .

Thus, in order to understand GOST, you need to understand the following three things:

what's happened basic step crypto transformations;
how basic cycles are formed from the main steps;
like out of three basic cycles add up all the practical algorithms of GOST.

Before proceeding to the study of these issues, we should talk about the key information used by the GOST algorithms. In accordance with the Kirchhoff principle, which is satisfied by all modern ciphers known to the general public, it is its secrecy that ensures the secrecy of the encrypted message. In GOST, key information consists of two data structures. In addition to key required for all ciphers, it also contains substitution table . Below are the main characteristics of the key structures of GOST.

The main step of crypto transformation.

The main crypto conversion step is essentially an operator that defines the conversion of a 64-bit data block. An additional parameter of this operator is a 32-bit block, which is used as any element of the key. The scheme of the main step algorithm is shown in Figure 1.

Figure 1. Scheme of the main step of crypto-transformation of the GOST 28147-89 algorithm.

Below are explanations of the main step algorithm:

Step 0

N– 64-bit data block to be converted, during the execution of the step its least significant ( N 1) and older ( N 2) parts are treated as separate 32-bit unsigned integers. Thus, one can write N=(N 1 ,N 2).
X– 32-bit key element;

Step 1

Addition with a key. The lower half of the converted block is added modulo 2 32 with the key element used in the step, the result is passed to the next step;

Step 2

Block replacement. The 32-bit value obtained in the previous step is interpreted as an array of eight 4-bit code blocks: S=(S 0 , S 1 , S 2 , S 3 , S 4 , S 5 , S 6 , S 7), and S 0 contains the 4 smallest ones, and S 7 - 4 most significant bits S.

Next, the value of each of the eight blocks is replaced by a new one, which is selected from the substitution table as follows: the value of the block Si changes to Si-th element in sequence (numbering from zero) i-of that replacement node (i.e. i-th line of the table of substitutions, numbering also from zero). In other words, as a replacement for the block value, an element is selected from the replacement table with a row number equal to the replacement block number and a column number equal to the replacement block value as a 4-bit non-negative integer. From this, the size of the replacement table becomes clear: the number of rows in it is equal to the number of 4-bit elements in a 32-bit data block, that is, eight, and the number of columns is equal to the number of distinct values of a 4-bit data block, which is known to be 2 4 , sixteen.

Step 3

Rotate left 11 bits. The result of the previous step is shifted cyclically by 11 bits towards the higher bits and transferred to the next step. In the algorithm diagram, the symbol denotes the function of cyclic shift of its argument by 11 bits to the left, i.e. towards the higher levels.

Step 4

Bitwise addition: The value obtained in step 3 is added modulo 2 bit by bit to the high half of the block being converted.

Step 5

Shift along the chain: the lower part of the converted block is shifted to the place of the older one, and the result of the previous step is placed in its place.

Step 6

The resulting value of the block to be converted is returned as a result of the execution of the algorithm of the main crypto-transformation step.

Basic cycles of cryptographic transformations.

As noted at the beginning of this article, GOST belongs to the class of block ciphers, that is, the unit of information processing in it is a data block. Therefore, it is quite logical to expect that it will define algorithms for cryptographic transformations, that is, for encryption, decryption and "accounting" in the control combination of one block of data. These algorithms are called basic cycles GOST, which emphasizes their fundamental importance for the construction of this cipher.

Basic cycles are built from basic steps cryptographic transformation discussed in the previous section. During the execution of the main step, only one 32-bit key element is used, while the GOST key contains eight such elements. Therefore, in order for the key to be fully used, each of the basic loops must repeatedly perform the main step with its various elements. At the same time, it seems quite natural that in each basic cycle all elements of the key should be used the same number of times; for reasons of cipher strength, this number should be more than one.

All the assumptions made above, based simply on common sense, turned out to be correct. Basic loops are repeated execution main step using different elements of the key and differ from each other only in the number of repetitions of the step and the order in which the key elements are used. This order is given below for various cycles.

Encryption cycle 32-Z:

K 0 ,K 1 ,K 2 ,K 3 ,K 4 ,K 5 ,K 6 ,K 7 ,K 0 ,K 1 ,K 2 ,K 3 ,K 4 ,K 5 ,K 6 ,K 7 ,K 0 ,K 1 ,K 2 ,K 3 ,K 4 ,K 5 ,K 6 ,K 7 ,K 7 ,K 6 ,K 5 ,K 4 ,K 3 ,K 2 ,K 1 ,K 0 .

Figure 2a. Scheme of the encryption cycle 32-3

Decryption cycle 32-P:

K 0 ,K 1 ,K 2 ,K 3 ,K 4 ,K 5 ,K 6 ,K 7 ,K 7 ,K 6 ,K 5 ,K 4 ,K 3 ,K 2 ,K 1 ,K 0 ,K 7 ,K 6 ,K 5 ,K 4 ,K 3 ,K 2 ,K 1 ,K 0 ,K 7 ,K 6 ,K 5 ,K 4 ,K 3 ,K 2 ,K 1 ,K 0 .

Figure 2b. 32-P Decryption Loop Diagram

Cycle for the production of imitation insert 16-З:

K 0 ,K 1 ,K 2 ,K 3 ,K 4 ,K 5 ,K 6 ,K 7 ,K 0 ,K 1 ,K 2 ,K 3 ,K 4 ,K 5 ,K 6 ,K 7 .

Figure 2c. Scheme of the production cycle of the imitation insert 16-З.

Each of the cycles has its own alphanumeric designation, corresponding to the pattern " n-X", where the first element of the designation ( n), specifies the number of repetitions of the main step in the cycle, and the second element of the notation ( X), a letter, specifies the order of encipherment ("З") or decryption ("Р") in the use of key elements. This order needs more explanation:

The decryption cycle must be the reverse of the encryption cycle, that is, the sequential application of these two cycles to an arbitrary block must result in the original block, which is reflected by the following relationship: C 32-R ( C 32-З ( T))=T, where T– arbitrary 64-bit data block, C X( T) is the result of the loop X above the data block T. To fulfill this condition for algorithms similar to GOST, it is necessary and sufficient that the order in which the key elements are used by the corresponding cycles is mutually inverse. It is easy to verify the validity of the written condition for the case under consideration by comparing the above sequences for cycles 32-3 and 32-P. One interesting consequence follows from the foregoing: the property of a cycle to be the inverse of another cycle is mutual, that is, the cycle 32-3 is inverse with respect to the cycle 32-P. In other words, the encryption of a data block could theoretically be done with a decryption loop, in which case the decryption of the data block would have to be done with an encryption loop. Of the two mutually inverse cycles, either one can be used for encryption, then the second must be used to decrypt data, however, the GOST28147-89 standard assigns roles to cycles and does not provide the user with the right to choose in this matter.

The cycle of generating an insertion imitation is half as long as the encryption cycles, the order of using key elements in it is the same as in the first 16 steps of the encryption cycle, which is easy to see by considering the above sequences, therefore this order in the cycle designation is encoded by the same letter "Z".

Schemes of basic cycles are shown in Figures 2a-c. Each of them takes as an argument and returns as a result a 64-bit block of data, indicated on the diagrams N. Symbol Step( N,X) denotes the execution of the main crypto-transformation step for the data block N using key element X. There is one more difference between the cycles of encryption and calculation of the simulation insert, not mentioned above: at the end of the basic cycles of encryption, the high and low parts of the result block are swapped, this is necessary for their mutual reversibility.

Basic encryption modes.

GOST 28147-89 provides for the following three data encryption modes:

simple replacement,
gamming,
feedback gamma,

and one additional mode of generating imitation insertion.

In any of these modes, data is processed in blocks of 64 bits, into which the array is divided, subjected to cryptographic transformation, which is why GOST refers to block ciphers. However, in two gamming modes, it is possible to process an incomplete data block smaller than 8 bytes, which is essential when encrypting data arrays with an arbitrary size, which may not be a multiple of 8 bytes.

Before proceeding to the consideration of specific algorithms for cryptographic transformations, it is necessary to clarify the notation used in the diagrams in the following sections:

T about, T w are arrays of open and encrypted data, respectively;

, – i- 64-bit blocks of open and encrypted data, respectively: , , the last block may be incomplete: ;

n– number of 64-bit blocks in the data array;

C X is the function of converting a 64-bit data block according to the algorithm of the basic cycle "X".

Now let's describe the main encryption modes:

Simple replacement.

Encryption in this mode consists in applying the 32-3 cycle to open data blocks, decryption - the 32-P cycle to encrypted data blocks. This is the simplest of the modes, 64-bit data blocks are processed in it independently of each other. The schemes of encryption and decryption algorithms in the simple replacement mode are shown in Figures 3a and b, respectively; they are trivial and do not need comments.

Picture. 3a. Data encryption algorithm in simple replacement mode

Picture. 3b. Data decryption algorithm in simple replacement mode

The size of the array of open or encrypted data, which is subject to encryption or decryption, respectively, must be a multiple of 64 bits: | T o |=| T w |=64 n , after the operation is performed, the size of the received data array does not change.

Simple replacement encryption mode has the following features:

Since data blocks are encrypted independently of each other and of their position in the data array, when two identical plaintext blocks are encrypted, identical ciphertext blocks are obtained and vice versa. The noted property will allow the cryptanalyst to conclude that the blocks of the original data are identical if he encounters identical blocks in the encrypted data array, which is unacceptable for a serious cipher.
If the length of the encrypted data array is not a multiple of 8 bytes or 64 bits, the problem arises of how and how to pad the last incomplete data block of the array to full 64 bits. This task is not as simple as it seems at first glance. Obvious solutions like “pad an incomplete block with zero bits” or, more generally, “pad an incomplete block with a fixed combination of zero and one bits” can, under certain conditions, give the cryptanalyst the opportunity to determine the contents of this very incomplete block by methods of enumeration, and this fact means a decrease in security cipher. In addition, the length of the ciphertext will change, increasing to the nearest integer multiple of 64 bits, which is often undesirable.

At first glance, the features listed above make it practically impossible to use the simple replacement mode, because it can only be used to encrypt data arrays with a size that is a multiple of 64 bits and does not contain repeated 64-bit blocks. It seems that for any real data it is impossible to guarantee the fulfillment of these conditions. This is almost true, but there is one very important exception: remember that the key size is 32 bytes, and the size of the replacement table is 64 bytes. In addition, the presence of repeated 8-byte blocks in a key or substitution table will indicate their very poor quality, so there cannot be such a repetition in real key elements. Thus, we found out that the simple replacement mode is quite suitable for encrypting key information, especially since other modes are less convenient for this purpose, since they require an additional synchronizing data element - a sync message (see the next section). Our guess is correct, GOST prescribes to use the simple replacement mode exclusively for encrypting key data.

Gambling.

How can you get rid of the shortcomings of the simple replacement mode? To do this, it is necessary to make it possible to encrypt blocks with a size of less than 64 bits and to ensure that the ciphertext block depends on its number, in other words, randomize encryption process. In GOST, this is achieved in two different ways in two encryption modes, providing scaling . Gambling - this is the imposition (removal) of a cryptographic range on open (encrypted) data, that is, a sequence of data elements generated using some cryptographic algorithm to obtain encrypted (open) data. Reciprocally inverse binary operations must be used to overlay gamma during encryption and remove it during decryption, for example, addition and subtraction modulo 2 64 for 64-bit data blocks. In GOST, for this purpose, the operation of bitwise addition modulo 2 is used, since it is the inverse of itself and, moreover, is most simply implemented in hardware. Gamming solves both of the above problems: firstly, all gamma elements are different for real encrypted arrays and, therefore, the result of encryption of even two identical blocks in one data array will be different. Secondly, although the gamma elements are produced in equal portions of 64 bits, a part of such a block with a size equal to the size of the encrypted block can also be used.

Now let's go directly to the description of the gamma mode. The gamma for this mode is obtained as follows: using some algorithmic recurrent number sequence generator (RGCH), 64-bit data blocks are generated, which are then subjected to 32-3 conversion, that is, encryption in the simple replacement mode, resulting in gamma blocks. Due to the fact that the overlay and removal of the gamma is carried out using the same bitwise XOR operation, the encryption and decryption algorithms in the gamma mode are identical, their general scheme is shown in Figure 4.

The RGHR used to generate the scale is a recurrent function: – elements of the recurrent sequence, f is the conversion function. Therefore, the question inevitably arises about its initialization, that is, about the element. In fact, this data element is an algorithm parameter for gamma modes, it is indicated on the diagrams as S, and is called in cryptography sync message , and in our GOST - initial filling one of the encoder registers. For certain reasons, the developers of GOST decided to use for the initialization of the RGHR not directly a sync message, but the result of its transformation according to the 32-3 cycle: . The sequence of elements generated by the RGHR depends entirely on its initial filling, that is, the elements of this sequence are a function of their number and the initial filling of the RGHR: where fi(X)=f(fi –1 (X)), f 0 (X)=X. Taking into account the transformation according to the simple replacement algorithm, a dependence on the key is also added:

where G i– i-th element of the scale, K- key.

Thus, the sequence of gamma elements to be used in the gamma mode is uniquely determined by the key data and the sync message. Naturally, for the encryption procedure to be reversible, the same sync message must be used in the decryption and decryption processes. From the requirement of the uniqueness of the range, the failure of which leads to a catastrophic decrease in the strength of the cipher, it follows that in order to encrypt two different data arrays on the same key, it is necessary to ensure the use of different sync messages. This leads to the need to store or transmit a sync message along communication channels along with encrypted data, although in some special cases it can be predefined or calculated in a special way if encryption of two arrays on the same key is excluded.

Now let's take a closer look at the RGHR used in GOST to generate gamma elements. First of all, it should be noted that it is not required to provide any statistical characteristics of the generated sequence of numbers. RGHR was designed by the GOST developers based on the need to fulfill the following conditions:

the repetition period of the sequence of numbers generated by the RGHR should not differ greatly (in percentage terms) from the maximum possible when given size value block 2 64 ;
neighboring values produced by RGHR must differ from each other in each byte, otherwise the cryptanalyst's task will be simplified;
RGHR should be fairly easy to implement both in hardware and software on the most common types of processors, most of which, as you know, have a 32-bit capacity.

Based on these principles, the creators of GOST designed a very successful RGHR, which has the following characteristics:

Where C 0 =1010101 16 ;

Where C 1 =1010104 16 ;

The subscript in the notation of a number means its number system, so the constants used in this step are written in the hexadecimal number system.

The second expression needs comments, since something else is given in the GOST text: , with the same value of the constant C one . But further in the text of the standard a comment is given that, it turns out, under the operation of taking the remainder modulo 2 32 –1 there is understood not the same as in mathematics. The difference lies in the fact that according to GOST (2 32 -1) mod(2 32 –1)=(2 32 –1), not 0. In fact, this simplifies the implementation of the formula, and the mathematically correct expression for it is given above.

the repetition period of the sequence for the younger part is 2 32, for the older part 2 32 -1, for the entire sequence the period is 2 32 (2 32 -1), the proof of this fact, very simple, you will get yourself. The first formula of the two is implemented in one instruction, the second, despite its apparent cumbersomeness, in two instructions on all modern 32-bit processors - the first instruction is the usual addition modulo 2 32 with the storage of the carry bit, and the second instruction adds the carry bit to the received value.

The scheme of the encryption algorithm in the gamma mode is shown in Figure 4, below are the explanations for the scheme:

Figure 4. Data encryption (decryption) algorithm in gamma mode.

Step 0

Defines the initial data for the main crypto transformation step:

T o(w) - an array of open (encrypted) data of arbitrary size, subjected to the encryption (decryption) procedure, during the procedure, the array is converted in portions of 64 bits;
S – sync message , a 64-bit data element required to initialize the gamma generator;

Step 1

The initial transformation of the sync message, performed to "randomize" it, that is, to eliminate the statistical patterns present in it, the result is used as the initial filling of the RGHR;

Step 2

One step of the work of the RGHR, which implements its recurrent algorithm. During this step, the older ( S 1) and younger ( S 0) parts of the data sequence are generated independently of each other;

Step 3

Gambling. The next 64-bit element produced by the RGHR is subjected to the 32-3 encryption procedure, the result is used as a gamma element to encrypt (decrypt) the next block of open (encrypted) data of the same size.

Step 4

The result of the algorithm is an encrypted (decrypted) data array.

The following are the features of gamma as an encryption mode:

Identical blocks in an open data array will give different ciphertext blocks when encrypted, which will make it possible to hide the fact of their identity.
Because the gamma is performed bit by bit, encryption of the incomplete data block is easily done as encryption of the bits of this incomplete block, for which the corresponding bits of the gamma block are used. So, to encrypt an incomplete block of 1 bit, according to the standard, the least significant bit from the gamma block should be used.
The sync message used in the encryption must somehow be transmitted to be used in the decryption. This can be achieved in the following ways:

store or transmit the sync message together with the encrypted data array, which will increase the size of the data array during encryption by the size of the sync message, that is, by 8 bytes;
use a predefined value of the sync message or generate it synchronously by the source and receiver according to a certain law, in this case there is no change in the size of the transmitted or stored data array;

Both methods complement each other, and in those rare cases where the first, most common of them does not work, the second, more exotic, can be used. The second method is much less useful, since it is possible to make a sync message predefined only if no more than one data array is encrypted on a given set of key information, which happens not so often. It is also not always possible to generate a sync message synchronously at the source and recipient of the data array, since it requires a hard connection to something in the system. So, a seemingly sound idea to use the number of the transmitted message as a synchronization message in the system for transmitting encrypted messages is not suitable, since the message may be lost and not reach the addressee, in which case the encryption systems of the source and receiver will be out of sync. Therefore, in the considered case, there is no alternative to transmitting a sync message along with an encrypted message.

On the other hand, the opposite example can also be given. Suppose data encryption is used to protect information on the disk, and it is implemented at a low level, data is encrypted by sectors to ensure independent access. In this case, it is impossible to store the sync message along with the encrypted data, since the sector size cannot be changed, but it can be calculated as a function of the read head number of the disk, the track (cylinder) number, and the sector number on the track. In this case, the sync message is tied to the position of the sector on the disk, which can hardly change without reformatting the disk, that is, without destroying the data on it.

The gamma mode has another interesting feature. In this mode, the bits of the data array are encrypted independently of each other. Thus, each bit of the ciphertext depends on the corresponding bit of the plaintext and, of course, the ordinal number of the bit in the array: . It follows from this that changing a bit of the ciphertext to the opposite value will lead to a similar change of the bit of the plaintext to the opposite:

where denotes inverted with respect to t bit value ().

This property gives an attacker the ability, by manipulating the ciphertext bits, to make predictable and even targeted changes to the corresponding plaintext obtained after its decryption, without possessing the secret key. This illustrates the well-known fact in cryptology that secrecy and authenticity are different properties cryptographic systems . In other words, the properties of the cryptosystem to provide protection against unauthorized access to the contents of the message and from unauthorized changes to the message are independent and can only intersect in some cases. This means that there are cryptographic algorithms that provide a certain secrecy of encrypted data and at the same time do not protect against changes, and vice versa, ensure the authenticity of the data and do not limit the possibility of getting acquainted with them. For this reason, the considered property of the gamma mode should not be considered as its disadvantage.

Gambling with feedback.

This mode is very similar to the gamma mode and differs from it only in the way gamma elements are generated - the next gamma element is generated as a result of the transformation along the 32-3 cycle of the previous block of encrypted data, and to encrypt the first block of the data array, the gamma element is generated as a result of the transformation according to the same sync cycle. This achieves block linking - each block of ciphertext in this mode depends on the corresponding and all previous blocks of plaintext. Therefore, this mode is sometimes called scaling with meshing blocks . The fact that the blocks are linked has no effect on the security of the cipher.

The scheme of algorithms for decoding and decryption in the feedback gamma mode is shown in Figure 5 and, due to its simplicity, does not need comments.

Figure 5. Algorithm for data encryption (decryption) in the gamma mode with feedback.

Encryption in closed-loop gamma mode has the same features as encryption in normal gamma mode, except for the effect of ciphertext corruption on the corresponding plaintext. For comparison, let's write the block decryption functions for both mentioned modes:

Gamming;

Gamming with feedback;

While in normal scaling mode, changes in certain bits of the ciphertext affect only the corresponding bits of the plaintext, in the feedback scaling mode, the picture is somewhat more complicated. As can be seen from the corresponding equation, when decrypting a data block in the closed-loop gamma mode, the open data block depends on the corresponding and previous encrypted data blocks. Therefore, if we introduce distortions into the encrypted block, then after decryption, two blocks of open data will be distorted - the corresponding one and the one following it, and the distortions in the first case will be of the same nature as in the gamma mode, and in the second case - as in the mode simple replacement. In other words, in the corresponding open data block, the same bits will be corrupted as in the encrypted data block, and in the next open data block, all bits will be independently of each other with a probability of 1 / 2 will change their values.

Development of an insertion simulation to an array of data.

In the previous sections, we have discussed the impact of corruption of encrypted data on the corresponding clear data. We have found that when decrypting in the simple replacement mode, the corresponding block of open data turns out to be distorted in an unpredictable way, and when decrypting the block in the gamma mode, the changes are predictable. In the closed-loop scaling mode, two blocks are distorted, one in a predictable way and the other in an unpredictable way. Does this mean that from the point of view of protection against the imposition of false data, the gamma mode is bad, while the modes of simple replacement and feedback gamma are good? - In no case. When analyzing this situation, it is necessary to take into account the fact that unpredictable changes in the decrypted data block can be detected only in the case of redundancy of these same data, and the greater the degree of redundancy, the more likely it is to detect distortion. A very large redundancy takes place, for example, for texts in natural and artificial languages, in which case the fact of distortion is almost inevitable. However, in other cases, for example, when distorting compressed digitized sound images, we will get just a different image that our ear can perceive. The distortion in this case will remain undetected, unless, of course, there is a priori information about the nature of the sound. The conclusion here is this: since the ability of some encryption modes to detect distortions introduced into encrypted data relies heavily on the presence and degree of redundancy of the encrypted data, this ability is not an immanent property of the corresponding modes and cannot be considered as their advantage.

To solve the problem of detecting distortions in an encrypted data array with a given probability, GOST provides for an additional mode of cryptographic transformation - the generation of imitated insertion. A fake insert is a control combination that depends on open data and secret key information. The purpose of using insertion mimics is to detect all accidental or intentional changes in the array of information. The problem outlined in the previous paragraph can be successfully solved by adding a fake insert to the encrypted data. For a potential attacker, the following two tasks are practically unsolvable if he does not own the key:

calculation of insertion simulation for a given open array of information;
selection of open data for a given simulation insert;

The scheme of the algorithm for generating a simulated insert is shown in Figure 6.

Figure 6. Algorithm for generating a simulated insert for a data array.

The part of the block received at the output is taken as an imitation insert, usually its 32 least significant bits. When choosing the size of the simulated insertion, one should take into account that the probability of successfully imposing false data is equal to 2 –| I | per brute-force attempt, unless the attacker has a more efficient brute-force method than simple guessing. When using a 32-bit fake insert, this probability is

Discussion of GOST cryptographic algorithms.

Cryptographic strength of GOST.

When choosing a cryptographic algorithm for use in a particular development, one of the determining factors is its strength, that is, resistance to attempts by an adversary to reveal it. The issue of cipher strength, on closer examination, boils down to two interrelated questions:

is it possible to open this cipher at all;
if so, how difficult is it in practice;

Ciphers that cannot be broken at all are called absolutely or theoretically secure. The existence of such ciphers is proved by Shannon's theorem, but the price of this strength is the need to use a key no smaller than the message itself to encrypt each message. In all cases, with the exception of a number of special ones, this price is excessive, therefore, in practice, ciphers that do not have absolute security are mainly used. Thus, the most commonly used encryption schemes can be solved in a finite time, or, more precisely, in a finite number of steps, each of which is some operation on numbers. For them paramount importance has the concept of practical stability, expressing the practical difficulty of their disclosure. A quantitative measure of this difficulty can be the number of elementary arithmetic and logical operations that must be performed to solve the cipher, that is, to determine the corresponding plaintext for a given ciphertext with a probability not less than a given value. At the same time, in addition to the decrypted data array, the cryptanalyst can have blocks of open data and the corresponding encrypted data, or even the ability to obtain the corresponding encrypted data for any open data he chooses - depending on the listed and many other unspecified conditions, separate types of cryptanalysis are distinguished.

All modern cryptosystems are built according to the Kirchhoff principle, that is, the secrecy of encrypted messages is determined by the secrecy of the key. This means that even if the encryption algorithm itself is known to the cryptanalyst, he is still unable to decrypt the message if he does not have the corresponding key. A cipher is considered well-designed if there is no way to break it more effective way than exhaustive search over the entire key space, i.e. over all possible key values. GOST probably corresponds to this principle - over the years of intensive research, not a single effective method for its cryptanalysis has been proposed. In terms of strength, it is many orders of magnitude superior to the former American encryption standard, DES.

GOST uses a 256-bit key and the size of the key space is 2256 . None of the currently existing or expected to be implemented in the near future electronic device it is impossible to pick up the key in less than many hundreds of years. This value has become the de facto key size standard for symmetric cryptographic algorithms these days, and the new US encryption standard also supports it. The former American standard, DES, with its real key size of 56 bits and the size of the key space of only 256, is no longer strong enough in the light of the capabilities of modern computing tools. This was demonstrated in the late 1990s by several successful brute-force attacks on DES. In addition, DES was found to be susceptible special ways cryptanalysis, such as differential and linear. In this regard, DES may be of more research or scientific interest than practical interest. In 1998, its cryptographic weakness was officially recognized - the US National Standards Institute recommended the use of triple DES encryption. And at the end of 2001, a new US encryption standard, AES, was officially approved, built on different principles and free from the shortcomings of its predecessor.

Notes on the architecture of GOST.

It is well known that the domestic encryption standard is a representative of a whole family of ciphers built on the same principles. Its most famous "relative" is the former American encryption standard, the DES algorithm. All these ciphers, like GOST, contain algorithms of three levels. The basis is always a certain “basic step”, on the basis of which “basic cycles” are built in a similar way, and practical procedures for encryption and the development of imitation insertion are already built on their basis. Thus, the specificity of each of the ciphers of this family lies precisely in its main step, or rather, even in its part. Although the architecture of classical block ciphers, to which GOST belongs, lies far beyond the scope of this article, it is still worth saying a few words about it.

Algorithms for the "basic steps of crypto-transformation" for ciphers like GOST are built in an identical way, and this architecture is called balanced Feistel network (balanced Feistel network) named after the person who first proposed it. Data transformation scheme on one cycle, or, as it is commonly called, round , shown in Figure 7.

Figure 7. The content of the main crypto-transformation step for block ciphers similar to GOST.

An even-sized block is fed to the input of the main step, the upper and lower halves of which are processed separately from each other. During the transformation, the lower half of the block is placed in place of the older one, and the older one, combined using the bitwise “ exclusive or » with the result of the calculation of some function, in place of the younger one. This function, which takes as an argument the lower half of the block and the element of key information ( X), is the content part of the cipher and is called it encryption function . For various reasons, it turned out to be advantageous to divide the encrypted block into two parts of the same size: | N 1 |=|N 2 | - it is this fact that reflects the word "balanced" in the name of the architecture. However, encryption unbalanced networks are also used from time to time, although not as often as balanced ones. In addition, cipher strength considerations require that the size of the key element be not less than the size of half the block: in GOST, all three sizes are 32 bits. .

If we apply the above to the scheme of the main step of the GOST algorithm, it becomes obvious that blocks 1,2,3 of the algorithm (see Fig. 1) determine the calculation of its encryption function, and blocks 4 and 5 set the formation of the output block of the main step based on the contents of the input block and the value of the encryption function. More details about the architectures of modern block ciphers with a secret key can be found in the classic works, or, in an adapted form, in my works.

In the previous section, we already compared DES and GOST in terms of durability, now we will compare them in terms of functional content and ease of implementation. In the GOST encryption cycles, the main step is repeated 32 times, for DES this value is 16. However, the GOST encryption function itself is much simpler than the similar DES function, in which there are many irregular bit permutations. These operations are extremely inefficiently implemented on modern non-specialized processors. GOST does not contain such operations, so it is much more convenient for software implementation.

None of the DES implementations considered by the author for the Intel x86 platform achieves even half the performance of the GOST implementation offered to your attention in this article, despite a twice shorter cycle. All of the above indicates that the GOST developers took into account both the positive and negative aspects of DES, and also more realistically assessed the current and future possibilities of cryptanalysis. However, taking DES as a basis when comparing the performance of cipher implementations is no longer relevant. The new US encryption standard is doing much better with efficiency - with the same key size as GOST in 256 bits, AES works faster than it by about 14% - this is when compared by the number of "elementary operations". In addition, GOST is practically impossible to parallelize, while AES has much more opportunities in this regard. On some architectures this benefit of AES may be less, on others it may be more. So, on the Intel Pentium processor, it reaches 28%. Details can be found in .

Key information quality requirements and key sources.

Not all keys and substitution tables provide maximum cipher strength. Each encryption algorithm has its own criteria for evaluating key information. So, for the DES algorithm, the existence of the so-called " weak keys ”, in which the connection between open and encrypted data is not sufficiently masked, and the cipher is relatively easy to break.

An exhaustive answer to the question about the quality criteria for keys and GOST substitution tables, if you can get it anywhere at all, is only from the developers of the algorithm. The relevant data was not published in the open press. However, according to the established procedure, key data received from an authorized organization must be used to encrypt information that has a stamp. Indirectly, this may indicate the existence of methods for checking key data for "lice". If the presence of weak keys in GOST is a debatable issue, then the presence of weak replacement nodes is beyond doubt. It is obvious that the "trivial" substitution table, according to which any value is replaced by itself, is so weak that when using it, the cipher is simply broken, no matter what the key is.

As noted above, criteria for assessing key information are not available, but some general considerations can still be made about them.

Key

The key must be an array of statistically independent bits that take on the values 0 and 1 with equal probability. It cannot be completely ruled out that some specific key values may turn out to be "weak", that is, the cipher may not provide a given level of security if they are used. However, presumably, the share of such values in the total mass of all possible keys is negligible. At least, intensive cipher research has not yet revealed any such key for any of the known (i.e. proposed by FAPSI) substitution tables. Therefore, the keys generated with the help of a certain generator of truly random numbers will be of high quality with a probability that differs from unity by a negligibly small amount. If the keys are generated using a pseudo-random number generator, then the generator used must provide the above statistical characteristics, and, in addition, have high cryptographic strength, no less than that of GOST itself. In other words, the task of determining the missing members of the sequence of elements generated by the generator should not be easier than the task of breaking the cipher. In addition, various statistical criteria can be used to reject keys with poor statistical performance. In practice, two criteria are usually sufficient - to check the equiprobable distribution of key bits between values 0 and 1, Pearson's criterion ("chi square") is usually used, and to check the independence of key bits, the series criterion is used. You can read about the mentioned criteria in textbooks or reference books on mathematical statistics.

The best approach for generating keys would be to use hardware MF sensors, but this is not always acceptable for economic reasons. When generating a small array of key information, a reasonable alternative to using such a sensor is and is widely used in practice by the "electronic roulette" method, when the next generated portion of random bits depends on the moment the operator presses a certain key on the computer keyboard. In this scheme, the source of random data is the computer user, more precisely, the temporal characteristics of his reaction. In this case, only a few bits of random data can be generated per keystroke, so the overall speed of generating key information is low - up to several bits per second. Obviously, this approach is not suitable for obtaining large arrays of keys.

In the case when it is necessary to develop a large array of key information, it is possible and very widespread to use various software sensors of pseudo-random numbers. Since such a sensor requires high cryptographic strength, it is natural to use the gamma generator of the cipher itself as it - we simply “cut” the gamma generated by the cipher into “pieces” of the desired size, for GOST - 32 bytes each. Of course, for this approach, we need a “master key”, which we can obtain using the electronic roulette method described above, and with its help, using a cipher in the gamma generator mode, we obtain an array of key information of the volume we need. So these two ways of generating keys - "manual" and "algorithmic" - work in tandem, complementing each other. Key generation schemes in "low-budget" information cryptographic protection systems are almost always built according to this principle.

Substitution table

The replacement table is a long-term key element, that is, it is valid for a much longer period than a single key. It is assumed that it is common to all encryption nodes within the same system. cryptographic protection. Even if the confidentiality of the substitution table is violated, the strength of the cipher remains extremely high and does not decrease below allowable limit. Therefore, there is no particular need to keep the table secret, and in most commercial applications of GOST, this is how it is done. On the other hand, the substitution table is a critical element to ensure the strength of the entire cipher. Choosing the wrong table can result in the cipher being easily broken by known cryptanalysis methods. The criteria for developing substitution nodes is a secret with seven seals and FAPSI is unlikely to share it with the public in the near foreseeable future. Ultimately, in order to say whether this particular substitution table is good or bad, you need to spend a huge amount of work - many thousands of man and machine hours. Once selected and used, the table is subject to replacement if and only if the cipher with its use turned out to be vulnerable to one or another type of cryptanalysis. Therefore, the best choice for the average user of the cipher is to take one of several tables that have become public. For example, from the hash function standard, it is also "central banking"; information about these tables can be found in the open press and even on the Internet, if you search well.

For those who are not used to taking the easy way, below is a general scheme for obtaining quality tables:

Using one method or another, you develop a set of eight replacement nodes with guaranteed non-linearity characteristics. There are several such methods, one of them is the use of so-called bent functions.
You check the implementation of the simplest "quality criteria" - for example, those published for DES replacement nodes. Here are some more general considerations on this score: Each substitution node can be described by four logical functions from four logical arguments. If these functions, written in minimal form(i.e., with the minimum possible expression length) are not complex enough, such a replacement node is rejected. In addition, individual functions within the entire substitution table must differ from each other to a sufficient extent. At this stage, many deliberately low-quality tables are eliminated.
For a cipher with tables of your choice, build different round models corresponding to different types cryptanalysis, and measure the corresponding "profile" characteristics. So, for linear cryptanalysis, build a linear statistical analogue of the encryption round and calculate the "profile" characteristic - the non-linearity index. If it turns out to be insufficient, the substitution table is rejected.
Finally, using the results of the previous paragraph, subject the cipher with the table of your choice to intensive research - an attempt to cryptanalyse by all known methods. This stage is the most difficult and time-consuming. But if it is done with high quality, then with a high degree of probability it can be stated that the cipher with the tables you have chosen will not be opened by mere mortals, and, it is possible, it will be too tough for the special services.

It is possible, however, to do much easier. The thing is that the more rounds in the cipher, the less impact on the security of the entire cipher is the security characteristics of one round. In GOST there are as many as 32 rounds - more than in almost all ciphers with a similar architecture. Therefore, for most domestic and commercial applications, it is sufficient to obtain substitution nodes as independent random permutations of numbers from 0 to 15. This can be practically implemented, for example, by shuffling a deck of sixteen cards, each of which is assigned one of the values of the specified range.

Concerning the table of replacements it is necessary to note one more interesting fact. The reversibility of the 32-3 and 32-R encryption cycles does not require that the replacement nodes be permutations of numbers from 0 to 15. Everything works even if there are duplicate elements in the replacement node, and the replacement determined by such a node , is irreversible - however, in this case, the strength of the cipher is reduced. Why this is so is not considered in this article, but it is not difficult to verify the fact itself. To do this, it is enough to try to first encrypt and then decrypt the data block using such an "inferior" substitution table, the nodes of which contain duplicate values.

Variations on the theme of GOST

Very often, for use in a cryptographic data protection system, an algorithm with a higher implementation speed than that of GOST is required, and such a high cryptographic strength is not required. A typical example of such tasks are various kinds of electronic exchange trading systems that manage trading sessions in real time. Here, the encryption algorithms used are required to make it impossible to decrypt the operational data of the system during the session (data on orders placed, deals made, etc.), but after it, these data, as a rule, are already useless for attackers. In other words, only a few hours of guaranteed perseverance is required, which is the typical length of a trading session. It is clear that the use of a full-fledged GOST in this situation would be firing a cannon at sparrows.

How to proceed in this and similar cases in order to increase the speed of encryption? The answer lies on the surface - use a modification of the cipher with fewer basic steps (rounds) in the basic cycles. How many times we reduce the number of encryption rounds, the performance increases by the same amount. This change can be achieved in two ways - by reducing the length of the key and by reducing the number of "lookup cycles" of the key. Recall that the number of basic steps in basic encryption cycles is N=n m, where n is the number of 32-bit elements in the key, m- the number of cycles of use of key elements, in the standard n=8, m=4. You can reduce any of these numbers, but the simplest option is to reduce the length of the key without affecting the scheme of its use.

It is clear that the price for speeding up the work will be a decrease in the strength of the cipher. The main difficulty lies in the fact that it is quite difficult to more or less accurately estimate the magnitude of this decrease. Obviously, the only possible way to do this is to study variants of a cipher with reduced cycles of cryptographic transformation "by full program". It is clear that, firstly, this requires the use of classified information, which only the developers of GOST own, and, secondly, it is very laborious. Therefore, we will now try to give a very, very rough estimate, based only on general patterns.

As for the resistance of the cipher to cracking by “extensive” methods, that is, to a “brute force” attack, everything is more or less clear here: a 64-bit key is somewhere on the verge of being accessible to this type of attack, a cipher with a key of 96 bits and higher ( remember that the key must contain an integer number of 32-bit elements) is quite robust against it. Indeed, a few years ago, the former US encryption standard, DES, was repeatedly hacked by brute force - first it was hacked by a computer network organized on the basis of the global Internet, and then by a specialized one, i.e. computer designed specifically for this purpose. Let's assume that the standard version of GOST, when implemented in software on modern processors, works four times faster than DES. Then the 8-round "reduced GOST" will work 16 times faster than DES. Let's also assume that over the time since the DES hack, the performance of computing technology, according to Moore's law, has quadrupled. As a result, we get that now the verification of one 64-bit key for the "reduced GOST" with eight cycles is 64 times faster than the verification of one DES key was once performed. Thus, the advantage of this version of GOST over DES in terms of the complexity of the brute force attack is reduced from 2 64–56 = 2 8 = 256 to 256 / 64 = 4 times. Agree, this is a very illusory difference, almost nothing.

It is much more difficult to assess the resistance of weakened modifications of GOST to "intensive" methods of cryptanalysis. However, the general pattern can be traced here as well. The fact is that the "profile" characteristics of many of the currently strongest types of cryptanalysis depend exponentially on the number of encryption rounds. So, for linear cryptanalysis (LCA) this will be the linearity characteristic L :

where C and are constants, R is the number of rounds. A similar relationship exists for differential cryptanalysis. According to their "physical meaning" all characteristics of this kind are probabilities. Usually, the amount of initial data required for cryptanalysis and its complexity are inversely proportional to such characteristics. It follows that these labor intensity indicators grow exponentially with the growth of the number of basic encryption steps. Therefore, if the number of rounds is reduced by several times, the complexity of the most known types of analysis will change as, very approximately and roughly, the root of this power from the initial amount. This is a very large drop in stamina.

On the other hand, GOST was designed with a large margin of safety and today is resistant to all known types of cryptanalysis, including differential and linear. With regard to the LCA, this means that for its successful implementation, more “open block - encrypted block” pairs are required than “exists in nature”, that is, more than 2 64 . In view of the above, this means that for a successful LCA of a 16-round GOST, at least blocks or 2 35 bytes or 32 GB of data will be required, and for an 8-round GOST, at least blocks or 2 19 bytes or 0.5 MB.

The conclusions from all that has been said above are given in the following table, which summarizes the characteristics of the reduced versions of GOST.

Number of rounds	Key size, bit	Quick action index	Likely characteristics of the cipher (very rough estimate)
24	192	1,33	Resistant to most known types of KA, or be on the verge of resistance. The practical implementation of the CA is impossible due to high requirements for initial data and labor intensity.
16	128	2	Theoretically unstable to some types of cryptanalysis, however, their practical implementation in most cases is difficult due to high requirements for initial data and labor intensity.
12	95	2,67	It is not resistant to some known types of cryptanalysis, but it is suitable for ensuring the secrecy of small amounts of data (up to tens or hundreds of Kb) for a short time.
8	64	4	It is not resistant to some known types of cryptanalysis, but it is suitable for ensuring the secrecy of small amounts of data (up to tens of Kbytes) for a short time.

The last two options, with 12 and 8 rounds, are able to provide very, very limited protection in time. Their use is justified only in tasks where only short-term secrecy of closed data is required, on the order of several hours. A possible area of application for these weak ciphers is to close the UDP traffic of electronic exchange trading systems. In this case, each data packet (datagram, middle "D" from UDP abbreviation) is encrypted with a separate 64-bit key, and the key itself is encrypted with a session key (a key whose scope is one communication session between two computers) and transmitted along with data.

Before finishing with the reduced versions of GOST, I will say that all the above considerations are highly speculative. The standard provides durability for only one, 32-round option. And no one can give you guarantees that the resistance of the reduced variants of the cipher to breaking will change in the way indicated above. If you still decide to use them in your developments, remember that you have set foot on very shaky ground, which can slip out from under your feet at any moment. Since encryption speed is critical to you, maybe you should consider using a faster cipher or a more powerful computer? Another consideration for which this is worth doing is that the weakened versions of GOST will be as sensitive as possible to the quality of the replacement units used.

The issue at hand also has a downside. What if the encryption speed is not critical, and the requirements for strength are very strict? There are two ways to increase the resistance of GOST - we will conditionally call them "extensive" and "intensive". The first of these is nothing more than a simple increase in the number of encryption rounds. It’s not entirely clear to me why this might really be needed, because the domestic standard already provides the necessary stability without it. However, if you suffer from paranoia above the required level (and all "information defenders" are simply obliged to suffer from it, this is a condition for professional suitability, the only question is the severity of the case :), this will help you calm down a little. If you are unsure about this KGB cipher or the substitution table you are using, just double, quadruple, etc. number of rounds - select the multiplicity based on the severity of your case. This approach allows you to really increase the strength of the cipher - if earlier cryptanalysis was simply impossible, now it is impossible in the square!

More tricky and interesting is the question of whether it is possible to increase the strength of the cipher without changing the number and structure of the main encryption steps. Surprisingly, the answer to this question is yes, although we are once again treading on the shaky ground of speculation. The fact is that in GOST, at the main conversion step, it is supposed to replace 4 by 4 bits, but in practice (we will talk about this later), all software implementations perform the replacement byte by byte, i.e. 8 by 8 bits - this is done for reasons of efficiency. If we immediately design such a replacement as an 8-bit one, then we will significantly improve the characteristics of one round. First, the “diffusion” characteristic or “avalanche” indicator will increase - one bit of the source data and / or key will affect more bits of the result. Secondly, for larger substitution nodes, lower differential and linear characteristics can be obtained, thereby reducing the cipher's susceptibility to similar types of cryptanalysis. This is especially true for the reduced GOST cycles, and for 8 and 12-round options, such a step is simply necessary. This somewhat compensates for the loss of stamina in them from a decrease in the number of rounds. What makes it difficult to use this technique is that you will have to design such "increased" replacement nodes yourself. And also the fact that larger nodes are generally noticeably more difficult to design than smaller ones.

Non-standard use of the standard.

Of course, the main purpose of GOST crypto algorithms is data encryption and imitation protection. However, they can be found in other applications, naturally related to the protection of information. Let's briefly talk about them:

1. For encryption in the gamma mode, GOST provides for the generation of a cryptographic gamma - a sequence of bits with good statistical characteristics and high cryptographic strength. Further, this gamma is used to modify open data, resulting in encrypted data. However, this is not the only possible application of the cryptographic gamma. The fact is that the algorithm for its development is a pseudo-random number sequence generator (PRNG) with excellent characteristics. Of course, to use such a PRNG where only obtaining statistical characteristics generated sequence, and cryptographic strength is not needed, is not very reasonable - for these cases, there are much more efficient generators. But for various applications related to information security, such a source will be very useful:

As noted above, gamma can be used as a "raw material" for generating keys. To do this, you just need to get a gamma segment of the desired length - 32 bytes. In this way, keys can be made as needed and do not need to be stored - if such a key is needed again, it will be easy enough to generate it again. It will only be necessary to remember on which key it was originally generated, which sync message was used, and from which byte of the generated gamma the key began. All information, except for the key used, is not secret. This approach will make it easy to control a fairly complex and branched system of keys, using just one "master key".
Similarly to the previous one, gamma can be used as the initial "raw material" for generating passwords. Here the question may arise why it is necessary to generate them at all, is it not easier to simply invent them as needed. The failure of this approach was clearly demonstrated by a series of incidents in computer networks, the largest of which was the diurnal paralysis of the Internet in November 1988, caused by the "Morris worm". One of the ways a malicious program penetrated a computer was password guessing: the program tried to enter the system by successively sorting through several hundred passwords from its internal list, and in a significant proportion of cases it was able to do this. The fantasy of man in inventing passwords turned out to be very poor. That is why in those organizations where due attention is paid to security, passwords are generated and distributed to users by a security system administrator. Password generation is a little more complicated than key generation, since in this case the “raw” binary gamma must be converted to character form, and not just “cut” into pieces. In addition, individual values may need to be discarded to ensure that all characters of the alphabet are equally likely to appear in the password.
Another way to use the cryptographic gamut is guaranteed erasure of data on magnetic media. The fact is that even when rewriting information on magnetic media there are traces of previous data that can be restored by the appropriate examination. To destroy these traces, such overwriting must be performed many times. It turned out that it would be necessary to rewrite information to the media fewer times if such a procedure uses random or pseudo-random data that will remain unknown to experts trying to recover the overwritten information. The gamma cipher will come in handy here.

2. Not only the cryptographic gamma, but also the cryptographic transformation itself, can be used for needs not directly related to encryption:

We know that one of such options for using GOST is the development of a simulated insert for data arrays. However, on the basis of any block cipher, including GOST, it is quite easy to build a scheme for calculating a one-way hash function, also called MDC in the literature, which in different sources stands for change detection code / manipulation (M modification/ M anipulation D etection C ode) or message digest (M essay D igest C ode). The first decoding appeared in the literature much earlier, the second, shorter one, I think, was invented by those who were unable to remember the first :) - it was a joke. MDC can be directly used in imitation protection systems as an analogue of imitation insertion, which, however, does not depend on the secret key. In addition, MDC is widely used in electronic digital signature (EDS) schemes, because most of these schemes are designed in such a way that it is convenient to sign a data block of a fixed size. As you know, on the basis of the discussed standard GOST 28147-89, the standard of the Russian Federation for calculating the one-way hash function GOST R34.11-94 is built.
It is less known that on the basis of any block cipher, including GOST, it can be built completely functional diagram EDS, with a secret signature key and an open verification combination. For a number of reasons, this scheme has not received wide practical distribution, but in some cases it can still be considered a very attractive alternative to the "mathematical" EDS schemes that are currently dominant in the world.

Literature

Information processing systems. Cryptographic protection. Cryptographic conversion algorithm GOST 28147-89. State. Com. USSR according to standards, M., 1989. ftp://ftp.wtc-ural.ru/pub/ru.crypt/GOST-28147
Shannon Claude. Mathematical theory of secret systems. In the collection "Works on information theory and cybernetics", M., IL, 1963, p. 333-369. http://www.enlight.ru/crypto/articles/shannon/shann__i.htm
Announcing Approval of Federal Information Processing Standard (FIPS) 197, Advanced Encryption Standard (AES), Federal Register Vol. 66, no. 235 / Thursday, December 6, 2001 / Notices, pp 63369–63371. http://csrc.nist.gov/encryption/aes/
Feistel Horst. Cryptography and computer security. Translation by A. Vinokurov, published by Horst Feistel. Cryptography and Computer Privacy, Scientific American, May 1973, Vol. 228, no. 5, pp. 15-23. http://www.enlight.ru/crypto/articles/feistel/feist_i.htm
Schneier Bruce. Applied cryptography. 2nd ed. Protocols, algorithms and source texts in the C language., M., "Triumph", 2002 http://www.ssl.stu.neva.ru/psw/crypto/appl_rus/appl_cryp.htm
Menezes Alfred, van Oorschot Paul, Vanstone Scott. Handbook of applied cryptography. http://www.cacr.math.uwaterloo.ca/hac/
Vinokurov Andrey. How is a block cipher structured? Manuscript. http://www.enlight.ru/crypto/articles/vinokurov/blcyph_i.htm
Vinokurov Andrey. Issues on cryptography for the electronic journal iNFUSED BYTES online. http://www.enlight.ru/crypto/articles/ib/ib.htm
Vinokurov Andrey, Primenko Eduard. The text of the report "On software implementation Encryption Standards of the Russian Federation and the USA”, Informatization Conference, Moscow, MEPhI, January 28-29, 2001. Published in the conference proceedings.
Information technology. Cryptographic protection of information. Hash function GOST R34.11-94, Gosstandart RF, M., 1994.

Encryption algorithm GOST 28147-89. Simple substitution method. — Archive WASM.RU

“While you are alive, do not die, look at this world.
Many here have a dead soul - they are dead inside.
But they walk and laugh, not knowing that they are not there,
Don't rush your death," she told me.
Aria, "Up There"
2.1 Feistel networks.
2.2 Block cipher GOST 28147-89

3.1 Key information
3.2 The main step of the crypto transformation
3.3 Basic cycles:“ 32-З” , “ 32-R” .

4.1 Implementation of the main step of crypto transformation
4.2 Increasing the speed of the algorithm
5. Key information requirements
6. List of used literature
7. Thanks.

Introduction.

This document is my attempt to describe the method of simply replacing the GOST 28147-89 encryption algorithm in the simplest, but, nevertheless, technically literate language. How well I succeeded, the reader will give his opinion after reading the first six points.

In order for my work to be more useful, I recommend arming yourself with the works of the authors indicated in the list of used literature. A calculator is also recommended, so that it has a function for calculating the operation XOR, because reading the article assumes that the reader has set out to study this encryption algorithm. Although it is also suitable as a reference tool, I wrote this article precisely as a tutorial.

Introduction to block ciphers.

Before we begin to consider the algorithm, we need to familiarize ourselves with the history of the creation of this kind of cipher. The algorithm belongs to the category of block ciphers, in the architecture of which the information is divided into a finite number of blocks, the final one naturally may not be complete. The encryption process takes place precisely over the full blocks, which form the ciphertext. The final block, if it is incomplete, is supplemented with something (I will talk about the nuances of its addition below) and is encrypted in the same way as full blocks. By ciphertext, I mean - the result of the action of the encryption function on a certain amount of data that the user submitted for encryption. In other words, the ciphertext is the end result of encryption.

The history of the development of block ciphers is associated with the beginning of the 70s, when IBM, realizing the need to protect information when transmitting data over computer communication channels, began to carry out its own research program dedicated to the protection of information in electronic networks, including cryptography.

A group of researchers - developers of the IBM company, which began to study encryption systems with a symmetric key scheme, was headed by Dr. Horst Feistel.

2.1 Feistel networks

The architecture of the new encryption method proposed by Feistel in the classical literature was called the "Feistel Architecture", but at the moment in Russian and foreign literature a more established term is used - "Feistel's network" or Feistel`s NetWork. Subsequently, according to this architecture, the "Lucifer" cipher was built - which was later published and caused a new wave of interest in cryptography in general.

The idea of the Feistel network architecture is as follows: the input information stream is divided into blocks of n bits in size, where n is an even number. Each block is divided into two parts - L and R, then these parts are fed into an iterative block cipher, in which the result of the j-th stage is determined by the result of the previous stage j-1! This can be illustrated with an example:

Rice. one

Where, function A is the main action of the block cipher. It can be a simple action, such as the XOR operation, or it can have a more complex form, be a sequence of a number of simple actions - modulo addition, left shift, element replacement, etc., in the aggregate, these simple actions form the so-called - the main step of crypto transformation.

It should be noted that the key elements of the function are the supply of key elements and the XOR operation, and how well thought out the work of these operations is, speaks of the cryptographic strength of the cipher as a whole.

In order for the idea of Feistel networks to be completely clear, consider the simplest case depicted in Fig. rice. one, where in the function A - the operations “mod 2” (“xor”) will perform, but this protozoan case, in a more serious situation, for example, hiding information of national importance, function A can be more complex (as far as I have seen, function A can indeed be very complicated):

Initial data:

L=1110b, R=0101, K=1111b

Get a ciphertext

1. (R + K) mod 2 4 = Smod, Smod = 0100b

2. (Smod + L) mod 2 = Sxor, Sxor = 1010b

3. L=R, R=Sxor

L=0101b, R=1010b

Let's explain our steps:

1. This operation is addition mod 2 4 . In practice, such an operation boils down to a simple addition, where we must add two numbers and ignore the carry to the 5th digit. Since, if we put exponents over the digits of the binary representation of a number, there will be an indicator of four over the fifth digit, take a look at the figure below, which shows the actions of our operation:

Rice. 2

Here I pointed to the exponents with an arrow, as you can see, the result should have been 10100, but since the carry is ignored during the mod 2 4 operation, we get 0100.

2. This operation in the literature is called mod 2, in assembly language it is implemented by the command XOR. But its more correct name is mod 2 1 . Without this unique operation, it is hardly possible to build a fast, easy-to-implement encryption algorithm and, at the same time, be quite crypto-resistant. The uniqueness of this operation lies in the fact that it is the reverse of itself! For example, if the number A is XORed with the number B, the result will be C, in the future it is enough to re-XOR the numbers B and C together to get the previous value of A!

In this operation, we got 1010 having the numbers 1110 and 0100, to get back 1110, it is enough to XOR the numbers 0100 and 1010 together! More details about this operation can be found in the article, which is embedded on the site. www.wasm.ru, « Elementary guide toCRC_error detection algorithms» the author who Ross N. Williams. There is a paragraph in this work - " 5. Binary arithmetic without carries". It is in this article that the operation is described xor! I exclaim because in this article this operation is described in such a way that the reader not only understands how this operation works, he even begins it see, hear and feel!

3. This action is necessary so that when decrypting from the ciphertext, you can get the original values.

2.2 Block cipher GOST 28147-89

The encryption algorithm GOST 28147 - 89 belongs to the category of block ciphers operating on the Feistel balanced network architecture, where two parts of the selected information block are of equal size. The algorithm was developed in the bowels of the eighth department of the KGB, now transformed into FAPSI, and was enshrined as the encryption standard of the Russian Federation back in 1989 under the USSR.

For this method of the algorithm to work, it is necessary to break the information into blocks of 64 bits in size. Generate or enter into the encryption system the following key information: key and substitution table. The choice of the key and substitution table for encryption should be taken very seriously, because. this is the foundation of the security of your information. About what requirements are imposed on the key, and the table of substitutions, see the paragraph "Requirements for key information".

When considering the method, we will not focus on this, because. this article, as I said above, was written with the aim of teaching the reader how to encrypt data using the method of simply replacing this encryption algorithm, but we will definitely touch on this issue at the end of the article.

theoretical minimum.

3.1 Key information

As I said above, the following are actively involved in data encryption:

3.1.1. The key is a sequence of eight elements of 32 bits each. Further, we will denote the symbol K, and the elements of which it consists - k1,k2,k3,k4,k5,k6,k7,k8.

3.1.2 The substitution table is a matrix of eight rows and sixteen columns, hereinafter referred to as Hij. Each element at the intersection of row i and column j takes 4 bits.

3.2 The main step of crypto transformation

The main action in the encryption process is the main step of the crypto transformation. This is nothing more than an action to encrypt data using a certain algorithm, only the name the developers introduced was too cumbersome.

Before starting to encrypt, the block is divided into two parts L and R, 32 bits each. The key element is selected and only then these two parts of the block are fed, the key element is the substitution table into the main step function, the result of the main step is one iteration of the base cycle, which will be discussed in the next paragraph. The main step consists of the following steps:

The addition part of the block R is added to the key element K mod 2 32 . I described a similar operation above, here the same thing is only the exponent is not “4”, but “32” - the result of this operation will be denoted by Smod in the future.

We divide the previously obtained result Smod into four bit elements s7,s6,s5,s4,s3,s2,s1,s0 and feed it into the replacement function. The replacement is as follows: the element Smod - s i is selected, from the beginning we start from the lowest element, and replace it with the value from the substitution table for the i - row and column indicated by the value of the element s i . We pass to s i +1 element and do the same and continue until we replace the value of the last element Smod - the result of this operation will be denoted as Ssimple.

In this operation, the value of Ssimple is shifted cyclically to the left by 11 bits and we get Srol.

We select the second part of the block L and add mod 2 with Srol, as a result we have Sxor.

At this stage, the L part of the block becomes equal to the value of the R part, and the R part in turn is initialized with the result of Sxor, and this completes the main step function!

3.3 Basic cycles: “32-Z”, “32-R”.

In order to encrypt information, it is necessary to break it into blocks of 64 bits in size, naturally the last block can be less than 64 bits. This fact is the Achilles' heel of this "simple substitution" method. Since its addition to 64 bits is a very important task to increase the cryptographic strength of the ciphertext, this sensitive place, if it is present in the array of information, but it may not be (for example, a file of 512 bytes in size!), should be treated with great responsibility!

After you have broken the information into blocks, you should break the key into elements:

K = k1,k2,k3,k4,k5,k6,k7,k8

Encryption itself consists in using the so-called basic cycles. Which, in turn, include the n -th number of basic crypto-transformation steps.

Basic cycles are, how to say, marked: n - m. Where n is the number of basic crypto-transformation steps in the base loop, and m is the "type" of the base loop, i.e. what is it about, "Z" encryption or "R" encryption of data.

The basic encryption cycle 32–3 consists of 32 basic cryptographic steps. The block N and the element of the key K are fed into the function that implements the actions of the step, and the first step occurs with k1, the second over the result with the element k2, etc. according to the following scheme:

k1,k2,k3,k4,k5,k6,k7,k8,k1,k2,k3,k4,k5,k6,k7,k8,k1,k2,k3,k4,k5,k6,k7,k8k8,k7, k6,k5,k4,k3,k2,k1

The 32-P decryption process proceeds in a similar way, but the key elements are fed in the reverse order:

k1,k2,k3,k4,k5,k6,k7,k8,k8,k7,k6,k5,k4,k3,k2,k1,k8,k7,k6,k5,k4,k3,k2,k1,k8, k7,k6,k5,k4,k3,k2,k1

Practice.

4.1 Implementation of the main step of crypto transformation

After we got acquainted with the theory of how to encrypt information, it is time to see how encryption works in practice.

Initial data:

Let's take a block of information N = 0102030405060708h, here the parts L and R are equal:

L = 01020304h, R = 05060708h, take the key:

K=' as28 zw37 q839 7342ui23 8e2t wqm2 ewp1' (these are ASCII codes, in order to view the hexadecimal representation, you can open this file in view mode in Total Commander by pressing the " F3” and then the key “ 3 "). In this key, the element values will be:

k1 = 'as28', k2 = 'zw37', k3 = 'q839', k4 = '7342'

k5 = 'ui23', k6 = '8e2t', k7 = 'wqm2', k8 = 'ewp1'

Also take the following substitution table:

Rice. 3

Here the rows are numbered from 0 to 7, the columns from 0 to F.

A warning: All information, including the key with the substitution table, is taken as an example for considering the algorithm!

Using the "Initial data", you need to get the result of the action of the main step of the crypto transformation.

1. Select part R = 05060708h and key element k1 = ‘as28’, in hexadecimal form the key element will look like this: 61733238h. Now we do the operation of summation mod 2 32:

Rice. 4

As you can see in the figure, we did not transfer to 33 bits marked in red and with the exponent " 32 ". And if we had other values of R and the key element, this could well happen, and then we would ignore it, and in the future use only the bits marked in yellow.

I perform such an operation with the assembler command add:

; eax=R, ebx='as28'

The result of this operation is Smod = 66793940h

2. Now the most intricate operation, but if you look closely, it is no longer as scary as it seems at first. Let's imagine Smod in the following form:

Rice. five

I tried to visualize the Smod elements in the figure, but I’ll explain anyway:

s0 = 0, s1 = 4, s2 = 9, etc.

Now, starting from the lowest element s0, we make a replacement. Remembering the paragraph 3.2 The main step of crypto transformation» i is a row, s i is a column, we are looking for a value in the zero row and zero column:

Fig.6

So the current value of Smod is not 6679394 0 h, and 6679394 5 h.

We proceed to replace s1, i.e. four. Using the first row and fourth column (s1= 4!). Let's look at the picture:

Rice. 7

Now the value is Smod, not 667939 4 5h, 667939 2 5h. I assume that now the replacement algorithm is clear to the reader, and I can say that after the end result of Ssimple will have the following value - 11e10325h.

I will talk about how this is easiest to implement in the form of assembler commands later in the next paragraph, after I talk about the extended table.

We must shift the resulting Ssimple value 11 bits to the left.

Rice. 8

As you can see, this action is quite simple, and is implemented by one assembly language command - roll and the result of this Srol operation is 0819288Fh.

4. Now it remains part L of our block of information to XOR with the value Srol. I take calculator from w2k sp4 and get Sxor = 091b2b8bh.

5. This action is final and we simply assign, clean R the value of the L part, and initialize the L part with the value of Sxor.

Final result:

L=091b2b8bh, R=01020304h

4.2 Increasing the speed of the algorithm

Now let's talk about optimizing the algorithm for speed. When implementing a project, one has to take into account that a program that works with registers more often than with memory works the fastest, and here this judgment is also very important, because. over one block of information as many as 32 encryption steps!

When I implemented the encryption algorithm in my program, I did the following:

1. Selected part of the block L in the eax register, and R in edx.

2. Initialized into the esi register with the address of the extended key, more on that below.

3. I assigned the value of the address of the extended substitution table to the ebx register, more on this below

4. Passed the information of points 1,2, 3 to the function of the basic cycle 32 - Z or 32 - R, depending on the situation.

If you look at the scheme for supplying key elements in paragraph " Basic cycles: “32-Z”, “32-R””, then our key for the basic cycle 32 - Z can be represented as follows:

K 32-Z =

'as28','zw37','q839','7342','ui23','8e2t','wqm2','ewp1',

'as28','zw37','q839','7342','ui23','8e2t','wqm2','ewp1',

'ewp1','wqm2','8e2t','ui23','7342','q839','zw37','as28'

Those. from the beginning go k1,k2,k3,k4,k5,k6,k7,k8 - ‘ as28', 'zw37', 'q839', '7342', 'ui23', '8e2t', 'wqm2', 'ewp1' This sequence is repeated three times. Then the elements go in reverse order, i.e.: k8,k7,k6,k5,k4,k3,k2,k1 - 'ewp1', 'wqm2', '8e2t', 'ui23', '7342', 'q839', 'zw37', 'as28'.

I pre-arranged the elements in the array in the order they should be served in 32 - Z. Thus, I increased the memory required per key, but saved myself from some thinking processes that I did not need, and increased the speed of the algorithm, for by reducing the memory access time! Here I described only the key for 32 - Z, for the cycle 32 - R I did the same, but using a different scheme for supplying elements, which I also described in paragraph " Basic cycles: “32-Z”, “32-R».

It's time to describe the implementation of the replacement function, as I promised above. I could not describe earlier, because. this requires the introduction of a new concept - an extended substitution table. I can't explain to you what it is. Instead, I will show you it, and you can formulate for yourself what it is - an extended substitution table?

So, in order to understand what an extended substitution table is, we need a substitution table, for example, I will take the one shown in Fig. 3.

For example, we needed to replace the number 66793940h. I will present it in the following form:

Rice. nine

Now if we take the elements s1,s0, i.e. low byte, then the result of the replacement function will be 25h! After reading the article by Andrey Vinokurov, which I cited in the paragraph “ List of used literature”, you will actually find that if you take two lines, you can get an array that allows you to quickly find replacement elements using an assembler command xlat. They say it is possible in another way, faster, but Andrey Vinokurov spent about four years researching fast algorithms for implementing GOST! I don't think you should reinvent the wheel when it already exists.

So, about the array:

Let's take the first two lines, zero and first, and create an array of 256 bytes. Now we observe one feature that if you need to convert 00h, then the result will be 75h (based on Fig. 3) - put this value in the array at offset 00h. We take the value 01h, the result of the replacement function is 79h, put it in the array at offset 01 and so on until 0FFh, which will give us 0FCh, which we will put in the array at offset 0FFh. So we got an extended table of replacements for the first group of strings: the first and zero. But there are still three groups: the second page 2, page 3, the third page 4, page 5, the fourth page 6, page 7. With these three groups we proceed in the same way as with the first. The result is an extended substitution table!

Now you can implement an algorithm that will perform the replacement. To do this, we take the source codes that Andrey Vinokurov posted on his page, see " Bibliography».

lea ebx,extented_table_simple

mov eax,[put number to replace]

add ebx,100h ;go to the next two nodes

sub ebx,300h ; so that in the future ebx points to the table

Now one more feature, we not only replaced the previous actions, but also shifted the number by 8 bits to the left! We just have to shift the number another 3 bits to the left:

and we get the result of the operation rol eax,11!

There is nothing more I can add on optimization, the only thing I can emphasize what I said above is to use registers more often than memory accesses. I think these words are only for beginners, experienced people understand this very well without my words.

Key Information Requirements.

As stated in the article by Andrey Vinokurov, the key is selected according to two criteria:

The criterion for the equiprobable distribution of bits between the values 1 and 0. Usually, the criterion for the equiprobable distribution of bits is the Pearson criterion (“chi-square”).

This means the key, in principle any number can. That is, when forming the next bit of the key, the probability of its initialization by one or zero is 50/50!

Please note that the key has eight elements, each 32 bits, so there are 32 * 8 = 256 bits in total in the key and the number of possible keys is 2 256! Doesn't it strike you?

series criteria.

If we look at our key, which I gave in paragraph " 4.1 Implementation of the main step of crypto transformation”, then you will notice that the following entry is true:

Rice. 10

In one phrase, the value of k 1 should not be repeated not in k 2 , not in any other element of the key.

That is, the key that we have chosen as the consideration of the encryption algorithm, well meets the two criteria above.

Now about the choice of substitution table:

Now let's talk about how to choose the right substitution table. The main requirement for the choice of substitution tables is the phenomenon of "non-repeatability" of elements, each of which is 4 bits in size. As you saw above, each row of the substitution table consists of the values 0h, 1h, 2h, 3h, ..., 0fh. So the main requirement is that each line contains the values 0h, 1h, 2h, ... , 0fh and each such value in one copy. For example, the sequence:

1 2 3 4 5 6 7 8 9 A B C D E F

Fully meets this requirement, but still! It is not recommended to select such a sequence as a string. Because if you supply a value to the input of a function that relies on such a string, then you will get the same value at the output! Don't believe? Then take the number 332DA43Fh and eight such lines as a substitution table. Perform the replacement operation, and I assure you, the output will be the number 332DA43Fh! That is, the same that you submitted to the input of the operation! And this is not a sign of good form in encryption, and was it?

That was one requirement, the next criterion says that - each bit of the output block must be statistically independent of each bit of the input block!

How does it look easier? And here is how, for example, we chose the element s0 = 0Fh, 01111b from the above number. The probability that we now replace the first bit with a one or a zero is 0.5! The probability of replacing the second, third and fourth bits, each bit, considered separately, by ones or zeros is also 0.5. When choosing s1 = 0Eh, we will replace the probability that we are a zero bit, and this is “0”, by zero or one too equals - 0.5! Thus, according to this criterion, there is no regularity between the replacement of zero bits of elements s0, s1! Yes, you could substitute ones, but you could also put zeros.

To evaluate the table according to this criterion, you can build a table of correlation coefficients calculated by the formula:

If p = 1, then the value of bit j at the output is equal to the value of bit i at the input for any combination of bits at the input;

If p = -1, then the value of bit j at the output is always the inverse of input bit i;

If p = 0, then the output bit j takes on the values 0 and 1 with equal probability for any fixed value of the input bit i.

Let's take a single line example:

Let's break it down into components:

We calculate one coefficient using the formula above. To make it easier to understand how this is done, I will explain in more detail:

We take the 0th bit of the 0th number (0) at the input and the 0th bit of the 0th number at the output (1) and perform the operation 0 XOR 1 = 1.

We take the 0th bit of the 1st number (1) at the input and the 0th bit of the 1st number at the output (1) and perform the operation 1 XOR 1 = 0.

We take the 0th bit of the 2nd number (0) at the input and the 0th bit of the 2nd number at the output (0) and perform the operation 0 XOR 0 = 0.

We take the 0th bit of the 3rd number (1) at the input and the 0th bit of the 3rd number at the output (1) and perform the operation 1 XOR 1 = 0.

After sequentially performing XOR operations in such a sequence, we count the number of all non-zero values, we get the value 6. Hence P 00 = 1-(6/2 4-1) = 0.25. So, it turned out that the value of bit 0 at the output is equal to the value of bit 0 at the input in 4 cases out of 16;

Final odds table:

As can be seen from the table of correlation coefficients, bit 3 at the input is inverted relative to bit 0 at the output in 14 cases out of 16, which is 87.5% This is no longer acceptable for normal encryption systems. For a change, let's take another example:

The table of coefficients will be as follows (who is not lazy to recalculate)

Well, things are even worse in this table - bits 1 and 2 of the group remain unchanged! The cryptanalyst has a place to turn around Taking into account all these requirements, a simple enumeration (“head-on”) found permutation tables corresponding to the specified theory (today - 1276 combinations) Here are some of them:

09 0D 03 0E-06 02 05 08-0A 07 00 04-0C 01 0F 0B

00 05 0A 07-03 08 0F 0C-0E 0B 04 09-0D 06 01 02

06 0B 0F 00-0C 01 02 0D-08 07 09 04-05 0A 03 0E

04 0E 00 09-0B 01 0F 06-03 0D 07 0A-0C 02 08 05

04 02 08 0E-05 0F 03 09-0B 01 0D 07-0A 0C 06 00

07 03 09 0C-08 00 06 0F-0E 04 01 0A-0D 0B 02 05

06 0F 03 08-0D 04 0A 01-09 02 05 0C-00 0B 0E 07

0C 06 08 01-03 09 07 0E-0B 05 0F 02-04 0A 00 0D

04 0B 09 06-0E 01 00 0F-0A 05 03 0C-0D 02 07 08

00 0E 0F 01-07 08 09 06-04 0B 0A 05-03 0D 0C 02

0F 09 01 07-04 0A 08 06-0E 00 02 0C-05 03 0B 0D

0A 03 04 01-05 0C 0B 0E-08 06 0F 0D-07 09 00 02

0B 06 0F 01-04 0A 08 05-00 0D 0C 02-07 09 03 0E

0C 03 02 08-0D 06 0B 05-07 09 04 0F-0A 00 01 0E

02 0B 0F 04-09 00 06 0D-05 0E 01 08-0C 07 0A 03

List of used literature.

Article by Andrey Vinokurov:

Encryption algorithm GOST 28147-89, its use and implementation

for Intel x86 platform computers.

Here are the source codes for the implementation of the encryption algorithm.

Article by Horst Feistel:

Cryptography and Computer Security.

(can be found at the same address as the previous article)

Ross N. Williams:

An Elementary Guide to CRC Error Detection Algorithms
Posted on the site www.wasm.en.

Thanks.

I would like to express my gratitude to all visitors of the www.wasm.ru forum. But I would especially like to thank ChS, who is currently known as SteelRat, he helped me understand things that I probably would never have understood, as well as helping me write the paragraph: “ Key information requirements”, the main part of this paragraph was written by him. I am also deeply grateful to the employee of KSTU. A.N. Tupolev Anikin Igor Vyacheslavovich and it would be a sin not to mention Chris Kaspersky, for the fact that he is, and Volodya / wasm.ru for his instructions. Oh, and I get from him. I also want to note Sega-Zero / Callipso for bringing some mathematical jungle to my mind.

This is perhaps all I would like to tell you.

I would be grateful for criticism or questions related to this article or just advice. My contact details: [email protected], ICQ - 337310594.

Sincerely, Evil`s Interrupt.

P.S.: I did not try to outdo anyone with this article. It was written with the intent to make it easier to study GOST, and if you have difficulties, this does not mean that I am to blame for this. Be reasonable and be patient, all the best to you!

). At the same time, the number of articles about this algorithm in the Russian media and blogs of Russian users is growing: both covering the results of attacks on the Russian standard with varying degrees of reliability, and containing opinions about its operational characteristics. The authors (and, consequently, readers) of these notes often get the impression that the domestic encryption algorithm is obsolete, slow, and has vulnerabilities that make it much more susceptible to attacks than foreign encryption algorithms with a similar key length. With this series of notes, we would like to tell in an accessible form about the current state of affairs with the Russian standard. The first part will cover all the attacks on GOST 28147-89 known to the international cryptographic community, current estimates of its strength. In future publications, we will also consider in detail the properties of the standard from the point of view of the possibility of building efficient implementations.

Nicolas Courtois - "great and terrible"

Let's start with a story about the activities of Nicolas Courtois, who is the author of a whole series of works on the Russian block encryption standard ().

In October 2010, the process of considering the inclusion of the GOST 28147-89 algorithm into the international standard ISO/IEC 18033-3 was started. Already in May 2011, an article by the famous cryptographer Nicolas Courtois appeared on the ePrint electronic archive, marked by a very ambiguous attitude towards him from the world cryptographic community. Courtois' publications are a sad example of the manipulation of concepts, which does not reveal any new properties of the object in question, but with a claim to sensation provokes the spread of erroneous opinions about its actual properties in an incompetent environment.

Algebraic Method

Courtois' reasoning is built around two classes of cryptanalysis methods: algebraic methods and differential ones. Consider the first class of methods.

In a simplified way, the method of algebraic cryptanalysis can be described as the compilation and solution of a large system of equations, each of the solutions of which corresponds to the goal of the cryptanalyst (for example, if the system is compiled using one pair of plaintext and ciphertext, then all solutions of this system correspond to keys under which this plaintext is converted into given encrypted). That is, in the case of the problem of cryptanalysis of a block cipher, the essence of the algebraic method of cryptanalysis is that the key is found as a result of solving a system of polynomial equations. The main difficulty is to be able to compose as simple a system as possible, taking into account the characteristics of a particular cipher, so that the process of solving it takes as little time as possible. Here the key role is played by the features of each specific analyzed cipher.

The algebraic method exploited by Courtois can be briefly described as follows. At the first stage, such properties of GOST 28147-89 are used as the existence of a fixed point for a part of the encryption transformation, as well as the so-called reflection point. Due to these properties, enough a large number pairs of open-ciphertexts, several pairs are selected that allow us to consider transformations not on 32, but only on 8 rounds. The second stage consists in the fact that, based on the results of 8-round transformations obtained at the first stage, a system of nonlinear equations is constructed, in which the unknowns are the key bits. This system is then solved (this sounds simple, but is actually the most time-consuming part of the method, since the system consists of non-linear equations).

As noted above, nowhere in the work is there a detailed description and analysis of the complexity of the second and main stage of determining the key. It is the complexity of the second stage that determines the complexity of the entire method as a whole. Instead, the author gives the notorious "facts", on the basis of which he makes estimates of labor intensity. These "facts" are said to be based on experimental results. Analysis of the "facts" from the work of Courtois as a whole is given in the work of domestic authors. The authors of this work note that many of Courtois' "facts" presented without any evidence turned out to be false during experimental verification. The authors of the article went further and, for Courtois, analyzed the complexity of the second stage using well-founded algorithms and estimates. The resulting estimates of the complexity show the complete inapplicability of the presented attack. In addition to domestic authors, the big problems that Courtois has with the estimates and justification of his methods were also noted, for example, in the work.

Differential method

Consider the second Courtois method, which is based on differential cryptanalysis.

The general method of differential cryptanalysis is based on the exploitation of the properties of non-linear mappings used in cryptographic primitives, related to the influence of the key value on the dependencies between the differences between pairs of input and output values of these mappings. Let us describe the main idea of the differential method of cryptographic analysis of a block cipher. Usually block ciphers transform the input data in stages using a number of so-called round transformations, and each round transformation does not use the entire key, but only some of it. Consider a slightly "truncated" cipher, which differs from the original one in that it does not have the last round. Let us assume that it has been established that as a result of encrypting two plaintexts that differ in some fixed positions using such a "truncated" cipher, with a high probability, ciphertexts are obtained that also differ in some fixed positions. This property shows that a "truncated" cipher is very likely to leave a dependency between some plaintexts and the results of their encryption. In order to recover part of the key with this apparent disadvantage, it is necessary to be able to encrypt pre-selected plaintexts with the key we want to recover (the so-called "chosen plaintext attack"). At the beginning of the “key opening” procedure, a certain number of pairs of plaintexts are randomly generated that differ in the same fixed positions. All texts are encrypted using a "full" cipher. The resulting ciphertext pairs are used to recover those key bits that are used in the last round transformation, as follows. With the help of some randomly selected value of the desired bits of the key, a transformation is applied to all ciphertexts, the reverse of the last round transformation. In fact, if we guessed the desired value of the key bits, we will get the result of the “truncated” cipher, and if we didn’t guess, we will actually “encrypt the data even more”, which will only reduce the dependence between the blocks noted above (difference in some fixed positions). In other words, if among the results of such “additional processing” of ciphertexts there were quite a lot of pairs that differ in fixed positions known to us, then this means that we have guessed the required key bits. Otherwise, there are significantly fewer such pairs. Since only part of the key is used in each round, the bits to be searched (that is, the bits of the key used in the last round) are not as many as the bits in the full key and can simply be iterated over by repeating the above steps. In this case, we will definitely someday stumble upon the correct value.

It follows from the above description that the most important thing in the differential analysis method is the numbers of those very positions in plaintexts and ciphertexts, the differences in which play a key role in restoring the key bits. The fundamental presence of these positions, as well as the set of their numbers, directly depends on the properties of those non-linear transformations that are used in any block cipher (usually, all "non-linearity" is concentrated in the so-called S-boxes or substitution nodes).

Courtois uses a slightly modified version of the differential method. Immediately, we note that Courtois conducts his analysis for S-boxes that are different from the current ones and from those proposed in ISO. The paper presents differential characteristics (the very numbers in which blocks should differ) for a small number of rounds. The rationale for extending stats for more rounds is usually based on "facts". Courtois expresses, again, with nothing but his authority, an unsubstantiated assumption that changing the S-boxes will not affect the resistance of GOST 28147-89 against its attack (at the same time, for unknown reasons, S-boxes from the 1st working draft of the supplement to the standard ISO/IEC 18033-3 have not been considered). The analysis carried out by the authors of the article shows that even if we accept Courtois's unfounded "facts" and analyze GOST 28147-89 with other S-boxes, the attack again turns out to be no better than a complete enumeration.

A detailed analysis of the works of Courtois with a detailed justification for the groundlessness of all statements about the decrease in the stability of the Russian standard was carried out in [ , ].

At the same time, even Courtois himself recognizes the absolute lack of accuracy in calculations! The following slide is taken from Courtois' presentation at the FSE 2012 Short Announcement section.

It should be noted that the works of Courtois were repeatedly criticized by foreign researchers as well. For example, his work on building attacks on the AES block cipher algorithm using the XSL method contained the same fundamental flaws as the work on the analysis of the Russian standard: most of the estimates of labor intensity appear in the text completely unfounded and unsubstantiated - detailed criticism can be found, for example, in work . In addition, Courtois himself acknowledges widespread refusals to publish his work at major cryptographic conferences and in established peer-reviewed journals, leaving him often only the opportunity to speak at the short announcement section. This, for example, can be read in section 3 of the work. Here are some quotes quoted by Courtois himself and related to his work:

"I think that the audiences of Asiacrypt will not feel it is interesting." Asiacrypt 2011 reviewer.
“… there is a big, big, big problem: this attack, which is the main contribution of the paper has already been published at FSE’11 (it was even the best paper),…”. Crypto Reviewer 2011.

Thus, the professional part of the international cryptographic community treats the quality of Courtois's work with no less doubt than, say, the statements of some Russian specialists about their ability to crack AES for 2,100, or the next "proofs" for two pages of the hypothesis, which are not confirmed by any consistent calculations. on the inequality of complexity classes P and NP.

Isobe and Dinur-Dunkelman-Shamir attacks

The general idea of the Isobe attacks () and Dinur-Dankelman-Shamir (hereinafter: the DDSH attack) () is to construct for a certain (key-dependent) narrow set of plaintexts a transformation equivalent on this set, which has a simpler structure than the encryption transformation itself. . In the case of the Isobe method, this is the set of 64-bit blocks x such that F 8 -1 (Swap(F 8 (z))) = z, where z = F 16 (x), through F 8 (x) and F 16 ( x) are the first 8 and the first 16 rounds of encryption GOST 28147-89, respectively, through Swap - the operation of swapping halves of a 64-byte word. When the plaintext gets into this set, the result of the full 32-round transformation of GOST 28147-89 coincides with the result of the 16-round one, which is exploited by the author of the attack. In the case of the DDS method, this is the set of x such that F 8 (x) = x (the fixed point of the transformation F 8). For any plaintext from this set, the GOST 28147-89 transformation works exactly the same as its last 8 rounds, which simplifies the analysis.

The complexity of the Isobe attack is 2224 encryption operations, the LDS attack is 2192. However, all questions about whether it follows that the Isobe and LDS attacks introduce new restrictions on the conditions for the application of our algorithm are removed by an assessment of the requirements for the amount of material needed to carry out each of the attacks: the Isobe method requires 232 pairs of plaintext and ciphertexts, and for the DDSH - 2 method 64 . Processing such volumes of material without changing the key is a priori unacceptable for any block cipher with a block length of 64: on material with a volume of 2 32 , taking into account the problem of birthdays (see, for example, to the intruder the ability to make certain conclusions about plaintexts from ciphertexts without determining the key. The presence of 264 pairs of open and cipher texts obtained on the same key actually allows the adversary to perform encryption and decryption operations without knowing this key at all. This is due to a purely combinatorial property: the adversary in this case has the entire encryption transformation table. This situation is absolutely unacceptable under any reasonable operational requirements. For example, CryptoPro CSP has technical limitation for the amount of encrypted (without key conversion) material of 4 MB (see). Thus, a strict prohibition on the use of a key on a material of this size is inherent in any block cipher with a block length of 64 bits, and therefore, the Isobe and DDSH attacks in no way narrow the area of use of the GOST 28147-89 algorithm while maintaining the maximum possible security 2 256 .

Of course, it should be noted that the researchers (Isobe and Dinur-Dankelman-Shamir) showed that some properties of the GOST 28147-89 algorithm make it possible to find ways of analysis that were not taken into account by the creators of the algorithm. The simple form of the key schedule, which greatly simplifies the task of constructing efficient implementations, also allows one to construct simpler descriptions of the transformations performed by the algorithm for some rare cases of keys and plaintexts.

The paper demonstrates that this negative property of the algorithm can be easily eliminated with full preservation of operational characteristics, but, unfortunately, it is an integral part of the algorithm in its commonly used form.

We note that certain negligences in the estimates of the average labor intensity are also present in the work of Dinur, Dankelman, and Shamir. Thus, when constructing an attack, due attention is not paid to the following point: for a significant proportion of keys, the set of plaintexts x, such that F 8 (x) = x, is empty: 8 rounds of transformation may simply not have fixed points. The existence of fixed points also depends on the choice of replacement nodes. Thus, the attack is only applicable to certain replacement nodes and keys.

It is also worth mentioning one more work with an attack on GOST 28147-89. In February 2012, an updated version of the article (dated November 2011) appeared on the ePrint electronic archive of the international cryptographic association, which contained a new attack on GOST 28147-89. The characteristics of the presented attack are as follows: the volume of the material is 232 (as with Isobe), and the labor intensity is 2192 (as with DDSh). Thus, this attack improved the record-breaking LDS attack in terms of material volume from 2 64 to 2 32 . We note separately that the authors honestly presented all the calculations with the justification of the complexity and volume of the material. After 9 months, a fundamental error was found in the above calculations, and since November 2012, the updated version of the article in the electronic archive no longer contains any results regarding the domestic algorithm.

Attacks based on the assumption that the attacker knows "something" about the keys

Finally, we note that there are also a number of works in the literature (see, for example, and ) devoted to attacks on GOST 28147-89 in the so-called model with linked keys. This model basically contains an assumption about the possibility of an intruder to gain access for analysis not only to pairs of open and encrypted texts using the desired key, but also to pairs of open and cipher texts obtained using (also unknown) keys that differ from the sought-for known regular way (for example, in fixed bit positions). In this model, it is indeed possible to obtain interesting results about GOST 28147-89, however, in this model, it is possible to obtain no less strong results about, for example, the most widely used in modern networks general use AES standard (see, for example,). Note that the conditions for carrying out such attacks arise when a cipher is used in a certain protocol. It should be noted that the results of this kind, although they are of undoubted academic interest from the point of view of studying the properties of cryptographic transformations, actually do not apply to practice. For example, all cryptographic information protection tools certified by the FSB of Russia fulfill the strictest requirements for encryption key generation schemes (see, for example,). As indicated in the results of the analysis carried out in the presence of 18 associated keys and 2 10 pairs of blocks of open and ciphertext, the complexity of a full opening private key, with a probability of success of 1-10 -4 , is indeed 2 26 . However, if the above requirements for the development of key material are met, the probability of finding such keys is 2 -4352, that is, 24096 times less than if you just try to guess the secret key on the first try.

The works related to the model with linked keys also include the work , which made a lot of noise in 2010 in Russian electronic publications that do not suffer from the habit of carefully checking the material in the process of chasing sensations. The results presented in it were not supported by any rigorous justification, but contained loud statements about the possibility of hacking the state standard of the Russian Federation on a weak laptop in a matter of seconds - in general, the article was written in the best traditions of Nicolas Courtois. But, despite the completely obvious to the reader, more or less familiar with the basic principles of scientific publications, the groundlessness of the article, it was precisely to reassure the Russian public that after the work Rudsky wrote a detailed and detailed text containing a comprehensive analysis of this shortcoming. The article with a telling title "On the zero practical significance of the work "Key recovery attack on full GOST block cipher with zero time and memory"" justifies the fact that the average complexity of the method given in the method is no less than the complexity of a complete enumeration.

Dry residue: what is the resistance in practice?

In conclusion, we present a table containing data on all the results of strictly described and justified attacks on GOST 28147-89 known to the international cryptographic community. Note that the complexity is given in the encryption operations of the GOST 28147-89 algorithm, and the memory and material are specified in the blocks of the algorithm (64 bits = 8 bytes).

Attack	Labor intensity	Memory	Required material
Isobe	2 224	2 64	2 32
Dinur-Dunkelman-Shamir, FP, 2DMitM	2 192	2 36	2 64
Dinur-Dunkelman-Shamir, FP, low-memory	2 204	2 19	2 64
	2 224	2 36	2 32
Dinur-Dunkelman-Shamir, Reflection, 2DMitM	2 236	2 19	2 32
brute force	2 256	1	4
The number of nanoseconds since the beginning of the universe	2 89

Despite a fairly large-scale research cycle in the field of security of the GOST 28147-89 algorithm, at the moment there is not a single attack known, the conditions for which would be achievable with the accompanying operational requirements of a block length of 64 bits. The restrictions on the amount of material that can be processed on one key, arising from the parameters of the cipher (bit length of the key, bit length of the block), are significantly stricter than the minimum amount that is necessary to carry out any of the currently known attacks. Therefore, when the existing operational requirements are met, none of the GOST 28147-89 cryptanalysis methods proposed so far allows you to determine a key with a laboriousness less than exhaustive search.