Hardware and software setup

The concept of the file archiving process. An Overview of Data Compression Methods What is File Compression Ratio

Purpose of archiving- ensuring a more compact arrangement of information on a disk, as well as reducing the time and, accordingly, the cost of transmitting information through communication channels in computer networks. In addition, archiving greatly simplifies the transfer of information from one computer to another, reduces the time it takes to copy it to external media, helps protect information from unauthorized access, and helps protect against computer viruses.

The main feature of archiving- this is information compression, i.e. converting it to a form that reduces redundancy in its representation and, accordingly, requires less memory for storage.

Both one and several files can be compressed, which in compressed form are placed in one so-called archive file or archive, from where they can be extracted in their original form.

Archive file (archive) is a specially organized file containing one or more files in a compressed or uncompressed form and service information about file names, date and time of their creation or modification, sizes, etc.

The process of writing files to an archive file is called archiving(archiving, packing), and extracting files from the archive - unzipping(unzipping, unpacking).

The degree of file compression during archiving depends on its format. Some formats (for example, graphic formats) rely on compression performed by programs that create files of these types, and therefore do not shrink when archived. Best of all, when archiving, text files and database files are compressed, files of executable programs and load modules are compressed less. The compression ratio is also affected by the compression method.

In addition to regular archive files, you can create continuous, multi-volume and self-extracting archives, as well as their combinations, for example: multi-volume self-extracting, multi-volume continuous, etc.

Solid archive is an archive packed in a special way, in which all compressed files are treated as one sequential data stream.

Continuous archiving greatly increases the compression ratio, especially when adding a large number small similar files. However, there are also disadvantages:

§ existing continuous archives are updated more slowly than regular archives;

§ encrypted non-stop archives cannot be modified;

§ To extract a single file from a continuous archive, it is necessary to analyze all previous archived files, so extracting individual files from the middle of a continuous archive is slower than extracting from a normal archive. However, if all or a few of the first files are extracted from a continuous archive, then in this case the decompression speed is almost the same as with ordinary archives;


§ If any file in a continuous archive is damaged, then all files following it will also fail to be extracted. Therefore, when saving a continuous archive on unreliable media, it is recommended to add recovery information.

Continuous archives are best used when:

§ the archive is rarely updated;

§ there is no need to frequently extract one or more files from the archive;

§ one is archived big file;

§ Compression ratio is more important than compression speed.

Files in continuous archives are usually sorted by extension, but the sort order can be changed.

Multi-volume archives are archives consisting of several parts (volumes). Typically, volumes are used to store a large archive on multiple floppy disks or other removable media.

The first volume in the sequence has the usual standard extension archiver program, and extensions of subsequent volumes - the first letter of the archiver extension and a serial number.

Files in existing volumes cannot be added, updated, or deleted.

Self-extracting (SFX, from English words SelF-eXtracting) archive is the archive to which the executable is attached. This module allows you to extract files by simply running the archive as regular program. Thus, no additional external programs are required to extract the contents of an SFX archive. SFX archives, like any other executable files, usually have the .EXE extension, but they can be handled just like any other archive.

SFX archives are useful in cases where you need to transfer the archive to someone, but you are not sure that the recipient has the appropriate archiver to extract files.

Multi-volume and self-extracting archives can also be continuous.

Programs that archive / unzip files are called archiving programs.

Archiving programs can be compared according to the following main parameters: interface, compression methods (determining the degree of file compression), types of archives created, speed, support for other archiver formats.

When creating an archive, the archiver program automatically assigns its own extension to the archive file, for example, zip, rar, etc.

The archiving program is managed in one of the following ways:

1. using command line;

2. using the built-in shell and dialog panels that allow you to control using menus and function keys.

3. using function key combinations in operating shells, which, as a rule, can offer a choice of several DOS archiving programs or the shell's own archiver.

4. using GUI elements.

Despite the many archiving programs, a modern user, as a rule, really works with two archive formats: ZIP and RAR.

The degree of information compression depends on several reasons:

First, the type of data being compressed is of great importance. Graphic and text files are best compressed. For them, the compression ratio can be from five to forty percent. Files of executable programs, boot modules, multimedia files are compressed worse.

Secondly, the compression method is of great importance.

Thirdly, it is also important which archiver is used. When choosing the type of archiver, they are usually guided by the following considerations: so that the compression ratio is as high as possible, and the time it takes to pack and unpack files is as short as possible.

Information compression programs

Compression occurs with the help of archiving programs. To date, the most common are four archivers - WinRar, WinAce, 7Zip and WinZip. As for the last program, it does not stand up to scrutiny.

Let's take a closer look at the archiver - WinRar This archiver can be associated with the following file types: RAR, ZIP, CAB, ARJ, LZH, ACE, 7-Zip, TAR, GZip, UUE, BZ2, JAR, ISO.

The program supports files of almost unlimited size (up to 8,589,934,591 GB). True, to work with files larger than 4 GB, you need to work in file system NTFS.

When choosing optimal settings There are a few things to keep in mind for compression:

Although WinRAR supports the ZIP format, it is recommended to choose RAR in most cases. This will provide more high level compression. You can compress files to ZIP if you are not sure that a program will be installed on the computer on which the files will be unpacked, with which you can unpack files in RAR format.

You need to decide which compression method is best to use. The higher the compression ratio, the more time it will take to archive, so here you need to consider for what purposes the data is archived. If this is long-term storage, of course, it makes sense to wait and get the archive with the maximum compression ratio, but if you just need to send a few documents by mail, the normal (Normal) compression ratio is fine for you.

If you need to achieve maximum file compression, use the Create solid archive option. However, it also has its drawbacks. Firstly, it will take more time to unpack such files than to extract from a regular archive. Imagine that you have two hundred files in your archive. If it's created in the usual way, you can easily extract one of the files. If you used solid archive, then it will matter how the file you need would be archived. If it was in the middle of the second hundred, then to unpack it, the program will need to unpack 150 files before it gets to it. Creating archives in this way can also entail great losses, because if the archive becomes corrupted, you will lose all the files that were in it. In the case of packing in the usual way, you can extract from the damaged archive, if not all, but most of the files.

If you need to create a large archive, this can take quite a long time. WinRar allows you to determine how much time it will take to complete a particular task. The Benchmark and hardware test option is intended for this. Another reason to use this option is to define possible errors, which may occur during archiving on a computer of a particular configuration due to a hardware failure.

Among other settings of WinRar "a, one can note the possibility of creating self-extracting archives with an indication of the unpacking path. Such files do not require an archiver program on the computer on which they are planned to be unzipped. Such archives are called SFX-archives. Their disadvantage compared to conventional archives files is a larger size, since they, in addition to the actual packed files, also contain the EXE executable module.

The contents of a RAR archive can be made invisible. To do this, in the program settings, in the Archiving with Password window, you need to check the box next to the Encrypt File Names line.

You can also set a password to open the archive. As a result of an error in transferring an archive over a local network or downloading it from the Internet, as well as due to a hardware failure or a virus attack, the archive may be damaged. WinRar allows you to determine the integrity of the data by testing the archive using the Test Archived Files option.

To minimize the chance of data loss, when creating WinRar archives it is recommended to use the Put Recovery Record option (this checkbox can be found on the General tab of the archive creation window).

If this has been done, then in case of damage to the archive, it can be restored.

In addition, in WinRar, you can reduce the likelihood of damage to a RAR archive by specifying the size of the information to be restored when creating it. To do this, you need to execute the Commands > Protect Archive From Damage command in the Winrar window. At the same time, the volume of the Recovery Record cannot exceed ten percent of the total size of the archive.

To repair damaged RAR archives, select the required file in the WinRar window and execute the Tools > Repair command.

WinRAR can be built into the context menu, and it supports not only the Explorer menu, but also other programs, such as the popular file manager Total Commander. This makes it possible to quickly archive files using the default settings and without opening the program window for this. By the way, the default settings can be changed, in accordance with what requirements you place on your archives. You can do this by opening the WinRar window and executing the Options > Settings command. In this window, go to the Compression tab and click the Create Default button. The settings specified in this window will be used for quick archiving. If you need to change the archiving settings, this can also be done using context menu. To do this, select the Add to Archive… command. Here you can set the format and compression ratio, specify the name of the archive, and select other archiving options.

WinRar allows you to save user-defined settings to a file with a Reg extension. Later, this file can be imported into the program to reuse the given configuration. This file stores information such as the history of archives that have been created recently, default compression settings, etc.

Another handy Winrar option is the ability to create your own bookmarks - Favorities. It is very often necessary to regularly back up the same folders on your hard drive. By bookmarking information about the location of these folders, you can quickly navigate to them in the program window and back up the necessary files and subdirectories.

General information about archiving files

Process Conceptarchiving files One of the most widely used types of service programs are archiving programs, designed for archiving, packaging files by compressing the information stored in them. Information compression - this is the process of converting information stored in a file to a form in which redundancy in its representation is reduced and, accordingly, less memory is required for storage. Information in files is compressed by eliminating redundancy in various ways, for example, by simplifying codes, excluding constants from them bits, or representing repeated symbols or a repeating sequence of symbols as a repetition factor and corresponding symbols. Various algorithms for such information compression are used. Both one and several files can be compressed, which are placed in a compressed form in the so-called archive file or archive. archive file- this is a specially organized file containing one or more files in compressed or uncompressed form and service information about file names, the date and time of their creation or modification, sizes, etc. The purpose of file packaging is usually to provide a more compact arrangement of information on disk, reducing the time and, accordingly, the cost of information transmission over communication channels in computer networks. In addition, packing a group of files into one archive file greatly simplifies their transfer from one computer to another, reduces the time it takes to copy files to disks, protects information from unauthorized access, and helps protect against computer viruses. File compression ratio characterized by the coefficient Ks, defined as the ratio of volume compressed file Vc to the size of the original file Vo, expressed as a percentage: Kc=(Vc/Vo)*100% The compression ratio varies depending on the program being used, the compression method, and the source file type. The files of graphic images, text files and data files are most well compressed, for which the compression ratio can reach 5 - 40%, the files of executable programs and load modules are compressed less - 60 - 90%. Archive files are almost not compressed. Archiving programs differ in the compression methods used, which accordingly affects the degree of compression. Archiving (packaging)- placing (loading) source files into an archive file in compressed or uncompressed form. Unzipping (unpacking) - the process of restoring files from an archive exactly as they were before they were loaded into the archive. When unpacking, the files are extracted from the archive and placed on disk or in RAM; Programs that pack and unpack files are called programs - archivers Large archive files can be placed on several disks (volumes). These archives are called multivolume. Tom is component multivolume archive. When creating an archive of several parts, you can write its parts to several floppy disks. The main types of archiving programs Currently, several dozen programs are used - archivers, which differ in the list of functions and operating parameters, but the best of them have approximately the same characteristics. Some of the most popular programs include: ARJ, PKPAK, LHA, ICE, HYPER, ZIP, RAK, ZOO, EXPAND developed abroad, as well as AIN and RAR developed in Russia. Usually, packing and unpacking files are performed by the same program, but in some cases this is done by different programs, for example, the PKZIP program packs files, and PKUNZIP unpacks files. these files do not require any programs, since the archive files themselves may contain an unpacking program. Such archive files are called self-extracting. Self-extracting archive file - this is a bootable, executable module that is capable of independently unzipping the files in it without using an archiver program. The self-extracting archive is called SFX - archive (SelF - eXtracting). Archives of this type in MS DOS are usually created in the form of an .EXE file. Many programs - archivers unpack files by uploading them to disk, but there are also those that are designed to create a packaged executable module (program). As a result of such packaging, a program file is created with the same name and extension, which, when loaded into RAM, self-extracts and immediately starts. However, it is also possible inverse transformation program file into an unpacked format. Such archivers include the PKLITE, LZEXE, UNP programs. The EXPAND program, which is part of the utilities operating system MS DOS and Windows shell used to decompress files software products supplied by Microsoft. Programs - RAR and AIN archivers, in addition to the usual compression mode, have a solid mode, in which archives with a high compression ratio and a special organization structure are created. In such archives, all files are compressed as one data stream, i.e. the search area for repeated character sequences is the entire set of files loaded into the archive, and therefore the unpacking of each file, if it is not the first one, is associated with the processing of others. It is preferable to use archives of this type for archiving a large number of files of the same type. Ways to manage the program - archiver The program - archiver is controlled in one of two ways:
  • using the MS DOS command line, which forms run command, containing the name of the archiver program, the control command and its configuration keys, as well as the names of the archive and source files; such management is typical for archivers ARJ, AIN, ZIP, RAK, LHA, etc.;
  • using the built-in shell and dialog panels that appear after starting the program and allow you to control using the menu and function keys, which creates a more comfortable working environment for the user. Such control has a program - RAR archiver.
Performing the actions prescribed for it, the archiver program, as a rule, displays the protocol of its work. All modern programs - archivers are equipped with help screens that are called when you enter only one program name or a name with the /? key on the command line. Help can be brief - on one screen or expanded - on several. Many archivers have help screens with examples of composing commands to perform various operations. Help information is usually displayed in English or another international language. Considering the similarity of management principles for most archiving programs, let's consider the main features of the ARJ program (version 2.42), which is known as one of the best in terms of the set of functions provided to the user, compression ratio and speed. The ARJ program is especially effective when working with database files and text files. ARJ ARCHIVING SOFTWARE Purpose of the ARJ archiver The ARJ program allows you to:
  • create archive files from individual or all files of the current directory and its subdirectories, loading up to 32000 files into one archive;
  • add and replace files in the archive;
  • extract and delete archive files;
  • protect each of the files placed in the archive with a 32-bit cyclic code, test the archive, checking the safety of information in it;
  • receive work assistance in 3 international languages;
  • enter comments to files in the archive;
  • save paths to files in the archive;
  • save several generations (versions) of the same file in the archive;
  • reorder the archive file by file sizes, names, extensions, date and time of modification, compression ratio, etc.;
  • search for strings in archived files;
  • restore files from damaged archives;
  • create self-extracting archives both on one volume and on several volumes;
  • view the contents of text files contained in the archive;
  • ensure the protection of information in the archive and access to files placed in the archive with a password.
COMMAND LINE STRUCTURE FOR WORKING WITH THE ARJ PROGRAM To get brief help on the screen, just enter the program name in the command line: ARJ. To get detailed help and examples of command assignment, enter: ARJ - ? or ARJ /?To load the program and perform the necessary functions, the command line format is used, where the program name and parameters are separated by spaces: ARJ<команда> [-<кл1> [-<кл2>,..]] <имя_архива> [<список_имен_файлов>] The required command line options are two options:<команда>and<имя_архива>.Parameter<команда>is written as a single character following the program name and sets the archiving function in accordance with Table. 11.1. Table11.1 - The main commands of the ARJ archiver program

Group number

Command group

Team

Archive function

Placement in the archive

add files to archive

replace files in the archive with new versions

add only new files to the archive

move files to archive

Extract from the archive

extract files from archive to current directory

extract files from the archive and place them in directories according to the access paths specified for them

Removing from the archive

delete files from archive

Service functions

full archive testing

displaying the contents of the archive without specifying the path to the files

displaying the contents of the archive with the path to the files

copy archive with new parameters

find text string in archive

Parameter<имя_архива>specifies the name of the archive file and is written according to the general rules of MS DOS, but without specifying the extension, which is automatically assigned when creating a new file. The archive name can be written with the path to the file. The default archiver processes archive files that have the .ARJ extension. Self-extracting archive file created with the .EXE extension. Such a file contains an unpacking software module, and ARJ is not required to extract files from it. Optional command line options are switches<клN>and<список_имен_файлов>. It is customary to designate optional parameters using square brackets. The keys specify the action of the archiving command, and there may be several of them. Each switch starts with a "-" character and can be placed anywhere on the command line after the command. The sign of the key, in addition to the "-" symbol, can be the "/" symbol. In table. 11.2 the most important tuning keys are listed. Note.Commands and keys of the ARJ archiver can be entered on the command line in any case. The list of filenames is given when not all files in the archive or current directory are to be processed. If you need to add, extract, or remove multiple files on the command line, write down their full names. Up to 64 file names can be specified in the file list. To shorten filenames, you can use patterns according to MS DOS rules, for example: *.* - all files; *..bat"- all files with the extension .BAT; A?.*- all files starting with A. Table 11.2. The most important configuration keys for the ARJ archiver

Purpose

Adding files from the current directory and all its subdirectories, specifying the path to the files
Creating a multi-volume archive file

Protecting the created archive with a password:

g<пароль>- the password is entered on the command line

g? - enter an invisible password on execution

Adding/replacing files, except for files whose names are specified after the key

Request to perform an operation for each file:

to confirm, you must enter the character "Y"

for refusal - character "N"

Creating a self-extracting archive

Specifying the archiving method:

m0 - no compression;

m1 - normal compression (default);

m2 - the highest compression;

m3 - fast compression and less compression;

m4 - fastest compression and least compression;

"Yes" answer is expected for all archiver questions
Pause when viewing the archive content after the screen is full
Putting files in an archive One of the main operations when working with archive files is placing files in an archive, which can be performed using the commands: a, u, m, f. Most often, these commands are used in conjunction with switches: -r, -g, -q, -je.Let's give typical examples of commands for creating and editing archive files. Example 11.1. To archive file arhtxt add two files from the current directory n1.txt and n2.txt:ARJ and arhtxt n1.txt n2.txt Example 11.2. Create an archive file in the current directory arhobj.arj, containing all files in the directory OBJ:ARJ a arhobj obj\*.* Note . When adding any files that are already in the archive, the files are replaced regardless of the date and time of their modification or creation. Example 11.3. On drive B: create an archive arhmat.arj, in which you need to place all the files of the current directory, except for files with the extension prg. Files are added to the archive with their paths : ARJ a b:\arhmat-x*.prg -r Example 11.4. Replace archived files with newer versions arcmat.arj on disk b: and add to it from the current directory the files that are not in the archive:ARJ u b:\arcmat Note. If there are no new and missing files in the source directory, then the message "no change" is displayed on the screen. Example 11.5. Move to archive file bas.arj all files with extension bass from current directory:ARJ m bas*.bas Note . Team m similar to command a, except that, upon successful completion, the moved files are removed from the original directory. By default, the command does not ask for permission to delete. Example 11.6. Replace archive with only new files with extension bass from current directory with confirmation for each file: ARJ f bas*.bas -q Example 11.7. Move to archive file arch.arj all files in the current directory, protecting them with a password DINO:ARJ m arch -gDINO Example 11.8. Add to archive arch.arj from the current directory all files with the extension bass, protecting them with a password that will be entered upon request during the archiving process: ARJ a arch -g? *.fox Example 11.9. Create self decompressing archive file arxbank.exe, containing all files in the current directory: ARJ a arxbank -je Attention! When entering a password, the case of entering characters is important, for example, DINO and Dino passwords are significantly different. It is very important not to forget the password, without which it will be impossible to extract files from the archive. Extracting files from an archive Extracting files from an archive is done using the commands e or X. Team e extracts files and places them either in the current directory or according to the path specified on the command line itself. Team X extracts files to the directory from which they were previously placed in the archive, and if there is no such directory on the disk, then it will be created. If there is already a file with the same name in the directory where the extracted file is to be placed, the program will ask the user for permission to replace the file. The user must enter the character "Y" to allow the substitution or "N" to refuse. To exclude such a dialogue with the program, you can enter the key in the command line - at, which corresponds to a "Y" response to all file replacement requests. Files archived with a password can only be extracted with the correct password. Example 11.10.Extract from archive file arhtxt.arj two files n1.txt and n2.txt to current directory:ARJ e arhtxt n1.txt n2.txt Example 11.11. Extract from archive file arhobj.arj all files in current directory:ARJ e arhobj Example 11.12. Extract from archive file arhobj.arj all files in a directory d:\obj:ARJ e d:\obj\arhobj Example 11.13.Extract from archive file arch.arj all files to the current directory with the DINO password and no confirmation of requests to replace existing files: ARJ e arch -gDINO -y Example 11.14.Extract from archive file arhmat.arj on disk V: all files and write them to directories according to their paths:ARJ x b:\arhmat Deleting files from an archive The ARJ archiver program allows you to physically remove one file or a group of files specified by the list from an archive file. Using the key -q, you can provide a warning before deleting each file from the specified list. When all files are deleted from an archive, it is saved on disk as an empty file, i.e. file with zero size. Example 11.15. Deleting from an archive file arhmat.arj two files with confirmation for each file: ARJ d -q arhmat m_012.fox m_12.prg Service functions The service functions that the ARJ archiver program has are very diverse. The user can test the archive, view the contents of the archive on the screen or print to the printer, replace file names in the archive, copy the archive with new parameters, find a text string in the text files contained in the archive, and much more. Archive testing. Archive testing is based on the principle of checking the Cyclic Redundancy Check (CRC) code of each file included in it. Cyclic control code is calculated as the sum of all codes representing the information of a file, and is therefore often referred to as the checksum of a file. When calculating the checksum, its maximum value is usually limited to 16 or 32 digits, while in order to avoid overflow, the value of the transfer from the highest digit is added to the value of the least significant digit. When testing, the newly calculated cyclic control code is compared with the code stored in the archive. When the integrity of any file is broken, its CRC changes and a mismatch occurs. Either the entire archive or its part can be subject to verification in accordance with the list of files. The check is carried out quite quickly and is accompanied by a protocol output on the screen, in which the value "OK" is displayed for each correct file. Checking password-protected files is not possible without specifying a password. Archive testing - this is a check of the safety of the information of each file contained in the archive. Example 11.16. Check the integrity of all files in the archive arcmat.arj on disk a:ARJ t a:arcmat Viewing the contents of an archive . Two commands are used to view the contents of an archive: I and v. The contents of the archive can be output to the screen or standard output. Team I displays information about each file on one line, the command v- in two lines, one of which specifies the path to the file. The display may be paused after the screen is full if the key is used. -jp. The contents of the archive are displayed in the form of a table, in which information about the files is arranged in the order in which the files were placed in the archive. The table is not sorted. The table can include either information about all files, or about some of them in accordance with a given list of files. You can view the contents of both ordinary archive files and self-extracting ones with the EXE extension. You can use ARJ message forwarding to output file information to the printer. On the rice. 11.1 the contents of the QPR4.ARJ archive file are given. The following command was used to view: ARJ Iqpr4.Columns in Figure 11.1 contain the following information about files: Filename - file name; Original - original file size; Compressed - compressed file size; Ratio - compression ratio; DateTime modified - date and time of file creation (modification); CRC-32 - 32-bit cyclic control code; Attr - file attributes; BTPMGVX- additional information about the file. Rice. 11.1 Screenshot showing the contents of the archive file qpr4.arjProcessing archive: QPR4.ARJArchive created: 1996-02-23 18:41:34, modified: 1996-02-23 18:43:46Filename Original Compressed Ratio Date Time modified CRC-32 BTPMGVXANALYZE .Wq1 13844 2898 0.209 92-10-13 17:34:26 311d59e9 AW B 1gmaster.wq1 69500 ​​20816 0.300 92-09-12 04:00:00 85b7d6f6 AW B 1goptimizr.wq1 6491 2556 0.394 92-10-13 17: 54:56 F1B958DE AW B 1Gregister.wq1 5537 2001 0.361 92-09-12 04:00:00 3B9A3005 AW B 1GSample.WQ1 5017 1912 0.381 92-12-02 20:51:28 31508cca ​​AW B 1GZVUKEFKT.WQ1 205 968 0.439 94-11-01 00:39:54 118CBFC3 AW B 1GGRAGRED.WQ1 3437 1306 0.380 94-11-02 22:50:28 55C06C4F AW B 1GCOUP.SPO 19862 15243 0.767 92-02-12 04:00:03 B 1ASCII>SOR 1637 975 0.596 92-09-12 04:00:00 010C0344 AW B 1DUTB.SFO 33228 33176 0.998 92-02-12 04:00:00 1D76197A AW B 1 10 files 160795 81858 in the last column of the table file attributes: B - for files with the extension .VAK; T - file type (B - binary, T - text, D - directory); R - the archive contains information about the path to the file, which can be viewed with the command v; M - compression method; G - sign of file protection with a password ; V - the file has a continuation on the next volume; X - the file has a beginning on the previous volumes. Example 11.17. Display information about files with extension bass stored in archive file bas.arj with pause after screen fills: arj I bas *.bas -jp Example 11.18. Display information about all files contained in the archive arh-mat.arj on disk a:, with file paths:ARJ v a:\arhmat -jp Example 11.19. Display information about files contained in a self-extracting archive arxbank.exe:ARJ I arxbank.exe Example 11.20. Display information about all archive files arhmat.arj to printer:ARJ v a:\arhmat > prn Copying archive with new parameters mi. To change archive parameters, use the command at, with which you can, for example, convert a regular archive file into a self-extracting one. Example 11.21. Create a self-extracting archive file arhmat.exe from archive file arhmat.arj ARJ y-je arhmat Working with multi-volume archives One of the important advantages of the ARJ archiver program is the ability to create multi-volume archives, i.e. archives that use multiple disks. One archive file is placed on each of the disks, occupying all its free space. In this case, it is not necessary that the disk be previously cleared, since other files may be located on it along with the archive file. When creating an archive, the file located on the first disk is assigned the extension .ARJ by default, and on subsequent disks - .A01, .A02, etc. The rule for designating extensions can be changed using configuration keys, which practically removes restrictions on the number of archive volumes. Viewing the table of contents for each of the archive files of a multi-volume archive is carried out in the same way as for a single-volume archive. The ARJ program allows you to correct the contents of a multi-volume archive - delete, replace and add files. In this case, files are not redistributed between volumes. To work with a multi-volume archive, you must specify the key -v. Refinement of command customization is achieved by using command modifiers. Command modifier - it is a Latin character in any case, written after the key. There can be several molifiers in a team, the order of their entry is indifferent. In addition, a number that specifies the size of the archive volume in bytes can be used as modifiers. The purpose of some modifiers is given in Table 11.3. Table 11.3. Assigning modifiers to the ARJ command for working with a multi-volume archive

Modifier

Assigning a modifier

Specifies that the archive files of a multi-volume archive will take up all the free space on the disks (volumes)
Allows you to execute any number of DOS commands before creating a new volume, such as viewing, clearing or formatting the floppy disk on which the next archive file is to be written; after executing the commands, you must enter the EXIT command to continue archiving
Forbids sharing archived files between volumes
Provides for filing sound signal before installing the next volume
Allows you to reserve free space on the first volume; the number following the r indicates the size of this space

360, 720, 1200

Variants of modifiers for specifying archive volume sizes
Example 11.22. Create a large archive armat.arj in the drive A: using everything free space on floppy disks: ARJ a A: armat -vaExample 11.23. Create a multi-volume archive armat.arj in the drive A: using all free space on floppy disks, beeping and entering MS DOS commands before inserting the next disk:ARJ a A:amat -vvas Example 11.24. Create a multi-volume archive armat.arj in the drive A: using all free space on floppy disks and not sharing archived files between volumes: ARJ a A: armat -vaw Example 11.25. Create a multi-volume archive armat.arj in the drive A: each volume of which will occupy 360 KB: ARJ a A: armat -v360 Extracting files from a multi-volume archive is carried out in the same way as from a single-volume archive, but you must specify the key on the command line -v.Example 11.26 Extracting all files in a multi-volume archive armat.arj from floppy disks installed in the drive A:ARJ e A:armat -v MULTIFUNCTIONAL INTEGRATED RAR ARCHIVER Main features of the program The RAR archiver is a powerful tool for creating and maintaining archives. Its distinguishing features are:
  • the ability to work in two modes - full screen interactive interface and conventional command line interface;
  • support for other types of archives; in full-screen mode, RAR provides the ability to work with archives of other types (.ZIP, .ARJ, LZH), view their contents, modify and convert them;
  • using the highly efficient solid compression method to obtain a high compression ratio (10 - 50% higher than usual);
  • the ability to create self-extracting and multi-volume archives;
  • password protection of archives.
Diverse service functions RAR:
  • password encryption;
  • adding file and archive comments;
  • the possibility of partial or complete recovery of damaged archives;
  • protection of the archive from changes;
  • the ability to add to the archive information about the creator of the archive, the time and date of the last changes made to the archive.
The advantages of RAR are especially noticeable when archiving executable modules (.EXE), object files (.OBJ), large text files, etc. The RAR archiver can be controlled in two modes:
  • in command line mode;
  • in full screen mode.
Since the control technology, the list of commands and keys in the command line mode are similar to the ARJ archiver discussed above, only the features of managing the RAR archiver in the full-screen interface mode will be considered below. Full screen mode To work with the RAR archiver in full-screen interface mode, you need to load the RAR program from the DOS command line, for example: С:\ RARAfter loading the program, a window with two panels will appear on the screen (Fig. 11.2). Memory (Memory) and Settings (Settings), which contain information about memory usage, the current default compression method, the presence of a password, the creation mode backups archive, etc. The left panel contains a list of files and subdirectories of the current directory, through which the selector can be moved using the cursor keys. Pressing the key at the moment when the selector is on a line with a directory name, a link to the upper directory ("..), or on a line with the name of an archive file, it allows you to enter a subdirectory, go to a superdirectory, or enter an archive, respectively. The RAR program allows you to work with archives of the following types: RAR, ARJ, ZIP and LZH. After entering the archive, a list of its files is displayed similar to a regular directory. Thus, you can navigate through directories and archives, work with files both in archives and in directories. When you are in a directory, a hint about the assignment of the function keys is displayed on the bottom line of the screen: 1-Help 2-Add 3-View 4- Fresh 5-Volume 6-Move 7-Update 8-Repair 9-Option 0-Quit tooltip changes and contains a list of additional functions called by the joint keystroke and function keys: 2-Solid 3-View. 11.4. Table 11.4. Assignment of control keys of the RAR archiver when working with the catalog

Function name

Purpose

Add a file to the archive, if the archive does not exist it will be created
View file
Update files in the archive - only changed files are added, the old copies of which are in the archive
Create archive volumes
Transfer files to archive
Add files that are not in the archive and update those whose old copies are already in the archive
Restore corrupted archive
Exit RAR. Key
Create a continuous (solid) archive
View file
Create an archive split into SFX volumes
Create solid - archive divided into volumes
Create solid - archive split into SFX volumes
When you press a key simultaneously with changing the prompt line, a window appears with a list of additional functions of the archiver that are performed when entering key combinations with letters: Alt-C - switch to color or black and white mode; Alt-D - select the current disk; Alt-J - temporary exit to DOS (DOS-shell); Alt-M - selection of compression method; Alt-P - password setting; Alt-S - write current options; Alt-W - assign a working directory for temporary files. If you press the key , when the cursor is on the line with the name of the archive, you will be taken to the archive itself, as in a directory. The same will happen if you run the RAR program with the parameter - the name of the archive you want to go to. When you are in the archive, the function key line looks like this: 1-Help 2-Test 3-View 4-Extr 5-Comment 6-ExCurD 7-SFX 8-Delete 9-Oplion 0-Quit The extra function bar appears when you hold down a key < alt >: 1- 2- 3-View.. 4-ExtrTo 5-FilCmt 6- 7-Lock 8- 9- 0 11.5 Table 11.5. Assignment of RAR archiver control keys when working with an archive

Function name

Purpose

Displaying help information
Test archive
View file
Extract file from archive with full paths
Add a comment to the archive
Extract files to current directory
Convert to SFX - archive
Delete files from archive
Configuration/Save configuration
Exit from the archive
View the file with the built-in program if there is an external
Extract files to a specified directory
Add comments to files
Block archive from changes
Files can be marked (highlighted) or unmarked using the keys or . To select a group of files or cancel the selection, use the keys<Серый +>and<Серый ->.If there are marked files, a service line appears at the bottom of the screen, which indicates the number of marked files and their total size, while the size of files in subdirectories is not taken into account. When you enter the archive, its contents are located on the left half of the screen. Files from this list can be viewed, marked, etc. in the same way as files in a regular directory. In the list, next to the names of password-encoded files, an asterisk (*) is placed. On the right half of the screen, there is an information window that displays information about the archive: the name of the archive and its status, the presence of a comment, the presence of files encrypted with a password, and statistical information about the number of files in the archive, their total volume, compression ratio, the number of the minimum RAR version for unpacking this archive, and the name of the operating system in which the archive was created. When working with multi-volume archives in full-screen mode, you must start unpacking such an archive from its very first part (from the very first volume). The lengths of files, parts of which ended up in different volumes, refer only to the current volume. Symbol<= обозначает файл, продолжающийся с предыдущего тома, а символ =>file that continues into the next volume . Setting archiver parameters To change the parameters of the archiver, after starting it, press the key and call up the setup menu. A window with the following menu will appear on the screen: Configuration...Set password

work directory Default comment fileExternal viewerChange disk RegistrationSave options First menu item Configuration allows you to call the configuration dialog box for setting the main RAR parameters (Fig. 11.3). The window contains five groups of parameters: Interface options - interface settings; Sort names - setting options for sorting files; Include file mask - setting the file inclusion mask; Compression - setting the compression method; Other options - setting other parameters.Fig. 11.3. View of the RAR archiver configuration settings window A parameter marked with a cross means that the corresponding function is enabled. The transition from one parameter to another is carried out by pressing the arrow keys. To change the parameter value in the current field, press . When all parameters are set, go to the "OK" field and press to confirm the selected values. If you decide not to change the settings, go to the "Cancel" field and click to cancel them. New parameter values ​​can be saved for use by default during the next launches. Interface settings: Color - color / black and white full screen mode; Sound - sound effects; Stdout mode - console mode when performing actions from the command line; Mouse - support mouse in full screen mode . Setting sorting by file names:Unsorted - disable sorting;Name - sort by names;Extension - sort by extensions;Size - sort by size.Setting the file inclusion mask allows you to add files to the archive according to their attributes: for reading;System files - system files;Archive files - files for reading and writing;Hidden files - hidden files. Setting the compression method to use the default:Store - add files to the archive without compression;Fastest - very fast compression (least efficient) ;Fast - fast compression;Normal - normal compression (default);Good - good compression (more efficient);Best - best compression (most efficient).Setting other parameters: before changing;Add empty directories - add empty directories to the archive;Always make solid - create solid archives by default;(+) Put Authenticity - add author's control and (+) Log errors to file - keep records of critical situations during RAR operation in the RAR.LOG file. Menu item set password is used to assign a password when packing files into an archive and unpacking from an archive. You can also assign a password by pressing the key combination and<Р>. The password is not saved for use on subsequent launches. Menu item work directory allows you to specify the directory where the RAR archiver will place temporary files. It can also be specified by pressing the key combination and . The name of the file with a general comment to the archives can be set using the menu item default comment file.Menu item external viewer allows you to define an external program that RAR will call to view the contents of files from the archive. When viewing a file in full screen mode, RAR uses the built-in viewer if no external one is defined. Menu item change disk is designed to change the current disk, the directory of which is displayed in the working window. To save the archiver settings, use the menu item Save setup. After pressing in this field you will be asked to choose: Save - save the set parameter values ​​for default use; Cancel - refuse to write parameters. RAR stores the configuration (set of settings) by default in the RAR-CFG file, which is located in the same directory as the RAR.EXE program itself. Settings can also be saved by pressing the key combination<Аlt>and .Technology of work with the archiver Let's consider the sequence of actions when performing the most frequently performed archiving procedures after loading the RAR program to work in full screen mode. Creating a new archive from multiple files 1.Select a disk by pressing a key combination and 2. To change the order of files in the list, press the key , and in the configuration window that appears in the group sort names check the box of the required sorting option.3.By pressing the key or<СерыЙ +>select the files to be archived. 4. To protect the files being archived with a password, press the key combination and<Р>and enter a password with re-confirmation. 5.If you want to change the compression method, click<Аlt>and<М>and in the dialog box that appears, select the required one. 6. To create an archive, press the keys: - for a regular archive; and - for archive type solid;<Аlt>and - for a self-extracting archive (SFX) and enter the maximum size of the archive in kilobytes. 7. In the window that appears, enter the name of the archive file and the path to it. To move files to the archive, enable the Move checkbox in the window.8. After entering information into the window, press the key . Two charts (horizontal bars) appear on the screen, showing the progress of archiving each file individually and the formation of the archive as a whole. At the end of the process, brief information about the volume of files before and after being archived is displayed. Extracting files from an archive 1. Select a disk by pressing a key combination and .2.Set the directory containing the archive file on the left panel of the information window. 3.In the information window that appears on the screen, set the selector to the line with the name of the archive file and press the key . A list of archive files will appear on the left panel. 4. Using the keys or<Серый +>, highlight the files to be extracted. 5. Press the keys: - to extract the marked files according to the paths and recreate the directory structure; - to extract marked files to the current directory; and - to extract the marked files to the specified directory. If the files were archived with a password, a password entry window will appear. Creation of a multi-volume archive on floppy disks containing all the files of a given hard drive directory 1.Set the directory with source files on the left panel of the information window.2.Using the key or<Серый +>, highlight the names of the files to be archived.3.Press the key and in the window that appears on the screen, set or select from the proposed list the size of one archive volume. For automatic sizing, i.e. use all free disk space, select Autodetect. Press a key .4.Install a floppy disk to record the first volume of the archive into the drive and, at the "Enter archive name" request, enter the full name of the first archive volume: drive name, path, file name. The first file will automatically get the .RAR extension. Finish entering and pressing the key start the archiving process. 5. The archiving protocol is displayed on the screen (Fig. 11.4), in which for each archived file the volumes in the original and packed state and the compression level are indicated. archive volumes, which will automatically receive the extensions .ROO", .R01, .R02, etc. After installing each diskette, press the button in the request dialog. Rice. 11.4. View of the information window showing the process of creating a multi-volume archive Extracting files from a multi-volume archive on floppy disks to a specified directory<А1t>and .3.In the information window that appears on the screen, set the selector to the line with the name of the first archive file and press the key . A list of archive files will appear on the left panel of the window. 4. To extract all archive files to the specified directory on the hard drive, press the key combination and and in the dialog box that appears, enter the path to the specified directory. Press a key .5.A dialog box for selecting the extraction option will appear on the screen: Proceed with all volumes from current - extract from all files; Proceed with selected files only - extract only selected files. 6. Select the first option, corresponding to extraction of all files from the archive, and press the key . A list of files extracted from the archive will be displayed on the screen. A successfully extracted file is marked with Ok. 7. After extracting all the files of the first volume, the program will offer to install the next disk (volume) with the .ROO extension, and after extracting the files from it, the volumes with the .R01, R02, etc. extensions, respectively, if any. Note. To work with multi-volume archives, it is necessary to specify the working directory for placing the archiver's temporary files. Such a directory is created on the hard drive and the path to it is indicated by pressing the key combination<А1t>and . The path to the working directory should be saved in the archiver configuration file by pressing<А1t>and .

Data compression methods have a fairly long history of development, which began long before the advent of the first computer. This article will attempt to give a brief overview of the main theories, concepts of ideas and their implementations, which, however, does not claim to be absolute completeness. More detailed information can be found, for example, in Krichevsky R.E. , Ryabko B.Ya. , Witten I.H. , Rissanen J. , Huffman D.A., Gallager R.G. , Knuth D.E. , Vitter J.S. and etc.

Information compression is a problem that has a fairly long history, much older than the history of the development of computer technology, which (history) usually went in parallel with the history of the development of the problem of encoding and encryption of information. All compression algorithms operate on an input stream of information, the minimum unit of which is a bit, and the maximum unit is several bits, bytes, or several bytes. The goal of the compression process, as a rule, is to obtain a more compact output stream of information units from some initially non-compact input stream using some transformation of them. The main technical characteristics of compression processes and the results of their work are:

The degree of compression (compress rating) or the ratio (ratio) of the volumes of the source and resulting streams;

Compression rate - the time spent on compressing a certain amount of information in the input stream until an equivalent output stream is obtained from it;

Compression quality - a value showing how heavily packed the output stream is by applying re-compression to it using the same or another algorithm.

There are several different approaches to the problem of information compression. Some have a very complex theoretical mathematical base, others are based on the properties of the information flow and are algorithmically quite simple. Any approach and algorithm that implements data compression or compression is designed to reduce the volume of the output information stream in bits using its reversible or irreversible transformation. Therefore, first of all, according to the criterion associated with the nature or format of the data, all compression methods can be divided into two categories: reversible and irreversible compression.

Irreversible compression means such a transformation of the input data stream, in which the output stream, based on a certain information format, represents, from a certain point of view, an object that is quite similar in external characteristics to the input stream, but differs from it in volume. The degree of similarity of the input and output streams is determined by the degree of correspondence of some properties of the object (ie compressed and uncompressed information, in accordance with some specific data format) represented by this information stream. Such approaches and algorithms are used to compress, for example, raster graphic file data with a low byte repeat rate in the stream. This approach uses the property of the structure of the graphic file format and the ability to present a graphic image approximately similar in display quality (for perception by the human eye) in several (or rather n) ways. Therefore, in addition to the degree or magnitude of compression, the concept of quality arises in such algorithms, since Since the original image changes during the compression process, then quality can be understood as the degree of correspondence between the original and resulting images, which is subjectively assessed based on the information format. For graphic files, this correspondence is determined visually, although there are also corresponding intelligent algorithms and programs. Irreversible compression cannot be used in areas where it is necessary to have an exact match between the information structure of the input and output streams. This approach is implemented in popular formats for representing video and photo information, known as JPEG and JFIF algorithms and JPG and JIF file formats.

Reversible compression always leads to a decrease in the volume of the output information flow without changing its information content, i.e. - without loss of information structure. Moreover, the input stream can be obtained from the output stream using a decompression or decompression algorithm, and the recovery process is called decompression or decompression, and only after the decompression process is the data suitable for processing in accordance with its internal format.

In reversible algorithms, encoding as a process can be considered from a statistical point of view, which is even more useful, not only for constructing compression algorithms, but also for evaluating their effectiveness. For all reversible algorithms, there is a notion of coding cost. The coding cost is the average length of a code word in bits. The coding redundancy is equal to the difference between the cost and the coding entropy, and a good compression algorithm should always minimize the redundancy (recall that the entropy of information is understood as a measure of its disorder.). Shannon's fundamental theorem on encoding information says that "the cost of encoding is always not less than the entropy of the source, although it can be arbitrarily close to it." Therefore, for any algorithm, there is always some limit to the degree of compression, determined by the entropy of the input stream.

Let us now proceed directly to the algorithmic features of reversible algorithms and consider the most important theoretical approaches to data compression related to the implementation of coding systems and methods of information compression.

Series encoding compression

The most well-known simple approach and reversible compression algorithm is Run Length Encoding (RLE). The essence of the methods of this approach is to replace chains or series of repeated bytes or their sequences with one encoding byte and a counter for the number of their repetitions. The problem with all similar methods is only to determine the way in which the decompressing algorithm could distinguish the encoded series from other unencoded byte sequences in the resulting byte stream. The solution to the problem is usually achieved by placing labels at the beginning of the encoded chains. Such marks may be, for example, characteristic bit values ​​in the first byte of a coded run, values ​​of the first byte of a coded run, and the like. These methods, as a rule, are quite effective for compressing bitmap graphic images (BMP, PCX, TIF, GIF). the latter contain quite a few long series of repeating sequences of bytes. The disadvantage of the RLE method is a rather low compression ratio or the cost of encoding files with a small number of series and, even worse, with a small number of repeated bytes in series.

Compression without using the RLE method

The process of data compression without using the RLE method can be divided into two stages: modeling (modeling) and, in fact, encoding (encoding). These processes and their implementing algorithms are quite independent and diverse.

The coding process and its methods

Encoding is usually understood as the processing of a stream of characters (in our case, bytes or nibbles) in some alphabet, and the frequencies of occurrence of characters in the stream are different. The goal of encoding is to convert this stream into a bit stream of minimum length, which is achieved by reducing the entropy of the input stream by taking into account symbol frequencies. The length of the code representing characters from the stream alphabet must be proportional to the amount of information in the input stream, and the length of the stream characters in bits may not be a multiple of 8 or even variable. If the probability distribution of the frequencies of occurrence of characters from the alphabet of the input stream is known, then it is possible to construct an optimal coding model. However, due to the existence of a huge number of different file formats, the task becomes much more complicated. the data symbol frequency distribution is not known in advance. In that case, in general view, two approaches are used.

The first one consists in viewing the input stream and building encoding based on the collected statistics (this requires two passes through the file - one for viewing and collecting statistical information, the second for encoding, which somewhat limits the scope of such algorithms, because, thus, , eliminates the possibility of one-pass on-the-fly coding used in telecommunication systems, where the amount of data is sometimes not known, and their retransmission or parsing may take an unreasonably long time). In such a case, the entropy scheme of the used coding is written to the output stream. This technique is known as static Huffman coding.

All compression algorithms operate on the input information stream in order to obtain a more compact output stream using some kind of transformation. The main technical characteristics of compression processes and the results of their work are:

· degree of compression - the relation of volumes of initial and resulting streams;

· compression rate - the time spent on compressing a certain amount of information in the input stream, until an equivalent output stream is obtained from it;

· compression quality - a value showing how heavily packed the output stream is when re-compressing it is applied to it using the same or another algorithm.

Algorithms that eliminate the redundancy of data recording are called data compression algorithms, or archiving algorithms. Currently, there are a huge number of data compression programs based on several basic methods.

All data compression algorithms are divided into:

) lossless compression algorithms, when using which the data at the receiving end is restored without the slightest change;

) lossy compression algorithms that remove information from the data stream that has little effect on the essence of the data, or is generally unperceivable by a person.

There are two main lossless archiving methods:

Huffman algorithm (eng. Huffman), focused on compressing sequences of bytes that are not interconnected,

the Lempel-Ziv algorithm (eng. Lempel, Ziv), focused on compressing any kind of text, that is, using the fact of repeated repetition of "words" - sequences of bytes.

Almost all popular programs lossless archiving (ARJ, RAR, ZIP, etc.) uses a combination of these two methods - the LZH algorithm.

Huffman algorithm.

The algorithm is based on the fact that some characters from the standard 256-character set in free text may occur more often than the average repetition period, while others, respectively, less often. Therefore, if $+o records common characters using short sequences of bits less than 8 long, and long ones to record rare characters, then the total file size will decrease.

Lempel-Ziv algorithm. The classical Lempel-Ziv algorithm -LZ77, named after the year of its publication, is extremely simple. It is formulated as follows: if a similar sequence of bytes has already been encountered in the past output stream, and the record of its length and offset from the current position is shorter than this sequence itself, then the link (offset, length) is written to the output file, and not the sequence itself.

4. File compression ratio

Compression of information in archive files is performed by eliminating redundancy different ways, for example, by simplifying the codes, eliminating constant bits from them, or representing repeating symbols or a repeating sequence of symbols as a repetition factor and corresponding symbols. Algorithms for such information compression are implemented in special archiver programs (the most famous of which are arj / arjfolder, pkzip / pkunzip / winzip, rar / winrar) certain ones are used. Both one or several files can be compressed, which are placed in a compressed form in the so-called archive file or archive.

The purpose of file packaging is usually to provide a more compact arrangement of information on a disk, to reduce the time and, accordingly, the cost of transferring information over communication channels in computer networks. Therefore, the main indicator of the effectiveness of a particular archiver program is the degree of file compression.

The degree of file compression is characterized by the coefficient Kc, defined as the ratio of the volume of the compressed file Vc to the volume of the original file Vo, expressed as a percentage (some sources use the inverse ratio):

Kc=(Vc/Vo)*100%

The amount of compression depends on the program you are using, the compression method, and the type of source file.

The files of graphic images, text files and data files are most well compressed, for which the compression ratio can reach 5 - 40%, the files of executable programs and load modules are compressed less Kc = 60 - 90%. Archive files are almost not compressed. This is easy to explain if you know that most archiving programs use variants of the LZ77 (Lempel-Ziv) algorithm for compression, the essence of which is a special encoding of repeating sequences of bytes (read - characters). The frequency of occurrence of such repetitions is highest in texts and scatter plots and practically reduced to zero in archives.

In addition, archiving programs still differ in the implementations of compression algorithms, which accordingly affects the degree of compression.

Some archiving programs additionally include tools aimed at reducing the compression ratio Kc. So in WinRAR program a mechanism for continuous (solid) archiving has been implemented, using which a 10 - 50% higher compression ratio can be achieved than conventional methods, especially if a significant number of small files of the same type of content are packed.

Characteristics of archivers are inversely dependent values. That is, the higher the compression rate, the lower the compression ratio, and vice versa.

There are many archivers on the computer market - each has its own set of supported formats, its pros and cons, its own circle of admirers who firmly believe that the archiver they use is the best. We will not dissuade anyone or anything - we will simply try to impartially evaluate the most popular archivers in terms of functionality and efficiency. These include WinZip, WinRAR, WinAce, 7-Zip - they are the leaders in terms of the number of downloads on software servers. It is hardly advisable to consider other archivers, since the percentage of users using them (judging by the number of downloads) is small.

Liked the article? Share with friends!
Was this article helpful?
Yes
Not
Thanks for your feedback!
Something went wrong and your vote was not counted.
Thank you. Your message has been sent
Did you find an error in the text?
Select it, click Ctrl+Enter and we'll fix it!