Skybinary

View Categories

1.2.2 File Compression, Archiving, and Backup in Linux

4 min read

File Compression, Archiving, and Backup in Linux #

File compression, archiving, and backup are essential for managing system storage, transferring data efficiently, and protecting information. Linux provides a variety of powerful command-line tools for compressing, combining, and copying files safely.


1. gzip (GNU Zip) #

Purpose:
gzip is used to compress files and reduce their size to save disk space or transfer time. It replaces the original file with a compressed version having the .gz extension.

Key Features:

  • Uses the DEFLATE algorithm for compression.
  • Compresses individual files only (not directories).
  • Often used in combination with tar for multiple files.

Common Syntax:

gzip [options] filename
gunzip filename.gz

Examples:

  • Compress a file: gzip data.txt → Output: data.txt.gz
  • Decompress a file: gunzip data.txt.gz
  • Compress with maximum compression level (1–9): gzip -9 largefile.log

Usage in Administration:
Used for log compression, backup scripts, and reducing transmission size.


2. bzip2 #

Purpose:
bzip2 offers higher compression ratios than gzip using the Burrows-Wheeler block-sorting algorithm. It creates files ending with .bz2.

Key Features:

  • Better compression but slower performance.
  • Works on single files only.
  • Compatible with bunzip2 for decompression.

Common Syntax:

bzip2 [options] filename
bunzip2 filename.bz2

Examples:

  • Compress a file: bzip2 report.txt
  • Decompress: bunzip2 report.txt.bz2
  • Keep original file after compression: bzip2 -k report.txt

Usage in Administration:
Useful for archiving backups where space efficiency is more important than speed.


3. zip #

Purpose:
zip combines compression and archiving in one command, commonly used for cross-platform file sharing.

Key Features:

  • Compresses multiple files or directories into a single .zip archive.
  • Compatible with Windows and macOS systems.
  • Supports encryption and password protection.

Common Syntax:

zip [options] archive.zip files
unzip archive.zip

Examples:

  • Compress multiple files: zip backup.zip file1.txt file2.txt
  • Compress a directory recursively: zip -r project.zip /home/user/project
  • Extract files: unzip backup.zip

Usage in Administration:
Ideal for sharing compressed files between Linux and non-Linux systems.


4. tar (Tape Archive) #

Purpose:
tar is used to archive multiple files or directories into a single file without compression by default. It’s often combined with gzip, bzip2, or xz for compressed archives.

Common File Extensions:

  • .tar → uncompressed
  • .tar.gz → gzip compressed
  • .tar.bz2 → bzip2 compressed
  • .tar.xz → xz compressed

Common Syntax:

tar [options] archive_name.tar files

Common Options:

OptionDescription
-cCreate an archive
-xExtract an archive
-tList contents
-vVerbose output
-fSpecify filename

Examples:

  • Create an archive: tar -cvf backup.tar /etc
  • Extract an archive: tar -xvf backup.tar
  • Create a compressed tarball: tar -czvf backup.tar.gz /home/user
  • Extract a .tar.gz file: tar -xzvf backup.tar.gz

Usage in Administration:
Used widely for backups, packaging source code, and system migrations.


5. xz #

Purpose:
xz provides high compression ratios using the LZMA2 algorithm. It’s slower but results in smaller files than gzip or bzip2.

Key Features:

  • Ideal for large backups or archives.
  • Commonly used with tar to create .tar.xz archives.
  • Supports multithreaded compression with -T option.

Common Syntax:

xz [options] filename
unxz filename.xz

Examples:

  • Compress a file: xz largefile.iso
  • Decompress: unxz largefile.iso.xz
  • Keep original file: xz -k logfile.txt

Usage in Administration:
Used when maximum compression is required for large archives or software distributions.


6. cpio (Copy In and Out) #

Purpose:
cpio is used for copying files to and from archives. It reads file lists from standard input and is often used in backup scripts or with find commands.

Common Syntax:

cpio [options] < filelist

Modes:

  • Copy-out mode: Create an archive (-o).
  • Copy-in mode: Extract files (-i).
  • Pass-through mode: Copy files from one directory to another (-p).

Examples:

  • Create an archive: find /home/user | cpio -ov > backup.cpio
  • Extract files: cpio -idv < backup.cpio
  • Copy files to another directory: find . -type f | cpio -pdm /backup

Usage in Administration:
Useful for making or restoring backups, especially in bootable system images and initramfs creation.


7. dd (Data Duplicator) #

Purpose:
dd is a low-level utility used to copy, clone, and convert data from one source to another. It’s often used for creating bootable disks, backups, and system images.

Key Features:

  • Works at the block level (not just files).
  • Can copy partitions, drives, or device images.
  • Supports block size and progress control.

Common Syntax:

dd if=input_file of=output_file [options]

Examples:

  • Create a disk image: dd if=/dev/sda of=/backup/disk.img bs=4M status=progress
  • Restore from image: dd if=/backup/disk.img of=/dev/sda
  • Create a bootable USB: dd if=ubuntu.iso of=/dev/sdb bs=4M status=progress
  • Wipe a disk: dd if=/dev/zero of=/dev/sdb bs=1M

Usage in Administration:
dd is powerful for system imaging, cloning, and disaster recovery tasks

Powered by BetterDocs

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top