Browse Definitions :
Definition

LZW compression

What is LZW compression?

LZW compression is a method to reduce the size of Tag Image File Format (TIFF) or Graphics Interchange Format (GIF) files. It is a table-based lookup algorithm to remove duplicate data and compress an original file into a smaller file. LZW compression is also suitable for compressing text and PDF files. The algorithm is loosely based on the LZ78 algorithm that was developed by Abraham Lempel and Jacob Ziv in 1978.

Invented by Abraham Lempel, Jacob Ziv and Terry Welch in 1984, the LZW compression algorithm is a type of lossless compression. Lossless algorithms reduce bits in a file by removing statistical redundancy without causing information loss. This makes LZW -- and other lossless algorithms, like ZIP -- different from lossy compression algorithms that reduce file size by removing less important or unnecessary information and cause information loss.

The LZW algorithm is commonly used to compress GIF and TIFF image files and occasionally for PDF and TXT files. It is part of the Unix operating system's file compression utility. The method is simple to implement, versatile and capable of high throughput in hardware implementations. Consequently, LZW is often used for general-purpose data compression in many PC utilities.

pros and cons of data compression
LZW reduces the size of TIFF or GIF files.

How LZW compression works

The LZW compression algorithm reads a sequence of symbols, groups those symbols into strings and then converts each string into codes. It takes each input sequence of bits of a given length -- say, 12 bits -- and creates an entry in a table for that particular bit pattern, consisting of the pattern itself and a shorter code. The table is also called a dictionary or codebook. It stores character sequences chosen dynamically from the input text and maintains correspondence between the longest encountered words and a list of code values.

As the input is read, any repetitive results are substituted with the shorter code, effectively compressing the total amount of input. The shorter code takes up less space than the string it replaces, resulting in a smaller file. As the number of long, repetitive words increases in the input data, the algorithm's efficiency also increases. Compression occurs when the output is a single code instead of a longer string of characters. This code can be of any length and always has more bits than a single character.

The LZW algorithm does not analyze the incoming text. It simply adds every new string of characters it sees into a code table. Since it tries to recognize increasingly longer and repetitive phrases and encode them, LZW is referred to as a greedy algorithm.

LZE compression encoding logic
The LZW algorithm doesn't analyze incoming text. It reads a sequence of symbols, groups those symbols into strings and then converts each string into codes in a table.

Code table in LZW compression

Unlike earlier approaches, such as LZ77 and LZ78, the LZW algorithm includes a lookup table of codes as part of the compressed file. Typically, the number of table entries is 4,096. In the code table, codes 0-255 are assigned to represent single bytes from the input file. Before the algorithm starts encoding, the table contains only the first 256 entries. The rest of the table is blank. In other words, the first 256 codes are assigned to the standard character set by default.

The remaining codes are assigned to strings as the algorithm proceeds with the compression. When encoding starts, the algorithm identifies repeated sequences in the data and adds them to the code table so that it fills up with more entries. For file compression, codes 256 through 4,095 are used to represent sequences of bytes. These codes refer to substrings, while codes 0-255 refer to individual bytes.

The decoding program that decompresses the file can build the table by using the algorithm as it processes the encoded input. It takes each code from the compressed file and translates it through the code table that's being built to find the character that code represents.

Advantages and drawbacks of LZW compression

The LZW algorithm quickly compresses large TIFF or GIF files. It works especially well for files containing a lot of repetitive data, which is common with monochrome images.

One drawback of LZW compression is that compressed files without repetitive information can be large, defeating the purpose of compression. Another issue is that some versions of the algorithm are copyrighted, so companies must pay royalties or licensing fees to use it. These fees may get added to the product cost.

Finally, LZW is not the most efficient compression algorithm. Other algorithms are available to compress files faster and more efficiently.

LZW compression vs. ZIP compression

LZW and ZIP are both lossless compression methods, meaning no data is lost after compression. TIFF files retain their quality after being compressed into smaller files using either LZW or ZIP. That said, compressed TIFF files can be slightly slower to work with because they require more processing effort to open and close them.

lossless vs. lossy compression
LZW, like ZIP, is a lossless compression method, which means no data is lost after compression.

LZW and ZIP provide good results with 8-bit TIFF files. For 16-bit TIFF files, the ZIP algorithm performs better than LZW. In fact, LZW tends to make 16-bit files larger. Generally, both algorithms work efficiently when they can group a lot of similar data and work on images that are low on detail and contain few tones. These images compress more than images containing lots of detail or different tones.

Explore the differences among compression vs. deduplication vs. encryption.

This was last updated in January 2023

Continue Reading About LZW compression

Networking
  • local area network (LAN)

    A local area network (LAN) is a group of computers and peripheral devices that are connected together within a distinct ...

  • TCP/IP

    TCP/IP stands for Transmission Control Protocol/Internet Protocol and is a suite of communication protocols used to interconnect ...

  • firewall as a service (FWaaS)

    Firewall as a service (FWaaS), also known as a cloud firewall, is a service that provides cloud-based network traffic analysis ...

Security
  • identity management (ID management)

    Identity management (ID management) is the organizational process for ensuring individuals have the appropriate access to ...

  • fraud detection

    Fraud detection is a set of activities undertaken to prevent money or property from being obtained through false pretenses.

  • single sign-on (SSO)

    Single sign-on (SSO) is a session and user authentication service that permits a user to use one set of login credentials -- for ...

CIO
  • IT budget

    IT budget is the amount of money spent on an organization's information technology systems and services. It includes compensation...

  • project scope

    Project scope is the part of project planning that involves determining and documenting a list of specific project goals, ...

  • core competencies

    For any organization, its core competencies refer to the capabilities, knowledge, skills and resources that constitute its '...

HRSoftware
  • recruitment management system (RMS)

    A recruitment management system (RMS) is a set of tools designed to manage the employee recruiting and hiring process. It might ...

  • core HR (core human resources)

    Core HR (core human resources) is an umbrella term that refers to the basic tasks and functions of an HR department as it manages...

  • HR service delivery

    HR service delivery is a term used to explain how an organization's human resources department offers services to and interacts ...

Customer Experience
  • martech (marketing technology)

    Martech (marketing technology) refers to the integration of software tools, platforms, and applications designed to streamline ...

  • transactional marketing

    Transactional marketing is a business strategy that focuses on single, point-of-sale transactions.

  • customer profiling

    Customer profiling is the detailed and systematic process of constructing a clear portrait of a company's ideal customer by ...

Close