From 72da03cc86405d6ef4f529687e95c1a54f577c7d Mon Sep 17 00:00:00 2001
From: !antona <!antona>
Date: Thu, 22 Aug 2002 18:09:47 +0000
Subject: [PATCH] Add description of compression algorithm.

(Logical change 1.5)
---
 doc/compression.txt | 153 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 153 insertions(+)

diff --git a/doc/compression.txt b/doc/compression.txt
index e69de29b..90d2f0c1 100644
--- a/doc/compression.txt
+++ b/doc/compression.txt
@@ -0,0 +1,153 @@
+
+Description of the NTFS (de)compression algorithm (based on a modified LZ77
+algorithm)
+
+Copyright (c) 2001 Anton Altaparmakov.
+
+This document is published under the GNU General Public License.
+
+Credits: This is based on notes taken from various places (most notably from
+Regis Duchesne's NTFS documentation and from various LZ77 descriptions) and
+further refined by looking at a few compressed streams to figure out some
+uncertainties.
+
+Note: You should also read the run list description with regards to compression
+in linux-ntfs/include/layout.h. Just search for "Attribute compression".
+FIXME: Should merge the info from there into this document some time.
+
+Compressed data is organized in logical "compression" blocks (cb). Each cb has
+a size (cb_size) of 2^compression_unit clusters. In all versions of Windows,
+NTFS (NT/2k/XP, NTFS 1.2-3.1), the only valid compression_unit is 4, IOW, each
+cb is 2^4 = 16 clusters in size.
+
+We detect and warn about a compression_unit != 4 but we try to decompress the
+data anyway.
+
+Compression is only supported for cluster sizes between 512 and 4096. Thus a
+cb can be between 8 and 64kiB in size.
+
+Each cb is independent of the other cbs and is thus the minimal unit we have
+to parse even if we wanted to decompress only one byte.
+
+Also, a cb can be totally uncompressed and this would be indicated as a sparse
+cb in the run list.
+
+Thus, we need to look at the run list of the compressed data stream, starting
+at the beginning of the first cb overlapping @page. So we convert the page
+offset into units of clusters (vcn), and round the vcn down to a mutliple of
+cb_size clusters.
+
+We then scan the run list for the appropriate position. Based on what we find
+there, we decide how to proceed.
+
+If the cb is not compressed at all, and covers the whole of @page, we pretend
+to be accessing an uncompressed file, so we fall back to what we do in
+aops.c::ntfs_file_readpage(), i.e. we do:
+	return block_read_full_page(page, ntfs_file_get_block);
+
+If the cb is completely sparse, and covers the whole of @page, we can just
+zero out @page and complete the io (set @page up-to-date, unlock it, and
+finally return 0).
+
+In all other cases we initiate the decompression engine, but first some more
+on the compression algorithm.
+
+Before compression the data of each cb is further divided into 4kiB blocks, we
+call them "sub compression" blocks (sb), each including a header specifying
+its compressed length. So we could just scan the cb for the first sb
+overlapping @page and skip the sbs before that, or we could decompress the
+whole cb injecting the superfluous decompressed pages into the page cache as a
+form of read ahead (this is what zisofs does for example).
+
+In either case, we then need to read and decompress all sbs overlapping @page,
+potentially having to decompress one or more other cbs, too.
+
+As soon as @page is completed we could either stop or continue until we finish
+the current cb, injecting pages as we go along (again following the zisofs
+example).
+
+Because the sbs follow each other directly, we need to actually read in the
+whole cb in order to be able to scan through the cb to find the first sb
+overlapping @page, so it does make sense to follow the zisofs approach of
+decompressing the whole cb and injecting pages as we go along. So all
+discussion from now on will assume that we are going to do that. Although it
+might make sense not to decompress any sbs locate before @page because this
+would be a kind of "read-behind" which is probably silly, unless someone is
+reading the file backwards. Performing read-ahead by decompressing all sbs
+following @page OTOH, is very likely to be a good idea.
+
+So, we read the whole cb from disk and start at the first sb.
+
+As mentioned above, each sb is started with a header. The header is 16 bits of
+which the lower twelve bits (i.e. bits 0 to 11) are the length (L) - 3 of the
+sb (including the two bytes for the header itself, or L - 1 not counting the
+two bytes for the header). The higher four bits are set to 1011 (0xb) by the
+compressor for a compressed block, or to 0000 for an uncompressed block, but
+the decompressor only checks the most significant bit taking a 1 to signify a
+compressed block, and a 0 an uncompressed block.
+
+So from the header we know how many compressed bytes we need to decompress to
+obtain the next 4kiB of uncompressed data and if we didn't want to decompress
+this sb we could just seek to the next next one using the length read from the
+header. We could then continue seeking until we reach the first sb overlapping
+@page.
+
+In either case, we will reach a sb which we want to decompress.
+
+Having dealt with the 16-bit header of the sb, we now have length bytes of
+compressed data to decompress. This compressed stream is further split into
+tokens which are organized into groups of eight tokens. Each token group (tg)
+starts with a tag byte, which is an eight bit bitmap, the bits specifying the
+type of each of the following eight tokens. The least significant bit (LSB)
+corresponds to the first token and the most significant bit (MSB) corresponds
+to the last token.
+
+The two types of tokens are symbol tokens, specified by a zero bit, and phrase
+tokens, specified by a set bit.
+
+A symbol token (st) is a single byte and is to be taken literally and copied
+into the sliding window (the decompressed data).
+
+A phrase token (pt) is a pointer back into the sliding window (in bytes),
+together with a length (again in bytes), starting at the byte the back pointer
+is pointing to. Thus a phrase token defines a sequence of bytes in the sliding
+window which need to be copied at the current position into the sliding window
+(the decompressed data stream).
+
+Each pt consists of 2 bytes split into the back pointer (p) and the length (l),
+each of variable bit width (but the sum of the widths of p and l is fixed at
+16 bits). p is at least 4 bits and l is at most 12 bits.
+
+The most significant bits contain the back pointer (p), while the least
+significant bits contain the length (l).
+
+l is actually stored as the number of bytes minus 3 (unsigned) as anything
+shorter than that would be at least as long as the 2 bytes needed for the
+actual pt, so no compression would be achieved.
+
+p is stored as the positive number of bytes minus 1 (unsigned) as going zero
+bytes back is meaningless.
+
+Note that decompression has to occur byte by byte, as it is possible that some
+of the bytes pointed to by the pt will only be generated in the sliding window
+as the byte sequence pointed to by the pt is being copied into it!
+
+To give a concrete example; a block full of the letter A would be compressed
+by storing the byte A once as a symbol token, followed by a single phrase
+token with back pointer -1 (p = 0, therefore go back by -(0 + 1) bytes) and
+length 4095 (l=0xffc, therefore length 0xffc + 3 bytes).
+
+The widths of p and l are determined from the current position within the
+decompressed data (cur_pos). We don't actually care about the widths as such
+however, but instead we want the mask (l_mask) with which to AND the pt to
+obtain l, and the number of bits (p_shift) by which to right shift the pt to
+obtain p. These are determined using the following algorithm:
+
+for (i = cur_pos, l_mask = 0xfff, p_shift = 12; i >= 0x10; i >>= 1) {
+	l_mask >>= 1;
+	p_shift--;
+}
+
+Note, that as usual in NTFS, the sb header, as well as each pt, are stored in
+little endian format.
+