Compression
Stunts and its ports uses multiple compression formats for code and resource files to save disk space and, more importantly, the number of floppy disks used to distribute the game.
Formats by game version
Game version | Stunts packing | RefPack | RPck | |||||
---|---|---|---|---|---|---|---|---|
RLE | Huffman | |||||||
Supported | Prefix table | Code endianness | ||||||
MS DOS | BB 1.0 | Game | X | X | Big | |||
Loader | X | X | Little | |||||
BB 1.1/MS 1.1 | Game | X | X | X | Little | |||
Loader | X | X | Little | |||||
Amiga | X | X | ||||||
PC98 | X | |||||||
FM Towns |
MS DOS
The original release of Stunts uses a custom data compression format not found in the ports to other platforms. The code may have been hand-crafted for 16-bit x86 since other options were favoured in the later ports.
The data format supports both run-length encoding (RLE) and Huffman coding. They can be combined in multiple passes. The data format and the decompression code don't require any particular order or combination of algorithms. Every compressed file shipped with the game use Huffman coding, and when combined with RLE, the RLE pass is applied before Huffman.
File header
enum CompressionType { RLE = 1 Huffman = 2 } struct Header { union { uint8 passes : 7 uint8 compression_type : 7 } uint8 multi_pass : 1 uint32 uncompressed_size : 24 }
If the most significant bit of the first byte of the file is set, this is a multi-pass file and the remaining 7 bits are the number of passes to be processed recursively. If the multi_pass flag is not set, we are only doing a single pass and the first byte is the compression type. The next three bytes make up the uncompressed size. If the multi_pass flag was set, this is the final uncompressed size. When the multi_pass flag is not set, this is the uncompressed size of the current pass. The remaining data is processed recursively. The compression_type field decides how it is decoded.
Run-length encoding
struct RLE { uint32 compressed_size : 24 uint8 reserved // Always 0x00, value is ignored by the game uint8 escape_length : 7 uint8 no_sequence : 1 uint8 escape_codes[escape_length] uint8 data[...] }
RLE is used to reduce the length of consecutively repeated bytes by using an escape code followed by a counter to signify how many times the next byte should be repeated. It is particularly useful for compressing images, as large areas with a single colour would be represented by many consecutive identical bytes.
The RLE stage can itself have two passes if no_sequence is not set. The first pass will then expand sequences of bytes in the data. escape_codes[1] will mark the beginning and the end of a byte sequence. When the escape code is encountered in the data, the following bytes are considered a sequence until the escape code is hit again. After the end-marking escape code, a one byte length counter tells how many times the preceding byte sequence should be repeated in the output buffer. Because the escape code cannot itself be escaped, sequential byte runs can only be applied in files that have unused byte values to use as the first escape code.
The main RLE pass is a traditional expansion of single-byte runs, but with a twist that allows for multiple types of escape codes. The escape code's position in the header's escape code list affects its properties. When the first escape code is encountered in the source buffer, the following byte has the number of repetitions, and the next byte will be copied that many times to the output buffer. When the third escape code is encountered, the number of repetitions is stored in the following two bytes as a 16-bit integer. Then the following byte is copied that many times. All other escape codes use its position in the escape code list as the number of repetitions, and have no additional counter bytes before the data byte to be repeated. This allows the second escape code reserved for sequences to serve as a no-op without the two passes interfering. Escape codes can themselves be escaped in single-byte runs by having them expanded to only one copy.
All repetition counters have an off-by-one surplus, as the data byte itself counts as the first copy. For optimal compression, escape codes should be chosen from byte values that occur least frequently, or not at all, in the source data.
RLE decoding optimisation
Escape codes are put in an escape[256] array, where the code's byte value is the index and the escape code's position is the entry's value. This array is used to test incoming bytes for escape codes. Raw bytes will have the value 0, while other values corresponds to the escape code's property. This optimisation makes the decoding run O(1), as opposed to O(n), where n is the number of escape codes.
Huffman coding
struct Huffman { uint8 tree_levels : 7 uint8 delta_coding : 1 uint8 symbols_per_level[tree_levels] uint8 alphabet[SUM(symbols_per_level)] uint16 codes : var [...] }
Huffman coding uses a binary tree to store data. When Huffman codes are read, each bit tells which branch to take when traversing the tree. If a leaf node is hit, we have found the code's symbol and can proceed with the next code. Frequently used symbols have shorter paths and thus takes up fewer bits in the source file.
The tree is stored by having two lists, one with symbols per level, and one with the alphabet. The alphabet is every possible symbol in the order they appear in the tree. The tree is built by going through each symbols_per_level. If there are symbols at the current level, consume them from alphabet and create leaf nodes. If there are no symbols, and there still are more levels to create, proceed to create left and right children nodes for the next level.
The remaining data is the Huffman codes. Bit by bit is read to find leaf nodes in the Huffman tree whose symbols are appended to the output buffer. This is repeated until the output buffer has reached uncompressed_size.
The maximum Huffman tree height is 16. While the data format allows for up to 127 levels, Stunts' decompression function uses arrays with a fixed capacity of 16 for its level offset table.
Delta coding
If the flag for delta coding is set, each symbol is treated as the difference from the previous output byte instead of an independent final value. The last byte written to the output buffer is remembered and added to the next expanded symbol. While regular Huffman coding assigns shorter codes to frequently occurring values, this delta variant assigns shorter codes to differences that occurs frequently. This feature is not used by any shipped files, but the code is present in all versions that implements Huffman decoding. One can speculate that delta coding was outperformed by RLE combined with regular Huffman coding, making delta coding unnecessary.
Huffman decoding optimisations
Prefix table
With Huffman codes being paths in a binary tree, they have the property of being prefix-free, meaning that one leaf node's code cannot be the prefix of another node's code. This allows for a decoding optimisation technique known as prefix tables or Huffman tables. By creating arrays of symbols and code widths where the indices are the Huffman codes, the codes can be looked up directly without traversing the tree. Every array index for non-leaf nodes between two leaf node entries are filled with data for the preceding leaf node. This way there's no need to filter out bits that belong to the next Huffman code, at the expense of common symbols taking up multiple entries in the lookup table during decompression.
Stunts utilises this optimisation for Huffman codes that are up to 8 bits wide. The first byte of the code is checked in the widths[256] array. If the code's width is 8 bits or less, the same code is looked up in the symbols[256] array and written to the output buffer. To find the start of the next Huffman code, the source buffer is advanced by the code's bit width found in the widths array.
Offset table
As Stunts limits the prefix tables to 8 bit wide Huffman codes to save RAM, another novel optimisation is applied for codes that exceeds 8 bits. Prefixes for longer codes have the value 0x40 in the widths array, indicating that the current code is a prefix that needs further expansion. For this purpose Stunts uses two additional helper arrays: total_codes[tree_levels], which has the total number of leaf nodes that exists in the tree up to each level, and code_offsets[tree_levels], which for each level has a value that can be added to a code with the same level in order to get its index in the alphabet. This works because the codes are arranged canonically. As a code's bit width corresponds to the code's level in the Huffman tree, the width can be used to compare the current code with the level's total by getting total_codes[current_width] for each extra bit read. If the code is smaller than the total number of codes encountered, it is a leaf node, and we get the symbol by adding the level's code offset to the code with alphabet[code + code_offsets[width]]. This is more laborious than the use of prefix tables for direct lookup, but it only applies to codes that are occurring less frequently in the input file.
Version differences
The Huffman decompression in the game code of Stunts 1.0 differs from the one in its LOAD.EXE and later versions of the game. The Huffman codes are written with the least significant bit first, as opposed to the other versions that has the codes written with the opposite orientation. The game code of Stunts 1.0 also does not use a prefix table, and the bit-stream orientation was likely changed with the introduction of this optimisation.
Amiga
The Amiga port uses two unrelated compression formats.
RefPack
Non-3d shape resource files uses the Electronic Arts in-house LZSS-based compression format RefPack. This may be one of the first known uses of RefPack which since became part of EA's internal engine libraries and has been used in many games through the 90's and early 2000's. The bit stream format is fully documented and has multiple implementations available, but the header flags have either changed, or the unofficial implementations make incorrect assumptions about them, causing the decompressors to wrongfully assume that the 4D Sports Driving files are not compressed due to not having the 0x01 flag set.
RPck
3-d shapes are compressed with RPck, a simple RLE-based format found in some Amiga file archiving software. Compression ratio is inferior to RefPack, but the code is simpler and faster, which may explain why it was chosen for these files.
PC98
The PC98 port of the game uses RefPack to compress some of the bitmap image resource files. See the Amiga section for more details.
FM Towns
The FM Towns and FM Towns Marty ports of the game were distributed on CD-ROM with ample storage and does not apply any compression to the game resource files.