Get Recall Free

Stanford EE274: Data Compression I 2023 I Lecture 3 - Kraft Inequality, Entropy, Introduction to SCL

18 Apr 2024 (over 1 year ago)

Stanford EE274: Data Compression I 2023 I Lecture 3 - Kraft Inequality, Entropy, Introduction to SCL

Prefix Codes

Prefix codes simplify the decoding process by ensuring that no code word is a prefix of another code word.
A good code has the length of the encoding of a symbol roughly equal to the logarithm of 1/PX.
The Shannon code is a construction method for prefix codes where symbols are sorted according to their probabilities, and code words are assigned without violating the prefix-free condition.
Kraft's inequality characterizes prefix codes as satisfying a certain mathematical inequality.
Prefix codes are a subset of uniquely decodable codes, and Kraft's inequality can be extended to uniquely decodable codes as well.

Entropy

Entropy is a measure of the uncertainty associated with a random variable.
Entropy is a function of the probability distribution, not the random variable itself.
Entropy is measured in bits.
Entropy is 0 for a deterministic random variable and log K for a uniformly distributed random variable.
Entropy is a measure of uncertainty or randomness in a random variable.
Entropy is the amount of information contained in a message.
Entropy is the limit of compression for a given random variable.
The joint entropy of independent random variables is the sum of the entropies of the individual variables.
The entropy of a tuple of independent and identically distributed random variables is n times the entropy of a single random variable.

K-Divergence

The K-Divergence is a measure of distance between two probability distributions.
The K-Divergence is positive and equal to zero if and only if the two distributions are equal.

Shannon's Source Coding Theorem

The expected length of any prefix code for a given distribution is greater than or equal to the entropy of the distribution.
Entropy is the fundamental limit of lossless compression.
Shannon proved in 1948 that no prefix code can have an expected code length lower than the entropy of the source.
This result holds for any uniquely decodable code, not just prefix codes.

Huffman Codes

Huffman codes are the optimal codes for a given distribution and can achieve arbitrarily close to the entropy by increasing the block size.

Block Codes

Block codes can achieve entropy, but they have exponential complexity and require huge tables.

Arithmetic Codes and ANS Codes

Arithmetic codes and ANS codes achieve the same result as block codes but are more efficient.

Browse more from
Data Compression

Pricing CTA

Summarize anything forget nothing

Rated 4.9 on Product Hunt

Get Started For Free