Unsigned Integer
n-bit Range: $[0, 2^n - 1]$
Signed Integer
Sign-Magnitude Representation
n-bit Range: $[-2^{n-1}-1, 2^{n-1}-1]$
Two’s Complement Representation
n-bit Range: $[-2^{n-1}, 2^{n-1}-1]$
<aside> ❓
BF16이 요새 Train에 자주쓰인다! (LLM… easy converge..?)
⇒ Exponent bit가 중요하기 때문에 (dynamic range)
</aside>
Pruning → Quantization → Huffman Encoding