Welcome to Information Representation!

Ever wondered how a computer, which is essentially just a bunch of tiny electronic switches, can show you a high-definition movie, play your favorite song, or store your homework? It all comes down to Information Representation. In this chapter, we’ll explore how computers "translate" the real world into 1s and 0s. Don't worry if this seems a bit "alien" at first—we'll break it down step-by-step!


1.1 Data Representation: The Basics

Binary Magnitudes: The "Kilo" vs. "Kibi" Confusion

In the "human" world, we use the decimal system (Base 10). When we say "kilo," we mean 1,000. however, computers use the binary system (Base 2), where things work in powers of 2. This creates a slight difference in how we measure data.

Decimal Prefixes (Base 10): These are what you use in everyday life.

  • kilo (k): \(10^3\) = 1,000
  • mega (M): \(10^6\) = 1,000,000
  • giga (G): \(10^9\) = 1,000,000,000
  • tera (T): \(10^{12}\) = 1,000,000,000,000

Binary Prefixes (Base 2): These are the "true" computer measurements.

  • kibi (Ki): \(2^{10}\) = 1,024
  • mebi (Mi): \(2^{20}\) = 1,048,576
  • gibi (Gi): \(2^{30}\) = 1,073,741,824
  • tebi (Ti): \(2^{40}\) = 1,099,511,627,776

Memory Aid: If it has an "bi" in the middle (like kibi), it stands for Binary, so it uses the 1,024 scale!


Number Systems: Binary, Denary, and Hexadecimal

We use three main number systems in Computer Science:

  1. Denary (Base 10): The numbers 0-9 we use every day.
  2. Binary (Base 2): Just 0 and 1. This is the computer's native language.
  3. Hexadecimal (Base 16): Uses 0-9 and then A-F (where A=10, B=11, C=12, D=13, E=14, F=15).

Why use Hexadecimal? It's much shorter than binary and easier for humans to read without making mistakes. One "Hex" digit represents exactly 4 binary bits (a nibble).

Quick Review: To convert Binary to Hex, split the binary number into groups of 4 from the right. For example, \(1011 | 0101\) becomes \(B | 5\), or \(B5_{16}\).


Binary Coded Decimal (BCD)

BCD is a special way of representing numbers where each individual digit of a denary number is stored as its own 4-bit binary code.

Example: To represent the number 85 in BCD:
8 = 1000
5 = 0101
So, 85 in BCD is 1000 0101.

Real-world application: BCD is often used in digital clocks or calculators where each digit on the screen needs to be controlled individually.


Signed Integers: One’s and Two’s Complement

How does a computer know a number is negative? We use Two's Complement.

Step-by-step: How to make a number negative using Two's Complement:

  1. Start with the positive binary version of the number.
  2. Flip all the bits (change 0s to 1s and 1s to 0s). This is called "One's Complement."
  3. Add 1 to the result. This is "Two's Complement."

Common Mistake: Don't forget that in a signed 8-bit number, the leftmost bit is the sign bit. If it’s a 1, the number is negative!

Binary Addition & Overflow: When you add binary numbers, sometimes the result is too big to fit in the number of bits allowed (e.g., adding two 8-bit numbers and getting a 9-bit answer). This is called Overflow, and it can cause computer errors!

Key Takeaway: Binary is for computers, Hex is for humans to read binary easily, and Two's Complement is how we handle negative numbers.


1.2 Multimedia: Graphics and Sound

Bitmapped Graphics

A bitmap image is made of tiny dots called pixels (picture elements). Each pixel is assigned a binary color code.

  • Image Resolution: The number of pixels wide by the number of pixels high (e.g., 1920 x 1080).
  • Color Depth (Bit Depth): The number of bits used to represent the color of a single pixel. The more bits you use, the more colors you can have (e.g., 8-bit depth allows \(2^8 = 256\) colors).
  • File Header: A small part of the file that stores data about the image, like its resolution and color depth.

Calculating File Size:
\(File Size = Resolution (Width \times Height) \times Color Depth\)
Example: An image 100x100 with a 1-bit color depth is 10,000 bits.


Vector Graphics

Instead of storing pixels, Vector Graphics store a drawing list of mathematical objects (lines, circles, rectangles) and their properties (color, thickness, position).

Analogy: A bitmap is like a photo; if you zoom in too much, it gets "blocky" (pixelated). A vector is like a set of instructions to draw a shape; you can make it as big as a house and it will still look perfectly sharp!


Sound

Sound is naturally analogue (a continuous wave), but computers are digital. To store sound, we must sample it.

  • Sampling: Taking "snapshots" of the sound wave's amplitude at regular intervals.
  • Sampling Rate: How many samples we take per second (measured in Hertz, Hz). Higher rate = better quality but larger file.
  • Sampling Resolution: The number of bits used to store each sample. Higher resolution = more accurate representation of the sound's volume.

Did you know? A standard CD uses a sampling rate of 44,100 times per second! That's a lot of snapshots.

Key Takeaway: Bitmaps use pixels; Vectors use math. Sound is captured by sampling an analogue wave into digital bits.


1.3 Compression

Compression is the process of making a file smaller. Why do we need it? To save storage space and to make files faster to send over the internet.

Lossy vs. Lossless Compression

  1. Lossy Compression: Permanently removes some data that the human eye or ear likely won't notice. Once it's gone, you can't get it back. (Examples: JPEG, MP3).
  2. Lossless Compression: Reduces file size without losing any original data. When you decompress it, it's identical to the original. (Examples: PNG, ZIP).

Run-Length Encoding (RLE)

RLE is a simple form of lossless compression. It looks for consecutive repeated pieces of data and stores them as a single value and a count.

Example: Imagine a row of pixels: WWWWWBBRRR
In RLE, we store this as: 5W, 2B, 3R
We just turned 10 characters into 6! This works great for simple images with large blocks of the same color.

Quick Review: Use Lossy when you need a very small file and don't mind a tiny drop in quality (like streaming video). Use Lossless when the data must be perfect (like a text document or a program's code).

Key Takeaway: Compression saves space. Lossy throws data away; Lossless keeps everything perfect by being clever with how it's stored.