Introduction to Graph Algorithms

Welcome! Today we are diving into one of the most powerful tools in Computer Science: Graphs. Don't let the name fool you—this isn't about the bar charts or line graphs you see in math class. In Computer Science, a graph is a way of mapping out relationships between things.

Think about a social network like Instagram or a map app like Google Maps. How does the app know who your "suggested friends" are, or how to find the fastest route to a coffee shop? The answer lies in Graph Algorithms. In this chapter, we will learn how to represent these connections and how to "travel" through them efficiently.

1. What is a Graph?

At its simplest, a graph is a collection of "dots" connected by "lines."

Key Terms:

  • Vertex (or Node): A single point or object (like a city or a person). The plural is Vertices.
  • Edge: The link between two vertices (like a road between cities or a friendship between people).

Types of Graphs

Not all connections are the same! We categorize graphs based on how their edges behave:

  • Undirected Graph: The connection works both ways. If Person A is friends with Person B, then Person B is also friends with Person A. Think of it like a two-way street.
  • Directed Graph (Digraph): The connection has a specific direction. For example, on X (formerly Twitter), you might follow a celebrity, but they don't necessarily follow you back. This is like a one-way street.
  • Weighted Graph: The edges have a "cost" or "value" attached to them. For example, an edge between two cities might have a weight of 50 to represent 50 kilometers.
  • Unweighted Graph: All edges are equal. We only care if a connection exists or not.

Quick Review: Think of a graph as a map. Cities are Vertices, and the roads between them are Edges. If the road is a one-way street, it's a Directed graph. If the road tells you the distance, it's a Weighted graph.

2. Representing Graphs in Code

Computers can't "see" a drawing of a graph, so we have to store the data in a way they understand. There are two main ways to do this:

A. Adjacency Matrix

This is a 2D array (a grid) where the rows and columns represent vertices. If there is a connection between Vertex A and Vertex B, we put a 1 in the grid; otherwise, we put a 0.

Example Matrix:
If we have 3 nodes (0, 1, 2) and a connection between 0 and 1:
\( \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} \)

  • Pros: Very fast to check if two specific nodes are connected.
  • Cons: Wastes a lot of memory if there are many nodes but very few connections (lots of zeros).

B. Adjacency List

This is like a "contact list." Each vertex has a list of all the other vertices it is directly connected to.

Example List:
Node 0: [1, 2]
Node 1: [0]
Node 2: [0]

  • Pros: Saves space! We only store the connections that actually exist.
  • Cons: Slower to check if a specific connection exists because you have to search through the list.

Memory Aid: Use a Matrix when the graph is "dense" (lots of connections). Use a List when the graph is "sparse" (mostly empty space).

3. Graph Traversal: Exploring the Connections

Traversal is just a fancy word for "visiting every node in the graph." There are two main ways to do this, and they work exactly like exploring a building or a maze.

A. Breadth-First Search (BFS)

Imagine you drop a pebble into a pond. The ripples move outward in circles, hitting the closest points first, then the next circle, and so on. That is BFS!

How it works:

  1. Start at a node and mark it as "visited."
  2. Visit all its immediate neighbors first.
  3. Then visit the neighbors of those neighbors.
  4. Uses a Queue (First-In, First-Out) to keep track of which node to visit next.

Best for: Finding the shortest path in an unweighted graph.

B. Depth-First Search (DFS)

Imagine you are exploring a maze. You pick a path and keep going as deep as possible until you hit a dead end. Then you backtrack and try the next available turn. That is DFS!

How it works:

  1. Start at a node and mark it as "visited."
  2. Follow a path as far as it goes.
  3. When you hit a dead end (or a node already visited), backtrack to the last node that had an unvisited neighbor.
  4. Uses a Stack (Last-In, First-Out) or Recursion to keep track of the path.

Best for: Checking if a path exists between two nodes or solving puzzles where you need to try every possibility.

Don't worry if this seems tricky! Just remember: BFS is for Breadth (spread out wide), and DFS is for Depth (go down deep).

4. Common Pitfalls to Avoid

  • Infinite Loops: If you don't keep track of "visited" nodes, your algorithm might keep going in circles forever. Always mark a node as visited!
  • Wrong Data Structure: Remember that BFS uses a Queue and DFS uses a Stack. Mixing them up will change how the algorithm explores the graph.
  • Directed vs. Undirected: In a directed graph, just because you can go from A to B doesn't mean you can go from B to A. Check the arrows!

Key Takeaways

Vertices are the points, Edges are the lines. Graphs can be Directed (one-way) or Undirected (two-way), and Weighted (costs) or Unweighted (no costs). To store them, use an Adjacency Matrix (for lots of connections) or an Adjacency List (to save space). To search them, use BFS (wide/Queue) for shortest paths or DFS (deep/Stack) to explore everything.

Did you know? The "Six Degrees of Separation" theory (that everyone in the world is connected by six or fewer social connections) is actually a graph problem that can be solved using Breadth-First Search!