Welcome to the World of Databases!
In this chapter, we are going to explore how computers store and manage massive amounts of information. Whether it is your favorite streaming service keeping track of your playlists or a school managing thousands of student records, databases are the engine behind it all. By the end of these notes, you’ll understand how to organize data efficiently and how to "talk" to a database using code!
8.1 Database Concepts
Before we had modern databases, people used a file-based approach. Imagine having dozens of separate spreadsheets or text files. This caused big problems:
• Data Redundancy: The same data is stored in multiple places (e.g., your address is in the "Library" file and the "Tuition" file).
• Data Inconsistency: If you move house and only update one file, the files now disagree!
• Data Dependency: The data is tied to the specific software used to create it.
The Relational Database Solution:
A Relational Database solves these problems by organizing data into linked tables. This reduces redundancy and makes sure data stays accurate (Data Integrity).
Key Terminology (The Language of Databases)
Don't worry if these terms feel like jargon at first; they are just different names for things you already know!
• Entity: An object or concept about which data is stored (e.g., Student, Book, Lesson). Think of it as the "title" of your table.
• Table: Where all the data for an entity is kept.
• Field / Attribute: A single characteristic of an entity (e.g., Student_Name, Date_of_Birth). These are the columns in your table.
• Record / Tuple: A collection of fields about one specific instance (e.g., all the info about one specific student). These are the rows in your table.
• Primary Key: A unique identifier for every record. Just like your ID card number, no two people can have the same one!
• Candidate Key: Any field that could potentially be a primary key because it is unique.
• Foreign Key: A primary key from one table that appears in another table to link them together.
• Secondary Key: A field that is indexed to make searching faster (e.g., searching by Surname instead of Student_ID).
Entity-Relationship (E-R) Diagrams
We use E-R diagrams to visualize how tables connect. The most important part is the Relationship:
• One-to-One (1:1): One husband has one wife (in most systems!).
• One-to-Many (1:M): One mother can have many children.
• Many-to-Many (M:M): Many students study many subjects.
Pro Tip: In a relational database, we cannot easily handle Many-to-Many relationships directly. We usually break them down into two One-to-Many relationships using a third "link" table!
The Normalisation Process
Normalisation is just a fancy word for "organizing your data to avoid mess." We do this in three steps:
1. First Normal Form (1NF):
A table is in 1NF if there are no repeating groups of data and all attributes are atomic (this means only one piece of data per cell—no lists!).
2. Second Normal Form (2NF):
It must be in 1NF first. Then, every non-key field must depend on the entire primary key. This only matters if you have a "Composite Key" (a primary key made of two fields).
3. Third Normal Form (3NF):
It must be in 2NF. Then, there must be no transitive dependencies. This means a non-key field cannot depend on another non-key field.
Example: If "Club_Fees" depends on "Club_Name," but "Club_Name" isn't the primary key, you need to move them to a separate table!
Key Takeaway: Normalisation makes databases efficient, reduces duplicate data, and ensures that if you change data in one place, it stays consistent everywhere.
8.2 Database Management Systems (DBMS)
A DBMS is the software that sits between the user and the data. Think of it like a librarian: you ask the librarian for a book, and they know exactly where it is and how to get it safely.
Features of a DBMS:
• Data Dictionary: A file that stores "metadata" (data about data), like table names, field types, and relationships.
• Data Management: Handling how data is actually stored on the disk.
• Data Integrity: Ensuring rules are followed (e.g., you can't delete a student if they still have library books checked out).
• Data Security: Managing who can see or edit which parts of the database (access rights).
• Query Processor: The part that understands your SQL commands and finds the data.
Software Tools in a DBMS:
• Developer Interface: Allows a programmer to create tables and forms easily.
• Logical Schema: The overall view of the entire database design.
Key Takeaway: The DBMS handles the "heavy lifting" of security and organization so that the user doesn't have to worry about the technical details of storage.
8.3 Structured Query Language (SQL)
SQL is the standard language used to communicate with databases. It is split into two main parts:
1. Data Definition Language (DDL)
Used to create or change the structure of the database.
• CREATE DATABASE: Sets up a new database file.
• CREATE TABLE: Defines a new table and its fields.
Example:
CREATE TABLE Students (
StudentID INTEGER PRIMARY KEY,
FirstName VARCHAR(20),
DateOfBirth DATE
);
• ALTER TABLE: Used to add or delete columns in an existing table.
2. Data Manipulation Language (DML)
Used to manage the data inside the tables.
• SELECT: The most common command. It fetches data.
• INSERT INTO: Adds a new row of data.
• UPDATE: Changes existing data.
• DELETE FROM: Removes data.
Writing a SELECT Query
Think of a SELECT query like a filter. The basic structure is:
SELECT [fields] FROM [tables] WHERE [condition] ORDER BY [field];
Example: Find all students who are in Year 12, sorted by their name:
SELECT FirstName, LastName
FROM Students
WHERE YearGroup = 12
ORDER BY LastName ASC;
Joins and Aggregates
• INNER JOIN: Used to combine data from two tables where they have a matching field (usually Primary Key and Foreign Key).
• SUM, COUNT, AVG: These are "aggregate" functions that perform math on your data.
Example: SELECT COUNT(*) FROM Students; (This counts how many students are in the table).
Quick Review Box:
• DDL = Creating the "bucket" (structure).
• DML = Managing the "water" (data) inside the bucket.
• Use WHERE to filter results.
• Use INNER JOIN to link tables.
Did you know? SQL is one of the most in-demand skills in the job market today. Master this, and you’re already ahead of the game!
Final Summary
Databases are much more powerful than simple files. By using a Relational Model, we keep data organized and safe. Normalisation keeps our tables clean, the DBMS keeps our data secure, and SQL allows us to find exactly what we need in a split second. Don't worry if SQL syntax feels picky at first—computers are very literal! Just remember the "Select-From-Where" pattern, and you'll be a database pro in no time.