Topic outline
-
-
Welcome and thank you for taking the first step into a journey that will span the course of your career in IT. Whether you’re looking to get into software development, networking, cybersecurity or games design, you will encounter databases in one way or another.
Databases are one of the most universally accepted ways of organising data when it comes to any sort of software program, and while you may not exactly be interested in building databases as a career, despite how lucrative it may be, it is very important that you know how they function. Being oblivious as to how they function would be the same as speaking to someone in English without the use of verbs – a hard task indeed.
Each week you’re going to be dipping your toes into a large pool of concepts, ideas and techniques. Learning in this course is going to be split up into bite-sized chunks that should be easily digestible. The topic we’ll be looking at this week is Database Fundamentals – funnily enough, this used to be the old course name. Why has it changed you may ask? Well, Data Driven Web Technologies will not only teach you the fundamentals of databases and how they function but also how they are implemented online.
-
Read
What exactly is a database?- A handful of tables containing data that are used to retrieve information.
- A collection of potentially useful data that models real-world or conceptual objects and their relationships.
- It is managed by a Database Management System
- This system can be physical or computerised.
A typical example of a database is a library, where books are the objects containing data and the cataloguing system is the Database Management System. The next video discusses the New York Public Library as an example.New York Public Library 3m 26s
Problems with traditional databases (physically stored data):Let's take book libraries as an example:- They are limited by physical space - cannot store more books than the location physically allows.
- Compare this to a computerised database which will store ~2 trillion records in a fraction of the space!
- They have limited multi-access.
- A borrowed book can only be viewed by one person.
- Limited search methods (Title, Author, Subject).
- What if the data has 2 authors?
- What if the title starts with "The" or "As"?
- What if we want to search by a single word in the title?
- What if you want to search by keywords, instead of just the first word, author or subject?
- Slow turn-around.
- Borrowing a book requires going to the physical location and searching for it amongst the shelves.
- Alternatively, Google's database can pull up a book in <7 seconds.
- Multiple indexes must be rigorously maintained.
- Anomalies - What if...
- A card is misplaced?
- An item is deleted in one index but not the others (deletion anomaly)
- An item is entered into one index but not the others (insertion anomaly)
- An item is incorrectly entered in all indexes but only corrected in one? (update anomaly)
- There are different versions of the same record?
- Complex query limitations:
- How quickly can you count the number of books on a topic or by an author?
- What if a book is on many subjects - it must appear multiple times in a subject catalogue (prone to the above anomalies).
As you can see, there are a multitude of reasons as to why it would be smart to move from a physical data storage system to a computerised one! Most libraries nowadays have a computerised cataloguing system available to use.
This simple fact should be enough to convince you that storing data virtually is the way to go.
-
Watch
The following video provides a quick introduction to some of the fundamental concepts in database design and use.
Database Tutorial for Beginners 3m 16s -
Read
After watching the video, you should be convinced that moving from basic Excel spreadsheets to relational databases is the way to go.
Effectively, it is quite possible for data to be stored in file types such as text file, Excel spreadsheets or Word documents.
There are advantages to using these file types:
-
They are seemingly low cost and easy to setup
-
No extra software/hardware is required to get them started
-
They are easy to use for anyone with basic computer training
However, there are also disadvantages:
- The sheer size of the files – with large datasets, data retrieval becomes a lengthy process
- Updates/synchronisation – files are hard to share and update across multiple devices
- Accuracy – duplication of data can occur
- Security – file access is restricted by windows/mac user accounts
- Extra programming is required to automate processes.
So how exactly does a database help speed up these processes? Well, we need to use Structured Query Language (SQL) to ask questions to the database to retrieve structured data.
Structured language enables users to interact with databases. A structured language differs from natural, everyday language in that it uses keywords in a logical pattern to perform tasks. This language is easier to understand than other programming languages due to its simplified nature.
SQL is interpreted by what is known as a database management system (DBMS). The DBMS will action the commands it receives and determine what needs to be done to retrieve the data requested.
Furthermore, a DBMS performs all the maintenance and management functions for the data model, which is the relational database in our case.
This includes but is not limited to:
-
Managing disk space
-
Communication of data between humans and the computer
-
Managing the creation of tables
-
Managing the insertion, updating and deletion of data records
-
Performing complex query answering (searching and matching) of records
-
-