Relational (aka SQL) databases are the oldest general purpose database type and remain extremely popular today. Regardless of your industry or profession, if you are in operations or analytics, you will come across RDBMs and need to understand the mechanics and processes of this architecture.
Also termed “NoSQL” this is a varied collection of modern database types that offer approaches that differ from the standard relational pattern. NoSQL is said to stand for either “non-SQL” or “not only SQL” indicating that this may allow for SQL-like querying.
The relational database type uses a structure that allows users to identify and access data in relation to another piece of data in the database. These databases are extremely popular just as they are easy to understand because data is organized into structured, defined, and organized tables.
SQL stands for “structured query language” and was a language developed to perform operations in the relational database system. Because of the relational-nature of this model and its organization of tables, rows, and columns, SQL optimizes the querying of data collected.
Luckily, this is a fairly simple architecture to understand. The basics inform us that a relational database is made up of tables, records, fields, columns, and rows affecting the way we build and query data in this model.
Example: diagram of the relational data model for South African Municipalities uploaded to Research Gate by Jean Vincent Fonou Dombeau.
There are a variety of relational database products available. Selecting one over the other is typically an enterprise decision or a personal one, depending on the work and environment. Open-source databases such as PostgreSQL or MariaDB are free while others requires purchase or subscription plans such as the Oracle-owned MySQL. Cloud-based databases are also available with Amazon and Google, among others.
As an analyst you can ensure that you are relatively agnostic to which database you use over another if you properly understand the fundamental nature of relational database management systems and its model, and are able to communicate/code in the structured querying language . Then your work is a matter of transforming data into valuable information and knowledge.
First, you will learn how to query data from a single table using basic data selection techniques such as selecting columns, sorting a result set, and filtering rows. Your next effort is where you learn more advanced query techniques such as joining multiple tables, using set operations, and constructing subqueries. It is wise to also learn how to manage database tables such as creating new tables or modifying an existing table’s structure.
There are abundant courses online teaching SQL for students. The best experience involves practice. If you are completely new, consider a course where you are also playing with simple tables in a hosted RDMS environment that way you are not stuck downloading and configuring any software. If you have access to a database system you can take courses that provide datasets while teaching more advanced concepts and techniques.
An alternative emerged to legacy relational database systems and were made increasingly popular – from necessity – by tech giants such as Google, Netflix, Facebook, and Amazon all of whom manage and process big data. In contrast to traditional RDBMs, non-relational databases do not store data in structured tables but instead provide a variety of new ways to store data. Each with important benefits and trade-offs. NoSQL databases may be organized as key-value, graph-store, document-based, or column-based.
Modern applications placed new demands on existing database systems that were lacking within the RDBMs architecture. On the one hand, relational databases were not designed to cope with the scale and agility challenges of applications today, nor were they developed with the capacities of today’s storage and processing power. Given the commodity of today’s exploding volume of new and rapidly changing data types — structured, semi-structured, unstructured and polymorphic data, No-SQL databases emerged.
Luckily, this is a fairly simple architecture to understand. The basics inform us that a relational database is made up of tables, records, fields, columns, and rows affecting the way we build and query data in this model.
Example: diagram of a non-relational data model which represents complex and deeply nested document structure. Photo from Ebay engineering.
In terms of building new applications, considerations of alternative data models such as those available in the non-relational paradigm are extremely important. There are multiple examples and products for the various frameworks. It often depends on scalability, latency, data type, performance, querying language, etc etc… But if you want your application to handle a lot of complicated querying, database transactions and routine analysis of data, you’ll probably want to stick with a relational database.
As an analyst working with applications that are heavy into data analytics, you may not depend upon non-relational databases as much as you may assume given their growing popularity. This is because non-relational databases have no joins like there would be in relational databases making it difficult to query data. This means you would need to perform multiple queries and join the data manually within your code – which is not easy or efficient in terms of mean-adjusted error and effort.
First, you will need a very strong understanding of exactly what your data requirements are before you select between a relational or non-relational data model. This requires an audit of data sources, use-cases, storage and processing requirements, application goals, and user experience. Nevertheless you may need to store unstructured data so that you can convert it into meaningful information for your business – the correct database can facilitate this.