Database basics

2015-07-27

Databases these days are quite modular and generally run in a client server architecture. Typically in either a 2-tier or 3-tier structure. The 2-tier has the application and database on a single server, then clients can connect to said server. The 3-tier architecture adds an extra layer where there is a server in between the client and the database server, which is the application server, whereby the client connects to the application server, and then the application server maintains the connection to the database server. By employing this client-server/n-tier architecture, it allows multiple users to access the same data with relative ease. The 3-tier architectures are particularly popular with web-based front ends that interface with the database. The connections to the database server are maintained through the Object Database Connectivity (ODBC) drivers where most database vendors provide native drivers to their particular system.

The modern day database management system doesn’t require you to know the low level details of how the data is stored on the file system. You set up the data model (through entities) in your database, and the database management system handles the lower level detail. Each entity contains specific information - for instance the person entity would contain properties such as name, age, location, gender, data of birth, etc. Within the database, an entity would typically relate to one or more other entities, such as the entity of person and vehicles - one person can have many vehicles.

The representation in the data is typically represented in an entity-relationship diagram. This diagram shows each entity and their relationships. For a more detailed outlook, you could use a schema diagram, which in addition to entities and their relationships, would include other attributes of the schema such as triggers, sequences, procedures, functions and packages - to give a fully detailed outlook of the database schema.

A modern trend in database management systems is the use of an object-relational data model. This concept became a part of the SQL99 standard, where the major vendors support this technology through the use of the SQL structured type. The object/type then become one of the attributes of the entity. This technology has not had widespread uptake, however.

One of the common benefits of a database management system is such that it is self documenting. By observing the database, you can see all the different entities and any relationships they may have. When you create your database, with the appropriate structure to ensure it’s integrity, at any given time you can find out more information about the database by querying the data dictionary - which is typically what SQL management GUI applications use to report the information back to you.

Producing a well structured data model following the desired business rules allows you to adapt your data to different applications over time, as technologies evolve. It should not be necessary to make any major changes to the data model to support a different front-end technology. That is where the benefits of declarative constraints come in - it means no matter what technology you use for your users, the data integrity will persist.

A good architecture to follow for your database is the three-schema architecture. This architecture ensures that the front-end application of the data is well separated from the database. It consists of the low level - detailing how the physical structure of the files on the file system are stored; the conceptual level - which details the actual structure of entities in the database; the view level - hides the full details of how the database was set up at the conceptual level, providing easy to use views that users can report from. These days, the database management system handles the lower level for you, and you just deal with creating a good database structure. By separating the latter two levels, it ensures that any change to the conceptual design shouldn’t have an overbearing impact to the external level - all that would need updating is the specification of the view(s) that have been exposed.

The data languages worth making note of from the perspective of database management systems are data-definition language (DDL) and data-manipulation language (DML). DDL refers to the set of commands used in defining the structure of database - entities, relationships, triggers and procedures. DML is the set of commands used to perform create, remove, update and delete (CRUD) operations. These two language are in fact the same language, but are named differently just to distinguish the types of commands that are being performed on the database.