• Software
  • Leadership
  • Agile
  • Events
  • Other Topics
    • Finance
    • Robotics & AI
    • System Administration
    • Books
    • Life Experiences
    • Environment
  • Write and Earn
  • About Us
    • About Us
    • Our Contributors
    • Contact Us
    • Article Submission Guidelines
    • Logo demystified
  • Follow @MeJaneve
    Janeve.Me
  • Categories

    Browse through following categories for articles written by professionals.
    • Agile
      4
    • Books
      5
    • Environment
      1
    • Events and Conferences
      7
    • Featured
      9
    • Finance
      1
    • Leadership
      6
    • Life Experiences
      8
    • Robotics & AI
      1
    • Software
      30
    • System Administration
      2
  • Software
  • Leadership
  • Agile
  • Events
  • Other Topics
    • Finance
    • Robotics & AI
    • System Administration
    • Books
    • Life Experiences
    • Environment
  • Write and Earn
  • About Us
    • About Us
    • Our Contributors
    • Contact Us
    • Article Submission Guidelines
    • Logo demystified
Home » Software

Database Design Practices

Janeve George Posted On February 26, 2012
0
1.9K Views


0
Shares
  • Share On Facebook
  • Tweet It

This document lists guidelines for designing database schema. Major key concerns to be kept in mind are:

  • Performance
  • Portability
  • Data Integrity
  • Readability

These might have side-effects on each other, for example, avoiding usage of functions and stored procedures increases portability but restricts performance. The DB Designer must make the best of all possible choices to make a very reliable and robust database for an application.

Design for Performance

The database schema should be designed to perform the best. The turnaround time for queries should be minimum. Few steps that we could take in improving the database performance are:

  • Store data in the format it is expected to be used by the end-user. By doing so, we could directly work on the data retrieved and there is rarely any need to convert the format of the data.
  • Avoid the usage of NULLs wherever possible. ‘NULL’ is not a value that we store in the database column, it means that we do not know the value. Having ‘NULL’ values basically violates relational database concepts. It does not provide any relevant information + it adds to the processing of determining is the column is NULLable + other weird issues. Sometimes it might also increase the IO overhead. Another issue is whenever you execute a query with ‘IS NOT NULL’ or ‘IS NULL’ where clause, the database performs a sequential scan on the table because ‘NULL’ cannot be indexed.. ouch… It would be better if we use some explicit value like ‘NA’ or ‘NULL’ (as literal string ofcourse), etc.
  • Use ‘vharchar’ instead of ‘char’ if a column needs to be nullable or if the size of the value is variable. By doing so, we could save on space hence IO time.
  • Every base tables should have a primary key. Primary keys improves the performance (if there is huge number of records). You may use a self incrementing primary key if necessary, but it is advisable to use natural keys wherever possible.
  • Design database for minimal user contention. Almost all applications that uses these databases are multi-threaded. Wiping out user contention will be impossible. but we have to try and minimize the contention as much as possible through proper database design. This would improve the performance considerably.
  • Use indexes wherever possible. Indexes are the best and most commonly used way to improve the performance of select queries. The drawback of indexes is that it turn out to be heavy for modify queries like Insert, Update and Delete. I would recommend to use minimal indexes for transaction tables. In case you are creating multi-key index, it is best to have the most unique column to come first.
  • Archive records in transactional tables often and in small chunks. All records that have served the purpose should be archived so that it gives better indexing and reduces IO wait.

Design for Portability

The database schema should be portable. There are numerous database vendors in the market. At some point in time we might require to migrate the data to a different database platform. The schema should be designed in such a way that the migration would not become a pain at any time in the future if necessary. Few steps that we could take are:

  • Environment and future expansion must be kept in mind. As an extension to develop a scalable application, we should have a scalable platform that supports it. The database schema should be scalable.
  • Use data types supported by all databases. Some database might provide some special data types. It is better to avoid these data types which might be specific to a particular database vendor. Most common data types are: integer, char, char varying, float, decimal.
  • Use size and lengths of entities such that it is supported by all databases. It is best if we use the entity names and data type size that is supported by all database.
    • Max. database name Length = 8
    • Max. table name length = 30
    • Max. column name length = 30
    • Max. index name length = 30
    • Max. char size limit = 254
    • Max. varchar size = 4000
    • Max. columns per table = 1000
  • While designing keep in mind the most commonly used database available in the market. Some of the most popular database used are:
    • Oracle
    • MS SQL Server
    • PostgreSQL
    • MySQL
    • Sybase
  • Using vendor specific features and data types should be documented properly.

Design for Data Integrity

Data integrity is an important aspect that database designers ponder over.

  • Strive for a ‘Star’ schema design when optimizing for retrievals.
  • You should add standard columns to all tables. These columns may be omitted in performance sensitive tables to minimize overhead and performance impact. Following columns supports data integrity of the database tables.
    • updated_by(optional), updated_at
    • created_by(optional), created_at
    • lock_version (for optimistic locking)
  • Always design to third normal form. De-normalize a table only as a last resort.
  • Every base tables should have a primary key. It not only helps in performance, but also in data integrity.
  • Use Foreign key for referential Integrity.
    • Index both sides of foreign key
    • Use integer keys as foreign keys for faster retrieval
    • Foreign keys may slow down transactions
    • Minimize the number of columns that form a foreign key
  • Use History and Audit Tables.
    • For Master tables plan on optional audit or history tables, but not both per master table
    • History Table – History tables record row changes. The current row is inserted into the history table before making any update on the row of the master table. History tables will have exactly identical columns as that of master table.
    • Audit Table – Audit tables record columns changes. Each column changed in master creates one audit record. Columns in audit table are: column_name, before_value, after_value, updated_at, updated_by.
    • Do not use these for tables that are updated very frequently like transaction tables.

Design for Readability

The database schema should document itself. We have to provide appropriate nomenclatures for the entities so that they convey what the represent through their names. Some naming conventions are:

  • Names should be readable, not cryptic
    • e.g: students, teachers, etc
  • Minimize use of abbreviations or use standardized abbreviations
  • Separate words with ‘_’
  • Use only lower case for table names.
  • Do not use any special character
  • Do not use names that are used as special keywords in any of major RDBMS. See attachment DBReservedWords.
  • Table Naming Convention
    • Pluralize – use domains, not domain
    • Suffix history tables with ‘history’ and audit tables with ‘audit’. Example history of the table called application_preferences should be stored in application_preferences_history.
  • Column Naming Convention
    • Do not use different names for columns which are very similar. If you are using state in a particular table say country, you should not use any other alterations, like province, anywhere in the  database.
    • Use ‘id’ as the name of primary key on all tables
    • Use <tablename>_id for foreign key columns
Post Views: 1,911
0
Shares
  • Share On Facebook
  • Tweet It




Author

Janeve George

A Technology Leader, Software Engineer, and Agile Methodologies enthusiast. Currently, working as Lead Software Development with Zeta Suite. He has more than 1.8 decades of experience spanning different verticals predominated by hosting, cloud, and media delivery technologies.

Database Usage Practices
Read Next

Database Usage Practices

  • Follow @MeJaneve
    Janeve.Me
  • Categories

    Browse through following categories for articles written by professionals.
    • Agile
      4
    • Books
      5
    • Environment
      1
    • Events and Conferences
      7
    • Featured
      9
    • Finance
      1
    • Leadership
      6
    • Life Experiences
      8
    • Robotics & AI
      1
    • Software
      30
    • System Administration
      2

  • Popular Posts

  • Recent Posts

    • 3Es to Supercharge your Career Journey and Performance Review Ratings
    • Java Asynchronous Programming using CompletableFuture - Part 1
    • The Java Stream API
    • Functional Interfaces in Java
  • Keep In Touch

    Follow us on social media to get latest articles on Programming, System Architecture, Agile Development Methodologies, Product and Project Management, Personal Development, BigData, Robotics, Upcoming Events and more...


Copyright © 2020 | Janeve.Me. All rights Reserved.
Press enter/return to begin your search