Working With Large Databases

Have you ever noticed that servers with small databases never have performance problems? That’s because SQL Server is such a robust and capable product that it can mask database design and query implementation problems – up to a point.

The point where hidden problems become visible is typically when a database reaches a certain size. This tipping point will vary from server to server, from database to database, from query to query, from application to application. But the tipping points are there, in every production environment.

Why?

Because all database and application designs, no matter how scalable, have paradigm limits. Designers may have been excellent at anticipating the paradigm limits of the the database they designed to support their application, but if a database grows large enough, users will eventually experience performance problems.

Here are some of the ways in which database size related problems can be overcome:

  1. More RAM. This always helps a production environment but it may not completely solve performance problems.
  2. Reduction in the size of large tables, either by trimming old data, or archiving old data.
  3. More efficient query and supporting index designs.
  4. A completely new database and application design.

I once read an article that, at the time, stated that Google had had to redesign and rescale its database designs and implementations, along with its applications, 3 times over a period of years to accommodate the massive data growth its databases had experienced. The company literally outgrew every design paradigm it created.

But database and/or application redesign is very expensive and not always an option, so other methods of dealing with large database size should be explored first.

A Case in Point:

I once worked on a database that, in addition to having some serious design flaws, exceeded 1.46 Terabytes in size. There were many challenges associated with every aspect of using and maintaining this database. Just performing a full database backup required over 12 hours per day.

Ultimately, after a thorough analysis, I recommended that one of the best options to improve server performance was to reduce and manage the size of the largest tables in the database. This was accomplished by a combination of data archiving and data trimming of old data. Other options and solutions were also implemented.

Today, that database is at a very manageable 700+ GB, even with its design flaws. Every aspect of performance associated with accessing the database is now excellent and report queries that were once thought to be impossible to write and execute, run daily on the server supporting the database. Full database backup operations now take a very manageable 4+ hours.

I have considerable experience solving problems associated with maintaining and accessing large and very large databases. I have yet to find a situation for which I could not provide good solutions to those problems.

If you need help with performance tuning your SQL Server,  contact me to schedule a free consultation.