Thursday, November 28, 2013

An Overview of the Rebuild Index Task


Over time, as indexes are subjected to data modifications, index fragmentation can occur in the form of:

  • Gaps in data pages – this creates wasted empty space.
  • Logical fragmentation – this is a logical ordering of the data that no longer matches the physical ordering of the data

Gaps in data pages can reduce the number of rows that can be stored in SQL Server's data cache, leading to increased disk I/O. Logical fragmentation can cause extra disk activity as the disk subsystem has to work harder to find the data on disk and move it to the data cache. The only way to remove wasted space and logical fragmentation is to rebuild or reorganize the indexes on a regular basis. This is one of the most useful and powerful maintenance tasks that you can perform on a database, because the steps it performs can greatly boost database performance.

If you configure the Rebuild Index task using all the default settings, then when the task runs, it physically drops and rebuilds every index in your selected databases, removing both wasted empty space and logical fragmentation. As a byproduct of rebuilding all the indexes, index and column statistics are also recreated anew and fully updated.

Free space options


These options assumes that each of your tables has a clustered index, and is not a heap. A heap is a table without a clustered index. As a best practice, all tables should have a clustered index.

The first two choices are listed under Free space options and include "Reorganize pages with the default amount of free space" and "Change free space per page percentage to", You can choose one option or the other, but not both.

These options can have a significant impact on the Rebuild Index task.

The default option of "Reorganize pages with the default amount of free space" is a little confusing. First, it says reorganize, not rebuild. Remember, we are working on the Rebuild Index task, not the Reorganize Index task. Don't let this confuse you into thinking that selecting this option reorganizes indexes, rather than rebuild them. It does the latter, and this is actually a mistake in the user interface. It really should say "rebuild," not "reorganize".

The second part of this first option says "default amount of free space". What does that mean? When creating a SQL Server index, there is an option to create the index with a certain amount of free space on each data page. This setting is known as the fill factor. If an index is created without specifying a fill factor, then the default fill factor is used, which is 100 (actually 0, but 0 means the same thing as a 100% fill factor). This means that no free space is created for the data pages of an index.

The potential problem with a fill factor of 100 arises when data is added to a table as a result of an INSERT or UPDATE, and a new row needs to be added to a data page. If there is no room for it, then SQL Server will reorganize the rows, moving some of the rows onto a new data page, and leaving some on the old data page. This is known as page splitting. While page splitting is a normal SQL Server activity, too much page splitting can cause performance issues because it results in index fragmentation, the very thing we are trying to eliminate with the Rebuild Index task. In order to mitigate this problem, DBAs often decrease the fill factor to, say, 90, meaning that data pages be 90% full, leaving 10% free space.

Advanced Options


The two options under Advanced options are shown in Figure

The Advanced options section of the Define Rebuild Index Task screen

By default, both options are turned off. The first one is "Sort results in tempdb". If you don't choose this option, then when an index is rebuilt, all of the rebuilding activity is performed in the database file itself. If you select the "Sort results in tempdb" option, then some of the activity is still performed in the database, but some of it is also performed in tempdb. The benefit is that this can often speed up the index rebuild process. The drawback is that it also takes up a little more overall disk space, as space in tempdb is required, in addition to some space in the database where the indexes are being rebuilt.

The benefit you get out of this option depends on where tempdb is located on your server. If tempdb is located on the same drive or array as the database file that is having its indexes rebuilt, then the benefit may be minimal, if any. However, if tempdb is located on its own isolated drive spindles, then the benefit will be greater because there is less disk I/O contention.

So, should you use this option? If your databases are small, you probably won't be able to discern much performance benefit, but if you have large databases, with large tables and indexes, and if tempdb is located on its own spindles, then turning this feature on will probably boost index rebuild performance.

The second advanced option is one we've discussed previously: "Keep index online while reindexing". This option is only available if you have the Enterprise Edition of SQL Server. By selecting this option, index rebuilding becomes an online, rather than offline task. If you are using Enterprise Edition, you will probably want to select this option. I say "probably" because there are pros and cons of performing an online index rebuild;

No comments:

Post a Comment