SQL and MSBI Tips and Tricks: Memory and IO in SSIS and SQL

1.SSIS is an in-memory pipeline, so ensure that all transformations occur in memory.

The purpose of having Integration Services within SQL Server features is to provide a flexible, robust pipeline that can efficiently perform row-by-row calculations and parse data all in memory.

While the extract and load phases of the pipeline will touch disk (read and write respectively), the transformation itself should process in memory. If transformations spill to disk (for example with large sort operations), you will see a big performance degradation. Construct your packages to partition and filter data so that all transformations fit in memory.

A great way to check if your packages are staying within memory is to review the SSIS performance counter Buffers spooled, which has an initial value of 0; above 0 is an indication that the engine has started swapping to disk.

2.Plan for capacity by understanding resource utilization.

SQL Server Integration Services is designed to process large amounts of data row by row in memory with high speed. Because of this, it is important to understand resource utilization, i.e., the CPU, memory, I/O, and network utilization of your packages.
CPU Bound

Seek to understand how much CPU is being used by Integration Services and how much CPU is being used overall by SQL Server while Integration Services is running. This latter point is especially important if you have SQL Server and SSIS on the same box, because if there is a resource contention between these two, it is SQL Server that will typically win – resulting in disk spilling from Integration Services, which slows transformation speed.

The perfmon counter that is of primary interest to you is Process / % Processor Time (Total). Measure this counter for both sqlservr.exe and dtexec.exe. If SSIS is not able to drive close to 100% CPU load, this may be indicative of:

Application contention: For example, SQL Server is taking on more processor resources, making them unavailable to SSIS.
Hardware contention: A common scenario is that you have suboptimal disk I/O or not enough memory to handle the amount of data being processed.
Design limitation: The design of your SSIS package is not making use of parallelism, and/or the package uses too many single-threaded tasks.

Network Bound

SSIS moves data as fast as your network is able to handle it. Because of this, it is important to understand your network topology and ensure that the path between your source and target have both low latency and high throughput.

The following Network perfmon counters can help you tune your topology:

Network Interface / Current Bandwidth: This counter provides an estimate of current bandwidth.
Network Interface / Bytes Total / sec: The rate at which bytes are sent and received over each network adapter.
Network Interface / Transfers/sec: Tells how many network transfers per second are occurring. If it is approaching 40,000 IOPs, then get another NIC card and use teaming between the NIC cards.

These counters enable you to analyze how close you are to the maximum bandwidth of the system. Understanding this will allow you to plan capacity appropriately whether by using gigabit network adapters, increasing the number of NIC cards per server, or creating separate network addresses specifically for ETL traffic.
I/O Bound
If you ensure that Integration Services is minimally writing to disk, SSIS will only hit the disk when it reads from the source and writes to the target. But if your I/O is slow, reading and especially writing can create a bottleneck.

Memory bound
A very important question that you need to answer when using Integration Services is: "How much memory does my package use?"

The key counters for Integration Services and SQL Server are:

Process / Private Bytes (DTEXEC.exe): The amount of memory currently in use by Integration Services. This memory cannot be shared with other processes.
Process / Working Set (DTEXEC.exe): The total amount of allocated memory by Integration Services.
SQL Server: Memory Manager / Total Server Memory: The total amount of memory allocated by SQL Server. Because SQL Server has another way to allocate memory using the AWE API, this counter is the best indicator of total memory used by SQL Server.
Memory / Page Reads / sec: Represents to total memory pressure on the system. If this consistently goes above 500, the system is under memory pressure.

3.　Baseline source system extract speed.

Understand your source system and how fast you extract from it. After all, Integration Services cannot be tuned beyond the speed of your source – i.e., you cannot transform data faster than you can read it.

http://blogs.msdn.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-50-01/1856.Figure-1-_2D00_-Baseline-Source-System.jpg

Measure the speed of the source system by creating a very simple package reading data from your source with the a destination of "Row Count":

Execute the package from the command line (DTEXEC) and measure the time it took for it to complete its task. Use the Integration Services log output to get an accurate calculation of the time. You want to calculate rows per second:

Rows / sec = Row Count / TimeData Flow

Based on this value, you now know the maximum number of rows per second you can read from the source – this is also the roof on how fast you can transform your data. To increase this Rows / sec calculation, you can do the following:

• Improve drivers and driver configurations: Make sure you are using the most up-to-date driver configurations for your network, data source, and disk I/O. Often the default network drivers on your server are not configured optimally for the network stack, which results in performance degradations when there are a high number of throughput requests. Note that for 64-bit systems, at design time you may be loading 32-bit drivers; ensure that at run time you are using 64-bit drivers.

• Start multiple connections: To overcome limitations of drivers, you can try to start multiple connections to your data source. As long as the source can handle many concurrent connections, you may see an increase in throughput if you start several extracts at once. If concurrency is causing locking or blocking issues, consider partitioning the source and having your packages read from different partitions to more evenly distribute the load.

• Use multiple NIC cards: If the network is your bottleneck and you’ve already ensured that you’re using gigabit network cards and routers, then a potential solution is to use multiple NIC cards per server. Note that you will have to be careful when you configure multiple NIC environments; otherwise you will have network conflicts.

4. Tune your network.

A key network property is the packet size of your connection. By default this value is set to 4,096 bytes. This means a new network package must be assemble for every 4 KB of data. As noted in

SqlConnection.PacketSize Property in the .NET Framework Class Library, increasing the packet size will improve performance because fewer network read and write operations are required to transfer a large data set.

http://blogs.msdn.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-50-01/8015.Figure-3-_2D00_-tune-your-nextwork.jpg

If your system is transactional in nature, with many small data size read/writes, lowering the value will improve performance.

Since Integration Services is all about moving large amounts of data, you want to minimize the network overhead. This means that the value 32K (32767) is the fastest option. While it is possible to configure the network packet size on a server level using sp_configure, you should not do this. The database administrator may have reasons to use a different server setting than 32K. Instead, override the server settings in the connection manager as illustrated below.

Another network tuning technique is to use network affinity at the operating system level. At high throughputs, you can sometimes improve performance this way.

For the network itself, you may want to work with your network specialists to enable jumbo frames to increase the default payload of 1,500 bytes to 9,000 bytes. By enabling jumbo frames, you will further decrease the amount of network operation required to move large data sets.

SSIS and Memory

One thing that you're bound to run into when doing SSIS is memory issues. Buffer overflows. Virtual memory out of space.

Although I'm no expert in it, I can offer a few tips from things I've ran into.

1) If you're running out of physical memory, increase the pagefile amount. (Windows 7)
Right click on "My Computer" --> "Properties"
Select "Advanced System Settings" option on the left hand side
On the "Advanced" tab, under the "Performance" group box, click [Settings]
Click on the "Advanced" tab
On the "Virtual memory" group box, click [Change...]
Uncheck the "Automatically managed paging files for all drives" option as this puts a smaller page file
Under "Custom Size", input the inital and max size. There is a recommended amount. I'd do 10-50% over the recommended amount. Note: Make sure that you have enough disk space for your file.
Click [Set]
Click [OK]
Restart your system

2) SQL Server likes to eat up memory. It's a smart thing to set the maximum amount that it can consume, as it will consume everything until it's reached it's max - only to be reset when the service is restarted or the computer is rebooted.
Open up SSMS.
Right click on [Server Name] --> Properties
Click on the "Memory" tab
Set your max server memory to something manageable. If you have 8gb in your system, you don't want the max server memory to be 8gb since your os needs memory to also manage and there will be slowdown issues because of the imminent paging that will happen. I set it to 50% of my total memory. Once it reaches 50%, it will stay there regardless if it is currently using the entire space or not. Once it's reached the allocated amount. It stays. There - I've said the same thing 5 times. You get the idea.

What are the different properties which control the number of rows to be stored in the data buffers?

Each Data Flow Task has two properties (DefaultMaxBufferRows and DefaultMaxBufferSize) which control the number of rows which can be stored in single buffer and how many buffers can be created. The size of each row plays a pivotal role in deciding the number of rows which can be stored in a single buffer. DefaultMaxBufferRows property has default value of 10,000 rows and specifies the maximum number of rows that can be stored in each individual buffer.
The DefaultBufferSize property has a default value of 10 MB and specifies the maximum size of each individual buffer. Please note maximum and minimum size of each individual buffer is also constrained by two internal properties: MinBufferSize = 64 KB and MaxBufferSize = 100MB. As a best practice, you should configure these properties in such a way, that SSIS creates a small number of large buffers instead of creating a large number of small buffers. Also keep in mind, you should not make these values too large, especially if you minimal memory on the system, to a point where SSIS starts swapping/spooling data to disk. For more information click here.

How does BLOB data get stored in the data buffer and what happens if SSIS does not find enough buffers to hold the BLOB data?

As long as the BLOB data can fit in the data buffers (created out of memory) it is stored in the data buffers. However, the size of BLOB data is often too large to fit in data buffers, so in this circumstance SSIS spools the BLOB data to the file system at the default location or at a specified location.
Any data, including BLOB data, which cannot be stored in the data buffers, because of memory scarcity or memory pressure, is spooled to the file system.

When and where are data buffers are spooled to disk? How can this issue be avoided?

Often times, the size of BLOB data is too large to fit in the data buffers, so SSIS spools the BLOB data to the file system at the default location or at a specified location. However, anytime data cannot be stored in the data buffers because of memory scarcity or memory pressure, the data is spooled to the file system.
By default SSIS derives the physical locations to store buffer overflow data and BLOB overflow data from the TEMP/TMP environment variable. You can change this default behavior by changing these two properties of data flow task:

BufferTempStoragePath
BLOBTempStoragePath.

Note - sometimes spooling causes your system to run slow because of the input/output involved. As such, you should try to optimize the data in data buffers by adding more RAM or removing asynchronous transformations to the extent possible. If this is not possible, then try to specify the physical location for the spooling to the fastest drives that you have available. You can learn more this here.

What is parallel processing in SSIS and how do you control it?

One of the biggest benefits of using SSIS is it's ability to process data in parallel. You can have multiple tasks run in parallel for your ETL operations and utilize the power of your today's multi-core/multi-processer machines.
Each SSIS package has MaxConcurrentExecutables property which has its default value as -1; which means the maximum number of tasks that can be executed is equal to the total number of processors on the machine plus two. For example, on a four processor machine can execute 6 tasks in parallel in a package with this default property value. You can change this default value to an absolute number to indicate absolute number of tasks that can be executed in parallel.
Each Data flow task has the EngineThreads property which specifies the total number of threads that can be created for executing the data flow task. In SQL Server 2008, the default value for this property is 10, which means 10 threads can be allocated at one time to execute different execution trees of the data flow task in parallel.

SQL and MSBI Tips and Tricks

Thursday, November 28, 2013

Memory and IO in SSIS and SQL