Thursday, December 26, 2019

SSISDB is running out of space : Shrinking the SSISDB


Shrinking the SSISDB

The SQL Server Integration Services (short SSIS) are a great tool to move data around. Not only are they fast, they have many useful parts to make data migration a simple task. The same is true for their production use. Cleaning up their own database is a feature that most applications completely ignore. All is set-up in a way that SSIS can run for a long time without user intervention.
I was therefore surprised when our backup disk run out of space. The SSIS database grow in a matter of days from ~7 GB to 70 GB. A quick look around showed that it stored 60 Million events, what was unexpected and problematic. A bit of googling resulted in an idea to reduce the retention time. There are many examples on how to do that. Unfortunately, the one I liked the most had all scripts in screenshots, what makes copy and paste impossible. This post fixes that, explains the reason to do the different steps and points out possible pitfalls and their prevention.

Before you Start

Don’t try anything without a backup. If something goes wrong, the backup may be the only thing that stands between you and hours of work to get SSIS back to work.
Next, limit the auto-growth of your SSIS database and the log files. The shrinking will create a lot of log entries and if you try to reduce it too much at once you fill up your disk again and the job throws an exception. Set the maximum size for the log files near the maximum you can put on the disk. You shrink it later, but until then every gigabyte helps.
The last step for preparation is to disable all the tasks who use SSIS. It’s simpler to shrink the database when you aren’t constantly blocked by a running SSIS job. You can deactivate the jobs in the overview of the jobs section in the SQL Server Agent:
Uncheck all the SSIS jobs and click OK. To turn them back on later, you simply check them and again and click on OK to save those changes.

Start Shrinking

Right after you have your backup and limited the auto-growth of the database files you should change the recovery mode of the SSISDB to SIMPLE. There are no *.log backups and you are unable to recover your database to a specific point in time. On the plus side, you don’t need the space for the archived transaction log backups.
Keep a note about the current recovery mode so that you can go back to the right one when you are done.

USE [master]
GO

SELECT name, recovery_model_desc
FROM sys.databases 
WHERE name = 'SSISDB'
GO

ALTER DATABASE [SSISDB] SET RECOVERY SIMPLE WITH NO_WAIT
GO


Now check what values are set for the catalog properties:

USE SSISDB
GO

SELECT * FROM [catalog].catalog_properties

The two Boolean fields must be set to true. If they are false the clean-up script will not do any work.
The next step is to reduce the retention time that is controlled by the variable Retention_window. This values specify how many days the data is kept around. The default value is to keep data for a year (365 days). If you want to reduce it to 7 days, then you may be tempted to do it in one step. It can work, but most likely it will run for hours and then fail. To reduce risk of spending hours without any progress, try smaller steps. Reduce it in the first attempt to 200 days. If it works, keep reducing it by 50 days. If it isn’t working, you can reduce the retention time by 1-5 days per step.

UPDATE SSISDB.[catalog].[catalog_properties]
SET property_value=200
WHERE property_name='Retention_window'


With the retention time reduced, it is now time to start the clean-up job: 

 EXEC [internal].[cleanup_server_retention_window]

If the clean-up was successful, you repeat it with a lower retention time until you reach a database size you can work with.

How big is your Database?

Looking at the size of the different database files will get you only half the picture. That is just the size it takes on the disk. The database keeps free space allocated to improve the overall performance. When you delete a great number of records then that no longer needed space isn’t returned to the operating system. SQL Server has built-in reports to see how much of the file size is really used. You can find those reports in the Management Studio of SQL Server on the context menu of your database:
The report looks something like the next picture, where the free space is marked in green:

Shrinking the Database

Now we can finally come to the part where we can give disk space back to the operating system. SQL Server Management Studio offers a simple way to do that.

Go Back to the Initial Recovery Mode

With a successful reduction of the size you now can go back to the recovery mode you had before we start.

ALTER DATABASE [SSISDB] SET RECOVERY FULL WITH NO_WAIT
GO


Next make a backup of the now shrinked database. If something else goes wrong, you don’t need to redo the whole work of reducing the retention time.
Don’t forget to turn your SSIS jobs on.

Tuesday, December 17, 2019

List primary key, foreign key and constraint name relationship in Database

----// For all tables in a database, below is the query.

SELECT o2.name AS Referenced_Table_Name,
       c2.name AS Referenced_Column_As_FK,
       o1.name AS Referencing_Table_Name,
       c1.name AS Referencing_Column_Name,
s.name AS Constraint_name
FROM  sysforeignkeys fk
INNER JOIN sysobjects o1 ON fk.fkeyid = o1.id
INNER JOIN sysobjects o2 ON fk.rkeyid = o2.id
INNER JOIN syscolumns c1 ON c1.id = o1.id AND c1.colid = fk.fkey
INNER JOIN syscolumns c2 ON c2.id = o2.id AND c2.colid = fk.rkey
INNER JOIN sysobjects s ON fk.constid = s.id
ORDER BY o2.name

----// There's also a command:

exec sp_fkeys @pktable_name ='Product',
        @pktable_owner ='Production'

----// For all tables in a database, below is the query.

select schema_name(tab.schema_id) + '.' + tab.name as [table],
        col.column_id,
        col.name as column_name,
        case when fk.object_id is not null then '>-' else null end as rel,
        schema_name(pk_tab.schema_id) + '.' + pk_tab.name as primary_table,
        pk_col.name as pk_column_name,
        fk_cols.constraint_column_id as no,
        fk.name as fk_constraint_name
from sys.tables tab
inner join sys.columns col on col.object_id = tab.object_id
left outer join sys.foreign_key_columns fk_cols on fk_cols.parent_object_id = tab.object_id
                                                and fk_cols.parent_column_id = col.column_id
left outer join sys.foreign_keys fk on fk.object_id = fk_cols.constraint_object_id
left outer join sys.tables pk_tab on pk_tab.object_id = fk_cols.referenced_object_id
left outer join sys.columns pk_col on pk_col.column_id = fk_cols.referenced_column_id
                                    and pk_col.object_id = fk_cols.referenced_object_id
order by schema_name(tab.schema_id) + '.' + tab.name,
    col.column_id

----// For all tables in a database, below is the query.

select 
    schema_name(pk_tab.schema_id) + '.' + pk_tab.name as primary_table
    , pk_col.name as pk_column
    ,schema_name(fk_tab.schema_id) + '.' + fk_tab.name as foreign_table,
    '>-' as rel,
    substring(D.[name], 1, len(D.[name])-1) as [fk_columns],
    fk.name as fk_constraint_name
from sys.foreign_keys fk
inner join sys.tables fk_tab  on fk_tab.object_id = fk.parent_object_id
inner join sys.tables pk_tab  on pk_tab.object_id = fk.referenced_object_id
cross apply
(
    select col.[name] ,fk_c.referenced_column_id, fk_c.referenced_object_id
    --+ ', '
    from sys.foreign_key_columns fk_c
    inner join sys.columns col    on fk_c.parent_object_id = col.object_id and fk_c.parent_column_id = col.column_id
    where fk_c.parent_object_id = fk_tab.object_id
    and fk_c.constraint_object_id = fk.object_id
    -- order by col.column_id
    --for xml path ('')
) D
inner join sys.columns pk_col on pk_col.column_id = D.referenced_column_id
                                    and pk_col.object_id = D.referenced_object_id
WHERE schema_name(pk_tab.schema_id) = 'BEY00013'
order by schema_name(fk_tab.schema_id) + '.' + fk_tab.name,
    schema_name(pk_tab.schema_id) + '.' + pk_tab.name


----// Modify foreign key constraints

ALTER TABLE TableB
DROP CONSTRAINT MyForeignKey
GO

ALTER TABLE TableB
ADD CONSTRAINT MyForeignKey FOREIGN KEY (ForeignKeyColumn)
REFERENCES TableA (PrimaryKeyColumn)
ON DELETE CASCADE
GO

Thursday, November 14, 2019

Convert GETDATE() into local time


Below statement will give server time zone

DECLARE @TimeZone VARCHAR(50)

EXEC MASTER.dbo.xp_regread 'HKEY_LOCAL_MACHINE','SYSTEM\CurrentControlSet\Control\TimeZoneInformation','TimeZoneKeyName',@TimeZone OUT

SELECT @TimeZone AS [Server time zone]


SELECT GETDATE(), GETUTCDATE()

Below statement gives current date into location time as per country time zone

SELECT GETDATE() AT TIME ZONE 'Central Europe Standard Time' AT TIME ZONE 'UTC'

SELECT GETDATE() AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC'

SELECT GETDATE() AT TIME ZONE 'W. Australia Standard Time' AT TIME ZONE 'UTC'

SELECT GETDATE() AT TIME ZONE 'Cen. Australia Standard Time' AT TIME ZONE 'UTC'

SELECT GETDATE() AT TIME ZONE 'AUS Central Standard Time' AT TIME ZONE 'UTC'

SELECT GETDATE() AT TIME ZONE 'E. Australia Standard Time' AT TIME ZONE 'UTC'

SELECT GETDATE() AT TIME ZONE 'AUS Eastern Standard Time' AT TIME ZONE 'UTC'

SELECT GETDATE() AT TIME ZONE 'India Standard Time' AT TIME ZONE 'UTC'

Wednesday, October 23, 2019

Overview on SQL Server Fragmentation

Overview on SQL Server Fragmentation

SQL Server stores data on 8KB pages. When we insert data into a table, SQL Server will allocate one page to store that data unless the data inserted is more than 8KB in which it would span multiple pages. Each page is assigned to one table. If we create 10 tables then we'll have 10 different pages.
As you insert data into a table, the data will go to the transaction log file first. The transaction log file is a sequential record meaning as you insert, update, and delete records the log will record these transactions from start to finish. The data file on the other hand is not sequential. The log file will flush the data to the data file creating pages all over the place.
Now that we have an idea of how data is stored, what does this have to do with fragmentation?
There are two types of fragmentation: Internal Fragmentation and External Fragmentation

SQL Server Internal Fragmentation

SQL Server Internal Fragmentation is caused by pages that have too much free space. Let's pretend at the beginning of the day we have a table with 40 pages that are 100% full, but by the end of the day we have a table with 50 pages that are only 80% full because of various delete and insert statements throughout the day. This causes an issue because now when we need to read from this table we have to scan 50 pages instead of 40 which should may result in a decrease in performance. Let's see a quick and dirty example.
Let's say I have the following table with a Primary Key and a non-clustered index on FirstName and LastName:
dbo.Person table with a Primary Key and a non-clustered index on FirstName and LastName
I'll talk about ways to analyze fragmentation later in this tip, but for now we can right click on the index, click Properties, and Fragmentation to see fragmentation and page fullness. This is a brand new index so it's at 0% fragmentation.
SQL Server Index Properties and Fragmentation Before Inserts
Let's INSERT 1000 rows into this table:


INSERT INTO Person VALUES
('Brady', 'Upton', '123 Main Street', 'TN', 55555)
GO 1000

Now, let's check our index again:
SQL Server Index Properties and Fragmentation After Inserts
You can see our index becomes 75% fragmented and the average percent of full pages (page fullness) increases to 80%. This table is still so small that 75% fragmentation would probably not cause any performance issues, but as the table increases in size and page counts increase you may see performance degrade. You can also see from the screenshot above that this table went from 0 pages to 4.

SQL Server External Fragmentation

External Fragmentation is caused by pages that are out of order. Let's pretend at the beginning of the day we have a perfectly ordered table. During the day we issue hundreds of update statements possibly leaving some empty space on one page and trying to fit space into other pages. This means our storage has to jump around to obtain the data needed instead of reading in one direction.


sys.dm_db_index_physical_stats – Introduced in SQL Server 2005, this dynamic management view (DMV) returns size and fragmentation information for the data and indexes of the specified table or view.

SELECT OBJECT_NAME(ips.OBJECT_ID)
 ,i.NAME
 ,ips.index_id
 ,index_type_desc
 ,avg_fragmentation_in_percent
 ,avg_page_space_used_in_percent
 ,page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'SAMPLED') ips
INNER JOIN sys.indexes i ON (ips.object_id = i.object_id)
 AND (ips.index_id = i.index_id)
ORDER BY avg_fragmentation_in_percent DESC 

Index Rebuild : This process drops the existing Index and Recreates the index.

USE AdventureWorks;GOALTER INDEX ALL ON Production.Product REBUILD
GO

Index Reorganize : This process physically reorganizes the leaf nodes of the index.

USE AdventureWorks;GOALTER INDEX ALL ON Production.Product REORGANIZE
GO


Index should be rebuild when index fragmentation is great than 40%. Index should be reorganized when index fragmentation is between 10% to 40%. Index rebuilding process uses more CPU and it locks the database resources. SQL Server development version and Enterprise version has option ONLINE, which can be turned on when Index is rebuilt. ONLINE option will keep index available during the rebuilding.

Tuesday, September 10, 2019

How to check what is in SQL Server plan cache

What happens when a query is issued to SQL Server

In SQl Server, every query requires a query plan before it is executed. When you run a query the first time, the query gets compiled and a query plan is generated. This query plan is then saved in sql server query plan cache.

Next time when we run the same query, the cached query plan is reused. This means sql server does not have to create the plan again for that same query. So reusing a query plan can increase the performance. 


How long the query plan stays in the plan cache depends on how often the plan is reused besides other factors. The more often the plan is reused the longer it stays in the plan cache.
How to check what is in SQL Server plan cache

To see what is in SQL Server plan cache we will make use of the following 3 dynamic management views and functions provided by sql server
1. sys.dm_exec_cached_plans
2. sys.dm_exec_sql_text
3. sys.dm_exec_query_plan

Use the following query to see what is in the plan cache

SELECT cp.usecounts, cp.cacheobjtype, cp.objtype, st.text, qp.query_plan
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(plan_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp
ORDER BY cp.usecounts DESC

As you can see we have sorted the result set by usecounts column in descending order, so we can see the most frequently reused query plans on the top. The output of the above query from my computer is shown below.


Sql server query plan cache

The following table explains what each column in the resultset contains


ColumnDescription
usecountsNumber of times the plan is reused
objtypeSpecifies the type of object
textText of the SQL query
query_planQuery execution plan in XML format

To remove all elements from the plan cache use the following command
DBCC FREEPROCCACHE

In older versions of SQL Server up to SQL Server 6.5 only stored procedure plans are cached. The query plans for Adhoc sql statements or dynamic sql statements are not cached, so they get compiled every time. With SQL Server 7, and later versions the query plans for Adhoc sql statements and dynamic sql statements are also cached.

Things to consider to promote query plan reusability

For example, when we execute the following query the first time. The query is compiled, a plan is created and put in the cache.
Select * From Employees Where FirstName = 'Mark'

When we execute the same query again, it looks up the plan cache, and if a plan is available, it reuses the existing plan instead of creating the plan again which can improve the performance of the query. However, one important thing to keep in mind is that, the cache lookup is by a hash value computed from the query text. If the query text changes even slightly, sql server will not be able to reuse the existing plan.

For example, even if you include an extra space somewhere in the query or you change the case, the query text hash will not match, and sql server will not be able find the plan in cache and ends up compiling the query again and creating a new plan.

Another example : If you want the same query to find an employee whose FirstName is Steve instead of Mark. You would issue the following query
Select * From Employees Where FirstName = 'Steve'

Even in this case, since the query text has changed the hash will not match, and sql server will not be able find the plan in cache and ends up compiling the query again and creating a new plan.

This is why, it is very important to use parameterised queries for sql server to be able to reuse cached query plans. With parameterised queries, sql server will not treat parameter values as part of the query text. So when you change the parameters values, sql server can still reuse the cached query plan.

The following query uses parameters. So even if you change parameter values, the same query plan is reused.

Declare @FirstName nvarchar(50)
Set @FirstName = 'Steve'
Execute sp_executesql N'Select * from Employees where FirstName=@FN', N'@FN nvarchar(50)', @FirstName

One important thing to keep in mind is that, when you have dynamic sql in a stored procedure, the query plan for the stored procedure does not include the dynamic SQL. The block of dynamic SQL has a query plan of its own.

Summary: Never ever concatenate user input values with strings to build dynamic sql statements. Always use parameterised queries which not only promotes cached query plans reuse but also prevent sql injection attacks.

Sunday, September 8, 2019

Query for SQL Server SP Performance Indicator




SELECT TOP (20) 
  CASE WHEN database_id = 32767 THEN 'Resource' 
    ELSE DB_NAME(database_id)
  END AS DBName
      ,OBJECT_SCHEMA_NAME(object_id,database_id) AS [SCHEMA_NAME] 
      ,OBJECT_NAME(object_id,database_id)AS [OBJECT_NAME]
      ,cached_time
      ,last_execution_time
      ,execution_count
   ,total_worker_time
      ,total_worker_time / execution_count AS AVG_CPU
      ,total_elapsed_time / execution_count AS AVG_ELAPSED
      ,total_logical_reads / execution_count AS AVG_LOGICAL_READS
      ,total_logical_writes / execution_count AS AVG_LOGICAL_WRITES
      ,total_physical_reads  / execution_count AS AVG_PHYSICAL_READS
   FROM sys.dm_exec_procedure_stats 
ORDER BY execution_count DESC

Database backup history


We can also use the following script:

1) if you want to get the database file storage usage:
SELECT d.name ,f.type_desc ,f.size [Size MB],f.state_desc
FROM [sys].[master_files] f
JOIN [sys].[databases]  d ON d.database_id =f.database_id
2) if you want to get the database backup storage usage:
SELECT SUBSTRING(a.database_name,1,35) AS 'Database',
DATEPART(YEAR,a.backup_start_date) AS 'year',
DATEPART(MONTH,a.backup_start_date) AS 'month' ,
AVG(CAST((a.backup_size /1073741824) AS DECIMAL (9,2)))AS 'Avg Gig' ,
AVG(CAST((a.backup_size /1048576) AS DECIMAL (9,2)))AS 'Avg MB' 
--,a.type
FROM  msdb.dbo.backupset a
JOIN  msdb.dbo.backupset b on a.server_name = b.server_name 
      AND a.database_name = b.database_name 
WHERE a.type = 'D'  
and b.type = 'D'         
  and a.database_name = 'MyDB_Test'
  --and a.backup_size > 1073741824   --   > 1 Gig
GROUP BY a.database_name,DATEPART(YEAR,a.backup_start_date)
  ,DATEPART(MONTH,a.backup_start_date)--,a.type
ORDER BY a.database_name,DATEPART(YEAR,a.backup_start_date) DESC
  ,DATEPART(MONTH,a.backup_start_date) DESC
 
3) if you want to get the latest backup only use the following command:
SELECT sdb.Name AS DatabaseName,
 COALESCE(CONVERT(VARCHAR(12), MAX(bus.backup_finish_date), 101),'-') 
 AS LastBackUpTime
FROM sys.sysdatabases sdb
LEFT OUTER JOIN msdb.dbo.backupset bus ON bus.database_name = sdb.name
GROUP BY sdb.Name
4) if you want to get the backup history for all databases in the last seven days:
SELECT 
   CONVERT(CHAR(100), SERVERPROPERTY('Servername')) AS Server,
   msdb.dbo.backupset.database_name, 
   msdb.dbo.backupset.backup_start_date, 
   msdb.dbo.backupset.backup_finish_date,
   msdb.dbo.backupset.expiration_date,
   CASE msdb..backupset.type 
       WHEN 'D' THEN 'Database' 
       WHEN 'L' THEN 'Log' 
   END AS backup_type, 
   msdb.dbo.backupset.backup_size, 
   msdb.dbo.backupmediafamily.logical_device_name, 
   msdb.dbo.backupmediafamily.physical_device_name,  
   msdb.dbo.backupset.name AS backupset_name,
   msdb.dbo.backupset.description
FROM   msdb.dbo.backupmediafamily 
INNER JOIN msdb.dbo.backupset 
 ON msdb.dbo.backupmediafamily.media_set_id = msdb.dbo.backupset.media_set_id 
WHERE  (CONVERT(datetime, msdb.dbo.backupset.backup_start_date, 102) >= GETDATE() - 7) 
ORDER BY 
   msdb.dbo.backupset.database_name,
   msdb.dbo.backupset.backup_finish_date

Query to get database backup details

Backup Details.

SELECT  name ,
            recovery_model_desc ,
            state_desc ,
            d AS 'Last Full Backup' ,
            i AS 'Last Differential Backup' ,
            l AS 'Last log Backup'
    FROM    ( SELECT    db.name ,
                        db.state_desc ,
                        db.recovery_model_desc ,
                        type ,
                        backup_finish_date
              FROM      master.sys.databases db
                        LEFT OUTER JOIN msdb.dbo.backupset a ON a.database_name = db.name
            ) AS Sourcetable 
        PIVOT 
            ( MAX(backup_finish_date) FOR type IN ( D, I, L ) ) AS MostRecentBackup
enter image description here
We can write below query to include BackupSize and BackupSet.Name in this query. I omitted the pivoted data and make it simpler
WITH    backupsetSummary
          AS ( SELECT   bs.database_name ,
                        bs.type bstype ,
                        MAX(backup_finish_date) MAXbackup_finish_date
               FROM     msdb.dbo.backupset bs
               GROUP BY bs.database_name ,
                        bs.type
             ),
        MainBigSet
          AS ( SELECT   db.name ,
                        db.state_desc ,
                        db.recovery_model_desc ,
                        bs.type ,
                        bs.name AS BackupSetName ,
                        bs.backup_size ,
                        bs.backup_finish_date
               FROM     master.sys.databases db
                        LEFT OUTER JOIN backupsetSummary bss ON bss.database_name = db.name
                        LEFT OUTER JOIN msdb.dbo.backupset bs ON bs.database_name = db.name
                                                              AND bss.bstype = bs.type
                                                              AND bss.MAXbackup_finish_date = bs.backup_finish_date
             )
    SELECT  *
    FROM    MainBigSet