Administration

Replication without Snapshot Initialisation

Recently I had a requirement for a customer that was not really “by the book”.
The requirement was to replicate a table, but only replicate anything new coming in, no historical data. The table has close to 7 Billion rows and is several Terabytes big.
Replicating all the historical data would kill the target server which just hasn’t the capacity. The target server will only keep a rolling 30 days of data and discard everything older.
That said, I had to find a way replicating all new transactions without the need of a Snapshot or Backup initialisation. This is not as straight forward as it sounds, but then again also not really too complicated, once you know how.
After some trial-and-error on a test lab installation I got it going. Here is an account of how-to:

Create the schema on the target server

To use replication without snapshot initialisation you will have to need all the data structures set up before you start. Create the target table on the target server. Don’t forget to plan how to handle identity columns. (If you need some data in there, also transfer and – if need be – modify the data to your needs.)

Create publication on source server

What I like to do is to use the wizard to create my Publications but not actually create the publication, just create the scripts. Then check and modify them before actually creating the publication. (And the same for subscriptions)

exec sp_addpublication 
         @publication = N'ReplWoSnapshot'
        ,@description = N'Transactional publication of database ''Source'' from Publisher ''SourceServer''.'
        ,@sync_method = N'concurrent'
        ,@retention = 0
        ,@allow_push = N'true'
        ,@allow_pull = N'true'
        ,@allow_anonymous = N'false'
        ,@enabled_for_internet = N'false'
        ,@snapshot_in_defaultfolder = N'true'
        ,@compress_snapshot = N'false'
        ,@ftp_port = 21
        ,@allow_subscription_copy = N'false'
        ,@add_to_active_directory = N'false'
        ,@repl_freq = N'continuous'
        ,@status = N'active'
        ,@independent_agent = N'true'
        ,@immediate_sync = N'false'
        ,@allow_sync_tran = N'false'
        ,@allow_queued_tran = N'false'
        ,@allow_dts = N'false'
        ,@replicate_ddl = 1
        ,@allow_initialize_from_backup = N'false'
        ,@enabled_for_p2p = N'false'
        ,@enabled_for_het_sub = N'false'
        ,@allow_partition_switch = N'true'
        ,@replicate_partition_switch = N'false'

Important here is that Immediate Synchronosation is off. Also switch of Initialisation from Backup.
Note: The last two parameters are needed in this case since it is a partitioned table I am replicating and I want to still be able to switch partitions on the source database without generating errors. But I don’t need the partition switch to be replicated. The target database will have its own split-switch-merge logic.

exec sp_addpublication_snapshot 
         @publication = N'ReplWoSnapshot'
        ,@frequency_type = 1
        ,@frequency_interval = 1
        ,@frequency_relative_interval = 1
        ,@frequency_recurrence_factor = 0
        ,@frequency_subday = 8
        ,@frequency_subday_interval = 1
        ,@active_start_time_of_day = 0
        ,@active_end_time_of_day = 235959
        ,@active_start_date = 0
        ,@active_end_date = 0
        ,@job_login = N'login'
        ,@job_password = 'pwd'
        ,@publisher_security_mode = 1

The snapshot agent is still created as usual even though it won’t be used.

exec sp_addarticle 
          @publication = N'ReplWoSnapshot'
        , @article = N'Source Table'
        , @source_owner = N'dbo'
        , @source_object = N'Source Table'
        , @type = N'logbased'
        , @description = null
        , @creation_script = null
        , @pre_creation_cmd = N'drop'
        , @schema_option = 0x00000000081B509F
        , @identityrangemanagementoption = N'manual'
        , @destination_table = N'Source Table'
        , @destination_owner = N'dbo'
        , @vertical_partition = N'false'
        , @ins_cmd = N'CALL sp_MSins_dboSourceTable'
        , @del_cmd = N'CALL sp_MSdel_dboSourceTable'
        , @upd_cmd = N'SCALL sp_MSupd_dboSourceTable'

Also all the articles are added as usual.

Create subscription

Again, I used the wizard to create the subscription, but only let it generate the scripts. In my case I created a Pull subscription, but it should also work with a Push subscription.

First add it on the distributor (= soruce server in this case)

exec sp_addsubscription 
         @publication = N'ReplWoSnapshot'
        ,@subscriber = N'Target Server'
        ,@destination_db = N'Source Dtaabase'
        ,@sync_type = N'replication support only'
        ,@subscription_type = N'pull'
        ,@update_mode = N'read only'

Here lies the most important part of the process: the Synchronisation Type. You need to use Replication Support Only. This ensures that no initialisation is attempted, just anything coming in will be replicated without questions.

Now add the subscription on the target server.

exec sp_addpullsubscription 
		 @publisher = N'Source Server'
		,@publication = N'ReplWoSnapshot'
		,@publisher_db = N'Source DB'
		,@independent_agent = N'True'
		,@subscription_type = N'pull'
		,@description = N''
		,@update_mode = N'read only'
		,@immediate_sync = 0

Again, Immediate Synchronisation must be switched off.

exec sp_addpullsubscription_agent 
		 @publisher = N'source server'
		,@publisher_db = N'Source DB'
		,@publication = N'ReplWoSnapshot'
		,@distributor = N'Source Server'
		,@distributor_security_mode = 1
		,@distributor_login = N''
		,@distributor_password = null
		,@enabled_for_syncmgr = N'False'
		,@frequency_type = 64
		,@frequency_interval = 0
		,@frequency_relative_interval = 0
		,@frequency_recurrence_factor = 0
		,@frequency_subday = 0
		,@frequency_subday_interval = 0
		,@active_start_time_of_day = 0
		,@active_end_time_of_day = 235959
		,@active_start_date = 20130111
		,@active_end_date = 99991231
		,@alt_snapshot_folder = N''
		,@working_directory = N''
		,@use_ftp = N'False'
		,@job_login = N'login'
		,@job_password = 'pwd'
		,@publication_type = 0

The subscription agent is then added as usual.

With these few steps the replication is fully configured and running. New transactions from the creation of the Publication on are replicated to the subscriber without any need of initialisation, be it from Snapshot or from Backup.

But there is one more thing you should do!
You need to be aware that a lot of data is actually missing in the target database. So every Update and Delete on a record in the source database that is not available in the target database will cause an error in the replication and acutally stop the replication.
Even when you are sure only Inserts happen on the source server, you should take precautions for transactions being executed for data that is not available on the target system.

Skip Consistency Errors in Replication

In order to avoid these errors (or actually tell the replication to ignore them) you need to use Agent Profiles.
Please find a step-by-step guide here on
How to: Work with Replication Agent Profiles (SQL Server Management Studio)
For this particular distribution we need to use an Agent Profile called Continue On Data Consistency Errors.
To use this profile:

  • Open the Replication Monitor
  • Right-Click on the distribution and choose Agent Profiles
  • Tick the box next to Continue On Data Consistency Errors
  • Click the button Change Existing Agents (if you just click OK, only newly created Agents will use this profile).
  • Caveat 1:

    Please be aware that Replication without Initialisation might not be the best solution for your requirements (or even this requirement described here). In fact Service Broker might provide a more suitable solution or some ETL process or something hand-made.
    Please always look into all kinds of solution and don’t just take the first that offers itself (we do that often enough anyway). It might not be the best one.

    Caveat 2:

    The description given here is no full step-by-step guide. It pre-requisits some knowledge of SQL Server replication. Some obvious steps are omitted, others vaguely described.
    Please always refer to the SQL Server documentation for full description of all parameters and steps.

    Capacity Specifications for SQL Server

    Here is a useful link for all those who ever wondered about the maximum capacity of SQL Server. Here you’ll find a table specifying things like the number of partitions a table can have, the number of bytes of code a single stored procdure can have, but also fairly basic (and most important) stuff like the maximum row size, maximum index key size, maximum columns per table and so on.

    So, add the link to your boookmarks. You’ll need it over and over again:
    Maximum Capacity Specifications for SQL Server

    Get number of rows per partition

    How can you get the number of rows in each partition of your partitioned tables?
    Here is a simple query to retrieve that data, including the boundary value for each partition:

    SELECT	    
         p.partition_id                             as PartitionID
        ,t.name                                     as [Table]
        ,ps.name                                    as PartitionScheme
        ,pf.name                                    as PartitionFunction
        ,p.partition_number                         as [Partition]
        ,p.rows                                     as [Rows]
        ,prv.value                                  as Boundary
        --,pf.type_desc                               as BoundaryType
        ,case when pf.boundary_value_on_right = 1 
            then 'Right'
            else 'Left'
            end                                     as BoundarySide
    
    FROM	
        sys.tables t
        inner join sys.partitions p 
            on t.object_id = p.object_id 
            and p.index_id = 1
        inner join sys.indexes i on p.object_id = i.object_id and p.index_id = i.index_id
        inner join sys.data_spaces ds on i.data_space_id = ds.data_space_id
        inner join sys.partition_schemes ps on ds.data_space_id = ps.data_space_id
        inner join sys.partition_functions pf on ps.function_id = pf.function_id
        left outer join sys.partition_range_values prv 
            on pf.function_id = prv.function_id
            and p.partition_number = prv.boundary_id
    WHERE	
        is_ms_shipped = 0   
        and exists (select 1 from sys.partitions p2
                    where t.object_id = p2.object_id
                    and partition_number > 1)
    order by 
         t.name 
        ,p.partition_number desc

    This will show you

  • Tables that have at least two partitions. If you also want to see the number of rows for all the other tables, remove the … and exits (…) from the where-clause.
  • Only one entry per partition. If you want to see an entry per partitioned index, remove and p.index_id = 1 from the join-condition.
  • Error in replication: the subscription(s) have been marked inactive and must be reinitialized

    There are several reasons for this error to appear, mostly they are network connection problems, but also downtime of the subscriber beyond the subscription retention period or simple schema changes.
    So the first thing to do is to investigate the reason for the issue. As always, Google is your friend and you’ll find some great sites helping you in this investigation.

    Relevant data for this investigation can be found in

    select * from distribution..MSdistribution_agents
    select * from distribution..MSdistribution_history order by time desc
    select top(50) * from distribution..MSrepl_errors order by time desc
    	
    exec sp_browsereplcommands
    	
    select * from distribution..MSrepl_commands

    Reinitializing the replication can be a pain, especially with large data volumes. In some cases reinitializing can be avoided. You can re-activate the subscriptions, but you’ll have to do so directly in the distribution database.

    First find the subsriptions that need to be activated.

    select * from distribution..Mssubscriptions

    Take a look at the status column:

  • 0 = Inactive
    1 = subscribed
    2 = Active
  • Then you can reactivate them by resetting the status to Active:

    update distribution..Mssubscriptions set status = 2 where status = 0 and publisher_db = '<YourDB>'

    After reactivation check the Replication Monitor and the above tables how your replication is doing. There might be several errors you need to solve, but generally this should help get the replication going again.

    Pitfalls of SQL Server Replication

    If you are planning Replication for one of your SQL Server databases there are some pitfalls you will encounter and need to solve. There are some little things you need to take care of, which can be hard to find since some error messages SQL Server gives maybe describe the error but don’t help at all in finding the source or how to resolve it.

    1. Inform yourself about replication.
    Before you even start planning Replication read up on the principal concepts: Snapshot replication, Transactional Replication, Merge Replication. How they work. What resources they need.

    2. Plan your strategy for identity columns.
    You need to plan your Replication especially careful if you plan to let users update the replicated data. Then you need to take special care how both databases – master data and replicated data – handle identity columns.

    3. Only replicate the data you really need replicated.
    Don’t just replicate your whole database. This might seem the easiest solution, especially if your database is small. But databases tend to grow, and what might be a small database in one customer instance, is a huge database at another customer’s instance. Always plan your databases (and therefore also replication) scalable, so they can become bigger and bigger without performance loss.
    This said, replicate only the tables, views, functions and stored procedures you need. Replication adds additional stress to the system, so minimize that stress to the absolutely necessary.

    4. Check also what the application needs.
    Depending on how you plan to use the replicated data, what you actually need to replicate will differ a lot. It is not only necessary to check all the dependencies within the database and maybe even with other databases, but also to check the applications which you are planning to use on the replicated data.
    For instance the database and the views, procedures and functions don’t show any indication of any dependency on a user table or a lookup table, but the application you are trying to run on this data absolutely needs this data to work properly (or at all).
    So always check with application developers what data apart from the obvious database dependencies you also will need to replicate (or substitute in some other way).

    5. Check your objects for their “replication-fitness”.
    There are some (simple) prerequisites to take care of before replicating a table:

    a. It needs a Primary Key (for transactional replication).
    You might shrug at this point, but there are databases out there, especially legacy databases, where not every table has a primary key. If you need to replicate such a table, you need to create a primary key. In many cases you will find unique columns on such tables, either defined by a unique constraint or index or undefined but clearly detectable. Talk to the application developers, what might be unique, if adding a column might work, what consequences a primary key might have.
    If there is no way of adding a Primary Key, you cannot replicate that table and you might need to find another way to copy the data – some automated ETL process maybe.
    b. Take care of identity columns
    When you add a table with an identity column to the replicated articles, SQL Server automatically alters the schema of that column to . This is sufficient if you don’t let the users update the replicated data (read-only database). If you do, you need to take special care of the identity columns (see above).
    c. Check for triggers.
    Everyone always forgets about triggers. An enabled trigger on the target database can fail the whole replication process. Check the triggers, see what they do, and if you need them on the target database. You even might need to alter them because you need only parts of it.
    Tip: If you have a trigger that on insert into Table1 generates data for Table2, there are two ways of handling this:
    i. Replicate Table1 and Table2, disable the trigger on the target database.
    ii. Replicate only Table1 and leave the trigger enabled on the target database. So every replicated data in Table1 fires the trigger and fills Table2.
    Method (i) let’s all the resources be used on the master database, first the trigger fires, then both tables need to be distributed. Method (ii) balances the load by letting the target database do some of the work. Chose for yourself, which to use. It just goes to show, there is no straight answer on what to do with triggers. Each one has to be checked on its own.

    6. You need sufficient rights.
    When setting up replication you need to have Windows users available that have sufficient rights. We are talking Windows/Domain users here, not SQL Server users:
    a. Your user needs rights to set the executing user context of a service, create a Network share, maybe install components on the servers.
    b. A user for each database that can reach the database snapshot share.
    c. A user for each database under which the SQL Agent job can run.
    d. A user with SQL Server sysadmin rights.

    7. SQL Server Replication needs to be installed on both servers.
    Don’t laugh! This happens more often than you think. Everything is prepared, all the objects and user rights checked and you want to start setting up the replication and the first message you get is

    So get your Setup files ready and install SQL Server Replication. It is a straight forward process and shouldn’t present any problems, as long as you have sufficient rights to do so (and the Setup files handy).
    Important: even though setup doesn’t tell you to, after the setup is finished you must restart SQL Agent service. Otherwise replication won’t work. And SQL Server will never tell you why, but throw cryptic error messages that some settings are missing or incorrect.

    8. Run SQL Agent with a valid user context.
    Database snapshots are distributed via a network share that both servers can access. Depending on your security configuration (and best practice implementations) you might need to set SQL Agent service to run under a specific user context.
    Important: Don’t change the service’s user context in the computer management screen. Even though this screen automatically gives the user account the Logon as Service right, SQL Server and SQL Agent need some additional rights to work correctly.
    Always use SQL Configuration Manager to change the service’s user context. This will set all the rights that SQL Server needs.

    9. Replication needs TCP/IP name resolution.
    The replication set up screen lets you connect to the distribution server via its IP address. No problem. You can set up everything, set the user rights and so on. Only when replication tries to start its SQL Agent jobs it will tell you, it can’t reach the distribution server because it has no name. So always use the server name instead of the IP address.
    Tip: Depending on your network configuration and name resolution configuration it might not be possible to reach the server by name. In such a case you can edit the hosts file in %WINDOWSDIR%\system32\drivers\etc . Here enter the IP and the name and fool SQL Server into having a name to use.

    10. The owner of the database has to be valid.
    This issue only happens if you have an environment with, let’s say, test and production servers with different SQL users and databases get copied across different environments. It is a quite obvious and rare issue, but it happens, so I mention it.
    Even though everything else works and the users seem to have sufficient rights, if the owner’s SID is not recognized, replication won’t work. This can be solved simply by resetting the ownership of the database:

    alter authorization on database::test to mySAUser;

    Even though the user name might be the same, the SID would be different and therefore we need to make this change.

    Rebuilding fragmented indexes and tables

    Over time indexes and tables will get fragmented. What that means is that the real data pages containing the data for the table or the index are more and more scattered around our data files and not neatly side by side, pretty much the same as disk fragmentation.

    The effect this has is a loss in performance since data has to be picked up in different places, indexes must be read from different places, writes go crisscross all over the file(s). A lot of additional I/O operations.

    To prevent that from happening we should regularly check our tables and indexes for fragmentation. You can use either DBCC SHOWCONTIG or the DMV sys.dm_db_index_physical_stats to query the fragmentation level of objects.

    -- set minimum page count the index should have
    -- to be considered for a rebuild.
    -- indexes smaller than 100, 120 pages usually don't gain anything from rebuilding.
    declare @IPageCnt int = 100;
    -- set the minimum fragmentation level of an index
    -- at which threshold it should be rebuild. (value in % of fragmentation)
    declare @IMinFragmentation int = 30;
    SELECT
         OBJECT_SCHEMA_NAME(FRAG.[object_id]) + '.' + OBJECT_NAME(FRAG.[object_id])  as [table]
        ,SIX.[name] as [index]
        ,FRAG.avg_fragmentation_in_percent as [fragmentation]
        ,FRAG.page_count as [pages]
        ,'alter index ' + SIX.[name]
            + ' on '+OBJECT_SCHEMA_NAME(FRAG.[object_id])
            + '.' + OBJECT_NAME(FRAG.[object_id])
            + ' rebuild;' as [rebuild_sql]
    FROM
        sys.dm_db_index_physical_stats
        (
            db_id(),    --use the currently connected database
            0,          --Parameter for object_id.
            DEFAULT,    --Parameter for index_id.
            0,          --Parameter for partition_number.
            DEFAULT     --Scanning mode. Default to &quot;LIMITED&quot;, which is good enough
        ) FRAG
        INNER JOIN sys.indexes SIX ON FRAG.[object_id] = SIX.[object_id]
            AND FRAG.index_id = SIX.index_id
    WHERE
        -- consider only those indexes that need treatment
        FRAG.page_count &gt; @IPageCnt
        AND FRAG.avg_fragmentation_in_percent &gt; @IMinFragmentation
    ORDER BY
        FRAG.avg_fragmentation_in_percent DESC
    

     The level of fragmentation you are willing to allow is up to you. You should consider the time the size of the table, the available space, the time it takes to rebuild tables and indexes etc. As a rule of thumb, the bigger the table the lower the threshold. On a huge table with several billion rows 10% mean several hundred million rows scattered around the data file, on a small table with some ten thousand rows even 40% or 50% fragmentation might be tolerable.

    In this example I am interested in all tables that have a fragmentation level of 30% or more.

    NOTE: You should always know resp. try to find out why fragmentation is happening, because in many cases it can hint to an underlying problem with the database setup and design. Fragmentation could hint to disk problems, memory problems or poor clustered key design.

    Defragment Indexes

    Once you identified the indexes and tables there are a lot of things to take into account before defragmenting. There are several ways to defragment an index.

    alter index IX_myIndex on dbo.myTable reorganize

    This will reorganize the leaf level data of an index, clustered or nonclustered. This operation is always done online. It reorders the index pages and also tries to compact them. The index and the table remain online.  It’s better than nothing but then again not much more than that. With a low level of fragmentation you might want to try a reorganization.

    alter index IX_myIndex on dbo.myTable rebuild

    This does what it says on the label: it will rebuild the whole index from scratch. New index pages are created in order and compacted together. This operation can be done online or offline.

    Depending on the index size, rebuilding will take up a lot of temporary space:

    – Size of index in the data file
    – Enough space in tempdb for sorting
    – Or space for sorting in the data file with sort_in_tempdb = off 

    When rebuilding a clustered index be aware that the clustered index is the table, so you will need free space to move the whole table around! 

    To make room for this operation you can disable other indexes on the table. This will remove all the leaf data from the index, but not the meta data and the index definition. It will also prevent the index from being constantly updated while the rebuild is moving data pages around. But beware that you have to rebuild the indexes afterwards to enable them again.

    Also, if you disable a clustered, unique or primary key index, all associated non-clustered, non-unique indexes AND all foreign key constraints associated with the table will also be disabled and will have to be rebuilt afterwards. The table data will be completely inaccessible until the index is rebuilt. So, careful with disabling indexes!

    Defragment Tables

    If the table has a clustered index (which every table should have!) then rebuilding the clustered index IS rebuilding the table, and that’s exactly what you should do.

    If there is no clustered index you can use

    alter table dbo.myTable rebuild

    Note that this will also rebuild ALL indexes on that table.

    Important Note: After you rebuilt your indexes/tables, recompile your stored procedures!

    After rebuilding indexes and/or tables you should recompile your stored procedures which access this table. The statistics will be up to date and the efficiency of the indexes might have change.

    exec sp_recompile N'dbo.myTable'

     This command marks the table myTable for recompilation. This means, that every stored procedure accessing the table for the first time after the command will be recompiled and receive a new execution plan.

    SQL Server healthcheck: sp_blitz by Brent Ozar

    Just a quick reminder for everyone who hasn’t yet tried it: if you want to check out an unknown server, or find issues on your well-known server, and learn about the caveats for setting up and maintaining a SQL Server, there’s no better way to start as Brent Ozar’s sp_blitz.

    It will check for commonly known issues, some obvious, some not so commonly known and guide you on to further learn about the issues, fix them and avoid them.

    %d bloggers like this: