execution plan

Table Variables vs. Temporary Tables

Many articles have been written already about this topic, many online discussions carried out. I won’t recap all of the arguments here, but merely concentrate on the maybe lesser known facts (at least they were new to me not so long ago) about temp tables and table variables.

Creation

Temporary tables are created like every other table, only the name is prefixed by a hash sign.

create table #tabX
  (ID int not null identity(1,1) primary key clustered
  ,Col1 bit not null
  ,Col2 datetime2 not null
  ,Col3 varchar(20) not null
  ,Col4 tinyint not null);

Temp tables are not only created like other tables, you also can do (almost) everything with them as you would with a normal table, like creating constraints, indexes asf.

Table variables are not created but declared.

declare @tabX as table (ID int not null identity(1,1) primary key clustered
      ,Col1 bit not null default 0
      ,Col2 datetime2 not null default sysdatetime()
      ,Col3 varchar(20) not null unique
      ,Col4 tinyint not null check (Col4 >=1 and Col4 <=5));

As you can see, you can declare any constraint on a table variable, but you cannot create any indexes.

Both tables can have constraints, only temporary tables can have indexes.

Resources

A common myth is that temporary tables are created in tempdb and table variables are created in memory. That’s nonsense. Both are created in tempdb.

Scope

Table variables only exist in the same scope as variables, i.e. within the same block of code, defined by begin… end or ended with go.
Temporary tables exist in the scope of the session. So if you create a temporary table it will exist as long as you drop it or disconnect the session, but only for this session.
So you can create a temporary table, call a stored procedure and use that same table within the stored procedure. This can be handy sometimes, but it also can lead to very nasty behaviour, if you are forgetting to drop tables or where exactly now you are getting your data from.
You can even create global temporary tables by prefixing the name with two hash signs:

create table ##tabX
  (ID int ...

This table is not exclusive to the session, but is available to all sessions until it is dropped or SQL Sevrer stopped.

Table variables only exist in the code block scope, temporary tables in the session scope and global temporary tables in the server scope.

Naming

Another common misbelieve is, that you need to generate unique names for temporary tables, since they are created in the tempdb and no two objects may have the same name.
Yes, objects must have different names, but in case of temp tables, SQL Server takes care of the uniqueness.
To proof this, we open two sessions and in each session we create a temp table #tabX. Then we run

select name, object_id from tempdb.sys.objects where name like '#tabX%'

and have this result:

SQL Server adds a unique number for each session to the name of the temporary table, thus taking care of the uniqueness of names. Our session can still use it as #tabX, not having to care about the suffix. Isn’t that awesome?
Oh, and if we look at global temp tables

No suffix here, so for global temp tables we really have to take care of uniqueness. Makes sense.

Temporary tables don’t need to generate a unique name. SQL Server does that internally.

Transaction scope

A huge difference is the transaction scope of temp tables and table variables. Only for temporary tables transaction logs are recorded, which is essential to rollback transactions.

begin transaction
  insert #tabX (Col3, Col4) values ('New Value', 1);
  insert @tabX (Col3, Col4) values ('New Value', 1);
rollback

So if we run something like this, #tabX will have rolled back and not have the new line, @tabX will still have the new line.

Only temp table write transaction logs and writes can be rolled back.

Compilation scope

A stored procedure that creates and/or drops a temporary table can never have a precompiled execution plan. Since a temporary table behaves like a normal table, only with a different scope, it is also treated as a normal object.
If you were to create a real table and/or drop it, after using it for storage or data aggregation in a stored procedure, the execution plan will also have to be recompiled every time.
If you use a table variable, the execution plan can be precompiled, since not an object is created, but the mere image of an object.

Only table variables can let a stored procedure have precompiled, reusable execution plans.

Performance

There are many, many articles with benchmarks and performance test scripts. As far as I gather, the conclusion is there is no real difference. A common opinion is that temporary tables perform better with large data sets. I don’t really know if that is true, but for now I believe it.
So for everything over 50,000 rows I rather tend towards a temp table, below that to a table variable.
But the main decision for or against one or the other shouldn’t be based on performance but on other facts like scope, transaction handling, execution plans.

No real performance difference. On larger volumes of data temp tables believed to be better.

Unique advantages

Temporary tables

Temporary tables can be used in select… into… statements. Just as real tables can be created with select… into… , temporary tables can also be created with it.

select top(100)
   ID
  ,Col1
  ,Col2
  ,Col3
into
  #tabX
from
  dbo.MyTable
where
  ID > 100
;

This can be very handy at times, but it has the same problems as with a real table (see my post on select… into…).

Table variables

Since SQL Server 2008 you can pass table variables as parameters to stored procedures. This is a great functionality. It has some restrictions (http://www.sqlteam.com/article/sql-server-2008-table-valued-parameters) but it’s a really cool and powerful feature for complex application development.

Conclusion

As always with development, there is no real answer which one has the most advantages. It all depends on what you actually need. Base your decision on the advantages and disadvantages of both options, the result you want to accomplish, the scope and lifetime of the objects, the transactional scope.
I tend to rather use table variables, because I am a developer who does a lot with stored procedures, so I like that the execution plan can still be precompiled with variables and that I don’t need to care to clean up, since at the end of the execution the variable is gone anyway.

select… into…

A pattern I see frequently used, especially in conjunction with temporary tables is this

select
   ID
  ,Col1
  ,Col2
  ,Col3
into
  dbo.tab2
from
  dbo.tab1
where
  ID > 100

This syntax automatically creates a table for the result set and inserts the records of the result set into it. It is fast and efficient. And it can be really handy, especially in complex stored procedures or scripts, storing data from complex queries for other complex queries.
What makes it so popular is the possibility to create not only manifest tables, but also temporary tables:

select
   ID
  ,Col1
  ,Col2
  ,Col3
into
  #tmpTab
from
  dbo.tab1
where
  ID > 100

You need not define or create the table before in fact it must not be defined before.
This is all very neat, and sometimes I use it myself, but only sometimes and never in stored procedures. More likely in some ad-hoc scripts.

And here is why:

Schema Locks

select… into… always creates a Schema Lock on the table it is selecting from. Even with (nolock) it needs to create a schema lock to get the schema for the table it creates. A schema lock may not sound too bad and is actually not that big a deal.
But the problem is not so much the lock itself or what it does, but what it takes to get it. Imagine an OLTP system with a high data load, a table with approx. 1000 records inserted per second. And you have a select.. into… from this table, possibly joint with two or three other tables, and your statement, even with the nolock hint, needs to be able to place a schema lock on all four tables at once.
This means the query has to wait for a window, when it can actually place these locks. Sometimes nanoseconds, sometimes a few milliseconds, in worst case, although unlikely, even seconds.
While this may not seem much it can amount to quite a lot in the long run. And what you get in any case are highly different, unreliable execution times.

Crappy Schema

The created table gets the data from the selected data. So f.i. if for Column X the longest selected string value has 4 characters, the field will be varchar(4). Might be ok, but when I later need space for longer values, I am in trouble.
Example:

select
   1      as ID
  ,'XXX'    as Col1
  ,getdate()  as Col2
  ,200.23   as Col3
  ,N'djjdh' as Col4
into
  tabInto

This gives you this schema

CREATE TABLE dbo.tabInto
  (ID int NOT NULL
  ,Col1 varchar(3) NOT NULL
  ,Col2 datetime NOT NULL
  ,Col3 numeric(5, 2) NOT NULL
  ,Col4 nvarchar(5) NOT NULL)

A heap with no primary key, all the columns are exactly the maximum length the data had, and all columns are not null, just because there were not any nulls in the data.
Of course you can create your keys and indexes afterwards, but why let the engine work twice to create a decent table. And we all know what increasing the size of not null-columns does to a table (basically tear it apart).
If you just leave it as the heap it is, the Query Optimizer will have a hard time working with it.

And speaking of execution plans:

Non-Reuseable Execution Plans

If you have a create table-command in your stored procedure, which you implicitly have with select… into…, the procedure has to be recompiled for every execution. It will never get a reusable execution plan. This increases the execution time of the procedure immensely. And it doesn’t matter if it is a real table or a temporary table, they are treated the same way.

Whereas if you use a well-defined table variable, insert .. select… into it, you will get a reusable execution plan for your procedure, and, if you defined all the data types correctly and used constraints where possible, you will also get the best possible execution plan for your data.

%d bloggers like this: