Generating random numbers in T-SQL

Every once in a while the necessity arises to generate some random numbers in your T-SQL code, be it for some random selection of price winners, be it to randomize a sort order or whatever reason.
SQL Server provides a built-in function rand() which returns a random number. Great, you might say, until you try it.

select rand()

This returns a float value between 0 and 1. To make that useful you have to tweak it. I chose to create a function that returns a random integer in between the boundaries I need.

-- view that returns a random number
begin try
    exec('create view dbo.vw_RandNumber as select 1');
end try begin catch end catch
go

alter view dbo.vw_RandNumber
as
    -- return a random number using the built-in rand() function
    select
        rand() as RandNumber    -- returns a float value between 0 and 1
    ;
go

-- function to return a random integer value
begin try
    exec('create function dbo.fn_RandNumber() returns int');
end try begin catch end catch
go

alter function dbo.fn_RandNumber
    (@Min int   -- lower boundary of desired random value
    ,@Max int)  -- upper boundary of desired random value
    returns int
AS
begin
    declare
         @rand float    -- the built-in rand() function returns a float,
                        -- so we need one here
        ,@ret int       -- our return value
    ;

    while @ret is null  -- just to be sure, we really return sth valid
    begin
        -- rand() returns a float between 0 and 1, so we need to tweak it
        -- to get a random number between our boundaries
        -- f.i. @Min = 1, @Max = 100 => random number between 1 and 100
        select      @rand = @Min
                            + (select RandNumber
                            from dbo.vw_RandNumber)
                            * (@Max-@Min)
        ;
        -- now we convert the float into an integer
        -- simply by assigning it to an int variable, thus cutting of the decimals
        -- Note: this is still an implicit conversion, even if no convert() or
        --       cast() is used meaning it still needs the same resources
        select @ret = @rand   -- 94,56873936663979 => 94
    end
    -- return the result
    return @ret;
end
go

This way I can get a random number within any range I want (as long as it’s integer).

select dbo.fn_RandNumber(1, 100)
select dbo.fn_RandNumber(1, 500000)
select dbo.fn_RandNumber(800, 900)
select dbo.fn_RandNumber(-50, 0)

You might ask why we need this construct with a view. Simple answer, the usage of rand() is not allowed within a user defined function. It will throw an error message:
Invalid use of a side-effecting operator ‘rand’ within a function.

Therefore we need to ‘fool’ SQL Server by creating a view that returns just the one random number and then select from the view within the function.

Caveats

There are two major caveats to consider:

Firstly, rand() returns a float value, we also need a float variable in the function. Using floats is never really a good idea. Normal CPUs are not optimized for floating point calculations, meaning that operations with floats need additional CPU cycles. Unless you have an additional floating point processor in your SQL Server machine (which is very unlikely), this will take up some CPU time.

There are other ways to create random numbers, you’ll find them when you google. Here is a neat example by Jeff Moden.

select
     abs(checksum(newid())) % (@Max - @Min + 1) + 1 as randNum
    ,col1
    ,col2
    ,col3
FROM
    dbo.someTable

I prefer the construct with the view and the function, since I think it’s quite flexible (I can have functions for decimal, money, whatever), easy to read in stored procedure code and easy to maintain.

5 Responses to Generating random numbers in T-SQL

  • Jeff Moden says:

    Please explain to me how you think the ABS(CHECKSUM(NEWID())) method has anything at all to do with RBAR.

    • Heino Zunzer says:

      Hi Jeff,
      first of all thanks for taking interest in my blog and commenting on it.

      To my understanding RBAR happens when I f.i. use functions in the select-clause which have to be evaluated and executed against every row.
      So in this example

      select top(1000) 
           abs(checksum(newid())) as RandNum
          ,ID
          ,SomeColumn
      from 
          dbo.Table1

      while it doesn’t have to evaluate any column value against every row, it still gets executed a 1000 times, doesn’t it?
      If I misunderstand how the engine processes the above and/or what RBAR is, please let me know.

      Cheers, heino

  • Jeff Moden says:

    Apologies for the very late reply, Heino…

    There’s a difference between RBR (Row By Row) and RBAR (Row By Agonizing Row). Every SELECT, for example, uses RBR because, behind the scenes, the machine language code is looping (I call such loops “Pseudo Cursors” with credits to R.Barry Young for the term) like it would in any application. Using the native functions of T-SQL doesn’t make it RBAR because it’s neither agonizingly slow nor does it use an agonizing amount of resources. See the following URL for more on that. ( http://www.sqlservercentral.com/articles/T-SQL/91724/ ). Using scalar **User** Defined Functions can certainly be RBAR although there are some exceptions where such a thing will beat most every attempt to make something set based. Please see the following article for more on that. ( http://blogs.lessthandot.com/index.php/datamgmt/dbprogramming/sql-server-proper-case-function/ )

  • Heino Zunzer says:

    Thanks for the explanation, Jeff.
    I removed the sentences referring to RBAR and I promise to read up on the topic. 😉

  • Jeff Moden says:

    Apologies. I lost track of this, Heino. Thank you very much for the feedback and for what you do. Anyone that tries to help others is aces in my book.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: