Wicked Engine

Next power of two in HLSL

January 5, 2018

There are many occasions when a programmer would want to calculate the next power of two for a given number. For me it was a bitonic sorting algorithm operating in a compute shader and I had this piece of code be responsible for calculating the next power of two of a number:

uint myNumberPowerOfTwo = 1;

while( myNumberPowerOfTwo < myNumber)

{

myNumberPowerOfTwo <<= 1;

}

It gets the job done, but doesn’t look so nice. For not unusual cases when myNumber is more than 1000 it can already take ten cycles to loop. I recently learned that HLSL has a built in function called firstbithigh. It returns the position of the first non zero bit in a 32-bit number starting from the left to the right (from high order to low). With its help, we can rewrite the algorithm as follows:

uint myNumberPowerOfTwo = 2 << firstbithigh(myNumber);

It does the same thing, so how does it work? Take a random number and write it in binary:

00000001011010011011011111000100

For that number, firstbithigh returns the number of the location of the first non zero bit (from the left), which is 24. The next power of two to that number would be:

00000010000000000000000000000000

That number has seven leading zeroes. All we have to do is zero the whole number but set a single bit to non-zero. To do it, first set up 2 in binary:

00000000000000000000000000000010

Then shift the whole thing left by the position of the first bit in myNumber to get what we are looking for:

00000010000000000000000000000000

Nice one, but we have a few corner cases we need to eliminate. If myNumber is already a power of two, we will get a greater power of two which might not be what we are looking for. In that case we should return myNumber itself. If you subtract one from myNumber before calling firstbithigh on it, it will ensure that power of two numbers will return themselves:

uint myNumberPowerOfTwo = 2 << firstbithigh(myNumber – 1);

Take 1024 for example, it will get the next power of two for 1023 which is 1024. Or take 129 for example, it will get the next power of two for 128 which is 256, so works out quite well. If you try to calculate it for 0 however, there is a potential bug if myNumber is unsigned, as -1 in unsigned would wrap around to 0xFFFFFFFF. If you can potentially call it for zeroes and operate on unsigned, I would suggest doing it like this:

uint myNumberPowerOfTwo = 2 << firstbithigh(max(1, myNumber) – 1);

If it is signed int, then you don’t have to worry about it, as the firstbithigh function will return zero for negative numbers.

Have any question, noticed a mistake? Please post in the comments below!

Thank you for reading!

Posted by

turanszkij

5 responses to “Next power of two in HLSL”

hywelwt

January 7, 2018 at 9:01 am

If you don’t have access to the firstbithigh intrinsic, it can also be done like this:

uint GetNextPower( float i_n )
{
return ((uint)ceil( sqrt( i_n ) ))

Reply
1. hywelwt
  
  January 7, 2018 at 2:56 pm
  
  Oops forgot to quote code:
  [code language=”cpp”]
  uint GetNextPowerOf2( float i_n )
  {
  return ((uint)ceil( sqrt( i_n ) )) << 1u;
  }
  [/code]
  
  Reply
  1. turanszkij
    
    January 7, 2018 at 5:37 pm
    
    Hm, that doesn’t seem right, try that with a few numbers. I think you might have meant using log2 like this, which gives the correct result:
    [code]
    uint GetNextPowerOf2( float i_n )
    {
    return 2 &lt;&lt; (uint)log2( i_n – 1 );
    }
    [/code]
    The difference is that log2 is a ~~half~~ quarter-rate instruction as I know, while firstbithigh is full-rate. Although firstbithigh doesn’t work for floats like this.
    
    Reply
  2. a lemon
    
    April 15, 2021 at 6:19 pm
    
    Where do you get instruction speeds? From https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#arithmetic-instructions? Also, for floats, can’t you just asint(), extract exponent bits(half-rate, idk what function in hlsl) and bitshift?
    
    Reply
Michael Riegger

July 16, 2020 at 12:46 am

Great article. I’m looking at using this myself and noticed that an input value of 1 seems to results in a nextpoweroftwo of 0 on both my AMD machine and in C++. It seems like left shifting by a negative amount is undefined in C++. Does that seem right to you?

Reply

Next power of two in HLSL

5 responses to “Next power of two in HLSL”

Leave a ReplyCancel reply

Discover more from Wicked Engine