<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Random ASCII</title>
	<atom:link href="http://randomascii.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://randomascii.wordpress.com</link>
	<description>Forecast for randomascii: programming, tech topics, with a chance of unicycling</description>
	<lastBuildDate>Thu, 23 Feb 2012 18:39:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='randomascii.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Random ASCII</title>
		<link>http://randomascii.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://randomascii.wordpress.com/osd.xml" title="Random ASCII" />
	<atom:link rel='hub' href='http://randomascii.wordpress.com/?pushpress=hub'/>
		<item>
		<title>64-Bit Made Easy</title>
		<link>http://randomascii.wordpress.com/2012/02/14/64-bit-made-easy/</link>
		<comments>http://randomascii.wordpress.com/2012/02/14/64-bit-made-easy/#comments</comments>
		<pubDate>Wed, 15 Feb 2012 04:50:24 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[Code Reliability]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=436</guid>
		<description><![CDATA[The scariest aspect of porting your ancient 32-bit code to 64-bit is pointer truncation bugs. Any places where you store a pointer in an ‘int’ or a ‘long’ can come back to bite you when you move to 64-bit. The &#8230; <a href="http://randomascii.wordpress.com/2012/02/14/64-bit-made-easy/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=436&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The scariest aspect of porting your ancient 32-bit code to 64-bit is pointer truncation bugs. Any places where you store a pointer in an ‘int’ or a ‘long’ can come back to bite you when you move to 64-bit.</p>
<p>The problem is, these bugs can take a while to show up. Memory allocations on Windows default to starting at low addresses, so it takes a while for allocations to work their way up high enough for there to be anything in the top 32 bits to truncate.</p>
<p><span id="more-436"></span>
<p>Just as with <a href="http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/">time-math</a> it is really tedious to deal with bugs that may take hours to show up.</p>
<p>At <a href="http://www.valvesoftware.com/">Valve</a> we use a simple technique to solve this problem. We make sure that our allocations <em>start</em> above the 4 GB line. If every allocation has some bits in the high 32 bits then pointer truncation bugs tend to cause warm fuzzy crashes immediately, and 64-bit cleanliness is easy.</p>
<p>Here’s some code:</p>
<p><pre class="brush: cpp; light: true;">
void ReserveBottomMemory()
{
#ifdef _WIN64
    static bool s_initialized = false;
    if ( s_initialized )
        return;
    s_initialized = true;

    // Start by reserving large blocks of address space, and then
    // gradually reduce the size in order to capture all of the
    // fragments. Technically we should continue down to 64 KB but
    // stopping at 1 MB is sufficient to keep most allocators out.

    const size_t LOW_MEM_LINE = 0x100000000LL;
    size_t totalReservation = 0;
    size_t numVAllocs = 0;
    size_t numHeapAllocs = 0;
    size_t oneMB = 1024 * 1024;
    for (size_t size = 256 * oneMB; size &gt;= oneMB; size /= 2)
    {
        for (;;)
        {
            void* p = VirtualAlloc(0, size, MEM_RESERVE, PAGE_NOACCESS);
            if (!p)
                break;

            if ((size_t)p &gt;= LOW_MEM_LINE)
            {
                // We don't need this memory, so release it completely.
                VirtualFree(p, 0, MEM_RELEASE);
                break;
            }

            totalReservation += size;
            ++numVAllocs;
        }
    }

    // Now repeat the same process but making heap allocations, to use up
    // the already reserved heap blocks that are below the 4 GB line.
    HANDLE heap = GetProcessHeap();
    for (size_t blockSize = 64 * 1024; blockSize &gt;= 16; blockSize /= 2)
    {
        for (;;)
        {
            void* p = HeapAlloc(heap, 0, blockSize);
            if (!p)
                break;

            if ((size_t)p &gt;= LOW_MEM_LINE)
            {
                // We don't need this memory, so release it completely.
                HeapFree(heap, 0, p);
                break;
            }

            totalReservation += blockSize;
            ++numHeapAllocs;
        }
    }

    // Perversely enough the CRT doesn't use the process heap. Suck up
    // the memory the CRT heap has already reserved.
    for (size_t blockSize = 64 * 1024; blockSize &gt;= 16; blockSize /= 2)
    {
        for (;;)
        {
            void* p = malloc(blockSize);
            if (!p)
                break;

            if ((size_t)p &gt;= LOW_MEM_LINE)
            {
                // We don't need this memory, so release it completely.
                free(p);
                break;
            }

            totalReservation += blockSize;
            ++numHeapAllocs;
        }
    }

    // Print diagnostics showing how many allocations we had to make in
    // order to reserve all of low memory, typically less than 200.
    char buffer[1000];
    sprintf_s(buffer, &quot;Reserved %1.3f MB (%d vallocs,&quot;
                      &quot;%d heap allocs) of low-memory.\n&quot;,
            totalReservation / (1024 * 1024.0),
            (int)numVAllocs, (int)numHeapAllocs);
    OutputDebugStringA(buffer);
#endif
}
</pre></p>
<h2>Cool, eh?</h2>
<p>The code is a bit messy but actually fairly simple. Call this as soon as possible when your 64-bit process starts up and you will be able to find and fix your pointer truncation bugs in no time at all.</p>
<p>The code is a bit verbose because it first reserves all of the low-memory address space and then tries to soak up address space that was previously reserved by the CRT and process heaps. Other heaps in your process may still be holding on to low memory, but in practice it shouldn’t be enough to matter.</p>
<h2>It’s cheap</h2>
<p>The VirtualAlloc calls reserve only address space, which means that the cost of this is very low. The code doesn’t reserve 4 GB of RAM, it just reserves some space and then never uses it. Cheap like borscht.</p>
<h2>(App) Verify this</h2>
<p>This is such a simple and obvious technique that I’m quite surprised that <a href="http://randomascii.wordpress.com/2011/12/07/increased-reliability-through-more-crashes/">Application Verifier</a> doesn’t offer it as an option*. In fact, Application Verifier has a bug that renders it almost incompatible with this technique: if you are using this technique at the same time that you use Application Verifier then it somehow ends up committing 4 GB of RAM! The first time we hit this was when a colleague was running a dozen copies of our asset conversion tool while Application Verifier was enabled for it. The 48 GB of extra RAM consumption did bad things to his computer’s performance.</p>
<p>I hacked around this problem by detecting Application Verifier (just check to see if one of its DLLs is loaded) and disabling the reservation in that case. Another alternative is to make the address space reservation optional, but this won’t find as many bugs.</p>
<h2>It’s a darned good start</h2>
<p>Pointer truncation bugs aren’t the only problem in porting to 64-bit. Indices and offsets can also truncate or wrap, so looking at compiler warnings and auditing likely problem areas is a good idea. It turns out that most integer loop variables should probably be ‘size_t’ or ‘ptrdiff_t’ rather than ‘int’.</p>
<h2>I’m here all week, try the steak</h2>
<p>Porting to 64-bit needn’t be scary, and this technique helps make the process more reliable and predictable. By using this technique at Valve I was able to flush out all the critical pointer truncation bugs in a large code base in very little time. This then made it easier to use Application Verifier to check for other memory bugs, and also lets our processes address vast amounts of memory.</p>
<p><font size="2">* If Microsoft had added pointer truncation detection to Application Verifier then they might have caught this bug in their audio APIs. If you use the MIXER_OBJECTF_HWAVEOUT flag to pass a (64-bit) HWAVEOUT to mixerOpen then you find that the uMxId parameter is a (32-bit) UINT. Oops. Try it yourself, before and after calling ReserveBottomMemory(). Bug reported.</font></p>
<p><pre class="brush: cpp; light: true;">
void TestAudio()
{
    WAVEFORMATEX w = {};
    w.wFormatTag = WAVE_FORMAT_PCM;
    w.nChannels = 1;
    w.nSamplesPerSec = 44100;
    w.wBitsPerSample = 16;
    w.nBlockAlign = w.nChannels * (w.wBitsPerSample/8);
    w.nAvgBytesPerSec = w.nSamplesPerSec * w.nBlockAlign;

    HWAVEOUT hWave;
    MMRESULT mmr = waveOutOpen(&amp;hWave, WAVE_MAPPER, &amp;w,
                NULL, 0, CALLBACK_NULL);

    if (mmr == MMSYSERR_NOERROR)
    {
        HMIXER hMixer = NULL;
        // Map the device onto an HMIXER. The flags parameter tells the API
        // to interpret the second parameter as an HWAVEOUT. The
        // mandatory cast truncates the pointer.
        mmr = mixerOpen(&amp;hMixer, UINT(hWave), 0, 0,
                    MIXER_OBJECTF_HWAVEOUT);
    }
}
</pre></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/436/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=436&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2012/02/14/64-bit-made-easy/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>
	</item>
		<item>
		<title>Don&#8217;t Store That in a Float</title>
		<link>http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/</link>
		<comments>http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/#comments</comments>
		<pubDate>Tue, 14 Feb 2012 04:43:31 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[AltDevBlogADay]]></category>
		<category><![CDATA[Floating Point]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[double]]></category>
		<category><![CDATA[float]]></category>
		<category><![CDATA[floating point]]></category>
		<category><![CDATA[game time]]></category>
		<category><![CDATA[microseconds]]></category>
		<category><![CDATA[precision]]></category>
		<category><![CDATA[time resolution]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=431</guid>
		<description><![CDATA[I promised in my last post to show an example of the importance of knowing how much precision a float has at a particular value. Here goes. As a general rule this type of data that should never be stored &#8230; <a href="http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=431&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I promised in my last post to show an example of the importance of knowing how much precision a float has at a particular value. Here goes.</p>
<p>As a general rule this type of data that should never be stored in a float:</p>
<p><span id="more-431"></span>
<p>Elapsed game time should never be stored in a float. Use a double instead. I’ll explain why below.</p>
<p>As an extra bonus, because switching to double is not always the best solution, this post demonstrates the dangers of unstable algorithms, and how to use the guarantees of floating-point math to improve them.</p>
<h2>How long has this been going on?</h2>
<p>A lot of games have some sort of GetTime() function that returns how long the game has been running. Often these return a floating-point number because it allows for convenient use of seconds as the units, while allowing sub-second precision.</p>
<p>GetTime() is typically implemented with some sort of high frequency timer such as QueryPerformanceCounter. This allows time resolution of a microsecond or better. However it’s worth looking at what happens to this resolution if the time is returned as a float, or stored in a float. We can do that using one of the TestFloatPrecision functions from the <a href="http://randomascii.wordpress.com/2012/01/23/stupid-float-tricks-2/">last post</a> – just call them from the watch window of the debugger. In the screen shot below I tested the precision available at one minute, one hour, one day, and one week:</p>
<p><a href="http://randomascii.files.wordpress.com/2012/02/image3.png"><img style="padding-left:0;padding-right:0;padding-top:0;border-width:0;" border="0" alt="image" src="http://randomascii.files.wordpress.com/2012/02/image_thumb3.png?w=520&#038;h=157" width="520" height="157"></a></p>
<p>It’s important to understand what this data means. The number ‘60’, like all integers up to 16777216, can be exactly represented in a float. The watch window shows that the next value after 60 that can be represented by a float is about 60.0000038. Therefore, if we use a float to store “60 seconds” then the next time that we can represent is 3.8 microseconds past 60 seconds. If we try to store a value in-between then it will be rounded up or down.</p>
<h2>How long did it take?</h2>
<p>One of the most common things to do with time values is to subtract them. For instance, we might have code like this:</p>
<p><pre class="brush: cpp; light: true;">
double GetTime();

float TimeSomethingBadly()
{
    float fStart = GetTime();
    DoSomething();
    float elapsed = GetTime() - fStart;
    return elapsed;
}
</pre></p>
<p>The implication of the precision calculations above is that if ‘fStart’ is around 60, then ‘elapsed’ will be a multiple of 3.8 microseconds (two to the negative eighteenth seconds). That is the most precision you can get. If less than 3.8 microseconds has elapsed then ‘elapsed’ will either be rounded down to zero, or rounded up to 3.8 microseconds.</p>
<p>Therefore, if our game timer starts at zero and we store time in a float then after a minute the best precision we can get from our timer is 3.8 microseconds. After our game has been running for an hour our best precision drops to 0.24 milliseconds. After our game has been running for a day our precision drops to 7.8 milliseconds, and after a week our precision drops to 62.5 milliseconds.</p>
<p>This is why storing time in a float is dangerous. If you use float-time to try calculating your frame rate after running for a day then the only answers above 30 fps that are possible are infinity, 128, 64, 42.6, or 32 (since the possible frame lengths are 0, 7.8, 15.6, 23.4, or 31.2 milliseconds). And it only gets worse if you run longer.</p>
<p>As another example consider this code:</p>
<p><pre class="brush: cpp; light: true;">
double GetTime();

void ThinkBadly()
{
    float startTime = (float)GetTime();
    // Do AI stuff here
    float elapsedTime = GetTime() - startTime;
    assert(elapsedTime &lt; 0.005); // 
}
</pre></p>
<p>The purpose of this code is to warn the developers whenever the AI code takes inordinately long. However when the game has been running for a day (actually the problem reaches this level after 65,536 seconds) GetTime() will always be returning a multiple of 0.0078 s, and ‘elapsedTime’ will always be a multiple of that duration. In most cases ‘elapsedTime’ will be equal to zero, but every now and then, no matter how fast the AI code executes, the time will tick over to the next representation during the AI calculations and ‘elapsedTime’ will be 0.0078 s instead of zero. The assert will then trigger even though the AI code is actually still under budget.</p>
<h2>It’s a catastrophe for base-ten also</h2>
<p>The general term for what is happening with these time calculations is <a href="http://en.wikipedia.org/wiki/Catastrophic_cancellation">catastrophic cancellation</a>. In all of these examples above there are two time values that are accurate to about seven digits. However they are so close to each other that when they are subtracted the result has, in the worst case, zero significant digits.</p>
<p>We can see the same thing happening with decimal numbers. A float has roughly seven decimal digits of precision so the decimal equivalent would be getting a time value of 60.00000 and having the next possible time value be 60.00001. Given a seven-digit decimal float we can’t get more than a tenth of a microsecond precision when dealing with time around 60 seconds. When we subtract 60.00000 from 60.00001 then six of the seven digits cancel out and we end up with just one accurate digit. For times less than a tenth of a microsecond we have a complete catastrophe – all seven digits cancel out and we get zero digits of precision, just like with a binary float.</p>
<h2>Double down</h2>
<p>The solution to all of this is simple. GetTime() must return a double, and its result must always be stored in a double. The cancellation still occurs, but it is no longer catastrophic. A double has enough bits in the mantissa that even if your game runs for several millennia your double-precision timers will still have sub-microsecond precision. You can verify this by using the double-precision variation of TestFloatPrecisionAwayFromZero():</p>
<p><pre class="brush: cpp; light: true;">
union Double_t
{
	Double_t(double val) : f(val) {}

	int64_t i;
    double f;
    struct
    {
        uint64_t mantissa : 52;
        uint64_t exponent : 11;
        uint64_t sign : 1;
    } parts;
};

double TestDoublePrecisionAwayFromZero(double input)
{
    union Double_t num(input);
    // Incrementing infinity or a NaN would be bad!
    assert(num.parts.exponent &lt; 2047);
    // Increment the integer representation of our value
    num.i += 1;
    // Subtract the initial value find our precision
    double delta = num.f - input;
    return delta;
}
</pre></p>
<p>You can see in the screenshot below that if you store time in doubles then after your game has been running for a week you will have sub-nanosecond precision, and after three millennia you will still have sub-millisecond precision.</p>
<p><a href="http://randomascii.files.wordpress.com/2012/02/image4.png"><img style="padding-left:0;padding-right:0;padding-top:0;border-width:0;" border="0" alt="image" src="http://randomascii.files.wordpress.com/2012/02/image_thumb4.png?w=608&#038;h=114" width="608" height="114"></a></p>
<p>Clearly a double is overkill for storing time, but since a float is underkill a double is the right choice.</p>
<p>Aside: my initial calculation of the precision remaining after three millennia was wrong because the calculation of the number of seconds was done with integer math, and it overflowed and gave a completely worthless answer. Which proves that integer math can be just as tricky as floating-point math.</p>
<h2>Changing your units doesn’t help</h2>
<p>All along I am assuming that you are storing your time in seconds. However your choice of units doesn’t significantly affect the results. If you decide that your time units are milliseconds, or days, then the precision available after your game has been running for a day will be about the same. It is the ratio between the elapsed time and the time being measured that matters.</p>
<h2>Or use integers</h2>
<p>Tom Forsyth <a href="http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BA%20matter%20of%20precision%5D%5D">points out</a> that the same issues happen with world coordinates and that switching to integer types can give you greater worst-case precision, as well as consistent precision. The Windows GetTickCount() and GetTickCount64() functions use this technique, using milliseconds as the units. This alternative to using a double for time is quite reasonable, especially if you encapsulate it well. A uint32_t with milliseconds as units will overflow every 50 days or so but you can avoid that by using a uint64_t. However despite Tom’s threats to invoke his <a href="http://home.comcast.net/~tom_forsyth/blog.wiki.html#OffendOMatic">OffendOMatic</a> rule for all who use doubles, I still prefer doubles for game time because of the combination of convenient units (seconds) and more than sufficient precision.</p>
<p>While Tom and I appear to disagree over whether you should use double in situations like this, we agree that ‘float’ won’t work.</p>
<p>Note that while GetTickCount() and GetTickCount64() are millisecond precision they are often actually less accurate than you would expect. Unless you have changed the Windows timer frequency with timeBeginPeriod() the GetTickCount functions will only return a new value every 10-20 milliseconds (<em>insert pithy comment about precision versus accuracy here</em>).</p>
<h2>Four billion dollar question</h2>
<p>Even if you use doubles for time, the precision available will still change as game time marches on from zero to the length of your game. These precision changes – while smaller with doubles than with floats – can still be dangerous. Luckily there is a convenient way to get the consistent precision of an integer, with the convenient units of a double.</p>
<p>If you start your game clock at about 4 billion (more precisely 2^32, or any large power of two) then your exponent, and hence your precision, will remain constant for the next ~4 billion seconds, or ~136 years.</p>
<p>And, when using doubles this, precision is approximately one microsecond.</p>
<p>So there you have it. The one-true answer. Store elapsed game time in a double, starting at 2^32 seconds. You will get constant precision of better than a microsecond for over a century. You read it here first.</p>
<h2>Time <em>deltas</em> fit in a float</h2>
<p>It is important to understand that the limited precision of a float is only a problem if you do an unstable calculation, such as catastrophic cancellation cancelling out most of the digits. The code below, on the other hand, is fine:</p>
<p><pre class="brush: cpp; light: true;">
double GetTime();

float TimeSomethingWell()
{
    double dStart = GetTime(); // Store time in a double
    DoSomething();
    float elapsed = GetTime() - dStart; // Store *result* in a float
    return elapsed;
}
</pre></p>
<p>In TimeSomethingWell() we store the result of the subtraction in a float – <em>after</em> the catastrophic cancellation. Therefore our elapsed time value will have tons of precision.</p>
<p>Similarly, if you are using floats in your animation system to represent short times, such as the location of key-frames in a 60 second animation, then floats are fine. However when you add these to the current time you need to store the result of the addition in a double.</p>
<h2>Tables!</h2>
<p>Forrest Smith made a pretty table showing how the precision of a float changes as the magnitude increases, and I mangled it to suit my needs. Here it is for time:</p>
<table border="2" cellspacing="0" cellpadding="2" width="513">
<tbody>
<tr>
<td valign="top" width="109" align="right"><strong>Float Value</strong></td>
<td valign="top" width="118" align="right"><strong>Time Value</strong></td>
<td valign="top" width="130" align="right"><strong>Float Precision</strong></td>
<td valign="top" width="152" align="right"><strong>Time Precision</strong></td>
</tr>
<tr>
<td valign="top" width="109" align="right">1 </td>
<td valign="top" width="118" align="right">1 second</td>
<td valign="top" width="130" align="right">1.19E-07</td>
<td valign="top" width="152" align="right">119 nanoseconds</td>
</tr>
<tr>
<td valign="top" width="109" align="right">10 </td>
<td valign="top" width="118" align="right">10 seconds</td>
<td valign="top" width="130" align="right">9.54E-07</td>
<td valign="top" width="152" align="right">.954 microsecond</td>
</tr>
<tr>
<td valign="top" width="109" align="right">100 </td>
<td valign="top" width="118" align="right">~1.5 minutes</td>
<td valign="top" width="130" align="right">7.63E-06</td>
<td valign="top" width="152" align="right">7.63 microseconds</td>
</tr>
<tr>
<td valign="top" width="109" align="right">1,000 </td>
<td valign="top" width="118" align="right">~16 minutes</td>
<td valign="top" width="130" align="right">6.10E-05</td>
<td valign="top" width="152" align="right">61.0 microseconds</td>
</tr>
<tr>
<td valign="top" width="109" align="right">10,000 </td>
<td valign="top" width="118" align="right">~3 hours</td>
<td valign="top" width="130" align="right">0.000977</td>
<td valign="top" width="152" align="right">.976 milliseconds</td>
</tr>
<tr>
<td valign="top" width="109" align="right">100,000 </td>
<td valign="top" width="118" align="right">~1 day</td>
<td valign="top" width="130" align="right">0.00781</td>
<td valign="top" width="152" align="right">7.81 milliseconds</td>
</tr>
<tr>
<td valign="top" width="109" align="right">1,000,000 </td>
<td valign="top" width="118" align="right">~11 days</td>
<td valign="top" width="130" align="right">0.0625</td>
<td valign="top" width="152" align="right">62.5 milliseconds</td>
</tr>
<tr>
<td valign="top" width="109" align="right">10,000,000 </td>
<td valign="top" width="118" align="right">~4 months</td>
<td valign="top" width="130" align="right">1</td>
<td valign="top" width="152" align="right">1 second</td>
</tr>
<tr>
<td valign="top" width="109" align="right">100,000,000 </td>
<td valign="top" width="118" align="right">~3 years</td>
<td valign="top" width="130" align="right">8</td>
<td valign="top" width="152" align="right">8 seconds</td>
</tr>
<tr>
<td valign="top" width="109" align="right">1,000,000,000 </td>
<td valign="top" width="118" align="right">~32 years</td>
<td valign="top" width="130" align="right">64</td>
<td valign="top" width="152" align="right">64 seconds</td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>
<p>And here is the table showing how the precision of a float diminishes when you use it to measure large distances, with meters being the units in this case:</p>
<table border="2" cellspacing="0" cellpadding="2" width="655">
<tbody>
<tr>
<td valign="top" width="109" align="right"><strong>Float Value</strong></td>
<td valign="top" width="132" align="right"><strong>Length Value</strong></td>
<td valign="top" width="123" align="right"><strong>Float Precision</strong></td>
<td valign="top" width="143" align="right"><strong>Length Precision</strong></td>
<td valign="top" width="144" align="right"><strong>Precision Size</strong></td>
</tr>
<tr>
<td valign="top" width="109" align="right">1 </td>
<td valign="top" width="132" align="right">1 meter</td>
<td valign="top" width="123" align="right">1.19E-07</td>
<td valign="top" width="143" align="right">119 nanometers</td>
<td valign="top" width="144" align="right">virus</td>
</tr>
<tr>
<td valign="top" width="109" align="right">10 </td>
<td valign="top" width="132" align="right">10 meters</td>
<td valign="top" width="123" align="right">9.54E-07</td>
<td valign="top" width="143" align="right">.954 micrometers</td>
<td valign="top" width="144" align="right">e. coli bacteria</td>
</tr>
<tr>
<td valign="top" width="109" align="right">100 </td>
<td valign="top" width="132" align="right">100 meters</td>
<td valign="top" width="123" align="right">7.63E-06</td>
<td valign="top" width="143" align="right">7.63 micrometers</td>
<td valign="top" width="144" align="right">red blood cell</td>
</tr>
<tr>
<td valign="top" width="109" align="right">1,000 </td>
<td valign="top" width="132" align="right">1 kilometer</td>
<td valign="top" width="123" align="right">6.10E-05</td>
<td valign="top" width="143" align="right">61.0 micrometers</td>
<td valign="top" width="144" align="right">human hair width</td>
</tr>
<tr>
<td valign="top" width="109" align="right">10,000 </td>
<td valign="top" width="132" align="right">10 kilometers</td>
<td valign="top" width="123" align="right">0.000977</td>
<td valign="top" width="143" align="right">.976 millimeters</td>
<td valign="top" width="144" align="right">toenail thickness</td>
</tr>
<tr>
<td valign="top" width="109" align="right">100,000 </td>
<td valign="top" width="132" align="right">100 kilometers</td>
<td valign="top" width="123" align="right">0.00781</td>
<td valign="top" width="143" align="right">7.81 millimeters</td>
<td valign="top" width="144" align="right">size of an ant</td>
</tr>
<tr>
<td valign="top" width="109" align="right">1,000,000 </td>
<td valign="top" width="132" align="right">.16x earth radius</td>
<td valign="top" width="123" align="right">0.0625</td>
<td valign="top" width="143" align="right">62.5 millimeters</td>
<td valign="top" width="144" align="right">credit card width</td>
</tr>
<tr>
<td valign="top" width="109" align="right">10,000,000 </td>
<td valign="top" width="132" align="right">1.6x earth radius</td>
<td valign="top" width="123" align="right">1</td>
<td valign="top" width="143" align="right">1 meter</td>
<td valign="top" width="144" align="right">uh&#8230; a meter</td>
</tr>
<tr>
<td valign="top" width="109" align="right">100,000,000 </td>
<td valign="top" width="132" align="right">1.4x sun radius</td>
<td valign="top" width="123" align="right">8</td>
<td valign="top" width="143" align="right">8 meters</td>
<td valign="top" width="144" align="right">4 Chewbaccas</td>
</tr>
<tr>
<td valign="top" width="109" align="right">1,000,000,000 </td>
<td valign="top" width="132" align="right">14x sun radius</td>
<td valign="top" width="123" align="right">64</td>
<td valign="top" width="143" align="right">64 meters</td>
<td valign="top" width="144" align="right">half a football field</td>
</tr>
</tbody>
</table>
<h2>Stable algorithms also matter</h2>
<p>Some time ago I investigated some asserts in a particle animation system. Values were going out of range after less than an hour of gameplay and I traced this back to an out-of-range ‘t’ value being passed to the Lerp function, which expected it to always be from 0.0 to 1.0. Clamping was one obvious solution but I first investigated why ’t’ was going out of range.</p>
<p>One problem with the code was that the three parameters were all floats, so over long periods of time it would inevitably have insufficient precision. However we were getting instability much earlier than expected and it felt like switching to double immediately might just mask an underlying problem.</p>
<p>The parameters to the function, all time values in seconds, corresponded to the end of an animation segment, the length of that segment, and the current time, which was always between the start of the segment (segmentEnd-segmentLength) and ‘segmentEnd’. Because the start time of the segment was not passed in this code calculated it, and then did a straightforward calculation to get ‘t’:</p>
<p><pre class="brush: cpp; light: true;">
float CalcTBad(float segmentEnd, float segmentLength, float time)
{
    float segmentStart = segmentEnd - segmentLength;
    float t = (time - segmentStart) / segmentLength;
    return t;
}
</pre></p>
<p>Straightforward, but unstable. Because ‘segmentLength’ is presumed to be quite small compared to ‘segmentEnd’, there is some rounding during the first subtraction and the difference between ‘segmentStart’ and ‘segmentEnd’ will be a bit larger or smaller than ‘segmentLength’. The resulting difference will always be a multiple of the current precision, so it will degrade over time, but even very early in the game the result will not be perfect. Because the value for ‘segmentStart’ is slightly wrong the value of “time – segmentStart” will be slightly wrong, and occasionally ‘t’ will be outside of the 0.0 to 1.0 range.</p>
<p>This will happen even if you use doubles. The errors will be smaller, but ‘t’ can still go slightly outside the 0.0 to 1.0 range. As the game goes on ‘t’ will range farther outside of the correct range, but from just a few minutes into the game the results will show signs of instability.</p>
<p>The natural tendency is to say “floating-point math is flaky, clamp the results and move on”, but we can do better, as shown here:</p>
<p><pre class="brush: cpp; light: true;">
float CalcTGood(float segmentEnd, float segmentLength, float time)
{
    float howLongAgo = segmentEnd - time;
    float t = (segmentLength - howLongAgo) / segmentLength;
    return t;
}
</pre></p>
<p>Mathematically this calculation is identical to CalcTBad, but from a stability point of view it is greatly improved.</p>
<p>If we assume that ‘time’ and ‘segmentEnd’ are large compared to ‘segmentLength’, then we can reasonably assume that ‘segmentEnd’ is less than twice as large as time. And, it turns out that if two floats are that close then their difference will fit exactly into a float. Always. So the calculation of ‘howLongAgo’ is exact. Ponder that for a moment – given a few reasonable assumptions we have <em>exact</em> results for one of our floating-point math operations.</p>
<p>With ‘howLongAgo’ being exact, if ‘time’ is within its prescribed range then ‘howLongAgo’ will be between zero and ‘segmentLength’, and so will ‘segmentLength’ minus ‘howLongAgo’. IEEE floating-point math guarantees correct rounding so when we divide by ‘segmentLength’ we are guaranteed that ‘t’ will be from 0.0 to 1.0. No clamping needed, even with floats.</p>
<p>This real example demonstrates a few things:</p>
<ul>
<li>Any time you add or subtract floats of widely varying magnitudes you need to watch for loss of precision
<li>Sometimes using ‘double’ instead of ‘float’ is the correct solution, but often a more stable algorithm is more important
<li>CalcT should probably use double (to give sufficient precision after many hours of gameplay) </li>
</ul>
<h2>Your compiler is trying to tell you something…</h2>
<p>With Visual C++ on the default warning level you will get warning C4244 when you assign a double to a float:</p>
<blockquote>
<p>warning C4244: &#8216;initializing&#8217; : conversion from &#8216;double&#8217; to &#8216;float&#8217;, possible loss of data</p>
</blockquote>
<p>Possible loss of data is not necessarily a problem, but it can be. Suppressing warnings, with #pragma warning or with a cast, is something that should be done thoughtfully, after understanding the issue. Otherwise the compiler might say “I told you so” when your game fails after a twenty-four hour soak test.</p>
<h2>Does it matter?</h2>
<p>For some game types this problem may be irrelevant. Many games finish in less than an hour and a float that holds 3,600 (seconds) still has sub-millisecond accuracy, which is enough for most purposes. This means that for those game types you should be fine storing time in a float, as long as you reset the zero-point of GetTime() at the beginning of each game, and as long as the clock stops running when the game is paused.</p>
<p>For other game types – probably the majority of games – you need to do your time calculations using a double or uint64_t. I’ve seen problems on multiple games who failed to follow this rule. The problems are particularly tedious to track down and fix because they may take many hours to show up.</p>
<p>Store your time values in a double, starting at 2^32 seconds, and then you don’t need to worry, at least not as much, as long as you avoid unstable algorithms.</p>
<h2>Next time…</h2>
<p>On the next post I think it might finally be time to start jumping into the delicate subject of how to compare floating-point numbers, with the many subtleties involved. Previous articles in this series, and other posts, can be found <a href="http://randomascii.wordpress.com/category/floating-point/">here</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/431/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/431/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/431/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/431/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/431/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/431/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/431/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/431/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/431/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/431/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/431/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/431/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/431/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/431/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=431&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2012/02/image_thumb3.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2012/02/image_thumb4.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>They sure look equal&#8230;</title>
		<link>http://randomascii.wordpress.com/2012/02/11/they-sure-look-equal/</link>
		<comments>http://randomascii.wordpress.com/2012/02/11/they-sure-look-equal/#comments</comments>
		<pubDate>Sun, 12 Feb 2012 04:07:57 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[Floating Point]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[debuggers]]></category>
		<category><![CDATA[floating point]]></category>
		<category><![CDATA[precision]]></category>
		<category><![CDATA[visual studio]]></category>
		<category><![CDATA[windbg]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=413</guid>
		<description><![CDATA[This is a special bonus extra post in my floating-point series, ranting about an issue that has been a problem for years. Some debuggers don’t display floats with enough precision. We count on our programming tools to be precise, and &#8230; <a href="http://randomascii.wordpress.com/2012/02/11/they-sure-look-equal/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=413&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This is a special bonus extra post in my floating-point series, ranting about an issue that has been a problem for years.</p>
<p>Some debuggers don’t display floats with enough precision.</p>
<p><span id="more-413"></span>
<p>We count on our programming tools to be precise, and accurate, so when they present us with misleading information for no good reason we should be annoyed. Here is an example:</p>
<p><a href="http://randomascii.files.wordpress.com/2012/02/image.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" border="0" alt="image" src="http://randomascii.files.wordpress.com/2012/02/image_thumb.png?w=363&#038;h=106" width="363" height="106"></a></p>
<p>The debugger is plainly showing me that f1 and f2 have the same value, but it is also showing me that they are not equal. This seems confusing.</p>
<p>Even though comparing floating-point numbers for equality is a a dark endeavor full of traps we should still expect that two floats (and they are both floats, see the ‘Type’ column) that contain the same value should be equal.</p>
<p>Huh.</p>
<p>An expert programmer knows that there are a few things that one can do in order to resolve this ambiguity. One can cast f1 and f2 to double in order to convince the debugger to print them using more digits, or one can display the underlying integer representation. With that done we see that f1 and f2 are indeed different numbers. Their floating-point values are slightly different, and their integer representations differ by one.</p>
<p><a href="http://randomascii.files.wordpress.com/2012/02/image1.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" border="0" alt="image" src="http://randomascii.files.wordpress.com/2012/02/image_thumb1.png?w=362&#038;h=168" width="362" height="168"></a></p>
<p>It turns out that eight decimal digits of mantissa is not enough to reliably distinguish floats. You need nine.</p>
<h2>But why?</h2>
<p>That may seem surprising. A float mantissa has 24 bits (if we include the implied one) which gives us 16,777,216 different mantissa values, which is representable in eight digits. However it turns out that if the first digit of the decimal mantissa is small enough, and if the stars align properly, then one extra decimal digit may be required.</p>
<p>Numbers from 1,000 to just below 1,024 are perfect for this. It takes ten binary digits to represent the integer portion (four decimal digits) which leaves fourteen binary digits for the fractional portion. Fourteen binary digits is 16,384 different values. 16,384 different values clearly require five digits to uniquely identify, which means that nine decimal digits (4 + 5) are required to uniquely identify all float numbers in this range, and in fact nine decimal digits (for the mantissa) are sufficient to uniquely identify all float numbers.</p>
<p>On the other hand, numbers from 1,024 to just below 10,000 only require eight decimal digits to uniquely identify. They use eleven or more binary digits for the integer portion, which leaves thirteen or fewer binary digits for the fraction.</p>
<h2>9 digits does what?</h2>
<p>It’s important to be precise here: nine digits is not enough to fully display the precise value of all float numbers – that actually requires over a hundred digits in some cases (more on that later). However nine digits of mantissa is enough to unambiguously identify which of the ~4 billion floats we are talking about.</p>
<p>Nine digits aren’t needed all the time. Most adjacent floats can be distinguished using just eight digits. Using Float_t and our ability to iterate through all floats it is easy enough to find exactly how many pairs of floats look identical when printed with an eight digit mantissa. Here’s some code:</p>
<p><pre class="brush: cpp; light: true;">
union Float_t
{
	Float_t(float f1 = 0.0f) : f(f1) {}

	int32_t i;
    float f;
    struct
    {
        uint32_t mantissa : 23;
        uint32_t exponent : 8;
        uint32_t sign : 1;
    } parts;
};

void Count9PrintFloats()
{
	int matchCount = 0;
	uint32_t lastExponent = 0;
    Float_t allFloats(0.0f);

	char buffer[30];
    sprintf_s(buffer, &quot;%1.7e&quot;, allFloats.f);
    while (allFloats.parts.exponent &lt; 255)
    {
        allFloats.i += 1;
        char buf2[_countof(buffer)];
        sprintf_s(buf2, &quot;%1.7e&quot;, allFloats.f);
        if (strcmp(buffer, buf2) == 0)
            matchCount += 1;
        strcpy_s(buffer, buf2);

		if (allFloats.parts.exponent != lastExponent)
		{
			printf(&quot;%d matches found up to exponent %u.\n&quot;,
					matchCount, lastExponent);
			lastExponent = allFloats.parts.exponent;
		}
	}

    printf(&quot;%d matches found in %d ticks.\n&quot;, matchCount);
}
</pre></p>
<p>This program took a bit less than half an hour to iterate through all 2 billion positive floats and it found 32,226,412 pairs of floats that require nine digit decimal mantissas to distinguish. In other words, roughly 6% of all floats need to printed with a nine-digit mantissa.</p>
<p>Visual Studio (versions 2005, 2010, and VS 11 Developer preview all behave identically) display floats with eight digits of precision, and this is insufficient. They need to display them with nine. They are sooo close. I am hopeful that this can be corrected before VS 11 ships. Finally.</p>
<p>In your own development if you want your floating point numbers to round-trip from float to decimal and back I recommend printing them like this:</p>
<blockquote><p>printf_s(%1.8e&#8221;, 8);</p>
</blockquote>
<p>If printed in this manner you should get nine digits of mantissa and when you sscanf them back into memory you should get back the exact float that you started with.</p>
<h2>Other debuggers</h2>
<p>WinDbg gets this right. It displays floats with 10 digits of precision.</p>
<p><a href="http://randomascii.files.wordpress.com/2012/02/image2.png"><img style="background-image:none;border-bottom:0;border-left:0;padding-left:0;padding-right:0;display:inline;border-top:0;border-right:0;padding-top:0;" title="image" border="0" alt="image" src="http://randomascii.files.wordpress.com/2012/02/image_thumb2.png?w=417&#038;h=229" width="417" height="229"></a></p>
<h2>Doubles</h2>
<p>Both VS and WinDb display doubles with 17 digits of precision. There are two reasons for this:</p>
<ol>
<li>17 decimal digits of mantissa is how many is required to uniquely identify a double.</li>
<li>17 decimal digits of mantissa is the most that VC++ will print.</li>
</ol>
<p>If you ask VC++ to print more than 17 digits (printf(“%1.30e”, d1);) then the extra digits will all be zeroes, even though that is usually not the correct value. This is standards compliant, but annoying. We’ll deal with this problem, and explore how many digits it takes to perfectly represent the value of a float, in a subsequent post.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/413/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/413/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/413/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/413/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/413/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/413/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/413/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/413/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/413/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/413/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/413/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/413/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/413/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/413/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=413&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2012/02/11/they-sure-look-equal/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2012/02/image_thumb.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2012/02/image_thumb1.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2012/02/image_thumb2.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>Stupid Float Tricks</title>
		<link>http://randomascii.wordpress.com/2012/01/23/stupid-float-tricks-2/</link>
		<comments>http://randomascii.wordpress.com/2012/01/23/stupid-float-tricks-2/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 06:03:11 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[AltDevBlogADay]]></category>
		<category><![CDATA[Floating Point]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[floating point]]></category>
		<category><![CDATA[integer representation]]></category>
		<category><![CDATA[mantissa]]></category>
		<category><![CDATA[negative zero]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=400</guid>
		<description><![CDATA[Type Punning is Not a Joke I left the last post with a promise to share an interesting property of the IEEE float format. There are several equivalent ways of stating this property, and here are two of them. For &#8230; <a href="http://randomascii.wordpress.com/2012/01/23/stupid-float-tricks-2/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=400&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h2>Type Punning is Not a Joke</h2>
<p>I left the <a href="http://altdevblogaday.com/2012/01/05/tricks-with-the-floating-point-format/">last post</a> with a promise to share an interesting property of the IEEE float format. There are several equivalent ways of stating this property, and here are two of them.</p>
<p><span id="more-400"></span>
<p>For floats of the same sign:</p>
<ol>
<li>Adjacent floats have adjacent integer representations
<li>Incrementing the integer representation of a float moves to the next representable float, moving away from zero </li>
</ol>
<p>Depending on your math and mental state these claims will seem somewhere between fantastic/improbable and obvious/inevitable. I think it’s worth pointing out that these properties are certainly not inevitable. Many floating-point formats before the IEEE standard did not have these properties. These tricks only work because of the implied one in the mantissa (which avoids duplicate encodings for the same value), the use of an exponent bias, and the placement of the different fields of the float. The float format was carefully designed in order to guarantee this interesting characteristic.</p>
<p>I could go on at length to explain why incrementing the integer representation of a float moves to the next representable float (incrementing the mantissa increases the value of the float, and when the mantissa wraps to zero that increments the exponent, QED) but instead I recommend you either trust me or else play around with Float_t in your debugger until you see how it works.</p>
<p>One thing to be aware of is the understated warning that this only applies for floats of the same sign. The representation of positive zero is adjacent to the representation for 1.40129846e-45, but the representation for negative zero is about two billion away, because its sign bit is set which means that its integer representation is the most negative 32-bit integer. That means that while positive and negative zero compare equal as floats, their integer representations have radically different values. This also means that tiny positive and negative numbers have integer representations which are about two billion apart. Beware!</p>
<p>Another thing to be aware of is that while incrementing the integer representation of a float normally increases the value by a modest and fairly predictable ratio (typically the larger number is at most about 1.0000012 times larger) this does not apply for very small numbers (between zero and FLT_MIN) or when going from FLT_MAX to infinity. When going from zero to the smallest positive float or from FLT_MAX to infinity the ratio is actually infinite, and when dealing with numbers between zero and FLT_MIN the ratio can be as large as 2.0. However in-between FLT_MIN and FLT_MAX the ratio is relatively predictable and consistent.</p>
<p>Here’s a concrete example of using this property. This code prints all 255*2^23+1 positive floats, from +0.0 to +infinity:</p>
<p><pre class="brush: cpp; light: true;">
union Float_t
{
    int32_t i;
    float f;
    struct
    {
        uint32_t mantissa : 23;
        uint32_t exponent : 8;
        uint32_t sign : 1;
    } parts;
};

void IterateAllPositiveFloats()
{
    // Start at zero and print that float.
    Float_t allFloats;
    allFloats.f = 0.0f;
    printf(&quot;%1.8e\n&quot;, allFloats.f);

    // Continue through all of the floats, stopping
    // when we get to positive infinity.
    while (allFloats.parts.exponent &lt; 255)
    {
        // Increment the integer representation
        // to move to the next float.
        allFloats.i += 1;
        printf(&quot;%1.8e\n&quot;, allFloats.f);
    }
}
</pre></p>
<p>The (partial) output looks like this:</p>
<blockquote>
<p><font color="#000000">0.00000000e+000 <br />1.40129846e-045 <br />2.80259693e-045 <br />4.20389539e-045 <br />5.60519386e-045 <br />7.00649232e-045 <br />8.40779079e-045 <br />9.80908925e-045 <br />… <br />3.40282306e+038 <br />3.40282326e+038 <br />3.40282347e+038 <br />1.#INF0000e+000</font></p>
</blockquote>
<p>For double precision floats you could use <a href="http://msdn.microsoft.com/en-us/library/h0dff77w(v=vs.100).aspx">_nextafter()</a> to walk through all of the available doubles, but I’m not aware of a simple and portable alternative to this technique for 32-bit floats.</p>
<p>We can use this property and the Float_t union to find out how much precision a float variable has at a particular range. We can assign a float to Float_t::f, then increment or decrement the integer representation, and then compare the before/after float values to see how much they have changed. Here is some sample code that does this:</p>
<p><pre class="brush: cpp; light: true;"> 
float TestFloatPrecisionAwayFromZero(float input)
{
    union Float_t num;
    num.f = input;
    // Incrementing infinity or a NaN would be bad!
    assert(num.parts.exponent &lt; 255);
    // Increment the integer representation of our value
    num.i += 1;
    // Subtract the initial value to find our precision
    float delta = num.f - input;
    return delta;
}

float TestFloatPrecisionTowardsZero(float input)
{
    union Float_t num;
    num.f = input;
    // Decrementing from zero would be bad!
    assert(num.parts.exponent || num.parts.mantissa);
    // Decrementing a NaN would be bad!
    assert(num.parts.exponent != 255 || num.parts.mantissa == 0);
    // Decrement the integer representation of our value
    num.i -= 1;
    // Subtract the initial value to find our precision
    float delta = num.f - input;
    return -delta;
}

struct TwoFloats
{
    float awayDelta;
    float towardsDelta;
};

struct TwoFloats TestFloatPrecision(float input)
{
    struct TwoFloats result =
    {
        TestFloatPrecisionAwayFromZero(input),
        TestFloatPrecisionTowardsZero(input),
    };
    return result;
}
</pre></p>
<p>Note that the difference between the values of two adjacent floats can always be stored exactly in a (possibly subnormal) float. I have a truly marvelous proof of this theorem which the margin is too small to contain.</p>
<p>These functions can be called from test code to learn about the float format. Better yet, when sitting at a breakpoint in Visual Studio you can call them from the watch window. That allows dynamic exploration of precision:</p>
<p><a href="http://randomascii.files.wordpress.com/2012/01/image3.png"><img style="padding-left:0;padding-right:0;padding-top:0;border-width:0;" border="0" alt="image" src="http://randomascii.files.wordpress.com/2012/01/image_thumb3.png?w=536&#038;h=195" width="536" height="195"></a></p>
<p>Usually the delta is the same whether you increment the integer representation or decrement it. However if incrementing or decrementing changes the exponent then the two deltas will be different. This can be seen in the example above where the precision at 65536 is twice as good (half the delta) going towards zero compared to going away from zero.</p>
<h2>Caveat Incrementor</h2>
<p>Pay close attention to the number of caveats that you have to watch for when you start partying on the integer representation of a float. It’s safe in a controlled environment, but things can quickly go bad in the real world:</p>
<ul>
<li>Some compilers may prohibit the <a href="http://labs.qt.nokia.com/2011/06/10/type-punning-and-strict-aliasing/">type-punning/aliasing</a> used by Float_t (gcc and VC++ allow it)
<li>Incrementing the integer representation of infinity gives you a NaN
<li>Decrementing the integer representation of zero gives you a NaN
<li>Incrementing or decrementing some NaNs will give you zero or infinity
<li>The ratio of the value of two adjacent floats is usually no more than about 1.0000012, but is sometimes much much larger
<li>The representations of positive and negative zero are far removed from each other
<li>The representations of FLT_MIN and -FLT_MIN are as far from each other as the representations of FLT_MAX and -FLT_MAX
<li>Floating-point math is <em>always</em> more complicated than you expect </li>
</ul>
<h2>Log Rolling</h2>
<p>A related property is that, for floats that are positive, finite, and non-zero, the integer representation of a float is a piecewise linear approximation of its base 2 logarithm&nbsp; I just like saying that. It sounds cool.</p>
<p>The reason that the integer representation is (after appropriate scaling and biasing) a piecewise linear representation of the base 2 logarithm of a (positive) float is because the exponent is logically the base-2 logarithm of a float, and it is in the high bits. The mantissa linearly interpolates between power-of-2 floats. The code below demonstrates the concept:</p>
<p><pre class="brush: cpp; light: true;"> 
void PlotFloatsVersusRep()
{
    // Let's plot some floats and their representations from 1.0 to 32.0
    for (float f = 1.0f; f &lt;= 32.0f; f *= 1.01f)
    {
        Float_t num;
        num.f = f;
        // The representation of 1.0f is 0x3f800000 and the representation
        // of 2.0f is 0x40000000, so if the representation of a float is
        // an approximation of its base 2 log then 0x3f800000 must be
        // log2(1.0) == 0 and 0x40000000 must be log2(2.0) == 1.
        // Therefore we should scale the integer representation by
        // subtracting 0x3f800000 and dividing by
        // (0x40000000 - 0x3f800000)
        double log2Estimate = (num.i - 0x3f800000) /
                    double(0x40000000 - 0x3f800000);
        //printf(&quot;%1.5f,%1.5f\n&quot;, f, log2Estimate);
        double log2 = log(f) / log(2.0);
        printf(&quot;%1.5f,%1.5f,%1.5f,%1.5f\n&quot;, f, log2Estimate, log2, log2Estimate / log2);
    }
}
</pre></p>
<p>If we drop the results into Excel and plot them with the x-axis on a base-2 log scale then we get this lovely chart:</p>
<p><a href="http://randomascii.files.wordpress.com/2012/01/image4.png"><img style="padding-left:0;padding-right:0;padding-top:0;border-width:0;" border="0" alt="image" src="http://randomascii.files.wordpress.com/2012/01/image_thumb4.png?w=547&#038;h=381" width="547" height="381"></a></p>
<p>If it’s plotted linearly then the ‘linear’ part of ‘piecewise linear’ becomes obvious, but I like the slightly scalloped straight line better. The estimate is exact when the float is a power of two, and at its worst is about 0.086 too small.</p>
<p>In the last of this week’s stupid float tricks I present you with this dodgy code:</p>
<p><pre class="brush: cpp; light: true;"> 
int RoundedUpLogTwo(uint64_t input)
{
    assert(input &gt; 0);
    union Float_t num;
    num.f = (float)input;
    // Increment the representation enough that for any non power-
    // of-two (FLT_MIN or greater) we will move to the next exponent.
    num.i += 0x7FFFFF;
    // Return the exponent after subtracting the bias.
    return num.parts.exponent - 127;
}
</pre></p>
<p>Depending on how you think about such things this is either the simplest and most elegant, or the most complicated and obtuse way of finding out how many bits it takes to represent a particular integer.</p>
<h2>Random aside</h2>
<p>The IEEE standard guarantees correct rounding for addition, multiplication, subtraction, division, and square-root. If you’ve ever wondered if this is important, then try this: using the Windows calculator (I’m using the Windows 7 version), calculate sqrt(4) &#8211; 2. The answer should, of course, be zero. However the answer that the calculator actually returns is:</p>
<blockquote>
<p>Scientific mode: -8.1648465955514287168521180122928e-39 <br />Standard mode: -1.068281969439142e-19</p>
</blockquote>
<p>This is utterly fascinating. It shows that the calculator is using impressive precision (it did the calculations to <em>40</em> digits of accuracy, 20 digits in standard mode) and yet it got the wrong answer. Because the Windows calculator is using so much precision it will, for many calculations, get a more accurate answer than an IEEE float or double. However, because it fails to correctly round the answer (the last digit cannot be trusted) it sometimes gives answers that are <a href="http://answers.microsoft.com/en-us/windows/forum/windows_7-windows_programs/windows-calculator-gives-wrong-answer/c94c2aa5-03a0-42f7-82ee-899800355613">laughably wrong</a>.</p>
<p>I can just imagine the two calculator modes arguing: the standard mode says “according to my calculations sqrt(4) is 1.9999999999999999999” and the scientific mode says “according to <em>my</em> calculations it’s 1.999999999999999999999999999999999999992 – that is far more accurate”. Meanwhile an IEEE float says “I’ve only got about eight digits of precision but I think sqrt(4) is 2.”</p>
<p>Having lots of digits of precision is nice, but without correct rounding it can just end up making you look foolish.</p>
<h2>That’s all folks</h2>
<p>In the next post I’ll discuss some concrete examples that show the value of knowing how much precision is available around certain values. Until next time, <a href="http://www.youtube.com/watch?v=CTAud5O7Qqk">float on</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/400/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/400/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/400/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/400/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/400/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/400/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/400/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/400/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/400/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/400/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/400/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/400/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/400/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/400/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=400&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2012/01/23/stupid-float-tricks-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2012/01/image_thumb3.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2012/01/image_thumb4.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>SSOPA</title>
		<link>http://randomascii.wordpress.com/2012/01/18/ssopa/</link>
		<comments>http://randomascii.wordpress.com/2012/01/18/ssopa/#comments</comments>
		<pubDate>Wed, 18 Jan 2012 17:35:17 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[Computers and Internet]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=381</guid>
		<description><![CDATA[Stop SOPA SOPA is a bill that Hollywood, the music industry, and others (Big Copyright) are lobbying congress to pass. This bill would allow extra-judicial actions to block web sites and payment processing for web sites that Big Copyright disagrees &#8230; <a href="http://randomascii.wordpress.com/2012/01/18/ssopa/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=381&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h2>Stop SOPA</h2>
<p>SOPA is a bill that Hollywood, the music industry, and others (Big Copyright) are lobbying congress to pass. This bill would allow extra-judicial actions to block web sites and payment processing for web sites that Big Copyright disagrees with. This is a bill too far. Way too far.</p>
<p>Copyright is a balancing act. Section 8 of the US constitution says that one of the powers of congress is:</p>
<blockquote><p>To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries</p>
</blockquote>
<p>This doesn’t say that copyrights should last forever (<a href="http://en.wikipedia.org/wiki/Copyright_Term_Extension_Act">hi Mickey!</a>), and it doesn’t say that copyrights should be enforced at all costs (<a href="http://en.wikipedia.org/wiki/Fair_use">fair use</a> being one critical exception). It just says that a bit of protection for intellectual property might be kinda nice.</p>
<p>Preventing piracy in the age of the Internet is tricky. Impossible actually. But draconian measures that muzzle legitimate free speech and give excessive power to large content creators are not the answer. Balanced legislation that includes sufficient <a href="http://arstechnica.com/tech-policy/news/2012/01/even-without-dns-provisions-sopa-and-pipa-remain-fatally-flawed.ars">judicial oversight</a> is needed. And, since much piracy is driven by convenience, content providers need to provide <a href="http://arstechnica.com/tech-policy/news/2012/01/forget-sopa-copyright-owners-must-build-a-better-bittorrent.ars">easier ways</a> of <a href="http://store.steampowered.com/">acquiring legal content</a>. It’s a crazy idea I know: instead of suing your customers and blocking their access to websites, why not make it easier for them to purchase your products?</p>
<p>See Wikipedia’s <a href="http://en.wikipedia.org/wiki/Wikipedia:SOPA_initiative/Learn_more">SOPA and PIPA</a> article for more information – it’s still up today.</p>
<p>And if you are a US citizen then make sure that your representatives know that muzzling free speech without judicial oversight so that Eminem can sell a couple more CDs is not acceptable.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/381/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=381&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2012/01/18/ssopa/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>
	</item>
		<item>
		<title>Tricks With the Floating-Point Format</title>
		<link>http://randomascii.wordpress.com/2012/01/11/tricks-with-the-floating-point-format/</link>
		<comments>http://randomascii.wordpress.com/2012/01/11/tricks-with-the-floating-point-format/#comments</comments>
		<pubDate>Thu, 12 Jan 2012 06:11:21 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[AltDevBlogADay]]></category>
		<category><![CDATA[Floating Point]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=370</guid>
		<description><![CDATA[Years ago I wrote an article about how to do epsilon floating-point comparisons by using integer comparisons. That article has been quite popular (it is frequently cited, and the code samples have been used by a number of companies) and &#8230; <a href="http://randomascii.wordpress.com/2012/01/11/tricks-with-the-floating-point-format/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=370&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Years ago I wrote an article about how to do epsilon floating-point comparisons by using integer comparisons. That article has been quite popular (it is frequently cited, and the code samples have been used by a number of companies) and this worries me a bit, because the article has some flaws. I’m not going to link to the article because I want to replace it, not send people looking for it.</p>
<p>Today I am going to start setting the groundwork for explaining how and why this trick works, while also exploring the weird and wonderful world of floating-point math.</p>
<p><span id="more-370"></span>
<p>This article was originally posted on <a href="http://altdevblogaday.com/2012/01/05/tricks-with-the-floating-point-format/">#AltDevBlogADay</a>.</p>
<p>There are lots of references that explain the layout and decoding of floating-point numbers. In this post I am going to supply the layout, and then show how to reverse engineer the decoding process through experimentation.</p>
<p>The <a href="http://en.wikipedia.org/wiki/IEEE_754">IEEE 754-1985 standard</a> specifies the format for 32-bit floating-point numbers, the type known as ‘float’ in many languages. The <a href="http://en.wikipedia.org/wiki/IEEE_754-2008">2008 version</a> of the standard adds new formats but doesn’t change the existing ones, which have been standardized for over 25 years.</p>
<p>A 32-bit float consists of a one-bit sign field, an eight-bit exponent field, and a twenty-three-bit mantissa field. The union below shows the layout of a 32-bit float. This union is very useful for exploring and working with the internals of floating-point numbers. I don’t recommend using this union for production coding (it is a violation of the aliasing rules for some compilers, and will probably generate inefficient code), but it is useful for learning. These articles on <a href="http://labs.qt.nokia.com/2011/06/10/type-punning-and-strict-aliasing/">aliasing</a> and <a href="http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html">undefined behavior</a> provide some more details.</p>
<p><pre class="brush: cpp; light: true;">
union Float_t
{
    Float_t(float f1 = 0.0f) : f(f1) {}
    // Portable sign-extraction
    bool Sign() const { return (i &gt;&gt; 31) != 0; }

    int32_t i;
    float f;
    struct
    {   // Bitfields for exploration. Do not use in production code.
        uint32_t mantissa : 23;
        uint32_t exponent : 8;
        uint32_t sign : 1;
    } parts;
};
</pre></p>
<p>The format for 32-bit float numbers was carefully designed to allow them to be reinterpreted as an integer, and the aliasing of ‘i’ and ‘f’ should work on most platforms (if, such as gcc and VC++, they allow aliasing through unions), with the sign bit of the integer and the float occupying the same location.</p>
<p>The layout of bitfields is compiler dependent so the bitfield struct that is also in the union may not work on all platforms. However it works on Visual C++ on x86 and x64, which is good enough for my exploratory purposes. On big endian systems like SPARC and PPC the order in the bitfield struct is reversed.</p>
<p>In order to really understand floats, it is important to explore and experiment. One way to explore is to write code like this, in a debug build so that the debugger doesn’t optimize it away:</p>
<p><pre class="brush: cpp; light: true;">
void TestFunction()
{
    Float_t num(1.0f);
    num.i -= 1;
    printf(&quot;Float value, representation, sign, exponent, mantissa\n&quot;);
    for (;;)
    {
        // Breakpoint here.
        printf(&quot;%1.8e, 0x%08X, %d, %d, 0x%06X\n&quot;,
            num.f, num.i,
            num.parts.sign, num.parts.exponent, num.parts.mantissa);
    }
}
</pre></p>
<p>Put a breakpoint on the ‘printf’ statement and then add the various components of <em>num</em> to your debugger’s watch window and examine them, like this:</p>
<p><a href="http://randomascii.files.wordpress.com/2012/01/image.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" border="0" alt="image" src="http://randomascii.files.wordpress.com/2012/01/image_thumb.png?w=355&#038;h=139" width="355" height="139"></a></p>
<p>You can then start trying interactive experiments, such as incrementing the mantissa or exponent fields, incrementing num.i, or toggling the value of the sign field. As you do this you should watch num.f to see how it changes. Or, assign various floating-point values to num.f and see how the other fields change. You can either view the results in the debugger’s watch window, or hit ‘Run’ after each change so that the printf statement executes and prints some nicely formatted results.</p>
<p>Go ahead. Put Float_t and the sample code into a project and play around with it for a few minutes. Discover the minimum and maximum float values. Experiment with the minimum and maximum mantissa values in various combinations. Think about the implications. This is the best way to learn. I’ll wait.</p>
<p><a href="http://randomascii.files.wordpress.com/2012/01/img_0291-400x210.jpg"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="IMG_0291 (400x210)" border="0" alt="IMG_0291 (400x210)" src="http://randomascii.files.wordpress.com/2012/01/img_0291-400x210_thumb.jpg?w=244&#038;h=130" width="244" height="130"></a></p>
<p>I’ve put some of the results that you might encounter during this experimentation into the table below:</p>
<table border="2" cellspacing="0" cellpadding="2" width="663">
<tbody>
<tr>
<td valign="top" width="181"><strong>Float value</strong></td>
<td valign="top" width="185"><strong>Integer representation</strong></td>
<td valign="top" width="58"><strong>Sign</strong></td>
<td valign="top" width="124"><strong>Exponent field</strong></td>
<td valign="top" width="111"><strong>Mantissa field</strong></td>
</tr>
<tr>
<td valign="top" width="192"><font size="2" face="Courier New"><strong>0.0</strong></font></td>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>0&#215;00000000</strong></font></td>
<td valign="top" width="61"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="128"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="114"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
<tr>
<td valign="top" width="195"><font size="2" face="Courier New"><strong>1.40129846e-45</strong></font></td>
<td valign="top" width="195"><font size="2" face="Courier New"><strong>0&#215;00000001</strong></font></td>
<td valign="top" width="63"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="129"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="115"><font size="2" face="Courier New"><strong>1</strong></font></td>
</tr>
<tr>
<td valign="top" width="195"><font size="2" face="Courier New"><strong>1.17549435e-38</strong></font></td>
<td valign="top" width="195"><font size="2" face="Courier New"><strong>0&#215;00800000</strong></font></td>
<td valign="top" width="64"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="130"><font size="2" face="Courier New"><strong>1</strong></font></td>
<td valign="top" width="116"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
<tr>
<td valign="top" width="195"><font size="2" face="Courier New"><strong>0.2</strong></font></td>
<td valign="top" width="194"><font size="2" face="Courier New"><strong>0x3E4CCCCD</strong></font></td>
<td valign="top" width="65"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="130"><font size="2" face="Courier New"><strong>124</strong></font></td>
<td valign="top" width="117"><font size="2" face="Courier New"><strong>0x4CCCCD</strong></font></td>
</tr>
<tr>
<td valign="top" width="194"><font size="2" face="Courier New"><strong>1.0</strong></font></td>
<td valign="top" width="194"><font size="2" face="Courier New"><strong>0x3F800000</strong></font></td>
<td valign="top" width="66"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="130"><font size="2" face="Courier New"><strong>127</strong></font></td>
<td valign="top" width="117"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
<tr>
<td valign="top" width="194"><font size="2" face="Courier New"><strong>1.5</strong></font></td>
<td valign="top" width="194"><font size="2" face="Courier New"><strong>0x3FC00000</strong></font></td>
<td valign="top" width="67"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="129"><font size="2" face="Courier New"><strong>127</strong></font></td>
<td valign="top" width="117"><font size="2" face="Courier New"><strong>0&#215;400000</strong></font></td>
</tr>
<tr>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>1.75</strong></font></td>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>0x3FE00000</strong></font></td>
<td valign="top" width="68"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="129"><font size="2" face="Courier New"><strong>127</strong></font></td>
<td valign="top" width="117"><font size="2" face="Courier New"><strong>0&#215;600000</strong></font></td>
</tr>
<tr>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>1.99999988</strong></font></td>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>0x3FFFFFFF</strong></font></td>
<td valign="top" width="69"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="129"><font size="2" face="Courier New"><strong>127</strong></font></td>
<td valign="top" width="117"><font size="2" face="Courier New"><strong>0x7FFFFF</strong></font></td>
</tr>
<tr>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>2.0</strong></font></td>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>0&#215;40000000</strong></font></td>
<td valign="top" width="69"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="129"><font size="2" face="Courier New"><strong>128</strong></font></td>
<td valign="top" width="117"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
<tr>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>16,777,215</strong></font></td>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>0x4B7FFFFF</strong></font></td>
<td valign="top" width="69"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="129"><font size="2" face="Courier New"><strong>150</strong></font></td>
<td valign="top" width="117"><font size="2" face="Courier New"><strong>0x7FFFFF</strong></font></td>
</tr>
<tr>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>3.40282347e+38</strong></font></td>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>0x7F7FFFFF</strong></font></td>
<td valign="top" width="69"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="129"><font size="2" face="Courier New"><strong>254</strong></font></td>
<td valign="top" width="117"><font size="2" face="Courier New"><strong>0x7FFFFF</strong></font></td>
</tr>
<tr>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>Positive infinity</strong></font></td>
<td valign="top" width="193"><font size="2" face="Courier New"><strong>0x7f800000</strong></font></td>
<td valign="top" width="69"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="129"><font size="2" face="Courier New"><strong>255</strong></font></td>
<td valign="top" width="117"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
</tbody>
</table>
<p>With this information we can begin to understand the decoding of floats. Floats use an base-two exponential format so we would expect the decoding to be mantissa * 2^exponent. However in the encodings for 1.0 and 2.0 the mantissa is zero, so how can this work? It works because of a clever trick. Normalized numbers in base-two scientific notation are always of the form 1.xxxx*2^exp, so storing the leading one is not necessary. By omitting the leading one we get an extra bit of precision – the 23-bit field of a float actually manages to hold 24 bits of precision because there is an implied ‘one’ bit with a value of 0&#215;800000.</p>
<p>The exponent for 1.0 should be zero but the exponent field is 127. That’s because the exponent is stored in excess 127 form. To convert from the value in the exponent field to the value of the exponent you simply subtract 127.</p>
<p>The two exceptions to this exponent rule are when the exponent field is 255 or zero. 255 is a special exponent value that indicates that the float is either infinity or a NAN (not-a-number), with a zero mantissa indicating infinity. Zero is a special exponent value that indicates that there is no implied leading one, meaning that these numbers are not normalized. This is necessary in order to exactly represent zero. The exponent value in that case is –126, which is the same as when the exponent field is one.</p>
<p>To clarify the exponent rules I’ve added an “Exponent value” column which shows the actual binary exponent implied by the exponent field:</p>
<table border="2" cellspacing="0" cellpadding="2" width="705">
<tbody>
<tr>
<td valign="top" width="150"><strong>Float value</strong></td>
<td valign="top" width="161"><strong>Integer representation</strong></td>
<td valign="top" width="51"><strong>Sign</strong></td>
<td valign="top" width="113"><strong>Exponent field</strong></td>
<td valign="top" width="123"><strong>Exponent value</strong></td>
<td valign="top" width="103"><strong>Mantissa field</strong></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>0.0</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0&#215;00000000</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>-126</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>1.40129846e-45</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0&#215;00000001</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>-126</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>1</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>1.17549435e-38</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0&#215;00800000</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>1</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>-126</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>0.2</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0x3E4CCCCD</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>124</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>-3</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0x4CCCCD</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>1.0</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0x3F800000</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>127</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>1.5</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0x3FC00000</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>127</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0&#215;400000</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>1.75</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0x3FE00000</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>127</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0&#215;600000</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>1.99999988</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0x3FFFFFFF</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>127</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0x7FFFFF</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>2.0</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0&#215;40000000</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>128</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>1</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>16,777,215</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0x4B7FFFFF</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>150</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>23</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0x7FFFFF</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>3.40282347e+38</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0x7F7FFFFF</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>254</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>127</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0x7FFFFF</strong></font></td>
</tr>
<tr>
<td valign="top" width="150"><font size="2" face="Courier New"><strong>Positive infinity</strong></font></td>
<td valign="top" width="161"><font size="2" face="Courier New"><strong>0x7f800000</strong></font></td>
<td valign="top" width="51"><font size="2" face="Courier New"><strong>0</strong></font></td>
<td valign="top" width="113"><font size="2" face="Courier New"><strong>255</strong></font></td>
<td valign="top" width="123"><font size="2" face="Courier New"><strong>Infinite!</strong></font></td>
<td valign="top" width="103"><font size="2" face="Courier New"><strong>0</strong></font></td>
</tr>
</tbody>
</table>
<p>Although these examples don’t show it, negative numbers are dealt with by setting the sign field to 1, which is called sign-and-magnitude form. All numbers, even zero and infinity, have negative versions.</p>
<p>The numbers in this chart were chosen in order to demonstrate various things:</p>
<ul>
<li>0.0: It’s handy that zero is represented by all zeroes. However there is also a negative zero which has the sign bit set. Negative zero is equal to positive zero.
<li>1.40129846e-45: This is the smallest positive float, and its integer representation is the smallest positive integer
<li>1.17549435e-38: This is the smallest float with an implied leading one, the smallest number with a non-zero exponent, the smallest normalized float. This number is also FLT_MIN. Note that FLT_MIN is not the smallest float. There are actually about 8 million positive floats smaller than FLT_MIN.
<li>0.2: This is an example of one of the many decimal numbers that cannot be precisely represented with a binary floating-point format. That mantissa wants to repeat ‘C’ forever.
<li>1.0: Note the exponent and the mantissa, and memorize the integer representation in case you see it in hex dumps.
<li>1.5, 1.75: Just a couple of slightly larger numbers to show the mantissa changing while the exponent stays the same.
<li>1.99999988: This is the largest float that has the same exponent as 1.0, and the largest float that is smaller than 2.0.
<li>2.0: Notice that the exponent is one higher than for 1.0, and the integer representation and exponent are one higher than for 1.99999988.
<li>16,777,215: This is the largest odd float. The next larger float has an exponent value of 24, which means the mantissa is shifted enough left that odd numbers are impossible. Note that this means that above 16,777,216 a float has <em>less</em> precision than an int.
<li>3.40282347e+38: FLT_MAX. The largest finite float, with the maximum finite exponent and the maximum mantissa.
<li>Positive infinity: The papa bear of floats.</li>
</ul>
<p>We can now describe how to decode the float format:</p>
<ul>
<li>If the exponent field is 255 then the number is infinity (if the mantissa is zero) or a NaN (if the mantissa is non-zero)
<li>If the exponent field is from 1 to 254 then the exponent is between –126 and 127, there is an implied leading one, and the float’s value is:
<ul>
<li>(1.0 + mantissa-field / 0&#215;800000) * 2^(exponent-field-127)</li>
</ul>
<li>If the exponent field is zero then the exponent is –126, there is no implied leading one, and the float’s value is:
<ul>
<li>(mantissa-field / 0&#215;800000) * 2^-126</li>
</ul>
<li>If the sign bit is set then negate the value of the float</li>
</ul>
<p>The excess-127 exponent and the omitted leading one lead to some very convenient characteristics of floats, but I’ve rambled on too long so those must be saved for the next post.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/370/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/370/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/370/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/370/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/370/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/370/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/370/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/370/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/370/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/370/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/370/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/370/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/370/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/370/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=370&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2012/01/11/tricks-with-the-floating-point-format/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2012/01/image_thumb.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2012/01/img_0291-400x210_thumb.jpg" medium="image">
			<media:title type="html">IMG_0291 (400x210)</media:title>
		</media:content>
	</item>
		<item>
		<title>Top Ten Technologies of 2011</title>
		<link>http://randomascii.wordpress.com/2012/01/01/top-ten-technologies-of-2011/</link>
		<comments>http://randomascii.wordpress.com/2012/01/01/top-ten-technologies-of-2011/#comments</comments>
		<pubDate>Mon, 02 Jan 2012 05:40:57 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[AltDevBlogADay]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=358</guid>
		<description><![CDATA[It seemed wrong to do a ‘normal’ blog post on December 23rd, and my programming themed “Night Before Christmas” rap medley never quite came together, so instead I’m doing a top-ten list. Well, not really a “top-ten” list, but a &#8230; <a href="http://randomascii.wordpress.com/2012/01/01/top-ten-technologies-of-2011/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=358&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It seemed wrong to do a ‘normal’ blog post on December 23rd, and my programming themed “Night Before Christmas” rap medley never quite came together, so instead I’m doing a top-ten list.</p>
<p>Well, not really a “top-ten” list, but a list of ten indispensable programming technologies I’ve used this year. I’ve avoided numbering my top-ten list because I couldn’t come up with any sensible way to compare these disparate programs. They have all been crucial. I’ve blogged about most of them, and if you’re not using them or some equally good equivalent you should fix that. I use every one of these technologies both at work (on projects with dozens of developers and industrial strength infrastructure) and at home (on projects with one developer and with the ‘build machine’ being a different enlistment on my laptop).</p>
<p><span id="more-358"></span>
<p>This article was originally posted on <a href="http://altdevblogaday.com/2011/12/24/top-ten-technologies-of-2011/">#AltDevBlogADay</a>.</p>
<p>I develop on Windows so there is a significant bias towards Microsoft products. The good news is that six of the seven Microsoft products on this list are available for free (and even the seventh <a href="http://altdevblogaday.com/2011/12/03/you-should-be-using-microsoft-bizspark/">can be free</a>), and two of the three non-Microsoft products are also free (and even the third is <a href="http://www.perforce.com/downloads/try_perforce_free">free for limited use</a>).</p>
<p>Without further ado, here is my list:</p>
<ul>
<li>Visual Studio 2010 – I’m enjoying using parts of C++0x including move constructors, static_assert, and the ‘auto’ and ‘override’ keywords. The “Navigate To” functionality (Ctrl+,) lets me explore source code more effectively, and there are other hidden improvements as well.
<li><a href="http://randomascii.wordpress.com/category/code-reliability/">/Analyze</a> – I’ve written about this feature a number of times already, but this is a good place to summarize its effect. By using /analyze for static code analysis I have found and fixed over a thousand serious bugs this year, and it now monitors our code to prevent entire classes of bugs from ever returning.
<li><a href="http://randomascii.wordpress.com/2011/12/07/increased-reliability-through-more-crashes/">App Verifier</a> – I’ve written about this as well and, while it’s only helped me find a few dozen bugs, these were all serious (race conditions and memory corruption) and would have been painfully difficult or impossible to find without such a tool.
<li><a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms681417(v=vs.85).aspx">Symbol server</a> – having symbols show up when I need them is magically essential. I now even put locally built symbols into a local symbol server so that I never again invalidate a crash dump by rebuilding the DLLs that it depends on.
<li><a href="http://randomascii.wordpress.com/2011/11/11/source-indexing-is-underused-awesomeness/">Source indexing</a> – every day I save time because the source file that I need shows up automatically. I love the fact that, no matter what product, branch, or version the code is from, the debugger will automatically find the file instantly. If you ever debug code that is not built on your machine then you need this.
<li><a href="http://randomascii.wordpress.com/category/xperf/">Xperf</a> – knowing how to use xperf is like having x-ray vision while everyone else is blindfolded. This tool has let me quickly find and fix innumerable performance problems, in our software and in the tools that we use.
<li><a href="http://msdn.microsoft.com/en-us/windows/hardware/gg463009">WinDbg</a> – the debugger I love to hate. It’s UI is execrable but it does a couple of things that Visual Studio’s debugger doesn’t, so I keep it around for the clarifying second opinion that is occasionally crucial. More on this in the new year.
<li><a href="http://www.perforce.com/">Perforce</a> – change lists, labels, file history that I can believe in, I love this version control software. I haven’t used anything else for more than a decade, so I can’t compare it to any of its worthy competitors, but it is excellent.
<li><a href="http://www.sysinternals.com/">sysinternals</a> – for exploring or monitoring of processes these tools are tough to beat. Many problems can be solved by proper applying one of these.
<li><a href="http://www.python.org/">Python</a> – a good scripting language is indispensable and it’s good to be at a company where Python is one of the main choices. Indenting for scope makes me smile. </li>
</ul>
<p>I thought about having a top 10 list with just xperf and /analyze but a top 1010 list seemed more fun.</p>
<p>What’s missing from this list? What deserves to be removed? Let me know.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/358/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/358/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/358/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=358&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2012/01/01/top-ten-technologies-of-2011/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>
	</item>
		<item>
		<title>Fractal eXtreme New Version&#8211;Better Zoom Movies</title>
		<link>http://randomascii.wordpress.com/2011/12/14/fractal-extreme-new-versionbetter-zoom-movies/</link>
		<comments>http://randomascii.wordpress.com/2011/12/14/fractal-extreme-new-versionbetter-zoom-movies/#comments</comments>
		<pubDate>Thu, 15 Dec 2011 00:25:00 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[Fractals]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=351</guid>
		<description><![CDATA[While working on the Really Deep Fractal Zoom Movie – Much Faster project (240 times faster!) several updates were made to Fractal eXtreme in order to improve the creating of zoom movies. These improvements are now available as part of &#8230; <a href="http://randomascii.wordpress.com/2011/12/14/fractal-extreme-new-versionbetter-zoom-movies/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=351&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>While working on the <a href="http://randomascii.wordpress.com/2011/12/04/really-deep-fractal-zoom-movie-much-faster/">Really Deep Fractal Zoom Movie – Much Faster</a> project (240 times faster!) several updates were made to Fractal eXtreme in order to improve the creating of zoom movies. These improvements are now available as part of the 2.20 release. The summary of these improvements is:</p>
<ul>
<li>A bug related to pixel guessing was fixed. On one test movie this fix gave an overall calculation speedup of over 25%!</li>
<li>Support for variable speed playback when converting zoom movies to AVI movies was added.</li>
<li>A bug that would occasionally cause crashes when saving a zoom movie was corrected.</li>
<li>A bug that would cause images wider than 4,096 pixels to not reload iteration data was fixed.</li>
<li>A bug that caused failures when creating zoom movies from fractals that were rotated 90 or 270 degrees was fixed.</li>
<li>The progress meter that appears when rendering a zoom movie has been improved. It is now horizontally sizeable, it has an option to pause rendering, and it has an option to bring up the status window for more details on what is happening.</li>
<li>The timers that track how much time has been spent rendering an image or a movie now stop counting when your machine goes into hibernation, which makes them a more accurate reflection of the total computing time.</li>
<li>Statistics about each frame rendered are now saved to a .fxz.log file in the same directory as the movie. This records how long each frame took to calculate, the total number of iterations, the percentage of pixels that were guessed, the precision, and more.</li>
<li>When you save a zoom movie as an AVI file it is now optional whether the resulting AVI will be played automatically.</li>
<li>The undo limit in FX was increased from 50 to 500.</li>
</ul>
<p>As always the latest version of Fractal eXtreme is available at <a title="http://www.cygnus-software.com/downloads/downloads.htm" href="http://www.cygnus-software.com/downloads/downloads.htm">http://www.cygnus-software.com/downloads/downloads.htm</a>.</p>
<p><span id="more-351"></span>And now some details…</p>
<h2>Pixel guessing</h2>
<p>As was briefly mentioned <a href="http://randomascii.wordpress.com/2011/11/28/faster-fractals-again/">here</a> a problem was found where one calculation stage was being serialized instead of parallelized. Fractal eXtreme relies heavily on ‘guessing’ of areas of constant iterations in order to improve performance. Sometimes 75-95% of pixels can be guessed and this can give a tremendous performance improvement. However it is important that this process never give incorrect results, so at the end of every frame FX scans over the image to see if any guesses might be wrong. Any guessed pixel that is not surrounded by eight identical pixels is deemed risky and is explicitly calculated.</p>
<p>Due to a logic flaw it turned out that all processors would recalculate the same pixel, and then move on to the next one, almost in lock step. That meant that this phase would run ‘n’ times slower than it should, where ‘n’ is the number of processor threads you have. That’s an 8x slowdown on my laptop, and a 12x slowdown on my work machine.</p>
<p>The slowdown is only on this one phase, but this phase can involve the calculating of enough pixels – especially on high resolution images – that it makes a significant difference. A 25% speedup from this fix was measured on one long movie, and on machines with more threads the speedup would be greater. Actual results will vary significantly depending on the images you are calculating and the number of cores on your machine.</p>
<h2>Variable speed playback</h2>
<p>When doing deep zooms it is often the case that some parts of the movie are more interesting than others and it can be nice to slow down the movie at these points – or maybe you just want to have artistic control over the pacing of your movies. Rendering zoom movies to AVI at a constant speed and then adjusting the playback speed afterwards is inefficient and leads to a loss of quality. Therefore FX 2.20 lets you control the zoom speed when saving to AVI.</p>
<p>To enable this feature check the <em>Custom zoom</em> check box on the <em>Save as AVI</em> dialog in the zoom movie player:</p>
<p><a href="http://randomascii.files.wordpress.com/2011/12/image4.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://randomascii.files.wordpress.com/2011/12/image_thumb4.png?w=484&#038;h=235" alt="image" width="484" height="235" border="0" /></a></p>
<p>When this is checked the zoom movie player will look for a file with a .fxz.config extension with the same name and in the same directory as your .fxz file. You just need to create this text file and add appropriate commands. Here is an example .fxz.config file:</p>
<blockquote><p># Blank lines or lines that start with &#8216;#&#8217; are comments.</p>
<p># Set the starting zoom level.<br />
setZoomLevel = 2.0</p>
<p># Pause for half a second<br />
holdTime = 0.5<br />
# Accelerate to 2.0 zooms per second. Hold that speed from 4.0 zooms to<br />
# 5.0 zooms. The acceleration will be a smooth (sinusoidal) acceleration<br />
# curve from the current zoom level to fromZoom, and the speed will<br />
# then be held steady until toZoom is reached.<br />
zoomSpeed = 2.0, fromZoom = 4.0, toZoom = 5.0<br />
# Accelerate to 4.0 zooms per second, holding it from 7.0 to 10.0<br />
# zooms. The acceleration happens between zooms 5.0 and 7.0.<br />
zoomSpeed = 4.0, fromZoom = 7.0, toZoom = 10.0<br />
# Decelerate to 0.0 zooms per second, halting at 12.0 zooms.<br />
zoomSpeed = 0.0, fromZoom = 12.0, toZoom = 12.0</p>
<p># Pause for half a second.<br />
holdTime = 0.5</p>
<p># Go backwards.<br />
zoomSpeed = -3.5, fromZoom = 6.0, toZoom = 3.0<br />
# Stop.<br />
zoomSpeed = 0.0, fromZoom = 0.5, toZoom = 0.5</p></blockquote>
<p>When you click OK the Fractal eXtreme movie player looks for this file and parses it. If there are errors they will be reported. Otherwise it will tell you how long the movie will be and then save it to AVI.</p>
<p>In addition to specifying an initial zoom level and pauses you specify periods where the zoom speed is constant. You specify this by saying what speed you want and what zooms you want this speed held between. In-between these periods (such as from about 6 seconds to 9 seconds in the graph below) the movie player smoothly interpolates between those speeds.</p>
<p><a href="http://randomascii.files.wordpress.com/2011/12/image3.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://randomascii.files.wordpress.com/2011/12/image_thumb3.png?w=622&#038;h=321" alt="image" width="622" height="321" border="0" /></a></p>
<p>A sample movie using the data above can be found <a href="http://youtu.be/HH2d1DMqN70">here</a>.</p>
<p>Specifying periods of constant zoom-speed with the start/stop times given in zoom levels was found to work best because it makes it easy to say that you want to go slowly from zooms 88.3 to 97.2 where there is some cool detail, and then from zoom 101 to 150 you want to be going full speed.</p>
<p>It is important to leave gaps between the periods of constant zoom speed since otherwise there can’t be smooth interpolation and the results will not be as pleasing.</p>
<p>Note that the speed can go negative, meaning that zooming in and then out as you please is allowed. Just be sure to stop before changing directions, or else the results will be jarring, and when the speed is negative be sure to make toZoom smaller than fromZoom.</p>
<p>The chart above was created in Excel by pasting the data from the .avi.log file that is now created whenever you save a zoom movie as AVI.</p>
<h2>In closing…</h2>
<p>We hope that these new features allow the creation of better fractal zoom movies than ever before.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/351/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=351&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2011/12/14/fractal-extreme-new-versionbetter-zoom-movies/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2011/12/image_thumb4.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2011/12/image_thumb3.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>Increased Reliability Through More Crashes</title>
		<link>http://randomascii.wordpress.com/2011/12/07/increased-reliability-through-more-crashes/</link>
		<comments>http://randomascii.wordpress.com/2011/12/07/increased-reliability-through-more-crashes/#comments</comments>
		<pubDate>Thu, 08 Dec 2011 05:58:00 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[AltDevBlogADay]]></category>
		<category><![CDATA[Code Reliability]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=346</guid>
		<description><![CDATA[Shipping games that don’t crash is hard, and it’s important to use every tool available to try to find bugs. Static code analysis is one technique that I’ve discussed in the past and for some classes of bugs it is &#8230; <a href="http://randomascii.wordpress.com/2011/12/07/increased-reliability-through-more-crashes/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=346&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Shipping games that don’t crash is hard, and it’s important to use every tool available to try to find bugs. <a href="http://randomascii.wordpress.com/category/code-reliability/">Static code analysis</a> is one technique that I’ve discussed in the past and for some classes of bugs it is delightfully effective.</p>
<p>Another strategy is to stress your game at runtime by adding additional validation, and by making the runtime environment more hostile so that rare bugs become frequent.</p>
<p>App Verifier is a free tool from Microsoft that adds additional checks to handles and locks, and allocates memory in a way that makes bugs more likely to lead to crashes.</p>
<p>Any Windows developers that are listening to this: if you’re not using App Verifier, you are making a mistake.</p>
<p>This post discusses App Verifier’s heap features.</p>
<p><span id="more-346"></span>
<p>This article was originally posted on <a href="http://altdevblogaday.com/2011/12/08/increased-reliability-through-more-crashes/">#AltDevBlogADay.</a></p>
<h2>Memory Stressing&nbsp; with Page Heap</h2>
<p>One of the main features of App Verifier is Page Heap. This is a feature that puts every allocation on its own page in order to flush out buffer overruns and use-after-free errors.</p>
<h2>Buffer overruns</h2>
<p>Normally if you write beyond the end of an allocated buffer you will corrupt the heap data structures or some other allocation. This will often cause no initial problems, and then a catastrophic failure later on. This delayed failure makes it difficult to track down the problem. You might know which buffer overflowed, but not which code overflowed it.</p>
<p>Page Heap puts each allocation on its own 4-KB page, with the allocated memory aligned to the end of the page. Therefore if you overrun the buffer you will touch the next page. Page Heap ensures that the next page will be unmapped memory so you get a guaranteed access violation at the exact moment that you overrun the buffer.</p>
<p>Buffer overrun crashes with page heap are usually on the first byte of a page. That means that the last three digits of the hex address will be zero – watch for that signature in order to categorize the access violations you see.</p>
<p>In the awesomely buggy code below you can see that we crashed when we tried to write to 0x06D8D000 (EDI), and the memory window shows the ‘??’ pattern that indicates a fresh page of non-existent memory.</p>
<p><a href="http://randomascii.files.wordpress.com/2011/12/image.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" border="0" alt="image" src="http://randomascii.files.wordpress.com/2011/12/image_thumb.png?w=745&#038;h=368" width="745" height="368"></a></p>
<p>By default Page Heap keeps your allocations aligned to 8 or 16 byte boundaries so if, for instance, you allocate 4 bytes of memory there will be 4 or 12 bytes of mapped memory before the end of the page, which means that overruns will not be instantly caught. Memory corruption in the unused bytes at the end of the page will be checked for when the memory is freed.</p>
<h2>Use after free</h2>
<p>If you write to memory after freeing it then this will usually corrupt memory. Occasionally you will get lucky and crash immediately, but more often you will merely set the stage for a crash far in the future. Use-after-free memory corruption is usually much harder to investigate than buffer overruns.</p>
<p>Since Page Heap puts each allocation on its own page it can ensure that the memory will be unmapped when it is freed. That means that use-after-free will reliably give an instant access violation. A nightmare memory corruption bug becomes a tame kitten.</p>
<p>Whereas buffer overruns with Page Heap usually cause access violations near the beginning of a page, use-after-free with Page Heap usually causes access violations near the end of a page (assuming small allocations). Watch for the last three digits of the hex address to be near 0xFFF.</p>
<p>With use-after-free bugs the challenge may be to figure out who freed the memory. Sometimes, on good days, App Verifier will help with that. The process is a bit convoluted and arcane, but worth knowing about.</p>
<p>Page Heap records call stacks when you allocate and free memory, and WinDbg has an extension that will look up that information for a Page Heap address. If you are debugging with WinDbg then you can just type in “!heap -p -a Address” and see if a call stack for when the memory was freed is available. If you are debugging with Visual Studio then you can save a Minidump with Heap (Debug-&gt;Save Dump As) then load it into WinDbg and type the the intuitive !heap command. Whether the free stack is available depends on how long ago the memory was freed. I find that it has worked about half the time for me, and when it works it feels quite magical. If it doesn’t give an answer within ten to twenty seconds then it probably will and ctrl+break is your friend. In the screen shot below you can see that we crashed when accessing 06decff8 (ESI), saved a crash dump, loaded it into WinDbg, and then typed in “!heap –p –a 06decff8”. Our reward for this effort was the call stack of when this memory was freed. Shazam!</p>
<p><a href="http://randomascii.files.wordpress.com/2011/12/image1.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" border="0" alt="image" src="http://randomascii.files.wordpress.com/2011/12/image_thumb1.png?w=669&#038;h=460" width="669" height="460"></a></p>
<h2>Illegal Reads</h2>
<p>Page Heap will detect illegal <em>reads</em> (buffer overreads and read-after-free) just as easily as illegal writes. These bugs are less serious, but worth finding and fixing since they can still lead to unpredictable behavior and crashes in the field.</p>
<h2>True Tales</h2>
<p>Normally I use App Verifier in a proactive mode – I use it to hunt for bugs that we don’t know about, and to make our unit tests more likely to find problems.</p>
<p>However it also works brilliantly as a reactive tool. A few months ago I got a call from a coworker because our game was hanging – spinning in a busy loop in the gnarliest lockless code that we have. I spent a while (too long) staring at the data structures trying to figure them out – when I realized that the problem was probably memory corruption. I turned on App Verifier and instantly hit the crash, in code that was miles away from where the symptoms appeared. An object that owned memory was being returned by value but didn’t have a copy constructor (<a href="http://en.wikipedia.org/wiki/Rule_of_three_(C%2B%2B_programming)">rule of three violation</a>). The memory owned by the object was freed, but the object’s copy still had pointers to it and we were writing through them. Without App Verifier I’m not sure how we would have found the bug, and with App Verifier the bug was trivial.</p>
<p>In less happy news, in some cases App Verifier will perturb the timing of your game so much that certain race conditions no longer occur – so it isn’t guaranteed to find everything.</p>
<h2>Memory Consumption and Performance</h2>
<p>When a 32-bit process on Windows allocates one byte it actually uses up 16 bytes of heap space – the heap granularity plus bookkeeping overhead adds the extra bytes. When using Page Heap a one byte allocation actually uses up 4 KB of memory, and 8 KB of address space. That means that fewer than 256 K allocations will exhaust the default 2 GB address space of a 32-bit program.</p>
<p>Marking your program as <a href="http://msdn.microsoft.com/en-us/library/wz223b1z(v=VS.100).aspx">large address aware</a> will give you (when running on 64-bit Windows) a 4 GB address space, which will postpone address space exhaustion, but will probably not avoid it entirely. Porting to 64-bit is the ideal solution. In many cases the more expedient solution is to adjust the heap settings so that only allocations within a certain size range go on their own pages, or else adjust the RandRate so that some percentage of your allocations go on their own pages.</p>
<p>Page heap significantly reduces your game’s performance. In addition to the greater cost of allocating and freeing memory, individual memory accesses are now more expensive, due to less efficient cache usage. Your mileage may vary, but these are the changes I’m seeing on one recent project:</p>
<h2>
<table border="2" cellspacing="0" cellpadding="2" width="516">
<tbody>
<tr>
<td valign="top" width="226"><font size="4"></font></td>
<td valign="top" width="101"><font size="4">Normal</font></td>
<td valign="top" width="185"><font size="4">With App Verifier</font></td>
</tr>
<tr>
<td valign="top" width="226"><font size="4">Frame rate</font></td>
<td valign="top" width="101"><font size="4">170 fps</font></td>
<td valign="top" width="185"><font size="4">3.7 fps</font></td>
</tr>
<tr>
<td valign="top" width="226"><font size="4">Memory usage</font></td>
<td valign="top" width="101"><font size="4">0.8 GB</font></td>
<td valign="top" width="185"><font size="4">2.9 GB</font></td>
</tr>
<tr>
<td valign="top" width="226"><font size="4">Address space usage</font></td>
<td valign="top" width="101"><font size="4">1.0 GB</font></td>
<td valign="top" width="185"><font size="4">5.5 GB</font></td>
</tr>
</tbody>
</table>
</h2>
<p>If I reduced the number of memory allocations per frame then I could probably get performance up to about 10 fps, but even at 3.7 fps it’s okay for running tests.</p>
<p>Note that while we are using less than 4 GB of RAM, we are using more than 4 GB of address space, so we are only able to run in full App Verifier mode because we have a 64-bit build.</p>
<h2>Hooking up to the process heap</h2>
<p>Most game developers don’t use the system heap directly, for all sorts of good reasons. However Page Heap is powerful enough to justify making a conditional exception. On the projects I have worked on it has been relatively straightforward to add code that checks for a command line option on the first allocation, and when it is detected redirects all allocations to the process heap.</p>
<p>It’s worth pointing out that many other components within your game may already be using the Windows heap. D3D, for instance, does a lot of heap allocations which will be redirected to Page Heap by App Verifier.</p>
<h2>Technical details</h2>
<p>After installing App Verifier just run it, add your executable name to the list, and click Save. Don’t forget to click Save after any changes that you make. That’s it. Your game will now be stress tested any time it runs on that machine. Don’t forget to clear the list and hit Save when you are done or you may find your game (or tool) will be running noticeably slower. The settings are stored in the registry (HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options) and continue to be in force even when App Verifier isn’t running – think of the potential for practical jokes!</p>
<p>In the App Verifier window right-click on Basics-&gt;Heaps to edit the Page Heap settings, and check View-&gt;Property Window to see descriptions of the settings. You can specify what allocations go to Page Heap, put allocations at the beginning of pages in order to watch for buffer underruns, and configure other settings.</p>
<p>You should probably uncheck “Leak” since otherwise memory leaks will be considered a fatal error, which is a bit too dramatic for my tastes.</p>
<p><a href="http://randomascii.files.wordpress.com/2011/12/image2.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" border="0" alt="image" src="http://randomascii.files.wordpress.com/2011/12/image_thumb2.png?w=710&#038;h=562" width="710" height="562"></a></p>
<p>App Verifier prints debug output that explains some of the problems that it detects, and attaching a debugger after it finds a bug means you will miss this valuable information. App Verifier assumes that you will be using WinDbg, but it’s okay to use Visual Studio. App Verifier also prints a message at process startup to let you know that Page Heap is enabled.</p>
<p>The Visual C++ debug CRT puts padding around allocations in debug builds. This makes Page Heap less effective, so you should prefer using Page Heap with the release CRT.</p>
<p>App Verifier tries to ‘add value’ to access violations by catching them with an exception handler and printing out helpful information. That’s redonkulous! All this does is complicate the diagnosis by putting you six levels deeper in the stack. You can disable this on a per-solution basis by going to Debug-&gt;Exceptions-&gt;Win32 Exceptions-&gt;Access violation and checking the ‘Thrown’ box so that your game halts on the offending instruction.</p>
<h2>Getting App Verifier</h2>
<p>App Verifier and WinDbg are both available for free as part of the “<a href="http://www.microsoft.com/download/en/details.aspx?displaylang=en&amp;id=8279">Microsoft Windows SDK for Windows 7 and .Net Framework 4</a>” – don’t you love Microsoft product names? You should already have the Windows SDK installed for <a href="http://randomascii.wordpress.com/category/xperf/">xperf</a> and <a href="http://randomascii.wordpress.com/category/code-reliability/">/analyze</a> and <a href="http://randomascii.wordpress.com/2011/11/11/source-indexing-is-underused-awesomeness/">source indexing</a>. How many reasons do you need to install this thing?</p>
<p>See <a href="http://randomascii.wordpress.com/2011/10/15/try-analyze-for-free/">this post</a> for details on getting the Windows SDK.</p>
<h2>Summary</h2>
<ul>
<li>App Verifier and Page Heap are free goodness
<li>Access violation address that ends near 0&#215;000? Buffer overrun.
<li>Access violation address that ends near 0xFFF? Use after free.
<li>Access violation address of zero? Page Heap induced address space exhaustion (port to 64-bit or configure the Page Heap settings to avoid this) </li>
</ul>
<p>I’ve found dozens of serious bugs using AppVerifier. Buffer overruns, use after free, invalid handle usage caused by race conditions, and more. Our build machines now use App Verifier on some of the nightly unit tests and it continues to protect us and save us from wasting time.</p>
<p>P.S. The day after posting this on <a href="http://altdevblogaday.com/">AltDevBlogADay</a> I found a long-standing read-overrun bug in some decade old code, thus showing the value of having App Verifier on as you exercise all code paths.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/346/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=346&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2011/12/07/increased-reliability-through-more-crashes/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2011/12/image_thumb.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2011/12/image_thumb1.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://randomascii.files.wordpress.com/2011/12/image_thumb2.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>Really Deep Fractal Zoom Movie &#8211; Much Faster</title>
		<link>http://randomascii.wordpress.com/2011/12/04/really-deep-fractal-zoom-movie-much-faster/</link>
		<comments>http://randomascii.wordpress.com/2011/12/04/really-deep-fractal-zoom-movie-much-faster/#comments</comments>
		<pubDate>Mon, 05 Dec 2011 00:15:29 +0000</pubDate>
		<dc:creator>brucedawson</dc:creator>
				<category><![CDATA[Fractals]]></category>

		<guid isPermaLink="false">https://randomascii.wordpress.com/?p=336</guid>
		<description><![CDATA[A few weeks ago while browsing YouTube for Fractal movies I came across a video that claimed to be (as of its post date of January 26, 2010) the deepest zoom movie on YouTube. The video was of an interesting &#8230; <a href="http://randomascii.wordpress.com/2011/12/04/really-deep-fractal-zoom-movie-much-faster/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=336&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A few weeks ago while browsing YouTube for Fractal movies I came across a video that claimed to be (as of its post date of January 26, 2010) the <a href="http://www.youtube.com/watch?NR=1&amp;v=0jGaio87u3A">deepest zoom movie on YouTube</a>. The video was of an interesting area, but what really caught my eye was the information that the video took six months to render. Six months on twelve CPU cores.</p>
<p>I’ve worked pretty hard at optimizing Fractal eXtreme’s calculations so I decided to see how long it would take FX to render the exact same movie.</p>
<p>It took 18 hours. That’s 240 times faster.</p>
<p><span id="more-336"></span>
<p>I had expected FX to be faster, but 240 times faster was more than I had expected. So where did the speedups come from?</p>
<p><em>Different computers</em>: The original render used three quad-core computers of 2009 vintage. I’m using a four-core eight-thread 2011 Sandybridge laptop CPU. Sandybridge is really fast, but my laptop CPU has a max clock speed that is a lot lower than on desktop parts, and the hyperthreads aren’t as powerful as real cores, and I’ve got fewer total threads. So, I suspect the total CPU power available is close to identical. FX speedup 1:1.</p>
<p><em>Interpolation</em>: The original render calculated every frame, whereas Fractal eXtreme calculates key frames, separated by a magnification increase of 2.0. The original render ran at 30 frames per second for 312 seconds, for a total of 9,360 frames. Some of them are stationary frames at the beginning, but because the zoom speed drops at the end a higher percentage of the frames are towards the end where rendering takes longer, so the ratio of 9,360:916 is probably pretty reasonable and FX does about ten times less work. FX speedup: 10:1</p>
<p><em>Guessing</em>: Despite its chaotic nature the Mandelbrot set has many large areas with a constant iteration count. Fractal eXtreme recognizes these and avoids calculating every pixel in these regions (it is very conservative in order to avoid errors). In extreme cases where there isn’t much detail this guessing can avoid calculating 90% or more of the pixels without changing the result. Overall the guessing reduced the workload by about two thirds. FX speedup: 3:1</p>
<p><em>Doubling magnification</em>: Each Fractal eXtreme key frame is exactly double the magnification of the previous key frame. That means that one quarter of the pixels from key frame ‘n’ are also in key frame ‘n+1’. That means that Fractal eXtreme usually only has to calculate 75% of the pixels in a key frame, with no change in the final results. FX speedup: 4:3</p>
<p><em>Great programming</em>: With 40:1 of the 240:1 advantage explained I think we are forced to conclude that Fractal eXtreme is just faster. The additional 6:1 advantage could come from using 64-bit math instead of 32-bit math (good for about a 4:1 advantage) plus using fully unwound math routines (good for perhaps a 1.5:1 advantage), but this is purely speculation. FX speedup: 6:1</p>
<p>Being able to calculate the same movie (same maximum iterations, depth, and target location) in 18 hours is nice, but we can do better. And, to be honest, when you interpolate key frames that have aliasing you do lose some quality. However if you antialias the key frames before interpolation then you actually get a higher quality movie, with more of the subtle detail visible, even after interpolation. And if you antialias and render at a higher resolution, you get an even higher quality movie. And, due to the nature of the Fractal eXtreme zoom movie interpolation system, you can play the movie back at a higher resolution than it was rendered and actually increase the quality even more.</p>
<p>So…</p>
<p>Same location, same maximum iterations, slightly higher maximum zoom depth (to finish all the way into the final Mandelbrot set), 960&#215;540 render resolution (up from 640&#215;480), 3&#215;3 antialiasing, and a final output resolution of 1280&#215;720.</p>
<p>The higher resolution and antialiasing make a stunning improvement to the video quality. The moiré patterns and flickering pixels are gone, subtle ribbons of color are consistently visible even when they are less than a pixel wide, and the boundaries between bands are smooth. Because the interpolation is done after the time-consuming render it was possible to experiment with different zoom speeds, which allows the movie to go fast when there is little detail, and slow down whenever a mini-brot or other tourist attraction appears.</p>
<p>The increased resolution and antialiasing means that we need to render 15.1875 times more pixels, which would normally increase the render time to 11.4 days, plus a bit more for the two extra key frames at the end. However at higher resolutions the guessing works even better so this enhanced quality movie actually took less than 8 days of compute time.</p>
<p>The elapsed time was a bit higher than the render time because I was rendering this on my laptop, which I use for many other tasks. I left the render running when I was doing e-mail, web surfing, and writing this blog post (so FX was running a bit more slowly) but I had to pause it, and stop the clock, when I was taking the bus to work. If I’d done the calculations on a dedicated desktop machine like the one I have at work (six cores and higher frequencies) it could have been rendered in half the time.</p>
<p>8 days. 720p of antialiased Mandelbrot beauty. Thanks to Nosro for finding the location, and sharing its coordinates.</p>
<p>But enough talk. The proof is in the videos. Enjoy.</p>
<div style="display:inline;float:none;margin:0;padding:0;" id="scid:5737277B-5D6D-4f48-ABFC-DD9C333F4C5D:8fdcaab3-c980-4bd7-8d1a-e8c7f58a5726" class="wlWriterEditableSmartContent">
<div><span style="text-align:center; display: block;"><a href="http://randomascii.wordpress.com/2011/12/04/really-deep-fractal-zoom-movie-much-faster/"><img src="http://img.youtube.com/vi/W79aVuJi1iM/2.jpg" alt="" /></a></span></div>
<div style="width:621px;clear:both;font-size:.8em;">Deep zoom movie at 1280&#215;720</div>
</div>
<p>&nbsp;</p>
<p>Original 640&#215;480 non-antialiased zoom movie:</p>
<div style="display:inline;float:none;margin:0;padding:0;" id="scid:5737277B-5D6D-4f48-ABFC-DD9C333F4C5D:c08325e6-fbf3-40a2-8567-7ada9637d355" class="wlWriterEditableSmartContent">
<div><span style="text-align:center; display: block;"><a href="http://randomascii.wordpress.com/2011/12/04/really-deep-fractal-zoom-movie-much-faster/"><img src="http://img.youtube.com/vi/0jGaio87u3A/2.jpg" alt="" /></a></span></div>
<div style="width:448px;clear:both;font-size:.8em;">Deep zoom movie at 640&#215;480</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/randomascii.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/randomascii.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/randomascii.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/randomascii.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/randomascii.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/randomascii.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/randomascii.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/randomascii.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/randomascii.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/randomascii.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/randomascii.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/randomascii.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/randomascii.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/randomascii.wordpress.com/336/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=randomascii.wordpress.com&amp;blog=18565082&amp;post=336&amp;subd=randomascii&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://randomascii.wordpress.com/2011/12/04/really-deep-fractal-zoom-movie-much-faster/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d69d2780728dfc033fcc8123f31ef8fa?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">brucedawson</media:title>
		</media:content>
	</item>
	</channel>
</rss>
