Symbols the Microsoft Way

Symbol servers allow developer tools on Windows to automatically find symbols. They do this so well that most developers never have to worry about the internal mechanisms. However when things go wrong it can be helpful to understand how they work, and it turns out that it is all very simple.

Note that elfutils debuginfod may finally be letting Linux catch up to Microsoft and Windows, but allowing automatic retrieval of debug information and source code.

This article should serve as a good comparison to the process of getting symbols, especially for crashes on customer machines, for Linux. I documented that process in a four-part series:

My discussion of Windows symbol servers make use of the symbol server that I have on my laptop, for my own personal projects. Whenever I release a new version of Fractal eXtreme (64-bit optimized, multi-core, fast and fluid exploration of fractals, demo version here) I put the symbols and binaries on my symbol server so that I can trivially investigate any crash reports that I receive. This may seem like overkill for a home project, but in fact a local symbol server is just a copy of the files, arranged in a specific way for easy retrieval, and it is trivial to set up. For executables that I release publicly, like UIforETW, I publish the PE and PDB files to a public symbol server on Google storage – details are here.

Finding PE files

Symbol servers store not just symbols but also PE files (DLLs and EXEs). If these aren’t already available, such as when looking at a minidump or an xperf profile, then they are usually retrieved first, before the symbols. For crash dumps from 64-bit processes these PE files may be necessary in order to do stack walking because they contain the necessary metadata. There are three pieces of information that are needed in order to retrieve a PE file from a symbol server: the file name, link time stamp, and the image size. In order to manually check whether the latest version of FractalX.exe made it into my symbol server I would extract the link time stamp and the image size from the executable like this:

> dumpbin FractalX.exe /headers | find “date stamp”
4FFD0109 time date stamp Tue Jul 10 21:28:57 2012
> dumpbin FractalX.exe /headers | find “size of image”
147000 size of image

The format for the path to a PE file in a symbol server share is:

“%s\%s\%08X%x\%s” % (serverName, peName, timeStamp, imageSize, peName)

Note that the timeStamp is always printed as eight digits (with leading zeroes as needed) using upper-case for ‘A’ to ‘F’ (important if your symbol server is case sensitive), whereas the image size is printed using as few digits as needed, in lower-case.

Aside: If you do a pair of builds with minimal differences between them then the size will probably not change, and the peName certainly won’t, so the only thing stopping the two builds from having the same symbol-server address is the timeStamp. This means that reducing the precision of the timeStamp – say, from one second to 24 hours – could cause some pretty serious filename collisions. You could end up overwriting the first entry in the symbol server with the second. You know, hypothetically.

My symbol server is in c:\MySyms (normally it would be on a shared server, but this is my personal laptop) so the full path for the file examined above is:

c:\MySyms\fractalx.exe\4FFD0109147000\FractalX.exe

Simple enough. In my case I use symstore.exe’s /compress option (it saves a lot of space) when I add the files. Compressed files are indicated by replacing the last character with an underscore, so the actual path is this:

c:\MySyms\fractalx.exe\4FFD0109147000\FractalX.ex_

This is a good test to make sure that your PE files have been correctly added to your symbol server but it’s not a very realistic use case since we used the PE file to obtain the values needed to retrieve the PE file. The more common scenario is that you would have a minidump or an xperf ETL file and this file would contain a series of module name, link time stamp, image size triplets and these would be used at analysis time to retrieve the PE files. In the case of minidump files there is an array of MINIDUMP_MODULE structures which contain the relevant data. Note that the layout of symbol server shares can be much more complex. You should use the APIs (discussed later) to retrieve PE files – the technique above is purely for troubleshooting.

If you have a link time-stamp and you want to convert it to a date you can use this Python one-liner batch file:

python -c "import time; print time.ctime(int('%1',16))"

One extra quirk is that somewhere in Microsoft’s linker/debugger toolchains used to lower-case the names of PE files. This meant that if you use a case-sensitive file system for your symbol server (as Chrome does) then you had to upload your symbols using lower-case file names. The Chrome team discovered this the hard way. The PDB names are extracted from the PE files, with the case intact, so they just have to match. I can no longer reproduce this behavior – UIforETW.exe gets uploaded in mixed-case to the case-sensitive Google Storage and this works.

Finding PDB files

Finding PE files is handy when analyzing customer crash dumps in order to have the assembly instructions but it’s actually more important than that. Minidump files don’t necessarily record enough information to retrieve PDB files. In those cases the tools retrieve the PE files and then look in the PE files to get the information needed to retrieve the PDB files. Once again we can extract this information from a PE file using dumpbin:

> dumpbin FractalX.exe /headers | find “Format:”
4FFD0109 cv           56 000B9308    B7B08    Format: RSDS, {6143E0D1-9975-4456-AC8E-F24C8777336D}, 1, FractalX.pdb

The long hexadecimal number after RSDS is a GUID, and the number after that (a 32-bit decimal number, but in this case just ‘1’) is called the ‘age’. The PDB file name is also listed here. Together these uniquely identify a particular version of a PDB file. The format for the path to a PDB file in a symbol server share is:

“%s\%s\%s%x\%s” % (serverPath, pdbName, guid, age, pdbName)

Funny thing here – notice that I use %x to print the age, but in the previous paragraph I described the age as being a decimal number. Well, the PDB age (just a measure of how many times the same PDB has been reused) is just a 32-bit number, but dumpbin prints it in decimal, and symbol servers expect it in hexadecimal. Hurray for consistency! This means that if you parse the dumpbin output you need to convert the age to an integer and the print it as hexadecimal. If you get this wrong then the bug won’t show up until you encounter a PDB with an age of ten or greater. Wonderful.

As with the PE files a final underscore indicates when a file is compressed by symstore.exe. The path on my symbol server for the PDB listed above looks like this:

c:\MySyms\FractalX.pdb\6143E0D199754456AC8EF24C8777336D1\FractalX.pd_

Simple enough. The algorithm for generating the GUID and age is that whenever you do a rebuild – whenever a fresh PDB is generated – a new GUID is created and the age is set to one. Whenever you do an partial build the PDB is updated with new debug information and the age is incremented. That’s it – use the PE name, link time stamp, and image size to find the PE (if it isn’t already loaded) and then use the GUID, age, and PDB file name to find the PDB file. Note that the layout of symbol server shares can be much more complex. You should use the APIs (discussed later) to retrieve PDB files – the technique above is purely for troubleshooting.

Adding to a symbol server

If you ship software on Windows then you should have a symbol server. That symbol server should contain the PE files and PDB files for every product you ship. If you don’t do this then you are doing either yourself or your customers a disservice. You should also have a symbol server for all internal builds that anybody at the company might end up running. If a program might crash, and if you want to be able to investigate the crash then put the symbols on the symbol server. If you’re worried about internal builds using up too much space then put them on a separate symbol server and purge the old files occasionally. You should also make sure that your build machines are running source indexing so that when you’re debugging a crash in an old version of your software you will automatically get the right source files. Luckily I wrote about that already. Adding files to a symbol server is the height of simplicity. Set sourcedir to point at a directory containing files to add and set dest to your symbol server directory, which should be accessible to all who need the symbols. Then run these commands:

symstore add /f %sourcedir%\*.dll /s %dest% /t prodname /compress symstore add /f %sourcedir%\*.exe /s %dest% /t prodname /compress symstore add /f %sourcedir%\*.pdb /s %dest% /t prodname /compress

If you’re running a case-sensitive file system then you’ll need to lower-case the PE file names afterwards – sorry.

That’s it. Use /r if you want the files recursively added, and see the help for more information. If you have an existing symbol server that was not compressed, or if you want to do the compression step separately from adding to the symbol server, then the makecab command (ships with Windows) does the trick. This is what Chrome used to use, with this type of command line:

makecab /D CompressionType=LZX /D CompressionMemory=21 chrome.dll.pdb

That will generate a compressed chrome.dll.pd_ file which will be automatically decompressed when it is downloaded to your local symbol cache.

Another option is to use compress.exe from the Windows Server 2003 Resource Kit Tools. But be careful. compress /help says that LZX is the default but it isn’t. So be sure to use compress -ZX or you will end up with compressed files that SymSrv cannot decompress.

Apparently makecab, compress.exe, and symstore all have the limitation that their input files have to be less than 2 GiB. This is a problem for Chrome’s unified chrome.dll.pdb which is currently (October 2019) about 2.5 GB, and therefore cannot be compressed. Oops. pigz can handle 4 GiB input files but a version that generates CAB files is not yet open source. We ultimately fixed this by foregoing Microsoft’s compression and instead using HTTP compression – we added -Z to our gsutil command line so that the PDBs would be gzip compressed on upload.

If you make your symbol server available through HTTP be sure to use https, to ensure the integrity of the downloads.

Or, if you are trying to make a publicly accessible symbol server, just follow these simple directions.

Using a symbol server

The precise details of how to get your development tools to use your symbol server vary, but one almost universal method is to set the _NT_SYMBOL_PATH environment variable (advanced usage here and here), to something like this:

_NT_SYMBOL_PATH=SRV*c:\symbols*c:\MySyms;SRV*c:\symbols*https://msdl.microsoft.com/download/symbols;SRV*c:\symbols*https://chromium-browser-symsrv.commondatastorage.googleapis.com

This tells tools to first look in the local cache (c:\symbols) and then look in the symbol server c:\MySyms. If symbols are found in c:\MySyms then they are copied (and decompressed) to c:\symbols. If none of that works then the same process (including the same cache directory) is followed for Microsoft’s web based symbol cache, and then Chrome’s. Note that a local symbol cache is required when dealing with compressed symbols. Note that some symbol servers, such as Chrome’s and Microsofts, can be reached through https as well as http. When https is available as an option you should always use it since otherwise a man-in-the-middle attack could use malformed PDBs or source-indexing commands to execute arbitrary code when you download and use these PDBs.

Microsoft still lists their symbol server using http in some places, but https works and should be preferred.

The SRV* part of _NT_SYMBOL_PATH is important, poorly documented, and apparently poorly understood. It is my understanding, confirmed by discussions on stackoverflow, that SRV* tells symsrv.dll to treat the following paths or URLs as symbol servers instead of just a collection of loose files. So, if _NT_SYMBOL_PATH is c:\symbols then dbghelp or symsrv may recursively search the directory structure for your symbols, but if _NT_SYMBOL_PATH is SRV*c:\symbols then it will search in a very structured and efficient way. If _NT_SYMBOL_PATH is

SRV*c:\symbols*https://msdl.microsoft.com/download/symbols

then symsrv will first look in c:\symbols, using the quick and efficient symbol server algorithm, and will then (if the symbols aren’t found) do the same efficient search in Microsoft’s symbol server. You should prefer using SRV* and symbol server layout rather than unstructured symbols.

Programmatically retrieving symbols

Usually the debuggers and profilers that you use will know how to use symbol servers, but occasionally you may need to write code to download symbols – perhaps you are writing a debugger or profiler. In my case I had a web page that listed GUIDs, ages, and PDB names for dozens of Microsoft DLLs from dozens of versions of Windows for which we needed symbols. Writing code to download all of these symbols was trivial – several orders of magnitude easier than getting symbols for other versions of Linux. The short explanation of what I needed to do was “call SymFindFileInPath”. In order to demonstrate how easy it was I decided to give a slightly longer explanation. The sample code, available on github as part of UIforETW, takes a GUID, age, and pdb name, or a datestamp, size, and pename and downloads the PE or PDB file from a symbol server. The biggest chunk of code is for parsing the GUID – the actual downloading is trivial.

The TESTING define uses a known-good GUID, age, name, and symbol server. Comment out that define to use this to download arbitrary symbols from the symbol servers specified in _NT_SYMBOL_PATH. If you encounter any difficulties then run this program in a debugger – dbghelp will print diagnostics to the debugger output window. The one gotcha is that dbghelp.dll and symsrv.dll have to be in the same directory as your tool – having them in your path does not work reliably.

As previously mentioned, the latest version of RetrieveSymbols (the tool whose source code is listed above) ships in UIforETW – source is here and the latest binary can be found here or in the latest UIforETW release. The symchk tool (ships in the Windows debugger toolkit) also downloads PDB files when passed a PE file – use the /v option to get information about where the PDB file was downloaded to, and other information. symchk is more convenient if you have a .dmp file or a PE file that you need symbols for, whereas RetrieveSymbols is more convenient if you have a GUID, age, and PDB file name.

Diagnosing symbol problems with windbg

If you have a minidump and its symbols are not loading then I recommend loading the minidump into windbg and using its diagnostics:

  • !sym noisy – print verbose information about attempts to get symbols
  • lmv m MyModule – print a record from the crash dump’s module list including its name, time stamp, image size, and where the PDB is located if it was found
  • !lmi MyModule – print a module’s header information – this only works if the PE file has been loaded, which is a prerequisite for having symbols load

Dumpbin summary

  • “%VS120COMNTOOLS%..\..\VC\vcvarsall.bat” – this adds dumpbin’s directory to the path
  • dumpbin FX.exe /headers | find “date stamp” – find the link time stamp of a PE file
  • dumpbin FX.exe /headers | find “size of image” – find the image size of a PE file
  • dumpbin FractalX.exe /headers | find “Format:” – find the GUID, age, and file name of a PE file’s PDB file

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x as fast. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And sled hockey. And juggle. And worry about whether this blog should have been called randomutf-8. 2010s in review tells more: https://twitter.com/BruceDawson0xB/status/1212101533015298048
This entry was posted in Symbols, Visual Studio and tagged , , , , . Bookmark the permalink.

74 Responses to Symbols the Microsoft Way

  1. Kdansky says:

    I’m using the windows symbol servers daily, but I have one weird issue: One of our customers has a machine on which our software crashes, and the generated minidumps give me a stack-trace into MFC-dlls of which I cannot get symbols, which is highly irregular. The exact version-numbers elude me right now, but I do not understand how this could happen in the first place. Does MS have holes in their PDB-coverage? Is the minidump faulty? Is it a problem with the client’s windows installation? Magic?

    • brucedawson says:

      That sounds peculiar. You should try loading the crashes into windbg and using “lvm m mfc100” to get more information, and !lmi also. You can at least find out whether the problem is with loading the PE file or the PDB file.

      Are you shipping the MFC DLLs with your application? That’s the recommended thing to do and that would normally mean that they would be running a known version — your version.

      • Kdansky says:

        Thank you for the suggestions. I could load the dumps into Windbg, and they seem to be fine, it’s just that I don’t have the correct versions of the dll’s themselves, (or possibly VS/windbg can’t find them due to 32bit app on 64bit dev machine). It seems shipping the mfc100u.dll ourselves would be the better option to begin with.

        • brucedawson says:

          > VS/windbg can’t find them due to 32bit app on 64bit dev machine

          No. That is never a problem. The symbol lookup algorithm doesn’t give a damn about CPU architecture. It’s all about extracting fields from the PE file and using them as search keys. x86/x64/ARM/PPC does not enter in to it.

          Assuming you have your symbol path configured correctly the customer must have a version of mfc100u.dll that is not listed in Microsoft’s symbol server. This is possible, albeit very rare. You should be shipping mfc100u.dll anyway which will probably resolve both the crash and the symbol problems.

  2. Alexander says:

    Been using “srv*shared server*msdl” configuration for ages, and did all sorts of tricks to have a local cache. I did have a backup script to copy shared server to local cache, I had microsoft client-side caching, I even thought to hack its driver to force caching a directory! (The intent was to always write to shared server, but read from local cache). I also have microsoft and our own symbols all messed up in a single directory at shared server. Oh my, i feel so ashamed now that I learned I only had to configure “srv*local cache*shared server*microsoft” to get it working out of the box. Also, didn’t know I can actually put two symbol servers in a row to avoid having a mess of everything in a single directory.

    • brucedawson says:

      The syntax for _NT_SYMBOL_PATH is excessively messy but pretty configurable. Read the three links in the post for various other ways of having different caching policies for different symbol servers.

      I generally cache everything to the same directory and then delete it occasionally. I trust that it will get repopulated as needed.

      • Alexander says:

        I already read them. Your post merely served as a starting point. I actually wanted to learn more about compressing an existing server (the only thing I didn’t know from your post, and I seem to have skipped the part where you config local cache through intermediate), but reading stuff ended in learning all that. Thanks 🙂

  3. Alexander says:

    “Minidump files and most profile files do not actually record enough information to retrieve PDB files”

    Not quite so. I made a debugging tool for our crash handling purposes and I actually load PDB’s by MINIDUMP_MODULE.CvRecord, which is a CV_INFO_PDB70, containing everything you need. I didn’t even save PE files for ages, and it worked just fine.

    • Alexander says:

      Although I vaguely remember I had a case when I was helping some friend with his minidump and he didn’t have CvRecord in it. Probably a very outdated minidump-making tool or something like that.

    • brucedawson says:

      A problem we hit was that internally we would record full minidumps (with heap) and they had enough information to allow loading the PDBs, without finding the PEs. However when Microsoft sent us mini-minidumps (no heap) we couldn’t load the symbols. This is what forced me to learn more about how the process works so that I could configure our symbol publishing so that we could load symbols for *all* minidumps.

      So, there are some cases where the PE files are unnecessary, but I prefer not to risk it 🙂

  4. Alexander says:

    Also, are you aware of SymSrvGetFileIndexes() / SymSrvGetFileIndexInfo() ? This is a programmatical way of what you’re doing with dumpbin

  5. Pingback: Symbols on Linux Part Three: Linux versus Windows | Random ASCII

  6. Pingback: Symbols on Linux Part One: g++ Library Symbols | Random ASCII

  7. Pingback: Symbols on Linux Part Two: Symbols for Other Versions | Random ASCII

  8. Alexander says:

    Finally I got time to re-configure our symbols servers and compress it.
    I downloaded the compress.exe from your link, with md5 a911550b51f759a723f40db3157572f7.
    I compressed the symsrv using some batch script.
    And now it’s all ruined! Many files just can’t be extracted, others can, but with warnings. 7-zip will show absolutely invalid original file sizes for every single file. Having googled the internal format of compress.exe I can confirm that header contains exactly that incorrect size. To be specific, if will always have byte 0x63 where it shouldn’t be.

    I’m pretty much terrified. Even though I do have a backup, just not too handy.

    Now, an experiment. Let’s make file of exactly 8465408 bytes, compress it and try to expand.
    HANDLE file = CreateFile(_T(“Zeroes.pdb”), GENERIC_WRITE, 0, 0, CREATE_ALWAYS, 0, 0);
    SetFilePointer(file, 8465408, 0, FILE_BEGIN);
    SetEndOfFile(file);
    CloseHandle(file);

    Compressing goes fine: compress.exe -R Zeroes.pdb
    Expanding results in a 0 bytes file: expand -R Zeroes.pd_

    • brucedawson says:

      Damn. I don’t know what would have happened. How are you trying to extract the files? The only way we try extracting them is with symbol server and that works. I don’t know what the format is — I don’t know that 7-zip is supposed to be able to decompress them.

      Sorry…

      • Alexander says:

        It all started with debugger acting WEIRD on one pdb. Then it turned out that this PDB can’t be extracted at all. Give my experiment a try.

      • Alexander says:

        By the way, how did you compress your symbol server? There’re two compression types available in compress.exe and I simply used default one (turns out its compression isn’t as good as -Z compression).

        • Alexander says:

          Now it turns out even that is a lie. Compress.exe says -ZX is default, but in fact if neither -ZX nor -Z is specified then it uses some third type of compression (which has caused me damages). It seems that -ZX compresses better then -Z, which compresses better then real default.

        • Alexander says:

          symstore /compress will compress even better then default / Z / ZX. In my case:
          original = 51mb
          ntfs = 24.2mb
          default = 18.5mb
          Z = 12.9mb
          ZX = 11.1mb
          symsrv = 9.5mb

        • brucedawson says:

          I don’t remember what option we used — it was over a year ago. Now we just use the /compress option to symstore. We probably used the default options, and symsrv.dll is able to decompress those.

          • Alexander says:

            If you’re able to find any PDB compressed back then, what is its signature? SZDD is bad default compression, MSCF is -Z, -ZX and symsrv compression. If it is SZDD, Do you have byte at offset 0x0A == byte 0x09 + 1? That’s what seems to be the bug. 4 bytes from 0xA should form a 4-byte original size.

          • Alexander says:

            MSCF means you’re lucky. I have a theory that it’s not compress.exe but some of windows DLLs are at fault, going to test it. So far Win7 x64 and Win8 x64 both have the problem.

          • Alexander says:

            Also, for quite a while now it looks like we’re both working quite intensively on pretty much the same technologies, and by that I mean general debugging / crash handling / debugging crash dumps / working on arcane faults. I feel that it would be great to make a closer acquaintance. If interested, please send me some instant messenger contact to me email.

          • Alexander says:

            Theory about Windows didn’t work out. WinXP SP3 has the same bug. Probably no need to go further on that. What I really wonder is how the bug still exists, it’s been over 10 years now and it’s no good when the file can be compressed, but not expanded.

            You pointed out that it’s symsrv that should be able do expand, so I’d like to clear that moment once again: it all started with symsrv. On one of the PDB’s after compression it would create a 0-byte-long PDB in my cache and fail to load any symbols. Investigating I found that the PDB can’t be decompressed by any means, and expand.exe produces the same 0-byte pdb. The next thing I found is that all of the PDB’s were compressed wrong, but most of them can still be decompressed, even though expand.exe will yield warnings. I think it’s best to incorporate that in your post. Also, compress.exe doesn’t compress as good as symstore with any flags. So it’s probably best to convert existing by renaming the symsrv and starting a recursive symstore on it. Transaction history will be lost as a downside, though, but it seems it can be restored by hand, replacing 000Admin and all .ptr files from original symbol store.

  9. justarandomguy says:

    Thanks a lot for your script. It saved me a lot of time.
    But, when I use it, I have error 2 which is supposed to file not found. Any guess why ? (didnt change a line from your code).

    • brucedawson says:

      What script are you talking about? The dumpbin commands? The symstore commands? The C++ code for retrieving symbols?

      You need to make sure that the command that you are running is in the path. For the C++ code you need to make sure that dbghelp.dll and symsrv.dll are in the same directory as the executable you created.

      I’m confused as to how my script can have saved you a lot of time if you can’t run it…

      • justarandomguy says:

        Thanks for your fast reply. I am talking about your C++ code.
        Yes both dlls are in the same directory as the executable created.
        And you saved me a lot of time by giving me hope 🙂 .

        • brucedawson says:

          I can’t tell what’s going wrong. The C++ code doesn’t print numeric error codes so you must be having a failure to properly compile and run it. The best thing to do is to create a blank Win32 Command Line project using Visual Studio and then paste the code into the main source file below the #include “stdafx.h”, then build and run. But, I can’t help you debug problems with this process.

          • justarandomguy says:

            You print the error number :
            Extract from your code : … %u … GetLastError());

            • brucedawson says:

              Right you are. Well, assuming that TESTING is defined it should work. Maybe make sure that _NT_SYMBOL_PATH is not set so that it uses the symbol path mentioned in the code. And double-check that dbghelp.dll and symsrv.dll are in the executable directory. I just copied them with this syntax:

              c:\temp>xcopy “C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\symsrv.dll” TestSymbols\Debug
              C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\symsrv.dll
              1 File(s) copied

              c:\temp>xcopy “C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\dbghelp.dll” TestSymbols\Debug
              C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\dbghelp.dll
              1 File(s) copied

              I had to fix up the smart quotes to get the code to compile (stupid smart quotes) but then it worked and found the requested kernel32.pdb.

          • justarandomguy says:

            You were right, shame on me … I forgot to define TESTING … shame on me …
            Everything works perfectly. You rock 🙂
            One other thing. How to get the GUID number (PdbSig70) ? Is there a way to calculate it from a known DLL ?

            PS : Sorry, I can not reply to your last post (dont know why).

            • brucedawson says:

              If you have the DLL then use dumpbin — see the Finding PDB Files section. If you don’t have the DLL then you need to retrieve it first — you generally can’t get the information needed to retrieve a PDB unless you have the DLL.

          • justarandomguy says:

            Ok, but dont you think you can generate the GUID from a DLL with the dbghelp API ?

          • justarandomguy says:

            I’m just trying to get the PDB file given a DLL file. 🙂

          • justarandomguy says:

            Do you think I shouldnt use SymFindFileInPath ?

            • brucedawson says:

              If you have a DLL you can find the GUID using dumpbin. That works. Or, you can read the dbghelp help to find out how to get the GUID programmatically. If you figure out how to do it you should share your results.

              • Alexander says:

                The API’s are SymSrvGetFileIndexes() and SymSrvGetFileIndexInfo(), I have already described them above.

          • justarandomguy says:

            @Alexander : Ok, I did not see it. I will give it a try.
            I manage to download the symbol using directly the SymLoadModuleEx function.
            One other thing, to parse GUID, I discover the CLSIDFromString which ease a lot the process :
            GUID guid;
            TCHAR* hash = L”{57AF0B26-EF63-46C2-BFAB-A652F46CB5F7}”;
            HRESULT hr;
            CLSIDFromString(hash,&guid);
            @brucedawson : Thanks a lot for your help. Keep up the good work 🙂

  10. Clem says:

    Thanks a lot for publishing this article!

    I have a symbol server where PE files and PDB files are pushed for each build. When I open a minidump with windbg the debug symbol is found and the correct callstack is displayed.
    However, when I open the same crash dump with Visual Studio I get an error message saying “no matching binary found”.

    Do you have any idea why Visual Studio would not be able to find the symbols? The same version of SYMSRV.dll seems to be used by both windbg and Visual Studio.

    I use the _NT_SYMBOL_PATH to tell the development tools where the symbol server is located. I am running the 64bit version of windbg 6.12.0002.633 and Visual Studio 2010.

  11. brucedawson says:

    You can try using procmon (sysinternals) to monitor access to the symbol server. Also, you should configure a local symbol server cache. This makes symbol retrieval more reliable and it means that once windbg has retrieved the symbols to the local cache, Visual Studio can retrieve them from there (assuming you specify the same cache directory for both, but it would be foolish not to).

    When I’ve seen this in the past it has usually been boring problems like a misspelled symbol server name, or Visual Studio being launched before _NT_SYMBOL_PATH was set.

    • Clem says:

      It looks like it is a problem with VS2010 ignoring _NT_SYMBOL_PATH. Everything works fine when I open the same minidump with VS2012. I am essentially seeing the issue described here: https://stackoverflow.com/questions/17981030/why-is-vs-2010-ignoring-nt-symbol-path. The workaround is to manually set the symbol server path in VS2010 (for reference I am referring to my own symbol server, not the one from Microsoft).

      I am still interested to know what is happening so I followed your advice and used procmon to monitor access to the symbol server. According to procmon the VS2010 process definitely accesses the symbol server; however, I can’t see any references to the PE file. If I run the same experiment with VS2012 and windbg, I can see the process accessing the PE file first and then the PDB file.

      I am not sure what conclusion to draw from my experiment. VS2010 accesses the symbol server so it means that it knows about _NT_SYMBOL_PATH (it is the only place where it is specified). However, it doesn’t try to open the PE file. If I manually set the symbol server path in VS2010 then the debugger finds and opens the PE file. I don’t understand what the difference is between specifying the symbol server address with _NT_SYMBOL_PATH or manually in VS. In both cases the address seems to be picked up correctly, however, when I use _NT_SYMBOL_PATH the PE file is not loaded so the pdb cannot be found.

      Do VS2010 and VS2012 use a different debug engine? I looked at the callstack that accesses the symbol server and VS2010 uses NatDbgDE.dll while VS2012 uses vsdebugeng.dll.

      • brucedawson says:

        It sounds like you now know more than I do (or, I only know as much as you because I just read your comment and your link).

        The main problem that we have hit is that VS (unsure of which versions, but certainly 2010) will ignore symbol server cache directories and will drop symbol server directories in randomly selected directories at randomly selected times. If VS is running as administrator this can lead to arbitrarily badly corrupted systems since having kernel32.dll directories in your path confuses Windows. If VS is not running as administrator then these files can still end up being dropped in %temp% which can then cause future VS updates to fail to install. Joy. This may be related, or maybe not.

        • Clem says:

          From what I can see the symbol server cache is being used correctly. I am not going to dig any deeper for now. I have a workaround for VS2010 and hopefully I’ll move to VS2012 soon.

          Thanks a lot for your help.

  12. Pingback: Slow Symbol Loading in Microsoft’s Profiler, Take Two | Random ASCII

  13. Audrea says:

    Hello! Your webpage is running slow , this consumed just like a
    minute to successfully load up, I personally dont know whether it’s
    entirely me or your web site but google and yahoo loaded for me.
    Anyways, Thank you for posting an incredibly awesome articles.
    I guess this has already been beneficial to many
    individuals . This one is definitely fantastic everything that
    you actually have concluded in this article and wish to discover even more
    great posts from your site. I now have your site book marked to check out new stuff you
    publish.

  14. Pingback: Xperf and Visual Studio: The Case of the Breakpoint Hangs | Random ASCII

  15. Pingback: Visual Studio Single Step Performance Fixes | Random ASCII

  16. Rick Molloy says:

    I know this is an old thread, but just a comment that is super important for those of us that have build farms and use symstore. Publishing to a symbols server needs to be serialized and it will ‘corrupt’ the symstore unless you manage it. Corruption usually looks like not being able to find a particular build (at random) in the store

    https://msdn.microsoft.com/en-us/library/windows/desktop/ms681417(v=vs.85).aspx

    “Note SymStore does not support simultaneous transactions from multiple users. It is recommended that one user be designated “administrator” of the symbol store and be responsible for all add and del transactions.”

    • brucedawson says:

      It seems very odd to me that SymStore can’t handle simultaneous transactions. Isn’t a symbol store just a directory structure? I believe that Chrome builds its symbol store with the Google Storage equivalent of mkdir and xcopy. Maybe that is another option – skip symstore and use mkdir/xcopy instead. Or use symstore to compress the files locally and then mkdir/xcopy. Worth a try.

      It would be great if somebody at Microsoft could comment on what circumstances can trigger symstore corruption, so that we don’t do bizarre workarounds that aren’t generally needed.

  17. Pingback: Everything Old is New Again, and a Compiler Bug | Random ASCII

  18. Pingback: Add a const here, delete a const there… | Random ASCII

  19. Pingback: Analyzing a Confusing Crash | Random ASCII

  20. brucedawson says:

    Note that for PE files the time-stamp should be printed with %08X – zero padded to eight digits, upper-case hex (if you use a case-sensitive file system). At least that is what symstore.exe creates.

    The padding to eight hex digits is crucial – I’ve seen bugs before where people have padded the size (don’t) and there are potential bugs for not padding the date – you need to exactly match the path which the debuggers will look for. Or use symstore.exe and let it worry about it.

  21. Thanks for the article, really helpful.

    However, I can’t get the code to work with compressed files. I get a “file not found” message in the SYMSRV debug output:
    “SYMSRV: UNC: d:\SymbolServer\MemPro_Test.pdb\AF2B5AF7FF014075BAAF5488BB8479572\MemPro_Test.pdb – file not found”

    I’m using your code, compiling with VS2017, and compressing using the /compress flag in symstore.exe. If I don’t compress the file it works fine.

    If I change the extension of the file in the symbol server from “.pd_” to “.pdb” it does work. It’s as though SymSrv doesn’t understand the compressed file extension. Any ideas how to get around this?

    • Alexander says:

      I have been using compressed symbols for many years now and everything works as intended.
      What are you loading the symbols with?
      Maybe some super outdated symsrv.dll is being used under the hood?

      You can also try to check symbols with:
      symchk.exe /su “symbol_server_string” /v “exe_path”

      Do you cascade symbol storages? Like that:
      srv*C:\Symbols*d:\SymbolServer
      Maybe a cascaded cache is required to use compressed symbols (I always used it myself, because this way it also works much faster)

      If nothing helps, give us your symbol server string.

      • Thanks Alexander. I found the problem in the end, you were right, it was the cascaded cache that was required. It obviously needs somewhere to decompress to. Using a path like you suggest works, the pdb gets decompressed to c:\symbols.

  22. yi says:

    I recently did something similar (setting up a symbol server) and “One extra quirk is that Microsoft’s linker/debugger toolchains lower-case the names of PE files. This means that if you use a case-sensitive file system for your symbol server (as Chrome does) then you have to upload your symbols using lower-case file names.” is no longer true for me. PE filenames also remain case intact.

    • brucedawson says:

      How did you verify this? ’cause if they stopped doing this then I would expect Chrome’s symbol server to stop working, since it remains case sensitive (last I checked)

      • yi says:

        The symbol server I set up is in S3, and the requests look like:
        GET /symbolserver/SHLWAPI.dll/4CE7B9E257000/SHLWAPI.dll HTTP/1.1 404
        SHLWAPI.dll is not lower-cased
        (when I’m using a Visual Studio 2008)
        I’m curious how they are not lower-cased for me but Chrome’s symbol server still works though…

        • brucedawson says:

          I’m not sure. Looking at my mention of this it seems that I wasn’t sure whether the lower-casing was being done by the linker or by the debugger. It may be that one of those has changed so that it no longer forcibly lower-cases the names, but Chrome continues to work because it specifies the case correctly.

          Or, maybe the lower-casing behavior is new – coming after VS 2008. A lot may have changed in the five years since VS 2008 was created and when I published my blog post.

  23. Marc Sherman says:

    “… an xperf ETL file and this file would contain a series of module name, link time stamp, image size triplets…” Is this the purpose of `xperf -merge`, to add the triplets to the ETL file?

    • brucedawson says:

      I’m a bit hazy on the details of what xperf -merge gets, but machine-specific information definitely, and that might be when those triplets are recorded. That is what is suggested by the comments regarding the pre-trace option, which was based on conversations with a Microsoft developer:
      https://github.com/google/UIforETW/blob/master/UIforETW/UIforETWDlg.cpp#L1213

      • Zhentar says:

        The module name, link time, and image size are logged directly by the kernel Image provider events (along with other things needed to interpret address, like what offset it was loaded at). The trace merge reads back out all of those image events and inserts other stuff that you can only get from the executable image, like the PDB signature (unless it’s an NB10 PDB signature, with the Windows 10 trace merge DLLs :salt: ), file version & description, etc. Aside from that, the merge process injects events for any applicable registered ETW manifests, and other system configuration info.

        There’s also an option to generate NGEN PDBs. But all it does is populate your local symbol cache; it doesn’t inject anything into the trace file and it doesn’t tell you which ones it generated, making it only useful for analysis on the same machine, the exact opposite of every other merge option…

        • brucedawson says:

          Thanks for the details!

          I guess once the NGEN PDBs are generated you at least have the option to package up the trace (File-> Export Package) for analysis on another machine.

  24. Pingback: Why don’t Minidumps give good call stacks? – tuatphukien.com

  25. Pingback: Creating a Public Symbol Server, Easily | Random ASCII – tech blog of Bruce Dawson

  26. Daniel robinson says:

    As ever thanks for a a great article Bruce, I wasn’t aware of the compress option which is really helpful as our symbol server just ran out of disk space 🙂
    Does anyone have a script which will either do an in-place replace/compress or copy/compress to new path for all stored symbols?
    Thanks

    • brucedawson says:

      It should be fairly easy to write a Python script that iterates through a symbol-server directory and compresses files. Note that these compressed files have to be decompressed when they are downloaded locally, so don’t do this on your local symbol-server cache.

      I would probably have the script visit each directory, move the PDB/PE file in question to a temporary location, and then run symstore /compress on the file. If you pass the right parameters then it should reappear in its compressed form in the same directory (with a trailing ‘_’ character). Good luck, and test the script before letting it loose.

  27. Daniel robinson says:

    I ended up creating the following script which reads history.txt and then iterates over each index file and adds each symbol file to a new store with the compress option.

    #Root of current symstore folder
    $sourceSymStore = “C:\SymStore”
    #Root of new symstore folder
    $newSymStore = “S:\SymStore”

    $historyFilePath = [String]::Format(“{0}\000Admin\history.txt”,$sourceSymStore)

    $symhistory = Import-Csv $historyFilePath -Header “Index”,”Op”,”Type”,”DateTime”,”Program”,”Name”,”Version”

    foreach($upload in $symhistory)
    {
    $indexFilePath = [String]::Format(“{0}\000Admin\{1}”,$sourceSymStore, $upload.Index)
    $indexInfo = Import-csv $indexFilePath -Header “Symbol”,”Path”

    foreach($symInfo in $indexInfo)
    {
    $file = ($symInfo.Symbol.Split(“\”))[0]
    $GUID = ($symInfo.Symbol.Split(“\”))[1]

    $symPath = [String]::Format(“{0}\{1}\{2}\{1}”,$sourceSymStore,$file,$GUID)

    #Check symbol exists
    if(Test-Path ($symPath)
    {
    symstore add /f $symPath /s $newSymStore /t $upload.Name /v $upload.Version /c “Script import” /compress

    write-host ([String]::Format(“Imported {0},index={1},name={2},version={3}”,$symPath, $upload.Index, $upload.Name, $upload.Version ))
    }
    }
    }

    • Daniel robinson says:

      This didn’t turn out to be the best way as each symbol in a index/transaction was being added in a single transaction which was really slow. I’ve modifed the script to first store the list of symbols in the source transaction and then pass those into symstore in one pass. This is now using all cores on the symbol server during the import and seems to be going a lot faster.

      $sourceSymStore = “C:\SymStore”
      $newSymStore = “S:\SymStore”

      $historyFilePath = [String]::Format(“{0}\000Admin\history.txt”,$sourceSymStore)

      $symhistory = Import-Csv $historyFilePath -Header “Index”,”Op”,”Type”,”DateTime”,”Program”,”Name”,”Version”

      foreach($upload in $symhistory)
      {
      $indexFilePath = [String]::Format(“{0}\000Admin\{1}”,$sourceSymStore, $upload.Index)
      $indexInfo = Import-csv $indexFilePath -Header “Symbol”,”Path”

      $symbols = New-Object System.Collections.ArrayList

      foreach($symInfo in $indexInfo)
      {
      $file = ($symInfo.Symbol.Split(“\”))[0]
      $GUID = ($symInfo.Symbol.Split(“\”))[1]

      $symPath = [String]::Format(“{0}\{1}\{2}\{1}”,$sourceSymStore,$file,$GUID)

      #Check symbol exists
      if(Test-Path $symPath)
      {
      $symbols.Add($symPath)

      write-host ([String]::Format(“Found Symbol {0},index={1},name={2},version={3}”,$symPath, $upload.Index, $upload.Name, $upload.Version ))
      }
      }

      if($symbols.Count -ge 1)
      {
      $symbols | Out-File C:\temp\dlls.txt -Encoding ascii
      symstore add /f @C:\temp\symbols.txt /s $newSymStore /t $upload.Name /v $upload.Version /c “Script import” /compress
      }
      }

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.