Can PDBs be off more than 2 or 3 lines?
You give the statement that you have never seen PDBs off more than a few lines. 40 lines seem to be too much, especially when the decompiled code doesn't look much of a difference.
However, that's not true and can be proven by a 2 liner: create a String object, set it to null
and call ToString()
. Compile and run. Next, insert a 30 line comment, save the file, but do not recompile. Run the application again. The application still crashes, but gives a 30 lines difference in what it reports (line 14 vs. 44 in the screenshot).
It is not related at all to the code that gets compiled. Such things can easily happen:
- code reformat, which e.g. sorts the methods by visibility, so the method moved up 40 lines
- code reformat, which e.g. breaks long lines at 80 characters, usually this moves things down
- optimize usings (R#) which removes 30 lines of unneeded imports, so the method moved up
- insertion of comments or newlines
- having switched to a branch while the deployed version (matching the PDB) is from trunk (or similar)
How can this happen in your case?
If it's really as you say and you seriously reviewed your code, there are two potential issues:
- EXE or DLL does not match to PDBs, which can easily be checked
- PDBs do not match to source code, which is harder to identify
Multithreading can set objects to null
when you least expect it, even if it has been initialized before. In such a case, NullReferenceExceptions can not only be 40 lines away, it can even be in a totally different class and therefore file.
How to continue
Capture a dump
I'd first try to get a dump of the situation. This allows you to capture the state and look at everything in detail without the need of reproducing it on your developer machine.
For ASP.NET, see the MSDN blog Steps to Trigger a User Dump of a Process with DebugDiag when a Specific .net Exception is Thrown or Tess' blog.
In any case, always capture a dump including full memory. Also remember to collect all necessary files (SOS.dll and mscordacwks.dll) from the machine where the crash occured. You can use MscordacwksCollector (Disclaimer: I'm the author of it).
Check the symbols
See if the EXE/DLL really matches to your PDBs. In WinDbg, the following commands are helpful
!sym noisy
.reload /f
lm
!lmi <module>
Outside WinDbg, but still using debugging tools for Windows:
symchk /if <exe> /s <pdbdir> /av /od /pf
3rd party tool, ChkMatch:
chkmatch -c <exe> <pdb>
Check the source code
If PDBs match to DLLs, the next step is to check whether the source code belongs to the PDBs. This is best possible if you commit PDBs to version control together with the source code. If you did that, you can search the matching PDBs in source control and then get the same revision of source code and PDBs.
If you didn't do that, you're unlucky and you should probably not use source code but work with PDBs only. In case of .NET, this works out pretty well. I'm debugging a lot in 3rd party code with WinDbg without receiving the source code and I can get pretty far.
If you go with WinDbg, the following commands are useful (in this order)
.symfix c:symbols
.loadby sos clr
!threads
~#s
!clrstack
!pe
Why code is so important on StackOverflow
Also, I looked at the code of the View() method, and there is no way for it to throw a NullReferenceException
Well, other people have made similar statements before. It's easy to overlook something.
The following is a real world example, just minimized and in pseudo code. In the first version, the lock
statement didn't exist yet and DoWork() could be called from several threads. Quite soon, the lock
statement was introduced and everything went well. When leaving the lock, someobj
will always be a valid object, right?
var someobj = new SomeObj();
private void OnButtonClick(...)
{
DoWork();
}
var a = new object();
private void DoWork()
{
lock(a) {
try {
someobj.DoSomething();
someobj = null;
DoEvents();
}
finally
{
someobj = new SomeObj();
}
}
}
Until one user reported the same bug again. We were sure that the bug was fixed and this was impossible to happen. However, this was a "double-click user", i.e. someone who does a double click on anything that can be clicked.
The DoEvents() call, which was of course not in such a prominent place, caused the lock to be entered again by the same thread (which is legal). This time, someobj
was null
, causing a NullReferenceException in a place where it seemed impossible to be null.
That second time, it was return boolValue ? RedirectToAction("A1","C1") : RedirectToAction("A2", "C2"). The boolValue was an expression which could not have thrown the NullReferenceException
Why not? What is boolValue? A property with a getter and setter? Also consider the following (maybe a bit off) case, where RedirectToAction
takes only constant parameters, looks like a method, throws an exception but is still not on the callstack. This is why it is so important to see code on StackOverflow...