调试一个bad coding的异常

前几天，我们的程序在做longevity test的时候出了错，报"The file exist"，但是在我们本地的测试环境中费尽力气也不能重现，最后根据日志把目标锁定到了几个别的项目的dll上，使用reflector反编译review它的code，总算是猜到了问题的根源，就是Path.GetTempFileName()有些情况下会抛异常，MSDN写道：

The GetTempFileName method will raise an IOException if it is used to create more than 65535 files without deleting previous temporary files.

你看，难怪我们这重现不了，要创建到65535个文件那得多少次测试啊。总结这次问题的解决上的难度所在，主要是这个dll有一些非常不好的异常处理方式，下边给出一个示例：

class Program
    {
        private void Test1()
        {
            try
            {
                //... Many codes
                Path.GetTempFileName();
                //... Many codes
            }
            catch (Exception ex)
            {
              throw new Exception(ex.Message); //一定不能这样写，因为这样会把exception 的trace给变掉，导致定位困难，起码也要写成throw nex exception(msg,ex);
            }
        }

        public void Test2()
        {
            try
            {
                Test1();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.ToString());
            }
        }

        static void Main(string[] args)
        {
            Program prg = new Program();
            while (true)
            {
                prg.Test2();
                Thread.Sleep(2000);
            }

            Console.Read();
        }
    }

后来我在想，如果我使用windbg的话是否可以定位这个问题呢？经过一番实验，答案是可以的，下次遇到这种trace不清的可以按照下边类似方法试试。

1.先给一个程序运行时候的特写