$2M prize opportunity scuttled by a C# memory leak

| November 17, 2007

From Slashdot comes a link to this postmortem by the Princeton University team competing in the DARPA Grand Challenge for autonomous vehicles:

It was the closest thing to a memory leak that you can have in a “managed” language. C# manages your memory for you by watching the objects you create. When your code no longer maintains any reference to the object, it automatically gets flagged for deletion without the programmer needing to manually free the memory, as they would need to do in C or C++. Hence, in order to be allocating memory that is never freed in C#, you need to somehow be referencing an object in a way that you don’t know about. Our actual bug occurred in the obstacle detection code. As we detect obstacles in each frame, we store the ones we detect. As the car moves, we call an update function on each of the obstacles that we know about, to update their position in relation to the car. Obviously, once we pass an obstacle, we don’t need keep it in memory, so everything 10 feet behind the car got deleted.

Or so we thought. We kept noticing that the computer would begin to bog down after extended periods of driving. This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles. The computer performance would just gradually slow down until the car just simply stopped responding, usually with the gas pedal down, and would just drive off into the bush until we pulled the plug. We looked through the code on paper, literally line by line, and just couldn’t for the life of us imagine what the problem was. It couldn’t be the list of obstacles: right there was the line where the old obstacles got deleted. Sitting in a McDonald’s the night before the competition, we still didn’t know why the computer kept dying a slow death. Because we didn’t know why this problem kept appearing at 40 minutes, we decided to set a timer. After 40 minutes, we would stop the car and reboot the computer to restore the performance…

We added one line of code to remove the event subscription and, over the next three days, we successfully ran the car for 300 miles through the Mojave desert.

We had vacations coming up a few weeks after the race, so we left the cars in Vegas and returned, two weeks later, to investigate the problem. One of our team members downloaded the 14-day trial of ANTS Profiler and we ran it on our car’s guidance code. We profiled the memory usage and saw the obstacle list blowing up. How could this be? We called “delete” on those old obstacles! To our amazement, it was only minutes before we realized that our list of detected obstacles was never getting garbage collected. Though we thought we had cleared all references to old entries in the list, because the objects were still registered as subscribers to an event, they were never getting deleted.

ANTS Profiler helped us fix a problem in minutes that would have taken us weeks to track down. If only we’d thought of it before the competition, we would most likely have finished the entire race and had a chance at the top prize money.

Ouch, ouch, and ouch. ..bruce w..

Be Sociable, Share!

Category: Information Technology, Main, Project Management

About the Author ()

Webster is Principal and Founder at Bruce F. Webster & Associates, as well as an Adjunct Professor of Computer Science at Brigham Young University. He works with organizations to help them with troubled or failed information technology (IT) projects. He has also worked in several dozen legal cases as a consultant and as a testifying expert, both in the United States and Japan. He can be reached at bwebster@bfwa.com, or you can follow him on Twitter as @bfwebster.

Comments are closed.