Citation :
I ran across a nice little Com Interop gotcha when I was working with Excel automation from C# today. I'm sure that a lot of you with more COM Interop experience than me already know about this but I figured I would share so that those with lesser experience can avoid the heartburn that I had today. The gotcha occurred when I was working with Excel ranges to pull information out of a spreadsheet from C#. When working with Office automation from a managed language, one needs to understand that under-the-hood, there are still reference counters! If you are not careful when writing code, it is VERY easy to expose your program to some basic memory leaks.
Below is a sample of some easy pseudo-code to pull information out of a range. Can you spot the problem? (I know I sure couldn't earlier)
Excel.Range range;
range = GetFirstRange(....);
// Do Something With Range
range = GetSecondRange(....);
// Do Something With New Range
Marshal.ReleaseComObject(range);
range = null;
The problem is that range is not released before the call to GetSecondRange and hence the reference counter after the call is at two, instead of one (I assume). What does this mean? This means that when your program exits, Excel will remain loaded in memory. The correct code looks like:
Excel.Range range;
range = GetFirstRange(....);
// Do Something With Range
Marshal.ReleaseComObject(range);
range = GetSecondRange(....);
// Do Something With New Range
Marshal.ReleaseComObject(range);
range = null;
While this gets pretty easy to spot with the example above, it gets a little more tricky when dealing with loops and test conditions. Sounds easy enough, right? WRONG! The real tricky part comes when you are dealing with temporary variables used by the runtime. For instance, when dealing with opening up a workbook, the workbooks collection will be a temporary variable used by the runtime:
Excel.Workbook myWorkbook = myExcelApplication.Workbooks.Open(....);
// Do something with workbook
Marshal.ReleaseComObject(myWorkbook);
myWorkbook = null;
Can you spot what's wrong with the code above? The problem is that .Workbooks.Open will introduce a memory leak. The answer? Change to the following:
Excel.Workbooks myWorkbooks = myExcelApplication.Workbooks;
Excel.Workbook myWorkbook = myWorkbooks.Open(....);
// Do something with workbook
Marshal.ReleaseComObject(myWorkbook);
Marshal.ReleaseComObject(myWorkbooks);
myWorkbook = null;
myWorkbooks = null;
Now I assume (because I haven't researched for sure yet) that this memory leak is caused because the reference counter on the Workbooks com wrapper will still be increased even though there is not a local variable reference to that object.
Am I wrong here? My feeling is that most of this SHOULD be taken care of for you when using a managed language, but ISN'T. Because of this, one is almost no better off using a managed language with COM interop than using an unmanaged language. In all fairness, I know that it would most likely be quite tricky for the runtime to take care of this under-the-hood, and that if you are programming COM Interop you are most likely a Big Boy programmer and should be able to take care of this yourself. After all, even one of my biggest pet peeves is a lazy programmer. Perhaps I should just file it away in the back of my mind as another lesson learned in order to make me a better programmer. What are all of your thoughts on this situation?
|