Microsoft's Tlbimp Creates Leaky BSTR Signatures

- - posted in C# coding | Comments

This one confounded me when I first discovered it, and I’ve recently been reminded about it. For the sake of remembering the details, and hopefully helping someone else out I’m going to document it here.

The problem is this. When you have a COM library that you need to use from a C# app, you import it as a reference. In the background the Microsoft.NET wizards do their magic by running the Tlbimp.exe to generate a managed DLL with all of the objects and interfaces from the COM library. You proceed to use the code that Microsoft so conveniently converted for you fully confident that all is well.

But it’s not. See, suppose your COM library has a method that returns a BSTR via an [out] parameter, or perhaps it defines an interface for a listener your managed code must implement. Suddenly there is the potential for a serious memory leak!

See, a BSTR in unmanaged code aren’t just any normal string. BSTR’s are allocated by the system by calling SysAllocString and subsequently released by calling SysFreeString. This poses a problem for managed code if you aren’t careful. Take the following listener interface for example.

*Interface names and GUID’s changed to protect the innocent

interface IImplementMeListener : IDispatch{
[
id(0x000000C9)
]
HRESULT _stdcall notify([in] TEventType eventType, [in] BSTR data );
};

The Tlbimp.exe generates a managed assembly with the following signature for this same method.

[ComImport, TypeLibType((short) 0x10c0), Guid("00000000-0000-0000-0000-000000000000")]
public interface IImplementMeListener
{
    [MethodImpl(MethodImplOptions.InternalCall, MethodCodeType=MethodCodeType.Runtime), DispId(0xc9)]
    void notify([In, ComAliasName("ExampleLib.TEventType")] TEventType eventType,
        [In, MarshalAs(UnmanagedType.BStr)] string data);
}

Which means when you implement this interface in your managed class, you’ll just have String as the data type for the second parameter. Which might look like this.

class MyManagedListener : ExampleLib.IImplementMeListener{
    public void notify(ExampleLib.TEventType eventType, String data)
    {
        DoSomethingWithData(data);
    }
}

So this is what happens.

1) Your COM library allocates the string to pass into your listener using SysAllocString.

2) Your COM library passes the newly allocated string into your managed app by calling the notify method of your listener.

3) Your managed app does whatever it’s going to do with the string, then returns.

Normally in a fully managed app this would be no problem, when the reference count to the string finally reaches 0, the garbage collector sweeps it up and the memory is reclaimed. However, in this case we have a problem. The COM library allocated the string, and passed it into your managed app, and it’s responsibility for that string ends there. The expectation is that the client will release the BSTR by calling SysFreeString. Clearly we can’t explicitly do that to the managed String type.

So what do we do? We rewrite the part of the assembly that Tlbimp.exe made for us, and adjust our listener implementation slightly.

This is how I did it, though there may be better ways.

1) Use a disassembler to view the code of the Tlbimp.exe generated assembly for your COM library. I used, Lutz’s Reflector.

2) Copy the code for the entire library into a *.cs file, then change just the signature of the method you’re concerned with.

The new signature should look like this.

[ComImport, Guid("00000000-0000-0000-0000-000000000000"), TypeLibType((short) 0x10c0)]
public interface IImplementMeListener
{
[MethodImpl(MethodImplOptions.InternalCall, MethodCodeType=MethodCodeType.Runtime), DispId(0xc9)]
    void notify([In, ComAliasName("ExampleLib.TEventType")] TEventType eventType,
        [In] IntPtr data);
}

And your implementing class changes to this.

class MyManagedListener : ExampleLib.IImplementMeListener{
    public void notify(ExampleLib.TEventType eventType, IntPtr data)
    {
        String dataStr = Marshal.PtrToStringBSTR(data);
        DoSomethingWithData(dataStr);
        Marshal.FreeBSTR(data);
    }
}

This is not particularly tricky wizardry. All we’re doing is marshaling the input value from the library as an IntPtr instead of a managed String. This allows us to explicitly release it using the System.Runtime.InteropServices.Marshal.FreeBSTR method, just like the library expects us to do.

Hopefully, this will save you some hastle, and avoid a potentially large memory leak.

Comments