Author Interface Description Files
- Mark Juras
Overview
This page describes how and why Interface Description files are created during the preparation process.
Interface Description Files
When gmStudio translates your source code, it uses information about all the COM components referenced by your system. These COM components provide detailed information about the interfaces and types they contain and gmStudio uses this information to interpret your code and improve the quality of translations.
Rather than finding, loading, and inspecting external COM components every time it performs a translation, gmStudio uses a cache of pre-built interface description files (IDFs). The IDF cache is typically built at the beginning of your migration project from the COM components referenced by your codebase and registered on the migration machine. By default the IDF cache is located in the workspace\idf folder.
gmStudio creates IDFs for your migration project when you select [Tools/Author Interface Descriptions] from the menu. This operation is included in the default batch process for all new migrations.
Types of IDFs
COM dependencies can be classified as being either local, external, or custom:
- A Local component (aka, In-House Component, IHC) is based on a migration unit that is part of the migration project.
- Local components are defined by their source code.
- IDFs for local COM components are generated from the source code of those components and saved to the local IDF cache folder (idf\FromCode).
- An External component (aka, Third-Party Component, 3PC) is based on a reference to a COM type library file (TLB, DLL, OCX, etc.) that is external to the migration project.
- External components are binary files (pre-compiled from source code that is external to the Migration Project). In many cases, they are provided only as binary from a third party.
- IDFs for external COM components are generated from the COM typelib files and saved to the system IDF cache folder (idf\FromIDL).
- In order to create IDFs for an external component, the component must be properly registered and ready to load.
Many applications use a mix of .NET and VB6/ASP. Specifically, the VB6/ASP code references .NET classes in .NET assemblies. This is known as COM interop. To do this, the .NET assembly must be referenced through a second file called a COM Callable Wrapper (CCW). A CCW is a COM type library (a .tlb file) created using visual studio or the tlbexp tool that is part of the .NET SDK. If your application is calling .NET through a CCW, you will have to generate an IDF from the CCW .tlb file.
When using COM Callable Wrapper assemblies, you may find the IDFs have no members in the corresponding IDF file. This is a serious deficiency that prevents gmBasic from recognizing object.member references and results in CallByNames and partially migrated code. In order to get the CCW classes to contain members you should apply the AutoDual attribute to those classes and rebuild the assemblies. For example:
[ClassInterface(ClassInterfaceType.AutoDual)] public class Mammal { public void Eat(); public void Breathe(): public void Sleep(); }
See more here: https://docs.microsoft.com/en-us/dotnet/framework/interop/com-callable-wrapper
- Custom IDFs is an IDF that you create by hand, typically by modifying a generated IDF.
- Custom IDFs are typically placed in the user folder so they will take precedence over like-named IDFs generated by the tool.
- Custom IDFs may also be explicitly assigned to load instead of a generated IDF; for example by using the Registry-libname command.
Validating your IDFs
If you are working on a machine where the COM components referenced by your migration are properly registered, then creating and using IDFs should be a completely automatic process. However, if the automatic process fails, you can identify the missing IDFs and create them manually.
The Dependency Status field for each migration task indicates the overall status of the IDFs it requires:
- Dependency Status=READY means all the IDFs required for the migration unit are found in the IDF cache.
- Dependency Status=~IDF means at least one IDF required for the migration unit could not be found in the IDF cache.
The meaning of Dependency Status=READY is controlled by the RefStatFlags setting in the application config file.
RefStatFlags = "IDF"; require IDF only.
RefStatFlags = "ASM"; require Interop only.
RefStatFlags = "IDF+ASM"; require both IDF and Interop
Quality of IDFs
Note that the quality of an IDF will depend on if the corresponding COM component and its COM dependencies are properly installed and registered on the machine where the IDFs are generated. If the COM component and its COM dependencies are not properly registered, the IDF will be more weakly typed: using Variants, VB_USERDEFINED, and missing coclass/subclass information. These weakly typed IDFs can hinder gmStudio's ability to recognize and interpret symbols defined using types from the COM API and this will reduce the quality of the generated code.
In addition, some COM APIs are intentionally weakly typed. For example, collection properties or properties for other complex types might simply return type variant or type object. This reduces code quality and can result in extensive use of CallByName and dynamic cast. It also reduces the ability of the tool to apply upgrade transformations. When you see this, you can typically improve things by creating a custom COM description file that have variants and objects replaced with the appropriate stronger types. See this article for an example.
Using The References Panel
The RefStat field in the task list will tell you if you are missing an IDF needed by one of your migration tasks, but it cannot tell you which IDF is missing. In order to see that, you have to look at the References panel.
The references panel provides detailed information about all the components referenced by each migration unit. One piece of this information is the RefStat field:
- Dependency Status=READY means the component's IDF file is in the IDF cache.
- Dependency Status=NOTFOUND means the component's IDF is not found in the IDF cache.
Click [Rebuild Interface Description] from the reference context menu to rebuild an IDF.
Click [Edit Interface Description] from the reference context menu to edit an IDF.
IDF Reports and Logs
The Source References Report can provide a high-level survey of all COM dependencies for your entire migration project. Essentially this report is the consolidated References Panel data for a selected set of migration units.
There are also two reports that tabulate the contents of the IDF cache:
- Interface File Headers Report
- Interface File ProgIDs Report
When gmStudio creates an IDF file, it saves a log file in the workspace\log folder. The name of the file is derived from the COM filename with the extension .tli.log.
Generating IDFs Manually
Typically, you will let the COM references in your migration units drive the creation of IDFs; however, if you want to create IDFs for an arbitrary set of COM components, you can do so by performing the following steps:
- Click [Tools\Author Interface Description(s) from files...] from the menu. An open file dialog will be displayed.
- Select COM binary file(s). You may also select *.idl files already generated by gmStudio, or a *.lst file containing a list of COM (or IDL) files
- Click Open. An IDF for the selected COM component will be created.
When generating IDF files, gmStudio will attempt to locate the COM files for each reference based on your registry and the file location hint in the reference statement in your VBP. If that fails, it will look for the file in the folder(s) specified in COM Search Folders specified for your project. This is a semicolon-delimited list of folder paths that may be searched for the COM binary files.
This value is stored in the gmProj file/s TlbSearchPath element.
COM Selection Variations
If you wish to process a group of COM components in batch, create a text file that has the full path to a COM component on each line and save as typelibs.lst; then select the typelibs.lst file in step 2 above.
If you have a group of IDL files created by gmStudio, you can select them instead of a COM file in step 2 above.
Dealing with COM module references
There are a handful of COM types that are referenced using a little-known sub-module notation (file.dll\#) where # is the number of a module within the type library. For example, if your VB6 program uses the Regular Expression classes in VBScript.dll you will find a reference in your VBP like the following:
Reference=*\G{...}#5.5#0#..\..\..\WINDOWS\system32\vbscript.dll\3#Microsoft VBScript Regular Expressions 5.5
gmStudio handles this by creating IDFs of the form file_#.dll, in this case vbscript_3.dll.xml.
If you need to process one of these submodules manually, enter the (file.dll\#) path manually in step 2.
Dealing with Non-standard ProgIds
COM classes can be identified by a ProgId value of the form Library.Class. Normally, the value of Library and Class correspond directly to symbols stored in the COM binary. However, sometimes non-standard identifiers are used for "branding" purposes. These branded ProgIDs do not correspond directly to the ProgIDs found in the type libraries. In order to deal with this, the gmStudio configuration has a file called registry.xml that provides a cross reference for non-standard ProgIds and actual ProgIds.
Typically you will add registry commands like this to your translation script.
<!-- These registry commands help the translator resolve non-standard ProgIds used by various third party components. source = the ProgId that appears in code target = the LibName.ClassName that appears in the IDL/IDF --> <registry type="ProgId" source="APToolkit.Object" target="APToolkitLib.APToolKit" /> <registry type="ProgId" source="AddressObject.AddressCheck" target="AddressObjectLib.AddressCheck" /> <registry type="progid" source="AddressObject.Parse" target="AddressObjectLib.Parse" /> <registry type="progid" source="AddressObject.ZipCodeData" target="AddressObjectLib.ZipCodeData" /> <registry type="progid" source="DERuntime.DERuntime" target="DERuntimeObjects.DataEnvironment" /> <registry type="progid" source="IXSSO.Query" target="Cisso.CissoQuery" /> <registry type="progid" source="MSWC.BrowserType" target="BrowserType.BrowserCap" /> <registry type="progid" source="MSWC.Counters" target="Counters.Counters" /> <registry type="progid" source="MSXML.DOMDocument" target="MSXML2.DOMDocument" /> <registry type="progid" source="Microsoft.XMLDOM" target="MSXML2.DOMDocument" /> <registry type="progid" source="Microsoft.XMLHTTP" target="MSXML2.XMLHTTP" /> <registry type="progid" source="Msxml2.DOMDocument.4.0" target="MSXML2.DomDocument" /> <registry type="progid" source="Msxml2.ServerXMLHTTP.4.0" target="MSXML2.ServerXMLHTTP" /> <registry type="progid" source="Persits.MailSender" target="ASPEMAILLib.MailSender" /> <registry type="progid" source="SAWZip.Archive.1" target="SAWZIPLib.Archive" /> <registry type="progid" source="SoftArtisans.ExcelWriter" target="SAEXCELLib.SAExcelApplication" /> <registry type="progid" source="SoftArtisans.FileUp" target="UPLDICTIONARYLib.ISAFile" /> <registry type="progid" source="WScript.Network" target="IWshRuntimeLibrary.WshNetwork" /> <registry type="progid" source="WScript.Shell" target="IWshRuntimeLibrary.WshShell" />
Multiple DLL Versions
Large VB6 and ASP applications sometimes reference multiple versions of the same COM API. For example, there are several versions of the MSXML2 API in circulation – msxml2.dll, msxml3.dll, msxml4.dll, and msxml6.dll.
The registry-libname command in a Translation Configuration File allows multiple versions of a given DLL file to target a single standard as part of the migration process. For example, the following two commands tell gmBasic that references to msxml4.dll and msxml6.dll should be processed as if they were references to GM.msxml6.dll. This means that only the custom msxml6 IDF will be the only used in the upgrade process.
The following registry-libname directives can be used to standardize on multiple COM components on a single IDF:
<Registry type="libname" source="msxml4.dll" target="GM.msxml6.dll" /> <Registry type="libname" source="msxml6.dll" target="GM.msxml6.dll" />
Retaining type short for COM interop
COM IDFs sometimes declare API elements as having type short. However, most .NET APIs are favor type int (i.e., Int32). You will generally use an integer in a .NET API where you used a short in a COM API. Consequently, the default IDF files prepared by gmBasic will Integer instead of Short.
However, if you are planning to use interop as the replacement for a COM dependency, you will need to retain type Short in your IDF. Retaining a short in an IDF is done by passing TypeInteger=short on the command line when you generate the IDFs for those particular COM files. The IDF generation process is mostly automated, but you can find the command line string for generating an IDF in the gmStudio application settings, gmStudio.cfg, file. It is the variable IdlToXmlCmd:
Default (short in the IDL to an Integer in the IDF) IdlToXmlCmd = "cmd.exe /C pushd '%IdfFromIdlFolder%' && '%TranToolExe%' '%SrcIdlPath%' >> '%LogPath%' ... Modified (short in the IDL to an Short in the IDF) IdlToXmlCmd = "cmd.exe /C pushd '%IdfFromIdlFolder%' && '%TranToolExe%' '%SrcIdlPath%' TypeInteger=short >> '%LogPath%' ...
Hand-Coded IDF
Sometimes you may not have the binary files for a COM component and cannot generate its IDF. In this situation, you may generate a starter IDF and add the API elements you need by hand. You may create a starter document for a "NOT FOUND" IDF by selecting Edit IDF item from the References context menu. This is normally just a short term fix, to allow you to move forward with work until you obtain a copy of the COM binary and use it to generate a full and accurate IDF.