Custom VB6 Language Replacement
- Mark Juras
Overview
Our research shows that, with very few exceptions, any algorithm or operation that can be expressed in VB6 can also be expressed in C# or VB.NET. In fact in most situations, there are several reasonable ways to express an algorithm or operation and as we say, "if you get 10 developers in a room, you will get (at least) 10 different opinions about a 'better' way to code something". gmStudio is designed with this need for variation in mind: it allows migration teams to configure the translation rules according to their unique standards. We call this "configuring gmStudio with custom (a.k.a project-specific) language replacement rules".
The process for implementing custom language replacement rules follows an iterative four-step process:
- Understand how legacy language elements are used by your application
- Define your .NET language coding standards and conventions
- Design and implement the rules to migrate from legacy language conventions to .NET language standards
- Apply, refine and extend the rules for additional source codes as needed
These four steps are described in more detail below.
1) Understand how legacy language elements are used by your application
The VB6 language and the intrinsic VB6 object model provide many types of services to VB6 applications. A few examples are listed here:
- Intrinsic keywords, functions, and commands
- Object Model / Data Model Definition
- File IO, Database Access, Printing
- User Interface Elements: Forms/Controls and Graphics
- Error Handling
- Interfacing with Operating System Services
The first step in customizing how gmStudio upgrades VB6 to .NET is to identify the specific VB6 language elements that provide services to your application and to understand, at a high level, how the services work.
2) Define your Language replacement strategy
If your organization is already using .NET, you may already have coding standards for new .NET code and you should plan to follow those standards in upgraded legacy code as well.
It is important that the coding standards are objective enough that upon review, you can know with certainty if the standards are being followed in your code, or not. Manual review is critical, but when possible, automated review tools should also be used to help measure and report compliance with standards.
The second step in customizing language replacement is specifying new coding conventions and standards for .NET in a detailed and objective form.
3) Design and implement the rules to rewrite VB6 language in .NET
Designing the rules for using .NET correctly is a manual task regardless of whether you plan to use tools to help you rewrite your code or not. The rules must be considered and implemented at a detailed level based on the specific language elements and platform services used by your application. In our methodology, the rules that direct language replacement are specified in two types of files: Metalanguage Files, gmSL scripts, and Migration DLLs.
- Metalanguage Files are XML documents that describe how the VB6 language elements map to C# or VB.NET. The standard metalanguage files used by gmStudio are designed and maintained by Great Migrations and shipped with the product in the translator configuration files folder, [installdir]\support\metalang.
- gmSL Scripts are files containing logic implemented in gmSL script code. The logic allows a lower-level inspection of the original source code implementation and a more dynamic transformation of how VB6 language elements map to C# or VB.NET. See the RDO to SQLClient Sample for an example of gmSL. The Windows Common Controls to WinForms sample migration uses gmSL scripts to implement some of the more complex upgrade rules.
- Migration Utilities are .NET console executables that implement complex transformations in C# code using gmAPI.
- gmPL Extensions are .NET Dlls that implement custom processing commands and integrate custom processing into the translation script processing pipeline using gmAPI.
4) Apply, refine and extend the rules to additional source codes as appropriate
The default metalanguage files shipped with gmStudio cover almost every aspect of popular VB6 dialect and you only need to customize those elements that you want to translate differently. If you are migrating a large codebase you can expect to incrementally modify and refine your copy of the metalanguage files as you work though the source code units (VBP/ASP files). The number of modifications should diminish to zero once you have processed a representative sampling of your code.
Balancing manual and tool-assisted techniques
Of course there are limitations to what should be automated in a software re-engineering effort. For example, it may not make sense to automate a particularly complex set of re-engineering operations that impact only one application function. In these cases, the migration team should consider the following balanced approach:
- Leveraging gmStudio’s analytic capabilities, find out where and how the language elements are used in order to scope the redesign work and to plan customization tasks.
- If appropriate, use gmStudio to implement a partial migration by stubbing out, commenting out, or flagging specific code.
- Finish and refine the transformation work manually
- If desired, configure gmStudio to integrate hand-written code into your migration script solution to make the manual re-engineering a repeatable operation.
Appendix A: Example: Upgrading an Intrinsic Function
In this example, we look at how to customize the replacement of VB6’s Len function.
1) Understand how legacy language elements are used by your application
In VB6, the Len function offers a convenient way to compute the length of a string. Documentation for Len from the VB6 object browser says the following:
Function Len(Expression) Member of VBA.Strings Returns string length or bytes required to store a variable
gmStudio’s default translation of Len(x) is VBNET.Strings.Len(x). In default gmStudio translation conventions, VBNET is an alias for the Microsoft.VisualBasic namespace. This namespace is distributed in Microsoft.VisualBasic.dll and contains an extensive set of classes that emulate VB6 runtime and VBA.
Decompiling Microsoft.VisualBasic.dll shows that VBNET.Strings.Len is
| public static int Len(string Expression) { if (Expression == null) { return 0; } return Expression.Length; } |
You can use gmStudio’s reporting tools – the Source Scan Report or the Analytics Reference Report – to see how your application uses the Len function. The Source Scan Report can be run from the Search panel or the Reports menu. Simply searching for “Len” will show were you use Len, but may also show some false matches. A more precise Source Scan Report can be done using a more sophisticated regular expression; for example “@\bLen\(“ (the leading @ indicates case-sensitivity). The most precise report for how and where you use Len will be obtained by running the Analytics References Report. See the records in the Analytics References report having MemLibr=Basic, MemClas=Vb6Function, and Memname=Len.
2) Define your .NET language coding standards and conventions
Let’s assume that your coding standards do not allow use of the Microsoft.VisualBasic.Strings class and you want to use property notation, x.Length, instead of method notation, VBNET.Strings.Len(x).
3) Design and implement the rules to migrate from legacy language conventions to .NET language standards
The next step in the process is to find in the gmStudio metalanguage files the rule for expressing the Len operation and change it. The easiest way to find a rule is to look for the .NET code pattern that you want to change in the metalanguage files. You can do this with the Search Panel in gmStudio. For this example, enter VBNET.Strings.Len in the search box on the left, check the [Lang] checkbox, and click [Run Search]. The results of this search are shown below:
The results show that the rule for Len is in the VBASIC.xml file, which is located in the gmStudio installation folder, [installdir]\support\metalang. The actual text of the rule for Len in VBASIC.xml looks like this
<MetaLanguage> <patterns> . . . <pattern id="VBF"> <subcode id="Len"> <vbn narg="1" code="Strings.Len(%1d)"/> <jvs narg="1" code="len(%1d)"/> <csh narg="1" code="VBNET.Strings.Len(%1d)"/> </subcode>
In order to make gmStudio author x.Length instead of VBNET.Strings.Len(x) we will edit the VBASIC.xml text file so that
this instruction
<csh narg="1" code="VBNET.Strings.Len(%1d)"/>
becomes
<csh narg="1" code="%1d.Length"/>
Here, %1d is a place holder for the expression that was being passed into Len in the original VB6 code. Internally, it corresponds to the first expression on the operation stack built by gmStudio when it processes and stores the VB6 code.
An additional modification that can be useful when changing a method to a property is to add the status=”postfix” attribute. In this example, status=’postfix’ directs gmStudio to put parentheses around the %1d token if needed.
See Appendix X for instruction on how to modify metalanguage files.
An alternative to modifying the standard language dialects (i.e., vbn, csh) directly is to add your own custom coding standard as an upgrade SubSystem. First a three-character subsystem name must be added to the Dialects element in the enumerations.xml metalanguage file.
Next, the subsystem name is used to implement your custom upgrade rules for the specific operations you want to customize.
For example, the loc subsystem implements an object-oriented C# coding standard. If you search the language files for "<loc" you will see that it contains an extensive set of customizations. Here is the one for Len:
<subcode id="Len"> <loc status="postfix" narg="1" code="%1d.Len()"/> <vbn narg="1" code="Strings.Len(%1d)"/> <jvs narg="1" code="len(%1d)"/> <csh narg="1" code="VBNET.Strings.Len(%1d)"/> </subcode>
You can load up to 5 subsystems in your migration script to create a hierarchy of custom rules. See the VB.NET versions of the standard samples to see how multiple subsystems can be applied.
The core language dialects: csh, vbn, and all will always be loaded an applied for patterns that are not over-ridden by a user subsystem.
More sophisticated, dynamic subsystems, can be implemented by adding a gmSL scripts to the system configuration. See for example oocSubsystem.gmsl and wpfSubSystem.gmsl in the system metalanguage folder.
4) Apply and refine those rules for additional source codes as needed
Typically a large, mature code will not meet the assumptions of every migration rule 100% and you should plan to consider alternatives and to refine or extend the rule to deal with variations. This incremental refinement is an important aspect of our iterative methodology.
In this example, the new rule assumes that the type of original argument to Len has a Length property in .NET and that argument is not null at runtime. The first assumption is easy to check using the C# compiler. The second assumption can be checked by static analysis of the code and by runtime testing. We know that the rule will not work for situations where the argument is a struct; in fact the resulting code will not even build. A more appropriate translation of Len(struct), that still avoids using VBNET, is to use System.Runtime.Interop.Marshal.SizeOf().
In this type of situation, a Migration DLL, or gmSL function, can be used to implement a rule that specifies x.Length is used for strings and Marshal.SizeOf(x) for is used for structs. However, when you encounter exceptions to your coding/upgrade standards, take time to see what the code is actually doing. For example, in the case of code that uses Len with struct you will typically also find Win32 APIs calls or record-based file IO. Both of these things warrant additional redesign as they move to .NET. It may make more sense to rework that section of your code in a different way. Also beware that VB6 language frequently provides high-level services that can only be reproduced by runtime routines that integrate several .NET operations. The check for null in VBNET. Strings.Len is a good example of this.
Balancing manual and automated work is a central tenet of the tool-assisted rewrite methodology.
You should apply rules in a manner that fits the needs of your application and use a variety of techniques including using gmStudio to systematically integrate hand-written code with the migration solution.
Appendix X: Activating a Custom Metalanguage Configuration
This appendix describes how to modify the metalanguage configuration so gmStudio will produce translations that use your custom language replacement rules.
Metalanguage Files
Metalanguage files are XML documents that direct gmStudio as it rewrites your VB6 program for .NET.
- The standard metalanguage files used by gmStudio are designed and maintained by Great Migrations and are shipped with the product in the translation configuration folder, [installdir]\support\metalang.
- Do not change the default metalanguage files in the install folder.
- Always make a copy of the metalanguage files in your migration project workspace before modifying them. This ensures that your modified files will not be over-written when you install a gmStudio product update. The specific steps used to activate a project-specific set of metalanguage files are described in this Appendix.
There are two types of Metalanguage files: default Interface Description Files (IDFs) and default Language Files.
- The Default IDFs are processed every time you run the translation process.
The default IDFs are deployed in [installdir]\support\metalang
To list and inspect the default IDFs:
- Open the Settings dialog by clicking the [Settings] button on the toolbar.
- Click the [Configuration Files] tab
- Select the [Metalang Files] radio button
- The Default Language Files must be pre-processed by gmStudio to create the metalanguage information file (VB7Lang.vbi)
The language script (VB7Lang.xml) directs gmStudio in processing default language files to create the metalanguage information file (VB7Lang.vbi)
The translator StartUp file (gmBasic.xml) specifies the location of the metalanguage information file.
The default language files are deployed in [installdir]\support\metalang
To list and inspect the default Language Files:
- Open the Settings dialog by clicking the [Settings] button on the toolbar.
- Click the [Configuration Files] tab
- Select the [Metalang Files] button
Customizing default interface description files
The [Configuration Files] tab on the Settings dialog is designed to help you inspect the metalanguage files and make a working copy in your project workspace. The following instructions explain how this is done for a default IDF.
- Open the Settings dialog by clicking the [Settings] button on the toolbar.
- Click the [Configuration Files] tab
- Copy the desired default IDF to your workspace.
- Click the [Interface Descriptions] button to list the default IDFs
- Click the desired default IDF, for example MigrationSupport.xml
- Click [Save As…] and save a copy of the file in the [workspace]\usr folder.
Once a default IDF file exists in your workspace\usr folder it will take precedence over the default copy. This behavior is governed by gmStudio’s configuration folder search rules: Target before Local before System before Language.
Customizing default language files
The [Configuration Files] tab on the Settings dialog is designed to help you inspect and manage all the files that play a role in configuring gmStudio. The following instructions explain how to setup gmStudio to do custom language replacement.
Part 1: activating project-specific metalanguage information
- Open the Settings dialog by clicking the [Settings] button on the toolbar
- Click the [Configuration Files] tab
- Click the [Project] option in the [Translator Configuration] group box.
Clicking the [Project] option copies two files into your [workspace]\usr folder:
1) StartUp File (gmBasic.xml)
The StartUp file controls the global defaults for the translator including the location of the metalanguage information file.
The Project option uses a version of the gmBasic startup file that has the metalanguage attribute set so that the translator will use the language information file in your workspace.
<Startup metalanguage="..\usr\VB7Lang.vbi">
2) Language Information Script (VB7Lang.xml)
The Language Information Script indicates which default language files should be processed to create the language information file. The Project option uses a version of the script that can be edited to specify that your custom language files should be used instead of the default files.
Once you have modified the gmBasic.xml and the VB7Lang.xml files is in your workspace, clicking the [Update Translator Configuration] button will create a new metalanguage information file (VB7Lang.vbi) in your workspace. This customized file will take precedence over the default copy that ships with gmStudio and will be used by the translator instead. of the default copy that ships with gmStudio.
Part 2: customizing default language files
- Identify the metalanguage file that you wish to update and make a copy of the file in your workspace -- if you have not already done so. For example, assume you want to modify the VBASIC.xml file.
- Click [Language Files] to list the default language files
- Click [Language Files] to list the default language files
- Modify the VBASIC.xml file in your work space user folder so as desired. For example.
- Select VBASIC.xml in the file list
- Right-Click and select [Activate Project Specific Copy] to save a copy of the VBASIC.xml in the [workspace]\usr folder
Activating a project specific copy will also modify the VB7Lang.xml file in your workspace so that it will reference the VBASIC.xml in your workspace instead of the one in the default metalanguage folder.
- Select User files on the setting Dialog then select your copy of the VBASIC.XML file. The click the \[Editor] button to open the file for editing and make changes.
- Rebuild the metalanguage information file by clicking the [Update Translator Configuration] button.
This processes the VB7Lang.xml script and display the Translator Configuration Build Log into the text box.
If the process is successful, the listing will show the directory listing for the new VB7Lang.VBI file in your workspace.
Part 3: comparing your custom files to default
You can compare your customized metalang files to the corresponding default version by selecting the file and right-clicking \[Compare to Default].
This feature can be helpful to merge default file changes after you have installed an gmStudio upgrade (See Appendix Z).
Appendix Y: Migration Dlls
Migration DLLs are not supported beyond V31.25 (Feb 2021). The advanced customization features formerly offered through Migration DLLs are now offered through Migration Utilities built using the .NET based gmAPI. The content of Appendix Y below is provided for historical reference only.
Most features of migration DLLs can now be accomplished more easily with gmSL scripts. The added benefit being that gmSL scripts are interpreted by the gmBasic translation engine each time they are used where as DLLs are compiled at a point in time and may get out of sync with the tool. More on gmsl is presented in gmSLIntroduction .
Migration DLLs extend and alter the behavior of the gmStudio translator. Migration DLLs can manipulate the information about the system at the lowest level: symbol tables and operation streams. Migration DLLs can also be used to extend the gmBasic scripting language, for example, to develop specialized analysis, reporting, and code generation tools. Migration DLLs allow migration teams to make the translator do things that cannot be easily specified using the declarative refactoring statements or the gmStudio scripting language.
Migration DLLs contain "handlers". These are subroutines invoked by the translator when various "migration events" occur during processing. There is a large set of predefined migration events as well as a facility for attaching migration events to the specific types and members in COM libraries and to specific application types and variables. There is also an extensive gmStudio API that facilitates interacting with the translator and the system model in migration event handlers.
Migration DLLs can be coded with Visual Studio using C, Managed C++, C#, and VB.NET. The system programming techniques and meta-programming concepts needed to develop migration DLLs are somewhat advanced. We typically develop the Migration DLLs for our customers; but we also offer an SDK and Training Package for teams who want to develop Migration DLLs in house.
Appendix Z: Upgrading System Defaults
From time to time a new release of gmStudio will include changes to the default metalanguage scripts. These changes will be baked" into the system default metalanguage file (vb7lang.vbi) and will be used by default in new translation workspaces. However, if you have a workspace that is using a project-specific metalanguage configuration, you MUST sync up your custom metalanguage scripts to be compatible with gmStudio failing to do this will frequently cause the translator to terminate abnormally lead or produce malformed results.
Syncing up your custom metalanguage scripts with the new system defaults is the same as any other code merging operation:
For each of new system default scripts, check to see if you are using a custom script in your project workspace. If you are not using a custom script, move onto the next script. But, it you have have a custom copy of a script, merge the new system default into your custom copy. The easiest way to do this is with a file comparison/merge tool such as Beyond Compare. Once you have synced all of your custom metalanguage scripts with the new system default script, you should rebuild your project specific metalang file using the Settings form.
Be careful not to change the system default files when you are merging them into your project-specific files.
Your solution configuration should always be kept under version control, but you may also want make a local backup of your project-specific files before starting the merge so you can use it to restart if you make a mistake and to double check your work. Your migration
Example: Changing translation of Val for VB.NET
The upgrade team has the following sample:
Dim i1 As Integer Dim i2 As Variant i1 = 9 i2 = "" i1 = Val(i2)
The default VBN translation is this:
Dim i1 As Integer = 0 Dim i2 As String = "" i1 = 9 i2 = "" i1 = CInt(i2) <--- Val(i2) wanted
The objective here is to migrate Val(x) to Val(x) rather than CInt(x). Normally, changing how a VB6 Intrinsic function is migrated is simply a matter of changing the mapping of that function in the metalang file. It turns out doing this for Val may be a slightly special case because of how VB6 does implicit numeric conversions.
First we need to see that VB6 Intrinsic Function Val has the following declaration:
Function Val(String As String) As Double Member of VBA.Conversion Returns the numbers contained in a string
Notice that VB6.Val function *returns a Double*, not Integer. So, in this sample, there must be an implicit conversion of Double to Integer needed to make the assignment to i1.
The tool is aware of the need for the conversion. You can see this by comparing the Compiler audit report and the Analyser audit report:
Compiler audit 52 | | | | NEW | 47 i1 = Val(i2) 55 | | | | LEV | Nest0 57 | | | | BIF | Component:Val:221210 62 | | | | LEV | Nest1 64 | 1.64 | 1.64 | String | LDA | Variable:i2:66390 69 | 1.64 | | | ARG | Void 71 | 1.64 | 1.71 | Double | VBF | Val <--- before analyser, we have VBF.Val 73 | 1.64 | | | ARG | Integer 75 | 2.75 | 1.75 | Integer | LDA | Variable:i1:66306 80 | | | | STR | AssignValue
After the analyser has processed this operation stream, the call to Val operation (VBF.Val) is changed to a conversion to Integer operation (CNV.ToInteger):
Analyser audit 52 | | | | NEW | 47 i1 = Val(i2) 55 | | | | LEV | Nest0 57 | | | | BIF | Component:Val:221210 62 | | | | LEV | Nest1 64 | 1.64 | 1.64 | String | LDA | Variable:i2:66390 69 | 1.64 | | | ARG | Void 71 | 1.64 | 1.71 | Integer | CNV | ToInteger <--- after analyser, we have CNV.ToInteger 73 | 1.64 | | | ARG | Integer 75 | 2.75 | 1.75 | Integer | LDA | Variable:i1:66306 80 | | | | STR | AssignValue
This conversion is implicit in VB6, but must be explicit for C#. The tool also does it for VB.NET although it can be done implicitly in VB.NET with option strict off.
After the tool's author will express the CNV.ToInteger operation according to the expressions specified in the metalang. This expression is specified in the metalang file: VBASIC.XML.
default VBASIC.XML ... <pattern id="CNV"> ... <subcode id="ToInteger"> <vbn role="utility" narg="1" code="CInt(%1d)"/> <csh role="utility" status="Extension" narg="1" code="%1d.ToInt()"/> </subcode> {code}
There are two ways to change the expression:
Using a custom VBASIC.XML file
Make a project-specific metalang file using a project-specific version of VBASIC.XML. In this case, you would make a copy of VBASIC.XML changed as follows:
workspace\usr\VBASIC.XML ... <pattern id="CNV"> ... <subcode id="ToInteger"> <vbn role="utility" narg="1" code="Val(%1d)"/> <csh role="utility" status="Extension" narg="1" code="%1d.ToInt()"/> </subcode>
Using a mig.VB7Lang.xml file
Make a project-specific metalang file using a custom mig.vb7lang.xml as follows:
workspace\usr\mig.VB7lang.xml <MetaLanguage> <Refactor id="Basic" > <Replace id="Patterns"> <pattern id="CNV"> <subcode id="ToInteger"> <vbn role="utility" narg="1" code="Val(%1d)"/> </subcode> </pattern> </Replace> </Refactor> </MetaLanguage>
The mig.vb7lang.xml approach is preferred since it more clearly separates and documents the project-specific changes from the system default. This separation simplifies updating to new versions of gmStudio that may also have metalang source file changes.
The process of using the mig.VB7Lang.xml file is similar to the process described in above mentioned article. The steps are slightly different:
1) Use the project settings form to make a project-specific copy of the Sample Rule Set file, mig.VB7Lang.xml file. This will place a copy of the sample in your user folder.
2) Edit the copy of the file in your User folder to reflect your desired language operation mappings.
3) Select the [Project] option button in Translation Configuration box
4) Click Update Project Metalang File
FWIW: I like making the conversion to Integer explicit:
<vbn role="utility" narg="1" code="CInt(Val(%1d))"/>
Note, the following example shows that if conversion to Integer is not needed, the default translation of Val is to Val, but if a conversion to integer is needed, the CNV.ToInteger pattern is used:
VB6: Dim i1 As Integer Dim i2 As Variant i1 = 9 i2 = "" i1 = Val(i2) Dim d1 As Double d1 = Val(i2) VB.NET Dim i1 As Integer = 0 Dim i2 As String = "" i1 = 9 i2 = "" i1 = CInt(Val(i2)) <-- using custom expression for CNV.ToInteger Dim d1 As Double = 0.0 d1 = Val(i2)