Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

Large mature systems often contain symbols that are no longer used. These may be methods that are no longer called or variables and data structures that are no longer referenced by active logic. These unused symbols are often referred to as "dead code". Dead code is a form of technical debt, but it can difficult to identify and risky to remove requiring a lot of testing.  

An Upgrade Project is an optimal time to deal with dead code because gmStudio can help you identify and remove it.  Removing dead code also makes the overall Upgrade effort proceed more efficiently and can produce more maintainable results. And, you can test for problems from removing dead code while you are regression testing other features of the the platform upgrade.  Removing dead code is structural upgrade feature that teams often integrate with their Custom Upgrade efforts.

Identifying Dead Code

gmStudio's Refactor/Remove command directs the removal of symbol declarations, and optionally, symbol references as well. The Refactor/Remove command requires you to specify the identifier of the symbol to remove. In some cases, the Upgrade Team may already know the identifiers of dead code, but typically many cases of dead code are not known and they are difficult to find and verify. Teams can benefit an automated analysis identifying the dead code. This document describes how gmStudio can help automate the process of identifying and removing dead code.

gmStudio is distributed with gmGlobal.exe, a tool that can help you identify and report unused symbols in various ways. gmGlobal is a .NET executable built using gmAPI it is a console application that takes a gmPL script on its command line. The script tells gmGlobal what code to analyse, how to analyse it, and how to report the results. gmGlobal recognizes and internally processes the following statements:

  • InformationFiles,
  • FindCallByName,
  • RemoveUnused, and
  • ReportRemovals.

gmGlobal can analyse a symbol usage across collection of inter-related components, and also identifies dead code as both symbols that are not referenced at all, and also symbols that are referenced only from dead code.


Warning

CAUTION: The Unused symbol analysis depends on the detailed symbol reference information gathered by gmBasic: unreferenced symbols are unused symbols. However, this approach fails to identify references made through late calls. You should complete type inference optimization and other techniques to reduce late calls prior to doing the Unused analysis. You should also plan to take advantage of the FindCallByName and DoNotRemove features to fine tune the removals. Always take time to consider these limitations and review the removal scripts generated by gmGlobal as well as the codes that have had symbols removed.

Info

Event handlers (e.g.  Form and Control event handlers, Class_Terminate, Class_InItialize, Sub Main, etc.) and symbols declaring or implementing interfaces are excluded from removals because they are needed even if they are not referenced.

Reporting

The easiest way to use the Unused Symbols Analysis and and the FindCallByName Reports is to invoke them from the Reporting menu. 

Using gmGlobal as a Task in gmStudio

You may integrate gmGlobal into your gmStudio project as a special gmGlobal task. Typically this will be inserted into the task list to run after the VBPs are translated to create the VBI files to be analysed. When you Translate this special task, gmStudio will run prepare an actual script and run gmGlobal.exe passing the actual script as the first argument.


The gmGlobal tasks have the Source File set to the gmGlobal script template. This task must use the GMTOOL: notation specify the location of gmGlobal.exe.

GMTOOL:C:\Program Files (x86)\GreatMigrations\gmStudio\gmGlobal.exe

Here are some attributes of a sample gmGlobal task:

Source Name              = [RemoveUnused1]
Source Location          = [C:\gmSpec\Util\UnUsedTest\proj\usr]
Source File Name         = [RemoveUnused1.xml]
Translation Script       = [GMTOOL:C:\Program Files (x86)\GreatMigrations\gmStudio\gmGlobal.exe] 
Task Command Script      = [UserCmds.cmd]
Code Bundle Path         = [C:\gmSpec\Util\UnUsedTest\proj\log\UnUsedTest-RemoveUnused1-std-csh.bnd]

A sample RemoveUnused Script

I will demonstrate the Unused Symbol Analysis and Reporting with a small demo project. Project has default translations setup for two related VBPs:

  • projUnusedDll.vbp
  • UnUsedTestEXE.vbp

The first example script is RemoveUnused1.xml:

<gmGlobal>
   <Select Progress="1" />
   <InformationFiles>
      <Load id="UnUsedTest-projUnusedDll-std-csh.vbi" />
      <Load id="UnUsedTest-UnUsedTestEXE-std-csh.vbi" />
   </InformationFiles>
   <Output Status="New" FileName="%bndPath%" />
   <RemoveUnused />
   <Output Status="Close" />
</gmGlobal>

This script directs gmGlobal.exe to load the VBIs for the two VBP translations and generate a RemoveUnused report. The report is placed in the BndFile associated with the task to facilitate integration with gmStudio progress tracking and reporting.

The InformationFiles statement supplies the list of information files to be processed by down stream statements like FindCallByName or RemoveUnused . The list is initiated either by the statement <InformationFiles site="folder"> or by a statement <InformationFiles> with no site attribute specified. When "site" is specified the list consists of all files with the "vbi" extension in the specified folder. When not, each file is introduced by a statement <Load id="pathname">. 

The RemoveUnused statement scans a set of loaded information files to find those members that are not used. The RemoveUnused algorithm proceeds in two phases – Local and Global. The ultimate problem faced by the RemoveUnused algorithm is that once a set of members are removed any other members that were referenced only in that set become unused. These members then form the next set of removals and so on.   But "identifying members only referenced by the removed members" cannot be done directly. Rather all references made by unremoved members must be computed. This core operation is performed by the gmAPI Runtime.References() service method and this method is at the heart of the algorithm here. Ultimately this final loop of identifying unused members and recomputing the references using Runtime.References() until no new unused members are found is performed by the method RescanForUnusedReferences(). This method is called by both the Local and Global removal phases. Once all Unused members have been identified this statements authors a set of Registry.RefactorFile statements that can be used by the same translation scripts that produced the information files to author target code that does not include the unused members.

CAUTION: The RemoveUnused algorithm changes the attributes of the members of the information files. Once it has completed, these information files cannot used again to do this operation again, perhaps with different member types or different do not removes. Copies of the source information files must be made before performing the removes or the source information files must be refreshed by rerunning the scripts that produced them. (Note in developing this application I made this mistake many times).

gmGlobal also produces a log of its operations with verbosity corresponding to select.Progress.

Select.Progress=1

Code Block
gmGlobal V40.34x86(09/22/22) System Build(09/22/22 14:48:10)
The InformationFiles list contains 2 files.
The Local Removal Scan of <UnUsedTest-projUnusedDll-std-csh.vbi> required 2 passes.
The Local Removal Scan of <UnUsedTest-UnUsedTestEXE-std-csh.vbi> required 3 passes.
Performing Global Removal pass 1
Performing Global Removal pass 2
Performing Global Removal pass 3
Performing Global Removal pass 4

Select.Progress=2

Code Block
gmGlobal V40.34x86(09/22/22) System Build(09/22/22 14:48:10)
The InformationFiles list contains 2 files as follows:
   Information File(1): UnUsedTest-projUnusedDll-std-csh.vbi
   Information File(2): UnUsedTest-UnUsedTestEXE-std-csh.vbi
The Local Removal Scan of <UnUsedTest-projUnusedDll-std-csh.vbi> required 2 passes.
The Local Removal Scan of <UnUsedTest-UnUsedTestEXE-std-csh.vbi> required 3 passes.
Performing Global Removal pass 1
The information file <UnUsedTest-projUnusedDll-std-csh.vbi> has 3 unUsed members
   projUnusedDll.Class1.DLLexposedUsedbyClient:42265 has 0 global references
   projUnusedDll.Class1.dllPropUsedbyClient:42327 has 0 global references
   projUnusedDll.Class1.dllPropNotUsedbyClient:42584 has 0 global references
Performing Global Removal pass 2
The information file <UnUsedTest-projUnusedDll-std-csh.vbi> has 1 unUsed members
   projUnusedDll.Class1.DLLexposedNotUsedbyClient:42202 has 0 global references
Performing Global Removal pass 3
The information file <UnUsedTest-projUnusedDll-std-csh.vbi> has 1 unUsed members
   projUnusedDll.Class1.pubOnlyUsedinDLL:42140 has 0 global references
Performing Global Removal pass 4


The output of the run produces this file with refactoring commands.

Code Block
<Registry type="RefactorFile" Source="...\UnusedTestDLL.vbp">
<Refactor errorStatus="warn">
   <Remove identifier="projUnusedDll.Class1.privNotUsedinDLL"/>
   <Remove identifier="projUnusedDll.Class1.pubUsedOnlyFromPrivate"/>
   <Remove identifier="projUnusedDll.Class1.dllPropUsedbyClient.Let.val"/>
   <Remove identifier="projUnusedDll.Class1.dllOnlyGetUsedbyClient.Let.val"/>
</Refactor>
</Registry>
<Registry type="RefactorFile" Source="...\UnUsedTest.vbp">
<Refactor errorStatus="warn">
   <Remove identifier="UnUsedTestEXE.modUnUsedTest.NotUsedDecl"/>
   <Remove identifier="UnUsedTestEXE.modUnUsedTest.UnreachableDecl"/>
   <Remove identifier="UnUsedTestEXE.notUsedSub"/>
   <Remove identifier="UnUsedTestEXE.notUsedFunc"/>
   <Remove identifier="UnUsedTestEXE.UnreachableSub"/>
   <Remove identifier="UnUsedTestEXE.UnreachableFunc"/>
</Refactor>
</Registry>

The commands above may be used as a starting point for implementing rules that will remove unused code from you translations. The commands would be integrated with a GlobalSettings script.

Reporting

Following the application of <RemoveUnused /> to a set of VBI files, the list of unused symbols may be reported using code such as the following:

   <Output Status="New" FileName="%MigName%-ReportUnused.tab" syntax="Tabbed" />
   <ReportRemovals />
   <Output Status="Close" />

The resulting report is written to the file named in the FileName attribute and looks like this:


Controlling the Types of symbols in the Unused Analysis

The "RemoveUnused" command itself has two mutually exclusive attributes: Include and Exclude. These specify what types of members – methods, fields, constants, properties, declarations, events, enumerations, or structures – are eligible for removal. By default, all of the above types are eligible for removal. Automatically removing all of these unused member types from the application code may be going too far in many cases. The user may be mainly looking for help with identifying and removing unused subprograms since they typically reference each other creating a complex web of dependencies that are tedious to unravel. Or a user might want to retain all events rather than singling out particular ones via a DoNotRemove list.

The types of symbols included in the Unused Symbols analysis may be set the Include attribute of the RemoveUnused command; for example:

<RemoveUnused Include="method, declare, property" /> 

Alternatively, the types of symbols excluded from the Unused Symbols analysis may be set the Exclude attribute of the RemoveUnused command; for example:

<RemoveUnused Exclude="field, constant, structure, enumeration, event" /> 

The following type names are recognized

      method
      field
      constant
      property
      declare
      structure
      enumeration
      event

Preventing symbols from being marked as Unused

In addition to its Include and Exclude attributes the RemoveUnused statement also supports a set of DoNotRemove subcommands. The DoNotRemove statements store a series of DoNotRemove identifiers classified by their scope and then scans a set of loaded information files to locate and mark any members that are specified in the list. There are three scope levels: Global = 0, Library = 1, and member = 2+. The level is simply computed by counting the periods in the identifier. Each subcommand has a single "id" attribute. The example below shows using a DoNotRemove element to suppress removal of a specific symbol.

<gmGlobal>
   <InformationFiles>
      <Load id="UnUsedTest-projUnusedDll-std-csh.vbi" />
      <Load id="UnUsedTest-UnUsedTestEXE-std-csh.vbi" />
   </InformationFiles>
   <Output Status="New" FileName="%bndPath%" />
   <RemoveUnused>
      <DoNotRemove id="projUnusedDll.Class1.privNotUsedinDLL"/>
   </RemoveUnused>
   <Output Status="Close" />
</gmGlobal>


FindCallByName

As mentioned above, the Unused Analysis risks marking late called symbols as unused.  Typically the user will have to make adjustments for this by adding DoNotRemove commands and carefully reviewing the reported removals for symbols known to be accessed by a late call.   This can be a difficult process, and gmGlobal includes an additional analyzer to assist you: the FindCallByName command illustrated below:

gmAPI_CallByName.xml

Code Block
<gmGlobal>
   <Storage Action="Create" Identifier="gmAPI_CallByName" />
   <LoadEnvironment />
   <Select MaxOutputWidth="2048" />
   <Output Status="New" Filename="gmAPI_CallByName.out" />
   <InformationFiles>
      <Load id="UnUsedTest-projUnusedDll-std-csh.vbi" />
      <Load id="UnUsedTest-UnUsedTestEXE-std-csh.vbi" />
   </InformationFiles>
   <FindCallByName ShowDetails="on" />
   <Storage Action="Close" />
</gmGlobal>
The C# gmAPI has not yet been integrated with the RemoveUnused logic, but this may be done in a future release.

One of the major stategies for removing unwanted CallByNames is to provide interfaces which can be used to either type the host of the CallByName or to box the CallByName itself. Before actual CallByName refactoring instructions can be authored to use these interfaces the interfaces themselves have to be defined. The optional "Interfaces" attribute generates an Interface Description File based on the unresolved CallByNames encountered; for example:


Code Block
<FindCallByName ShowDetails="on" interfaces="LateCallInterfaces.dll.xml" />

This IDF file can then be referenced by a FindCallByName script that does not specify the Interfaces attribute. In this form a set of refactoring commands are authored instead.

To be continued...

Table of Contents