Codebase list findbugs / upstream/2.0.3+repack design / DecouplingFromBCEL.txt
upstream/2.0.3+repack

Tree @upstream/2.0.3+repack (Download .tar.gz)

DecouplingFromBCEL.txt @upstream/2.0.3+repackraw · history · blame

Decoupling FindBugs from BCEL

Document history:
DHH 7/3/2006: Created

Goals:

- eliminate tight coupling between FindBugs and BCEL

- Allow other bytecode frameworks (such as ASM and maybe Soot) to
  be used by detectors

   *** Detectors could even do their own scanning on the raw classfile
       data, without any bytecode framework
       (on either stream or byte array)

- Allow visitor-based detectors to use information from
  dataflow analysis framework.


Issues:

- We are reliant on BCEL's Constants and Visitor classes

- The bytecode analysis (ba) package makes heavy use of
  BCEL tree-based reprentation (generic package) and also
  the BCEL Repository and type classes (both of which have
  significant shortcomings).  The detectors which use
  the bytecode analysis package are also tightly coupled
  to these classes.


Strategy:

- Change the basic visitation strategy to remove dependence
  on the BCEL Repository and JavaClass classes.

  *** We should develop our own CodeBase and Repository abstractions

- The visitor classes and detectors will probably be the easiest
  to convert.  We can develop our own Visitor/DismantleBytecode
  interfaces, which can be concretely implemented using ASM,
  (or any other bytecode framework).  Obviously the detectors
  should not be aware which bytecode framework is being used.

  We will probably need to add our own equivalent of the
  BCEL Constants interface.   We will need some kind of
  general classfile representation package (to hide framework-specific
  details).

- After visitor-based detectors are working using new visitor framework,
  start work on developing new classes/APIs for more sophisticated
  (e.g., dataflow) analysis.  We should try to separate WHAT information
  is computed (types, values, nullness, etc.) from HOW the
  information is computed.  Clients (Detectors) should only be coupled
  to the classes representing WHAT is computed.

Notes:

- We should standardize on VM (slashed) type descriptors in all places
  where a JVM type is being referred to.  Dotted classnames should
  go away except when displaying such types to the user.
  
  *** This is how ASM represents types -> efficient

  *** We should probably develop a richer type abstraction eventually
      for the dataflow-based detectors, etc.