Structuring Extension Code
This tutorial section shows how to create an empty extension using our provided script, and goes over the organization of the generated extension code and how it corresponds to code in the Polyglot library.
Creating an empty extension
Any language extension incurs a fixed startup cost, namely, several packages and
files required to make the extension usable with the base compiler. Creating
these skeletal entities can be tedious, so Polyglot provides a script to
generate them automatically. The skeleton template is located in directory
skel
of the Polyglot code base, but simply copying the directory
will not do the trick because renaming must be done. The script renames and
copies the files and directories in an instant.
The script is
bin/newext.sh
and has the following usage:
Usage: newext.sh dir package LanguageName ext where dir - name to use for the top-level directory and for the compiler script package - name to use for the Java package LanguageName - full name of the language ext - file extension for source files package and LanguageName must be legal Java identifiers.For our
CArray
extension, we will use this command:
newext.sh carray carray CArray carThe first token
newext.sh
may need to be prepended with appropriate
directory if the bin
directory is not defined in the
PATH
system variable.
Extension structure
Several directories will be generated for the empty extension. We will focus on
directory
carray/compiler/src
, which contains the extension code.
The remainder of this section provides short descriptions of these extension
components. Later, we will explore some of these components in more detail
while we implement the CArray
extension in full.
Package carray
This package contains general specifications for the extension.
Main.java
: The entry point to the extension compiler.Topics.java
: Defines topics used for informational reporting.Version.java
: Defines the version number, which consists of major, minor, and patch-level numbers. The version number is used as a check when extracting extension-specific type information from.class
files.ExtensionInfo.java
: Specifies appropriate implementations for the extension. In particular, it defines the default file extension, name of the compiler executable, parser, factory class for creating AST nodes, and type system. Extension information also includes additional compiler options for the extension, and any change to the scheduler, which determines the order of compiler passes to be run.
Package carray.parse
This package contains lexical specifications for the extension.
carray.flex
: The lexer specification based on the JFlex fast scanner generator for Java.carray.ppg
: The extensible parser specification that defines the context-free grammar for the language.
Package carray.ast
This package defines the machinery for abstract syntax trees (ASTs) of programs
in the extension. The naming convention is that a name such as
CArrayNodeFactory
denotes an interface, and the suffix
_c
, as in CArrayNodeFactory_c
designates the
implementation of the corresponding interface.
CArrayNodeFactory.java
: Defines factory methods for new AST nodes in the language.CArrayExtFactory.java
: Defines factory methods for extension objects, which may override existing operations on ASTs defined for compiler passes in the base language, and may define new operations on ASTs specific to compiler passes in the extended language.CArrayExt.java
: The default extension class for the language. By default, an instance of this extension class simply forwards the operations to the base language. Any other extension class that implements or overrides operations specific to a particular kind of AST node must extend this class. Note that the class name does not follow the convention, but this may change in the future.CArrayLang.java
: The language dispatcher for the extension, which determines the AST node or extension object that contains the appropriate implementation of operations for compiler passes and invokes the operations on that object properly.
Package carray.types
This package contains the type objects and defines the type system for the
language.
CArrayTypeSystem.java
: Defines any methods required for the type system of the extension.
Package carray.visit
This package hosts any visitor classes that iterate over abstract syntax trees
specific to the extension.
Building the extension
Similar to Polyglot, extensions can be built using the Ant build tool. The
generated build script, located at
carray/build.xml
, needs no
modifications to build the empty extension. In fact, the full implementation of
CArray
requires no modifications to the build script whatsoever.
Polyglot Test Harness
Polyglot provides a framework called the Polyglot Test Harness (pth)
for conveniently testing the extension implementation. The generated directory
carray/tests
contains a template test script called
pthScript
used by pth. The test script has the following grammar:
# ScriptFile ::= CompilerTest+ # CompilerTest ::= ExtClassName ["CmdLineArgs"] { FileTest [; FileTest]* } # FileTest ::= CompilationUnits [Description] [FailureSet] # CompilationUnits ::= Filenames [, Filenames]* # Filenames ::= Filename [Filename]* # Description ::= LitString # FailureSet ::= Failure [, Failure]* # Failure ::= ( ErrorKind ) # | ( ErrorKind, "RegExp" ) # | ( "RegExp" ) # | ( ) # ErrorKind : one of, or a unique prefix of one of the following # strings: "Warning", "Internal Error", "I/O Error", # "Lexical Error", "Syntax Error", "Semantic Error" # or "Post-compiler Error". # Filename : the name of a file. Is interpreted from the # directory where pth is run. # LitString : a literal string, enclosed in quotes. # RegExp : a regular expression, as in java.util.regex; # is always enclosed in quotes. # CmdLineArgs : additional command line args for the Polyglot # compiler; is always enclosed in quotes.The generated test script contains one compiler test, which contains one file test that invokes the currently empty
CArray
extension on the
generated file Hello.car
:
carray.ExtensionInfo "-d out" { Hello.car; }This compiler test also specifies the Polyglot flag
-d
, which
designates the output directory of .class
files generated by the
compiler.
The entry point for pth is class
polyglot.pth.Main
, located in
directory tools/pth/src
of the Polyglot code base. To run pth,
simply pass in the test script file name as the argument. For each file test
in the test script, pth will invoke the specified compiler and report whether
the compiler succeeds or fails as expected. For Hello.car
, no
failures are listed, so this file test is expected to compile. Running pth with
the generated pthScript
yields the following result:
Test script pthScript Hello.car: OK pthScript: 1 out of 1 tests succeeded.We will populate this test script with more test cases as we implement
CArray
.