This describes the steps involved in the implementation and use of an aParse generated C++ parser.
Write Grammar | Use the ABNF metalanguage to define the syntax of a protocol. |
Generate Parser | Use aParse to generate the parser source code. |
Test Parser | Use the parser to parse instances of the protocol and verify whether the grammar correctly describes the protocol. |
Produce Visitor | Produce a visitor for instances of the protocol that performs the desired conversion or translation. |
Employ Parser | Make use of the parser and visitor in a program. |
Additional information can also be found in the following sections.
External Rules | Using external user defined rules. |
Options | aParse supported options. |
Apache Ant | Invoking aParse from Apache Ant. |
Write Grammar
Using the ABNF metalanguage, define the syntax of a protocol in a text file.
For example, the definition of a 24 hour clock in clock.abnf.
Generate Parser
Use aParse to generate the Clock parser source code.
java -cp aparse.jar com.parse2.aparse.Parser -language cpp -main
-visitors Displayer clock.abnf
Any syntax errors or inconsistencies in the grammar will be highlighted at this point. For example, if the Separator rule was not declared then aParse would generate the following errors.
Once the grammar compiles cleanly, the following C++ header and source files will have been produced.
Parser.hpp Parser.cpp |
The parser of instances of Clock. |
ParserMain.cpp | An optional file that contains a main method that may be invoked to run the parser. |
Rule.hpp Rule.cpp |
An abstract base class that is inherited by all concrete Rule_* and Terminal_* classes. |
Rule_*.hpp Rule_*.cpp |
Classes, one for each of the rules defined in the grammar. For example, Rule_Clock, Rule_Hours and Rule_Minutes. |
Terminal_*.hpp Terminal_*.cpp |
Two classes, Terminal_StringValue and Terminal_NumericValue, for the ABNF terminal string and numeric values. |
ParserContext.hpp ParserContext.cpp |
A context class that encapsulates the information required by the Rule_* and Terminal_* parsers. |
ParserAlternative.hpp ParserAlternative.cpp |
A class that encapsulates the details of an alternative encountered during the parsing of an instance of Clock. |
ParserException.hpp ParserException.cpp |
An exception that is raised by the Parser when it attempts to parse an instance of Clock that does not conform to the specified grammar. |
Visitor.hpp | An abstract base class that must be inherited by classes that want to traverse a Parser generated rule tree. |
Displayer.hpp Displayer.cpp |
A class that implements the interface defined by the Visitor class and displays the terminal string and numeric values of a Parser generated rule tree. |
XmlDisplayer.hpp XmlDisplayer.cpp |
A class that implements the interface defined by the Visitor class and displays the contents of a Parser generated parse tree in XML. |
Compile these source files and build an executable using the relevant tools of an appropriate C++ development environment. From henceforth it is assumed an executable called parser has been built.
Test Parser
To test the parser, create a file, for example clock.txt, containing an instance of Clock.
Parse it and display the contents of the parse tree using the parser and Displayer.
parser -visitor Displayer -file clock.txt
Alternatively, parse it and display the contents of the parse tree in XML using the XmlDisplayer visitor.
parser -visitor XmlDisplayer -file clock.txt
See Clock Parser for a working example.
Produce Visitor
Rather than use the following automatically generated XmlDisplayer it is possible to produce and use an alternative Visitor.
For example, the following Clock2Xml visitor is identical to the XmlDisplayer except it does not output the ":" separators.
See Clock Parser for a working example.
Similarly, the following Clock24To12 visitor converts the 24 hour clock values to their 12 hour clock equivalents.
See Clock Parser for a working example.
Employ Parser
Having verified the operation of the parser and visitor, they may be built into a program. The following code segment shows how the parser and Clock24To12 visitor might be used to parse and process the contents of the file clock.txt.
External Rules
aParse supports the use of external code to parse and encapsulate protocol elements that are not directly support by the ABNF metalanguage.
For example, a protocol may not use separators to identify the boundary between variable length elements. Instead, the length of an element is found within the element. The following shows how a length prefixed string may be supported.
The following ABNF grammar states that a Message is composed of any number of String elements. The use of the $rule directive tells aParse that the user defined LLString rule is to be used for String elements.
The following is the C++ class for the LLString rule. This supports variable length strings where the ASCII format length of the string is located in the first two characters.
This user defined class must provide at least two methods: parse() and accept().
parse()
This is a factory method that identifies whether the next sequence of characters in the input being parsed represent a two character length prefixed string and returns an instance of the LLString rule if they do. It must return NULL if they do not.
All the information required by the parse() method is contained in the supplied ParserContext. The context.text string is the stream of characters being parsed and context.index points to the start of the characters, within the context.text string, that the parse() method must attempt to parse. If the parse is successful, the context.index must be advanced by the number of characters taken up by the LLString element. If the parse fails, context.index must not be changed.
The first and last things the parse() method must do are to call the context.push() and context.pop() methods. The call to context.push() tells aParse that the LLString parser has been called and it is the supplied rulename that would appear in the rule stack output with any ParseException thrown. The call to context.pop() tells aParse that the parsing has completed and, most importantly, whether or not the parse was successful or not.
accept()
This is the accept method of the visitor pattern and it will simply pass the LLString to the specified visitor.
Options
aParse supports the following optional arguments.
-destdir directory | The directory in which the generated C++ files are put. |
-includedirs directories | The directories that are scanned when aParse searches for files included via $include directives. This is a comma or semi-colon separated list. |
-namespace namespace | The C++ namespace that the generated parser belongs to. An appropriate namespace directive is added to the generated C++ files. |
-main | Instructs aParse to generate a ParserMain.cpp source file containing a main method that provides the means by which an executable can be built to test the parser. |
-visitors visitors | If the main option has been chosen, this instructs aparse to add code to the main method to support the passing of the parsed rule to a visitor specified in the comma or semi-colon separated visitors list. |
-trace | Instructs aParse to output a trace showing the rules being compared against the input. |
Apache Ant
aParse provides the com.parse2.aparse.AntTask class that can be invoked directly from Apache Ant.<taskdef name="aparse" classname="com.parse2.aparse.AntTask" classpath="aparse.jar">
<target name="makeparser">
<aparse grammar="grammar.abnf" language="cpp" main="yes"/>
</target>
The com.parse2.aparse.AntTask class supports the setting of the following C++ related attributes.
grammar | The ABNF grammar file. |
language | Must be set to cpp to instruct aParse to generate C++ files. |
destDir | The directory in which the generated C++ files are put. |
includeDirs | The directories that are scanned when aParse searches for files included via $include directives. This is a comma or semi-colon separated list. |
namespace | The C++ namespace that the generated parser belongs to. An appropriate namespace directive is added to the generated C++ files. |
main | When set to yes, this instructs aParse to generate a ParserMain.cpp source file containing a main method that provides the means by which an executable can be built to test the generated parser. |
trace | When set to on, this instructs aParse to output a trace showing the rules being compared against the input. |