Think Out Loud: Mynx Programming Language Design: Semantic Merge into the Data Traffic Lane

After completing and testing the individual sentence syntax objects of Mynx (with approximately 300-test cases) the next phase is what I term uniqueness and existence semantic checks -- in context within the class or program. These are the usual semantic checks in compiler texts involving the symbol table. The checks are semantic context as they are in the unit of context -- the class or program. Consider the Mynx EBNF for a class body:


CLASS_BODY := ( CLASS_METHOD | CLASS_ATTRIBUTE | CLASS_OVERLOAD | CLASS_OBVIATE 
              | CONSTRUCT | DESTRUCT | DEFAULT)*

PROGRAM_BODY := (PROGRAM_ATTRIBUTE | PROGRAM_METHOD)+

Another important construct is the module and namespace inclusion that is:


MODULE := [ module (default | NAME) ‘;’ ]
WITH   := ( with NAME [‘.’ ‘*’] ‘;’     )

The class body has 0 or more class elements, and a program body at least 1 or more program elements. The namespace elements are 0 or more. The module plus the program or class name creates an implicit namespace statement, but being optional, is again 0 or 1.

The semantic merge phase follows the sentence semantic check, and the semantic merge has two aspects:

Uniqueness - context syntax elements are unique - declaration only once.

Existence - context syntax elements exist before use - declaration before use.

Each syntax object must check as semantically unique and entered into the data structures within the symbol table before the existence checks can be performed as the overall semantic context checks.

Uniqueness - there can only be one

The uniqueness check is that syntax elements are unique. The major question of concern is: unique on what within the syntax object?

For each class element:

Attribute - unique on name identifier

Constructor - unique on type identifiers in parameter list

Default - only one

Destructor - only one

Method - unique on name identifier, parameter list, and if covariant return type identifier

Namespace - unique on sequence of identifiers

Obviate - unique on method name identifier, parameter list; if an all methods obviate, only one unique on method identifier

Overload - unique on operator

For each program element:

Attribute - unique on name identifier

Method - unique on name identifier, parameter list

For each syntax element is unique on identifier, sequence of identifiers, or operator.

After the semantic sentence check then follows the semantic merge, and the semantic context checks. In the process, the context elements are entered in the symbol table.

Symbol Table - Collection of Data in Structures

The symbol table is a composition of data structures, for each of the context elements of a class or program.

Each syntax object implements a specific interface to get the unique string for use in the symbol table. Depending upon the syntax object, and what information is to be stored, a different data structure is used.

Three data structures: ordered vector, ordered map, and ordered multimap.

Ordered Vector - elements are unique, but no association with syntax object

Ordered Map - elements are unique, but association with syntax object

Ordered Multimap - elements are unique, but association in multiple syntax objects

All three data structures are implemented from scratch (to foster portability to C++ and C#...so no specific data structures library), and are linear data structures that maintain ordering -- sorted for O(lg n) access performance (to check for duplicates). The ordered map is an ordered vector, only with a pair key to value association class, and the ordered multi map is with a tuple key to an array of values (dynamically resized if need be).

Many compiler texts spend an inordinate amount of time discussing implementation of the symbol table, but the reason is that the implementation of the symbol table impacts performance. Using object-oriented data structures hides and simplifies many of the details.

Implementation of Semantic Merge

The symbol table is within the context object -- the singleton class that contains context information about the class or program being semantically analyzed.

The merge occurs within the sentence, after the sentence semantic check, the semantic merge checks for duplicates on the sentence itself, and reports errors. If no errors, then the sentence is merged by calling a method on the symbol table.

The sentences are the driving point of the semantic merge and error reporting, the symbol table and context object act as a single point or locus for access and storage.

Existence Semantic Checks

The existence semantic checks will follow once the uniqueness semantic checks are complete. Existence is the declaration before use principle common to many of the strongly typed programming languages such as Algol, Pascal, Java, Oberon, C++.

Labels: context semantic, mynx, mynx semantic check, semantic analysis, semantic merge, symbol table

Think Out Loud: Mynx Programming Language Design

Sunday, June 24, 2007

Semantic Merge into the Data Traffic Lane - Data Structures and the Symbol Table

Uniqueness - there can only be one

Symbol Table - Collection of Data in Structures

Implementation of Semantic Merge

Existence Semantic Checks

About Me

Previous Posts