Saturday, May 19, 2007

Half a Visitor, but Full Object-Oriented Semantic Check

In implementing semantic checks my original software design was based on the material in many object-oriented compiler texts - the visitor design pattern. I won’t restate theory or design of the visitor design pattern, it is a favorite pattern of mine (next to the singleton), but in my implementation of the Mynx compiler phase of sentence and statement semantic checks, it created more overhead than it helped to resolve.

Sentences, once created after a parse then have a Context singleton object passed. The Context object stores information among the sentences, including error reports.


     theSentence.check(Context.getContext());


In the original, pure visitor design pattern, the Context object then had a method called and the sentence passed for the semantic check, a`la:


public void check(Context ctx)
{
     ctx.doSemanticCheckSentence(this);
}


The visitor pattern, but in the method “doSemanticCheckSentence”, the sentence passed then needs to have information moved by accessor methods...the infamous getXXX and setXXX.

All the getters and setters are moving information back and forth from within the sentence -- I was writing more code to move data than to do the semantic checks. Encapsulation is great, but when information is within the object and you want to do checks to verify semantics, it is a pain because you have to do much more data movement then semantic check.

Then I realized passing a Context object is useful to centralize a focus for semantic information, but the actual sentence should do its own semantic check. The distinction is using the sentence as data to be processed -- the full visitor design pattern, versus letting the sentence check itself -- but passing semantic information to a central semantic nexus.

In the bombastic hyperbole of object-oriented, a message is sent to the sentence stating “sentence, do a semantic check”. The strange twist is a design pattern, a holy grail of object-oriented development, turns an object that can do its own thing into a data element - back to procedural passing data among different methods.

The Context object is a nexus to store contextual semantic information, such as alerts and error reports. Thus far, there are a total of 109-discrete semantic test cases, one for success and one for failure of a single semantic check. So thus far, a total of 218-semantic checks as discrete Mynx classes and programs.

Moving from full visitor design pattern but instance as data to half-visitor instance as an object has made the semantic check implementation go faster. Sticking with the full visitor design pattern and I’d be writing data movement accessor methods -- and still implementing the semantic checks. A cautionary tale of blindly using a design pattern can make a design more rigid, and require over-decoration of a class with accessor methods to get the data.

Labels: , ,

Tuesday, May 08, 2007

Mynx Context Semantic Constraints on Scope and Uniqueness

In implementing the sentence semantic checks, part of the semantic processing is to verify context of syntax elements. The two primary context semantic constraints that must be verified are:


  1. unique - syntax element is not duplicate or replicated - that creates an ambiguity.

  2. scope - visibility of a syntax element for use within statement, method, or unit.



Unique



Uniqueness is not unexpected, but the question in semantic processing of a syntax element is “Unique on what lexical element?” For example, a class attribute is unique on the attribute identifier, but a class overload is unique on the operator.


  1. class attribute/program attribute - unique on attribute identifier.

  2. class overload - unique on operator used in overload.

  3. class method/program method - unique on method identifier and/or method parameters. For covariant method, unique on method return type identifier.

  4. class obviate - method identifier and/or method parameters.

  5. class constructor - unique on parameters.



Scope



Scope is the visibility of syntax elements within the class. Mynx uses an explicit scope control by name, not syntax position or nesting like other languages.

Mynx has two scopes:


  1. Global - complete visibility within the unit, or class/program; A global syntax element does not necessarily have to be declared before use. Order of declaration is immaterial.



  2. class doScope is

    public construct is to null;

    public void doSomething is

    var Int myInt to 1;

    myInt is this`INT + 2; //use class attribute although declared afterward

    end doSomething;

    private constant Int INT to 0;

    end class;


  3. Local - visiblity within a method, constructor, destructor, loosely “a method” in terms of syntax; and declaration must precede use.



Within a Mynx method, declaration is required before use, but there is no block-nesting of scope. So a declaration within a try-statement is not limited to the block-scope of the try-statement, the declaration is visible to all statements following the declaration.


public void doMethod is

try
Int x to 0;
... rest of statements...
when(Trap)
... do something ...
end try;

x++; //x is visible throughout the method not just within try-block

end doMethod;


The big difference in scope between global and local scope is global has visiblity within the block - the unit. Local scope has visibility after the point of declaration.

For global scope, visibility everywhere allows declarations to be clustered logically together. For a method, a variable needs to be declared before use, but is visible from then onward, so do not have to worry about nesting scope within a statement. Both are flexible, allowing the software developer choice.
A unit (class or program) element is useable by all other unit elements. A trade-off is that some elements such as attributes or methods can be scoped in or out by name - made opaque or invisible but without nesting the declarations.

A feature from Pascal that is mentioned in Kernighan’s “Why Pascal Is Not My Favorite Programming Language” is the inverted pyramid of declarations. This is helpful to the compiler writer for a one pass compiler, but a royal pain for the developer. The emphasis of Mynx is to give a developer or user choice, and not foist any particular approach.

Declaration before use in a method but visible throughout the method is a trade-off. As a method is the core of making a class function, using variables without declaring them first is very flexible, but is not type-safe at compile time.

In short, as Mynx is a strongly typed programming language, types must be checked at compile time, so declaration before use facilitates it. And, using a variable then declaring it afterward is counter-intuitive. I’ll avoid digression into dynamically typed languages or dynamic variable languages; but dynamic typing and useage is powerful for a language that needs those features as part of the language rationale.

The visibility afterward for all within a method is to avoid frustration of scoping within multiline statements, something that drives me nuts in C++ and Java. Especially when I’m adding code and the variable is not visible outside the scope of a try-statement or loop statement.

The rules for scope and uniqueness are needed to implement the semantic checks, but the rules are from the rationale - the how’s and why’s of the Mynx language. Some are for compiler design and development, others are from my own experience as a software developer. Implementing the semantic checks has forced me to look at Mynx and ask the universal question of “why?” -- something a language user often does not but accepts as “the way it is.” Now as a language designer, I am sitting on the opposite side of the programming language table, and it would be accepting the way it is to use existing language uniqueness and scope rules. Mynx is not to be another Java/C++/C# clone.

Labels: , , , , ,

Monday, May 07, 2007

Mynx Sentential Semantic Check - I Check, Therefore Semantics - Alone

In implementing the semantic checks for Mynx particularly the class attribute, I’ve had to re-do and re-design (refactor) the original code. Code is rarely effective the first time, like writing prose, so requires re-thinking and re-writing.

The effective solution to implement semantic checks is guided by some heuristics or rules of thumb of:

  1. Each semantic check is responsible to determine if it should check (a generic class attribute semantic check would not execute if the class attribute is an integer constant semantic).


  2. Each semantic check only checks one thing, so one error message reported.


  3. Each semantic check is a method in the class, called from a central semantic check method after the sentence is parsed.


The third heuristic is more object-oriented approach, each sentence does its own semantic check; the original implementation passed a sentence via the visitor design pattern to a class with semantic checks, but required many query and access methods to be annotated to the sentence. The difficulty was that information hiding and encapsulation created lots of data movement. It is more effective for the semantic check to be internal to the sentence, and pass a semantic context object as the locus of information.

In the original semantic checks, I found my original test cases (23-overall semantic checks, both for failure and success so 46-semantic checks) were causing one semantic error to squelch another, or created a spurious error - one error causing another by the code tangling within the semantic method. The state of a method can cause anomalous semantic errors.

The simplification into individual methods that are mutual exclusive and single purpose has avoided the problem of spurious and supression of semantic errors. The order of execution and other methods called during the semantic checks does not impact each semantic check or have any other interaction. Later as I implement higher-level semantic checks it is vital there is no side-effects.

Labels: , , ,

Website Spy Software