Thursday, December 28, 2006

The Turing Tarpit or All Things Great are Small?!? (in a Programming Language Design)

A recent question in an e-mail was indicative of the mentality of “less is more” in programming languages. The question: “I've noticed that a large percentage
of your commands are various loop types; what is your argument for
having so many very specific types of loops?” The implication was that somehow this was wrong or bad...indicative of the “less is more” approach to programming language design. minimalism seems to have become a de facto “better” approach in a programming language.

Don’t get the wrong impression, I understand and believe that minimalism is important at the right place at the right time -- but not as an end unto itself. For example, I wrote as the lead author a computer architecture book in which the processor architecture had only one instruction -- the ultimate in “less is more”; but, the minimalism to an ultimate RISC processor had some interesting benefits and features. Minimalism was not the goal in and of itself.

The “Less is More” reminds me of a children’s book, “The Secret World of Og” by Pierre Berton (I remember seeing it as a television series way back in 1983 when I was a kid); in the book the inhabitants of the world of Og had only one word -- “og.” So how og was said, and how frequently could be anything from a joke to an insult. The less is more of linquistics, or in Ogish speak...”og...og...og.”

From the converse viewpoint of maximumism, or “more is less” too many choices preclude any actual choice. That very idea is the theme of the book, “The Paradox of Choice: Why More Is Less” by Barry Schwartz. That seemingly makes the justification for fewer choices, not more. But fewer is not always necessarily better either. We’ve all heard the adage in a dual choice, “The lesser of two evils.” -- but the lesser is still evil. In short, insufficient number of choices...a Morton's Fork.

Less is more becomes a Hobson’s choice -- no choice at all. More is less and less is more both are ultimately going full circle, leading to insufficient choice, reaching a “diminishing margin of returns” (to borrow a phrase from economics.)

More specifically for programming language design...less is more is the anathema of the Turing tarpit. To quote from a blog: “What is the Turing tar-pit? It's the place where a program has become so powerful, so general, that the effort to configure it to solve a specific problem matches or exceeds the effort to start over and write a program that solves the specific problem.


(This is especially dangerous for programming language designers. There's an irresistable urge to reduce a language to the smallest, most elegant core of axioms.)


Where does the notion and implicitness of minimalism “less is more” originate? Rhetorical question it seems -- thoughtlessness and shallowness are intellectually lazy. Elegance of a programming language -- it seems an academic mindset, almost equating a programming language as a notation for expressing thought like some concise, neat, tidy, and elegant (there’s that word again) mathematical proof or theorem. (And yes, I know about Ken Iverson’s language APL based on the monograph “Notation as a Tool of Thought". APL was the ultimate write-only language, the 1960’s version of modern PERL scripting. The name APL = A Programming Language, a minimalist programming language name, but uninspired like calling a kitten “cat.”)

So my question in retort, is why not have just have a pre-conditional = while, and let the programmer codify specific loops - counted = for, infinite = loop, post-conditional=do-while. One loop, let the programmer create specific loop kinds.

Instead of loops or iterative statements, lets digress into colors for paints. The less is more approach to colors would be black...like Henry Ford said about the colors for the Model-T Ford, “You can have any color so long as it’s black.” A painter’s canvas is white, so black is the only color needed. Imagine the great works of art painted in this enlightened philosophy of less is more -- no grey, just black.

Now imagine the other extreme, zillions of colors to choose from. You can get perriwinkle blue, to sky blue, to raven black and midnight black, bone white to bleach white. Even choosing a color would be a major ordeal -- do you pick marigold yellow, or lemon yellow, or sunrise yellow, or...hopefully you see the idea.

The middle road, “enough is enough” is to provide a basic palette of colors -- the red, orange, yellow, blue, indigio, violet, black, and white. Enough colors to paint, but not be befuddled with hundreds of hues and pigments for one specific color.

Going back to language design, I agree that too much is bad, and too little. But, the important principles are expressivity to express codified thought in code, and giving the programmer choice. Both are lost when a language designer forces a programmer to choke on having to implement other varieties of the same entity only given in one form. I can remember some of the kludge code needed to have an early exit from a while-loop in Pascal, there was no early exit or escape mechanism like in C. Both expressivity, and choice were lost. (Kernighan talks about this in “Why Pascal is not my Favorite Programming Language.” which I read later as a computer science student...it captured what I’d thought but never formalized...)

So in Mynx, as the programming language designer, I give choice to the programmer, and allow some variety to focus on expressivity rather than some elegant minimalism. The iterative statements are:

1. finite: for
2. infinite: loop
3. pre-conditional:
    a. true:  while
    b. false: until
4. post-conditional:
    a. true:  repeat-while
    b. false: repeat-until

All iterative statements in Mynx have the ability to skip (next statement) or break (exit statement). Again, it is a choice by the programmer, and expressive for code--explicitly indicate a skip or exit on a specific condition.

The programming language designer has to think like the user of their language, like an artist who will paint a canvas. In short, Mynx is designed with a variety of iterative statements to give choice to the programmer and allow for rich expressivity in the language. Rather than provide a chalk pencil for a black or white image, Mynx has a full but finite palette of colors for the programmer to choose from. I as the language designer should not be making the choice for a future Mynx programmer/software developer.

Labels:

Wednesday, December 20, 2006

Not So Primitive in Types in Programming Languages

The Mynx programming language has been inspired with the design approach of the C (yes C...) programming language. That approach was that the core language itself does not have built-in functionality in the language (like a println, or readln in Pascal, or tasks in Ada95), but uses libraries written in the language.

C did have one element integrated as part of the language, and that was the basic primitive types (such as int, unsigned, long, char). The Java programming language, an object-oriented descendant of C, has the same concept of the primitive types built into the programming language. Mynx does not. Why?

Primitive types integrated into the programming language have a problem with object-oriented programming languages -- boxing and unboxing. If a primitive type is also defined in the programming language as a class, a mechanism is needed to convert and handle object instances and the primitive types. Java had the problem it seemed until recently, when one of the newer languages, C#, defined primitives in the language and had a mechanism to convert between the primitive and the struct (C# defines primitives as value types or structs, not as classes).

Essentially a primitive type represents naked data, whereas the class definition is the naked data clothed in functionality and state. (Makes you wonder why boxing and unboxing weren't called strip and wrap...)

Another difficulty with primitives types is in parameter passing, or a distinction between primitive (naked data) and reference (wrapped data) types. A nasty "gotcha" in Java is passing a primitive type versus passing an object reference. Everything in Java is pass by copy value, but in an object reference the address is passed, whereas in a primitive type a copy of the value is passed.

Mynx avoids this entire scheme by simply not defining primitives as part of the language. All the fundamental data types:


  1. integer - signed integral numbers from a bit to a long

  2. ordinal - unsigned integral numbers from a bit to a long

  3. real - non-integral numbers, including a float

  4. char - character values

  5. string - sequence of characters

  6. bool - logical true or false

  7. bignum - big decimal and integer numbers



The fundamental data types are defined with Mynx classes, in Mynx, not as "auto-magically" provided types in the programming language. Everything is an object, so no "gotcha" with primitive types, and no boxing or unboxing.

It is the design of the basic types as classes and some of the issues that makes the reason for putting types in the language more apparent, but those issues (such as literal type versus a variable type) are for a future blog entry.

Labels:

Thursday, December 07, 2006

Doh! Tweaking the XSchema for MOXI files

The original XSchema for a MOXI file needs three tweaks...


  1. Add version attribute

  2. Add default method tag

  3. Remove supers attribute

Version


A version attribute is needed so that different versions of the Mynx compiler doing external semantic checks can be sure to work with a compatible external XML file -- the MOXI file. The version attribute needs to be a decimal number that is greater or equal to 1.0. (That is, if the revised version of the XSchema is 1.0 and not 1.1...).

Default Method

Similar to a destructor tag, a default method tag needs to be incorporated into the XSchema. The default tag specifies the default method of the class, providing there is one in the class. The example syntax:

   <default method="doIt"/>

Supers Attribute

The supers attribute indicates the number of superclasses used by the class. This is redundant, as the clustering of information has a count attribute indicating the number of superclasses.

Doh!


In creating the original XSchema for a MOXI XML semantic information file, some information was missed. How did I realize I'd omitted information? As I'm implementing semantic checks, I'm implementing the semantic information stored in the traditional "symbol table" of a compiler. Part of the symbol table includes information stored about specific elements of a class -- such as superclasses, and a default method.

Labels:

Monday, December 04, 2006

A scheme with moxie in Mynx - MOXI XSchema

A Mynx Object XML Interface or MOXI file is an XML file that provides a class interface, but by using XML avoids the specific high-level language dependencies.

The information about a Mynx class that is compiled into a binary, but during the semantic checks for external classes, information about the interface is needed, such as methods, superclasses, attributes, operator overloads. The binary form that is generated is specific, but to avoid compiler dependency on accessing and reading that particular binary form (such as reading the information in a .NET assembly), aneutral markup is used -- a MOXI file. A MOXI file is used to store external semantic information about a Mynx class.

As a MOXI file is XML, its structure and organization can be declared and defined used a schema or XSchema. After pondering and careful consideration, I have created and tested the initial version of the XSchema for a MOXI XML -- moxi.xsd.

XSchema/MOXI File Design



The MOXI XSchema uses the “Venetian Blind” approach to structuring the elements and types used in the MOXI file definition. Essentially each major element is organized into its own cluster of elements, with an attribute in the root element indicating the number.

The “Venetian Blind” approach is used for XSchema flexibility and if need be, later to accomodate changes to types and elements ina MOXI file. Each element is a major cluster, and has a corresponding type, and the complex types use simple types for more basic information such as attribute kind or class mode.

A simple MOXI file example with each of the major elements clustered:


<?xml version="1.0" encoding="UTF-8"?>
<class supers="0" mode="abstract" module="String" name="String"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<attributes count="1">
<attribute type="String" mode="default" access="public" name="String"
rank="0" kind="constant" isgeneric="true"/>
</attributes>

<constructs count="1">
<construct access="public">
<arg type="String" mode="in" rank="0" isgeneric="true"/>
</construct>
</constructs>

<destructs access="public"/>

<methods count="1">
<method type="String" mode="default" isvoid="true" access="public"
name="doIt" rank="0" isgeneric="false">
<arg type="String" mode="in" rank="0" isgeneric="false"/>
</method>
</methods>
<overloads count="0"/>
<superclasses count="1">
<superclass module="mynx.core" name="Void"/>
</superclasses>
</class>


The design idea is that when importing the XML MOXI file for a particular class,the information is presented consistently, and locally to each cluster of information. For each cluster of elements that is read, along with the number of them, and the data structure (such as an array) is created containing the information -- but knowing how many elements there are present at first. An empty cluster would have a count of zero, and use an empty tag -- but the count attribute and the root tag is always present-- so consistent, regardless of the information contained in the MOXI file.

A MOXI file is read by a MOXIFileReader into a MOXIObject, both of which are classes in the native implementing high-level language (so optimal for the particular platform Mynx is implemented on) but reading a platform neutral MOXI XML file.

MOXI Tool (File => Tool in Mynx)



MOXI files can be stored and contained in a ZIP (or other) archive file, or in a directory specifically for them. In Mynx, each file type has a corresponding tool. So a .mynx file has a ‘mynx’ tool -- the compiler. For a MOXI file, there is a corresponding tool ‘moxi.’

The MOXI tool will allow a MOXI file to be read, query for information, and inspect. A visual tool might read in an archive of all the MOXI files contained, and show them in a GUI -- a basic object browser.

A runtime MOXI tool interface can allow for basic reflection and class introspection -- independent of the implementing high-level language. Effectively, the method calls to a reflection library are loading a MOXI file and programatically read, query, and inspect information about the class -- albeit (in a safe form) read-only. But for dynamically querying, and invoking a method is one of the powers of reflection.

The MOXI XSchema is also the (possibly...) basis for the Mynx documentation Myxd XSchema, possibly by extending the XSchema to use comments or annotations. Using XML has the advantage of transformation into HTML, PDF, and other formats from one core document.

Tools Used to Create XSchema



Some of the tools I used include Microsoft XML Notepad (very nice) and the XML Schema Inference (way cool -- helped me to think about the XSchema). Others are a schema quality checker tool from IBM, and a validator for an XML document against an XSchema for the Mac (gasp!...yes I have a Mac).

Labels:

Website Spy Software