21 April 2010

What I Would Like in a Programming Language (Part 1)

Compiler Intrinsics

Intrinsic functions and properties are wonderful things. All the decent C family languages have intrinsic functions like sizeof and alignof, but sometimes you need support for more. While a language designer can try to think of every possible need and expose a good intrinsic for all potential future requirements, this is ultimately a losing battle for pretty obvious reasons. It would be really great if a user could extend the intrinsic properties of the compiler with their own domain-specific needs. I am imagining the compiler gets something like an extension sheet along with the source code so that these properties can be added to the system quite trivially. It would have make it so that people could extend the compiler without actually having to recompile the compiler -- it is the culture of extension that you really care about. Of course, I have no idea how the implementation would work, but it would be nice to have.

Ultimately, this could give a system like C++ type traits that do not feel like a complete hack. Of course, C++ type traits are extremely powerful, but frankly, they were never meant to do what they now do and, of course, just don't feel right. If concepts were not dropped from the C++0x standard, we could be about halfway to a cleaner solution; as it stands, we are stuck with using type traits.

To run completely away with the idea, something on the order of having a Lisp-like macro system where you have an extra program which spits out some abstract syntax tree from the input would be totally awesome. Okay, so this feature already exists in perfect form in Lisp, but I would love to see it in other languages as well.

Unit Testing

QuickCheck

Haskell is a wonderful place to draw examples from. A framework like QuickCheck is just awesome. Because let's face it: Nobody likes writing unit tests. Now, I'm not saying that they are completely unnecessary, just that they are are pain in the ass to write. Say you have a function with the signature sqrt(x : real) : real.

If you wanted to write some unit tests for this function, you would pound away at some known values of various square roots. For brevity, I'll eliminate specifying some range of results that we consider "valid."

assert_equal(sqrt(4), 2)
assert_equal(sqrt(100), 10)
assert_equal(sqrt(2), 1.4142135)
assert_failure(sqrt(-1))

Okay, that's halfway decent and pretty clear what I mean. But let's face it: out of the limited representational power of a real (whatever that may be), I am testing a very pathetic subset of all the possibilities. There might be negative values that somehow work or positive ones that do not - which is especially probable for very small or very large numbers. What would be nice is something with a signature like this:

sqrt(x : real @AcceptableRange(0 .. INFINITY))
: real @ResultCheck(r => (r * r) == x)

So the syntax is not completely readable, but the idea is that we are attaching some annotations to the parameter and the result of the function. These can be processed by whoever might need them, similar to the use of .NET attributes and Java annotations. However, I have stretched the allowable syntax to whatever you want -- in this case, a ranged primitive and a lambda function. The possibilities are endless! Compliers for the language doing static checking could find failures before they happen and IDEs could assist people with their problems.

Better yet, we could use...

Static Type Checking

I am a huge believer in static type checking. From the example above, things like just having a type modifier called unsigned makes exceptional conditions of the sqrt function impossible, which is really nice, since this is what you actually mean. Potential errors due to negative numbers are eliminated at compile time, because you simply cannot compile when there is a chance for an error. Compiler-enforced consistent behavior is an awesome thing.

User-Definable Primitive-looking Types

Let's say you are writing a math function that has the signature rotate(thing : Shape, angle : Real). This function rotates the Shape called thing by angle radians. Oh, you could not tell that I use radians by the method signature? That's a problem...

If we were doing some C++, we might have lines like:
typedef float Radians;
typedef float Degrees;

So the signature would look like: void rotate(Shape* thing, Radians angle). Now the method signature tells you which kind of unit you are using. Of course, the problem here is obvious: since Radians and Degrees are actually the same type, we are free to convert between the two and the compiler will not actually care that there is a difference (because, as far as it is concerned, there is not a difference).

So how can you make the compiler care? In C++, this is notoriously difficult (although possible). Once again, let us pretend that there is something perfect out there for me that looks like this:
type Radians is real range 0 .. 2 * PI
type Degrees is real range 0 .. 360

And then one could specify that there exists a scalar conversion between the two units:
conversion Radians <=> Degrees is implicit direct

That looks a little funny, but stay with me for a second. The <=> token means that the conversion is two-way. After that, I added some words just to show that my dream is really powerful. implicit means that there is an implicit conversion (as opposed to an explicit one) - the compiler is allowed to freely convert the units between each other (assuming it follows the rules of conversion). Then, direct is just a way to specify that there are no fancy conversion rules: the compiler just figures out that 180/PI is a good conversion factor for the number specified.

var angle = 180 degrees
rotate(thing, angle)

This is kind of along the lines of Ada with a little bit of extra conversion logic. A convenient system like this would be great for certain NASA contractors.



I've decided to split this incredibly long post up...

No comments:

Post a Comment