Programming language design: natural != simple
June 11th, 2009 1:41:49 pm pst by Sterling CamdenOr should that be natural <> simple?
Since the introduction of the first programming languages that didn’t map directly to machine instructions, the debate continues over how closely programming languages should mimic other human languages, especially English. COBOL was the first language (AFAIK) that attempted to allow programmers to write code as semi-English statements. For example:
SUBTRACT AR-TOTAL-BALANCE FROM AR-CREDIT-LIMIT GIVING AR-REMAINING-CREDIT.
Very English – or rather American (including the shouting). As opposed to the much more concise, mathematical, yet somewhat less self-documenting Fortran equivalent:
Z = X – Y
Note how the COBOL statement ends with a period, just like an English sentence – while the Fortran statement mimics a mathematical formula (though it isn’t — it’s an implied LET) by having no terminal punctuation. Because COBOL statements tend to be very wordy, the period actually comes in handy. It allows you to continue a statement over more than one line, because the period clearly marks the end of a statement. But it’s also a curse. It’s easy to forget, because the casual reader can usually tell by context where the end of a statement should be. So it becomes a Simon Says rule – one that punishes the programmer more than it helps.
Back when I used COBOL, if you forgot a period on a COBOL statement every statement in the rest of the program would generate a compilation error. At the end of a college semester, after waiting days to get back the printout from a job you submitted to the queue, you’d often find that your 1345 compilation errors were all due to one missing dot. Then you’d resubmit the corrected version, and wait two more days to find where the next one was missing. I once remarked to a female CS student, “Life is like a COBOL program – miss a period and you’ve had it.”
But without the period, the problem remains of how to continue a statement over more than one line. Fortran and other languages (Synergy/DE and VB included) have resorted to the unnatural practice of placing a continuation character in a specific place. In VB, it’s an underscore at the end of the first line. In Synergy/DE (DIBOL), it’s an ampersand as the first non-whitespace character on the second line. Languages in the Algol tradition (including Pascal, C, Java, C#, etc.) require a terminating semi-colon to delimit statements. This not only allows for ease of continuation, but it also enables having more than one statement on a single source line. Lisp requires matching parentheses, which has the same benefits. Unfortunately, all of these punctuation marks are just as easy to omit as the COBOL period.
Scripting languages like Perl, Python, Ruby, and JavaScript have taken to the practice of inferring the end of a statement or its continuation, while allowing punctuation (e.g., semi-colons, matching parentheses, or the explicit \ continuation in Python) in order to be explicit. This seems very natural. “Hey, you forgot to punctuate? Don’t worry about it – I know what you meant,” the interpreter seems to say.
Unfortunately, as in other human languages “I know what you meant” doesn’t guarantee that you actually do know what I meant. Consider the following Ruby example:
1: def big(frack,n,deal)
2: n * ((frack - 1) / deal * 2.2345) + (deal * 12) - (frack / 3) + (frack - deal) + 1
3: end
4:
5: print big(6,2,0.25)
This prints 97.13, given those parameters to the function “big”. Now, being the good command-line-oriented programmer that I am, I’d like to continue line 2 so it doesn’t exceed the 80th column:
1: def big(frack,n,deal)
2: n * ((frack - 1) / deal * 2.2345) + (deal * 12) - (frack / 3) + (frack - deal)
3: + 1
4: end
5:
6: print big(6,2,0.25)
Now the result is 1. What the frack? Both line 2 and line 3 are syntactically valid in Ruby, because any complete expression can function as a statement. Because they can both be interpreted as independent statements, Ruby does. And since Ruby functions return the last expression evaluated, we get “+ 1”. Adding a semi-colon after the “+ 1” doesn’t help, because Ruby infers one at the end of line 2 if it can. You have to break the line at a different point, such as immediately following an operator, to make the end of the line an invalid end of statement. (I tried enclosing the entire statement in parentheses, and I still get 1! Go figure.)
JavaScript, on the other hand, doesn’t suffer this same symptom even though it also will infer a missing semi-colon. Why? Even though “+ 1” as a statement in JavaScript doesn’t cause an error, it also doesn’t do anything. You have to explicitly “return” a value from a function. So JavaScript can be greedy about combining statements until it can’t, while Ruby non-greedily treats end of line as a statement terminator if it can. Both of these seem to follow the Principle of Least Surprise (POLS) for their syntax, except when they don’t.
This wouldn’t be such a big frack’n deal except for the fact that instead of getting an error you just get an unexpected result. That’s a lot more insidious than 1345 COBOL compilation errors, and it underscores the need for near complete code coverage in your tests for applications written in these languages.
Posted in Coding...OK?, Geek Meditations | 4 Comments » RSS 2.0
A class should extend another class if it passes the “is a” test. A Bugatti is a car, so Bugatti extends the class of cars (does it ever – this baby can do 253 mph. Why from here to Los Angeles — of course, you couldn’t do 253 mph all the way to Los Angeles, because there’s always some jerk hogging the left lane at about 180). A Bugatti is not a chassis. A Bugatti contains a chassis, it isn’t derived from it. That distinction trips up class designers all the time. For instance, given the Model class we discussed earlier (from which an Order is derived), does it extend or contain a database table?
If you’ve ever worked on a medium to large project (no matter how well-managed it was or what version control software you used) then you have probably experienced the Pain Of Obstructive Puts (POOP). Another developer put a change into the archives that makes your code stop working.
There is! We use 


