Chip's Quips
A tiny spark of wit for a highly flammable world

Programming language design: natural != simple

June 11th, 2009 1:41:49 pm pst by Sterling Camden

Or should that be natural <> simple?

Since the introduction of the first programming languages that didn’t map directly to machine instructions, the debate continues over how closely programming languages should mimic other human languages, especially English.  COBOL was the first language (AFAIK) that attempted to allow programmers to write code as semi-English statements.  For example:

SUBTRACT AR-TOTAL-BALANCE FROM AR-CREDIT-LIMIT GIVING AR-REMAINING-CREDIT.

Very English – or rather American (including the shouting).  As opposed to the much more concise, mathematical, yet somewhat less self-documenting Fortran equivalent:

Z = X – Y

Note how the COBOL statement ends with a period, just like an English sentence – while the Fortran statement mimics a mathematical formula (though it isn’t — it’s an implied LET) by having no terminal punctuation.  Because COBOL statements tend to be very wordy, the period actually comes in handy.  It allows you to continue a statement over more than one line, because the period clearly marks the end of a statement.  But it’s also a curse.  It’s easy to forget, because the casual reader can usually tell by context where the end of a statement should be.  So it becomes a Simon Says rule – one that punishes the programmer more than it helps.

Back when I used COBOL, if you forgot a period on a COBOL statement every statement in the rest of the program would generate a compilation error.  At the end of a college semester, after waiting days to get back the printout from a job you submitted to the queue, you’d often find that your 1345 compilation errors were all due to one missing dot.  Then you’d resubmit the corrected version, and wait two more days to find where the next one was missing.  I once remarked to a female CS student, “Life is like a COBOL program – miss a period and you’ve had it.”

But without the period, the problem remains of how to continue a statement over more than one line.  Fortran and other languages (Synergy/DE and VB included) have resorted to the unnatural practice of placing a continuation character in a specific place.  In VB, it’s an underscore at the end of the first line.  In Synergy/DE (DIBOL), it’s an ampersand as the first non-whitespace character on the second line.  Languages in the Algol tradition (including Pascal, C, Java, C#, etc.) require a terminating semi-colon to delimit statements.  This not only allows for ease of continuation, but it also enables having more than one statement on a single source line.  Lisp requires matching parentheses, which has the same benefits.  Unfortunately, all of these punctuation marks are just as easy to omit as the COBOL period.

Scripting languages like Perl, Python, Ruby, and JavaScript have taken to the practice of inferring the end of a statement or its continuation, while allowing punctuation (e.g., semi-colons, matching parentheses, or the explicit \ continuation in Python) in order to be explicit.  This seems very natural.  “Hey, you forgot to punctuate?  Don’t worry about it – I know what you meant,” the interpreter seems to say.

Unfortunately, as in other human languages “I know what you meant” doesn’t guarantee that you actually do know what I meant.  Consider the following Ruby example:

   1: def big(frack,n,deal)

   2:   n * ((frack - 1) / deal * 2.2345) + (deal * 12) - (frack / 3) + (frack - deal) + 1

   3: end

   4:

   5: print big(6,2,0.25)

This prints 97.13, given those parameters to the function “big”.  Now, being the good command-line-oriented programmer that I am, I’d like to continue line 2 so it doesn’t exceed the 80th column:

   1: def big(frack,n,deal)

   2:   n * ((frack - 1) / deal * 2.2345) + (deal * 12) - (frack / 3) + (frack - deal)

   3:       + 1

   4: end

   5:

   6: print big(6,2,0.25)

Now the result is 1.  What the frack?  Both line 2 and line 3 are syntactically valid in Ruby, because any complete expression can function as a statement.  Because they can both be interpreted as independent statements, Ruby does.  And since Ruby functions return the last expression evaluated, we get “+ 1”.  Adding a semi-colon after the “+ 1” doesn’t help, because Ruby infers one at the end of line 2 if it can.  You have to break the line at a different point, such as immediately following an operator, to make the end of the line an invalid end of statement.  (I tried enclosing the entire statement in parentheses, and I still get 1!  Go figure.)

JavaScript, on the other hand, doesn’t suffer this same symptom even though it also will infer a missing semi-colon.  Why?  Even though “+ 1” as a statement in JavaScript doesn’t cause an error, it also doesn’t do anything.  You have to explicitly “return” a value from a function.  So JavaScript can be greedy about combining statements until it can’t, while Ruby non-greedily treats end of line as a statement terminator if it can.  Both of these seem to follow the Principle of Least Surprise (POLS) for their syntax, except when they don’t.

This wouldn’t be such a big frack’n deal except for the fact that instead of getting an error you just get an unexpected result.  That’s a lot more insidious than 1345 COBOL compilation errors, and it underscores the need for near complete code coverage in your tests for applications written in these languages.

Posted in Coding...OK?, Geek Meditations | 4 Comments » RSS 2.0

My very first C program

October 3rd, 2008 1:46:45 pm pst by Sterling Camden

I’ve been using the C programming language for more than 24 years now.  Despite its occasional Simon Says rules (I’m looking at you, prototypes), it has become a very comfortable language for me.  But it wasn’t that way in the beginning.

My introduction to C coincided with my first use of a Unixish system: Xenix running on an Altos 586.  It was so liberating to be free of the restrictions of DEC operating systems — but I was to learn that freedom comes at a price.

My colleague Dave and I decided to learn the language together.  For our first program in C, we eschewed the unchallenging and pedestrian ”hello world”.  No, we decided, only a wuss would start with something that has no side effects.  We’ll write a program to create a file!

I had previously worked with a language that supported pointers (DG/L, an ALGOL 60 derivative), and both Dave and I had worked in assembly language.  But the pointer syntax in C was quite different from either of those, so when fopen called for a pointer to a filename we weren’t quite sure what to pass.  After considerable discussion and numerous compilation errors, we ended up with the following:

#include <stdio.h>
main()
{
    FILE *fp;
    char filename[6] = "xyzzy";
    fp = fopen(filename[0], "w");
    fclose(fp);
}

Can you spot the mistake?  The C compiler that came with Xenix couldn’t.  The program happily compiled and even ran without complaint.

But the subsequent shell command “ls xyzzy” returned nothing.  The broader command “ls”, however, dumped unprintable garbage all over the screen.

An octal dump of the directory file revealed that it contained an entry whose filename was a series of random characters, many of which had special meaning for the CRT terminal we were using.  Then the light bulb came on.

We had erroneously thought that passing filename[0] to fopen would pass the address of the 0th character of filename, when it actually passed the character at that location.  The fopen function interpreted that character’s value (‘x’ = 170 octal) as an address, then merrily looked there and used whatever it found as a filename, up to the first null.

Now we understood our error — but how to delete the file?  Trying to enter the filename at the shell prompt proved impossible.  We tried ‘rm -i *’ to have rm prompt us for each file.  But when we got to the one that messed up the display and we answered ‘y’, it barked back ‘not found’.  Analyzing the garbage that preceded ‘not found’ told us that rm had stripped the eighth bit on each character, so it no longer matched.

Finally, we hit on a solution.  Our second C program initialized its filename array using the numeric values of the characters we could see in the octal dump, then correctly passed the address of that array to unlink().  That second C program was perhaps the last C program I ever wrote that worked on the first try.

Most of today’s C compilers would at least give you a warning for passing a char where a pointer was expected.  Most of today’s operating systems would disallow access to memory addresses in that range.  And I suspect that most of today’s file systems wouldn’t create a filename containing those characters.  But sometimes I miss those old days when you were trusted to know what you were doing and severely punished if you didn’t — you learned so much more along the way.

Posted in Coding...OK?, Tempus fugit | 8 Comments » RSS 2.0

Thelophanic convention

April 23rd, 2008 2:30:25 pm pst by Sterling Camden

It’s difficult to believe that Reg Braithwaite wrote Programming conventions as signals more than five months ago. It’s been haunting my mind ever since I read it, as only newly acquired insights are wont to do. I find myself repeating the idea to colleagues and clients on a daily basis. It’s so simple and obvious that someone should have said it long ago — which is what makes it a great idea.

But I need a name for it, because our industry doesn’t have enough buzzwords. No really, having a label for this philosophy will help me to talk about it. I’d like to call it “thelophanic convention” (from Greek thelo = “intent” + phanos = “apparent”), because Reg’s idea is to use programming conventions in ways that elucidate your intentions (phanos originally meant a torch or lamp, but came to be used for “appearing” — so I like the notion of “illuminating” that this etymology adds).

In a nutshell, thelophanic convention means that when you have more than one way to do something, choose the way that best communicates your intentions. When there isn’t an apparent winner, decide on what each convention should mean and use them consistently. Go read Reg’s post for some great examples. Naturally, your conventions need to be communicated to other members of your team as well as to anyone down the road who might read your code.

An example that occurred to me today has to do with exceptions. When should you throw a standard exception versus a specialized derived class of exception? For instance, if a function within an application accepts an argument that should not be null, should you throw a NullArgumentException? Using thelophanic convention, the answer is “maybe”. It depends on how generic the function is, what the argument means, and under what conditions it might become null. If it’s extremely unlikely that it would ever be null, then maybe you don’t check for null at all and just let whatever runtime exception occur that would be the result of using it as if it weren’t null. When an exception is coughed up from deep inside your framework of choice, that’s a pretty clear signal that something is very wrong and unexpectedly so.

If it’s possible for the client code to inadvertently pass in a null argument, however, then you should throw an exception to let them know exactly what they did wrong. Should it be NullArgumentException, or an application-specific class of exception, though? Again, thelophanic convention asks “what are you trying to say here?” If the client code catches your exception, will it be possible to distinguish this case from any other exception that might occur? In other words, is it possible to generate a NullArgumentException by any other means, thus diluting the meaningfulness of catching that exception? Could the client code want to do something different to handle this specific case? If so, then use an application-specific exception, because the catch block can’t easily distinguish the cases by looking at the traceback. But even if this is the only opportunity for a NullArgumentException, you may still want to throw a more specific exception, depending on the abstraction represented by the function and argument. MyApplication.CustomerMissingException may be more meaningful within a customer maintenance routine that expects a Customer object. On the other hand, if your function operates generically on a wide range of object classes, then NullArgumentException may say all you want to say.

Thelophanic convention is the best idea I’ve encountered so far for how to handle when TIMTOWTDI (There Is More Than One Way To Do It). But I have seen (and used) many other approaches:

  1. Novices choose the only method that they know about, or the first one they can find.
  2. Newly established programmers follow a strict religion — whether it’s a company policy, an “industry standard”, or “best practices”.
  3. Anal-retentive programmers choose the most efficient solution: least code, leanest on resource usage, or fastest.
  4. Geeks with delusions of grandeur choose the most clever algorithm. After fooling themselves enough times, they move up to…
  5. Experts choose the most readable version. After rewriting that oh so readable code several times, though…
  6. Gurus choose the most extensible/maintainable approach. But that’s only one step towards reaching the next level:
  7. Bodhisatvas choose the thelophanic convention, enlightening themselves and others.

Will there be further stages of enlightenment regarding what to do when TIMTOWTDI? Probably so. But that’s beyond my lights for now.

Posted in Coding...OK?, Geek Meditations | 10 Comments » RSS 2.0

Can I get MySQL to call Ruby?

February 26th, 2008 4:07:47 pm pst by Sterling Camden

MySQL and I are acquainted, but we have yet to establish an intimate relationship. She’s a cheap date who’s always available and doesn’t make a scene. But lately she’s been eluding my desire for something more exciting. Maybe she’s jealous because I want to bring Ruby in on one of her triggers.

Maybe you can help me to be more persuasive. I need to execute a Ruby script whenever a row is inserted or deleted on a specific table. Ideally, this should occur as immediately as possible. It would make sense to do this from within a MySQL trigger function, but there doesn’t seem to be any easy way to get there from here. I searched the MySQL forums and found three similar requests, but no answers.

Guy Naor posted exactly what I need except that it uses Postgres’ LISTEN/NOTIFY mechanism, which is sadly lacking in MySQL. Unfortunately, we don’t have the option of changing database servers.

Baron published an approach using advisory locks as a notification mechanism. But commenter Arjen Lentz said “I’ve tended to advise against using MySQL’s advisory locking functions, there are enough pitfalls to make it undesirable in terms of relying on it for an application” — without specifying exactly what those pitfalls are. Besides, the idea of using something intended as a lock in place of a message seems like it’s bound to get, um, locked up.

We’re currently considering using a polling mechanism instead. The trigger function would insert a row into a new table that acts as a queue, and the polling process would read the queue and handle the notifications. But that does introduce a delay of up to a little more than our polling interval, and it also seems just a bit Mousetrappish. I’m hoping that some of my readers who have taken MySQL and Ruby around the block a few more times than I have (glances with puppy dog eyes in the direction of Assaf, lower lip protruding and trembling slightly) might come up with a better idea (pppplease).

Posted in Coding...OK?, Wildly popular | 2 Comments » RSS 2.0

Using OO in Business Applications

May 23rd, 2007 11:31:16 pm pst by Sterling Camden

The day before yesterday I presented on OOP in Synergy/DE version 9.  Here’s essentially what I said:

What is OO?

“OO” is the sound people are supposed to make when they behold your beautiful object-oriented code – but in many cases it comes out “OOPs” instead.  In this presentation I hope to provide some insights on how to evoke the desired exclamation.

Object Orientation is both a design philosophy and a set of programming language features to support it.  Languages that provide the latter are often called Object-Oriented Programming (OOP) languages.  Object Orientation’s primary goal: to make the code describe the real-world objects involved in the application, and their interactions and relationships, thereby reducing the translation between business rules and computer instructions.

How does this apply to business applications?

Well, you could begin to model your business processes in code.  Given an order entry system, for instance, you might make “Order” a class that contains all the data related to an order and the methods to do what an order needs to do.  A class defines a type of object.  So, an Order will contain some general information about an order and a collection of line items, along with methods to retrieve, commit, and maybe validate that data.

You might then find that sales orders and purchase orders have a lot in common, but differ somewhat.  So, you make both the SalesOrder and PurchaseOrder class “extend” the Order class, adding data and procedures that are unique to each, and possibly overriding some behaviors of the base Order class.

As you apply this principle to more areas of your application, you might start to see patterns emerge.  A lot of what you do to store, retrieve, update, delete, and list orders also applies to other types of tables.  You could then make the Order class extend some even more generic class that provides methods and data for manipulating any type of discrete data item.  If you’re familiar with Design Patterns, you might want to follow the Model-View-Controller pattern and call this class a Model.  Then you could create a View class which represents any type of user interface for the data, and extend that for specific ways of looking at certain types of data (input windows, lists, etc.).  Finally, a Controller class could be used to orchestrate the overall behavior of an application, and extended to create specific types of applications.

“Whoa!  You’re talking about a complete rewrite!”  I hear you cry.

You’re right — that’s a bad idea.  Any of you out there who have ever attempted it can testify that for any application that has taken many years to mature, a complete architectural overhaul is not in the cards.  You simply cannot rewrite from scratch and have any product ready to ship before you go out of business.  You need to incrementally incorporate newer methods and technologies in a way that allows you to continue to release your product on a regular basis.  And you don’t want that pavement-to-dirt feeling when your user navigates between different parts of your application.

So, I expect that most of you will begin using objects without even really thinking about OOP.  First, you’ll start consuming some of the supplied classes.  The new dynamic Array class might prove handy, with which you can size an array at runtime without having to use the obtuse ^m syntax.  You’ll be using classes and not even know it.

Or, quite likely, you’ll see the new try-catch-finally structured exception handling in Synergy 9 and say to yourself, “Self, you’ve been looking for something to replace that goto-by-any-other-name-still-stinks onerror statement for ages.  Time to learn something new.”  As soon as you catch your first exception, you’ll have an Exception object in your hot little hands.  You’ll see how naturally you can access the information contained therein using the new object syntax.  Maybe you’ll create your own derived class of exception in order to add your own custom information about an error encountered in your application.  That might be your very first class definition.

Then you’ll start to discover more of the useful features of objects.  For instance, scope and destructors.  How many times have you wrestled with problems related to cleaning up resources when they’re no longer needed?  The UI Toolkit provides the environment concept to release all resources that were allocated in an environment when that environment level is exited, which works fine for many cases.  But in more complex applications you often find that you need to preserve some resources across environment-level bounds.  So, you promote those to global, but then you have to remember to clean them up when you’re done.  Wouldn’t it be nice if the resource would just get cleaned up automatically as soon as you lose all references to it?  That’s what you can do with a class.

As you begin to think more in terms of objects, it will become more natural for you to create classes that model other aspects of your application.  But don’t rush yourself.  And don’t massively redesign, unless you’ve got a lot of extra time on your hands.

General principles

A class should reflect a single type of object.  That means it’s a noun — some actor within your application.  Without referring to existing code, try to describe, in natural language, how your application works from the user’s perspective.  Whenever you encounter a noun, that’s a candidate for a class.

Methods are verbs that describe some action that a class of object can perform, on other objects or on itself (or intransitively).  If the name of the method doesn’t involve a discrete action, then maybe it needs to be thought out differently.

Properties can be seen as adjectives that describe the object, or members that compose it.  If your class has a property that you can’t think of in one of these ways, perhaps it shouldn’t be a property of this class.

A class should extend another class if it passes the “is a” test.  A Bugatti is a car, so Bugatti extends the class of cars (does it ever – this baby can do 253 mph.  Why from here to Los Angeles — of course, you couldn’t do 253 mph all the way to Los Angeles, because there’s always some jerk hogging the left lane at about 180).  A Bugatti is not a chassis.  A Bugatti contains a chassis, it isn’t derived from it.  That distinction trips up class designers all the time.  For instance, given the Model class we discussed earlier (from which an Order is derived), does it extend or contain a database table?

Often you can get away with inaccurate inheritance models for a while.  But eventually, the practice of deriving classes from things that they logically contain causes conflicts, because you end up wanting to derive them from more than one ancestor class.  Use the “is a” test to avoid that problem.

Signs you should have contained:

–You feel the need for multiple inheritance

–You add way too many methods

–You don’t override any virtual methods

Sign you should have inherited:

–You replicate all of the same methods and just forwards or duplicates them

What else to avoid

As Ken Lidster has said to me on many occasions, OO rewards a good design handsomely, but punishes a bad design to the end of your days.  Lets discuss some common pitfalls.

In Execution in the Kingdom of Nouns, Steve Yegge describes how Java’s (and C#’s) insistence on requiring all functions to belong to a class causes programmers to manufacture many needless nouns in order to perform any activity.  You don’t have to fall into that trap in Synergy/DE, because Synergy/DE provides stand-alone functions.  So if your description of a process begins with a verb, just make it a function.  Don’t give it an actor if it doesn’t need one.

Classes whose names include the words “manager”, “broker”, “locator”, or any other verbal noun, should be suspected.  If the noun’s purpose in life is merely to perform some action, then that should be a function instead of a class.

What about singleton classes?  Singleton classes are only instantiated once.  Sometimes that means that they’re really just a set of global data and functions disguised as a class.  Take, for instance, an Application object.  Java and C# virtually require classes like this because otherwise you can’t get anything done.  Because these languages require that you have a class to contain any function, you typically pick yourself up by your bootstraps by creating an Application object that has, at least, some sort of ”run” method.  You don’t need that in Synergy/DE, unless it’s useful.  Singleton classes can be useful for encapsulating related data and functions together, creating derivations of similar applications, and for avoiding global naming conflicts — but be careful not to let them become big, miscellaneous messes or just the equivalent of namespaces.

In OOP and the death of modularity, Chad Perrin notices a trend in object-oriented programming in which classes become coupled with one another, reducing modularity.  Sometimes, by thinking in terms of objects rather than starting with processes, you can bundle too much functionality together into one class.  Interfacing with such a complex class soon requires complex knowledge about how that class operates, creating unnecessary dependencies.  By contrast, keeping the design simple and atomic requires less intimate knowledge and maximizes reusability.

Geek24 provides a series of humorous programming quips, including: ”The great thing about Object Oriented code is that it can make small, simple problems look like large, complex ones.”

Unfortunately, that often proves true.  But you can avoid doing that, and make OOP work for you instead:

Avoid creating overly ornate inheritance hierarchies.  Focus on the abstractions that you use in the business model, and ignore the rest.  In real life, we always ignore certain layers of abstraction.  It’s useful, for instance, to talk about an overnight envelope as a type of shipment, but we rarely need to state the fact that all shipments are types of molecular collections.  That’s not the level on which we operate.  Likewise within an object hierarchy, don’t bother including ancestor classes that no one will ever need.  Make use of inheritance where it makes sense.

Rule of thumb: simplify.  If adding a class simplifies the code and makes it easier to understand and manipulate, that’s a good design.  Oooo.  If it complicates the code and adds unnecessary layers and dependencies, you were better not to use objects at all.  OOPS!  The beauty of Synergy/DE is that it lets you decide.

Posted in Coding...OK?, Geek Meditations, Wildly popular | 6 Comments » RSS 2.0

Quicker cleanup after being POOPed on

March 18th, 2007 1:47:57 pm pst by Sterling Camden

If you’ve ever worked on a medium to large project (no matter how well-managed it was or what version control software you used) then you have probably experienced the Pain Of Obstructive Puts (POOP). Another developer put a change into the archives that makes your code stop working.

In case you were on the giving end of one of those transactions, just get the WOMM certification to deflect any blame. Be sure to complete all four steps.

The biggest POOP problem I regularly experience happens when someone updates a large number of files whose changes are all interdependent. Working remotely from my home with a 1.5mbps download speed means pinging remote archives over VPN takes time. On one large project I’m involved with, a full automated update of all archives takes several hours. It’s not something I want to do every day.

When I’m in mad-rush-to-the-coding-deadline mode, I try to avoid doing a full update as long as possible. I probably shouldn’t, especially at that phase, but I hate to eat up all that bandwidth and processing time while I’m in a hurry to get things done.

Nevertheless, sooner or later I’ll have to modify a file that someone else has also changed, and their revision will require a whole boatload of others — usually scattered throughout the project. So I kick off a full update, and then I try to second-guess what files might be required and go after those individually, madly trying to get a clean build without having to wait and wait and wait.

That happened to me yesterday. Then this morning as I was cooking my breakfast, I thought, “Hmm, there really should be a transaction history for the archives so I could know what files had been touched without having to scan every blessed archive.”

There is! We use PVCS Version Manager, which creates a journal file. For every modification to the archives, this file contains a comma-delimited entry with the archive filename, person who modified it, date, time, and the type and version of modification.

So, I wrote a quick and dirty clean Ruby script to scan the log for entries in which a new revision was put on or after a specified date. It weeds out duplicate entries and extracts each file from its archive if not up to date. This script takes minutes to run, rather than hours. So I’ve gained a lot of time — and lost one good excuse.

Posted in Coding...OK? | 3 Comments » RSS 2.0

Wirth the trouble?

February 28th, 2007 3:05:20 pm pst by Sterling Camden

Trying to solve a software problem just now, I had to jump into Delphi for the first time in a long while. Adding an “else” clause to an “if” block, I somehow automatically remembered to remove the semicolon from the “end” of the “if” block before adding the “else”. Scary.

Pascal was never meant to become second nature. You’re supposed to forget to remove that semi-colon, forget to add the semi-colon at the end of function parameter lists, accidentally use equality when you meant assignment, use double-quotes on strings, and test integers for Boolean truth so the compiler can have the pleasure of getting all Catholic School Holy Sister with a ruler across your knuckles.

Posted in Coding...OK? | No Comments » RSS 2.0

What’s worse than GOTO?

February 14th, 2007 2:37:55 pm pst by Sterling Camden

Robert Nyman solicits the worst code you’ve ever seen (thanks, [GAS]).

Most of the examples supplied in the comments are in HTML that actually displays (not that that requires infinite inspiration). But heck, it can’t be truly gruesome code if it works as expected.

The worst code I’ve ever seen was written by a colleague who shall remain nameless, back in the days before DIBOL had begin-end blocks (hint: more than 20 years ago). Rather than applying the usual GOTO leapfrog over a multi-line conditional section, my associate decided to repeat the conditional test for each statement in the block. Perhaps this informed professional had read in some Structured Programming article that GOTO’s are bad.

Besides blatantly violating the DRY principle (which nobody had ever heard of back then), this approach produced a slight side-effect:

IF (CTL .EQ. 1) CTL = 2
IF (CTL .EQ. 1) RETURN

What’s more, our worthy (I hesitate to use the term) programmer repeated the same mistake in dozens of places throughout a production Payroll system. Needless to say, other career opportunities soon became necessary.

Posted in Coding...OK? | No Comments » RSS 2.0

A touch of Krome

October 9th, 2006 1:09:14 pm pst by Sterling Camden

I like to link to users of my published code, as a sort of “thank you for your support”. Even though they don’t pay for any of it.

But sometimes it pays off in other ways. For instance, when I linked to Sergio Longoni (aka “Kromeboy”) who uses my OPML blogroll widget, he responded with some good suggestions for improving the widget, including the necessary source code.

So today I updated the widget to version 1.1 and included Sergio’s enhancements. He also had some other suggestions that I’ll consider for a future version.

It’s called “collaboration”.

Posted in Coding...OK? | 9 Comments » RSS 2.0

Algooglrithms

October 5th, 2006 1:15:52 pm pst by Sterling Camden

googlecodesearchPlaying around with the new Google Code Search today. Nice to know that my ratrace app comes up on top in a search for “lang:ruby rats”: http://www.google.com/codesearch?hl=en&lr=&q=lang%3Aruby+rats&btnG=Search

Interesting how it searches within ZIP files on the web. A search for my tag cloud widget for WordPress presented more difficulty, because I put most of the description into a readme.txt file instead of in source code comments. So “lang:php” excluded it.

Let’s try something apotheon was searching for today: http://www.google.com/codesearch?hl=en&lr=&q=lang%3Aperl+s-expression+parser&btnG=Search

A Perl S-expression parser. 50 results. Ought to be something in there to do the trick.

The Synergy/DE language seems a bit under-indexed: http://www.google.com/codesearch?hl=en&lr=&q=lang%3Asynergyde&btnG=Search

Even though I’ve posted 42 examples in Synergy/DE on chipstips.com.

DIBOL fares no better: http://www.google.com/codesearch?hl=en&lr=&q=lang%3Adibol&btnG=Search

I like the way you can use regular expressions in your search: http://www.google.com/codesearch?hl=en&lr=&q=lang%3A.*lisp+def%5Bun%2Cmacro%5D&btnG=Search

In this case we’re searching for languages that end in “lisp” and code that contains “defun” or “defmacro”. Hmm, only 500 results. They need to index more Lisp code too, apparently. A search for just “lang:.*lisp” gives only 600 results.

Let’s try some different languages for comparison:

 

Language Search Approximate # Results
^c$ 2,870,000
^c\+\+$ 751,000
java 766,000
php 302,000
python 188,000
perl 186,000
fortran 73,900
c# 50,400
.*assembl.* 41,400
lex 18,100
yacc 10,800
.*lisp 600
.*basic 600
basic 500
sql 500
lisp 400
scheme 400
asp 400
ruby 200
smalltalk 200
javascript 200
erlang 100
ada 100
eiffel 100
vbscript 0
synergy.*,dibol,dbl 0
progress.* 0
delphi,pascal 0
algol 0
cobol 0
ocaml,arc,dylan,haskell,logo 0
xml,xslt,.*html 0

Did I forget anybody?

I would say this tool holds promise. I’ll definitely keep it in my programming weapons arsenal.

Posted in Coding...OK? | 7 Comments » RSS 2.0