The blog of protondonor

SNOBOL hate blog

Several years and several jobs ago, I worked on what I fervently hope was the world's last production SNOBOL system.

A full discussion of what the system was and what it did is out of scope and would probably make my former employer justifiably mad if they ever found this. Suffice to say that it munged data from a format produced by one bit of COTS software to data readable by another bit of COTS software, and that my role on the project was to almost single-handedly rewrite it in a modern language.

It's also ancillary to the main point of this post that both of these pieces of software hadn't gotten significant updates in over a decade, and that the server running the SNOBOL code was an old-ass Solaris box, and that the guy who wrote all that code had been retired for years and charged a hefty consulting fee for answering any emails about it.

For those of you who are unfamiliar with SNOBOL, here is a short FAQ:

Q. SNOBOL? You mean the toilet cleaner?
A. I wish I did.

Q. Is that related to COBOL?
A. No, although it is of a similar vintage. Other than their name, age, and love of capital letters, they have almost nothing in common.

Q. So what is it?
A. Imagine Perl written by and for '60s programmers. Although it can technically do a lot of other things, SNOBOL has a very strong focus on string processing. It also has a unique structure, where all control flow is in the form of goto statements, and every statement is a pattern matching expression.

Q. Like regexes?
A. SNOBOL predates regexes by quite a bit, and its pattern matching is fairly unique. It provides a large number of operators that find arguments in strings and can pass a success or failure state. Several operators, such as $, can also store state into a variable. Confusingly, the variable comes after the $ operator, and a statement may contain multiple $ operators, allowing potentially unlimited numbers of variables to be assigned in a single statement. This does wonders for readability.

Q. Wait, what was that thing you said about control flow earlier?
A. That all control flow is in the form of goto statements?

Q. Yeah, what the fuck?
A. At the end of every single statement, in its own little column, there's a spot that may (but doesn't have to) contain one of: (label), :S(label), or :F(label). These indicate go to label unconditionally, go to label if pattern is matched, or go to label if pattern fails to match, respectively. This also does wonders for readability.

Q. I looked up some samples on Rosetta Code and it's really weird but it doesn't look that bad.
A. This codebase didn't have the clean style and helpful variable names used by whoever was writing those samples. The guy who wrote it wrote as though he had to pay for every byte and every line used, so the code was filled with single letter variables, with most lines allocating or assigning multiple variables at a time. It was also spread across over 100 files, each adhering to an 8.3 naming convention which made the purpose of the file almost inscrutable. At the top of each file was a comment, with a description of the file's purpose that was never clear enough. The comment also helpfully indicated who to blame for the code (always just that one guy) and that it was last updated when N*SYNC was still a going concern.

Q. Is there any tooling for SNOBOL? Any IDEs or vim plugins to help?
At the time, I found a grand total of one IDE with SNOBOL bindings. It was written in an old version of Tcl that neither I nor any of my co-workers could get running on our machines.

Q. I'm sold! How do I learn SNOBOL?
A. The Macro SPITBOL dialect of SNOBOL is available on Github, with updates as recent as early 2024. There is also a repo of documentation on the language. The Green Book is the definitive book on SPITBOL and is probably the best place to start.