mdlbear: (technonerdmonster)

You may remember from my previous post about Hyperviewer that I’d been plagued by a mysterious bug. The second time the program tried to make a simplex (the N-dimensional version of a triangle (N=2) or tetrahedron (N=3), a whole batch of “ghost edges” appeared and the program (quite understandably) blew up. I didn’t realize it until somewhat later, but there were ghost vertices as well, and that was somewhat more fundamental. Basically, nVertices, the field that holds the number of vertices in the polytope, was wildly wrong.

Chasing ghosts

Eventually I narrowed things down to someplace around here, which is where things stood at the end of the previous post.

1        let vertices = [];
2        /* something goes massively wrong, right here. */
3        for (let i = 0; i < dim; ++i) {
4            vertices.push(new vector(dim).fill((j) => i === j? 1.0 : 0));
5        }

I found this by throwing an error, with a big data dump, right in the middle if nVertices was wrong (it’s supposed to be dim+), or if the length of the list of vertices was different from nVertices.

 1        let vertices = [];
 2        /* something goes massively wrong, right here. */
 3        if (this.nVertices !== (dim + 1) || this.nEdges !== ((dim + 1) * dim / 2) ||
 4            this.vertices.length !== 0 || this.edges.length !== 0 ) {
 5            throw new Error("nEdges = " + this.nEdges + " want " +  ((dim + 1) * dim / 2) +
 6                            "; nVertices = " + this.nVertices + " want " + dim +
 7                            "; vertices.length = " + this.vertices.length +
 8                            ' at this point in the initialization, where dim = ' + dim +
 9                            " in " + this.dimensions + '-D ' + this.name 
10                           );
11        } 
12        for (let i = 0; i < dim; ++i) {

It appeared that nVertices was wildly wrong at that point. If I’d looked carefully and thought about what nVertices actually was, I would probably have found the bug at that point. Or even earlier. Instead, what clinched it was this:

 1        this.vertices = vertices;
 2        this.nVertices = vertices.length;  // setting this to dim+1 FAILS:
 3        // in other words, this.nVertices is getting changed between these two statements!
 4        if (this.vertices.length !== this.nVertices || this.edges.length !== 0) {
 5            throw new Error("expect " + this.nVertices + " verts, have " + this.vertices.length +
 6                            " in " + this.dimensions + '-D ' + this.name +
 7                            "; want " + this.nEdges + " edges into " + this.edges.length
 8                           );
 9        }

The code that creates the list of vertices produces the right number of vertices. If I set nVertices equal to the length of that list, everything was fine.

If instead I set

1        this.nVertices = dim+1;

it was wrong. Huh? For example, in four dimensions, the number of vertices is supposed to be five, and that was the length of the list. When is 4+1 not equal to 5?

At this point a light bulb went off, because it was clear that dim+1 was coming out equal to 41. In three dimensions it was 31. When is 4+1 not equal to 5? When it’s actually "4"+1. In other words, dim was a string. JavaScript “helpfully” converts a string to a number when you do anything arithmetical to it, like multiply it by something or raise it to a power. But + isn’t always an arithmetic operation! In JavaScript (and many other languages) it’s also used for string concatenation.

What went wrong, and a rant

The problem was that, the second time I tried to create a simplex, the number of dimensions was coming from the user interface. From an <input element in a web form. And every value that you get from a web form is a string. HTML knows nothing about numbers, and it has no way to know what you’re going to do with the input you get.

So the fix was simple (and you can see it here on GitHub: convert the value from a string to a number right off before trying to use it as a number of dimensions. But… But… But cubes and octohedrons were right!

That’s because the number of vertices in a N-cube is 2**N, and in an N-orthoplex (octohedron in three dimensions) it’s N*2 (and multiplication is always an arithmetic operator in JavaScript). And it worked when I was creating the simplex’s vertices because it was being compared against in a for loop. And so on.

If I’d been using a strongly-typed language, the compiler would have found this two weeks ago.

There are two main ways of dealing with data in a programming language, called “strong typing” and “dynamic typing”. In a strongly-typed language, both values and variables (the boxes you put values into) have types (like “string” or “integer”), and the types have to match. You can’t put a string into a variable with a type of integer. Java is like that (mostly).

Some people find this burdensome, and they prefer dynamically-typed languages like JavaScript. In JavaScript, values have types, but variables don’t. It’s called “dynamic” typing because a variable can hold anything, and its type is that of the last thing that was put into it.

You can write code very quickly in a language where you don’t have to declare your variables and make sure they’re the right type for the kind of values you want to put into them. You can also shoot yourself in the foot much more easily.

There are a couple of strongly-typed variants on JavaScript, for example CoffeeScript and TypeScript, and a type-checker called “Flow”. I’m going to try one of those next.

There was one more problem with simplexes

(simplices?) … but that was purely geometrical, and just because I was trying to do all the geometry in my head instead of on paper, and wasn’t thinking things through.

If you’re in N dimensions, you can create an N-1 dimensional simplex by simply connecting the points with coordinates like [1,0,0], [0,1,0], and [0,0,1] (in three dimensions – it’s pretty easy to see that that gives you an equilateral triangle). Moreover, all the vertices are on the unit sphere, which is where we want them. The last vertex is a bit of a problem.

A fair amount of googling around (or DuckDuckGoing around, in my case) will eventually turn up this answer on mathoverflow.net, which says that in N dimensions, the last vertex has to be at [x,...,x] where x=-1/(1+sqrt(1+N)). Cool! And it works. Except that it’s not centered – that last vertex is a lot closer to the origin than the others. It took me longer than it should have to get this right, but the center of the simplex is its “center of mass”, which is simply the average of all the vertices. So that’s at y=(1+x)/(N+1) because there are N+1 vertices. Now we just have to subtract y from all the coordinates to shift it over until the center is at the origin.

Then of course we have to scale it so that all the vertices are back on the unit sphere. You can find the code here, on GitHub.


Another fine post from The Computer Curmudgeon.

mdlbear: (technonerdmonster)

(This will be something of an experiment. The original was written in markdown and posted on Computer-Curmudgeon.com. We'll see whether the process made a hash of it. I may have to do some cleaning up.

This post is about Hyperviewer, an update of a very old demo program of mine from 1988 that displays wireframe objects rotating in hyperspace. (Actually, anywhere between four and six dimensions.) Since this is 2018, I naturally decided to write it in JavaScript, using Inferno and SVG, and put it on the web. It was a learning experience, in more ways than one.

Getting started

I had been doing a little work with React, which is pretty good an very popular, and had recently read about Inferno, which is a lighter-weight, faster framework that's almost completely interchangeable with React. Sounded good, especially since I wanted high performance for something that's going to be doing thousands of floating-point matrix multiplies per second. (A hypercube in N dimensions has 2^N vertices, and a rotation matrix has N^2 entries -- do the math). (It turns out I really didn't have to worry -- Moore's Law over three decades gives a speedup by a factor of a million, give or take a few orders of magnitude, so even using an partially-interpreted language speed isn't a problem. Perhaps I'm showing my age.)

To keep things simple -- and make it possible to eventually save pictures -- I decided to use SVG: the web standard for Scalable Vector Graphics, rather than trying to draw them out using an HTML5 Canvas tag. It's a perfect match for something that's nothing but a bunch of vectors. SVG is XML-based, and you can simply drop it into the middle of an HTML page. SVG is also really easy to generate using the new JSX format, which is basically XML tags embedded in a JavaScript file.

Modern JavaScript uses a program called a "transpiler" -- the most common one is Babel -- that compiles shiny new JavaScript constructs (and even some new languages like TypeScript and CoffeeScript, which I want to learn soon) into the kind of plain old JavaScript that almost any browser can understand. (There are still some people using Microsoft Exploiter from the turn of the century millennium; if you're reading this blog it's safe for me to assume that you aren't one of them.)

Anyway, let's get started:

cut tag added to protect your sanity )

(Not too bad of a formatting job, though of course the color didn't come through. Cut tag added because it's over 2000 words.)

Another fine post from The Computer Curmudgeon.
Cross-posted on computer-curmudgeon.com

mdlbear: (technonerdmonster)

Today in my continuing series on programming languages I'm going to talk about "scripting languages". "Scripting" is a rather fuzzy category, because unlike the kinds of languages we've discussed before, scripting languages are really distinguished by how they are used, and they're used in two very different ways. It's also confusing because most scripting languages are interpreted, and people tend to use "scripting" when they should be using "interpreted". In my opinion it's more correct to say that a language is being used as a scripting language, rather than to say that it is a scripting language. As we'll see, this is particularly true when the language is being used to customize some application.

But first, let's define scripts. A script is basically a sequence of commands that a user could type at a terminal[1] -- often called "the command line" -- that have been put together in a file so that they can be run automatically. The script then becomes a new command. In Linux, and before that Unix, the program that interprets user commands is called a "shell", possibly because it's the visible outer layer of the operating system. The quintessential script is a shell script. We'll dive into the details later.

[1] okay, a terminal emulator. Hardly anyone uses physical terminals anymore. Or remembers that the "tty" in /dev/tty stands for "teletype"'.

The second kind of scripting language is used to implement commands inside some interactive program that isn't a shell. (These languages are also called extension languages, because they're extending the capabilities of their host program, or sometimes configuration languages.) Extension languages generally look nothing at all like something you'd type into a shell -- they're really just programming languages, and often are just the programming language the application was written in. The commands of an interactive program like a text editor or graphics editor tend to be things like single keystrokes and mouse gestures, and in most cases you wouldn't want to -- or even be able to -- write programs with them. I'll use "extension languages" for languages used in this way. There's some overlap in between, and I'll talk about that later.

Shell scripting languages

Before there was Unix, there were mainframes. At first, you would punch out decks of Hollerith cards, hand them to the computer operator, and they would (eventually) put it in the reader and push the start button, and you would come back an hour or so later and pick up your deck with a pile of listings from the printer.

Computers were expensive in those days, so to save time the operator would pile a big batch of card decks on top of one another with a couple of "job control" cards in between to separate the jobs. Job control languages were really the first scripting languages. (And the old terminology lingers on, as such things do, in the ".bat" extension of MS/DOS (later Windows) "batch files". Which are shell scripts.)

By far the most sophisticated job control language ran on the Burroughs 5000 and 6000 series computers, which were designed to run Algol very efficiently. (So efficiently that they used Algol as what amounted to their assembly language! Programs in other languages, including Fortran and Cobol, were compiled by first translating them into Algol.) The job control language was a somewhat extended version of Algol in which some variables had files as their values, and programs were simply subroutines. Don't let anyone tell you that all scripting languages are interpreted.

Side note: the Burroughs machines' operating system was called MCP, which stands for Master Control Program. Movie fans may find that name familiar.

Even DOS batch files had control-flow statements (conditionals and loops) and the ability to substitute variables into commands. But these features were clumsy to use. In contrast, the Unix shell written by Stephen Bourne at Bell Labs was designed as a scripting language. The syntax of the control structures was, in fact, derived from Algol 68, which introduced the "if...fi" and "do...done" syntax.

Bourne's shell was called sh in Unix's characteristically terse style. The version of Unix developed at Berkeley, (BSD, for Berkeley System Distribution -- I'll talk about the history of operating systems some time) had a shell called the C shell, csh, which had a syntax derived from the C programming language. That immediately gave rise to the popular tongue-twister "she sells cshs by the cshore".

The GNU (GNU's Not Unix) project, started by Richard Stallman with the goal of producing a completely free replacement for Unix, naturally had its own rewrite of the Bourne Shell called bash -- the Bourne Again Shell. It's a considerable improvement over the original, pulling in features from csh and some other shell variants.

Let's look at shell scripting a little more closely. The basic statement is a command -- the name of a program followed by its arguments, just as you would type it on the command line. If the command isn't one of the few built-in ones, the shell then looks for a file that matches the name of the command, and runs it. The program eventually produces some output, and exits with a result code that indicates either success or failure.

There are a few really brilliant things going on here.

  • Each program gets run in a separate process. Unix was originally a time-sharing operating system, meaning that many people could use the computer at the same time, each typing at their own terminal, and the OS would run all their commands at once, a little at a time.
  • That means that you can pipe the output of one command into the input of another. That's called a "pipeline"; the commands are separated by vertical bars, like | this, so the '|' character is often called "pipe" in other contexts. It's a lot shorter than saying "vertical bar".
  • You can "redirect" the output of a command into a file. There's even a "pipe fitting" command called tee that does both: copies its input into a file, and also passes it along to the next command in the pipeline.
  • The shell uses the command's result code for control -- there's a program called true that does nothing but immediately returns success, and another called false that immediately fails. There's another one, test, which can perform various tests, for example to see whether two strings are equal, or a file is writable. There's an alias for it: [. Unix allows all sorts of characters in filenames. Anyway, you can say things like if [ -w $f ]; then...
  • You can also use a command's output as part of another command line, or put it into a variable. today=`date` takes the result of running the date program and puts it in a variable called today.

This is basically functional programming, with programs as functions and files as variables. (Of course, you can define variables and functions in the shell as well.) In case you were wondering whether Bash is a "real" programming language, take a look at nanoblogger and Abcde (A Better CD Encoder).

Sometime later in this series I'll devote a whole post to an introduction to shell scripting. For now, I'll just show you a couple of my favorite one-liners to give you a taste for it. These are tiny but useful scripts that you might type off the top of your head. Note that comments in shell -- almost all Unix scripting languages, as a matter of fact -- start with an octothorpe. (I'll talk about octothorpe/sharp/hash/pound later, too.)

# wait until nova (my household server) comes back up after a reboot
until ping -c1 nova; do sleep 10; done

# count my blog posts.  wc counts words, lines, and/or characters.
find $HOME/.ljarchive -type f -print | wc -l

# find all posts that were published in January.
# grep prints lines in its input that match a pattern.
find $HOME/.ljarchive -type f -print | grep /01/ | sort

Other scripting languages

As you can see, shell scripts tend to be a bit cryptic. That's partly because shells are also meant to have commands typed at them directly, so brevity is often favored over clarity. It's also because all of the operations that work on files are programs in their own right; they often have dozens of options and were written at different times by different people. The find program is often cited as a good (or bad) example of this -- it has a very different set of options from any other program, because you're trying to express a rather complicated combination of tests on its command line.

Some things are just too complicated to express on a single line, at least with anything resembling readability, so many other programs besides shells are designed to run scripts. Some of the first of these in Unix were sed, the "stream editor", which applies text editing operations to its input, and awk, which splits lines into "fields" and lets you do database-like operations on them. (Simpler programs that also split lines into fields include sort, uniq, and join.)

DOS and Windows look at the last three characters of a program's name (e.g., "exe" for "executable" machine language and "bat" for "batch" scripts) to determine what it contains and how to run it. Unix, on the other hand, looks at the first few characters of the file itself. In particular, if these are "#!" followed by the name of a program (I'm simplifying a little), the file is passed to that program to be run as a script. The "#!" combination is usually pronounced "shebang". This accounts for the popularity of "#" to mark comments -- lines that are meant to be ignored -- in most scripting languages.

The scripting programs we've seen so far -- sh, sed, awk, and some others -- are all designed to do one kind of thing. Shells mostly just run commands, assign variables, and substitute variables into commands, and rely on other programs like find and grep to do most other things. Wouldn't it be nice if one could combine all these functions into one program, and give it a better language to write programs in. The first of these that really took off was Larry Wall's Perl. Like the others it allows you to put simple commands on the command line -- with exactly the same syntax as grep and awk.

Perl's operations for searching and substituting text look just like the ones in sed and grep. It has associative arrays (basically lookup tables) just like the ones in awk. It can run programs and get their results exactly the way sh does, by enclosing them in backtick characters (`...` -- originally meant to be used as left single quotes), and it can easily read lines out of files, mess with them, and write them out. It has has objects, methods, and (more or less) first-class functions. And just like find and the Unix command line, it has a well-earned reputation for scripts that are obscure and hard to read.

You've probably heard Python mentioned. It was designed by Guido van Rossum in an attempt to be a better scripting language Perl, with an emphasis on making programs more readable, easier to write, and easier to maintain. He succeeded. At this point Python has mostly replaced Perl as the most popular scripting language, in addition to being a good first language for learning programming. (Which is the best language for learning is a subject guaranteed to invoke strong opinions and heated discussions; I'll avoid it for now.) I avoided Python for many years, but I'm finally learning it and finding it much better than I expected.

Extension languages

The other major kind of scripting is done to extend a program that isn't a shell. In most cases this will be an interactive program like an editor, but it doesn't have to be. Extensions of this sort may also be called "plugins".

Extension languages are usually small, simple, and interpreted, because nobody wants their text editor (for example) to include something as large and complex as a compiler when its main purpose is defining keyboard shortcuts. There's an exception to this -- sometimes when a program is written in a compiled language, the same language may be used for extensions. In that case the extensions have to be compiled in, which is usually inconvenient, but they can be particularly powerful. I've already written about one such case -- the Xmonad window manager, which is written and configured in Haskell.

Everyone these days has at least heard of JavaScript, which is the scripting language used in web pages. Like most scripting languages, JavaScript has escaped from its enclosure in the browser and run wild, to the point where text editors, whole web browsers, web servers, and so on are built in it.

Other popular extension languages include various kinds of Lisp, Tcl, and Lua. Lua and Tcl were explicitly designed to be embedded in programs. Lua is particularly popular in games, although it has recently turned up in other places, including the TeX typesetting system.

Lisp is an interesting case -- probably its earliest use as an extension language was in the Emacs text editor, which is almost entirely written in it. (To the point where many people say that it's a very good Lisp interpretor, but it needs a better text editor. I'm not one of them: I'm passionately fond of Emacs, and I'll write about it at greater length later on.) Because of its radically simple structure, Lisp is particularly easy to write an interpretor for. Emacs isn't the only example; there are Lisp variants in the Audacity audio workstation and the Autodesk CAD program. I used the one in Audacity for the sound effects in my computer/horror crossover song "Vampire Megabyte".

Emacs, Atom (a text editor written in JavaScript), and Xmonad are good examples of interactive programs where the same language is used for (most, if not all, of) the implementation as well as for the configuration files and the extensions. The boundaries can get very fuzzy in cases like that; as a Mandelbear I find that particularly appealing.

Another fine post from The Computer Curmudgeon.

mdlbear: (technonerdmonster)

In comments on "Done Since 2018-07-15" we started having a discussion of mirroring and cross-posting DW blog entries, and in particular what my plans are for implementing personal blog sites that mirror all or some of a -- this -- Dreamwidth journal.

Non-techie readers might conceivably want to skip this post.

Where I am now:

Right now, my blog posting process is, well, let's just say idiosyncratic. Up until sometime late last year, I was posting using an Emacs major mode called lj-update-mode; it was pretty good. It had only two significant problems:

  1. It could only create one post at a time, and there was no good way to save a draft and come back to it later. I could live with that.
  2. It stopped working when DW switched to all HTTPS. It was using an obsolete http library, and noone was maintaining either of them.

My current system is much better.

  1. I run a command, either make draft or, if I'm pretty sure I'm going to post immediately, make entry. I pass the filename, without the yyyy/mm/dd prefix, along with an optional title. If I don't pass the title I can add it later. The draft gets checked in with git; I can find out when I started by using git log.
  2. I edit the draft. It can sit around for days or months; doesn't matter. It' an ordinary html file except that it has an email-like header with the metadata in it.
  3. When I'm done, I make post. Done. If I'm posting a draft I have to pass the filename again to tell it which draft; make entry makes a symlink to the entry, which is already in a file called yyyy/mm/dd-filename.html. It gets posted, and committed in git with a suitable commit message.

You can see the code in MakeStuff/blogging on GitHub. It depends on a Python client called charm, which I forked to add the Location: header and some sane defaults like not auto-formatting. Charm is mostly useless -- it does almost everything using a terminal-based text editor. Really? But it does have a "quick-post" mode that takes metadata on the command line, and a "sync" mode that you can use to sync your journal with an archive. Posts in the archive are almost, but not quite, in the same format as the MakeStuff archive; the main difference is that the filenames look like yyyy/mm/dd_HHMM. Close, but not quite there.

There's another advantage that isn't apparent in the code: you can add custom make targets that set up your draft using a template. For example, my "Done since ..." posts are started with make done, and my "Computer Curmudgeon" posts are started with make curmudgeon. There are other shortcuts for River and S4S posts. I also have multiple directories for drafts, separated roughly by subject, but all posting into the same archive.

Where I want to go:

Here's what I want next:

  • The ability to post in either HTML or markdown -- markdown has a great toolchain, including the ability to syntax-color your code blocks.
  • The ability to edit posts by editing the archived post and uploading it. Right now it's a real pain to keep them in sync.
  • A unified archive, with actual URLs in the metadata rather than just the date and time in the filename.
  • The ability to put all or part of my blog on different sites. I really want the computer-related posts to go on Stephen.Savitzky.net (usually shortened to S.S.net in my notes), and a complete mirror on steve.savitzky.net (s.s.net).
  • Cross-links in both directions between my sites and DW.

How to get there:

Here's a very brief sketch of what needs to be done. It's only vaguely in sequence, and I've undoubtedly left parts out. But it's a start.

Posting, editing, and archiving

  • Posting in HTML or markdown is a pretty easy one; I can do that just by modifying the makefiles and (probably) changing the final extension from .html to .posted so that make can apply its usual dependency-inference magic.
  • Editing and a unified archive will both require a new command-line client. There aren't any. There are libraries, in Ruby, Haskell, and Javascript, that I can wrap a program around. (The Python code in charm doesn't look worth saving.) I wanted to learn Ruby anyway.
  • The unified archive will also require a program that can go back in time and match up archived posts with the right URLs, reconcile the two file naming conventions, and remove the duplicates that are due to archiving posts both in charm and MakeStuff. Not too hard, and it only has to be done once.
  • It would be nice to be able to archive comments, too. The old ljbackup program can do it, so it's feasible. It's in Perl, so it might be a good place to start.

Mirror, mirror, on the server...

This is a separate section because it's mostly orthogonal to the posting, archiving, etc.

  • The only part of the posting section that really needs to be done first is the first one, changing the extension of archived posts to .posted. (That's because make uses extensions to figure out what rules to apply to get from one to another. Remind me to post about make some time.)
  • The post archive may want to have its own git repository.
  • Templating and styling. My websites are starting to show their age; there's nothing really wrong with a retro look, but they also aren't responsive (to different screen sizes -- that's important when most people are reading websites on their phones), or accessible (screen-reader friendly and navigable by keyboard; having different font sizes helps here, too). Any respectable static site generator can do it -- you may remember this post on The Joy of Static Sites -- but the way I'm formatting my metadata will require some custom work. Blosxom and nanoblogger are probably the closest, but they're ancient. I probably ought to resist the temptation to roll my own.

Yeah. Right.

Another fine post from The Computer Curmudgeon.

mdlbear: (technonerdmonster)
Back when smalltalk was sports and the weather, And an object was what you could see, And we watched "Captain Video" in black-and-white, Before there was color TV. -- "When I Was a Lad"

Even if you're not a programmer (and most of the people I expect to be reading this aren't) you've probably heard one of your geek friends mentioning "object-oriented" programming. You might even have heard them mention "functional" programming, and wondered how it differs from non-functional programming. The computer curmudgeon can help you.

If you are a programmer, you probably know a lot of this. You may find a few items of historical interest, and it might be useful explaining things to the non-programmers in your life.

Imperative languages

As you may recall from last month's post on computer languages, the first programming languages were assembly languages -- just sequences of things for the computer to do, one after another (with a few side-trips). This kind of language is called "imperative", because each statement is a command telling the computer what to do next: "load a into register 2!" "add register 2 to register 3!" "roll over!" "sit!" You get the idea.

Most programmers use "statement" instead of "command" for these things even though the latter might be more correct from a grammatical point of view; a "command" is something one types at a terminal in order to run a program (which is also called a "command").

Most of the earlier programming languages (with one notable exception that we'll get to later) were also imperative. Fortran, Cobol, and Algol were the most prominent of these; most of today's languages descend more-or-less from Algol. (By the way, the Report on the Algorithmic Language Algol 60 is something I always point to as an example of excellent technical writing.)

Imperative languages make a distinction between statements, which do things (like "a := 2", which puts the number 2 into a variable called "a"), and "expressions", which compute results. Another way of saying that is that an expression has a value (like 2+2, which has a value of 4), and a statement doesn't. In earlier languages that value was always a number; later languages add things like lists of symbols and strings of characters enclosed in quotes.

Algol and its descendents lend themselves to what's called "Structured Programming" -- control structures like conditionals ("if this then {that} else {something-else}") and loops (for x in some-collection do {something}) let you build your program out of simple sections that nest together but always have a single entrance and exit. Notice the braces around those sections. Programming languages have different ways of grouping statements; braces and indentation are probably the most common, but some languages use "if ... fi" and "do ... done".

Languages also have ways of packaging up groups of statements so that they can be used in different places in the program. These are called either "subroutines" or "functions"; if a language has both it means that a function returns a value (like "sqrt(2)", which returns the square root of 2) and a subroutine doesn't (like "print('Hello world!')", which prints a familiar greeting.) Subroutines are statements, and functions are expressions. The expressions in parentheses are called either "arguments" or "parameters" -- I was bemused to find out recently that verbs have arguments, too. (A verb's arguments are the subject, and the direct and indirect objects.)

Object-oriented languages

As long as we're on the subject of objects, ...

In the mid 1960s Ole-Johan Dahl and Kristen Nygaard, working at the University of Oslo, developed a language called Simula, designed for simulating objects and events in the real world. Objects are things like cars, houses, people, and cats, and Simula introduced software objects to represent them. Objects have "attributes": my car is blue, my cats are both female, my wife has purple hair. Some of these attributes are objects in their own right: a car has wheels, a cat has four legs and a person has two legs and two arms. Some of these are variables: sometimes my driveway has a car in it, sometimes it doesn't, and sometimes it has two or three.

What this means is that objects have state. A car's speed and direction vary all the time, as does the position of a cat's tail. An object is a natural way to bundle up a collection of state variables.

An object also can do things. In a program, objects rarely do things on their own, some other object tells them what to do. This is a lot like sending the object a message. "Hey, car -- turn left!" "Hey, cat! What color are your eyes?" Software objects don't usually have any of their state on the outside where other parts of the program can see it; they have to be asked politely. That means that the object might be computing something that looks to the outside like a simple state variable -- a car computes its speed from the diameter of its tires and how fast its wheels are turning. All these things are hidden from the rest of the program.

An object has to have a method for handling any message thrown at it, so the documentation of most languages refers to "calling a method" rather than "sending a message". It's the same thing in most cases. As seen from the outside, calling a method is just like calling a function or subroutine, except that there's an extra argument that represents the object. If you think of the message as a command -- an imperative sentence -- it has an implied subject, which is the object you're talking to. Inside the little program that handles the method, the object is represented by a name that is usually either "self" or "this", depending on the language.

One natural thing to do with objects is to classify them, so objects have "classes", and classes tend to be arranged in hierarchies. My cats are cats -- we say that Ticia and Desti are instances of the class "Cat". All cats are animals, so Cat is a subclass of Animal, as are Dog and Person. (In programming languages, classes tend to have their names capitalized. Variables (like "x" or "cat") almost always start with a lowercase letter. Constants (like "Pi", "True", and "Ticia") can go either way.)

An object's class contains everything the computer needs to determine the behavior of its instances: a list of the state variables, a list of the methods it can handle, and so on. Now we get to the fun part.

Simula was just objects and methods tacked on to a "traditional" imperative language -- Algol in that case. Most other "object-oriented" languages work the same way -- C++ is just objects tacked on to C; Python and Perl had objects from the beginning or close to it, but they still include a lot of things -- numbers, strings, arrays, and so on, that aren't objects or (like strings in Java) are pretending to be objects. It doesn't have to be that complicated.

Shortly after Simula was released, Alan Kay at Xerox PARC realized that you could design a programming language where everything was an object. The result was Smalltalk, and it's still the best example of a pure object-oriented language. (It inspired C++ and Java, and indeed almost all existing OO languages, but very few of them went all the way. Ruby comes closest of the languages I'm familiar with.) Smalltalk is turtles objects all the way down.

In Smalltalk numbers are objects -- you can add new methods to them, or redefine existing methods. (Your program will eventually stop working if you redefine addition so that 2+2 is 5, but it will keep going for a surprisingly long time.) So are booleans. True and False are instances of different subclasses of Boolean, and they behave differently when given messages like ifTrue: or ifFalse:. (I'm not going to go into the details here; Smalltalk warrants an article all by itself). Loops are methods, too. The trickiest bit, though, is that classes are objects. A class is just an instance of Metaclass, and Metaclass is an instance of itself.

Someone trying to implement Smalltalk has to cheat in a couple of places -- the Metaclass weirdness is one of them -- but they have to hide their tracks and not get caught.

Functional languages

Smalltalk, for all its object orientation, is still an imperative language underneath. A program is still a sequence of statements, even if they're all wrapped up in methods. Objects have state -- lots of it; they're just a good way of organizing it. Alan Kay once remarked that programming in Smalltalk was a lot like training a collection of animals to perform tricks together. Sometimes it's more like herding cats.

But it turns out that you don't need statements -- or even state -- at all! All you need is functions.

About the same time Alan Turing was inventing the Turing Machine, which is basically a computer stripped-down and simplified to something that will just barely work, Alonzo Church was developing something he called the Lambda Calculus, which did something similar to the mathematical concept of functions.

A mathematical function takes a set of inputs, and produces an output. The key thing about it is that a given set of inputs will always produce exactly the same output. It's like a meat grinder -- put in beef, and you get ground beef. Put in pork, and you get ground pork. A function has no state, and no side-effects. State would be like a secret compartment in the meat grinder, so that you get a little chicken mixed in with your ground beef, and beef in your ground pork. Ugh. A side effect would be like a hole in the side that lets a little of the ground meat escape.

Objects are all about state and side effects. You can call a method like "turn left" on a car (in most languages that would look like car.turn(left)) and as a side effect it will change the car's direction state to something 90 degrees counterclockwise from where it was. That makes it hard to reason about what's going to happen when you call a particular method. It's even harder with cats.

Anyway, in lambda calculus, functions are things that you can pass around, pass to other functions, and return from functions. They're first class values, just like numbers, strings, and lists. It turns out that you can use functions to compute anything that you can compute with a Turing machine.

Lambda calculus gets its name from the way you represent functions: To define a function that squares a number, for example, we say something like:

square = λ x: x*x

Now, this looks suspiciously like assigning a value to a variable, and that's exactly what it's doing. If you like, you can even think of λ as a funny kind of function that takes a list of symbols and turns it into a function. You find the value of an expression like square(3) by plugging the value 3 into the function's definition in place of x. So,

square(3)
  → (λ x: x*x)(3)
  → 3*3
  → 9

Most functional languages don't allow you to redefine a name once you've bound it to a value; if you have a function like square it wouldn't make much sense to redefine it in another part of the program as x*x*x. This is vary different from imperative languages, where variables represent state and vary all over the place. (Names that represent the arguments of functions are only defined inside any one call to the function, so x can have different values in square(2) and square(3) and nobody gets confused.)

Lambda calculus lends itself to recursive functions -- functions that call themselves -- and recursion is particularly useful when you're dealing with lists. You do something to the first thing on the list, and then combine that with the result of doing the same thing to the rest of the list.Suppose you have a list of numbers, like (6 2 8 3). If you want another list that contains the squares of those numbers, you can use a function called map that takes two arguments: a function and a list. It returns the result of applying the function to each element of the list. So

map(square, (6, 2, 8, 3))
 → (36, 4, 64, 9)

It's pretty easy to define map, too. It's just

map = λ (f, l):
	if isEmpty(l)
           then l
	   else cons(f(first(l)), map(f, rest(l)))

The cons function constructs a new list that consists of its first argument, which can be anything, tacked onto the front of its second argument, which is a list. Using first and rest as the names of the functions that get the first item in a list, and the remainder of the list after the first item, is pretty common tese days. Historically, the function that returns the first item in a list is called car, and the function that returns the rest of the list is called cdr (pronounced sort of like "could'r").

I know, the classic introduction to recursive functions is factorial. Lists are more interesting. Besides, I couldn't resist an excuse for mentioning "car". We'll see "cat" in a future post when I discuss scripting languages.

We call map a "higher-level" function because it takes a function as one of its arguments. Other useful higher-level functions include filter, which takes a boolean function (one that returns true or false) and flatmap, which takes a function that returns lists and flattens them out into into a single list. (Some languages call that concatmap.) Higher-level functions are what make functional languages powerful, beautiful, and easy to program in.

Now, Church's lambda calculus was purely mathematical -- Church used it to prove things about what kinds of functions were computable, and which weren't, which is exactly what Alan Turing did with the Turing Machine. We'll save that for another post, except to point out that Church's lambda calculus can compute anything that a Turing machine can compute, and vice versa.

But in 1958 John McCarthy at MIT invented a simple way of representing lambda expressions as lists, so that they could be processed by a computer: just represent a function and its arguments as a list with the function first, followed by its arguments. Trivial? Yes. But brilliant. His paper included a simple function called eval that takes a list representing a function and its arguments, and returns the result. Steve Russell realized that it wouldn't be hard to implement eval in machine language (on an IBM 704).

I'll save the details for another post, but the result was a language called Lisp. It's the second oldest programming language still in use. The post about LISP will explain how it works (elsewhere I've described eval as The most amazing piece of software in the world), and hopefully show you why Kanef's song The Eternal Flame says that God wrote the universe in LISP. Programmers may want to take a look at "Sex and the Single Link".

And finally,

I'd be particularly interested in seeing comments from the non-programmers among my readers, if I haven't lost you all by now. Was this post interesting? Was it understandable? Was it too long? Were my examples sufficiently silly? Inquiring minds...

Another fine post from The Computer Curmudgeon.

mdlbear: (lemming)

I have asked a couple of people for five Things that they associate with me, to ramble on about in my journal. I extend the same offer to anyone who comments here.

Here are the five things I got from [livejournal.com profile] pocketnaomi:

Songwriting

I believe I started making up songs when I was eight or ten years old, but didn't actually write any down until I was in college and found myself rooming with two other guitar players. They would have been classifiable as filksongs if I'd ever heard of such things at the time. I only remember bits of one of them, but was told at the last reunion I went to, a decade ago, that one of those former roommates still sings that and another one, which I had entirely forgotten. I keep them in my computer files now.

It was my involvement with fandom and filk that finally "gave me permission" to write songs, a few of which were worth singing in public. As time goes by I seem to have gotten better at it.

I wrote five songs last year, my most prolific year so far, and more than the previous five years put together. Last year also included my two or three best songs so far.

I have a tendency to write lyrics first; if I start with music it may take years for the tune to attach itself to a suitable lyric.

I've helped teach songwriting at a couple of weekend workshops run by Kathy Mar; I don't claim to be much good at that, but you're welcome to read my notes for the 2007 workshop and draw your own conclusions.

Programming

Programming is, in essence, the art of giving orders to an incredibly fast, incredibly accurate, and moronicly literal-minded demon. As such, it represents a very useful skill for game-players and parents. You will note that I do consider it an art, and in particular a branch of literature. (My degree is in Computer Science, but I feel strongly that any field with the word "science" in its name isn't one.)

Another way of looking at it is to say that the inside of a computer is an alternate universe where magic works: programs are spells, and obey most of the usual laws of magic. They also share with traditional magic the fact that a misspoken spell can wreak untold havoc.

Programming, like reading, is one of those activities I do in a light trance state. When I'm on my game (increasingly rare these days) I occasionally look up from my keyboard after what seems like a short time and wonder why it's suddenly gotten dark outside.

House Parties

Our household has four Saturday parties every year: one in late December or early January celebrating the new year and our anniversary, one in March (the "It's Green!" party, now by long-established tradition the Saturday after Consonance) to celebrate Spring, St. Patrick's Day, and our birthdays, one in June (originally to celebrate the anniversary of Colleen's flower business, but now just for the tradition of it), and one in late October to celebrate Halloween.

We also have an Open House every Wednesday -- these were originally devised by Colleen to make sure that she would have adults to talk to even after our older daughter was born.

The house is also more-or-less open during the entire Winter holiday season; we don't exactly expect guests, but are never surprised if they show up, and occasionally invite them.

Our 25th Anniversary party was remarkable in being the only one for which we hired entertainment -- the members of Golden Bough had been to a few of our previous parties, and we booked them a year in advance to make sure they'd show up. It was also the only one we had to rent chairs for.

Cordwainer Smith

... is/was one of my favorite science fiction authors. The name "Mandelbear" comes in part from a post I made in alt.callahans, and in part from one of my favorite characters, the Middle-Sized Bear, in his story "Mark Elf"; my latest and arguably most autobiographical song is called "A Talk With the Middle-Sized Bear". My first filk song, "The Shores of the Night", was loosely inspired by another of his stories, "The Lady Who SailedThe Soul".

My favorite story of his is probably "The Dead Lady of Clown Town", though it's hard to pick just one. I especially admire him for his imagery and his narrative style; many of his stories are written as if they were popular history, written years -- centuries, in some cases -- after the events they recount. "Drunkboat" is also worthy of mention; its description of the first journey through hyperspace is simply a translation of Arthur Rimbaud's "Le Bateau ivre".

Janis Ian

... is one of my favorite singer-songwriters, and probably the celebrity I would most like to spend a night with -- swapping songs, of course. She's also a long-time science fiction fan, and more recently author. I see from her tour schedule that she's toastmistress at the Nebula Awards banquet this year. I have yet to run into her at a Worldcon; she never goes to the filking.

I have been known to perform one of her songs, "The Last Train", in filk circles and even at a concert or two.

Her website includes lots of good articles on being a songwriter and performer, backed by 40-odd years of experience. Highly recommended.

mdlbear: (hacker glider)

My performance review yesterday seemed to go pretty well. No numbers until $boss goes to $grandboss and things grind through the mill, but... On average, my income has not kept up with inflation; I don't expect that to change. $boss had some good suggestions about my next project; it's starting to come together into something coherent. And useful, which is always good.

There was a group in the lunchroom this afternoon (or maybe it was yesterday; everything runs together these days) discussing code readability. Somebody remarked that my Perl code always looks like I expected it to be read by somebody else. I responded that I know damned well that when it comes back to me in a year or so, I'll be somebody else.

... but that doesn't always help. Our form-handling system was written nearly a decade ago, by $boss.previous, on top of an old version of the world's third-ugliest programming language. Which, to my lasting embarassment, I designed. (Aside: PIA is basically a stripped-down Lisp with XML syntax. HTML, in the beta version. Just for comparison, the fourth ugliest language is INTERCAL. Trust me, you don't want to meet ATLAS or the Advantest IC tester language (the name of which I have mercifully forgotten) which are in second and first place respectively. Syntax by H.P.Lovecraft; architecture by Sarah Winchester. There are things you're better off not knowing.)

Anyway, our intrepid system administrator was trying to get it running on a modern Linux box so he could pull out the historical records. M. is fscking brilliant (fluent in five or six languages, degree in CS, ...) but let's face it: mouldering code, dead language, written by people who should have known better... Took us about an hour to find both places where an absolute pathname was hard-coded in. Just because it was my own stupid design didn't mean I remembered it: thank goodness for find, grep, and man.

Could probably have done it in half the time if it'd been me at the keyboard. Nice to know that, even with my brain mostly turned to mush from old age and chronic caffeine addiction, I can still out-debug a bright young whippersnapper half my age.

mdlbear: the positively imaginary half of a cubic mandelbrot set (Default)

Saw the first rainbow of the season this morning; it was drizzling on my way to work. By noon, fortunately, it was cool and clear: just right for walking.

After my walk (see previous post) I spent the entire afternoon programming. This mostly consisted of taking the quick hack of a Perl program I wrote yesterday and adding some code for, basically, computing the weighted average of a list of vectors. The pre-existing code I was working from was an earlier hack written in Objective C by somebody else. Fun! No, really. I don't get in quite as much programming as I ought to for my sanity and self-respect.

Tomorrow I have to go back, figure out how I really ought to compute that weighted average instead of faking it, and then convert the guts of the code from an egregious hack into a reasonably well-documented Perl module. It will eventually be open source. Meanwhile, one of my coworkers is in Japan using the egregious hack for a demo.

mdlbear: (hacker glider)

Only worked on one track this evening, but it was a fun one: "Vampire Megabyte". You see, it has this line: "...and the modem beeped and twittered as the mainframe lost its mind..." -- and it really wanted a suitable sound effect.

Turns out Audacity plugins are written in this Lisp-based language called Nyquist. "Aha!", says I to myself, says I. "I can play that game..."

It's been a while since I wrote any serious Lisp beyond the occasional Emacs tweak. But it all comes back.

Here's where we get into the serious hacking. )

In other album-related news, I spent some time consolidating my customer data someplace where it can be backed up safely, in an encrypted form, offsite. I hope to finally get around to sending emails sometime early this week. (\me grovels contritely) I am now very close to having something I can send to Oasis, though it's probably going to be the week after Westercon at the rate I'm going.

mdlbear: (hacker glider)

Recently I've been looking into the programming language Ruby. This page has a lot of good resources, including some for people who know nothing at all about programming and want a quick introduction. Ruby turns out to be a particularly good first language, in part because of its closeness to Smalltalk, which was originally designed as a teaching language.

Learn to Program, by Chris Pine is a good introduction to programming, in 12 easy web pages. Reading this and following the examples won't make you a programmer, but it will get you pointed in the right direction.

Why's (Poingnant) Guide to Ruby by why the lucky stiff is a quirky, funny, occasionally (yes) poingnant guide to the language with cartoon foxes and weird sidebars; significantly more complete than Learn to Program. Experienced programmers will probably find it too slow and rambling, but it's an entertaining ramble if you have the time for it.

Programming Ruby (The Pragmatic Programmer's Guide) by Dave Thomas is a more traditional introduction, perhaps not as elegant as the Report on the Algorithmic Language Algol 60, but highly accessible. You could easily use it as an introductory programming textbook.

If you're already familiar with some other programming language, you should head directly to Ruby From Other Languages -- although it doesn't mention Smalltalk, which is arguably the closest match.

If you're a web developer and haven't been living under a rock for the last four years you already know about Ruby on Rails.

mdlbear: the positively imaginary half of a cubic mandelbrot set (Default)
KLone - Embedded Web Server and SDK
KLone is a fully-featured, multiplatform, web application development framework, targeted especially for embedded systems and appliances.

It is a self-contained solution which includes a web server and an SDK for creating WWW sites with both static and dynamic content. When using KLone, there's absolutely no need for any additional component: neither the HTTP/S server (e.g. Apache, Netscape, Roxen), nor the typical active pages engine (PHP, Perl, ASP, Python).

KLone does everything, and does it fast and small.

KLone blends the HTTP/S server application together with its content and configuration into a single executable file. The site developer writes his/her dynamic pages in C/C++ (in usual scripting style: <% /* code */ %>) and uses KLone to transform them into embeddable, compressed native code with the native C/C++ compiler. The result is then linked to the HTTP/S server skeleton to obtain one single, ROM-able, binary file.
They don't call it "C server pages", but that's what it is.

Most Popular Tags

Syndicate

RSS Atom

Style Credit

Page generated 2019-04-22 02:40 pm
Powered by Dreamwidth Studios