May 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 2025

Page Summary

Expand Cut Tags

No cut tags
mdlbear: (hacker glider)

1: The Turing Machine

So, Wednesday I looked at Wikipedia's front page and saw, under the "On this day" heading:

1936 – English mathematician Alan Turing published the first details of the Turing machine (model pictured), an abstract device that can simulate the logic of any computer algorithm by manipulating symbols.

It was the ""model pictured" that grabbed me. The caption was/is "A physical Turing machine model. A true Turing machine would have unlimited tape on both sides, however, physical models can only have a finite amount of tape."

I knew that -- everyone who studies computer science knows that, and a few have dreamed, as I had, of building a physical model. I even figured out how to build one out of wood, minus a few details. But there it was.

(If you're not interested in the details, you can skip this and the other indented blocks. But I digress...)

A Turing Machine is a remarkably simple device. It has a read head, a write head, a strip of tape that they operate on, and a controller with a finite number of states. It can read what's on the tape -- the classic machine uses blank, "0", and "1". (Some versions use "X" instead of "1", and some dispense with "0" and just have 1 and blank. That makes programming them a little more complicated, but not by much. Some have more symbols. It doesn't matter -- you can program around it.) The machine can move the tape backward and forward. Numbers are usually represented in Unary, so you count "1", "11", "111", ..., although with both 1 and 0 you could use binary, and some versions do.

The machine has a "state", which selects a line in the machine's program that tells it what to write, which direction to move the tape, and which state to go to next, depending on what symbol the read head is looking at. (Think of the table as a drum with pegs on it, like a music box.)

That's it. That's all you need to compute any function that can be computed by any kind of mechanical or digital computer. Of course you may need a lot of tape -- so you need to attach it to a tape factory -- and a lot of time.

The critical thing is that it's possible to design a universal Turing machine: it takes a tape, and the state table of a Turing machine (in 1's, 0's and blanks), and it uses that description to do exactly what that machine is programmed to do. Turing's big accomplishment was using the universal Turing machine to prove that there some things that a computer can't do, no matter how much time and tape you give it.

But of course I was much more fascinated by the machines, starting at the website of the model that first grabbed my attention., and proceeding to a Turing machine made of legos. I spent some time in the Turing machine gallery. But the rabbit hole went deeper than that.

2: The Universal Constructor

At about that point it ocurred to me to look at the Wikipedia page for the Von Neumann universal constructor. Because once you have a kind of machine that can simulate itself, the natural question is whether you can have a machine that can build a copy of itself.

The trivial answer to this question is "Yes, of course. Cells have been reproducing themselves for billions of years." But in the 1940s when von Neumann was thinking about this, the structure of DNA had not yet been determined -- that was 1953 -- and although it had been known since the late 1920s that DNA had something to do with heredity, nobody knew how it worked. So his insight into the machinery of reproduction was pretty remarkable.

Like Turing's insight into the machinery of computation, von Neumann's insight into the machinery of reproduction was to separate the machine -- the Universal Constructor -- from the description of what it was to construct, stored on something simple -- a tape.

Von Neumann's machine was/is a cellular automaton; it "lived" (if you can call it that) on a grid of squares, where each square can be in one of 29 different states, with rules that tell it what to do depending on the states of its neighbors. A completely working machine wasn't simulated until 1995. Its constructor had 6329 32-state cells, and a tape with a length of 145,315. It took it over 63 billion timesteps to copy itself. (Smaller and faster versions have been constructed since then).

At, say, 1000 steps/second, that would have taken over two years. It wasn't until 2008 that a program, Golly, became able to simulate it using the hashlife algorithm; it now takes only a few minutes.

Which led me even further down the rabbit hole. Because no discussion of cellular automata would be complete without Conway's Game of Life.

3: The Game of Life

It's not really a game, of course, it's a cellular automaton. Each cell in the square grid is either dead or alive. You start with an arrangement of live cells, and turn them loose according to four simple rules:

  1. If a live cell has fewer than two live neighbors (out of the 8 cells surrounding it), it dies of loneliness.
  2. A live cell with two or three live neighbors, stays alive.
  3. A live cell with more than three live neighbors dies of overpopulation.
  4. A dead cell with exactly three live neighbors becomes live.

I first encountered the game in the October 1970 issue of Scientific American, in Martin Gardner's "Mathematical Games" column. The Wikipedia article gives a good introduction.

Patterns in Life evolve in a bewildering variety of ways. Many of them die out quickly -- an isolated cell, for example. Some patterns sit there and do nothing -- they're called "still lifes". A 2x2 block of cells for an example. Some blow up unpredictably, and may or may not leave a pile of random still lifes behind. Some patterns oscillate: a horizontal row of three cells will become a vertical row in the next turn, and vice versa -- it's called a "blinker".

And some patterns move. The simplest, called a "glider", appears in this post's icon. You can crash gliders into blocks or gliders into gliders, and depending on the timing they will do different interesting things. It didn't take people long to figure out that you can build computers, including a universal Turing machine. Or a machine that prints out the digits of Pi.

Or a universal constructor.

4: The universal constructor

While I was falling into this rabbit hole, I happened to remember a passing mention of a universal constructor that can build anything at all out of exactly 15 gliders. (Strictly speaking, anything that can be constructed by crashing gliders together. Some patterns can't be made that way. But almost all the complicated and interesting ones that people are building these days can.) If this intrigues you, go read the article. Or wait until the next section, where I finally get to the bottom of the rabbit hole.

On the way down I encountered lots of weird things -- the aforementioned universal Turing machine and Pi printer, and a variety of "spaceships" that travel by, in effect, repeatedly constructing a new copy of themselves, then cleaning up the old copy. It took a while for me to get my head around that.

Then, sometime Wednesday evening, I found the book.

5: The Book of Life

It's not called "The Book of Life", of course, it's called Conway's Game of Life: Mathematics and Construction. But you get the idea. You can download the PDF.

The book ends with a pattern that simulates a Life cell. There are several different versions of this; this is the latest. It works by making copies of itself in any neighboring cells that are coming alive, then destroying itself if it's about to die. Wild.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (wtf)

So, a couple of days ago (September 8th, to be exact) Patreon laid off their entire five-person security team. WTF? The linked article goes on to say,

The firm, which is still doing business in Russia, simply calls it “a strategic shift” (which seems to be corporate mumbo-jumbo for “cheaper outsourcing”). But infosec experts call it a “nightmare” caused by an “untrustworthy” company that’s “just put a massive target on its back.”

You can see links to more articles below in the resources.

The minimum reasonable response to this would be to change your password. Done that. It's not unreasonable to delete your account. I'm still supporting a few sites, so I'll leave my account in place until I see what's going to happen. And laying in a supply of popcorn.

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

According to this article posted yesterday on Ars Technica, there is a major security hole in Zoom for the Mac. Zoom issued a security bulletin on Saturday. The article suggests that you should download the update directly from Zoom or click on your menu bar options to "Check for updates" rather than waiting for the auto-update, although if you've already updated since Saturday you're probably ok.

The article goes into more detail; tl;dr is that Zoom's installer is owned by and runs as root, and has a major bug that allows unsigned updates to be installed.

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

I've been using the same software for doing my taxes for somewhere around 30 years. It was called TaxCut back then; the company that made it was bought by H&R Block in 1993, though they didn't rename the software until 2008. For much, if not all, of that time I've been doing it on a Mac of some sort.

Last year I looked at the system requirements and discovered that it would no longer run on my ageing Mac Mini. It also wouldn't run on Windows 7. It needed either NacOS High Sierra or Windows 8.1. So I used their web version, which I remember as rather slow, and enough different from the offline version of previous years to be annoying.

So for this year (which is to say tax year 2021), my options would appear to be:

  1. Use the web version again. Ugh, but at least it would import 2020 without trouble. Maybe. It didn't let me upload a 2019 data file; I had to feed it a PDF and do a lot of fixing up.
  2. Run it on the laptop that has Win 8.1, or put the Win 10 disk that came with (new) Sable back in and use that. Ugh.
  3. Buy a newer Mac Mini. I could get a minimal one for about $100-150, or a more recent one (running Mojave) for around $200-250. (Those are eBay prices, of course.)

(Note that cost of the software is the same for all three options.)

I'm really leaning toward #3. But really that would just be an excuse to buy another computer, and would leave me with two Mac Minis that I'd hardly ever use. More likely I'll dither about it until the end of March and then break down and go use the web version again.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

This post in Krebs on Security describes an unusual and potentially very dangerous attack technique that can be used to sneak evil code past code reviews and into the supply chain. Briefly, it allows evildoers to write code that looks very different to a human and a compiler. It should probably come as no surprise that it involves Unicode, the same coding standard that lets you make blog posts that include inline emoji, or mix text in English and Arabic.

In particular, it's the latter ability that the vulnerability targets, specifically Unicode's "Bidi" algorithm for presenting a mix of left-to-right and right-to-left text. (Read the Bidi article for details and examples -- I'm not going to try plopping random text in languages I don't know into the middle of a blog post.)

Now go read the "Trojan Source Attacks" website, and the associated paper [PDF] and GitHub repo. Observe, in particular, the Warning about bidirectional Unicode text that GitHub now attaches to files like this one in C++. Observe also that GitHub does not flag files that, for example, mix homoglyphs like "H" (the usual ASCII version) and "Н" (the similar-looking Cyrillic letter that sounds like "N"; how similar it looks depends on what font your browser is using). If you're unlucky, you might have clicked on a URL containing one or more of these, that took you someplace unexpected and almost certainly malicious.

The Trojan Source attack works by making use of the control characters U+202B RIGHT-TO-LEFT EMBEDDING (RLE) and U+202A LEFT-TO-RIGHT EMBEDDING (LRE), which change the base direction explicitly.

And remember: ШYSINAШYG - What You See Is Not Always What You've Got!

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

If you're sensible enough not to use Facebook, WhatsApp, or Instagram, or to have set up "log in with Facebook" on any site you use regularly, you might not have noticed that they all disappeared from the internet for about six hours yesterday. Or if you noticed, you might not have cared. But you might have read some of the news about it, and wondered what the heck BGP and DNS are, and what they had to do with it all.

And if not, I'm going to tell you anyway.

You're more likely to have heard of DNS: that's the Internet's phone book. Your web browser, and every other program that connects to anything over the Internet, uses the Domain Name System to look up a "domain name" like, say, "www.facebook.com", and find the numerical IP address that it refers to. DNS works by splitting the name into parts, and looking them up in a series of "name servers". First it looks in a "root server" to find the address of the Top-Level Domain (TLD) server that holds the lookup table for the last part of the name, e.g., "com". From the TLD server it gets the address of the "authoritative name server" that holds the lookup table for the next part of the name, e.g., facebook, and looks there for any subdomains (e.g. "www").

(When you buy a "domain name", what you're actually buying is a line in the TLD servers that points to the DNS server for your domain. You also have to get somebody to "host" that server; that's usually also the company that hosts your website, but it doesn't have to be.)

All this takes a while, so the network stack on your computer passes the whole process off to a "caching name server" which remembers every domain name it looks up, for a time which is called the name's "time to live" (TTL). Your ISP has a caching name server they would like you to use, but I'd recommend telling your router (if you have full control over it) to use Cloudflare's or Google's nameserver, at the IP address 1.1.1.1 or 8.8.8.8 respectively. Your router will also keep track of the names of the computers attached to your local network.

Finally, we get to the Border Gateway Protocol (BGP). If DNS is the phone book where you look up street addresses, BGP is the road map that tells your packets how to get there from your house, and in particular what route to take.

The Internet is a network of networks, and it's split up into "autonomous systems (AS), each of which is a large pool of routers belonging to a single organization. Each AS exchanges messages with its neighbors, using BGP to determine the "best" route between the itself and every other AS in the Internet. (The best route isn't always the shortest; the protocol can also take things like the cost of messages into account.) BGP isn't entirely automatic -- there's some manual configuration involved.

What happened yesterday was that somebody at Facebook accidentally gave a command that resulted in all the routes leading to Facebook's data centers being withdrawn. In less than a minute Facebook's DNS servers noticed that their network was "unhealthy", and took themselves offline. At that point Facebook had basically shot themselves in the foot with a cannon.

Normally, engineers can fix server configuration problems like this by connecting to the servers over the internet. But Facebook's servers weren't connected to the internet anymore. To make matters worse, the computers that control access to Facebook's buildings -- offices as well as data centers -- weren't able to connect to the database that told them whose badges were valid.

Meanwhile, computers that wanted to look up Facebook or any of its other domains (like WhatsApp and Instagram), kept getting DNS failures. There isn't a good way for an app or a computer to determine whether a DNS lookup failure is temporary or permanent, so they keep re-trying, sometimes (as Cloudflare's blog post puts it) "aggressively". Users don't usually take an error for an answer either, so they keep reloading pages, restarting their browsers, and so on. "Sometimes also aggressively." Traffic to Facebook's DNS servers increased to 30 times normal, and traffic to alternatives like Signal, Twitter, Telegram, and Tiktok nearly doubled.

Altogether a nice demonstration of Facebook's monopoly power, and great fun to read about if you weren't relying on it.

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: blue fractal bear with text "since 2002" (Default)

A rather mixed bag of things that, arguably, I should have written about a week ago.

1: the Let's Encrypt root certificate.

Hopefully this won't affect you, but if your browser starts complaining about websites suddenly being untrusted, you need to upgrade. The problem is that Let's Encrypt's root certificate is expiring, and will be replaced by a new one (see the link above for details). Starting October 1st, browsers and other programs that rely on the old cert will have problems if they haven't been upgraded in the last year.

You keep your OS and your browser up to date, right? There are some old apps and operating systems that are no longer receiving upgrades, and so won't know about the new root cert. Specifically, if you're using one of these products:

OpenSSL <= 1.0.2, Windows < XP SP3, macOS < 10.12.1, iOS < 10 (iPhone 5 is the lowest model that can get to iOS 10), Android < 7.1.1 (but >= 2.3.6 will still mostly work), Mozilla Firefox < 50, Ubuntu < 16.04, Debian < 8, Java 8 < 8u141, Java 7 < 7u151, NSS < 3.26, Amazon FireOS (Silk Browser).

Possibly, Cyanogen > v10, Jolla Sailfish OS > v1.1.2.16, Kindle > v3.4.1, Blackberry >= 10.3.3, PS4 game console with firmware >= 5.00, IIS

(You can probably uptrade to the newest Firefox or switch to a recent version of Chrome, which will restore your ability to browse the web, but a few other things might still fail. (For example, Firefox will keep working on my ancient Mac Mini, but Safari probably won't.)

The following articles go into a lot more detail; you can get a good overview from the first two:

Smart TVs, fridges and light bulbs may stop working next year: Here's why An Internet of Trouble lies ahead as root certificates begin to expire en masse, warns security researcher • The Register The Impending Doom of Expiring Root CAs and Legacy Clients Let's Encrypt's Root Certificate is expiring! Certificate Compatibility - Let's Encrypt

2. Phillips Respironics CPAP recall:

If you're using a CPAP made by Phillips Respironics, hopefully you've already seen the Recall Notification [PDF]. I missed it, through my habit of ignoring notifications in the Dreamstation app and website. The email I got from Medicare says:

If you own or rent one of the Philips products that was recalled, talk to your doctor as soon as possible about whether to continue using your recalled equipment. If you would like to replace or repair your equipment, the supplier you bought the equipment from is responsible for replacing or repairing rental equipment at no cost to you when the equipment is less than 5 years old.

If, like me, you insist on continuing to use your facehugger, install an antibacterial filter, which will keep little bits of soundproofing foam out of your lungs. This is probably only necessary if you've been using ozone to clean your device, but I decided not to take chances.

3. Chevrolet Bolt EV recall:

If you own a Bolt, you should have received several letters about this recall. Hopefully you haven't been throwing them away unread, but if you have, you'll want to enable "hilltop reserve" to limit your charging to 90%, don't run your battery down below about 70 miles, park outside immediately after charging, and don't leave your Bolt charging indoors overnight. "Experts from GM and LG have identified the simultaneous presence of two rare manufacturing defects in the same battery cell as the root cause of battery fires in certain Chevrolet Bolt EVs." You don't want to take chances with battery fires. They're nasty; lithium is perfectly capable of burning under water.

Be safe out there.

On a more hopeful(? helpful, at least) note, dialecticdreamer has posted Demifiction: Breaking Omaha!, which despite being set in a fictional universe contains a lot of practical advice for disaster preparedness.

mdlbear: (technonerdmonster)

Note: Despite being posted on a Saturday and a title that includes the name of a a character from a well-known musical, this is not a Songs for Saturday post. It doesn't have anything to do with fish, either.

Remarkably, Joseph Weizenbaum's original source code for ELIZA has been rediscovered, after having been missing and believed lost for over half a century, and was made public on May 23rd of this year. ELIZA is probably the oldest and almost certainly the best-known implementation of what is now known as a chatbot.

If you decide to look at the code, start by reading the web page it's embedded in before you dive into the listing. The "Notes on reading the code" section, which comes after the listing, will prevent a lot of confusion. The listing itself is a scan of a 132-column listing, and definitely benefits from being viewed full-screen on a large monitor.

The first thing you see in the listing is the script -- the set of rules that tells the ELIZA program how to respond to input. The program itself starts on page 6. You might be misled by the rules, which are in the form of parenthesized lists, into thinking that the program would be written in LISP. It's not; it's written in MAD, an Algol-like language, with Weisenbaum's SLIP (Symmetric List Processing) primitives embedded in it.

SLIP uses circular, bidirectionally-linked lists. Each list has a header with pointers to the first and last list element; the header of an empty list points to itself. I've lost track of how many times I've implemented doubly-linked lists, in everything from assembly language to Java.

ELIZA is the name of the program, but "Eliza" usually refers to the combination of an Eliza-like program with the Doctor script. The most common script is a (rather poor) simulation of a Rogerian psychotherapist called "Doctor". According to the note at the bottom of the Original Eliza page, actual Rogerian therapists have pronounced it a perfect example of how not to do Rogerian therapy. Nevertheless, many people are said to have been helped by ELIZA, and it's possible to have a surprisingly intimate conversation with her as long as you suspend your disbelief and respect her limits.

If you have Emacs installed on your computer, you can access a version of Doctor with M-X doctor. Otherwise, browse to Eliza, Computer Therapist if you don't mind having a potentially intimate conversation with something hosted on a public website. (Or simply download the page -- it's written in Javascript.)

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

If you happen to be the administrator of a Microsoft Exchange Server that can be accessed from the internet, you need to immediately

  1. Apply the patches that Microsoft released on Tuesday: Multiple Security Updates Released for Exchange Server – updated March 5, 2021 – Microsoft Security Response Center
  2. Use this script (on GitHub) to scan your logs, as described in HAFNIUM targeting Exchange Servers with 0-day exploits - Microsoft Security to determine whether you are one of the at least 30,000 organizations that have been hacked via the holes you just patched (see Step 1). (You did patch them, right?) If you are,...
  3. Figure out what it means to your organization that all of your organization's internal email is now sitting on a disk somewhere in China. If that sounds like A Very Bad Thing,...
  4. Panic.

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

Today I was shocked to read that Fry's Electronics has gone out of business, as of midnight last night (February 24th). Their web page has the announcement:

After nearly 36 years in business as the one-stop-shop and online resource for high-tech professionals across nine states and 31 stores, Fry’s Electronics, Inc. (“Fry’s” or “Company”), has made the difficult decision to shut down its operations and close its business permanently as a result of changes in the retail industry and the challenges posed by the Covid-19 pandemic. The Company will implement the shut down through an orderly wind down process that it believes will be in the best interests of the Company, its creditors, and other stakeholders.

It's a sad, sad day. Their first ad, a full page in the San Jose Mercury-News, was like nothing seen before (or since), listing computer chips and potato chips on the same page. (Its relationship to Fry's Food and Drug, which had recently been sold by the founders' father, was obvious.) As time went by the groceries largely disappeared, but soft drinks and munchies remained, and some of the larger stores included a cafeé.

I (snail) mailed a copy of that first ad to my father, and that first Sunnyvale store was one of the tourist attractions we visited on his next visit to the West Coast. I have no idea how much money I spent there over the years.

After I moved to Washington in 2012 my visits to Fry's became much less frequent, and more of my electronics started coming from Amazon. It's been years since I saw the inside of a Fry's store.

I'll miss it.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

I've always been a little uncomfortable about build systems and languages that start the build by going out to a package repository and pulling down the most recent (minor or patch) version of every one of the package's dependencies. Followed by all of their dependencies. The best-known of these are probably Python's pip package manager, Javascript's npm (node package manager), and Ruby's gems. They're quite impressive to watch, as they fetch package after package from their repository and include it in the program or web page being built. What could possibly go wrong?

Plenty, as it turns out.

The best-known technique for taking advantage of a package manager is typosquatting -- picking a name for a malware package that's a plausible misspelling of a real one, and waiting for someone to make a typo. (It's an adaptation of the same technique from DNS - picking a domain name close to that of some popular site in hopes of siphoning off some of the legitimate site's traffic. These days it's common for companies to typosquat their own domains before somebody else does -- facbook.com redirects to FB, for example.)

A few days ago, Alex Birsan published "Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies", describing a new attack that relies on the way package managers like npm resolve dependencies, by looking for and fetching the most recent compatible version (i.e. with the same major version) of every package, and the fact that they can be made to look in more than one repository.

Fetching the most recent minor version of a package is usually perfectly safe; packages have owners, and only the owner can upload a new version to the repository. (There have been a few cases where somebody has gotten tired of maintaining a popular package, and transferred ownership to someone who turned out to be, shall we say, less than reliable.)

The problem comes if, like most large companies and many small ones, you have a private repository that some of your packages come from. The package manager looks in both places, public and private, for the most recent version. If an attacker somehow gets the name and version number of a private package that doesn't exist in the public repository, they can upload a bogus package with the same name and a later version.

It turns out that the names and versions of private packages can be leaked in a wide variety of ways. The simplest turns out to be looking in your target's web apps -- apparently it's not uncommon to find a copy of a `package.json` left in the app's JavaScript by the build process. Birsan goes into detail on this and other sources of information.

Microsoft has published 3 Ways to Mitigate Risk When Using Private Package Feeds, so that's a good place to look if you have this problem and want to fix it. (Hint: you really want to fix it.) Tl;dr: by far the simplest fix is to have one private repo that includes both your private packages, and all of the public packages your software depends on. Point your package manager at that. Updating the repo to get the most recent public versions is left as an exercise for the reader; if I was doing it I'd just make a set of dummy package that depend on them.

Happy hacking!

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (g15-meters)

Not only is today Boxing Day, it's also the birthday of Charles Babbage: December 26, 1791. He invented the stored-program digital computer, which he called the Analytical Engine. That also makes the Analytical Engine the first unfinished computer project (unless you count Babbage's Difference Engine, but that wasn't a general-purpose computer). Contrary to popular belief, the mechanical precision of the time was quite capable of producing it (proved by the full implementation of the Difference Engine, using 1820s-level technology, in the 1990s), but the machining proved much more expensive than expected, and the project eventually ran out of funding. It's an old story.

But this post isn't about Babbage, or the Difference Engine -- this post is about a song I wrote back in 1985 called Uncle Ernie's [ogg][mp3], and that in turn was directly inspired by Mike Quinn Electronics, a surplus joint located in a run-down old building at the Oakland airport, run by a guy named Mike Quinn. I had to search for the name of the store; everyone just called it "Quinn's". There's a good description of the place in "Mighty Quinn and the IMSAI connection" on The Official IMSAI Home Page. As far as I know there is no connection to "Quinn the Eskimo" by Bob Dylan besides the title.

At one point Quinn's had a Bendix G-15 for sale, with a price tag just short of $1000. Unlike the one I first learned programming on, it had magtape drives as well as paper tape. Somebody eventually bought it; I hope they gave it a good home. That's almost certainly the origin of the line about magtape drives in the second verse. A 7090 would have occupied the entire building.

Almost all of the other computers mentioned -- Altair, Imsai, Apple 3, PC Junior, Heathkit Hero (yes, Heath sold robot kits back in the 1980s) -- were also quite real, and some of the smaller ones almost certainly did show up at Quinn's from time to time, especially the Imsai and Altair, which were sold in kit form. The only thing I made up completely was the temperature controller in verse three. The only one I actually used was the 7090 (or rather its successor, the 7094, but that wouldn't have scanned).

lyrics, if you don't want to click through: )

mdlbear: (technonerdmonster)

You may remember this post about renaming the default branch in Git repositories. Since then I've done some script writing -- they say you don't really understand a process until you can write a program that does it, and this was no exception. (There are lots of exceptions, actually, but that's rather beside the point of this post...)

Anyway, here's what I think is the best way to rename master to main in a clone of a repository where that rename has already been done. (That's a common case anywhere you have multiple developers, each with their own clone, or one developer like me who works on a different laptop depending on the time of day and where the cats are sitting.)

     git fetch
     git branch -m master main
     git branch -u origin/main main
     git remote set-head origin main
     git remote prune origin

The interesting part is why this is the best way I've found of doing it: 1. It works even if master isn't the current branch, or if it's out of date or diverged from upstream. 2. It doesn't print extraneous warnings or fail with an error. Neither of those is a problem if you're doing everything manually, but it can be annoying or fatal in a script. So here it is again, with commentary:

git fetch -- you have to do this first, or the git branch -u ... line will fail because git will think you're setting upstream to a branch that doesn't exist on the origin.

git branch -m master main -- note that the renamed branch will still be tracking master. We fix that with...

git branch -u origin/main main -- many of the pages I've seen use git push -u..., but the push isn't necessary and has several different ways it can fail, for example if the current branch isn't main or if it isn't up to date.

git remote set-head origin main -- This sets main as the default branch, so things like git push will work without naming the branch. You can use -a for "automatic" instead of the branch name, but why make git do extra work? Many of the posts I've seen use the following low-level command, which works but isn't very clear and relies on implementation details you shouldn't have to bother with:

    git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/main

git remote prune origin -- I've seen people suggesting git fetch --prune, but we already did the fetch way back in step 1. Alternatively, we could use --prune on that first fetch, but then git will complain about master tracking a branch that doesn't exist. It still works, but it's annoying in a script.

Just as an aside because I think it's amusing: my former employer (a large online retailer) used and probably still uses "mainline" for the default branch, and I've seen people suggesting as an alternative to "main". It is, if anything, more jarring than "master" for someone who has previously encountered "mainlining" only in the context of self-administered street drugs.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

Hopefully, this post will become the first of a series about solving various common problems with Git. Note that the grouping in that phrase is intentionally ambiguous – it could be either “(solving various common problems) with Git”, or “solving (various common problems with Git)”, and I expect to cover both meanings. Often there are aspects of both: Git got you into trouble, and you need to use Git to get yourself out of it.

“It is easy to shoot your foot off with git, but also easy to revert to a previous foot and merge it with your current leg.” —Jack William Bell

In many cases, though, this will involve git rebase rather than merge, and I think “rebase it onto your current leg” reads better.

Overcoming your fear of git rebase

Many introductions to Git leave out rebase, either because the author considers it an “advanced technique”, or because “it changes history” and the author thinks that it’s undesirable to do so. The latter is undermined by the fact that they usually do talk about git commit --amend. But, like amend, rebase lets you correct mistakes that you would otherwise simply have to live with, and avoid some situations that you would have a lot of trouble backing out of.

In order to rebase fearlessly, you only need to follow these simple rules:

  • Always commit your changes before you pull, merge, rebase, or check out another branch! If you have your changes committed, you can always back out with git reset if something goes wrong. Stashing also works, because git stash commits your work in progress before resetting back to the last commit.
  • Never rebase or amend a commit that’s already been pushed to a shared branch! You can undo changes that were pushed by mistake with git revert. (There are a few cases where you really have to force-push changes, for example if you foolishly commit a configuration file that has passwords in it. It’s a huge hassle, and everyone else on your team will be annoyed at you. If you’re working on a personal project, you’ll be annoyed at yourself, which might be even worse.)
  • If you’re collaborating, do your work on a feature branch. You can use amend and rebase to clean it up before you merge it. You can even share it with a teammate (although it might be simpler to email a patch set).

That last rule is a lot less important if you’re working by yourself, but it’s still a good idea if you want to keep your history clean and understandable – see Why and How To Keep Your Master Happy. And remember that you’re effectively collaborating if your project is on GitHub or GitLab, even if nobody’s forked it yet.

Push rejected (not fast forward)

One common situation where you may want to rebase is when you try to push a commit and it gets rejected because there’s another commit on the remote repo. You can detect this situation without actually trying to push – just use git fetch followed by git status.

I get into this situation all the time with my to-do file, because I make my updates on the master branch and I have one laptop on my desk and a different one in my bedroom, and sometimes I make and commit some changes without pulling first to sync up. This usually happens before I’ve had my first cup of coffee.

The quick fix is git pull --rebase. Now all of the changes you made are sitting on top of the commit you just pulled, and it’s safe for you to push. If you’re developing software, be sure to run all your tests first, and take a close look at the files that were merged. Just because Git is happy with your rebase or merge, that doesn’t mean that something didn’t go subtly wrong.

Pull before pushing changes

I get into a similar situation at bedtime if I try to pull the day’s updates and discover that I hadn’t pushed the changes I made the previous night, resulting in either a merge commit that I didn’t want, or merge conflicts that I really didn’t want. You can avoid this problem by always using git pull --rebase (and you can set the config variable pull.rebase to true to make that the default, but it’sa little risky). But you can also fix the problem.

If you have a conflict, you can back get out of it with git merge --abort. (Remember that pull is just shorthand for fetch followed by merge.) If the merge succeeded and made an unwanted merge commit, you can use git reset --hard HEAD^.

Another possibility in this situation is that you have some uncommitted changes. In most cases Git will either go ahead with the merge, or warn you that a locally-modified file will be overwritten by the merge. In the first case, you may have merge conflicts to resolve. In the second, you can stash your changes with git stash, and after the pull has finished, merge them back in with git stash pop. (This combination is almost exactly the same as committing your changes and then rebasing on top of the pulled commit – stash actually makes two hidden commits, one to preserve the working tree, and the other to preserve the index. You can see it in action with gitk --all.

… and I’m going to stop here, because this has been sitting in my drafts folder, almost completely finished, since the middle of January.

Resources

NaBloPoMo stats:
   5524 words in 11 posts this month (average 502/post)
    967 words in 1 post today

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

If you've been paying attention to the software-development world, you may have noticed a movement to [remove] racist terms in tech contexts. The most obvious such terms are "master" and "slave", and there are plenty of good alternatives: primary/secondary, main/replica, leader/follower, etc. The one that almost every software developer sees every day is Git's "master" default branch. This issue on GitLab includes some good discussion of what makes "main" the best choice for git. (I've also seen "mainline" used.)

Renaming your master branch is easy. If you have a local repo that isn't a clone of anything (so it doesn't have any remotes), it's a one-liner:

   git branch -m master main

Renaming the default branch on an existing repo is trivial. If it has no remotes, for example if it's purely local or a shared repo on a server you have an ssh account on, it's a one-liner:

   git branch -m master main

It's a little more complicated for a clone, but not much more complicated:

   git branch -m master main
   git push -u origin main
   git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/main
   git pull

What you need to do at this point depends on where your origin repo is located. If you've already renamed its default branch, you're done. If you haven't, the git push -u created it. At this point if your origin repo is on GitHub, need to log in and change its default branch from master to main because it won't let you delete its default branch.

Then, delete the old master branch with

   git push --delete master

This works for simple cases. It gets a little more complicated on GitHub because you might have web hooks, pull requests, and so on that still refer to master. GitHub says that renaming master will be a one-step process later in the year, so you may want to wait until then. For less complicated situations, any URLs that reference master will get automatically redirected to main. See this page for details.

I had a slightly different problem: my shared repositories are on my web host, and there are hook scripts that pull from the shared repo into the web directory. My version of the post-update only looks for changes in the master branch. Fortunately that's a one-liner, too:

   ssh HOST sed -i -e s/master/main/g REPO/hooks/post-update

 

The next problem is creating a new repo with main as the default branch. GitHub already does this, so if you are starting your project there you're good to go. Otherwise, read on:

The Git project has also added a configuration variable, init.defaultBranch, to specify the default branch for new repositories, but it's probably not in many distributions yet. Fortunately, there's a workaround, so if you don't want to wait for your distribution to catch up, you can take advantage of the way git init works, as described in this article by Leigh Brenecki:

  1. Find out where Git keeps the template that git init copies to initialize a new repo. On Ubuntu, that's /usr/share/git-core/templates, but if it isn't there look at the man page for git-init.
  2. Copy it to someplace under your control; I used .config/git/init-template.
  3. cd to the (new) template and create a file called HEAD, containing ref: refs/heads/main.
  4. Set the init.templateDir config variable to point to the new template.

Now when git wants to create a new repo, it will use HEAD to tell it which branch to create. Putting all that together, it looks like:

   cp -a /usr/share/git-core/templates/ ~/.config/git/init-template
   echo ref: refs/heads/main > ~/.config/git/init-template/HEAD
   git config --global init.templateDir ~/.config/git/init-template

You can actually replace that initial copy with mkdir; git is able to fill in the missing pieces. Alternatively, you can add things like a default config file, hooks, and so on.

(I've already updated my configuration repository, Honu, to set up the modified template along with all the other config files it creates. But that probably doesn't help anyone but me.)

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

NaBloPoMo stats:
   2146 words in 4 posts this month (average 536/post)
    814 words in 1 post today

mdlbear: (technonerdmonster)

It's been a while since I described the way I do backups -- in fact, the only public document I could find on the subject was written in 2006, and things have changed a great deal since then. I believe there have been a few mentions in Dreamwidth and elsewhere, but in this calamitous year it seems prudent to do it again. Especially since I'm starting to feel mortal, and starting to think that some day one of my kids is going to have to grovel through the whole mess and try to make sense of it. (Whether they'll find anything worth keeping or even worth the trouble of looking is, of course, an open question.)

My home file server, a small Linux box called Nova, is backed up by simply copying (almost -- see below) its entire disk to an external hard drive every night. (It's done using rsync, which is efficient because it skips over everything that hasn't been changed since the last copy.) When the disk crashes (it's almost always the internal disk, because the external mirror is idle most of the time) I can (and have, several times) swap in the external drive, make it bootable, order a new drive for the mirror, and I'm done. Or, more likely, buy a new pair of drives that are twice as big for half the price, copy everthing, and archive the better of the old drives. Update it occasionally.

That's not very interesting, but it's not the whole story. I used to make incremental backups -- instead of the mirror drive being an exact copy of the main one, it's a sequence of snapshots (like Apple's Time Machine, for example). There were some problems with that, including the fact because of the way the snapshots were made (using cp -l to copy directories but leave hard links to the files that haven't changed) it takes more space than it needs to, and makes the backup disk very difficult -- not to mention slow -- to copy if it starts flaking out. There are ways of getting around those problems now, but I don't need them.

The classic solution is to keep copies offsite. But I can do better than that because I already have a web host, and I have Git. I need to back up a little.

I noticed that almost everything I was backing up fell into one of three categories:

  1. Files I keep under version control.
  2. Files (mostly large ones, like audio recordings) that never change after they've been created -- recordings of past concerts, my collection of ripped CDs, the masters for my CD, and so on. I accumulate more of them as time goes by, but most of the old ones stick around.
  3. Files I can reconstruct, or that are purely ephemeral -- my browser cache, build products like PDFs, executable code, downloaded install CDs, and of course entire OS, which I can re-install any time I need to in under an hour.

Git's biggest advantage for both version control and backups is that it's distributed -- each working directory has its own repository, and you can have shared repositories as well. In effect, every repository is a backup. In my case the shared repositories are in the cloud on Dreamhost, my web host. There are working trees on Nova (the file server) and on one or more laptops. A few of the more interesting ones have public copies on GitLab and/or GitHub as well. So that takes care of Group 1.

The main reason for using incremental backup or version control is so that you can go back to earlier versions of something if it gets messed up. But the files in group don't change, they just accumulate. So I put all of the files in Group 2 -- the big ones -- into the same directory tree as the Git working trees; the only difference is that they don't have an associated Git repo. I keep thinking I should set up git-annex to manage them, but it doesn't seem necessary. The workflow is very similar to the Git workflow: add something (typically on a laptop), then push it to a shared server. The Rsync commands are in a Makefile, so I don't have to remember them: I just make rsync. (Rsync doesn't copy anything that is already at the destination and hasn't changed since the previous run, and by default it ignores files on the destination that don't have corresponding source files. So I don't have to have a complete copy of my concert recordings (for example) on my laptop, just the one I just made.)

That leaves Group 3 -- the files that don't have to be backed up because they can be reconstructed from version-controlled sources. All of my working trees include a Makefile -- in most cases it's a link to MakeStuff/Makefile -- that builds and installs whatever that tree needs. Programs, web pages, songbooks, what have you. Initial setup of a new machine is done by a package called Honu (Hawaiian for the green sea turtle), which I described a little over a year ago in Sable and the turtles: laptop configuration made easy.

The end result is that "backups" are basically a side-effect of the way I normally work, with frequent small commits that are pushed almost immediately to a shared repo on Dreamhost. The workflow for large files, especially recording projects, is similar, working on my laptop and backing up with Rsync to the file server as I go along. When things are ready, they go up to the web host. Make targets push and rsync simplify the process. Going in the opposite direction, the pull-all command updates everything from the shared repos.

Your mileage may vary.

Resources and references

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: a rather old-looking spectacled bear (spectacled-bear)

I really need to write my memoirs, preferably before my memory deteriorates to the point where I can't. (I am inspired by my mom, who published the third edition of hers last year.) I have, however, given up on the idea of following the King of Hearts' advice to "begin at the beginning, [...] and go on till you come to the end: then stop". (I note in passing that I haven't come to the end yet.) So I'm just going to dive in at whatever point seems interesting at the moment. I'll tag these by year, so that anyone interested (possibly as many as two of you) can sort them out later.

This particular point was suggested by somebody's mention of their Erdős number, so I suppose I ought to explain that first. Content Warning: contains math, which you can safely skip over if you're math-phobic. Deciding which parts to skip is left as an exrcise for the reader.

You have perhaps heard of the parlor game called "Six Degrees of Kevin Bacon", based on the concept of "six degrees of separation". The idea is to start with an actor, and figure out the shortest possible list of movies that links them with Kevin Bacon. The length of that list is the actor's "Bacon number", with Bacon himself having the number zero, anyone who acted in a movie with him having the number one, and so on. As far as I know I don't have a finite Bacon number, but it's not outside the realm of possibility if, as most people do, you include TV shows and so on. I think I've been in at least one brief local TV news item.

But sometime during my senior year at Carleton College, I co-authored a paper with one of my math professors, Ken Wegner, which gave me an Erdős number of 7. The paper, published in 1970 in The American Mathematical Monthly, was "Solutions of Φ(x) = n , Where Φ is Euler's Φ-Function" [Wegner, K., & Savitzky, S. (1970), The American Mathematical Monthly, 77(3), 287-287. doi:10.2307/2317715].

So now I have three things to explain: What is an Erdős number? What is Euler's Φ function? And finally, What was my contribution to the paper?

Erdős number: As you might expect from the introduction about the Bacon Number, a mathematician's Erdős number is the smallest number of co-authored papers connecting them to Paul Erdős (1913–1996), an amazingly prolific (at least 1,525 papers) 20th Century mathematician. He spent the latter part of his life living out of a suitcase, visiting his over 500 collaborators (who thus acquired an Erdős number of 1. The Erdős number was first defined in print in 1969, so about the time I was collaborating with Wegner on Euler's Φ function.

Euler's Φ function, Φ(n), also called the Totient function, is defined as the number of positive integers less or equal to n that are relatively prime to n; or in other words the numbers in the range 1 ≤ k ≤ n for which the greatest common divisor gcd(n,k) = 1. (You will also see it written in lower-case as "φ", or in Latin as "phi".)

The totient function is pretty easy to compute, at least for sufficiently small numbers. The inverse is rather less straightforward, and has been the subject of a considerable number of StackExchange queries. (This answer includes a good set of links.) I was thinking of including some detail about that, and was barely able to keep myself from falling down the usual rabbit-hole, which almost always ends up somewhere in group theory. For example, φ(n) is the order of the multiplicative group of integers modulo n. See what I mean?

My contribution to the paper was not very closely related to the actual mathematics of the problem; what I did was write the computer program that computed and printed out the table of results. That involved a hack. A couple of hacks, actually.

In 1969, Carleton College's computer lab contained an IBM 1620 and a couple of keypunches. The 1620 was fairly primitive even by 1960s standards; its memory consisted of 20,000 6-bit words, with a cycle time of 20 microseconds. Each word contained one BCD-coded decimal digit, a "flag" bit, and a parity check bit. It did arithmetic digit-by-digit using lookup tables for addition and multiplication. It was not particularly fast -- about a million times slower than the CPU in your phone. But it was a lot of fun. Unlike a mainframe, it could sit in one corner of a classroom (if it was air-conditioned), it was (comparatively) inexpensive, and it could stand up to students actually getting their hands on it.

A lot of the fun came from the fact that the 1620's "operating system" was the human operator sitting at the console, which consisted mainly of an electric typewriter and a row of buttons and four "sense switches" that the program could read. If you wanted to run a program, you put a stack of punched cards into the reader and pushed the "load" button, which read a single 80-column card into the first 80 characters of memory, set the program counter to zero, and started running. My program was written in FORTRAN. Not even FORTRAN II. Just FORTRAN.

Computing the table that occupied most of the paper took about a week.

Here's where it gets interesting, because obviously I wasn't the only student who wanted to use the 1620 that week. So I wrote an operating system -- a foreground/background system with my program running in the background, with everyone else's jobs running in the "foreground". That would have been easy except that the 1620 could only run one program at a time. Think about that for a moment.

My "operating system" consisted mainly of a message written on the back of a Hollerith card that said something like: "Flip sense switch 1 and wait for the program to punch out a deck of cards (about a minute). When you're done, put the deck in the reader and press LOAD."

Every time the program went around its main loop, it checked Sense Switch 1, and if it was set, it sent the contents of memory to the card punch. Dumping memory only took one instruction, but it wasn't something you could do from FORTRAN, so I put in a STOP statement (which FORTRAN did have) and changed it to a dump instruction. By scanning the program's object code (remember this is a decimal machine; an instruction took up 12 columns on the card) and replacing the HALT instruction with DUMP.

It worked.

    MR: Collaboration Distance
     MR Erdos Number = 7
     S. R. Savitzky 	  coauthored with    Kenneth W. Wegner 		MR0260667
     Kenneth W. Wegner 	  coauthored with    Mark H. Ingraham 		MR1501805
     Mark H. Ingraham 	  coauthored with    Rudolph E. Langer 		MR1025350
     Rudolph E. Langer 	  coauthored with    Jacob David Tamarkin 	MR1501439
     Jacob David Tamarkin coauthored with    Einar Hille 	        MR1555331
     Einar Hille 	  coauthored with    Gábor Szegő 	        MR0008279
     Gábor Szegő 	  coauthored with    Paul Erdős 	        MR0006783
     MR0260667 points to: K. W. Wegner and S. R. Savitzky, (1970)
     Solutions of φ (x) = n, Where φ is Euler's φ-Function on JSTOR,
     The American Mathematical Monthly, 77(3), 287-287.
     DOI: 10.1080/00029890.1970.11992471.

There are two other numbers of interest: the Shūsaku Number, measuring a Go player's distance from the famous 19th-Century Go player Hon'inbō Shūsaku, and the Sabbath Number, measuring a musician's distance from the band Black Sabbath. I'm pretty sure I have a Sabbath number through filkdom. I definitely have a Shūsaku number of 5 from having lived down the hall from Jim Kerwin, Shūsaku Number 4, my sophomore and junior years at Carleton. That's another story.

And if I expect to write more journal entries about math, I'm going to have to extend my posting software to allow entries written in LaTeX. Hmm.

The Mandelbear's Memoirs

mdlbear: blue fractal bear with text "since 2002" (Default)

Bad week. Continuing the trend set last week, the filk community lost Lindy Laurant. Meanwhile what used to be a free country continues its descent into theocratic dictatorship with kleptocracy. Colleen's nausea and diarrhea also continued, though somewhat improved over the previous two weeks. The USB connector on my old Thinkpad keyboard died while I was in the process of moving the cable to its replacement. Poor little Cygnus suffered a tea spill, so I ordered a replacement keyboard.

It's a good thing that I keep spare laptops in the house. (I'm always happy to take unwanted computers off your hands.) It's a good thing that I don't actually need Bluetooth to work on Sable.

The week to come isn't likely to be any better.

Notes & links, as usual )

mdlbear: (technonerdmonster)

For some time now I've been eyeing Lenovo's ThinkPad Compact Bluetooth Keyboard with TrackPoint with a mixture of gadget lust and skepticism -- most of the reviews I saw said that the Bluetooth connection had a tendency to be laggy. Combined with the amount of trouble I've been having with Bluetooth on Linux Mint lately, and the lack of a USB connection, and the high price, it's been pretty far down on my list of things to buy.

Anyone who knows my fondness for addiction to Thinkpad keyboards can figure out what was going to happen when Lenovo came out with the ThinkPad TrackPoint Keyboard II, featuring both Bluetooth and a wireless USB dongle, but otherwise looking almost exactly like my wired KU-1255 keyboard and the keyboards on most of my Thinkpad laptops. I discussed that in "The Curmudgeon Contemplates Keyboards", a couple of weeks ago.

It arrived yesterday, much sooner than I'd expected. It's lovely, and just about what I expected. It's hard to go wrong with a Thinkpad keyboard.

Being nearly icon-blind it took me a while to puzzle out the switches, because the quick-start sheet had nothing but a few pictures to explain them. It didn't say anything at all about the "Android/Windows" switch. So I went looking on their tech support website and found nothing but a PDF of the quick-start. Not helpful. (After a day and a half I found a review that explained that it gives F9-F12 Android-specific functions, and indeed I was eventually able to make out the tiny markings above them on the beveled edge of the bezel.)

The website -- and most of the reviews -- also mentioned its support for "6-point entry for the visually impaired", but DDG and Google found nothing except references to this keyboard. Braille, maybe? Whatever. There's nothing about it on the tech-support site.

There are some things I really appreciate as a cat's minion. It's exactly the right size to sit on top of my laptop (Sable is a Thinkpad X230; the keyboards are almost identical) with the lid closed and an external monitor plugged in. If a cat shows signs of wanting to sit on it, I can set it aside (or close the lid), and pick it up later. (I broke the micro-USB connector on one of my wired Thinkpad keyboards, because I often flip it up behind the laptop with the keys away from me -- and the cat.) If a cat does sit on it, the on-off switch is easily reachable on the right-hand side. Much easier than unplugging the cable.

So let's sum up. On the positive side: the wireless USB, Bluetooth, the classic ThinkPad feel and layout, the TrackPoint nub, and two of the three buttons are exactly as I would expect. (The middle button is in the same plane as the two side buttons, and the raised dots are much lower and are no longer blue.) The charging connector is USB-C. I haven't used it long enough to evaluate battery life, but it's been on since yesterday and claims to be at 99%; Lenovo claims two months, so that's believable. It's just the right size to sit on an ultrabook like a Thinkpad X230 with the lid closed.

I'm not sure whether to count the low-contrast markings on the function keys as positive or negative. I've pretty-much abandoned my old emacs key-bindings for them, and some of the functions indicated by the icons are actually useful. I'll get out my label-maker, or label them with white-out.

On the negative side: the USB cable is just for charging. For goodness' sake, how much circuitry would it have taken for it to make that a third connection mode? The documentation is sketchy -- the QuickStart page is nothing but icons and arrows, and for an icon-impaired curmudgeon that's a bit of a problem. Nowhere in the documentation does it explain what the Android/Windows switch is for. There's nothing on Lenovo's tech support website, either. There's no backlight, and the function keys are labeled with low-contrast tiny letters. The dongle is, of course, incompatible with Logitech's, so it uses another USB port. (This is a minor quibble, because I had the slot I unplugged the old keyboard from.)

Some people would make the position of the Fn key, to the left of Ctrl, as a problem. They might also complain about the Page Up and Page Down keys' flanking the Up-Arrow in the inverted T arrangement. Since I've be using Thinkpads since sometime in the last Millennium, and the new page-up/page-down positions for 95% of the last decade, I don't have a problem with either of those -- they're exactly what I want. Some people would miss the trackpad and palm rest; I've been using a wired but otherwise identical keyboard for years, and don't miss them. Your mileage may vary.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

Setting up a computer so that it can boot into one of several different Linux distributions is something of a challenge; I haven't done it in quite a long time, and of course hings have changed. You may remember the previous post in this series, in which I discuss the proposed partitioning scheme for Sable's new terrabyte SSD. So if that didn't interest you, this probably won't either. )

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: blue fractal bear with text "since 2002" (Default)

Not a good week. Not horrible, either, by contemporary standards, but Colleen spent Monday through Thursday in the hospital with another UTI, and the air quality has gotten progressively worse. Up until today Whidbey Island -- or at least the Oak Harbor measuring station -- has had a slightly lower AQI than most of the surrounding measurements, but today it's solidly up into "Unhealthy" (around 185, though it depends somewhat on which map you're looking at and how you interpolate; the Washington department of Ecology's map has it in the mid-to-high 200s, which is Very Unhealthy). Parts of Seattle are up into the Hazardous range. "Don't breathe anything you can see" -- good advice if you can manage it. I can't.

I've been making progress on upgrading (laptop)Sable to a 1TB SSD and three distributions (Mint/MATE, LMDE/Cinnamon, and UbuntuStudio/Xfce4). The main obstacles are the fact that Mint and UStudio both identify themselves as "ubuntu", so their boot/efi information clobber one another, and the fact that a lot of my setup for (window manager)Xmonad was based on Gnome, which doesn't play well with MATE or Xfce. And some of it was based on the (previously-valid) assumption that I would need only one set of config files per machine. Working on it, and (setup manager)Honu will be the better for it when I'm done. Hopefully this week. I also expect to get a few curmudgeon posts out of it.

I have not been singing nearly as much as (I feel that) I should be. A lot less than is good for me. This is, sadly, typical.

Notes & links, as usual )

mdlbear: blue fractal bear with text "since 2002" (Default)

A week. Mostly spent caring for Colleen, doing household chores, Getting A Few Things Done, making sure everything on Sable is ready to be replaced with fresh installs, and researching how to set up a computer to boot multiple OSs using EFI and the GPT partition table format. Documentation for this is thinner on the ground than one would like. (I started actually doing it today.)

Things that Got Done included ordering two new Thinkpad keyboards (the Bluetooth one will ship in "more than five weeks", which is why I ordered the other one), calling a handyman to (finally) make a concrete pad for the end of the ramp to replace the gravel nightmare that's there now, and writing a few posts. Actually, five posts, which is a little more than usual. Between the writing and various computer tasks I was actually able to get into flow a few times, which is good and keeps me from looking at the news.

... and Mom's not doing all that well. I mean, she's doing really well for someone who's 99 years old and on hospice care, but that's not really saying very much.

Notes & links, as usual )

mdlbear: (technonerdmonster)

...about disk partitioning. Content warning: rather specialized geekness. If that's not something you're into, you might want to skip this.
tl;dr )

  Dreamwidth makes an excellent rubber duck -- thanks for listening.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

For the last week or two my external keyboard has been flaking out -- dropping keystrokes, and occasionally barfing out a string of repeats. The cats, of course, know nothing about this. Or will admit to nothing, in any case. So yesterday, after determining that a blast of canned difluoroethane wasn't going to fix it, I finally started to think seriously about replacing it.

The keyboard has only a limited set of plausible replacements, because there are only two types of external keyboard that I can stand: the Model M and the ThinkPad TrackPoint Keyboard. The Model M and the oldest of the Thinkpad keyboards (the marvelous SK-8845 Ultranav) can be dismissed out of hand because they lack a logo key, which I've gotten used to using as Xmonad's Mod key. Most Model Ms lack a trackpoint, although I have one that has it -- and two PS-2 connectors on the cable. Besices, I'm not positive that I can find my Model M at this point, and it takes up a lot of desk space that I don't have anymore.

The second generation of Thinkpad keyboards -- the SK-8855 -- have a logo key, and an attached USB cable that stows into a recess on the back, but have the page-up and page-down keys on the right-hand edge, in what has become, for me, the wrong place. That makes them just enough different from the keyboards on the newer Thinkpads that it's annoying. I have one that I'd consider using anyway, but it's broken; my second one is out on loan.

(You might well ask why, since both of the laptops I'm using -- Sable and Raven -- are Thinkpads with the right keyboard, I would be looking at external keyboards. I blame the cats. If I have an external keyboard and an external monitor on my desk, I can close the lid and let Desti sit on it. Come to think of it, that may be why I need a replacement keyboard in the first place.)

There are three Thinkpad keyboards with the new layout -- the KU-1255, which is what I'm looking to replace, the Bluetooth version, and the shiny new ThinkPad TrackPoint Keyboard II. The Bluetooth version has gotten poor reviews -- apparently it tends to be laggy -- and in any case one of the laptops it needs to go with doesn't have Bluetooth. (I know -- dongles. I'm also running out of USB ports.) The Keyboard II has both Bluetooth and a wireless USB dongle. (It would, of course, be ideal if it were compatible with Logitech's, but of course it wouldn't be.)

I was just about to order one when I saw this line on Lenovo's website:

Ships in more than 5 weeks.

So it looks as though I get to spend $60 on a KU-1255 to use while I'm waiting. Or instead. Or maybe an SK-8855, because they have an attached USB cable instead of requiring a (fragile) micro-USB, except that those appear to be made of unobtainium today. And I can get the KU-1255 from Amazon and have it delivered tomorrow.

Just for the record, here's what I like (and some reviewers detest, of course) about the newer Thinkpad keyboards:

  • Page-up and page-down keys. (Many -- perhaps most -- newer compact keyboards require using the function key on the up and down arrows, which makes it hard to hit one-handed. Because cat.)
  • The cursor keys are all in one place on the lower right: the arrows in an inverted-T arrangement, with the page-up and page-down on either side of the up-arrow in what practically every other keyboard leaves as empty space. Huh?
  • Trackpoint -- the little red pointing stick between the G, H, and B keys. I don't always use it, but it's there when I need it. And you can scroll with it.
  • Along with the trackpoint, there are three buttons directly under the space bar. The middle one is what you hold down to scroll with the trackpoint; on Linux it's also "paste selection" in most places, and "download" in browsers.
  • The classic Thinkpad key-feel. A lot like a Model M clicky-key only silent. Less travel than the mechanical keys on the Model M, but I've come to prefer that.

I'm still waffling over the II. It's hard to justify, now that I have a 1255 on order. But not impossible. Meanwhile I'll just sit here listening to The Typewriter (a concerto for orchestra and solo typewriter) by Leroy Anderson). (There's a version that includes a repeat performance using an IBM Selectric, but I can't seem to find it now. It would have been perfect for this post.)

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: blue fractal bear with text "since 2002" (Default)

Hmm. File corruption on (laptop)Sable, political corruption (a given these days), some interesting debugging on (temporary replacement laptop) Raven, singing, no walking to speak of, Colleen doing her exercises, Colleen's usual medical problems and some less usual ones, deli sandwiches by curbside pickup, killer hurricane, killer cops,... FSCK (literally as well as indirectly figuratively).

Oh yeah, almost forgot -- C.D.C. Now Says People Without Covid-19 Symptoms Do Not Need Testing (via siderea | How Can You Tell the CDC is Lying? Their Lips Are Moving.)

The usual mixed week, here at this end of the Rainbow Caravan.

Notes & links, as usual )

mdlbear: (technonerdmonster)

Early last Sunday afternoon I noticed that the battery-charge indicator had vanished from (main laptop)Sable's Gnome panel. (That's sort of like the row of icons and such you see along the bottom of the screen on a Mac, except that I've configured it to go vertically down the left-hand edge, where it doesn't reduce the hight of my browser window too much.)

Hmm, says I to myself, maybe it will come back after a reboot. So I did that, and logging in presented me with an empty screen background. ??? A little more experimentation showed that only the Gnome-2 desktop was affected; the Ubuntu one (which I detest) worked fine. So did a console terminal, and SSH. The obvious next step was to run fsck, the file-system checker (and many hackers' favorite stand-in for a certain four-letter expletive).

Well, not quite the next step. Since I figured that fixing file-system corruption might possibly make things worse, I moved over to one of my spare laptops, Raven, sat Sable on the shelf next to my desk, and logged in on Sable with SSH. Then I went to the top of my working tree and ran make status to see what needed to be checked in. I think I've mentioned MakeStuff before -- it's basically a multi-function build tool based on GNU Make, and one of the things it can do is find every git repository under the top-level directory, and do things like check its status, or pull. (Commit takes a little more thought, so you don't want to do it indiscriminately.)

Then I ran MakeStuff/scripts/scripts/pull-all on Raven. Done.

Well, almost. There are a few things in my home directory that aren't under my working tree, mostly Desktop, Documents, Downloads, my Firefox bookmarks, and my Gnome Panel configuration. I hauled out a USB stick, fired up tar (like zip, except that it can save everything about a file, not just what DOS knows about). The command I actually used, because I probably forgot a few things (and should have excluded a few more, like Ruby and Perl), was

    rsync -a --exclude vv --exclude ?cache --exclude ?golang . \
          nova:/vv/backups/steve\@sable

And ran straight into the fact that USB sticks are usually formatted with a FAT filesystem, and limit files to 4Gb. Growf! Faced with the unappetizing prospect of shipping 17GB of backups over WiFi, I carried Sable over to my server and plugged in the ethernet cable that I leave hanging off the router for just such occasions. After that finished, I fired up Firefox bookmarked all my tabs, and exported tabs and bookmarks to an HTML file. Should have done that before I backed up everything, but I didn't think of it.

Finally, I was ready to run fsck and find out the bad news. I plugged in the USB stick with the Ubuntu live installer (one does not run fsck on a mounted filesystem!), brought up a terminal, and ran

e2fsck -cfp /dev/sda5 # check for bad blocks, force, preen

(Force means to do a full check even if the disk claims it's okay; "preen" means to make all repairs that can be done without human approval.) Naturally, after turning up a few dozen bad blocks, it told me that I had to run it manually. I could have replaced the -p option with -y, to say "yes" to all requests for approval; instead I left it off and hit Enter a hundred times or so. Almost all the problems were "doubly-claimed blocks", mostly shared between some other file and the swapfile. Of course. Fsck offered to clone those blocks, and I took it up on that offer. Then ran it again to make sure it hadn't missed anything. It hadn't. But it was still broken, no doubt because of all those corrupted files.

So this morning, after a couple of searches, I installed the debsums program, which finds all of the files you've installed, and compares their checksums against the ones in the packages they came from. The following command then takes that list, and re-installs any package containing a file with a bad checksum:

apt-get install --reinstall $(dpkg -S $(debsums -c) \
       | cut -d : -f 1 | sort -u)

Sable now "works" again. I know at one zip file was corrupted (it was a download, and I was able to find it again), and fsck doesn't appear to have kept a log, so broken files will keep turning up for a while. I know there aren't any bad zip files left because there's an option in unzip, -t, that compares checksums, just the way debsums does, so I could loop through all my downloads with:

for f in *.zip; do echo -n $f:\ ; unzip -tqq $f; echo; done

I have two remaining tasks, I think: one is to validate all of my Git working trees (worst case -- just blow them away and re-clone them), and then comes the really hard one: deciding whether I still trust Sable's SSD, or need to get a new one. And if I get a new one, how big? Sable and its 500GB drive were purchased together, used, from eBay, and brand-new 1TB SSDs are pretty cheap right now. So there's that.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

If you happen to be running a Windows DNS server, I hope you have automatic updates enabled. Today's security update fixes CVE-2020-1350, also known as SigRed: A 17-year-old 'wormable' vulnerability for hijacking Microsoft Windows Server. I think that title kind of says it all, doesn't it? For the record, it's a heap-based buffer overflow that can be triggered by a malicious DNS query, and it's described as "wormable", with a CVSS base score of 10.0. Wormable means that it can propagate itself and spread exponentially to other vulnerable servers.

It's not at all inaccurate to describe this as "COVID-19 for Windows DNS server". Go fix.

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

Ripple20

2020-06-29 01:02 pm
mdlbear: (technonerdmonster)

This one is pretty wild. Ripple20 is a set of 19 zero-day vulnerabilities in a widely used low-level TCP/IP software library developed by Treck, Inc. It gets its name because its position in the supply chain allowed the library with its vulnerabilities to ripple outward through hundreds of software and hardware vendors, and from there into hundreds of millions (maybe more) of devices. Printers, UPSs, infusion pumps, industrial control devices, ... any kind of thing in the Internet of Things that has a network connection.

It's been rippling outward since 1997.

It's important to note that it's not in Linux, Windows, iOS, or Android. So it's probably not in your phone or your computer. It might well be in your router, printer, WiFi-connected light switches, TV, or internet-connected refrigerator. And devices containing Wind River's VxWorks aren't affected -- that's the URGENT/11 zero-day vulnerabilities from last year.

And there seem to be only somewhere between 10,000 and 100,000 devices that are actually connected to the internet. Chicken feed.

The vulnerabilities have, of course, been patched by Treck, and sent to their customers. And from there to their customers. And so on. But how many people check for software updates for their printer? (I do.) Is it even possible to install a software patch on a light switch? Is the company that made it still in business? You see the problem.

There are ways you can set up a firewall to block these. If your router manufacturer (or open-source OS project) sends you an update, install it.

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: portrait of me holding a guitar, by Kelly Freas (freas)

Around the end of 2018 I wrote a Songs for Saturday post about my song "The World Inside the Crystal", which is about my -- I'm not sure what to call it: contention? belief? fantasy? -- that inside of computers is a world where magic works. I wrote the song itself in 1985, about the time my first kid was born. (I don't remember which came first.) Anyway, about four years later I wrote another: "Daddy's World". I sang it for my music teacher last Monday and she had some questions that made me realize that there are a lot of people, some of them no doubt on my reading list, who have no idea what it's referring to. Unless you had a Mac in your house before 1990 or thereabouts, have the definition of the Mandelbrot set memorized, and know a little about complex numbers, four-dimensional geometry, and integrated circuits, that may include you.

Here there be lyrics, and footnotes. )

The recording of "Daddy's World" on YouTube comes from my CD, Coffee, Computers, and Song, and has my kids joining in on the chorus. Neither of them could sing all that well, but I really wanted them there. I don't seem to have it on any concert recordings -- sorry about that.

mdlbear: (technonerdmonster)

If you've ever looked into cloud storage (like for backups -- you do make backups, right?) you will recognize Amazon's Simple Storage Service, otherwise known as S3. It was the first of the Amazon Web Services to be released, in 2006. It's cheap ($0.023 per GB per month for up to 50TB, after which you get a bit of a discount), extremely reliable, and secure.

According to this article on "How to secure an Amazon S3 Bucket",

Here’s what you need to know to lock down an Amazon S3 bucket:

Step one: do nothing. [emphasis theirs]

Yes, do nothing because — like all other AWS services — the default configuration provides a strong security posture right out of the gate.

So when you create an S3 "bucket" (which is what they call the container you store your files in -- bits in a bucket?), only you can do anything with it. After that, if necessary, you can give other people access. Or open it up for everyone to see, for example if you want to host a website on it. (There are better places to host a website.)

If you're storing sensitive information like customer names and addresses, you can have Amazon encrypt it for you. For really sensitive things, like social security numbers and credit card information, you can encrypt it on your end. Amazon gives you some useful tools that make it easy. But this post isn't a tutorial on S3 security -- Amazon has one right here. This is, I don't know, kind of a <rant>.

Because, in spite of the fact that you have to do extra work to make a bucket public, I keep running into articles like Leaky Buckets: 10 Worst Amazon S3 Breaches and, more recently, Adult Site Leaks Extremely Sensitive Data of Cam Models.

Yes, S3 buckets can be used to exchange data with other companies or people -- if you're careful. Encrypted. Multiple times. With strictly limited access. And public buckets can be used for hosting media files and even whole (static) websites (although download bandwidth, while cheap, is not usually free -- a DDOS attack or suddenly going viral can saddle you with an appallingly high bill). But for goodness' sake don't confuse the two!

</rant>

Ask yourself these questions:

  1. Will I be absolutely delighted if a thousands of random people on FB saw this file I'm storing? If the answer is "yes", make it public. Otherwise, consider making it private.
  2. Will I have a problem if certain people (my business competitors, my mother, my ex, ...) saw this file? If so, you should make it private, and use at least server-side encryption.
  3. Will I get in trouble (lawsuits, identity theft, public shaming in blog posts like this one, ...)? Encrypt it. Use client-side encryption if you want to be sure. Encrypt the filenames, too. And keep it encrypted when it's stored on your computers, as well. (In many cases there are government regulations that cover exactly how you should handle this data. Some things shouldn't be stored at all, like credit card PINs. But always encrypt.)

Resources

Here is the Amazon documentation for securing data on S3. There's more, but these are the basics.

... and here are a few other links, collected here for your convenience.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

If your Windows 10 boxes -- all of them, both desktops and servers, didn't apply the update that came out Tuesday, go do that now. You can chase the references below while the update is downloading.

Here's what Microsoft said about it when they released the patch. There are more details in the resource list below. Some of those go on to link to proof-of-concept exploits. Good luck!

h/t to @thnidu.

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

Unlike my post Wednesday, this is one you should do Right Now(TM) if you have Firefox installed and aren't getting automatic updates. And even if you're getting updates automatically, you should check your version if you haven't updated since Wednesday. This vulnerability is being actively exploited in the wild.

The latest version is 72.0.1; you can check this by choosing the "About" item on the "Help" menu. The corresponding Android version is 68.4.1; "About" is the last item on the "Settings" menu. The update doesn't appear to be necessary on iOS (presumably because it's using a different just-in-time (jit) compiler)-- version 20.0 was released back in October.

Links

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

I haven't put on my curmudgeon's hat in way too long. This isn't going to be a very long post, and if you don't know what a hash function is and what it's used for, all you need to do is make sure you're keeping up with security upgrades. Modern web browsers are already safe, and have been for the last three years or so; if you're still using a browser older than that, you should upgrade it. Some other programs in common use are not safe yet, so watch for security upgrades in the coming months.

If you're still with me, I just wanted to point the people who worry about such things at the latest vulnerability-with-a-catchy-name: SHA-1 is a Shambles. Tl;dr: the cost to construct a pair of different messages with the same SHA-1 hash has dropped to well under $100K worth of rented GPU time. Attacks still aren't exactly practical -- they used 900 GPUs for a couple of months -- but it's only a matter of time.

Section 7 of the paper [PDF] goes into detail on current usage of SHA-1 and what is being done about it. SHA-1 has been deprecated since 2011, and is no longer allowed in digital signatures. There are, however, still some older programs and protocols where it's an option, and X.509 certificates are still being issued with SHA-1 signatures (although modern browsers will reject them). They're being fixed, but you should make sure gpg, ssh, and web browsers are up to date, and if you're a developer, please stop using or accepting keys or certificates with SHA-1 signatures.

Git uses SHA-1 hashes to identify objects, and indirectly to sign commits. The developers are working on using longer hashes, and it now includes collision-detection; I'll be particularly interested in that work going forward.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

I suppose I could have titled this one "Data-mining the to-DO loG", but since the log is kept in a directory called Dog...

My to.do file is somewhere in between a bullet journal and a logbook. Since its start as a pure to-do list in 2006, it has come to take on an increasingly important role in my life. (Some people might say that that's because my memory is deteriorating; they might be right.)

If you haven't already seen my How to.do it post, you might want to read that first. Or look under the cut in any of my "done since" posts. I mentioned a new tagging convention in this post. I have since extended it to make it easier to extract information, and also written a more general-purpose search tool. Because tool-using bear.

The net effect is that I can now easily answer questions like "when was the last time Colleen was discharged from a hospital" (answer: September 10), "what else did I do that day?" (answer: fix a messed-up fstab on Nova, and start to make a list of things I avoid doing, among other things), and so on.

As long as I can search and reliably find the search term and the date on the same line, grep works pretty well, and the convention of putting the mmdd date in parentheses right in front of one of the words "Admitted", "Discharged", or "Transferred" (or just the letter), and I can get "the last time Colleen was in the hospital" from:

grep '[0-9])D' 2*/*.done | tail -1

and the number of hospital stays in 2018 with

grep '[0-9])A' 2018/*.done | wc -l

Requiring a digit before the right parenthesis keeps me from getting false positives on things like "(gastroenterologist)Dr.". Other queries are equally simple. With a date somewhere on the line, I can find things like "CPAP" and "litter".

Of course, I had to go back searching for things like "admit" and "hospital" and put them into the correct format. But none of that helps much with queries like "what else was I doing?", because grep just returns a filename and a line number along with the lines that it finds. Then I have to go to emacs or less and navigate down to the line. It's possible to do better.

The solution was a script called dgrep, where the "d" stands for "done" or something like that. It does a couple of things differently:

  • Mainly, it knows that dates are four digits starting in column 1, so it can print them with the hit.
  • It knows where my to-do archive is, so I don't need to tell it what directories to search if I just want to search all of them.

so I can do the following:

 dgrep '[0-9]\)D' | tail -1
2019/09.done:247: 0910:   / (0910)Discharge instructions:

but there's one more trick. The '--less' option prints, not a filename and line number (which emacs and other editors can parse), but a command that you can use to search for that date:

 dgrep --less '[0-9]\)D' | tail -1
less -p ^0910 2019/09.done 247:   / (0910)Discharge instructions:

I just select the command, and click the middle mouse button to paste it into the command line. The help message also tells you the command line you need to look at each of the hits in succession.

The dgrep script is written in Perl and necessarily uses regular expressions, both of which are well into "now you have two problems" territory if you're not careful. But it works.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

NaBloPoMo stats:
   7582 words in 11 posts this month (average 689/post)
    618 words in 1 post today

mdlbear: (technonerdmonster)

Today the computer curmudgeon talks about how he blogs.

I think it's widely known that I update this journal (and by that I mean both my Dreamwidth journal, which you are probably reading now, and my Computer Curmudgeon website), using a rickety combination of shell scripts and makefiles called MakeStuff. There are several good reasons for that. (Whether "because I can" is a good reason or not is debatable. There are others.)

A little over a year ago I made a planning post, and the "Where I am now" section remains a pretty good, albeit sketchy, description of the process. There's also an even sketchier one in the README file for MakeStuff/blogging. It had a list of what I wanted to do next, but essentially the only thing I've actually done is posting in either HTML or Markdown.

I have, however, reorganized things a bit, so that all of the relevant scripts are in MakeStuff/blogging -- the last part was moving in the script, now called charm-wrapper, that takes the post's metadata out of its email-like header and turns it into a command line for charm, a livejournal/dreamwidth posting client written in Python. It isn't a very good solution, but it works.

And since I'm running out of time to make a post today, I'm going to start this series with the other utilities in MakeStuff/blogging. And then go add this list to the README.

  • check-html is a simple wrapper for html-tidy, a popular HTML syntax checker. Since it's not actually being used to fix syntax, it can play fast-and-loose with things like the header (it's just text) and blog-specific tags like <cut> and <user>. It handles those by putting a space after the "<" character. It would be trivial to add this to the make recipe for post.
  • last-post is a site-scraper that returns the URL of your most recent post. It's useful, because charm doesn't return it. (I eventually put that functionality into charm-wrapper, but it's still useful. Eventually it wants to take a date on the command line.
  • word-count is what I use to generate the NaBloPoMo statistics at the bottom of posts in November. (In other months, you just get a straight list; you can also get a listing for a whole year.)
NaBloPoMo stats:
     43 2019/11/01--rabbit-rabbit-rabbit.html
   1731 2019/11/02--s4s-memorials.html
   1465 2019/11/03--done-since-1027.html
    145 2019/11/04--i-ought-to-post-something.html
    430 2019/11/05--how-to-makestuff.html
-------
   3814 words in 5 posts this month (average 762/post)
    430 words in 1 post today

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

I recently bought a new-to-me Thinkpad X230, and in keeping with my ongoing theme of naming thinkpads after their color, I called it Sable, which in addition to being the heraldic name for the colour black is also a small dark-brown animal in the mustelid (weasel) family.

I've become quite fond of Sable. It's only an inch wider than my former laptop, Cygnus (named after Cygnus X-1, the first X-ray source to be widely accepted as a black hole), about ten times faster, and the same weight. The extra inch makes it exactly the right size for me to put one of my Thinkpad USB keyboards flat on top of it. One may wonder why I would even want to do this, but I can move the keyboard to my lap when one of the cats wants to sit on my desk. (The cat is almost always Desti, who is also black, but naming a laptop after her would be confusing.)

Anyway.

One of the problems with getting a new computer is getting it configured the way I want it, which usually means "pretty much the same as the last one". Most people do this by copying as much as they can off the old one, (on Linux that's typically their entire home directory), and then installing the same collection of applications (packages, in Linux terminology). It's tedious, and when the architectures or operating system versions are different it leads to a wide range of random glitches and outright bugs that have to be tracked down individually over the course of the next week or so. Even if it doesn't, home directories tend to include a large amount of random stuff, like downloads and the contents of your browser cache, that you don't necessarily want.

And if you're trying to set up a home directory on your web host, or your work machine, or something tiny like a Raspberry Pi, well... What you really want to do is start afresh and just install what you really need.

That's where the turtles (because home is wherever you carry your shell, as it says in the song) come in. Specifically, Honu, which is a collection of makefiles and scripts that does almost all of the setup automagically.

Honu (Hawaiian for the green sea turtle) requires nothing more than a shell (the Linux/Unix command-line processor, and anything Posix-compatible will work), an SSH client, the git version-control system, and make. In fact, if you can install packages on your target system, the first thing Honu's bootstrap script will do is install the ones you don't have.

Make was one of the first programs for building software automatically, and I'm very fond of it. It lets you define "recipes" -- actually, short shell scripts -- for building files out of their "dependencies", and it's clever enough to only build the things that are out of date. It can also follow rules, like the built-in one that tells it how to use the C compiler to turn a .c file into a .o object file, and the linker to turn a .o file to an executable file. (On Windows the executable file would end in an extension of .exe, but Unix and its descendents don't need it.)

Make can also follow chains of rules, so if your source file changes it will rebuild the executable, and (unless you tell it not to) delete the object file after it's sure nothing else needs it. And rules don't have to result in actual files -- if you give it a recipe for a "phony" target it will simply notice that it isn't there, and run the recipe every time. This is good for things like "install-pkgs" and "install", which are Honu's main make targets.

Turn it loose with a make command, and Honu's makefiles happily go about installing packages, creating directories, and setting up configuration files ; the whole process takes well under an hour.

I wrote Honu to be pretty generic -- it knows a lot of my preferences, but it doesn't know my name, email address, current projects, or hosting service. I also have another package, Mathilda (our name for the particular happy honu who narrates "Windward"). Mathilda sets all of that up, pulling down the Git repositories for my current projects, blog archives, websites, songbooks, build systems, and so on; putting them in the right places so that I can sit down in front of Sable, open the lid, and be right at home.

...Except, as in most moving projects, for tracking down all the little pieces that got left behind, but that only took a couple of days.

...

...And in case you were wondering what happened to this week's Songs for Saturday, you can read more about "Windward" in Songs for Saturday: Travelers and The Bears, from 2015.

Where the wind takes us next year no turtle can tell, But we'll still be at home come high water or hell, Because home is wherever you carry your shell.

Or $(SHELL), as the case may be.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

Wherein the curmudgeon cogitates out loud, and solicits suggestions.

Recently, my business (HyperSpace Express) checking account's balance has gone up considerably thanks to a writing gig. And the keyboard on my favorite laptop, a Lenovo X120e netbook called Cygnus, has been giving me trouble recently. It's developed a habit, which I think is thermal, of shutting down mid-boot. And last week random keys stopped working. I fixed that one -- nothing but a loose cable, (this time: it's on its second replacement keyboard) -- but not before I started looking at laptops again.

For quite a long time, the laptop I'd been considering lusting after as an upgrade has been the Lenovo X230. (The pointing stick and middle mouse button are required, along with an easy Linux install, so my choices are somewhat limited.) The X230 only slightly bigger than Cygnus, but a considerable upgrade: somewhat lighter, up to 16MB of RAM, a faster CPU, USB-3, and longer battery life are the main features I'd like to have. The fact that it has a docking connector that just happens to match a dock I have sitting around is a nice extra. They're available on eBay for anywhere between $150 and $400, depending on features, and can often be obtained defenestrated or with Windows 7. (I also considered the 220, which has the old-style beveled keys, but it has little else to recommend it. Besides, I'm used to the chicklets on Cygnus and I prefer the new layout.)

A few days ago, though, I made the mistake of also looking at the X1 Carbon series. It's tempting. First the negatives: it's bigger, with a 14" screen. It's more expensive -- more like the $300-600 range. It has less I/O -- you need a dongle for ethernet. The RAM is soldered on; you get your choice of 4 or 8 GB -- the 230 is upgradable to 16. Confusingly, it comes in a six different "generations" rather than having different model numbers; each generation has a different collection of I/O ports. The touchpad is larger than the one on the 230, which is a negative for a clumsy bear. It uses the M.2 form factor for SSDs, so I can't just take a drive out of any of my other laptops and stick it in. Windows 10 is standard. If you go for the "yoga" variant, which flips over to become a tablet, it's heavier.

On the other hand, it has a larger (16x9) screen, which would be especially nice for lyrics. The "yoga" version would be perfect on a music stand. It has somewhat better battery life than the 230. It comes standard with SSD. And it is, surprisingly, about half a pound lighter in the non-yoga flavor. Even the yoga is lighter than Cygnus. And although it doesn't have all the I/O I'd like to have, it's all I'm likely to need on a day-to-day basis. (The 4th generation, or the equivalent 1st generation yoga, looks like the sweet spot for I/O; it's pretty close to the 230.)

On the gripping hand, can I really justify having yet another laptop? I currently have five thinkpads (admittedly, one is old enough vote and has the Y2K bug, and the next oldest is also an IBM; the newest is currently out on loan), two other Lenovos, and a Dell netbook. I can't find my Asus Eeee, but I think it's around somewhere. (I didn't buy them all; I'm also the household's repair depot dumping ground for old computers.) But still. And I'd have to get new stickers.

There's also the question of what I want to do with yet another laptop I don't use on a daily basis. I already keep one in the bedroom. I could, of course, keep one on the desk and one in my backpack, but I'd have to take the backpack one out to sync it every time I left the house.

Buy one of each? ... ... see above, only doubled.

Sell one? Naaaaaaaaaaaaaaaaaaaaaa...

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

Okay, this one's a bit weird. Apparently A hacker is wiping Git repositories and asking for a ransom (of 0.1 Bitcoin). Apparently it was done by scanning the entire web for /.git/config files and mining those for credentials (including access tokens and URLs of the form http://user.password@victim.com. The hacker "replaced" the contents of the repository with a ransom demand.

The perpetrator is apparently hoping that anyone stupid enough to leave their git repo accessible through the web (I admit -- I used to do that) and to put login credentials in it (no, I'm not that stupid -- that's one of the things everyone is warned about multiple times, just in case it wasn't obvious), is probably stupid enough to pay the ransom instead of simply restoring their repo from any clone of it and changing their password.

And of course it turns out that the entire repo is still there after the attack -- the perpetrator is apparently just adding a commit and pointing HEAD at it. this post on StackExchange explains how to recover.

It's even easier, though, if you've actually been using the repo, because then you'll have a clone of it somewhere and all you have to do is

  cd clone
  git push --force origin HEAD:master

There's still the perp's threat to release your code if you don't pay. If your code is in a public repo on GitHub, GitLab, or BitBucket -- who cares? If it's in a private repo, you may have a problem, provided you (1) think it's likely that this threat can be carried out (there is reason to believe that your code hasn't actually be stashed away anywhere) and (2) you think that whatever secrets may have been in your private repo are worth more than about $570.

You can see by looking at Bitcoin Address 1ES14c7qLb5CYhLMUekctxLgc1FV2Ti9DA that, so far (4pm today) nobody has paid up.

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

In my previous curmudgeon post, Writing Without Distractions, I gave version control only a brief mention, and promised a follow-up post. That would be this one. This post is intended for people who are not in the software industry, including not only poets but other writers, students, people who program as a hobby, and programmers who have been in suspended animation for the last decade or three and are just now waking up.

The Wikipedia article on version control gives a pretty good overview, but it suffers from being way too general, and at the same time too focused on software development. This post is aimed at poets and other writers, and will be using the most popular version control system, git. (That Wikipedia article shares many of the same flaws as the one on version control.) My earlier post, Git: The other blockchain, was aimed at software developers and blockchain enthusiasts.

What is version control and why should I use it?

A version control system, also called a software configuration management (SCM) system, is a system for keeping track of changes in a collection of files. (The two terms have slightly different connotations and are used in different contexts, but it's like "writer" and "author" -- a distinction without much of a difference. For what it's worth, git's official website is git-scm.com/, but the first line of text on the site says that "Git is a free and open source distributed version control system". Then in the next paragraph they use the initialism SCM when they want to shorten it. Maybe it's easier to type? Go figure.)

So what does the ability to "track changes" really get you?

Quite a lot, actually! )

...and Finally

The part you've been waiting for -- the end. This post is already long, so I'll just refer you to the resources for now. Expect another installment, though, and please feel free to suggest future topics.

Resources

Tutorials

Digging Deeper

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: (technonerdmonster)

A few years ago I read an article about how to set up a Mac for distraction-free writing. I can't seem to find it anymore (okay, some rather large value of "a few"), but "there's an app for that" now. Many writers on my reading list are talking about distraction-free writing tools like iA Writer (seems to be the one people are most impressed by at the moment) and FocusWriter (free and cross-platform). There's even an Emacs mode.

These all work roughly the same way: run a text editor in full-screen mode, and write plain text with simplified markup in a fixed-width font. Worry about formatting later, if at all. Grey out everything but the sentence or paragraph you're working on. The article I can't find -- written before specialized writing programs and even before the web -- suggested getting the same effect by taking all of the icons off your screen and setting your default font to Courier.

If you're happily using one of these tools, you may want to skip ahead to the section on formatting, and maybe fill in the gaps later. If you're still using a word processor, or typing into a text field in a browser (even in "rich text" mode), you should probably stick with me.

What You See is All You Can Get

WYSIWYG (What You See Is What You Get) word processors are arguably the worst thing to have happened to writing in the last half-century. They have three huge problems:

The first is that they make a promise they can't deliver on. In fact, they should be called WYSIAYCG -- What You See Is All You Can Get. If your word processor doesn't support kerning, multiple fonts, paragraphs with hanging indents large initial capitals, mathematical notation, or internal cross-linking, you can't use them. If they make it difficult to use these features, you still won't use them unless you absolutely have to, and then you find yourself wasting time doing clumsy work-arounds. Think about how you'd go about formatting song lyrics with chords over them. Shudder. How about making the space between sentences equal to one-and-a-half times the space between words?

The second is related to the first: word processors target a specific page layout. If you want to make a printed book, a web page, and an eBook, you're going to have to do extra work to accommodate the differences, or settle for something that has about the same level of mediocrity in all of those environments.

At a lower level, word processors use proportional-spaced fonts. That means you have to peer over the tops of your glasses to see whether that character at the end of the sentence is a period or a comma, and if your hands are shaking from too much coffee you'll have trouble selecting it. Or putting the cursor in front of it without selecting it, if you want to add a few words.

The third is that they distract you from actually writing, tempting you to fiddle with fonts, reformat your footers, worry about word-wrapping and hyphenation, and place your page breaks to avoid widows and orphans, at a time when you should be concentrating on content.

There's a fourth, mostly unrelated, problem which is so pervasive these days that most people accept it as The Way Things Are: if you accidentally select something and then type, whatever you selected goes away. In almost all cases, even if your word processing has an "undo" feature, this can't be undone. So let's talk a little more about...

Editing

Anyone who's been hanging around me long enough is expecting me to mention GNU Emacs at some point, and I will. But there are plenty of other text editors, and most of them are perfectly usable. They're often called "programmers' editors".

I'm not going to tell you how to use a text editor here; I'm just going to tell you more about why, and point you at some resources. Michael Hartl's Learn Enough Text Editor to Be Dangerous is a pretty good introduction to most of them, though you may want to skip the chapter on Vim. It gives short shrift to Emacs, but fortunately the first thing on your screen after starting Emacs is the tutorial. Start there.

So, why would you, a writer, want to use a programmer's editor?

One reason is that programmers have been writing on computers for a quite a bit longer than writers have, so text editors have a considerable head start. More to the point, programmers use their own programs. This gives them a strong incentive to make their programs fast, efficient, and powerful. Not every programmer who has a problem with their text editor is going to fix it, but enough do to make them improve rapidly.

Word processors, on the other hand, are written by programmers, but they are usually written for ordinary users, not experts, and they're written to be products, not programming tools. As products, they have to appeal to their customers, which means that they have to be easy to learn and easy to use. They don't have to work well for people who spend their entire work day writing -- those are a tiny fraction of the customer base.

Another reason is that text editors use fixed-width fonts and encourage you to use comparatively short lines (typically 72 or 80 characters, for reasons that date back to the late 1880s). Paragraphs are separated by blank lines. Since line breaks inside of paragraphs are ignored by formatters, some authors like to start every sentence on a new line, which makes them particularly easy to move around, and makes it easier to spot differences between versions.

A text editor also makes you more efficient by giving you a wide range of keyboard commands -- you can write an entire book without ever taking your fingers off the keyboard. (This is, in part, due to their long history -- text editors predate graphical user interfaces by several decades.) And most modern text editors are extensible, so that if you want new commands or want them to behave differently for different kinds of markup, they're easy to add. (I have a set that I use for my band's lead sheets, for example, and another for my to-do files.)

Markup

Up until somewhere around 1990, everyone who did any serious writing knew how to edit a manuscript using proofreaders' marks. Manuscripts were typed double-spaced to leave room for insertions, corrections, and cryptic little marks between the lines and in the margins. This was, logically enough, called "marking up" the manuscript. You've probably heard of Markdown. You've certainly heard of HTML, which stands for "HyperText Markup Language". HTML, in turn, is a variant on SGML, "Standard General Markup Language". You may have heard of LaTeX, which is the standard for academic -- especially scientific -- writing.

Markup languages let you separate content writing from formatting. Semantic markup lets you add additional information about what the things you are marking up mean; it's up to a stylesheet to determine what they look like . In HTML, you don't have to <i>italicize</i> something, you can <em>emphasize</em> a talking point, or <cite>cite</cite> a book title. They usually look the same, so most people don't bother, until they decide to turn all the book titles in their two thousand blog posts into links.

You can see how using semantic markup can free you from having to think about formatting while you're writing. There's another, less obvious advantage: you aren't stuck with just one format. By applying different styles to your document you can make it a web page, a printed book, an eBook, a slide show, or an email.

Another advantage of markup languages is that all of the markup is visible. This week's xkcd: "Invisible Formatting", shows how you can accidentally make a boldface space in the middle of normal text, where it can distract you by making an insertion unexpecedly boldface. It may also make subtle changes in line and word spacing that are hard to track down down.

There are two main kinds of markup languages: ones like Markdown and Textile, that use simple conventions like **double asterisks** for strong emphasis, and the ones that use tags, like <cite>HTML</cite>. LaTeX and Restructured Text are somewhere in the middle, using both methods. You can be a lot more specific with HTML, but Markdown is far easier to type. Markdown and Textile let you mix in HTML for semantic tagging; Markdown, Textile, and Resturectured Text all use LaTeX for mathematical notation. Some formatters let you embed code with colored syntax highlighting.

These days, it looks as though Markdown is the most popular, in part thanks to GitHub; you can find it in static site generators like Hugo and Jekyll, and it's accepted by many blogging platforms (including Dreamwidth). Unfortunately they all accept different dialects of Markdown; there is an enormous number of Markdown-to-whatever converters. But the nice thing about markup languages is that you aren't stuck with just one. That brings us to...

Formatting

Once you have a file that says exactly what you want to say, the next thing you'll want to do is format it. Formatting programs (a category that includes LaTeX, web browsers, website generators like Jekyll and Hugo) all use some kind of style sheet that describes what each kind of markup is supposed to look like. You probably know about CSS, the "Cascading Style Sheets" that are used on the web. LaTeX has a different set, written in the typesetting language TeX.

If you wrote your file in HTML and you want to publish it on the web, you're done. You may want to make your own stylesheet or customize one of the thousands that are already out there, but you don't have to. Modern browsers do a perfectly reasonable job of formatting. CSS lets you specify a separate style for print, so when a reader wants a printed copy it actually looks like something you'd want to read on paper.

If you wrote your file in LaTeX and you want to publish it on paper, you're done -- it's safe to assume that LaTeX knows more about formatting and typesetting than you do, so you can just tell LaTeX what size your pages, pick one of the hundreds of available stylesheets (or write your own), and let it make you a PDF. You can change the page size or layout with just a few keystrokes.

If you wrote your file in Markdown or some other markup language, there are dozens of formatting programs that produces HTML, LaTeX, PDF, or some combination of those. (My favorite is Pandoc -- see below.) Markdown is also used by static website generators like Hugo or Jekyll, and accepted by everything from blogging sites to GitHub. If you're publishing on the web or in some other medium your formatter supports, you're done.

The advantage of separating content from format is that you're not stuck with one format. Web? Print? eBook? You don't have to pick one, you have all of them at your fingertips. There are hundreds of conversion programs around: html2latex, latex2html, kramdown (which GitHub uses),... For most purposes I recommend Pandoc. The subtitle of Pandoc's home page calls it "a universal document converter", and it is. It can convert between any of the the markup languages I've mentioned here, and more, in either direction. In addition it can output eBook, word processor, wiki, and documentation formats, not to mention PDF. As an example of what it can do, I write these posts in either HTML or Markdown as the mood takes me, and use Pandoc to convert them to HTML for Dreamwidth and plain text, stripping out the tags, so that I can get accurate word counts.

Version Control, etc.

Text files with short lines are ideal for other tools in the Linux (and Unix -- did you know that Unix was originally used by technical writers?) environment. When you compare two files, a line-by-line comparison (which is what diff gives you) is more useful than a paragraph-by-paragraph comparison (which is what diff gives you if you don't hard-wrap your text). Text editors can run formatters, spelling checkers, search tools, and others, and put the cursor on the exact line you're looking for. Want to search nearly 6500 blog posts for your favorite quote from G. K. Chesterton? Took me one line and a little over 4 seconds.

        time find . -type f -exec grep -nHi -e 'rules of architecture' {} +

Many formatting tools simply ignore single line breaks and use a blank line to separate paragraphs, examples include LaTeX and most (though not all) Markdown translators. HTML ignores line breaks altogether and relies on tags. I take advantage of that to make HTML more readable by indenting the text by four spaces, and using 80- or 90-character lines. If you want an example and you're reading this page in a browser, just hit Ctrl-U to look at the page source. Compare that to web pages made without hard-wrapped lines -- you may find yourself scrolling dozens, if not hundreds, of characters to the right because browsers don't do any wrapping when displaying source. Nor would you want them to.

The biggest advantage (in my not-so-humble opinion) is version control. (Those of you who've been around me were just waiting for me to mention git, weren't you?) Being able to find all the changes you made this week -- and why you made them -- can be incredibly useful. Being able to retrieve a file from three years ago that you accidentally deleted is priceless.

This post is already pretty long, so the next post in this series is going to be about version control (and the other things you can do with git and GitHub; it's not just version control) for writers. Stay tuned.

Resources

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).

mdlbear: (technonerdmonster)

It's getting so that data breaches aren't news anymore unless they're huge. The Gizmodo article calls it The Mother of All Breaches, exposing 773 million email addresses and 21 million passwords. There's a more complete post by Troy Hunt: The 773 Million Record "Collection #1" Data Breach. Hunt is the person behind the Have I Been Pwned website. That should be your next stop -- it lets you check to see which of your email addresses, usernames, and passwords have appeared in any data breach.

If your password shows up in Pwned Passwords, stop using it. Consider enabling two-factor authentication where you can, and getting a password vault. Hunt recommends 1Password. If you want open source, you can try KeePassX.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).

mdlbear: (technonerdmonster)

Some day I ought to put together a comprehensive list of privacy-related links. This is not that list; it's just a few of the links that came my way recently, in no particular order.

I'd suggest starting with the ACLU's What Individuals Should Do Now That Congress Has Obliterated the FCC’s Privacy Protections. It's a good overview.

DuckDuckGo is my current privacy-preserving search engine of choice. The DuckDuckGo Blog has been a good source of additional information. I especially recommend this article on How to Set Up Your Devices for Privacy Protection -- it has advice for iOS, Android, Mac, Windows 10 and 7, and Linux. Also check out a broader range of tips here.

The Electronic Frontier Foundation, as you might expect, is another great source of information. I suggest starting with Tools from EFF's Tech Team. While you're there, install Privacy Badger. It's not exactly an ad blocker; what it does is block trackers.

Here's an article on Which Browser Is Better for Privacy? (Spoiler: it's Firefox.) Then go to Firefox Privacy - The Complete How-To Guide.

For the paranoid among us, there are few things better than Tor Browser. If you use it, you'll probably want to turn off Javascript as well.

The Linux Journal's article on Data Privacy: Why It Matters and How to Protect Yourself has a lot of good advice, most of which isn't Linux-specific at all.

However, if you are running Linux, you'll want to look at How To Encrypt Your Home Folder After Ubuntu Installation, Locking down and securing SSH access to your server, and Own Your DNS Data.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).

mdlbear: (ccs)

In 1985 I wrote a song called: "The World Inside the Crystal". At the time there didn't seem to be any songs about computers, or programming, that weren't meant to be funny. (I think there might have been a few about AI or robots that were meant to be scary. It's entirely possible that this was the first serious computer song ever written.)

I also wanted to explore the notion that inside of computers is an alternate universe where magic works. I don't remember whether I came up with that, or somebody else mentioned it to me; it was definitely an idea that I was kicking around at that time. Kick it far enough, and it winds up someplace like this:

Beside the world we live in Apart from day and night Is a world ablaze with wonder Of magic and delight Like a magic crystal mirror, My computer lets me know Of the other world within it Where my body cannot go. chorus: You can only see the shadows Of electrons on a screen From the world inside the crystal That no human eye has seen. The computer is a gateway To a world where magic rules Where the only law is logic Webs of words the only tools Where we play with words and symbols And creation is the game For our symbols have the power To become the things they name. chorus Now you who do not know this world Its dangers or its joys You take the things we build there And you use them as your toys. You trust them with your fortunes, Or let them guard your lives. From the chaos of creation Just their final form survives. chorus Call us hackers, call us wizards, With derision or respect, Still our souls are marked by something That your labels can't affect. Though our words are touched by strangeness There is little we can say. You would only hear the echo Of a music far away. chorus

I can always tell the programmers in the audience -- they've been there. It won a Pegasus Award for "Best Science Song" in 1997, possibly because I mentioned it on Usenet.

There are several different recordings. The one to start with is Kathy Mar's cover here, off of her tape Plus &csedilla;a Change, with an awesome synth track by Chrys Thorsen. The one on my CD is okay, although I'm not all that happy with it now. It's way too fast, for one thing, and there isn't an instrumental break before the last verse. It's on YouTube courtesy of my distributor, CD Baby.

There have been some good ones in concerts. The one at Consonance 2009, with Tres Gique, is one of the better ones. Here's another, at Baycon 2009. Consonance 2012 appears to be my best (recorded) solo performance. Audio players don't come off all that well on DW, but I'll close with one anyway.

mdlbear: (technonerdmonster)

Most humans multitask rather badly -- studies have shown that when one tries to do two tasks at the same time, both tasks suffer. That's why many states outlaw using a cell phone while driving. Some people are much better than others at switching between tasks, especially similar tasks, and so give the appearance of multitasking. There is still a cost to switching context, though. The effect is much less if one of the tasks requires very little attention, knitting during a conversation, or sipping coffee while programming. (Although I have noticed that if I get deeply involved in a programming project my coffee tends to get cold.) It may surprise you to learn that computers have the same problem.

Your computer isn't really responding to your keystrokes and mouse clicks, playing a video from YouTube in one window while running a word processor in another, copying a song to a thumb drive, fetching pages from ten different web sites, and downloading the next Windows update, all at the same time. It's just faking it by switching between tasks really fast. (That's only partially true. We'll get to that part later, so if you already know about multi-core processors and GPUs, please be patient. Or skip ahead. Like a computer, my output devices can only type one character at a time.)

Back when computers weighed thousands of pounds, cost millions of dollars, and were about a million times slower than they are now, people started to notice that their expensive machines were idle a lot of the time -- they were waiting for things to happen in the "real world", and when the computer was reading in the next punched card it wasn't getting much else done. As computers got faster -- and cheaper -- the effect grew more and more noticable, until some people realized that they could make use of that idle time to get something else done. The first operating systems that did this were called "foreground/background" systems -- they used the time when the computer was waiting for I/O to switch to a background task that did something that did a lot of computation and not much I/O.

Once when I was in college I took advantage of the fact that the school's IBM 1620 was just sitting there most of the night to write a primitive foreground/background OS that consisted of just two instructions and a sign. The instructions dumped the computer's memory onto punched cards and then halted. The sign told whoever wanted to use the computer to flip a switch, wait for the dump to be punched out, and load it back in when they were done with whatever they were doing. I got a solid week of computation done. (It would take much less than a second on your laptop or even your phone, but we had neither laptop computers nor cell phones in 1968.)

By the end of the 1950s computers were getting fast enough, and had enough memory, that people could see where things were headed, and several people wrote papers describing how one could time-share a large, fast computer among several people to give them each the illusion that they had a (perhaps somewhat less powerful) computer all to themselves. The users would type programs on a teletype machine or some other glorified typewriter, and since it takes a long time for someone to type in a program or make a change to it, the computer had plenty of time to do actual work. The first such systems were demonstrated in 1961.

I'm going to skip over a lot of the history, including minicomputers, which were cheap enough that small colleges could afford them (Carleton got a PDP-8 the year after I graduated). Instead, I'll say a little about how timesharing actually works.

A computer's operating system is there to manage resources, and in a timesharing OS the goal is to manage them fairly, and switch contexts quickly enough for users to think that they're using the whole machine by themselves. There are three main resources to manage: time (on the CPU), space (memory), and attention (all those users typing at their keyboards).

There are two ways to manage attention: polling all of the attached devices to see which ones have work to do, and letting the devices interrupt whatever was going on. If only a small number of devices need attention, it's a lot more efficient to let them interrupt the processor, so that's how almost everything works these days.

When an interrupt comes in, the computer has to save whatever it was working on, do whatever work is required, and then put things back the way they were and get back to what it was doing before. This takes time. So does writing about it, so I'll just mention it briefly before getting back to the interesting stuff.

See what I did there? This is a lot like what I'm doing writing this post, occasionally switching tasks to eat lunch, go shopping, sleep, read other blogs, or pet the cat that suddenly sat on my keyboard demanding attention.

Let's look at time next. The computer can take advantage of the fact that many programs perform I/O to use the time when it's waiting for an I/O operation to finish to look around and see whether there's another program waiting to run. Another good time to switch is when an interrupt comes in -- the program's state already has to be saved to handle the interrupt. There's a bit of a problem with programs that don't do I/O -- these days they're usually mining bitcoin. So there's a clock that generates an interrupt every so often. In the early days that used to be 60 times per second (50 in Britain); a sixtieth of a second was sometimes called a "jiffy". That way of managing time is often called "time-slicing".

The other way of managing time is multiprocessing: using more than one computer at the same time. (Told you I'd get to that eventually.) The amount of circuitry you can put on a chip keeps increasing, but the amount of circuitry required to make a CPU (a computer's Central Processing Unit) stays pretty much the same. The natural thing to do is to add another CPU. That's the point at which CPUs on a chip started being called "cores"; multi-core chips started hitting the consumer market around the turn of the millennium.

There is a complication that comes in when you have more than one CPU, and that's keeping them from getting in one another's way. Think about what happens when you and your family are making a big Thanksgiving feast in your kitchen. Even if it's a pretty big kitchen and everyone's working on a different part of the counter, you're still occasionally going to have times when more than one person needs to use the sink or the stove or the fridge. When this happens, you have to take turns or risk stepping on one another's toes.

You might think that the simplest way to do that is to run a completely separate program on each core. That works until you have more programs than processors, and it happens sooner than you might think because many programs need to do more than one thing at a time. Your web browser, for example, starts a new process every time you open a tab. (I am not going to discuss the difference between programs, processes, and threads in this post. I'm also not going to discuss locking, synchronization, and scheduling. Maybe later.)

The other thing you can do is to start adding specialized processors for offloading the more compute-intensive tasks. For a long time that meant graphics -- a modern graphics card has more compute power than the computer it's attached to, because the more power you throw at making pretty pictures, the better they look. Realistic-looking images used to take hours to compute. In 1995 the first computer-animated feature film, Toy Story, was produced on a fleet of 117 Sun Microsystems computers running around the clock. They got about three minutes of movie per week.

Even a mediocre graphics card can generate better-quality images at 75 frames per second. It's downright scary. In fairness, most of that performance comes from specialization. Rather than being general-purpose computers, graphics cards mostly just do the computations required for simulating objects moving around in three dimensions.

The other big problem, in more ways than one, is space. Programs use memory, both for code and for data. In the early days of timesharing, if a program was ready to run that didn't fit in the memory available, some other program got "swapped out" onto disk. All of it. Of course, memory wasn't all that big at the time -- a megabyte was considered a lot of memory in those days -- but it still took a lot of time.

Eventually, however, someone hit on the idea of splitting memory up into equal-sized chunks called "pages". A program doesn't use all of its memory at once, and most operations tend to be pretty localized. So a program runs until it needs a page that isn't in memory. The operating system then finds some other page to evict -- usually one that hasn't been used for a while. The OS writes out the old page (if it has to; if it hasn't been modified and it's still around in swap space, you win), and schedules the I/O operation needed to read the new page in. And because that take a while, it goes off and runs some other program while it's waiting.

There's a complication, of course: you need to keep track of where each page is in what its program thinks of as a very simple sequence of consecutive memory locations. That means you need a "page table" or "memory map" to keep track of the correspondence between the pages scattered around the computer's real memory, and the simple virtual memory that the program thinks it has.

There's another complication: it's perfectly possible (and sometimes useful) for a program to allocate more virtual memory than the computer has space for in real memory. And it's even easier to have a collection of programs that, between them, take up more space than you have.

As long as each program only uses a few separate regions of its memory at a time, you can get away with it. The memory that a program needs at any given time is called its "working set", and with most programs it's pretty small and doesn't jump around too much. But not every program is this well-behaved, and sometimes even when they are there can be too many of them. At that point you're in trouble. Even if there is plenty of swap space, there isn't enough real memory for every program to get their whole working set swapped in. At that point the OS is frantically swapping pages in and out, and things slow down to a crawl. It's called "thrashing". You may have noticed this when you have too many browser tabs open.

The only things you can do when that happens are to kill some large programs (Firefox is my first target these days), or re-boot. (When you restart, even if your browser restores its session to the tabs you had open when you stopped it, you're not in trouble again because it only starts a new process when you look at a tab.)

And at this point, I'm going to stop because I think I've rambled far enough. Please let me know what you think of it. And let me know which parts I ought to expand on in later posts. Also, tell me if I need to cut-tag it.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com). If you found it interesting or useful, you might consider using one of the donation buttons on my profile page.

NaBloPoMo stats:
   8632 words in 13 posts this month (average 664/post)
   2035 words in 1 post today

mdlbear: (technonerdmonster)

Recently I started reading this Ruby on Rails Tutorial by Michael Hartl. It's pretty good; very hands-on, and doesn't assume that you know Ruby (that's a programming language; Rails is a web development framework). It does assume that you know enough about software development and web technology to be dangerous. And if you're not dangerous yet,...

It points you at a web site where you can learn enough to be dangerous. Starting from knowing nothing at all.

It's the author's contention that Tech is the new literacy [and] [l]earning the basics of programming is only one piece of the puzzle. LearnEnough to Be Dangerous teaches [you] to code as well as a much more powerful skill: technical sophistication. Part of that technical sophistication is knowing how to look things up or figure things out when you don't know them.

There are seven volumes in the series leading up to the Rails tutorial, giving you an introductory course in software development. I haven't gone to a bootcamp, but I'd guess that this is roughly the equivalent. More importantly, by the end of this series you'll be able to work through and understand just about any of the thousands of free tutorials on the web, and more importantly you'll have learned how to think and work like a software developer.

The first three tutorials lay the groundwork: Learn Enough Command Line..., Learn Enough Text Editor..., and Learn Enough Git to Be Dangerous. With just those, you'll know enough to set up a simple website -- and you do, on GitHub Pages. You'll also end up with a pretty good Linux or MacOS development environment (even if you're using Windows).

I have a few quibbles -- the text editor book doesn't mention Emacs, and the author is clearly a Mac user. (You don't need a tutorial on Emacs, because it has one built in -- along with a complete set of manuals. So you'll be able to try it on your own.)

The next three books are Learn Enough HTML to Be Dangerous, Learn Enough CSS & Layout, and Learn Enough JavaScript. The JavaScript is a real introduction to programming -- you'll also learn how to write tests, and of course you'll also know how to use version control, from the git tutorial.

At this point I have to admit that after starting the Ruby tutorial I went back and skimmed through the others; I'll probably want to take a closer look at the JavaScript tutorial to see if I've missed anything in my somewhat haphazard journey toward front-end web development.

The next book in the series is Learn Enough Ruby to Be Dangerouse. (If you skip it on your way to the Rails tutorial, there's a quick introduction there as well.) Ruby seems like a good choice for a second language, and learning a second programming language is important because it lets you see which ideas and structures are fundamental, and which aren't. (There's quite a lot of that about JavaScript -- it's poorly-designed in many ways, and some things about it are quite peculiar.)

Another good second or third programming language would be Python. If you'd like to go there next, or start from the beginning with Python, I can recommend Django Girls and their Tutorial. This is another from-the-ground-up introduction to web development, so of course there's a lot of overlap in the beginning.

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com)

NaBloPoMo stats: 593 words in this post, 1172 words in 3 posts this month.

mdlbear: (technonerdmonster)

Actually two PSAs.

First: Especially if you're running Windows, you ought to go read The Untold Story of NotPetya, the Most Devastating Cyberattack in History | WIRED. It's the story of how a worldwide shipping company was taken out as collateral damage in the ongoing cyberwar between Russia and the Ukraine. Three takeaways:

  1. If you're running Windows, keep your patches up to date.
  2. If you're running a version of Windows that's no longer supported (which means that you can't keep it patched, by definition), either never under any circumstances connect that box to a network, or wipe it and install an OS that's supported.
  3. If at all possible, keep encrypted offline backups of anything really important. (I'm not doing that at the moment either. I need to fix that.) If you're not a corporation and not using cryptocurrency, cloud backups encrypted on the client side are probably good enough.

Second: I don't really expect that any of you out there are running an onion service. (If you had to click on that link to find out what it is, you're not.) But just in case you are, you need to read Public IP Addresses of Tor Sites Exposed via SSL Certificates, and make sure that the web server for your service is listening to 127.0.0.1 (localhost) and not 0.0.0.0 or *. That's the way the instructions (at the "onion service" link above) say to set it up, but some people are lazy. Or think they can get away with putting a public website on the same box. They can't.

If you're curious and baffled by the preceeding paragraph, Tor (The Onion Router) is a system for wrapping data packets on the internet in multiple layers of encryption and passing them through multiple intermediaries between you and whatever web site you're connecting with. This will protect both your identity and your information as long as you're careful! An onion service is a web server that's only reachable via Tor.

Onion services are part of what's sometimes called "the dark web".

Be safe! The network isn't the warm, fuzzy, safe space it was in the 20th Century.

Another public service announcement from The Computer Curmudgeon.

mdlbear: blue fractal bear with text "since 2002" (Default)

It's been a week. There were some good parts. Music Under the Trees, yesterday at Betsy Tinney's, was one of them, even if we did have to leave early because Colleen was flagging. (I didn't object, because to be honest I'd been dreading driving home in the dark after three nights of not enough sleep. But still.)

Another was getting the rest of the bed box installation done, which happened last Sunday. It's awesome. N got to try it out Saturday night; it was complete enough for sleeping in at that point.

A third good thing wa a very nice visit to the southern end of the Rainbow Caravan, to spend a day with N and the kids. We will get our household back together. It may take a while.

And I got a very preliminary version of my "Consulting business" website done, as a GitHub Pages site. Last week N and I had picked a theme: Read Only, by HTML5 Up. It's cool - big banner across the top of the text, and a neat circular image (which it turned out was masked out by setting the enclosing box's boundary radius to 100%). Only one problem.

GitHub Pages are a snap to set up; you can do it in five minutes if you accept all the defaults. And it uses a nice static site builder called Jekyll which has themes that are pretty easy to set up. The devil's in the details, as usual. Because although we found "Read Only" through a gallery of Jekyll themes, it turned out that it wasn't a theme at all, just a mock-up. And although I eventually found a Jekyll version, it wasn't particularly usable.

I now know how to roll my own Jekyll theme, and I can consider myself an advanced beginner at the – Liquid template language and CSS stylesheets. By the way, the MDN Web Docs (MDN stands for Mozilla Developer Network, BTW) are awesome. They have tutorials on all the important web technologies: HTML, Javascript, and CSS, plus some more obscure ones. And when you get to the edge cases, they have reference docs.

It took me, basically, all week, with a huge amount of frustration along the way.

We appear to be getting into the bad parts of the week, don't we? Right.

I believe I mentioned that I'd been dreading going home from Betsy's late at night (I'm nowhere near as good a driver as I was at 50, and I know it). My guess is that that was at the root of the anxiety attacks I had Saturday and Sunday. (Panic attacks are intense, and supposedly last for only a few minutes to an hour or so. Anxiety attacks can -- and in my case, do -- last all day.

And I have "Trigger finger" in my left thumb. It's been getting worse, not better, in spite of the brace I'm wearing, which incidentally makes it almost impossible to type because my thumb keeps hitting my laptop's trackpad and left button. Anyway.

Aaaaaaand, I've been spending almost all my time grappling with Jekyll and CSS, and not getting any job applications done. Bletch.

Notes & links, as usual )

mdlbear: (technonerdmonster)

Today in my continuing series on programming languages I'm going to talk about "scripting languages". "Scripting" is a rather fuzzy category, because unlike the kinds of languages we've discussed before, scripting languages are really distinguished by how they are used, and they're used in two very different ways. It's also confusing because most scripting languages are interpreted, and people tend to use "scripting" when they should be using "interpreted". In my opinion it's more correct to say that a language is being used as a scripting language, rather than to say that it is a scripting language. As we'll see, this is particularly true when the language is being used to customize some application.

But first, let's define scripts. A script is basically a sequence of commands that a user could type at a terminal[1] -- often called "the command line" -- that have been put together in a file so that they can be run automatically. The script then becomes a new command. In Linux, and before that Unix, the program that interprets user commands is called a "shell", possibly because it's the visible outer layer of the operating system. The quintessential script is a shell script. We'll dive into the details later.

[1] okay, a terminal emulator. Hardly anyone uses physical terminals anymore. Or remembers that the "tty" in /dev/tty stands for "teletype"'.

The second kind of scripting language is used to implement commands inside some interactive program that isn't a shell. (These languages are also called extension languages, because they're extending the capabilities of their host program, or sometimes configuration languages.) Extension languages generally look nothing at all like something you'd type into a shell -- they're really just programming languages, and often are just the programming language the application was written in. The commands of an interactive program like a text editor or graphics editor tend to be things like single keystrokes and mouse gestures, and in most cases you wouldn't want to -- or even be able to -- write programs with them. I'll use "extension languages" for languages used in this way. There's some overlap in between, and I'll talk about that later.

Shell scripting languages

Before there was Unix, there were mainframes. At first, you would punch out decks of Hollerith cards, hand them to the computer operator, and they would (eventually) put it in the reader and push the start button, and you would come back an hour or so later and pick up your deck with a pile of listings from the printer.

Computers were expensive in those days, so to save time the operator would pile a big batch of card decks on top of one another with a couple of "job control" cards in between to separate the jobs. Job control languages were really the first scripting languages. (And the old terminology lingers on, as such things do, in the ".bat" extension of MS/DOS (later Windows) "batch files". Which are shell scripts.)

By far the most sophisticated job control language ran on the Burroughs 5000 and 6000 series computers, which were designed to run Algol very efficiently. (So efficiently that they used Algol as what amounted to their assembly language! Programs in other languages, including Fortran and Cobol, were compiled by first translating them into Algol.) The job control language was a somewhat extended version of Algol in which some variables had files as their values, and programs were simply subroutines. Don't let anyone tell you that all scripting languages are interpreted.

Side note: the Burroughs machines' operating system was called MCP, which stands for Master Control Program. Movie fans may find that name familiar.

Even DOS batch files had control-flow statements (conditionals and loops) and the ability to substitute variables into commands. But these features were clumsy to use. In contrast, the Unix shell written by Stephen Bourne at Bell Labs was designed as a scripting language. The syntax of the control structures was, in fact, derived from Algol 68, which introduced the "if...fi" and "do...done" syntax.

Bourne's shell was called sh in Unix's characteristically terse style. The version of Unix developed at Berkeley, (BSD, for Berkeley System Distribution -- I'll talk about the history of operating systems some time) had a shell called the C shell, csh, which had a syntax derived from the C programming language. That immediately gave rise to the popular tongue-twister "she sells cshs by the cshore".

The GNU (GNU's Not Unix) project, started by Richard Stallman with the goal of producing a completely free replacement for Unix, naturally had its own rewrite of the Bourne Shell called bash -- the Bourne Again Shell. It's a considerable improvement over the original, pulling in features from csh and some other shell variants.

Let's look at shell scripting a little more closely. The basic statement is a command -- the name of a program followed by its arguments, just as you would type it on the command line. If the command isn't one of the few built-in ones, the shell then looks for a file that matches the name of the command, and runs it. The program eventually produces some output, and exits with a result code that indicates either success or failure.

There are a few really brilliant things going on here.

  • Each program gets run in a separate process. Unix was originally a time-sharing operating system, meaning that many people could use the computer at the same time, each typing at their own terminal, and the OS would run all their commands at once, a little at a time.
  • That means that you can pipe the output of one command into the input of another. That's called a "pipeline"; the commands are separated by vertical bars, like | this, so the '|' character is often called "pipe" in other contexts. It's a lot shorter than saying "vertical bar".
  • You can "redirect" the output of a command into a file. There's even a "pipe fitting" command called tee that does both: copies its input into a file, and also passes it along to the next command in the pipeline.
  • The shell uses the command's result code for control -- there's a program called true that does nothing but immediately returns success, and another called false that immediately fails. There's another one, test, which can perform various tests, for example to see whether two strings are equal, or a file is writable. There's an alias for it: [. Unix allows all sorts of characters in filenames. Anyway, you can say things like if [ -w $f ]; then...
  • You can also use a command's output as part of another command line, or put it into a variable. today=`date` takes the result of running the date program and puts it in a variable called today.

This is basically functional programming, with programs as functions and files as variables. (Of course, you can define variables and functions in the shell as well.) In case you were wondering whether Bash is a "real" programming language, take a look at nanoblogger and Abcde (A Better CD Encoder).

Sometime later in this series I'll devote a whole post to an introduction to shell scripting. For now, I'll just show you a couple of my favorite one-liners to give you a taste for it. These are tiny but useful scripts that you might type off the top of your head. Note that comments in shell -- almost all Unix scripting languages, as a matter of fact -- start with an octothorpe. (I'll talk about octothorpe/sharp/hash/pound later, too.)

# wait until nova (my household server) comes back up after a reboot
until ping -c1 nova; do sleep 10; done

# count my blog posts.  wc counts words, lines, and/or characters.
find $HOME/.ljarchive -type f -print | wc -l

# find all posts that were published in January.
# grep prints lines in its input that match a pattern.
find $HOME/.ljarchive -type f -print | grep /01/ | sort

Other scripting languages

As you can see, shell scripts tend to be a bit cryptic. That's partly because shells are also meant to have commands typed at them directly, so brevity is often favored over clarity. It's also because all of the operations that work on files are programs in their own right; they often have dozens of options and were written at different times by different people. The find program is often cited as a good (or bad) example of this -- it has a very different set of options from any other program, because you're trying to express a rather complicated combination of tests on its command line.

Some things are just too complicated to express on a single line, at least with anything resembling readability, so many other programs besides shells are designed to run scripts. Some of the first of these in Unix were sed, the "stream editor", which applies text editing operations to its input, and awk, which splits lines into "fields" and lets you do database-like operations on them. (Simpler programs that also split lines into fields include sort, uniq, and join.)

DOS and Windows look at the last three characters of a program's name (e.g., "exe" for "executable" machine language and "bat" for "batch" scripts) to determine what it contains and how to run it. Unix, on the other hand, looks at the first few characters of the file itself. In particular, if these are "#!" followed by the name of a program (I'm simplifying a little), the file is passed to that program to be run as a script. The "#!" combination is usually pronounced "shebang". This accounts for the popularity of "#" to mark comments -- lines that are meant to be ignored -- in most scripting languages.

The scripting programs we've seen so far -- sh, sed, awk, and some others -- are all designed to do one kind of thing. Shells mostly just run commands, assign variables, and substitute variables into commands, and rely on other programs like find and grep to do most other things. Wouldn't it be nice if one could combine all these functions into one program, and give it a better language to write programs in. The first of these that really took off was Larry Wall's Perl. Like the others it allows you to put simple commands on the command line -- with exactly the same syntax as grep and awk.

Perl's operations for searching and substituting text look just like the ones in sed and grep. It has associative arrays (basically lookup tables) just like the ones in awk. It can run programs and get their results exactly the way sh does, by enclosing them in backtick characters (`...` -- originally meant to be used as left single quotes), and it can easily read lines out of files, mess with them, and write them out. It has has objects, methods, and (more or less) first-class functions. And just like find and the Unix command line, it has a well-earned reputation for scripts that are obscure and hard to read.

You've probably heard Python mentioned. It was designed by Guido van Rossum in an attempt to be a better scripting language Perl, with an emphasis on making programs more readable, easier to write, and easier to maintain. He succeeded. At this point Python has mostly replaced Perl as the most popular scripting language, in addition to being a good first language for learning programming. (Which is the best language for learning is a subject guaranteed to invoke strong opinions and heated discussions; I'll avoid it for now.) I avoided Python for many years, but I'm finally learning it and finding it much better than I expected.

Extension languages

The other major kind of scripting is done to extend a program that isn't a shell. In most cases this will be an interactive program like an editor, but it doesn't have to be. Extensions of this sort may also be called "plugins".

Extension languages are usually small, simple, and interpreted, because nobody wants their text editor (for example) to include something as large and complex as a compiler when its main purpose is defining keyboard shortcuts. There's an exception to this -- sometimes when a program is written in a compiled language, the same language may be used for extensions. In that case the extensions have to be compiled in, which is usually inconvenient, but they can be particularly powerful. I've already written about one such case -- the Xmonad window manager, which is written and configured in Haskell.

Everyone these days has at least heard of JavaScript, which is the scripting language used in web pages. Like most scripting languages, JavaScript has escaped from its enclosure in the browser and run wild, to the point where text editors, whole web browsers, web servers, and so on are built in it.

Other popular extension languages include various kinds of Lisp, Tcl, and Lua. Lua and Tcl were explicitly designed to be embedded in programs. Lua is particularly popular in games, although it has recently turned up in other places, including the TeX typesetting system.

Lisp is an interesting case -- probably its earliest use as an extension language was in the Emacs text editor, which is almost entirely written in it. (To the point where many people say that it's a very good Lisp interpretor, but it needs a better text editor. I'm not one of them: I'm passionately fond of Emacs, and I'll write about it at greater length later on.) Because of its radically simple structure, Lisp is particularly easy to write an interpretor for. Emacs isn't the only example; there are Lisp variants in the Audacity audio workstation and the Autodesk CAD program. I used the one in Audacity for the sound effects in my computer/horror crossover song "Vampire Megabyte".

Emacs, Atom (a text editor written in JavaScript), and Xmonad are good examples of interactive programs where the same language is used for (most, if not all, of) the implementation as well as for the configuration files and the extensions. The boundaries can get very fuzzy in cases like that; as a Mandelbear I find that particularly appealing.

Another fine post from The Computer Curmudgeon.

mdlbear: (technonerdmonster)

In comments on "Done Since 2018-07-15" we started having a discussion of mirroring and cross-posting DW blog entries, and in particular what my plans are for implementing personal blog sites that mirror all or some of a -- this -- Dreamwidth journal.

Non-techie readers might conceivably want to skip this post.

Where I am now:

Right now, my blog posting process is, well, let's just say idiosyncratic. Up until sometime late last year, I was posting using an Emacs major mode called lj-update-mode; it was pretty good. It had only two significant problems:

  1. It could only create one post at a time, and there was no good way to save a draft and come back to it later. I could live with that.
  2. It stopped working when DW switched to all HTTPS. It was using an obsolete http library, and noone was maintaining either of them.

My current system is much better.

  1. I run a command, either make draft or, if I'm pretty sure I'm going to post immediately, make entry. I pass the filename, without the yyyy/mm/dd prefix, along with an optional title. If I don't pass the title I can add it later. The draft gets checked in with git; I can find out when I started by using git log.
  2. I edit the draft. It can sit around for days or months; doesn't matter. It' an ordinary html file except that it has an email-like header with the metadata in it.
  3. When I'm done, I make post. Done. If I'm posting a draft I have to pass the filename again to tell it which draft; make entry makes a symlink to the entry, which is already in a file called yyyy/mm/dd-filename.html. It gets posted, and committed in git with a suitable commit message.

You can see the code in MakeStuff/blogging on GitHub. It depends on a Python client called charm, which I forked to add the Location: header and some sane defaults like not auto-formatting. Charm is mostly useless -- it does almost everything using a terminal-based text editor. Really? But it does have a "quick-post" mode that takes metadata on the command line, and a "sync" mode that you can use to sync your journal with an archive. Posts in the archive are almost, but not quite, in the same format as the MakeStuff archive; the main difference is that the filenames look like yyyy/mm/dd_HHMM. Close, but not quite there.

There's another advantage that isn't apparent in the code: you can add custom make targets that set up your draft using a template. For example, my "Done since ..." posts are started with make done, and my "Computer Curmudgeon" posts are started with make curmudgeon. There are other shortcuts for River and S4S posts. I also have multiple directories for drafts, separated roughly by subject, but all posting into the same archive.

Where I want to go:

Here's what I want next:

  • The ability to post in either HTML or markdown -- markdown has a great toolchain, including the ability to syntax-color your code blocks.
  • The ability to edit posts by editing the archived post and uploading it. Right now it's a real pain to keep them in sync.
  • A unified archive, with actual URLs in the metadata rather than just the date and time in the filename.
  • The ability to put all or part of my blog on different sites. I really want the computer-related posts to go on Stephen.Savitzky.net (usually shortened to S.S.net in my notes), and a complete mirror on steve.savitzky.net (s.s.net).
  • Cross-links in both directions between my sites and DW.

How to get there:

Here's a very brief sketch of what needs to be done. It's only vaguely in sequence, and I've undoubtedly left parts out. But it's a start.

Posting, editing, and archiving

  • Posting in HTML or markdown is a pretty easy one; I can do that just by modifying the makefiles and (probably) changing the final extension from .html to .posted so that make can apply its usual dependency-inference magic.
  • Editing and a unified archive will both require a new command-line client. There aren't any. There are libraries, in Ruby, Haskell, and Javascript, that I can wrap a program around. (The Python code in charm doesn't look worth saving.) I wanted to learn Ruby anyway.
  • The unified archive will also require a program that can go back in time and match up archived posts with the right URLs, reconcile the two file naming conventions, and remove the duplicates that are due to archiving posts both in charm and MakeStuff. Not too hard, and it only has to be done once.
  • It would be nice to be able to archive comments, too. The old ljbackup program can do it, so it's feasible. It's in Perl, so it might be a good place to start.

Mirror, mirror, on the server...

This is a separate section because it's mostly orthogonal to the posting, archiving, etc.

  • The only part of the posting section that really needs to be done first is the first one, changing the extension of archived posts to .posted. (That's because make uses extensions to figure out what rules to apply to get from one to another. Remind me to post about make some time.)
  • The post archive may want to have its own git repository.
  • Templating and styling. My websites are starting to show their age; there's nothing really wrong with a retro look, but they also aren't responsive (to different screen sizes -- that's important when most people are reading websites on their phones), or accessible (screen-reader friendly and navigable by keyboard; having different font sizes helps here, too). Any respectable static site generator can do it -- you may remember this post on The Joy of Static Sites -- but the way I'm formatting my metadata will require some custom work. Blosxom and nanoblogger are probably the closest, but they're ancient. I probably ought to resist the temptation to roll my own.

Yeah. Right.

Another fine post from The Computer Curmudgeon.

Most Popular Tags

Syndicate

RSS Atom

Style Credit

Page generated 2025-05-16 08:06 pm
Powered by Dreamwidth Studios