Open Source

Just had a short break away from the constant pace of previous work – 16 months’ worth of near-daily updates – in which I caught up on the stack of magazines, articles, books I’d been meaning to get through. So away from tech for a while hence no blog post last month.

The book stack never goes down – see Michael Simmons’ article at – if I could take a yearly two-week reading vacation, I would… but back in the real world..

Now I’m ready to start on Parachute 0.0.2, the rework of the node server protocol to be iserver compatible. This should mean that the emulator could work with other emulators’ iservers, and vice-versa. However, the link emulation mechanism would need additional variants, to use the mechanism used by other emulators – e.g. as used by Gavin Crate’s emulator (see

During this work, I’ll update the Hello World assembly program, and start upgrading the C++ code to C++11 as needed.

TL;DR: Frustration, but the end is in sight.

Parachute is composed of several separate projects, with independent versions, held in separate repositories:

  • the Transputer Emulator itself, written in C++, built using Maven/CMake/Make, which requires building and packaging on macOS, CentOS 7, Ubuntu 1604 and 1804, Raspbian Stretch, and Windows 10.
  • the Transputer Macro assembler, written in Scala, built using Maven, which requires building and packaging on macOS, Linux (one cross-platform build for all the above Linux variants), and Windows 10.
  • and eventually there will be the eForth build for Transputer, other languages, documentation, etc.

Getting all this to build has been quite the journey!

I use Maven as an overall build tool, since it gives me sane version management, build capability for all the languages I use via plugins, packaging, signing, deployment to a central repository (I’m serving all build artefacts via Maven Central).

Each project’s build runs across a set of Jenkins instances, with the master on macOS, and nodes on virtual machines, and a physical Raspberry Pi.

Each project deploys a single artefact per target OS, into Maven Central’s staging repository. So there are six build jobs, one on each node, that can sign and deploy on request.

The effect of this is that a single commit can trigger six build jobs for the C++ code, and three for the JVM-based code (since all Linux systems package the same scripts). Deployment is manually chosen at convenient points, with manual closing of the staging repository in Sonatype’s OSSRH service.

The manual deployment choices may be removed once all this is running smoothly. Since I cannot produce all platform-specific artefacts from a single Maven build, I cannot use the Maven Release Plugin.

Once the emulator and assembler are deployed for all their variants, there is a final build job that composes the Parachute distribution archives, signs them and deploys them to Maven Central via Sonatype OSSRH.

There have been several ‘gotchas’ along the way..

… the GPG signing plugin does not like being run on Jenkins nodes. It gets the config from the master (notably, the GPG home, from which it builds its paths to the various key files). So that had to be parameterised per-node.

… getting the latest build environments for C++ on each of the nodes. I’m not using a single version of a single compiler on everything. A variety of clangs (from 3.5.0 to 8.0.0) and Microsoft Visual C++ Build Tools.

… Windows. It’s just a world of pain. Everything has to be different.

So this long ‘phase one’ is almost at an end, and I hope to ship the first build very soon.

It would be ‘fun’ to see if I can replicate all the above with a cloud-based build system instead of Jenkins + VMs. However, Windows, macOS and Raspberry Pi will be problematic. Travis CI does not have CentOS or Raspberry Pi hosts; Circle CI does not have Windows, CentOS or Raspberry Pi hosts (Windows is on their roadmap).

Since Feb/Mar 2018, I’ve been working on a new phase of one of my old projects: Parachute, a modern toolchain for programming the Transputer, and a Transputer Emulator – cross-platform for Mac OSX, Windows 10 and Linux.

The Transputer architecture is interesting since it was one of the first microprocessors to support multitasking in silicon without needing an operating system to handle task switching. Its high level language, occam, was designed to facilitate safe concurrent programming. Conventional languages do not easily represent highly concurrent programs as their design assumes sequential execution. Java has a memory model and some facilities (monitors, locks, etc.) to make parallel programming possible, but is not inherently concurrent, and reasoning about concurrent code in Java is hard. occam was the first language to be designed to explicitly support concurrent (in addition to sequential) execution, automatically providing communication and synchronisation between concurrent processes. If you’ve used go, you’ll find occam familiar: it’s based on the same foundation.

My first goal is to get a version of eForth running on my emulator (as I’ve long wanted to understand Forth’s internals). The eForth that exists is a) untested by its author and b) only buildable on MASM 6, which is hard to obtain (legally). I’m trying to make this project as open and cross-platform as possible, so first I had to write a MASM-like macro assembler for the Transputer instruction set This is mostly done now, written in Scala, and just requires a little packaging work to run it on Mac OS X, Linux and Windows.

I’ve written up the history of this project at Parachute History, so won’t repeat myself here..

I’m not yet ready to release this, since it doesn’t build on Windows or Linux yet, and there are a few major elements missing. Getting it running on Windows will require a bit of porting; Linux should be a cinch.

Once I have a cross-platform build of the emulator, I intend to rewrite my host interface to be compatible with the standard iServer (what I have now is a homebrew experimental ‘getting started’ server).

There are quite a few instructions missing from my emulator – mostly the floating point subset, which will be a major undertaking.

The emulator handles all the instructions needed by eForth. eForth itself will need its I/O code modifying to work with an iServer.

Once eForth is running, I have plans for higher-level languages targetting the Transputer…

… but what I have now is:

… to be continued!

Abstract: Oracle is shutting down Kenai and on April 28, 2017, and as one of the open source projects I’m a member of was formerly hosted there, we needed to move away. This move comprises source code and mailing lists; this post concerns the former, and is a rough note on how I’m migrating svn repositories to git (hosted on github), with the method, and a script that makes it easier.

It’s an ideal time to migrate your subversion repositories to somewhere else, and since git/github are the de facto/fashion standards these days, you may want to consider converting your entire history to git, and hosting it at github. (Disclaimer, I prefer mercurial and bitbucket, and some of the below could be used there too…)

To convert a svn repo to git, and push to github, there are several stages. I did this on OS X – I’d do this on some form of UNIX, but you may get results with Windows too.

I’m going to use svn2git, so let’s get that installed:

Following the installation instructions at

I made sure I had ruby, ruby gems, svn, git, and git-svn installed (these package names might not be precise for your system; they’re from memory)
I can do:

$ cd /tmp
$ mkdir foo
$ cd foo
$ git init
$ git svn
git-svn - bidirectional operations between a single Subversion tree and git
usage: git svn [options] [arguments]
… blah blah …


$ sudo gem install svn2git

The conversion from svn to git makes a lot of network access to the svn repo, and so to reduce this, let’s “clone” the svn repo onto the local system.
Mostly following the notes at, first, initialise a local repo:

$ mkdir svnconversion
$ cd svnconversion
$ svnadmin create svnrepo

Note that git’s svn code expects the subversion repo it converts to have a filesystem format version between 1 and 4, that is, up to Subversion 1.6. So if you have a version of the svn client that’s more recent than that, you’ll have to use the command:

$ svnadmin create —compatible-version 1.6 svnrepo

(see for details)

$ ls -l svnrepo
total 16
-rw-r--r-- 1 matt staff 246 1 Sep 22:58 README.txt
drwxr-xr-x 6 matt staff 204 1 Sep 22:58 conf
drwxr-sr-x 15 matt staff 510 1 Sep 22:58 db
-r--r--r-- 1 matt staff 2 1 Sep 22:58 format
drwxr-xr-x 12 matt staff 408 1 Sep 22:58 hooks
drwxr-xr-x 4 matt staff 136 1 Sep 22:58 locks

$ cd svnrepo

Now create the pre-revprop-change hook:

$ echo '#!/bin/sh' > hooks/pre-revprop-change
$ chmod 755 hooks/pre-revprop-change

Let’s prepare to sync the svn repo here:

$ svnsync init file:///tmp/svnconversion/svnrepo

Now let’s do the actual sync. This is what takes the time on large repositories…

$ svnsync --non-interactive sync file:///tmp/svnconversion/svnrepo
# Make tea…

OK, now we have the “clone” of the svn repo, so let’s convert it to git. The first thing you’ll need is an author mapping file. This converts the short author names used in svn commits into the longer “name ” form used by git.

Note there are many possible structures for svn repos, with the ‘standard’ layout having branches/tags/trunk. This page assumes that your svn repo looks like that. If it doesn’t, then see where there are many possibilities documented to aid your conversion.

See the svn2git github page for details of how to create this authors.txt file.

Converting to git is as simple as:

$ cd /tmp/svnconversion
$ mkdir gitrepo
$ cd gitrepo
$ svn2git —authors ../authors.txt file:///tmp/svnconversion/svnrepo

Then create a new repository using the GitHub web UI, add it as a remote, and push, mirroring all branches to the remote:

$ git remote add origin
$ git push --mirror origin

The following is a script I wrote to make it easier to perform the above steps repeatedly, as I had several repositories to convert. It assumes you have exported the GITORGANISATION environment variable.


usage() {
	echo "svn-to-git-conversion [syncsetup|sync|convert|push] http://url/of/svn/repository local-repo-dir-prefix ~/path/to/authors"
	exit 1

# syncsetup, sync, convert or push


# prefix of relative folder (eg jxta-c) where repository will be svnsynced to eg jxta-c-svn
# and of relative folder where repository will be converted eg jxta-c-git

# path to author mapping file

if [ "$PHASE" != "syncsetup" -a "$PHASE" != "sync" -a "$PHASE" != "convert" -a "$PHASE" != "push" ]

echo local svn repository url is $SVNREPOFILEURL

if [ "$PHASE" = "syncsetup" ]
	svnadmin create --compatible-version 1.6 $SVNREPONAME
	echo '#!/bin/sh' > $SVNREPONAME/hooks/pre-revprop-change
	chmod 755 $SVNREPONAME/hooks/pre-revprop-change

if [ "$PHASE" = "sync" ]
	svn propdel svn:sync-lock --revprop -r 0 $SVNREPOFILEURL
	svnsync --non-interactive sync $SVNREPOFILEURL
	echo Users in the SVN repository to be added to the $AUTHORS file:
	svn log --quiet $SVNREPOFILEURL | grep -E "r[0-9]+ \| .+ \|" | cut -d'|' -f2 | sed 's/ //g' | sort | uniq
	echo Top-level structure of the SVN repository: 

if [ "$PHASE" = "convert" ]
	svn2git --authors $AUTHORS $SVNREPOFILEURL

if [ "$PHASE" = "push" ]
	git remote add origin$GITORGANISATION/$GITREPONAME.git
	git push --mirror origin

In part 2 of this series, I described the construction of the HF antenna analyser project I’m building, from Beric Dunn’s schematics and Arduino firmware. In this article, I’ll finish some small items of construction, and look at testing and driving the analyser. All resources, pictures and files for this project are available from the project GitHub repository, with driver software available from the driver GitHub repository.


The Scan LED wasn’t working, and this was because R12 was too large, so I replaced it with a 1K Ohm. Sorted. Also, the SIL headers I’d ordered originally were too small for the pins of the Arduino Micro and DDS module. It took some time to locate suitable replacements, and find a supplier who wasn’t going to charge me £4.95 just for placing an order as a private (hobbyist) customer. Fortunately, I discovered Proto-Pic, a UK supplier that could provide 10-pin and 6-pin SIL headers. I ordered 2×10 pin Stackable Arduino Headers (PPPRT-11376) and 6×6 pin Stackable Arduino Headers (PPPRT-09280) for £4.78 including P&P. When fitting the 6-pin headers for the Arduino Micro (three per side), you may find that they are quite tight together, so sand down the inner edges a little. The Arduino Micro was still quite a tight fit, but it’s far more secure than it was.

Boxing it up

I cut a few more tracks on the veroboard near the mounting holes so that the metal spacers and screws I found in my spares box wouldn’t short anything out, then started fitting the board into the enclosure, cutting holes as appropriate. I added a switch into the power line… the result looks like this:

And when the LetraSet goes on:

Software, Firmware

I’ve made a few changes to Beric’s original firmware (see here), but will keep the commands and output format compatible, so if you’re driving my modified firmware with Beric’s Windows driver, everything should still work.

I use Windows 2000 on an old laptop in the Shack: I couldn’t get it working with the Arduino drivers, so I couldn’t use Beric’s Windows driver software. I needed a Linux or Mac OSX solution, so started writing a Scala GUI driver that would run on Mac, Windows or Linux, and have got this to the point where I need to add serial drivers like RxTx, getting the native libraries packaged, etc., etc.

However, that’s on hold, since I was contacted by Simon Kennedy G0FCU, who reports that he’s built an analyser from my layout which worked first time!! He’s running on Linux, and has passed the transformed scan output into gnuplot to yield a nice graph. I hadn’t considered gnuplot, and the results look far better than I could write quickly.

So, I reused the code I wrote several years ago for serial line/data monitoring, and wrote an analyser driver in C that produces nice graphs via gnuplot. So far it builds on Mac OSX. In the near future I’ll provide downloadable packages for Debian/Ubuntu/Mint, Red Hat/CentOS and hopefully Raspberry Pi.


The analyser as it stands is not without problems – the first frequency set during a scan usually reports a very high SWR – I don’t think the setting of the DDS frequency after a reset is working reliably. From looking at the DDS data sheet timing diagrams, short delays are needed after resetting, and updating the frequency – these are not in the current firmware…

Also repeated scans tend to show quite different plots – however, there are points in these repeated plots that are similar, hopefully indicating the resonant frequencies.

Beric mentioned (on the k6bez_projects Yahoo! group) that “With the low powers being used, and the germanium diodes being used, it makes sense to take the square of the detected voltages before calculating the VSWR.”…

Simon pointed out that “the variable VSWR is defined as a double. This means that when REV >= FWD and VSWR is set to 999 it causes an overflow in the println command that multiplies VSWR by 1000 and turns it into an int. Making VSWR a long should fix this.” He also suggested some other changes to the VSWR calculation…

… these are changes I’m testing, and hope to commit soon.

I’ll add some options to the software/firmware to plot the detector voltages over time for a few seconds – an oscilloscope probing the FWD/REV detector output shows some digital noise. I had added an LED-fading effect to show that the board is active, and this exacerbates the noise. This noise makes it through to the VSWR measurement. I’ll try taking the mode of several measurements… Once the DDS is generating the relevant frequency, I’m expecting these voltages to be perfectly stable.

I’m investigating these issues, and hope to resolve them in software/firmware – I hope no changes are needed to the hardware to fix the problems I’m seeing, but can’t rule out shielding the DDS, and/or using shielded cable for the FWD/REV connections between the op-amp and Arduino Micro.

In the next article, I’ll show how to drive the analyser with the driver software, and hopefully resolve the noise issue.

Will M0CUV actually find the resonant frequency of his loft-based 20m dog-leg dipole made from speaker wire? Will the analyser show the tight bandwidth of the 80m loop? Stay tuned! (groan)

73 de Matt M0CUV

I’ve recently been building a small set of CentOS server virtual machines with various settings preconfigured, and packages preinstalled. These were built from the ‘minimal’ CentOS-6.5-x86_64-minimal.iso distribution, as you don’t need a GUI to administer a Linux server. Initially these VMs were built manually, following a build document, but after several additions to the VMs, and documenting these updates in the build document, I decided to automate the whole process. This post describes how I achieved this – I had some problems, hope this helps…

UPDATED: The need to specify an IP adderss for the remote_host property has been fixed in Packer’s GitHub repo, and should be in a release coming soon!

I decided to use Mitchell Hashimoto’s excellent Packer system. I’m running it on an Ubuntu Linux 12.04 desktop VM. Eventually this will be changed to run under Jenkins, so that changes to the configuration can be checked into source control, and the whole process can be fully automated. Until then, I’ve automated it using Windows 7 as my main system, with VMware Player 6.0.1 running the Ubuntu Linux desktop. I also have an instance of VMware ESXi 5.5.0 also running under VMware Player. The Ubuntu VM with Packer creates the new CentOS VMs inside this Nested ESXi. If you haven’t seen the film Inception, now might be a good time to watch it…. Both the Ubuntu and ESXi VMs use bridged networking, and are on the same IP network.

On the ESXi system, I have:

  • installed the VMware Tools for Nested ESXi
  • configured remote SSH access and the ESXi Shell (under Troubleshooting Mode Options) – Packer currently requires SSH access to ESXi, rather than using VMware’s API; this may change in the future
  • enabled discovery of IP address information via ARP packet inspection. This is disabled by default, and is enabled by SSH using esxcli system settings advanced set -o /Net/GuestIPHack -i 1
  • allowed Packer to connect to the VNC session of the VM being built, so that it can provide the early boot commands to the CentOS installer (specifically, giving KickStart a specific configuration file, served by a small web server – more on this later). To enable VNC access, I used the vSphere client to visit the server’s Configuration/Security Profile settings, and under Firewall/Properties…, enabled gdbserver (which enables ports in the range VNC requires, 5900 etc.) and also SSH Client and SSH Server (I forget some of the other things I tried… sorry!)
  • configured a datastore called ‘vmdatastore’ which is where I want Packer to build the VMs.

On the Ubuntu system, I have a directory containing:

  • The CentOS minimal .ISO
  • A Kickstart file. This was taken from a manual installation’s anaconda-ks.cfg, and modified using a CentOS desktop’s KickStart Configuration tool. See below for its contents.
  • The Packer .JSON script. See below.
  • A script to launch a webserver to serve this directory – Packer needs to get the .ISO and KickStart file over the network, and this is how it’s served. Nothing complex: python has a simple one-line server which I use here.
  • A script to run packer.
  • A script to run on the built VM after the OS has been installed. This isn’t the hard part, so this just echoes something: in reality, this installs the packages I need, configures all kinds of stuff.

So let’s see some scripts. They are all in my ~/packertemplatebuilding directory. The Ubuntu desktop VM’s IP address is, and the ESXi VM’s IP address is; root SSH access to ESXi is used, and the password is ‘rootpassword’. (Of course these are not the real settings!)

The webserver launching script:

python -m SimpleHTTPServer &

The Packer launch script:

# export PACKER_LOG=enable
packer build base-packer.json

The Packer script – one of the problems I had was that the IP addresses you see in here were initially given as hostnames, and set in DNS. This didn’t work, as Packer (0.5.1) is using Go’s net.ParseIP(string-ip-addr) on the remote_host setting, which yielded the error “Unable to determine Host IP”. Using IP addresses isn’t ideal, but works for me:

  "builders": [
      "type": "vmware-iso",
      "iso_url": "",
      "iso_checksum": "0d9dc37b5dd4befa1c440d2174e88a87",
      "iso_checksum_type": "md5",
      "disk_size": "10240",
      "disk_type_id": "thin",
      "http_directory": "~/packertemplatebuilding",
      "remote_host": "",
      "remote_datastore": "vmdatastore",
      "remote_username": "root",
      "remote_password": "rootpassword",
      "remote_type": "esx5",
      "ssh_username": "root",
      "ssh_password": "rootpassword",
      "ssh_port": 22,
      "ssh_wait_timeout": "250s",
      "shutdown_command": "shutdown -h now",
      "headless": "false",
      "boot_command": [
        "<tab> text ks=<enter><wait>"
      "boot_wait": "20s",
      "vmx_data": {
        "ethernet0.networkName": "VM Network",
        "memsize": "2048",
        "numvcpus": "2",
        "cpuid.coresPerSocket": "1",
        "ide0:0.fileName": "disk.vmdk",
        "ide0:0.present": "TRUE",
        "ide0:0.redo": "",
        "scsi0:0.present": "FALSE"
"provisioners": [
      "type": "shell",
      "script": ""

Note that this need for IP addresses has been fixed and will be in a future Packer release.

The script:

echo Starting post-kickstart setup

And finally, the Kickstart file ks.cfg, note the hashed value of the VM’s root password has been redacted. Use the Kickstart Configuration tool to set yours appropriately:

#platform=x86, AMD64, or Intel EM64T
# Firewall configuration
firewall --enabled --ssh --service=ssh
# Install OS instead of upgrade
# Use CDROM installation media

rootpw  --iscrypted insert-hashed-password-here
authconfig --enableshadow --passalgo=sha512

# System keyboard
keyboard uk
# System language
lang en_GB
# SELinux configuration
selinux --enforcing
# Do not configure the X Window System
# Installation logging level
logging --level=info

# Reboot after installation

# System timezone
timezone --isUtc Europe/London
# Network information
network  --bootproto=dhcp --device=eth0 --onboot=on
# System bootloader configuration
bootloader --append="crashkernel=auto rhgb quiet" --location=mbr --driveorder="sda"

# Partition clearing information
clearpart --all  --drives=sda

# Disk partitioning information
part /boot --fstype="ext4" --size=500
part pv.008002 --grow --size=1
volgroup vg_centos --pesize=4096 pv.008002
logvol / --fstype=ext4 --name=lv_root --vgname=vg_centos --grow --size=1024 --maxsize=51200
logvol swap --name=lv_swap --vgname=vg_centos --grow --size=3072 --maxsize=3072

%packages --nobase


And that’s it! You’ll have to adjust the timings of the various delays in the Packer .JSON file to match your system. Have the appropriate amount of fun!

Earlier posts discussed the distributed microblogging system I’m building, why I’m writing it, how you would use it, and how it works. In this post, I’ll describe the tools & technologies I’m using to write it, and how you can get involved. It’ll take a long time to write, given the amount of time I can devote to it, given life, family, study and day job, so I’d be very happy to receive help!

The software is written in Scala, and its code is currently hosted in a Mercurial repository on BitBucket. I build it with Maven, write it in IntelliJ IDEA, and use test-driven development as rigorously as possible. It is released under the Apache License, v2.0. Installable software is available for Mac OS X (Snow Leopard or greater), Windows (XP or greater), and Ubuntu Linux (10.04 or greater). I use Software Crafting as my approach, apprentices are always welcome!

The main technologies I’m using are JXTA for the peer-to-peer communications (of which, more later), Play for the client REST API, Bootstrap, Jquery and HTML5 for the web UI. Storage is handled by an embedded H2 database, with my CommonDb Framework for data access.

The rough architectural plan is that on top of JXTA, I intend to have an anti-corruption messaging/asynchronous RPC layer feeding into the domain model, this being isolated from JXTA. Group membership may be handled by an implementation of the Paxos consensus algorithm. Replication is to be handled by a simple gossip protocol, both for the updates to the directory, and between peers in a message replica set.

Interested in contributing to the project? Contact me via this blog, or via @mattgumbley or @devzendo on Twitter; you can find my mail details on the Contact page.

To be continued…

In which I describe the features I’m hoping to provide in a peer-to-peer social network, run by its users, for its users.

In my previous post, I laid out some arguments behind my wish for a decentralised, peer-to-peer social network. This is a system that I’m building. In this post, I’ll describe the features and usage I’m hoping to provide, in as non-technical manner as I can. It’s a simple set of features, but provides the essentials. A subsequent post will describe the technologies I’m using to implement this.

No central system

I should explain the main difference between this system, and other social networks you may be familiar with.

You might visit or in your web browser, or might use an app on a smartphone or tablet, but Facebook and Twitter have a large set of servers providing the social network to you.

With this system, there are no central servers. The users of the system each run a piece of software (the ‘node’) on their computers, and this plays its part in building the social network, allowing people’s posts to be distributed to their followers.

Open Source

It’s Open Source: free as in speech, and in beer: it costs nothing, the source code is available for you to read or scrutinise and you are free to join me in developing it, translating it, enhancing it, discussing its future direction.

Free as in cost: it costs nothing to run a node, although it will increase your Internet connectivity costs. I’ll try to ensure the node isn’t too greedy with your bandwidth!

Getting started

You would download and run your own copy of the node software – available for Mac OS X (Snow Leopard or above), Windows (XP and above), or Ubuntu Linux (10.04 or above). There’s a desktop version with a small GUI, and a version you can run as a service/daemon without a GUI.

Installation is trivial: drag the application icon to Applications on a Mac; run an installer on Windows; some arcane apt-get incantation on Ubuntu that Linux-heads will find soothingly easy.

I said above that there’s no central server, but there is a server hosting, which is where the software is downloaded from, but that’s almost all it’s used for. More on that, later.

You run this software whenever and wherever you can – whenever you are using your computer, or when it is idle. The node software would find, and join the peer-to-peer network, handling replication of some user’s messages, and the building of your timeline. It is your home node. You can have more than one home node, say one at home, one at work – and all nodes help to build the network. You can only log into your home nodes, however. Your neighbours might have their own home nodes, that help to build the network, but you can’t log into them, unless they grant you access.

You can opt to provide relay facilities for other users (those behind NAT routers), if you are running the node on a publically-accessible system, and can spare bandwidth, storage and CPU. If you have plenty of this, you can opt to form part of the directory, of which, more later.

Once running, the node software gives you a web site, and you log in to this with your web browser. The desktop node provides a button, which, when clicked, loads the client web site into your web browser.

All setup and operations are then done from your web browser.

To access your home node from the public Internet, you may need to open its web ports on your firewall. Although I’m trying to make all this as easy to use as possible, this step might be problematic for less technical users.

Using the network

After the node is installed and running, you log into it with a web browser, and assign an Administrator password. You can’t do anything else with it until this is done.

Then, as the Administrator, you can create an account on the network.

From the login screen, you can see the Administrator account, and all other users of this home node, including the one you just set up. Now log in using this account.

Logged in as a user, you can set basic bio information from your account settings, and set an avatar picture.

You can search for other users, and follow them. You can see who is following you.

You can post private messages to those you follow, and who also follow you. Such private messages are sent directly to the follower’s node, if it is online. Delivery will be retried if the follower is offline.

You can post a new public message. This is public to your followers, and will be replicated to a small set of peer nodes that hold your posts, to improve availability of your messages, if your home node is offline, or uncontactable. (These peers form your message replica set).

Messages are short – perhaps a little longer than 140 characters though.

Your timeline view will show the posts of those you follow, sorted chronologically. The timeline will show 72 hours of messages; messages older than this disappear. The message expiry time exists since the system relies on the goodwill of its users, hosting message storage – I don’t want to eat your entire disk! Messages replicated to other peers also expire after 72 hours. Maybe more than 72 hours is needed – but there should be some finite expiry.

It’s likely that in building the timeline, there may be replicas that are not online or contactable. You may be viewing part of today’s timeline when these replicas come online, making yesterday’s messages available, so there will need to be some visual indication that there are some older posts from yesterday that you could now read.

A selected message in your timeline can be replied to. Your client would show any replies to your messages.

How the network is built

When you post a message, it is quickly replicated to your replica set (if possible), to improve your messages’ availability to your followers. The size of your replica set is dependent on the number of followers you have: a celebrity may have thousands of followers or more – there needs to be many replicas available to serve their messages. For new users with few followers, fewer replicas are needed, but there will always be a small number.

Your timeline is built by your home node contacting one of the set of peers that replicate your friend’s messages. By contacting a replica of their messages, you will receive a read-only copy of them.

User information, and the graph of followers will be stored in the directory: a replicated set of highly-available, well-connected peers. The directory also records the set of peers that replicate your messages, and which peers are your home nodes.

Search will prove difficult; there may be a need to send all posts to an indexing service, again on a set of high-availability peers, from where search can be effected. This could also be used to provide a “firehose”.

In which I consider how a peer-to-peer social network could be run for its users, by its users, and the benefits and disadvantages of such a scheme.

Warning: this post contains quoted strong language.


I use Facebook and Twitter; they have re-united me with old friends, and provide an engaging means of contact with them that I once thought would be by email. I don’t use email so much for social contact these days; spam made it something I only check rarely. It’s heavyweight, compared to the style of conversation that social networks provide. So, I am indebted to Facebook and Twitter, and will continue to use both.

But there are some disadvantages with them…

The cost of social networks

Social networks such as Facebook and Twitter (and Google+ and whatever Microsoft might build) are not free. You do not pay money to use them, but providing the service is a significant cost to their owning companies. The hardware / infrastructure employed by centralised social networks has to be paid for; engineers need to be paid.


Typically these costs are mitigated through advertising, and since the networks can analyse your graph and conversations, this will tend towards targeted advertising.

I’m starting to see adverts, sponsored stories, sponsored tweets, in my timelines and – in the case of Twitter’s despised treading topics Quick Bar – in your face with no way to avoid it. Facebook’s tracking of who likes what now provides a regular source of unsolicited dross at the top of my timeline.

I have a peculiar dislike of advertising which won’t be shared by many, I’d admit. I rarely watch commercial TV. I use ad-blockers and anti-tracking plugins. I choose not to be bombarded by these distractions; the false world the advertisers foist on us. I choose not to include the downloading of adverts in my monthly bandwidth allowance. I’d like the choice to block adverts on social networks, although I understand that this is not in their interests.

I’m in agreement with Banksy, who said:

People are taking the piss out of you everyday. They butt into your life, take a cheap shot at you and then disappear. They leer at you from tall buildings and make you feel small. They make flippant comments from buses that imply you’re not sexy enough and that all the fun is happening somewhere else. They are on TV making your girlfriend feel inadequate. They have access to the most sophisticated technology the world has ever seen and they bully you with it. They are The Advertisers and they are laughing at you.

You, however, are forbidden to touch them. Trademarks, intellectual property rights and copyright law mean advertisers can say what they like wherever they like with total impunity.

Fuck that. Any advert in a public space that gives you no choice whether you see it or not is yours. It’s yours to take, re-arrange and re-use. You can do whatever you like with it. Asking for permission is like asking to keep a rock someone just threw at your head.

You owe the companies nothing. Less than nothing, you especially don’t owe them any courtesy. They owe you. They have re-arranged the world to put themselves in front of you. They never asked for your permission, don’t even start asking for theirs.

I think he has a point: advertising could be opt-in, controlled by me: I would follow those companies/organisations in which I have an interest, and they post adverts and other information for my consumption. If I choose to I unfollow, that’s my prerogative, and I should never hear from them again. There should be no way in which they can sell my address to other organisations, as typically happens when signing up to less-than-scrupulous sites using your email address. An organisation’s right to put information in my timeline must be granted by me, be revocable by me, and be binding. In building a social network, I would not view my friends as a sea of eyeballs/attentions to be advertised at and provoked.

I don’t begrudge companies wanting to promote themselves and their products, but I find out about so many things by word of mouth/tweet, and recommend things to my followers: if your product is not good enough to go viral because of its inherent quality or your company’s ethics, you may want to reconsider its viability. Natural selection, invisible hands of markets making significant gestures, and whatnot.

Of course, blocking adverts would deprive the social networks of some of their income; the ad-blocker I use, Ad-Block Plus recently introduced an option to allow or block selected, non-intrusive adverts; I exercised my choice. Facebook have recently prevented one such ad-blocker, F. B. Purity, so I’m stuck with their dross.

So if they are not supported by adverts, how are the networks to sustain themselves? Would you pay for access to a social network? I might, depending on cost. I suspect many wouldn’t pay. It probably wouldn’t yield a sustainable income – take the example of smartphone apps: Apps typically have a paid version and a “free” ad-supported version. I heard that the ad-supported versions typically generate more money than those that people pay for in order to remove the annoyances. Note: App.Net exists precisely to cater for those who do not want an ad-supported social network. Can it sustain itself?

I think it’s odd, supporting yourself by irritating your users. Perhaps the advertising will become increasingly irritating/invasive to the point of this model being rejected by users: however, the continuing existence of commercial TV is an existence proof that suggests this is unlikely to happen. The BBC exists in its current form because it extracts an entertainment tax from its viewers. Were it not to do this, it would be annoying us with adverts to generate income. So, the advertisers are expecting us to acquiesce, and accept their intrusions.

There is a way that a social network can be built, sustainably, without advertising, but before I discuss this, there are several other aspects to think about…

Privacy and interception

Having a large user base using social networks as a centralised communications platform is also great for regimes seeking to intercept, censor or perform sentiment analysis of messages.

I believe that we are entitled to some privacy. A decentralised system might make such interception slightly harder. Given that our home and mobile Internet traffic is monitored, then even decentralised social network traffic would be visible. It would just be slightly harder to tie all the messages together. The system I’m proposing uses secure communications for all inter-node communications.

In the context of a social network, I think Twitter’s model of privacy is what I’d like to emulate: my posts are all public, but I can send private messages to specific followers if I follow them.

The spooks can see all the public stuff, and if they want to, can throw their weight behind a brute force attack on the private messages. I am under no illusions that any of this would prevent them from reading what I might consider secure: I’m assuming they have the kind of resources and abilities that might appear in a Dan Brown novel.

A decentralised system (see future post describing the mechanisms I’m considering) would necessarily store a user’s messages on multiple systems, encrypted, possibly distributed worldwide.


Centralisation also means that regimes can censor communications. In extreme cases, entire countries are disconnected from the global Internet, as we have seen during the Arab Spring, and recently in Syria. Specific sites such as Facebook or Twitter can be blocked, although proxies and Tor can reduce the impact of this.

A decentralised network communicating over HTTP could only be blocked using some form of deep-packet inspection. I’m sure the system I have in mind could be disrupted significantly; there are points of failure, but once working, a significant network disruption would cause partitioning, not complete failure. When the network is restored, the partition would be automatically recovered. (He says, glibly, waving hands frenetically around the CAP theorem and FLP impossibility result)

Privacy and the trend towards total openness

Social networks have been known to make changes to their settings that led to reduced privacy, without notifying users. Some networks have complex privacy settings that may be more open by default than you may like. There are also attempts to leverage their huge user base to become an identity provider or a messaging provider. Their smartphone apps scour your contacts, upload and modify them.

There is a trend away from private to public, controlled by these networks, without the users’ consent. This is not democracy.

Facebook recently gave up their voting system, as voting turnout was very low.
I’d like to think that users would want to be able to express views on how the system expands and is developed. So I’m thinking of providing a voting system where the users choose how the system evolves.

The basic simple privacy model should not change, though; it should not be made more complex than it is.


There’s a cynical notion that if you’re not paying for it, then you are the product being sold.

What would it be like if we had a social network that challenged these assumptions?

Something for nothing

To build a social network without incurring any costs, without the need for advertising to sustain it, without building server farms will require a system that users build themselves by their participation; one that self-organises, self-maintains. It must be trivially easy to set up and get started. Anything that involves anything remotely technical will not get used (see Diaspora*). It has to be as easy to use as Skype. It has to be decentralised, using peer-to-peer technologies, as I can’t afford a central server, nor do I want to be the administrator.

So this is my current überproject. It involves peer-to-peer communications (using JXTA as the underlying network transport), embedded databases, gossip protocols, web user interfaces and all that they entail.

It does not preclude the making of money: not everyone will want to run a node; many will not be able to, if they do not have a traditional computer or if they have moved to the cloud, and only use a tablet or smartphone. In these cases, they can pay a small- or micro-fee for a subscription to access the network via a server that’s maintained by a responsible company. (Perhaps there will be a race to the bottom between hosting companies, as we see in the App Stores?!)

You could start a company selling a Raspberry Pi pre-loaded with a Linux distribution, JVM and the node software. Add it to your home network, create your account, and off you go!

It does not preclude companies creating a presence for advertising, but won’t rely on them for funding. Such companies will not be given any special privileges to push advertising at users.

It does not prevent a determined power from intercepting traffic, but should give an element of privacy with user-to-user messages.

You pay for it by running a node. If you can spare the bandwidth, you can run an Internet-facing node, allowing other users to relay through it, improving access. Generous donors and philanthropists could run nodes on high-availability/performance servers in data centres.

Once you have a node, you create an account, then log into your account on the node with a browser. Then you can follow others, read their posts, and port your own messages.

Subsequent posts will describe the features I’m proposing, and the implementation of this system in more detail.

Are you in?

More on these later, but I’ve spent quite some time honing the Cross Platform Launcher plugin, making it work nicely with the Maven NAR (Native Archive) Plugin. You can now build JVM launchers across all three platforms (Mac OS X, Ubuntu Linux and Windows) with one plugin, that contain cross platform JNI code.

This is in preparation for another large-scale project, to replace my hodgepodge of clunky shell/perl backup scripts and Windows XP backup with something a little more… industrial.

I need a decent backup system. Time Machine looks great for the Mac, and I use it… but don’t entirely trust it. I use FileVault, and having to log out just to start a back up of any bands that have changed is just retarded.

I could just go out and buy Retrospect, but that’s not my style 😉

It has to back up in the background, notifying the user of activity unobtrusively. I’m thinking of a continuous incremental system, from the usual trio of operating systems, managing retention periods and disk space. I want to go back to any point in time and see/restore that data. It has to store all file metadata, so, extended attributes, modes/ACLs, multiple streams, sparse files, long filenames, symlinks, long symlinks to long filenames, ownership, etc. I want to be able to translate backup data into tar streams for verification and off-site backups. i.e. give me a tar stream of the whole data from last Wednesday, 18:34 onto this disk. It has to have a sparse, Zen-like UI – I’ve seen too many backup programs that were obviously written for developers!

I don’t ask much, really…..

The first seeds have been sown; I have the beginnings of a cross-platform library for filesystem access and access to all the above enhanced metadata that’s not available in Java. I know Java 7 has some of this; I can’t wait for Java 7. I wrote quite a bit of this for UNIX some time ago, using Ant and Eclipse, but didn’t do it using TDD. I’m revisiting this, and starting from first principles using TDD.

I also need to reuse some common code from MiniMiser. Mercurial makes transplanting code between repositories quite easy.

I think with this project that I’ll open it as soon as I’ve cut something that builds in my CI server. Shouldn’t be long now.

Next Page »