Open Source


Ah, the optimism of the 1st January!

As I reflected on 2018, it became apparent that ‘starting, not finishing’ is a big problem, chez M0CUV. My muse bestows plenty of interesting ideas, but some of them are a bit ambitious. I start, then things grind to a halt. This, coupled with chronic procrastination means a lot of churn, and a feeling of dissatisfaction, angst, and despair at not being able to find the time to do all this – or to prioritise better. A look back through the log shows a big gap of radio silence from June to October, a page of digital mode contacts, and not a single CW QSO throughout the whole year. On the software side, I hung up my own P2P stack after a baffling test failure wore me down. I do want to return to that.. However, I spent most of my hobby development time working on the Parachute project, which despite being really tricky, is going well. I never thought writing an assembler would be this hard. The devil is in the details.

So, after giving myself a good slap for a lacklustre radio year, 2019’s going to be goal-driven. Hopefully these goals are SMART (Specific/Stretching, Measurable/Meaningful, Agreed-upon/Achievable, Realistic/Rewarding and Time-based/Trackable). There are quite a few, and only the tech/radio-related ones are blogged about here. I’ve been using the Getting Things Done method for a while, and it stresses the importance of defining the Next Action on your activities..

So…

  • 1 blog post/month, at least. Progress updates on projects, etc.
  • 1 CW QSO/month, or 12 in the whole year. This’ll probably be really tough.
  • 1 QSLable QSO/month, or 12 in the whole year, any mode/band. FT8 makes this much easier!
  • Try to contact each month’s special callsign for the Bulgarian Saints award from Bulgarian Club Blagovestnik. I’ve already bagged January’s, LZ1354PM via SSB on 1st Jan 🙂
  • Take the next step on the magnetic loop project: build the frame, house the capacitor. I bought some wood and a food container for this yesterday.
  • Box up the 30m QCX transceiver properly – then use it.
  • Keep up with magazines as they arrive rather than building a pile of them.
  • Fix the current bizzare bug in the Transputer assembler – then ship the first release on Windows, macos and Ubuntu/Raspbian
  • Convert the Parachute Node server to use the IServer protocol – then write the IO code for eForth.
  • Build a web application with elm. I’m thinking of a web-front-end to WSJT-X, to allow me to operate remotely.

Let’s see how I get on…!

Advertisements

Some interesting features of cifs-utils in CentOS 7 that make mounting Windows shares harder than I’d like, that need documenting next time I (or you) run into them…. I had to read the source for mount.cifs to uncover this, and examine the entrails of mount.cifs with strace… so hope this helps….

The actual problem I’m having is that I’m trying to mount CIFS shares as non-root users: this appears to be impossible very hard in CentOS 7, but worked fine was really easy in earlier versions. Along the way to discovering this, I found many suboptimalities that I thought might be useful to fellow travellers….

You can add an entry to /etc/fstab for the default settings for the mount point, or, specify them on the command line. It’s not mandatory.

Normally you’d need to give the username, password, and (windows) domain to the mount command, and these are delegated to mount.cifs. However, supplying the password on the command line is not wise from a security perspective, since it’ll be visible via ps. So, you can store the credentials in a file, with appropriate permissions, then give this to mount.cifs. Except that it doesn’t quite work as I expected….

With the credentials.txt file containing:
username=windowsdomain\\windowsusername
domain=windowsdomain
password=windowspassword

(Note, I’ve seen some posts suggesting that the windows domain be prepended to the username as above, and as I’ll show below, this causes problems…)

I used the command:

mount -t cifs //SERVER/SHARE /mount/point -v -o credentials.txt,uid=1002,gid=1002

With appropriate Linux UID/GID that should be the owner of the files thusly mounted. (this is my ‘backupuser’ user) It didn’t work. The first problem was the error:

Credential formatted incorrectly: (null)

… which is code for ‘you have a blank line in your credentials.txt file’. Removing the blank lines, I then got:

mount error(13): Permission denied

I checked permissions on the credentials.txt (0440), the mount point, etc., etc… no, it’s not that. It’s parsing the credentials.txt, and seems to not get the username from it.. if you give it a gentle hint with:

mount -t cifs //SERVER/SHARE /mount/point -v -o credentials.txt,uid=1002,gid=1002,username=windowsusername

It works!

Now, if your credentials.txt has the username without the domain, like:

username=windowsusername
domain=windowsdomain
password=windowspassword

You do not need to give the username when calling mount, so this works:
mount -t cifs //SERVER/SHARE /mount/point -v -o credentials.txt,uid=1002,gid=1002

So, the rules for credentials files are:

  • No blank lines
  • Don’t add the domain to the username

But as for enabling non-root/suid/user mounts of CIFS shares… setting the suid bit (chmod u+s /sbin/mount.cifs), adding ‘user’ to an /etc/fstab entry, and running it gives the very helpful “mount error(22): invalid argument”. I’ve tried everything I can think of, but it just appears to be something that is no longer possible in CentOS 7.

To get this working requires adding an entry to the /etc/sudoers, like this:


backupuser ALL=NOPASSWD: /sbin/mount.cifs *, /bin/mount *, /bin/umount

Then mounting using sudo mount…. not happy about having to jump through these hoops, but there you go…

I’ve recently been trying to connect to a Microsoft SQL Server from a CentOS 6.5 system, using my tool of choice, Perl 5.10.

I chose to use unixODBC, and Microsoft’s own closed-source driver.

There are a few excellent articles out there on using unixODBC, such as EasySoft’s tutorial. The authors of that article, EasySoft, sell their own ODBC driver, which I didn’t try but probably should have. Or, perhaps FreeTDS would have been easier.

This has been moderately problematic, due to Microsoft’s driver requiring unixODBC 2.3.0, which does not exist in the standard package repositories. They expect you to build from source. So, I tried to build it as a .rpm, but this didn’t work quite perfectly. I looked on rpmfind, and found that there’s a unixODBC 2.3.1 available in Fedora Core 17, of course, building that on CentOS didn’t help because the MS driver checks for the existence of unixODBC 2.3.0 exactly, during its installation script. Gee, thanks MS!

Now I could go off on a lengthy rant here about MS being a bit NIH and not understanding package management (well Windows still has no such concept)… but no. Here’s how I got round this – it’s incomplete, but due to other committments I had to drop this work, so if this is of use to others, great…

So, I took the 2.3.1 .spec file, and added a patch that renamed all the .so files to .so.2, as required by perl-DBD-ODBC and built 2.3.0 with it… it didn’t quite work, as the MS driver is linked against libodbcinst.so.1! So, one quick ln -s libodbcinst.so.2.0.0 libodbcinst.so.1 later, I have the combination of unixODBC 2.3.0, perl-DBD-ODBC 1.48-1 and MS’ libmsodbcsql-11.0.so.2270.0 working together. A bug report about similar problems with unixODBC and the Oracle driver was filed here – it looks like the MS driver could never work with unixODBC out of the box; a new MS driver linked against the correct libraries is what’s required.

Perhaps I should have built unixODBC from source as the MS page suggests, but I thought I’d be better off taking the same packaging instructions as similar RPM-based distros do. Anyway, I’ve got it working.

My .odbc.ini file looks like (data source name and hostname modified):

[MyDataSource]
Driver = ODBC Driver 11 for SQL Server
Server = tcp:mysqlserver.organisation.com

The perl script I used to test it:

#!/usr/bin/perl
use strict;
use warnings;
use DBI;
use DBD::ODBC;
my $server = "mysqlserver.organisation.com";
my $db_user = "myuser";
my $db_pass = 'secret';
my $dbh = DBI->connect("DBI:ODBC:MyDataSource", $db_user, $db_pass) or die "$DBI::errstr\n";
print "Connected\n";
my $query = "Select count(*) from [mytable].dbo (NOLOCK)";
my $sth = $dbh->prepare($query) or die "$DBI::errstr\n";
$sth->execute or die "$DBI::errstr\n";
# go do stuff!
# Close the database
$sth->finish;
$dbh->disconnect;

You can find my modified unixODBC.spec here, and the small patch to rename the .so’2 can be found here.

In 2002, Paul Graham published A Plan For Spam, in which he describes the use of a Bayesian filter to detect spam in email. Today, this is one of the techniques used by most modern mail clients to detect spam, after initial training.

Could the same technique be used to filter your Twitter timeline? This could be either to detect spam, or, as I wanted to use it, to filter out things that I’m just not interested in, e.g. the current discussion of American politics. (Not that this is not worthy of discussion, it’s just that I have very little interest in it, and live in England.)

The idea is that upon reading a tweet, you decide whether it is interesting or not, and mark it accordingly. The filter learns that this particular combination of words is spam or ham, and uses this to adjust its assessment of the spamminess of other tweets. Tweets are then displayed in different colours, depending on the probability being high, low or medium. The record of your choices is stored on disk, and reloaded on startup; the effect being that over time, the program learns your interests.

One problem is the 140-character limit of tweets. This does not provide much with which to train a filter! However, I have had some initial success, not yet to the extent that I could simply not display low priority tweets; the filter still needs tuning.

The filter is an extension to Cameron Kaiser’s excellent TTYtter command-line Twitter client, which is written in Perl 5. As TTYtter does not currently provide a mechanism for an extension to indicate the ‘priority’ of tweets, and hence, colour them differently, I have patched TTYtter 1.2.5. In addition to TTYtter’s requirements, you will also need to install Gea-Suan Lin’s Algorithm::Bayesian module from the CPAN. I also recommend installing Term-ReadLine-TTYtter. The filter memory is persisted using Data::Dumper.

To use, invoke TTYtter as:

./ttytter.pl -readline -exts=bayes.pl -ansi

To mark tweets as spam, use the command "/spam xx yy .. zz" where xx, yy, zz are one or more tweet menu IDs, as shown in the incoming timeline. To mark as ham, use "/ham xx yy .. zz". Alternatively, use "/- xx yy .. zz" for spam and "/+ xx yy .. zz" for ham.

Tweets are then coloured according to the filter. Ham or interesting tweets are bold white, spam is yellow, undecided is plain white.

The weighting of the filter’s probability could possibly be better: I’m splitting the probability range [0..1] into three equal parts, [0..1/3), [1/3..2/3) and [2/3..1]. This was a guess 🙂

The patched TTYtter, and the module, can be found on the DevZendo Miscellaneous Mercurial repository . Changes to TTYtter are under the same licence as TTYtter itself; the Bayesian filter extension is under the Apache License v2.0.

More on these later, but I’ve spent quite some time honing the Cross Platform Launcher plugin, making it work nicely with the Maven NAR (Native Archive) Plugin. You can now build JVM launchers across all three platforms (Mac OS X, Ubuntu Linux and Windows) with one plugin, that contain cross platform JNI code.

This is in preparation for another large-scale project, to replace my hodgepodge of clunky shell/perl backup scripts and Windows XP backup with something a little more… industrial.

I need a decent backup system. Time Machine looks great for the Mac, and I use it… but don’t entirely trust it. I use FileVault, and having to log out just to start a back up of any bands that have changed is just retarded.

I could just go out and buy Retrospect, but that’s not my style 😉

It has to back up in the background, notifying the user of activity unobtrusively. I’m thinking of a continuous incremental system, from the usual trio of operating systems, managing retention periods and disk space. I want to go back to any point in time and see/restore that data. It has to store all file metadata, so, extended attributes, modes/ACLs, multiple streams, sparse files, long filenames, symlinks, long symlinks to long filenames, ownership, etc. I want to be able to translate backup data into tar streams for verification and off-site backups. i.e. give me a tar stream of the whole data from last Wednesday, 18:34 onto this disk. It has to have a sparse, Zen-like UI – I’ve seen too many backup programs that were obviously written for developers!

I don’t ask much, really…..

The first seeds have been sown; I have the beginnings of a cross-platform library for filesystem access and access to all the above enhanced metadata that’s not available in Java. I know Java 7 has some of this; I can’t wait for Java 7. I wrote quite a bit of this for UNIX some time ago, using Ant and Eclipse, but didn’t do it using TDD. I’m revisiting this, and starting from first principles using TDD.

I also need to reuse some common code from MiniMiser. Mercurial makes transplanting code between repositories quite easy.

I think with this project that I’ll open it as soon as I’ve cut something that builds in my CI server. Shouldn’t be long now.

When I started this project (when it was still a single desktop application for home financial management), I hoped that it might attract a community of users and developers.

Writing a software product that has quality is no small task. Anyone can lash together some code quickly, slap it up on SourceForge and hope – and many people do, leading to what David Heinemeier Hansson describes as “two people playing with it for five minutes” (See previous post, with quotes from David taken from the FLOSS Weekly Podcast #79 – http://twit.tv/floss)

Creating well-designed, aesthetic software, that meets the users’ needs, with all the features they have come to expect from modern desktop applications, plus excellent documentation, community infrastructure (website, forums, mailing lists, etc.) that’s translated and localised to their language and locale and having a coherent architecture that passes intensive quality metrics – is a lot of work for one developer.

I admit I have given no thought to internationalisation or localisation, at present. I will dedicate a release cycle or two to it later – I hope that the project will be of interest to developers who have more experience of these issues than I do, and who speak languages other than British English.

As noted in earlier posts, the goals of the project have diversified into providing a general-purpose embedded-database-backed desktop application framework. I hope that this might be of broader use to more users and developers than the narrower opportunities afforded by simply a financial application.

All the development to date has been stored and built on a server on a private network – as I open up the project, these facilities will have to move to public hosting facilities. Again, SourceForge and the like offer attractive facilities to developers here, but are not at all usable by end users (See How to successfully compete with Open Source software by Patrick McKenzie for a critical view of SourceForge as being unsuitable for end users.).

I’m currently using Subversion as a version control mechanism; this has been great so far, but the central repository on the private network is a bottleneck to my preferred style of working: rapid commits of discrete pieces of work. I do the majority of work on the project on the train, only connected sporadically via GPRS/HSUPA modem. Since the private network is not accessible remotely, I can’t commit when I’m travelling. All items I’ve been working on during commuting are in my working copy, and require sorting out into discrete commits. Sometimes there are overlaps between items, and if I were to open this to the community, it would make code review of independent pieces of work difficult. Not having the entire history available whilst working disconnectedly is also less than ideal. So before opening up, I’ll convert the projects to a distributed version control system – I’ve already chosen Mercurial: Git is not a consideration since its developers do not seem overly concerned about cross-platform issues; Bazaar was a contender, however Mercurial passed the “Has an O’Reilly Book” quality test. (Mercurial: The Definitive Guide.) I don’t know how easy it would be to set up my own repository hosted on my own public server, so I’ll enquire about hosting the code on BitBucket.

Another small barrier to entry for other developers is the domain from where it is available. I have a perception that people might be less likely to contribute to a project that seems like it only belongs to ‘a guy in his bedroom’: The gumbley.me.uk domain and uk.me.gumbley Java package I have been using to date is like a private house: you may come in if invited. A community hall is open to all, more welcoming. Therefore, a “proper” domain has been obtained – which, in accordance with my preference for polishing until ready, will be revealed later. Suffice to say that I intend to develop it as a community website, made highly accessible primarily to potential users of the software, as they shouldn’t have to wade through a developer-centric design. I’m using Drupal to manage it – a thoroughly fantastic CMS; I recommend it highly. There will also be a grand renaming of the Java package structure of all the projects prior to hosting on BitBucket.

The build infrastructure is more than just source code control: I’ll need:

  • a continuous integration server (are there any publically hosted Hudson servers?)
  • a Maven repository (I am aware of Sonatype’s OSS repository, and will be enquiring about this – there are a few artifacts I’m using that are not available from the central repository, and a few I have deployed in odd ways on my private repository – these issues will have to be addressed first).

I have been reading two excellent books on the community aspects of developing Open Source software: Karl Fogel’s Producing Open Source Software, and Jono Bacon’s Art of Community. I’d thoroughly recommend these to anyone contemplating an Open Source project – especially if they want it to have any longevity. Another series of articles that will help the Open Source craftsman are by Kirill Grouchnikov, and may be found at Pushing Pixels: Party of One: Surviving a Hobby Open Source Project. (Note that I disagree with Kirill’s point about documentation: “If your documentation does not have an immediate answer – it’s your fault. In fact, the documentation is a chicken and egg problem – if your library needs documentation, it’s unfortunately your fault as well.” – I believe documentation to be an essential part of software, especially frameworks, languages and libraries: how many people would be Ruby programmers now, if Programming Ruby had not been published?

The list of technical things I have to provide have ballooned following their advice: bug tracker, mailing lists backed by forums for nontechnical users, IRC (could be difficult, given my semi-connected, only-on-the-train-and-frequently-going-through-tunnels mode of working).

There are also the political / governance aspects that must also be addressed.

There’s a lot to do – and at the moment, only me to do it. No wonder then, that only 15% of projects identified by Kirill in his articles achieve viability.

The words in the classes of MiniMiser, from Wordle

A picture really does tell a thousand words. These are the words in the classes of the current version of the MiniMiser framework – only the main code, not the unit tests (it’d look very skewed by the word ‘Test’, if I included those!).

Generated via:

find . -name "*.java" -print |
~/utils/bin/wordleize-java-class-names.pl

With a bit of perl:

#!/usr/bin/perl
while () {
  chomp;
  my $class = $_;
  $class =~ s-^.*/--;
  $class =~ s-\\.java$--;
  $class =~ s/([A-Z])/ $1/g;
  $class =~ s/(^| )[A-Z]( |$)//g;
  $class =~ s-^ +--;
  $class =~ s- +$--;
  print "$class\\n";
}

And the output pasted into the very wonderful Wordle.

Next Page »