Fedora 17 upgrade notes

Here are some personal notes on upgrading a home machine to Fedora 17 (beta).

The preupgrade requires about 165MB on /boot for the upgrade. This is in addition to whatever you needed to boot Fedora. I didn’t have enough space, and the preupgrade process didn’t warn me. This is a bug that wasn’t considered a blocker. I’m glad I treated the spew of “No space left on device” messages and python stack traces on the console rather than trusting the preupgrade dialog box that said “ready to reboot” into Beefy Miracle. So I aggressively removed all kernels except the active/current one. Moral of the story: if your boot partition is smaller than 200MB (200*2^20), preupgrade will probably not work.

For the record, I fully support release naming in Fedora and have no problems with Beefy Miracle or Spherical Cow.

Posted in development | Leave a comment

Mystifying edit icon at Launchpad

I work with Launchpad every now and then, and today, I wondered why the little yellow icons that are supposed to mean “edit” look the way they do.

Here is an example of what I mean: "This report is public [edit]"

Many months ago, when I was figuring out how to use Launchpad, I remember hovering my mouse pointer over icons and other screen elements systematically, trying to discover how to change and edit certain things in a project page. And it was a true revelation to discover that the button meant ‘edit’. Actually, I think it looks like a shoe-print.

I had a look around today to see if anyone else had the same confusion, and there is a bug report ‘”Edit” icon is difficult to recognize’. Actually, the icon is supposed to be a pencil. If I try really hard, I can see the pencil. But otherwise, I still see a shoe. I don’t really see it as an exclamation point (!) as other people noted, since those are vertical, not slanted. It’s a little amusing that the bug has been open 3+ years, and no change has been made. A survey was done on launchpad icons, which indicated a “high level of understanding” that the shoe meant “edit”, but also “danger”. If the surveyed were Launchpad users, this should be unsurprising. What do the uninitiated users see?

We’ve known about this for three years. Since nobody with the power to change it thinks it’s worth fixing, my hope is that it gets fixed accidentally when Launchpad does a visual refresh, which might happen in a few years. Until then, I can continuue to enjoy sounding like a guru when I tell people that it’s an ‘edit’ button and they say ‘What? Really? I wouldn’t have guessed.’

Posted in Uncategorized | Leave a comment

Using git to collaborate on a paper

I’m working with others on a paper written in LaTeX.  It’s stored in a git repository.  I figured it would be easy for all of us to track and merge each other’s changes this way.  So my friend clones the repo, commits a change, and sends me an email to let me know.  At this point, I thought, great, let’s grab his changes, review and merge them.

I’m sure this is obvious and is documented elsewhere, but it wasn’t to me. It’s not conceptually different than two programmers sharing code, which I’m sure is one of the common usages of git.

So I thought, well, I can just track his ‘master’ branch in my repo and merge the changes.

# this doesn't work
git branch --track friend-master/path/to/other/repo master

No, this doesn’t work.  I have to add my friend’s repo as a remote in my local repo first.

#this works
git remote add friend /path/to/other/repo

This says “Hey, I’m interested in the repo at /path/to/other/repo , and from now on I’m calling it “friend.” Does this mean I can track the branch now?

#not yet
git branch --track friend-master friend/master

No, first I have to fetch the remote so that my local repo is aware of what branches exist there.

#this grabs stuff from my friend's repo
git fetch friend

Specifying “friend” is necessary, because git will pull by your default repo if one is not specified (and your default is probably “origin”). At this point, we can track the branch and merge.

git merge friend/master

Summary:

git remote add friend /path/to/other/repo
git fetch friend
git merge friend/master

It’s actually pretty simple. It might even be obvious, provided you already understand the way git works. Otherwise, you might find this useful.

Posted in development | Leave a comment

Evaluating Kyoto Cabinet

Today, I wanted to see whether Kyoto Cabinet could do a better job with table lookups than MySQL.

I have a 1.7+ billion row table in MySQL that has three columns, a 64-bit int and two 32-bit ints. This yields about 28GB on-disk in MySQL with MyISAM, which is about right if you multiply out 1.7 billion by 16 bytes.  The sole reason this table exists is to provide lookups on the two smaller ints from the larger.  Naturally, an index is warranted, and in this case, the index takes up about the same amount of space.  Okay, maybe a little more (29,478,814,827 for the table and 30,052,789,248 for the index).  Did I mention that the index takes about a day to generate on a machine with gobs (128GB) of memory and SSDs?

I figured, hey, I’m using it like a key-value store, so could I do better? Kyoto Cabinet seems like something to try, since I’ve heard good things about Tokyo Cabinet.

Here’s what I did:

For space efficiency, I stored the 64-bit int as an 8-character key and packed the two 32-bit ints into an 8-character value.  I used KC’s HashDB and set HashDB::TLINEAR, tune_buckets to 2 billion (2 * 2^30), and tune_map to 16GB (16*2^30).   TLINEAR is recommended for space efficiency, tune_buckets should be within a factor of 2 of the expected key count, and tune_map should reflect the expected overall db size.  I think my values were in the right ballpark.

What I found was that KC looked very fast for small data sizes, but its insertion time seemed to increase linearly as the number of keys inserted.  Here is a plot relating insertion time (in seconds) for a batch of 64K keys versus the total number of keys inserted.

You can see that there are a couple bumps in the curve, but the times seem to keep increasing: 222 seconds for 64K when 2M keys have been inserted, versus 1 second (or less) for the first few 64K batches.  This doesn’t seem like it’s going to work for 2 billion.

MySQL is looking pretty good here.  Though the index is large, it’s close to 16 bytes per record, which doesn’t sound that much bigger than KC’s 10 bytes.

I don’t know if I have any alternatives.  Perhaps MongoDB. I think it has about 12 bytes in overhead per record, just because of BSON, but if it has a more reasonable insertion time, it may be worth a look.

 

Posted in development | Tagged , , | 4 Comments