Page Trail: [[ Blog Index ]]

LogCabin::appendEntry(1, "Hello, world!")

2014-12-16 13:20 | raft logcabin

This is the first in a series of blog posts detailing the ongoing development of LogCabin. This first entry catches up on the developments from when I started working with Scale Computing in November, so it's longer than most of the future updates will be.

The theme of this entry is getting started in a new environment. Up until now, I'd done nearly all of the development of LogCabin on my laptop and on the RAMCloud cluster. Running it somewhere new uncovered a bunch of implicit assumptions baked-in about the environment, so it exposed a new set of issues and bugs. This is fairly inevitable when it comes to low-level systems code, and there's a lot of value in working through it. LogCabin is significantly easier to run now than it was before, and it should be easier for the rest of you to install on your systems, too.

Getting Started

At Scale, we've been working to integrate LogCabin into Scribe, the software that runs their private cloud products. We needed LogCabin to be installable and deployable as a normal system service with an RPM (Scale's software distribution is currently based on RHEL 6). So, I added an option for LogCabin to behave more like a daemon, reparenting to init and writing to a log file instead of stderr. Others at Scale wrote a basic init script and an RPC spec file, which we intend to merge into the upstream repo.

On the client side of things, I moved the Client.h header that clients need to include. Client code used to #include <Client/Client.h>, which is fairly confusing; now it's #include <LogCabin/Client.h>.

I also started removing the need for DNS in clients. Instead of requiring a DNS name that resolves to multiple addresses, the Cluster constructor now accepts a semicolon-delimited list of addresses and will randomly connect to hosts in that list. Left to do are simplifying the README and other scripts to take advantage of this, and finding a way for dynamic membership changes to work without DNS (maybe we should expose a way for running clients to reset the list of hosts?). For now, Scribe still uses a DNS hostname set up by /etc/hosts.

Client API

As I was adding the first lines of code to write to LogCabin from Scribe, I noticed that LogCabin had no way to do a conditional create. In other words, you could write to a key on the condition that it had a particular value, but you couldn't write to it on the condition that it had no value. Now, Tree::setCondition(path, "") will match not only "files" in the replicated state machine with 0-length values but also files that don't exist at all. In the future, we may want to extend the condition system to include matching on hashes of files for large files, or maybe switch to using version numbers.

Internal Improvements

I also fixed two crashes in the RPC system. One was a race condition, in which a socket that was disconnected and closed was still being used to attempt to send outbound messages. An additional mutex protecting access to that file descriptor fixed the problem.

The other bug in the RPC system was harder to track down. This crash happened to Nate when he reconfigured the cluster to include an address that was invalid, so that DNS resolution failed. Instead of retrying periodically, the non-leader servers would PANIC with messages like:

1418256261.050477 RPC/ in refresh() WARNING[2:Peer(1)]: Unknown error from getaddrinfo("", "61023"): Temporary failure in name resolution
1418256261.050639 RPC/ in read() ERROR[2:evloop]: Error while reading from socket: Transport endpoint is not connected Exiting...

But the MessageSocket never should have been instantiated with an invalid address! And strangely, I couldn't reproduce this problem on my laptop, only on Scale's servers.

It turns out that connect() behaves incorrectly on Scale's servers. As far as I understand from the POSIX standard, connect(fd, NULL, 0), with a 0-length sockaddr, should return -1 and set errno to EINVAL. That's what it does on my laptop, and that's what it does on Scale's servers when running through strace or valgrind. However, if I run that on Scale's servers as a normal program, it returns 0, that everything is ok, and then barfs later that the socket isn't connected! I couldn't find any other reference to this problem after a quick web search, but I suspect its a glibc issue. I worked around it by making sure not to call connect() with an empty sockaddr.

A couple additional improvements to the client library's internals:

  • If a client ever issues a write request, the client library needs to open a session with the LogCabin cluster, and it'll periodically issue keep-alive RPCs to keep that session open. Unfortunately, if the cluster went down, these RPCs would prevent clients from exiting, as they'd retry their keep-alive RPCs forever. Fixing this required making that RPC cancellable, which caused a fair amount of code churn.
  • Clients that couldn't connect to a cluster were actually very aggressively trying to reconnect, causing 100% CPU usage and probably wasting network bandwidth. They're now rate-limited.

Travis CI

Build Status

I also set up Travis CI to do automated builds for LogCabin. This started with wanting the code-level documentation (produced by Doxygen) to be available on a web server. It's a simple idea, but the documentation changes as the code changes, so static hosting wouldn't quite work. On each commit, Travis CI will now check out the new version of the code, build it, run the unit tests, and build the documentation. Then, it'll push the docs to a GitHub static page, which GitHub then hosts at

Travis CI runs these automated builds on fairly puny and/or overloaded VMs. I'm not blaming them (it's a free service), but this made some of the unit tests fail intermittently. Unfortunately, a few of the unit tests are just fundamentally sensitive to timing, like making sure that a condition variable waits for about the right amount of time. I renamed such tests to include "TimingSensitive" in the name, and, using a gtest filter, Travis CI will no longer fail the build for such tests.


Next time I'll discuss ongoing work to add timeouts to LogCabin's client library, including a battle against libstdc++ 4.4's broken support for clocks and time. Thanks to Scale Computing for supporting this work.

Grad School

2014-12-11 20:55 | raft logcabin

Ousterhout placing academic hood on Ongaro

Well, I spent the last five years getting my Ph.D. in Stanford's Computer Science department. I won't do that justice here, but I'll fill in the story briefly so that subsequent posts make sense. I was part of Professor John Ousterhout's group, which is primarily focused on RAMCloud, a large-scale in-memory distributed storage system.

I started working on RAMCloud soon after joining Stanford (Ryan made the first commit the month after I started), and I worked on various low-level parts of the system and master recovery. Eventually, I began to look into eliminating the single point of failure that RAMCloud's coordinator once was, and I became interested in using consensus to solve the problem. (To be fair, I never fixed the problem in RAMCloud; Ankita and John deserve the credit for that.)

I wasn't impressed with the existing consensus-based systems, so I started learning about Paxos, the algorithm that's nearly synonymous with consensus. I struggled through how to build a complete system using Multi-Paxos, and meanwhile, John questioned whether Paxos even had the right approach. He kept pushing on the idea of understandability, to find the solution that's the easiest for someone else to understand. He asked: what's the advantage to agreeing on log entries (slots) out of order if they then have to be applied in order? If the game was understandability, I just couldn't defend Paxos on this question.

Eventually, John went off during a weekend and proposed ALPO, the first version of the algorithm that matured into the Raft consensus algorithm. Raft turned into my thesis topic, and I developed an implementation of it in C++ called LogCabin. Meanwhile, Raft gained significant traction in industry, being implemented in a variety of systems in many different languages, and it's also been taught in a few distributed systems courses already.

Early on in RAMCloud's history, in April 2010, we held the RAMCloud Design Review: a group of friendly people from academia and industry came over to listen to our ideas for RAMCloud and give us feedback. As part of this feedback, we were advised to use ZooKeeper for the coordinator (which RAMCloud eventually did use) and were warned of the "danger in believing one should do Paxos from scratch or optimize it". I think that was pretty solid advice when interpreted as: if you start on this path, it will consume your life. Sometimes, though, getting side-tracked to work on an important problem is the right thing, especially in academia.

Now that I've graduated, I plan to continue to stay involved with Raft and help support the Raft community where I can. I've recently announced my plans to continue developing LogCabin with support from Scale Computing, and I'm excited to see LogCabin mature into a stable and production-quality system. I plan to post articles about this development here on a regular basis (RSS).

Tips for Running PowerPoint in CrossOver Office

2014-04-22 17:16

I use Linux exclusively but run Microsoft PowerPoint for some presentations. I won't go into the details of why here, but I wanted to record and share a few tips I use for getting it to run well. Your mileage may vary, obviously.

Most importantly, as of this writing, you want no later than PowerPoint 2010.

The first problem I had is that when I drag a selection rectangle or move objects around, the entire drawing area went black. The workaround I use gives Wine a big rectangle to manage and draw on. Run cxsetup, go to "Wine Configuration", under the "Graphics" tab, enable "Emulate a virtual desktop". You'll want to set the resolution (size of the rectangle) based on your monitor (I use 1590x840). I still get a black rectangle covering the contents of my selection (not the whole drawing area as before), but it's fairly usable that way.

The second problem was that some bullets weren't displaying correctly; they were showing an empty rectangle instead. I resolved this by copying a random version of wingdings.ttf (with that exact spelling, I guess) into ~/.cxoffice/Microsoft_Office_2010/drive_c/windows/Fonts. Then I had to quit my bottle in cxsetup; no idea what that means.

I find that it's pretty stable and works pretty well after those changes (though I don't use most of the advanced features or animations). Still, I'd suggest saving frequently. And I always render to PDF and use a native PDF viewer for the actual talks. That's a good idea in general but also necessary due to the fixed size virtual desktop, which won't fit nicely on different projector sizes.

Manual Window Placement in i3 (Part 2)

2013-05-06 21:39

This is the second part of a series on making the i3 window manager work the way I want. I left off last time with the goal of changing the way windows are placed as they are created, and I had a couple of pointers from the i3 hacking howto for where to start looking. This post covers how I've set up my test environment.

I started looking in src/manage.c but soon found my way into src/con.c, which does most of the grunt work surrounding containers. There are a ton of conditional branches in the i3 source code, so running i3 with debug logging on (i3 -d all) was essential in figuring out which code paths were being executed.

One particularly relevant log message,

Inserting con = x after last focused tiling con y
led me to con_attach(), the function in charge of placing a new window in i3's layout tree. I think that's one key function I'll need to change.

I started playing around with changing the layouts of things and creating extra containers in there but quickly got frustrated. The problem was that I was using my buggy version of i3 while iterating on the code and testing. Testing also became difficult, since running the tools to analyze what's going on requires opening new windows, but opening new windows affects i3's state.

A better approach is to run i3 inside a nested X server. This way you can keep your editor, browser, and other tools open outside of the test environment, and keep the test environment minimal, clean, and easy to reset.

I had used Xnest in the past, but I found that i3bar didn't display fonts for me under Xnest. I don't know what the problem was there, but I came across Xephyr, a replacement for Xnest that supports modern X extensions. Fortunately, Xephyr can run i3 and i3bar properly. Xephyr allows the nested server to grab the keyboard (toggled with Ctrl+Shift), which is quite handy for window manager development.

I'm also getting a lot of mileage from i3's contrib/ script, which uses i3's JSON-based IPC interface to show you a graphical representation of the layout tree. This script has been helpful in understanding the tree transformations that occur as I test my changes to i3. For example, it's showing me that I have a bunch of nested containers with only one child (oops). I've made a few minor improvements to the script, and Michael Stapelberg has already accepted a couple of these minor patches.

To use contrib/ with a nested X server, you need to help it find i3's IPC socket. It uses AnyEvent::I3 internally, whose default constructor finds the i3 running on your current DISPLAY. You don't want to launch the script with the DISPLAY set to the nested X server, since then it would launch also launch gv inside the nested X server. Instead, construct the AnyEvent::I3 instance as follows, for a nested X server running on DISPLAY=:1:

chomp(my $path = qx(DISPLAY=:1 i3 --get-socketpath));
my $i3 = i3($path);

Now that I can use my editor reliably and query what's going on in a controlled testing environment, I should be able to make some real progress.

Manual Window Placement in i3 (Part 1)

2013-05-01 00:22

I've been using tiling window managers for the past couple of years. I started with awesome, then Notion (a fork of Ion; Ion is no longer maintained), and now I'm in the process of moving to i3. For those of you that aren't familiar with it, the screenshots all look the same. They all behave differently, though, and I guess you just have to find one that fits your mental model.

When you open a new window in most tiling window managers, your existing windows get rearranged or resized to make room for it. This is kind of one main idea, actually, and it works reasonably well when opening your second or third window. Beyond two or three, depending on the screen size and applications, it starts to suck.

Awesome is a dynamic window manager, meaning it assigns each workspace a layout, and that layout determines how windows are placed as they are opened. A common layout involves a spiral of ever-shrinking window sizes. The first window opened will occupy the entire screen. The second window will take half the real estate from the first. Then the third window will take half the real estate from the second, etc.

The net result of this dynamic approach, however, is that window placement is unstable. When you launch a window (in a large enough tile to see the contents), your other windows get displaced and resized to make room. This is the single reason why I switched from awesome to Notion.

Notion's approach is very simple but powerful. Every tile is actually a tabbed set of windows. If you launch a new window, it creates a new tab in the same container. You get three commands for managing windows: split a container vertically, split a container horizontally, and unsplit. Split containers can be nested arbitrarily.

I really like Notion's basic approach. I just had a few minor gripes with Notion as a whole. Looking over my list, I don't think any are show-stoppers, and most of them I could probably fix with some configuration or minor hacks.

But I happened to come across i3, and I was impressed by the way they're managing their project. I'd recommend checking out their videos, including the lead developer's hour-long Google tech talk. Rare for these sorts of projects, it aims for well-documented code, has automated tests, and has an active community.

I was hoping i3 would work like Notion out of the box, but unfortunately their model is a bit different. i3 and Notion support the same layouts in principle: i3 splits workspaces into nested containers, where each container is either tabbed, split horizontally, or split vertically. However, i3 behaves differently when placing a new window. If you're in a tabbed container, yes, it creates a new tab. But if you're in a split container, it creates a new split, resizing your existing windows in that container. That's not what I want.

I'm not seeing any options to control this behavior, so it looks like I'm going to have to get my hands dirty and hack it up myself. Given their container model, it shouldn't be too hard in principle. I guess what I want is: if you're opening a window as part of a tabbed container, open a new tab (no change from before). If you're opening a window as part of a split container, put the current window in a tabbed subcontainer, and create the new window as a new tab there. Maybe that's it?

So I don't lose my place, the hacking howto has a couple of sections that seem relevant: "8. Manage windows (src/main.c, manage_window() and reparent_window())" and "9. What happens when an application is started?". More on this later once I've jumped into the code...

Update: part 2 explains how I set up my test environment.

Is It Worth the Time?

2013-04-29 16:38

Today's XKCD starts to answer: how much time should I spend making a routine task faster?

XKCD 1205: Is It Worth the Time?

I really like this comic. As a Ph.D. student, I have a lot of control over how I spend my time, and this question comes up a lot. (I'm even writing about analyzing how I spend my time.) Relative to the people I work with, I think I err on the side of spending more time optimizing my workflow, and I think programmers in general tend to do this more than others.

Obviously you shouldn't spend all your time optimizing. We joke that one optimizing friend (Aleks) will only need one keystroke by the time he's done; it'll set off some sort of complex scheme for something-or-other. The details are moot, of course, since he'll never reach that point.

Still, I don't think Randall's chart is the definitive answer. It can be rational to spend more time "optimizing" than you'd naïvely expect to shave off:

  • You may grossly underestimate how much time you shave off. If you're eliminating a frustration, in particular, the expected time should include lost productivity due to becoming frustrated. If these minor frustrations add up, eventually you'll just shut down.

  • Improvements to your workflow can be shared if they apply to other people; this multiplies their effect. This is why I'd rather have my labmates use the same tools as me.

  • Although doing a routine task is usually not interesting, optimizing a routine task often is. So at least for me, the path to getting something done may be shorter through optimization rather than through procrastination.

  • Improving your workflow makes you faster at improving your workflow. The experience and knowledge gained while doing this leads to making faster improvements next time around. It also exposes low-hanging fruit for other workflow improvements you may not have considered. This argument may seem circular, but it's not: you're shaving time off future optimizations, which shaves time off tasks that you wouldn't have otherwise optimized.

These arguments aren't earth-shaking, but at least there's some ammunition to use in defense of over-optimizing.

Update: Cory Doctorow made some good counter-arguments on Boing Boing.

Bullets in Inkscape

2013-04-13 20:12

Inkscape is a good, open-source drawing program for vector graphics. I'm currently using it to make a research poster, but unfortunately, Inkscape doesn't do bullets. This post discusses your options if you want to use bullets in your Inkscape drawing and introduces a simple Inkscape extension that makes this much easier.

Your first option is to use an external program like Scribus or Tex to generate the bullets and text, then import that into Inkscape. This seems like a lot of work to me. I don't want to flip between different programs or files for this.

The second option is to draw the bullets manually next to your text box. This is pretty time-consuming, but it works if you have just a few bullets to place and your text won't change much. A circle is a sane choice, but you can use whatever you want as a bullet.

The third option is to place Unicode bullets manually inside your text box. To do this, you're limited to using Unicode characters such as bullets, triangles, and dashes. You can find these on the Internet and copy-and-paste these into your text boxes. The main drawback with this approach is spacing: if you wrap your line, you need to insert spacing to indent the next line. The best way to get the same indent level as you had on the line above is to insert the same exact Unicode character as before, but this time make it transparent (or white). This is workable, but it gets really tedious if you have a lot of bullets. It's even more tedious if you want bullets of a different color — in my case, I wanted blue bullets and black text.

I created a simple Inkscape extension to make the third approach more tolerable. The basic idea is to just replace strings in text boxes with bullets and spacing. It's easiest if I just show you:

extension transforms special characters into bullets

Every time you run the extension, it applies the following replacements:

Input Replaced with
*   asterisk space space top-level bullet
\   backslash space space indent same as top-level bullet
   -  space space space dash space second-level bullet
   \  space space space backslash space indent same as second-level bullet

Here's how to add this extension to Inkscape. You'll need to create two files in ~/.config/inkscape/extensions/. The first file, bullets.inx, describes to Inkscape how to display and run the extension; it's just boilerplate. The second file,, is the code that gets executed when you run the extension:


# top-level bullet and space
bullet='<tspan style="fill:#3465a4;">●<\/tspan> '
bulletnext='<tspan style="fill:none;">●<\/tspan> '

# second-level bullet and space
dash=$bulletnext'<tspan style="fill:#3465a4;"> –<\/tspan> '
dashnext=$bulletnext'<tspan style="fill:none;"> –<\/tspan> '

# the last argument to this script is the filename read from
shift $(( $# - 1 ))

sed -e "s/\\*  /$bullet/" \
    -e "s/\\\\  /$bulletnext/" \
    -e "s/   - /$dash/"  \
    -e "s/   \\\\ /$dashnext/" \

As you can see, there's not much magic here. The script just runs sed to find-and-replace a few strings with Unicode characters of the desired color.

And that's it. It's not the prettiest thing in the world, but now you can create bullets in Inkscape without tearing your hair out.

Color GDB Prompt

2013-04-07 14:50

To add color to your GDB prompt, place the following in your ~/.gdbinit file:

# keep trailing space on next line
set prompt \033[0;33m(gdb)\033[0m 

This makes it easier to find your place visually.

Leveraging Web Technologies for Local Programs

2013-04-03 21:38

Client-side web programming (HTML, JavaScript, CSS) has improved greatly in recent years. You could make a reasonable argument that this is the best environment for building user interfaces today: it's quick, it looks decent, there are tons of libraries and stylesheets available, and the developer tools are quite good. Firefox even has a 3D Inspector view these days — if that doesn't convince you, I don't know what will.

Meanwhile, node.js has become a reasonable platform for server-side JavaScript that interacts with the outside world. Node provides libraries for accessing the filesystem, making network requests (not just HTTP), and running arbitrary processes. And it has over 25,000 third-party libraries to help you out.

This environment already makes sense for building local applications. You'd run a node.js server locally to handle interaction with the outside world, and you'd load up a web page in your browser to provide the user interface. Your browser and server would communicate over HTTP requests as usual. Hide the browser's toolbars and it doesn't even look out of place.

The node-webkit project takes this one forward: it integrates the node.js libraries into a Webkit environment. So now instead of splitting your local application into a server and a client component, having these communicate over HTTP, and having to launch these separately, you can structure these applications in a much simpler way. There's no HTTP involved: your JavaScript just acts on user input directly, calling into node.js libraries when it needs to.

I've been playing around with node-webkit a bit the last few days, and I must say I'm impressed with how productive of an environment it is to program in. The number of JavaScript libraries out there really makes up for the ugly bits of the language. I feel like I've been shying away from building graphical applications because of the overhead of doing so with something like PyGTK; I think node-webkit might change this. Node-webkit applications require so little boilerplate that this approach makes sense even for one-off scripts where you'd like user interaction.

SQLite Database in Git

2013-04-03 19:16

I store nearly all files of even moderate importance in Git (including this blog post). These are usually plain-text files, but sometimes it's necessary to put binary files under version control. Unfortunately, those are typically difficult to diff and merge, but I recently discovered some features of Git that make this less painful. This blog post focuses on SQLite database files, but at least some of it applies to other binary file types.

My problem specifically involved managing changes to an SQLite database that contained results for a research study. The database was changing as new results arrived and were processed, and it was important to me to track its changes in case of manual or programming errors.

SQLite stores its database in a pretty complex format (described here). While diffing two SQLite databases can sometimes be human-readable, this entirely depends on the binary that happens to fall right around the modified values. It's doable but sometimes requires a lot of annoying horizontal scrolling past screenfuls of control characters. Life's too short for that.

SQLite can dump entire databases out as SQL statements, and Git can be configured to do this when generating diffs. In a .gitattributes or .git/info/attributes file, give Git a filename pattern and the name of a diff driver, which we'll define next. In my case, I added:

db.sqlite3 diff=sqlite3

Then in .git/config or $HOME/.gitconfig, define the diff driver. Mine looks like:

[diff "sqlite3"]
    textconv = dumpsqlite3

I chose to define an external dumpsqlite3 script, since this can be useful elsewhere. It just dumps SQL to stdout for the filename given by its first argument:

sqlite3 $1 .dump

At this point, git diff should show you plain-text diffs, as should browsing Git commits. There's still one problem left: sometimes SQLite's binary database will change, but the actual database contents remain the same. This results in a git status that says the database has changed but a git diff that says it hasn't.

I don't know enough about SQLite to know why this happens. I thought it was because SQLite doesn't compact free space right away in its database files, but I ran into a case where even if I vacuum two database files with identical contents, they still have different binaries.

One brute solution would be to dump the database contents to SQL and read them back into a "fresh" SQLite database. This should result in a canonical binary database, since SQLite doesn't seem to store anything like a timestamp in there. I suspect you could have your diff driver do this automatically every time it runs, but I haven't tried it yet.

rlwrap: readline Wrapper Program

2011-12-28 17:56

Dealing with a basic command-line prompt in a loop can be painful, so many programs, such as shells, interactive programming languages and debuggers, provide a more featureful prompt. For example, pressing the up and down arrows in a good prompt will flip through previously entered input lines. Programs will often make use of the GNU readline library for this functionality.

If you need to use a program that only has a basic prompt, you may be able to wrap it with the program rlwrap to get some more advanced features. From the man page:

rlwrap runs the specified command, intercepting user input in order to provide readline's line editing, persistent history and completion.


There are many options to add (programmable) completion, handle multi-line input, colour and re-write prompts. If you don't need them (and you probably don't), you can skip the rest of this manpage.

For example, I recently used rlwrap with jdb, the Java debugger, and Ikarus, a Scheme compiler.

The Cost of Exceptions of C++

2011-11-10 11:25

Most people seem to have an opinion as to whether exceptions in C++ are slow or fast, but very few people have put any useful numbers out there. Here's a lower bound:

#include <inttypes.h>
#include <stdio.h>

const uint64_t count = 1000000;

inline uint64_t
    uint32_t lo, hi;
    asm volatile("rdtsc" : "=a" (lo), "=d" (hi));
    return (((uint64_t) hi << 32) | lo);

    // Measure the cost of throwing and catching an int.
    uint64_t start = rdtsc();
    for (uint64_t i = 0; i < count; i++) {
        try {
            throw 0;
        } catch (int) {
            // do nothing
    uint64_t stop = rdtsc();
    printf("Cycles per exception: %lu\n",
           (stop - start) / count);

The code just measures the time it takes to throw the number 0 as an exception and catch it.

Using g++ version 4.4.4, compiled with -O3 in 64-bit mode, and running on an otherwise idle Intel Xeon E5620 CPU at 2.4 GHz, this benchmark takes 2.18 to 2.21 microseconds on average per exception.

So the cheapest exceptions on a modern CPU would cost about 2 microseconds. When you throw an exception in a real project rather than a microbenchmark, this cost is significantly higher. Anecdotally, we typically see times closer to 5 microseconds for exceptions in the context of RAMCloud, the project I work on at school.

Shared Feed Items

2010-03-13 22:34
Google Reader logo

Lately, I've been sharing items from my news feeds on Google Reader that I find interesting. You can view these on the web or as an Atom feed. (You don't need to be a Google Reader user yourself to view my shared items.)

Upgrading to Debian Squeeze

2010-02-21 15:48
Debian logo

I've switched my laptop over from Debian Lenny (stable) to Squeeze (testing). While I made it this far with the aging software available in Lenny by pulling newer packages from backports and unstable, I finally gave in for Python 2.6.

In case it helps anyone else, my laptop is a Lenovo Thinkpad T61 with an Intel graphics chip and wifi card. Much of what broke is related to kernel mode-setting (KMS). Here's the list:

  • I had vga=794 in my /etc/default/grub, which is no longer compatible. On Squeeze's kernel with this setting, the console framebuffer did not display anything. I think the GRUB_GFXMODE variable is supposed to replace it.
  • X failed to start and failed to let me switch back to the consoles on Squeeze's kernel (linux-image-2.6.32-trunk-amd64 2.6.32-5). I think you need an Intel graphics chip and 4GB of RAM to enjoy this bug (which seems at least related to FreeDesktop's Bug #25510). If you're so lucky, the kernel in unstable fixes the problem for me (linux-image-2.6.32-2-amd64 2.6.32-8).
  • I had a residual config file that apt did not purge at /etc/udev/rules.d/z60_xserver_xorg_input_wacom.rules that caused screenfuls of warnings early in the boot process.
  • Something has broken ifconfig wlan0 up on boot, but an easy work-around is to turn the hardware radio kill switch to off and then back to on.
  • Despite my efforts, Bluetooth is enabling itself. I'll have to find a way to turn that off again.
  • Instead of using module-assistant to build the ThinkPad SMAPI module, I installed the tp-smapi-dkms package.
  • IPython doesn't ship with /usr/bin/ipython2.6 yet, but copying /usr/bin/ipython2.5 seems to work fine. (It uses $0 to determine which version of Python to call.)

Book Log

2010-02-13 21:19

In addition to my movie log, I've now started a book log going back to last summer.

100 Movies by May 10th?

In other news, the movie log is now up to 87 films. I hadn't yet cranked out any numbers with it, but this announcement is as good a time as any to start, right?

The movie log seems to grow roughly linearly over time. Assuming that a linear model fits the data and that I am capable of basic statistics (neither of which we should count on), I will have watched 100 films by the 716th day since the start of the log, which comes out to May 10, 2010. Here's a pretty graph: movies watched over time Email Account Deleted

2009-11-29 23:21

Rice has deleted my undergraduate email account, , since I am no longer a student there. If you tried to send to that address and received a bounce notification, please resend your email to the same username at instead.

Twin Peaks iPhone Panorama

2009-11-07 18:27

I went up to Twin Peaks in San Francisco with Jay a few weeks ago. It was a nice view but kind of a worst-case scenario for a photo: my iPhone camera (VGA), poor lighting as the sun was setting, and stong winds.

That day I took a bunch of overlapping shots with my phone. Then I used the GIMP's automatic white balance correction on each of them. Next I stitched them together with Hugin, and finally I edited the stitched image with the GIMP. The following mediocre image is the result (click for the full 1534x652 image):


The 1 megapixel result really wasn't worth the time. Even in the thumbnail, it's easy to see that multiple source images are contributing their different colors and brightness levels. Maybe Hugin could correct for more of this with the proper settings, but I haven't taken the time to learn it well enough to know how. Although mine has a slightly larger angle, I think the one on wikipedia still wins.

I also took this suprising shot:

slanted Golden Gate bridge

It turns out the iPhone's cheap camera scans horizontally from top to bottom. As I was in a car moving left, the lines lower on the image were scanned later and appear shifted to the right. Kirk Mastin has an interesting post about this rolling shutter effect and what you can do with it. Jeffrey Erlich also has an awesome album that makes use of this effect.

Xfce Stopwatch Plugin

2009-08-16 22:49 | xfce

I needed an excuse to try Mike's Vala bindings for Xfce, so I created a new little plugin for the panel, the xfce4-stopwatch-plugin.

In the original release announcement on July 28th, I wrote:

This is the first release of the stopwatch panel plugin, which you can use to time yourself on different tasks. It's stable and usable, but quite minimal still.

The functionality is best summarized with this image from the web site: screenshots


From their web site,

Vala is a new programming language that aims to bring modern programming language features to GNOME developers without imposing any additional runtime requirements and without using a different ABI compared to applications and libraries written in C.

Instead of having to write tons of boilerplate code to create new GObjects in C and for other common tasks in developing GTK-based applications, Vala builds these features into the language. The Vala code you write passes through the Vala compiler, which produces GObject-based C code. From there, GCC compiles that to a binary as usual. There is no runtime, so Vala-produced code can run as fast as hand-coded C.

Vala makes it easy to write fast, object-oriented code for GTK-based projects. With Mike's Xfce bindings for Vala, you gain access to Xfce's libraries from Vala, letting you write panel plugins or other Xfce projects in Vala. It's a cool idea and something I definitely wanted to try.

Developing the Stopwatch Plugin

In general, Vala is pretty easy to write if you've worked with GObject before. I did hit a few bugs while developing even this simple plugin, so it's evident that Vala and the Xfce bindings aren't mature yet:

  • I filed GNOME Bug 587150, a bug in Vala's POSIX bindings for the time_t type. Vala treats it as a GObject instead of an integer, making it unusable to pass around your program in many ways. This bug hasn't seen any attention yet, but I've worked around it for Stopwatch by not using time_t.

    Update: Evan Nemerson fixed this one.

  • I patched a small bug in Xfce's Vala bindings for the XfceHVBox widget. The Vala compiler was producing calls to xfce_hv_box_new() instead of xfce_hvbox_new(), which of course caused a problem when GCC tried to resolve the symbol.
  • I also filed GNOME Bug 589930, a bug in Vala's generated code for sscanf. It always added an extra NULL argument at the end of the arguments list. Jürg Billeter fixed this one quickly with this commit, which made it into Vala 0.7.5.

Despite these hurdles, writing the Stopwatch plugin in Vala has been a pleasure. Admittedly the plugin doesn't do much, but the code is very short and straight-forward.

Stopwatch will probably see just one or two more releases before it's feature-complete. I'd also like to port the Places plugin to Vala at some point, but I'm waiting to see how volume management plays out once ThunarVFS is gone.

Lighttpd Fails to Bind to Localhost

2009-08-05 10:21
lighttpd logo

I installed the web server lighttpd on my laptop to test some configuration settings. As I didn't want to expose the server on the network, I uncommented server.bind = "localhost" from /etc/lighttpd/lighttpd.conf.

Then, restarting lighttpd failed with the following error:

(network.c.201)getaddrinfo failed:  Name or service not known ' localhost '

This is lighttpd 1.4.19-5 from main on Debian Lenny.

I was still able to ping localhost and checked my /etc/hosts file, but everything seemed fine. Finally, I checked the line of code the error points to (network.c line 201) and noticed it's part of an IPv6-specific chunk of code.

I found I could work around this issue by disabling IPv6 entirely in /etc/lighttpd/lighttpd.conf. For the uninitiated, comment out this line:

## Use ipv6 only if available.
include_shell "/usr/share/lighttpd/"

Other Reports of This Issue

A couple reports of the same problem can be found on the old lighttpd forums, but no resolution was reached. Unfortunately, I can't reply there because those forums are now locked, and historical threads were not copied to lighttpd's new forums. The first report was from Debian's 1.4.19-1 package, and the second report does not identify the version.

A post on the debian-user-spanish list reports the same problem on Debian Lenny but received no replies.

That mailing list post does point to Debian bug 489063 (which doesn't come up on Google when you search for the error message). There, Pierre Habouzit, one of lighttpd's maintainers on Debian, suggests using server.bind = "::1" instead of server.bind = "localhost" when IPv6 is enabled. This will start up the server without errors, but then I can only access it as http://ip6-localhost/ (not http://localhost/).


This is a pretty annoying little issue, and it hasn't fully been resolved. At a minimum, this:

## bind to localhost only (default: all interfaces)
# server.bind                = "localhost"

should be:

## bind to localhost only (default: all interfaces)
## use ::1 when IPv6 is enabled or localhost for IPv4
## (see Debian bug #489063)
# server.bind                = "::1"
# server.bind                = "localhost"

That would at least point people in the right direction.

I've sunk enough time into this for now, though. I'll post an update here if I pursue this any further.

Chef Roger's Knife List

2009-08-02 12:56 | low-tech

Last semester at Rice, I took the class Cooking with Chef Roger. The man is passionate about his knives, and he gave us a list of brands he recommends. From one of the few scraps of notes that survived, here is Chef Roger Elkhouri's list of quality brands for chef's knives (French knives):

chef's knife

Mid-Range Brands

Top-of-the-Line Brands

Xorg.conf for QEMU/KVM

2009-06-30 22:02

If you want to use a large resolution with a QEMU or KVM virtual machine, you'll need to manually specify a few things in xorg.conf. Out of the box, you can usually only use resolutions up to 800x600, although Fedora and Ubuntu have patched this up to 1024x768.

I created this xorg.conf to work with larger resolutions. With it, I was able to use up to 1280x1024 with the default emulated graphics and up to 1920x1200 when passing the -std-vga option to QEMU or KVM.

To make use of this:

  1. Back up your existing /etc/X11/xorg.conf in your virtual machine, if any.
  2. Save the file to /etc/X11/xorg.conf in your virtual machine:
    sudo wget -O /etc/X11/xorg.conf \
  3. If you want to use a resolution other than 1280x1024, modify the Modes line to suit your needs.
  4. Start or restart your virtual machine's X server.

If you're having problems, try passing QEMU/KVM the -std-vga flag.

Cgit Hacking

2009-06-17 11:45
cgit logo

Last week I hacked a couple new features into cgit, a web interface for Git, since it's the one I use on I added https:// URLs for the Atom feed and also syntax highlighting when viewing files.

HTTPS URLs for Cgit's Atom feed

Cgit generates Atom feeds so that you can keep track of changes from your feed reader. Unfortunately, that requires a full URL, which it assumed started with http://. This obviously didn't work for https://-only installations.

I modified cgit to check the HTTPS CGI variable. If it's set to on, cgit now generates full URLs starting with https://. While this isn't part of the official CGI spec, most servers will set it, including Apache and lighttpd.

Lars Hjemli, the maintainer of cgit, merged in my change, so it should be part of a future cgit release:

This looks good. I've merged it into my wip-branch on where I'll let it cook for a little while before merging to my master.

Syntax Highlighting for Cgit

Cgit is useful for browsing around a project's history, but it didn't do syntax highlighting for source code. This made it unpleasant to use for reading complete files (as opposed to diffs).

I modified cgit to make use of the highlight program, when available, to color source code. If highlight is unavailable or fails, cgit falls back to the old black-and-white view.

While the patch is small and self-contained, it's specific to highlight and just tacked on in the source code. Lars didn't take this one:

I like the result, but I think the implementation has to be more generic. And I'm currently about to add support for a few plugins/hooks to cgit which I think can be used to achieve the same result so lets see how that works out first, ok?

I'll be working with Lars on getting a cleaner solution merged into his tree once he's added support for plugins. In the mean time, feel free to use the code from my repository, which seems to work just fine.

Both of these features can be found on my cgit repo:

  • git clone
  • cgit front-end (running here with both of these changes)
See the https and highlight branches, respectively. Both were branched from cgit's master branch.

Tabs in Vim

2009-05-28 23:51
Vim logo

Version 7 of Vim introduced tabs to the editor, and these are a few of my tab-related tips. If you aren't familiar with tabs in Vim, start with the basics on The Golden Ratio or

Open Files in Tabs

If you want to open multiple files in their own tabs in a new Vim session, use the -p flag on the command line for vim or gvim. For example, to open all files in the current directory, use the following:

vim -p *

When you give Vim multiple files to edit, its default behavior is to use several buffers. If you want to use tabs as the default behavior instead (that is, without typing the -p flag every time), set up a couple shell aliases. For bash, place these in your ~/.bashrc:

alias vim='vim -p'
alias gvim='gvim -p'

Also, Vim will open a maximum of 10 tabs like this by default. To increase that limit to, for example, 50, add the following to your ~/.vimrc:

set tabpagemax=50

Easier Tab Navigation

When you have more than a few tabs open, it can become difficult to navigate them with only the keyboard. You can use {count}gt to go to the count-th tab (starting with 1), but counting them yourself is a waste of time. Placing the tab number on its label solves this problem.

Vim tab labels

You can see how I set a custom tab label in this commit to my Vim configuration repository. The blog post on The Golden Ratio has another custom tab label you could check out.

Overlooked Python Built-Ins

2009-05-22 23:52
Python logo

So, I just realized that I re-implemented two built-in Python functions on a small project I'm working on for ETSZONE. I just didn't know that these existed, so I'm writing about them here in case you've overlooked them too.


This is useful if you want to sort a copy of a list. Use sorted() instead of copying the list and then using list.sort().

This was my re-implementation (and I think I still like its name better):

def sort(seq, **args):
    x = list(seq)
    return x

The sorted function has been available since Python v2.4.


This is useful when you want a foreach loop, but you also need a loop counter around. Use enumerate() instead of keeping a counter elsewhere.

For example, I was writing out a spreadsheet with ooolib-python. For each spreadsheet cell to write, I had to specify row and column indexes. I could write more natural loops with enumerate, while still having a counter to use as a row or column index.

This was my re-implementation (and its name would have never caught on):

def indexiter(iterable):
    return zip(range(len(iterable)), iterable)

The enumerate function has been available since Python v2.3. Read about the optional start parameter in the docs - it looks useful, but it's new in Python 2.6.

This shows that it's a good idea to occasionally browse back through the very basic support a language gives you, since you might just find a couple useful tools in there that you had overlooked. If you're into Python, start here.

Stack Overflow DevDays Registration

2009-05-12 21:17

Stack Overflow logo

I just read about Stack Overflow DevDays on Joel on Software (post):

It's going to be in October, in five separate cities. In each city, we're planning a one-day event.

We decided to cram as many diverse topics as possible into a single day event. Like a tasting menu at a great restaurant, we'll line up six great speakers in each city.

This is not going to be just a Java conference or a .NET conference or a Ruby conference. This will be completely ecumenical. We'll have somebody to introduce Microsoft's new web framework, ASP.NET MVC, but we'll also get someone to talk about writing code for Google's new mobile operating system, Android. And in each city, we'll find one local computer science professor or graduate student to tell us about something new and interesting in academia.

I picked up one of the $10 student tickets for San Francisco. Now, I'm not a big Stack Overflow user, but, at that price, I had to go for it. I'm especially hoping to hear more about the first 4 topics listed: Android, Objective C and iPhone development, Google App Engine, and Python.

Extract Unique Lines From a File

2009-05-07 23:52

If you want to get rid of duplicate lines from a file or pipe, use

sort -u
sort | uniq

For example, maybe you're searching for another front-end to libpurple, the library underneath pidgin. You try to use apt-cache rdepends but find the output is cluttered with duplicate entries (bug #335925).

$ apt-cache rdepends libpurple0 | tail -n +3 | sort

Note that I've trimmed off the header (with tail) and sorted the list (with sort) here to make this more obvious.

Using the above tip to see only unique lines, you can easily work around this bug:

$ apt-cache rdepends libpurple0 | tail -n +3 | sort -u

Off-Brand Q-tips

2007-12-28 23:50 | low-tech

To start off this blog, I'm writing about things you stick in your ear. I suspect I'll end up writing about techier subjects soon enough. Nevertheless, it's probably worthwhile to attempt to set a precedent of, at least occasionally, writing about something low-tech.

Q-tips, or rather cotton swabs, always warn you not to insert them into your ear canal. After all, they officially have a variety of legitimate uses. Let's face it though: they were created for ear cleaning, so they work rather well for that.

Warning on Q-tips package

Well, cotton swabs aren't something you need to buy very often. You have to run out of them to realize just how nice they are. My roommate Matt and I ran out of cotton swabs on Monday a couple weeks ago. So, of course, that Wednesday I had a doctor's appointment. The ear thermometer must have had a fun time in there...

Anyway, I was still pretty thankful we ran out. We had the off-brand, wannabe Q-tips before. The ones with a tiny amount of cotton on each end. The ones that will not give until they entirely bend in the center. I've grown to hate the off-brands with a passion, and yet seem to keep encountering them.

Per unit price of off-brand cotton swabs

It's marketing. The real Q-tips are more expensive at the store. The off-brands cost a couple bucks less, and you get more - the price per unit of the off-brands can't be beat. The problem is, you don't want more. You really don't want more of them. You'll go home with your 300-pack of $1.99 cotton swabs, try to clean your ear with one, and get just a little pissed off at how ineffective it is. The next day after your shower, you'll again be a little pissed off. Even if you share your cotton swabs with someone else, you're still going to be a little pissed off, every single day, for about the next 5 months. Is that really worth saving a couple bucks?