The best engineering interview question I’ve ever gotten, Part 1

It’s been a while since I was on the receiving end of a software engineering interview. But I still remember my favorite interview question. It was at MemSQL circa 2013. (They haven’t even kept their name, so I assume they’re not still relying on this specific interview question. I don’t feel bad for revealing it. It’s a great story that I tell people a lot; I’ve just never blogged it before.)

Okay, so this isn’t a “question” per se; it’s a programming challenge. I forget how much time they gave for it. Let’s say three hours, counting the time to explain the problem. [UPDATE, 2024: I’m reliably informed that it was only one hour, not three; adjust the estimates below accordingly.]

Since MemSQL was a database company, this is a database challenge.

Introducing memcached

You know about memcached? No? Well, it’s an in-memory key-value store. (Read its About page here.) Let’s download its code and build it so I can show you what it does.

You may need to brew install libevent and maybe some other stuff in order to build memcached successfully. It won’t be too difficult to figure out; but anyway, wrangling with your environment wasn’t part of the test. You can assume interviewees were given access to a Linux box with all the right dependencies in place already.

For the authentic 2013 experience, let’s bypass the GitHub repo and untar a contemporary source distribution:

curl -O https://memcached.org/files/memcached-1.4.15.tar.gz
tar zxvf memcached-1.4.15.tar.gz
cd memcached-1.4.15
./configure
make

Now we’ve built the memcached executable. Let’s start it running:

./memcached

We can talk to the server via the default memcached port, port 11211. Its protocol is basically plain text, so we can use plain old telnet to talk to it. (If you don’t have telnet anymore, substitute the words nc -c for telnet.)

$ telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Playing with memcached

memcached is a key-value store. That means we can tell it to remember something (an association between a key and a value), and then later ask for that key again, and it’ll tell us the value it remembered. In memcached, keys are always ASCII strings and values are always arbitrary byte streams (which means you must specify their length manually). For example, type this into your telnet session:

set fullname 0 3600 10
John Smith

This tells memcached to remember an association between the string key fullname and the 10-byte value John Smith. The other numbers on that line are a “flags” value (0) to remember alongside the byte-stream value; and an expiry timeout (3600) measured in seconds, after which memcached will forget this association. Anyway, after you type these two lines, memcached will respond:

STORED

Now you can retrieve the value of fullname by typing into the same telnet session:

get fullname

memcached will respond:

VALUE fullname 0 10
John Smith
END

You can overwrite the value associated with fullname by issuing another set fullname command. You can also ask memcached to modify the value in certain ways; for example, there are special dedicated commands for append and prepend.

append fullname 0 3600 6
-Jones
STORED
get fullname
VALUE fullname 0 16
John Smith-Jones
END

Of course if you wanted to append -Jones to fullname from within a client program, you could do something like this:

# pip install python-memcached
import memcache
mc = memcache.Client(['127.0.0.1:11211'])
v = mc.get('fullname')    # get the old value from memcached
v += '-Jones'             # append -Jones
mc.set('fullname', v)     # set the new value into memcached

But the advantage of memcached’s dedicated append command is that it’s guaranteed to execute atomically. If you have multiple clients connected to the same memcached server, and they’re all trying to append to the same key at the same time, the get/set version might cause some of those updates to be lost, whereas with append you’re assured they’ll all be accounted for.

Another dedicated command that executes atomically is incr:

set age 0 3600 2
37
STORED
incr age 1

memcached responds with the postincremented value:

38

This response is useful because of the multiple clients thing. If you issued a separate get age command, you might see the new value only after several other clients had done their own updates. If you’re intending to use the value as a serial number, or a SQL primary key, or something like that, then it’s very good that there’s a way to see the incremented value atomically.

memcached remembers the incremented value too, of course:

get age
VALUE age 0 2
38
END

Notice that 37 and 38 are still being stored and returned as byte-strings; they’re decoded from ASCII into integers and back as part of the atomic operation. If you try to incr a non-integer value, you get an error:

incr fullname 1
CLIENT_ERROR cannot increment or decrement non-numeric value

Finally, note that incr is a full-fledged addition operation: you can increment (or decr) by any positive value, not just by 1.

incr age 10
48
decr age 10
38
incr age -1
CLIENT_ERROR invalid numeric delta argument

By the way, when you’re done talking to memcached and want to kill the connection, you can type the memcached command quit. (If you’re using nc -c, Ctrl+D also works. Or, just go to the terminal window where the memcached server is running and Ctrl+C to bring it down.)

The challenge: Modifying memcached

Via its incr and decr commands, memcached provides a built-in way to atomically add \(k\) to a number. But it doesn’t provide other arithmetic operations; in particular, there is no “atomic multiply by \(k\)” operation.

Your programming challenge: Add a mult command to memcached.

When you’re done, I should be able to telnet to your memcached client and run commands like

mult age 10
380

You have three hours. Go!


For spoilers and analysis, see “The best engineering interview question I’ve ever gotten, Part 2.”

Posted 2022-01-06