Tv's cobweb
Better CoreOS in a VM experience 2016-03-31
(This was all tested with CoreOS beta 991.2.0
, the stable 835.13.0
fails to mount 9p
volumes.)
CoreOS comes with instructions how to run it in QEMU . After the setup, it comes down to something like
#!/bin/sh
exec ./coreos_production_qemu.sh \
-user-data cloud-config.yaml \
-nographic \
"$@"
with cloud-config.yaml
looking like
#cloud-config
hostname: mytest
users:
- name: jdoe
groups:
- sudo
- rkt
ssh_authorized_keys:
- "ssh-ed25519 blahblah [email protected]"
I used that for many experiments, but felt it was less than ideal.
I wanted to move beyond -net user
. That just means calling QEMU
directly, instead of using coreos_production_qemu.sh
, no big deal.
But it meant I would be writing my own QEMU runner, no matter what.
I also wanted more efficient usage of my resources -- after all, the whole point of me running several virtual machines on one physical machine is to make these test setups more economical.
The provided QEMU images waste disk and bandwidth. Every VM stores
two copies of the CoreOS /usr
image, just like a physical machine
would. Copy-on-write trickery on the initial image will not help
beyond the first auto-update, as each VM independently downloads and
applies the updates. This means if you run a small test cluster of say
5 VMs, you'll end up with 10 copies of CoreOS, and 5x the bandwidth
usage needed.
Imitating physical computers with virtual machines is great if you're trying to learn how the CoreOS update mechanism works, but once you're to the point of wanting to just run services, it's simply not needed.
CoreOS does have a supported mode where it does not use the USR-A
and USR-B
partitions:
PXE booting
,
starting a computer by requesting the software over the network. I
could even skip the virtual networking and use this
with QEMU by launching the kernel and initrd directly
,
no need for PXE itself. However, this is wasteful in another way: it
holds the complete /usr
partition contents in RAM, using about
180MB. Once per each VM. There is also an annoying delay of 15+
seconds in VM startup, presumably related to the large initrd image,
and later the kernel spends 1.2 seconds uncompressing it into a
tmpfs
(measured on a i5-5300U laptop).
Digging into the PXE image, I find that it actually stores the /usr
contents as a squashfs
-- which is a real filesystem that can be
stored on block devices, as opposed to just unpacking a cpio
to a
tmpfs
. The PXE image does what's called a "loopback mount", where a
file is treated like a block device. In the PXE scenario, the file is
held in RAM in a tmpfs
; I can just put those bytes on a block
device, and boot that!
(The
Live CD
seems to also hold /usr
contents in tmpfs
just like the PXE
variant, even though it could fetch them on demand from the ISO. The
squashfs
image is random-access, unlike the usual cpio.gz
that's
used for initramfs
contents. In later versions, CoreOS could switch
their ISO images to use the trick I'll explain below -- at the cost of
physical machines needing to spin up a CD more often than once per
boot. The live CD has another downside that made me avoid it: to pass
kernel parameters, I'd have to resort to kludges like creating a boot
floppy image with syslinux
and the right parameters on it.)
So, I set about fixing the wasted disk and bandwidth problem. Here's a story of an afternoon project.
Using the /usr
image directly
Instead of holding an extra copy of the /usr
image data in RAM, we
can make it available as a block device, and load blocks on demand.
For that, we need the /usr
squashfs
image as a standalone file,
not inside the cpio
. It's not available as a separate download, but
we can extract it from the PXE image:
wget http://beta.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz
wget http://beta.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz.sig
gpg --verify coreos_production_pxe.vmlinuz.sig
wget http://beta.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz
wget http://beta.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz.sig
gpg --verify coreos_production_pxe_image.cpio.gz.sig
zcat coreos_production_pxe_image.cpio.gz \
| cpio -i --quiet --sparse --to-stdout usr.squashfs \
>usr.squashfs
Prepare a root filesystem
We also need to make prepare a disk image that will be used for
storing the root filesystem. CoreOS won't boot right with a fully
blank disk. If it had, I would have used qcow2
as the format, but
now I need to provide some sort of structure for the root filesystem,
so let's go with a raw
disk image.
I might have been able to set up the right GPT partition UUIDs for the
initrd to mkfs
things for me, but that seemed too complicated, and I
doubted it'd support my "just the root" scenario as well as their
nine-partition layout.
To keep it simple, we won't bother to use partitions; the whole block device is just one filesystem.
>rootfs.img
chattr +C rootfs.img
truncate -s 4G rootfs.img
mkfs.ext4 rootfs.img
Prepare user_data
This was previously done inside coreos_production_qemu.sh
with a
temp dir, but we'll just pass a directory as virtfs
following the
"config drive" convention. Let's move our previous file into the right
place:
mkdir -p config/openstack/latest
mv cloud-config.yaml config/openstack/latest/user_data
Finally, run QEMU
qemu-system-x86_64 \
-name mycoreosvm \
-nographic \
-machine accel=kvm -cpu host -smp 4 \
-m 1024 \
\
-net nic,vlan=0,model=virtio \
-net user,vlan=0,hostfwd=tcp::2222-:22,hostname=mycoreosvm \
\
-fsdev local,id=config,security_model=none,readonly,path=config \
-device virtio-9p-pci,fsdev=config,mount_tag=config-2 \
\
-drive if=virtio,file=usr.squashfs,format=raw,serial=usr.readonly \
-drive if=virtio,file=rootfs.img,format=raw,discard=on,serial=rootfs \
\
-kernel coreos_production_pxe.vmlinuz \
-append 'mount.usr=/dev/disk/by-id/virtio-usr.readonly mount.usrflags=ro root=/dev/disk/by-id/virtio-rootfs rootflags=rw console=tty0 console=ttyS0 coreos.autologin'
You'll be greeted with the Linux bootstrap messages and finally
This is mycoreosvm (Linux x86_64 4.4.6-coreos) 06:14:10
SSH host key: SHA256:t+WkofIWxkARu1hezwPnS/vgTJXUcPidA3UxKr+1uGA (DSA)
SSH host key: SHA256:cT32H33EVCHSnrCRsB+I9GG7AgXQWfyjk7JFuEzAqFU (ECDSA)
SSH host key: SHA256:NFgc7BLbeyS3SslpscSSNHNzc7lXzx6vKqBmUp+5T7Q (ED25519)
SSH host key: SHA256:pK8Dknoib61FnIwMQ6u4F4FxeSMIRq9zYsrJd0N3MPY (RSA)
eth0: 10.0.2.15 fe80::5054:ff:fe12:3456
mycoreosvm login: core (automatic login)
CoreOS stable (991.2.0)
Last login: Fri Apr 1 06:02:25 +0000 2016 on /dev/tty1.
Update Strategy: No Reboots
core@mycoreosvm ~ $
Success!
As usual with QEMU, press C-a x
to exit.
Stay tuned for part 2, where we will make the VM even leaner.
posted on 2016-03-31, tagged as coreos virtualization qemu kvm linux howto
Unmarshaling a JSON array into a Go struct 2016-01-13
Sometimes, you see heterogeneous JSON array like
["Hello world", 10, false]
Dealing with such an array in Go can be very frustrating. A
[]interface{}
hell is just about as painful as the
map[string]interface{}
hell (See my
earlier article about that
).
The natural way to deal with data like that in Go would be a struct like
type Notification struct {
Message string
Priority uint8
Critical bool
}
See how much more meaning we've added?
Now, you can't just json.Unmarshal
an array into a struct. I'll show
you how to make that work.
posted on 2016-01-13, tagged as go json programming article
Go JSON unmarshaling based on an enumerated field value 2016-01-12
In a previous article , we talked about marshaling/unmarshaling JSON with a structure like
{
"type": "this part tells you how to interpret the message",
"msg": ...the actual message is here, in some kind of json...
}
Last time, we left a repetitive switch statement in the code, where each message type was unmarshaled very explicitly. This time, we'll talk about ways to clean that up.
posted on 2016-01-12, tagged as go json programming article
Dynamic JSON in Go 2015-08-27
Go is a statically typed language. While it can represent dynamic
types, making a nested map[string]interface{}
duck quack leads to
very ugly code. We can do better, by embracing the static nature of
the language.
The need for dynamic, or more appropriately parametric, content in JSON often arises in situations where there's multiple kinds of messages being exchanged over the same communication channel. First, let's talk about message envelopes, where the JSON looks like this:
{
"type": "this part tells you how to interpret the message",
"msg": ...the actual message is here, as some kind of json...
}
posted on 2015-08-27, tagged as go json programming article
Notes from SCALE12x 2014-02-24
(Sidenote: this little blog engine had bitrotted pretty bad.. I reimplemented it with markdown, go & bootstrap, and it's much more pleasant to work with now. Time for new content!)
I spent Saturday & Sunday at the Southern California Linux Expo (SCALE) , and here's my very personal report of how I experienced it.
SCALE is not your typical tech conference, it brings in very diverse groups of people. The organizers are actively trying to reach out to e.g. kids that are in that "might grow up interested in things" age. Just about every age group, techie background, and personal interest is present -- the common theme really is only Linux (and a few BSD-based vendors trying to sell their gear). Of course this means that SCALE won't ever serve my desires perfectly -- but it serves the community well, and the feel of the conference is very friendly and engaging.
Friday
First of all, I was too busy to go on Friday, and the streaming video had some sort of audio codec trouble, so I won't comment about content of the devops day . What I will say is that I'm impressed by the strength of the devops presence at SCALE. It's becoming a significant backbone of SCALE, year by year. Kudos to the organizers. And they're at it all year long -- the local ops-oriented meetups have a great community going. Heartily recommended, whether you carry a pager or not. Also see hangops .
SCALE also hosted another sub-event on Friday called Infrastructure.Next , @infranext . It looked interesting, though I fear overpresence of Red Hat and vendor agenda. I'm still waiting for slides and/or video of How to Assemble A Cutting Edge Cloud Stack With Minimal Bleeding . (The archived live streams for all three days are useless because of audio problems.)
I also missed Greg Farnum's talk on Ceph . I worked at Inktank for almost two years, and this technology is one of a kind, and a good indicator of what direction the future lies. If you deal with >20 machines, you should definitely take time to look into Ceph.
Saturday
Saturday started off with a talk about SmartOS vs Linux performance tooling (slides ). There wasn't much new there this time around, but Brendan is a good speaker, and SmartOS is probably the most serious server-side alternative to Linux I'd personally consider these days, so it's good to keep tabs on what they've been working on.
My interestests drew me next to the talk about Presto (slides ). Takeaways:
- batch and interactive systems have fundamentally different needs, e.g. for monitoring grace periods, how and when maintenance can be performed; they require a different ops culture.
- Dain shared background on Facebook's internal networking challenges, and how data center power limits forced them to essentially trade off other servers for Presto servers, to avoid network bottlenecks.
- Presto is integrating the BlinkDB research on approximate queries, e.g. <10% error for 10-100x faster queries sounds like a very good trade-off.
- many "big data" stores don't store enough statistics about index hit rates to guide query planning
I'm sad I missed Beyond the Hypervisor (slides ) due to a schedule conflict.
The
OpenLDAP talk
(slides
)
was really largely about LMDB
, and that's
what I came for. LMDB is a library that implements a key-value store,
with an on-disk B-tree where read operations happen purely through a
read-only mmap
. This is a really nice architecture, pretty much as
good as a btree gets -- that is, it's probably happiest with
read-mostly workloads, and probably at its worst with small writes to
random keys. Pretty much the opposite of LevelDB, there. I wish the
benchmarks were less biased, but that seems to be the unavoidable
nature of benchmarking. LMDB has a lot of the kind of mechanical
sympathy that may remind you of
Varnish
: all
aspects of caching are offloaded to the kernel, and data can be
accessed in zero-copy fashion because the read-only mmap prevents
accidents. For Go programmers, Bolt
is a reimplementation of the design in pure Go, avoiding the Cgo
function call overhead, and offering a much nicer api than the
direct wrapper szferi/gomdb
. My
quick microbenchmarks say that, when used from Go, Bolt can be
faster.
Next up was High volume metric collection, visualization and analysis . If I could take back those 20 minutes, I would.
I spent the rest of the day catching up with old friends and making new ones.
Sunday
Clint Byrum is now at HP and working on TripleO, a project that aims to make OpenStack do bare-metal deploys, and then run a public-facing OpenStack on top of that. His talk was a good status report (slides ), but in situations like this I always end up wanting more details.
For the next slot, I bounced between three different talks, not 100% happy with any one of them. First, Hadoop 2 (slides ) was an intro to YARN et al that started off like an apologist "I swear Hadoop and Java don't really suck as much as they seem to". Mark me down as unconvinced.
Second, Configuration Management 101 was a good effort from a Chef developer to be party neutral, and talked about the common things you find in all the common CM frameworks. His references to promise theory are pretty much dead on, and in the 3 years since I fiddled momentarily with https://github.com/tv42/troops , my thoughts have gone more and more into thinking about distributed CM as an eventual consistency problem. With Juju-inspired notifications about config changes, using more gossip & vector clock style communication to update peers on e.g. services provided, this might result in something very nice. That one is definitely on the ever-growing itches to scratch list.
Third, Seven problems of Linux Containers was an OpenVZ-biased look into remaining problems. Some of it was a bit ridiculous -- who says containers must share a filesystem, just mount one for each container if you want to -- and some of it was just too OpenVZ-specific to be interesting. Still, a good topic, and OpenVZ was groundbreaking work.
For the next slot, I returned from lunch too late to fit in the packed rooms, and enjoyed breathing too much to try harder. I watched three talks, mostly from open doorways. The hotel's AC was not really keeping up anymore at this point, and only the main room was pleasant to be in.
Big Data Visualization left me wishing that 1) it wasn't fashionable to say "big data" 2) he'd have shown more visualizations 3) he'd talked about the hard parts.
ZFS 101 (slides ) is interesting to me mostly to see what people think about & want from storage. Btrfs is really promising in this space, feature-wise; it still has implementation trouble like IO stalls, but the integrated snapshots and RAID are just so much more useful and usable than any combination of hardware RAID, software RAID, and LVM. Snapshots really need to be a first-class concern. So far, my troubles with Btrfs are of a magnitude completely comparable to my troubles with the combination of LVM, LVM snapshots, HW-RAID cards dying, and SW-RAID1 sometimes booting the drive that was meant to be disabled. All in all, I find the "not yet stable" argument a bit boring; there's a whole lot of code and complexity in Btrfs, but it also removes the need for a whole lot of other kinds of code and complexity. If nothing else, the ZFS/Btrfs feature set should be a design template for future efforts; I understand e.g. F2FS has a very specific design goal (think devices rather than full computers), but not supporting snapshots in a new filesystem design is a bummer.
And finally, I spent time in Jordan Sissel's
fpm talk
.
fpm
is a tool that converts various package formats into other
package formats, a lot like
Alien
. Jordan's viewpoint on
this is a frustrated admin who just wants the damn square peg to fit
in the round hole, and fpm
is the jigsaw & hammer that'll make that
happen. I fundamentally disagree with him about the role of packaging;
the whole point of packaging is destroyed if the ecosystem has too
many bad packages, and the reason e.g. Debian packaging can be a lot
of work is not because cramming files in an archive should be hard,
but because making all that software work together and upgrade
smoothly actually is a difficult problem. But Jordan is an
entertaining speaker, and his point is valid; there are plenty of
cases where you don't care about the quality of the resulting package.
Just.. please don't distribute them, ok?
posted on 2014-02-24
Slides for my recent talks 2012-01-08
I just put up a bunch of slides from talks I've presented lately:
-
concurrency-oh-my/ Concurrency, Parallelism, Events, Asynchronicity, Oh My: concepts and Python applications of concurrency, 2011-11-10 at SoCal Piggies .
-
ceph-overview/ Ceph Overview for the OpenStack Conference, 2011-10-07 at OpenStack Conference .
-
teuthology/ Teuthology: a multi-machine test runner, 2011-07-01 at DreamHost .
-
intro-to-boto/ Introduction to Boto, and an even briefer intro to Gevent, 2011-07-01 at DreamHost .
posted on 2012-01-08
Keyboards influenced by touchscreens 2011-04-30
By now, you have probably used a touchscreen keyboard. We've come a long way from the clumsy "kiosk" computers that brought touchscreen keyboards to mainstream, a decade or two ago. But a classic keyboard with physical keys is still preferable for the tactile feedback we get from pressing the keys. Until touchscreens can provide that, we'll be using traditional keyboards for a while.
But how do touchscreen keyboards differ from physical keyboards, and what ideas could we copy from them to improve the user experience of traditional keyboards? Well, for one, most touchscreen keyboards these days don't do key repeat -- instead, they'll pop up a menu of alternatives, often the same base letter with with various accents, diacritics and umlaut. And you slide your finger to pick one of these options.
Now, that sounds good. I for one don't know how to type the various variations of the letters on a US keyboard, and as a Finn I actually need "ä" and "ö" sometimes. I can type them in Emacs, but not in my IM application or web browser.
Here's the idea: we don't really use autorepeat on the A-Z characters. Instead, make a long press of a letter key bring up an on-screen menu with variations of the basic letter.
On a touchscreen, choosing from the menu is immediate and fairly intuitive. While using the mouse for that would be straight forward -- and probably what a first time user will try -- nobody wants to type like that. We're stuck with one finger holding down the original key. We need to make a selection from up to about 9 alternatives. Here's two easy ideas (assuming the initial highlighted alternative is the base letter):
- make space move to the next highlighted alternative, or wrap around to the left
- make the four primary home row keys of the other hand highlight alternatives 1-4; make pressing the same home row key again act like space, above
For example, say I want to type "ä". My options are (assuming US keyboard layout):
-
hold "a", wait till menu pops up, while still holding "a" press space 4 times until "ä" is highlighted, release "a"
-
hold "a", wait till menu pops up, while still holding "a" press ";", release "a"
I think I'd use that more than auto-repeat. What about you?
posted on 2011-04-30, tagged as keyboard touchscreen human-computer-interface idea wishlist future
Deploy tools 2011-02-27
I've been looking at the world of deployment tools lately. Outside of Puppet and Chef (and ignoring the old beards Bcfg2 & Cfengine), what other things are there?
Fabric
lets you write Python functions to
describe "tasks" to be run. The Python functions are run on a client
machine -- for example, the sysadmin's laptop -- and each task can be
directed to operate on hosts or roles (groups of hosts), over SSH.
The functions can run remote commands with run("echo hello, world")
and sudo("chmod u=rw,go=r /etc/passwd")
. Fabric is a
very useful piece of the puzzle, but doing more complex operations one
shell command at a time gets frustrating. I keep wanting something
that can run whole chunks of Python easily, on the target machine.
Fabric also does nothing to solve the problems of e.g. multiple admins
running a deploy command at the same time.
Kokki
is closer to Chef in it's style
(and literally, "Chef" in Finnish). It's a framework for writing
cookbooks, with actions like File("/etc/greeting", content="hello, world")
in them. Then a configuration for a machine can invoke
certain recipes. Kokki seems to be aimed fully at running things
locally; that is, if you're deploying things, you'd run Kokki on the
target machine. Kokki is still in fairly early development, it's
website and source code don't match each other at all, and many of the
cookbooks no longer work with the current version. It also still
inherits a bit too many non-Pythonic elements from Chef, for my
tastes. Still, to this Pythonista, it looks very promising, and I will
be exploring it further.
Poni ("Pony" in Finnish -- I want one too!) is another Python project that takes a very different tack. It is built on a command-line tool that lets you define your infrastructure through hierarchical collections of key-value settings; that is, you describe the whole multi-machine service with database servers, load balancers, app servers and all. You can use inheritance much like other deploy tools use cookbooks. The command-line tool seems to be meant to be used for everything; the stored data is not really meant to be edited directly. While I appreciate the polish of the command-line tool, the editor-hostility comes off a bit odd. Especially so when the getting started guide has me "uploading" template files and Python source code to Poni's internal configuration storage. Am I really supposed to have two copies of these files?
Once you have your infrastructure defined, Poni provides you two main methods to actually make changes: you can create files based on templates (that have a strong mechanism for referring to any values from the configuration, including things like sharing the database connection information between the DB server and the client config), or you can run custom functions ("control commands"). The Python functions run locally, but Poni provides a remote execution framework very similar to the one in Fabric, though at least for now it is significantly more verbose. And, to my disappointment, doesn't really allow running full Python functions remotely either.
Somewhat confusing is the difference between the "create a file" and the "run control command" functionality. It is not quite clear how the whole is intended to orchestrate the full deployment, and the examples are both lacking and misleading. For example, right now the Puppet deployment example requires you to run a command to create some files, watch it fail, run another command to install software, then run the first command again to create the rest of the files. (Kind of weird to deploy Puppet with Poni in the first place..)
There is one thing about Poni that I am already starting to
dislike. Currently, every identifier you need to refer to on the
command line is given as a regexp, and commands act on all
matches. This leads to high risk for operator errors: for example, the
documentation itself uses $find("webshop/frontend")
as an example;
yet that would also match webshop/frontendforsomethingelse
. I do
hope the author changes his mind about regexps everywhere.
Much like Kokki, Poni is very early on in its development; it's command-line tools and things like variable referencing are top notch, but the picture is very much not complete yet. But this one is definitely a project to watch.
For posterity: I filed a bunch of issues about the things I bumped into:
- https://github.com/melor/poni/issues/2
- https://github.com/melor/poni/issues/3
- https://github.com/melor/poni/issues/4
- https://github.com/melor/poni/issues/5
- https://github.com/melor/poni/issues/6
posted on 2011-02-27, tagged as deploy software fabric kokki poni
EuroPython 2008 videos are up 2008-08-11
Update: Well, they used to be. blip.tv pulled a nasty stunt and removed tons of user content. And EuroPython itself has a nasty habit of not providing archived websites for previous years. This all has linkrotted. Don't you hate the internets?
My talks were:
posted on 2008-08-11, tagged as europython2008 europython python talk video
EuroPython 2008 wrap up 2008-07-19
EuroPython 2008 was fun. I presented two talks (My God, it's Full of Files -- Pythonic filesystem abstractions and Version Control for Du^H^HDevelopers ) and one lightning talk (RST+S5 for your slides), participated in a bunch of open space sessions, listened to about 13 talks, took a bunch of pictures, but most importantly had interesting hallway conversations with interesting people.
As usual, PyPy
was heavily represented, and seems to be making nice
progress toward being the nice and featureful default Python
implementation of the future. I especially liked the restricted
execution features and the LLVM
backend. The zc.buildout
talk
made
me decide I will try to replace (part of?) one custom deploy mechanism
with zc.buildout
-- most likely I'll end up rewriting most of the
current things as zc.buildout
recipes, but hopefully some of the
pre-existing recipes will be useful, and hopefully I can then later
reuse the recipies I create for this setup.
Personally, I think my talks went ok. I understand videos will be available later, as soon as transcoding etc are finished. I'm anxious to see them myself, as I'm still finetuning my public speaking skills. I'm learning, though -- this year I had no trouble staying within my time slot, even when I was adjusting verbosity on the fly.
For some reason, I felt underprepared for the filesystem API talk , but ultimately people liked the idea of a consistent Pythonic filesystem API enough that we had an open space session on it, and people were enthusiastic about a sprint to prototype the API. Which is what we ended up doing, too -- I'll blog separately about the results of that.
My decentralized version control talk
seemed to me to go over more
smoothly; I guess that's just because I've been thinking about version
control and project management a lot lately, so it was easy to talk
about the topic in a relaxed way. On the other hand, it wasn't as
much a call to action, and it really was overly generic, so I didn't
get as strong audience participation there. We did have an interesting
conversation about branch management strategies and such, though. I
consciously tried to keep the talk on a generic level, as I felt a
pure git
talk would have alienated some listeners, but I did end
up feeling restricted by that. There was some interest on a Teach me git
-style session, but what we ended up doing was just talking one
on one about getting started with git
, during the sprints. Sorry
if I missed any one of you -- grab me on #git
to continue, or find
me in future conferences ;)
I was requested to organize an open space session for Twisted Q&A, and that is exactly what we did. We went through a bunch of things related to asynchronous programming concepts, Deferreds, working with blocking code and libraries, database interfaces, debugging and unit testing.
I was also pulled in to another Twisted open space session, that was
mostly about what greenlets are and how to use them. I tried to
explain the differences between classical Deferreds,
deferredGenerator
/inlineCallbacks
, and greenlets, to the best
of my understanding. As a summary, with greenlets any function you
call can co-operatively yield execution (I mean yield in the
scheduling meaning, giving away your turn to run, not in the Python
generator meaning -- interestingly inlineCallbacks
etc actually
make those be the same thing... my kernel instincts make me want to
say "sleep"). Yielding in any subroutine means anything you do may end
up mutating your objects -- which is the root evil behind threading we
wanted to get away from. All the other mechanisms keep the top-level
function in explicit control of yielding. Around that time, most
people left for lunch, but about three of us stayed and talked about
debugging Deferreds and network packet processing with
twisted.pair
and friends.
One of the interesting hallway conversations was about what happens when upstream web hosting listed on PyPI is failing. It seems PyPI already does some sort of mirroring, but even that might not be enough. Many companies seem to be bundling eggs of their dependencies in their installation package, which sounds like a good setup for commercial click-to-install deployment. But it would still be good to see a CPAN -style mirror network for PyPI, and at least some people seemed even motivated to donating servers and bandwidth. Personally, I'm mostly spoiled by the combination of Debian/Ubuntu and decentralized version control, and my level of paranoia is too high to automatically install unverified software from the internet anyway. My primary motivation in the conversation was to point out that PyPI already has some sort of mirroring/upload setup, and that you'd really want to specify exact versions and SHA-1 hashes of your dependencies. Optionally, you could delegate the known good hash storage to PyPI (assuming you trusted PyPI not to attack you), but that would require a full Debian-style signature chain from a trusted key, or you'd be owned by anyone capable of MITM attacks, DNS forgery, or cracking a PyPI mirror.
posted on 2008-07-19, tagged as europython2008 europython python
EuroPython 2008 talk #1: My God it's Full of Files 2008-07-07
I published slides for my first Europython talk. Note the slides might not easy to understand without the actual talk -- the video streaming at Ustream will work if the network infrastructure can take it, and a downloadable video should be available later.
My God, it's Full of Files
Pythonic filesystem abstractions: An overview of different filesystem(-like) APIs in Python and attempts for unifying them.
There's a lot of different filesystem(-like) APIs in Python. I intend to provide an overview of existing projects, their status and capabilities, and hopefully inspire you to work on improving things.
posted on 2008-07-07, tagged as europython2008 europython talk python filesystem programming
Incremental mapreduce 2007-11-24
So Google has their MapReduce , and the people behind CouchDB are throwing around their ideas . I spent some time thinking about incremental mapreduce around July, and it's time I type out that page full of scribbles.
First of all: I think the ideas thrown out by Damien above aren't really mapreduce, As Google Intended. The real power of mapreduce is in its inherent combination of parallelism and chainability, output of one mapreduce is input to another, each processing step can run massively in parallel with each other, etc. The proposed design is like a one-iteration retarded cousin of mapreduce.
With that bashing now done (sorry), here's what I was thinking:
The way I imagined building an incremental mapreduce mechanism, without storing the intermediate data and just recomputing chunks that are out-of-date (which would be lame), is to add one extra concept into the system: call it "demap". It will basically create "negative entries" for the old data. This is basically what Damien did by providing both the old and new data map calls, all the time, just said differently, and I think my way might make the average call a lot simpler. And I don't see any reason why my version wouldn't be parallelizable, chainable, and generally yummy.
posted on 2007-11-24, tagged as mapreduce couchdb parallel cluster programming database search
KVM, the virtualization mechanism, rocks 2007-10-12
For years now, my primary machine has been a laptop. I've been
avoiding running Xen
1 because the hypervisor
it injects between hardware and Linux hasn't been very friendly for
power saving. Modern laptops are too often hot enough without any
extra help.
Well, I'm happy to say that as of now, KVM is definitely stable enough to replace my use of Xen for test setups. I expect that shortly it will become what I recommend for server use, too. And because it lets Linux be Linux, instead of doing any oddities with the hardware, all the powersaving etc goodness still works perfectly.
So far, I've stumbled on two things:
-
The version of KVM in Ubuntu, even
gutsy
, is just too old. With a bit of fiddling with the patches, KVM v46 built a perfectly workingdeb
. -
The
isolinux
graphical bootup, used inUbuntu
, crashes KVM (and with v28, it crashes the host machine -- beware!). See the bug report . I got around that by using mini.iso , but you could always fall back todebootstrap
; that was all we ever really had with Xen.
And while I have the soapbox: I have a dream. It's KVM running with SDL/VNC graphics in a window that's resizeable all the way to the virtual machine, with XRandR. Please make it happen!
Patches
So, the edited version of the patches:
- 01_use_bios_files_in_usr_share_kvm.patch
- 06_no_system_linux_kvm_h.patch
- from-debian-qemu/04_do_not_print_rtc_freq_if_ok.patch
And just remove from-debian-qemu/62_linux_boot_nasm.patch
, it
seems to have made it upstream.
-
What's up with http://www.xensource.com/ , by the way? That website is just a pile of horrible non-informative enterprise speak. Utterly useless, and that tends to alienate the techie community pretty fast. At least it's alienating me. ↩︎
posted on 2007-10-12, tagged as kvm xen virtualization xrandr idea
Snakepit and gitosis, things I've been working on 2007-10-12
A brief update of things I've been working on:
Snakepit
Snakepit
is a port of (part of) HiveDB
to Python and SQLAlchemy
.
It will help you write database-backed applications that need to scale
further than one database server, or even a master-slave setup, can
take you. It's MIT licensed, that is, pretty much as free-for-all as
it can be. And it's still work in progress, so don't be too harsh yet
;)
See https://github.com/tv42/snakepit for more.
gitosis
gitosis
aims to make hosting git
repos easier and safer. It
manages multiple repositories under one user account, using SSH keys
to identify users. End users do not need shell accounts on the server,
they will talk to one shared account that will not let them run
arbitrary commands. gitosis
is licensed under the GPL.
First real release will come as soon as I have to time to go through a couple of really minor nits. It's been self-hosting for a long time now.
See https://github.com/tv42/gitosis for more.
posted on 2007-10-12, tagged as software programming git database scalability
Rotaclock -- a unique clock where the whole wrist display the time 2007-06-06
(This post didn't originally get published due to technical problems with the graphics. It was written in 2007 and was rescued & inserted into the archive in 2015.)
Imagine a wristwatch with nothing but the wristband. Time flows around your wrist; parts of the wristband light up to indicate what time it is, and you can watch seconds race loops around your wrist.
The concept may need some tuning, as some of the readings are awkwardly behind your wrist. Perhaps there's a gap there that the indicators jump over, around the closing mechanism of the wrist band.
posted on 2007-06-06, tagged as idea gadget
Git for Computer Scientists 2007-03-29
I wrote a brief introduction to git for people who are not scared by words like Directed Acyclic Graph . Read it here: Git for Computer Scientists
posted on 2007-03-29, tagged as git howto
Howto host git
on your Linux box
2007-03-23
Warning
Updated to drop --use-separate-remote
from git clone
, it's
the default.
Updated to add --read-only
to git-shell-enforce-directory
.
I've run repeatedly into cases where I want to provide services to people without really trusting them. I do not want to give them shell access. I don't want to even create separate unix user accounts for them at all. But I do want to make sure the service they use is safe against e.g. password sniffing.
Instead of trying to run the version control system over HTTPS (like
Subversion
's mod_dav_svn
that
will only work with Apache, which I don't run), I want to run things
through SSH
. SSH is the de facto unix tool
for securing communications between machines.
Now, I said I don't want to create a unix user account for every
developer using the version control system. With SSH, this means using
a shared account, usually named by the service it provides: svn
,
git
, etc. To identify different users of that account, do not give
the account a password, but use SSH keys instead. To avoid giving
people full shell access, use a command="..."
when adding their
public key to ~/.ssh/authorized_keys
.
For Subversion, I submitted an enhancement to add --tunnel-user
,
to make sure the commit gets identified as the right user, and then
used command="..."
with the with arguments, like this (all on one
line):
command="/srv/example.com/repo/svn/svnserve -t
--root /srv/example.com/repo/svn/view/examplegroup
--tunnel-user jdoe" ssh-rsa ... [email protected]
Where the view
directory is a bunch of symlinks to the actual
repositories, allowing me to do group-based access control.
With git
, the author of the changeset is
recorded way before the SSH connection is opened. Without building
some sort of access control in git
hooks on the server, every
developer can pretty much ruin the repository by overwriting branches
with bogus commits. What they will not have is access outside of the
repository, or a way to actually remove the old commits from the disk
(unless you run git prune
on the server). The distributed nature
of git
makes this reasonably easy to detect, and pretty much
trivial to recover from. For any real trust in the code, you should
look at signed tags anyway. The included wrapper allows you to have
read-only users, but provides no detailed access control against
developers with write access; they just won't be able to escape to the
rest of the filesystem.
So, with that introduction out of the way, let's get to configuring:
Install git
on the server
sudo apt-get install git-core git-doc
Create the directory structure store the repositories and related files
sudo install -d -m0755 \
/srv/example.com/repo/git \
/srv/example.com/repo/git/.ssh \
/srv/example.com/repo/git/repos \
/srv/example.com/repo/git/view
Create the shared user account for this service
sudo adduser \
--system \
--home /srv/example.com/repo/git \
--no-create-home \
--shell /bin/sh \
--gecos 'git version control' \
--group \
--disabled-password \
git
Limit access
Set up a script that makes sure only relevant git
commands can be
run via SSH, and to limit the visible section of the filesystem to
things you actually want to give access to; put this file in
/usr/local/bin/git-shell-enforce-directory
(
download
) and chmod a+x
it
#!/usr/bin/python
# Copyright (c) 2007 Tommi Virtanen <[email protected]>
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
# BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
# ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# Enforce git-shell to only serve repositories
# in the given directory. The client should refer
# to them without any directory prefix.
# Repository names are forced to match ALLOW.
import sys, os, optparse, re
def die(msg):
print >>sys.stderr, '%s: %s' % (sys.argv[0], msg)
sys.exit(1)
def getParser():
parser = optparse.OptionParser(
usage='%prog [OPTIONS] DIR',
description='Allow restricted git operations under DIR',
)
parser.add_option('--read-only',
help='disable write operations',
action='store_true',
default=False,
)
return parser
ALLOW_RE = re.compile("^(?P<command>git-(?:receive|upload)-pack) '[a-zA-Z][a-zA-Z0-9@._-]*(/[a-zA-Z][a-zA-Z0-9@._-]*)*'$")
COMMANDS_READONLY = [
'git-upload-pack',
]
COMMANDS_WRITE = [
'git-receive-pack',
]
def main(args):
os.umask(0022)
parser = getParser()
(options, args) = parser.parse_args()
try:
(path,) = args
except ValueError:
parser.error('Missing argument DIR.')
os.chdir(path)
cmd = os.environ.get('SSH_ORIGINAL_COMMAND', None)
if cmd is None:
die("Need SSH_ORIGINAL_COMMAND in environment.")
if '\n' in cmd:
die("Command may not contain newlines.")
match = ALLOW_RE.match(cmd)
if match is None:
die("Command to run looks dangerous")
allowed = list(COMMANDS_READONLY)
if not options.read_only:
allowed.extend(COMMANDS_WRITE)
if match.group('command') not in allowed:
die("Command not allowed")
os.execve('/usr/bin/git-shell', ['git-shell', '-c', cmd], {})
die("Cannot execute git-shell.")
if __name__ == '__main__':
main(args=sys.argv[1:])
Create your first repository
cd /srv/example.com/repo/git/repos
sudo install -d -o git -g git -m0700 myproject.git
sudo -H -u git env GIT_DIR=myproject.git git init
(with git
older than v1.5, use init-db
instead of init
)
Set up an access control group and give it access to that repository
cd /srv/example.com/repo/git/view
sudo install -d -m0755 mygroup
cd mygroup
sudo ln -s ../../repos/myproject.git myproject.git
You can also use subdirectories of view/mygroup
to organize
the repositories hierarchically.
Note, one SSH public key will belong to exactly one group, but if necessary you can create a separate group for each account for absolute control.
Note, access to repository implies write access to repository, at least for now. You could make
Authorize users
Get an SSH public key from a developer and authorize them to access the group
cd /srv/example.com/repo/git
sudo vi .ssh/authorized_keys
How the developer generates their key is out of scope here.
Add a line like this, with the public key in it (all on one line, broken up in the middle of word to make sure there is no misunderstanding about when to use a space and when not to):
command="/usr/local/bin/git-shell-enforce-directory /srv/exampl
e.com/repo/git/view/mygroup",no-port-forwarding,no-X11-forwar
ding,no-agent-forwarding,no-pty ssh-rsa ... [email protected]
Or to allow only read-only access, add --read-only
as an option.
You can now push things to the repository with
git push [email protected]:myproject.git mybranch:refs/heads/master
Note that before the first push, your server-side repository will not contain even an initial commit, and can't really be cloned.
Now the developer can clone the repository
git clone git@myserver:myproject.git
or to avoid some behavior of older git that I consider confusing
(needs git
v1.5 or newer):
git clone -o myserver git@myserver:myproject.git
They will probably want to set up ssh-agent
to avoid typing the
passphrase all the time.
And you're done! Good luck with your adventures with git
, and
welcome to the 21st century and to distributed version control
systems.
posted on 2007-03-23, tagged as git admin howto ssh
Howto buy a used car in California 2007-02-20
Tomorrow, if everything goes well, I will buy a used car from an individual in California. Here's a checklist of things for that, to help others in similar situations. Some things may have already been omitted because they weren't relevant for me, so you may want to independently browse the websites I'm using as sources. I'm skipping everything related to finances and haggling. I'm not covering cases where the car isn't in good condition. Also, buying from dealerships is different. Good luck.
(Updated to mention REG 262.)
-
Find an ad on craigslist or whatever. I liked http://losangeles.listpic.com/car/
-
Call the seller. Don't just email, start evaluating the seller. Things to ask: http://www.carbuyingtips.com/questions.html (though that list is horribly extensive; I'd rather just pick the three best-looking candidates and do that thing on the spot; you'll need to doublecheck anyway to make sure the seller wasn't lying).
Ask for:
- the VIN of the car (usually 17 characters, on windshield)
- full name of the seller (you'll need it anyway for the check)
-
Check the CARFAX report for the car. Go for the $24.95 30-day option, you should always look at more than one car.
- compare odometer, accident history etc with what seller said (better yet, just avoid accident cars, the risk isn't worth it)
- make sure it's not an old junk car poorly repaired ("salvage" title)
-
Check the smog check history (fails are a sign of trouble). In California, in most of the cases, the seller is required to provide a certificate of check newer than 90 days old -- it seems the information flow to website is slow enough that this latest check does not show up, or something.
-
Check that the car hasn't suffered storm damage . (Annoying website demands cookies for a simple form. Suck.)
-
Make an appointment and go see the car. In good sunlight, you want to see the car.
-
for some things to check, read more , especially
On-the-Lot Checklist
,Road and Test Checklist
-
you have to drive the car; leave a photocopy of your drivers license if needed for assurance
-
bring a car nut friend who knows what to check; play good cop bad cop (this checklist is not as good, as I am not a car nut)
-
on the lot:
- paint chips
- cracked windows
- accident damage
- signs of water damage
- tires
- check the VIN you were given against windshield, doors, engine, dashboard, major body parts; if they don't match you are dealing with a criminal
-
while driving:
- brake hard, is it even
- rev the engine, does it sound healthy
- make plenty of starts and stops
- go through a manual gearbox, there should be no grinding noises
- see if the car will drive straight with your hands off the wheel
- how does the clutch feel?
- listen for noises throughout the test drive
-
after driving:
- check for leaks
-
to be really careful, you should check a lot of things like
- AC
- windows up/down
- radio/cd/speakers
- etc
-
-
Take the car to a mechanic of your choosing for evaluation. Alternatively, choose to trust on service from brand name vendor.
-
Check the price against Kelley Blue Book etc.
-
Now you're in business. Only bureaucracy and doublechecking left.
-
If you know what you're willing to pay, go to your bank and get a cashiers check. Or two alternatives, if that works out. Otherwise, you will need to go back to the bank after haggling, and without a deposit the seller may have sold the car to someone else. Or something. Tuff.
(Some people say haggling at dealerships is easier if you show up with a cashiers check just a bit short from what they're asking.)
Don't pay with cash, there's even less chance of getting it back than cancelling a check.
If you happen to be a seller reading this, do the actual sale in a bank to be sure you aren't being cheated.
-
Download and print PDF forms from DMV .
-
In California, sellers are required to provide a smog certificate. Make sure you get one. Stuff on smog checks: 1 , 2 , 3 . Frankly, I'm still a bit confused myself when I will need to do a smog check.
-
Check that the registration is current and that the car wasn't repurchased under the California "Lemon law". (Err, how? As I understand it should say so in the
Certificate of Title
paper) -
The seller should find the "pink slip" aka
Certificate of Title
-
if said paper does not have form fields for ownership transfer and odometer reading, you need form "REG 262" from the DMV, which is printed on special paper and not available as PDF. SUCK!
-
check seller name against his drivers license
-
fill it out with both of your info and both sign it; for instructions search for "Where do I sign?" on this page
-
if a bank or something still owns a chunk of the car, their signature is also needed on the pink slip; I'd be inclined to avoid the complexity
-
fill in the odometer value, both sign (read more )
-
seller keeps the Notice of Transfer and Release of Liability part and submits to DMV.
-
buyer fills in the back of the title to transfer ownership
-
-
Things to ask for before leaving
- is a special wheel lug key needed, get it
- are there extra keys
- how does the car alarm work
- on a convertible, how does the roof work
- any "tricks" you should know
-
Now you're ready to leave with your new car! Check that you have
-
Certificate of Title
, signed by both, also odometer section -
Bill of Sale
, signed by both -
maintenance records
-
smog certification
-
owners manuals, repair manuals
-
spare tire, jack
-
-
Seller has 5 days to submit
Notice of Transfer and Release of Liability
to DMV. Do it online . -
Buyer has 10 days days to report ownership change to DMV and max 30 days to pay the fees. Read more: checklist , things to send to DMV , more info .
-
Taxes?
General resources:
-
The other side of the story: seller howto . The person you are buying from should have done all of this, and this is what you can demand from him. Also, see the links and actual transaction guidance inside.
-
Used car buying tips at http://www.carbuyingtips.com/used.htm
posted on 2007-02-20, tagged as car california los-angeles howto
SCALE5x: Talk summary of the OpenWengo talk 2007-02-10
More SCALE5x
: Dave Neary
is talking about OpenWengo
. Note to
self: Wengo = TelCo, WengoPhone = software, OpenWengo = project
developing WengoPhone -- or something. At least it's not just .org
for community and .com
for services, even if the names are way too
close to eachother.
Good quote (not his, didn't catch the name):
"People don't want to buy a quarter-inch drill.
They want a quarter-inch hole!"
He recommended this blog for anyone interested in user interface design: http://headrush.typepad.com/
Choice quotes:
"Cross Platform (but sound on Linux is a disaster)"
"Surprisingly, for Microsoft, it's not SIP... pure SIP."
(talking about MSN Messenger)
They intend to implement XMPP-based transport mechanisms. Mentioned
inkboard
, an Inkscape extension(?) for whiteboard-style sharing of
drawing over the internet.
They have games over OpenWengo (I guess XMPP?), like chess.
"Oh did I mention sound on Linux is horrible?"
Heh, we're calling audience members during the talk. From France. And it didn't work ;)
OpenWengo has cross-platform video conferencing. Wow.
posted on 2007-02-10, tagged as voip sip im software scale5x conference talk
SCALE5x: Talk summary of the horribly named Red Hat Xen talk 2007-02-10
More SCALE5x : Sam Folk-Williams is doing a talk called Xen Virtualization in Red Hat Enterprise Linux 5 and Fedora Core 6: An overview for System Administrators (UNGH!). And demonstrates why I hate "big" companies like Red Hat: they sent a non-technical, but well-practised, person to talk about Xen. He sounds convincing, but ended up explaining Xen domU migration without understanding the concept of shared storage. Gah.
Also note how talk summary promises live demos of Xen integration features only Red Hat has, and how the actual talk contained no such thing. If I didn't have wireless right now I'd be annoyed. Thank you SCALE5x organizers, the wifi is just great.
posted on 2007-02-10, tagged as admin redhat xen scale5x conference talk bad
SCALE5x: Talk summary of Admin++, what root never told you 2007-02-10
So I'm at SCALE5x , listening to Ron Gorodetzky talk about what he learned about sysadmining for Digg and Revision3 (who try to be an "Internet television network"; in effect, they distribute loads of big files). Most of the tools he mentioned I already knew, but it was nice to get independent reviews of "hey I think this is good". Here's what I took home from his talk:
-
He really thinks highly of the OSCon 2005 talk
Livejournal's Backend (A history of scaling)
(PDF ). -
Between the lines I understood Revision3 has outsourced their big bandwidth use -- the CDNs he mentioned by name were Cachefly (the color scheme hurts even my eyes and real designers think I'm colorblind), BitGravity (caution hideous flash site) and of course Akamai .
-
He spoke about outsourcing data center operations, using things like Amazon EC2 and S3 . I need to come up with a budget and time to play with EC2.
-
He stressed the importance of setting up KVMs etc properly for the data center.
-
Set up your infrastructure and plan for scaling before you get popular, because you will be too busy to do them afterwards. That's nice, I like building things scalable from scratch.
-
Specific infrastructure management tools:
-
Puppet -- seems pretty much a reimplementation of cgengine
-
Bcfg2 -- smells like academentia to me
-
ISconf -- from the
Bootstrapping an Infrastructure
people, seems to be based on the idea of a p2p distributed cache that stores pretty much a version control history of commands ran.
As usual, I haven't yet seen anything that would actually seem to work in the real world, unless you give up everything you already have (like package management etc), and do things 100% their way.
His suggestion: as the tools are based on very different worldviews, look at everything and try to pick the one that matches your opinions.
-
-
One thing he wouldn't skimp on: "Don't skimp on RAM."
-
At Revision3, they use long-life server hardware and don't upgrade the servers, instead they go for a full new deployment.
posted on 2007-02-10, tagged as admin cluster configuration-management scale5x conference talk livejournal digg
IMAP over SSH Howto 2007-02-09
Tired of managing n+1
passwords? Hate having an extra network port
open on that server box? Want to have automated replication of email
to your laptop in a Unix command line geek-friendly fashion?
Here's how to make OfflineIMAP synchronize mail between local and remote Maildirs .
-
on the client:
-
create an SSH key pair with no passphrase:
$ ssh-keygen -t rsa -N '' -f ~/.ssh/imap-preauth-key
-
-
on the server:
-
install Binc IMAP on the server; no need to have it actually listen for network connections
-
I store my mail as
~/.Mail
on the server; create a~/.bincimap
on the server and adjust to fit:Mailbox { depot = "IMAPdir", umask = "0077", path = ".Mail", }
-
create a shell script
~/bin/imapd-preauth
that'll start the IMAP daemon in a preauthenticated mode; note that OfflineIMAP wants a certain style of handshakebincimapd
doesn't know how to do, so we fix that withsed
:#!/bin/sh set -e export BINCIMAP_LOGIN=PREAUTH+FAKE bincimapd|sed --unbuffered '1s/^FAKE OK PREAUTH/* PREAUTH/'
Make the script executable (duh).
-
authorize the previously generated SSH key to run only the above script -- add the following to
~/.ssh/authorized_keys
(split here for readability, make it all one line; replace THINGS to fit):command="/home/USERNAME/bin/imapd-preauth",no-port-forwarding, no-X11-forwarding,no-agent-forwarding,no-pty SSHPUBLICKEYHERE
-
-
on the client:
-
tell OfflineIMAP about the preauthenticated IMAP connection:
[Account SOMETHING] localrepository = local-SOMETHING remoterepository = remote-SOMETHING [Repository local-SOMETHING] type = Maildir localfolders = ~/data/mail/SOMETHING [Repository remote-SOMETHING] type = IMAP remotehost = HOSTNAME preauthtunnel = env -u SSH_AUTH_SOCK ssh -q -i ~/.ssh/imap-preauth-key %(remotehost)s fake-command
-
That should be it! Have fun.
(And if you just broke it, feel free to give one of the halves to me.)
posted on 2007-02-09, tagged as imap email offline software offlineimap ssh howto
Web 2.0 Explained 2007-02-05
Here's a nice video on what people mean when they say Web 2.0: http://youtube.com/watch?v=6gmP4nk0EOE
I like the style of it, though it is too fast paced if you don't already know what it is talking about.
(Found via Mickipedia )
posted on 2007-02-05
Six Word Scifi 2006-12-02
Sixwordscifi.com is so much fun it has to be wrong, somehow. Here are my favorites so far (yes, one of them is mine):
Back me up before I die.
— boogah
Even at light speed, I wait.
— jefft
All alone in his light cone.
— tv
posted on 2006-12-02
The Phone Killer Phone 2006-12-02
I now know what I want from my next phone. And it'll totally blow the whole phone concept out of the water.
Start with a mostly-open hardware platform like Neo1973 , add Linux (OpenMoko ) on top. And no need to cram in a clumsy qwerty keypad, just carry a one-hand keyboard when you care about it. Less clumsy when you don't want more than a phone, full SSH sweetness when you want it. The phone itself is purely touch-screen, and the keyboard can actually get respectable WPM with real keypress feedback. And the big part is, because the phone is actually Open, plugging all this in is not a big problem! That's just great!
posted on 2006-12-02
I need a bag 2006-11-17
I have a shoulder strap-style bag, manufactured and used by the German army, bought as surplus and dyed black. My 12" thinkpad fits perfectly inside of it. But the strap is sewn in place, and seems to fail every few years -- and now's the time.
The bag is really good, and I am going to get it fixed, but that doesn't mean I can't look at alternatives. So, I need something that fits a 12" thinkpad, isn't too big, preferably comes in black, and is otherwise non-attention grabbing and doesn't look like it'd contain a laptop. According to Lenovo, my laptop is about 268x211x20mm.
My options:
- small messenger bag: 1 , 2
- hard-shell attache case (not really my style, but a 12" one in black might rock it -- though I'm not going to buy a 15" case with internal padding to make a 12" laptop not bounce around, I want a smaller case too; it'd still be plenty big for the obligatory dead tree notes and docs; for pics, see 3 , 4
- the manliest purse ever from Maxpedition : I'd be willing to carry that (in black of course;) if it fit a 12".. And I don't think it will.
- more classical military style: M-51 Engineers Field Bag , Urban Explorer Black Canvas Shoulder , etc ..
- something from http://booqbags.com/ but the style doesn't really smack me in the face with want
- something from Timbuk2 , but they really don't seem my style
None of those really work for me. I guess considering the bags I already own, I might go for something I don't have. Mmm, a smallish hard-shell attache case in black, with a good enough shoulder strap that I can sling it over my head. That might just do it. Now where do I get one?
posted on 2006-11-17
A Revver command line video upload tool 2006-11-13
Update: It seems the script had gone missing at some point. It's back .
As you may or may not have noticed, I do a bunch of stuff for Revver. I ended up writing a sort of a tutorial to the Revver API, and as I like to collect all kinds of code samples here, I thought I should crossblog it here. The original is on the Revver developer blog .
One day, I was on a slow internet connection and wanted to upload a few files. I wanted something more batch-oriented than the web-based upload, and I have a personal bias against most current Java runtimes. So I decided to use the cool new API and write a video upload client , and will walk you through what it does in this blog entry. Feel free to "just" use the tool, but hopefully this will also help you in writing your own API clients.
First of all, I wanted to write something that's usable just about
everywhere. I tend to use Python
, so that's what the tool is written
in. The Python standard library didn't seem to be able to do HTTP POST
file upload (think web forms) of large files, so I ended up using
curl
for that. This should work on any Linux/OS X/etc box with Python
and curl installed. All you Ubuntu
/Debian
people just get to say
sudo apt-get install curl
and that's it.
So, let's dive right in. The tool is imaginatively named
revver-upload-video
. The first bit is the command line
parser. Don't be intimidated by the length, this is pretty much
boilerplate code, the actual API-using bits are really small. The
full file is 155 lines, total.
There are basically three kinds of options: mandatory, optional and for developer use. Mandatory options are enforced later on, and developer options are mostly meant for playing with the staging environment and reusing upload tokens from previous, failed, uploads.
#!/usr/bin/python
"""
Guerrilla command line video upload tool.
"""
import optparse, getpass, urlparse, urllib, xmlrpclib, subprocess
def getParser():
parser = optparse.OptionParser(
usage='%prog --title=TEXT --age-rating=NUM [OPTIONS] FILE..',
description='Upload videos to Revver')
parser.set_defaults(
api_url='https://api.revver.com/xml/1.0',
upload_url='http://httpupload.revver.com/',
login=getpass.getuser(),
)
parser.add_option('--login',
help='login name to use (default: %s)' %
parser.defaults['login'])
parser.add_option('--passphrase-file',
help='read passphrase from (prompt if not given)',
metavar='FILE')
parser.add_option('--age-rating',
help='MPAA age rating (mandatory)',
type='int')
parser.add_option('--title',
help='title for the video (mandatory)',
metavar='TEXT')
parser.add_option('--tag',
help='tags (mandatory, repeat for more tags)',
metavar='KEYWORD',
action='append')
parser.add_option('--author',
help='author of the video',
metavar='FULLNAME')
parser.add_option('--url',
help='website for extra info')
parser.add_option('--credits',
help='extra credits',
metavar='TEXT')
parser.add_option('--description',
help='a brief description',
metavar='TEXT')
parser.add_option('--api-url',
help='API URL to contact (developers only)')
parser.add_option('--upload-url',
help='Upload URL to send the file to (developers only)')
parser.add_option('--upload-token',
help='use preallocated token (developers only)',
metavar='HEX',
action='append')
return parser
If you've used optparse
before, there's not much interesting
there. It just instantiates a parser object and returns it, for the
main function to use. Nothing there touches the Revver API yet.
Next up, we have some utility functions. getPassphrase
will read a
passphrase form the file given to --passphrase-file=
, or prompt
the user for one. getAPI
instantiates an XML-RPC client object
with the login and passphrase, and caches it in options
so if you
call getAPI
more than once, you're still only prompted for the
passphrase at most once. Nothing in revver-upload-video
uses that,
but these are meant to be reusable functions.
def getPassphrase(filename=None):
if filename is not None:
f = file(filename)
passphrase = f.readline().rstrip('\n')
f.close()
else:
passphrase = getpass.getpass('Passphrase for video upload: ')
return passphrase
def getAPI(options):
api = getattr(options, 'api', None)
if api is None:
passphrase = getPassphrase(filename=options.passphrase_file)
(scheme, netloc, path, query, fragment) = \
urlparse.urlsplit(options.api_url,
allow_fragments=False)
query = urllib.urlencode([('login', options.login),
('passwd', passphrase)])
url = urlparse.urlunsplit((scheme, netloc, path, query, fragment))
api = xmlrpclib.Server(url)
options.api = api
return api
All right, now we're getting to the actual meat. getToken
calls
the API method video.getUploadTokens
to allocate an upload token
, that lets you upload a file to the Revver archive.
def getToken(api):
url, tokens = api.video.getUploadTokens(1)
assert len(tokens)==1
token = tokens[0]
return url, token
createMedia
creates a new video in the archive from your uploaded
file by calling video.create
in the API. It also adds metadata
like your website URL to the video. Finally, it returns the media id
of the newly-created video.
def createMedia(options, token):
data = {}
if options.credits is not None:
data['credits'] = options.credits
if options.url is not None:
data['url'] = options.url
if options.description is not None:
data['description'] = options.description
if options.author is not None:
data['author'] = options.author
api = getAPI(options)
media_id = api.video.create(token,
options.title,
options.tag,
options.age_rating,
data)
return media_id
Finally, we have the main
function, and the bits that call it when
you run the tool. Here, we actually parse the command line arguments,
enforce the presence of the mandatory options, and bail out unless you
gave it actual files to upload.
For each file given on the command line, we either use one of the
tokens given to us with --upload-token=
, or get one from the API
with getToken
. Then we join the upload URL and the token to get
the place to upload the file to, and run curl
as a subprocess to
do the actual upload. Checking that curl
worked takes 8 lines, and
then we use createMedia
to actually create the video. And that's
it!
def main(progname, args):
parser = getParser()
(options, args) = parser.parse_args()
if options.login is None:
parser.error('You must pass --login=LOGIN')
if options.title is None:
parser.error('You must pass --title=TEXT')
if not options.tag:
parser.error('You must pass --tag=KEYWORD')
if options.age_rating is None:
parser.error('You must pass --age-rating=NUM')
if not args:
parser.error('Pass files to upload on command line')
for filename in args:
if options.upload_token:
token = options.upload_token.pop(0)
url = options.upload_url
else:
api = getAPI(options)
url, token = getToken(api)
print '%s: allocated token %s' % (progname, token)
upload_url = urlparse.urljoin(url, token)
retcode = subprocess.call(['curl',
'-F', 'file=@%s' % filename,
'--',
upload_url,
])
if retcode < 0:
print >>sys.stderr, '%s: upload aborted by signal %d' % (
progname, -retcode)
sys.exit(1)
elif retcode > 0:
print >>sys.stderr, '%s: upload failed with code %d' % (
progname, retcode)
sys.exit(1)
print '%s: used token %s for %s' % (progname, token, filename)
media_id = createMedia(options, token)
print '%s: created media %r from %r' % (progname,
media_id,
filename)
if __name__ == '__main__':
import os, sys
main(progname=os.path.basename(sys.argv[0]),
args=sys.argv[1:])
At this point, we have all we need to do the same thing as the web-based upload, or the Java upload client. And you can do something similar, by yourself. This file is copyright Revver, Inc, but licensed under the MIT license -- that means you can use it as a base for writing your software, without any real restrictions. Download the whole thing here: revver-upload-video .
posted on 2006-11-13
New domain name 2006-09-30
I had a bit of fun thinking up of puntastic DNS domain names. I ended
up registering eagain.net
, the old
tv.debian.net
name will soon start redirecting there. Need to set up
email too.. Vanity domains are soo much fun.
For the rare non-nerd reading this, EAGAIN
is the error code you
get when you are doing asynchronous programming with non-blocking
sockets and would block next. Err, let's just say "I write async
code and it's a neat insider joke", ok?
For the nerds out there, here's a bunch of wild ideas I had while figuring out what domain name to register. Many of them are invalid (too short), and most of the good ones are already taken, but in case you need some inspiration:
- asyn.ch
- dot.at
- ex.plo.de / im.plo.de
- plea.se
- fal.se
- belong.us
- chi.hu (as in Chihuahua, our dogs..)
- celci.us
- blo.gr
- co.de
- co.ff.ee
- oh.no
- wh.ee
- wh.at
- sh.it
- stre.am (taken)
- up.stre.am
- down.stre.am
- upstre.am
- em.ploy.ee
- s.tre.am
- 3x2.net (as in triple-double-w)
- tribledouble.net
- 2by4.net (as in clue-by-four)
- be.am (la.ser.be.am, scotty.up.be.am ;-)
- dre.am
- progr.am
- pu.sh
- pu.bli.sh
- form.at
- anon.ymo.us
- pla.net
- mag.net
- carwa.sh
- blo.at
- cave.at
- repe.at
- pho.to
- oct.et
- pron.to
- plu.to
- lot.to
- dit.to
- adju.st
- ang.st
- arti.st
- ava.st (type like a pirate)
- broadca.st
- inner.net
- comm.it
- mis.info
- foc.us
- fung.us
- geni.us
- gur.us
- octop.us
- styl.us
- stat.us
- tor.us
- line.br
- deco.de
- while.do
- un.do
- weir.do
- if.then.fi
- cook.ie
- newb.ie
- zomb.ie
- disk.io
- comm.it
- subm.it
- bat.ch
- bran.ch
- epo.ch
- fet.ch
- zil.ch
- gra.ph
- boo.st
- fibona.cc
- from.to
- justa.com
And for the Finns:
- mi.au (cat lovers)
- vai.nu (dog lovers)
- kir.nu
- masen.nu
- tie.dos.to (taken, by non-Finn)
- hit.to
I'm not going to say anything about the Cook Island's subdomain for commercial entities.
posted on 2006-09-30, tagged as computer nerd dns
In case your Xen domU's have networking trouble 2006-05-21
If your domU
s have networking trouble with TCP, or some other
protocol that ends up needing fragmentation such as large ICMP pings,
you need to read this.
If it seems TCP handshakes complete, but no data is transferred -- especially, no actual data gets sent out from the domU -- you're likely hitting a bug in how Xen interacts with TCP segmentation offload.
The bug seems to depend on the actual network interface card the traffic is going out from. I hear tg3 is one of the cards that triggers it, and I'm seeing it on my home box with 8139too's.
The fix is pretty simple, but hard to figure unless you know what to look for: inside the domU, run
ethtool -K eth0 tx off
for each interface affected.
See http://wiki.xensource.com/xenwiki/XenFaq#head-4ce9767df34fe1c9cf4f85f7e07cb10110eae9b7 for the very small amount of extra information that is out there.
posted on 2006-05-21, tagged as xen networking computer
iBook--, Thinkpad++ 2006-05-21
Life sucks and then your computer breaks. As soon as a new X release is out, and it seems dual head on iBook is a possibility , the darn thing decides to fry its logic board. Again. Thankfully, Apple may make the repair for free, if the symptoms match the manufacturing problem . Again.
Well, the good news is that after 3 years of using the iBook, I got a new laptop. A Lenovo Thinkpad x60s, weighing just 1.3kg. It's so light I always think I forgot to put it in the backpack.
Ubuntu Dapper (flight 7) seems to work pretty well on the x60s. Trouble spots so far:
-
install CD corrupted display during X autoconfig: screen was in text mode, mostly black, with two or three character-size grey rectangles -- hitting enter blindly let it continue and reboot the machine
-
suspend and hibernation fail on resume
-
wlan hanged once, and didn't recover until I rebooted into Windows -- unngh. Look at this:
ipw3945: Error sending SCAN_ABORT_CMD: time out after 500ms. ipw3945: Radio Frequency Kill Switch is On: Kill switch must be turned off for wireless networking to work. ipw3945: Error sending ADD_STA: time out after 500ms. ipw3945: Error sending RATE_SCALE: time out after 500ms.
After that, any attempt to use the interface ended with:
ADDRCONF(NETDEV_UP): eth1: link is not ready
-
hotplugging the UltraBay docking station does not seem to work in Linux
I especially love the dual headedness, after fighting with the ATI driver in the iBook.
Now I need to see about hooking the fingerprint reader up to PAM.
posted on 2006-05-21, tagged as computer hardware ubuntu
My iBook has two heads 2006-04-23
Finally, after two years of hacks , my iBook 2.2 knows how to multihead! And no silly clone mode only, totally different image and external output at 1600x1200 at 85Hz. This is nice! Thank you X.org people for version 7, thank you X Strike Force !
Update: well, now suspending fails and booting the machine results in a black screen in over half of the tries. Bah.
posted on 2006-04-23, tagged as ibook computer hardware
render_pattern
: Repeat patterns easily in Nevow templates
2005-12-21
After render_fragment
, dialtone
mentioned render_pattern
,
that would get one or many patterns from the page and put them in the
current tag. Well, that's easy to write:
def render_pattern(self, name):
"""
Find and render a pattern.
Example:
<span nevow:pattern="foo">
I'm very repetititive.
</span>
<ul>
<li nevow:render="pattern foo">
this text will get removed when rendering
</li>
<li nevow:render="pattern foo"/>
</ul>
"""
def f(ctx, data):
doc = self.docFactory.load(ctx)
patterns = inevow.IQ(doc).allPatterns(name)
return ctx.tag.clear()[patterns]
return f
Updated to adapt doc to inevow.IQ
before calling
allPatterns
.
posted on 2005-12-21, tagged as nevow twisted python programming
render_fragment
: Reusable fragment embedding in Nevow templates
2005-12-21
This Nevow
renderer came up on #twisted.web. Thanks to rwall
and
dialtone
for input.
def render_fragment(self, name):
"""
Find and render a fragment, with optional docFactory.
Find a fragment factory from self via attributes named
fragment_* and replace content of current tag with said
fragment.
If pattern docFactory is found under this tag, pass it as
docFactory to the fragment factory.
Example:
class MyFrag(rend.Fragment):
...
class MyPage(rend.Page):
fragment_foo = MyFrag
...
and give MyPage a template with
<!-- no docFactory -->
<div nevow:render="fragment foo">
this text will get removed when rendering
</div>
<!-- with docFactory -->
<div nevow:render="fragment foo">
this text will get removed when rendering
<span nevow:pattern="docFactory">
but this whole tag will be passed as docFactory to MyFrag.
</span>
</div>
"""
def f(ctx, data):
callable = getattr(self, 'fragment_%s' % name, None)
if callable is None:
callable = lambda ctx, *args: ctx.tag[
"The fragment named '%s' was not found in %r." % (name, self)]
kwargs = {}
try:
docFactory = ctx.tag.onePattern('docFactory')
except stan.NodeNotFound:
pass
else:
kwargs['docFactory'] = loaders.stan(docFactory)
return ctx.tag.clear()[callable(**kwargs)]
return f
posted on 2005-12-21, tagged as nevow twisted python programming
render_if
: Conditional Parts in Nevow Templates
2005-12-17
This Nevow renderer has saved me a lot of time:
def render_if(self, ctx, data):
r=ctx.tag.allPatterns(str(bool(data)))
return ctx.tag.clear()[r]
Use it like this:
<nevow:invisible nevow:render="if" nevow:data="items">
<ul nevow:pattern="True"
nevow:render="sequence">
<li nevow:pattern="header">The items are a-coming!</li>
<li nevow:pattern="item">(the items will be here)</li>
</ul>
</nevow:invisible>
And now, if the list returned by data_items
is empty, there
will be no <ul>
tag at all in the output.
I just realized non-boolean tests may be wanted -- for example, test
if a string matches a regexp. You could do that by mangling the data
before render_if
, but that's not nice, because then you don't have
access to the original data inside nevow:pattern="True"
. So,
instead let's parametrize the test:
def render_ifparam(self, name):
tester = getattr(self, 'tester_%s' % name, None)
if tester is None:
callable = lambda context, data: context.tag[
"The tester named '%s' was not found in %r." % (name, self)]
return callable
def f(ctx, data):
r=ctx.tag.allPatterns(str(bool(tester(data))))
return ctx.tag.clear()[r]
return f
Note how we still cast the return value of the tester to boolean. You
could avoid that and call the renderer render_switch
. Adding
support for Deferred tests would be quite easy, too. The only ugly
part is I don't know of any way to make the same renderer work nicely
for nevow:render="if"
and nevow:render="ifparam foo"
.
[Updated to add return f
, also renamed second render_if
to
render_ifparam
to clarify things a bit. Thanks k3mper
.]
posted on 2005-12-17, tagged as nevow twisted python programming
My review of Twisted Network Programming Essentials 2005-11-29
I've just finished my review of the book Twisted Network Programming Essentials by Abe Fettig. Now go read it .
You may also be interested in an earlier review I wrote about Foundations of Python Network Programming .
posted on 2005-11-29, tagged as book-review twisted python programming
turku-dev: Kehittäjätapaaminen Turussa 2005-11-22
This entry is about a local software developer gathering, and written in Finnish. My apologies if it is complete gibberish to you, but atleast you can stare at the pretty ä dots.
Mikä?
Vapaamuotoinen tapaaminen ohjelmistoja työkseen ja/tai harrastuksekseen tekeville, tai muuten aiheesta kiinnostuneille.
Tutustutaan ihmisiin, puhutaan mukavia, syödään ruokaa. Jos haluat kertoa hienosta uudesta softasta, jota olet juuri tekemässä, löydät varmaan jonkun samanmielisen. Jos tarvitset apua hankalaan ongelmaan, joku varmaan on joskus tehnyt jotain samankaltaista. Eikä ihmisten tunteminen ainakaan haittaa urakehitystäkään.
Sillä ei ole väliä onko työkalusi C, Perl, Java, Python, Ruby vai PHP; tai Linux, BSD, OS X vai jopa Windows. Vapaat/avoimet ohjelmistot ovat monelle meistä tärkeitä, joten niiltä et kokonaan pysty välttymään, mutta tarkoitus on vain saada samanhenkisiä ihmisiä kokoon.
Missä?
Turun keskustassa oleva ravintola Harald , katso kartta .
Meitä kiinnostaa eniten Turun seudun toiminta, mutta ajatuksia "road showsta" on heitetty ilmaan, eli jatkossa kehittäjätapaaminen saattaa olla sinunkin lähikuppilassasi.
Koska?
Nyt lauantaina, 26.11. n. klo 12:30 alkaen. Niin pitkään kun intoa riittää.
Seuraava kerta varmaan joskus tammikuussa, ja siitä sitten eteenpäin vaikka kuukauden tai parin välein.
Kuka?
Tällä tapahtumalla ei ole virallista järjestäjää, eikä se liity minkään yhdistyksen tjms. toimintaan. Minä aloin asiasta tutuille puhumaan, Tero Kuusela on tehnyt lähes kaiken valmistelutyön.
Tällä hetkellä aiheesta kiinnostuneiden ihmisten taustoja ja kiinnostuksia: VSTKY, Linux-Aktivaattori, Debian, Python, Linux kernel, Google Summer of Code, jne..
Tahtoo!
Liity postituslistalle. Listan osoite on [email protected] ja liittyminen tapahtuu lähettämällä viesti osoitteeseen [email protected] ja vastaamalla vahvistus-pyyntöön.
Ihmismäärän arvioimiseksi pyydän, että ilmoitat tulostasi etukäteen osoitteeseen Tero Kuusela [email protected] .
posted on 2005-11-22, tagged as turku-dev finland software-development meeting lang:fi
The MochiKit screencast is very nice 2005-11-21
"It's simply a more convenient syntax."
"MochiKit is full of more convenient syntax."
The MochiKit screencast is great. I think screencasts are a great way to introduce people to new software.
posted on 2005-11-21, tagged as javascript ajax twisted python
I registered at Technorati.com 2005-11-21
I just registered at Technorati , and it requires me to post an entry claiming my Technorati Profile .
..and when I registered an alternate URL for it, they required me to do it all over again: Technorati Profile .
posted on 2005-11-21, tagged as technorati
New website template 2005-11-20
Just finished a new website layout. I'm reasonably pleased with it.
posted on 2005-11-20, tagged as web
Python is confusing 2005-11-02
>>> def simple(): yield 'a'
...
>>> ', '.join(simple())
'a'
>>> def horrible():
... if ' ' not in False: yield 'a'
...
>>> ', '.join(horrible())
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: sequence expected, generator found
>>>
But it does accept generators!
(Yes, I know what triggers it to say that. It's still horribly misleading.)
posted on 2005-11-02, tagged as python programming
Using nevow.guard the smart way 2005-09-16
<ronwalf> ok, I give up... How do I get the AVATAR_LOGIN stuck
between the SessionWrapped resource ul and the current
resource url
Well, we aim to please.
def getActionURL(ctx):
request = inevow.IRequest(ctx)
current = url.URL.fromRequest(request).clear()
root = request.getRootURL()
root = url.URL.fromString(root)
assert root is not None
root = root.pathList()
me = current.pathList(copy=True)
diff = len(me) - len(root)
assert diff >= 0
action = current
if diff == 1:
action = action.curdir()
else:
while diff > 1:
diff -= 1
action = action.parent()
action = action.child(guard.LOGIN_AVATAR)
for element in me[len(root):]:
action = action.child(element)
return action
Comment from ronwalf (on IRC) on 2005-09-17T00:06:57:
<ronwalf> Better. after root = root.pathList()
<ronwalf> if root == ['']: root = []
posted on 2005-09-16, tagged as python twisted programming
Turuxi gathering looks a bit too boring :( 2005-09-16
The Turuxi gathering today looks like a bit too boring for me, personally. I think I'll go xycle instead.
posted on 2005-09-16