Jochem van der Vorm Musings http://vorm.net/ Jochem van der Vormjoch3m@vorm.net Personal musings and rants on life, culture, music and technology vorm.net atom-generator 2013-05-26T02:00:54Z 256 color terminals http://vorm.net/256_color_terminals 2008-09-26T15:00:38Z 2008-09-26T16:00:14Z

While the rest of the techworld is getting excited about their aero, quartz or compiz 32bit color 3D desktop, I reached a less impressive milestone today:

From now on I use a 256 color terminal ;-). Actually I looked into this earlier, but since gnome-terminal and screen didn't support it back then (maybe upstream, but not in my distro), I enabled support just today.

Excellent documentation on this topic is already available on multiple places, but to summarize for myself:

.screenrc 
attrcolor b ".I"
termcapinfo xterm 'Co#256:AB=\E[48;5;%dm:AF=\E[38;5;%dm'
defbce "on" 
startup_message off

.bash_profile 
set TERM xterm-256color; export TERM

.vimrc 
set t_Co=256 
colorscheme=asudark

For vim you need a special colorscheme (gvim colorschemes do not work). My vim hacked 256 colors scheme, which work better with a transparent background then the original, is available here.

Ubuntu Hardy Heron http://vorm.net/ubuntu_hardy_heron 2008-04-19T22:00:00Z 2008-08-29T16:00:05Z

Today I installed Ubuntu Hardy Heron on a desktop computer. I am pretty distro/desktop agnostic (see my ten year anniversary post), but since Debian Etch is a bit old for a desktop and Hardy Heron has Long Term Support (I do not want to update/fix a computer every half year!!) Ubuntu was my choice.

There are unlimited gnome improvements which I don't care about, but some things stand out for me in this release.

  • Firefox 3 is much better than 2. The memory improvements and the renewed linux (gtk) focus helps. And luckily you can disable the stupid new urlbar in about:config with browser.urlbar.maxRichResults = 0 .
  • The way non-free installation for media-playback, video drivers and browser plugins is handled, is sweet! The installation of this software is painless here. I even got a nice 64-bit(!) Java firefox plugin, which I was unable to install in Debian (thanks redhat/icedtea).
  • And most important: the colors in the gnome-terminal are smoothened. Now that is what _I_ call eye candy. Less work and much more satisfaction than the integrated compiz.real ;-).

So thanks again for all free software developers!

Helping the environment http://vorm.net/helping_the_environment 2008-02-02T19:00:00Z 2008-02-02T19:00:00Z

A lot of work is done in making linux suspend work better. For me it works perfect (from linux kernel 2.6.20 or so). Therefore I wanted to go a step further and let my class A, heavy power-using receiver switch off when my desktop computer suspends (to ram). And switch back on when my computer wakes up. The motivation is that I only listen music via my computer.

To achieve this I bought the Gembird Silver Shield, a USB-switchable power adapter. I was prepared to do some nice USB snooping and C programming to get this device working in linux, but (un)fortunately there was already a working utility for this device.

Configuration for suspend/hibernate is not so easy and documentation is sparse for the user mode utilities. Since it took me more than the usual googling I will summarize my conclusions here for later use. First the gnome-screensaver measures the idle time. After this timeout is expired the gnome-power-manager starts to measure his own timeout (so before suspend the two times will stack). When the gnome-power-manager times out it will look at the Inhibit flag. If there is no inhibiting (for example my rhythmbox pushes Inhibiting, because I do not want to suspend when music is playing) your computer will suspend.

Since a custom script should be added to switch the power-switch off via usb with sispmctl the suspend-backend is important. Gnome power manager can use multiple backends to go to suspend mode. This works via hal (the hardware abstraction layer). Configuration for this is in /usr/share/hal/information. HAL is responsible for calling the suspend-backend. The default suspend-backend on my machine is pm-utils. (which can be tested with pm-(suspend/hibernate/power-save etc.). I also have a package called hibernate (which can also suspend, confusing isnt' it?). A third one is suspend2 (which can also hibernate....). These backends have different ways of adding custom hooks.

To add a hook to hibernate I added a file called local in /etc/hibernate/scriptlets.d/. The API is as follows (ugly in my book):

# -*- sh -*-

UsbPowerSocketDown() {
      /usr/bin/sispmctl -f1
}
UsbPowerSocketUp() {
      /usr/bin/sispmctl -o1
}

AddUsbOptions() {
      AddSuspendHook 10 UsbPowerSocketDown
      AddResumeHook 10 UsbPowerSocketUp
      return 0
}

AddUsbOptions

Pm-utils has a much nicer API. To add a custom hook add a file to /etc/pm/sleep.d . This uses init style ordering. So look in /usr/lib/pm-utils/sleep.d/ for a proper number. I needed to talk to the usb-bus AFTER the modules were loaded, so a number lower than 50. So I added /etc/pm/sleep.d/10usbpoweroptions with content like this:

#!/bin/bash
case $1 in
    suspend)
        /usr/bin/sispmctl -f1 
    ;;
    resume)
        sleep 1
        /usr/bin/sispmctl -o1 
    ;;
esac

After all this fiddling it works like a charm! Now hopefully one standard will emerge; because how to achieve the same result with KDE I don't know. I had to manually patch rhythmbox to change calling (via dbus) the Inhibit method from org.gnome.powermanager to org.freedesktop.powermanagement (because i used a wrong combination of versions..), so this suggests a move in the right direction.

Qemu/kvm and xorg screen resolution http://vorm.net/qemu_and_xorg_screen_resolution 2007-10-01T13:00:00Z 2007-10-01T13:00:00Z

Fast virtualization is cheap nowadays on linux. Just apt-get install kvm on a recent kernel and VT-enabled processor and you are ready to run all kind of different os-es on your host. No more recompiling the vmware kernel modules *again* or fiddling with xen-modified kernels and difficult networking setups. Yeah!

One problem I encountered was getting a decent resolution (f.g. 1280x1024 or higher) in the guest xserver (xorg). Google was not really helpful this time so therefore I post this note for future reference. Qemu emulates two kind of video cards, standard is a cirrus, -std-vga provides you with a vesa one. Vesa was not able to help me (garbled screen), so using the (default) cirrus emulator is the way to go for a linux guest.

The trick was in changing the monitor sync and refresh rates, autodetection did not work properly (in my centos 4.0 guest). So here is my working snippet of /etc/X11/xorg.conf:

Section "Device"
  Identifier  "Generic Video Card"
  Driver      "cirrus"
EndSection

Section "Monitor"
  Identifier  "Generic Monitor"
  Option    "DPMS"
  HorizSync 28-64
  VertRefresh 43-60
EndSection
10 years GNU/Linux http://vorm.net/10_years_gnu_linux 2007-09-16T20:00:00Z 2007-09-16T20:00:00Z

I am using GNU/Linux for 10 years! As a celebration I made a white chocolate cake (recept) with tux on it. Thanks to all the volunteers for this great combination of free software. In the past I have used all major distributions, nowadays I am stuck with Debian. In fact I am rather distro-agnostic, if it runs xterm, screen, vim, mplayer, gcc and mutt I am happy. Pro is the nice Debian philosophy, con the fact that is has not incorporated SElinux yet. So perhaps I will switch in the next 10 years again?

Sidenote for R. Stallman: when i am using gnu/linux 25 years I promise to make a gnu cake :-D.

Alsa versus OSS http://vorm.net/alsa_versus_oss 2007-06-27T22:00:00Z 2007-06-27T22:00:00Z

Last week I read a thread on the mplayer-dev mailinglist. It was about setting the default sound output from Alsa to OSS. It is now OSS for a default compile of the tarball. However OSS is deprecated in linux AND in almost all distributions that ship mplayer (my favorite media player) set the default output to Alsa.

Most in charge mplayer developers are pro OSS and do not want to switch. One of the arguments is that OSS is now GPL and is used other unices as well . There are also arguments about library vs. unix semantics. Alsa has an OSS compatibility mode (actually two, one in-kernel and one in-library). However using this blocks soft-mixing. This means that another program playing sounds can not do this while using default mplayer compile.

Both sides have some good arguments (i truly understand the points for OSS!, it is just not realistic), but discussing them in this context is useless and the way it is done on this mailing-list is very childish and out of reality. Mplayer should set Alsa as default output for linux, because 1) all major distributions do this and do this with good reason: 2) OSS in linux is deprecated. Reading this thread made me feel tired of (the immaturity of) the open source scene.

But today I am a happy guy again. Some troll suggested removing Alsa from the linux kernel and I was afraid of another useless thread. However I found a message from Takashi (one of Alsa's most productive driver and patchwriters:

Honestly, I'm not fully against changing the current code base (or crap, whatever, any childish name). There are indeed many misdesigns. But, replacing with the above is no option, IMO. The OSS have also many misdesigns, so the same argument would start again. One should learn something from history...

Anyway, if it's going to be more constructive, I'm willing to join in.

Takashi

Hannu (the main OSS developer) then said this:

We have no intention to push OSS back to the kernel or to replace ALSA. That alternative is not realistic any more. In addition OSS is a cross-platform product and staying more or less outside various kernel trees should provide better flexibility.

What we would like to push is that the old "deprecated" OSS/Free are removed from the kernel. OSS/Free is based on about years old OSS API version which was too limited for many applications. Having OSS/Free in the kernel doesn't serve any purpose.

Also we would like to stop the silly OSS vs ALSA war. OSS and ALSA are rather different. Both of them have some good points and bad points. For ordinary users it doesn't matter which API is used by the applications as long as they work. Just the application developers can see the real difference. Some of them prefer OSS while some other prefer ALSA and this should be their "freedom of choice".

I think the ideal solution would be that both ALSA and OSS APIs can co-exist by sharing the same low level drivers (which has already been demonstrated). The low level driver interfaces in both systems are practically identical. This means that ALSA's core can work with OSS' drivers and vice versa.

Today both OSS and ALSA teams have to spend significant amounts of time in emulating the "alien" APIs. Making OSS and ALSA to co-exist will require some work in both sides but that should be nothing when compared to the effort required for emulation.

Just my 2 cents.

Best regards, Hannu

Fortunately there are still a lot of sane people in the community. Thanks!

Dead on http://vorm.net/deadon 2007-04-03T21:00:00Z 2007-04-03T21:00:00Z

Good presentation on captchas by google. See this video. The group that is headed by the prof doing the presentation should easily be able to break current current used image captchas if their statements are true...

Perhaps I like it because perfectly reflects my own opinion though ;-).

You've got mail http://vorm.net/youvegotmail 2007-01-16T22:00:00Z 2007-01-16T22:00:00Z

Since I have searched for the maildirmake command on debian now two times (using http://packages.debian.org and apt-cache) with no result, I decided to blog the answer here.

You need to apt-get install courier-base. (and then /etc/init.d/courier-authdaemon stop; update-rc.d remove courier-authdaemon. This is not logical (I do not need courier at all, only postfix and mutt), so this post will help me and probably others in the future.

[update]
After hitting save I thought of an easier solution (?), use something like
echo 'function maildirmake { mkdir -p "$1"/{cur,new,tmp}; chmod -R 700 "$1";}' >> ~/.bashrc
[/update]

Away http://vorm.net/away 2006-12-25T23:00:00Z 2006-12-25T23:00:00Z

Linux and 64 bit computers are a good combination. The first linux 64 bit support dates from 1995. To be able to compile everything from source really helps supporting a platform. Windows mostly runs 32 bit programs at this moment, because proprietary software vendors (which account for most windows software) are not really fast in releasing new versions. Not all is well for linux though. Macromedia Flash is such a proprietary program of which no 64 bit player exists. Now, I do not really like flash (see also a previous entry about this, but for some reason flash is used to display movies on the web. If you think that that is weird, I agree. What has a vector renderer to do with a movie codec? The masters from youtube.com can perhaps tell you; we just have to comply.

I like to watch movies, so now I have a problem... Or not? Fortunately the flash way of bringing movies is just an container trick around a normal mpeg movie, and my favorite audio/movieplayer mplayer can play them just fine.

So if you are on debian, add contrib to /etc/apt/sources.list and apt-get install iceweasel iceweasel-greasemonkey mozilla-mplayer (the mplayer plugin is broken for firefox in etch at this moment). Now you only need a way to let mozilla know how to feed the movies. This can be done with a simple script like Michael Sheldons's.

So finally I can be an ultrahip blagotuber and and bring you movies ;-)

Happy Christmas Everyone!

Decipher http://vorm.net/decipher 2006-10-14T15:00:00Z 2006-10-14T15:00:00Z

Since some time I am an user of encfs. I actually did want to encrypt my whole root filesystem (just for fun, nothing to hide ;-)), but the loopback way is a hack and the old weak initialization vectors make watermark attacks easy. The weak key management was also a showstopper. Now that luks is relatively standard I hoped that with ubuntus upcoming edgy the eft the dm crypt + luks setup would be well enough integrated.

Unfortunately this is not the case. The installer does not support this yet. There is a myriad of conflicting documentation on how to set this up properly. Most of these are misleading and outdated. There are thousand ways to do this.. but I wanted to do it in a way that will be supported properly in the future. This guide is the best there is on this topic and following it literally does work for Edgy Eft.

One thing Edgy has changed (compared to the guide, which is actually for Dapper Drake) is that the latest cryptsetup package already has encrypted root initramsfs hooks. With adding cryptopts in the kernel line (or via kopts in grub) or making a /etc/initramfs-tools/conf.d/cryptoroot file the system _should_ come up automatically with the passphrase question. After a lot of fiddling (every distro seems to have their on ways of specifying the parameters) I still did not get this working. If you use the kernel line options cryptsetup is not installed in the initrd and if you use the conf file option, the proper kernel modules are not loaded.

Usually I behave like a good open source citizen and file nice bug reports about this, instead of whining in a journal entry. This time the 1000 different ways of doing these + the already very confusing bug reports about this package, left me feeling disqualified to do so. In the end, just following the guide for ubuntu and writing my own initramfs non configurable hook functions (which conflict with the future cryptopts settings !!) seems the best way right now. What makes the situation even more difficult for starters is that google does not list the proper page for ubuntu but a very outdated howto, so hopefully this entry will help the proper guide bubble up.

Hopefully this is something that is going to get better in future! Distributors, please fix this and standardize! On the bright side, the debian future for this looks most promising, clean implementation and support in the beta installer..

Faraway voice http://vorm.net/farawayvoice 2006-06-14T21:00:00Z 2006-06-14T21:00:00Z

Hacked a bit more on audio captchas lately, but the source is not in releasable form right now.. Anyway, I now recognize the audio captchas from microsoft 95% correct and from google (also blogger/blogspot) 60%+ by tweaking the segmentation. captchas.net (35%) and paypal.com (10%) are also doable, but some improvements are still needed.

Time to add some neural network learning.

Forever may not be long enough http://vorm.net/forevermaynotbelongenough 2006-02-21T21:00:00Z 2006-02-21T21:00:00Z

Some rights reserved by nailbender (http://www.flickr.com/photos/nailbender)For years I have struggled to find a proper music player. Most players are too playlist- and metatag oriented for my taste. My files are all stored in a nice directory hierarchy with proper filenames. The tags are very incomplete and buggy (a lot my music file predate ID3 tag standards, i8n is a disaster in most tags).

XMMS and winamp were pretty usable and fast (no meta-data reading in advance, but newer versions tried to do the same database building as other new programs (and failed)). One feature was particularly missing from these players: the ability to play random albums. This should (imho) be the standard setting of any music player: You are working and want to listen to an album in the background. Also it would be nice if you would be able to give a subset of your collection (for example all jazz music) and than the software picks an album for you in this genre. XMMS and winamp lack these features.

Even newer music players (banshee, amarok, rhythmbox, XMMS2, beep, windows media player, mpd, etc.) cannot do this, although they come closer nowadays. However I have lots of trouble trying them. For years they crashed on loading my large (100Gb+) collection. Now they usually do not crash anymore, but start crunching for hours (sometimes days) and when they finish loading the library and you restart the application; it just crunches again on their index for minutes. And more often than not, they load the index (sometimes 100MB+) to memory. This is intolerable if you just want to listen to one song.

I realized pretty quick that just whining is not going to help, so I wrote my own player in a few lines of Perl. It is far from perfect: I really want to:

  • NOT start gconf-tool externally
  • use gstreamer-perl bindings, instead of mplayer
  • thread/fork so better key-bindings are possible

Despite these drawbacks I use it for months already and it works flawlessly . Therefore I release the script to the world. It sets your album cover in the background and just starts playing random directories (=albums in my library).

download randomalbum.

Defeating audio (voice) captchas http://vorm.net/captchas 2006-02-20T17:00:00Z 2012-06-05T09:00:18Z

This article is old and here as a historic reference. For more up to date information about breaking audio captchas see for example Elie Burszstein, who builds on my previous work but in a much more academic fashion.

Introduction

For some years semi turing tests under the name of "captchas" can be found on the web, to prevent bots from filling in forms. When I first saw the visual variant I thought recognizing the characters with a computer algoritm should be easy. A bit of surfing and searching on the internet learned me that I was right, most were broken already. Reinventing the wheel is not very useful, so I left the topic alone.

Later I found a post about voice captchas. Since there was not too much information about this on the net and I was bored (ill at home), I decided to give it a shot. I started easy, willing to enhance the used algoritms to those used in speech recognition (like hmm, viterbi, baum-welch, entropy coding, etc.) when needed. This proved not to be necessary, the first feature complete (segmentation and matching) code worked relatively well on microsofts captchas. Later I tweaked it a bit to also work on google captchas.

On this page you can find proof of concept code to break voice captchas. Do not expect advanced software (pattern recnognition science is so much further) or code that can be used in other projects, I quitted the project when it worked. Initially (february 2006) I kept the code on my harddisk, but later (may 2006) I published it (see disclosure motivation).

How does it work

This is not a complete guide, but some pointers to the source (read it luke). As a starting point, consider the configtype struct:

typedef struct {
    int samplerate;
    int byterate;
    int winsize;
    int band_cnt;
    int word_length;
    int word_overlap;
    int threshold_energy;
    int file_offset;
    char trainfile[255];
} configtype;

The program starts with reading the audio file (in the header it could read the samplerate and byterate, but I am lazy). file_offset bytes are skipped in the beginning of the file, because google starts with a bell. The first step is that all samples are treated with a hamming window (arbitrary choice, most window types should do). The winsize is in samples (eg 512 samples on 8000 Hz provides a 64 ms window). Now the blocks are transformed into the frequency domain with a DFT After that the frequencies are put in band_cnt bins. These bins are not equally wide, the higher the frequency, the larger the band (this has to do with human hearing (mel/bark scale), but I doubt this is actually useful at the current incarnation of the program).

Now the program looks at the highest frequency bin. Every block that has more energy in a window than threshold_energy is considered a peak, and these peaks are used the segment the input file in the different spoken words. The word_length tells the program how many windows long a word is (so all words are considered the same length which is a current weakness of devoicecaptcha). word_overlap helps in localizing the peaks. When the locations of the words are know all frequency bins are written for word_length windows around the peaks. This is called the profile of the word.

The profiles for know words are put in trainfile. When a guess has to be made, the profiles for the words in the file are subtracted from those in the trainfile and the smallest deviation is chosen as the proper word. That is all.

The algoritms in devoicecaptcha are at this moment really naive. There are a lot of possible improvements. Perhaps in the future I will enhance the program a bit, for now I think the 33% (as on googles captchas) is good enough (and I am too lazy to reimplement htk, which should do the trick also (I guess)).

Proof of concept

The code is rather messy, but since this applies to most p0f code consider that 1337 ;-). Download devoicecaptcha.c and compile it with it:

gcc -lfftw3 -std=c99 devoicecaptcha.c

As you can see you need fftw, an allround fourier transform library, which is packaged for most distributions, so you can be lazy (apt-get install fftw3-dev or similar).

When started with ./a.out captcha.wav you also need a data set (a msn one and a google one are available. If you have downloaded the same captchas (see links) as I have, it will print a guess on stdout.

As said before, devoicecaptcha works with a comparison to a trained set. To build up a training set and test the effectiveness of various parameters you can start devoicecaptcha with a third bogus argument, eg ./a.out captcha.wav --print.

What I did was download a large set of captchas with lwp and transcribed them with the proper words with something like:

for i in google/*.wav; do aplay $i &> /dev/null &; read j; mv $i google/$j.wav; done

I ended up with a directory with filenames like "123456.wav" where 123456 are the digits spoken in the captcha. On this directory I unleased a small ruby script, which splits the files in a training and testing set, builds a training set and tests the rest. This script can be found under train.rb.

If you have broken other voice captchas with my code (or with an addition to my code), please let me know, so I can update this page.

MSN

MSN (passport) audio captchas are really weak. Only digits are used, there are always ten digits and the noise is weak and constant. The distance between the words is relatively constant. Devoicecaptcha guesses all ten digits correct on around the 75% of all cases, with a training set of about 40 files.

A data set which can be used for the english MSN (aka passport, aka msn live) voice captchas (which I got from passport.net) can be downloaded under the name msn.txt. It is also possible to create your own training data (see above).

Google

Googles voice captchas are more difficult to break than the captchas by microsoft. Google employs different speakers, uses better noise artifacts and a random number of words. The dictionary is (as microsofts) limited to digits only. The devoicecaptcha program scores around 33% on these voice captchas with a training set of 60 files. This is high enough for use in a bot.

A data set which can be used for the google captchas (in english, google also provides captchas in multiple languages) can be found under google.txt. The files were found at Google new account.

These captchas are also in use by blogger and blogspot for comments

Others

If you know other voice captcha systems, let me know. Perhaps I will have some time to look into them (and perhaps not). I will at least add them to the links section on this page, so together with the provided source other people can try to beat them.

Disclosure motivation

I did not release the source code on this page without hesitation, because it might help spammers in their goals. And I hate spam. However there are some reasons I released the code anyway:

  • I do not believe captchas actually work. Almost all visual captchas are broken already (read for example Microsofts paper on visual captchas or ez-gimpy which can defeat the "human insolvable" captchas at yahoo). Or look at the versatile pwntcha. Although I do believe humans can fool computers for quite some time in the future, I suspect that computers can always beat computer generated challenges (without a "secret" as in PKI).
  • Spam is a human problem and I believe human problems should be solved by humans. Legislation and law enforcement are in my opinion the best ways to deal with spammers. If captchas did no harm anyone this would of course not be enough reason for releasing this code, but
  • Captchas make the web a more difficult place for disabled people (see http://www.w3.org/ and more annoying for everyone. I hope the community will be motivated to find other solutions (and I am happy that the w3 organization cares for people with disabilities and a usable web for everyone, including the deaf-blind).
  • I am a proponent of full disclosure.

Some people might ask what kind of solutions I do suggest for solving the spam problem. Spamassasin catches thousands of spam mails for me; it is expensive in cpu cycles (so putting spammers in jail is preferred), but the multi-tiered approach (neural network detection together with several lists of wrong-doers) works relatively well and can be applied to other forms of spam.

Playing the cat/mouse game with more difficult captchas, when the previous challenge is broken will work, but is not satisfactory in the end. I encounter more human unsolvable captchas everyday. I do understand that corporations play this game however; in the real world thresholds do help.

Links

Information

Broken voice captchas

Working on voice captchas

  • Google.com A slightly better segmentation and better database gives me 60%+ recognition rate, but this is yet unreleased.
  • Paypal audio captcha Relatively well done (best one I found actually), but I expect to release support for this captcha "any time now"
  • Captchas.net HEAD works now. I will release this soon, but I am too lazy to generate a training set, so stuck at 33% recognition rate for now.

Not working on voice captchas

  • Standards schmmandards proposal The use of generated speech is probably weak; positive point is the use of human parsing ("the three numbers are one, two three"). For now I haven"t seen it actually used.

Do you know different implementations of audio captchas? Please contact me.


Promise(d) me http://vorm.net/promisedme 2005-12-29T11:00:00Z 2005-12-29T11:00:00Z

As promised here, I wrote a better version of the quickviewer for the vim calendar plugin. It now properly wraps around year ends.

#!/usr/bin/env perl
use Date::Calc;

$dir = "~/diary";
for ($i = 0; $i < 90; $i++) {
  $fn = "$dir/$y/$m/$d.cal" 
    if(($y,$m,$d)=Add_Delta_Days(Today(),$i));
  if (-f $fn) {
    printf "%2d %s", $d, substr(Month_to_Text($m),0,3);
    open IN, $fn; while (<IN>) {print "\t$_" if (! /^\s*$/);}
    close IN;
  }
}

Compared to last year's entry you could say I was far more efficient in 2005 than in 2004, but also a lot lazier. I leave it to you to assess the correctness of this statement.

Happy 2006 everyone!

Magical Mystery Tour http://vorm.net/magicalmysterytour 2005-11-23T11:00:00Z 2005-11-23T11:00:00Z
By me

perl<enter>

''=~('(?{'.('+[@.[]'^'[))@/}
').'"'.('][@.[]@@@]-[#+-|'^'?>)@<}),,}^.@@^^').',$/})')

<ctrl-d>

Virtual X(I) http://vorm.net/virtualxi 2005-09-24T10:00:00Z 2009-01-31T21:00:25Z

by Sammy A few days ago I read XHTML, what is the point?. Later that day, someone pointed me to Sending XHTML as text/html considered harmful by Ian Hickson. Especially the last article goes in great detail about why you shouldn't send xhtml pages with mime type text/html.

Usually I do not respond to such articles, but this one is very well-written and a lot of points Ian makes are absolutely valid. I still do not listen to this expert and serve my XHTMl pages as text/html, so therefore I feel the need to explain the reasoning behind this situation further in this entry.

  • I will not be hurt if I switch the content type and things go bad; this is my own responsibilty.
  • Escaping the <script> and <style> tags is my own responsibility. I do not mind doing this. In fact I find correcting these kind of errors later on minor work compared to totally re-writing from HTML 4.01 to XHTML.
  • The standard allows sending XHTML/Transitional as text/html.
  • I do understand that my stylesheet (and especially the body tag) is handled wrong when send as text/html.
  • The points about DOM-scripts and document.write() are not applicable to my site. I consider both bad scripting style anyway.
  • Yep, it does suck that despite delivering clean XML, the browser still has to parse the more difficult tag soup. I believe that if I start writing valid XML right now, (as should everyone) the switch will be easier later on, when ie7 works good and nobody uses older browsers anymore ;-). And yes, you could do this in HTML 4.01 too, but then existing validators do not push you enough to write validating xml.
  • Browser writers _have_ to accept the / > tag as the ending delimiter of a html tag, instead of > /. I do understand that browser developers hate this, but there are worse problems with parsing tag soup and internet explorer.... This is just inescapable in a world with 90% ie-users.

In fact the problem is (again) the never updated version of internet explorer, otherwise everybody could happily send application/xhtml+xml for all xhtml pages.

Later that day: Because I had a laptop with ms windows available this weekend I could test the site in iexplore.exe and decided to make the jump now and changed the site to XHTML 1.1 and UTF-8 anyway. It is served properly to browsers that support application/xhtml+xml, others are on their own...

So (with the proper fonts installed), you can enjoy: نا قادر على أكل الزجاج و هذا لا يؤلمني

Foucault's Pendulum http://vorm.net/foucaultspendulum 2005-05-22T10:00:00Z 2005-05-22T10:00:00Z

Some rights reserved by Feuillu (http://www.flickr.com/photos/feuilllu/) These weeks, with much pleasure, I have been reading Umberto Eco's Foucault's Pendulum. The writer truly knows a lot about history, philosophy, literature, different cultures and is very erudite. So besides enjoying the good plot, reading Foucault's Pendulum learns me a lot. However, on one thing the writer is a bit off. In the beginning of the book the main person tries to break into a computer by writing a (inefficient) computer program which generates anagrams of 'JHVH'.

Accidentally two weeks before reading this passage I wrote for my DND group a small program which solves a similar question in general. Since this group is too lazy to solve puzzles, I put the program on line; it is called rotx. Perhaps someone can make good use of it. It finds all rotx puzzles (with x = [1..25]) which deliver again a known word.

So, for example layout is the 'encrypted' version of fusion (rot6, so a->h, b->i, c->j, d->i, e->k, f->l, etc.), curly -> wolfs, arena - river, etc.

In dutch some solutions are urnen -> lieve, opaal -> hitte, knijp -> bezag and kerk->gang.

To use rotx you need a wordlist, for example as generated by aspell:

   aspell --lang=en dump master | ./rotx - > rotated.txt

The output (in the example above copied to rotated.txt) contains all rotated words which can also be found in the original wordlist..

The first incarnation of the program was in bash/sed/tr and awfully slow. (I had to try though, "No premature optimization!"). It should take two weeks to process a 1.5 MB English wordfile. (Eco's Basic script should take what, years??). Enter C++ and STL. The direct approach (rotating all words through the entire alphabet and looking all results up in the original list) should still take around 20 hours. So I cooked up an algorithm which uses more memory, but finishes in approximately 15 seconds on my old and crusty AMD duron 850!

The source can be found at /downloads/rotx.cc.html.

Pictures from home II http://vorm.net/picturesfromhomeii 2005-04-22T10:00:00Z 2005-04-22T10:00:00Z

As promised in this post, I added scanned photos to my photogallery. Therefore I bought a scanner: the Epson Perfection 2480, which works like a charm on linux. With thanks to the sane project, installation was easy (for people living in my strange universe):

find /cdrom -name "*.cab" -exec cabextract -l {} 2>/dev/null; \
| grep -B 10 Esfw
cabextract -F Esfw* -p ./ESCAN/ModUsd.cab > /etc/sane//Esfw41.bin 

I also promised a mod_rewrite patch for o.r.i.g.i.n.a.l. which is now finished. I will release it when the slideshow feature is ready. Also I wrote some mod_rewrite voodoo for the main site, which enabled me to provide proper permalinks, instead of the previous erroneous ones.

Only two todo's left:

  • Complete slideshow javascript for o.r.i.g.i.n.a.l
  • Buy a digital camera
Pictures from home (and further) http://vorm.net/picturesfromhomeandfurther 2005-03-12T11:00:00Z 2005-03-12T11:00:00Z

by meAs you perhaps have noticed, I added a new link under journal in the right menu, called photos. Finally I have consolidated my photo albums Before I had several albums on topic (njbg, travel, korvezee, etc.). The main njbg album is now at the njbg site, the korvezee albums are in my album and the travel albums from other peoples deleted...

Although the gallery is popular software, I do not like it anymore. It has too much code and features I do not need. I prefer to provide comments and do all resizing etc. myself with vim and image-magick and have the gallery software only do the display work. Luckily jimmac wrote a nice piece of software called o.r.i.g.i.n.a.l. It works and is very hackable. I do plan to port the gallery's mod_rewrite (providing useful url's) and slideshow features to my new photo album, but at this moment I think it is good enough to link.

Now I have a real good excuse to buy a nice digital camera and a scanner. I have lots of good old analog pictures to be added and it is weak to have stolen (but credited) so much pictures from people who do already own a digital camera.

The dark color is a bit in contrast with the rest of my site, but photo's look just better on a dark background.

I cover the waterfront http://vorm.net/icoverthewaterfront 2005-01-25T11:00:00Z 2005-01-25T11:00:00Z

Some rights reserved by estorde (http://www.flickr.com/photos/estorde)In a few days, I will be on holiday in Tignes. So no e-mail will be answered, phone probably neither. If I recover quickly ;-), I will come back to you after February 6th.

In the past weeks I have been working on a cleanup of the NJBG site. Some people volunteered to maintain the site. Since we are talking about a youth association, this is probably a good thing.

One of the lasts hacks I did was a mailing list. Actually multiple mailing lists with a web frontend. Although it is not really solid yet, it basically works. With PHP, programming is actually too easy these days. I needed only one procmail line (to redirect the incoming mail to PHP) and about twenty lines of PHP code. That is it. After working with python for some projects, it struck me again how good the PHP documentation is. Associative arrays (called dictionaries) in python still are a small mystery for me. Where PHP has a lot of nice array_ functions, python provides me with .keys(), .items() etc. methods, which make live hard. I really prefer handling with the keys and values in the same abstraction depth. Perhaps this is the reason mailman is such a monstrous beast...

Let's party http://vorm.net/letsparty 2004-12-31T11:00:00Z 2004-12-31T11:00:00Z

Where november was the month of moving, december was the month of party's. Today I will combine these events in giving a house-warming/new-year party. After that (hopefully) more time will be available for some good old hacking. As a reminder for myself to fix the damned end-year-bug here a shell script which shows my calendar, created with this nice vim script in a year/month/day.cal format.

    dir="$HOME/diary"
    nrofmonths=2

    year=`date +%Y`
    month=`date +%m| sed -e 's/^0//g'`
    day=`date +%e`

    for ((i=$month; i <= (month+nrofmonths-1); i++)); do
	if [[ -d "$dir/$year/$i" ]];  then
	    for ((j=1; j <= 31; j++)); do
		if [ "$i" -ne "$month" ] || [ "$j" -ge "$day" ]; 
      then
            bestand="$dir/$year/$i/$j.cal"
		    if [ -e $bestand ]; then
                size=`/bin/ls -l $bestand | /bin/awk ' print $5 '`
		        if [ $size -gt 0 ]; then
                    echo -n "* `date -d "$i/$j" +%a\ %d\ %b` : "
                    sed -e "1!s/^/	       /g" "$dir/$year/$i/$j.cal" 
                fi
		    fi
		fi
	    done
	fi
    done

Happy 2005 anyone!

Left the building http://vorm.net/leftthebuilding 2004-09-30T10:00:00Z 2004-09-30T10:00:00Z
***** Nagios  *****

Notification Type: PROBLEM
Host: jochem
State: DOWN
Address: jochem.pepperstream.nld-int
Info: PING CRITICAL - Packet loss = 100%

Date/Time: Thu Sept 30 16:38:26 CEST 2004 

Yep, I left Pepperstream after some very nice years and projects there. I will miss the Pepperboys and girl...

Feed my head http://vorm.net/feedmyhead 2004-08-28T10:00:00Z 2004-08-28T10:00:00Z

Wrote a fully validating, caching and fast RSS 2.0 generator tonight. I used the spec from Harvard law so I am not really sure if my feed has real world value. However it looks similar to serendipity's, so it should work. Only I don't understand the difference between the different ways of encoding the description tag . Never mind, happy syndicating!

One step closer http://vorm.net/onestepcloser 2004-05-09T10:00:00Z 2004-05-09T10:00:00Z

Lots of stuff happened lately. After another fight with my provider, I finally gave up and ordered my own dedicated server. (All providers in the Netherlands I've dealt with (quite a lot) have depressingly low technical skills). At this moment almost all services and sites we run are up and functional (and faster and more secure than ever).

I had a large amount of scripting to do and therefore I haven't had time to do a proper release of my mms-package. The album functionality is getting somewhere, so I plan to do this real soon now ;-). By request I have added the obligatory, but useless screenshot.