Categories
Blog

Google search is not a programmers best friend

I was playing around with google this weekend. The original problem I wanted to solve was that last.fm returns strange strange release dates for albums, so I was writing a small script that would extract the correct release date from various sources. I was aiming at www.metal-archives.com and wikipedia. Both of these sites have different search pages, and in general I’ve come to rely more on google’s site:xxx functionaly, than on individual pages own search engines. So I thought, why not just use google programmatically to search the sites. Seems easy enough.

Failure 1 (I’m feeling lucky):

Google has a very nice feature called “I’m feeling lucky” that will direct you to the first result. If I could specify my queries good enough, I could rely on that, and not have to parse google to get the url. It’s very simple, you just add &btnI at the end of your query and google will redirect. Sadly it works fine most of the time, but sometimes it just fails to redirect you. I couldn’t find any patterns to this randomness and a “works sometimes solution” is not a good one 😐

Failure 2 (google ajax):

I then found out that google has a seemingly very nice api that lets you do queries and get JSON back. JSON is easy to work with and it also allows one to go through several results, in case google doesn’t return the right one as the first result. After a bit poking around I found out that google ajax randomly returns different results from the normal google. It’s like using Yahoo instead of google. A bit of poking around returns the following 2 year old bug report. Furthermore the TOS directly forbids using the API for this kind of activity. Oh well, it didn’t work anyway.

Failure 3 (parsing google results directly):

After two bitter defeats I thought screw it, I’ll just parse the damn google result pages, how hard can it be? At least I know that it gives me the right results. So I did that, coded everything up and checked that things was working. Then let it loose on my collection (2×275 requests) and around the middle it stopped working. I poked a bit around, and found out that google has identified my program as a bad boy and decided to spank it by returning a “Please identify yourself as a human” page back instead of the normal google result page.

As a side note, after 3 bitter defeats I was ready to jump ship and try bing or Yahoo. That was a quick detour though, as none of them where up for the challenge of returning good results.

Categories
Blog

Channel downmixing in MPlayer

Recently I have been playing with downmixing in MPlayer.  When I bought new speakers, I decided to go with stereo instead of surround since I mostly listen to music. As anyone using mplayer or any “derived” players such as vlc have discovered, there is a incredible annoying problem that the voices of the actors are very low, actually in general the sound is very low. It appears that when mixing to two speakers, the center channels is put very low in the mix. The same could be said about the subwoofer although it’s naturally not as easily recognized.

A quick google revealed that MPlayer has several tricks (audio filters) that might potentially work: volume, volnorm, pan, hrtf. I quickly discarded volume and volnorm since I don’t want to just boost the sound, I want it to distribute the channels properly. hrtf seemed like a good simple choice, since pan looked very complex. Sadly in the middle of Harry Potter I had to turn it off because it was making lots of clipping of the sound. So I was left with pan. It took a while to get a good default, but a bit of googling around revealed one with a decent default. I first just tried turning sub + center up to one but in one or the other movie introduced the dreaded clipping. So I had to keep it down a bit while still retaining decent boost of center and sub. After an afternoon of testing I came to the following “magic” formula:

-channels 6 -af pan=2:0.4:0:0:0.4:0.2:0:0:0.2:0.3:0.3:0.1:0.1

Please do note that one needs to add a -channels 6 in order for mplayer to decode all 6 channels so that it can mix it down to two. One can read more about the pan filter here.

Categories
Programming

mucomp released into the wild

Today I’m happy to announce that the world is one audio player richer! This is a personal project of mine that I have been working on for a little while. It’s written in clojure and javascript (jQuery) and uses alsaplayer as the audio player. I probably won’t have much time to hack on it, so consider this a code dump that hopefully someone else will find useful and play/run/do-whatever with. As for maturity I use it almost daily and it’s pretty stable. There are some known bugs and kinks (mostly due to that its using alsaplayer and that the java inotify library is buggy).

Categories
Blog

How far have we come?

Things like these makes me wonder, with all the advances in computer science how far have we really come?

  • 40 years after the invention of relational databases we are still manually defining indexes
  • 40 years after the invention of Unix, the scheduler in Android (= Linux) still does a terrible job at scheduling the tasks that really depend on it (games and audio)
Categories
On the web

AndNav2 is now Open Source

Wow this is quite a nice suprise. Nicolas gave in and released the source of AndNav2. So I can now spend my time hacking that instead of reverse-engineering it 🙂 I rooted my HTC Hero so I actually don’t need to do that anymore, but that is really beside the point. With the source available there is no ends to what can be done 🙂

As I have said before, I really think this is a killer app for Android. Google released their navigation which is nice, but I’m not sure how well it works in offline mode.

Categories
Programming

Programming nirvana part 4: the repl

A repl is a very important tool in the arsenal of a programmer. It allows one to test bits and pieces of code and then assembling that into a running program. Examples of this includes the excellent firebug utility for firefox and various shells (like the python shell). There has even been tries to bring a repl to C#, but the languages is not at all designed for this purpose, so it’s a crude hack at best.

Often developing using the repl consists of a lot of copy pasting back and forth between then editor and the repl. Moving the repl into the editor makes this even less painful, but I would argue that in order to fully embrace a repl one has to have something like lisp, or a very good editor (that I havn’t seen yet). In lisp it’s very easy to take parts of a function and evaluation only that part simply by finding the right parenthesis and evaluation that. Using Emacs and SLIME it’s a simple two-key-gesture.

But one can take it one step further. Why not simply start your program as a repl, instead of adding a repl to your program. That way also the running state of the program can be determined. But in order for that to work, one has to be able to redefine functions without stopping the program. Clojure allows that, and it’s the key that makes it all work.

Programming in clojure is a process of organically growing your program in a bottom-up fashion where the running state of the program allows one to inspect, debug and fix programs all without shutting it down.

The best way to do this is to start up a repl in a seperate process and then to connect to that process, that way one can always disconnect and reconnect again when the need arises.

Categories
Programming

Programming nirvana part 3: avoid relational databases

This is actually a blog post I have been meaning to write for quite a while not, but I just never got around to it. Relational databases are ubiquitous in programming. Whether it’s mainframes, application programming or even phones now a days everyone stores persistent data in a database. Recently there has been some talk about non-schema databases and there’s a myriad different implementations of them, but why the sudden interest?

  • Database has at least two things that makes it less than ideal for agile development. First one being types, especially if the types propagate out into the application code (*cough* Microsoft). The second being based on a schema, which means that one has to make all the choices up front.
  • Databases are really built for a single machine, and even though all the big database vendors has replication support, it still seems like a bad solution if one really wants scalability. There are a lot better ways of doing scalability now a days (a distributed hash tables comes to mind).
  • Databases are meant to be standardized, but in reality they are not. They all have different syntax to do common things.
  • It has been said that when you build a DSL, then sooner or later it will turn into a (bad) full-fledged programming language. What if instead the data was simply part of the program and you could manipulate it as you are used to? Functional languages makes this much easier and possible. Especially a programming language that is built with concurrency in mind.
Categories
Programming

Programming nirvana part 2: be agile

In part 1 I made the very brief argument that compiling sucks. I really like that cartoon since there is a lot of truth to it. Compiling sucks mainly because it breaks flow. There has been several tries to fix it by lowering the time it takes to compile but in the end, it’s still the same loop: write, compile, “debug”.

There is another layer to it as well, one that dynamic languages doesn’t necessarily imply: Developing a program should be having a running program that can be used while it is being written. This is one of the pillars of agile programming. Step one in achieving this goal is to separate the UI from the backend. Web is a natural way to do this.

Django comes very close to achieving this with it’s automatic reloading on change, but the biggest problem is that it’s not able to automatically migrate the most basic model changes. On the other hand Django has many other things going for it so it is by no means a bad choice. But there might be a better one lurking in the dark.

Categories
Programming

Programming nirvana part 1: avoid compilation

I have been meaning to write about a new project I have been working on in my spare time for a while now. It’s a project where I wanted to maximize the fun I had doing it, learn something new and to do something that I can use. Sadly I have been putting it off since enumerating all the reasons feels like quite a daunting task.  So instead of writing everything in one big blog post I’ll just do shorter posts. One for each argument. First one is pretty simple.

Categories
spam

Fighting spam in gmail

Everyone hates spam, and like everyone else I get tons of it. So I’ve of course written about it two times before. Earlier this week we migrated over to a better email solution, so that we could get SPF working properly. SPF should help google figure out what is spam and what is not. In the process I went through all my old spam (all 4200 hundred of them) and found the following patterns to be  effective in cleaning my spam folder:

Matches: (Pfizer || replica || diploma || pharmacy || meds || pills || male enhancement || BuyCializ || http://www.nobelbarmoebelonline.com || http://www.nobelfahnenmitmast.com || http://www.nobelbarmoebel.com || Bestellseite || http://www.nobeltannenonline.com || http://www.nobelgewerbetools.com || health || Reinigungswirkung || britney spears || http://www.edelebarmoebel.com || http://www.edelefahnenmitmast.com || http://www.alumastmitfahnen.com || 성명 || 주식 || Проверки || Damen und Herren || и || видео || lv-bestbag || lv-bagshop || 밤세상)
Do this: Skip Inbox, Delete it

BIG FAT WARNING: This list is of course highly personal, and you should check how many in your inbox it matches, before installing it.

With this list in place I get about 10-15 spam a day and I can actually go through my spam box to look for false-positives. Sadly something that still happens way to often 🙁