Machinomics: December 2013

Wednesday, December 25, 2013

Benchmark functions in Python

This is a function to benchmark functions in Python using decorators, so that you can use it non-intrusively with your current code, just adding the decorator operator @ to the definition of your function. Feel free to modify it with higher resolution time functions.

Sunday, December 22, 2013

Clojure concurrency and some niceties

I stumbled upon the barber problem at some webpage I don't remember; it was a loose and open approach to it. I then imagined my own version and implemented that in Clojure (I'll put the code as a Gist, even though the style just does not match this blog's). Basically, I want to adjust the rate of customers entering the barber shop to a little faster than the barber can dispatch one customer.

To do that, I create two functions, one that increases the queue of customers (with the shop supporting up to 3 customers) amd one that dispatces customers, decreasing the queue. I assume 4 hours for customers to be able to take a seat (the barber's will close after 4 hours and cut the remaining wating customers).

Notice here that the barber cuts the hair only if a random number meets a condition (here making him slower than his customers). I implement this with a watcher function, that gets called whenever the reference changes. In order to be called every time a customer enter the shop, we need to issue and identity change, that does not change the value of the queue but fires the watcher. This is a very nice feature of Clojure.

The important functional element is this

(let [f (if (< @queue 3) inc identity)]
(dosync (alter queue f)) ))

Notice the conditional assignment to the variable in the let block. This removes the boilerplate code needed in the expression section within the let block.

Thursday, December 19, 2013

Why are so many people still using Internet Explorer?

As somebody that know a little bit of HTML and Javascript, I appreciate standards very much. Internet Explorer has broken (though they've restrained themselves) and continue to break every single one of them. Luckily I develop for mobile platforms when I have an idea and have time.

Internet Explorer never follows standards, the interfaces they expose are always Microsoft-y, cumbersome, intrincated, unpractical. Multimedia and interaction is assured with the latest developments of the triad HTML/Javascript/CSS3, yet in Explorer it always renders badly. I know programmers who have suffered making two versions of their webpage: one for Explorer and another for the rest. These are just bad guys.

And we have the following compelling reasons, that are beyond pure political reasons (of following standards):

You are safer by avoiding software that bad guys target. Mac users benefited from this for years. Windows users can lower their attack surface (be less vulnerable) by avoiding popular software. Internet Explorer is popular, so bad guys exploit known problems with the browser. No thanks.
Microsoft fixes bugs in Internet Explorer on a fixed schedule. But, bugs are not discovered on a schedule which means IE users remain vulnerable to know bugs until the next scheduled bug fix roll-out. Neither Firefox nor Chrome, my preferred browsers, are locked into a schedule.
In addition, Microsoft is just slow in fixing Internet Explorer bugs. The last release of IE patches included a fix to a bug that Microsoft had been told about six months ago. The topic of bugs in popular software brings Adobe's Flash Player to mind. Internet Explorer users with Flash enabled in their browser get notified of new versions of Flash using a very flawed system. And, when they are notified, they need to manually install the new version of Flash.
In this day and age, this is not acceptable; Flash is too popular and too buggy. Firefox fails here too. And speaking of Flash, it exists in Internet Explorer as an ActiveX control. The lack of security in ActiveX is what prompted me to jump on the Firefox bandwagon even prior to version 1.0.
ActiveX may be locked down a bit more than it used to be, but how many Internet Explorer users understand the security related prompts about running an ActiveX control, let alone the configuration options for ActiveX? To me, a browser that doesn't support ActiveX is safer. ActiveX was the first approach to extending browsers with extra features and functions. Now, both Firefox and Chrome have a huge number of available extensions. Internet Explorer has only a handful
Buggy browser extensions/plugins are often targeted by bad guys. Both Firefox and Chrome do some checking for outdated extensions. Internet Explorer does none.

Tuesday, December 17, 2013

Damn, Clojure is fast!

Well I feel positive today, I struggled for the past two days to make my logistic regression in Clojure faster. I even made up to four different implementations of the logistic regression, with none of them giving satisfactory results. It all was even more disappointing when comparing to my MyML logistic regression implementation.

Well, I found out what the problem was: Actually I was making two fatal errors.

My input data in Clojure were lists instead of vectors
My input data in Python was 256 datapoints, instead of 1000 as in the Clojure version.

Both points stemmed from me being not so careful. In the first case, I knew already that one should use vectors instead of list when going after performance, but I was assuming that my data was in vector form. Staring at the variable X, I wondered if it was vector or list and voila, performance just got x5 better. Then I went to the Python interpreter, checked whether the number of iterations in the gradient descent object was the same as in Clojure, and then checked the data... well, my data was smaller (from a previous test with logistic regression). I re-generated my data and Python just lagged behind. In particular, I give you the figures (notice that I did not bother to put the wrong results I was getting because of my mistakes):

Python: 0.56 sec
Clojure: 0.19 sec (iterative implementation through loop-recur) 0.04 sec (concurrent implementation through agents).

Bear in mind that I am conduncting the tests on my girlfriend's borrowed machine and that I installed Cristoph Gohlke's Numpy distribution, which shippes with Intel's MKL statically-linked libraries, so it should be pretty fast in terms of algebraic computations. Perhaps the lack of performance comes from Python's interpreter itself (read-interpret-execute...). This is even more supporting of Clojure, since we are focusing on the infrastructure of both systems.

I will be putting everything in order, making my logistic regression more idiomatic and building some tests.

Machinomics