Machinomics: June 2012

Wednesday, June 27, 2012

The Fourier transform as a diagonalization

One of the benefits of using the Fourier transform of a function is that convolutions become multiplications. This is important when solving a differential equation with its Green's function. If the Green's function comes from a differential operator $D^*D$, where $D$ is a differential operator and $D^*$ is its adjunct, then the Green's function is not singular at the origin, and is continuous. It expands a function space called a reproducing kernel hilbert space, RKHS, and all functions in this space can be written as linear combinations of the Green's function evaluated on one argument, and the solution to the differential equation $D^*D u = y$ would be of that form. OK, don't digrees anymore... to the cheese...

In the Fourier domain we operate on frequencies $\omega$. For example, to attenuate the noise, we decrease the power in the high omegas, which accounts for a convolution (with a Gaussian, for example). If we see this linear operation as a matrix, the convolution operator that has one (let's say) dimensional Gaussians in its rows (in the time/space domain) becomes a diagonal in the Fourier domain.

The page popped up with much to follow on. In particular, I liked this paragraph

The moral of the story is that the Fourier Transform may be thought of as a change of basis. The Fourier integral projects a function onto the basis functions of a new coordinate system whose basis functions are the complex exponentials. In this new basis, the convolution operator is diagonal and everything is simple. The convolution operator acts on each Fourier component independently by multiplying the component by an associated magnitude and phase.

In Matlab

C=[4 1 2 3; 3 4 1 2; 2 3 4 1; 1 2 3 4]

C =

     4     1     2     3
     3     4     1     2
     2     3     4     1
     1     2     3     4

F=fft(C)

F =

10.0000            10.0000            10.0000            10.0000
   2.0000 - 2.0000i -2.0000 - 2.0000i -2.0000 + 2.0000i   2.0000 + 2.0000i
   2.0000            -2.0000             2.0000            -2.0000
   2.0000 + 2.0000i -2.0000 + 2.0000i -2.0000 - 2.0000i   2.0000 - 2.0000i

F*C*F'

ans =

1.0e+003 *

   4.0000                  0                  0                  0
        0             0.0640 - 0.0640i        0                  0
        0                  0             0.0320                  0
        0                  0                  0             0.0640 + 0.0640i

Tuesday, June 26, 2012

I am an Analytic Bastard and I will fight Delusional Geometers to death

This post is dedicated to AMG: friend and enemy, mentor and destroyer, wise and fool.

AMG is obsessed to equate generalized functions (as in Swartz distribution theory) to probability distributions (as in measure theory). According to AMG I am an Analytic Bastard, I agree. And this was the least thing I could take from a Delusional Geometer. Therefore I left AMG.

Schwarz distributions are NOT probability distributions

I can say it louder but not clearer.

It doesn't matter that they are both called distributions, sometimes it does happen in mathematics that two different things are similarly called. It doesn't matter how hard you try to make them the same, it doesn't matter how proud you are and how little you think of the people that surround you and that are not Field medalists.

The fact that Strichartz's book shows a bell-like $C^{\infty}$, compactly supported function does not imply it is a distribution. What is more, this bell-like $C^{\infty}$, compactly supported function is clearly stated to belong to the set $\mathcal{D}$, the set of test functions. Therefore it is not a distribution itself, but the objects to which the linear functional (the Swartz distribution) is applied to. If $\varphi \in \mathcal{D}$ then there is a test function. It looks similar to a DENSITY function (the Gaussian) but the density is not the distribution, nor the test function is the (other kind of) distribution.

Furthermore, forcing my brains so as to accept $\varphi \in \mathcal{D}'$ and call it a (Swartz) distribution, then you can't write $\varphi(x)$ outside the integral symbol. It is a functional, which means that it is applied to some $\phi \in \mathcal{D}$, so what makes sense is $\varphi(\phi)$, don't get angry with me because of this, this is a fact. If $\varphi(x)$ were a linear functional, it could be written as $\int \varphi(x) \phi(x) dx$, why on Earth do you say $\varphi(x)$ anyway?

At this point, why the hell do you use $\varphi(x)$ to name a "bell-like" function with $x= \arg \max_y \varphi(y)$?

Then it remains going full retarded and try to apply density estimation methods to image analysis following this logic:

David Mumford develops an axiomatic theory that describes images as generalized functions (check)
We have methods that work in the density estimation field fairly well (check)
Since YOU (and only you) say generalized functions = probability distributions, then our methods must be very powerful in image analysis (FAIL)

FAIL! Because the only supporting argument you have is your pride.

So, let me out!

Wednesday, June 20, 2012

Pairs trading and colinearity

We sometimes recognize a couple of assets as being co-related. However, the dependence regime changes over time, making this co-relation non-linear and depending on, let's say, a phase. A more robust concept is multiple co-linearity, which implies that a linear combination of the returns of those assets is linearly related, and has a constant mean and variance.

Let's say that two assets are co-linear and that the returns of one of them have been consistently larger than the other. It makes sense to sell short the asset with larger past returns and buy the asset with smaller past returns. With this, we would have a quantitative model to measure large discrepancies of the return of the linear combination, for example, execute this strategy when the absolute value of the return exceeds twice the standard deviation. This gives a statistical arbitrage oportunity.

One example that I like to use is the pair EURUSD and GC (NYMEX Gold 100oz) and I used that to get a hold on some coins. Another perhaps more interesting application would be corn and wheat. They seem to have periods when one is the loved child of agricultural commodity traders. They are normally worth the around the same, being wheat historically more expensive. Corn catched up and had a period that was more expensive, but wheat had recently a rally and got to be USD100 more expensive per contract. Obviously, the gap close down to a difference of USD20. It then widened and is now sitting around USD40-USD50. This simple model would have yield a potentical change of USD80 per contract.

Saturday, June 16, 2012

SQL Server CE Max Database Size

I am crawling an Internet database so I can apply some spectral graph methods to it. I am using my computer at the university. I have just logged in and found out that my crawler was reporting it had (oddly and partially) finished. Then I found out the error was not that the database owners had detected me (I have implemented the protocols to keep a low profile), but that the local database is capped at 256 MB file size.

Long story short, if you want to use more than 256 MB on a desktop SQL Server CE database, add the parameter Max Database Size=1024 (to increase it to 1 MB) in your connection string. Separate variables with ';'.

Monday, June 11, 2012

Nested queries with SQL Server Compact Edition

Following the previous post, I am finding some small troubles with SQL Server CE and their solutions, and I think they will be useful for anybody working with it.

SQL Server CE does not support nested queries!

The way to circumvent this is by using inner joins, as mentioned in this blog.

My query now reads:

String sql = "SELECT Authors.* FROM Authors INNER JOIN " +
                    "(SELECT AuthorID FROM SubDomain_Authors " +
                    "WHERE (DomainID = " + subdomain.DomainID +
                    ") AND (SubDomainID = " + subdomain.SubDomainID +
                    ")) AS t ON Authors.ID = t.AuthorID";

SqlCeCommand cmd = new SqlCeCommand(sql, con);

                SqlCeResultSet rs = cmd.ExecuteResultSet(ResultSetOptions.Scrollable);
                if (rs.HasRows)
                {

Sunday, June 10, 2012

Using Visual C# 2008 Express to query a SQL database

I am currently working on building a database that I intend to feed to a graph so that I can explore its spectral properties and derive interesting issues regarding that data.

To do that, I am downloading downloading data from a source, processing it and storing it into a database. Since the data are highly structured, I decided that I would use an SQL-like database. To my comfort, I found out how to use the SQL Server Compact Ediction (v 3.5) that Microsoft ships with a number of its products (I know that v 4.0 ships with Visual Studio 2010, but I don't know where the 3.5 I had installed came from). In any case, while working on the project, I found out that Visual C# itself can be used as a SQL query viewer, to query a database.

These are the steps to query a database from your Visual C# 2008 Express

Click on the menu "Data" and then "Add new data source" if you have an open project
Click on the menu "Tools" and then "Connect to database" otherwise
Click on database
"New connection"
Select "Microsoft SQL Server Compact 3.5 (Data provider .NET Framework para Microsoft SQL Server Compact 3.5)"
Examine to get to your file. You can create a new file with a new filename if you will.
Assuming that you have data to query: Look for the tables, right-click on it and click on "Show table data", or "New query". In the first case, "SELECT * FROM Table" will be executed, whereas the second case will let you modify the SELECT statement.

MS SQL CE stores the database locally in .sdf files and it is embedded into your program as a SQL engine DLL so you can have access to the objects that interpret statements, access and control files. And by the way, it has a limit of 4GB per data-base file.

It is probably not that big of a deal but it was gladly surprised when I found out I could get this project done with just the Express edition and no supporting tools.

Tuesday, June 5, 2012

Declaration of linear independence

I've been very busy (pinyin for beginners: Wo hen mang, simplified: 我很忙) so let's have some nerdy fun. Via mathbabe's blog, the declaration of linear independence:

IN EUCLIDEAN SPACE, JANUARY 28, 1988
When, in the course of a proof, it becomes necessary for a set to dissolve the argument which has connected it with a theorem, and to assume among the powers of mathematics a position above that of the mathematician, a decent respect for the axioms requires that a rigorous justification be given.
We hold these truths to be self-evident: that all nonzero vectors are created equal; that they are endowed by their definer with certain unalienable rights; that among these are the laws of logic and the pursuit of valid proofs; that to secure these rights, logical arguments are created, deriving their just powers from axioms; that whenever any argument becomes destructive of these ends, it is the right of the vectors to alter or to abolish it, and to institute a new argument, laying its foundation on such principles, and organizing its powers in such form, as to them shall seem most likely to reach the correct conclusion. Prudence, indeed, will dictate that theorems long established should not be changed for light and transient causes, and accordingly all experience hath shown that sets are more disposed to accept the conclusions of arguments than to right themselves by abolishing the arguments. But when a long train of abuses and usurpations, pursuing invariably the same object, evinces a design to reduce them to zero in a non-trivial way, it is their right, it is their duty to throw off such argument, and to provide new proofs for their future security. Such has been the patient sufferance of these vectors, and such is now the necessity which constrains them to alter these arguments. The history of Professor Eigen is a history of repeated injuries and usurpations, all having in direct object the establishment of dependence among these vectors. To prove this, let facts be submitted to a candid world.
He has refused to acknowledge that he obtained a zero matrix only by multiplying our coordinate matrix by a zero matrix.
He has restricted our freedom of movement by requiring us all to live in the same hyperplane, even though we cannot all fit in one.
He has attempted unsuccessfully to invert our coordinate matrix, and, having overlooked the inverse, has concluded that the coordinate matrix is singular.
He has changed bases repeatedly for opposing with manly firmness his attempts to place us in the span of fewer vectors than the dimension of the space.
He has erected a multitude of new formulas and sent hither swarms of new functions to force our directions into a proper subspace of the vector space.
He has kept among us vectors to be orthogonal to all of us without the consent of those of us whose dot product with them is nonzero.
He has abdicated the axioms here by committing mathematical errors in computing a zero determinant for our coordinate matrix.
In every stage of these oppressions we have petitioned for redress in the most humble terms; our repeated petitions have been answered only by repeated injuries.
A mathematician whose arguments are thus marked by every error is unfit to prove the theorem which he attempts to prove.
Nor have we been wanting in attentions to Professor Eigen. We have warned him from time to time of flaws in his arguments. We have reminded him of the circumstances of our definition, we have appealed to his knowledge of the axioms, and we have requested him to disavow these usurpations which would inevitably destroy the validity of his arguments. He has been deaf to the voice of logic. We must therefore acquiesce in the necessity which denounces our separation and hold him as we hold the rest of mathematicians, an enemy when he is wrong, a friend when he is right.
We, therefore, the members of set S in vector space V, appealing to the supreme judge of mathematics for the rectitude of our intentions, do solemnly publish and declare that these vectors are, and of every right ought to be, a free and independent basis; that they are absolved from all subjection to Professor Eigen's theorems, and that all restriction of them to a hyperplane is, and of right ought to be, totally dissolved; and that as a basis, they have full power to span the space, form invertible coordinate matrices, give unique linear combinations equal to a given vector, and to do all other acts and things which a basis may of right do.
And for the support of this declaration, with a firm reliance on the protection of the properties of a vector space, we mutually pledge to each other our magnitudes, our directions, and our sacred honor.
In witness whereof we have signed our coordinates with respect to an appropriate orthonormal basis, and found them to constitute a triangular matrix with nonzero diagonal elements.

Machinomics