Micro languages in Clojure - part 2

In the previous article we constructed a tiny DSL for generating SQL statements using a more Clojure like syntax.  It works, but it has a couple of problems:  It doesn’t parameterize SQL variables, and nesting doesn’t work properly.  In this article we’ll rectify those issues, and add a bit of addition functionality.

So to parameterize the SQL statements we obviously need to extract the values from our generated SQL and replace them with question marks.

This:

Needs to become this:

The library we are using expects parameterized SQL in the following form:

That should be the eventual result of our SQL generator.

For our simple DSL we need to accumulate the string portion of the statement, and the variables for binding.  Whenever I need to accumulate more then one thing in Clojure, I tend to lean on maps.  You may know them by different names, perhaps dictionary or hash table, etc.  They are a very useful data structure, and I tend to use them quite frequently.  Perhaps too much, but that’s another story for another day.

Let’s create a simple helper function for making maps of SQL fragments:

This function takes multiple arguments and checks to see of the last one is a vector.  If the last one is a vector, it’s assumed to be a variable to be bound to the SQL.  It returns a map with the :stmt key bound to the SQL string, and the :vars key bound to a list of the variables.

Now we need another little helper function to selectively generate paramatized statements.

Let’s modify the sql-exp function to call paramatize:

A call to sql-exp will now return a map resembling this:

Our little SQL DSL is now using bound variables.  The other issue to be dealt with is the nesting of SQL expressions.  To tackle that issue we’ll create another helper function:

All this function does is surround the supplied statement with parentheses.  So nesting should now work correctly.  Albeit at the cost of one superfluous pair of parentheses.

All the maps get merged together using merge-with concat.  Which does exactly what it says, merge the maps by concatenating the values with matching keys.  All that’s left is to take the resulting map and convert it to a form that clojure.contrib.sql can digest.  So one more helper function:

Let’s give it a test run:

Looks good, bound variables are working and nesting is handled correctly now.  Let’s see how much work it would take to add sorting and result set paging.

For sorting we’ll use our macro/let binding trick again to create functions asc and desc.

Paging turns out to be dead simple:

Let’s test and make sure it works as expected:

Looks good, though the nesting is getting a bit extreme.  As luck would have it, Clojure offers a great solution for over nestification.  Say hello to the macro.  With the “threading” macro, we can rewrite the above as:

Much more readable.  Now our tiny DSL meets almost all of its users’ (yours truly) needs.  In the future we might revisit how the SQL is generated to enhance composibility, and perhaps provide a more terse syntax for specifying table columns.

As always the complete source is available on Github.

Micro languages in Clojure via macros and let.

While working on a couple of web projects using moustache I’ve been slowly building up some abstractions for dealing with SQL for data durability.  The abstractions are built on top of the clojure.contrib.sql library which works great.  However I was interested in being able to express my queries in a more “Clojuresque” way. The library that immediately popped to mind was the ClojureQL project.  Alas there were two issues with ClojureQL.  First and foremost, it seems abandoned or at least stagnant at the moment, hopefully that will change as it has lots of potential.  The second issue is that the abstraction it provides seems backwards.  Here’s an example:

In Clojure the collection generally comes last.  So perhaps something like this, would be more idiomatic:

Let’s see if we can build a tiny SQL abstraction layer to provide the above interface.  Let’s  agree before we begin that one of our goals should be to minimize namespace pollution, especially considering we’ll be redefining some clojure.core functions.  Users of the library (aka me) shouldn’t have to prefix every function or resort to the use of (:use).

We don’t want this:

A thought occurred to me the other day for working around this issue.  Perhaps we can locally rebind symbols using let inside a macro to give us the interface we’d like:

Much nicer in my opinion.  That thought in hand, let’s get down to work.  Starting with collect macro we have:

(The sqlize function simply transforms :table into “table” and :table/col into “table.col”.)

Here’s our first use of let to rebind a symbol.  as is bound to an anonymous function but only within the context of collect.  Now a quick test of the new collect macro:

So now we can generate select statements, next up is filtering those “collections”.  Here’s the filter macro:

Again, we use let to bind symbols local to only the macro.  Time to test filtering:

It works!  Here’s our (in)complete SQL DSL:

That’s it for now, in the next article we’ll fix our nesting issue, and implement bound variables in the generated SQL.

If you want to play around, all the code is available on Github.

Clojure simple threaded web server performance test.

After reading David Nolen’s post about Aleph I was curious to see how scalable a simple Clojure threaded web application could be when paired with good old Postgres 8.4.  To make things just slightly more realistic, two queries will be generated per request, and a basic listing of the data will be produced.

I’m running this experiment on an iMac with a 2.8 Ghz Intel Core i7 with 8 gigs of memory.

Here’s the code:

And here’s a screen capture of the output page produced:

scale-html-screen-grab

The tests will be run with apache bench set to make one thousand requests with a concurrency of fifty.  The tests will be run ten times and the results averaged.

ab -n 1000 -c 50 http://localhost:8080/

The first pass will use straight non-pooled connections.  

Results: 668 requests a second.

Not exactly speedy, but that’s expected due to the high cost of making a connection to the database for every request.

Now let’s will switch to connection pooling.  Eliminating the connection overhead should increase performance significantly.

Result: 901 requests a second.

Better, but not nearly as large a difference as I was expecting.  Perhaps changing the maximum connections to the database will improve the performance.  Let’s bump the maximum connections to 3 and try again.

Result: 2203 requests a second.

A large improvement.  Setting the maximum connections to 3 increased performance rather dramatically.  Now let’s try setting the maximum connections to 50 to match the concurrency we are using in apache bench.

Result: 3138 requests a second.

Again a large improvement.  Not a surprise as the optimal connection level for the work load at hand was readily apparent.  Time for one more experiment:  Setting the maximum connections to 1000.  (postgres.conf must be adjusted to allow one thousand connections.)

Result: 3065 requests a second.

Basically the same as the previous run.  So blindly setting the maximum connections higher does not pay off in improved performance in this circumstance.

The simple experiment illustrated here is merely the tip of a very large optimization ice-burg.  I know very little about actually tuning Postgres.  There are other connection pool libraries out there that could potentially be faster.  Operations folks experienced in these matters may be able to squeeze out even more performance for very little complexity cost.

My personal conclusion is that it’s extremely unlikely that any project I’m building will ever need the additional raw speed offered by event based servers.  The complexity is, quite simply, not worth the cost.  Every situation is different but trading complexity for raw speed, unless it’s your only competitive advantage, seems like a fools errand.

If you want to play around yourself, the code is available on Github.

Named routes for Compojure

I’ve been itching for named routes in Compojure for a few weeks now.  I tend to change my site structure often in the beginning of a project, and string literals for links was making my life difficult.

Here’s my, probably naive, solution:

Sending email from Clojure.

I’m working on several small projects in Clojure, and for one of them I needed to be able to send email.  Quick googling revealed this.  Java Mail API?  Just what I wanted.  However since I’m using leiningen, I needed to find a maven repository with javamail.  In my search for javamail I happened across Apache commons-email, the description is as follows: “Commons Email aims to provide a API for sending email. It is built on top of the Java Mail API, which it aims to simplify.“  Simplify you say?  Well that is worth a look…  Long story short, if you are using leiningen.  In project.clj add:


:repositories [["central-proxy" "http://repository.sonatype.org/content/repositories/central/"]]
:dependencies [[org.apache.commons/commons-email "1.2"]]

Now you can simply do the following:

This does add one extra dependency but, considering it can easily send HTML email and do attachments, I think it’s worth it.