Monday, November 20, 2006

Continuations for Web Development and Leaky Abstractions

In the popular online essay The Law of Leaky Abstractions Joel Sopolsky talks about choosing wrong abstractions for describing programing libraries. Abstractions that hide details influencing in important ways the programs using those libraries. Since those details are hidden the programmer have no control over them and in result has no control over important aspects of the resulting programs.

Calling something continuation passing when in fact you need to save and then retrieve the whole process memory to and from the disc is such a misleading abstraction. Passing and then calling a continuation, in languages that support it, is a primitive operation - something that translates to just a few machine code instructions - something O(1) - saving process memory is not. Until this memory saving is completely eliminated I will call Catalyst::Continuation and Jifty::Continuation a leaky abstraction.

Saturday, October 28, 2006

path and args

There is a drive to convert all the addresses with GET parameters to paths.
http://www.domain.com/book?author=Dostoyevsky
becomes
http://www.domain.com/book/Dostoyevsky.
But what if we add a genre to that search?
http://www.domain.com/book/Dostoyevsky/memoir
or
http://www.domain.com/book/memoir/Dostoyevsky
?
We need to arbitrary decide about the position of the arguments. But what if both of the arguments are optional? Then we need to reserve some string to play the role of 'undef'. We need to add two unnecessary, arbitrary conventions - complicating the code and also the usage.

Perhaps we should rather do
http://www.domain.com/book/author/Dostoyevsky/genre/memoir
with the semantic that everything after /book/ is actually the same old key=value pair and that the order of the pairs has no meaning. Does this really buy us something? I don't know - I would rather see:
http://www.domain.com/book?author=Dostoyevsky;genre=memoir and know from the start how to manipulate it, and that I don't need to care about the order of the pairs.

Why on Earth change all named parameters to positional ones?

For Plugger: Catalyst.

Thursday, October 26, 2006

Tags and search and DBIx::Class

Update: Advanced Search in web DBIx::Class based applications (with tags, full text search and searching by location) is a more elaborated version of this article.

Some time ago I had an idea for a bookmarking site - nothing really revolutionary, but with an effective interface. I've decided that it needs to combine search, browsing by tags and other properties, ordering and jumping to pages. I have really thought much how to make it effective - that is letting the user find some web page with least number of operations assuming that she remembers only random bits from it and that the data we display perhaps reminds her about some new info. These interface ideas are material for another post - here I would like to concentrate on implementation of tagging in DBIx::Class.

I had following requirements:
  • tags can be combined together and with other search terms
  • all search should use indices
  • use database paging of results
  • for tag clouds I need a list of tags used by bookmarks matching a search criterium
With only the first three criteria I could use a simple table with tags concatenated in one field and use full text search on it. In fact the first implementation used that technique. This meant tags can be only one word - but that is a reasonable constraint. The searches for a combination of three tags there looked like WHERE ? @@ tags AND ? @@ tags AND ? @@ tags AND THE_OTHER_CRITERIA where the @@ operator is a full text match for PostgreSQL and tags was the name of the aggregated tags column.

Unfortunately I did not found any way to meet the fourth requirement with this database schema, so I decided to have separate bookmark and tag tables. The tag table has two columns bookmark_id and tag_text with indices on both of them. One idea how to make the combined search here could be like:

SELECT b.*
FROM bookmark b, tag t
WHERE t.bookmark_id = b.id
AND (t.tag_text IN (?, ?, ?))
GROUP BY b.id
HAVING COUNT( b.id )=3

But that would mean a full scan on the bookmark table and I wanted a solution using indices. So I devised another query for that schema. For a combination of many tags I would do as many joins to the tag table as there are tags in the query:

WHERE tag_1.bookmark_id = bookmark.id AND
tag_1.tag_text = ? AND
tag_2.bookmark_id = bookmark.id AND
tag_2.tag_text = ? AND
...

For the first glance this looks hard to do in DBIx::Class - but actually it is a lot easier than it seems, the key learning here is: If the same join is supplied twice, it will be aliased to rel_2 (and similarly for a third time) (from POD for DBIx::Class::ResultSet).

I had to build hash with search parameters with the proper key names (tag, tag_2, tag_3 ...) and values from the @tags array :

my $suffix = '';
my $i = 1;
for my $tag (@tags){
$sqlparams{'tag' . $suffix . '.tag'} = $tag;
$suffix = '_' . ++$i;
}


And then the search:

my $it = $schema->resultset('Bookmark')->search(
\%sqlparams, {
join => [ ('tag') x scalar(@tags), 'usr' ],
order_by => \@order,
page => $page,
rows => $maxrows,
},
);


Tuesday, October 24, 2006

One indirection layer too much?

It is a common practice to have a model completely outside of Catalyst classes (let's call it MyApp::RealModel). This way you can conveniently use the code in batch jobs and command line tools outside of the web environment. But still you need to create a nearly empty MyApp::Model::EmptyModel class just to connect the outside model into the catalyst framework. This might not seem as too much work - but it requires one more unique name and complicates the documentation. And since this EmptyModel does not introduce much new functionality over the MyApp::RealModel, the programmer easily forgets about all differences between them - and eventually falls into confusion.

This is one indirection layer too much in my opinion - what I would like instead is to be able to specify in the configuration file (myapp.yml) something like:

model: MyApp::RealModel

Is that change feasible? Don't ask me.

Saturday, October 21, 2006

Customisation by deleting

Deleting parts of the generated code is one of the easiest ways of customizing. There is just one cognitive task that you need to do - decide what you need and what you don't, I have no data about that - but I am sure that converting of you internal thoughts to the programming language syntax is much more complicated.

I think this is one of the factors that makes code generation an efficient tool in some circumstances. When you close the logic in some library, you cannot modify it so easily. This is one of the reasons that in Catalyst::Example::InstantCRUD I finally decided to generate the templates, and also to use a minimum of logic there - so that deleting, and also changing the order of some elements would not break it.

Friday, October 20, 2006

HTML::Widget versus RoR helpers

I've recently deleted lot's of code from InstantCRUD - some logic moved to templates, some to HTML::Widget::DBIC, some just simplified (with the help of uri_with). I am feeling the code for CRUD becomes quite 'elegant'.

What I am still not happy with is the widget config - I feel that all the visual things (like the choice between textarea and textfield etc) should be managed in the templates. The Ruby on Rails helpers seem ideal solution - they do all the validation, filling up the values that need to be done in the controller, but let the View manage the actual HTML produced. To reach that we would have be able to produce HTML::Widget::Element without specifying that it is to be a textfield or a radio box and leave the HTML generation to be called in the template, something like:

[% result.field_html('fieldname, 'textarea', cols, 30, rows, 5) %]

This would produce the field html with the proper value and an error mark if there was an error during validation. This looks like too radical step for evolution of HTML::Widget - perhaps what is needed is some completely new module?

Inspired by "Rails-like form helpers" email discussion.

There is a reply at the mailing list.