Some time ago I had an idea for a bookmarking site - nothing really revolutionary, but with an effective interface. I've decided that it needs to combine search, browsing by tags and other properties, ordering and jumping to pages. I have really thought much how to make it effective - that is letting the user find some web page with least number of operations assuming that she remembers only random bits from it and that the data we display perhaps reminds her about some new info. These interface ideas are material for another post - here I would like to concentrate on implementation of tagging in DBIx::Class.
I had following requirements:
- tags can be combined together and with other search terms
- all search should use indices
- use database paging of results
- for tag clouds I need a list of tags used by bookmarks matching a search criterium
WHERE ? @@ tags AND ? @@ tags AND ? @@ tags AND THE_OTHER_CRITERIA
where the @@
operator is a full text match for PostgreSQL and tags
was the name of the aggregated tags column.Unfortunately I did not found any way to meet the fourth requirement with this database schema, so I decided to have separate
bookmark
and tag
tables. The tag
table has two columns bookmark_id
and tag_text
with indices on both of them. One idea how to make the combined search here could be like:SELECT b.*
FROM bookmark b, tag t
WHERE t.bookmark_id = b.id
AND (t.tag_text IN (?, ?, ?))
GROUP BY b.id
HAVING COUNT( b.id )=3
But that would mean a full scan on the bookmark table and I wanted a solution using indices. So I devised another query for that schema. For a combination of many tags I would do as many joins to the
tag
table as there are tags in the query:
WHERE tag_1.bookmark_id = bookmark.id AND
tag_1.tag_text = ? AND
tag_2.bookmark_id = bookmark.id AND
tag_2.tag_text = ? AND
...
For the first glance this looks hard to do in DBIx::Class - but actually it is a lot easier than it seems, the key learning here is:
If the same join is supplied twice, it will be aliased to rel_2 (and similarly for a third time)
(from POD for DBIx::Class::ResultSet).I had to build hash with search parameters with the proper key names (
tag
, tag_2
, tag_3
...) and values from the @tags
array :
my $suffix = '';
my $i = 1;
for my $tag (@tags){
$sqlparams{'tag' . $suffix . '.tag'} = $tag;
$suffix = '_' . ++$i;
}
And then the search:
my $it = $schema->resultset('Bookmark')->search(
\%sqlparams, {
join => [ ('tag') x scalar(@tags), 'usr' ],
order_by => \@order,
page => $page,
rows => $maxrows,
},
);
2 comments:
Just for the sake of passing on the link, Jay Pipes of MySQL has a nice presentation on tags and folksonomy in MySQL.
http://jpipes.com/presentations/tagging.pdf
Dear Internet, you rock. You're always around to show me someone who already did all the hard work for me. Thanks!
Post a Comment