Perl Alchemy - notes of a programmer

Saturday, March 31, 2012

Plack::Middleware::Auth::Form - some updates and a possible name change

I'll make a new release of Plack::Middleware::Auth::Form soon. There are quite a few fixes in the Plack::Middleware::Auth::Form repository gathered since the last release. It is all from external contributors - thanks a lot!

The bug reported in #75896: Cookie Expiry Date not set for "remember" session is quite interesting. Apparently Plack::Middleware::Session sends the session cookie on each request and if you don't set Expiry Date each time it will happily unset it.

I am thinking about changing the name to WebPrototypes::LoginForm. Some people did not like the name Plack::Middleware::Auth::Form from the start, because it is a bit more high-level then the other Auth middlewares, and now I have two more elements for quick web application prototyping under the WebPrototypes namespace.

Sunday, March 25, 2012

Blog writing and assuming stupidity

Writing a blog is not easy. People did not change much since the 'bread and circuses' times. You need to spicy your writing up with strong statements or you'll not get any audience. On the other hand ridiculing someone while having a very superficial knowledge of the matter makes you a bully.

What happened there? Again, it is hard to tell anything interesting without some speculation - and possibly I'll have to apologize to Dave for this - but I think Dave has read "You must hate version control systems, we won't be using any" and assumed that this is is from a company that superficially rejected version control because they did not want to learn or, in other words, from someone that assumed that version control is useless. Talk about beams and eyes. That's not to say that I vouch for the 'pipelines' system or for replacing version control with it. I still don't know much about these pipelines - but new ideas don't have to work in every possible aspect to be worthwhile and you'll not have a break-through idea if you always stick to the accepted wisdom.

It is easy to assume stupidity - on average people are mediocre - but the internet is a big search space - expect to be surprised from time to time :)

Monday, March 19, 2012

Verbs and Nouns

There is a popular, if a bit long and blurry, rant by Steve Yegge: Execution in the Kingdom of Nouns - it is about how we overuse nouns and under-use verbs when programming in Java. Of course it is not different in other object oriented imperative languages. Programs do something, subroutines do something - verbs should be at least as prominent as nouns in programming - but when we need to write an application we build it out of objects. Even if it is a web application - something that translates the HTTP request into the HTTP response - we code it as an object with fields and all that stuff. Even if we code against an API that defines the web application as a subroutine reference, we still write it as an object and then make a closure over it to pass to the backend.

Do we overuse nouns? Or maybe it is that actions are opaque and unstructured - and when we need to get to the the details, the parts that compose them - then it is more natural to treat them as things? Wouldn't it be easier to incorporate streaming in PSGI if the application there was an object with methods and attributes?

Sunday, March 11, 2012

WebNano - code experiments

WebNano is only a few hundreds lines - but you can arrange it in many many ways - and then you need to test it with all kinds of URL schemas and controller architecture. I do a lot of exploratory coding - testing all the possible arrangements. I feel that I keep forgetting about the things that suggested me to choose one design over others. Maybe I'll keep some notes here. In the past two weeks I tried a few things:

Keeping the parsed path as an attribute in the controller.
Additionally to the above I tried adding three more controller methods: 'action_name', 'action_args' and 'action_postfix'.
I wrote two additional test controllers for the simple url schema, both redirecting handling to DvdDatabase::Controller::Dvd::Record for the case where we have a record to handle: overriding local_dispatch, overriding handle

The conclusions:

having the path as attribute is handy for code retrieving the record
the additional controller methods help with writing custom dispatchers
splitting the processing to two controllers - one for the case where you have one object to work on (like viewing, editing, deleting), second for the case where we don't (like listing, creating) is very clean - you can have the object as controller attribute
the the additional dispatcher methods are less useful for that more clean architecture
the biggest problem was always preventing the methods that require the object to be called when we don't have the record id on the path (like '/view' when we assume that it should be '/1/view') - and the best method to do that is having the two controller classes
overriding 'handle' is actually simpler - because it is a very simple method

Tuesday, March 06, 2012

Why Bread::Board looks mostly redundant

This is based on two assumptions - that you don't use BB as a kind of Service Locator (but I agree with for example Dependency Injection != using a DI container that this is an anti-pattern) and what 'mostly follows' that the product of your BB container is just one object - the application class. I believe these are good guidelines for software architecture. With those two assumptions all that BB gives you is that you can name your partial results and then use them in later computations, but Perl has a good support for this - it is called variables.

For example let's take the original example from Bread::Board synopsis: Now - let's do the same with just variables: You can also feel fancy and do it with Moose lazy attributes: This is not longer than the BB example and it uses generic tools.

Friday, March 02, 2012

Mason 2

Mason 2 looks very interesting. First of all it has the a file a page modus operandi that works so well for PHP, then it has all the template inheritance and Moose template candies that look very powerful, finally the page code works in the request scope - i.e. it can access the page parameters and stuff from attributes which is so much more convenient then passing these values around as method parameters as you do in Catalyst. The only part lacking from my cursory look at the documentation is anything that works in the application scope. Most probably it is just that I did not found anything in the most exposed documents - but this omission still looks ominous.

Saturday, January 14, 2012

Schlep

Schlep is tedious, unpleasant task. According to Paul Graham schlep is also what really defines a company - it is doing the tasks that are unpleasant and tedious for someone that they would pay you for.

Narrowing this down to my own Perl web development work - the schlep for me was always getting the basic web app running with user registration, login pages, password reset mechanisms, etc. - in every new project that was the most repeatable, boring work. I think everyone has the feeling that this does not need to be like that. I've started thinking about what could be a solution to this and here are my first experiments about fixing it: Plack-Middleware-Auth-Form, WebPrototypes::ResetPass, WebPrototypes::Registration (I might rename the first one to WebPrototypes as well). The point is to solve it across the multiple web frameworks, templating languages and storage layers - so that it can survive moving from project to project.

What is your schlep?

Saturday, December 10, 2011

A kind of call by name

I often write code like this:

$self->create_user( username => $username, email => $email, pass_token => $pass_token );

I wish I could get rid of the naming redundancy in this call:

$self->create_user( $username, $email, $pass_token );

(without changing 'create_user' of course).

Probably some new syntax would be needed.

Friday, November 18, 2011

'use strict' and cargo cult programming

I've just read mjd's confession "Why I Hate strict", it is from 2003 so he might have changed his views now, but there is nothing that would indicate that on this web page. Anyway, his main argument that the usual advice to use strict is automatic and mindless and that it often does not really prevent the problems that people think it does. In other words it is a cargo cult programming to which he contrast programming with thinking and deep analysis of everything you do.

I used to program without use strict; use warnings but after exposure to the usual propaganda I switched and I found that the cost of mindlessly adding it is negligible, the cases where I need no strict are very rare, and there are many benefits of doing it, especially when working with old code. This cult is rather effective in luring the cargo planes to land in my atoll. On the other hand I am all for deep analysis and checking your assumptions from time to time. There are many valid points in Marc Lehman common::sense and I would like to see them discussed. While we are on the road to have use strict by default we might also try to make it better.

Saturday, November 12, 2011

$ primes for money

The thesis above sounds uncontroversial. It is also rather uncontroversial that '$' is relatively frequently used when programming in Perl. Now - what can be the consequences of that?

Money has been said to change people's motivation (mainly for the better) and their behavior toward others (mainly for the worse). The results of nine experiments suggest that money brings about a self-sufficient orientation in which people prefer to be free of dependency and dependents. Reminders of money, relative to nonmoney reminders, led to reduced requests for help and reduced helpfulness toward others. Relative to participants primed with neutral concepts, participants primed with money preferred to play alone, work alone, and put more physical distance between themselves and a new acquaintance.

from one of the first links in the query above. Pretty sad - can that apply to the Perl community? Another link from that list, an entertaining BBC video report suggests also some other effects: hunger and pain insensitivity.

Monday, November 07, 2011

Thesis: simple - antythesis: easy - synthesis: ...

Rich Hickey's Simple Made Easy is a great talk, a must see, with lot's of insight, but together with that it also misrepresents what Agile is about. Hickey's main point is that we should try to write simple software, because this is the only way to have reliable software, and he is right of course. He notes that when you encounter a new bug and try to fix it - all the existing tests pass - so they will not help you in finding the cause of it. You need to do the bug analysis on your own and the complexity of your code is your enemy there. He is also right when he talks about how easy means familiar a not simple and that it is a trap because it drives us away from the other (in small increments I would add). He is insightful when he talks about things that are source of complexity. He is funny, but missing the point in his critique of Agile.

The development sprints he attacks are not about doing the bulk of the work - they are about building a prototype on which we can test our assumptions. Without the understanding that we get from these prototypes we could simplify as much as we want but it would not change the fact that our solution solves the wrong problem. Agile is not an enemy of simple, it puts a lot of weight to doing the easy - but not because this is the goal - rather it uses easy as a mean to get to the correct. Agile is the answer to the paradox that we don't know what we should make until we already have a prototype of that thing. I wish more developers cared about simple - but only after they know what is needed.

Tuesday, November 01, 2011

Notes on the Synthesis of Form

According to Wikipedia the origin of Design Patterns lays in the Pattern Language ideas by the unorthodox architect and philosopher Christopher Alexander, but his earlier work also used to be widely read by computer scientists:

Alexander's Notes on the Synthesis of Form was required reading for researchers in computer science throughout the 1960s. It had an influence^[8] in the 1960s and 1970s on programming language design, modular programming, object-oriented programming, software engineering and other design methodologies. Alexander's mathematical concepts and orientation were similar to Edsger Dijkstra's influential A Discipline of Programming.

The solution to the design problem that he proposes there does not look too attractive now, but his models, his metaphors, his insight into the design process - it's all still relevant and spot on. I am surprised that the Agile movement does not quote "Notes" as one of their foundation texts.

Saturday, October 15, 2011

Concentration and Flow or Yet Another Dependency Injection Note

Imagine that you need to do some small home improvement or maintenance work and you have all the needed tools, in good quality, clean and well maintained with all cutting blades sharpened and no missing screwdriver heads. A nice feeling - isn't it? When you start work like this you can concentrate on the task at hand instead of thinking where you can borrow that drill tool.

Collaborators in an algorithm are like those tools, having them readily available lets you concentrate on the problem.

Friday, October 14, 2011

Object oriented versus functional interface

I use DateTime::Format::W3CDTF for parsing my dates:

my $w3c = DateTime::Format::W3CDTF->new; my $dt = $w3c->parse_datetime( $date_string );

I wish it was:

my $dt = DateTime::Format::W3CDTF->parse_datetime( $date_string );

and that the library created the parser on the fly as needed. It's not only less typing - but also much simpler mental model. This simpler model is sometimes too simple - for example if you parse a lot of dates then sparing the parser creation each time can make a difference.

I think the optimal thing to do is provide two APIs - like JSON - a functional one:

$perl_hash_or_arrayref = decode_json $utf8_encoded_json_text;

and an object oriented one:

$json = JSON->new->allow_nonref; $perl_scalar = $json->decode( $json_text );

for those that need that extra control.

Monday, October 03, 2011

open expects filename as binary data encoded in the system characterset

I guess this is not a surprise to anyone who thought about how this is supposed to work, but for the sake of being systematic, here is the code:
use strict; use warnings; use autodie; use HTML::Entities; use Encode; my $a = HTML::Entities::decode( 'ñ' ); open(my $fh, '>', $a ); print $fh "Without encoding\n"; close $fh; open(my $fh1, '>', encode( 'UTF-8', $a ) ); print $fh1 "With encoding\n"; close $fh1
And here is the result when run on an system with UTF8 locales:
zby@zby:~/myopera/tmp$ ls ? a.pl ñ zby@zby:~/myopera/tmp$ cat ñ With encoding
'a.pl' is the name of the script itself, the mark '?' hides the F1 hexadecimal code and that file contains 'Without encoding'.

Friday, September 30, 2011

Courriel::MMS

I think it is time to announce Courriel::MMS - it is an extension for the new email handling library by Dave Rolsky for processing MMS messages forwarded as emails by the mobile operators. This is still just a github link, no CPAN package yet, but it works in our production servers since Wednesday - so I bet on this code :)

It is a bit heuristic - for example many operators send a subject like 'You have received an mms' - which is useless for us, so the library tries to find something else that would act as the subject we need. For dealing with the various mobile operators, that each send a slightly different format of these emails, I used the Factory design pattern together with Module::Pluggable - this is a novel design for me so I wait for comments.

Tuesday, September 20, 2011

URI->path expects binary data

Update: changed new to path - with new it would be reasonable to require that the uri fed to the parser is already an ASCI string containing the already URI encoded url.
Consider this code:
use 5.010; use Encode 'encode'; use URI; my $uri = URI->new( 'http://example.com/' ); say $uri->path( encode("UTF-8", "can\x{00B4}t-make-it-work" ) ); say $uri->path( "can\x{00B4}t-make-it-work" );
The output (in perl 5.14.0) is:
http://example.com/can%C2%B4t-make-it-work http://example.com/can%B4t-make-it-work
If your page is encoded in UTF8 - then the first one is correct: %C2%B4 is the URI encoded UTF8 encoding of Unicode Character 'ACUTE ACCENT' (U+00B4). If your page encoding is Latin1 - then the second one would be correct - but this is only by accident - in that case you should still use encode("iso-8859-1", ...).

There are probably many other string manipulating libs that should document if their input should be binary encoded data or decoded character strings.

Thursday, September 01, 2011

Names are special

At perl.com Tom Christiansen talks about sorting names, among other things. It appears surprisingly difficult to do properly with many special rules for each language. By coincidence at programming.reddit.com just above a link to that article there is Personal names around the world - a link to a w3c article about even more complications with names. At PhilPapers we had to solve *somehow* a few of these - the result is Text::Names. It is still rather limited: "While it tries to accommodate non-Western names, this module definitely works better with Western names, especially English-style names" - but there is already lots of logic embedded there.

Sunday, August 28, 2011

is_utf8 is useless - can we have is_character?

Consider this code:



$data_structure = utf8::is_utf8($json)

   ? from_json($json)

   : decode_json($json);

taken, together with the is_character suggestion, from otherwise very informative post: Quick note on using module JSON. I have seen similar code in many places. The idea is to check if the string you have is character data or a string of bytes and treat it appropriately. Unfortunately is_utf8 does not do that check:



use strict;

use warnings;



use utf8;

use HTML::Entities;

use JSON;



my $a = HTML::Entities::decode( '&nbsp;' );

my $json = qq{{ "a": "$a" }};

print 'is_utf8: ' . ( utf8::is_utf8( $json ) ? 'yes' : 'no' ) . "\n";



my $data_structure = utf8::is_utf8( $json )

   ? from_json( $json )

   : decode_json( $json );

This fails (on my machine) with following output:



is_utf8: no

malformed UTF-8 character in JSON string, at character offset 8 (before "\x{8a0}" }") at a.pl line 12.

If that still is a mystery try this:



use strict;

use warnings;



use HTML::Entities;

use Devel::Peek;



Dump( HTML::Entities::decode( '&nbsp;' ) );

the output (on my machine) is:



SV = PV(0x24f2090) at 0x24f3de8

  REFCNT = 1

  FLAGS = (TEMP,POK,pPOK)

  PV = 0x2501620 "\240"\0

  CUR = 1

  LEN = 16

this string is internally encoded as "\240" i.e. "\x{0a}" which is Latin1 encoding of non-breaking space. It does not have the utf8 flag set - so the code above tries to treat it as UTF8 encoded stream of bytes and fails.

I don't know if we can have is_character easily - but the lack of introspection here is surely painful.

Wednesday, August 24, 2011

CPAN, decoupling and Dependency Injection

Consider the code:



sub fetch

{

    my ($self, $uri) = @_;

    my $ua           = LWP::UserAgent->new;

    my $resp         = $ua->get( $uri );



    ...

}

Yes - this is taken from a post by chromatic.

Now imagine that this is code from a CPAN module you installed and that some security concerns require you to replace LWP::UserAgent with LWPx::ParanoidAgent there. Bad luck - you'll probably need to subclass it, override that whole fetch method and pray that it will not change too much with every new release of the original module.

This is really why I am drumming this Dependency Injection drum over and over again - code that uses it is more reusable, more universal:



use Moose;

has 'ua', is => 'ro', default => sub { LWP::UserAgent->new };



sub fetch

{

    my ($self, $uri) = @_;

    my $ua           = $self->ua;

    my $resp         = $ua->get( $uri );



    ...

}

Now you would not have any problem with providing a LWPx::ParanoidAgent object for the fetch method to use.

By the way, with classical DI you'd move that LWP::UserAgent->new completely out from the class, here it stays as a 'default' that can be overridden from outside if you need. The problem with classical DI is that you need to have a place where to move that initialization code - here it is sidestepped for the 'normal' usage and you need to worry about it only in the cases where you really need to. Java probably does not have this 'default' mechanism.