Perl Alchemy - notes of a programmer: 2012

Wednesday, October 31, 2012

Dependency Injection and open-sourcing generic parts of apps

using DI in CPAN libs makes them more universal - but DI even more important is when you want to open-source some generic part of your application. Your boss agrees and then you encounter code like this:

Saturday, October 06, 2012

A simple quine

I wrote a quine:

Thursday, August 30, 2012

Interactive presentations

At my latest YAPC talk I used questions to the audience to make sure that everything is understood. That does not mean that I asked 'do you understand it' - but rather I asked 'how would you estimate this or that' - and then I extracted from the answers the generic strategies that I had prepared to talk about. This worked spectacularly well and I think I'll add it to all my presentations.

Thursday, June 07, 2012

Why web frameworks tend to grow to become such unwieldy beasts?

Most of the web frameworks I know tend to do at least two things - the web stuff plus creation and initialization of all other components. This means that the framework is coupled with all of these components and this is the root of all evil in web frameworks. Of course you need a place to do that object creation and wiring work - but it is not really related to web stuff - it should have it's own place in the program, ideally in a Dependency Injection compartment, not necessarily a container based on the available libraries - but it can also be coded by hand (I might change my minde some day but for now I don't see any reasons to use DI container libraries in a dynamic language like Perl). All the arguments about using Dependency Injection apply also here, but even for someone rejecting DI it should be pretty obvious that reading config files and initializing objects is not much related to web stuff and if you buy the single responsibility principle you should split them into separate libraries.

I don't know Django or Rails too deeply - but I've observed this with Catalyst. Now, Catalyst is supposed to be decoupled from the Views or Models and other Plugins, there are many different ones in each category and you can replace them freely in your programs. But Catalyst code-base is still pretty big and then you have all these Catalyst::Models, Catalyst::Views and Catalyst::Plugins that don't really do any meaningful work - they only map one interface into another one. It could be much simpler if Catalyst only cared about the web related processing.

Saturday, May 12, 2012

Non compatible changes in WebNano

In my latest commit in WebNano I refactored a lot of code and changed the API in a non-compatible way. I am going to make a new release with those changes soon. I feel that doing an additional release only to warn about this fact before that sounds kind of silly - isn't announcing it here enough?

Monday, May 07, 2012

On the importance of intuitive names.

PARTICIPANT:

[Reading names of classes]

Binary reader, Buffered stream, Reads and writes...

[Pauses, Scrolls through list of classes].

So let me just...stream writer. Implements a text writer writing characters to a stream in a particular encoding. Umm...stores an underlying string. Text...hmmm. Text reader, text writer. Represents a writer that can write a sequential series of characters. This class is abstract.

[Pause]

Ummm...

[scrolls up and down through list of classes]

So it, you know, it sounds to me like it's, you know, it's more low-level kind of stuff. Whereas I just want to create a text file. Umm.

[Points at the description of one of the classes]

Characters to a stream in a particular encoding. I'm not sure what...obviously a text writer for writing characters to a stream.

[Clicks on the link to view more details about the TextWriter class. Then looks at list of classes that derive from TextWriter]

System dot IO dot StringWriter. This seems too low-level to me to be a text writer thing but maybe I am wrong. Umm...

[scrolls through description of the TextWriter class]

Text writer is designed for character output, whereas the stream class is designed for byte input and output.

[Sigh. Clicks on link to view TextWriter members]

Probably going where no man should have gone before here.

This guy did not make it.

Neither did any of the other 7 professional programmers that were participants in that experiment! Their task was "to write a program that would write to and read from text files on disk". They had 2 hours for that and could browse all of the relevant documentation. They were testing the API of the then new programming framework called .NET - the programmers did not know it yet - but they had programmed in VisualBasic. This is an example code fragment using the file writing API that they were expected to write: After that experiment they added a new 'File' API: and ran the experiment again. This time all participants were able to complete each task in 20 minutes and without browsing the documentation. This is the story from the "Chapter 29 How Usable Are Your APIs?" in Making Software.

Fascinating puzzle - isn't it? The article proposes following solution to it: there are three types of programmers - opportunistic, pragmatic and systematic. The opportunists tend to use the high-level abstractions and try and experiment with what would work and they intuitively get the File API as opposed to the StreamWriter API. And it just happened that all 8 participants of that study were opportunist programmers?!

The developers who participated in the file I/O studies were very much in the opportunistic camp. The way they used the FileObject class in the second study was almost a perfect example of opportunistic programming. The contrast with the more abstract classes they had to use in the first study was stark. Indeed, one of the main findings from the file I/O studies was that without such high-level concrete components, opportunistic developers are unlikely to be successful.

Sounds like a weak argument:

Copy pasting examples and playing with the code is the most efficient way to learn a new API - so I suspect that what they call 'opportunistic programming' is actually learned behaviour that would be characteristic of any experienced programmer.
Expecting a file related API sounds quite natural and not really related to being opportunistic or systematic, the task at hand was exactly file related IO and in most programming languages there is such an API, it also makes sense that there is one because file operations are very common.
I don't see anything higher level or more concrete in the FileObject related example code - it looks on the same level of abstraction as the StreamWriter code. The authors claim that it is the fact that you have both StreamWriter and StreamReader that makes it lower level then FileObject which is only one - but I don't see how that follows.

Phil Karlton wrote:

There are only two hard things in Computer Science: cache invalidation and naming things.

Naming things also comes out as quite important.

Wednesday, May 02, 2012

Tricky problems of the Perl language - a completely arbitrary list

Overloading and parameter types validation.
Clash between overloading hashification and arrayification and the each, keys and values with the new dereferencing semantics.
There is no way to know if you received characters or binary data, lots of libraries and even core functions work differently in these cases - but often it is not documented.
In Perl observing a variable changes it - for example - reading a variable containing a string in a number context will fill in its number slot (as far as I understand it - see perlguts for the details). Normally it does not matter - but it makes threading less efficient (because shared variables need to go through additional loops to work).

Two bonus points - using too much of the $ character makes

And one more fixed recently - the one making checking $@ after eval unreliable.

Friday, April 27, 2012

If all of your state is on the stack then you are doing functional programming

Consider this: This is imperative code, it uses variables with state, but from the outside it is a pure mathematical function. This is because all of its variables are created anew when the function is being executed and destroyed after that. It does not matter that the variables contain simple values: This can also be purely functional provided that Markdent::Simple::Document does not use globals or system calls.

Imagine that this was enforced by the compiler/interpreter - maybe with a new keyword function, or something. I have the feeling that this would be very similar to how use strict works - giving the end user some kind of safety.

Tuesday, April 17, 2012

A Data::Dumper bug - or Latin1 strikes again

I did not believe my boss that he found a bug in Data::Dumper - but run this:

To get a character that has internal representation in Latin1 I could also use HTML::Entities::decode( 'ó' ) there with the same result. The output I get on 5.14.2 and 5.10.1 is:

When I check the dumped string - it has the right character encoded in Latin1 - and apparently eval expects UTF8 when use utf8 is set. Without use utf8 eval works OK on it. If the internal representation of the initial character is UTF8 (like when the first line is my $initial = 'ó';) - then the dumped string contains UTF8 (which is again might be interpreted incorrectly if the code does not have use utf8 preamble).

Considering that Data::Dumper is a core module and one that is one of the most commonly used and that its docs say:

The return value can be evaled to get back an identical copy of the original reference structure.

this looks like a serious bug.

Is that a known problem? Should I post it to the Perl RT?

Update: Removed the initial eval - "\x{f3}" is enough to get the Latin1 encoded character. Some editing.
Update: I tested it also on 5.15.9 and it fails in the same way.
Update: I've reported it to the Perl RT - I am not sure about the severity chosen and the subject - this was my first Perl bug report.
Update: In reply to the ticket linked above Father Chrysostomos explains: "The real bug here is that ‘eval’ is respecting the ‘use utf8’ from outside it." and later adds that 'use v5.16' will fix the problem in 5.16.

Saturday, April 14, 2012

Breaking problems down and defaults

In a classic essay Dave Rolsky wrote: Want Good Tools? Break Your Problems Down. I wish more people have read this and applied the advice - CPAN libraries would be more useful then. But stating the goal is probably not enough - we need also to talk about how it can be reached and about problems encountered on the way there. For example let's take the module that was the result of the process described in the essay linked above:

The problem is that the criticized approach, a unified library that just converts Markdown to HTML, would result in a simpler API - for example something like this:

Maybe the difference does not look very significant - but after a while it can get annoying. For the 99% of cases you don't need to extra flexibility that comes with the replaceable parser - so why should you pay for it? If I had to use Markdent frequently I would write a wrapper around it with an API like above.

By the way, Text::Markdown already has this wrapper and it does present a double, functional/object oriented API - where the presented above simple, functional one does the most common thing, while the object oriented one gives you more control over the choices made. Only that it still couples parsing and generation.

Another way of simplifying the API is providing defaults to function arguments. For example to the object constructor. Dependency Injection is all about breaking the problem down and making flexible tools - but it might become unbearable if we not soften it up a bit with defaults.

Programming is always about doing trade-offs - here we add some internal complexity (by adding the wrappers or providing the defaults) and in exchange get a simplified API that covers the most common cases while still maintaining the full power under the hood. I think this is a good trade off in most cases, and especially in the case of libraries published to CPAN that need to be as universal as possible.

Wednesday, April 04, 2012

What if "character != its utf8 encoding" is overengineering?

"You shell not assume anything about the internal representation of characters in Perl" - is a mantra that has been repeated over and over by the Perl pundits for something like a decade. But there are still people who refuse to take that advice and want to peek into the internal representation of characters. What if our sophisticated approach about isolating the 'idea of a character' and its representation is a case of overengineering? People often overreact for past traumas - programming is not an exception - and the conversion from many national 'charsets' to unicode was a big event. Maybe expecting another conversion soon is such an overreaction?

Getting rid of the Latin1 internal encoding does not look like a big price for improving simplicity and getting rid of all these subtle mistakes. I think it is important that the language is understood by its users and if it is not, then maybe, instead of blaming the programmers, we could make it easier to understand? Sure it is nice to have the possibility to change the internal encoding from UTF8 to UTF16 or maybe something completely different in the future - but I have the feeling that this might be case of architecture astronautics.

Saturday, March 31, 2012

Plack::Middleware::Auth::Form - some updates and a possible name change

I'll make a new release of Plack::Middleware::Auth::Form soon. There are quite a few fixes in the Plack::Middleware::Auth::Form repository gathered since the last release. It is all from external contributors - thanks a lot!

The bug reported in #75896: Cookie Expiry Date not set for "remember" session is quite interesting. Apparently Plack::Middleware::Session sends the session cookie on each request and if you don't set Expiry Date each time it will happily unset it.

I am thinking about changing the name to WebPrototypes::LoginForm. Some people did not like the name Plack::Middleware::Auth::Form from the start, because it is a bit more high-level then the other Auth middlewares, and now I have two more elements for quick web application prototyping under the WebPrototypes namespace.

Sunday, March 25, 2012

Blog writing and assuming stupidity

Writing a blog is not easy. People did not change much since the 'bread and circuses' times. You need to spicy your writing up with strong statements or you'll not get any audience. On the other hand ridiculing someone while having a very superficial knowledge of the matter makes you a bully.

What happened there? Again, it is hard to tell anything interesting without some speculation - and possibly I'll have to apologize to Dave for this - but I think Dave has read "You must hate version control systems, we won't be using any" and assumed that this is is from a company that superficially rejected version control because they did not want to learn or, in other words, from someone that assumed that version control is useless. Talk about beams and eyes. That's not to say that I vouch for the 'pipelines' system or for replacing version control with it. I still don't know much about these pipelines - but new ideas don't have to work in every possible aspect to be worthwhile and you'll not have a break-through idea if you always stick to the accepted wisdom.

It is easy to assume stupidity - on average people are mediocre - but the internet is a big search space - expect to be surprised from time to time :)

Monday, March 19, 2012

Verbs and Nouns

There is a popular, if a bit long and blurry, rant by Steve Yegge: Execution in the Kingdom of Nouns - it is about how we overuse nouns and under-use verbs when programming in Java. Of course it is not different in other object oriented imperative languages. Programs do something, subroutines do something - verbs should be at least as prominent as nouns in programming - but when we need to write an application we build it out of objects. Even if it is a web application - something that translates the HTTP request into the HTTP response - we code it as an object with fields and all that stuff. Even if we code against an API that defines the web application as a subroutine reference, we still write it as an object and then make a closure over it to pass to the backend.

Do we overuse nouns? Or maybe it is that actions are opaque and unstructured - and when we need to get to the the details, the parts that compose them - then it is more natural to treat them as things? Wouldn't it be easier to incorporate streaming in PSGI if the application there was an object with methods and attributes?

Sunday, March 11, 2012

WebNano - code experiments

WebNano is only a few hundreds lines - but you can arrange it in many many ways - and then you need to test it with all kinds of URL schemas and controller architecture. I do a lot of exploratory coding - testing all the possible arrangements. I feel that I keep forgetting about the things that suggested me to choose one design over others. Maybe I'll keep some notes here. In the past two weeks I tried a few things:

Keeping the parsed path as an attribute in the controller.
Additionally to the above I tried adding three more controller methods: 'action_name', 'action_args' and 'action_postfix'.
I wrote two additional test controllers for the simple url schema, both redirecting handling to DvdDatabase::Controller::Dvd::Record for the case where we have a record to handle: overriding local_dispatch, overriding handle

The conclusions:

having the path as attribute is handy for code retrieving the record
the additional controller methods help with writing custom dispatchers
splitting the processing to two controllers - one for the case where you have one object to work on (like viewing, editing, deleting), second for the case where we don't (like listing, creating) is very clean - you can have the object as controller attribute
the the additional dispatcher methods are less useful for that more clean architecture
the biggest problem was always preventing the methods that require the object to be called when we don't have the record id on the path (like '/view' when we assume that it should be '/1/view') - and the best method to do that is having the two controller classes
overriding 'handle' is actually simpler - because it is a very simple method

Tuesday, March 06, 2012

Why Bread::Board looks mostly redundant

This is based on two assumptions - that you don't use BB as a kind of Service Locator (but I agree with for example Dependency Injection != using a DI container that this is an anti-pattern) and what 'mostly follows' that the product of your BB container is just one object - the application class. I believe these are good guidelines for software architecture. With those two assumptions all that BB gives you is that you can name your partial results and then use them in later computations, but Perl has a good support for this - it is called variables.

For example let's take the original example from Bread::Board synopsis: Now - let's do the same with just variables: You can also feel fancy and do it with Moose lazy attributes: This is not longer than the BB example and it uses generic tools.

Friday, March 02, 2012

Mason 2

Mason 2 looks very interesting. First of all it has the a file a page modus operandi that works so well for PHP, then it has all the template inheritance and Moose template candies that look very powerful, finally the page code works in the request scope - i.e. it can access the page parameters and stuff from attributes which is so much more convenient then passing these values around as method parameters as you do in Catalyst. The only part lacking from my cursory look at the documentation is anything that works in the application scope. Most probably it is just that I did not found anything in the most exposed documents - but this omission still looks ominous.

Saturday, January 14, 2012

Schlep

Schlep is tedious, unpleasant task. According to Paul Graham schlep is also what really defines a company - it is doing the tasks that are unpleasant and tedious for someone that they would pay you for.

Narrowing this down to my own Perl web development work - the schlep for me was always getting the basic web app running with user registration, login pages, password reset mechanisms, etc. - in every new project that was the most repeatable, boring work. I think everyone has the feeling that this does not need to be like that. I've started thinking about what could be a solution to this and here are my first experiments about fixing it: Plack-Middleware-Auth-Form, WebPrototypes::ResetPass, WebPrototypes::Registration (I might rename the first one to WebPrototypes as well). The point is to solve it across the multiple web frameworks, templating languages and storage layers - so that it can survive moving from project to project.

What is your schlep?