Imagine that this was enforced by the compiler/interpreter - maybe with a new keyword function
, or something. I have the feeling that this would be very similar to how use strict
works - giving the end user some kind of safety.
Friday, April 27, 2012
If all of your state is on the stack then you are doing functional programming
Tuesday, April 17, 2012
A Data::Dumper bug - or Latin1 strikes again
To get a character that has internal representation in Latin1 I could also use HTML::Entities::decode( 'ó' ) there with the same result. The output I get on 5.14.2 and 5.10.1 is:
When I check the dumped string - it has the right character encoded in Latin1 - and apparently eval expects UTF8 when
use utf8
is set. Without use utf8
eval works OK on it. If the internal representation of the initial character is UTF8 (like when the first line is my $initial = 'รณ';
) - then the dumped string contains UTF8 (which is again might be interpreted incorrectly if the code does not have use utf8
preamble).
Considering that Data::Dumper is a core module and one that is one of the most commonly used and that its docs say:
The return value can be evaled to get back an identical copy of the original reference structure.this looks like a serious bug.
Is that a known problem? Should I post it to the Perl RT?
Update: Removed the initial eval - "\x{f3}"
is enough to get the Latin1 encoded character. Some editing.
Update: I tested it also on 5.15.9 and it fails in the same way.
Update: I've reported it to the Perl RT - I am not sure about the severity chosen and the subject - this was my first Perl bug report.
Update: In reply to the ticket linked above Father Chrysostomos explains: "The real bug here is that ‘eval’ is respecting the ‘use utf8’ from outside it." and later adds that 'use v5.16' will fix the problem in 5.16.
Saturday, April 14, 2012
Breaking problems down and defaults
The problem is that the criticized approach, a unified library that just converts Markdown to HTML, would result in a simpler API - for example something like this:
Maybe the difference does not look very significant - but after a while it can get annoying. For the 99% of cases you don't need to extra flexibility that comes with the replaceable parser - so why should you pay for it? If I had to use Markdent frequently I would write a wrapper around it with an API like above.
By the way, Text::Markdown already has this wrapper and it does present a double, functional/object oriented API - where the presented above simple, functional one does the most common thing, while the object oriented one gives you more control over the choices made. Only that it still couples parsing and generation.
Another way of simplifying the API is providing defaults to function arguments. For example to the object constructor. Dependency Injection is all about breaking the problem down and making flexible tools - but it might become unbearable if we not soften it up a bit with defaults.
Programming is always about doing trade-offs - here we add some internal complexity (by adding the wrappers or providing the defaults) and in exchange get a simplified API that covers the most common cases while still maintaining the full power under the hood. I think this is a good trade off in most cases, and especially in the case of libraries published to CPAN that need to be as universal as possible.
Wednesday, April 04, 2012
What if "character != its utf8 encoding" is overengineering?
Getting rid of the Latin1 internal encoding does not look like a big price for improving simplicity and getting rid of all these subtle mistakes. I think it is important that the language is understood by its users and if it is not, then maybe, instead of blaming the programmers, we could make it easier to understand? Sure it is nice to have the possibility to change the internal encoding from UTF8 to UTF16 or maybe something completely different in the future - but I have the feeling that this might be case of architecture astronautics.