Tuesday, September 22, 2009

Optional Dependencies Are Going Out of Fashion

I was reading the Changes document for the latest DBIC release and I spotted:

- Remove the recommends from Makefile.PL, DBIx::Class is not
supposed to have optional dependencies. ever.

I am noting this so that it will not missed by the general public that optional dependencies are going out of fashion.

The problem here is that if an optional dependency enables some useful feature (and otherwise why would you add it?) - then you should expect that someone writes a library using this feature and uploads that library to CPAN. What that other guy should add to his dependencies? How is he supposed to know which of your optional prerequisites are needed for his module? And what if this is not about just two modules in dependency relation but rather a whole chain of them? Then it becomes a mess.

10 comments:

AdamKennedy said...

They're going out of fashion because it's finally getting through to people that optional dependencies aren't supported by the CPAN :)

john said...

To be honest, I disagree with this. It seems needlessly pedantic and opinionated. I do agree that one should exercise thought in the matter, and not just huck in whatever into the Makefile.PL, saying, "Well, I guess I'll just make it optional..." Many times it's a flag telling you to do something else. However I recently had a case where it seemed the correct thing to do. While working on Test::DBIx::Class I wanted to support SQLite, MySQL and Postgresql, but not everyone needs all three and not everyone wants to install all the bits needed for all three, particularly when installing the DBD for Postgresql and MySQL fail so often (both really want a running instance of the database and a bunch of %ENV to pass all the tests). So in this case I'm stuck with either publishing a module that's going to fail half the time due to the way tests are written, or I can punt and let the installer decide what she wants, based on their local need. I guess I could write support for these two databases as separate cpan modules? Is that what we are supposed to do now?

zby said...

Hi John - hmm actually you have a point here. I really don't know what is the right answer here. But for my defence - the pedantry comes from my experiences with installing CPAN modules - it is failing way too often and there is no one big reason of these failings - it is all small things - but they come in huge quantities. We need some pedantry if we want to have usable CPAN and continue to build bigger and bigger dependency trees.

Back to your question - if you treat all the databases uniformly - then perhaps you should not require any of the particular DBD drivers and leave the tests optional? What exactly do you gain by adding optional prerequisites here? The toolchain does not do anything with it and for people it would be more useful if it was in the docs than in META.yml

dami said...

OK, I agree that in presence of chains of dependencies, an optional dependency in the middle will probably create a mess. But when it comes to "final" applications, optional dependencies can make sense. For example in Pod::POM::Web, I made the dependencies on PPI syntax colouring and on fulltext indexing optional, because the price for these additional features does not have to be necessarily paid by every user.

zby said...

Hi Dami thanks - this is another good point. I can see two answers here. First is add another two packages one for the colouring and one for the full text indexing. Second answer is another question - what really do optional dependencies buy you? It is not processed automatically by any tool, it can only be interpreted by a human so does it not fit better in the documentation than in the file interpreted by the installation programs? Or maybe I am wrong and there is something that is done by the automatic tools for optional dependencies? I cannot imagine anything reasonable - maybe for dependencies like in the case above described by John - when you need at least one out of a set - then perhaps the automatic tools could choose one randomly if there is none yet installed.

On the other hand even though this data is not interpreted by the installation tools it still can be interpreted by some meta tools - for creating statistics etc?

john said...

For my instance, I considered some possibilities. One was to install the test subsystem for a particular database driver ONLY if the required DBD was already installed. That way someone could build their makefile.pl with the DBD of choice first, and second my Test::DBIx::Class. But that seemed a bit too much magic in the background, and really I didn't see anyone doing anything similar. I'd rather follow a process that at least has been generally used, if not universally praised.

For Test::DBIx::Class, you can actually use the tool without the mysql or postgresql plugins. This is because sqlite can easily be installed without much effort. Actually, since T::DBIx::Class is only useful when using DBIC, you already have sqlite installed, so that's a no brainer.

The bigger issue for me is the support for testing against mysql and postgresql using the autodeployment option. This needs support code but it makes no sense in the absence of the required DBD. And you can't install the DBD without the required Database (and only then with effort).

I'd break these plugins out into separate distributions, but I'd worry people not notice them :) I guess it's a documentation issue in that case.

dami said...

zby asks "what really do optional dependencies buy you?". Well, I was convinced that CPAN was proposing the optional dependencies at installation time. Don't remember how I came to this belief; I think I tested it long ago, but maybe my memory plays tricks with me. Anyway, I tested it again yesterday, and you are right, optional dependencies are just ignored by CPAN, so having them in the META.yml doesn't buy you anything.

zby said...

Yeah - just after switching off my computer I thought: wait I remember there used to be questions before I started using PERL_MM_USE_DEFAULT. I actually never read them even them, I just presed Enter, Enter, Enter. I think I am not alone in this and I think all those questions are useless and in fact we should get rid of them by default. It sounds like a good idea a the start when we think about just one module. We want to give the user choice, choice is good after all right? No - this is not sustainable in the face of the dependency chains that we have right now.

sid22burn said...

When a module has optional dependencies i get asked if i want to install them. But primarly i use CPANPLUS, but i think even CPAN always ask me.

At least i think the same, that optional dependencies should reduced or completly eliminated if it is possible.

There was two points here that says when optional dependencies are good. But i didn't share this.

1) There was somebody that says i want to support DBD::mysql, DBD::pg etc.

At first, instead of "optional dependencies" write the module that it uses the module if there are installed or otherwise produce an error if you do somethin that is installed.

Think of it like DBI. DBI supports various DBD::* modules. But DBI itself has no optional dependencies. If you try to connect to a mysql database and don't have DBD::mysql installed it throws (hopefully) a message that it says it is not installed.

If you have tests, then only run the tests if the modules is installed.

This system allows you to be completely indepent from any DBD::* as a requirement.

Even the sentence "If you have DBD::SQLite installed if you have DBIx::Class installed" is not a true sentence.

Because it can change. Probably before Catalyst 5.8 you would say

"If you have Catalyst installed you have Class::Accessor installed."

Now Catalyst 5.8 switched to Moose. And Class::Accessor is not anymore a requirement. So don't use a module or set it to the requirement list only another module use it.

Completely writer your modules if you don't now the dependencies of your other modules that you use.

2) The second was for a complete application that has optional functionality that not everyone need.

If this is the case. A plugin system would make more sense instead of optional dependencies.

Let's say i want to install "Foobar". At installation time he ask me if i want the optional modules or functionality X. I asked with "no".

After some use of Foobar i think i want now the optional functionality X. Now is the question, which modules do i need to install to get functionality X?

If the version not changed and i try to install "foobar" i only get something like "Foobar already installed at the requested version".

So instead of optional dependencies, it makes more sense to have a plugin system. This allows you to browse the plugins and install them separately.

Write your application that you can activate the plugins you want, or that it automatically loads all plugins. i would prefer that you need to specify them and it fails if there are not installed or could not load.

Because this eliminates the error that you use a plugin that is identically not installed.

As an example for this look at Padre. Padre as a core and Padre::Plugin for optional functions.

Mark Aufflick said...

Also remember that actually a lot of people do module builds without cpan. In most "corporate" companies the production install process is very tightly controlled and so module dependency chains automatically building themselves via cpan+ is not an option.

One place deployed cpan modules as rpm packages, just like redhat, but we built the packages ourselves. We used helper scripts, so it wasn't too bad, but each new module had to be packaged, approved, and released. You can imagine that if each module in Catalyst's dependency chain had one or two optional modules that you didn't need at that time, you would be very happy not to build them!