Wednesday, July 30, 2014

Plack::App::* namespace is not for apps - so which is the proper CPAN namespace ?

OOps ... I just realized that I had misunderstood the intent of the Plack::App namespace : the top-level Plack doc explicitly says :
DO NOT USE Plack:: namespace to build a new web application or a framework. It's like naming your application under CGI:: namespace if it's supposed to run on CGI and that is a really bad choice and would confuse people badly.
and the 2009 Plack Advent Calendar goes even further with
Think twice before using Plack::App::* namespace. Plack::App namespace is for middleware components that do not act as a wrapper but rather an endpoint. Proxy, File, Cascade and URLMap are the good examples. If you write a blog application using Plack, Never call it Plack::App::Blog, okay? Name your software by what it does, not how it's written.
OK, sorry, I got this wrong when publishing Plack::App::AutoCRUD -- but to my excuse, I'm not alone, several other CPAN authors did the same.

The app is quite young, so it is still time to repair its name (even if this operation will be quite tedious, because it involves changes in all module sources, in the CPAN distribution, in the github repository name, and in the upcoming YAPC::EU::2014 talk). But if I want to be a good citizen and engage into such an operation, what should be the proper name ? The CPAN namespace is becoming a bit crowded, as already noted 2 years ago by Joel Berger. For choosing a name, there seem to be several controversial and perhaps contradictory principles :
  • CPAN is for modules, not for apps  : this was argued in 2008 in a Perlmonk discussion on the same topic ; however, many people replied in disagreement. I disagree too : publishing a Perl app on CPAN fully makes sense because we take advantage of the CPAN infrastructure for tests, dependency management, publication, etc. Furthermore, applications can be extended or forked, just like modules, so CPAN is a perfect environment for sharing.
  • publish under the App::* namespace : this is the PAUSE recommendation. But applications in the App::* namespace are mainly  command-line utilities, which is quite different from Web applications. As a matter of fact, nobody used yet the App::Web namespace -- maybe it's time to start ?
  • use a ::Web or ::WebApp suffix at the end of the module name :  I never saw this as a recommendation, but nevertheless many distributions adopted this approach. This is certainly appropriate if the main goal is to publish a functionality Foo::Bar, and by the way, there is also a web app at Foo::Bar::WebApp. But if the purpose of the whole distribution is just a web app, this approach tends to create a new top-level namespace, which is not considered good practice. Should I choose AutoCRUD::WebApp ? I think not, because other people might want to use the AutoCRUD::* namespace.
  • avoid top-level namespaces : this used to be an important recommendation, but it doesn't seem to be well respected any more :-( -- nowadays I see more and more CPAN distributions taking up top-level names. I won't cite any particular example, not to offend anybody, but it's quite obvious if you look at the list of top-level namespaces .... and unfortunately many of those top-level names give no clue whatsoever about what kind of functionality will be found in the associated distribution.
  • hide the technology underlying your app : the Plack argument above says that the app should be named from its functionality, not from its implementation technology. Well ... I'm not so sure that this is always appropriate. Many modules sit under the Tie::Hash::* namespace, just because they used the tied hash technology, for providing various kinds of functionalities.
    Concerning  "Plack", when I see that keyword in a module name, I know that a) this is Web technology, and b) this will work on any kind of web server (as opposed to modules names containing "Apache" or "Apache2"), and I consider this to be useful information for a potential user. On the opposite, I didn't want to name my module DBIx::DataModel::AutoCRUD, even if it uses DBIx::DataModel quite heavily, because that's not hardwired into the architecture and I could easily imagine a later adaptation for supporting as well DBIx::Class.
So in the end I will probably end up with something like App::Web::AutoCRUD or WebApp::AutoCRUD ... unless somebody comes up with a better suggestion !

PS : see also Catalyst::Plugin::AutoCRUD .. which can be used either as a Catalyst plugin or as an application on its own.





Friday, July 11, 2014

Perl virtual tables for DBD::SQLite : ready to test

Followup to my previous article : a first draft of Perl virtual table support for DBD::SQLite is available at https://github.com/DBD-SQLite/DBD-SQLite/tree/vtab .

This is still alpha software, but it shows the idea; I still don't know when this work will be mature enough for a CPAN publication.

Two examples of virtual table modules are bundled with the distribution :
  • FileContent : implements a virtual column that exposes file contents. This is especially useful
    in conjunction with a SQLite FTS fulltext index; see the doc in Fulltext_search.pod
  • PerlData :  binds a virtual table to a Perl array within the Perl program. This can be used for simple import/export operations, for debugging purposes, for joining data from different
    sources, etc.
I'm currently thinking of one more example , which would be fun to play with : a virtual table that would proxy to another DBI connection. Then we could join data from various sources, using SQLite's features to do the joining work. Sounds quite exciting, but at this point this is just an idea.

Thanks to Salvador FandiƱo who pointed me to https://metacpan.org/pod/SQLite::VirtualTable, which is similar in idea but more meant to embed a Perl interpreter inside a sqlite application, rather than the other way around; this code helped me build the DBD::SQLite version.

Of course any comments/suggestions are welcome.

Friday, July 4, 2014

project : Perl virtual tables for DBD::SQLite

During the year I have more and more management tasks and less time for programming. So for the holidays I wanted a change and decided to engage into a really "hardcore programming"  project, namely to add support for Perl virtual tables within the DBD::SQLite database driver. SQLite has this notion of "virtual tables" which look like regular tables but are implemented through callback routines. This project  implies some C programming, using Perl XS API, and the delicate part is to design some appropriate glue between SQLite's notion of "object-oriented", through extensible C structures and callbacks, and Perl's object-oriented features.

At the beginning I wasn't even sure if such a project would be feasible, but now it is slowly taking shape and I'm pretty confident that it will eventually reach something usable. The concept is quite similar to Perl's tied variables, where a published API is reused for accessing many different kinds of data; except that here the published API is SQL instead of hashes or array operators. As a result, we could have virtual tables bound to the filesystem, to the Win32::Registry, to some configuration data, or any other accessible resource. This will open a new field for lots of creative ideas.

My main motivation for doing this work is to be able to build a framework for collections of documents, using SQLite for the fulltext index, and using the filesystem for storing document content : this will be a much more powerful replacement for my very old File::Tabular::Web::Attachment::Indexed module. That module is still heavily used at Geneva courts of law, but now we have 10 years of data, and the old architecture is clearly showing its limits.

For the virtual tables project I need some test cases, so if anybody has ideas about Perl-accessible data to be published as an SQL table, I'm interested.