Tuesday, October 6, 2009

I don't hate the stash ... but I loathe the session.

A couple of days ago, John Napiorkowski asked : Does Anyone Else Hate the Stash?

True enough, the stash in Perl Catalyst is a big bag in global memory. Any method along the chain can read or write into that bag. Then you pass the whole thing to the Template Toolkit (TT), that integrates the Catalyst stash into its own stash, and again any template fragment can read or write into the TT stash. So when studying one particular component along the chain, it is indeed sometimes hard to track what is in the stash at that point, and where the data came from. I frequently need to resort to the perl debugger to sort out such situations.

Nevertheless, I don't hate the stash, because it is sooo convenient to let various software components collaborate at little cost. Setting up a more controlled way of passing information between components would be quite tedious and would imply more maintenance. It is like when having several humans in a team : if collaboration is harmonious, it leverages some multiplicative power; if not, people start treading on each other's toes, and the global result is unsatisfactory. A simple step for ensuring harmonious collaboration is to partition the stash namespace through additional levels of hashrefs (same principle we use on CPAN for avoiding collisions between module names).

Furthermore, global memory in stash is not too risky because it is very limited in time : at the end of the request the whole stash is cleaned up. Unfortunately, there is something much worse than the stash : the session !

Some colleagues tend to like putting stuff into session storage, because it's easy to program sequences of requests without having to propagate state through URL parameters or JSON data. I try to avoid it as much as possible because :
  • data in session storage is likely to produce unwanted "action at distance"
  • the URL API is no longer RESTful (calling the same URL with same parameters might yield different results)
  • there is a cost in serializing / deserializing the session data at each request
  • session data is limited in size
  • so session data is not appropriate for storing recursive datastructures of unknown sizes

Programmers a tempted by the easy aspect of session data, but are not always conscious of the limitations above.


  1. I'm curious how you would handle a large dataset that can't be passed as a URL parameter.

    My first instinct is to put in the session and pull it back out in the next step (redirecting to the previous step if the session data is not there.)

    Any other solutions I can think of just seem a lot like reimplimenting the session, and probably not as well.

  2. Datasets are passed through POST requests, either as regular forms or as JSON trees.

    But anyway I tend to avoid multi-steps transactions; instead, I use large forms (with sections and navigation facilities and Ajax helpers), so the data is accumulated on the client side, and at the end the whole data tree is submitted in one go.