Thursday, October 29, 2009

Initial design vs. maintenance and refactoring

As previously mentioned in this blog, our team is busy rewriting a complex application from Cobol to Perl.

While studying the old Cobol code, I often discover some areas where the initial design was well-thought and well-organized, but later became blurred and confused over 20 years of maintenance. So part of our analysis task is to do some archeology, trying to understand the historical layers, and to sort out what is still relevant to the current needs of our users.

Unfortunately, the same phenomenon also starts appearing in the Perl code ! Some parts were written two years ago, and then had to evolve for various reasons ... and sometimes the initial design becomes blurred in this process.

One may think that when this happens, it is because the initial design did not have the proper abstraction / parameterization hooks to make it easy to extend. But sometimes when doing the initial design of a component, you don't have a complete picture yet of what is going to surround that component ... or the requirements may have changed because this is a long-term project, and life doesn't stop while we are working on this application. So what is really needed to keep it clean is constant refactoring.

The problem is that maintenance operations do not have the same metrics : maintainers are evaluated by how many tickets they solved and how long it took; so there is a natural tendency to just "get it to work". Spending additional time on refactoring operations brings no immediate reward : users won't see a change, managers won't understand why you need to revisit code that was already written, and there is an additional risk of introducing regressions.

It's easy to understand that on a collective level there would be some rewards on the long-term (better maintainability, cleaner architecture, etc.) ... but on the long-term the maintainer will probably have gone to another project !

Saturday, October 17, 2009

how to reconcile audits and agile development ?

According to recommendations recently emitted by the Swiss working group for IT government audit (Swiss chapter of ISACA, international Information Systems Audit and Control Association), every important IT project in Swiss government should have at least 10 documents ready for the auditor (among about a hundred kinds of documents defined by the Swiss project management method Hermes ) :

1. Feasibility study
2. Specifications
3. Cost effectiveness analysis
4. Integration into the IT environment
5. Requirements
6. Concept for an internal control system (ICS)
7. System architecture
8. Tests (test plan and documentation)
9. Acceptance by the user
10. Final assessment

The recommendations explicitly insist that this list also applies to "new so-called 'agile' development methods".

For our Perl project at Geneva courts of law, this means that we must produce such documents to be ready for occasional auditors. The problem is, that the Hermes method was mainly inspired by good old waterfall development methods on mainframe computers, and some of the documents listed above just do not make much sense in our context; so instead of helping to better structure and organize the project, they just represent an additional burden.

For example, some parts of the application start in an exploratory way, without formal specifications, and are progressively shaped into working functionalities; tests are not planned in a document, but written in a galaxy of test files; etc.

I guess that the pressure for formal deliverables in project management is probably stronger in government than in private companies, but nevertheless people doing big Perl projects in any context probably also have at least some of such constraints. Any testimonies on that ?

Tuesday, October 6, 2009

I don't hate the stash ... but I loathe the session.

A couple of days ago, John Napiorkowski asked : Does Anyone Else Hate the Stash?

True enough, the stash in Perl Catalyst is a big bag in global memory. Any method along the chain can read or write into that bag. Then you pass the whole thing to the Template Toolkit (TT), that integrates the Catalyst stash into its own stash, and again any template fragment can read or write into the TT stash. So when studying one particular component along the chain, it is indeed sometimes hard to track what is in the stash at that point, and where the data came from. I frequently need to resort to the perl debugger to sort out such situations.

Nevertheless, I don't hate the stash, because it is sooo convenient to let various software components collaborate at little cost. Setting up a more controlled way of passing information between components would be quite tedious and would imply more maintenance. It is like when having several humans in a team : if collaboration is harmonious, it leverages some multiplicative power; if not, people start treading on each other's toes, and the global result is unsatisfactory. A simple step for ensuring harmonious collaboration is to partition the stash namespace through additional levels of hashrefs (same principle we use on CPAN for avoiding collisions between module names).

Furthermore, global memory in stash is not too risky because it is very limited in time : at the end of the request the whole stash is cleaned up. Unfortunately, there is something much worse than the stash : the session !

Some colleagues tend to like putting stuff into session storage, because it's easy to program sequences of requests without having to propagate state through URL parameters or JSON data. I try to avoid it as much as possible because :
  • data in session storage is likely to produce unwanted "action at distance"
  • the URL API is no longer RESTful (calling the same URL with same parameters might yield different results)
  • there is a cost in serializing / deserializing the session data at each request
  • session data is limited in size
  • so session data is not appropriate for storing recursive datastructures of unknown sizes

Programmers a tempted by the easy aspect of session data, but are not always conscious of the limitations above.

Saturday, October 3, 2009

YAPC presentation styles

One personal comment I got from the YAPC::EU::09 Survey results was : "Split slides so they contain less text". Well, while attending several talks, I felt exactly the reverse : I wished the speaker had condensed slides so they contain more text !

Finding the right balance is really a difficult question. It is true that I have a tendency to fill slides with a lot of material, in order to exploit complementary channels : while my voice gives the general idea, or emphasizes a particular point, the slide can convey more detailed information, and people in the audience can grab more content if they are especially interested in one particular aspect.

Probably I like this style because it corresponds to my own way of learning. When I was at school, at a time where beamers were rare and expensive, most teachers used physical transparencies. Some of them had the habit of putting the transparency and immediately hiding it with a piece of paper; then they would progressively uncover the slide, one point at a time. I hated that habit, because I was forced to think at the same speed as the teacher. If I see the global picture at once, I can immediately choose which points seem more important to me, and focus on them, maybe already preparing a question, or think back at what was said before, or anticipate what is probably going to be said next. But if the teacher dictates the rhythm, and decides to pause for 5 minutes on a point which is important to him, but not to me, or decides to quickly skip over a detail which I need to elaborate in my head, then I'm in trouble.

The modern way of uncovering slides one point at a time is the Takahashi style (lots of slides, very few words, huge font), which seems much praised in the Perl community. I must admit that I was quite impressed the first time I saw a presentation in this style : it is quite efficient for a lightning talk, or to create some suspense at a particular point in a presentation. However, if many speakers adopt this style just out of fashion, without deliberate thinking about which effect they want to achieve, then at the end of the day it becomes quite boring, and I feel like having watched several hours of videoclips. After such a day I don't really remember what were the highlights.