Paul Mukherjee's Blog: February 2009

Friday, February 27, 2009

Requirements and IT programmes

Requirements in any kind of IT-based engineering programme are difficult. The bigger the programme, the more difficult they are. This may seem paradoxical in world in which Google, Amazon and eBay are delivering high performance IT-driven businesses to massive audiences. Why is that?

I think there are three major reasons for this.

The first is the predilection for big-bang approaches with large programmes. This seems to be particularly prevalent in public sector programmes, where IT is driving business transformation, such as Aspire, NPfIT and DII. By adopting a big-bang approach it is not possible to attempt to scale up ideas from agile development around getting something working and then improving on it in stages. Conversely Google et al typically introduce new products and services initially to limited audiences as betas (e.g. gmail) and only gradually increase functionality until sufficient stability for full release has been reached.

The next reason, may be peculiarly British. The UK Office of Government Commerce has mandated the use of output-based specifications for procurement of large programmes. This is fine in itself since this approach ensures focus is maintained on end-user business benefits. However output-based specifications are not engineering requirements, so delivery of a service can not be measured against an OBS. Let’s consider an example to make this clear. The following requirement is from the OBS for the NPfIT, available on the Department of Health’s public web site.

The service shall be secure and confidential, and only accessible on a need-to-know basis to authorised users.

In the absence of precise definitions of “secure”, “confidential” and “need-to-know” this is a vacuous statement!

(Note that it may appear that I have selected this requirement as an extreme example in order to demonstrate the point, but in fact I chose this at random.)

I’m not suggesting that the requirements against which programmes are procured should go in to engineering levels of detail, but conversely these requirements are inadequate as a basis for engineering delivery. The approach that I have seen at first hand, and which worked successfully, is to use such OBS style requirements as the starting point for requirements elaboration. For a particular release of a service, the requirement should be elaborated to the degree that it is testable and implementable, and that should be agreed between customer and supplier as the basis for determining whether the requirement has been met or not. In long-term programmes the elaboration may alter over time (for example an encryption algorithm which was secure 10 years ago may not be secure today). The essence is that the OBS expresses the spirit of the requirements rather than the details; procurement requirements are not the same thing as engineering requirements.

The final reason is the use of COTS packages for delivery of such programmes. The challenge is that the delivered requirements depend totally on the capability of the COTS package. This is more an issue for business processes, since these tend to be intrinsic to specific packages. This is another reason why ideas from agile development can not be reused. Also, picking up on the previous point, the vagueness of an OBS means that the customer and COTS vendor could have very different (but equally valid) interpretations of what a requirement means.

Is this a real problem? Well, judge for yourself the success of these large programmes...

Wednesday, February 18, 2009

Building Architecture vs Aeronautical Engineering

Building architecture is often used as an example of what IT architecture should aspire to. There are a number of reasons for this: for a start, the term “architecture” is normally associated with buildings, and has really been adopted by IT in parallel with the emergence of the new discipline of abstracting large scale systems in order to be able to understand them. This close relationship with building architecture has been cemented by the work on architecture patterns, which takes as its starting point Christopher Alexander’s work on patterns in building architecture.

There are certainly some similarities between building and IT architecture. Both use tools of abstraction to manage complexity; for example building architecture uses plan and elevation drawings to understand the structure of buildings, and mathematics to understand how to construct buildings. IT uses architecture views (such as the ToGAF framework) to understand what needs to be built, and a variety of tools and processes in order to build these systems.

But what about after construction? How well does the metaphor hold then? I think at this point it falls apart; buildings are by and large static, maintenance is typically restricted to superficial changes such as painting and decorating. It is unusual for buildings to go through fundamental reconstruction. On the other hand, IT systems are living beasts, which are subject to constant change, of varying degrees. Sometimes it is addition of a minor feature or a new interface; other times it can be fundamental re-architecting of the entire system in order to accommodate new requirements, or in order to accommodate original requirements which were not properly understood.

In practice this means that in the case of building architecture, design documents and blueprints will gather dust post-construction, whereas for IT systems it is critical to have an up-to-date ‘as-maintained’ documentation set. (That said, in my experience organisations that do maintain such documentation are the exception rather than the rule.)

I think a better metaphor is aeronautical engineering, where the discipline involved in maintaining up-to-date documentation for in-life aircraft, associated systems and components is quite incredible. I was struck by this years ago when I worked on a project with Boeing re-engineering a tool they use - Wiring Illuminator - which helped maintainers to understand the individual wires and connected end-points in aircraft. Subsequently I worked on JSF where the life history of every single component was being maintained. Note that I am following well-trodden ground here: over 10 years ago Ross Anderson was pointing out that IT could learn much from the way that the aircraft industry deals with safety-related issues.

As the discipline of IT architecture develops I fully expect that the need to capture high quality ‘as-maintained’ documentation will be critical. Tim O’Reilly shrewdly observed that a key industry differentiator in the future will be service management organisations; I would add to that: it will be service management organisations who excel at ‘as-maintained’ documentation and baseline management.

Wednesday, February 11, 2009

What Should Go In The Cloud?

Cloud computing is all the rage. Vendors are falling over themselves to offer services from the cloud. Analysts are proclaiming that the cloud is the next big thing. So given that cloud based services provide economies of scale that most businesses can’t dream of, we should be pushing all of our services in to the cloud rather than provisioning them ourselves, right?

Let’s consider an example from the last programme that I worked on. In that programme a national single sign on (SSO) solution was provided as an external service (in the cloud in fashionable parlance). This ensured a single identity across NHS organisations, allowing users to cross organisational boundaries with a single identity. Great idea. One minor problem: if for any reason this external service was unavailable, users were not able to log in to any of their applications, and users already logged in had their sessions terminated. Unavailability of that single service impacted all other business applications.

What seemed like a great idea at the time, did not really stand up to scrutiny in practice. Of course hindsight is a great tool so I am not criticising the original design choice, but trying to learn from it. Using this example it is obvious that not everything should be in the cloud - operational considerations need to be traded off against financial benefits. So how do we decide what should go in the cloud and what we should deliver ourselves?

There are a number of dimensions to this. The first consideration is the business’s value chain. Any secondary activity in the value chain is a candidate for delivery via the cloud. For example, HR systems, intranets etc. What about primary activities? Instinctively these should be delivered internally. But if that is the case, how is it that salesforce.com has been so successful?? I think the answer is deeper: primary activities should be delivered from the cloud if they can so provide greater levels of quality and reliability than would be possible by delivering it internally. So for a large, mature organisation with a sophisticated IT operation, delivering CRM internally might make sense. For other organisations CRM via salesforce.com might make sense even though this is a primary activity for the organisation.

Returning to my SSO example then, for those NHS organisations for whom SSO is too complicated a task it makes sense to deliver this from the cloud. For larger, more sophisticated NHS organisations, internal delivery of SSO might be appropriate. That just leaves the problem of interoperability...for a later blog!

Thursday, February 05, 2009

Massive IT Programmes

I have just recently changed jobs, joining Cognizant’s Advanced Solutions Practice, having spent the last three and a half years working for BT as Chief Architect on the NHS National Programme for IT. Moving on from that role has given me the chance to reflect a little on some of the challenges that I faced in that role.

The programme is frequently in the press, and has been labelled as a classic example of a failing IT programme. Though the press coverage has in general been ill-informed and inaccurate there have undoubtedly been problems with delivering the programme, for many reasons, which I will not get in to here. However some general observations can be made about massive IT programmes.

One of the greatest challenges in programmes such as this one is the sheer size of change involved in terms of both business and technology. The traditional programme and project management approach to dealing with the complexity that this scale brings is to follow a reductionist strategy, breaking the overall programme into smaller manageable parts. The difficulty with this is choosing how to slice and dice the large problem. Executed correctly this approach allows application of traditional programme management and systems engineering techniques to ensure delivery within acceptable parameters of cost, schedule and risk. The down side is that if the overall problem is divided incorrectly the small parts so obtained are as difficult to deliver as the overall programme. Moreover this approach assumes that such a division is possible.

What alternatives are there then? That is a difficult question to answer since this is really an embryonic and immature field. Historically the approach taken was to execute a small-scale pilot programme then scale this up to the size of the large programme, but that takes time and can cause loss of momentum. An alternative would be to take an evolutionary approach, similar to some agile approaches to software development: execute a solution with acknowledged flaws, and evolve this via a series of small iterations in to a solution that is ‘good enough’ to satisfy the key stakeholders of the programme.