Paul Mukherjee's Blog

Monday, August 22, 2011

The difficult thing about back-ups

My back-up drive failed recently.

It was a lucky coincidence that I noticed this; despite grand intentions I’m lazy when it comes to backing things up and only think about it when I am panicking about typing ‘rm –rf’ in the wrong directory.

The drive failed when I was moving it across my study; I powered it down, unplugged it, moved it, plugged it back in and then fired it up again. Except it declined to fire up. It made a few pathetic wheezing sounds and gave up. After a few days of online searching I admitted defeat and contacted Lacie who eventually acknowledged it was faulty and issued an RMA. Four weeks later I had a repaired working drive in place (albeit having lost all of the original data). I am now able to return to my previous state of blissful ignorance.

The point of this little story? In a nutshell I think this summarises many enterprise’s attitude to back-ups. I know from personal experience of two organisations who notionally had a standard back-policy with regular full and incremental back-ups, which when the back-ups were needed, could not be retrieved because the back-up hardware had failed.

Then there is the recent case of Amazon. Running infrastructure as complicated and sophisticated as they do is fraught with risk so if I was one of their customer’s I would probably be thinking about having an iron-clad business continuity plan in place, but apparently even here back-up complacency reigns.

Why is this a big deal? Normal production systems are tested thoroughly prior to go-live and then tested on an on-going basis through production use. Any problem will be automatically detected or signalled by a user fairly quickly. However since back-up systems are only invoked by exception, the first time you know there is a problem is when you need them. The answer? Well, regular testing of a full restore from back-up seems like the obvious solution. There are some products on the market which claim to help but I remain somewhat sceptical of their efficacy.

Am I eating my own dog food? Alas I have regressed to my pre-failure days and have adopted the macho “my hardware never fails and I never type rm –rf by mistake” attitude. Some people never learn.

Wednesday, July 06, 2011

Government ICT Strategy

The recently published government ICT strategy makes interesting reading. The government is clearly trying to learn some of the lessons of previous failures, albeit without really getting to grips with the underlying reasons for these failures.

One of the government's explicit strategy statements makes very interesting reading:

The adoption of compulsory open standards will help government to avoid lengthy vendor lock-in, allowing the transfer of services or suppliers without excessive transition costs, loss of data or significant functionality.

Will open standards really do this?

Let's turn this on its head; what are the typical reasons for vendor lock-in? Here is my starter for 10 in no particular order:

The software offers must-have features which competitors do not have.
The organisation using the software has adapted its business processes to fit with how the software works.
The organisation using the software has made a considerable investment in training its staff in how to use the software.
The software is integrated with other systems that the organisation uses.
Difficulty of data migration

Which of these will open standards help with? As far as I can tell only number 4; use of open standards in interface specifications should in principle allow substitutivity of standards-compliant components on either side of the interface. I say in principle because for any reasonably sophisticated enterprise system, an interface will be a key part of a business process, which will one way or another be organisation specific. It is typically a non-trivial task (in some cases impossible) to substitute another component into this business process without impacting the business process, leading to item 2 in the above list.

Don't get me wrong: open standards are great and to be applauded - I glory in my ability to choose my browser according to my mood. However let's not kid ourselves that be adopting them we are going to see a public sector IT world free of Oracle and MS Office any time soon.

Tuesday, December 21, 2010

The Assumption of Independence in the Financial Systems Failure

I have been spending a lot of time recently reading some of the plethora of books that have been published which either provide inside accounts of the 2008 failures in the banking sector (Lehmans and Bear Stearns) or have tried to analyse the causes of this failure. Though I’m not an economist by any stretch of the imagination, having studied financial strategy I have developed something of a morbid fascination for this topic, a little on the lines of an episode of Columbo where you know the outcome but the interest comes from finding out how Colombo will prove the perpetrators guilt. In the case of the 2008 crash it is common knowledge that it was caused by bankers taking risks purely to maximise their own bonuses isn’t it?

Going beyond the superficial mass media level reveals something slightly more interesting. There was certainly unjustifiable risk taking but this in itself ought not to have caused the systemic failure that occurred. One of the early failures which triggered the collapse of the dominoes was the collapse of two hedge funds dealing in derivatives run by Bear Stearns. Right until the point at which these funds were liquidated the fund managers were maintaining that the funds were diversified so their expose to subprime mortgages was limited. However subsequent investigation proved in fact over 70% of the cash invested in these funds had been spent on mortgage-backed derivatives. This is important as it is a key tenet of investment strategy that funds should be diversified so that losses in one area are compensated by gains in other areas. Diversification fails as a strategy when losses in one area trigger losses in another area i.e. even though a portfolio may be diversified there may be dependencies between them. This is what happened in 2008: losses in sub-prime triggered losses in other areas leading to large scale failures in supposedly resilient diversified funds.

So why bring this up in a blog supposedly devoted to technology? Having the memory of an elephant this reminded me of a couple of papers that were published in the 80s. The first paper The N-Version Approach to Fault-Tolerant Software looked at how software risk could be massively reduced by copying the idea of hardware redundancy in software, a technique know as n-version programming. Basically the idea was that for a high integrity system the software should be independently written multiple times and then control logic would execute all three versions in parallel, following the majority vote at each decision point. This was followed up by another paper An Experimental Evaluation of the Assumption of Independence in Multi-version Programming which challenged the hypothesis at the heart of n-version programming – that a failure in one programme would be independent of a failure in another programme. This is a reasonable assumption in hardware since failures are typically caused by physical characteristics rather than common design flaws. However this latter paper demonstrated empirically that this was not a safe assumption for software and therefore n-version programming was dead.

Fast forward 25 years and what do we see? The 2008 crash was effectively the result of dependent failure in a system which assumed failures were independent. Spooky eh? Normally software mimics life but in this instance software technology seems to have go there first!

Friday, July 30, 2010

Open Source Confusion

I have dabbled with open source for many years, both as a user and briefly as a developer. I personally like the idea that if there is a problem with the software I can fix it myself rather than having to wait for the vendor to fix it. I have therefore read with interest some interesting thoughts about open source, in particular in two recent forums.

The first area is in connection with the Department of Health’s decision not to continue its enterprise wide agreement with Microsoft. This has triggered some discussion about open source. At the same time but coincidentally the latest version of IT Now concentrates on open source.

What I have found interesting reading these articles and posts is the amount of confusion and misinformation about open source. Here I add my own thoughts to these discussions.

Open Source is cheaper than closed source

This is a classic misconception. While it is often the case that open source does not have the same initial license cost as closed source solutions, proper comparison of the costs of the two requires analysis of the respective total cost of ownership. For example in a typical corporate situation key infrastructure components require support in line with the organisation’s business needs. In a closed source situation the software vendor typically provides this as part of their maintenance agreement; in an open source situation, since there often isn’t a software vendor as such, a 3rd party organisation must provide this support. The organisation procuring such an open source solution must satisfy itself that any such support vendor has sufficient competence and expertise in the software to be able to support it. A good example of such an organisation is Red Hat who provide support for Red Hat Linux (amongst many open source products). The key point here is that lifetime costs including training, support, upgrades etc must be included in the TCO calculation.
Open Source is less secure than closed source

This is somewhat more contentious. I have previously heard this used as an argument (by non-technical people) for not using open source. I would argue that open source solutions are more secure than closed source since the opportunity for unlimited peer review of open source code significantly reduces the risk of security vulnerabilities persisting, compared to closed source solutions which effectively rely on security by obfuscation. The open source approach is similar to the practice in the cryptographic community of peer review of crypto algorithms.
Open Source is easier to modify than closed source

It is self evidently true that in principle anyone can modify open source code. However in practice modifying open source code is not for the faint hearted – these are often complex and sophisticated enterprise applications. It’s fine for Yahoo and Google engineers to modify open source software since their businesses are based on software. However for organisations for whom software is an enabler rather than a core business asset, such sophisticated software development will not typically be a core competence. For such organisations self modification of code is not really an option unless there is a desire to diversify the business into software development! For example this means that most adopters of Open Office are unlikely to modify the code themselves.
Open Source is supported by dedicated individuals who freely give up their time

There are undoubtedly many dedicated developers who give up their own time to write or modify code for open source applications. However there are also many open source products for which major chunks are developed by large organisations with salaried employees. Red Hat is an example of this. Similarly Yahoo contributes to many open source projects based on the work that their salaried engineers perform. Daniel Pink’s idealised view of open source as being the output of individuals motivated not by normal corporate rewards isn’t totally accurate.

Thursday, March 04, 2010

Software Patents Gone Mad?

Is it just me or is Apple trying to claim a patent for pub/sub? See this article. According to this Apple is claiming a patent over

“A system in which a software module called an event consumer can indicate an interest in receiving notifications about a specific set of events, and it provides an architecture for efficiently providing notifications to the [event] consumer”

What is interesting is that the pretenders to Microsoft’s crown are now exhibiting the same kind of behaviour for which Microsoft used to be criticised.

Friday, February 26, 2010

Are Standards Good for Consumers?

Last week’s Mobile World Congress produced the interesting announcement that a number of industry members are joining together to form an industry association (Wholesale Applications Community, or WAC) dedicated to providing a common application platform for mobile phones. This, combined with Bruno’s thoughtful blog on Apple and Flash/Java got me thinking...

According to the announcement “The alliance's stated goal is to create a wholesale applications ecosystem that – from day one – will establish a simple route to market for developers to deliver the latest innovative applications and services to the widest possible base of customers around the world.”

It is interesting that this is an operator-led initiative; none of the major platform vendors (Apple, Microsoft, Google or Nokia) are currently involved in this. A simple interpretation of this could be that it is an attempt by the operators to reclaim the initiative as the services they provide are effectively commoditised with industry differentiation being provided by the mobile device platforms provided by Apple et al. These mobile device platforms provide the features that enable the rich ecosystem of applications which has created a whole new sub-industry. If the operators don’t get a piece of this, they will be consigned to building masts and sending bills until they go the way of the dinosaurs.

However it also raises an interesting broader technology issue: does the consumer benefit from this standardisation? Discussions about technology standards almost always invoke the example of VHS vs Betamax as the rationale for the consumer benefits of standardisation. However doesn’t standardising the application platform take away the ability of the device manufacturers to differentiate themselves by providing distinctive features? If the argument works for mobile devices, why not for laptops? Desktops? Servers? If it’s such a great idea, why did MSX fail?

To my mind the key difference is whether we are talking about functionality or content. VHS vs Betamax was important to consumers because they wanted a standard content delivery mechanism - as long as the machine could play the content, essentially the functionality of the machine was irrelevant. So standards for defining and delivering content are good for the consumer. (HTML is another good example of this.)

Standards for functionality restrict the features available to consumers, creating monopolies and stifling innovation. This is bad for consumers. Having a diverse market for mobile device platforms is therefore very much to the benefit of consumers. Standardising this platform would be bad for consumers.

By the way, in case no-one told the members of WAC, they are reinventing the wheel - they should check out Java.

Thursday, February 18, 2010

Custom Configuration?

One of the trends that I have noticed recently is the number of firms who are challenging the elements of their IT estate that are custom developed or in some other way non-standard. The reasoning goes that anything non-standard costs more to develop and maintain compared to a vanilla out-of-the-box configuration. This is of course quite true but I think there is a more subtle point here.

Against the cost of any element of the IT estate we need to balance the value generated. In general the standard configuration in a packaged application is merely an aggregation of the most common needs of their existing user base. For many sector most firms in the sector will be using the same set of applications, so by adopting a standard configuration, a firm is saying that it is happy to execute its business process in the same way as most of its competitors.

If this business process constitutes a source of differentiated competitive advantage for the firm, by adopting the standard configuration this source of differentiation is being sacrificed and the value delivered by this differentiation lost. In this case firms should think very carefully about the cost of this differentiation versus the value.

If there is no differentiated competitive advantage associated with this business process, it is either a business overhead or a source of cost advantage. Either way the value is not reduced by standardising the business process. So in this latter case it is quite safe to standardise on an out-of-the-box configuration.

Thursday, February 11, 2010

Enterprise vs Solution Architecture Reprised

I currently work as an architect with Cognizant’s Architecture Practice within the Advanced Solutions Group. Just before Christmas we had a working session where we got on to the subject of Enterprise vs Solution Architecture. Coming out of this discussion I reached a couple of conclusions.

Firstly, is Enterprise Architecture just Solution Architecture on a larger scale? The answer, somewhat unhelpfully is Yes and No. Recall from my previous blog the time element of Enterprise Architecture. Solution Architecture doesn’t become Enterprise Architecture when it just becomes bigger and/or more complex. However humans typically deal with scale and complexity by breaking problems down. In this context if the Solution Architecture becomes so much more complex it will most likely not be implemented in one go but will be the target of a programme of change. The target and roadmap to achieve that target then becomes the Enterprise Architecture for this programme of change.

This led me to a second conclusion which I have found incredibly helpful when explaining to non Enterprise Architects the distinction between Enterprise and Solution Architects. Most IT people have no problem understanding the distinction between a project manager responsible for delivering a tightly scoped piece of work, and a programme manager, responsible for delivering change over a period of time executed as a series of parallel and/or sequential projects. In this context the appropriate simile is that an Enterprise Architect is to a Solution Architect, as a programme manager is to a project manager.

Wednesday, December 16, 2009

Integration

Enterprise integration is one of the most difficult IT problems to crack.

A while back I was lucky enough to be invited to a reception at The Gherkin by a leading vendor of integration software. At the reception the CTO of this company made a speech; though I expected this to be fairly positive and sales-oriented, I was stunned when he announced that “the integration problem is 90% solved”. Having worked continuously on large-scale integration projects for the last 8 years or so, I was filled with a sense of dread that after all there had been an obvious answer that I and the talented people I had been working with had somehow overlooked. Intrigued, I grabbed the CTO as soon as he had finished his speech and challenged him about this. To my opening “You’re wrong, we’re not even at 10% yet” he managed to muster that he was just talking about the technology plumbing underlying integration. That of course is a different issue totally but this brief interlude highlights everything that is wrong with integration projects.

First and foremost, integration is a business problem, which needs to be owned and led by the business not the IT shop. Integration boils down to users attempting to conduct business while traversing system boundaries. Before you can understand how to plumb the systems together you need to understand the business processes involved. The problem with this is that understanding integrated business processes is very difficult. Moreover integration usually goes hand in hand with business transformation, so in parallel new business processes need to be agreed and integration aspects analysed and documented.

Integration projects go wrong when they are technology driven. A bunch of well-meaning techies (or even worse, mercenary consultants) tell senior managers that they can save £(name your amount) by providing “joined-up solutions”. The explanation is simple in management speak - “by joining up silos of data we make the business more efficient and agile, requiring fewer people to achieve greater productivity”. Who isn’t going to believe that (Tony Blair did). Joining up these silos means building business processes that can span the silos (otherwise you won’t get any improvement in efficiency). The net result is techies telling business people how to do their jobs - not renowned as a recipe for success.

Wednesday, November 25, 2009

Systems Integration

I have recently noticed with a number of clients an interesting phenomenon. A project starts life as an application roll out or an upgrade but morphs over time into a systems integration project. First things first, what do we mean when we talk about systems integration?

Systems integration projects have a number of characteristic properties:

The is a single holistic desired business outcome with an identified business sponsor accountable for achievement of this outcome;
The outcome can not be achieved by delivery of a single isolated application but requires multiple applications integrated in some way, or a single application integrated with other back-end systems in some way;
There are multiple organisational entities involved in delivery of the overall system. These could be external entities such as multiple suppliers delivering to the client, or could involve multiple internal entities, particularly in larger organisations.

What is striking about these evolved systems integration projects is that by accident or design some of the basics of systems integration get ignored, leading to fundamental (and in some case irresolvable) problems with systems delivery. Examples include:

Having a clearly defined and baselined set of requirements under configuration and change management;
Having a prime who is accountable for end to end systems integration. The National Programme for IT is a classic example of how things can go wrong when there isn’t a prime - you can’t federate systems integration;
Putting in place mechanisms which allow individual suppliers to perform partial integration testing of their components. This could be for example provision of a sandpit environment or provision of test harnesses that suppliers can use in their own environments.

In future blogs I will return to some of these themes and explore them in more detail.

Experts

Someone recently introduced himself to me as an “expert” in his particular area. I was somewhat taken aback by this as in Britain at least this is not a term we use lightly. As I reflected on this I reached two diametrically opposite interpretations of this word:

Someone calling themself an expert is one of a handful of people in the world who knows all there is to know about this particular subject; if they aren’t able to answer a question (perhaps after going away to think about it on their own) no-one can.

Someone calling themself an expert has such a weak understanding of the subject in question that they aren’t even aware of all of their areas of ignorance. In all likelihood such experts will answer questions inaccurately if they can answer them at all.

Note that experts in the former category don’t need to describe themselves in this way - it is obvious.

Friday, May 08, 2009

An Audience with Michael Dell

I recently had the pleasure of hearing Michael Dell speak in a briefing session organised by BCS ELITE.

Michael started off by giving his views about industry trends and the current economic situation. In particular he spent quite a lot of time talking about cloud computing and how Dell are working with large technology organisations such as Amazon and Yahoo. He talked about the emergence of private clouds - cloud based solutions with limited shared infrastructure to provide some of the benefits of cloud computing while at the same time assuaging concerns about security and integrity of data.

After Michael’s presentation there was a Q&A session. The questions varied from “are you getting too close to Microsoft” to “what would you do if you weren’t running Dell”. Michael answered all of these questions head-on, without any kind of hesitation. What really came across very strongly was his deep passion for technology - he very clearly would rather spend time with his engineers than with his bean counters. In this respect his attitude was similar to Bill Gates’s, when I heard him speak last year (albeit in a more constrained forum). It was great to hear Michael speak - my congratulations and thanks go to BCS ELITE for organising this event.

Friday, March 13, 2009

The Enterprise in EA

One of the areas of contention in EA in anything but the smallest organisation, is how to define the enterprise, i.e. the ‘E’ in ‘EA’. This may seem obvious - if you are in the business of manufacturing and selling widgets, then the enterprise is the business of manufacturing and selling widgets. However in these days of outsourcing, off-shoring and reconfiguration of value chains things are not so straightforward. Consider the following examples.

The first is Tesco. As a large and successful retailer its enterprise consists of taking products from suppliers and selling them to consumers via multiple channels (various store formats, online and catalogue). It provides back office functions in support of these activities. So what is the enterprise here? Are suppliers part of the enterprise? Are the different haulage companies used by Tesco part of the enterprise? Is the internal email system part of the enterprise?

As a second example consider a public sector organisation such as an NHS trust. It delivers care to patients, funded by the department of health. If it is a larger trust it may also undertake teaching and/or research. So does the enterprise in this case include the research systems? The teaching systems?

Finally, my own pet favourite example: a systems integration programme. Consider a prime contractor for the NHS National Programme for IT. A prime delivers the programme against the customer’s specification to the customer’s stakeholders. So in this case we have a confluence of the NHS enterprise, an individual trust’s enterprise and the prime contractor’s enterprise.

I think the first thing that becomes clear from these examples is the link between the enterprise and governance. As such I could choose the enterprise to be anything I want, but it is meaningless if I have no ability to measure and influence conformance against my target architecture. This potentially means that the enterprise can be broader than a legal entity. For example Tesco could include elements of EA in the contracts that they agree with suppliers e.g. use of standardised interfaces for communications, common business processes etc. Conversely it also explains the common situation of organisations creating EA teams but not providing any governance mechanism, which leads to a team that produces lots of good ideas which are then largely ignored by the rest of the organisation.

My second observation is more contentious: if the enterprise picture is highly complicated with multiple powerful stakeholders who have divergent interests, there is no point trying to have anything other than a trivial enterprise architecture, because parochial stakeholder interests will always defeat a federated, consensus-based governance model. In short, if it is a complex stakeholder environment, see what emerges rather than trying to impose a centralised EA.

Friday, February 27, 2009

Requirements and IT programmes

Requirements in any kind of IT-based engineering programme are difficult. The bigger the programme, the more difficult they are. This may seem paradoxical in world in which Google, Amazon and eBay are delivering high performance IT-driven businesses to massive audiences. Why is that?

I think there are three major reasons for this.

The first is the predilection for big-bang approaches with large programmes. This seems to be particularly prevalent in public sector programmes, where IT is driving business transformation, such as Aspire, NPfIT and DII. By adopting a big-bang approach it is not possible to attempt to scale up ideas from agile development around getting something working and then improving on it in stages. Conversely Google et al typically introduce new products and services initially to limited audiences as betas (e.g. gmail) and only gradually increase functionality until sufficient stability for full release has been reached.

The next reason, may be peculiarly British. The UK Office of Government Commerce has mandated the use of output-based specifications for procurement of large programmes. This is fine in itself since this approach ensures focus is maintained on end-user business benefits. However output-based specifications are not engineering requirements, so delivery of a service can not be measured against an OBS. Let’s consider an example to make this clear. The following requirement is from the OBS for the NPfIT, available on the Department of Health’s public web site.

The service shall be secure and confidential, and only accessible on a need-to-know basis to authorised users.

In the absence of precise definitions of “secure”, “confidential” and “need-to-know” this is a vacuous statement!

(Note that it may appear that I have selected this requirement as an extreme example in order to demonstrate the point, but in fact I chose this at random.)

I’m not suggesting that the requirements against which programmes are procured should go in to engineering levels of detail, but conversely these requirements are inadequate as a basis for engineering delivery. The approach that I have seen at first hand, and which worked successfully, is to use such OBS style requirements as the starting point for requirements elaboration. For a particular release of a service, the requirement should be elaborated to the degree that it is testable and implementable, and that should be agreed between customer and supplier as the basis for determining whether the requirement has been met or not. In long-term programmes the elaboration may alter over time (for example an encryption algorithm which was secure 10 years ago may not be secure today). The essence is that the OBS expresses the spirit of the requirements rather than the details; procurement requirements are not the same thing as engineering requirements.

The final reason is the use of COTS packages for delivery of such programmes. The challenge is that the delivered requirements depend totally on the capability of the COTS package. This is more an issue for business processes, since these tend to be intrinsic to specific packages. This is another reason why ideas from agile development can not be reused. Also, picking up on the previous point, the vagueness of an OBS means that the customer and COTS vendor could have very different (but equally valid) interpretations of what a requirement means.

Is this a real problem? Well, judge for yourself the success of these large programmes...

Wednesday, February 18, 2009

Building Architecture vs Aeronautical Engineering

Building architecture is often used as an example of what IT architecture should aspire to. There are a number of reasons for this: for a start, the term “architecture” is normally associated with buildings, and has really been adopted by IT in parallel with the emergence of the new discipline of abstracting large scale systems in order to be able to understand them. This close relationship with building architecture has been cemented by the work on architecture patterns, which takes as its starting point Christopher Alexander’s work on patterns in building architecture.

There are certainly some similarities between building and IT architecture. Both use tools of abstraction to manage complexity; for example building architecture uses plan and elevation drawings to understand the structure of buildings, and mathematics to understand how to construct buildings. IT uses architecture views (such as the ToGAF framework) to understand what needs to be built, and a variety of tools and processes in order to build these systems.

But what about after construction? How well does the metaphor hold then? I think at this point it falls apart; buildings are by and large static, maintenance is typically restricted to superficial changes such as painting and decorating. It is unusual for buildings to go through fundamental reconstruction. On the other hand, IT systems are living beasts, which are subject to constant change, of varying degrees. Sometimes it is addition of a minor feature or a new interface; other times it can be fundamental re-architecting of the entire system in order to accommodate new requirements, or in order to accommodate original requirements which were not properly understood.

In practice this means that in the case of building architecture, design documents and blueprints will gather dust post-construction, whereas for IT systems it is critical to have an up-to-date ‘as-maintained’ documentation set. (That said, in my experience organisations that do maintain such documentation are the exception rather than the rule.)

I think a better metaphor is aeronautical engineering, where the discipline involved in maintaining up-to-date documentation for in-life aircraft, associated systems and components is quite incredible. I was struck by this years ago when I worked on a project with Boeing re-engineering a tool they use - Wiring Illuminator - which helped maintainers to understand the individual wires and connected end-points in aircraft. Subsequently I worked on JSF where the life history of every single component was being maintained. Note that I am following well-trodden ground here: over 10 years ago Ross Anderson was pointing out that IT could learn much from the way that the aircraft industry deals with safety-related issues.

As the discipline of IT architecture develops I fully expect that the need to capture high quality ‘as-maintained’ documentation will be critical. Tim O’Reilly shrewdly observed that a key industry differentiator in the future will be service management organisations; I would add to that: it will be service management organisations who excel at ‘as-maintained’ documentation and baseline management.

Wednesday, February 11, 2009

What Should Go In The Cloud?

Cloud computing is all the rage. Vendors are falling over themselves to offer services from the cloud. Analysts are proclaiming that the cloud is the next big thing. So given that cloud based services provide economies of scale that most businesses can’t dream of, we should be pushing all of our services in to the cloud rather than provisioning them ourselves, right?

Let’s consider an example from the last programme that I worked on. In that programme a national single sign on (SSO) solution was provided as an external service (in the cloud in fashionable parlance). This ensured a single identity across NHS organisations, allowing users to cross organisational boundaries with a single identity. Great idea. One minor problem: if for any reason this external service was unavailable, users were not able to log in to any of their applications, and users already logged in had their sessions terminated. Unavailability of that single service impacted all other business applications.

What seemed like a great idea at the time, did not really stand up to scrutiny in practice. Of course hindsight is a great tool so I am not criticising the original design choice, but trying to learn from it. Using this example it is obvious that not everything should be in the cloud - operational considerations need to be traded off against financial benefits. So how do we decide what should go in the cloud and what we should deliver ourselves?

There are a number of dimensions to this. The first consideration is the business’s value chain. Any secondary activity in the value chain is a candidate for delivery via the cloud. For example, HR systems, intranets etc. What about primary activities? Instinctively these should be delivered internally. But if that is the case, how is it that salesforce.com has been so successful?? I think the answer is deeper: primary activities should be delivered from the cloud if they can so provide greater levels of quality and reliability than would be possible by delivering it internally. So for a large, mature organisation with a sophisticated IT operation, delivering CRM internally might make sense. For other organisations CRM via salesforce.com might make sense even though this is a primary activity for the organisation.

Returning to my SSO example then, for those NHS organisations for whom SSO is too complicated a task it makes sense to deliver this from the cloud. For larger, more sophisticated NHS organisations, internal delivery of SSO might be appropriate. That just leaves the problem of interoperability...for a later blog!

Thursday, February 05, 2009

Massive IT Programmes

I have just recently changed jobs, joining Cognizant’s Advanced Solutions Practice, having spent the last three and a half years working for BT as Chief Architect on the NHS National Programme for IT. Moving on from that role has given me the chance to reflect a little on some of the challenges that I faced in that role.

The programme is frequently in the press, and has been labelled as a classic example of a failing IT programme. Though the press coverage has in general been ill-informed and inaccurate there have undoubtedly been problems with delivering the programme, for many reasons, which I will not get in to here. However some general observations can be made about massive IT programmes.

One of the greatest challenges in programmes such as this one is the sheer size of change involved in terms of both business and technology. The traditional programme and project management approach to dealing with the complexity that this scale brings is to follow a reductionist strategy, breaking the overall programme into smaller manageable parts. The difficulty with this is choosing how to slice and dice the large problem. Executed correctly this approach allows application of traditional programme management and systems engineering techniques to ensure delivery within acceptable parameters of cost, schedule and risk. The down side is that if the overall problem is divided incorrectly the small parts so obtained are as difficult to deliver as the overall programme. Moreover this approach assumes that such a division is possible.

What alternatives are there then? That is a difficult question to answer since this is really an embryonic and immature field. Historically the approach taken was to execute a small-scale pilot programme then scale this up to the size of the large programme, but that takes time and can cause loss of momentum. An alternative would be to take an evolutionary approach, similar to some agile approaches to software development: execute a solution with acknowledged flaws, and evolve this via a series of small iterations in to a solution that is ‘good enough’ to satisfy the key stakeholders of the programme.

Tuesday, January 20, 2009

Enterprise vs Solution Architecture

Following on from a previous blog, one of the things that I often see confused is the difference between enterprise and solution architecture. In particular I often see people confuse solution architecture with technical architecture, and enterprise architecture with solution architecture.

Wikipedia provides the following definition:

Enterprise architecture is a comprehensive framework used to manage and align an organisation's business processes, Information Technology (IT) software and hardware, local and wide area networks, people, operations and projects with the organisation's overall strategy.

I have highlighted some of the key elements of this definition. EA is about providing a framework that helps to align business processes, IT and people with the overall strategy of the organisation. This is typically captured by considering business process, applications, information and technology as independent views of the enterprise, and then mapping out how they will evolve over time (e.g. via the use of roadmaps, technical strategies and reference architectures). This can be depicted graphically:

Solution architecture is about delivering a project at a particular point in time. In the happy days scenario governance structures are in place to ensure that solution architectures are perfectly aligned with the enterprise architecture, so we can think of each solution architecture as being a ‘snapshot’ of the enterprise architecture at a particular point in time. Again we can model this graphically:

The reality is rarely as clean as the happy days scenario. Governance is more typically a mechanism for identifying divergence from the enterprise architecture, rather than a means of enforcing it, since delivery projects are normally under massive pressure to do the bare minimum to ensure delivery, rather than think of the longer-term implications of the choices made. Solution architectures are thus often misaligned with the EA at best; in some cases they are totally at odds with the EA. This is shown below:

Tuesday, July 22, 2008

Business vs IT - Enterprise Architecture

I had the pleasure of attending a breakfast seminar this morning, organised by glue:. The subject of the seminar was "Managing IT Enabled Change". The theme was the idea of a common language to allow enterprise architects to talk to the business. The session was kicked off by glue:'s CEO Gareth Lloyd, who gave a very clear presentation of the context and in particular explained the need for a common language. After that Ceri Williams, also of glue:, provided some explanation of the meaning of a common language and how it fits in with EA, business strategy, business transformation and programme management. This was underpinned by Jes McPhee from Troux Technologies, who demonstrated how their tool can be used to support change analysis in support of business objectives. Finally Daren Ward, business architecture principal at Marks and Spencer, provided some real-world feedback on the use and applicability of business architecture.

While it was all very interesting and I enjoyed the presentations and the interchange of ideas, I had, and still have, a fundamental problem with the thesis presented. For me, enterprise architecture is precisely about ensuring that IT supports the overall business in achieving its goals. The glue: presentations, deliberately or unwittingly, talked about IT and 'the business' as though they were separate functions; in today's competitive environment it is not possible to get competitive advantage on any significant scale without having IT at the core of the business - 'the business' and IT should be indivisible.

That said, EA has largely been driven by the IT side of the world so there is still work to do in evangelizing the EA cause in organisations. And better tool support that talks about business problems and business issues, rather than servers and networks, is only to be welcomed.

Sunday, June 15, 2008

The Resurrection of the Power Mac

For the last four years I have used a Power Mac G5 with dual 2.5GHz processors as my main machine at home. During this time the machine has been pretty much bullet-proof; it stays turned on most of the time (though in sleep mode when I am at work); the only time it has been switched off is when I have moved house! The one alteration I have made to it was to add an additional hard disk last year.

Last Tuesday without warning the machine stopped - it was not even a graceful shutdown. When I tried to restart it, though I could hear the fans spinning no video signal was generated. I tried the various boot options e.g. single user mode, boot from optical drive, none of which worked. Stumped I went to bed to think about it overnight. In the morning when I tried again, it booted first time. I put it into sleep mode and went to work. When I returned in the evening I started using and after 20 minutes or so it put itself into sleep mode. It repeated this several times until it just stopped again.

I decided to go back to basics so I disconnected all devices except the screen, keyboard and mouse, in case there was a hardware conflict causing the problem. That didn't help. At this point I was starting to get desperate, and concluded that there was probably a hardware fault on the machine. I therefore contacted a local company who are authorised for Apple service, and they suggested I bring the machine in and they could run some diagnostics for me. I couldn't do it until the weekend so I had a couple of days to try to resolve the issue myself.

As a vain attempt to do something useful I opened up the case and was greeted by several thick dust bunnies. There was nothing visibly wrong internally, but embarrassed at the thought of the service engineer seeing the amount of dust in the case, I put a brush nozzle on the end of our vacuum cleaner and sucked out all of the dust. When I looked more carefully I could see that the air ducts onto the CPU cooler (it is liquid cooled) were totally clogged up, so I used the vacuum cleaner to remove most of this dust. Feeling satisfied that I would not deliver a dust-filled machine to be repaired I put the case back together, and tried one last time to start it up.

Hey presto, it worked! That was 4 days ago, and I have not had a problem with it since then. The conclusion I have reached is that the CPUs were getting too hot which was causing the machine to shut down and then to refuse to start up until they had cooled down. I am guessing this is a feature built in to the chipset to prevent permanent damage. I have therefore installed Temperature Monitor so that I can avoid the problem in the future.

It gives a whole new meaning to the phrase "clean down the machine"!