Monday, August 22, 2011

The difficult thing about back-ups

 

My back-up drive failed recently.

It was a lucky coincidence that I noticed this; despite grand intentions I’m lazy when it comes to backing things up and only think about it when I am panicking about typing ‘rm –rf’ in the wrong directory.

The drive failed when I was moving it across my study; I powered it down, unplugged it, moved it, plugged it back in and then fired it up again. Except it declined to fire up. It made a few pathetic wheezing sounds and gave up. After a few days of online searching I admitted defeat and contacted Lacie who eventually acknowledged it was faulty and issued an RMA. Four weeks later I had a repaired working drive in place (albeit having lost all of the original data). I am now able to return to my previous state of blissful ignorance.

The point of this little story? In a nutshell I think this summarises many enterprise’s attitude to back-ups. I know from personal experience of two organisations who notionally had a standard back-policy with regular full and incremental back-ups, which when the back-ups were needed, could not be retrieved because the back-up hardware had failed.

Then there is the recent case of Amazon. Running infrastructure as complicated and sophisticated as they do is fraught with risk so if I was one of their customer’s I would probably be thinking about having an iron-clad business continuity plan in place, but apparently even here back-up complacency reigns.

Why is this a big deal? Normal production systems are tested thoroughly prior to go-live and then tested on an on-going basis through production use. Any problem will be automatically detected or signalled by a user fairly quickly. However since back-up systems are only invoked by exception, the first time you know there is a problem is when you need them. The answer? Well, regular testing of a full restore from back-up seems like the obvious solution. There are some products on the market which claim to help but I remain somewhat sceptical of their efficacy.

Am I eating my own dog food? Alas I have regressed to my pre-failure days and have adopted the macho “my hardware never fails and I never type rm –rf by mistake” attitude. Some people never learn.

Wednesday, July 06, 2011

Government ICT Strategy

The recently published government ICT strategy makes interesting reading. The government is clearly trying to learn some of the lessons of previous failures, albeit without really getting to grips with the underlying reasons for these failures.

One of the government's explicit strategy statements makes very interesting reading:
The adoption of compulsory open standards will help government to avoid lengthy vendor lock-in, allowing the transfer of services or suppliers without excessive transition costs, loss of data or significant functionality.
Will open standards really do this?

Let's turn this on its head; what are the typical reasons for vendor lock-in? Here is my starter for 10 in no particular order:
  1. The software offers must-have features which competitors do not have.
  2. The organisation using the software has adapted its business processes to fit with how the software works.
  3. The organisation using the software has made a considerable investment in training its staff in how to use the software.
  4. The software is integrated with other systems that the organisation uses.
  5. Difficulty of data migration

Which of these will open standards help with? As far as I can tell only number 4; use of open standards in interface specifications should in principle allow substitutivity of standards-compliant components on either side of the interface. I say in principle because for any reasonably sophisticated enterprise system, an interface will be a key part of a business process, which will one way or another be organisation specific. It is typically a non-trivial task (in some cases impossible) to substitute another component into this business process without impacting the business process, leading to item 2 in the above list.

Don't get me wrong: open standards are great and to be applauded - I glory in my ability to choose my browser according to my mood. However let's not kid ourselves that be adopting them we are going to see a public sector IT world free of Oracle and MS Office any time soon.

Tuesday, December 21, 2010

The Assumption of Independence in the Financial Systems Failure

I have been spending a lot of time recently reading some of the plethora of books that have been published which either provide inside accounts of the 2008 failures in the banking sector (Lehmans and Bear Stearns) or have tried to analyse the causes of this failure. Though I’m not an economist by any stretch of the imagination, having studied financial strategy I have developed something of a morbid fascination for this topic, a little on the lines of an episode of Columbo where you know the outcome but the interest comes from finding out how Colombo will prove the perpetrators guilt. In the case of the 2008 crash it is common knowledge that it was caused by bankers taking risks purely to maximise their own bonuses isn’t it?

Going beyond the superficial mass media level reveals something slightly more interesting. There was certainly unjustifiable risk taking but this in itself ought not to have caused the systemic failure that occurred. One of the early failures which triggered the collapse of the dominoes was the collapse of two hedge funds dealing in derivatives run by Bear Stearns. Right until the point at which these funds were liquidated the fund managers were maintaining that the funds were diversified so their expose to subprime mortgages was limited. However subsequent investigation proved in fact over 70% of the cash invested in these funds had been spent on mortgage-backed derivatives. This is important as it is a key tenet of investment strategy that funds should be diversified so that losses in one area are compensated by gains in other areas. Diversification fails as a strategy when losses in one area trigger losses in another area i.e. even though a portfolio may be diversified there may be dependencies between them. This is what happened in 2008: losses in sub-prime triggered losses in other areas leading to large scale failures in supposedly resilient diversified funds.

So why bring this up in a blog supposedly devoted to technology? Having the memory of an elephant this reminded me of a couple of papers that were published in the 80s. The first paper The N-Version Approach to Fault-Tolerant Software looked at how software risk could be massively reduced by copying the idea of hardware redundancy in software, a technique know as n-version programming. Basically the idea was that for a high integrity system the software should be independently written multiple times and then control logic would execute all three versions in parallel, following the majority vote at each decision point. This was followed up by another paper An Experimental Evaluation of the Assumption of Independence in Multi-version Programming which challenged the hypothesis at the heart of n-version programming – that a failure in one programme would be independent of a failure in another programme. This is a reasonable assumption in hardware since failures are typically caused by physical characteristics rather than common design flaws. However this latter paper demonstrated empirically that this was not a safe assumption for software and therefore n-version programming was dead.

Fast forward 25 years and what do we see? The 2008 crash was effectively the result of dependent failure in a system which assumed failures were independent. Spooky eh? Normally software mimics life but in this instance software technology seems to have go there first!

Tuesday, September 21, 2010

Offshoring

I have recently changed jobs (more of the new job in a separate post). For the last 18 months I have worked as an Enterprise Architect for Cognizant. I joined in order to help Cognizant grow by delivering large and complex solutions, as a way of differentiating the company from other offshore players such as TCS, Wipro and Infosys. This strategy was also intended to close the gap to the existing SIs such as Accenture and Cap Gemini. Here I give my own opinion about what I experienced over these months,.

The reality of the role was somewhat different to the vision described above. Cognizant like the other Indian offshore players is extraordinarily good at its core competencies: testing and application support. It is market leading in these areas leading to ongoing significant market growth while maintaining a healthy margin. Clients recognise this and associate this value proposition strongly with Cognizant. The problem then is it is very difficult for these offshore players to reposition their brand in the market in order to move towards the premium end of the market which would open up much more lucrative consulting opportunities as well as systems integration programmes. Clients do not associate the offshore players with premium services and half-hearted attempts to educate clients by offering premium services at offshore rates only serves to reinforce the budget branding of offshore players. As an analogy if Ryanair launched a premium business class service at the same rates as BA, how much market share would they realistically win from BA? In retrospect this strategy is fundamentally flawed unless it permeates the entire organisation so that the culture of the company moves towards the quality levels of the supposed premium players they would like to topple. How many companies are going to risk such a major culture change when they are growing by 10-20% a quarter?

Therein lies the reason for me moving on to a new role: when it comes down to it I can’t get excited about testing or application support.

Friday, July 30, 2010

Open Source Confusion

 

I have dabbled with open source for many years, both as a user and briefly as a developer. I personally like the idea that if there is a problem with the software I can fix it myself rather than having to wait for the vendor to fix it. I have therefore read with interest some interesting thoughts about open source, in particular in two recent forums.

The first area is in connection with the Department of Health’s decision not to continue its enterprise wide agreement with Microsoft. This has triggered some discussion about open source. At the same time but coincidentally the latest version of IT Now concentrates on open source.

What I have found interesting reading these articles and posts is the amount of confusion and misinformation about open source. Here I add my own thoughts to these discussions.

  1. Open Source is cheaper than closed source

    This is a classic misconception. While it is often the case that open source does not have the same initial license cost as closed source solutions, proper comparison of the costs of the two requires analysis of the respective total cost of ownership. For example in a typical corporate situation key infrastructure components require support in line with the organisation’s business needs. In a closed source situation the software vendor typically provides this as part of their maintenance agreement; in an open source situation, since there often isn’t a software vendor as such, a 3rd party organisation must provide this support. The organisation procuring such an open source solution must satisfy itself that any such support vendor has sufficient competence and expertise in the software to be able to support it. A good example of such an organisation is Red Hat who provide support for Red Hat Linux (amongst many open source products). The key point here is that lifetime costs including training, support, upgrades etc must be included in the TCO calculation.

  2. Open Source is less secure than closed source

    This is somewhat more contentious. I have previously heard this used as an argument (by non-technical people) for not using open source. I would argue that open source solutions are more secure than closed source since the opportunity for unlimited peer review of open source code significantly reduces the risk of security vulnerabilities persisting, compared to closed source solutions which effectively rely on security by obfuscation. The open source approach is similar to the practice in the cryptographic community of peer review of crypto algorithms.

  3. Open Source is easier to modify than closed source

    It is self evidently true that in principle anyone can modify open source code. However in practice modifying open source code is not for the faint hearted – these are often complex and sophisticated enterprise applications. It’s fine for Yahoo and Google engineers to modify open source software since their businesses are based on software. However for organisations for whom software is an enabler rather than a core business asset, such sophisticated software development will not typically be a core competence. For such organisations self modification of code is not really an option unless there is a desire to diversify the business into software development! For example this means that most adopters of Open Office are unlikely to modify the code themselves.

  4. Open Source is supported by dedicated individuals who freely give up their time

    There are undoubtedly many dedicated developers who give up their own time to write or modify code for open source applications. However there are also many open source products for which major chunks are developed by large organisations with salaried employees. Red Hat is an example of this. Similarly Yahoo contributes to many open source projects based on the work that their salaried engineers perform. Daniel Pink’s idealised view of open source as being the output of individuals motivated not by normal corporate rewards isn’t totally accurate.

Thursday, March 04, 2010

Software Patents Gone Mad?

Is it just me or is Apple trying to claim a patent for pub/sub? See this article. According to this Apple is claiming a patent over

“A system in which a software module called an event consumer can indicate an interest in receiving notifications about a specific set of events, and it provides an architecture for efficiently providing notifications to the [event] consumer”

What is interesting is that the pretenders to Microsoft’s crown are now exhibiting the same kind of behaviour for which Microsoft used to be criticised.

Friday, February 26, 2010

Are Standards Good for Consumers?

Last week’s Mobile World Congress produced the interesting announcement that a number of industry members are joining together to form an industry association (Wholesale Applications Community, or WAC) dedicated to providing a common application platform for mobile phones. This, combined with Bruno’s thoughtful blog on Apple and Flash/Java got me thinking...

According to the announcement “The alliance's stated goal is to create a wholesale applications ecosystem that – from day one – will establish a simple route to market for developers to deliver the latest innovative applications and services to the widest possible base of customers around the world.”

It is interesting that this is an operator-led initiative; none of the major platform vendors (Apple, Microsoft, Google or Nokia) are currently involved in this. A simple interpretation of this could be that it is an attempt by the operators to reclaim the initiative as the services they provide are effectively commoditised with industry differentiation being provided by the mobile device platforms provided by Apple et al. These mobile device platforms provide the features that enable the rich ecosystem of applications which has created a whole new sub-industry. If the operators don’t get a piece of this, they will be consigned to building masts and sending bills until they go the way of the dinosaurs.

However it also raises an interesting broader technology issue: does the consumer benefit from this standardisation? Discussions about technology standards almost always invoke the example of VHS vs Betamax as the rationale for the consumer benefits of standardisation. However doesn’t standardising the application platform take away the ability of the device manufacturers to differentiate themselves by providing distinctive features? If the argument works for mobile devices, why not for laptops? Desktops? Servers? If it’s such a great idea, why did MSX fail?

To my mind the key difference is whether we are talking about functionality or content. VHS vs Betamax was important to consumers because they wanted a standard content delivery mechanism - as long as the machine could play the content, essentially the functionality of the machine was irrelevant. So standards for defining and delivering content are good for the consumer. (HTML is another good example of this.)

Standards for functionality restrict the features available to consumers, creating monopolies and stifling innovation. This is bad for consumers. Having a diverse market for mobile device platforms is therefore very much to the benefit of consumers. Standardising this platform would be bad for consumers.

By the way, in case no-one told the members of WAC, they are reinventing the wheel - they should check out Java.