Latest Post By Martin 0 Comments

TechCrunch

Wow. T-Mobile and Danger, the Microsoft-owned subsidiary that makes the Sidekick, has just announced that they’ve likely lost all user data that was being stored on Microsoft’s servers due to a server failure. That means that any contacts, photos, calendars, or to-do lists that haven’t been locally backed up are gone. Apparently if you don’t turn off your Sidekick and make sure its battery doesn’t run out you can salvage what’s currently stored on the device, otherwise you’re out of luck: Microsoft/Danger is describing the likelihood of recovering the data from their servers as “extremely low”.

T-Mobile Sidekick users have been suffering from a major outage all week, and that issue apparently hasn’t been resolved either.

This goes beyond FAIL, face-palm, or any of the other internet memes we’ve come to associate with incompetence. The fact that T-Mobile and/or Microsoft Danger don’t have a redundant backup is simply inexcusable, especially given the fact that the Sidekick is totally reliant on the cloud because it doesn’t store its data locally. Microsoft acquired Danger for $500 million in February 2008.

I asked my wife what she thought about this and got the response I expected, “what do you mean they don’t have a backup?”. It’s one of the issues that happens inside and outside the enterprise, whether it’s a fault of the project manager, or the guy who signed the ‘risk form’ (no we don’t need it backed up), so many people just expect data to being backed up, it’s ‘core’ so to speak to many people. That said a few issues, before you also agree with my wife:

  • How much data was involved in the backup
  • How long would it take the backup
  • How long is an actual restore and how easy is to do?
  • Consider that from a service delivery standpoint, there is increasingly no concept of downtime
  • Internal process – we don’t backup development systems, the system then gets ‘made production’ – who checks the backups are in place when it’s done?

On so many occasions either myself or colleagues have experienced this, consider that for a few thousand users the amount of data could be very significant, and also that users might not have been perceived to accept the downtime needed to take a proper backup. For example, switch off all services for 48 hours, run backup and then do the work. At the same time, we also have to understand that it can actually be easier to rebuild the system, than spend forever trying to recover data, the “I can rebuild your server in 3 hours, or we can spend the next 72 trying to restore it, what’s your preference?”.

It’s easy to sit on the outside and ask questions or say what should have happened, regardless, consider that with a large userbase, with data volumes getting ever bigger, it can be getting increasingly more challenging to keeping the backup infrastructure in line with your data  – everyone wants data, everyone wants service, but just watch the reaction when you say “and what about the backups?”, people only appreciate the importance of backups, when they’re needed, they’re typical a cross platform, cross business line service which has no budget, no investment until it stops working in many an enterprise/organization. There’s more technical coverage on this site.

With no downtime (the 24 hour business), increasingly simple tasks start to cover more risk – just look at what the IT guys go through when increasing more disk space on their cluster – if users were told the actual risks, they might not sign off the risk/accept the change. For example, inallocating more storage to a Windows cluster, we could see the following risks:

  • Data loss or corruption
  • Cluster issues – some resources wont start, don’t operate as expected
  • Cluster doesn’t work, or wont operate in a cluster due to issues relating to the resources/quorum
  • Server might not boot when accessing the storage – a Known issue with specific fibre card drivers and Windows 2003 service packs.

Share and Enjoy

Bookmark and Share

Green Tech Media

Greenpeace might have waited about a month, and saved itself the paint.

It was in late August that Greenpeace trespassed on Hewlett Packard‘s Palo Alto, Calif.-headquarters to paint the words “hazardous product” on the roof – an attack on HP’s failure to meet deadlines to reduce polychlorinated biphenyls (PVC) and brominated flame retardants (BFRs) in their products.

What Greenpeace didn’t know was that HP was about to come out with a nearly BFR and PVC-free notebook, the ProBook 5310m, which launched in mid-September.

That’s one big reason that Greenpeace upped HPs store on its Guide to Greener Electronics report released Friday. Now Greenpeace is applauding HP for pushing the rest of the industry on a similar course.

Check out this article from the Green Tech Media team, it’s an interesting read and got me thinking again about the green message both in terms of green IT and recycling of devices. It’s a worthy conversation to be having, and I’d love to see the vendors (AND END USERS) see what we can do to achieve more?  Could we not have more trade-in schemes, easier recycling on a local basis, but with less of the nonsense? By that I mean, I have a set of headphones with one broken earphone, who do I send them to? If it’s an iPod I can send it to Apple, to my local recycling place, phone up and ask about a mouse, a computer speaker set, power cables or headphones, and you get a bit less of a response.

  • We need to make it outrageously easy to recycle your computer and all your electronic devices
  • We need to encourage users to recycle their old computers rather than collect them (I’m guilty of this), stop the legacy IT collecting let us recycle/re-use, and let us reduce the life span of computers. There is no reason why we’re keeping 10 year old pcs and using them, from a support and an energy efficiency standpoint.
  • We need to be more encouraging to the vendors and highlight where they’re making progress
  • We need to start looking as difficult as it is at applications/roles/services – on a very simple level for example – Bank ATM’s – can they not be flatscreen, low power footprint with the facility to automatically turn off when not required? Before we start looking at PUE values, can we not evangelize the basics, switch off the lights, switch off the pc, end standby tv’s and look at how we can not only integrate this into our products, but to end user best practice?
  • Now for the emotional statement. Retrospective recycling – can we have someone visit those countries, those organizations and enterprises with their legacy working and broken technology, recycle and deploy new technology? Will it cost money? Absolutely, but think of the revenues created, the opportunities created with new technologies, and the environmental benefit of collecting all those 286s and older sitting around the developing world, which might not be useable or in use.
  • Can we start transforming the green IT message throughout the business into the nature of how we do business, video conferencing, solar power where possible, recycling paper and everything else, can we look at our processes and see how we can do more with less, reduce our environmental impact, reduce our cost and improve revenues.

Share and Enjoy

Bookmark and Share

NetApp

Research Triangle Park, NC—October 7, 2009— Today, NetApp (NASDAQ: NTAP) will hold a grand opening celebration to showcase its new energy-efficient dynamic data center located at the NetApp technology center in Research Triangle Park (RTP). The celebration also marks the 10-year anniversary of NetApp in the RTP region.

The grand opening ceremony event will be attended by Keith Crisco, secretary of the North Carolina department of commerce; Tony Caravano, deputy state director for U.S. Senator Kay R. Hagen; Alexandra Sullivan, technical development manager, U.S. Environmental Protection Agency, ENERGY STAR; and key NetApp executives.

It’s always interesting to read about organizations talking about the Green or energy efficient data center, it’s great to see what combinations of hardware, software, best practice as well as environmental steps to minimize their data center power consumption and efficiency. The article illustrates what can be possible with the right combination of activities to reduce power, increasing the temperature of the data center, and fresh air cooling being examples as well as choosing more efficient hardware, and optimizing the operating system and application.

We need to be taking a top down approach across the platforms and business units, what’s eating our power, our resources, what barriers to success are we having in reducing our consumption to profit ratio (so to speak), what changes to operations or process do we need to make? It’s often the simple things that can lead to bigger savings, that Janet has no trust of the backups, means she keeps everything saved on the file share, she never deletes files, meaning IT have to provide more storage, for a small business storing word documents, it might not be a big deal, but take that across an enterprise and you could find, one aspect of your IT, your backup/restores could be costing you millions in storage requirement. With SnapShot, with tiering of the data where we only keep what we need directly online, and recover where necessary, we can reduce what you require to a minimum reducing power and storage cost. Not only do we need to look at the data center, it’s the server, the blade, the database, the network and crucially the application. We need to centralize data flow between applications, I was speaking to one IT Manager who had two databases doing similar things, but they had to be separate for political reasons, so we had two sets of data, two sets of backups to perform, to sets of storage to perform. There will always be the Chinese Wall scenario, where I can’t have one set of data on the same platform, but I wonder if we could not be smarter with our development, reduce unnecessary data duplication before it’s made, could we not be looking at the data flows, looking at the application architecture, reduce our data transfer to a minimum and our data storage? That I copy the trades to my deal capture system for processing and then submission to my back office team, means that we have in reality got three sets of deal/trade data until one of the systems runs out of space and a guy deletes ‘the old logs’, or runs the ‘clear down script’.

Share and Enjoy

Bookmark and Share

http://technet.microsoft.com/en-us/library/cc288335.aspx

Microsoft Windows SharePoint Services provides the ability to restrict certain kinds of files from being uploaded or retrieved, based on the file extension. For example, a file with the .exe file extension could potentially contain code that runs on client computers when it is downloaded. Because it has the .exe file extension, the file can be run on demand when it is downloaded. If files with the .exe file extension are blocked, users can neither upload nor download a file with the .exe extension, and potentially dangerous content in the .exe file cannot be downloaded. This feature does not prevent all exploits based on file types, nor is it designed to do so.

Check this out if you’ve had any problems uploading specific file types to your SharePoint portal, you need to change the allowed file types in your central administration page.

Share and Enjoy

Bookmark and Share

I was having a conversation with Jim, a ‘common services’ manager based in a midsize financial organization in the city. I’d met him for lunch and was asking him what he was currently working on, and at that, I’ll let the story below unfold, the organization is looking at it’s file services issues and working out what to do next.

It all came as a result of an issue they’d been having with data. You see the organization had one file cluster, two Windows servers in a cluster with terrabytes of data shared over several drives, each up to 800GB, each drive corresponding to a business line or department.

The business had requested more storage, Jim’s response was that they needed to start deploying a new cluster with more storage, with failover and then deploy data de-duplication technologies to sit through the data an therefore archive/delete what wasn’t needed.

Below is the conversation we had, I’ve removed any customer/business specific information:

Is the data application or business line specific?

No it’s shared file data which has been hosted on our shared file cluster server, so the J drive might host IT and IT development data. Each share corresponds to a business line, but we don’t run applications off the file server, except for a few specific desktop functions but not in high volume. It’s static data.

Are there any existing quotas in place restricting data growth/size?

No we create a share on a drive and the business units on that drive share that space with whoever is also on that drive. If a share gets too big, we move that share on to its own separate drive.

What archiving do you have in place?

We periodically scan and delete old files, the data is also backed up to tape.

Ok, but what stops me hoarding data on the file server, keeping everything online?

Well nothing really, but we do delete old files when possible. Often the business say no though.

What are the backups like?

They’re ok, we’ve been having issues with capacity but that’s to be expected as the amount of data we have to backup increases.

What’s the lead time on a restore? If I log a call how long will it take?

Well, restores are not an incident, they are dealt with, but not as a priority, the priority is resolving incidents. It’s a lead time of a day or two.

Is there a per gb cost for putting data on the file server?

Our existing billing process is based around the per device cost, you pay a fee for your desktop or laptop, that includes a fixed fee which pays for your file and print. So no, there is no fee per gb, but then I don’t think the business would pay that if there was.

It’s an interesting coversation and highlights a few issues. IT and the business want to fix the problem, and they’re throwing money at it, they’ve been told four new servers plus fibre cards and a lot of storage. However, if we take a few steps back, before we spend money do we not need to fix our core? Fix the causal factors (as you get taught in history), not the secondary or tertiary factors?

The causal factors are:

  • Lack of standards for data archiving
  • Lack of data limits or quotas – I can backup my pc onto the file server
  • The billing system does not encourage people to think about the data they are consuming
  • The restore process is not responsive enough – could this be making people keep everything online?
  • There is no separation in data duties between priority and back office data, IT might fill up the drive with restores, stopping development with their SourceSafe project.

The secondary issues factors are:

  • File server is getting full
  • Drives are getting fragmented and therefore slower at supplying data to the end user
  • Backups are taking longer to run

Do we not in essence simply need to turn it around and have a discussion, a data amnesty if you like with the business sponsors and say, “we can either spend a few hundred thousand pounds in parts and people, or we can re-engineer our existing setup and fix the issue?”

  • When we hear that the users don’t want to delete their data, are we asking the right people?
  • When we say restores are not a priority, do we need to change that, either through giving users opportunities to restore data, or establishing a micro team of two or three people that only do restores, rather than bundling it under whatever department it currently sits?
  • Have we thought about isolating user, business and private data rather than combining all three data forms on the same system?

When you ask Jim, the challenge he has is he knows this:

“I know, I know, we need to implement data archiving, we some kind of quota system, and we need to re-organize the way we do file services, but we’re at end game, I’m getting told by the business, we’ve got budget, just fix it, on the one hand, and a general understanding that I never get funding when it works. That’s easier to take the funding, buy a new system and migrate users than it is to try and fix the existing one and retrospectively install rules and procedures.

Often it can be easier to ‘start again’ blame the new system and say it’s best that way than to go back and start changing things with people saying that important phrase, but why, it works, leave it alone, what have you done.

Key, we don’t get investment or funding for shared IT services unless they’re broken and it might take months to sign off funding, so whenever we get the chance we take it.”

With that in mind, are we spending money we don’t need to spend in order to avoid an emotional conversation, to avoid some office politics? How much of the IT budget city wide are we ‘mis-investing’ simply because we are unable/unwilling to communicate with our business sponsors and end user community?

In essence by making funding a challenge, are we not increasing our costs, because although I could fix what we have, you wont give me money for future investment, how do we turn this around and develop trust and understanding?

Share and Enjoy

Bookmark and Share
October 2009 07

Tesco Bank on the way?

http://www.finextra.com/fullstory.asp?id=20578

Tesco has changed the name of its personal finance unit to Tesco Bank, reflecting the UK supermarket’s determination to take on established high street players. It has also selected the core technology platforms for the division.

Last December the retail giant bought out the Royal Bank of Scotland to take complete control of Tesco Personal Finance. It has now renamed the unit Tesco Bank “in recognition of our longer-term objective of creating a full-service retail bank”.

Very cool, I’m genuinely excited about Tesco Bank, it will be interesting to see how it will affect the traditional high street banks, and at the same time, what range of technologies it will be using to deliver its services. We’ll need to wait and see.

Share and Enjoy

Bookmark and Share

Our next bladewatch drinks are on Wednesday the 14th October from 5:30pm, we’d love you to attend. This time, we’re requesting you send an email to martin237@gmail.com, so we know you’re attending.

There’s no cost to attending, and as before, we’ll deploy the bladewatch credit card for the first hour or two to buy the first round of drinks. There will be no hard sell or request for state secrets, it’s simply a meet up, talk about how the market is doing, what we’re up to and everything else.

The details of the venue are below:

- Henry’s bar
-4 London Wall- Blomfield Street-
-EC2M 5NT-

http://maps.google.co.uk/maps?hl=en&client=firefox-a&rls=org.mozilla:en-US:official&hs=ckQ&q=4%20London%20Wall%20Blomfield%20Street-&um=1&ie=UTF-8&sa=N&tab=wl

Some frequently asked questions about the meet up

How long does it last?
Don’t worry it’s not an all night event, you’re free to come and go as you please.

Who sponsors the meet up?
Bladewatch.com pays for the event, the venue and puts some money behind the bar.

Can vendors/PR people attend?
Of course, but as ever, it’s a community meet up, so no hard sales please, swap business cards, ask questions, but don’t let’s deploy the brochures, it’s not that kind of event – for that you want a conference.

Is the conversation driven/specific?
Not at all, it’s a meet up of readers, colleagues and people who have participated in the blog, with that in mind, the conversation flows between nuclear physics, the weather (we’re in the UK you see) and IT.

Is attendance free?
Absolutely, there’s none of the £5 ticket stuff (we don’t have the time to mess about), and besides we want people we know to attend, our readers/contributors do mean a lot to us here at bladewatch.

Can I attend independently of my organization?
Absolutely, it’s you our readers we’re inviting, not your organization, all conversations are below the radar and we wont be publishing conversations, it’s not that kind of event.

Can we sponsor the event?
If you’d like to help out with the bar bill, display a logo or do a short briefing on a product or service before we all start talking, we’d be happy to talk about it, but we’ll need to discuss it before hand, we don’t want to detract from the nature of the event.

Share and Enjoy

Bookmark and Share
October 2009 07

Cloud webcast

Univa UD

Lisle, IL, Oct. 6, 2009 – A team of industry leaders in cloud computing will be hosting a free webcast to provide insights and best practices on cloud-based IT service delivery. Experts from Intel Corporation, Univa UD and Diamond Management and Technology Consultants will lead the session, which takes place on Thursday, October 8 at 2:00pm ET.

Titled “Cloud as a Service Delivery Platform: Best Practices and Practical Insights,” the webcast will allow participants to get key questions answered about delivering cloud-based application services. Speakers will address lessons learned in both strategic planning and execution phases of cloud service delivery, presenting valuable tips and discussing experiences in a highly interactive forum. The webcast will also present the latest requirements that end users are demanding.

This event is crafted for directors or executives responsible for application delivery and other services at leading service providers, telecommunications companies, hosting providers, or web ISVs looking to offer applications in a SaaS model.

Webcast presenters:
. Billy Cox, Director, Cloud Strategy and Planning, Software and Services Group, Intel Corporation
. Chris Curran, Partner and CTO, Diamond Management & Technology Consultants
. Jason Liu, CEO, Univa UD

The webcast does sound cool, I’ll need to register, it’s always great to see what people are talking about in the cloud space, particularly in service delivery and best practice, do check it out.

Share and Enjoy

Bookmark and Share

In terms of process and delivery, there are so many examples, where organizations have brought in new technologies, new services, but not adapted or modernized the background processes both in terms of your helpdesk process and in the way the teams work together.

So Mike, (we’ll call him), sent me a wonderful illustration of this. He’d started at a financial organization as a server engineer, and got told, order yourself a Blackberry and a laptop.

He logged a call through the helpdesk, requesting a laptop and a Blackberry. A lady (Janet we’ll call her) about 30 feet from him, responsible for the organizatios mobile technologies called and said, “can you print out mobile devices request form, sign it and then fax it back to me?”

The process was manual and worked as follows:

  1. Mike faxes to Janet the form signed by him saying he wants a Blackberry and laptop.
  2. Janet puts that in the filing system (an A4 ring binder with the month and Mobile phone requests/laptop requests).
  3. Janet then sends an email to Anthony, (Mike’s Manager) who sits next to Mike and asks can we buy him these devices, Anthony says yes by email, she prints that out, staples it to the purchase request form that Mike faxed.
  4. Janet then raises a call to the laptop team requesting a laptop, then she raises a call to the mobile phone team for a mobile phone. She then waits for the devices to be purchased and configured, then delivered to her desk.
  5. Janet then asks Mike to come over and sign another form stating that he takes ownership and liability for the devices listed. This is then stapled to the original faxed request.
  6. Janet then logs a mobile technologies billing form to the mobile billing team to bill Mike’s mobile to his cost center. She then logs a call to ensure that the laptop is billed to his department sending it to the desktop team.

Janet’s job is to co-ordinate people’s calls, to ensure they are correctly ordered and billed back to the relevant team.

  • The actual delivery time for the Blackberry – 48hours from the request to the voice team, who simply requet a handset and contract from their mobile phone provider.
  • The actual delivery time for the laptop – 48 hours to build and configure – (IT normally has at least three or four pre-ordered laptops ready for supply). New/specific requests would take five business days including build and configuration.

The user percevied lead time:

  • Mobile phone – 5 working days – 1 of which usually involving paperwork being sent back and forward, billing organization and cross charging
  • Laptop – ten working days, there was the back office IT aspects to work out,a survey, a check is it the small or the medium laptop you want, do you need an external mouse and a laptop bag? The host name allocation, and discussions about the applications you wanted installed, then the operating system load and configuration.

Issues from listening to Mike:

  • IT had taken the purchase request process written quite some time ago to purchase a desktop computer/server and changed the individual tasks for each call, the process was made to fit around mobile phones and laptops not adapted around the different nature of the requests. Laptops/mobile phones are more fluid devices and change more regularly, they are more user specific.
  • IT did not have a budget or drive for process improvement, it worked on the basis of business as usual.
  • There were many systems to manage assets, to organize purchase approvals, but none of them spoke to each other or linked into each other. Therefore a simple process to order a phone required a full time member of staff to co-ordinate and manage call progress.
  • Users did not complain, there was no issue from a user standpoint. IT delivered a service which therefore was deemed satisfactory.

What has to change:

  • Consolidation of operating system builds to two, a laptop and desktop build.
  • Blackberry configuration loading from the mobile phone service provider so IT plugged it in and sync’d it with the exchange server.
  • A backoffice web based tool to handle the different aspects of the laptop and mobile provisioning process. When a call got logged, the manager got emailed, they had to sign into the tool, press ok, and that the generated each teams’ task. One view, one tool. Removal of all paperwork into a database powered web interface.
  • A process change, the asset management team managing devices by phone number or computer name, combined with their asset number. When they allocated an asset number, they would also allocate a host name or mobile number.
  • A combination of the laptop, mobile and desktop asset registers, so that there was one gold source of data, one tool which all the teams used. Avoiding duplication of data and three different sources of data each reporting different information.

Share and Enjoy

Bookmark and Share

In the previous article, the BAA CIO had mentioned ‘complexity breeds costs’, and in many ways he’s right. There are two main reasons your IT costs are higher when compared to the industry benchmark, and they link back to investment and management. Both can be easily resolved from an operational standpoint, often it’s an update of processes, of making systems communicate, coupled with specific investment in target areas to reduce delays in deployment and support, an agreement in essence of standards and best practice.  We’ll go through what I mean by that in a moment, but let us examine the management aspect.

In terms of management we’re talking about the way we do business, the cross charging element, the structure of your organization, how things are done both in terms of ‘the process’ and how it works in reality, how many layers and departments are involved in that specific activity. The more people there are to involve, the slower the process seems to be. What we need to do is attack the challenge so to speak from three ways;

  1. Process based change – the actual workflows involved from the point at which the user emails the helpdesk and says order me a desktop to the point at which it is delivered and installed.
  2. Organizational based change -  what barriers to success are there, and key what teams are not performing, how we might change that.
  3. Technical based change – what tools/systems prevent or ‘don’t help’ us deliver that request.

The next post illustrates this wonderfully.

In terms of technology we need to be covering the fundamentals, it’s the basics, your core which underpins everything else. Let me be clear when we say core, your network, your storage, desktop/server, middelware and database. If we have a fundamental issue with the server build, that it takes too long, that it hasn’t moved on in line with server standards and specifications, that can all hinder your success. A network where I have to set the buffers to high, where multi-casting works, but just not quite right, just not enough for my application guys to be happy with, means that we’re starting off with an uneven dare I say it, unstable core. It is this core dear reader upon which you may be powering your business, running your applications and wondering why that email took 15 minutes to arrive, why we had to reboot the settlements server during the business day, and just WHY OH WHY IS MY DESKTOP SOOOOOOOOOO SLLLLLLOOOOOOOOWWWW.

So when we look at the technology, we should be working in split personality mode. I want my support guys looking at what we having and striving to stabilize the core:

  • Upgrade all drivers and firmware on everything, even if it’s out of support, that we have our systems optimized regardless of whether they’re a DL380 G6 or DEC 1200A.
  • Upgrade the operating system with the latest service packs, security patches – again even if it’s out of support, I don’t care if it’s NT4, but can we at least consolidate to Service Pack 6a with the roll up hot fix. I don’t care if it’s Solaris 8, just make sure it’s patched as high as the application and the box will go.
  • Identify and resolve configuration type issues, that server19 runs out of disk space due to the pagefile, move the page file, upgrade the memory. we should be configuring on a per application per server basis, there should be plenty of disk space on the OS drive, the database dumps/backups and applications should all be configured to work within the constraints of that server, whether it’s 4GB or 146Gb drives.
  • Defrag, defrag, defrag, why? Because it’s one thing that we can do during the business day (application permitting) to improve/contain performance without impacting operations, let us do what we can.
  • Reboot, reboot, reboot – avoid that manliness nonsense of “my server has been up longer than yours”, I don’t care if mine works. Should a server stay online for months/years without rebooting? Yes. But on the basis that it’s 2009, life is short and my Bentley will not be delivered on the basis of the uptime statistics of my infrastructure, I can either security patch/service my server estate and reboot it once a month or even every three months, or we can, during that key transaction, reboot it, because it’s hung and the anti virus isn’t updating, the server having been up for 700 days.
  • Get on with it – do it, stop talking about where we are, what the barriers to success are, how that isn’t my specific job, that’s not a support task, who’s budget should fund it, middleware should really be doing that under section 7.4 because it’s their remit. We can either do it, fix it, and move on, or continue in mediocrity, but let known that users/the business don’t tend to reward failure.
  • Understand each business lines’, each user groups priorities and change the way you deliver and interact accordingly. We could argue with front office since they don’t like filling in the change request, or we could just fill in the change request, understanding that by doing this, I expect assistance/the benefit of the doubt accordingly.

There will be a post illustrating this later today.

Share and Enjoy

Bookmark and Share