Wednesday, February 13, 2008

BlackBerry outage caused by upgrade


The company behind the BlackBerry smart phones said a three-hour e-mail outage Monday was caused by an upgrade designed to increase capacity.
Research in Motion Ltd. Tuesday said the upgrade was part of "routine and ongoing efforts," and that similar upgrades in the past had caused no problems.

The outage, which started about 3:30 p.m. EST, annoyed subscribers who are used to checking and writing e-mail whenever they're in cellular coverage and able to make voice calls. It affected only some of the BlackBerry users in North America — for others, the service kept working fully.

It was the second major outage for the service in less than a year. In April, a minor software upgrade crashed the system for all users. A smaller disruption in September also was caused by a software glitch.

RIM, which is based in Waterloo, Ontario, has been on a tear, adding 1.65 million subscriber accounts in the quarter that ended Dec. 1, for a total of 12 million subscribers worldwide. It has deals with scores of wireless carriers around the world.

Experts said RIM's system is relatively reliable, but its centralized structure means that when there are problems, they can affect millions of users.

E-mail sent to and from BlackBerry phones in North America all goes through a Network Operations Center. It appears the problem occurred there, when one of two Internet addresses that relay e-mail from corporate servers stopped responding, according to Zenprise, a Fremont, Calif., company that helps companies troubleshoot BlackBerry problems.

"Any time you got a system that's got a NOC, a Network Operations Center, you have the potential for a single point of failure," said Jack Gold, with technology analyst firm J.Gold Associates in Northborough, Mass.

"What's a bit surprising to me is that with all the work they've been doing over time ... that they haven't been able to have enough redundancy in the NOC so that there isn't a single point of failure," said Gold, who has done business with RIM.

Microsoft's competing solution for mobile corporate e-mail doesn't use an equivalent to the NOC. Instead, it sends e-mail directly from a corporate server to user's handsets. That means widespread outages are unlikely, though the system can fail on the local level.

"Those types of issues occur so often ... they slide under the radar," said Chris Ambrosio, director of wireless research at Strategy Analytics.

The analysts agreed that RIM's centralized system had advantages, including strong security.

Ahmed Datoo, vice president of marketing at Zenprise, said RIM had significantly improved its handling of the outage, notifying customers soon after it started.

"That process didn't exist a year ago," Datoo said, noting that most subscribers affected by the April outage learned of it through the news media.

0 comments: