Got something to say or just want fewer pesky ads? Join us... 😊

BT and Tuesday February 2nd.



Westdene Seagull

aka Cap'n Carl Firecrotch
NSC Patron
Oct 27, 2003
21,526
The arse end of Hangleton
I don't think you can have a clone as such, but as @Westdene Seagull says they should have had better monitoring in place to prevent it, and remove the router from the network. It could have been a "perfect storm" of issues which were hard to pick up.

HSRP would solve the problem on Cisco - all decent vendors have their equivalent.

EDIT - I'll change that to 'might have' as of course we don't know what the real cause was.
 




The Oldman

I like the Hat
NSC Patron
Jul 12, 2003
7,160
In the shadow of Seaford Head
Did anyone lose service on here? I checked and we didn't lose any connection with BT on Tuesday - unlike the regular down times we used to get with Virginmedia

In Seaford our BT Infinity did not go down but could not access loads of sites including NSC and Bank websites. However, no trouble in getting the BT website which told me there were no problems with their services.
 


Gazwag

5 millionth post poster
Mar 4, 2004
30,730
Bexhill-on-Sea
I reckon the cleaner took the wrong plug out while she did the hovering, easily done.

Or, like we had two weeks ago, some cowboy electricity workers digging a whole just down the road from the office went through the electricity cable and shut down the whole road. Half an hour later we had power back but then they cut through the broadband cable :facepalm:
 


beorhthelm

A. Virgo, Football Genius
Jul 21, 2003
36,015
Agreed it will have been some datacentre level router but if 2m+ people can be affected by a single device failure then it's a very poorly designed network. Equally there are tools to monitor the health of the routing protocols and that can take self-healing actions to prevent this sort of outage.

Its easy for us to say that, but these sorts of things are often unforeseen, unexpected failures that the tools dont anticipate. the technical view as i understand it, is a cascade failure brought about from something spaffing bad data. one of the Reg comments was to the effect that there's nothing in the protocols to handle the scenario, the network techies here of that opinion - the monitoring could only tell the Sysadmin that its borked, at which time its already gone. the fact that there was no failover would support this as the error would be replicated on the backup (hardcore systems design would call for different vendors on alternative networks, but procurement overturn such ideals.). or the failover worked but was saturated, the design might anticipate only a short period of single path/node usage. a major US telco had that problem a few years ago when one of their main pipes got cut and it took longer than expected to locate and fix the problem. its expensive to maintain completely 200% redundant systems and a backup for that (so do you need 400% capacity?). networks are supposed to design to route around problems to save asking that question, but if the routers go first...
 


catfish

North Stand Brighton Boy
Dec 17, 2010
7,677
Worthing
It is unusual as their sevice is normally very good. I was without a connection for a couple of hours & would be very peed off if it happened on a regular basis.
 




Springal

Well-known member
Feb 12, 2005
24,785
GOSBTS
HSRP would solve the problem on Cisco - all decent vendors have their equivalent.

EDIT - I'll change that to 'might have' as of course we don't know what the real cause was.

Disclaimer, I work for a global networking manufacturer and work with Service Providers.

Unlikely this will be the case of 'OMG JUST 1 BIG ROUTER!!!!' or something as basic as not having proper failover configuration.

A network like BT will be extremely complex in terms of MPLS, extensive routing tables, complex routing protocols all in a multi-vendor situation. What I imagine would have happened is some spurious routing table updates went out across the network, wiping out parts of the network and making it unreachable. Especially given routers regularly saw re-registration during the outage.

This kind of thing has been seen before, see http://www.eweek.com/c/a/IT-Infrast...-Update-Causes-Massive-Internet-Outage-709180

Or, dare I say could be those large Huawei (Chinese manufacturer) routers going in to feed DPI information back to GCHQ etc... allegedly
 


portslade seagull

Well-known member
Jul 19, 2003
17,949
portslade
I reckon the cleaner took the wrong plug out while she did the hovering, easily done.

Or, like we had two weeks ago, some cowboy electricity workers digging a whole just down the road from the office went through the electricity cable and shut down the whole road. Half an hour later we had power back but then they cut through the broadband cable :facepalm:

Not sure BT know what cleaners are so couldn't have been that
 


Albion and Premier League latest from Sky Sports


Top
Link Here