Skip to content
frameloss's profile

New problem solver

 • 

8 Messages

Sunday, August 24th, 2014 7:00 AM

Poor uptime percentages, is the the router or service?

I am getting pretty frustrated, and think at this point I have made a by signing up for comcast business service.  One reason I moved to Comcast business from residential was to drop a bunch of VPS-hosted services that were having availability problems.  Now I have gone from unreliable hosting, to unreliable networking!  It has been getting worse the last few days, as shown by my uptime numbers:

 

. . . from uptime robot:

Screen Shot 2014-08-24 at 8.27.37 AM.png

 

< 90% availability?!?!?! I was getting much more consistent and reliable service on residential. Anyone else seeing the same problems? I suspect that the gateway I was provided is to blame. I find that I have to forcibly reboot it to resolve the problem most days.

 

Screen Shot 2014-08-24 at 8.28.00 AM.png

 

Is it my cable-modem? I have the netgear . . . 

 

Initialization Procedure

Vendor Name Netgear
Hardware Version 1.04
   
Firmware Version V1.34.04
Operating Mode Residential Gateway
System Uptime 0 days 00h:45m:30s
Date 08 - 24 - 2014
Time 07:43:12

 

Anyone else having this problem?

Advocate

 • 

1.4K Messages

10 years ago

Hello frameloss and welcome,

 

Could you please share some additional information as follows:

1. Are you using a Static IP or straight DHCP on your NG3K?

 

2. Do you use a Firewall / Controlling router device within your network?

 

3. How many devices are operated through your NG3K throughout your entire network?

 

4. What is you Internet Bandwidth Speed?

 

5. What are your specific uptime robot monitorsbecause this is not clear by your graphs?

It monitors your websites every 5 minutes and alerts you if your sites are down (actually, it is smarter, details below).

How It Works? The Details

Here are the step-by-step actions of Uptime Robot to understand it better:

  • it asks for your websites headers and gets status codes like "200-ok", "404-not found", etc. every 5 minutes (or more depending on the monitor's settings),
  • if the status code doesn't indicate a problem, we are good
  • if the status code is~400+ and 500+, then the site is not loading
  • in order to make sure the site is down, Uptime Robot makes several more checks in the next 30 seconds,
  • if the site is still down, it sends an alert.

Introduction

Uptime Robot uses a distributed monitoring system to minimize false-positives.

All primary checks are made from the main engines in Dallas-USA. However, once a downtime is detected, secondary requests to verify this downtime is sent from remote nodes that are located in different countries/continents.

Whitelisting

If the monitoring works well for you, there is no need to take any actions. But, if you get any false/positives, there is a strong chance that the IPs used are blocked by your hosting provider.

Please make sure that you whitelist these IPs so that any requests that Uptime Robot send are not blocked.

 

Look forward to hearing from you.

Problem solver

 • 

305 Messages

10 years ago

What's exactly going on with your service? Are you able to post the signal levels from your modem during an outage so we can take a look?

New problem solver

 • 

8 Messages

10 years ago

What's exactly going on with your service? Are you able to post the signal levels from your modem during an outage so we can take a look?

 

Kraze: Sure, sorry it wasn't clear.  No layer-3 traffic passes during an outage.  The modem is responsive on the LAN side, but does not route; within a few minutes of a hard power cycle it starts working again.

 

It's working right now, so I'll paste in signal strength numbers from the gateway when it happens again.

 

VBSSP-RICH:

 

1. Are you using a Static IP or straight DHCP on your NG3K?

 

Actually both, well, sort-of.  The netgear provides a NAT-hide (but not DHCP/DNS) for internal clients, and is acting as a layer-3 router for the servers too.  Both routed connections and NAT'ed connections can see no further than the gateway.

 

2. Do you use a Firewall / Controlling router device within your network?

 

Yes, but traffic through any routing device or packet filter continues to pass between my internal networks. Also, servers and clients are using different devices.

 

3. How many devices are operated through your NG3K throughout your entire network?

 

Doing an ARP sweep I show 31 hosts up, combined across the internal and public subnets. Interestingly enough, you might have touched on something here: the gateway only shows 16 ARP entries. Is there a limitation on the ARP table size on these devices?

 

4. What is you Internet Bandwidth Speed?

 

Was roughly 50/10 when last tested.

 

5. What are your specific uptime robot monitorsbecause this is not clear by your graphs?

 

It's a 15 minute monitor against a static web page on one of the servers. Not the greatest measure of availability, and I've reduced the resolution to a 5 minute window--generally speaking a certain level of downtime is acceptable for a mail system so I didn't see then need for more frequent monitoring. However, graphing out time-series connection data for postfix in Kibana shows pretty much a 1:1 correlation with their alerts (for once the constant noise from spammers provided useful, go figure.)  Here is an extract from the last few days from my uptime robot logs:

 

UpMail24-08-2014 23:18:57OK0 hrs, 25 mins
DownMail24-08-2014 23:04:17---0 hrs, 14 mins
UpMail24-08-2014 22:48:57OK0 hrs, 15 mins
DownMail24-08-2014 21:48:51---1 hrs, 0 mins
UpMail24-08-2014 19:12:11OK2 hrs, 36 mins
DownMail24-08-2014 18:58:12---0 hrs, 13 mins
UpMail24-08-2014 15:11:05OK3 hrs, 47 mins
DownMail24-08-2014 05:22:29---9 hrs, 48 mins
UpMail24-08-2014 04:44:45OK0 hrs, 37 mins
DownMail24-08-2014 04:30:46---0 hrs, 13 mins
UpMail23-08-2014 05:10:26OK23 hrs, 20 mins
DownMail23-08-2014 04:25:21---0 hrs, 45 mins
UpMail21-08-2014 14:26:40OK37 hrs, 58 mins
DownMail21-08-2014 13:11:35---1 hrs, 15 mins
UpMail20-08-2014 07:45:54OK29 hrs, 25 mins
DownMail20-08-2014 06:15:09---1 hrs, 30 mins
UpMail20-08-2014 05:59:48OK0 hrs, 15 mins
DownMail20-08-2014 02:44:43---3 hrs, 15 mins

 

Thanks for all the help so far guys, I really do appreciate it.  Right now the ARP tables on the netgear make me suspicious, unfortunately I will need to do an ethernet cable run before moving the NAT-hide from the gateway over to the firewall . . . so if there are other likely culprits I am all ears.

New problem solver

 • 

8 Messages

10 years ago

I recabled last night and moved my SNAT over to a different system, so far for the last 7 hours no hiccups.

Advocate

 • 

1.4K Messages

10 years ago

Great to hear that your cabling and moving your SNAT to another system fixed some aspect. Let us know if you need anything else.

 

 

New problem solver

 • 

8 Messages

10 years ago

I am approaching the three day mark without any problems, I haven't had an up time stretch exceed 24 hours since the service was installed, so it seems like I found a fix.