0
Not a bug
Too many false positivies
I have noticed that since July there are too many false positives mostly time outs on ping or http access when the sites are up and running (Anturis report - critical, time out but site is available through a pc or a mobile device).
Any idea what is going on with Anturis? It became less reliable.
Thanks
Any idea what is going on with Anturis? It became less reliable.
Thanks
Customer support service by UserEcho
Could you, please, forward to support@anturis.com several emails with false positives?
Also I have just updated an agent on the logmon machine and now it wont start throwing that error:
The service Anturis Monitoring Agent could not start
Any clue what is wrong with it.
As I could see, logmon agent is connected now. It has latest version and sending data to server.
As to "false positives" - your site uscib.org were unavailable from the four different locations, including your own agent logmon:
Connectivity error on US-Michigan. Error 2: Timeout
[The target host or service is down or unreachable from this location.]
Connectivity error on US-Dallas. Error 2: Timeout
[The target host or service is down or unreachable from this location.]
Connectivity error on CA-Vancouver. Error 2: Timeout
[The target host or service is down or unreachable from this location.]
Connectivity error on logmon.interacthosting.net. Error 2: Timeout
[The target host or service is down or unreachable from this Anturis Agent.]
How this could be false positive?
Right now your site have consistent packet loss even from our development office:
--- uscib.org ping statistics ---
--- uscib.org ping statistics ---
And from brazillia:
--- uscib.org ping statistics ---
That's what I collected in last 5 minutes. You do have a problem with your provider.
From same state:
PING uscib.org (69.176.101.68) 56(84) bytes of data.
--- uscib.org ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 7527ms
rtt min/avg/max/mdev = 17.592/21.528/23.491/1.882 ms
From the west coast:
--- uscib.org ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 1501ms
rtt min/avg/max/mdev = 118.066/118.310/118.715/0.487 ms
From UE (Paris):
--- uscib.org ping statistics ---
WebSitePulse utilize ping as well - no errors reported.
Right now from Netherlands, Amsterdam, from this very reliable datacenter https://www.exonet.nl/, from the server, that have no any network load:
Same state:
--- uscib.org ping statistics ---
100 packets transmitted, 99 received, 1% packet loss, time 100476ms
rtt min/avg/max/mdev = 15.761/35.858/669.807/66.991 ms
West Coast:
100 packets transmitted, 98 received, 2% packet loss, time 100173ms
rtt min/avg/max/mdev = 18.549/32.863/515.623/58.488 ms
Poland:
Pakiety: Wysłane = 100, Odebrane = 98, Utracone = 2 (2% straty),
Szacunkowy czas błądzenia pakietów w millisekundach:
Minimum = 119 ms, Maksimum = 137 ms, Czas średni = 124 ms
Packet loss in a range of 1-2% are counted as "acceptable" and shouldn't trigger any critical errors especially as timed out to reach a destination.
Persistent 1% packet loss - is an indication of problem.
2% packet loss - an indication of strong problem.
Packet loss 6%, as I show you, means that you network simply does not work.
And Anturis alerts were absolutely correct. But for sure it's up to you - resolve the problem or leave it as is.
BR,
Konstantin,
CTO Anturis Inc.
1% pocket loss can be made by CPU 100% spike made by some process, disk operation on an array or NIC shortage not exactly an indication of problem not even talking about 2% as an indication of strong problem when most of the internet providers will tell you that value is acceptable. Anyway - that is regarding monitors using pings but not acceptable in a situation when there is a monitor using HTTP connection and I'm getting "timeout" - I should to see that error only when the site is not available. Packet loss it will increase only a load time in that case.
Regarding alerts and problems - I can tell same about Anturis infrastructure as packet may not to make back (full round trip - request sent, reached destination but response not reached back to a source) or high packet loss (like that 6% compare to 1-2% what I'm getting). Excellent example is Vancouver monitor from where I see continuous issues (high ping notices or time outs) and few months back Dallas when Anturis even agreed that there is a problem and fixed it.
I'm just saying that everything was OK up until those false positive messages grew to the point of not being acceptable, compared to a competitor like Web Site Pulse. What is the sense of renewing a monitoring service which is telling the admins that there is an issue (site down, appliance not available) when it is not true?