CloudatCost Issues
-
As I suspected, it's IOWait problems. Always check SAR first for performance problems.
Linux 3.13.0-32-generic (ubuntu) 03/09/2015 _x86_64_ (1 CPU) 02:05:11 AM CPU %user %nice %system %iowait %steal %idle 02:15:01 AM all 0.10 0.00 0.31 7.35 0.00 92.24 02:25:01 AM all 0.02 0.00 0.21 5.97 0.00 93.81 02:35:01 AM all 0.00 0.00 0.17 2.01 0.00 97.82 02:45:01 AM all 0.19 0.00 0.28 20.81 0.00 78.73 02:55:39 AM all 1.63 0.00 0.46 28.88 0.00 69.03 03:05:01 AM all 0.22 0.00 0.14 9.61 0.00 90.02 03:15:01 AM all 0.00 0.00 0.12 1.73 0.00 98.15 03:25:01 AM all 0.00 0.00 0.14 7.41 0.00 92.45 Average: all 0.28 0.00 0.23 10.63 0.00 88.86
-
I'm seeing rather large IO delays here too.
-
So far I've only seen the IO issues on the Dev1 instances.
-
-
@Aaron-Studer said:
@scottalanmiller said:
So far I've only seen the IO issues on the Dev1 instances.
I wonder why?
I'm sure the over-commit rate on those is much higher. And the IO needs are higher as there are so many OSes doing things, even when idle, in the same IO space.
-
What is a "good" average? On my Dev1 box, I'm seeing 8.41%. On my Dev3 box, I'm seeing 4.45%.
-
@Danp said:
What is a "good" average? On my Dev1 box, I'm seeing 8.41%. On my Dev3 box, I'm seeing 4.45%.
Average "should" approach zero. Definitely way under 1%.
-
For example, the MangoLassi database, which you can imagine is relatively busy, produces only .03% IOWait state average. It almost never spikes above .08% in any ten minute period.
That's point ZERO three. So 8.41 is 280 times higher!!
-
@scottalanmiller said:
For example, the MangoLassi database, which you can imagine is relatively busy, produces only .03% IOWait state average. It almost never spikes above .08% in any ten minute period.
That's point ZERO three. So 8.41 is 280 times higher!!
Does this than say that the servers C@C are providing are over provisioned? I understand that they are growing capacity and are working to resolve this.
-
@scottalanmiller said:
@Danp said:
What is a "good" average? On my Dev1 box, I'm seeing 8.41%. On my Dev3 box, I'm seeing 4.45%.
Average "should" approach zero. Definitely way under 1%.
Assuming this is a VM, and it's working correctly, even if he Host machine is super busy, would you expect to see these numbers be outside their normal range if the host is saturated?
-
@Dashrender said:
Does this than say that the servers C@C are providing are over provisioned? I understand that they are growing capacity and are working to resolve this.
It states this right on their status page. They know that they overprovisioned IO and are adding capacity.
-
@scottalanmiller said:
@Dashrender said:
Does this than say that the servers C@C are providing are over provisioned? I understand that they are growing capacity and are working to resolve this.
It states this right on their status page. They know that they overprovisioned IO and are adding capacity.
lol I know that, I was mainly asking the second part of my question, in the assumption that our VM is doing what it should.. why does it show a higher than expected IO load?
Is it aware of the other loads around it on the host? -
@Dashrender said:
lol I know that, I was mainly asking the second part of my question, in the assumption that our VM is doing what it should.. why does it show a higher than expected IO load?
Is it aware of the other loads around it on the host?Oh, I see. It doesn't show IO load. It shows IOWait state - the amount of time in which the CPU is stuck just waiting for IO to respond. So we have no idea what the load is, we only know that it can't respond to us.
-
@scottalanmiller said:
@Dashrender said:
lol I know that, I was mainly asking the second part of my question, in the assumption that our VM is doing what it should.. why does it show a higher than expected IO load?
Is it aware of the other loads around it on the host?Oh, I see. It doesn't show IO load. It shows IOWait state - the amount of time in which the CPU is stuck just waiting for IO to respond. So we have no idea what the load is, we only know that it can't respond to us.
Aww.. ok great, nice lightbulb moment for me.. Thanks!
-
If the storage was super slow, like it had a carrier pigeon interface, it would show up as having high IOWait but low load. But with SSDs, we assume they are over loaded.
-
This post is deleted! -
What is a SAR report? How do I get one?
-
@Aaron-Studer said:
What is a SAR report? How do I get one?
In Linux the command to get it is simply sar.
Ta da!
-
If you are missing the sar utility, it means that you forgot to install the sysstat utilities.
Either...
yum -y install sysstat
Or...
apt-get install sysstat
If you are on Ubuntu 14.04.2 LTS they do this ridiculous thing that it doesn't turn on my default. Run sar and it will tell you what to change to enable it. It's just editing one file to change a false to a true.
-
@scottalanmiller I always run CentOS7, per your recommendation