XenServer 6.5 - SR "Run out of space while coalescing."
-
Some interesting stuff in here: http://discussions.citrix.com/topic/347002-how-to-fix-vhd-coalesce-mess-in-xenserver-62
From the article, try some of these when this happens and let's see if there is something happening trying to clean things up:
tail (or cat, if you like) /var/log/SMlog | grep -i "exception" tail (or cat, if you like) /var/log/SMlog | grep -i "coalesced" tail (or cat, if you like) /var/log/SMlog | grep -i "kicking" tail (or cat, if you like) /var/log/SMlog | grep "GC"
-
My guess is that there are many snaps and a ton of space is used during the coalesce phase and it then clears up before you are checking it so that it appears like there is a lot of space but it actually runs out at some point momentarily.
-
Sorry did this ever get addressed? Did you check for any dis-associated vDisk that you can delete?
-
Oh yeah, I forgot that we were waiting for him to get back to us on this.
-
He had a thread on SW that did not get as much traffic but had two responses but went silent too...
http://community.spiceworks.com/topic/1242191-xenserver-6-5-sr-run-out-of-space-while-coalescing
He's not been on here for nine days.
-
Just checked, he went quiet on SW the same day, nine days ago. I sent him a PM, hope that all is okay.
-
I checked in, he's just been too busy with a new rollout to revisit this.
-
Hey all - So sorry for going dead! We implemented a new system that basically "runs the organization" and it's been a mad house. We consolidated 6 separate databases into one system, and even after a year of doing data migration tests, UAT, etc., it's still a bit of a mess (which we anticipated, fortunately). In any rate, I am here to update this thread on the issue...
It seems that the issue was caused by my backup software. Because the backup jobs were taking so long to process, for one magical moment, the software would have snapshots created for every single VM. This was the cause of the issue. After beefing up our backup server and adding SSDs for the metadata database (deduplication), backup windows are MUCH better, jobs are not overlapping as much as before, and this issue has not happened since.
Thanks everyone for the help! I'm hoping to be able to participate a bit more moving forward. We'll see how that goes...
-
Awesome, glad that you managed to find a solution!
-
Looks like it was snapshots as @scottalanmiller was finding. Good to know.
-
@JaredBusch I suspected the same as well, but what was confusing was that they would be cleaned up by the time I would be able to troubleshoot. It seemed to be a very short moment where it would happen, as it seemed to happen long enough for XenCenter to alert, but not long enough to have any impact on the environment.
-
So many things use snapshots under the hood, space can grow and shrink pretty rapidly in confusing ways.