02/23/2010 Update: A few days ago we bumped into an issue where a published icon was not equaly distributing resources between servers. And it seemed all inbound connections were only going to one server, and it didn't matter what servers we rotated in and out of the list. After some investigation, we had moved around licensing on two of the servers in our XenApp 4.5 farm, which apparently created issues. When clicking on the properties of each server, everything was set to farm manage accept for the licensing. And in this case - the license server versioning was different between what was set as the farm default, and where we were pointing the licensing for the two servers.
After running the follwing command: qfarm /ltload -- the same two servers were under "load throtteling load" were set to Off status. and it wasn't until after we changed the License server to "Use farm Settings" for the two non load balancing servers that we noticed server distribution was back to normal levels. So to the above, we say to the masses - if you have issues or notice icon load distribution is not balanced, and traffic appears to be going to once server than run the qfarm /ltload command. If you see "off" next to any of the servers under "load throtteling load", make sure you check your servers licensing. Set to farm manage if you can.
Look at updating your license server to a newer version, especially if you intend on using any of the new XenApp features. --END
11/11/2009 Update: an interesting issue came up the other day on how to effectively use the qfarm /load, and how to best interpret the oddball results that are returned. We recently introduced a new trading app to the farm, where the user base was distributed across the domestic US. After a few days of production runs, we started to receive complaints of "slowness" on the client side. After configuring a new load evaluator and configuring several assumably pertinent rules - nothing seemed to appear in the logs after we fired up logging in the Presentation Server Console. No obvious anomalies like excessive paging, I/O, disk thrashing or processing yet the qfarm /load command yielded a strikingly high load for the servers in question (in this case it was 5,000 out of 10,000).
We then looked at what types of incoming and outgoing TCP/UDP connections were in play and we found approximately 24 end points worth investigating. We don't use Microsoft Operations Manager (Unix and win32 servers) here but do have Perform and Predict as an option. After reviewing the data for each of the 24 servers we found one of several anomalies, and ultimately one of the servers had a processing load of >90% for a majority of peak production hours. So the reports of Citrix slowness were not a result of the ICA protocol or citrix server being slow, it was a result of the UI waiting for data to be processed.
What gets me is that I can recall on many situations where Citrix or the network links get blamed and the root cause generally points to something completely different. My own takeaway notes are as follows: CPU Utilization overall counters never provide any good numbers as its an average of all processing. This masks any potential affinity issues or over abundant processing on one or more processors. To gain a more granular insight into look instead to the CPU Utilization - All Processors to. --END
I like to keep most of my notes categorized on this site both for easy access and for others to use in the event they somehow stumble upon this site in some google, bing or yahoo search. Anybody still use dogpile?
What you will see below is a few notes on how to use qfarm.exe to gather some very handy farm information on the fly. This section has seriously expanded over time and will continue to do so because I find myself using qfarm more and more frequently as I veer away from the GUI.
11/29/2009 Update: we have scheduled tasks which are set to run every Saturday evening and reboot each Citrix server in a "staggered" manner. For the past two weeks, we have experienced issues with a Citrix server that is used by outsourcing group, and the Taipei office. The problem was only recently revealed when the on call person was woken up at 2am by the group pager. Using the qfarm /offline command allowed me to find out if any servers were offline and not accessible prior to any production disruptions.