Changing farm membership
The admin at the second, questionable server, should go to the command prompt and type “chfarm” – the “Change Farm” utility. Changing farms takes about two minutes, and costs nothing else. As long as we are not on the data store server, we can simply “chfarm” and tell the system we are “creating a new farm”, then we are pulled out of the old farm and launched into the new. Within a couple of minutes, the server says “Farm membership changed successfully” as IMA restarts itself. If the admin is then able to start the management consoles, in the new and empty farm, then there is nothing wrong with the server DLL’s, and the problem was likely just ODBC connectivity. To finish fixing the problem, then, the admin spends two more minutes, changing farm membership again, this time into the existing farm. The IMA data store forgets any ODBC connectivity issues is ever had with this server, and the server re-establishes itself as “connected”.
The same command used above for ODBC connectivity troubleshooting – “chfarm” – is also used in the scenario where we want to permanently change farm membership, possibly migrating a test server from the test farm to the production farm.
When a server changes farms, it abandons the old farm database, and looses along with it any published apps, or any other configuration settings that had been done in the old IMA data store. It comes to the new farm as a Citrix Terminal Server, with apps installed, but nothing published, and now taking all the defaults of the new farm.
When moving servers between farms, documentation is very important, because in Citrix there is no command to ask a server where it thinks the data store is. There is, however, a registry key, at “HKLMsoftwareCitrixIMA”, called “PSSERVER”, and the value of the “PSSERVER” registry key is the DNS name of the data store server. If there is no “PSSERVER” registry key, that server is the IMA data store with an Access DB.
The IMA Zone Data Collector is like the IT manager of it’s “zone”, and the Data Store is like the CIO – just one for the whole company, even if there are multiple “zones” and so multiple “ZDC’s”. If there are two or three servers in the farm, the ZDC is just another production server, serving apps to ICA clients, even though it also has the extra duties of maintaining the “IMA Dynamic Store” of load management information.
After about 50 servers in a zone, (a fuzzy number based on CPU utilization on the ZDC), Citrix recommends a STAND-ALONE ZDC, as well as a stand-alone Citrix License Server. Technically the maximum number of server allowed in a single zone is 512, and there is an IMA registry key to up this number higher if necessary, but if we are using average level machines, the real limit will come long before “512” servers in a zone. the real limit comes when the ZDC just can’t get all it’s work done anymore – first it is being slow for the users connected to apps on that server, then it starts having trouble “enumerating” the apps on the client screens.
At this point we need a second ZDC for the zone, but we have to work around “IMA Law”, which demands there be only ONE ZDC per zone. So just like with an IT department of 50 or more people, and a “stand-alone” manager, we can work around the 1 manager per department stipulation, by splitting the IT department. We break up the 50 people into “help desk”, and “server support”. Now there are two departments, and so two managers, where there used to be one; we use the same strategy with IMA zones. After about a hundred servers in the zone (or somewhere between 100 and 512), we will want to artificially “split the zone”.
Though the default IMA zone configuration is based on IP sub netting, this is only for convenience, and doesn’t have to be the case. So we don’t have to subnet anything differently in order to get two zones, and two data collectors. We simply let “IMA law” and “IP Sub netting” diverge.
We go the PSC farm properties, to the “zones” tab, and use the “New Zone” button to create “zoneB”, then choose some of the servers in the first zone and click “Move Server” to move it to the new zone. (This is an after-hours task because we have to reboot the IMA servers we move.)
After moving some servers into the new zone, we ought to set a “Most Preferred” and “Preferred” Zone Data collector, and document what was configured, for use in client-support. At this point we have two data collectors, where before there was only one, and nothing has changed as far as sub netting.
Citrix recommends MINIMIZING the number of zones in the Citrix farm. They say the number of zones “exponentially increases” the amount of WAN traffic. More recently they have said that more than 25 zones in a farm doesn’t work.
In the diagram above, the admin goes to each location, adds a new server to the farm over port 2512 in the firewall, and because each location is a different subnet, the admin winds up with a ZDC in every location. Still, most of the servers are centralized at headquarters, and only a few servers are distributed across the WAN, in order to serve some back-end DB app.
The issue here is just what kind of traffic, and how frequent, go over the 2512 port to the ZDC’s around the world.
IMA data collectors communicate over the WAN, over port 2512, and transmit any “changes”. Changes could be things like publishing a new app. but changes could also be the fact that a USER LOGGED IN, OR OUT!
So the problem with the IMA defaults for multiple subnets, is that in the scenario above, if any ONE USER simply logs on, or off, there is ZDC communication with every other location in the farm, because all the locations have ZDC’s.
It goes back to the analogy of a ZDC being an IT manager. We have a big IT department, with a manager, at headquarters. Then we start a new location, and staff it with one IT person. The question is whether to make the one IT person a “manager”, or not. If they are made “manager”, then they can’t get any work done because they are on the phone all day in conference calls with their counterparts at the other locations. And so the solution is we do NOT make the lone IT person a manager, but simply an employee who reports to a manager at another location.
So the same might need to be done to the IMA zone configuration, when we have a larger HQ and a bunch of smaller remote “sites”, with Citrix servers. By default, these servers are all “managers”, or “ZDC’s”. Again, without affecting IP sub netting in any way, we want to “collapse” the zones in this case, so that there is NOT a separate ZDC at each location, and then we DON’T have to dial up each location every time someone logs on or off at HQ.
To collapse the zones, we just go back to the PSC farm properties, “zones” tab, and “move servers” into an existing zone. Again, this is for after-hours because we will have to reboot the servers. And one more thing in this case, we have to “delete” the empty zone, according to “IMA law”.
Zone Preference and Failover
Presentation Server 3 and PN Agent 8 introduced a new feature that increases the availability that the Presentation server can provide. ZPF, or Zone Preference and Failover, is a solution to add availability of applications even when an entire site is down – an entire IMA zone. This can also be used for a failover plan to a backup data center, with the backup data center being another zone in the same farm.
Before Presentation Server 3, the only way to publish an application properly was to publish it once for all the servers in one zone, and apply that app to people appropriate to that zone, then to publish the app again, this time on the servers in the other zone, for the users appropriate to that zone, so that there would end up being east coast users accessing an application that ran in NY, and a group west coast users accessing a different published app that ran on servers in LA.
The problem here is that if one site goes down, there may be connectivity and ample server resources on the other side of the WAN, but there was no way to seamlessly start utilizing those resources in a failover scenario.
What some people were doing was publishing the application across multiple zones, and saying everybody could access it. The advantage was that there were servers automatically available even when one whole site was down, automatically. The disadvantage was the huge increase in network traffic, as a user had to span the WAN just to connect to any app, checking if the other side of the WAN didn’t have a less busy server.
Rather than sharing load information across zones, administrators can now create policies, one for each IMA zone, where a preferred zone is defined, and a list of backup zones can be defined, from bkp1 to bkp10. The group east coast users, then, would be applied to a policy in the PSC, with a preferred zone on NY, and a backup 1 zone of LA. The west coast users group would be assigned a different policy, with the LA zone the preferred zone and the NY the backup 1 zone.
While no load information would be shared across zones, in the event that all the Presentation Servers at a particular site are inaccessible, there can be an automatic failover of access to the next geographically convenient set of servers, as defined manually by the Citrix administrator.
The failover plan can be used in a disaster recovery scenario, where there are two data centers, a replicated SAN between them, and some front end SSL/VPN with access to pre-configured policies. If the primary data center were to fail, a strategically placed logon access point could redirect users to a failover zone / data center seamlessly, from the PN Agent client software.
Recovering from a failed data store
One of the objections sometimes to implementing Server Based Computing in the enterprise is the argument against putting all the eggs in one basket, in that if the applications all run centrally; the central server farm is a single point of failure. The Citrix implementation, however, can be put together in such a way that it is reliable, with failover mechanisms to backup data centers in place for catastrophic failure.
The Citrix Presentation Server farm consists of application servers, data collector(s), a license server with a valid license file, a data store, and applications and data. The applications and data are unique in any environment, but the rest is constant and can be supported by best practices.
The application servers are not a single point of failure as long as the N+1 model is used (or the N+25% model in a large implementation), where there is always one more physical server available than is actually required for the user base at maximum load. This way the farm can tolerate losing any one server without noticeable user impact.
The data collector is not a single point of failure either, as a new one will be elected dynamically if the regular one goes down, or cannot be contacted. As long as the clients have a way of getting to the other servers, such as multiple IP addresses in Program Neighborhood, the data collector’s failure should not produce an impact on the functionality of the farm.
The license server is 30-day fault tolerant, and in Enterprise version an alert can be set with Resource Manager to send an email within minutes of License Server Connection Failure. If the license server reconnects at any time in the thirty days the problem resolves itself. If the server is not going to come back up, then the license file, digitally signed with the case-sensitive hostname of the old license server, is the critical component. The license file, a *.lic file, can be backed up to a thumb drive separately, and restored to a new server with the same name of the old license server, and the Citrix License server software installed. The license server can be supported by Microsoft Clustering services. The license file itself is available on the myCitrix website, and can be returned and reallocated to a new license server name as well, in case of an emergency.
The data store is the central repository where almost the entire Citrix implementation is invested. The Administrators of the farm, the license server to point to, the whole farm configuration, the published applications, all their properties, the security of who gets access to what, the custom load evaluators, custom policies, configured printers and print drivers, all this is stored in the central repository called the data store. After an implementation has been around for a while, this repository is extremely unique, and unless it is documented completely down to the hidden detail screens, the farm data store needs to be protected.
The data store is either a SQL, Oracle, or DB2 database on a server outside the actual Presentation Server farm, or else it is an Access or MSDE database on one of the Presentation servers, called mf20.mdb (which showed up with MetaFrame XP, right after MetaFrame 1.8). If it is on the external server, then leaving it up to the DBA’s to back it up isn’t good enough; the data store can become corrupt, and without a solid known good backup of the data store, a series of recent tapes could be suspect or worthless. Since so much is invested in the Citrix data store, a separate copy of the database, from 5 to 20 MB in size, should go on the thumb drive with the license file. That thumb drive, plus a Window server CD and a Citrix server CD, plus the apps and the data, are the Citrix deployment, (apart from any web configuration files in the Web Interface and Secure Gateway or Access Gateway).
The data store itself becomes the single point of failure in some farms, but like the license server it is 30-day fault tolerant, and alerts can be configured in Resource Manager for Data Store Connection Failure as well. As long as the data store is backed up with a known good copy, the data store server in the Presentation server farm is easily replaced.
Unlike the license file on the thumb drive, that is tied to a particular server name, the data store is not tied to a particular server name, and can reside on any Citrix server in the farm, or another server can be built, inserted into the farm, and then host the data store. The SQL Express, SQL, Oracle, or DB2 data stores should be backed up by the utilities that come with them. The Access data store on the first Citrix Presentation server in the farm by default, is backed up with the command dsmaint backup path:, and then can be restored at any time to any Citrix Presentation server in the farm.
When the data store becomes unavailable, the PSC will not launch and the farm will be unmanageable. Users should still be able to access the application servers that are still left, as one of those servers has to be elected the data collector, and that is all the users need in order to connect to already-existing applications. But the farm is also unconfigurable, until the data store is back online.
To restore the data store to a different server, or just to move it to a more convenient place on the network, the procedure is as follows:
- place the mf20.mdb that was backed up in the proper directory: C:ProgramFilesCitrixIndependent Management Architecture;
- create a file dsn to the new data store;
- run dsmaint config /user:user /pwd:password /dsn:path to dsn on the new data store server and restart IMA;
- run dsmaint failover newdatastoreservername on all the other servers in the farm and restart IMA
To create a dsn file, go to the control panel, administrative tools, of the Citrix server that holds the new data store, and go to “Data Sources (ODBC)”. On the tab marked “file dsn”, create a new file, with Access 4.0 drivers, that is in the same directory as the mdb file is, and can be named anything, but for convention should be mf20.dsn. on the final screen, the actual database that the dsn file is supposed to point to must be selected. Under the select button, highlight the proper database, (not the imalhc.mdb but the mf20.mdb) and close the utility.
There should now be a dsn file in the “Program Files/Citrix/Independent Management Architecture” directory of the server that is about to become the new data store server.
When the servers first join the farm, they need to know where the data store is supposed to be. They log this server’s name in their registry, under HKLM, in the IMA key. When the data store moves, even if it moves directly onto a server, the IMA configuration doesn’t automatically discover the new farm location with some kind of broadcast. The servers are still looking for the data store on the old server, and start their 30-day countdown to stop receiving connections. After the data store is moved to a different server, the new data store server needs to be told that it is the new data store, and then all the other servers in the farm need to be told the new name of the data store server, so they can failover.
Although the identity of the Data Collector can be seen with the QFARM tool, and with the QUERYDC tool when the data store is unavailable, the identity of the actual data store, and the value of the key that says where a server thinks the data store is, are not readily available through a Citrix command line utility. Therefore administrators should be very careful to document where they are putting the data store, and to make sure all the servers in the farm are pointing to the correct server.
To tell a server that it is the new data store, there is a simple command line with three switches, that ends up looking something like dsmaint config /user:Administrator /pwd:password /dsn:”C:Program FilesCitrixIndependent Management Architecturemf20.dsn”, but of course this could be prepared for ahead of time and made into a script. Once the command line returns the word “Successful”, the IMA service can be restarted, and the one server is back to having management capability.
But the rest of the servers in the farm are still on their 30-day countdowns. The command to failover the other servers in the farm to the new data store server is dsmaint failover newserver. After that, the IMA service needs to be restarted (net restart imaservice)
Once IMA restarts successfully on all the servers, the Citrix implementation is back into full manageability.
The Access or SQL Express data store can be easily migrated with the dsmaint utility. The option is dsmaint migrate, and instead of three parameters, as in the dsmaint config command, there are six: src user, src pwd, src dsn, and dest user, dest pwd, and dest dsn. A dsn file is set up to the new SQL, Oracle or DB2 database, and the data store is migrated.
If a server needs to be removed from the farm, the proper way is to either chfarm, or to uninstall the Citrix software. If for some reason a server has left the farm unexpectedly, and is gone for good, but still has vestiges hanging around in the PSC, then there is a command line utility to check and if necessary clean the data store. The command is dscheck, and dscheck /clean to actually clean up the inconsistencies.