Cloning Citrix XenApp 4.5 on VMware ESX 3.5

Though running Citrix XenApp 4.5 – formerly Citrix Presentation Server 4.5 – on XenServer 4.1, the new virtualization platform from Citrix, might be a likely topic for a Citrix Training Center blog, today the Citrix story I have to tell is actually about running XenApp 4.5 on VMware’s Virtual Infrastructure 3 (VI3), because that’s what the customer was doing.

Apparently there’s a new industry coming up that is projected to bloom into a hundred and sixty billion dollar industry, soon, called “infrastructure on demand”.

The task was to create a “button” to press that would provision Citrix server vm’s, apps installed, Citrix software installed, configuration complete and documented, immediately, and while many of my peers approach this from a scripting point of view, with Windows sysprep utility and Enteo software complicating the process, I decided to see if I could get a citrix-vm clone going, having been hanging around training centers for the past 15 years and dealing with ghost issues all this time, having seen an older MetaFrame XP advanced admin course that bulleted a full page with what to watch out for when cloning Citrix servers, and knowing that the current version of the CCIA books say several times that Citrix supports cloning.

I Googled the issue and found very little from Citrix specifically, understandably with their big push for XenServer. Still, there were a few bloggers who’d tried it, and a couple of Powerpoints by VMware and Citrix from a couple of years ago, and I gave the new software a try at the old tricks.

I used Citrix XenApp 4.5 Feature Pack 1 (FP1), on a Windows 2003/R2 base server, in a Windows 2003 AD domain. The VMware box was ESX 3.5 on a powerful Dell box, with Virtual Center 2.5 installed on a vm inside the ESX box, as the cloning feature is exclusive to the Virtual Center, not available on ESX 3.5 being managed directly as a standalone host.

After building the domain controller, TS and Citrix Licensing on that DC, installing SQL Express on the DC, and configuring AD appropriately for Citrix (see blog article on this site….), I built the prototype Citrix server, installing Terminal Services and XenApp 4.5.

That much is standard stuff you’d do on day one of a Citrix class, or day one or two of an implementation. The only modifications here, creating the “golden template” from which all future Citrix servers will be cloned, is that

1) The balloon memory driver feature of vmtools should be left out of the Citrix server image.

VMware Tools

2) The Resource Manager feature is de-selected, as EdgeSight will be used to monitor the performance of the Citrix Servers, as well as load test and optimize them. (If Resource Manager were to be left installed, the local resource manager database would need to be deleted before cloning, while the Resource Manager service was stopped.)

3) Each vmdk needs two full gig of RAM,and there should be two vmdk’s, one for the system, at about 10 GB, and another for applications. The page file should go on the second vmdk.

4) The Microsoft utility “UPHClean” should be installed, and any profiles should be cleaned up before cloning – ideally no profiles will ever have been created on the golden template.

5) Several settings in Terminal Services that are relevant to the Citrix implementation can be configured either in GPO’s or in the Terminal Services configuration (TSC) tool on each server. Since in the Windows 2000 days there was no option for Terminal Server GPO’s, and since we’re cloning anyway, out of nostalgia I set the security in the TSC – specifically locking down the RDP listener to admins only (big default security hole!), and configuring disconnect timeouts and shadow settings.

6) The applications installed were Microsoft Office 2003 and the Microsoft CRM client, and Adobe Reader. They were installed using “TRANSFORMS” files from Microsoft, edited in ORCA.

7) Administrators for the farm need to be set as Domain Admins, if that hasn’t already been done; there will be a problem if the domain admin can’t manage the farm.

 

Citrix Access Management Console

8 ) With the Platinum license, the Citrix server has to be patched and the new Access Management Console (AMC) has to be downloaded, and the server inside the new AMC has to be set to “Platinum”, about twenty-five minutes of tedious work that can be washed away by the new cloning process, as long as it is endured during the creation of the golden template.

Citrix Presentation Server

9) One more thing on the golden template – the “newsid” utility from Microsoft should be easily accessible on the local hard drive, because there will need to be a rename of the server before it hits the network. The “querydc” utility is also quite useful (see the blog on advanced IMA) and so might as well be included in the template as well. Any other custom utilities should be added at this point.

When the golden template is finally perfect, and has been tested, any WebInterface or pnagent site should be deleted, as it can be re-created much more easily than all those others in the clones can be deleted.

Finally in the Virtual Infrastructure client, in the virtual Center “data center”, the XenApp server can be “cloned to template”.

VMware Infrastructure Client

To see the template, switch to the “Virtual Machines and Templates” view in the Inventory of the Virtual Infrastructure Client (VIC). Right clicking on the template object offers either cloning, or “deploying” a virtual machine from a template. The “deploy” is the common option used in vmware, and requires sysprep files copied to the vmware virtual center server, and a customization wizard with ten screens full of questions. The purpose of this exercise was to get the job done as fast as possible. So the other option, “clone”, is like the old ghost technology. It just makes a copy of the original, and you are on your own as far as customizing it properly.

VMware Infrastructure Client

The clock, so to speak, starts ticking when I click “clone”. I take all the potential issues into account. I tested my server in a test lab and then production, and the test results held up. The design scaled well, with the template converted to a vm temporarily to get updates to the CRM client before going back into the “Templates” view. The clock starts ticking now, and the clock ends when I have a second server in the farm, apps installed, best practices configured, and in the domain properly, in a scalable model, which happens to be about twenty five minutes later.

First, I disable the NIC, logged into the console of the new server in Virtual Center. Can’t have this thing on the network yet.

VMware Infrastructure Client

Second, I run “newsid” from the local hard drive of the new clone. While getting a new sid, it also renames the computer to whatever we want – here we name it the same as the name in virtual center.

XenApp

When the virtual center server reboots after newsid, log in as the local administrator and change to a workgroup, from the domain. Then enable the NIC, change the IP address and configure IP, then re-join the domain.

XenApp

Log in as the domain admin now, launch a command prompt, and invoke the “change farm” utility, by typing “chfarm”. This will allow the server to be (temporarily) placed in a stand-alone “test” farm with a local MS Access database.

Command Prompt

 

Citrix Presentation Server

 

When the IMA service starts successfully – about two minutes later – then restart the change farm utility, this time joining the production farm as if for the first time.

 

Citrix Presentation Server

 

The final catch, though, is that the super-critical-for-printing system account that’s supposed to be local to every Citrix 4.5 server. The “ctx-cpsvcuser” account, is corrupt. One way to fix this is to go in to the printers and drivers section of the windows server, remove the Citrix Universal Printer, and delete the corrupted account from the permissions tab in the TSC under the ICA listener.

 

ICA-TCP Properties

Print Server Properties

 

Then go to “Add/Remove” programs, click on the “Citrix Presentation Server 4.5” entry, and click “change/remove”. On the following screen, faced with “Modify, repair, or remove”, choose “repair”. Browse to the MSI on the Citrix server CD, and wait. Seven or Eight minutes can go by, but when it’s over, the driver AND the user account are fixed.

Citrix Presentation Server

Citrix Presentation Server

The apps need to be modified to run on the new server as well, and the custom load evaluator has to be applied to the new server manually.

XenApp

Finally, the Citrix server is moved into the appropriate Organizational Unit (OU) in the domain, applying all security and profile optimization settings.

Active Directory

And we have a new workhorse in the farm, hitting the ground running, about twenty five minutes after the clock started running.

Was this information helpful? Feel free to share your comments/experiences below.

Advanced IMA – Compatibility Mode – Part 2

Changing farm membership

The admin at the second, questionable server, should go to the command prompt and type “chfarm” – the “Change Farm” utility. Changing farms takes about two minutes, and costs nothing else. As long as we are not on the data store server, we can simply “chfarm” and tell the system we are “creating a new farm”, then we are pulled out of the old farm and launched into the new. Within a couple of minutes, the server says “Farm membership changed successfully” as IMA restarts itself. If the admin is then able to start the management consoles, in the new and empty farm, then there is nothing wrong with the server DLL’s, and the problem was likely just ODBC connectivity. To finish fixing the problem, then, the admin spends two more minutes, changing farm membership again, this time into the existing farm. The IMA data store forgets any ODBC connectivity issues is ever had with this server, and the server re-establishes itself as “connected”.

The same command used above for ODBC connectivity troubleshooting – “chfarm” – is also used in the scenario where we want to permanently change farm membership, possibly migrating a test server from the test farm to the production farm.

When a server changes farms, it abandons the old farm database, and looses along with it any published apps, or any other configuration settings that had been done in the old IMA data store. It comes to the new farm as a Citrix Terminal Server, with apps installed, but nothing published, and now taking all the defaults of the new farm.

When moving servers between farms, documentation is very important, because in Citrix there is no command to ask a server where it thinks the data store is. There is, however, a registry key, at “HKLMsoftwareCitrixIMA”, called “PSSERVER”, and the value of the “PSSERVER” registry key is the DNS name of the data store server. If there is no “PSSERVER” registry key, that server is the IMA data store with an Access DB.

Splitting Zones

The IMA Zone Data Collector is like the IT manager of it’s “zone”, and the Data Store is like the CIO – just one for the whole company, even if there are multiple “zones” and so multiple “ZDC’s”. If there are two or three servers in the farm, the ZDC is just another production server, serving apps to ICA clients, even though it also has the extra duties of maintaining the “IMA Dynamic Store” of load management information.

After about 50 servers in a zone, (a fuzzy number based on CPU utilization on the ZDC), Citrix recommends a STAND-ALONE ZDC, as well as a stand-alone Citrix License Server. Technically the maximum number of server allowed in a single zone is 512, and there is an IMA registry key to up this number higher if necessary, but if we are using average level machines, the real limit will come long before “512” servers in a zone. the real limit comes when the ZDC just can’t get all it’s work done anymore – first it is being slow for the users connected to apps on that server, then it starts having trouble “enumerating” the apps on the client screens.

At this point we need a second ZDC for the zone, but we have to work around “IMA Law”, which demands there be only ONE ZDC per zone. So just like with an IT department of 50 or more people, and a “stand-alone” manager, we can work around the 1 manager per department stipulation, by splitting the IT department. We break up the 50 people into “help desk”, and “server support”. Now there are two departments, and so two managers, where there used to be one; we use the same strategy with IMA zones. After about a hundred servers in the zone (or somewhere between 100 and 512), we will want to artificially “split the zone”.

Though the default IMA zone configuration is based on IP sub netting, this is only for convenience, and doesn’t have to be the case. So we don’t have to subnet anything differently in order to get two zones, and two data collectors. We simply let “IMA law” and “IP Sub netting” diverge.

We go the PSC farm properties, to the “zones” tab, and use the “New Zone” button to create “zoneB”, then choose some of the servers in the first zone and click “Move Server” to move it to the new zone. (This is an after-hours task because we have to reboot the IMA servers we move.)

Advanced IMA_5

After moving some servers into the new zone, we ought to set a “Most Preferred” and “Preferred” Zone Data collector, and document what was configured, for use in client-support. At this point we have two data collectors, where before there was only one, and nothing has changed as far as sub netting.

Advanced IMA_6

Collapsing Zones

Citrix recommends MINIMIZING the number of zones in the Citrix farm. They say the number of zones “exponentially increases” the amount of WAN traffic. More recently they have said that more than 25 zones in a farm doesn’t work.

If the Presentation servers are spread out in multiple locations, by default there are multiple zones. Usually, in this case, there is room for optimization.

Advanced IMA_7

In the diagram above, the admin goes to each location, adds a new server to the farm over port 2512 in the firewall, and because each location is a different subnet, the admin winds up with a ZDC in every location. Still, most of the servers are centralized at headquarters, and only a few servers are distributed across the WAN, in order to serve some back-end DB app.

The issue here is just what kind of traffic, and how frequent, go over the 2512 port to the ZDC’s around the world.

IMA data collectors communicate over the WAN, over port 2512, and transmit any “changes”. Changes could be things like publishing a new app. but changes could also be the fact that a USER LOGGED IN, OR OUT!

Advanced IMA_8

So the problem with the IMA defaults for multiple subnets, is that in the scenario above, if any ONE USER simply logs on, or off, there is ZDC communication with every other location in the farm, because all the locations have ZDC’s.

It goes back to the analogy of a ZDC being an IT manager. We have a big IT department, with a manager, at headquarters. Then we start a new location, and staff it with one IT person. The question is whether to make the one IT person a “manager”, or not. If they are made “manager”, then they can’t get any work done because they are on the phone all day in conference calls with their counterparts at the other locations. And so the solution is we do NOT make the lone IT person a manager, but simply an employee who reports to a manager at another location.

So the same might need to be done to the IMA zone configuration, when we have a larger HQ and a bunch of smaller remote “sites”, with Citrix servers. By default, these servers are all “managers”, or “ZDC’s”. Again, without affecting IP sub netting in any way, we want to “collapse” the zones in this case, so that there is NOT a separate ZDC at each location, and then we DON’T have to dial up each location every time someone logs on or off at HQ.

To collapse the zones, we just go back to the PSC farm properties, “zones” tab, and “move servers” into an existing zone. Again, this is for after-hours because we will have to reboot the servers. And one more thing in this case, we have to “delete” the empty zone, according to “IMA law”.

Zone Preference and Failover

Presentation Server 3 and PN Agent 8 introduced a new feature that increases the availability that the Presentation server can provide. ZPF, or Zone Preference and Failover, is a solution to add availability of applications even when an entire site is down – an entire IMA zone. This can also be used for a failover plan to a backup data center, with the backup data center being another zone in the same farm.

Before Presentation Server 3, the only way to publish an application properly was to publish it once for all the servers in one zone, and apply that app to people appropriate to that zone, then to publish the app again, this time on the servers in the other zone, for the users appropriate to that zone, so that there would end up being east coast users accessing an application that ran in NY, and a group west coast users accessing a different published app that ran on servers in LA.

The problem here is that if one site goes down, there may be connectivity and ample server resources on the other side of the WAN, but there was no way to seamlessly start utilizing those resources in a failover scenario.

What some people were doing was publishing the application across multiple zones, and saying everybody could access it. The advantage was that there were servers automatically available even when one whole site was down, automatically. The disadvantage was the huge increase in network traffic, as a user had to span the WAN just to connect to any app, checking if the other side of the WAN didn’t have a less busy server.

Presentation Server 3 introduced ZPF, which is a feature in the PSC policies and only works with PNAgent ICA client software.

Advanced IMA_9

Rather than sharing load information across zones, administrators can now create policies, one for each IMA zone, where a preferred zone is defined, and a list of backup zones can be defined, from bkp1 to bkp10. The group east coast users, then, would be applied to a policy in the PSC, with a preferred zone on NY, and a backup 1 zone of LA. The west coast users group would be assigned a different policy, with the LA zone the preferred zone and the NY the backup 1 zone.

Advanced IMA_10

While no load information would be shared across zones, in the event that all the Presentation Servers at a particular site are inaccessible, there can be an automatic failover of access to the next geographically convenient set of servers, as defined manually by the Citrix administrator.

The failover plan can be used in a disaster recovery scenario, where there are two data centers, a replicated SAN between them, and some front end SSL/VPN with access to pre-configured policies. If the primary data center were to fail, a strategically placed logon access point could redirect users to a failover zone / data center seamlessly, from the PN Agent client software.

Recovering from a failed data store

One of the objections sometimes to implementing Server Based Computing in the enterprise is the argument against putting all the eggs in one basket, in that if the applications all run centrally; the central server farm is a single point of failure. The Citrix implementation, however, can be put together in such a way that it is reliable, with failover mechanisms to backup data centers in place for catastrophic failure.

The Citrix Presentation Server farm consists of application servers, data collector(s), a license server with a valid license file, a data store, and applications and data. The applications and data are unique in any environment, but the rest is constant and can be supported by best practices.

The application servers are not a single point of failure as long as the N+1 model is used (or the N+25% model in a large implementation), where there is always one more physical server available than is actually required for the user base at maximum load. This way the farm can tolerate losing any one server without noticeable user impact.

The data collector is not a single point of failure either, as a new one will be elected dynamically if the regular one goes down, or cannot be contacted. As long as the clients have a way of getting to the other servers, such as multiple IP addresses in Program Neighborhood, the data collector’s failure should not produce an impact on the functionality of the farm.

The license server is 30-day fault tolerant, and in Enterprise version an alert can be set with Resource Manager to send an email within minutes of License Server Connection Failure. If the license server reconnects at any time in the thirty days the problem resolves itself. If the server is not going to come back up, then the license file, digitally signed with the case-sensitive hostname of the old license server, is the critical component. The license file, a *.lic file, can be backed up to a thumb drive separately, and restored to a new server with the same name of the old license server, and the Citrix License server software installed. The license server can be supported by Microsoft Clustering services. The license file itself is available on the myCitrix website, and can be returned and reallocated to a new license server name as well, in case of an emergency.

The data store is the central repository where almost the entire Citrix implementation is invested. The Administrators of the farm, the license server to point to, the whole farm configuration, the published applications, all their properties, the security of who gets access to what, the custom load evaluators, custom policies, configured printers and print drivers, all this is stored in the central repository called the data store. After an implementation has been around for a while, this repository is extremely unique, and unless it is documented completely down to the hidden detail screens, the farm data store needs to be protected.

The data store is either a SQL, Oracle, or DB2 database on a server outside the actual Presentation Server farm, or else it is an Access or MSDE database on one of the Presentation servers, called mf20.mdb (which showed up with MetaFrame XP, right after MetaFrame 1.8). If it is on the external server, then leaving it up to the DBA’s to back it up isn’t good enough; the data store can become corrupt, and without a solid known good backup of the data store, a series of recent tapes could be suspect or worthless. Since so much is invested in the Citrix data store, a separate copy of the database, from 5 to 20 MB in size, should go on the thumb drive with the license file. That thumb drive, plus a Window server CD and a Citrix server CD, plus the apps and the data, are the Citrix deployment, (apart from any web configuration files in the Web Interface and Secure Gateway or Access Gateway).

The data store itself becomes the single point of failure in some farms, but like the license server it is 30-day fault tolerant, and alerts can be configured in Resource Manager for Data Store Connection Failure as well. As long as the data store is backed up with a known good copy, the data store server in the Presentation server farm is easily replaced.

Unlike the license file on the thumb drive, that is tied to a particular server name, the data store is not tied to a particular server name, and can reside on any Citrix server in the farm, or another server can be built, inserted into the farm, and then host the data store. The SQL Express, SQL, Oracle, or DB2 data stores should be backed up by the utilities that come with them. The Access data store on the first Citrix Presentation server in the farm by default, is backed up with the command dsmaint backup path:, and then can be restored at any time to any Citrix Presentation server in the farm.

When the data store becomes unavailable, the PSC will not launch and the farm will be unmanageable. Users should still be able to access the application servers that are still left, as one of those servers has to be elected the data collector, and that is all the users need in order to connect to already-existing applications. But the farm is also unconfigurable, until the data store is back online.

To restore the data store to a different server, or just to move it to a more convenient place on the network, the procedure is as follows:

  1. place the mf20.mdb that was backed up in the proper directory: C:ProgramFilesCitrixIndependent Management Architecture;
  2. create a file dsn to the new data store;
  3. run dsmaint config /user:user /pwd:password /dsn:path to dsn on the new data store server and restart IMA;
  4. run dsmaint failover newdatastoreservername on all the other servers in the farm and restart IMA

To create a dsn file, go to the control panel, administrative tools, of the Citrix server that holds the new data store, and go to “Data Sources (ODBC)”. On the tab marked “file dsn”, create a new file, with Access 4.0 drivers, that is in the same directory as the mdb file is, and can be named anything, but for convention should be mf20.dsn. on the final screen, the actual database that the dsn file is supposed to point to must be selected. Under the select button, highlight the proper database, (not the imalhc.mdb but the mf20.mdb) and close the utility.

There should now be a dsn file in the “Program Files/Citrix/Independent Management Architecture” directory of the server that is about to become the new data store server.

When the servers first join the farm, they need to know where the data store is supposed to be. They log this server’s name in their registry, under HKLM, in the IMA key. When the data store moves, even if it moves directly onto a server, the IMA configuration doesn’t automatically discover the new farm location with some kind of broadcast. The servers are still looking for the data store on the old server, and start their 30-day countdown to stop receiving connections. After the data store is moved to a different server, the new data store server needs to be told that it is the new data store, and then all the other servers in the farm need to be told the new name of the data store server, so they can failover.

Although the identity of the Data Collector can be seen with the QFARM tool, and with the QUERYDC tool when the data store is unavailable, the identity of the actual data store, and the value of the key that says where a server thinks the data store is, are not readily available through a Citrix command line utility. Therefore administrators should be very careful to document where they are putting the data store, and to make sure all the servers in the farm are pointing to the correct server.

To tell a server that it is the new data store, there is a simple command line with three switches, that ends up looking something like dsmaint config /user:Administrator /pwd:password /dsn:”C:Program FilesCitrixIndependent Management Architecturemf20.dsn”, but of course this could be prepared for ahead of time and made into a script. Once the command line returns the word “Successful”, the IMA service can be restarted, and the one server is back to having management capability.

But the rest of the servers in the farm are still on their 30-day countdowns. The command to failover the other servers in the farm to the new data store server is dsmaint failover newserver. After that, the IMA service needs to be restarted (net restart imaservice)

Once IMA restarts successfully on all the servers, the Citrix implementation is back into full manageability.

The Access or SQL Express data store can be easily migrated with the dsmaint utility. The option is dsmaint migrate, and instead of three parameters, as in the dsmaint config command, there are six: src user, src pwd, src dsn, and dest user, dest pwd, and dest dsn. A dsn file is set up to the new SQL, Oracle or DB2 database, and the data store is migrated.

If a server needs to be removed from the farm, the proper way is to either chfarm, or to uninstall the Citrix software. If for some reason a server has left the farm unexpectedly, and is gone for good, but still has vestiges hanging around in the PSC, then there is a command line utility to check and if necessary clean the data store. The command is dscheck, and dscheck /clean to actually clean up the inconsistencies.

Advanced IMA – Compatibility Mode – Part 1

Advanced IMA

In a farm of fewer than (roughly) thirty servers, with all the servers in one location, IMA runs itself, and the defaults are appropriate.

In a larger farm, or a more complex implementation where there are servers in multiple locations, the IMA defaults may need to be modified in order to optimize the implementation.

And even in smaller farms, there are a few important IMA issues to be aware of, such as how to back-up and restore the IMA data store, and how to recover from a corrupt “Local Host Cache”.

Independent Management Architecture (IMA) is both an architecture and a protocol; as a protocol, it runs over port 2512 and holds the Presentation Server farm together. As an architecture, it is what makes the farm scalable.

Back with MetaFrame 1.8, Citrix servers were like Windows 3.11 machines, in that they BROADCAST to each other in order to be in the farm together. We were intended to have one, or maybe two or three servers together, on one LAN, and they broadcast to each other to maintain connectivity. Before it’s time, Citrix took off, and there were customers with HUNDREDS of MetaFrame 1.8 servers in a farm, BROADCASTING to each other. If servers were on multiple subnets, we had to configure single-point-of-failure “ICA Gateways”.

IMA Data Store

With MetaFrame XP, the scalable IMA architecture was released. First of all, there was the new “data store”, a static DBMS database, which holds all the configuration data for the farm. Smaller POC implementations could use a runtime Access database, and the standard in production is to place the data store on a SQL 2000 or 2K5 server. There is only ever ONE data store for the farm. The data store is a 30-day fault-tolerant single point of failure, and we can set up Resource Manager alerts to tell us immediately if a server looses connection to the data store. When the data store is MS Access, or SQL 2005 Express, it is installed DIRECTLY on the first Citrix server in the farm. When it goes on SQL, (or ORACLE or DB2), the data store should be on a server OUTSIDE the Citrix farm, not on a Presentation Server.

In the case of a remote data store, the first server in the farm is given a DSN – direct connection to the data store. The rest of the servers in the farm receive an INDIRECT connection the data store; if the first server in the farm is down, no server can access the data store. This is the most subtle single-point-of-failure in Citrix. The single-point-of-failure is 30-days fault-tolerant, but after thirty days without that one particular Citrix server, the whole farm stops accepting connections.

The Citrix recommendation for this situation is to create DIRECT connections to the data store, by going to some other servers in the farm, even all the servers, and adding a DSN file manually that points to the data store.

The data store can be backed up and restored to a different server if necessary. To back up a local MS Access data store, there is a Citrix command line utility: “dsmaint backup path:”, which takes the locally stored IMA data store from the “Program FilesCitrixIndependent Management Architecture” directory, and places a closed copy of the “mf20.mdb” at the “path” defined at the end of the command line, (preferably a thumb drive).

To back up a SQL data store, use Enterprise Manager or Management Studio, to back up the database as a “single file”, and place it in a secure place, as this is the complex heart of the Citrix implementation, and we wouldn’t want to have to recreate it from scratch.

Even if the SQL team is backing up the data store nightly, we still want a recent, ‘last-known-good’, on a separate, static, thumb drive. If the data store becomes corrupt on the SQL server, we can always go back to our ‘last known good’. How often should this data store be backed up? Not necessarily nightly, because a simple data store could easily become overwritten with the corrupt version. Rather, each time we do significant configuration work – adding more servers, changing policy settings, changing the printer configuration – we want to get a ‘last known good’, so we can always bring the implementation back to this point.

Without being diligent and backing up the data store, we can still get a ‘last known good’, sort of. Every time the IMA service restarts, it renamed the access data store to mf20.bak, and that can also be considered a backup, but we don’t want to have to depend on a backup from the last time we rebooted the server.

A strategy for restoring our diligently backed-up data store will follow, at the end of this chapter.

Zone Data Collector

The second new component within the new IMA architecture was the IMA “Local Host Cache”, which is a runtime version of the data store information that’s relevant to the particular server – when an admin configures an IMA server, or a user connects and launches an app, they are actually contacting the IMA Local Host Cache (IMALHC).

And the IMA LHC on each server is the basis for the “Zone Data Collector” (ZDC). The ZDC is in some ways more critical to the farm than the data store, because we can’t go thirty days without a ZDC. No one can connect to the farm, if we don’t have a ZDC.

A Zone Data Collector is elected dynamically, and in the small farm that runs itself, the first server in the farm, (whether or not the data store is installed locally), is set as “Most Preferred” to win the ZDC elections that will occur every time a server reboots, or joins the zone. The rest of the servers in the zone are set to “default preference”.

Once we have several servers in a farm, Citrix recommends hard-coding a backup ZDC or two. The ZDC is a critical role, answering all client requests, querying the “dynamic store” which it maintains in RAM, returning the name of the least busy server, and maintaining the updated dynamic store information. If a ZDC goes offline, if it can’t be contacted, and even if another server comes online, there is a ZDC “election”, and no matter what the preferences, SOME server will be elected the ZDC. The current ZDC can be enumerated by typing “qfarm” at the command prompt of any Citrix server. The “D” to the right of one of the servers means that server is the ‘zone “D”ata collector’. (The asterisk just means this is the server we are typing on at the moment.)

Advanced IMA_1


By default, the server denoted as “Most Preferred” will always win the zone elections. But if the main ZDC is down, by default the next one elected could be anybody, since all the other servers in the zone are set to “Default Preference”; Citrix recommends setting a second-in-command for the important position of ZDC, even a third, rather than leaving it up to the random “host-id” that got configured dynamically during server install.

To control who gets elected ZDC, we use the PSC farm management tool, go to the properties of the farm, and click the “zones” tab. In the GUI, the blue check means “Most Preferred”. To set another server as “Preferred”, we can right click a default preference server, and add the orange pyramid, which means “preferred”, or “next-in-line to be ZDC”.

Advanced IMA_2


Though the “qfarm” command comes back telling us with a “D” who the ZDC is, the “qfarm” command is not going to be available if the data store is down; ‘qfarm’ queries the data store for who has won the ZDC election. But when the data store is down, the cockpit of the implementation is closed, and ‘qfarm’ only works in the cockpit.

There is however another command, not installed in Citrix by default but available on the server CD, under “support”, “debug”, “win2K3”, which can be copied to the server and used at the command line: ‘querydc’. The ‘querydc’ command queries the ZDC itself, and so will work in the cabin, when the cockpit is closed. Also, a ‘querydc -e’ will force a ZDC election, (something that otherwise would happen within the next five minutes if left alone.)

Advanced IMA_3

If the whole point of implementing Citrix is greater centralization, why is it that many implementations involve Citrix servers in multiple locations? Wouldn’t it be better to keep all the servers in one location, and send out ICA to all clients everywhere?

Ideally, that would be the design, with the only ‘other’ location for Citrix servers being a “Disaster Recovery” (DR) site / zone.

But many Citrix implementations are not designed from the ground up, but rather evolve slowly; the first reason to bring Citrix into the enterprise is often because it “fixes” some back-end database application that was running slow over the wire. The application vendor tells the customer that Citrix fixes the problem, so they put a Citrix server in front of the database app, and instead of the application logic going over the WAN to all the user PC’s, the client software is installed on the Citrix server, the app logic runs only on the backbone between the database and the Citrix server, and everything is made better by the lean ICA protocol.

And if we are putting Citrix servers in front of our back-end databases, and we have already spread the databases out at multiple locations, we now have Citrix servers in multiple locations.

With IMA, we can span locations, buildings, countries, and continents, with the Citrix servers in a single farm, and the farm can be managed from anywhere in the world, from any one server.

The default configuration of IMA in the farm that spans locations MAY be ok, but it depends on how the Citrix servers are utilized, and there is often an opportunity to improve things by changing the defaults.

By default the “zone name” is the IP subnet that the server is being added to. The first server on the subnet is the “Most Preferred” ZDC, and all the other servers on the same subnet realize that there is already a “Most Preferred” ZDC, so they join with their elected representative over IMA port 2512, then stay silent as far as the WAN is concerned.

If the admin goes to a DIFFERENT subnet, and adds a server to the existing farm, (which will only work if 2512 is open between the subnets), then the server is still part of the same FARM, but because it realized it is on a new subnet and there is not yet a data collector for this subnet, the first server on the new subnet becomes the “Most Preferred” ZDC for the NEW ZONE, and IMA continues to run itself, with one data store per farm, and one ZDC per zone, according to “IMA law”.

Advanced IMA_4

Troubleshooting IMA

With a distributed database-related structure such as IMA, consistency and connectivity issues can arise in a production environment.

If an admin is sitting at one WAN location, saying he published an app or changed a PSC policy, and the admin at the second location is saying he doesn’t see the change, there could be several different reasons for this.

It isn’t the default behavior. Ideally, as soon as the first admin made the change to the IMALHC on the server he was connected to, the change would have propagated to the data store, and then a change notification should have gone out to all the other IMA servers in the farm to get the change immediately. All servers should be reflecting the change in real-time.

If the second admin was across a slow WAN from the first admin, then it is conceivable that the change was lost in the traffic on the wire, and the local IMALHC had given up on getting the update. The solution for this situation is a simple
“dsmaint” command: “dsmaint refreshLHC”. The admin at the second location types this at the command prompt of the Citrix server, and if this was ll that was wrong, then the refresh was all that was needed.

The IMALHC is just an MS Access database, and can also become corrupt. As opposed to the mission-critical IMA Data Store, the IMALHC is expendable. IF the admin does a “refreshLHC” and still isn’t happy with what they are seeing, the admin can then choose to go farther and type “dsmaint recreateLHC” at the command prompt.

The “recreateLHC” command doesn’t actually do any update or creation, but simply sets a registry key value – “PSREQUIRED” under HKLMSoftwareCitrixIMA – to a “one”, from a “zero”. The significance of this is that now when the Citrix Independent Management Architecture service is RESTARTED, the IMALHC will be completely cast aside, and a brand-new IMALHC will be rebuilt from the IMA data store over the wire. If the data store is big, or across a WAN, expect this process to take a little longer than normal. (The number of servers, number of published apps, number of PSC policies, and number of print drivers in the data store are what make it big.) The admin on the second set of servers has to restart the IMA service manually after entering the “dsmaint recreateLHC” command, in order to get the LHC back to normal. While the LHC is building, there is a blank Windows screen with the mouse-cursor in the middle. There is no Citrix GUI to tell you what data store the server is pulling from, how far along it is so far, or how long it expects the entire process to take, or even the fact that something important is happening. (In fact, there is no command in Citrix to say where the server THINKS the data store is.)

If the server the second admin was sitting at was NOT across a WAN from the first server, then none of these “dsmaint” commands will help, and there is a much more likely solution to the problem: “ODBC Connectivity”!

When the “Citrix Independent Management Architecture” starts, it loads the critical DLL’s, then the less critical DLL’s, then  reads the “PSREQUIRED” registry key to figure out if it is going to be a refresh (O) or a total rebuild (1) of the IMA LHC from, the data store. Finally, the server starting IMA tries to make the actual database connection.

Fortunately the typical data store issue is not internal corruption, but simply loss of connectivity to individual IMA servers. The solution for ODBC connectivity loss is quick, and simple.

Check back for Part 2

Active Directory (AD) Integration – Part 2

GPO’s

Windows 2003 Group Policies can do a lot of things; it’s such a big list of things that you can do in a Group policy, with the default templates, that it can be difficult to find a setting among the hundreds of fields and sub-fields in the Group Policy tool in Active Directory. There is an Excel spreadsheet called PolicySettings.xls that is searchable, and contains detailed explanations of each setting in the default Windows 2003 SP1 templates.

But there are a few key settings among all the registry keys that are critical to the success of a Citrix implementation.

  • Delete Cached Roaming Profile – removes the roaming profile from the Terminal Server after the user logs off, and the changes have been successfully copied back to the central roaming profile directory. (used in conjunction with two others – “- do not detect slow connection” and “-wait for remote profile to load”)
  • Folder Redirection (to keep the size of the roaming profile down and keep the user’s data centralized and secure)
  • Lockdown of Internet Explorer / the Desktop (may require significant testing to make sure things aren’t too locked down to get the work done).
  • Loopback –replace /merge
  • Hide Drives on My Computer – determines exactly which drive letters a Citrix client can see, and has to take into account the home drive, the other network drives, and the remapped client drives.

“Loopback-merge / Loopback-replace”

The “loopback merge” or “loopback replace” setting in group policy can be a critical component to getting the control over user access that a Citrix implementation requires:

First of all, the users don’t go in the users container, and the Citrix servers don’t go in the computers container, because they can’t be controlled with group policies; instead, separate containers, “OU’s”, are created by the AD administrator.

At the minimum, we require a single OU for the Citrix server we are implementing. As the Citrix integrator, we need to be able to control the type of access the users are getting when logged on to the Presentation Servers, and we use GPO’s, on an OU, to accomplish this.

As far as the users, they also don’t belong in a folder, but in an Organizational Unit. But there are a few different scenarios to look at with the users. If we are building the AD from scratch along with the Citrix implementation, we might as well create the “Citrix Users” OU; but we might more likely be bringing Citrix into an AD implementation that already exists, for completely different designs than “terminal services”, and the users already may have GPO’s controlling things like folder redirection and hiding server drives, in ways that conflict with what we need them to do.

In this case the Citrix integrator needs to be able to “lock-down” the Citrix SERVER Organizational Unit, so that already-existing users with conflicting user settings can come in without threatening the stability of the Citrix implementation.

In the case of the “folder redirection” GPO, we have to configure the GPO in the “user” section, since there is no corresponding setting in the “computer” section. But if we set a “user” GPO and put it on our one SERVER organizational unit, by default, we don’t get the guarantee that our GPO will work.

User settings will be read in AD first at the computer OU, but then at the user OU, and the user OU will win out if there are any conflicts, by default.

The key to controlling what happens is the “User Group Policy loopback processing mode” GPO setting. After setting it to “Enabled”, on the GPO of the OU of the Citrix server, the integrator has the option of setting it to either “merge” or “replace” mode.

AD Integration

In the scenario where the integrator has no control over the GPO’s on the user’s OU’s, and the GPO’s could very well be conflicting, (in terms of which drive letters are hidden or revealed, for example, or how locked down a machine is), the integrator can use “loopback-replace”, to cancel out any user-settings that may have been assigned at the user OU, and then use another GPO to set all the user-settings that the users will have, when logging in to servers in that OU.

AD Integration

In this scenario, any setting under the “user” portion of the GPO will have to monolithic, meaning every user will have the same setting.

In another scenario, an organization can require differing types of access for different types of users. For instance, a regular branch employee may be restricted to only one mapped client drive that is going to correspond to their USB audio device files, and the other client drives are to be hidden, so that they cannot use the WAN for transferring data from their hard drives to the central “home” drive they are mapped to in the Citrix implementation; the manager at the branch, however, will be allowed to access his own client drives in the Citrix environment, and the transfer of files across the WAN will be at his discretion.

In order to implement this type of access, Citrix says the integrator needs to be able to control the GPO’s at the user’s OU’s, and designs different user GPO’s on those user containers.

With a different OU for each of the three types of users, the integrator can set up different user GPO’s, then, at the server container, the integrator places a “loopback-merge” GPO, in order to allow the various user rights to be “merged” into the Citrix Presentation Server’s environment, providing the user GPO’s of any user’s capable of logging in to these servers have been analyzed for any security or incompatibility issues.

AD Integration

A user is logging in to Terminal Services and getting this Terminal Services environment, often only after having logged in to a server or domain and run a different set of GPO’s for that environment. If the user is logging in from a Windows workstation, access that workstation itself is managed by AD as well; in the case of a thin client, there is less of a GPO management issue.

The Citrix integrator needs to provide access to mission critical applications. Some of these may be web apps, and may be provided by publishing IE on the Citrix Presentation Servers. This doesn’t necessitate the granting of access for users to browse the web through the Presentation Server, however. GPO’s can be set to restrict the users of the IE on servers in an OU to only a fixed set of websites. (These websites can be further enhanced with SSO by adding a Password Manager agent to the Citrix Servers.) As long as functionality of these websites is not impeded, the IE on the Citrix servers should be locked down as tightly as possible, and virus scan software installed, and internet traffic monitored, and filtered for only a fixed set of websites.

On the other hand, users may want or need access to external websites from time to time. This access can be assumed less mission critical, or it would be transferred to the Presentation Server environment. Therefore, the client machines should be less restricted, allowing web browsing, and less restricted access to the machine as well, compared to the Citrix environment of the same user account.

To accomplish this, the workstations reside in a different OU, separate from the Citrix servers. A loopback policy is placed on the workstation container as well, and in this scenario, the option would be “loopback-replace”. This way, the user settings that restrict drives, and IE and desktop access, and reside in GPO’s on the user containers, will be discarded, or “replaced”, when the users log in to the workstations, and there will be only minimal restrictions to replace them.

Then, within those varying user GPO’s, login scripts are pointed at to map home and shared drives, the settings to lockdown the server are set, folder redirection of My Documents and Desktop to the user’s home drive is configured, and “Hide Drives on My Computer” has to be configured, to control exactly what the user can see when connected to the Citrix servers.

AD Integration

The disadvantage of this design, though it is the Citrix and Microsoft “Best Practice”, is that it necessitates re-building the entire AD for the Terminal Services implementation, or building two different user accounts for each actual person – one for the terminal services and another for whatever the old AD account was used for.

As an alternative, there is a way to lock-down the Citrix server OU with “loopback-REPLACE”, while still configuring different types of user access. By adding the multiple, conflicting GPOs for different user types at the one server OU, with “loopback – replace” in place, we can go into “properties” on each user GPO, go into “advanced”, and highlight each group on the menu, de-select “apply group policy to”, then add a group from AD, like “managers”, and “apply group policy” only to that AD group.

AD Integration

The “apply group policy to” button is in the properties of the GPO, under “properties”, “Security”:

AD Integration

Hiding Server Drives

The TSC tool or a PSC policy can be used to enable or disable “client drive mapping”, and assuming the “remapping of server drives” has not been done, the client drives are going to be V, U, and T, possible continuing up the alphabet backward from there for any other existing client drives at the moment of connection.

The login scripts can get the user an “H” drive for home and an “S” drive for share. By default, the user gets access to the Presentation server’s drives, as they are in its own registry, most likely C, D, and E.

All this can be then filtered by a GPO that reveals, or not, each letter in the alphabet, if there is something there to reveal.

The problem is that in the standard Windows template file – the “system.adm” file on a standard domain controller, didn’t predict the complexity of this environment.

Among the string of V,U, an T, H & S, and C, D, and E, we want to reveal or hide various combinations. The interface in AD offers 6 different, pre-configured, variations in a drop down menu, on which drives to hide. The choices are “Restrict A&B”, “Restrict C only”, “Restrict D only”, “Restrict A B & C only”, “Restrict A B C & D only”, and “Restrict all Drives”.

The two more options we might be hoping for are “Restrict All but S & H”, and “Restrict S, H & T.

And there is a way to modify the ADM file on the domain controller(s) so that the extra options can be available. It involves typing the alphabet backwards into a calculator in binary, clicking 1’s for every letter to hide and 0’s for every letter to reveal, then converting the number to decimal, and appending the system.adm file on the domain controller. The MS KB article # 231289 details the steps;

One thing to watch out for is that the number of drive letters on each Presentation Server have to be standard and consistent, and this GPO is customized around that particular drive letter situation, as well as the client drive situation. Without standards in practice, the implementation will become a problem.

Since many people find themselves in the same situation with the need to hide specific drive letters other than the ones in the default windows policy template, there is a third party tool available from http://www.petri.co.il called gpdrivesoptions:

AD Integration

CM, Citrix Training Instructor
Unitek Citrix Training

Active Directory (AD) Integration – Part 1

There are two main areas of Active Directory design that are critical to most Citrix implementations: Profiles, and Group Policies.

Profiles

With the most simplistic, default situation on a Windows Terminal Server, a user exists in Active Directory without any “profile”, or “terminal services profile” information included. When the user first logs in to the Terminal Server, a “local profile” is created, under “Documents and Settings” on that application server that they happen to hit, and this directory structure includes windows settings, as well as “My Documents”, the “Desktop”, and Internet Explorer’s “cookies” and “bitmaps”; If a user then saves data in these locations, logs out, and logs in to a different Presentation Server the next day, a completely different local profile would be created, and the user would be mystified as to why their data from the day before is “sometimes there, sometimes not”, as they log in to different load balanced desktops or applications.

AD Integration

There are advantages to “local” profiles on a Presentation Servers: no data has to travel across a wire before the profile can load, and the profiles on the separate servers are not being tied together, and so are not as likely to become corrupt as “roaming profiles” are. Logins are faster and management of profiles is simple.

A public library can publish a load-balanced desktop across several Citrix servers, and the local profile on each server of the librarian, or the “library user”, remains the same. The public uses the desktop, browses the internet, types, prints or saves, but leaves behind nothing, and doesn’t expect to see their own data, or desktop changes, waiting there for them the next day. In an environment like this, the local profiles can become locked-down “mandatory” profiles that do not accept changes, and so do not become corrupt. Renaming the ntuser.dat to ntuser.man was the simple step to make a local profile a “mandatory” profile.

But most internal organizations need to provide regular access to personal data, and personal application and printer settings changes, accessible across multiple, load-balanced, application presentation servers. Microsoft’s alternative to local profiles is something called “roaming profiles”; the profiles are tied to the individual AD user accounts, and each user owns a private directory share, where their profile is centrally store. This profile is then downloaded and cached on each Citrix application server, changes can be made, then the profile is copied back up to the central location, to be pulled down to the next application server the next time.

To configure a roaming profile on a Windows 2003 Domain Controller, the user account in AD is modified on the “Terminal Services” tab, and a UNC can be used to the share on a central file server.

AD Integration

But once roaming profiles are implemented, several steps must be taken to avoid having the roaming profiles become corrupt, so that the user either can’t log in, or out, and an administrator ends up having to re-create the profile.

The Microsoft free downloadable utility called “UPHClean” is recommended for any terminal server, to clean up all the user profiles that come and go on a daily basis.

In the PSC printing policies, the printer properties can be forced to be stored on the client device, instead of the roaming profile; this can cut down on profile corruption.

AD Integration

Delete cached roaming profile

And if roaming profiles are implemented, Citrix recommends running several “Group Policies” along with them, to maintain stability in the implementation. By default, these roaming profiles DO copy up to a central location when a user logs out, but they also remain behind on each Citrix server, “cached” for ease of use in the future. The problem is that with the load managed, published application model, users could conceivably log in to multiple servers at once, and log out of them in a different order, and this would eventually lead to corruption through time-stamp illogic.

AD Integration

Citrix recommends having the roaming profiles deleted by running a GPO, located in the “computer” section, under “administrative templates”, “system”, “user profiles”, and when we enable it, we are recommended to enable two others: “do not detect slow network connections”, and “wait for remote profile to load”.

AD Integration

Folder Redirection

The problem, now, with “roaming profiles” is that by default they can become large, with “My documents” and IE’s cached bitmaps and cookies, and “Desktop” all part of the profile. In order to avoid sending a user’s entire home folder over the wire to be cached on an application server, “folder redirection” can be implemented through Windows GPO’s, to keep the profile from becoming unmanageably large. Folder redirection is implemented through Windows GPO’s, in the “user” section, under “windows settings”. Once implemented, large portions of the formerly “roaming” profile now remain stable on the central “home” directory, available for retrieval through whatever methods the Citrix administrator has provided, when needed, instead of the default design, which was to download all the documents a user had ever saved to each server the day they log in to that server.

AD Integration

Another feature of Windows 2003 GPO’s is something called “profile size quotas”. Users are allowed to grow their profile until it gets larger than a preset value. The problem with implementing this setting is that when the user passes the profile size quota, they are unable to complete a logout, because they can’t complete the copying back up of the profile to the central location, and they have to call the help desk.

Even with folder redirection in place, the profiles in an enterprise Citrix deployment can become large, due simply to a high volume of applications that all require bits of a profile.

AD Integration

Citrix Consulting has come up with a custom scripting solution called “hybrid profiles”, where they look at just how much of the profile requires changing, and script a solution that cuts way down on how much of the profile actually travels back and forth, “roaming”, and leaves the reset as a permanently cached “local mandatory” profile on each Presentation Server. “FLEX” profiles are a free downloadable tool to develop scripts on your own that render your large roaming profiles “hybrids”.

Stay tuned for part 2…

CM, Citrix Training Instructor
Unitek Citrix Training