HP Moonshot & Citrix XenDesktop: Lessons Learned From The River Of Big Data
Recently, one of our clients who is also one of the top three exclusive HP partners in North America engaged Headwaters Group to assist in the delivery of a Citrix XenDesktop and HP Moonshot Proof of Concept (PoC), for their end-user customer. Below are the lessons learned as outlined by Carl Webster, one of Headwaters’ elite consultants who specializes in Citrix XenDesktop. Carl originally posted this on his own personal blog, and graciously allowed us to use it here and continue to spread the knowledge to our community. If you are a fellow engineer, or if you are a seeking a team of senior engineers with specializations in the areas of Citrix or HP Moonshot for your upcoming project, you can contact us here.
Centered around the concept of the “software defined” server, HP Moonshot is a new breed of converged systems that addresses the space, energy, cost, and complexity issues that make today’s computing platforms unsustainable. The HP CS100 has been fully tested and validated to work with Citrix Xen Desktop. HP and Citrix have done a wonderful job of taking the guess work out of how much performance is needed to power your virtual desktop hosting platform. A proof of concept, or limited scale pilot deployment, is also a great idea since it allows your business to “grow” into the solution.
But, as with any mission critical IT initiative, the skill of the deployment team is just as critical as sizing and designing the final architecture. Here are some of the lessons learned and some tips for working with Moonshot.
#1: Physical Delivery is Different
Forget most of what you know about delivering virtual desktops when working with Moonshot. Moonshot desktops are physical and require a new way of thinking if you have not worked with delivering streamed images to physical computers before.
- Moonshot, with XenDesktop, works with Citrix Provisioning Services (PVS) only.
- There is no hypervisor needed for Moonshot desktops.
- No Hosting connection is created in Citrix Studio.
- The XenDesktop Setup Wizard nor the Streaming VM Wizard in PVS are used.
These four concepts were definitely new to the Citrix Support people I worked with. It took quite of bit of patience on my part to instruct them on using PVS and XenDesktop with physical desktops.
#2: Get Organized; Get a Tool
I used Devolutions Remote Desktop Manager to make my life easier. I configured PuTTY sessions for the Moonshot chassis and both Moonshot switches as well as Remote Desktop sessions for many of the nodes and the Citrix and Microsoft infrastructure servers as shown in Figure 1.
This allowed me to have access to all my sessions from a tabbed interface instead of having multiple PuTTY and Remote Desktop sessions scattered over my desktop.
Before We Get to Items #3, 4 and 5
Read items 3, 4 and 5 and then I will explain why they are in this article.
#3: NIC Teaming…Not So Good
If multiple networks and or VLANs are used and VLAN tagging is used, do not even think of using Broadcom NIC Teaming. If CTX140338 had mentioned that limitation, we would not have lost days of work trying to get NIC teaming to work. CTX140338 simply states, “This is an enhancement to facilitate NIC teaming with the latest Broadcom NICs used in HP Moonshot systems.” If one line had been added that stated “NIC teaming is supported only if a single network with no VLAN tagging is used,” we would have known from the start of the PoC to not use NIC teaming.
HP and Citrix are working to resolve the VLAN tagging issue so please refer to the latest HP Getting Started Guide for the most up-to-date information.
More on this in #5 below.
#4: Don’t Click Cancel
If you are using just a single network with no VLAN tagging and you want to use NIC teaming, from our experience, the end-user experience will not pass the smell test. That means, while NIC teaming “works”, the end-user experience leaves much to be desired. Following the instructions on page 117 in the HP ConvergedSystem 100 for Hosted Desktops Getting Started Guide, the NIC teaming configuration is backed up and then restored after the vDisk is streamed. What I found out is that during the restore, when the team is created and configured the desktop loses connection to the PVS server. That means the user will see a message saying the connection to the desktop has been lost and will be retried in 30 seconds. The user will also be given two choices: Retry or Cancel. This will be a helpdesk nightmare, especially if the user clicks Cancel. In my testing, I found two things will happen if you let StoreFront and Studio handle the process:
- After about two minutes, the desktop starts.
- After about three minutes, you realize nothing has happened, and the user must click the desktop’s icon to get it to launch.
Bottom line, if you are willing to have users and helpdesk staff hate you, use NIC teaming. If you are willing to add confusion to the desktop startup process, use NIC teaming. If you are willing to add two to six minutes to the login and desktop startup process, use NIC teaming.
HP is working on improving the NIC teaming functionality and vast changes are around the corner.
#5: NIC Teaming + HP Velocity
Note: What is HP Velocity? HP Velocity is a software solution that improves the user experience for remote desktop and virtualized applications by addressing common network bottlenecks, such as packet loss, network latency and WiFi congestion.
If you still want to use NIC teaming AND you are using HP thin clients with HP Velocity installed, do NOT install the HP Velocity driver in your image. HP Velocity and the Broadcom NIC teaming software do not play well together. When you configure the NIC team, as soon as you click Commit Changes, your image is unrecoverable. It appears the Velocity driver creates a loop and the Broadcom software cannot recover. The only way to recover is to reload Windows and rebuild your image.
As soon as we figured out there was a conflict between the Velocity software and the Broadcom software, HP Labs, the Velocity team and others at HP immediately became involved in getting the issue resolved and a fix was made available.
The HP Velocity team is aware of the issue and is developing a fix for Moonshot. The issue revolves around the physical and virtual “bond” building a loopback with the Velocity driver. The HP Velocity team has a fix available now, but it requires booting Windows into Safe Mode. Since Moonshot is a headless system, there is no way to have Windows boot into Safe Mode. The HP Lab team was able to provide a quick work around by providing a custom installer that corrects the issue (albeit beta code); however, a permanent fix is under development.
Explaining Items #3, 4 and 5
Item 3, 4 and 5 exist only because:
- The customer wanted to use HP Velocity, and
- We did not know HP Velocity and Broadcom NIC teaming had not been tested together by HP, and
- We did not know that the PVS Target Device Software did not support how we wanted to use NIC teaming, and
- If we had known we couldn’t use NIC teaming, then we could have used HP Velocity and
- Not bothered with NIC teaming, and then
- Items 3, 4 and 5 would have been nonevents and never written about.
#6: Persistent ? Persistent
What the Moonshot documentation calls a Persistent Desktop is not what XenDesktop calls a Persistent Desktop.
With XenDesktop, a Persistent Desktop is one in which a user’s settings and or user installed applications persist between reboots. A Moonshot Persistent Desktop is where Windows Deployment Services (WDS) installs a full Windows build on every node in the Moonshot chassis. The Moonshot Persistent Desktops are then managed like regular physical Windows desktops.
#7: Update Passwords
One of the first things to be done after the Moonshot chassis is powered on is to update the firmware. But before you can update the firmware, passwords are needed for the two switches’ Admin and Enable accounts. After those passwords are set, you can run the new HPSum.bat file to update all the firmware. HPSum.bat is the new automated way to update Moonshot’s firmware and is included in the Moonshot Component Pack. If the switch Admin and Enable passwords are not set, the HPSum process will run but at the end will report it was unable to verify the passwords and you will have just lost 1.5 to 5.25 hours of time. Update the passwords first, next run HPSum.bat and then continue on with the Chassis and Switch configurations.
It takes roughly five minutes per device to update the firmware. The minimum number of cartridges in a Moonshot chassis is 15 plus the 2 switches and the Moonshot chassis or 18 devices. 18 * 5 = 90 minutes to update firmware. 30 cartridges plus the 2 switches and the Moonshot chassis (33 * 5 = 165 minutes) will take 2.75 hours to update firmware. A full chassis will take 5.25 hours. Those times are based on what we saw. Your mileage may vary.
#8: CPU Speed
Once the firmware has been updated, set the CPU Speed for all nodes.
If you are not sure what CPU speed settings are available, set it to a wrong value and the correct values will be displayed as shown in Figure 2.
The Moonshot documentation does not always match the Moonshot help text. I found several misstatements that I was able to report to my Citrix contact who then reported them to the HP Moonshot team. I was told that HP would correct the documentation. For example, Powercap mode 1 is not the default. For us, Mode 0 was the default. If you have four power supplies, use Powercap Mode 2.
#10: Answer Files
Moonshot requires WDS to install the Windows images onto the nodes. Windows System Image Manager (SIM) is used to manage the Client and Image unattended answer files provided by HP. Moonshot is a headless system so there is no video, keyboard or mouse to interact with a full Windows installation process. The two answer files are necessary but troubleshooting them can be a royal PITA. I recommend you use a regular Windows VM to do a test install of the customized Moonshot Windows installation. You do not care if the Windows installation actually runs, you are worried about getting past the point where the answer files are no longer used. Once Windows is installed to the VM, you know the answer files “should” work and you can proceed to installing Windows onto the first Moonshot node.
HP recommends using SIM so that any passwords entered are stored encrypted in the answer files.
A few issue we ran across:
- The answer files contain several XXXXXXXX lines that need to be replaced with the registered organization and user name. Even though we had that information entered, the image still had the XXXXXXXX for registered company and user.
- Even though we had the necessary credentials entered and the checkbox selected in WDS to enable joining the image to the domain, the domain was never joined.
- The time zone was never set.
Note: Make sure the Product Key you enter matches the ISO file type. For example, a Volume License ISO file will not work with an MSDN Product Key. If the product key does not match the ISO type, the following error (formatted for this article) is given:
This is an easy error to spot if a regular VM is used to test the answer files.
#11: Registry Hacks for PVS
Again, since Moonshot is a headless system, using PVS Maintenance and Test versions will be a royal PITA unless you use the registry setting described inCTX135299 on every PVS server. It is either that or give every Maintenance and Test user PuTTY access to the Moonshot chassis so they can access the virtual serial port on the node the Maintenance and Test target devices are attached to.
Rather than have to manually set a registry key on every PVS server, I would prefer an option in the PVS console to make this change.
#12: No Storms Here
Two of the main issues XenDesktop architects, engineers and administrators dread with virtual desktops are boot storms and login storms. Those two items are no longer a concern with Moonshot. If you did:
in a Moonshot chassis with 180 nodes, the power on sequence will not allow all 180 nodes to power on at the same time. Each Moonshot node has a dedicated quad-core processer, dedicated 8GB of RAM, dedicated 32Gb or 64Gb of SSD local storage and two dedicated network ports. The one thing every node shares is the BIOS. I have been given a description of how the power -on sequence works but since that information is proprietary, I cannot share it. If I get permission, I will update this article with the power-on sequence of events.
#13: Virtual Serial Port
In order to view a node’s boot process, Moonshot provides a Virtual Serial Port (VSP). This allows you to see the node’s power on sequence and the initial non-gui Windows boot process. After working for two weeks, the VSP stopped working. Fortunately, the solution is very simple and non-disruptive to any powered on nodes and desktops.
To view the VSP for a node:
If nothing appears in the VSP, there is a very simple fix. Enter the following command:
Note: “cm” is Chassis Manager.
That command will not affect any cartridges or any nodes currently running. After a few minutes, you can reconnect to the Moonshot chassis using PuTTY and the VSP is back working.
#14: Changing the Drive Letter of the Write Cache Drive
By default, PVS assigns drive D to the write cache drive. For a Moonshot node, its integrated SSD is disk 0 and, as disk drives do in Windows, has a unique ID. If there is an application that requires the use of drive D then the write cache drive letter must be changed. Using a hypervisor, adding an extra drive to a Virtual Machine (VM), creating a template and using that template when running the XenDesktop Setup Wizard makes changing the write cache drive to use a different drive letter fairly easy. No so when using physical devices.
Since each disk 0 has a unique ID, changing the drive letter of the write cache drive on the node used to create the master image does not make the drive letter change on another node whose disk 0 has its own unique id. Citrix has an article that explains how to remedy this situation, for the most part. If only one vDisk is ever created using only one node as the master, then the solution Citrix offers will work. Their solution will not work if there are multiple vDisks created and or nodes need to boot from multiple vDisks whose master image uses different drive letters for the write-cache drive.
In Moonshot, if I create a master image using the node C1N1, its disk 0 will have a unique id. If in that master image I change the write cache drive to letter W, the only way any other node will pick up the use of the letter W for the write cache drive is to change the unique id of its disk 0 to match the unique id from disk 0 in the master node. Not really a big deal. I can create a script or process that changes the unique id of disk 0 on say C5N1 to match the unique id of disk 0 from C1N1. Now what if I need to create another master image, for testing purposes, on C16N1? C16N1’s disk 0 will have its own unique id. Now I cannot boot C5N1 from the vDisk created from C16N1 as the unique id that was replaced on disk 0 does not match the disk id of C16N1’s disk 0. The write cache drive is now back to the default letter D and the application will not work because what should have been drive D is now drive E.
I don’t know many places that use only one vDisk for every device or every desktop offered to users. I can see this effecting the ability drag and drop a vDisk on a device collection. If the new vDisk has a different unique id for disk 0 than what the target device (Moonshot node) is already configured for, any application that requires specific files on the custom drive D will not work.
Maybe I am making a mountain out of a mole hill but this can be an issue when there are many vDisks and thousands or tens of thousands of nodes/desktops.
In the end, for this PoC, the customer decided to let PVS handle the D drive and will change their application configuration to use a different drive letter. It proved far easier to change the application than to change a default behavior of PVS.
You may be wondering what Write Cache option was selected for this PoC. We went with Cache to device RAM with overflow to disk and memory usage was set to 1024MB.
I will end with the two most frustrating things I encountered on this project.
1. Dealing with Citrix support!!!!!
Talk about driving a person to want to drink something stronger than a Ginger Ale!!! Sheez, dealing with Citrix support on Moonshot/PVS/XenDesktop issues is a lesson in futility because they apparently have not been trained in this area. Example:
Me: My desktops are not registering.
Ctx: OK, let’s rerun the XenDesktop Setup Wizard.
Me: My desktops are physical, there is no hypervisor involved.
Ctx: But you said you were using VSphere?
Me: Yes, we are using VSphere for the Citrix infrastructure like the Controller, StoreFront, Director and PVS.
Ctx: So there is a hypervisor involved!
Me: For the infrastructure, yes. For the desktops, no. We are using HP Moonshot which is physical so there is no hypervisor in use for the desktops.
Ctx: How can you use a hypervisor for the infrastructure but not for the desktops? Why would you do that?
Me: Because XenDesktop and PVS work with physical devices.
Ctx: Since you do have VSphere, let’s rerun the XenDesktop Setup Wizard.
Me: You’ve got to be kidding me!!!
By the way, the issue with my desktops not registering was that in Studio only the machine account SID was showing, not the machine account name. To resolve the issue, I:
- Deleted the machines from the Delivery Group and Machine Catalog,
- Deleted the Machine Accounts from Active Directory (AD) via PVS,
- Created new Machine Accounts in AD via PVS, and
- Added the Machines back in to the Machine Catalog and Delivery Group.
I know I should probably cut Citrix Support some slack since Moonshot is so new, right? But, PVS and XenDesktop have supported streaming to physical devices for quite a while. I know PVS has supported streaming to physical devices for a long time. My first three PVS projects five years ago were using PVS 5.x to stream to physical XenApp servers. My contention is that Citrix Support should be very familiar with using PVS to stream to physical devices with no hypervisor involved in the process.
2. NIC Teaming
For now, just say no!
I like Moonshot. I like the hardware. I like the concept. I am amazed that HP can put that many “desktops” in one chassis and not have the chassis melt from all the heat.
For the use cases where it can be used, I believe it is a cost-effective and energy saving solution. Having up to 180 desktops with dedicated CPU, memory, storage and networking in under 5U is quite impressive. HP’s documentation, while not perfect, is very well done. Their documentation (and videos) assumes a lot of knowledge and understanding the reader may not have. I am already looking forward to my next Moonshot PoC.
Carl Webster – Senior Technical Consultant for @HeadwatersGroup