Solution: Google Chrome 45.x Blocks GPU Acceleration for WebGL, Direct2D, DirectWrite with Citrix XenApp on Windows 2008R2/2012/2012R2
Google Chrome v. 45 have changed their behaviour of their web browser, so they are blacklisting GPUs, which means hardware acceleration is disabled by default in a Remote session this behaviour is seen in Microsoft RDS with RemoteFX, Citrix XenApp 6.5 -> 7.6 with vGPU/GPU pass-through. Behaviour is also seen on Linux environments with Google Chrome.
Issue highlightet in a Citrix XenApp 7.6 (2008R2/2012/2012R2) with GPU
Open Google Chrome and type in chrome://gpu and then you will see if your RDS/XenApp with a GPU is working or not.
With Google Chrome v 45 the default behaviour is now that Google choose to disabling the GPU.
Google Chrome experience with a GPU (Citrix XenDesktop vs Citrix XenApp)
above picture confirms the GPU is not enabled in Google Chrome with RDS/XenApp
interesting is that Multiple Raster Threads in Win7 with a GPU is disable… why lets find out.
This is properly because you haven’t configured the CPU correctly in the hypervisor for the virtual machine. * hint I did on purpose so you can understand it can be complex to troubleshoot if you don’t configure your environment correctly.
best practices with HDX 3D Pro is 4 vCPU and in XenServer 6.5 you configure it 2:2
Then Multiple Raster Threads works in Win7 after you changed the configuration of the CPU on the Win7 vm.
below picture confirms the GPU is not enabled in Google Chrome with RDS/XenApp in Server 2012R2
Now we have validated that the GPU is blacklisted in RDS/XA with a GPU, lets see if we can fix this.
Performance with a WebGL web app in Google Chrome
CPU is very high when the Google Chrome is running the WebGL application. This is because its using the CPU to do software rasterize of the GPU and this makes your user able to run a webgl application without a GPU, but wait. We actually have a GPU in our RDS/XA server now, why is this not offloading the CPU, thats because Google desided to turn off GPU and blacklist them.
Edit Google Chrome link and after the chrome.exe add –ignore-gpu-blacklist
Open Google Chrome and run chrome://gpu in the browser
Check if Hardware acceleration is now working for all functions in Google Chrome
GPU hardware acceleration is now working in RDS/XA with a GPU in Google Chrome and CPU is now offloaded to the GPU as you can see in the below picture.
I am very excited to share this great news with you all. I did a webinar with fellow CTP Trond Eirik Håvarstein from XenAppBlog.com, and we had a special guest surprise Jeroen Van De Kamp CTP and CTO, LoginVSI announcing ground breaking stuff in the webinar. We had over 700 people signed up for the Webinar, if you was among the crowd that missed the opportunity to see the webinar here is your chance, the webinar is now available for everyone for free. There was a lot of Q/A and I will the next couple of days reply to all the Q/A and make them available in this article.
The webinar has been re-mastered and the audio & graphical demo videos is even better now than in the actual webinar, make sure to check it out now:
Summary of webinar product announcements from LoginVSI, Lakeside Software, Uberagent for Splunk.
LoginVSI upcoming new version support’s GPU benchmark…
LoginVSI is working on next version that will support benchmark, capacity planning, stress testing the “missing component in virtualization” GPU. If you are interested you can write to get access to the beta version of LoginVSI.
Here are some screen shots from the session…. watch it to here what Jeroen tells about the upcoming version
Note if you want to get more info on the next version of LoginVSI that supports GPU, write to email@example.com subject GFX
Another groundbreaking product announcement was from Lakeside Software, they are about to release version 7 of Systrack that will support NVIDIA GPU Monitoring/assessing.
Application Graphics Benchmarking
The transformation of an existing software portfolio first begins with the identification of all of the actively used software packages in the environment. The added complication in the case of a project to begin advanced application delivery is the need to understand multiple facets of usage: resource consumption, graphics utilization, frequency of use, user access habits, and mobility needs. Because the state of IT is already so complex it only becomes possible to fully understand and plan with a complete set of descriptive information that really characterizes the unique aspects of every environment. Of particular interest is the ability to first identify applications that have GPU demands, and then begin to segment them into tiers of utilization. SysTrack continually collects information about software packages as they’re used and normalizes all data points for cross platform comparison. One of the key performance parameters that’s identified in this process is a graphical intensity measure (Graphics Index) that provides a way to identify those applications in the portfolio that have higher GPU demands than others. With this critical information it becomes possible to segment the portfolio into groupings based on their requirements for specific resources. By tying a general sense of which applications have peak demand to total length of usage it becomes easier to start developing a portfolio made up of different combinations of usage styles. This includes separating applications that may be used by a small set of the population with intense requirements versus widely used applications with a smaller footprint. Of course, this also allows for much deeper analytics centering on the behaviors of users that is quite important in planning the GPU profiles in use in provisioning. Figure 1 displays this relationship in a bubble chart format, this format groups applications based on their similar characteristics presenting clusters of similar applications in larger bubbles. The vast majority of applications exist in the “low graphics demand – Low Time Active” area in the bottom left, while only a select few have either high graphics demand or high time active.
SysTrackTracks graphics usage frequency across on physical clients and allows you to group users based on graphics usage & frequency
A natural expansion of this is grouping users into distinct workload types to understand how best to configure the profile types and GPU assignments for users. Once the target applications and users have been characterized and a plan has been developed it’s critical to begin the process of sizing the environment. This includes determining the architecture, sizing the desktops and servers that will be worked with, and identifying resources that will be required to support the needs of the planned deployment.
Resource Modeling & Capacity Planning
NVIDIA Marketplace report from Systrack’sVirtual Machine Planner (VMP) outlines the number of users that fall into different use cases making it easier to forecast how many users per board can be allocated
With a complete portfolio plan it now becomes possible to move into the next phase and start creating a model for what resources will be required for a complete environment. Because each of the users have been fully characterized throughout the assessment data collection interval it’s possible to use SysTrack’s Virtual Machine Planner (VMP) for powerful mathematical analysis to provide deep insight into infrastructure provisioning. The first component of this involves using the profile information above to help develop a plan for what kind of solution will be provided to the end-users. By segmenting the population into different delivery strategies using Citrix FlexCast options as a guideline, a more complete and accurate picture of how the net new environment will operate can be created. An additional benefit of segmentation is the ability to take advantage of grouping by general graphics consumption to identify the number of GPUs required for the environment based on the user density information for each profile type
The NVIDIA MarketPlace report from VMP outlines the number of users that fall into the various use cases (e.g. “high” for a designer or higher end power user), making it much easier to forecast how many users per board can be allocated and in turn how many total boards may be needed
This information creates an easy to use design for a set of user profiles, both for the actual desktop delivery and for the vGPU assignment. By ensuring the best possible analysis of the environment prior to the actual deployment the end-user experience is much simpler to forecast and control. This results in higher end-user satisfaction and a shorter transition time.
User Experience Optimization
After the successful implementation of the solution the environment still requires observation to prevent interruption of service and the potential for productivity impact. The best way to ensure optimal end-user service quality is to have a real-time alerting and analytical engine to collect and report instantly on degradation of any aspect of the systems the users interact with. SysTrack provides this in the form of proactive alerting, detailed system analysis in Resolve, and aggregate trending through Enterprise and Site Visualizer. An even more interesting feature is vScape, a tool designed to examine utilization across multiple virtual machines and correlate resource consumption to concurrency of application utilization. vScape provides real-time updates of all of the application usage across all virtual platforms in an enterprise, including information about what applications are currently demanding GPU resources. It also provides insight into other resource demands as well, such as CPU, memory, and I/O. This can help automate the discovery of co-scheduled or highly concurrent applications to pinpoint the root cause of oversubscription issues much more quickly. It also provides key insight into guest health characteristics with trending to correlate precisely which events may lead to service degradation
Another key feature introduced in SysTrack version 7.0 is the result of close collaboration with NVIDIA to leverage APIs presented in the guest operating system. This allows the capture of detailed GPU performance metrics to correlate vGPU consumption to end-user service quality. Specifically, with NVIDIA drivers present in the guest OS or on a physical system, the GPU utilization and key metrics (see table 2 for a sample of selected metrics) from the graphics card can be captured and analyzed in the same way as CPU or other system metrics are currently in SysTrack.
In Systrack 7 after provisioning users in VDI environment the IT admins can monitors performance, which enables to optimize density over time.
This completes the set of KPIs used in SysTrack to calculate the end-user experience score, including categories like resource limitation, network configuration, latency, guest configuration, protocol specific data for ICA, and virtual infrastructure. With a complete set of relevant information the proactive and trending health analysis provided in SysTrack yields a thorough analysis in an easy to understand, quantitative score that summarizes performance on an environmental, group based, or individual system level.
NVIDIA GPU Monitoring/Assessing: (Works with all NVIDIA GPU) Quadro, Kepler, GRID
You will be able to look at following parameters:
Frame Buffer Usage
Memory Usage (Bytes and Percent)
# of Apps
Temperatures and Fan RPMS
Use this data to accurately plan and size GRID and HDX 3D Pro deployments based on actually observed usage and utilization.
Monitor users post-deployment to provide the best user experience
UberAgent 1.8 for Splunk adds GPU performance monitoring
Helge Klein have developed a new version of Splunk that now supports monitoring of GPU, this was a feature request I talked with Helge Klein about in 2013, and I am so happy to see the results what he have done with UberAgent for Splunk, lets dig in what it can do.
GPU compute usage per machine
GPU memory usage per machine
GPU compute usage per process
GPU memory usage per process
uberAgent shows memory usage separately for shared and dedicated memory (dedicated = on the GPU, shared = main system RAM)
uberAgent shows compute usage per GPU engine. The various GPU engines serve different functions, e.g. 2D acceleration, 3D acceleration, video decoding, etc.
You will see more upcoming blogs from me covering this topic. End User experience, assessments of GPU workload, scaling/sizing, benchmarking, hardware supported, GPU side by side experience, Hypervisor vs Bare metal with a GPU. Watch out for cool things….
Citrix 3D Graphics Pack is the new name for the “Citrix Virtual GPU solution” that was introduced in October 2013 in Tech Preview which is NVIDIA vGPU and XenServer/XenDesktop components. The cool thing about this release is that NVIDIA vGPU is now released and no more “technical preview/beta” and the product have been fully built into XenServer and Citrix have created some amazing GUI in XenServer 6.2 XenCenter and XenDesktop 7.1. There is a cool new SDK commands to fully automate the GPU commands if you want to use the CLI method instead of the GUI mode. December 16th, Citrix released support for GPU virtualization using XenDesktop 7.1 HDX 3D Pro with XenServer 6.2 SP1 in the Citrix 3D Graphics Pack (see http://www.citrix.com/go/vgpu). This means that multiple users can share a single GPU, overcoming the 1:1 ratio to achieve higher user densities and create a more cost-effective remote 2D/3D virtualization solution.
Kudos to NVIDIA for building the vGPU
Kudos to Citrix XenServer team for integrating the vGPU into the Xen
Kudos to Citrix XenDesktop team for integrating the vGPU into the XenDesktop
Kudos to all the Citrix HDX 3D crew
What is Citrix 3D Graphics Pack
The Citrix 3D Graphics Pack enables true hardware GPU sharing of NVIDIA GRID Graphics cards providing the industry’s highest performance virtualized professional graphics app acceleration. This technology was first unveiled at Citrix Synergy 2013 and allows GPU sharing for Virtual Desktop Infrastructure (VDI) for XenServer, XenDesktop and NVIDIA GRID GPUs.
XenServerGPU pass-through improvements including XenCenter configuration.
Citrix/NVIDIA XenServer w. vGPU Architecture
Sizing NVIDIA vGPU profile
NVIDIA vGPU profiles are designed different with amount of memory, CUDA cores and frame buffer, amount of display and display resolution pr GRID GPU type.
I have added more informations than Citrix and NVIDIA does in their branding. vGPU OS support is one of the important things where you clearly see which OS is supported on Pass-through profile vs vGPU profiles.
Which GRID to choose
Whats new in XenServer 6.2 SP1
The 3D Graphics Pack supporting NVIDIA GRID GPUs
Support for Windows 8.1 and Windows Server 2012 R2
Improvements to the Site Recovery wizard for large deployments
GPU pass-through improvements including XenCenter configuration.
New SR wizard allows up to 50 new fibre-channel HBA SRs to be created in a single step.
Security Hotfix and functional Hotfix roll-up.
New SDK for XenServer 6.2.0 Service Pack 1 is ideal for developers wishing to access programmatically XenServer’s new management features for GPU virtualization (including the new vGPU and GPU pass-through). The five available XenServer SDKs, one for each of C, C#, Java, PowerShell and Python, expose the new XenAPI commands for working with physical GPUs (pGPUs), GPU groups, virtual GPUs (vGPUs) and virtual GPU types. The GPU technologies for XenDesktop and XenServer do of course also come with rich GUI configuration operations and provisioning via XenCenter, XenDesktop and MCS.Citrix last few development cycles they have invested in re-writing XenServer’s PowerShell API to provide developers and administrators with a PoSH alternative to using the XenServer command line (CLI) interface. In particular this interface is proving popular with those looking to automate bespoke vGPU and GPU pass-through configuration and benchmarking or auto-test frameworks. The PowerShell API is also a popular choice for XenDesktop and Windows administrators working with XenServer. Read more about the SDK here
Whats new in NVIDIA GRID vGPU Pack
GA of NVIDIA GRID vGPU Manager + Windows Display Driver
Latest NVIDIA GRID vGPU Manager is version (331.30)
Latest NVIDIA GRID vGPU Windows Display Driver (332.07) for Windows 7, Windows 8, Server 2008R2, Server 2012.
Important if you implemented XenServer vGPU tech preview
Customers who have previously installed the vGPU Tech Preview (XS62ETP001) on a host, cannot subsequently install Service Pack 1. Customers wishing to install Service Pack 1 will need to do a fresh installation of XenServer 6.2.0, before installing Service Pack 1.
1. Start with a fresh XenServer 6.2 installation on a GRID supported hardware
2. Install Service Pack 1 on the XenServer 6.2
3. Download the NVIDIA GRID vGPU Pack and unzip the contents; install NVIDIA GRID manager in XenServer from the CLI
4. Create a Windows 7 VM (this will be the base image)
5. From the XenCenter GUI, assign a vGPU type to the base image
6. In the Windows 7 VM:
a. Install NVIDIA GPU guest OS driver (available in the NVIDIA GRID vGPU Pack)
b. Install the XenServer Tools
c. Install the latest version of Citrix HDX 3D Pro VDA 7.1
7. Create a Machine Catalog using MCS to provision new VMs based on the base image or you can also use Citrix Provisioning Services (PVS)
8. Create Delivery Group, assign users, and publish the desktops
9. Access virtual desktops using Citrix Receiver. No GPU is required on the end-point devices
10. Validate GPU sharing by multiple desktops, using monitoring tools like Process Explorer from Microsoft
GUI enhancements in XenCenter for XenServer 6.2 SP1
In the Citrix XenCenter, there is a new tab called “GPU”at the host level. The appropriate vGPU types attached to the host are defined in this GUI, and made available to the virtual machines (VM). Depending on the requirements, one can also define the GPU placement policy here. This tab also makes it very convenient to visualize how many vGPU’s are already attached, and the physical GPU’s where they get placed.
This makes later troubleshooting simpler.
At the VM-level in XenCenter, the vGPU can be selected as part of VM properties or during New VMcreation on GPU enabled hosts. In the tech preview, this was a laborious step in the CLI. Now, simply determine the suitable vGPU Profile for your use-case and select it from the drop-down list. Once the VM is created, it boots into the Windows standard 800X600 VGA resolution. The vGPU features are available once the guest driver is installed in the Windows VM.
GPU performance graphs are available under the Performance tab of XenServer host. On first-run, these graphs have to be added to the view. Subsequently, they can be moved up or down and can show one or more of the installed GPUs.
GUI enhancements in XenDesktop 7.1 with XenServer 6.2 SP1
There are few GPU related enhancements on the XenDesktop consoles, and automated-provisioning of vGPU-enabled VMs using Machine Creation Services (MCS) is the one we’ve been waiting for. Simply attach a vGPU to the base VM, install the virtual delivery agent (VDA for HDX 3D Pro), and install the required graphics apps. Then head over to XenDesktop Studio to create the machine catalog. The only part to exercise caution is not to perform a Sysprep after creating a vGPU-enabled base image, else it wipes out the vGPU information.
In Studio, the vGPU Type must be defined while creating the host settings to be used as a platform for the MCS machines.
Subsequently, proceed to creation of a machine catalog as usual. The exact steps are outlined in the Reviewer’s Guide. At the step where MCS base image is chosen, hovering over the image name shows information to confirm if you have a valid vGPU-enabled master image.
The remaining process to create machine catalog, create delivery group, and assign users is no different than the usual way of delivering desktops and apps. Use the latest Citrix Receiver to access 3D apps.
Tweak XenServer 6.2.x for GPU intense applications/performance
Below articles are critical to follow, no matter if you use GPU pass-through or vGPU profiles with your virtual machines.
I have seen many GPU intense applications that uses the Turbo mode or the max CPU clock frequency and if you think Turbo mode work out of the box, think again. Many virtualize their 3D applications and will have an impact on this if this is not configured.
I have seen multiple issues with performance, now Citrix have officially shared the informations, so shall I help you position how to tweak XenDesktop 7.1 HDX 3D. Following tweaks are for XenDesktop 7.0 & 7.1 VDA’s.
With high screen resolutions (such as 2560×1600), a lower than expected Frames per Second (FPS) may be apparent, impacting user experience.
Change Encodespeed from [HKEY_LOCAL_MACHINE\SOFTWARE\Citrix\Graphics]”EncodeSpeed”=dword:00000002 to [HKEY_LOCAL_MACHINE\SOFTWARE\Citrix\Graphics]”EncodeSpeed”=dword:00000001 registry key to work around this issue.
While connecting to high resolution displays (for example: 2560×1600) artifacts of previously opened windows can remain. To ensure that the screen is refreshed, users can add the following registry key: [HKEY_LOCAL_MACHINE\Software\Citrix\HDX3D\BitmapRemotingConfig]”HKLM_EnableDirtyRect”=dword:00000000
This article describes how to enable XML DNS Address Resolution in XenApp 6.
Enabling DNS Address Resolution is allows the XenApp 6 server to return the Fully Qualified Domain Name (FQDN) of the XenApp server to the ICA client using the XML Service. This also means that workstations must be able to resolve the FQDN of the XenApp 6 servers with this policy enabled.