Ever wonder what a Sun Ray session would look like at 128 kbps?  How about 64 kbps?  Little known/not well documented feature of Sun Ray is that you can limit the amount of bandwidth used by each session.  Before I explain how to do this, let me set some ground rules and expectations.

While this is useful for testing what if/worse case scenarios, it's not advisable to set this in a production environment.  The Sun Ray protocol is an adaptive self throttling that will burst higher when needed.  Prohibiting the bandwidth to peak temporarily will result in a slow screen draw.

Don't think because you don't have the bandwidth constrained that the Sun Ray will always be running all out.  Like I mentioned before, it's self throttling.  If you're not doing anything, with anything defined as things which make the screen change, than neither is the protocol (minus some keep-alives).  Therefore this setting should only be used in what-if scenarios.

Latency is much more important to thin client than G/M/Kbps.  This test does not introduce latency.  So if you test 200 kbps in a LAN environment, don't expect those results to hold true for a high latency WAN environment.

Requirements:

You're Sun Ray must be able to get vendor class options from a DHCP server somewhere on your network.  This in itself could be a whole other topic, so for this post I'll assume you understand what I mean.  A list of the vendor class options can be found here.  Note that with SRSS 3 Update 1, we are shifting much of the burden of booting a Sun Ray away from DHCP.  You can read about that in the release notes to the beta software.

The vendor class option we are going to add is called NewTBW.  It's value is in bps, so it you wanted to throttle your bandwidth to 200Kbps, the value of this option would be 200000.

64000 is the lowest that this can be set.  You can specify a lower number, but the Sun Ray is going to use 64Kbps regardless.

First step is to add the definition of this vendor class to the DHCP table.  If you've created a Sun Ray interconnect, it should be defined, check this first:

# dhtadm -P |grep NewTBW

 If NewTBW is not defined, here is how to do it under Solaris:

# dhtadm -A -s NewTBW -d Vendor=SUNW.NewT.SUNW,30,NUMBER,4,1

Next thing to do is to add that option to one of the DHCP network macros.  You can get a list of macros by doing dhtadm -P.  Since you may not want this setting to apply to all Sun Ray connections, pick the Sun Ray network macro that you are connecting too.  In this example I'll add the option for connections coming in on the 192.168.8.0 network.

# dhtadm -M -m SunRay-192.168.8.0 -e NewTBW=200000

Now I'll double check to make sure it took:

# dhtadm -P |grep NewTBW
SunRay-192.168.8.0      Macro           :Include=SunRay:AuthSrvr=192.168.8.10:NewTBW=200000:

You may need to stop and start your dhcp server to have this option take effect.  Now you can power cycle the Sun Ray to get the new setting and start testing.

To change the option value, you can issue the orginal command you used to set the option with a new value.

Happy testing!

Comments:

This is interesting...wish I had know about it before messing around with Bandwidth Manager when wanted to see how a SunRay would work in low bw. I got a pretty respectable behavior at 384kbps. thanks

Posted by Oscar on June 29, 2005 at 07:44 AM PDT #

When you say "little known/not well documented" you really mean "unsupported, might go away or change without notice, use at your own risk, don't expect help if you try to use this in a production environment". This throttle is not intended for customer use, which is why it's not documented.

Posted by ottomeister on June 29, 2005 at 10:45 AM PDT #

Bad Craig? No Donut? Sorry Otto. :( But it's still fun to play with. Only a few things I do are supported anyhow. You know I love making work for the real engineers!

Posted by ThinGuy on June 29, 2005 at 10:58 AM PDT #

I'm with you, just because a novel feature is 'undocumented' doesn't make it undesirable. GREAT news coming from JavaOne! Thought JL 'lost' that mustache at NCQ205? I attended 'DCD' site trials this past morning, a quote from my client 'it works much faster now'. 'Mhz' operation was observed as fast? The 'Ghz' data transport trial should prove noteworthy! Quoting a fine ad - 'This one's for you!'

Posted by William R. Walling on June 29, 2005 at 01:02 PM PDT #

First time I saw him with it off was back in January. I'm pretty sure he had last December @ the Sun Ray 170 launch.

Posted by ThinGuy on June 29, 2005 at 01:09 PM PDT #

Excellent post, was about to ask the alias how one could do this in a lab to play around with different bandwidth scenarios.

Posted by Christopher Saul on June 29, 2005 at 02:43 PM PDT #

<font size=-2>
<em>I'm pretty sure he had last December @ the Sun Ray 170 launch.</em>
I think so too. The launch photos, featuring several shots of JohnnyL, are still online at the URL Craig gave at the time (mid-December 2004) but the site has since been borg'ed by Kodak and now you have to register to see more than thumbnails. I'm not willing to deal with the marketing sliminess of the site just to play spot-the-moustache.
<em>I'm with you, just because a novel feature is 'undocumented' doesn't make it undesirable.</em>
By all means play with it, just don't depend on it. And bear in mind that clamped bandwidth on a lab desk is not at all the same thing as limited bandwidth in a WAN, so don't expect to see the same results in the field as you see in the lab. Round-trip latency and packet loss are important factors. </font>

Posted by ottomeister on June 30, 2005 at 11:13 AM PDT #

We are running sunrays from a server with a gigabit ce interface ... as soon as I start doing heavy screen updates: like scrolling this page realy fast: http://cdrecord.berlios.de/ performance drops to a crawl ... my theory is that because the interface of the server is 1GB but the sunrays only do 100MB there will be lots of packet droppage as soon as screen updates get more intense ... so maybe this option will solve the problem ... can't wait to try when I get to work tomorrow.

Posted by Tobias Oetiker on December 20, 2005 at 11:48 AM PST #

I cannot say, that srss3.1 does a good job at self throttling. We have a fully switched (BigIron) Network here. Servers attached via Gigabit and clients over 100MBit. I see up to 50% dropped packets with utcapture, which results in aweful screen updates and orrible fake latencies of up to 3 seconds. When I limited the bandwith of my sun rays to below 30MBit/sec all these artifacts vanished. I am very happy You published this ``feature´´. I would be evebn more happy if the SRSS would do a better job at rate throttling.

Posted by Robert on August 07, 2006 at 04:25 AM PDT #

Hi ThinGuy -- great article (though I come to it somewhat late in the day!)

We're seeing terrible performance in the presence of packet loss. Our servers have 1G interfaces, we have a 10G backbone, and 100M DTUs. SunRay performance is *very* bad with some switches with small output buffer sizes, but choosing the right edge switches solves that. However, with sunray@home we are finding that even with a 4M link, screen redraw rates are appalling (random pieces go missing, sometimes for seconds at a time). We can reproduce this in-house by rate-limiting an edge switch port to 4Mb/s. This clearly illustrates that the SRSS rate-throttling algorithm is sub-optimal.

I have to question the logic behind the throttling algorithm used. I have seen it written somewhere that it is not unusual to see sustained 10% packet loss. This may be acceptable in a LAN-based environment where that limit would rarely, if ever, be acheived. But it is bound to create trouble on a WAN link. A sunray@home user's broadband(ADSL/cable) link will just drop random packets, or block for a while or....

However, this is the only place I've ever seen anyone mention the problem. Can we be alone? Is everyone else doing something differently? Right now this hack seems to be the only option for us, but if anyone can shed light on what else might be wrong I'd be very interested to hear about it!!

Posted by FatGuy on January 29, 2008 at 10:36 AM PST #

Hi FatGuy,
No contact info, so I'll have to respond here. I don't think you are seeing an algorithm problem. Any chance you have cisco switches? There are issues with some of their switches (and some other companies that discard vs buffer UDP packets) that just discard UDP packets on GigE ingresses to 100 Tx egresses. Try putting the server on 100 TX switch port and test again.

Posted by Sun Ray on January 29, 2008 at 11:33 AM PST #

Thanks for your comments, Sun Ray. Our network is almost entirely Foundry/HP kit. We did see problems with the Foundry switches not buffering sufficiently, but a reconfiguration of buffer sizes fixed that. Generally, as I said, performance on the local net is fine (with GigE server/100M sunrays).

Our main issue here is sunray@home where the target network is much much slower. Hence our experiment in-house with throttled edge ports. I'm assuming that this is a rough simulation of a home user - very bursty traffic is likely to be dropped by ISPs rather than buffered.

But you seem to be saying that the SRSS should cope with this and throttle accordingly. Can you give any hint as to how the algorithm operates in principle? What might break it?

Thanks again for your help

PS: I'd be happy to communicate more privately, but I don't really want to publish my contact details to the blog.

Posted by FatGuy on January 29, 2008 at 01:56 PM PST #

An update: our rather severe sunray@home problem turned out to be very faulty routing with very odd side-effects. And I can confirm without any doubt that the sunray throttling does work over WAN links -- with apologies for doubting in the first place ;-). Interestingly, however, our in-house experiment with throttled edge ports was not affected by routing issues, and the only solution to packet-loss due to (massive server-client speed mismatch) was to throttle the client using the "bandwidth limit" option in the pop-up GUI. This worked extremely well.

Now I'm curious about this. Surely detecting a good rate for a local client should be easier than for a WAN-based client? Perhaps there is some heuristic based on latency which affects the throttling algorithm?

Posted by FatGuy on January 30, 2008 at 02:48 PM PST #

Compression doesn't kick in until about 8Mbps with the most aggressive happening at lower speeds. Were you able to test @ 100 TX on the server side? I've submitted a RFE for a server side bandwidth limiting feature but don't have any customer names to attach to it. You can contact me at firstname.lastname@sun.com Or my blog handle @sun.com.

Posted by Thin Guy on January 30, 2008 at 02:56 PM PST #

Hi Thin Guy and others,

We're testing SunRays at my place of employment and are routinely seeing 30% packet lose using utcapture. I'm very interested in trying the NewTBW option however I'm running a Windows 2003 server as the DHCP provider. I've figured out roughly how to create a new vendor class option but, so far it hasn't seemed to help. Has anyone tried creating this option on a Windows system? Also, is there away from the client to determine if it's receiving the vendor class option, ala utquery perhaps?

Posted by landmissle on February 25, 2008 at 05:21 PM PST #

Hi landmissle

First, it possible to set the NetTBW option on W2K3 DHCP server (took me a while to get my head around it as I'm a 'nix sort of person rather than windows). I don't have instructions to hand, but I remember finding something on the net...

Second -- as I suspect others will agree -- 30% packet loss is pretty awful and if this all LAN based you should look for a cause. Obviously check your network out. And before getting too far into anything else, make sure that you are using the latest patch of the sunray server software. This helped a lot for me. The NetTBW option is NOT required when the software is operating optimally!

Third, there is a much easier way for you to test the bandwidth setting of the client. Upgrade your clients to the latest firmware with the pop-up GUI. One of the options in the GUI is a bandwidth limit. Setting this has the same effect as using the NetTBW DHCP option. And it's a lot easier to play with!

Hope this helps a little!

Posted by FatGuy on February 26, 2008 at 01:10 AM PST #

Thanks for the information posted here. I am new to this area and I saw first hand the power of the sunray solution but it has alos evoked some questions. I hope you can help me with some information here. I am trying to determine how ALP is affected over WAN.

Is the adaptive bandwidth only driven by bandwidth? Is the computation done per end point? When this kicks in does it do compression or does it slow down the server end?

How does latency affect this algorithm?

Do customers turn this feature off over WAN and use standard QoS policies to get more consistent performance?

Thanks so much!

Posted by thinclientoverwan on February 29, 2008 at 10:15 PM PST #

Hey all,

Reading through this blog i see a few references to Cisco switches and buffering values causing performance issues within the thin Sun environment. The company i work for is currently rolling out a solution using a SunFire 4150 and a bunch of Sunray 2FS's. We have RHEL 4 installed and are seeing some performance issues too that i'd like to run past readers.

Essentially, we have two Cisco 3750 switches stacked together. One is a 3750-FS (24x 100Mpbs fibre ports) and the other a 3750G-24T (24x 10/100/100 copper ports). We have the server pugged into a gig SFP in the 3750G-24T and the rays hanging off the 100Mbps fibre switch ports.

In this config, we experience poor redraw/refresh rates and window artifacts remain on the thin screens for a second or two when dragging windows about. If we plug the rays into the copper switch instead (and use the copper port of the ray), performance is much better with little to no redraw issues and absolutely no artifacts.

So, has anyone out there had similar experiences utilising the sunray fibre ports? What was the fix (besides changing over to copper)?

I'm gonna try connecting the server into the copper switch and down the speed to 100Mpbs as suggested by 'Sun Ray' above. Will let you know how it goes.

Cheers.

Posted by kent on July 03, 2008 at 08:31 PM PDT #

Hi Kent,

There's a known issue with Cisco switches dropping UDP packets at Gig speeds. No workaround according to cisco besides going to 100 TX. Give that a shot and see if your problem disapears. I'll post the tech note later today from Cisco.

Posted by Thin Guy on July 04, 2008 at 08:12 AM PDT #

Hi All,

Things seem much better. We have moved the server off the gig SFP and onto a copper port forced to 100FD. The sunray's (only 3 running at the moment) are very useable now with no artifacts and good redraw. It will be interesting to see how this 100Mb server connection scales to 40+ sunrays.

Thanks for the help. Looking forward to reading the Cisco tech note on the issue.

Cheers.

Posted by Kent on July 08, 2008 at 06:15 PM PDT #

Hey Kent (sorry for the delay),

http://www.cisco.com/en/US/products/hw/switches/ps5023/prod_release_note09186a00802165c5.html#wp614476

"If a switch is forwarding traffic from a Gigabit ingress interface to a
100 Mbps egress interface, the ingress interface might drop more packets
due to oversubscription if the egress interface is on a Fast Ethernet
switch (such as a Catalyst 3750-24TS or 3750-48TS switch) than if it is
on a Gigabit Ethernet switch (such as a Catalyst 3750G-24T or 3750G-24TS
switch). There is no workaround. (CSCed00328)"

40 Sun Rays on one 100TX switch? No problem.

Posted by Thin Guy on July 08, 2008 at 06:33 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed

This blog copyright 2008 by ThinGuy