This is a repost of an entry from ThinGuys blog.

I thought that it was important to capture it here. I have also added methods of testing this from Linux.

The Importance of MTU

<snip> One thing however that they Sun Ray does not do is fragmentation reassembly.  There are some good reasons for this as some WAN links drop fragmented packets, but mostly it's just a resource/feature issue.  But we do offer a work-around by allowing you to set the MTU on the Sun Ray itself which tells the Sun Ray protocol to throttle back on the MTU.  More on that later.

Bear with me as I step in over my head (with a little help from my friends)

One thing to consider in doing remote deployments is the Maximum Transmission Unit, which is the largest packet/datagram that can be sent over a network   In a perfect world, the MTU for ethernet is 1500.  In most WAN environments, things are far from perfect.  Due to different types of links, network gear, security features, etc, an MTU of 1500 over a WAN (and sometimes a LAN) connection is a pipe dream.  So some smart people invented PMTU Discovery, which is mechanism for a client and server to determine an agreed upon MTU.  Then some paranoid people decided to it's best to block ICMP packets across WAN routers, firewalls, etc, which basically renders PMTU-D useless.

Having the wrong MTU size is the cause for most of the frustration someone deals with in remote Sun Ray scenarios.  "Slow redraws", "jaggy", "black pixels", and "unuseable" are just some of the terms I've heard that all go away when we discover the correct MTU.   My MTU journey all started with my own dissatisfaction with my Sun Ray @ Home experience.  People were raving about it and I was just not seeing it.

So how can one find the correct MTU.  First you need a ping program that supports the do not fragment flag.

So what we want to do now is to ping through to the other side to find out the largest packet size you can get through. You can start with 1500 and work your way down until you get to one that returns without fragmenting the packet.

This example uses the Windows Ping command

C:\Documents and Settings\Craig>ping 24.234.0.7 -l 1366 -f

Pinging 24.234.0.7 with 1366 bytes of data:

Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.

Doh!  Santa Clara, we have a problem.  In this case 1366 is too large and I've found the 1256 is the largest size I can put through.

C:\Documents and Settings\Craig>ping 24.234.0.7 -l 1256 -f
Pinging 24.234.0.7 with 1256 bytes of data:

Reply from 24.234.0.7: bytes=1256 time=13ms TTL=59
Reply from 24.234.0.7: bytes=1256 time=18ms TTL=59
Reply from 24.234.0.7: bytes=1256 time=16ms TTL=59

Once you find your magic MTU value add/adjust option 26 on the DHCP server supplying the Sun Ray it's information, power cycle the Sun Ray and you are all set.

Here's an example for Linux:

#> ping -M do -s 1500 216.16.215.75
From 63.116.179.188 icmp_seq=1 Frag needed and DF set (mtu = 1500)
From 63.116.179.188 icmp_seq=1 Frag needed and DF set (mtu = 1500)
From 63.116.179.188 icmp_seq=1 Frag needed and DF set (mtu = 1500)
From 63.116.179.188 icmp_seq=1 Frag needed and DF set (mtu = 1500)
From 63.116.179.188 icmp_seq=1 Frag needed and DF set (mtu = 1500)

#> ping -M do -s 1366 216.16.215.75
PING bradlackey.net (216.16.215.75) 1366(1394) bytes of data.
1374 bytes from 216.16.215.75.static-cm-pool45.pool.hargray.net (216.16.215.75): icmp_seq=1 ttl=49 time=119 ms
1374 bytes from 216.16.215.75.static-cm-pool45.pool.hargray.net (216.16.215.75): icmp_seq=2 ttl=49 time=120 ms
1374 bytes from 216.16.215.75.static-cm-pool45.pool.hargray.net (216.16.215.75): icmp_seq=3 ttl=49 time=96.8 ms

Comments:

I've worked out my mtu for using a sunray across a WAN connection and I've set my dhcp server to give out the required information, is their anyway of confirming what the sunray is actually using. utquery doesn't show any information for sunrays on different networks as far as I can tell.

Thanks
James

Posted by James Legg on February 09, 2007 at 08:10 AM PST #

utquery can be pointed directly at a specific DTU. utquery -d 1.2.3.4 Where 1.2.3.4 is the IP of the DTU. This will not work if the DTU is behind a NAT router.

Posted by bhlackey on February 10, 2007 at 03:03 PM PST #

What I ended up doing as my DTU is NATed was connecting to a local Sun Ray server and doing a utquery from there to ensure that my mtu was being set correctly, after that I utswitch to the remote server if I want to.

Thanks James

Posted by James Legg on February 11, 2007 at 08:59 AM PST #

Your article helped me a lot!
Thanks
Erik

Posted by Erik Riedel on February 13, 2007 at 11:58 AM PST #

"Once you find your magic MTU value add/adjust option 26 on the DHCP server supplying the Sun Ray it's information, power cycle the Sun Ray and you are all set."

Ok... I've found my magic MTU value (1366) but I'm having issues editing option 26 on my DHCP server. It says that the value is not correct. I'm using Windows 2003 server and the issues are within a LAN (so we're not messing with our routers). I converted it to hex, and it wont take that either. Am I missing something?

Thanks for the article. It's helping alot, just need this last step.

Posted by Kevin on September 29, 2008 at 01:51 AM PDT #

If you're trying to do this with Solaris' ping, you can pass it an optional packet size at the end of the command line:

# ping -s -v 10.3.248.3 1472
PING 10.3.248.3: 1472 data bytes
1480 bytes from dtu002.sunray.example.ac.uk (10.3.248.3): icmp_seq=0. time=1.11 ms
1480 bytes from dtu002.sunray.example.ac.uk (10.3.248.3): icmp_seq=1. time=0.764 ms
^C
----10.3.248.3 PING Statistics----
2 packets transmitted, 2 packets received, 0% packet loss
round-trip (ms) min/avg/max/stddev = 0.764/0.935/1.11/0.24
# ping -s -v 10.3.248.3 1473
PING 10.3.248.3: 1473 data bytes
^C
----10.3.248.3 PING Statistics----
2 packets transmitted, 0 packets received, 100% packet loss

Our maximum MTU is therefore 1472.

Posted by Ceri Davies on December 04, 2008 at 01:22 PM PST #

Replying Ceri Davies' comment:

I guess your interpretation is wrong.

The ping replies are 8 bytes larger, as stated in ping's output, so the biggest packet allowed is 1480 bytes. Moreover, ping doesn't count the IP Header in the packet size (20 bytes), and your MTU is then 1500.

I had the same result as you as the max ping parameter (1472), and was surprised a 1500 MTU was not achievable with a gigabit switch...

I believe setting your MTU to 1472, and reperforming the MTU discovery test gave you 1444 bytes as a result (1472 - 8 (answer overhead) - 20 (ip header))

However, although 1500 seems OK here using ping, the display is still awfull on the DTU. A fullscreen image slideshow looks like we're "faxing" the images to the screen, it takes almost 1 second in the best cases for the screen to be completely redrawn...

Posted by Kevin POCHAT on April 06, 2009 at 06:40 AM PDT #

I would have to agree with Kevin. I am using WANem to emulate WAN links used by our clients. And i am able to replicate their issues which are a FAX like slide show. This is most apparent with PDF files.

I keep hearing how ALP is supposed be better then RDP over WAN links, but so far I am getting the opposite results with ALP performing much worse then RDP.

Any Suggestions?

Posted by Peter Sorensen on May 29, 2009 at 07:17 AM PDT #

We have a similar slow issue, but it is very odd, we have a cisco gigabit switch and the terminals plugged directly in run fine. We purchased some cheap 5 port GB switches as desktop port multipliers and we now have the fax issue. Yet in testing when I replace the GB with a 100MB unit the speed comes back up... very confusing.

Posted by xvart on June 03, 2009 at 12:48 AM PDT #

To xvart,

Actually the DTU work in 100MB, and don't negociate their preferred speed with GB switches (at least in some cases), and the switch talks with the DTU at 1GB --> lots of packet lost --> Fax effect.

Replacing your GB switches by 100MB ones solved the problem as the switches now (obviously) talks to the DTU at 100MB.

Posted by Kevin POCHAT on June 03, 2009 at 02:34 AM PDT #

Unfortunately that will not solve my problem as these DTU's are on 100Mb switches...

Posted by Peter Sorensen on June 03, 2009 at 05:45 AM PDT #

The issue here is that the cheap Gigabit switches have way to small of UDP buffers for the instantaneous large amounts of data ALP can send.

If you are running on Solaris, you can set hires_tick=1 in /etc/system and reboot.

You can confirm that this is set by using 'getconf CLK_TCK' and confirm that it returns 1000

This only works if you are reasonably up to date with SRSS patches.

The other alternative is to set the server NIC to 100MB.

Posted by bhlackey on June 03, 2009 at 08:15 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed

This blog copyright 2010 by ThinGuy