next up previous
Next: Network Address Translators Up: Notes on ``TCP/IP Illustrated'' Previous: Bibliography

A Backwards Compatibility Problem

This describes an example which actually occurred at the University, showing how evolution in the IP standards can cause a substantial backwards-compatibility problem. It should be read after Chapter 4 of Stevens, as an illustration of the material so far. See also the notes to pages 118 and 123, which illustrate several infelicities due to the fact that sub-netting was added after other ICMP features were defined.

When the University of Bath installed its first Ethernet (in mathematical Sciences in 1986), the University had, thanks to a far-sighted member of the Computing Service, already obtained a Class B address (138.38.x.y) -- such an allocation would be impossible today. It was already clear that a certain allocation policy would be needed, and RFC 950, describing sub-netting, had just been published, even though it was not widely supported, and Mathematical Sciences was allocated 138.38.96.x (and up to 138.38.103.x). So the three Sun machines in Mathematical Sciences formed a nice little private network, unconnected to anything else65. By 1988, the Sun operating system supported sub-netting.

Mathematical Sciences also had some High-Level Hardware Orion machines, and they soon supported Ethernet and TCP/IP, but not sub-netting, and they were added to this Ethernet. As long as this network remained separate, there was no problem, since the fact that the Orions thought that they were machines 96.10 etc. on Class B network 138.38 and the Suns thought that they were machines 3 etc on sub-net 138.38.96 did not matter, since the same IP addresses were used, and no routing was taking place.

Once the Sun 138.38.96.1 became multi-homed, with a second Ethernet connection named 138.38.32.254 on the campus backbone (then Ethernet), there was a problem with the Orions. The Sun 138.38.96.3, with a subnet mask of ffffff00, knew that 138.38.32.1 was on a different subnet to it, so it routed to it via its route default, i.e. 138.38.96.1, but the Orion 138.38.96.10, with no sub-netting, i.e. a mask of ffff0000, thought that the two were on the same net, so would not route, but ARP for 138.38.32.1.

This problem was solved by ``proxy ARPing'' (p. 60), in that 138.38.96.3 was instructed to reply to ARP requests for 138.38.32.1 (and other machines) by giving the Ethernet address of 138.38.96.1. 138.38.96.10 was then satisfied, and sent the packet for 138.38.32.1 out on the Mathematical Sciences Ethernet, with the Ethernet address of 138.38.96.1, but believing that this was the Ethernet address of the destination. On arrival at 138.38.96.1, there was no way to tell this packet from a correctly routed packet (e.g. from 138.38.96.3), and it was routed on to 138.38.32.1. A reply packet was routed by 138.38.32.1, which understood sub-netting, to its non-default router 138.38.32.254 for the 138.38.96.x subnet, and passed on to 138.38.96.10.

There were some drawbacks to this scheme. The ARP table on 138.38.96.3 had to be added to each time the system booted, so a list of arp -s pub (see p. 63) commands was added to the start-up script. Every time the Orions wanted to communicate with a new machine not on the sub-net, this list had to be extended. Also, 138.38.96.3 became an additional point of failure, since if it was down, the Orions could not communicate with any machine on the rest of 138.38.x.y (outside 138.38.96.y) unless the entry was already in the ARP cache66. Thus a failure of 138.38.96.3 was manifest as a slowly-growing series of complaints of the form ``my Orion won't talk to $\ldots$''. An attempt was made to solve this by using 138.38.96.1 as the proxy server, but this failed, as it also answered ARP requests for 138.38.32.1 on the 138.38.32.y sub-net. 138.38.96.2 was also used as a proxy server, with another manually-maintained list of ARP commands in its start-up. Most of the time this worked, but occasionally people would forget to add machines to both lists. This was not a problem as long as both machines were up, but if one failed, a few connectivity problems would occur, as an Orion tried to contact a machine for which only the failed machine was answering.

RFC 1122 (October 1989) made sub-netting support compulsory, but in fact High-Level Hardware never upgraded their system, and this hack continued to run until the Orion kernels were patched to give them the correct submasks in 1993, and then they were turned off in 1995.


next up previous
Next: Network Address Translators Up: Notes on ``TCP/IP Illustrated'' Previous: Bibliography
James Davenport 2004-03-09