-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathamber.texi
3915 lines (3281 loc) · 186 KB
/
amber.texi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\input texinfo @c -*-texinfo-*-
@c %**start of header
@setfilename amber.info
@settitle Chaosnet
@c %**end of header
@c Converted from Bolio by ams, assisted by victor
@c texi2any --html --no-split --set-customization-variable 'USE_TITLEPAGE_FOR_TITLE 1' --css-ref "https://tumbleweed.nu/lm-3/static/texinfo.css" --css-include=extra.css
@documentdescription
Chaosnet local area network protocol documentation. The master file is at https://github.com/Chaosnet/amber.
@end documentdescription
@macro xalpha
@math{\\alpha}
@end macro
@macro xbeta
@math{\\beta}
@end macro
@c for opcodes
@defindex op
@macro packet{name,num,descr}
@opindex \name\ (\num\): \descr\
@deffn Opcode @b{\num\} {@sansserif{\name\} @r{\descr\}}
@end macro
@c Notes:
@c Check all quoted numbers
@c Make sure all initialisms are decoded upon first occurence.
@c Discuss how to be a barebones implementation?
@c Service-finding protocol??
@c Error code names for TOPS-20
@titlepage
@center @t{MASSACHUSETTS INSTITUTE OF TECHNOLOGY}
@center @t{ARTIFICIAL INTELLIGENCE LABORATORY}
@ifnothtml
@sp 2
@end ifnothtml
@flushleft
@flushright
June, 1981
@end flushright
A.I. Memo No. 628
@end flushleft
@ifnothtml
@center @t{Chaosnet}
@sp 4
@center David A. Moon
@sp 6
@center Abstract
@end ifnothtml
@ifhtml
@title Chaosnet
@author David A. Moon
@b{Abstract:}
@end ifhtml
@indentedblock
Chaosnet is a @emph{local network}, that is, a system for communication
among a group of computers located within about 1000 meters of each
other. Originally developed by the Artificial Intelligence Laboratory
as the internal communications medium of the Lisp Machine system, it
has since come to be used to link a variety of machines around MIT
and elsewhere.
This memo describes both the hardware and the software protocols. It
is intended to be the definitive documentation for Chaosnet.
@end indentedblock
@iftex
@vskip 0pt plus 1filll
@end iftex
@ifnottex
@ifnothtml
@sp 1
@end ifnothtml
@end ifnottex
This report describes research done at the Artificial Intelligence Laboratory
of the Massachusetts Institute of Technology. Support for the Laboratory's
artificial intelligence research is provided in part by the Advanced Research
Projects Agency of the Department of Defense under Office of Naval Research
contract N00014-80-C-0505.
@end titlepage
@ifnottex
@ifnothtml
@node Top
@top Chaosnet
@end ifnothtml
@end ifnottex
@contents
@chapter Introduction
Chaosnet is a @emph{local network}, that is, a system for communication
among a group of computers located within one or two kilometers of each other.
The name Chaosnet refers to the lack of any centralized control element
in this network.
Chaosnet was originally developed in 1975 by the Artificial Intelligence Laboratory
of the Massachusetts Institute of Technology as the internal communications
medium of the Lisp Machine system [@ref{CHINUAL}, @ref{AIM444}]. It has since come to be
used to link a variety of machines around the Institute. Chaosnets also
exist at several other universities and research laboratories.
The Lisp Machine system is a multi-processor in which each active user is
assigned a "personal" computer consisting of a medium-scale processor, a
suitable amount of memory, and a swapping disk. Files are stored in a
central file-system accessed through Chaosnet. This shared file-system
retains the traditional advantages of a time-sharing system, namely
inter-user communication, shared programs, and centralized backup and
maintenance. At the same time, by giving each active user his own
processor, the Lisp Machine system is much more capable than a time-sharing
system at executing Lisp programs several million words in size efficiently
and with rapid interactive response. Because Chaosnet is taking the place
of the file disk in a conventional system, it must be fast (both in
response and in throughput), it must be reliable (this is the reason why
there is no centralized control), and it must allow connection of several
dozen machines. However, it does not need to operate over long distances.
Chaosnet is used to access other shared resources in addition to the file
system; these include printers, tape drives, and one-of-a-kind specialized
processors and I/O devices.
The system goals for Chaosnet were primarily simplicity and high performance.
The performance is achieved by starting with a very high speed transmission
medium and operating it in a simple, low-overhead fashion, rather than by
using unusually clever algorithms. Of course one has to be careful to avoid
algorithms which are so simple that they don't work or waste so much of the
transmission medium that performance is impacted. Simplicity was important
not only to improve performance, but because Chaosnet connects a diverse
set of machines, and hence had to have several implementations all of which
require maintenance in proportion to their complexity.
Simplicity of design also aids greatly in maintenance and management of the
network itself, which has proven to be a non-trivial task in a network involving
a variety of machines and used by a variety of groups, even within the single
institution of MIT. It is important to be able to isolate an apparent failure of the
whole network to the cable or to a particular host's hardware or software as
rapidly as possible.
The design of Chaosnet was greatly simplified by ignoring problems
irrelevant to local networks. Chaosnet contains no special provisions for
things such as low-speed links, noisy (very high
error-rate) links, multiple paths, and long-distance links with significant
transit time.
This means that Chaosnet is not particularly suitable for
use across the continent or in satellite applications.
Chaosnet also makes no attempt to provide unnecessary features such as multiple
levels of service or secure communication (other than by end-to-end encryption).
Chaosnet was largely inspired by the pioneering work on local networks at
Xerox PARC [@ref{ETHERNET}].
Chaosnet consists of two parts--the hardware and the software--which, while
logically separable, were designed for each other. The hardware provides a
carrier-sense multiple-access structure very similar to the PARC Ethernet.
Network nodes contend for access to a cable, or ether, over which they may
transmit packets addressed to other network nodes. The software defines
higher-level protocols in terms of packets. These protocols can be (and
are) used with media other than the Chaosnet cable, and with multiple
interconnected cables. The software contains ideas borrowed from Ethernet
[@ref{ETHERNET}], TCP [@ref{TCP}], and Arpanet, with some original ideas and
modifications.
@chapter Hardware Protocol
@section The Ether
The transmission medium of Chaosnet is called the @dfn{ether}.
Physically it is a coaxial cable, of the semi-rigid 1/2 inch low-loss type used for cable TV,
with 75-ohm termination at both ends. At each
network node a @dfn{cable transceiver} is attached to the cable. A 10-meter
flat cable connects the transceiver to an @dfn{interface} which is attached
to a computer's I/O bus. A network node consists of this transceiver and
interface and a computer which executes a certain piece of software called
the Network Control Program (NCP), which manages and controls Chaosnet, in
addition to whatever application software justifies its existence in the
first place.
One network node at a time may seize the ether and transmit a
packet, which arrives at all other nodes; each node decides in hardware
whether to ignore the packet or to receive it. A single ether must be a
linear cable; it may not contain branches nor stubs, and the ends may not
be joined into a circle. The maximum length of an ether cable is about 1
kilometer. This is determined by dispersion and by DC attenuation. The
maximum number of nodes on a single ether is probably a few dozen. This is
determined by degradation of the electrical properties of the cable by the
connectors used to attach the transceivers.
The maximum length of an ether could be increased by using
repeaters (bidirectional digital amplifiers joining two pieces of cable).
In practice this is not done. Instead, the protocol provides for multiple
ethers, joined together by nodes called @dfn{bridges} which relay packets
from one ether to another. A bridge is typically a pdp11 computer with two
or more network interfaces attached to it. A bridge node usually
performs other tasks as well, such as interfacing terminals. Bridges attach
other network media as well as ethers; some computers connect to the network
through their manufacturer's high speed computer-to-computer interface to
a nearby bridge, rather than being interfaced directly to an ether.
Asynchronous lines have also been used as Chaosnet media.
The form of information on the ether, the transceiver and
interface hardware, and the protocols which control who may seize the
ether are described in the following sections.
@section Packets
The basic unit of transmission is called a packet. This is a
sequence of up to 4032 data bits, plus 48 bits of @dfn{header} information
used by the hardware. Packets' bits are normally grouped into 16-bit
words. The division of a transmitted bit stream into packets provides a
conveniently-sized unit for resource allocation and error control. The job
of the hardware is to deliver a packet from one node to another. Software
protocols define the meaning of the data bits in a packet, manage the hardware,
compensate for imperfections of the hardware, and provide more useful services
than simple transmission of packets from one computer to another.
The hardware header consists of three 16-bit words, called
@dfn{destination}, @dfn{source}, and @dfn{check}. The source identifies the node
which transmitted this packet onto this ether. This is not necessarily the
original source of the message, since it may have originated on a
different ether. The destination identifies the node which is intended
to receive this packet from this ether. This is not necessarily the final
destination of the message; it may be a bridge which should relay the
packet to another ether, whence it will eventually reach its final destination.
The check word is a cyclic redundancy checksum, generated and checked
by hardware, which detects errors in transmission through the ether,
entirely-spurious packets created by noise on the cable, and memory
errors in the transmitting and receiving packet buffers.
The software protocol is also based on packets, taking 128 of the data bits
as a software header. This is described in @ref{packets-section,Packets}.
@section The Transceiver
Everyone who connects to the ether does so through a transceiver,
which is a small box which mounts directly on the cable via a UHF connector
and a T-joint. All nodes use identical transceivers (the interface varies
depending on what computer it is designed to interface to). The
transceiver contains the analog portion of the interface logic, provides
ground isolation between the ether cable and the computer, and contains
some protective circuitry designed to prevent a malfunctioning program or
interface from continuously jamming the ether. (If it were to be
redesigned, it ought to contain even more protective circuitry, since there
are some possible interface failures that can get past the protection and
render the ether unusable.)
The transceiver receives a differential digital signal from the
computer interface and impresses it onto the cable as a level of about 8
volts for a 1, or 0 volts (open circuit) for a 0, through a very fast VMOS
power FET. When the cable is idle it is held at 0 volts by the
terminations. This simple-minded unipolar scheme is adequate for the
medium cable lengths and transmission speeds we are using. The transceiver
monitors the cable by comparing it against a reference voltage, and returns
a differential signal to the interface. In addition, it detects
interference (another transceiver transmitting at the same time as this
one) and informs the interface.
The transceiver includes indicators (light-emitting diodes) for
power OK, transmitted data, received data, and interface attempting to jam
the ether. A test button simulates an input of continuous 1's from the
interface, which should light all the lights (dimly) if the transceiver is
working. These indicators and the test button are useful for rapidly
tracking down network problems. The transceiver requires its own power
supply mounted nearby; one power supply can service three transceivers if
they are all adjacent. High-voltage isolation between the cable and the
computer is provided by optical isolators within the transceiver.
@section The Interface
The interface is typically a wire-wrap board containing about 120 TTL
logic chips, which plugs into the I/O bus of a computer and connects it
to the ether (through a transceiver.) The interface implements the hardware
protocols described in the next section, buffers incoming and outgoing
packets, generates and checks checksums, and interrupts the host computer
when a packet is to be read out of the receive packet buffer or stored
into the transmit packet buffer. These packet buffers shield the host computer
from the high speed of data transmission on the cable. Instead of having to
produce bits at a high rate, the host can produce them at a lower rate,
collect them into a packet, and then tell the interface to transmit the packet
in a single burst of high-speed transmission.
Interfaces currently exist for Lisp machines, for DEC LSI-11
micro-computers, and for the DEC Unibus [@ref{UNIBUS}], which allows Chaosnet to
be attached to pdp11's, VAX's, and the peripheral processors of most
pdp10's. Programming documentation for this compatible family of
interfaces appears later in this paper.
Interfaces for other computers are likely to exist in the future. The
existing interface design does not make any unusual special demands of
its host computer and should be easily adaptable to other architectures.
@section Hardware Protocols
The purpose of these protocols is to deliver packets intact from
one node to another node on the same ether, with fairly high probability of
success, and to guarantee to give an error indication or lose the packet
entirely if it is not delivered intact. An additional purpose is to
provide high performance and not to bog down when subjected to a heavy
load.
Bits are represented on the ether using a technique which is called
Upright Biphase NRZI. Each bit cell, which is approximately 250
nanoseconds long, begins with a transition in state, from high to low or
from low to high. This transition marks the beginning of a bit cell and
provides self-clocking. 3/4 of the way through the bit cell, the state of
the cable is sampled; high represents a 1 and low represents a 0. If the
bit being represented is the same as the previous bit, there will be one
transition at the beginning of the bit cell and a second in the middle of
the bit cell. If the bit being represented is the opposite of the previous
bit, there will be no transition in the middle of the bit cell since the
clock transition will have set the cable to the desired state. The
AC frequency of the signal on the cable varies between 1/2 the bit rate and
the full bit rate. The information bit-rate is 4 million bits per second.
The self-clocking feature allows for slight variations in transmission
and cable propagation speed. However, since the 3/4 of a bit cell delay
is a fixed delay, only modest variations in speed can be tolerated. A crystal
clock is used as the source of the transmit timing in the interface.
Since there is always at least one state-transition per bit cell,
the states where the ether remains high or low for an appreciable time
are available for non-data uses. If the ether remains low for more than
about two bit cells, it is considered to be not-busy. This condition marks
the end of a packet and allows someone else to transmit. Note that if no
transceivers are active, the terminations will hold the ether low.
If the ether remains high for about two bit cells, this is an
"abort signal". Abort signals are used for two purposes. If the
transceiver detects a collision (two nodes trying to transmit at the same
time), each transmitting interface ceases to transmit and sends an abort
signal (four bit cells long), which tells all receivers to ignore the aborted
packet and ensures that the other transmitter also aborts. Thus when a
collision occurs, the ether is cleared as soon as possible to help prevent
long tie-ups under conditions of heavy load. The other use for the abort
signal is in hardware flow-control. When a receiving interface determines
that an incoming packet is addressed to it, but its receive buffer already
contains a packet, it sends an abort signal which causes the transmitter to
stop. This serves the dual purpose of immediately informing the
transmitter that its message did not get through, and preventing the ether
from being tied up while a long packet is transmitted which the receiver
cannot receive.
Packets are transmitted over the ether in reverse bit-order, for
hardware convenience. The three header words, which to the software appear
to be at the end of the packet, are transmitted first, in the order check,
source, destination. The data words, in reverse order, follow. Words are
transmitted least-significant bit first. Of course, the software need not
be aware of this reversal; packets arrive at the destination in the same
form as they were created by the source. At the end of the packet, an
extra zero bit is appended to bring the ether to the low state so that an
extra spurious clock-transition will not be generated when it goes idle.
This bit is stripped off by the interface and is never seen by software.
The check word is used for error detection, as described above.
The source word is made available to the software, which ignores it in
most cases, and also serves to synchronize the clocks in the
collision-avoidance mechanism which is described below. The
destination word is compared by each receiver against its own address.
If they match, or if the destination is zero, or if the software
selects the "match any destination" mode, the packet is placed in the
receive packet buffer and the host computer is interrupted. The zero
destination feature is used for "broadcast" messages. Note that a
receiver whose packet buffer is full will only generate an abort signal
if the packet was specifically addressed to it.
@section Ether Contention
Chaosnet has no centralized control element; when a network node
has a message to transmit, its interface seizes the ether and transmits
a packet. The time when it seizes the ether is determined only by state
inside that particular interface and by the local state of the cable at
the point where that interface's transceiver is attached.
If two interfaces should decide to seize the ether and transmit
at the same time, their transmissions will interfere and no useful
information will be transmitted. This is called a @dfn{collision}.
Collisions are the principal limitation on the bandwidth of a
heavily-loaded ether-type network, and should be avoided. (However,
neither PARC's network nor MIT's network has yet been operated with a
heavy enough load to make collisions really significant.)
Chaosnet uses a novel collision avoidance technique. First of
all, an interface will never initiate transmission unless the ether is
seen to be not busy, i.e. it has been in the low state for some time.
This ensures that collisions can only occur near the beginning of a
packet. Once transmission of a packet has gotten well started, the
ether is effectively "seized" (all interfaces realize that it is busy)
and transmission will continue successfully through to the end of the
packet. The amount of ether transmission time wasted by a collided
packet is therefore limited to the round-trip cable propagation delay.
This technique is called @dfn{carrier sense}.
Secondly, the hardware uses a time-division technique to attempt
to prevent two interfaces from initiating transmission at the same time.
This technique should prevent essentially all collisions while imposing
only a modest delay in the initiation of transmission. It is designed
so that it works better as the load on the ether increases; the wasted time
between packets and the relative rate of collisions both decrease.
The basic idea is that each interface is assigned a time-slot,
or @dfn{turn}, according to its address. It may only initiate
transmission during its turn. The turns are spaced far enough apart
that if one interface initiates transmission, every other interface
will perceive that the ether is busy by the time its own turn arrives,
and will not initiate an interfering transmission. Each interface
contains a time-slot counter which counts while the ether is not busy,
keeping track of whose turn it is. Each packet synchronizes the
counters in all of the interfaces by setting them from the source
address of that packet; at the time the packet was transmitted, it must
have been the turn of the interface that transmitted it.
Another way to think of this is to make an analogy with ring
networks. One can imagine a @dfn{virtual token} which passes down the cable
until it gets to the end, then jumps to the beginning of the cable and
repeats. An interface may only initiate transmission at the instant the
token passes by it. When an interface transmits, the token stops moving
and remains at that interface until the end of the packet, whereupon it
continues down the cable, passing every other interface, giving them
each a chance to transmit before letting the first interface transmit a second packet.
The token is not represented by any physical transmission on the
cable. That would constitute a form of centralized control, and would
lead to reliability problems if the token was lost or duplicated.
Instead, every interface contains a time-slot counter which keeps
track of where the token is thought to be. Every time a packet is
transmitted these counters are brought up-to-date. The token cannot be
lost because a counter by its nature eventually returns to all previous
states. It does not matter if the token is duplicated (i.e. the
counters lose synchronization) occasionally, since this will only cause
collisions, which we know how to detect and deal with, and since the first successful
transmission will resynchronize all counters. The basic mechanism
of the ether network with contention and collisions is known to work,
and the collision-avoidance scheme is an added-on optimization which
improves performance without changing the basic mechanism.
There is a finite propagation delay time between interfaces,
and this time is not small compared with the bit-rate of Chaosnet, nor
when compared with the desirable length of a time slot.
This time consists of the delay in the cable, about 5 nanoseconds per meter,
and the delay through the two transceivers, about 220 nanoseconds.
This propagation delay means that the time-slot counters in two different
interfaces cannot be exactly synchronized, and that when interface A
initiates transmission interface B will not instantaneously see that the
ether is busy. Special relativity tells us that in fact the concept
"exactly synchronized" is meaningless. Since the two time-slot counters are
not in the same place, the only way we can compare them is to send a
message from one to the other, through the ether, containing the reading of
the counter. But this message takes non-zero time to get there, so the
counter-reading it contains is wrong by the time it is compared against the
other counter! We in fact do send messages containing counter readings;
the source address in a packet equals the reading of the time-slot counter in the
interface that sent it--at the time it was sent. Since the network nodes are
not in relative motion, we can measure the distance between them and use
that information to improve their synchronization.
What we are trying to do is to prevent collisions. This means that
if interface A starts transmitting a packet in its turn, then by the time
interface B thinks that its own turn has arrived, it must perceive the
ether as busy. We will assign addresses (and hence time slots) and set the
length of a time slot in such a way that this will happen. Suppose the
maximum delay through the ether between A and B is @math{t}. This would be
the delay for one of them sending a packet to the other; the delay between
A's receipt of a third party's packet and B's receipt of that packet is
less, especially if the third party is between A and B on the cable. Then
the maximum perceived difference between a clock at A and a clock at B is
@math{2t}; if a message is sent from B to A synchronizing the clocks, and then
a message is sent from A to B containing A's clock reading, at B this clock
reading will be slow by @math{2t}.
When a packet transmitted by A arrives at B, B's clock may read as
much as @math{2t} earlier or later than A's turn, depending on the transmission
direction of the last synchronizing message. In order to guarantee that B's
turn has not yet happened, the time between any of A's turns and any of B's
turns must always be at least @math{2t}, twice the maximum propagation delay
through the ether between A and B. This is the important idea! We cause
this to be true by assigning addresses starting at one end of the cable;
each node's address is the previous node's address plus twice the
propagation delay between them, divided by the length of a turn. It is
easy to see that if this is done for all adjacent pairs, the condition will
automatically be true for non-adjacent pairs as well. When we get to the
end of the cable, we must assign a number of empty slots equal to twice the
propagation delay of the full length of cable, to provide
the necessary separation between the turns of the two nodes at the ends
of the cable.
The virtual token travels through the network at a substantially
slower speed than a real signal such as a packet; in the fastest case, when nodes are very
far apart, it travels at just half the speed of a real signal. Since a
Chaosnet ether has the geometry of a line, as compared to the ring net's
circle, the virtual token is also slowed down by the need to return from
the end of the cable to the beginning. This slower speed of the token is
the price one pays for the increased robustness of Chaosnet as compared
with a ring network. In a real system, we slow the token down even more to
provide a margin of safety. The speed of the network is not significantly
affected by the slow token, since the interval between packet transmissions
by a single node is much longer than the round-trip time of the token.
Indeed, if the network is being used primarily for file transfer, and hence
the packets are large, the transmission time alone for a typical packet is
several times the round-trip time of the token. A typical value for the
token's round-trip time is 64 microseconds.
In spite of all this, sometimes collisions will occur anyway. If
the cable has been idle for a long time, the various clocks will have lost
synchronization. If a source address is corrupted by a transmission error,
any interface that sees that source address will set its clock to an
incorrect value. Sometimes a packet will collide with random noise rather
than another legitimate packet. In addition, the transmitter does not
distinguish receiver-busy aborts from real collisions.
When a collision does occur, we recover from it (in software) by
retransmitting the packet again a couple of times, hoping that we will be
lucky enough not to have another collision, or that the receiver will soon
clear its packet buffer. This retransmission is done by the software, not
the hardware, since the hardware destroys the packet in its packet buffer
in the process of transmitting it. But if collisions continue to occur, we
give up and let somebody else have the ether. The packet is lost. A
higher level of protocol will soon realize that it has been lost and
retransmit it. We assume that there is enough randomness in the
higher-level software that the two nodes which originally collided will not
collide again on the retransmission by deciding to retransmit at precisely
the same instant.
@chapter Software Protocol--World View
The purpose of the basic software protocol of Chaosnet is to allow
high-speed communication among processes on different machines, with no
undetected transmission errors. The speed for file transfers in real-life
circumstances was to be comparable with an inexpensive magnetic tape drive
(30000 characters per second, or about 10 times the speed of the Arpanet).
We actually get about double this in some favorable cases. To achieve this speed it
was important to design out bottlenecks such as are found in the Arpanet,
for instance the control-link which is shared between multiple connections
and the need to acknowledge each message before the next message can be
sent. The protocol must be simple, for the sake of reliability and to
allow its use by modest computer systems. A full Chaosnet Network Control
Program is just about half the size of an Arpanet NCP on the same machine,
and the protocol allows low-performance implementations to omit some
features. A minimal implementation exists for a single-chip microcomputer.
@section Connections
The principal service provided by Chaosnet is a @dfn{connection} between
two user processes. This is a full-duplex reliable packet-transmission
channel. The network undertakes never to garble, lose, duplicate, or
resequence the packets; in the event of a serious error it may break
the connection off entirely, informing both user processes. User
programs may either deal in terms of packets, or ignore packet
boundaries and treat the connection as two uni-directional streams of
8-bit or 16-bit bytes.
On top of the connection facility "user" programs build other
facilities, such as file access, interactive terminal connections, and
data in other byte sizes, such as 36 bits. The meaning of the packets
or bytes transmitted through a connection is defined by the particular
higher-level protocol in use.
In addition to reliable communication, the protocol provides flow
control, includes a way by which prospective communicants may get in
touch with each other (called @dfn{contacting} or @dfn{rendezvous}), and
provides various network maintenance and housekeeping facilities. These are
discussed later.
@section Contact Names
When first establishing a connection, it is necessary for the two
communicating processes to contact each other. In addition, in the
usual user/server situation, the server process does not exist beforehand
and needs to be created and made to execute the appropriate program.
We chose to implement contacting in an asymmetric way. (Once the
connection has been established everything is completely symmetric.)
One process is designated the @dfn{user}, and the other is designated the
@dfn{server}. The server has some @dfn{contact name} to which it
@dfn{listens}. The user process requests its local operating system to
connect it to the server, specifying the network node and contact name
of the server. The local operating system sends a message (a @i{Request
for Connection}) to the remote operating system, which examines the
contact name and creates a connection to a listening process, creates
a new server process and connects to it, or rejects the request.
Automatically discovering to which host to connect in order to obtain a
particular service is a subject for higher-level protocols and for
further research. It is not dealt with by Chaosnet.
Once a connection has been established, there is no more need for the
contact name and it is discarded. Indeed, often the contact name
is simply the name of a service (such as "@code{TELNET}") and several
users should be able to have simultaneous connections to separate
instances of that service, so contact names must be reusable.
In the case where two existing processes that already know about each
other want to establish a connection, we arbitrarily designate one as
the listener (server) and the other as the requester (user). The
listener somehow generates a "unique" contact name, somehow
communicates it to the requester, and listens for it. The requester
requests to connect to that contact name and the connection is
established. In the most common case of establishing a second
connection between two processes which are already connected, the index
number (see below) of the first connection can serve as a unique
contact name.
Contact names are restricted to strings of upper-case letters, numbers,
and ASCII punctuation. The maximum length of a contact name is limited
only by the packet size, although on ITS hosts the names of
automatically-started servers are limited by the file-system to six characters.
See @ref{connection-opening,Connection Establishment} for complete details of how to establish
a connection.
@section Addresses and Indices
Each node (or host) on the network is identified by an @dfn{address},
which is a 16-bit number. These addresses are used in the routing of
packets. There is a table (the system hosts table, @file{SYSBIN;HOSTS2}, in the case
of ITS) which relates symbolic host names to numeric host addresses.
An address consists of two fields. The most-significant 8 bits
identify a @dfn{subnet}, and the least-significant 8 bits identify a host
within that subnet. Both fields must be non-zero. A subnet
corresponds to a single transmission path. Some subnets are physical
Chaosnet cables (@dfn{ethers}), while others are other media, for
instance an interface between a pdp10 and a pdp11. The significance
of subnets will become clear when routing is discussed (see @ref{routing,Routing}).
When a host is connected to an ether, the host's hardware address on that
ether is the same as its software address, including the subnet field.
A connection is specified by the names of its two ends. Such a name consists
of a 16-bit host address and a 16-bit connection index, which is assigned
by that host, as the name of the entity inside the host which owns the connection.
The only requirements placed by the protocol on indices
are that they be non-zero and that they be unique within a particular
host; that is, a host may not assign the same index number to two different
connections unless enough time has elapsed between the closing of the first
connection and the opening of the second connection that confusion between
the two is unlikely.
Typically the least-significant @math{n} bits of an index are used as a
subscript into the operating system's tables, and the most-significant
16-@math{n} bits are incremented each time a table slot is reused, to
provide uniqueness. The number of uniquizing bits must be sufficiently
large, compared to the rate at which connection-table slots are reused,
that if two connections have the same index, a packet from the old
connection cannot sit around in the network (e.g. in buffers inside
hosts or bridges) long enough to be seen as belonging to the new connection.
It is important to note that packets are @emph{not} sent between hosts
(physical computers). They are sent between user processes; more exactly,
between channels attached to user processes. Each channel has a 32-bit
identification, which is divided into subnet, host, index, and uniquization
fields. From the point of a view of a user process using the network, the
Network Control Program section of his host's operating system is part of
the network, and the multiplexing and demultiplexing it performs is no
different from the routing performed by other parts of the network.
It makes no difference whether two communicating processes run in the same
host or in different hosts.
Certain control packets, however, are sent between hosts rather than users.
This is visible to users when opening a connection; a contact name is only
valid with respect to a particular host. This is a compromise in the design
of Chaosnet, which was made so that an operational system could be built
without first solving the research and engineering problems associated with
making a diverse set of hosts into a uniform, one-level name space.
@section Packet Numbers
There are two kinds of packets, controlled and uncontrolled. Controlled
packets are subject to error-control and flow-control protocols, described
below (see @ref{flow-and-error-control,Flow and Error Control}), which guarantee that each controlled
packet is delivered to its destination exactly once, that the controlled
packets belonging to a single connection are delivered in the same order
they were sent, and that a slow receiver is not overwhelmed with packets
from a fast sender. Uncontrolled packets are simply transmitted; they will
usually but not always arrive at their destination exactly once. The
protocol for using them must take this into account.
Each controlled packet is identified by an unsigned 16-bit @dfn{packet number}.
Successive packets are identified by sequential numbers, with wrap-around
from all 1's to all 0's. When a connection is first opened, each end
numbers its first controlled packet (RFC or OPN) however it likes, and
that sets the numbering for all following packets.
Packet numbers should be compared modulo 65536 (2 to the 16th), to ensure correct
handling of wrap-around cases.
On a pdp11, use the instructions
@example
CMP A,B
BMI A_is_less
@end example
Do not use the @code{BLT} or @code{BLO} instruction.
On a pdp10, use the instructions
@example
SUB A,B
TRNE A,100000
JRST A_is_less
@end example
Do not use the @code{CAMGE} instruction.
On a Lisp machine, use the code
@lisp
(IF (BIT-TEST #o100000 (- A B))
<A is less>)
@end lisp
Do not use the @code{LESSP} (or @code{<}) function.
@section Packets
@anchor{packets-section}
A packet consists of a header, which is 8 16-bit words, and zero or
more 8-bit or 16-bit bytes of accompanying data. In addition there are
three words put on by the hardware, described earlier in this paper.
The following are the 8 header words:
@table @dfn
@item Operation
The most-significant 8 bits of this word are the @dfn{Opcode} of the packet,
a number which tells what the packet means. The 128 opcodes with high-order
bit 0 are for the use of the network itself. The 128 opcodes with high-order
bit 1 are for use by users. The various opcodes are described in
@ref{software-protocol-details,Software Protocol--Details}.
The least-significant 8 bits of this word are reserved for future use,
and must be zero.
@item Count
The most-significant 4 bits of this word are the forwarding count, which tells
how many times this packet has been forwarded by bridges. Its use is explained
in the Routing section.
The least-significant 12 bits of this word are the data byte count,
which tells the number of 8-bit bytes of data in the packet. The
minimum value is 0 and the maximum value is 488. Note that the count
is in 8-bit bytes even if the data are regarded as 16-bit bytes.
The byte count must be consistent with the actual length of the
hardware packet. Since the hardware cyclic redundancy check algorithm
is not sensitive to extra zero bits, packets whose hardware length
disagrees with their software length are discarded as hardware errors.
@item Destination Address
This word contains the network address of the destination host
to which this packet should be sent.
@item Destination Index
This word contains the connection index at the destination host
of the connection to which this packet belongs, or 0 if this packet
does not belong to any connection.
@item Source Address
This word contains the network address of the source host which
originated this packet.
@item Source Index
This word contains the connection index at the source host
of the connection to which this packet belongs, or 0 if this
packet does not belong to any connection.
@item Packet Number
If this is a controlled packet, this word contains its identifying number.
@item Acknowledgement
The use of this word is described in @ref{flow-and-error-control,Flow and Error Control}.
@end table
@section Data Formats
Data transmitted through Chaosnet generally follow Lisp Machine
standards. Bits and bytes are numbered from right to
left, or least-significant to most-significant. The first 8-bit byte
in a 16-bit word is the one in the arithmetically least-significant
position. The first 16-bit word in a 32-bit double-word is the one
in the arithmetically least-significant position.
The character set used is dictated by the higher-level protocol in use.
Telnet and Supdup, for example, each specifies its own ASCII-based character set.
The "default" character set--used for new protocols and for text that
appears in the basic Chaosnet protocol, such as contact names--is
the Lisp Machine character set [@ref{CHINUAL}]. This is basically ASCII,
augmented with additional printing characters and a different set of
format-effector (or "control") characters.
Because the rules for bit numbering conflict with the native byte-ordering
in pdp10s, and because it is quite expensive to rearrange the bytes using
the pdp10 instruction set, pdp11s which act as front-ends for pdp10s must
reformat packets passing through them, and pdp10s interfaced directly to
the network must have interfaces capable of rearranging the bytes. This
requires that the network protocols explicitly specify which portions of
each type of packet are 8-bit bytes and which are 16-bit bytes. In general
the header is 16-bit bytes and the data field is 8-bit bytes, but certain
packet types (OPN, STS, RUT, and opcodes 300 through 377) have 16-bit bytes
in the data field. Use of 32-bit data is rare, so no provision is made for
putting 32-bit data into the standard format for pdp10s. On our current
network pdp10s are the only hosts which require this packet reformatting
assistance, because most modern computers number their bits and bytes from
least-significant to most-significant.
The effect of this is that user programs that use the Chaosnet always see the
data in a packet and its header in the native form of the machine they are
running on, and the necessary conversions are automatically applied by the
network. This statement applies to the order of bits and bytes within a word,
but not to the character set (when packets contain textual data) which is
dictated by protocols.
Unlike some other network protocols, Chaosnet does not use any software
checksumming. Because of the diversity of hosts with different
architectures attached to the Chaosnet, it is impossible to devise a
checksumming algorithm which can be executed compatibly and efficiently on
all hosts. Instead, Chaosnet relies on error-checking hardware in the
network interfaces, and assumes that other sources of packet damage that
checksums could detect, such as software bugs in a Network Control Program,
either do not occur or will produce symptoms so obvious that they will be
detected and fixed immediately.
@section Routing
@anchor{routing}
@dfn{Routing} consists of deciding how to deliver a packet to the network
node specified by the destination address field of the packet.
Having reached that node, the packet can trivially be delivered to the
destination user process via the destination index.
In general routing may be a multi-step process involving transmission
through several subnets, since there may not be a direct hardware
connection between the source and the destination. Note that the
routing decision is made separately for each packet, with no reference
to the concept of connections.
Any host that is connected to more than one subnet acts as a @dfn{bridge}
and @dfn{forwards} packets from one subnet to another when necessary.
There could also be hardware bridges which are not hosts, although we have
not yet designed any such device. Since routing does not depend on connections,
a bridge is a very simple device (or program) which does not need much state.
This makes the bridge function inexpensive to piggyback onto a computer which
is also performing other functions, and makes reliable bridge software easy
to implement.
The difference between a @dfn{bridge} and a @dfn{gateway}, in our terminology,
is that a bridge forwards packets from one sub-Chaosnet to another, without
modifying the packets or understanding them other than to look at the
destination address and increment the forwarding count, and does not deal
with connections nor with flow control, while a gateway interconnects two
networks with differing protocols and must understand and translate the
information passing through it. Gateways may also have to deal with flow
and error control because they connect networks with slow or differing
speeds. Bridges are suitable for local networks while gateways are
suitable for long-distance networks and for connecting networks not
produced by the same organization.
To prevent routing loops, each packet contains a forwarding-count field.
Each bridge that forwards the packet increments this count; if the count reaches
its maximum value the packet is discarded. The error-control protocol will
recover discarded packets, or decide that no viable connection can be established
between the two hosts.
The implementation of routing in an operating system is as follows, given a
packet to be routed, which may have come in from the network or may have
been originated by the local host. First, check the packet's destination
address. If it is this host, receive the packet. Otherwise, increment the
forwarding count and discard the packet if it has been forwarded too many
times. If the destination is some other host on a subnet to which this
host is directly connected, transmit the packet on that subnet; the
destination host should receive it. If the destination is a host on a
subnet of which this host has no knowledge, look up the subnet in the
host's @dfn{routing table} to find the best bridge to that subnet, and
transmit the packet to that bridge.
Each host has a routing table, indexed by subnet number, which tells
how to get packets to hosts on that subnet.
Each entry contains: (exact details may vary depending on implementation)
@table @dfn
@item Type
The type of connection between the host and this subnet. This can
be one of @dfn{Direct}, @dfn{Bridge}, or @dfn{Fixed Bridge}. @i{Direct} means a physical
connection such as a Chaosnet interface. @i{Bridge} means
an indirect connection, via a packet-forwarding bridge. Which bridge
is best to use is to be discovered by this routing mechanism.
@i{Fixed Bridge} is the same except that the automatic mechanism
is not to change which bridge is used. This is useful to set up
explicit routing for purposes such as network debugging.
@item Address
Identifies the connection to this subnet in a way which depends on the type.
For a direct connection, this identifies the piece of hardware which
implements the connection. (It might be a Unibus address.)
For a bridge or a fixed bridge, this is the network address of the bridge.
@item Cost
A measure of the cost of sending a packet through this route.
Costs are used to select the best route from among alternatives in a way
described below. For a direct connection, the cost is 10 for a direct
interface between two computers (e.g. between a pdp10 and its front-end pdp11),
11 for a Chaosnet ether cable, 20 for a slow medium such as an asynchronous line, etc.
For a bridge or a fixed bridge, the cost is specified by the bridge in a RUT packet
(described below).
@end table
The routing table is initialized with the number of a more or
less arbitrary existent host and a high cost, for each subnet
to which the host is not directly connected. Until the correct
bridge is discovered (which normally happens within a minute
of coming up), packets for that subnet will be bounced off of
that arbitrary host, which probably knows the right bridge to forward them to.
The cost for subnets accessed via bridges is increased by 1 every 4
seconds, thus typically doubling after a minute. When the cost
reaches a "high" value, it sticks there, preventing problems with
arithmetic overflow. The purpose of the increasing cost is to discount
the value of old information. The cost for subnets accessed via direct
connections and fixed bridges does not increase.
Every 15 seconds, a bridge advertises its presence by broadcasting a
routing (RUT) packet on each subnet to which it is directly connected.
Each host on that subnet receives the RUT packet and uses it to update
its routing table. If the host's routing table says to access a certain
subnet via bridges, and the RUT packet says that this is the best bridge
to that subnet, the routing table is updated to say that this bridge should
be used.
Note that it is important that the rate at which the costs increase with
time be slow enough that it takes more than twice the broadcast interval to
increase the cost of one hop to be more than the cost of two hops. Otherwise
the routing algorithm is not well-behaved. Suppose subnet A has two
bridges (@xalpha{} and @xbeta{}) on it, and bridge @xalpha{} is connected to subnet
B but bridge @xbeta{} is not (it goes to some other irrelevant subnet). Then
if the costs increase too fast and bridges @xalpha{} and @xbeta{} do not
broadcast their RUT packets exactly simultaneously, sometimes packets for
subnet B may be sent to bridge @xbeta{} because its cost appears lower.
Bridge @xbeta{} will then send them to bridge @xalpha{}, where they should have
gone directly. In more complicated situations packets can go around in a
circle some of the time.
The source address of a RUT packet must be the hardware address of the
bridge on the particular subnet on which the packet is broadcast. The
destination address of a RUT packet must be zero; RUT packets are
not forwarded onto other subnets. The byte count of a RUT packet is a
multiple of 4 and the packet contains up to 122 pairs of 16-bit words:
@table @t
@item word 1
The subnet number of a subnet which this bridge can get to,
directly or indirectly, right-adjusted.
@item word 2
The cost of sending to that subnet via this bridge. This is the
current cost from the bridge's routing table, plus the cost
for the subnet on which the routing packet is being broadcast.
Adding the subnet cost eliminates loops, and prefers one-hop
paths over two-hop paths.
@end table
When a host receives a RUT packet, it processes each 2-word entry by
comparing the cost for that subnet against its current cost; if it is less or equal
the cost and the address of the bridge are entered into the routing table,
provided that that subnet's routing table entry is not of the Direct
or Fixed Bridge type.
When there are multiple equivalent bridges, the traffic is spread among them only by
virtue of their RUT packets being sent at different times, so that sometimes one
bridge has the lower cost, and sometimes the other. If this isn't adequate,
hosts could have hairier routing tables which remember more than one possible
route and use them according to their relative costs, but so far
this has not been necessary since the network traffic is not so high as
to saturate any one bridge.
The design of this routing scheme is predicated on the assumption that
the network geometry is simple, there are few multiple paths,
and the length of any path is quite short. This makes more sophisticated
schemes unnecessary.
An important feature of this routing scheme is that the size of the
table is proportional to the number of subnets, not to the number of hosts.
Thus it does not take up an inordinate amount of memory in a small computer,
and no complicated dynamic allocation schemes are required.
In the case of a pdp10 which accesses the Chaosnet through a front-end pdp11, we define
the interface between the two computers to be a subnet, and regard the pdp11
as a bridge which forwards packets between the network and the pdp10.
This gives the pdp10 and the pdp11 separate addresses so that we can
choose to talk to either one, even though they are part of the same
computer system. This is occasionally useful for maintenance purposes.
It becomes more useful when the front-end pdp11 has peripherals which
are to be accessed through the Chaosnet, since they can simply look like
hosts on that pdp11's private subnet.
In the case of a host which is attached to more than one subnet,
it is undesirable for the host to have more than one address, since this