• We want to create a Linux Kernel Module
that can serve application-programs as a
character-mode device-driver for our NIC
• So, as with the UART device, we will need
to implement ‘read()’ and ‘write()’ methods
• But which method should we do first?
• No way to “test” a ‘read()’ method without
having a way to send packets to our NIC
25 trang |
Chia sẻ: candy98 | Lượt xem: 940 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Advanced Systems Programming - Lesson 22: 82573L, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
82573L
Initializing our Pro/1000
Chicken-and-Egg?
• We want to create a Linux Kernel Module
that can serve application-programs as a
character-mode device-driver for our NIC
• So, as with the UART device, we will need
to implement ‘read()’ and ‘write()’ methods
• But which method should we do first?
• No way to “test” a ‘read()’ method without
having a way to send packets to our NIC
How ‘transmit’ works
descriptor0
descriptor1
descriptor2
descriptor3
0
0
0
0
Buffer0
Buffer1
Buffer2
Buffer3
List of Buffer-Descriptors
We setup each data-packets that we want
to be transmitted in a ‘Buffer’ area in ram
We also create a list of buffer-descriptors
and inform the NIC of its location and size
Then, when ready, we tell the NIC to ‘Go!’
(i.e., start transmitting), but let us know
when these transmissions are ‘Done’
Random Access Memory
Registers’ Names
Memory-information registers
TDBA(L/H) = Transmit-Descriptor Base-Address Low/High (64-bits)
TDLEN = Transmit-Descriptor array Length
TDH = Transmit-Descriptor Head
TDT = Transmit-Descriptor Tail
Transmit-engine control registers
TXDCTL = Transmit-Descriptor Control Register
TCTL = Transmit Control Register
Notification timing registers
TIDV = Transmit Interrupt Delay Value
TADV = Transmit-interrupt Absolute Delay Value
Tx-Desc Ring-Buffer
Circular buffer (128-bytes minimum)
TDBA
base-address
TDLEN
(in bytes)
TDH (head)
TDT (tail)
= owned by hardware (nic)
= owned by software (cpu)
0x00
0x10
0x20
0x30
0x40
0x50
0x60
0x70
0x80
Tx-Descriptor Control (0x3828)
0 0 0 0 0 0 0
G
R
A
N
0 0
WTHRESH
(Writeback Threshold)
0 0 0
FRC
DPLX
FRC
SPD 0
HTHRESH
(Host Threshold)
I
L
O
S
0 0
A
S
D
E
0
L
R
S
T
0 0
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
PTHRESH
(Prefetch Threshold)
0 0
Recommended for 82573: 0x01010000 (GRAN=1, WTHRESH=1)
“This register controls the fetching and write back of transmit descriptors.
The three threshhold values are used to determine when descriptors are
read from, and written to, host memory. Their values can be in units of
cache lines or of descriptors (each descriptor is 16 bytes), based on the
value of the GRAN bit (0=cache lines, 1=descriptors). When GRAN = 1,
all descriptors are written back (even if not requested).” --Intel manual
Transmit Control (0x0400)
R
=0
R
=0
R
=0
MULR TXCSCMT
UNO
RTX
RTLC R
=0
SW
XOFF
COLD (upper 6-bits)
(COLLISION DISTANCE)
COLD (lower 4-bits)
(COLLISION DISTANCE) 0 ASDV
I
L
O
S
S
L
U
TBI
mode
P
S
P
0 0 R
=0
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
R
=0
E
N
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
SPEED
CT
(COLLISION THRESHOLD)
EN = Transmit Enable SWXOFF = Software XOFF Transmission
PSP = Pad Short Packets RLTC = Retransmit on Late Collision
CT = Collision Threshold (=0xF) UNORTX = Underrun No Re-Transmit
COLD = Collision Distance (=0x3F) TXCSCMT = TxDescriptor Minimum Threshold
MULR = Multiple Request Support
82573L
Tx Configuration Word (0x0178)
82573L
ANE
Tx
Config
ITCE R
=0
IAME R
=0
DF
PAR
EN
PB
PAR
EN
Tx
LS
Tx
LS
Flow
=0
R
=0
Phy
Pwr
Down
En
DMA
Dyn
GE
R
=0
RO
DIS
Reserved
(=0)
SPD
BYPS
R
=0
EE
RST
ASD
CHK
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
0 0TxConfigWord
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
ANE = Auto-Negotiation Enable
TxConfig = Transmit Configuration Control bit
TxConfigWord = Transmit Configuration Word
This register has two meanings, depending on the state of the ANE bit
(i.e., setting ANE=1 enables the hardware auto-negotiation machine).
Applicable only in SerDes mode; program as 0 for internal-PHY mode.
Legacy Tx-Descriptor Layout
special
0x0
0x4
0x8
0xC
CMD
Buffer-Address high (bits 63..32)
Buffer-Address low (bits 31..0)
31 0
Packet Length (in bytes)CSO
statusCSS
reserved
=0
Buffer-Address = the packet-buffer’s 64-bit address in physical memory
Packet-Length = number of bytes in the data-packet to be transmitted
CMD = Command-field CSO/CSS = Checksum Offset/Start (in bytes)
STA = Status-field
Suggested C syntax
typedef struct {
unsigned long long base_addr;
unsigned short pkt_length;
unsigned char cksum_off;
unsigned char desc_cmd;
unsigned char desc_stat;
unsigned char cksum_org;
unsigned short special;
} tx_descriptor;
TxDesc Command-field
IDE VLE DEXT
reserved
=0 RS IC IFCS EOP
7 6 5 4 3 2 1 0
EOP = End Of Packet (1=yes, 0=no)
IFCS = Insert Frame CheckSum (1=yes, 0=no) – provided EOP is set
IC = Insert CheckSum (1=yes, 0=no) as indicated by CSO/CSS fields
RS = Report Status (1=yes, 0=no)
DEXT = Descriptor Extension (1=yes, 0=no) use ‘0’ for Legacy-Mode
VLE = VLAN-Packet Enable (1=yes, 0=no) – provided EOP is set
IDE = Interrupt-Delay Enable (1=yes, 0=no)
TxDesc Status field
reserved
=0 LC EC DD
3 2 1 0
DD = Descriptor Done
this bit is written back after the NIC processes the descriptor
provided the descriptor’s RS-bit was set (i.e., Report Status)
EC = Excess Collisions
indicates that the packet has experienced more than the
maximum number of excessive collisions (as defined by
the TCTL.CT field) and therefore was not transmitted.
(This bit is meaningful only in HALF-DUPLEX mode.)
LC = Late Collision
indicates that Late Collision has occurred while operating in
HALF-DUPLEX mode. Note that the collision window size
is dependent on the SPEED: 64-bytes for 10/100-MBps, or
512-bytes for 1000-Mbps.
Bit-mask definitions
enum {
DD = (1<<0), // Descriptor Done
EC = (1<<1), // Excess Collisions
LC = (1<<2), // Late Collision
EOP = (1<<0), // End Of Packet
IFCS = (1<<1), // Insert Frame CheckSum
IC = (1<<2), // Insert CheckSum as per CSO/CSS
RS = (1<<3), // Report Status
DEXT = (1<<5), // Descriptor Extension
VLE = (1<<6), // VLAN packet
IDE = (1<<7) // Interrupt-Delay Enable
};
Allocating kernel-memory
• Our 82573L device-driver will need to use
a segment of contiguous physical memory
which is cache-aligned and non-pageable
• As explained in our LDD3 textbook, such a
memory-block can be allocated using the
Linux kernel’s ‘kmalloc()’ function (and it
can later be deallocated using ‘kfree()’)
• The maximum-size allocation is 128-KB
• You should use the ‘GFP_KERNEL’ flag
Network MTU
• Unless the ‘Large-Send’ functionality has
been enabled, there will be a maximum
length for your network ‘datagrams’ equal
to 1536 bytes (=0x0600)
• So if you reused the same Packet-Buffer
for successive transmissions, you could fit
your packet-buffer and a moderate-sized
Descriptor-Buffer into one 4KB-pageframe
Single page-frame option
Packet-Buffer (3-KB)
(reused for successive transmissions)
4KB
Page-
Frame
Descriptor-Buffer (1-KB)
(room for up to 256 descriptors)
Another design-option
16 Packet-Buffers (3968-bytes)
(248-bytes per buffer )
4KB
Page-
Frame
Descriptor-Buffer (128-bytes)
(room for 16 descriptors)
Initialization
• Your device-driver needs to initialize your
82573L hardware to a known state, and
configure its options for your desired mode
of operation
• The Device Control register has bits which
let you initiate a ‘device reset’ operation
• The Device Status register has bits which
inform you when a ‘reset’ has completed
0Device Status (0x0008)
? 0 0 0 0 0 0 0 0 0 0 0
GIO
Master
EN
0 0 0
0 0 0 0 PHYreset ASDV
I
L
O
S
S
L
U
0
TX
OFF 0 0
F
D
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Function
ID
L
U
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
SPEED
FD = Full-Duplex
LU = Link Up
TXOFF = Transmission Paused
SPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved)
ASDV = Auto-negotiation Speed Detection Value
82573L
some undocumented functionality?
Device Control (0x0000)
PHY
RST
VME R
=0
TFCE RFCE RST R
=0
R
=0
R
=0
R
=0
R
=0
ADV
D3
WUC
R
=0
D/UD
status
R
=0
R
=0
R
=0
R
=0
R
=0
FRC
DPLX
FRC
SPD
R
=0
SPEED R
=0
S
L
U
R
=0
R
=0
R
=1
0 0
F
D
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
GIO
M
D
R
=0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
FD = Full-Duplex SPEED (00=10Mbps, 01=100Mbps, 10=1000Mbps, 11=reserved)
GIOMD = GIO Master Disable ADVD3WUP = Advertise Cold Wake Up Capability
SLU = Set Link Up D/UD = Dock/Undock status RFCE = Rx Flow-Control Enable
FRCSPD = Force Speed RST = Device Reset TFCE = Tx Flow-Control Enable
FRCDPLX = Force Duplex PHYRST = Phy Reset VME = VLAN Mode Enable
82573L
Extended Control (0x0018)
R
=0
R
=0
?
ITCE R
=0
IAME R
=0
DF
PAR
EN
PB
PAR
EN
Tx
LS
Tx
LS
Flow
=0
R
=0
Phy
Pwr
Down
En
DMA
Dyn
GE
R
=0
RO
DIS
R
=0
SPD
BYPS
R
=0
EE
RST
ASD
CHK
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
0 0 R
=0
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
R
=0
R
=0
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
82573L
R
=0
ASDCHK = AutoSpeed Detection Check TxLSFlow = Tx Large-Send Flow
EERST = EEPROM Reset TxLS = Tx Large-Send functionality
SPDBYPS = Speed-selection Bypass PBPAREN = Packet-Buffer Parity-Error Detect
RODIS = Relaxed-Ordering Disable DFPAREN = Descriptor-FIFO Parity-Error Detect
DMADynGE = DMA Dynamic-Gating Enable IAME = Interrupt-Acknowledge Auto-Mask Enable
PhyPwrDownEn = Phy PowerDown Enable ITCE = Interrupt Timers Cleared Enable
Example
// clear STATUS bit #31
iowrite32( 0x00000000, io + E1000_STATUS );
// initiate Device-Reset and Phy-Reset
iowrite32( 0x84000000, io + E1000_CTRL );
// wait until STATUS bit #31 is set
while ( ( ioread32( io + E1000_STATUS )&(1<<31)) == 0 );
// program Link Up with desired operating-mode settings
iowrite32( 0x00040241, io + E1000_CTRL );
// wait until LU-bit in STATUS is set
while ( ( ioread32( io + E1000_STATUS )&(1<<10)) == 0 );
Interrupt Cause Read (0x00C0)
INT
assert
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
A
C
K
S
R
P
D
TXD
LOW
R
=0
R
=0
R
=0
MDAC RXT0 RXO R
=0
RXD
MT0
R
=0
0 0
T
X
D
W
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
L
S
C
T
X
Q
E
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
R
=0
TXDW = Transmit Descriptor Written back LSC = Link Status Changed
TXQE = Transmit Queue Empty MDAC = MDI/O Access Completed
SRPD = Small Receive Packet Detected ACK = Receive ACK-frame detected
RXT0 = Receiver Timer Interrupt RXO = Receiver Overrun
TXDLOW = Transmit Descriptor Low Threshhold Reached
RXDMT0 = Receive Descriptor Minimum Threshhold Reached
INT-Assert = Interrupt Assertion is still pending
Mechanism for NIC-event notifications
In-Class Exercise #1
• Try compiling and installing our ‘tryreset.c’
demo-module, and examine the messages
put in the kernel’s log-file (use ‘dmesg’)
• Then modify the module-code so that it
also outputs the value in the ICR register
(Interrupt Cause Read) during each pass
through the two ‘busy-waiting’ loops
• #define E1000_ICR 0x00C0
In-Class Exercise #2
• Apply the save techniques we employed in
our earlier ‘announce.c’ demo-module so
that the ‘printk()’ statements in ‘tryreset.c’
get replaced by statements that will show
the messages onscreen, or in the current
desktop window, rather than writing them
to the kernel’s (out-of-view) log-file