Some cards seem to have the receive VLAN tag stripping feature enabled
by default, which causes received VLAN packets to be misinterpreted as
being received by the trunk device.
Fix by disabling VLAN tag stripping in the C+ Command Register.
Debugged-by: Xinming Lai <yiyihu@gmail.com>
Tested-by: Xinming Lai <yiyihu@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The RTL8211B seems to have a bug that prevents the link from coming up
unless the MII_MMD_DATA register is cleared.
The Linux kernel driver applies this workaround (in rtl8211b_resume())
only to the specific RTL8211B PHY model, along with a matching
workaround to set bit 9 of MII_MMD_DATA when suspending the PHY.
Since we have no need to ever suspend the PHY, and since writing a
zero ought to be harmless, we just clear the register unconditionally.
Debugged-by: Nikolay Pertsev <nikolay.p@cos.flag.org>
Tested-by: Nikolay Pertsev <nikolay.p@cos.flag.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Include a potential DMA mapping within the definition of an I/O
buffer, and move all I/O buffer DMA mapping functions from dma.h to
iobuf.h. This avoids the need for drivers to maintain a separate list
of DMA mappings for each I/O buffer that they may handle.
Network device drivers typically do not keep track of transmit I/O
buffers, since the network device core already maintains a transmit
queue. Drivers will typically call netdev_tx_complete_next() to
complete a transmission without first obtaining the relevant I/O
buffer pointer (and will rely on the network device core automatically
cancelling any pending transmissions when the device is closed).
To allow this driver design approach to be retained, update the
netdev_tx_complete() family of functions to automatically perform the
DMA unmapping operation if required. For symmetry, also update the
netdev_rx() family of functions to behave the same way.
As a further convenience for drivers, allow the network device core to
automatically perform DMA mapping on the transmit datapath before
calling the driver's transmit() method. This avoids the need to
introduce a mapping error handling code path into the typically
error-free transmit methods.
With these changes, the modifications required to update a typical
network device driver to use the new DMA API are fairly minimal:
- Allocate and free descriptor rings and similar coherent structures
using dma_alloc()/dma_free() rather than malloc_phys()/free_phys()
- Allocate and free receive buffers using alloc_rx_iob()/free_rx_iob()
rather than alloc_iob()/free_iob()
- Calculate DMA addresses using dma() or iob_dma() rather than
virt_to_bus()
- Set a 64-bit DMA mask if needed using dma_set_mask_64bit() and
thereafter eliminate checks on DMA address ranges
- Either record the DMA device in netdev->dma, or call iob_map_tx() as
part of the transmit() method
- Ensure that debug messages use virt_to_phys() when displaying
"hardware" addresses
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Redefine the value stored within a DMA mapping to be the offset
between physical addresses and DMA addresses within the mapped region.
Provide a dma() wrapper function to calculate the DMA address for any
pointer within a mapped region, thereby simplifying the use cases when
a device needs to be given addresses other than the region start
address.
On a platform using the "flat" DMA implementation the DMA offset for
any mapped region is always zero, with the result that dma_map() can
be optimised away completely and dma() reduces to a straightforward
call to virt_to_phys().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Physical addresses in debug messages are more meaningful from an
end-user perspective than potentially IOMMU-mapped I/O virtual
addresses, and have the advantage of being calculable without access
to the original DMA mapping entry (e.g. when displaying an address for
a single failed completion within a descriptor ring).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The malloc_dma() function allocates memory with specified physical
alignment, and is typically (though not exclusively) used to allocate
memory for DMA.
Rename to malloc_phys() to more closely match the functionality, and
to create name space for functions that specifically allocate and map
DMA-capable buffers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The legacy transmit descriptor index is not reset by anything short of
a full device reset. This can cause the legacy transmit ring to stall
after closing and reopening the device, since the hardware and
software indices will be out of sync.
Fix by performing a reset after closing the interface. Do this only
if operating in legacy mode, since in C+ mode the reset is not
required and would undesirably clear additional state (such as the C+
command register itself).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently have no generic concept of a PHY address, since all
existing implementations simply hardcode the PHY address within the
MII access methods.
A bit-bashing MII interface will need to be provided with an explicit
PHY address in order to generate the correct waveform. Allow for this
by separating out the concept of a MII device (i.e. a specific PHY
address attached to a particular MII interface).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On some RTL8169 onboard NICs (observed with a Lenovo ThinkPad 11e),
the EEPROM is not merely not present: any attempt to read from the
non-existent EEPROM will crash and reboot the system.
The equivalent code to read from the EEPROM was removed from the Linux
r8169 driver in 2009 with a comment suggesting that it was similarly
found to be unreliable on some systems.
Fix by accessing the EEPROM only on RTL8139 NICs, and assuming that
the MAC address will always be correctly preset on RTL8169 NICs.
Reported-by: Evan Prohaska <eprohaska@edkey.org>
Tested-by: Evan Prohaska <eprohaska@edkey.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On an Asus Z87-K motherboard with an onboard 8168 NIC, booting into
Windows 7 and then warm rebooting into iPXE results in a broken RX
datapath: packets can be transmitted successfully but garbage is
received. A cold reboot clears the problem.
A dump of the PHY registers reveals only one difference: in the
failure case the bits ADVERTISE_PAUSE_CAP and ADVERTISE_PAUSE_ASYM are
cleared. Explicitly setting these bits does not fix the problem.
A dump of the MAC registers reveals a few differences, of which the
most obvious culprit is the undocumented bit 24 of the Receive
Configuration Register (RCR), which is set in the failure case.
Explicitly clearing this bit does fix the problem.
Reported-by: Sebastian Nielsen <ipxe@sebbe.eu>
Reported-by: Oliver Rath <rath@mglug.de>
Debugged-by: Sebastian Nielsen <ipxe@sebbe.eu>
Tested-by: Sebastian Nielsen <ipxe@sebbe.eu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
realtek_destroy_ring() currently does nothing if the card is operating
in legacy (pre-RTL8139C+) mode. In particular, the producer and
consumer counters are incorrectly left holding their current values.
Virtual hardware (e.g. the emulated RTL8139 in qemu and similar VMs)
is tolerant of this behaviour, but real hardware will fail to transmit
if the descriptors are not used in the correct order.
Fix by resetting the producer and consumer counters in
realtek_destroy_ring() even if the card is operating in legacy mode.
Reported-by: Gelip <mrgelip@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On some systems, it appears to be possible for writes to the EEPROM
registers to be delayed for long enough that the EEPROM's setup and
hold times are violated, resulting in invalid data being read from the
EEPROM.
Fix by inserting a PCI read cycle immediately after writes to
RTL_9346CR, to ensure that the write has completed before starting the
udelay() used to time the SPI bus transitions.
Reported-by: Gelip <mrgelip@gmail.com>
Tested-by: Gelip <mrgelip@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some older RTL8139 chips seem to not immediately update the
RTL_CR.BUFE bit in response to a write to RTL_CAPR. This results in
iPXE seeing a spurious zero-length received packet, and thereafter
being out of sync with the hardware's RX ring offset.
Fix by inserting an extra PCI read cycle after writing to RTL_CAPR, to
give the chip time to react before we next read RTL_CR.
Reported-by: Gelip <mrgelip@gmail.com>
Tested-by: Gelip <mrgelip@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some onboard RTL8169 NICs seem to leave the EEPROM pins disconnected.
The existing is_valid_ether_addr() test will not necessarily catch
this, since it expects a missing EEPROM to show up as a MAC address of
00:00:00:00:00:00 or ff:ff:ff:ff:ff:ff. When the EEPROM pins are
floating the MAC address may read as e.g. 00:00:00:00:0f:00, which
will not be detected as invalid.
Check the ID word in the first two bytes of the EEPROM (which should
have the value 0x8129 for all RTL8139 and RTL8169 chips), and use this
to determine whether or not an EEPROM is present.
Reported-by: Carl Karsten <carl@nextdayvideo.com>
Tested-by: Carl Karsten <carl@nextdayvideo.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some RTL8169 cards seem to drive the EEPROM CS line high (i.e. active)
when 9346CR.EEM is set to "normal operating mode", with the result
that the CS line is never deasserted. The symptom of this is that the
first read from the EEPROM will work, while all subsequent reads will
return garbage data.
Reported-by: Thomas Miletich <thomas.miletich@gmail.com>
Debugged-by: Thomas Miletich <thomas.miletich@gmail.com>
Tested-by: Thomas Miletich <thomas.miletich@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some RTL8169 cards (observed with an RTL8169SC) power up advertising
only 100Mbps, despite being capable of 1000Mbps. Forcibly enable
advertisement of 1000Mbps on any RTL8169-like card.
This change relies on the assumption that the CTRL1000 register will
not exist on 100Mbps-only RTL8169 cards such as the RTL8101.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some RTL8169 cards (observed with an RTL8169SC) crash and burn if DAC
is enabled, even if only 32-bit addresses are used. Observed
behaviour includes system lockups and repeated transmission of garbage
data onto the wire.
This seems to be a known problem. The Linux r8169 driver disables DAC
by default and provides a "use_dac" module parameter.
There appears to be no known test for determining whether or not DAC
will work. As a workaround, enable DAC only if we are built as as
64-bit binary. This at least eliminates the problem in the common
case of a 32-bit build, which will never use 64-bit addresses anyway.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some bits in the C+ Command register are always one. Testing for the
presence of the register must allow for this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some RTL8169 cards (observed with an RTL8169SC) power up with
TCR.MXDMA set to 16 bytes. While this does not prevent proper
operation, it almost certainly degrades performance.
Fix by explicitly setting TCR.MXDMA to "unlimited".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some RTL8169 cards (observed with an RTL8169SC) power up with invalid
values in RCR.RXFTH and RCR.MXDMA, causing receive DMA to fail. Fix
by setting explicit values for both fields.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some RTL8169 cards (observed with an RTL8169SC) power up with garbage
values in the ring address registers, and do not clear the registers
on reset.
Fix by always setting the high dword of the ring address registers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RTL8139C+ cards use essentially the same datapath as RTL8169, which is
zerocopy and 64-bit capable. Older RTL8139 cards use a single receive
ring buffer rather than a descriptor ring, but still share substantial
amounts of functionality with RTL8169.
Include support for RTL8139 cards within the generic Realtek driver,
since there is no way to differentiate between RTL8139 and RTL8139C+
cards based on the PCI IDs alone.
Many thanks to all the people who worked on the rtl8139 driver over
the years.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The link state is currently set at probe time, and updated only when
the device is polled. This results in the user seeing a misleading
stale "Link: down" message, if autonegotiation did not complete within
the short timespan of the probe routine.
Fix by updating the link state when the device is opened, so that the
message that ends up being displayed to the user reflects the real
link state at device open time.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Tested-by: Thomas Miletich <thomas.miletich@gmail.com>
Debugged-by: Thomas Miletich <thomas.miletich@gmail.com>
Tested-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>