In order to construct outgoing link-layer frames or parse incoming
ones properly, some protocols (such as 802.11) need more state than is
available in the existing variables passed to the link-layer protocol
handlers. To remedy this, add struct net_device *netdev as the first
argument to each of these functions, so that more information can be
fetched from the link layer-private part of the network device.
Updated all three call sites (netdevice.c, efi_snp.c, pxe_undi.c) and
both implementations (ethernet.c, ipoib.c) of ll_protocol to use the
new argument.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Several SPI chips will respond to an SPI read command with a dummy
zero bit immediately prior to the first real data bit. This can be
used to autodetect the address length, provided that the command
length and data length are already known, and that the MISO data line
is tied high.
Tested-by: Thomas Miletich <thomas.miletich@gmail.com>
Debugged-by: Thomas Miletich <thomas.miletich@gmail.com>
The pcnet32 driver mismanages its RX buffers, with the result that
packets get corrupted if more than one packet arrives between calls to
poll().
Originally-fixed-by: Bill Lortz <Bill.Lortz@premier.org>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Tested-by: Stefan Hajnoczi <stefanha@gmail.com>
Also adds the MAC_ADDR_CORRECT flag, to indicate whether or not the
MAC address needs to be fixed up by the driver.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
This is a major rewrite of the legacy etherboot 3c90x driver using the
gPXE API for much improved performance over the legacy driver it
replaces.
This driver has been tested on 3c905, 3c905B, and 3c905C cards.
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Reviewed-by: Marty Connor <mdc@etherboot.org>
Tested-by: Marty Connor <mdc@etherboot.org>
Tested-by: Daniel Verkamp <daniel@drv.nu>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Intel's C compiler (icc) chokes on the zero-length arrays that we
currently use as part of the mechanism for accessing linker table
entries. Abstract away the zero-length arrays, to make a port to icc
easier.
Introduce macros such as for_each_table_entry() to simplify the common
case of iterating over all entries in a linker table.
Represent table names as #defined string constants rather than
unquoted literals; this avoids visual confusion between table names
and C variable or type names, and also allows us to force a
compilation error in the event of incorrect table names.
Following the example of the Linux driver, we add a check and delay to
make sure that the NIC has finished resetting before the driver issues
any additional commands.
Signed-off-by: Marty Connor <mdc@etherboot.org>
This previously unsupported NIC variant was was found to work using
the current driver:
PCI_ROM(0x13f0, 0x0200, "ip100a", "IC+ IP100A"),
Signed-off-by: Marty Connor <mdc@etherboot.org>
Some targets send a spurious CHECK CONDITION message in response to
the first SCSI command. We issue (and ignore the status of) an
arbitary harmless SCSI command (a READ CAPACITY (10)) in order to draw
out this response.
The Solaris Comstar target seems to send more than one spurious CHECK
CONDITION response. Attempt up to SCSI_MAX_DUMMY_READ_CAP dummy READ
CAPACITY (10) commands before assuming that error responses are
meaningful.
Problem reported by Kristof Van Doorsselaere <kvandoor@aserver.com>
and Shiva Shankar <802.11e@gmail.com>.
Driver was storing the result of pci_bar_start() and pci_bar_size() in
an int, rather than an unsigned long.
(Bug was introduced in the vendor's tree in commit eac85cd "Port
etherfabric driver to net_device api".)
adjust_pci_device() has historically enabled bus-mastering and I/O
cycles, but has never previously needed to enable memory cycles. Some
EFI systems seem not to enable memory cycles by default, so add that
to the list of PCI command register bits that we force on.
When compiling for the Linux kernel, PCI_BASE_ADDRESS_0 == 0, and
PCI_BASE_ADDRESS_1 == 1. This is not so when compiling for gPXE. We
must use the symbolic names rather than integers to get the correct
values.
Bug identified and patch supplied by:
George Chou <george.chou@advantech.com>
The patch file supplied for commit 3a799e9 ("[hermon] Add PCI ID for
ConnectX QDR card") accidentally marked drivers/infiniband/hermon.c as
being executable.
This driver is based on Stefan Hajnoczi's summer work, which
is in turn based on version 1.01 of the linux b44 driver.
I just assembled the pieces and fixed/added a few pieces
here and there to make it work for my hardware.
The most major limitation is that this driver won't work
on systems with >1GB RAM due to the card not having enough
address bits for that and gPXE not working around this
limitation.
Still, other than that the driver works well enough for
at least 2 users :) and the above limitation can always
be fixed when somebody wants it bad enough :)
Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com>
This brings us in to line with Linux definitions, and also simplifies
adding x86_64 support since both platforms have 2-byte shorts, 4-byte
ints and 8-byte long longs.
The return path in directed route SMPs lists the egress ports in order
from SM to node, rather than from node to SM.
To write to the correct offset within the return path, we need to
parse the hop pointer. This is held within the class-specific data
portion of the MAD header, which was previously unused by us and
defined to be a uint16_t. Define this field to be a union type; this
requires some rearrangement of ib_mad.h and corresponding changes to
ipoib.c.
These cards very nearly support our current IB Verbs model. There is
one minor difference: multicast packets will always be delivered by
the hardware to QP0, so the driver has to redirect them to the
appropriate QP. This means that QP owners may see receive completions
for buffers that they never posted. Nothing in our current codebase
will break because of this.
This can be used with cards that require the driver to construct and
parse packet headers manually. Headers are optionally handled
out-of-line from the packet payload, since some such cards will split
received headers into a separate ring buffer.
Some Infiniband cards will not be as accommodating as the Arbel and
Hermon cards in providing enough space for us to push a fake extra
header at the start of the received packet. We must therefore make do
with squeezing enough information to identify source and destination
addresses into the two bytes of padding within a genuine IPoIB
link-layer header.
Not all Infiniband cards have embedded subnet management agents.
Split out the code that communicates with such an embedded SMA into a
separate ib_smc.c file, and have drivers call ib_smc_update()
explicitly when they suspect that the answers given by the embedded
SMA may have changed.
Receive completion handlers now get passed an address vector
containing the information extracted from the packet headers
(including the GRH, if present), and only the payload remains in the
I/O buffer.
This breaks the symmetry between transmit and receive completions, so
remove the ib_completer_t type and use an ib_completion_queue_operations
structure instead.
Rename the "destination QPN" and "destination LID" fields in struct
ib_address_vector to reflect its new dual usage.
Since the ib_completion structure now contains only an IB status code,
("syndrome") replace it with a generic gPXE integer status code.
Avoid leaking I/O buffers in ib_destroy_qp() by completing any
outstanding work queue entries with a generic error code. This
requires the completion handlers to be available to ib_destroy_qp(),
which is done by making them static configuration parameters of the CQ
(set by ib_create_cq()) rather than being provided on each call to
ib_poll_cq().
This mimics the functionality of netdev_{tx,rx}_flush(). The netdev
flush functions would previously have been catching any I/O buffers
leaked by the IPoIB data queue (though not by the IPoIB metadata
queue).
Add the simplified ne2k_isa driver. It is just a selective copy+paste
of the relevant parts from ns8390.c plus a little trivial hacking to
make it actually work.
It is true that the code is pretty ugly, but:
a) ns8390.c is worse
b) It is only 372 lines and no #ifdefs
c) It works both in qemu/bochs and in real hardware
and we all know it is easier to cleanup working code
Hope someone will find the time to rewrite this driver properly,
but until then at least for me this is an ok solution.
Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com>
Halting the PEGs breaks platforms where there is sideband access to
the NIC (e.g. HP machines using iLO). (We have to retain the
unhalting code because on some other platforms (e.g. IBM blades with
BOFM) the pre-PXE firmware must halt the PEGs to avoid issues with the
BIOS rereading via the expansion ROM BAR.)
This is something of an ugly hack to accommodate an OEM requirement.
The NIC has only one expansion ROM BAR, rather than one per port. To
allow individual ports to be selectively enabled/disabled for PXE boot
(as required), we must therefore leave the expansion ROM always
enabled, and place the per-port enable/disable logic within the gPXE
driver.
The Phantom firmware selectively disables PCI functions based on the
board type, with the end result that we see one PCI function for each
network port. This allows us to eliminate the code for reading from
flash and, more importantly, removes knowledge of the board type magic
number from the gPXE driver.
Settings can be constructed using a dotted-decimal notation, to allow
for access to unnamed settings. The default interpretation is as a
DHCP option number (with encapsulated options represented as
"<encapsulating option>.<encapsulated option>".
In several contexts (e.g. SMBIOS, Phantom CLP), it is useful to
interpret the dotted-decimal notation as referring to non-DHCP
options. In this case, it becomes necessary for these contexts to
ignore standard DHCP options, otherwise we end up trying to, for
example, retrieve the boot filename from SMBIOS.
Allow settings blocks to specify a "tag magic". When dotted-decimal
notation is used to construct a setting, the tag magic value of the
originating settings block will be ORed in to the tag number.
Store/fetch methods can then check for the magic number before
interpreting arbitrarily-numbered settings.
This interface provides access to firmware settings (e.g. MAC address)
that will apply to all drivers loaded for the duration of the current
system boot.
A hardware bug means that reads through the expansion ROM BAR can
return corrupted data if the PEGs are running. This breaks platforms
that re-read the expansion ROM after invoking gPXE code, such as IBM
blade servers.
Halt PEGs during driver shutdown, and unhalt PEGs during driver
startup if we detect that this is not the first startup since
power-on.
Most other Phantom drivers define a register space in terms of a 64M
virtual address space. While this doesn't map in any meaningful way
to the actual addresses used on the latest cards, it makes maintenance
easier if we do the same.
The virtnet_transmit() logic for waiting the packet to be transmitted is
reversed: we can't wait the packet to be transmitted if we didn't kick()
the ring yet. The vring_more_used() while loop logic is reversed also,
that explains why the code works today.
The current code risks trying to free a buffer from the used ring
when none was available, that will happen most times because KVM
doesn't handle the packet immediately on kick(). Luckily it was working
because it was unlikely to have a buffer still queued for transmit when
virtnet_transmit() was called.
Also, adds a BUG_ON() to vring_get_buf(), to catch cases where we try
to free a buffer from the used ring when there was none available.
Patch for Etherboot. gPXE has the same problem on the code, but I hadn't
a chance to test gPXE using virtio-net yet.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
EFI requires us to be able to specify the source address for
individual transmitted packets, and to be able to extract the
destination address on received packets.
Take advantage of this to rationalise the push() and pull() methods so
that push() takes a (dest,source,proto) tuple and pull() returns a
(dest,source,proto) tuple.
Multicast hashing is an ugly overlap between network and link layers.
EFI requires us to provide access to this functionality, so move it
out of ipv4.c and expose it as a method of the link layer.
-Wformat-nonliteral is not enabled by -Wall and needs to be explicitly
specified.
Modified the few files that use nonliteral format strings to work with
this new setting in place.
Inspired by a patch from Carl Karsten <carl@personnelware.com> and an
identical patch from Rorschach <r0rschach@lavabit.com>.
Some devices (e.g. the Atmel AT24C11) have no concept of a device
address; they respond to every device address and use this value as
the word address. Some other devices use part of the device address
field to extend the word address field.
Generalise the i2c bit-bashing support to handle this by defining the
device address length and word address length as properties of an i2c
device. The word address is assumed to overflow into the device
address field if the address used exceeds the width of the word
address field.
Also add a bus reset mechanism. i2c chips don't usually have a reset
line, so rebooting the host will not clear any bizarre state that the
chip may be in. We reset the bus by clocking SCL until we see SDA
high, at which point we know we can generate a start condition and
have it seen by all devices. We then generate a stop condition to
leave the bus in a known state prior to use.
Finally, add some extra debugging messages to i2c_bit.c.
Use individual page mappings rather than a single whole-region
mapping, to avoid the waste of memory that occurs due to the
constraint that each mapped block must be aligned on its own size.
We were accidentally allocating only half the required amount of
memory (given the alignment method) for the firmware buffer, leading
to conflicts between the firmware buffer and gPXE code/data segments.
We were accidentally allocating only half the required amount of
memory (given the alignment method) for the firmware buffer, leading
to conflicts between the firmware buffer and gPXE code/data segments.
It is possible for the BIOS to use the UNDI API to bring up the NIC
prior to system boot. If this happens, UNM_NIC_REG_CMDPEG_STATE will
contain the value 0xf00f (UNM_NIC_REG_CMDPEG_STATE_INITIALIZE_ACK),
and we should skip initialising the command PEG.
The firmware will now determine the right port mode on all cards, so
the PXE driver doesn't have to set it. (Setting the port mode
apparently breaks some newer cards.)
Commit f58cc3f introduced a temporary workaround for a bug in current
prototype silicon, but failed to apply it to all eight PCI functions
within the device.
Determine the network-layer packet type and fill it in for UNDI
clients. This is required by some NBPs such as emBoot's winBoot/i.
This change requires refactoring the link-layer portions of the
gPXE netdevice API, so that it becomes possible to strip the
link-layer header without passing the packet up the network stack.
This patch adds support for the virtio-net adapter provided by KVM.
Written by Laurent Vivier <Laurent.Vivier@bull.net> for Etherboot.
Wrapped as legacy driver for gPXE by Stefan Hajnoczi
<stefanha@gmail.com>.
In tg3_chip_reset(), the PCI_EXPRESS change is taken from the Linux
tg3 driver. I am not sure what exactly it does (it is not documented
in the Linux driver), but it is necessary for the NIC to work
correctly.
Conjecture: The hardware issues 64-bit DMA writes of status descriptors,
which some PCI bridges seem to split into two 32-bit writes in reverse
order (i.e. dword 1 first). This means that we sometimes observe a
partial status descriptor. Add an explicit check to ensure that the
descriptor is complete before processing it.
Also ensure that the RDS consumer counter is incremented only when we
know that we have actually consumed an RX descriptor.
ns8390.c can produce four different drivers (one PCI, three ISA.) The
ISA driver requires setting a few macros; do that by setting defines
in stub files instead of using src/Config.
Currently, all the ISA drivers are broken (they were not enabled by
default), so #if 0 them out.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
From: Daniel Mealha Cabrita <dancab@utfpr.edu.br>
I've added tg3-5721 support for gPXE, the patch (against gpxe-0.9.3) is
attached to this message.
This chipset is present in HP ML150 G2 servers (possibly other HP machines
as well).
Drivers are not allowed to call printf(). Converted eprintf() to DBG(),
and removed spurious startup banner.
Fixed hardcoded inclusion of little_bswap.h
Use EIO rather than 1 as an error number.
Add ability for network devices to flag link up/down state to the
networking core.
Autobooting code will now wait for link-up before attempting DHCP.
IPoIB reflects the Infiniband link state as the network device link state
(which is not strictly correct; we also need a succesful IPoIB IPv4
broadcast group join), but is probably more informative.
Infiniband devices no longer block waiting for link-up in
register_ibdev().
Hermon driver needs to create an event queue and poll for link-up events.
Infiniband core needs to reread MAD parameters when link state changes.
IPoIB needs to cope with Infiniband link parameters being only partially
available at probe and open time.
Arbel and Hermon cards both have multiple ports. Add the
infrastructure required to register each port as a separate IB
device. Don't yet register more than one port, since registration
will currently fail unless a valid link is detected.
Use ib_*_{set,get}_{drv,owner}data wrappers to access driver- and
owner-private data on Infiniband structures.
Pull out common code for handling management datagrams from arbel.c
and hermon.c into infiniband.c.
Add port number to struct ib_device.
Add open(), close() and mad() methods to struct ib_device_operations.
From: Geert Stappers <stappers@stappers.nl>
To: etherboot-developers@lists.sourceforge.net
Subject: [Etherboot-developers] 3c90x polling again [patch]
Date: Thu, 29 Nov 2007 09:22:36 +0100
User-Agent: Mutt/1.5.16 (2007-06-11)
Hello,
gPXE didn't work on 3COM 905C Tornado cards for me.
It did transmit the DHCP request, but it didn't see the DHCP offer.
Adding debug print statements allready solved the problem.
Attached is a patch that has a cleaner delay then print statements.
The core of it is
- for(i=0;i<40000;i++);
+ mdelay(1);
There was no research if the change is about a longer delay
or about code NOT being optimized away. It works for me :-)
Cheers
Geert Stappers
driver's probe() routine fills in in nic->irqno. This is so that
non-interrupt-capable legacy drivers which set nic->irqno=0 will end
up reporting IRQ#0 via PXENV_UNDI_GET_INFORMATION; this in turn means
that the calling PXE NBP will (should) hook the timer interrupt, and
everything will sort of work.
The e1000_irq() routine should (per mcb30) do enable on non-zero,
disable on zero. This is not consistent in all drivers, so I'll
wait to update it when doing a global sweep.
This needs to be done manually because if the irq() routine is
implemented then we want something like "nic->irqno = pci->irqno;",
else we do "nic->irqno = 0;" nic->ioaddr may also need to be set
carefully.
Also added local variables to end of many files, for emacs indentation
to match kernel style (tab does 8 space indent).
There may still be an issue with memory handling, since it seems to
die ungracefully when ARP packets come in after loading a kernel.
Something to debug.
tracking down a bug that turned out to be a free_iob() used where I
needed a netdev_tx_complete(). This left the freed I/O buffer on the
net device's TX list, with bad, bad consequences later.
Also fixed the bug in question.
This driver really needs to be rewritten.
It tries to build both ISA and PCI images,
and makes life diffifult for the build system
and rom-o-matic.net. The warning was just a reminder
that it needs to be cleaned up and re-factored
to split the PCI and ISA drivers.
This ancient ISA driver should probably be removed.
It is hard to get a card to test it with, and there
are comments to the effect that it cannot work with
relocation. I would be quite interested to get a
bug report by someone who actually has this card.
safe dropping of the netdev ref by the driver while other refs still
exist.
Add netdev_irq() method. Net device open()/close() methods should no
longer enable or disable IRQs.
Remove rx_quota; it wasn't used anywhere and added too much complexity
to implementing correct interrupt-masking behaviour in pxe_undi.c.
Use generic fields in struct device_description rather than assuming
that the struct device * is contained within a pci_device or
isapnp_device; this assumption is broken when using the undionly
driver.
Add PXENV_UNDI_SET_STATION_ADDRESS.