Several use cases (e.g. the UNDI API and the EFI SNP API) require
access to the raw network device receive queue, and so currently use
manual calls to netdev_poll() on a specific network device in order to
prevent received packets from being processed by the network stack.
As an alternative, provide a flag that allows receive queue processing
to be frozen on a per-device basis. When receive queue processing is
frozen, packets will be enqueued as normal, but will not be
automatically dequeued and passed up the network stack.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some network cards do not generate interrupts when operated via the
UNDI API. Allow for this by waiting for the ISR to be triggered only
if the PXE stack advertises that it supports interrupts. When the PXE
stack does not advertise interrupt support, we skip the call to
PXENV_UNDI_ISR_IN_START and just poll the device using
PXENV_UNDI_ISR_IN_PROCESS. This matches the observed behaviour of at
least one other PXE NBP (emBoot's winBoot/i), so there is a reasonable
chance of this working.
Originally-implemented-by: Muralidhar Appalla <Muralidhar.Appalla@emulex.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide a "hexhyp" setting type, which functions identically to the
"hex" setting type except that it uses a hyphen instead of a colon as
the byte delimiter.
For example, if ${mac} expands to "52:54:00:12:34:56", then
${mac:hexhyp} will expand to "52-54-00-12-34-56".
Originally-implemented-by: Jarrod Johnson <jarrod.b.johnson@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Fix typographical error from commit ea631f6 ("[list] Add
list_first_entry()"). The symptom was PXELINUX 3.86 causing a stack
overflow under VMware.
Tested-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Signed-off-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow fc_ulp_decrement() to guarantee to fc_peer_decrement() that the
peer reference remains valid for the duration of the call, by ensuring
that ulp->peer remains valid while ulp is valid.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow link examination methods to safely assume that their
self-reference remains valid for the duration of the method call.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Calling a timer's expiry method may cause arbitrary consequences,
including arbitrary modifications of the list of retry timers.
list_for_each_entry_safe() guards against only deletion of the current
list entry; it provides no protection against other list
modifications. In particular, if a timer's expiry method causes the
subsequent timer in the list to be deleted, then the next loop
iteration will access a timer that may no longer exist.
This is a particularly nasty bug, since absolutely none of the
list-manipulation or reference-counting assertion checks will be
triggered. (The first assertion failure happens on the next iteration
through list_for_each_entry(), showing that the list has become
corrupted but providing no clue as to when this happened.)
Fix by stopping traversal of the list of retry timers as soon as we
hit an expired timer.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Rearrange the fields in struct memory_block (without altering
MIN_MEMBLOCK_SIZE) so that the "count" field of a reference-counted
object is left intact when the memory containing the object is freed.
This allows for the possibility of detecting reference-counting errors
such as double-freeing.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Check that the reference count is valid (i.e. non-negative) on each
call to ref_get() and ref_put(), using an assert() at the point of
use.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
free_memblock() currently uses list_for_each_entry() to iterate over
the free list, and may delete an entry over which it iterates. While
there is no way that the deleted list entry could be overwritten
before we reference it, this does rely upon list_del() leaving the
"next" pointer intact, which is not guaranteed. Discovered while
tracking down a list-corruption bug (as a result of having modified
list_del() to sanitise the deleted list entry).
Fix by using list_for_each_entry_safe().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There are several points in the iPXE codebase where
list_for_each_entry() is (ab)used to extract only the first entry from
a list. Add a macro list_first_entry() to make this code easier to
read.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Functions that instantiate objects generally own one reference to the
object being created. The error paths must therefore usually call
ref_put() to release this reference.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For some install-to-SAN scenarios, the OS needs to be able to reboot
to reread the partition table. On this second boot attempt, the SAN
disk will not be empty and so iPXE will attempt to boot from it,
rather than falling back to the OS' installation media.
Work around this problem by introducing the "skip-san-boot" option,
similar in spirit to "keep-san".
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Improve the visibility of error messages by removing the redundant
final printing of the URL being booted.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some SCSI targets (observed with an EMC CLARiiON Fibre Channel target)
will not respond to commands correctly until a TEST UNIT READY has
been issued. In particular, a READ CAPACITY (10) command will return
with a success status, but no capacity data.
Fix by issuing a TEST UNIT READY command automatically, and delaying
further SCSI commands until the TEST UNIT READY has succeeded.
Reported-by: Hadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The FCP command reference number is intended to be used for
controlling precise delivery of FCP commands, rather than being an
essentially arbitrary tag field (as with iSCSI and SRP).
Use the Fibre Channel local exchange ID as the tag for FCP commands,
instead of the FCP command reference. The local exchange ID does not
appear within the FCP IU itself, but does appear within the FC frame
header; debug traces can therefore still be correlated with packet
captures.
Reported-by: Hadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Users tend to gloss over cryptic-looking error messages such as
"Boot failed: Exec format error (Error 0x2e852001)"
In particular, users tend not to report the error number, which is the
single most useful piece of diagnostic information in an iPXE error
message. Try replacing the "Error 0x2e852001" portion with a URL,
giving
"Boot failed: Exec format error (http://ipxe.org/2e852001)"
in the hope that users will, upon seeing something that is
recognisably a URL, try viewing it in a web browser. Such users will
be greeted by a web page containing a more detailed description of the
error (automatically generated from the einfo text), including links
to each line of code that might generate the error, and a section for
additional user-contributed notes. At the time of writing, a user who
visits http://ipxe.org/2e852001 would see a note saying
"This error usually indicates that the SAN disk is empty, and does
not yet contain a bootable operating system."
which may be more useful than "Exec format error (Error 0x2e852001)".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 5f4ab0d ("[iscsi] Randomise a portion of the ISID to force new
session instantiation") introduced a regression by randomising the
ISID on each call to iscsi_start_login(), which may be called more
than once per connection, rather than on each call to
iscsi_open_connection(), which is guaranteed to be called only once
per connection. This is incorrect behaviour that causes our
connection to be rejected by some iSCSI targets (observed with a
COMSTAR target under OpenSolaris).
Fix by generating the ISID in iscsi_open_connection(), and storing the
randomised ISID as part of the session state.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The config/local/*.h files are expected to be empty in most cases.
This should not cause a licence determination to fail.
Fix by ignoring config/local/*.h for licensing purposes.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When a connection to an iSCSI target is broken without gracefully
closing the TCP socket, a subsequent connection attempt may fail
because the target believes that we are attempting session
reinstatement (see RFC3720 section 5.3.1). This has been observed
using the Microsoft iSCSI target.
Section 9.1.1 of RFC3720 states that initiators should use a stable
ISID, however section 5.3.1 shows that the only way to explicitly
request that a new session be created is to use a new ISID.
Fix by randomising the "qualifier" portion of the ISID.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently set both the FP and SP bits in our FIP FLOGI, to allow
the FCF the choice of selecting either a fabric-provided or a server-
provided MAC address. This complies with the FCoE specification, but
has been observed to result in an FLOGI rejection from some FCFs.
Fix by recording whether or not the FCF supports SPMA, and requesting
only one of FPMA or SPMA in our FIP FLOGI. We choose to prefer SPMA
where available, because many iPXE drivers will not be able to receive
unicast packets sent to a non-default MAC address.
Reported-by: Hadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When using binutils 2.20, it seems to be necessary to add -ldl to link
against -lbfd.
Reported-by: Duane Voth <duanev@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
(Ab)use the "secs" field in transmitted DHCP packets to convey
metadata about the DHCP session state. In particular:
bit 0 represents the receipt of a ProxyDHCPOFFER
bit 1 represents the receipt of a DHCPOFFER
bits 2+ represent the transmitted packet sequence number
This allows some relevant information about the internal state of the
DHCP session to be read out from a packet trace from a non-debug build
of iPXE. It also potentially allows replies to be correlated to their
requests (for servers that copy the "secs" field from request to
reply).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some ProxyDHCP implementations seem to violate the PXE specification
by expecting the client to retain options from the ProxyDHCPOFFER
rather than issuing a separate ProxyDHCPREQUEST.
Work around such broken clients by retaining the ProxyDHCPOFFER
packet, and proceeding to a ProxyDHCPREQUEST only if the
ProxyDHCPOFFER does not already contain PXE options.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
A recent patch series breaks compatibility with various common DHCP
implementations.
Revert "[dhcp] Don't consider invalid offers to be duplicates"
This reverts commit 905ea56753.
Revert "[dhcp] Honor PXEBS_SKIP option in discovery control"
This reverts commit 620b98ee4b.
Revert "[dhcp] Keep multiple DHCP offers received, and use them intelligently"
This reverts commit 5efc2fcb60.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
FCoE requires us to be able to receive unicast packets for multiple
addresses. Support this by operating in promiscuous mode.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The port ID assigned by the FLOGI response is implicit in the
destination ID used for the response (which will differ from the
source ID used for the corresponding request).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
FCoE requires the use of fabric-provided MAC addresses, which breaks
the assumption that the net device's MAC address is implicitly the
source address for net_tx() and the (unicast) destination address for
net_rx().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The disk signature is used by some OSes (notably Windows) to identify
the boot disk, so it's useful debugging information to have.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Error numbers are signed ints. EUNIQ() should not allow implicit type
promotion based on the supplied error diambiguator, because this
causes problems with statements such as
rc = ( condition ? -EUNIQ ( EBASE, disambiguator ) : -EBASE );
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Support the extensions mandated by EDD 4.0, including:
o the ability to specify a flat physical address in a disk address
packet,
o the ability to specify a sector count greater than 127 in a disk
address packet,
o support for all functions within the Fixed Disk Access and EDD
Support subsets,
o the ability to describe a device using EDD Device Path Information.
This implementation is based on draft revision 3 of the EDD 4.0
specification, with reference to the EDD 3.0 specification. It is
possible that this implementation may need to change in order to
conform to the final published EDD 4.0 specification.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Avoid a tedious timeout delay when attempting to issue a command over
a network device that has been closed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow multiple, space separated values (such as kernel arguments,
passed via DHCP) to be assigned to an identifier using the "set"
command.
Originally-implemented-by: Aaron Brooks <aaron@brooks1.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add PRM structures to support Hermon Ethernet devices.
Signed-off-by: Itay Gazit <itaygazit@gmail.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Unlike Arbel, port parameters must be applied via a separate call to
SET_PORT, rather than as parameters to INIT_PORT.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The response to a received FLOGI should probably be sent to the peer
port ID assigned as a result of the WWPN comparison.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Mapping a single page at a time causes a several-second delay at
device initialisation time. Reduce this by mapping multiple pages at
a time, using the largest block sizes possible given the alignment
constraints.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Mapping a single page at a time causes a several-second delay at
device initialisation time. Reduce this by mapping multiple pages at
a time, using the largest block sizes possible given the alignment
constraints.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use individual page mappings rather than a single whole-region
mapping, to avoid the waste of memory that occurs due to the
constraint that each mapped block must be aligned on its own size.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Backport some changes from the Hermon driver to the Arbel driver.
Specifically:
o Rename reserved_lkey to lkey
o Add arbel_rate() to calculate transmission rates
o Structure code to allow for addition of RC queue pairs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Reduce the amount of ICM space required by choosing to order the
various allocations in approximately descending order of alignment
requirements.
This saves approximately 512kB of host memory.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The current method for ICM allocation exactly matches the addresses
chosen by the old Etherboot driver, but does not match the
specification. Some ICM tables (notably the queue pair context table)
therefore end up incorrectly aligned.
Fix by performing allocations as per the specification.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Improve the utility of debugging messages by including the relevant
port number, queue number (QPN, CQN, EQN), work queue entry (WQE)
number, and physical addresses wherever applicable.
Add arbel_dump_cqctx() for dumping a completion queue context and
arbel_dump_qpctx() for dumping a queue pair context.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This is a backport of commit 0b1222f ("[hermon] Randomise the
high-order bits of queue pair numbers") to the Arbel driver.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This is a backport of commit cd5a213 ("[hermon] Allow software GMA to
receive packets destined for QP1") to the Arbel driver.
This patch includes a correction to a bug in the autogenerated
hardware description header file.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Only port state change events are currently mapped to our event queue,
since those are the only events we are prepared to handle. This
ignores a potentially useful source of diagnostic information in the
case of unexpected failures.
Fix by mapping all events to the event queue; a build with debugging
enabled will therefore at least dump the raw content of the unexpected
events.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Only port state change events are currently mapped to our event queue,
since those are the only events we are prepared to handle. This
ignores a potentially useful source of diagnostic information in the
case of unexpected failures.
Fix by mapping all events to the event queue; a build with debugging
enabled will therefore at least dump the raw content of the unexpected
events.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE currently uses the first port's port GUID as the node GUID,
rather than using the (possibly distinct) real node GUID. This can
confuse opensm during the handover to a loaded OS: it thinks the port
already belongs to a different node and so discards our port
information with a warning message about duplicate ports. Everything
is picked up correctly on the second subnet sweep, after opensm has
established that the "old" node no longer exists, but this can delay
link-up unnecessarily by several seconds.
Fix by using the real node GUID.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
No event is generated upon reaching INIT, so we must poll separately
for link state changes while we remain DOWN.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
No event is generated upon reaching INIT, so we must poll separately
for link state changes while we remain DOWN.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
ib_smc_update() potentially updates the Infiniband port state, and so
should almost always be followed by a call to ib_link_state_changed().
The one exception is the call made to ib_smc_update() before the
device is registered.
Fix by removing explicit calls to ib_link_state_changed() from drivers
using ib_smc_update(), including a call to ib_link_state_changed()
within ib_smc_update(), and creating a separate ib_smc_init() for use
prior to device registration.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The sense key gives a first idea of what the problem might be, and so
is potentially useful in diagnosing problems in a non-debug build.
Signed-off-by: Michael Brown <mcb30@ipxe.org>