Commit Graph

1497 Commits (dcad73ca5ad3e1fe011c52a24036f67ad69fadc1)

Author SHA1 Message Date
Michael Brown 856ffe000e [ena] Limit submission queue fill level to completion queue size
The CREATE_CQ command is permitted to return a size smaller than
requested, which could leave us in a situation where the completion
queue could overflow.

Avoid overflow by limiting the submission queue fill level to the
actual size of the completion queue.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26 19:37:54 +01:00
Michael Brown c5af41a6f5 [intelxl] Explicitly request a single queue pair for virtual functions
Current versions of the E810 PF driver fail to set the number of
in-use queue pairs in response to the CONFIG_VSI_QUEUES message.  When
the number of in-use queue pairs is less than the number of available
queue pairs, this results in some packets being directed to
nonexistent receive queues and hence silently dropped.

Work around this PF driver bug by explicitly configuring the number of
available queue pairs via the REQUEST_QUEUES message.  This message
triggers a VF reset that, in turn, requires us to reopen the admin
queue and issue an additional GET_RESOURCES message to restore the VF
to a functional state.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16 19:31:06 +01:00
Michael Brown 04879352c4 [intelxl] Allow for admin commands that trigger a VF reset
The RESET_VF admin queue command does not complete via the usual
mechanism, but instead requires us to poll registers to wait for the
reset to take effect and then reopen the admin queue.

Allow for the existence of other admin queue commands that also
trigger a VF reset, by separating out the logic that waits for the
reset to complete.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16 19:29:01 +01:00
Michael Brown 491c075f7f [intelxl] Negotiate virtual function API version 1.1
Negotiate API version 1.1 in order to allow access to virtual function
opcodes that are disallowed by default on the E810.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16 17:58:52 +01:00
Michael Brown b52ea20841 [intelxl] Show virtual function packet statistics for debugging
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16 17:58:46 +01:00
Michael Brown cad1cc6b44 [intelxl] Add driver for Intel 100 Gigabit Ethernet NICs
Add a driver for the E810 family of 100 Gigabit Ethernet NICs.  The
core datapath is identical to that of the 40 Gigabit XL710, and this
part of the code is shared between both drivers.  The admin queue
mechanism is sufficiently similar to make it worth reusing substantial
portions of the code, with separate implementations for several
commands to handle the (unnecessarily) breaking changes in data
structure layouts.  The major differences are in the mechanisms for
programming queue contexts (where the E810 abandons TX/RX symmetry)
and for configuring the transmit scheduler and receive filters: these
portions are sufficiently different to justify a separate driver.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12 16:15:17 +01:00
Michael Brown 6871a7de70 [intelxl] Use admin queue to set port MAC address and maximum frame size
Remove knowledge of the PRTGL_SA[HL] registers, and instead use the
admin queue to set the MAC address and maximum frame size.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12 13:24:06 +01:00
Michael Brown 727b034f11 [intelxl] Use admin queue to get port MAC address
Remove knowledge of the PRTPM_SA[HL] registers, and instead use the
admin queue to retrieve the MAC address.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12 13:03:12 +01:00
Michael Brown 06467ee70f [intelxl] Defer fetching MAC address until after opening admin queue
Allow for the MAC address to be fetched using an admin queue command,
instead of reading the PRTPM_SA[HL] registers directly.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12 13:03:12 +01:00
Michael Brown d6e36a2d73 [intelxl] Set maximum frame size to 9728 bytes as per datasheet
The PRTGL_SAH register contains the current maximum frame size, and is
not guaranteed on reset to contain the actual maximum frame size
supported by the hardware, which the datasheet specifies as 9728 bytes
(including the 4-byte CRC).

Set the maximum packet size to a hardcoded 9728 bytes instead of
reading from the PRTGL_SAH register.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12 13:03:12 +01:00
Michael Brown 99242bbe2e [intelxl] Always issue "clear PXE mode" admin queue command
Remove knowledge of the GLLAN_RCTL_0 register (which changes location
between the XL810 and E810 register maps), and instead unconditionally
issue the "clear PXE mode" command with the EEXIST error silenced.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11 15:28:03 +01:00
Michael Brown faf26bf8b8 [intelxl] Allow expected admin queue command errors to be silenced
The "clear PXE mode" admin queue command will return an EEXIST error
if the device is already in non-PXE mode, but there is no other admin
queue command that can be used to determine whether the device has
already been switched into non-PXE mode.

Provide a mechanism to allow expected errors from a command to be
silenced, to allow the "clear PXE mode" command to be cleanly used
without needing to first check the GLLAN_RCTL_0 register value.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11 15:28:03 +01:00
Michael Brown f0ea19b238 [intelxl] Increase data buffer size to 4kB
At least one E810 admin queue command (Query Default Scheduling Tree
Topology) insists upon being provided with a 4kB data buffer, even
when the data to be returned is much smaller.

Work around this requirement by increasing the admin queue data buffer
size to 4kB.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11 15:24:29 +01:00
Michael Brown fb69d14002 [intelxl] Separate virtual function driver definitions
Move knowledge of the virtual function data structures and admin
command definitions from intelxl.h to intelxlvf.h.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11 14:53:57 +01:00
Michael Brown c220b93f31 [intelxl] Reuse admin command descriptor and buffer for VF responses
Remove the large static admin data buffer structure embedded within
struct intelxl_nic, and instead copy the response received via the
"send to VF" admin queue event to the (already consumed and completed)
admin command descriptor and data buffer.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11 14:53:57 +01:00
Michael Brown 67f8878e10 [intelxl] Handle admin events via a callback
The physical and virtual function drivers each care about precisely
one admin queue event type.  Simplify event handling by using a
per-driver callback instead of the existing weak function symbol.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11 14:53:54 +01:00
Michael Brown 9e46ffa924 [intelxl] Rename 8086:1889 PCI ID to "iavf"
The PCI device ID 8086:1889 is for the Intel Ethernet Adaptive Virtual
Function, which is a generic virtual function that can be exposed by
different generations of Intel hardware.

Rename the PCI ID from "xl710-vf-ad" to "iavf" to reflect that the
driver is not XL710-specific.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10 12:29:47 +01:00
Michael Brown ef70667557 [intelxl] Increase receive descriptor ring size to 64 entries
The E810 requires that receive descriptor rings have at least 64
entries (and are a multiple of 32 entries).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10 12:29:47 +01:00
Michael Brown 9f5b9e3abb [intelxl] Negotiate API version for virtual function via admin queue
Do not attempt to use the admin commands to get the firmware version
and report the driver version for the virtual function driver, since
these will be rejected by the E810 firmware as invalid commands when
issued by a virtual function.  Instead, use the mailbox interface to
negotiate the API version with the physical function driver.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10 12:29:47 +01:00
Michael Brown b4216fa506 [intelxl] Use non-zero MSI-X vector for virtual function interrupts
The 100 Gigabit physical function driver requires a virtual function
driver to request that transmit and receive queues are mapped to MSI-X
vector 1 or higher.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10 12:29:47 +01:00
Michael Brown 1b61c2118c [intelxl] Fix invocation of intelxlvf_admin_queues()
The second parameter to intelxlvf_admin_queues() is a boolean used to
select the VF opcode, rather than the raw VF opcode itself.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10 12:29:45 +01:00
Michael Brown a202de385d [intelxl] Use function-level reset instead of PFGEN_CTRL.PFSWR
Remove knowledge of the PFGEN_CTRL register (which changes location
between XL710 and E810 register maps), and instead use PCIe FLR to
reset the physical function.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08 16:43:36 +01:00
Michael Brown 0965cec53c [pci] Generalise function-level reset mechanism
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08 16:39:40 +01:00
Michael Brown 9dfcdc04c8 [intelxl] Update list of PCI IDs
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08 15:59:55 +01:00
Michael Brown d8014b1801 [intelxl] Include admin command response data buffer in debug output
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08 15:59:55 +01:00
Michael Brown 319caeaa7b [intelxl] Identify rings consistently in debug messages
Use the tail register offset (which exists for all ring types) as the
ring identifier in all relevant debug messages.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08 15:59:55 +01:00
Michael Brown 814aef68c5 [intelxl] Add missing padding bytes to receive queue context
For the sake of completeness, ensure that all 32 bytes of the receive
queue context are programmed (including the unused final 8 bytes).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08 15:59:55 +01:00
Michael Brown 725f0370fa [intelxl] Fix bit width of function number in PFFUNC_RID register
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08 15:59:55 +01:00
Michael Brown 5d3fad5c10 [intelxl] Fix retrieval of switch configuration via admin queue
Commit 8f3e648 ("[intelxl] Use one admin queue buffer per admin queue
descriptor") changed the API for intelxl_admin_command() such that the
caller now constructs the command directly within the next available
descriptor ring entry, rather than relying on intelxl_admin_command()
to copy the descriptor to and from the descriptor ring.

This introduced a regression in intelxl_admin_switch(), since the
second and subsequent iterations of the loop will not have constructed
a valid command in the new descriptor ring entry before calling
intelxl_admin_command().

Fix by constructing the command within the loop.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08 15:59:55 +01:00
Michael Brown 87f1796f15 [ecm] Treat ACPI MAC address as being a non-permanent MAC address
When applying an ACPI-provided system-specific MAC address, apply it
to netdev->ll_addr rather than netdev->hw_addr.  This allows iPXE
scripts to access the permanent MAC address via the ${netX/hwaddr}
setting (and thereby provides scripts with a mechanism to ascertain
that the NIC is using a MAC address other than its own permanent
hardware address).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-05-23 12:23:53 +01:00
Michael Brown 04288974f6 [pci] Ensure that pci_read_config() initialises all fields
As per the general pattern for initialisation functions in iPXE,
pci_init() saves code size by assuming that the caller has already
zeroed the underlying storage (e.g. as part of zeroing a larger
containing structure).  There are several places within the code where
pci_init() is deliberately used to initialise a transient struct
pci_device without zeroing the entire structure, because the calling
code knows that only the PCI bus:dev.fn address is required to be
initialised (e.g. when reading from PCI configuration space).

Ensure that using pci_init() followed by pci_read_config() will fully
initialise the struct pci_device even if the caller did not previously
zero the underlying storage, since Coverity reports that there are
several places in the code that rely upon this.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-02-16 12:55:28 +00:00
Michael Brown e1cedbc0d4 [console] Support AltGr to access ASCII characters via remapping
Several keyboard layouts define ASCII characters as accessible only
via the AltGr modifier.  Add support for this modifier to ensure that
all ASCII characters are accessible.

Experiments suggest that the BIOS console is likely to fail to
generate ASCII characters when the AltGr key is pressed.  Work around
this limitation by accepting LShift+RShift (which will definitely
produce an ASCII character) as a synonym for AltGr.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-02-15 12:50:26 +00:00
Michael Brown f2a59d5973 [console] Centralise handling of key modifiers
Handle Ctrl and CapsLock key modifiers within key_remap(), to provide
consistent behaviour across different console types.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-02-15 11:58:50 +00:00
Michael Brown 0bbd896783 [console] Handle remapping of scancode 86
The key with scancode 86 appears in the position between left shift
and Z on a US keyboard, where it typically fails to exist entirely.
Most US keyboard maps define this nonexistent key as generating "\|",
with the notable exception of "loadkeys" which instead reports it as
generating "<>".  Both of these mapping choices duplicate keys that
exist elsewhere in the map, which causes problems for our ASCII-based
remapping mechanism.

Work around these quirks by treating the key as generating "\|" with
the high bit set, and making it subject to remapping.  Where the BIOS
generates "\|" as expected, this allows us to remap to the correct
ASCII value.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-02-10 13:59:32 +00:00
Michael Brown eb92ba0a4f [usb] Handle upper/lower case and Ctrl-<key> after applying remapping
Some keyboard layouts (e.g. "fr") swap letter and punctuation keys.
Apply the logic for upper and lower case and for Ctrl-<key> only after
applying remapping, in order to handle these layouts correctly.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-02-10 13:11:27 +00:00
Michael Brown 468980db2b [usb] Support keyboard remapping via the native USB keyboard driver
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-02-10 13:11:27 +00:00
Michael Brown 562c74e1ea [efi] Run ExitBootServices shutdown hook at TPL_NOTIFY
On some systems (observed with the Thunderbolt ports on a ThinkPad X1
Extreme Gen3 and a ThinkPad P53), if the IOMMU is enabled then the
system firmware will install an ExitBootServices notification event
that disables bus mastering on the Thunderbolt xHCI controller and all
PCI bridges, and destroys any extant IOMMU mappings.  This leaves the
xHCI controller unable to perform any DMA operations.

As described in commit 236299b ("[xhci] Avoid DMA during shutdown if
firmware has disabled bus mastering"), any subsequent DMA operation
attempted by the xHCI controller will end up completing after the
operating system kernel has reenabled bus mastering, resulting in a
DMA operation to an area of memory that the hardware is no longer
permitted to access and, on Windows with the Driver Verifier enabled,
a STOP 0xE6 (DRIVER_VERIFIER_DMA_VIOLATION).

That commit avoids triggering any DMA attempts during the shutdown of
the xHCI controller itself.  However, this is not a complete solution
since any attached and opened USB device (e.g. a USB NIC) may
asynchronously trigger DMA attempts that happen to occur after bus
mastering has been disabled but before we reset the xHCI controller.

Avoid this problem by installing our own ExitBootServices notification
event at TPL_NOTIFY, thereby causing it to be invoked before the
firmware's own ExitBootServices notification event that disables bus
mastering.

This unsurprisingly causes the shutdown hook itself to be invoked at
TPL_NOTIFY, which causes a fatal error when later code attempts to
raise the TPL to TPL_CALLBACK (which is a lower TPL).  Work around
this problem by redefining the "internal" iPXE TPL to be variable, and
set this internal TPL to TPL_NOTIFY when the shutdown hook is invoked.

Avoid calling into an underlying SNP protocol instance from within our
shutdown hook at TPL_NOTIFY, since the underlying SNP driver may
attempt to raise the TPL to TPL_CALLBACK (which would cause a fatal
error).  Failing to shut down the underlying SNP device is safe to do
since the underlying device must, in any case, have installed its own
ExitBootServices hook if any shutdown actions are required.

Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-11-23 15:55:01 +00:00
Benedikt Braunger 3ad27fbe78 [intel] Add PCI ID for Intel X553 0x15e4
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-11-22 12:42:18 +00:00
Michael Brown 236299baa3 [xhci] Avoid DMA during shutdown if firmware has disabled bus mastering
On some systems (observed with the Thunderbolt ports on a ThinkPad X1
Extreme Gen3 and a ThinkPad P53), the system firmware will disable bus
mastering on the xHCI controller and all PCI bridges at the point that
ExitBootServices() is called if the IOMMU is enabled.  This leaves the
xHCI controller unable to shut down cleanly since all commands will
fail with a timeout.

Commit 85eb961 ("[xhci] Allow for permanent failure of the command
mechanism") allows us to detect that this has happened and respond
cleanly.  However, some unidentified hardware component (either the
xHCI controller or one of the PCI bridges) seems to manage to enqueue
the attempted DMA operation and eventually complete it after the
operating system kernel has reenabled bus mastering.  This results in
a DMA operation to an area of memory that the hardware is no longer
permitted to access.  On Windows with the Driver Verifier enabled,
this will result in a STOP 0xE6 (DRIVER_VERIFIER_DMA_VIOLATION).

Work around this problem by detecting when bus mastering has been
disabled, and immediately failing the device to avoid initiating any
further DMA attempts.

Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-11-12 22:27:25 +00:00
Michael Brown 85eb961bf9 [xhci] Allow for permanent failure of the command mechanism
Some xHCI controllers (observed with the Thunderbolt ports on a
ThinkPad X1 Extreme Gen3 and a ThinkPad P53) seem to suffer a
catastrophic failure at the point that ExitBootServices() is called if
the IOMMU is enabled.  The symptoms appear to be consistent with
another UEFI driver (e.g. the IOMMU driver, or the Thunderbolt driver)
having torn down the DMA mappings, leaving the xHCI controller unable
to write to host memory.  The observable effect is that all commands
fail with a timeout, and attempts to abort command execution similarly
fail since the xHCI controller is unable to report the abort
completion.

Check for failure to abort a command, and respond by performing a full
device reset (as recommended by the xHCI specification) and by marking
the device as permanently failed.

Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-10-28 23:18:07 +01:00
Aaron Young f24a2794e1 [virtio] Update driver to use DMA API
Signed-off-by: Aaron Young <aaron.young@oracle.com>
2021-10-28 13:19:30 +01:00
Michael Brown 05a76acc6d [ecm] Use ACPI-provided system-specific MAC address if present
Use the "system MAC address" provided within the DSDT/SSDT if such an
address is available and has not already been assigned to a network
device.

Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-09-09 12:56:02 +01:00
Michael Brown 91e147213c [ecm] Expose USB vendor/device information to ecm_fetch_mac()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-09-09 12:52:12 +01:00
Michael Brown 4aa0375821 [rdc] Add driver for RDC R6040 embedded NIC
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-06-28 12:32:19 +01:00
Michael Brown 5622575c5e [realtek] Work around hardware bug on RTL8211B
The RTL8211B seems to have a bug that prevents the link from coming up
unless the MII_MMD_DATA register is cleared.

The Linux kernel driver applies this workaround (in rtl8211b_resume())
only to the specific RTL8211B PHY model, along with a matching
workaround to set bit 9 of MII_MMD_DATA when suspending the PHY.
Since we have no need to ever suspend the PHY, and since writing a
zero ought to be harmless, we just clear the register unconditionally.

Debugged-by: Nikolay Pertsev <nikolay.p@cos.flag.org>
Tested-by: Nikolay Pertsev <nikolay.p@cos.flag.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-06-24 12:36:46 +01:00
Michael Brown 065dce8d59 [ath5k] Avoid returning uninitialised data on EEPROM read errors
Originally-implemented-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-06-04 14:16:44 +01:00
Joseph 059c4dc688 [bnxt] Use hexadecimal values in PCI_ROM entries
Use hexadecimal values instead of macros in PCI_ROM entries so Perl
script can parse them correctly.  Move PCI_ROM entries from header
file to C file.  Integrate bnxt_vf_nics array into PCI_ROM entries by
introducing BNXT_FLAG_PCI_VF flag into driver_data field.  Add
whitespaces in PCI_ROM entries for style consistency.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-05-17 22:35:53 +01:00
Christian Nilsson adb2ed907e [intel] Add PCI ID for I219-V and -LM 10 to 15
Signed-off-by: Christian Nilsson <nikize@gmail.com>
2021-05-17 22:29:07 +01:00
Michael Brown 85d179f2c6 [xen] Support scatter-gather to allow for jumbo frames
The use of jumbo frames for the Xen netfront virtual NIC requires the
use of scatter-gather ("feature-sg"), with the receive descriptor ring
becoming a list of page-sized buffers and the backend using as many
page buffers as required for each packet.

Since iPXE's abstraction of an I/O buffer does not include any sort of
scatter-gather list, this requires an extra allocation and copy on the
receive datapath for any packet that spans more than a single page.

This support is required in order to successfully boot an AWS EC2
virtual machine (with non-enhanced networking) via iSCSI if jumbo
frames are enabled, since the netback driver used in EC2 seems not to
allow "feature-sg" to be renegotiated once the Linux kernel driver
takes over.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-04-14 16:33:41 +01:00
Michael Brown 0be8491b71 [pci] Avoid scanning nonexistent buses when using PCIAPI_DIRECT
There is no method for obtaining the number of PCI buses when using
PCIAPI_DIRECT, and we therefore currently scan all possible bus
numbers.  This can cause a several-second startup delay in some
virtualised environments, since PCI configuration space access will
necessarily require the involvement of the hypervisor.

Ameliorate this situation by defaulting to scanning only a single bus,
and expanding the number of PCI buses to accommodate any subordinate
buses that are detected during enumeration.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-04-10 15:05:05 +01:00
Tyler J. Stachecki c0346dbb49 [intel] Add additional PCI device ID to table
Adding this missing identifier allows the X557-AT2 chipset seen on (at
least) Super Micro A2SDI-H-TF motherboards to function with iPXE.

Signed-off-by: Tyler J. Stachecki <stachecki.tyler@gmail.com>
2021-04-10 14:56:00 +01:00
Michael Brown 7b963310aa [linux] Allow arbitrary settings to be applied to Linux devices
Allow arbitrary settings to be specified on the Linux command line.
For example:

    ./bin-x86_64-linux/slirp.linux \
          --net slirp,testserver=qa-test.ipxe.org

This can be useful when using the Linux userspace build to test
embedded scripts, since it allows arbitrary parameters to be passed
directly on the command line.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-03-02 19:35:11 +00:00
Michael Brown 2b5d3f582f [slirp] Add libslirp driver for Linux
Add a driver using libslirp to provide a virtual network interface
without requiring root permissions on the host.  This simplifies the
process of running iPXE as a Linux userspace application with network
access.  For example:

  make bin-x86_64-linux/slirp.linux
  ./bin-x86_64-linux/slirp.linux --net slirp

libslirp will provide a built-in emulated DHCP server and NAT router.
Settings such as the boot filename may be controlled via command-line
options.  For example:

  ./bin-x86_64-linux/slirp.linux \
      --net slirp,filename=http://192.168.0.1/boot.ipxe

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-03-02 11:09:57 +00:00
Michael Brown f309d7a7b7 [linux] Use host glibc system call wrappers
When building as a Linux userspace application, iPXE currently
implements its own system calls to the host kernel rather than relying
on the host's C library.  The output binary is statically linked and
has no external dependencies.

This matches the general philosophy of other platforms on which iPXE
runs, since there are no external libraries available on either BIOS
or UEFI bare metal.  However, it would be useful for the Linux
userspace application to be able to link against host libraries such
as libslirp.

Modify the build process to perform a two-stage link: first picking
out the requested objects in the usual way from blib.a but with
relocations left present, then linking again with a helper object to
create a standard hosted application.  The helper object provides the
standard main() entry point and wrappers for the Linux system calls
required by the iPXE Linux drivers and interface code.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-28 23:28:23 +00:00
Bruce Rogers 19d0fab40f [ath5k] Add missing AR5K_EEPROM_READ in ath5k_eeprom_read_turbo_modes
The GCC11 compiler pointed out something that apparently no previous
compiler noticed: in ath5k_eeprom_pread_turbo_modes, local variable
val is used uninitialized. From what I can see, the code is just
missing an initial AR5K_EEPROM_READ. Add it right before the switch
statement.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-16 23:35:24 +00:00
Michael Brown 0049243367 [ena] Switch to two-phase reset mechanism
The Linux and FreeBSD drivers for the (totally undocumented) ENA
adapters use a two-phase reset mechanism: first set ENA_CTRL.RESET and
wait for this to be reflected in ENA_STAT.RESET, then clear
ENA_CTRL.RESET and again wait for it to be reflected in
ENA_STAT.RESET.

The iPXE driver currently assumes a self-clearing reset mechanism,
which appeared to work at the time that the driver was created but
seems no longer to function, at least on the t3.nano and t3a.nano
instance types found in eu-west-1.

Switch to a simplified version of the two-phase reset mechanism as
used by Linux and FreeBSD.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-13 19:08:45 +00:00
Christian Iversen 1af0fe04f8 [hermon] Add support for ConnectX-3 based cards
After a ton of tedious work, I am pleased to finally introduce full
support for ConnectX-3 cards in iPXE!

The work has been done by finding all publicly available versions of
the Mellanox Flexboot sources, cleaning them up, synthesizing a git
history from them, cleaning out non-significant changes, and
correlating with the iPXE upstream git history.

After this, a proof-of-concept diff was produced, that allowed iPXE to
be compiled with rudimentary ConnectX-3 support. This diff was over
10k lines, and contained many changes that were not part of the core
driver.

Special thanks to Michael Brown <mcb30@ipxe.org> for answering my
barrage of questions, and helping brainstorm the development along the
way.

Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-02-02 00:37:43 +01:00
Michael Brown 6f1cb791ee [hermon] Avoid parsing length field on completion errors
The CQE length field will not be valid for a completion in error.
Avoid parsing the length field and just call the completion handler
directly.

In debug builds, also dump the queue pair context to allow for
inspection of the error.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-01 23:08:49 +00:00
Michael Brown 8747241b3e [hermon] Make hermon_dump_xxx() functions no-ops on non-debug builds
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-01 23:00:05 +00:00
Michael Brown 410566cef7 [hermon] Minimise reset time
Check for reset completion by waiting for the device to respond to PCI
configuration cycles, as documented in the Programmer's Reference
Manual.  On the original ConnectX HCA, this reduces the time spent on
reset from 1000ms down to 1ms.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-01 22:29:30 +00:00
Christian Iversen 7b2b35981f [hermon] Throttle debug output when sensing port type
When auto-detecting the initial port type, the Hermon driver will spam
the debug output without hesitation.  Add a short delay in each
iteration to fix this.

Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-02-01 12:35:22 +00:00
Christian Iversen 299c671f57 [hermon] Add a debug notice when initialization is complete
Signed-off-by: Christian Iversen <ci@iversenit.dk>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-01 12:30:25 +00:00
Christian Iversen 8b07c88df8 [hermon] Add support for port management event
Inspired by Flexboot, the function hermon_event_port_mgmnt_change() is
added to handle the HERMON_EV_PORT_MGMNT_CHANGE event type, which
updates the Infiniband subsystem.

Signed-off-by: Christian Iversen <ci@iversenit.dk>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-01 11:44:54 +00:00
Christian Iversen d948ac6c61 [hermon] Adjust Ethernet work queue size
Hermon Ethernet work queues have more RX than TX entries, unlike most
other drivers.  This is possibly the source of some stochastic
deadlocks previously experienced with this driver.

Update the sizes to be in line with other drivers, and make them
slightly larger for better performance.  These new queue sizes have
been found to work well with ConnectX-3 hardware.

Signed-off-by: Christian Iversen <ci@iversenit.dk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-01 11:12:26 +00:00
Michael Brown e62c3e3513 [hermon] Use reset value suitable for ConnectX-3
The programming documentation states that the reset magic value is
"0x00000001 (Big Endian)", and the current code matches this by using
the value 0x01000000 for the implicitly little-endian writel().

Inspection of the FlexBoot source code reveals an exciting variety of
reset values, some suggestive of confusion around endianness.

Experimentation suggests that the value 0x01000001 works reliably
across a wide range of hardware.

Debugged-by: Christian Iversen <ci@iversenit.dk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-02-01 01:53:15 +00:00
Christian Iversen 2e3d5909ee [hermon] Clean up whitespace in hermon.c
Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-02-01 01:48:29 +00:00
Christian Iversen 79031fee21 [iscsi] Update link to iBFT reference manual
Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-02-01 01:27:08 +01:00
Michael Brown def46cf344 [hermon] Limit link poll frequency in DOWN state
Some older versions of the hardware (and/or firmware) do not report an
event when an Infiniband link reaches the INIT state.  The driver
works around this missing event by calling ib_smc_update() on each
event queue poll while the link is in the DOWN state.

Commit 6cb12ee ("[hermon] Increase polling rate for command
completions") addressed this by speeding up the time taken to issue
each command invoked by ib_smc_update().  Experimentation shows that
the impact is still significant: for example, in a situation where an
unplugged port is opened, the throughput on the other port can be
reduced by over 99%.

Fix by throttling the rate at which link polling is attempted.

Debugged-by: Christian Iversen <ci@iversenit.dk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-31 23:29:45 +00:00
Christian Iversen 43d72d0087 [hermon] Perform clean MPT unmap on device shutdown
This change is ported from Flexboot sources.  When stopping a Hermon
device, perform hermon_unmap_mpt() which runs HERMON_HCR_HW2SW_MPT to
bring the Memory Protection Table (MPT) back to software control.

Signed-off-by: Christian Iversen <ci@iversenit.dk>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-29 00:46:53 +00:00
Christian Iversen 699b9f1d1b [hermon] Use Ethernet MAC as eIPoIB local EMAC
The eIPoIB local Ethernet MAC is currently constructed from the port
GUID.  Given a base GUID/MAC value of N, Mellanox seems to populate:

  Node GUID:   N + 0
  Port 1 GUID: N + 1
  Port 2 GUID: N + 2

and

  Port 1 MAC:  N + 0
  Port 2 MAC:  N + 1

This causes a duplicate local MAC address when port 1 is configured as
Infiniband and port 2 as Ethernet, since both will derive their MAC
address as (N + 1).

Fix by using the port's Ethernet MAC as the eIPoIB local EMAC.  This
is a behavioural change that could potentially break configurations
that rely on the local EMAC value, such as a DHCP server relying on
the chaddr field for DHCP reservations.

Signed-off-by: Christian Iversen <ci@iversenit.dk>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-29 00:13:46 +00:00
Christian Iversen 6cb12ee2b0 [hermon] Increase polling rate for command completions
Some older versions of the hardware (and/or firmware) do not report an
event when an Infiniband link reaches the INIT state.  The driver
works around this missing event by calling ib_smc_update() on each
event queue poll while the link is in the DOWN state.  This results in
a very large number of commands being issued while any open Infiniband
link is in the DOWN state (e.g. unplugged), to the point that the 1ms
delay from waiting for each command to complete will noticeably affect
responsiveness.

Fix by decreasing the command completion polling delay from 1ms to
10us.

Signed-off-by: Christian Iversen <ci@iversenit.dk>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-28 23:47:00 +00:00
Michael Brown 7d32225b55 [hermon] Add event queue debug functions
Add hermon_dump_eqctx() for dumping the event queue context and
hermon_dump_eqes() for dumping any unconsumed event queue entries.

Originally-implemented-by: Christian Iversen <ci@iversenit.dk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-28 22:30:56 +00:00
Christian Iversen 7c40227e18 [hermon] Increase command timeout from 2 to 10 seconds
Some commands (particularly in relation to device initialization) can
occasionally take longer than 2 seconds, and the Mellanox documentation
recommends a 10 second timeout.

Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-01-28 20:55:14 +00:00
Michael Brown cd126c41bb [hermon] Add assorted debug error messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-28 20:52:36 +00:00
Michael Brown ce45c8dc21 [hermon] Show "issuing command" messages only at DBGLVL_EXTRA
Originally-implemented-by: Christian Iversen <ci@iversenit.dk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-28 17:29:36 +00:00
Christian Iversen a2893dc18a [hermon] Reorganize PCI ROM list and document well-known product names
Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-01-28 17:23:05 +00:00
Christian Iversen 0e788c8eda [golan] Backport typo fix in nodnic_prm.h: s/HERMON/NODNIC/
Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-01-28 17:19:22 +00:00
Christian Iversen 36a892a7c7 [arbel] Clean up whitespace in MT25218_PRM.h header
Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-01-28 17:14:08 +00:00
Christian Iversen 414c842f06 [hermon] Clean up whitespace in MT25408_PRM.h header
Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-01-28 17:10:47 +00:00
Christian Iversen b9de7e6eda [infiniband] Require drivers to specify the number of ports
Require drivers to report the total number of Infiniband ports.  This
is necessary to report the correct number of ports on devices with
dynamic port types.

For example, dual-port Mellanox cards configured for (eth, ib) would
be rejected by the subnet manager, because they report using "port 2,
out of 1".

Signed-off-by: Christian Iversen <ci@iversenit.dk>
2021-01-27 01:15:35 +00:00
Michael Brown 8e3826aa10 [build] Inhibit spurious array bounds warning on some versions of gcc
Some versions of gcc (observed with gcc 9.3.0 on NixOS Linux) produce
a spurious warning about an out-of-bounds array access for the
isa_extra_probe_addrs[] array.

Work around this compiler bug by redefining the array index as a
signed long, which seems to somehow avoid this spurious warning.

Debugged-by: Manuel Mendez <mmendez534@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-15 20:54:27 +00:00
Manuel Mendez a5fb41873d [isa] Add missing #include <config/isa.h>
Signed-off-by: Manuel Mendez <mmendez534@gmail.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-13 23:01:27 +00:00
Michael Brown c42f31bc8a [xhci] Avoid false positive Coverity warning
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-04 09:37:59 +00:00
Michael Brown 7ce3b84050 [xhci] Show meaningful error messages after command failures
Ensure that any command failure messages are followed up with an error
message indicating what the failed command was attempting to perform.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-03 19:12:00 +00:00
Michael Brown 017b345d5a [xhci] Fail attempts to issue concurrent commands
The xHCI driver can handle only a single command TRB in progress at
any one time.  Immediately fail any attempts to issue concurrent
commands (which should not occur in normal operation).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-01-03 19:08:49 +00:00
Martin Habets da491eaae7 [sfc] Update email addresses
Email from solarflare.com will stop working, so update those.  Remove
email for Shradha Shah, as she is not involved with this any more.
Update copyright notices for files touched.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-12-28 18:41:55 +00:00
Mohammed Taha ce841946df [golan] Add new PCI IDs
Signed-off-by: Mohammed <mohammedt@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-12-28 13:55:30 +00:00
Michael Brown f47a45ea2d [iphone] Add iPhone tethering driver
USB tethering via an iPhone is unreasonably complicated due to the
requirement to perform a pairing operation that involves establishing
a TLS session over a completely unrelated USB function that speaks a
protocol that is almost, but not quite, entirely unlike TCP.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-12-16 13:29:06 +00:00
Michael Brown 13a6d17296 [xhci] Update driver to use DMA API
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-29 11:25:40 +00:00
Michael Brown 8d337ecdae [dma] Move I/O buffer DMA operations to iobuf.h
Include a potential DMA mapping within the definition of an I/O
buffer, and move all I/O buffer DMA mapping functions from dma.h to
iobuf.h.  This avoids the need for drivers to maintain a separate list
of DMA mappings for each I/O buffer that they may handle.

Network device drivers typically do not keep track of transmit I/O
buffers, since the network device core already maintains a transmit
queue.  Drivers will typically call netdev_tx_complete_next() to
complete a transmission without first obtaining the relevant I/O
buffer pointer (and will rely on the network device core automatically
cancelling any pending transmissions when the device is closed).

To allow this driver design approach to be retained, update the
netdev_tx_complete() family of functions to automatically perform the
DMA unmapping operation if required.  For symmetry, also update the
netdev_rx() family of functions to behave the same way.

As a further convenience for drivers, allow the network device core to
automatically perform DMA mapping on the transmit datapath before
calling the driver's transmit() method.  This avoids the need to
introduce a mapping error handling code path into the typically
error-free transmit methods.

With these changes, the modifications required to update a typical
network device driver to use the new DMA API are fairly minimal:

- Allocate and free descriptor rings and similar coherent structures
  using dma_alloc()/dma_free() rather than malloc_phys()/free_phys()

- Allocate and free receive buffers using alloc_rx_iob()/free_rx_iob()
  rather than alloc_iob()/free_iob()

- Calculate DMA addresses using dma() or iob_dma() rather than
  virt_to_bus()

- Set a 64-bit DMA mask if needed using dma_set_mask_64bit() and
  thereafter eliminate checks on DMA address ranges

- Either record the DMA device in netdev->dma, or call iob_map_tx() as
  part of the transmit() method

- Ensure that debug messages use virt_to_phys() when displaying
  "hardware" addresses

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-28 20:26:28 +00:00
Michael Brown 70e6e83243 [dma] Record DMA device as part of DMA mapping if needed
Allow for dma_unmap() to be called by code other than the DMA device
driver itself.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-28 18:56:50 +00:00
Michael Brown cf12a41703 [dma] Modify DMA API to simplify calculation of medial addresses
Redefine the value stored within a DMA mapping to be the offset
between physical addresses and DMA addresses within the mapped region.

Provide a dma() wrapper function to calculate the DMA address for any
pointer within a mapped region, thereby simplifying the use cases when
a device needs to be given addresses other than the region start
address.

On a platform using the "flat" DMA implementation the DMA offset for
any mapped region is always zero, with the result that dma_map() can
be optimised away completely and dma() reduces to a straightforward
call to virt_to_phys().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-25 16:15:55 +00:00
Michael Brown 24ef743778 [intelxl] Configure DMA mask as 64-bit
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-24 17:47:42 +00:00
Michael Brown 9e280aecb7 [intel] Configure DMA mask as 64-bit
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-24 17:46:39 +00:00
Michael Brown 03314e8da9 [intelxl] Update driver to use DMA API
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-21 13:35:11 +00:00
Michael Brown 76a7bfe939 [intelxl] Read PCI bus:dev.fn number from PFFUNC_RID register
For the physical function driver, the transmit queue needs to be
configured to be associated with the relevant physical function
number.  This is currently obtained from the bus:dev.fn address of the
underlying PCI device.

In the case of a virtual machine using the physical function via PCI
passthrough, the PCI bus:dev.fn address within the virtual machine is
unrelated to the real physical function number.  Such a function will
typically be presented to the virtual machine as a single-function
device.  The function number extracted from the PCI bus:dev.fn address
will therefore always be zero.

Fix by reading from the Function Requester ID Information Register,
which always returns the real PCI bus:dev.fn address as used by the
physical host.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-21 13:35:11 +00:00
Michael Brown b6eb17cbd7 [intelxl] Read MAC address from PRTPM_SA[HL] instead of PRTGL_SA[HL]
The datasheet is fairly incomprehensible in terms of identifying the
appropriate MAC address for use by the physical function driver.
Choose to read the MAC address from PRTPM_SAH and PRTPM_SAL, which at
least matches the MAC address as selected by the Linux i40e driver.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-20 19:15:30 +00:00
Michael Brown 062711f1cf [intel] Use physical addresses in debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-16 15:07:03 +00:00
Michael Brown 810dc5d6c3 [realtek] Use physical addresses in debug messages
Physical addresses in debug messages are more meaningful from an
end-user perspective than potentially IOMMU-mapped I/O virtual
addresses, and have the advantage of being calculable without access
to the original DMA mapping entry (e.g. when displaying an address for
a single failed completion within a descriptor ring).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-16 14:58:57 +00:00
Michael Brown fc5cf18dab [efi] Use casts rather than virt_to_bus() for UNDI buffer addresses
For a software UNDI, the addresses in PXE_CPB_TRANSMIT.FrameAddr and
PXE_CPB_RECEIVE.BufferAddr are host addresses, not bus addresses.

Remove the spurious (and no-op) use of virt_to_bus() and replace with
a cast via intptr_t.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-15 23:36:17 +00:00