opengnsys_ipxe

Commit Graph

Author	SHA1	Message	Date
Michael Brown	dcad73ca5a	[efi] Add support for driving EFI_MANAGED_NETWORK_PROTOCOL devices We want exclusive access to the network device, both for performance reasons and because we perform operations such as EAPoL that affect the entire link. We currently drive the network card via either a native hardware driver or via the SNP or NII/UNDI interfaces, both of which grant us this exclusive access. Add an alternative driver that drives the network card non-exclusively via the EFI_MANAGED_NETWORK_PROTOCOL interface. This can function as a fallback for situations where neither SNP nor NII/UNDI interfaces are functional, and also opens up the possibility of non-destructively installing a temporary network device over which to download the autoexec.ipxe script. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2024-03-25 17:58:33 +00:00
Michael Brown	a15ce00182	[efi] Match chainloaded device by uppermost matching handle Commit `4c5b794` ("[efi] Use the SNP protocol instance to match the SNP chainloading device") switched the chainloaded device matching logic to use a target protocol instance rather than the loaded image's device handle, on the basis that we want to bind to the parent SNP device rather than to a duplicate SNP protocol instance installed onto an IPv4 or IPv6 child device handle. It is possible that our calls to DisconnectController() and ConnectController() will cause the target protocol instance to be uninstalled and reinstalled, which may change the value of the protocol instance pointer. Allow for this by identifying and matching against the uppermost handle that initially has this target protocol instance installed. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2024-03-25 17:58:33 +00:00
Michael Brown	926816c58f	[efi] Pad transmit buffer length to work around vendor driver bugs The Mellanox/Nvidia UEFI driver is built from the same codebase as the iPXE driver, and appears to contain the bug that was fixed in commit `c11734e` ("[golan] Use ETH_HLEN for inline header size"). This results in identical failures when using the SNP or NII interface (via e.g. snponly.efi) to drive a Mellanox card while EAPoL is enabled. Work around the underlying UEFI driver bug by padding transmit I/O buffers to the minimum Ethernet frame length before passing them to the underlying driver's transmit function. This padding is not technically necessary, since almost all modern hardware will insert transmit padding as necessary (and where the hardware does not support doing so, the underlying UEFI driver is responsible for adding any necessary padding). However, it is guaranteed to be harmless (other than a miniscule performance impact): the Ethernet specification requires zero padding up to the minimum frame length for packets that are transmitted onto the wire, and so the receiver will see the same packet whether or not we manually insert this padding in software. The additional padding causes the underlying Mellanox driver to avoid its faulty code path, since it will never be asked to transmit a very short packet. Tested-by: Eric Hagberg <ehagberg@janestreet.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>	2024-03-18 22:52:05 +00:00
Rabia Manaa	c11734eee0	[golan] Use ETH_HLEN for inline header size The driver does not correctly handle very short transmitted packets such as EAPoL-Start where the entire DMA content lies within the current send work queue entry inline header length of 18 bytes. Fix by reducing the inline header length to the Ethernet frame header length of 14 bytes. Modified-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Michael Brown <mcb30@ipxe.org>	2024-03-17 22:55:32 +00:00
Michael Brown	bac967d51a	[snp] Allocate additional padding for receive buffers Some SNP implementations (observed with a wifi adapter in a Dell Latitude 3440 laptop) seem to require additional space in the allocated receive buffers, otherwise full-length packets will be silently dropped. The EDK2 MnpDxe driver happens to allocate an additional 8 bytes of padding (4 for a VLAN tag, 4 for the Ethernet frame checksum). Match this behaviour since drivers are very likely to have been tested against MnpDxe. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2024-03-16 23:28:34 +00:00
Geert Stappers	e5f3ba0ca7	[drivers] Sort PCI_ROM() entries numerically Done with the help of this Perl script: $MARKER = 'PCI_ROM'; # a regex $AB = 1; # At Begin @HEAD = (); @ITEMS = (); @TAIL = (); foreach $fn (@ARGV) { open(IN, $fn) or die "Can't open file '$fn': $!\n"; while (<IN>) { if (/$MARKER/) { push @ITEMS, $_; $AB = 0; # not anymore at begin } else { if ($AB) { push @HEAD, $_; } else { push @TAIL, $_; } } } } continue { close IN; open(OUT, ">$fn") or die "Can't open file '$fn' for output: $!\n"; print OUT @HEAD; print OUT sort @ITEMS; print OUT @TAIL; close OUT; # For a next file $AB = 1; @HEAD = (); @ITEMS = (); @TAIL = (); } Executed that script while src/drivers/ as current working directory, provided '$(grep -rl PCI_ROM)' as argument. Signed-off-by: Geert Stappers <stappers@stappers.it>	2024-02-22 14:19:04 +00:00
Joseph Wong	a846c4ccfc	[bnxt] Add support for BCM957608 Add support for BCM957608 device. Add support for additional link speeds supported by BCM957608. Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>	2024-02-08 15:10:12 +00:00
Joseph Wong	de8a0821c7	[bnxt] Add support for additional chip IDs Add additional chip IDs that can be recognized as part of the thor family. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2024-01-19 22:08:48 +00:00
Michael Brown	4b7d9a6af0	[libc] Replace linker_assert() with build_assert() We currently implement build-time assertions via a mechanism that generates a call to an undefined external function that will cause the link to fail unless the compiler can prove that the asserted condition is true (and thereby eliminate the undefined function call). This assertion mechanism can be used for conditions that are not amenable to the use of static_assert(), since static_assert() will not allow for proofs via dead code elimination. Add __attribute__((error(...))) to the undefined external function, so that the error is raised at compile time rather than at link time. This allows us to provide a more meaningful error message (which will include the file name and line number, as with any other compile-time error), and avoids the need for the caller to specify a unique symbol name for the external function. Change the name from linker_assert() to build_assert(), since the assertion now takes place at compile time rather than at link time. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2024-01-16 13:35:08 +00:00
Michael Brown	6d29415c89	[libc] Make static_assert() available via assert.h Expose static_assert() via assert.h and migrate link-time assertions to build-time assertions where possible. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2024-01-16 13:35:08 +00:00
Christian Helmuth	119c415ee4	[intel] Add PCI ID for I219-LM (23) Successfully tested on FUJITSU LIFEBOOK U7413. Signed-off-by: Christian Helmuth <christian.helmuth@genode-labs.com>	2023-12-21 13:53:24 +01:00
Michael Brown	36e1a559a2	[pci] Check that ECAM configuration space is within reachable memory Some machines (observed with an AWS EC2 m7a.large instance) will place the ECAM configuration space window above 4GB, thereby making it unreachable from non-paged 32-bit code. This problem is currently ignored by iPXE, since the address is silently truncated in the call to ioremap(). (Note that other uses of ioremap() are not affected since the PCI core will already have checked for unreachable 64-bit BARs when retrieving the physical address to be mapped.) Fix by adding an explicit check that the region to be mapped starts within the reachable memory address space. (Assume that no machines will be sufficiently peverse to provide a region that straddles the 4GB boundary.) Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-11-02 15:38:08 +00:00
Michael Brown	1f3a37e342	[pci] Cache ECAM mapping errors When an error occurs during ECAM configuration space mapping, preserve the error within the existing cached mapping (instead of invalidating the cached mapping) in order to avoid flooding the debug log with repeated identical mapping errors. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-11-02 15:20:27 +00:00
Michael Brown	74ec00a9f3	[pci] Handle non-zero starting bus in ECAM allocations The base address provided in the PCI ECAM allocation within the ACPI MCFG table is the base address for the segment as a whole, not for the starting bus within that allocation. On machines that provide ECAM allocations with a non-zero starting bus number (observed with an AWS EC2 m7a.large instance), this will result in iPXE accessing the wrong memory addresses within the ECAM region. Fix by adding the appropriate starting bus offset to the base address. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-11-02 15:05:15 +00:00
Michael Brown	f883203132	[pci] Force completion of ECAM configuration space writes The PCIe specification requires that "processor and host bridge implementations must ensure that a method exists for the software to determine when the write using the ECAM is completed by the completer" but does not specify any particular method to be used. Some platforms might treat writes to the ECAM region as non-posted, others might require reading back from a dedicated (and implementation-specific) completion register to determine when the configuration space write has completed. Since PCI configuration space writes will never be used for any performance-critical datapath operations (on any sane hardware), a simple and platform-independent solution is to always read back from the written register in order to guarantee that the write must have completed. This is safe to do, since the PCIe specification defines a limited set of configuration register types, none of which have read side effects. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-11-01 22:32:21 +00:00
Michael Brown	115707c0ed	[iphone] Add missing va_start()/va_end() around reused argument list The ipair_tx() function uses a va_list twice (first to calculate the formatted string length before allocation, then to construct the string in the allocated buffer) but is missing the va_start() and va_end() around the second usage. This is undefined behaviour that happens to work on some build platforms. Fix by adding the missing va_start() and va_end() around the second usage of the variadic argument list. Reported-by: Andreas Hammarskjöld <andreas@2PintSoftware.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-10-24 11:43:56 +01:00
Michael Brown	ae4e85bde9	[netdevice] Allocate private data for each network upper-layer driver Allow network upper-layer drivers (such as LLDP, which attaches to each network device in order to provide a corresponding LLDP settings block) to specify a size for private data, which will be allocated as part of the network device structure (as with the existing private data allocated for the underlying device driver). This will allow network upper-layer drivers to be simplified by omitting memory allocation and freeing code. If the upper-layer driver requires a reference counter (e.g. for interface initialisation), then it may use the network device's existing reference counter, since this is now the reference counter for the containing block of memory. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-09-13 20:23:46 +01:00
Michael Brown	eeb7cd56e5	[netdevice] Remove netdev_priv() helper function Some network device drivers use the trivial netdev_priv() helper function while others use the netdev->priv pointer directly. Standardise on direct use of netdev->priv, in order to free up the function name netdev_priv() for reuse. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-09-13 16:29:48 +01:00
Alexander Eichner	9e99a55b31	[virtio] Fix implementation of vpm_ioread32() The current implementation of vpm_ioread32() erroneously reads only 16 bits of data, which fails when used with the (stricter) virtio device emulation in VirtualBox. Fix by using the correct readl()/inl() I/O wrappers. Reworded-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-08-22 13:45:44 +01:00
Michael Brown	f3036fc213	[linux] Set a default MAC address for tap devices Avoid the need to always specify a local MAC address on the command line by setting a default hardware MAC address (using the same default address as for slirp devices). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-07-05 15:24:32 +01:00
Michael Brown	59d065c9ac	[linux] Fix error control flow in af_packet_nic_probe() Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-07-05 15:17:58 +01:00
Michael Brown	48ae5d5361	[linux] Fix error control flow in tap_probe() Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-07-05 14:47:13 +01:00
Michael Brown	3ef4f7e2ef	[console] Avoid overlap between special keys and Unicode characters The special key range (from KEY_MIN upwards) currently overlaps with the valid range for Unicode characters, and therefore prohibits the use of Unicode key values outside the ASCII range. Create space for Unicode key values by moving the special keys to the range immediately above the maximum valid Unicode character. This allows the existing encoding of special keys as an efficiently packed representation of the equivalent ANSI escape sequence to be maintained almost as-is. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-07-04 14:33:43 +01:00
Michael Brown	2689a6e776	[efi] Always poll for TX completions Polling for TX completions is arguably redundant when there are no transmissions currently in progress. Commit `c6c7e78` ("[efi] Poll for TX completions only when there is an outstanding TX buffer") switched to setting the PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS flag only when there is an in-progress transmission awaiting completion, in order to reduce reported TX errors and debug message noise from buggy NII implementations that report spurious TX completions whenever the transmit queue is empty. Some other NII implementations (observed with the Realtek driver in a Dell Latitude 3440) seem to have a bug in the transmit datapath handling which results in the transmit ring freezing after sending a few hundred packets under heavy load. The symptoms are that the TPPoll register's NPQ bit remains set and the 256-entry transmit ring contains a large number of uncompleted descriptors (with the OWN bit set), the first two of which have identical data buffer addresses. Though iPXE will submit at most one in-progress transmission via NII, the Dell/Realtek driver seems to make a page-aligned copy of each transmit data buffer and to report TX completions immediately without waiting for the packet to actually be transmitted. These synthetic TX completions continue even after the hardware transmit ring freezes. Setting PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll reduces the probability of this Dell/Realtek driver bug being triggered by a factor of around 500, which brings the failure rate down to the point that it can sensibly be managed by external logic such as the "--timeout" option for image downloads. Closing and reopening the interface (via "ifclose"/"ifopen") will clear the error condition and allow transmissions to resume. Revert to setting PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll, and silently ignore situations in which the hardware reports a completion when no transmission is in progress. This approximately matches the behaviour of the SnpDxe driver, which will also generally set PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-06-21 11:49:53 +01:00
Matt Parrella	bf25e23d07	[intel] Add workaround for I210 reset hardware bugs The Intel I210's packet buffer size registers reset only on power up, not when a reset signal is asserted. This can lead to the inability to pass traffic in the event that the DMA TX Maximum Packet Size (which does reset to its default value on reset) is bigger than the TX Packet Buffer Size. For example, an operating system may be using the time sensitive networking features of the I210 and the registers may be programmed correctly, but then a reset signal is asserted and iPXE on the next boot will be unable to use the I210. Mimic what Linux does and forcibly set the registers to their default values. Signed-off-by: Matt Parrella <parrella.matthew@gmail.com>	2023-03-14 14:44:32 +00:00
Forest Crossman	523788ccda	[intelx] Add PCI IDs for Intel 82599 10GBASE-T NIC Signed-off-by: Forest Crossman <cyrozap@gmail.com>	2023-03-05 18:22:18 -06:00
Michael Brown	2733c4763a	[iscsi] Limit maximum transfer size to MaxBurstLength We currently specify only the iSCSI default value for MaxBurstLength and ignore any negotiated value, since our internal block device API allows only for receiving directly into caller-allocated buffers and so we have no intrinsic limit on burst length. A conscientious target may however refuse to attempt a transfer that we request for a number of blocks that would exceed the negotiated maximum burst length. Fix by recording the negotiated maximum burst length and using it to limit the maximum number of blocks per transfer as reported by the SCSI layer. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-02-16 13:27:25 +00:00
Michael Brown	be8ecaf805	[eisa] Check for system board presence before probing for slots EISA expansion slot I/O port addresses overlap space that may be assigned to PCI devices, which can lead to register reads and writes with unwanted side effects during EISA probing. Reduce the chances of performing EISA probing on PCI devices by probing EISA slot vendor and product ID registers only if the EISA system board vendor ID register indicates that the motherboard supports EISA. Debugged-by: Václav Ovsík <vaclav.ovsik@gmail.com> Tested-by: Václav Ovsík <vaclav.ovsik@gmail.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-02-10 23:34:59 +00:00
Michael Brown	4e456d9928	[efi] Do not attempt to drive PCI bridge devices The "bridge" driver introduced in `3aa6b79` ("[pci] Add minimal PCI bridge driver") is required only for BIOS builds using the ENA driver, where experimentation shows that we cannot rely on the BIOS to fully assign MMIO addresses. Since the driver is a valid PCI driver, it will end up binding to all PCI bridge devices even on a UEFI platform, where the firmware is likely to have completed MMIO address assignment correctly. This has no impact on most systems since there is generally no UEFI driver for PCI bridges: the enumeration of the whole PCI bus is handled by the PciBusDxe driver bound to the root bridge. Experimentation shows that at least one laptop will freeze at the point that iPXE attempts to bind to the bridge device. No deeper investigation has been carried out to find the root cause. Fix by causing efipci_supported() to return an error unless the configuration space header type indicates a non-bridge device. Reported-by: Marcel Petersen <mp@sbe.de> Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-02-03 16:10:31 +00:00
Michael Brown	b6304f2984	[realtek] Explicitly disable VLAN offload Some cards seem to have the receive VLAN tag stripping feature enabled by default, which causes received VLAN packets to be misinterpreted as being received by the trunk device. Fix by disabling VLAN tag stripping in the C+ Command Register. Debugged-by: Xinming Lai <yiyihu@gmail.com> Tested-by: Xinming Lai <yiyihu@gmail.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-02-01 19:09:30 +00:00
Mohammed Taha	c5426cdaa9	[golan] Add new PCI ID for NVIDIA BlueField-3 network device Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-01-23 22:52:30 +00:00
Michael Brown	68734b9a4d	[efi] Bind to only the topmost instance of the SNP or NII protocols UEFI has the mildly annoying habit of installing copies of the EFI_SIMPLE_NETWORK_PROTOCOL instance on the IPv4 and IPv6 child device handles. This can cause iPXE's SNP driver to attempt to bind to a copy of the EFI_SIMPLE_NETWORK_PROTOCOL that iPXE itself provided on a different handle. Fix by refusing to bind to an SNP (or NII) handle if there exists another instance of the same protocol further up the device path (on the basis that we always want to bind to the highest possible device). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-01-23 19:27:13 +00:00
Michael Brown	2fef0c541e	[efi] Extend efi_locate_device() to allow searching up the device path Extend the functionality of efi_locate_device() to allow callers to find instances of the protocol that may exist further up the device path. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-01-23 19:27:13 +00:00
Alexander Graf	6b977d1250	[ena] Allocate an unused Asynchronous Event Notification Queue (AENQ) We currently don't allocate an Asynchronous Event Notification Queue (AENQ) because we don't actually care about any of the events that may come in. The ENA firmware found on Graviton instances requires the AENQ to exist, otherwise all admin queue commands will fail. Fix by allocating an AENQ and disabling all events (so that we do not need to include code to acknowledge any events that may arrive). Signed-off-by: Alexander Graf <graf@amazon.com>	2023-01-18 22:47:58 +00:00
Michael Brown	c4c03e5be8	[netdevice] Allow duplicate MAC addresses Many laptops now include the ability to specify a "system-specific MAC address" (also known as "pass-through MAC"), which is supposed to be used for both the onboard NIC and for any attached docking station or other USB NIC. This is intended to simplify interoperability with software or hardware that relies on a MAC address to recognise an individual machine: for example, a deployment server may associate the MAC address with a particular operating system image to be deployed. This therefore creates legitimate situations in which duplicate MAC addresses may exist within the same system. As described in commit `98d09a1` ("[netdevice] Avoid registering duplicate network devices"), the Xen netfront driver relies on the rejection of duplicate MAC addresses in order to inhibit registration of the emulated PCI devices that a Xen PV-HVM guest will create to shadow each of the paravirtual network devices. Move the code that rejects duplicate MAC addresses from the network device core to the Xen netfront driver, to allow for the existence of duplicate MAC addresses in non-Xen setups. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-01-15 00:42:52 +00:00
Michael Brown	ab19546386	[efi] Disable receive filters to work around buggy UNDI drivers Some UNDI drivers (such as the AMI UsbNetworkPkg currently in the process of being upstreamed into EDK2) have a bug that will prevent any packets from being received unless at least one attempt has been made to disable some receive filters. Work around these buggy drivers by attempting to disable receive filters before enabling them. Ignore any errors, since we genuinely do not care whether or not the disabling succeeds. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2023-01-11 00:18:18 +00:00
Christian I. Nilsson	563bff4722	[intel] Add PCI ID for I219-V and -LM 16,17 Signed-off-by: Christian I. Nilsson <nikize@gmail.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-11-15 13:05:28 +00:00
Michael Brown	2ae5355321	[pci] Backup and restore standard config space across PCIe FLR The behaviour of PCI devices across a function-level reset seems to be inconsistent in practice: some devices will preserve PCI BARs, some will not. Fix the behaviour of FLR on devices that do not preserve PCI BARs by backing up and restoring PCI configuration space across the reset. Preserve only the standard portion of the configuration space, since there may be registers with unexpected side effects in the remaining non-standardised space. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-11-13 21:38:41 +00:00
Michael Brown	ca2be7e094	[pci] Allow PCI config space backup to be limited by maximum offset Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-11-13 20:42:09 +00:00
Michael Brown	081b3eefc4	[ena] Assign memory BAR if left empty by BIOS Some BIOSes in AWS EC2 (observed with a c6i.metal instance in eu-west-2) will fail to assign an MMIO address to the ENA device, which causes ioremap() to fail. Experiments show that the ENA device is the only device behind its bridge, even when multiple ENA devices are present, and that the BIOS does assign a memory window to the bridge. We may therefore choose to assign the device an MMIO address at the start of the bridge's memory window. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-09-19 17:49:25 +01:00
Michael Brown	3aa6b79c8d	[pci] Add minimal PCI bridge driver Add a minimal driver for PCI bridges that can be used to locate the bridge to which a PCI device is attached. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-09-19 17:47:57 +01:00
Michael Brown	649176cd60	[pci] Select PCI I/O API at runtime for cloud images Pretty much all physical machines and off-the-shelf virtual machines will provide a functional PCI BIOS. We therefore default to using only the PCI BIOS, with no fallback to an alternative mechanism if the PCI BIOS fails. AWS EC2 provides the opportunity to experience some exceptions to this rule. For example, the t3a.nano instances in eu-west-1 have no functional PCI BIOS at all. As of commit `83516ba` ("[cloud] Use PCIAPI_DIRECT for cloud images") we therefore use direct Type 1 configuration space accesses in the images built and published for use in the cloud. Recent experience has discovered yet more variation in AWS EC2 instances. For example, some of the metal instance types have multiple PCI host bridges and the direct Type 1 accesses therefore see only a subset of the PCI devices. Attempt to accommodate future such variations by making the PCI I/O API selectable at runtime and choosing ECAM (if available), falling back to the PCI BIOS (if available), then finally falling back to direct Type 1 accesses. This is implemented as a dedicated PCIAPI_CLOUD API, rather than by having the PCI core select a suitable API at runtime (as was done for timers in commit `302f1ee` ("[time] Allow timer to be selected at runtime"). The common case will remain that only the PCI BIOS API is required, and we would prefer to retain the optimisations that come from inlining the configuration space accesses in this common case. Cloud images are (at present) disk images rather than ROM images, and so the increased code size required for this design approach in the PCIAPI_CLOUD case is acceptable. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-09-18 13:41:21 +01:00
Michael Brown	be667ba948	[pci] Add support for the Enhanced Configuration Access Mechanism (ECAM) The ACPI MCFG table describes a direct mapping of PCI configuration space into MMIO space. This mapping allows access to extended configuration space (up to 4096 bytes) and also provides for the existence of multiple host bridges. Add support for the ECAM mechanism described by the ACPI MCFG table, as a selectable PCI I/O API alongside the existing PCI BIOS and Type 1 mechanisms. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-09-16 01:05:47 +01:00
Michael Brown	ff228f745c	[pci] Generalise pci_num_bus() to pci_discover() Allow pci_find_next() to discover devices beyond the first PCI segment, by generalising pci_num_bus() (which implicitly assumes that there is only a single PCI segment) with pci_discover() (which has the ability to return an arbitrary contiguous chunk of PCI bus:dev.fn address space). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-09-15 16:49:47 +01:00
Michael Brown	56b30364c5	[pci] Check for wraparound in callers of pci_find_next() The semantics of the bus:dev.fn parameter passed to pci_find_next() are "find the first existent PCI device at this address or higher", with the caller expected to increment the address between finding devices. This does not allow the parameter to distinguish between the two cases "start from address zero" and "wrapped after incrementing maximal possible address", which could therefore lead to an infinite loop in the degenerate case that a device with address ffff:ff:1f.7 really exists. Fix by checking for wraparound in the caller (which is already responsible for performing the increment). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-09-15 15:20:58 +01:00
Michael Brown	8fc3c26eae	[pci] Allow pci_find_next() to return non-zero PCI segments Separate the return status code from the returned PCI bus:dev.fn address, in order to allow pci_find_next() to be used to find devices with a non-zero PCI segment number. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-09-15 15:20:58 +01:00
Michael Brown	a80124456e	[ena] Increase receive ring size to 128 entries Some versions of the ENA hardware (observed on a c6i.large instance in eu-west-2) seem to require a receive ring containing at least 128 entries: any smaller ring will never see receive completions or will stall after the first few completions. Increase the receive ring size to 128 entries (determined empirically) for compatibility with these hardware versions. Limit the receive ring fill level to 16 (as at present) to avoid consuming more memory than will typically be available in the internal heap. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-08-26 19:38:27 +01:00
Michael Brown	3b81a4e256	[ena] Provide a host information page Some versions of the ENA firmware (observed on a c6i.large instance in eu-west-2) seem to require a host information page, without which the CREATE_CQ command will fail with ENA_ADMIN_UNKNOWN_ERROR. These firmware versions also seem to require us to claim that we are a Linux kernel with a specific driver major version number. This appears to be a firmware bug, as revealed by Linux kernel commit 1a63443af ("net/amazon: Ensure that driver version is aligned to the linux kernel"): this commit changed the value of the driver version number field to be the Linux kernel version, and was hastily reverted in commit 92040c6da ("net: ena: fix broken interface between ENA driver and FW") which clarified that the version number field does actually have some undocumented significance to some versions of the firmware. Fix by providing a host information page via the SET_FEATURE command, incorporating the apparently necessary lies about our identity. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-08-26 19:38:27 +01:00
Michael Brown	9f81e97af5	[ena] Specify the unused completion queue MSI-X vector as 0xffffffff Some versions of the ENA firmware (observed on a c6i.large instance in eu-west-2) will complain if the completion queue's MSI-X vector field is left empty, even though the queue configuration specifies that interrupts are not used. Work around these firmware versions by passing in what appears to be the magic "no MSI-X vector" value in this field. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-08-26 19:38:27 +01:00
Michael Brown	6d2cead461	[ena] Allow for out-of-order completions The ENA data path design has separate submission and completion queues. Submission queues must be refilled in strict order (since there is only a single linear tail pointer used to communicate the existence of new entries to the hardware), and completion queue entries include a request identifier copied verbatim from the submission queue entry. Once the submission queue doorbell has been rung, software never again reads from the submission queue entry and nothing ever needs to write back to the submission queue entry since completions are reported via the separate completion queue. This design allows the hardware to complete submission queue entries out of order, provided that it internally caches at least as many entries as it leaves gaps. Record and identify I/O buffers by request identifier (using a circular ring buffer of unique request identifiers), and remove the assumption that submission queue entries will be completed in order. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2022-08-26 19:38:25 +01:00

1 2 3 4 5 ...

1497 Commits (dcad73ca5ad3e1fe011c52a24036f67ad69fadc1)