Commit Graph

1523 Commits (e202ea92cf972778d6ff692859932acc43cdb5e2)

Author SHA1 Message Date
Joseph Wong e202ea92cf
Merge d6220856ad into 7e64e9b670 2025-04-03 06:07:55 +00:00
Joseph Wong d6220856ad [bnxt] Use updated DMA APIs
Replace malloc_phys with dma_alloc, free_phys with dma_free, alloc_iob with alloc_rx_iob, free_iob with free_rx_iob, virt_to_bus with dma or iob_dma. Replace dma_addr_t with physaddr_t.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-04-02 22:42:18 -07:00
Michael Brown 4bcaa3d380 [efi] Disconnect existing drivers on a per-protocol basis
UEFI does not provide a direct method to disconnect the existing
driver of a specific protocol from a handle.  We currently use
DisconnectController() to remove all drivers from a handle that we
want to drive ourselves, and then rely on recursion in the call to
ConnectController() to reconnect any drivers that did not need to be
disconnected in the first place.

Experience shows that OEMs tend not to ever test the disconnection
code paths in their UEFI drivers, and it is common to find drivers
that refuse to disconnect, fail to close opened handles, fail to
function correctly after reconnection, or lock up the entire system.

Implement a more selective form of disconnection, in which we use
OpenProtocolInformation() to identify the driver associated with a
specific protocol, and then disconnect only that driver.

Perform disconnections in reverse order of attachment priority, since
this is the order likely to minimise the number of cascaded implicit
disconnections.

This allows our MNP driver to avoid performing any disconnections at
all, since it does not require exclusive access to the MNP protocol.
It also avoids performing unnecessary disconnections and reconnections
of unrelated drivers such as the "UEFI WiFi Connection Manager" that
attaches to wireless network interfaces in order to manage wireless
network associations.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-29 20:26:06 +00:00
Michael Brown 7737fec5c6 [efi] Define an attachment priority order for EFI drivers
Define an ordering for internal EFI drivers on the basis of how close
the driver is to the hardware, and attempt to start drivers in this
order.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-29 18:44:34 +00:00
Michael Brown 2399c79980 [fdt] Allow for the existence of multiple device trees
When running on a platform that uses FDT as its hardware description
mechanism, we are likely to have multiple device tree structures.  At
a minimum, there will be the device tree passed to us from the
previous boot stage (e.g. OpenSBI), and the device tree that we
construct to be passed to the booted operating system.

Update the internal FDT API to include an FDT pointer in all function
parameter lists.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-28 14:14:32 +00:00
Michael Brown 32a9408217 [efi] Allow use of typed pointers for efi_open() et al
Provide wrapper macros to allow efi_open() and related functions to
accept a pointer to any pointer type as the "interface" argument, in
order to allow a substantial amount of type adjustment boilerplate to
be removed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 15:43:56 +00:00
Michael Brown bac3187439 [efi] Use efi_open() for all ephemeral protocol opens
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 13:19:26 +00:00
Michael Brown 5a5e2a1dae [efi] Use efi_open_unsafe() for all explicitly unsafe protocol opens
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 13:19:26 +00:00
Michael Brown 9dd30f11f7 [efi] Use efi_open_by_driver() for all by-driver protocol opens
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-03-24 13:19:26 +00:00
Joseph Wong bd90abf487 [bnxt] Allocate TX rings with firmware input
Use queue_id value retrieved from firmware unconditionally when
allocating TX rings.

Signed-off by: Joseph Wong <joseph.wong@broadcom.com>
2025-02-07 09:26:15 +00:00
Michael Brown 24db39fb29 [gve] Run startup process only while device is open
The startup process is scheduled to run when the device is opened and
terminated (if still running) when the device is closed.  It assumes
that the resource allocation performed in gve_open() has taken place,
and that the admin and transmit/receive data structure pointers are
therefore valid.

The process initialisation in gve_probe() erroneously calls
process_init() rather than process_init_stopped() and will therefore
schedule the startup process immediately, before the relevant
resources have been allocated.

This bug is masked in the typical use case of a Google Cloud instance
with a single NIC built with the config/cloud/gce.ipxe embedded
script, since the embedded script will immediately open the NIC (and
therefore allocate the required resources) before the scheduled
process is allowed to run for the first time.  In a multi-NIC
instance, undefined behaviour will arise as soon as the startup
process for the second NIC is allowed to run.

Fix by using process_init_stopped() to avoid implicitly scheduling the
startup process during gve_probe().

Originally-fixed-by: Kal Cutter Conley <kalcutterc@nvidia.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-12-03 13:57:06 +00:00
Michael Brown cc45ca372c [pci] Drag in PCI settings mechanism only when PCI support is present
Allow for the existence of platforms with no PCI bus by including the
PCI settings mechanism only if PCI bus support is included.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-10-25 14:40:28 +01:00
Michael Brown c69f9589cc [usb] Expose USB device descriptor and strings via settings
Allow scripts to read basic information from USB device descriptors
via the settings mechanism.  For example:

  echo USB vendor ID: ${usb/${busloc}.8.2}
  echo USB device ID: ${usb/${busloc}.10.2}
  echo USB manufacturer name: ${usb/${busloc}.14.0}

The general syntax is

  usb/<bus:dev>.<offset>.<length>

where bus:dev is the USB bus:device address (as obtained via the
"usbscan" command, or from e.g. ${net0/busloc} for a USB network
device), and <offset> and <length> select the required portion of the
USB device descriptor.

Following the usage of SMBIOS settings tags, a <length> of zero may be
used to indicate that the byte at <offset> contains a USB string
descriptor index, and an <offset> of zero may be used to indicate that
the <length> contains a literal USB string descriptor index.

Since the byte at offset zero can never contain a string index, and a
literal string index can never be zero, the combination of both
<length> and <offset> being zero may be used to indicate that the
entire device descriptor is to be read as a raw hex dump.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-10-18 13:13:51 +01:00
Michael Brown c219b5d8a9 [usb] Add "usbscan" command for iterating over USB devices
Implement a "usbscan" command as a direct analogy of the existing
"pciscan" command, allowing scripts to iterate over all detected USB
devices.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-10-17 14:18:22 +01:00
Michael Brown 59d123658b [gve] Allocate all possible event counters
The admin queue API requires us to tell the device how many event
counters we have provided via the "configure device resources" admin
queue command.  There is, of course, absolutely no documentation
indicating how many event counters actually need to be provided.

We require only two event counters: one for the transmit queue, one
for the receive queue.  (The receive queue doesn't seem to actually
make any use of its event counter, but the "create receive queue"
admin queue command will fail if it doesn't have an available event
counter to choose.)

In the absence of any documentation, we currently make the assumption
that allocating and configuring 16 counters (i.e. one whole cacheline)
will be sufficient to allow for the use of two counters.

This assumption turns out to be incorrect.  On larger instance types
(observed with a c3d-standard-16 instance in europe-west4-a), we find
that creating the transmit or receive queues will each fail with a
probability of around 50% with the "failed precondition" error code.

Experimentation suggests that even though the device has accepted our
"configure device resources" command indicating that we are providing
only 16 event counters, it will attempt to choose any of its potential
32 event counters (and will then fail since the event counter that it
unilaterally chose is outside of the agreed range).

Work around this firmware bug by always allocating the maximum number
of event counters supported by the device.  (This requires deferring
the allocation of the event counters until after issuing the "describe
device" command.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-09-17 13:37:20 +01:00
Michael Brown f88761ef49 [ena] Change reported operating system type to "iPXE"
As described in commit 3b81a4e ("[ena] Provide a host information
page"), we currently report an operating system type of "Linux" in
order to work around broken versions of the ENA firmware that will
fail to create a completion queue if we report the correct operating
system type.

As of September 2024, the ENA team at AWS assures us that the entire
AWS fleet has been upgraded to fix this bug, and that we are now safe
to report the correct operating system type value in the "type" field
of struct ena_host_info.

The ENA team has also clarified that at least some deployed versions
of the ENA firmware still have the defect that requires us to report
an operating system version number of 2 (regardless of operating
system type), and so we continue to report ENA_HOST_INFO_VERSION_WTF
in the "version" field of struct ena_host_info.

Add an explicit warning on the previous known failure path, in case
some deployed versions of the ENA firmware turn out to not have been
upgraded as expected.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-09-05 14:13:16 +01:00
Animesh Bhatt c7f2e75519 [aqc1xx] Add support for Marvell AQtion Ethernet controller
This patch adds support for the AQtion Ethernet controller, enabling
iPXE to recognize and utilize the specific models (AQC114, AQC113, and
AQC107).

Tested-by: Animesh Bhatt <animeshb@marvell.com>
Signed-off-by: Animesh Bhatt <animeshb@marvell.com>
2024-09-02 13:45:54 +01:00
Michael Brown 7f75d320f6 [etherfabric] Fix use of uninitialised variable in falcon_xaui_link_ok()
The link status check in falcon_xaui_link_ok() reads from the
FCN_XX_CORE_STAT_REG_MAC register only on production hardware (where
the FPGA version reads as zero), but modifies the value and writes
back to this register unconditionally.  This triggers an uninitialised
variable warning on newer versions of gcc.

Fix by assuming that the register exists only on production hardware,
and so moving the "modify-write" portion of the "read-modify-write"
operation to also be covered by the same conditional check.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-09-02 12:24:57 +01:00
Michael Brown 46937a9df6 [crypto] Remove the concept of a public-key algorithm reusable context
Instances of cipher and digest algorithms tend to get called
repeatedly to process substantial amounts of data.  This is not true
for public-key algorithms, which tend to get called only once or twice
for a given key.

Simplify the public-key algorithm API so that there is no reusable
algorithm context.  In particular, this allows callers to omit the
error handling currently required to handle memory allocation (or key
parsing) errors from pubkey_init(), and to omit the cleanup calls to
pubkey_final().

This change does remove the ability for a caller to distinguish
between a verification failure due to a memory allocation failure and
a verification failure due to a bad signature.  This difference is not
material in practice: in both cases, for whatever reason, the caller
was unable to verify the signature and so cannot proceed further, and
the cause of the error will be visible to the user via the return
status code.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-08-21 21:00:57 +01:00
Michael Brown be2784649d [gve] Add missing error codes in EUNIQ() list of potential errors
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-08-20 22:44:15 +01:00
Michael Brown 53f089b723 [crypto] Pass asymmetric keys as ASN.1 cursors
Asymmetric keys are invariably encountered within ASN.1 structures
such as X.509 certificates, and the various large integers within an
RSA key are themselves encoded using ASN.1.

Simplify all code handling asymmetric keys by passing keys as a single
ASN.1 cursor, rather than separate data and length pointers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-08-18 15:44:38 +01:00
Michael Brown 7c82ff0b6b [pci] Separate permission to probe buses from bus:dev.fn range discovery
The UEFI device model requires us to not probe the PCI bus directly,
but instead to wait to be offered the opportunity to drive devices via
our driver service binding handle.

We currently inhibit PCI bus probing by having pci_discover() return
an empty range when using the EFI PCI I/O API.  This has the unwanted
side effect that scanning the bus manually using the "pciscan" command
will also fail to discover any devices.

Separate out the concept of being allowed to probe PCI buses from the
mechanism for discovering PCI bus:dev.fn address ranges, so that this
limitation may be removed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-08-15 09:31:14 +01:00
Michael Brown d2d194bc60 [gve] Increase number of receive buffers to reduce packet loss
Experiments suggest that using fewer than 64 receive buffers leads to
excessive packet drop rates on some instance types (observed with a
c3-standard-4 instance in europe-west4-a).

Fix by increasing the number of receive data buffers (and adjusting
the length of the registrable queue page address list to match).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-07-25 00:13:33 +01:00
Michael Brown c7b76e3adc [gve] Add driver for Google Virtual Ethernet NIC
The Google Virtual Ethernet NIC (GVE or gVNIC) is found only in Google
Cloud instances.  There is essentially zero documentation available
beyond the mostly uncommented source code in the Linux kernel.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-07-24 14:45:46 +01:00
Michael Brown b940d54235 [cachedhcp] Allow cached DHCPACK to apply to temporary network devices
Retain a reference to the cached DHCPACK until the late startup phase,
and allow it to be recycled for reuse.  This allows the cached DHCPACK
to be used for a temporary MNP network device and then subsequently
reused for the corresponding real network device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-04-02 22:59:50 +01:00
Michael Brown b66f6025fa [efi] Add the ability to create a temporary MNP network device
An MNP network device may be temporarily and non-destructively
installed on top of an existing UEFI network stack without having to
disconnect existing drivers.

Add the ability to create such a temporary network device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-03-29 14:46:13 +00:00
Michael Brown dcad73ca5a [efi] Add support for driving EFI_MANAGED_NETWORK_PROTOCOL devices
We want exclusive access to the network device, both for performance
reasons and because we perform operations such as EAPoL that affect
the entire link.  We currently drive the network card via either a
native hardware driver or via the SNP or NII/UNDI interfaces, both of
which grant us this exclusive access.

Add an alternative driver that drives the network card non-exclusively
via the EFI_MANAGED_NETWORK_PROTOCOL interface.  This can function as
a fallback for situations where neither SNP nor NII/UNDI interfaces
are functional, and also opens up the possibility of non-destructively
installing a temporary network device over which to download the
autoexec.ipxe script.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-03-25 17:58:33 +00:00
Michael Brown a15ce00182 [efi] Match chainloaded device by uppermost matching handle
Commit 4c5b794 ("[efi] Use the SNP protocol instance to match the SNP
chainloading device") switched the chainloaded device matching logic
to use a target protocol instance rather than the loaded image's
device handle, on the basis that we want to bind to the parent SNP
device rather than to a duplicate SNP protocol instance installed onto
an IPv4 or IPv6 child device handle.

It is possible that our calls to DisconnectController() and
ConnectController() will cause the target protocol instance to be
uninstalled and reinstalled, which may change the value of the
protocol instance pointer.  Allow for this by identifying and matching
against the uppermost handle that initially has this target protocol
instance installed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-03-25 17:58:33 +00:00
Michael Brown 926816c58f [efi] Pad transmit buffer length to work around vendor driver bugs
The Mellanox/Nvidia UEFI driver is built from the same codebase as the
iPXE driver, and appears to contain the bug that was fixed in commit
c11734e ("[golan] Use ETH_HLEN for inline header size").  This results
in identical failures when using the SNP or NII interface (via
e.g. snponly.efi) to drive a Mellanox card while EAPoL is enabled.

Work around the underlying UEFI driver bug by padding transmit I/O
buffers to the minimum Ethernet frame length before passing them to
the underlying driver's transmit function.

This padding is not technically necessary, since almost all modern
hardware will insert transmit padding as necessary (and where the
hardware does not support doing so, the underlying UEFI driver is
responsible for adding any necessary padding).  However, it is
guaranteed to be harmless (other than a miniscule performance impact):
the Ethernet specification requires zero padding up to the minimum
frame length for packets that are transmitted onto the wire, and so
the receiver will see the same packet whether or not we manually
insert this padding in software.

The additional padding causes the underlying Mellanox driver to avoid
its faulty code path, since it will never be asked to transmit a very
short packet.

Tested-by: Eric Hagberg <ehagberg@janestreet.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-03-18 22:52:05 +00:00
Rabia Manaa c11734eee0 [golan] Use ETH_HLEN for inline header size
The driver does not correctly handle very short transmitted packets
such as EAPoL-Start where the entire DMA content lies within the
current send work queue entry inline header length of 18 bytes.

Fix by reducing the inline header length to the Ethernet frame header
length of 14 bytes.

Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-03-17 22:55:32 +00:00
Michael Brown bac967d51a [snp] Allocate additional padding for receive buffers
Some SNP implementations (observed with a wifi adapter in a Dell
Latitude 3440 laptop) seem to require additional space in the
allocated receive buffers, otherwise full-length packets will be
silently dropped.

The EDK2 MnpDxe driver happens to allocate an additional 8 bytes of
padding (4 for a VLAN tag, 4 for the Ethernet frame checksum).  Match
this behaviour since drivers are very likely to have been tested
against MnpDxe.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-03-16 23:28:34 +00:00
Geert Stappers e5f3ba0ca7 [drivers] Sort PCI_ROM() entries numerically
Done with the help of this Perl script:

$MARKER = 'PCI_ROM';  # a regex
$AB = 1;  # At Begin
@HEAD = ();
@ITEMS = ();
@TAIL = ();

foreach $fn (@ARGV) {
    open(IN, $fn) or die "Can't open file '$fn': $!\n";
    while (<IN>) {
        if (/$MARKER/) {
            push @ITEMS, $_;
            $AB = 0;  # not anymore at begin
        }
        else {
            if ($AB) {
                push @HEAD, $_;
            }
            else {
                push @TAIL, $_;
            }
        }
    }
} continue {
    close IN;
    open(OUT, ">$fn") or die "Can't open file '$fn' for output: $!\n";
    print OUT @HEAD;
    print OUT sort @ITEMS;
    print OUT @TAIL;
    close OUT;
    # For a next file
    $AB = 1;
    @HEAD = ();
    @ITEMS = ();
    @TAIL = ();
}

Executed that script while src/drivers/ as current working directory,
provided '$(grep -rl PCI_ROM)' as argument.

Signed-off-by: Geert Stappers <stappers@stappers.it>
2024-02-22 14:19:04 +00:00
Joseph Wong a846c4ccfc [bnxt] Add support for BCM957608
Add support for BCM957608 device.  Add support for additional link
speeds supported by BCM957608.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2024-02-08 15:10:12 +00:00
Joseph Wong de8a0821c7 [bnxt] Add support for additional chip IDs
Add additional chip IDs that can be recognized as part of the thor
family.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-01-19 22:08:48 +00:00
Michael Brown 4b7d9a6af0 [libc] Replace linker_assert() with build_assert()
We currently implement build-time assertions via a mechanism that
generates a call to an undefined external function that will cause the
link to fail unless the compiler can prove that the asserted condition
is true (and thereby eliminate the undefined function call).

This assertion mechanism can be used for conditions that are not
amenable to the use of static_assert(), since static_assert() will not
allow for proofs via dead code elimination.

Add __attribute__((error(...))) to the undefined external function, so
that the error is raised at compile time rather than at link time.
This allows us to provide a more meaningful error message (which will
include the file name and line number, as with any other compile-time
error), and avoids the need for the caller to specify a unique symbol
name for the external function.

Change the name from linker_assert() to build_assert(), since the
assertion now takes place at compile time rather than at link time.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-01-16 13:35:08 +00:00
Michael Brown 6d29415c89 [libc] Make static_assert() available via assert.h
Expose static_assert() via assert.h and migrate link-time assertions
to build-time assertions where possible.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2024-01-16 13:35:08 +00:00
Christian Helmuth 119c415ee4 [intel] Add PCI ID for I219-LM (23)
Successfully tested on FUJITSU LIFEBOOK U7413.

Signed-off-by: Christian Helmuth <christian.helmuth@genode-labs.com>
2023-12-21 13:53:24 +01:00
Michael Brown 36e1a559a2 [pci] Check that ECAM configuration space is within reachable memory
Some machines (observed with an AWS EC2 m7a.large instance) will place
the ECAM configuration space window above 4GB, thereby making it
unreachable from non-paged 32-bit code.  This problem is currently
ignored by iPXE, since the address is silently truncated in the call
to ioremap().  (Note that other uses of ioremap() are not affected
since the PCI core will already have checked for unreachable 64-bit
BARs when retrieving the physical address to be mapped.)

Fix by adding an explicit check that the region to be mapped starts
within the reachable memory address space.  (Assume that no machines
will be sufficiently peverse to provide a region that straddles the
4GB boundary.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-11-02 15:38:08 +00:00
Michael Brown 1f3a37e342 [pci] Cache ECAM mapping errors
When an error occurs during ECAM configuration space mapping, preserve
the error within the existing cached mapping (instead of invalidating
the cached mapping) in order to avoid flooding the debug log with
repeated identical mapping errors.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-11-02 15:20:27 +00:00
Michael Brown 74ec00a9f3 [pci] Handle non-zero starting bus in ECAM allocations
The base address provided in the PCI ECAM allocation within the ACPI
MCFG table is the base address for the segment as a whole, not for the
starting bus within that allocation.  On machines that provide ECAM
allocations with a non-zero starting bus number (observed with an AWS
EC2 m7a.large instance), this will result in iPXE accessing the wrong
memory addresses within the ECAM region.

Fix by adding the appropriate starting bus offset to the base address.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-11-02 15:05:15 +00:00
Michael Brown f883203132 [pci] Force completion of ECAM configuration space writes
The PCIe specification requires that "processor and host bridge
implementations must ensure that a method exists for the software to
determine when the write using the ECAM is completed by the completer"
but does not specify any particular method to be used.  Some platforms
might treat writes to the ECAM region as non-posted, others might
require reading back from a dedicated (and implementation-specific)
completion register to determine when the configuration space write
has completed.

Since PCI configuration space writes will never be used for any
performance-critical datapath operations (on any sane hardware), a
simple and platform-independent solution is to always read back from
the written register in order to guarantee that the write must have
completed.  This is safe to do, since the PCIe specification defines a
limited set of configuration register types, none of which have read
side effects.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-11-01 22:32:21 +00:00
Michael Brown 115707c0ed [iphone] Add missing va_start()/va_end() around reused argument list
The ipair_tx() function uses a va_list twice (first to calculate the
formatted string length before allocation, then to construct the
string in the allocated buffer) but is missing the va_start() and
va_end() around the second usage.  This is undefined behaviour that
happens to work on some build platforms.

Fix by adding the missing va_start() and va_end() around the second
usage of the variadic argument list.

Reported-by: Andreas Hammarskjöld <andreas@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-10-24 11:43:56 +01:00
Michael Brown ae4e85bde9 [netdevice] Allocate private data for each network upper-layer driver
Allow network upper-layer drivers (such as LLDP, which attaches to
each network device in order to provide a corresponding LLDP settings
block) to specify a size for private data, which will be allocated as
part of the network device structure (as with the existing private
data allocated for the underlying device driver).

This will allow network upper-layer drivers to be simplified by
omitting memory allocation and freeing code.  If the upper-layer
driver requires a reference counter (e.g. for interface
initialisation), then it may use the network device's existing
reference counter, since this is now the reference counter for the
containing block of memory.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-09-13 20:23:46 +01:00
Michael Brown eeb7cd56e5 [netdevice] Remove netdev_priv() helper function
Some network device drivers use the trivial netdev_priv() helper
function while others use the netdev->priv pointer directly.

Standardise on direct use of netdev->priv, in order to free up the
function name netdev_priv() for reuse.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-09-13 16:29:48 +01:00
Alexander Eichner 9e99a55b31 [virtio] Fix implementation of vpm_ioread32()
The current implementation of vpm_ioread32() erroneously reads only 16
bits of data, which fails when used with the (stricter) virtio device
emulation in VirtualBox.

Fix by using the correct readl()/inl() I/O wrappers.

Reworded-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-08-22 13:45:44 +01:00
Michael Brown f3036fc213 [linux] Set a default MAC address for tap devices
Avoid the need to always specify a local MAC address on the command
line by setting a default hardware MAC address (using the same default
address as for slirp devices).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-07-05 15:24:32 +01:00
Michael Brown 59d065c9ac [linux] Fix error control flow in af_packet_nic_probe()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-07-05 15:17:58 +01:00
Michael Brown 48ae5d5361 [linux] Fix error control flow in tap_probe()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-07-05 14:47:13 +01:00
Michael Brown 3ef4f7e2ef [console] Avoid overlap between special keys and Unicode characters
The special key range (from KEY_MIN upwards) currently overlaps with
the valid range for Unicode characters, and therefore prohibits the
use of Unicode key values outside the ASCII range.

Create space for Unicode key values by moving the special keys to the
range immediately above the maximum valid Unicode character.  This
allows the existing encoding of special keys as an efficiently packed
representation of the equivalent ANSI escape sequence to be maintained
almost as-is.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-07-04 14:33:43 +01:00
Michael Brown 2689a6e776 [efi] Always poll for TX completions
Polling for TX completions is arguably redundant when there are no
transmissions currently in progress.  Commit c6c7e78 ("[efi] Poll for
TX completions only when there is an outstanding TX buffer") switched
to setting the PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS flag only when
there is an in-progress transmission awaiting completion, in order to
reduce reported TX errors and debug message noise from buggy NII
implementations that report spurious TX completions whenever the
transmit queue is empty.

Some other NII implementations (observed with the Realtek driver in a
Dell Latitude 3440) seem to have a bug in the transmit datapath
handling which results in the transmit ring freezing after sending a
few hundred packets under heavy load.  The symptoms are that the
TPPoll register's NPQ bit remains set and the 256-entry transmit ring
contains a large number of uncompleted descriptors (with the OWN bit
set), the first two of which have identical data buffer addresses.

Though iPXE will submit at most one in-progress transmission via NII,
the Dell/Realtek driver seems to make a page-aligned copy of each
transmit data buffer and to report TX completions immediately without
waiting for the packet to actually be transmitted.  These synthetic TX
completions continue even after the hardware transmit ring freezes.

Setting PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll reduces the
probability of this Dell/Realtek driver bug being triggered by a
factor of around 500, which brings the failure rate down to the point
that it can sensibly be managed by external logic such as the
"--timeout" option for image downloads.  Closing and reopening the
interface (via "ifclose"/"ifopen") will clear the error condition and
allow transmissions to resume.

Revert to setting PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll,
and silently ignore situations in which the hardware reports a
completion when no transmission is in progress.  This approximately
matches the behaviour of the SnpDxe driver, which will also generally
set PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-06-21 11:49:53 +01:00