Restructure the parsing of the build directory name from
bin[[-<arch>]-<platform>]
to
bin[-<arch>[-<platform>]]
and allow for a per-architecture default build platform.
For the sake of backwards compatibility, handle "bin-efi" as a special
case equivalent to "bin-i386-efi".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The timer and entropy seed CSRs will, by design, return different
values each time they are read.
Add the missing volatile qualifiers on the inline assembly to prevent
gcc from assuming that repeated invocations may be elided.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Zkr entropy source extension defines a potentially unprivileged
seed CSR that can be read to obtain 16 bits of entropy input, with a
mandated requirement that 256 entropy input bits read from the seed
CSR will contain at least 128 bits of min-entropy.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Zicntr extension defines an unprivileged wall-clock time CSR that
roughly matches the behaviour of an invariant TSC on x86. The nominal
frequency of this timer may be read from the "timebase-frequency"
property of the CPU node in the device tree.
Add a timer source using RDTIME to provide implementations of udelay()
and currticks(), modelled on the existing RDTSC-based timer for x86.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RISC-V seems to allow for direct discovery of CPU features only from
M-mode (e.g. by setting up a trap handler and then attempting to
access a CSR), with S-mode code expected to read the resulting
constructed ISA description from the device tree.
Add the ability to check for the presence of named extensions listed
in the "riscv,isa" property of the device tree node corresponding to
the boot hart.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow for the existence of platforms with no PCI bus by including the
PCI settings mechanism only if PCI bus support is included.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Running with flat physical addressing is a fairly common early boot
environment. Rename UACCESS_EFI to UACCESS_FLAT so that this code may
be reused in non-UEFI boot environments that also use flat physical
addressing.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the ability to issue Supervisor Binary Interface (SBI) calls via
the ECALL instruction, and use the SBI DBCN extension to implement a
debug console.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Montgomery multiplication requires calculating the inverse of the
modulus modulo a larger power of two.
Add bigint_mod_invert() to calculate the inverse of any (odd) big
integer modulo an arbitrary power of two, using a lightly modified
version of the algorithm presented in "A New Algorithm for Inversion
mod p^k (Koç, 2017)".
The power of two is taken to be 2^k, where k is the number of bits
available in the big integer representation of the invertend. The
inverse modulo any smaller power of two may be obtained simply by
masking off the relevant bits in the inverse.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow scripts to read basic information from USB device descriptors
via the settings mechanism. For example:
echo USB vendor ID: ${usb/${busloc}.8.2}
echo USB device ID: ${usb/${busloc}.10.2}
echo USB manufacturer name: ${usb/${busloc}.14.0}
The general syntax is
usb/<bus:dev>.<offset>.<length>
where bus:dev is the USB bus:device address (as obtained via the
"usbscan" command, or from e.g. ${net0/busloc} for a USB network
device), and <offset> and <length> select the required portion of the
USB device descriptor.
Following the usage of SMBIOS settings tags, a <length> of zero may be
used to indicate that the byte at <offset> contains a USB string
descriptor index, and an <offset> of zero may be used to indicate that
the <length> contains a literal USB string descriptor index.
Since the byte at offset zero can never contain a string index, and a
literal string index can never be zero, the combination of both
<length> and <offset> being zero may be used to indicate that the
entire device descriptor is to be read as a raw hex dump.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Implement a "usbscan" command as a direct analogy of the existing
"pciscan" command, allowing scripts to iterate over all detected USB
devices.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Faster modular multiplication algorithms such as Montgomery
multiplication will still require the ability to perform a single
direct modular reduction.
Neaten up the implementation of direct reduction and split it out into
a separate bigint_reduce() function, complete with its own unit tests.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Every architecture uses the same implementation for bigint_is_set(),
and there is no reason to suspect that a future CPU architecture will
provide a more efficient way to implement this operation.
Simplify the code by providing a single architecture-independent
implementation of bigint_is_set().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The big integer shift operations are misleadingly described as
rotations since the original x86 implementations are essentially
trivial loops around the relevant rotate-through-carry instruction.
The overall operation performed is a shift rather than a rotation.
Update the function names and descriptions to reflect this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An n-bit multiplication product may be added to up to two n-bit
integers without exceeding the range of a (2n)-bit integer:
(2^n - 1)*(2^n - 1) + (2^n - 1) + (2^n - 1) = 2^(2n) - 1
Exploit this to perform big integer multiplication in constant time
without requiring the caller to provide temporary carry space.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for building as a Linux userspace binary for AArch32.
This allows the self-test suite to be more easily run for the 32-bit
ARM code. For example:
make CROSS=arm-linux-gnu- bin-arm32-linux/tests.linux
qemu-arm -L /usr/arm-linux-gnu/sys-root/ \
./bin-arm32-linux/tests.linux
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Reading from PMCCNTR causes an undefined instruction exception when
running in PL0 (e.g. as a Linux userspace binary), unless the
PMUSERENR.EN bit is set.
Restructure profile_timestamp() for 32-bit ARM to perform an
availability check on the first invocation, with subsequent
invocations returning zero if PMCCNTR could not be enabled.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
All consumers of profile_timestamp() currently treat the value as an
unsigned long. Only the elapsed number of ticks is ever relevant: the
absolute value of the timestamp is not used. Profiling is used to
measure short durations that are generally fewer than a million CPU
cycles, for which an unsigned long is easily large enough.
Standardise the return type of profile_timestamp() as unsigned long
across all CPU architectures. This allows 32-bit architectures such
as i386 and riscv32 to omit all logic associated with retrieving the
upper 32 bits of the 64-bit hardware counter, which simplifies the
code and allows riscv32 and riscv64 to share the same implementation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Big integer multiplication currently performs immediate carry
propagation from each step of the long multiplication, relying on the
fact that the overall result has a known maximum value to minimise the
number of carries performed without ever needing to explicitly check
against the result buffer size.
This is not a constant-time algorithm, since the number of carries
performed will be a function of the input values. We could make it
constant-time by always continuing to propagate the carry until
reaching the end of the result buffer, but this would introduce a
large number of redundant zero carries.
Require callers of bigint_multiply() to provide a temporary carry
storage buffer, of the same size as the result buffer. This allows
the carry-out from the accumulation of each double-element product to
be accumulated in the temporary carry space, and then added in via a
single call to bigint_add() after the multiplication is complete.
Since the structure of big integer multiplication is identical across
all current CPU architectures, provide a single shared implementation
of bigint_multiply(). The architecture-specific operation then
becomes the multiplication of two big integer elements and the
accumulation of the double-element product.
Note that any intermediate carry arising from accumulating the lower
half of the double-element product may be added to the upper half of
the double-element product without risk of overflow, since the result
of multiplying two n-bit integers can never have all n bits set in its
upper half. This simplifies the carry calculations for architectures
such as RISC-V and LoongArch64 that do not have a carry flag.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The admin queue API requires us to tell the device how many event
counters we have provided via the "configure device resources" admin
queue command. There is, of course, absolutely no documentation
indicating how many event counters actually need to be provided.
We require only two event counters: one for the transmit queue, one
for the receive queue. (The receive queue doesn't seem to actually
make any use of its event counter, but the "create receive queue"
admin queue command will fail if it doesn't have an available event
counter to choose.)
In the absence of any documentation, we currently make the assumption
that allocating and configuring 16 counters (i.e. one whole cacheline)
will be sufficient to allow for the use of two counters.
This assumption turns out to be incorrect. On larger instance types
(observed with a c3d-standard-16 instance in europe-west4-a), we find
that creating the transmit or receive queues will each fail with a
probability of around 50% with the "failed precondition" error code.
Experimentation suggests that even though the device has accepted our
"configure device resources" command indicating that we are providing
only 16 event counters, it will attempt to choose any of its potential
32 event counters (and will then fail since the event counter that it
unilaterally chose is outside of the agreed range).
Work around this firmware bug by always allocating the maximum number
of event counters supported by the device. (This requires deferring
the allocation of the event counters until after issuing the "describe
device" command.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As of commit 79c0173 ("[build] Create util/genfsimg for building
filesystem-based images"), the EFI boot file name for each CPU
architecture is defined within the genfsimg script itself, rather than
being passed in as a Makefile parameter.
Remove the now-redundant Makefile definitions for EFI_BOOT_FILE.
Reported-by: Christian I. Nilsson <nikize@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for building iPXE as a 64-bit or 32-bit RISC-V binary, for
either UEFI or Linux userspace platforms. For example:
# RISC-V 64-bit UEFI
make CROSS=riscv64-linux-gnu- bin-riscv64-efi/ipxe.efi
# RISC-V 32-bit UEFI
make CROSS=riscv64-linux-gnu- bin-riscv32-efi/ipxe.efi
# RISC-V 64-bit Linux
make CROSS=riscv64-linux-gnu- bin-riscv64-linux/tests.linux
qemu-riscv64 -L /usr/riscv64-linux-gnu/sys-root \
./bin-riscv64-linux/tests.linux
# RISC-V 32-bit Linux
make CROSS=riscv64-linux-gnu- SYSROOT=/usr/riscv32-linux-gnu/sys-root \
bin-riscv32-linux/tests.linux
qemu-riscv32 -L /usr/riscv32-linux-gnu/sys-root \
./bin-riscv32-linux/tests.linux
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The cross-compiler will typically use the appropriate sysroot
directory automatically. This may not work for toolchains where a
single cross-compiler is used to produce output for multiple CPU
variants (e.g. 32-bit and 64-bit RISC-V).
Add a SYSROOT=... parameter that may be used to specify the relevant
sysroot directory, e.g.
make CROSS=riscv64-linux-gnu- SYSROOT=/usr/riscv32-linux-gnu/sys-root \
bin-riscv32-linux/tests.linux
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The EDK2 header macros VA_START(), VA_ARG() etc produce build errors
on some CPU architectures (notably on 32-bit RISC-V, which is not yet
supported by EDK2).
Fix by using the standard variable argument list macros.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For some 32-bit CPUs, we need to provide implementations of 64-bit
shifts as libgcc helper functions. Add test cases to cover these.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Define a cpu_halt() function which is architecture-specific but
platform-independent, and merge the multiple architecture-specific
implementations of the EFI cpu_nap() function into a single central
efi_cpu_nap() that uses cpu_halt() if applicable.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The definitions of the setjmp() and longjmp() functions are common to
all architectures, with only the definition of the jump buffer
structure being architecture-specific.
Move the architecture-specific portions to bits/setjmp.h and provide a
common setjmp.h for the function definitions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As described in commit 3b81a4e ("[ena] Provide a host information
page"), we currently report an operating system type of "Linux" in
order to work around broken versions of the ENA firmware that will
fail to create a completion queue if we report the correct operating
system type.
As of September 2024, the ENA team at AWS assures us that the entire
AWS fleet has been upgraded to fix this bug, and that we are now safe
to report the correct operating system type value in the "type" field
of struct ena_host_info.
The ENA team has also clarified that at least some deployed versions
of the ENA firmware still have the defect that requires us to report
an operating system version number of 2 (regardless of operating
system type), and so we continue to report ENA_HOST_INFO_VERSION_WTF
in the "version" field of struct ena_host_info.
Add an explicit warning on the previous known failure path, in case
some deployed versions of the ENA firmware turn out to not have been
upgraded as expected.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Move the <gdbmach.h> file to <bits/gdbmach.h>, and provide a common
dummy implementation for all architectures that have not yet
implemented support for GDB.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the process of adding a new CPU architecture by providing
common implementations of typically empty architecture-specific header
files.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This patch adds support for the AQtion Ethernet controller, enabling
iPXE to recognize and utilize the specific models (AQC114, AQC113, and
AQC107).
Tested-by: Animesh Bhatt <animeshb@marvell.com>
Signed-off-by: Animesh Bhatt <animeshb@marvell.com>
The link status check in falcon_xaui_link_ok() reads from the
FCN_XX_CORE_STAT_REG_MAC register only on production hardware (where
the FPGA version reads as zero), but modifies the value and writes
back to this register unconditionally. This triggers an uninitialised
variable warning on newer versions of gcc.
Fix by assuming that the register exists only on production hardware,
and so moving the "modify-write" portion of the "read-modify-write"
operation to also be covered by the same conditional check.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the "imgdecrypt" command that can be used to decrypt a detached
encrypted data image using a cipher key obtained from a separate CMS
envelope image. For example:
# Create non-detached encrypted CMS messages
#
openssl cms -encrypt -binary -aes-256-gcm -recip client.crt \
-in vmlinuz -outform DER -out vmlinuz.cms
openssl cms -encrypt -binary -aes-256-gcm -recip client.crt \
-in initrd.img -outform DER -out initrd.img.cms
# Detach data from envelopes (using iPXE's contrib/crypto/cmsdetach)
#
cmsdetach vmlinuz.cms -d vmlinuz.dat -e vmlinuz.env
cmsdetach initrd.img.cms -d initrd.img.dat -e initrd.img.env
and then within iPXE:
#!ipxe
imgfetch http://192.168.0.1/vmlinuz.dat
imgfetch http://192.168.0.1/initrd.img.dat
imgdecrypt vmlinuz.dat http://192.168.0.1/vmlinuz.env
imgdecrypt initrd.img.dat http://192.168.0.1/initrd.img.env
boot vmlinuz
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for decrypting images containing detached encrypted data
using a cipher key obtained from a separate CMS envelope image (in DER
or PEM format).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise CMS self-test data structure and macro names to refer to
"messages" rather than "signatures", in preparation for adding image
decryption tests.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some ASN.1 OID-identified algorithms require additional parameters,
such as an initialisation vector for a block cipher. The structure of
the parameters is defined by the individual algorithm.
Extend asn1_algorithm() to allow these additional parameters to be
returned via a separate ASN.1 cursor.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Reduce the number of dynamic allocations required to parse a CMS
message by retaining the ASN.1 cursor returned from image_asn1() for
the lifetime of the CMS message. This allows embedded ASN.1 cursors
to be used for parsed objects within the message, such as embedded
signatures.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Instances of cipher and digest algorithms tend to get called
repeatedly to process substantial amounts of data. This is not true
for public-key algorithms, which tend to get called only once or twice
for a given key.
Simplify the public-key algorithm API so that there is no reusable
algorithm context. In particular, this allows callers to omit the
error handling currently required to handle memory allocation (or key
parsing) errors from pubkey_init(), and to omit the cleanup calls to
pubkey_final().
This change does remove the ability for a caller to distinguish
between a verification failure due to a memory allocation failure and
a verification failure due to a bad signature. This difference is not
material in practice: in both cases, for whatever reason, the caller
was unable to verify the signature and so cannot proceed further, and
the cause of the error will be visible to the user via the return
status code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The TLS connection structure has grown to become unmanageably large as
new features and support for new TLS protocol versions have been added
over time.
Split out the portions of struct tls_connection that are specific to
client and server operations into separate structures, and simplify
some structure field names.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The TLS connection structure has grown to become unmanageably large as
new features and support for new TLS protocol versions have been added
over time.
Split out the portions of struct tls_connection that are specific to
transmit and receive operations into separate structures, and simplify
some structure field names.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise the existing support for performing RSA public-key
encryption, decryption, signature, and verification tests, and update
the code to use okx() for neater reporting of test results.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Asymmetric keys are invariably encountered within ASN.1 structures
such as X.509 certificates, and the various large integers within an
RSA key are themselves encoded using ASN.1.
Simplify all code handling asymmetric keys by passing keys as a single
ASN.1 cursor, rather than separate data and length pointers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise the logic for identifying the matching PCI root bridge I/O
protocol to allow for identifying the closest matching PCI bus:dev.fn
address range, and use this to provide PCI address range discovery
(while continuing to inhibit automatic PCI bus probing).
This allows the "pciscan" command to work as expected under UEFI.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI device model requires us to not probe the PCI bus directly,
but instead to wait to be offered the opportunity to drive devices via
our driver service binding handle.
We currently inhibit PCI bus probing by having pci_discover() return
an empty range when using the EFI PCI I/O API. This has the unwanted
side effect that scanning the bus manually using the "pciscan" command
will also fail to discover any devices.
Separate out the concept of being allowed to probe PCI buses from the
mechanism for discovering PCI bus:dev.fn address ranges, so that this
limitation may be removed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An attempt to use a validator for an empty certificate chain will
correctly fail the overall validation with the "empty certificate
chain" error propagated from x509_auto_append().
In a debug build, the call to validator_name() will attempt to call
x509_name() on a non-existent certificate, resulting in garbage in the
debug message.
Fix by checking for the special case of an empty certificate chain.
This issue does not affect non-debug builds, since validator_name() is
(as per its description) called only for debug messages.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is some exploitable similarity between the data structures used
for representing CMS signatures and CMS encryption keys. In both
cases, the CMS message fundamentally encodes a list of participants
(either message signers or message recipients), where each participant
has an associated certificate and an opaque octet string representing
the signature or encrypted cipher key. The ASN.1 structures are not
identical, but are sufficiently similar to be worth exploiting: for
example, the SignerIdentifier and RecipientIdentifier data structures
are defined identically.
Rename data structures and functions, and add the concept of a CMS
message type.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Extend the definition of an ASN.1 OID-identified algorithm to include
a potential cipher suite, and add identifiers for AES-CBC and AES-GCM.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The cms_signature() and cms_verify() functions currently accept raw
data pointers. This will not be possible for cms_decrypt(), which
will need the ability to extract fragments of ASN.1 data from a
potentially large image.
Change cms_signature() and cms_verify() to accept an image as an input
parameter, and move the responsibility for setting the image trust
flag within cms_verify() since that now becomes a more natural fit.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow passing a NULL value for the certificate list to all functions
used for identifying an X.509 certificate from an existing set of
certificates, and rename function parameters to indicate that this
certificate list represents an unordered certificate store (rather
than an ordered certificate chain).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Centralise all current mechanisms for identifying an X.509 certificate
(by raw content, by subject, by issuer and serial number, and by
matching public key), and remove the certstore-specific and
CMS-specific variants of these functions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Handling large ASN.1 objects such as encrypted CMS files will require
the ability to use the asn1_enter() and asn1_skip() family of
functions on partial object cursors, where a defined additional length
is known to exist after the end of the data buffer pointed to by the
ASN.1 object cursor.
We already have support for partial object cursors in the underlying
asn1_start() operation used by both asn1_enter() and asn1_skip(), and
this is used by the DER image probe routine to check that the
potential DER file comprises a single ASN.1 SEQUENCE object.
Add asn1_enter_partial() to formalise the process of entering an ASN.1
partial object, and refactor the DER image probe routine to use this
instead of open-coding calls to the underlying asn1_start() operation.
There is no need for an equivalent asn1_skip_partial() function, since
only objects that are wholly contained within the partial cursor may
be successfully skipped.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Calling asn1_skip_if_exists() on a malformed ASN.1 object may
currently leave the cursor in a partially-updated state, where the tag
byte and one of the length bytes have been stripped. The cursor is
left with a valid data pointer and length and so no out-of-bounds
access can arise, but the cursor no longer points to the start of an
ASN.1 object.
Ensure that each ASN.1 cursor manipulation code path leads to the
cursor being either fully updated, left unmodified, or invalidated,
and update the function descriptions to reflect this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Successfully reaching the end of a well-formed ASN.1 object list is
arguably not an error, but the current code (dating back to the
original ASN.1 commit in 2007) will explicitly check for and report
this as an error condition.
Remove the explicit check for reaching the end of a well-formed ASN.1
object list, and instead return success along with a zero-length (and
hence implicitly invalidated) cursor.
Almost every existing caller of asn1_skip() or asn1_skip_if_exists()
currently ignores the return value anyway. Skipped objects are (by
definition) not of interest to the caller, and the invalidation
behaviour of asn1_skip() ensures that any errors will be safely caught
on a subsequent attempt to actually use the ASN.1 object content.
Since these existing callers ignore the return value, they cannot be
affected by this change.
There is one existing caller of asn1_skip_if_exists() that does check
the return value: in asn1_skip() itself, an error returned from
asn1_skip_if_exists() will cause the cursor to be invalidated. In the
case of an error indicating only that the cursor length is already
zero, invalidation is a no-op, and so this change affects only the
return value propagated from asn1_skip().
This leaves only a single call site within ocsp_request() where the
return value from asn1_skip() is currently checked. The return status
here is moot since there is no way for the code in question to fail
(absent a bug in the ASN.1 construction or parsing code).
There are therefore no callers of asn1_skip() or asn1_skip_if_exists()
that rely on an error being returned for successfully reaching the end
of a well-formed ASN.1 object list. Simplify the code by redefining
this as a successful outcome.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Redefine bit 30 of an SMBIOS numerical setting to be part of the
function number, in order to allow access to hypervisor CPUID leaves.
This technically breaks backwards compatibility with scripts
attempting to read more than 64 consecutive functions. Since there is
no meaningful block of 64 consecutive related functions, it is
vanishingly unlikely that this capability has ever been used.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hypervisors typically intercept CPUID leaves in the range 0x40000000
to 0x400000ff, with leaf 0x40000000 returning the maximum supported
function within this range in register %eax.
iPXE currently masks off bit 30 from the requested CPUID leaf when
checking to see if a function is supported, which causes this check to
read from leaf 0x00000000 instead of 0x40000000.
Fix by including bit 30 within the mask.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The general syntax for SMBIOS settings:
smbios/<instance>.<type>.<offset>.<length>
is currently extended such that a <length> of zero indicates that the
byte at <offset> contains a string index, and an <offset> of zero
indicates that the <length> contains a literal string index.
Since the byte at offset zero can never contain a string index, and a
literal string index can never have a zero value, the combination of
both <length> and <offset> being zero is currently invalid and will
always return "not found".
Extend the syntax such that the combination of both <length> and
<offset> being zero may be used to read the entire data structure.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Experiments suggest that using fewer than 64 receive buffers leads to
excessive packet drop rates on some instance types (observed with a
c3-standard-4 instance in europe-west4-a).
Fix by increasing the number of receive data buffers (and adjusting
the length of the registrable queue page address list to match).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Google Virtual Ethernet NIC (GVE or gVNIC) is found only in Google
Cloud instances. There is essentially zero documentation available
beyond the mostly uncommented source code in the Linux kernel.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The DHCPv6 protocol does not itself provide a router address or a
prefix length. This information is instead obtained from the router
advertisements.
Our IPv6 minirouting table construction logic will first construct an
entry for each advertised prefix, and later update the entry to
include an address assigned within that prefix via stateful DHCPv6 (if
applicable).
This logic fails if the address assigned via stateful DHCPv6 does not
fall within any of the advertised prefixes (e.g. if the network is
configured to use DHCPv6-assigned /128 addresses with no advertised
on-link prefixes). We will currently treat this situation as
equivalent to having a manually assigned address with no corresponding
router address or prefix length: the routing table entry will use the
default /64 prefix length and will not include the router address.
DHCPv6 is triggered only in response to a router advertisement with
the "Managed Address Configuration (M)" or "Other Configuration (O)"
flags set, and a router address is therefore available at the point
that we initiate DHCPv6.
Record the router address when initiating DHCPv6, and expose this
router address as part of the DHCPv6 settings block. This allows the
routing table entry for any address assigned via stateful DHCPv6 to
correctly include the router address, even if the assigned address
does not fall within an advertised prefix.
Also provide a fixed /128 prefix length as part of the DHCPv6 settings
block. When an address assigned via stateful DHCPv6 does not fall
within an advertised prefix, this will cause the routing table entry
to have a /128 prefix length as expected. (When such an address does
fall within an advertised prefix, it will continue to use the
advertised prefix length.)
Originally-fixed-by: Guvenc Gulce <guevenc.guelce@sap.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
In a small subnet (with a /31 or /32 subnet mask), all addresses
within the subnet are valid host addresses: there is no separate
network address or directed broadcast address.
The logic used in iPXE to determine whether or not to use a link-layer
broadcast address will currently fail in these subnets. In a /31
subnet, the higher of the two host addresses (i.e. the address with
all host bits set) will be treated as a broadcast address. In a /32
subnet, the single valid host address will be treated as a broadcast
address.
Fix by adding the concept of a host mask, defined such that an address
in the local subnet with all of the mask bits set to zero represents
the network address, and an address in the local subnet with all of
the mask bits set to one represents the directed broadcast address.
For most subnets, this is simply the inverse of the subnet mask. For
small subnets (/31 or /32) we can obtain the desired behaviour by
setting the host mask to all ones, so that only the local broadcast
address 255.255.255.255 will be treated as a broadcast address.
Originally-fixed-by: Lukas Stockner <lstockner@genesiscloud.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove the now-unused generalised text widget user interface, along
with the associated concept of a widget set and the implementation of
a read-only label widget.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Rewrite the code implementing the "login" user interface to use a
predefined interactive form. The command "login" then becomes roughly
equivalent to:
#!ipxe
form
item username Username
item --secret password Password
present
with the result that login form customisations (e.g. to add a Windows
domain name) may be implemented within the scripting language.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for presenting a dynamic user interface as an interactive
form, alongside the existing support for presenting a dynamic user
interface as a menu.
An interactive form may be used to allow a user to input (or edit)
values for multiple settings on a single screen, as a user-friendly
alternative to prompting for setting values via the "read" command.
In the present implementation, all input fields must fit on a single
screen (with no scrolling), and the only supported widget type is an
editable text box.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For interactive forms, the concept of a secret value becomes
meaningful (e.g. for password fields).
Add a flag to indicate that an item represents a secret value, and
allow this flag to be set via the "--secret" option of the "item"
command.
This flag has no meaning for menu items, but is silently accepted
anyway to keep the code size minimal.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise the ability to look up a dynamic user interface item by
index or by shortcut key, to allow for reuse of this code for
interactive forms.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently have an abstract model of a dynamic menu as a list of
items, each of which has a name, a description, and assorted metadata
such as a shortcut key. The "menu" and "item" commands construct
representations in this abstract model, and the "choose" command then
presents the items as a single-choice menu, with the selected item's
name used as the output value.
This same abstraction may be used to model a dynamic form as a list of
editable items, each of which has a corresponding setting name, an
optional description label, and assorted metadata such as a shortcut
key. By defining a "form" command as an alias for the "menu" command,
we could construct and present forms using commands such as:
#!ipxe
form Login to ${url}
item username Username or email address
item --secret password Password
present
or
#!ipxe
form Configure IPv4 networking for ${netX/ifname}
item netX/ip IPv4 address
item netX/netmask Subnet mask
item netX/gateway Gateway address
item netX/dns DNS server address
present
Reusing the same abstract model for both menus and forms allows us to
minimise the increase in code size, since the implementation of the
"form" and "item" commands is essentially zero-cost.
Rename everything within the abstract data model from "menu" to
"dynamic user interface" to reflect this generalisation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for wraparound scrolling and allow the tab key to be used
to move forward through a list of elements, wrapping back around to
the beginning of the list on overflow.
This is mildly useful for a menu, and likely to be a strong user
expectation for an interactive form.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Switch terminology for the "item" command from "item <label> <text>"
to "item <name> <text>", in preparation for repurposing the "item"
command to cover interactive forms as well as menus.
Since this renaming affects only a positional parameter, it does not
break compatibility with any existing scripts.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The msg() and alert() functions currently defined in settings_ui.c
provide a general-purpose facility for printing messages centred on
the screen.
Split this out to a separate file to allow for reuse by the form
presentation code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The curses concept of a window has been supported but never actively
used in iPXE since the mucurses library was first implemented in 2006.
Simplify the code by removing the ability to place a widget set in a
specified window, and instead use the standard screen for all drawing
operations.
This simplification allows the widget set parameter to be omitted for
the draw_widget() and edit_widget() operations, since the only reason
for its inclusion was to provide access to the specified window.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Create a generic abstraction of a text widget, refactor the existing
editable text box widget to use this abstraction, add an
implementation of a non-editable text label widget, and generalise the
login user interface to use this generic widget abstraction.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The comments for replace_string() state that a successful return
status guarantees that the dynamically allocated string pointer is no
longer NULL (even if it was initially NULL and the replacement string
is NULL or empty). This is relied upon by readline() to guarantee
that it will always return a non-NULL string if successful.
The code behaviour does not currently match this comment: an empty
replacement string may result in a successful return status even if
the (single-byte) allocation fails.
Fix up the code behaviour to match the comments, and to additionally
ensure that the edit history is filled in even in the event of an
allocation failure.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The reference implementation of Dhcp6Dxe in EDK2 has a fatal flaw: the
code in EfiDhcp6Stop() will poll the network in a tight loop until
either a response is received or a timer tick (at TPL_CALLBACK)
occurs. When EfiDhcp6Stop() is called at TPL_CALLBACK or higher, this
will result in an endless loop and an apparently frozen system.
Since this is the reference implementation of Dhcp6Dxe, it is likely
that almost all platforms have the same problem.
Fix by vetoing the broken driver. If the upstream driver is ever
fixed and a new version number issued, then we could plausibly test
against the version number exposed via the driver binding protocol.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Editable strings currently require a fixed-size buffer, which is
inelegant and limits the potential for creating interactive forms with
a variable number of edit box widgets.
Remove this limitation by switching to using a dynamically allocated
buffer for editable strings and edit box widgets.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
If we do not have a current working URI (after applying the EFI device
path settings and any cached DHCP settings), then an attempt to
download autoexec.ipxe will fail since there is no base URI from which
to resolve the full autoexec.ipxe URI.
Avoid this potentially confusing error message by attempting the
download only if we have successfully obtained a current working URI.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a new setting to provide access to the link layer protocol type
from scripts. This can be useful in order to skip configuring
interfaces based on their link layer protocol or, conversely,
configure only selected interface types (Ethernet, IPoIB, etc.)
Example script:
set idx:int32 0
:loop
isset ${net${idx}/mac} || exit 0
iseq ${net${idx}/linktype} IPoIB && goto try_next ||
autoboot net${idx} ||
:try_next
inc idx && goto loop
Signed-off-by: Pavel Krotkiy <porsh@nebius.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently attempt to obtain the autoexec.ipxe script via early use
of the EFI_SIMPLE_FILE_SYSTEM_PROTOCOL or EFI_PXE_BASE_CODE_PROTOCOL
interfaces to obtain an opaque block of memory, which is then
registered as an image at an appropriate point during our startup
sequence. The early use of these existent interfaces allows us to
obtain the script even if our subsequent actions (e.g. disconnecting
drivers in order to connect up our own) may cause the script to become
inaccessible.
This mirrors the approach used under BIOS, where the autoexec.ipxe
script is provided by the prefix (e.g. as an initrd image when using
the .lkrn build of iPXE) and so must be copied into a normally
allocated image from wherever it happens to previously exist in
memory.
We do not currently have support for downloading an autoexec.ipxe
script if we were ourselves downloaded via UEFI HTTP boot.
There is an EFI_HTTP_PROTOCOL defined within the UEFI specification,
but it is so poorly designed as to be unusable for the simple purpose
of downloading an additional file from the same directory. It
provides almost nothing more than a very slim wrapper around
EFI_TCP4_PROTOCOL (or EFI_TCP6_PROTOCOL). It will not handle
redirection, content encoding, retries, or even fundamentals such as
the Content-Length header, leaving all of this up to the caller.
The UEFI HTTP Boot driver will install an EFI_LOAD_FILE_PROTOCOL
instance on the loaded image's device handle. This looks promising at
first since it provides the LoadFile() API call which is specified to
accept an arbitrary filename parameter. However, experimentation (and
inspection of the code in EDK2) reveals a multitude of problems that
prevent this from being usable. Calling LoadFile() will idiotically
restart the entire DHCP process (and potentially pop up a UI requiring
input from the user for e.g. a wireless network password). The
filename provided to LoadFile() will be ignored. Any downloaded file
will be rejected unless it happens to match one of the limited set of
types expected by the UEFI HTTP Boot driver. The list of design
failures and conceptual mismatches is fairly impressive.
Choose to bypass every possible aspect of UEFI HTTP support, and
instead use our own HTTP client and network stack to download the
autoexec.ipxe script over a temporary MNP network device. Since this
approach works for TFTP as well as HTTP, drop the direct use of
EFI_PXE_BASE_CODE_PROTOCOL. For consistency and simplicity, also drop
the direct use of EFI_SIMPLE_FILE_SYSTEM_PROTOCOL and rely upon our
existing support to access local files via "file:" URIs.
This approach results in console output during the "iPXE initialising
devices...ok" message that appears while startup is in progress.
Remove the trailing "ok" so that this intermediate output appears at a
sensible location on the screen. The welcome banner that will be
printed immediately afterwards provides an indication that startup has
completed successfully even absent the explicit "ok".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Retain a reference to the cached DHCPACK until the late startup phase,
and allow it to be recycled for reuse. This allows the cached DHCPACK
to be used for a temporary MNP network device and then subsequently
reused for the corresponding real network device.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An MNP network device may be temporarily and non-destructively
installed on top of an existing UEFI network stack without having to
disconnect existing drivers.
Add the ability to create such a temporary network device.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Split out the code that allocates our internal struct efi_device
representations, to allow for the creation of temporary MNP devices in
order to download the autoexec.ipxe script.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an abbreviated "Not found" error message for an HTTP 404 status
code, so that any automatic attempt to download a non-existent
autoexec.ipxe script produces only a minimal error message.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an abbreviated "Not found" error message for a TFTP "file not
found" error code, so that any automatic attempt to download a
non-existent autoexec.ipxe script produces only a minimal error
message.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an abbreviated "Not found" error message for an EFI_NOT_FOUND
error encountered when attempting to open a file on a local
filesystem, so that any automatic attempt to download a non-existent
autoexec.ipxe script produces only a minimal error message.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE is designed around fully asynchronous I/O, including asynchronous
connection opening. Almost all errors are therefore necessarily
reported as occurring during an in-progress download, rather than
occurring at the time that the URI is opened.
Local file access is currently an exception to this: errors such as
nonexistent files will be encountered while opening the URI. This
results in mildly unexpected error messages of the form "Could not
start download", rather than the usual pattern of showing the URI, the
initial progress dots, and then the error message.
Fix this inconsistency by deferring the local filesystem access until
the local file download process is running.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some URI schemes allow for a path name to be specified via the opaque
component of the URI (e.g. "file:/script.ipxe" to specify a path on
the filesystem from which iPXE itself was loaded). Files loaded from
such paths will currently fail to be assigned an appropriate name,
since only the path component of the URI will be used to construct a
default image name.
Fix by falling back to attempt deriving an image name from the opaque
component of a URI, if no path component is specified.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For unknown reasons, miscellaneous versions of gcc seem to struggle
with the static assertions used to ensure the correct layout of the
GCM structures.
Adjust the assertions to use offsetof() rather than direct pointer
comparison, on the basis that offsetof() must be a compile-time
constant value.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI HTTP boot mechanism is extraordinarily badly designed, even
by the standards of the UEFI specification in general. It has the
symptoms of a feature that has been designed entirely in terms of user
stories, without any consideration at all being given to the
underlying technical architecture. It does work, provided that you
are doing precisely and only what was envisioned by the product owner.
If you want to try anything outside the bounds of the product owner's
extremely limited imagination, then you are almost certainly about to
enter a world of pain.
As one very minor example of this: the cached DHCP packet is not
available when using HTTP boot. The UEFI HTTP boot code does perform
DHCP, but it pointlessly and unhelpfully throws away the DHCP packet
and trashes the network interface configuration before handing over to
the downloaded executable.
Work around this imbecility by parsing and applying the few network
configuration settings that are persisted into the loaded image's
device path. This is limited to very basic information such as the IP
address, gateway address, and DNS server address, but it does at least
provide enough for a functional routing table.
Signed-off-by: Michael Brown <mcb30@ipxe.org>