SBAT defines an encoding for security generation numbers stored as a
CSV file within a special ".sbat" section in the signed binary. If a
Secure Boot exploit is discovered then the generation number will be
incremented alongside the corresponding fix.
Platforms may then record the minimum generation number required for
any given product. This allows for an efficient revocation mechanism
that consumes minimal flash storage space (in contrast to the DBX
mechanism, which allows for only a single-digit number of revocation
events to ever take place across all possible signed binaries).
Add SBAT metadata to iPXE EFI binaries to support this mechanism.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As of commit f1e9e2b ("[efi] Align EFI image sections by page size"),
the VirtualSize fields for the .reloc and .debug sections have been
rounded up to the (4kB) image alignment. This breaks the PE
relocation logic in the UEFI shim, which requires the VirtualSize
field to exactly match the size as recorded in the data directory.
Fix by setting the VirtualSize field to the unaligned size of the
section, as is already done for normal PE sections (i.e. those other
than .reloc and .debug).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The RFC4122 specification defines UUIDs as being in network byte
order, but an unfortunately significant amount of (mostly Microsoft)
software treats them as having the first three fields in little-endian
byte order.
In an ideal world, any server-side software that compares UUIDs for
equality would perform an endian-insensitive comparison (analogous to
comparing strings for equality using a case-insensitive comparison),
and would therefore not care about byte order differences.
Define a setting type name ":guid" to allow a UUID setting to be
formatted in little-endian order, to simplify interoperability with
server-side software that expects such a formatting.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI specification mandates that the EFI watchdog timer should be
disabled by the platform firmware as part of the ExitBootServices()
call, but some platforms (e.g. Hyper-V) are observed to occasionally
forget to do so, resulting in a reboot approximately five minutes
after starting the operating system.
Work around these firmware bugs by disabling the watchdog timer
ourselves.
Requested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On some systems (observed with the Thunderbolt ports on a ThinkPad X1
Extreme Gen3 and a ThinkPad P53), if the IOMMU is enabled then the
system firmware will install an ExitBootServices notification event
that disables bus mastering on the Thunderbolt xHCI controller and all
PCI bridges, and destroys any extant IOMMU mappings. This leaves the
xHCI controller unable to perform any DMA operations.
As described in commit 236299b ("[xhci] Avoid DMA during shutdown if
firmware has disabled bus mastering"), any subsequent DMA operation
attempted by the xHCI controller will end up completing after the
operating system kernel has reenabled bus mastering, resulting in a
DMA operation to an area of memory that the hardware is no longer
permitted to access and, on Windows with the Driver Verifier enabled,
a STOP 0xE6 (DRIVER_VERIFIER_DMA_VIOLATION).
That commit avoids triggering any DMA attempts during the shutdown of
the xHCI controller itself. However, this is not a complete solution
since any attached and opened USB device (e.g. a USB NIC) may
asynchronously trigger DMA attempts that happen to occur after bus
mastering has been disabled but before we reset the xHCI controller.
Avoid this problem by installing our own ExitBootServices notification
event at TPL_NOTIFY, thereby causing it to be invoked before the
firmware's own ExitBootServices notification event that disables bus
mastering.
This unsurprisingly causes the shutdown hook itself to be invoked at
TPL_NOTIFY, which causes a fatal error when later code attempts to
raise the TPL to TPL_CALLBACK (which is a lower TPL). Work around
this problem by redefining the "internal" iPXE TPL to be variable, and
set this internal TPL to TPL_NOTIFY when the shutdown hook is invoked.
Avoid calling into an underlying SNP protocol instance from within our
shutdown hook at TPL_NOTIFY, since the underlying SNP driver may
attempt to raise the TPL to TPL_CALLBACK (which would cause a fatal
error). Failing to shut down the underlying SNP device is safe to do
since the underlying device must, in any case, have installed its own
ExitBootServices hook if any shutdown actions are required.
Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the "--uefi" option when invoking isohybrid on an EFI-bootable
image, to create a partition mapping to the EFI system partition
embedded within the ISO image.
This allows the resulting isohybrid image to be booted on UEFI systems
that will not recognise an El Torito boot catalog on a non-CDROM
device.
Originally-fixed-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The efi_unload() function is currently missing the calls to raise and
restore the TPL. This has the side effect of causing iPXE to return
from the driver unload entry point at TPL_CALLBACK, which will cause
unexpected behaviour (typically a system lockup) shortly afterwards.
Fix by adding the missing calls to raise and restore the TPL.
Debugged-by: Petr Borsodi <petr.borsodi@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The EFI loaded image protocol allows an image to be provided with a
custom system table, and we currently use this mechanism to wrap any
boot services calls made by the loaded image in order to provide
strace-like debugging via DEBUG=efi_wrap.
The ExitBootServices() call will modify the global system table,
leaving the loaded image using a system table that is no longer
current. When DEBUG=efi_wrap is used, this generally results in the
machine locking up at the point that the loaded operating system calls
ExitBootServices().
Fix by modifying the global EFI system table to point to our wrapper
functions, instead of providing a custom system table via the loaded
image protocol.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
A successful call to ExitBootServices() will result in the EFI console
becoming unusable. Ensure that the EFI wrapper produces a complete
line of debug output before calling the wrapped ExitBootServices()
method, and attempt subsequent debug output only if the call fails.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On some systems (observed with the Thunderbolt ports on a ThinkPad X1
Extreme Gen3 and a ThinkPad P53), the system firmware will disable bus
mastering on the xHCI controller and all PCI bridges at the point that
ExitBootServices() is called if the IOMMU is enabled. This leaves the
xHCI controller unable to shut down cleanly since all commands will
fail with a timeout.
Commit 85eb961 ("[xhci] Allow for permanent failure of the command
mechanism") allows us to detect that this has happened and respond
cleanly. However, some unidentified hardware component (either the
xHCI controller or one of the PCI bridges) seems to manage to enqueue
the attempted DMA operation and eventually complete it after the
operating system kernel has reenabled bus mastering. This results in
a DMA operation to an area of memory that the hardware is no longer
permitted to access. On Windows with the Driver Verifier enabled,
this will result in a STOP 0xE6 (DRIVER_VERIFIER_DMA_VIOLATION).
Work around this problem by detecting when bus mastering has been
disabled, and immediately failing the device to avoid initiating any
further DMA attempts.
Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE decodes any percent-encoded characters during the URI parsing
stage, thereby allowing protocol implementations to consume the raw
field values directly without further decoding.
When reconstructing a URI string for use in an HTTP request line, the
percent-encoding is currently reapplied in a reversible way: we
guarantee that our reconstructed URI string could be decoded to give
the same raw field values.
This technically violates RFC3986, which states that "URIs that differ
in the replacement of a reserved character with its corresponding
percent-encoded octet are not equivalent". Experiments show that
several HTTP server applications will attach meaning to the choice of
whether or not a particular character was percent-encoded, even when
the percent-encoding is unnecessary from the perspective of parsing
the URI into its component fields.
Fix by storing the originally encoded substrings for the path, query,
and fragment fields and using these original encoded versions when
reconstructing a URI string. The path field is also stored as a
decoded string, for use by protocols such as TFTP that communicate
using raw strings rather than URI-encoded strings. All other fields
(such as the username and password) continue to be stored only in
their decoded versions since nothing ever needs to know the originally
encoded versions of these fields.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some xHCI controllers (observed with the Thunderbolt ports on a
ThinkPad X1 Extreme Gen3 and a ThinkPad P53) seem to suffer a
catastrophic failure at the point that ExitBootServices() is called if
the IOMMU is enabled. The symptoms appear to be consistent with
another UEFI driver (e.g. the IOMMU driver, or the Thunderbolt driver)
having torn down the DMA mappings, leaving the xHCI controller unable
to write to host memory. The observable effect is that all commands
fail with a timeout, and attempts to abort command execution similarly
fail since the xHCI controller is unable to report the abort
completion.
Check for failure to abort a command, and respond by performing a full
device reset (as recommended by the xHCI specification) and by marking
the device as permanently failed.
Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Realistic Linux kernel command lines may exceed our current 256
character limit for interactively edited commands or settings.
Switch from stack allocation to heap allocation, and increase the
limit to 1024 characters.
Requested-by: Matteo Guglielmi <Matteo.Guglielmi@dalco.ch>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use the "system MAC address" provided within the DSDT/SSDT if such an
address is available and has not already been assigned to a network
device.
Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some vendors provide a "system MAC address" within the DSDT/SSDT, to
be used to override the MAC address for a USB docking station.
A full implementation would require an ACPI bytecode interpreter,
since at least one OEM allows the MAC address to be constructed by
executable ACPI bytecode (rather than a fixed data structure).
We instead attempt to extract a plausible-looking "_AUXMAC_#.....#"
string that appears shortly after an "AMAC" or "MACA" signature. This
should work for most implementations encountered in practice.
Debugged-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow for the DSDT/SSDT signature-scanning and value extraction code
to be reused for extracting a pass-through MAC address.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit cd3de55 ("[efi] Record cached DHCPACK from loaded image's
device handle, if present") added the ability for a chainloaded UEFI
iPXE to reuse an IPv4 address and DHCP options previously obtained by
a built-in PXE stack, without needing to perform a second DHCP
request.
Extend this to also record the cached ProxyDHCPOFFER and PXEBSACK
obtained from the EFI_PXE_BASE_CODE_PROTOCOL instance installed on the
loaded image's device handle, if present.
This allows a chainloaded UEFI iPXE to reuse a boot filename or other
options that were provided via a ProxyDHCP or PXE boot server
mechanism, rather than by standard DHCP.
Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When building an EFI ROM image for which no PCI vendor/device ID is
applicable (e.g. bin-x86_64-efi/ipxe.efirom), the build process will
currently construct a command such as
./util/efirom -v -d -c bin-x86_64-efi/ipxe.efidrv \
bin-x86_64-efi/ipxe.efirom
which gets interpreted as a vendor ID of "-0xd" (i.e. 0xfff3, after
truncation to 16 bits).
Fix by using an explicit zero ID when no applicable ID exists, as is
already done when constructing BIOS ROM images.
Reported-by: Konstantin Aladyshev <aladyshev22@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The DHCP service in EC2 has been observed to occasionally stop
responding for bursts of several seconds. This can easily result in a
failed boot, since the current cloud boot script will attempt DHCP
only once.
Work around this problem by retrying DHCP in a fairly tight cycle
within the cloud boot script, and falling back to a reboot after
several failed DHCP attempts.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As of commit f1e9e2b ("[efi] Align EFI image sections by page size"),
our SectionAlignment has been increased to 4kB in order to allow for
page-level memory protection to be applied by the UEFI firmware, with
FileAlignment left at 32 bytes.
The PE specification states that the value for FileAlignment "should
be a power of 2 between 512 and 64k, inclusive", and that "if the
SectionAlignment is less than the architecture's page size, then
FileAlignment must match SectionAlignment".
Testing shows that signtool.exe will reject binaries where
FileAlignment is less than 512, unless FileAlignment is equal to
SectionAlignment. This indicates a somewhat zealous interpretation of
the word "should" in the PE specification.
Work around this interpretation by increasing FileAlignment from 32
bytes to 512 bytes, and add explanatory comments for both
FileAlignment and SectionAlignment.
Debugged-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When building the Linux userspace binaries, the external system
headers may have already defined values for the __LITTLE_ENDIAN and
__BIG_ENDIAN constants.
Fix by retaining the existing values if already defined, since the
actual values of these constants do not matter.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC 3986 section 3.1 defines URI schemes as case-insensitive (though
the canonical form is always lowercase).
Use strcasecmp() rather than strcmp() to allow for case insensitivity
in URI schemes.
Requested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The RTL8211B seems to have a bug that prevents the link from coming up
unless the MII_MMD_DATA register is cleared.
The Linux kernel driver applies this workaround (in rtl8211b_resume())
only to the specific RTL8211B PHY model, along with a matching
workaround to set bit 9 of MII_MMD_DATA when suspending the PHY.
Since we have no need to ever suspend the PHY, and since writing a
zero ought to be harmless, we just clear the register unconditionally.
Debugged-by: Nikolay Pertsev <nikolay.p@cos.flag.org>
Tested-by: Nikolay Pertsev <nikolay.p@cos.flag.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The peer discovery time has a significant impact on the overall
PeerDist download speed, since each block requires an individual
discovery attempt. In most cases, a peer that responds for block N
will turn out to also respond for block N+1.
Assume that the most recently discovered peer (for any block) probably
has a copy of the next block to be discovered, thereby allowing the
peer download attempt to begin immediately.
In the case that this assumption is incorrect, the existing error
recovery path will allow for fallback to newly discovered peers (or to
the origin server).
Suggested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some versions of GNU objcopy (observed with binutils 2.23.52.0.1 on
CentOS 7.0.1406) document the -D/--enable-deterministic-archives
option but fail to recognise the short form of the option.
Work around this problem by using the long form of the option.
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 040cdd0 ("[linux] Add a prefix to all symbols to avoid future
name collisions") unintentionally reintroduced an element of
non-determinism into the build ID, by omitting the -D option when
manipulating the blib.a archive.
Fix by adding the -D option to restore determinism.
Reworded-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Ip4ConfigDxe driver bug that was observed on Dell systems in
commit 64b4452 ("[efi] Blacklist the Dell Ip4ConfigDxe driver") has
also been observed on systems with a manufacturer name of "Itautec
S.A.". The symptoms of the bug are identical: an attempt to call
DisconnectController() on the LOM device handle will lock up the
system.
Fix by extending the veto to cover the Ip4ConfigDxe driver for this
manufacturer.
Debugged-by: Celso Viana <celso.vianna@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Most RNDIS data structures include a trailing 4-byte reserved field.
For the REMOTE_NDIS_PACKET_MSG and REMOTE_NDIS_INITIALIZE_CMPLT
structures, this is an 8-byte field instead.
iPXE currently uses incorrect structure definitions with a 4-byte
reserved field in all data structures, resulting in data payloads that
overlap the last 4 bytes of the 8-byte reserved field.
RNDIS uses explicit offsets to locate any data payloads beyond the
message header, and so liberal RNDIS parsers (such as those used in
Hyper-V and in the Linux USB Ethernet gadget driver) are still able to
parse the malformed structures.
A stricter RNDIS parser (such as that found in some older Android
builds that seem to use an out-of-tree USB Ethernet gadget driver) may
reject the malformed structures since the data payload offset is less
than the header length, causing iPXE to be unable to transmit packets.
Fix by correcting the length of the reserved fields.
Debugged-by: Martin Nield <pmn1492@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The ARM versions of the big-integer inline assembly functions include
constraints to indicate that the output value is modified by the
assembly code. These constraints are not present in the equivalent
code for the x86 versions.
As of GCC 11, this results in the compiler reporting that the output
values may be uninitialized.
Fix by including the relevant memory output constraints.
Reported-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide a file "initrd.magic" via the EFI_SIMPLE_FILE_SYSTEM_PROTOCOL
that contains the initrd file as constructed for BIOS bzImage kernels
(including injected files with CPIO headers constructed by iPXE).
This allows BIOS and UEFI kernels to obtain the exact same initramfs
image, by adding "initrd=initrd.magic" to the kernel command line.
For example:
#!ipxe
kernel boot/vmlinuz initrd=initrd.magic
initrd boot/initrd.img
initrd boot/modules/e1000.ko /lib/modules/e1000.ko
initrd boot/modules/af_packet.ko /lib/modules/af_packet.ko
boot
Do not include the "initrd.magic" file within the root directory
listing, since doing so would break software such as wimboot that
processes all files within the root directory.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Restructure the EFI_SIMPLE_FILE_SYSTEM_PROTOCOL implementation to
allow for the existence of virtual files that are not simply backed by
a single underlying image.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE will construct CPIO headers for images that have a non-empty
command line, thereby allowing raw images (without CPIO headers) to be
injected into a dynamically constructed initrd. This feature is
currently implemented within the BIOS-only bzImage format support.
Split out the CPIO header construction logic to allow for reuse in
other contexts such as in a UEFI build.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
DNS names are case-insensitive, and RFC 5280 (unlike RFC 3280)
mandates support for case-insensitive name comparison in X.509
certificates.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use hexadecimal values instead of macros in PCI_ROM entries so Perl
script can parse them correctly. Move PCI_ROM entries from header
file to C file. Integrate bnxt_vf_nics array into PCI_ROM entries by
introducing BNXT_FLAG_PCI_VF flag into driver_data field. Add
whitespaces in PCI_ROM entries for style consistency.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Support for the zlib and gzip archive image formats is currently
included only if the IMAGE_ARCHIVE_CMD is used to enable the
"imgextract" command.
The ability to transparently execute a single-member archive image
without using the "imgextract" command renders this unintuitive: a
user wanting to gain the ability to boot a gzip-compressed kernel
image would expect to have to enable IMAGE_GZIP rather than
IMAGE_ARCHIVE_CMD.
Reverse the inclusion logic, so that archive image formats must now be
enabled explicitly (via IMAGE_GZIP and/or IMAGE_ZLIB), with the
archive image management commands dragged in as needed if any archive
image formats are enabled. The archive image management commands may
be explicitly disabled via IMAGE_ARCHIVE_CMD if necessary.
This matches the behaviour of IBMGMT_CMD and similar options, where
the relevant commands are included only when something else already
drags in the underlying feature.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An extracted image is wholly derived from the original archive image.
If the original archive image has been verified and marked as trusted,
then this trust logically extends to any image extracted from it.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide image_extract_exec() as a helper method to allow single-member
archive images (such as gzip compressed images) to be executed without
an explicit "imgextract" step.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Avoid using the "rdtsc" instruction unless profiling is enabled. This
allows the non-debug build of the UNDI driver to be used on a CPU such
as a 486 that does not support the TSC.
Reported-by: Nikolai Zhubr <n-a-zhubr@yandex.ru>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The decompressor uses the i486 "bswap" instruction, but does not
require any instructions that exist only on i586 or above. Update the
".arch" directive to reflect the requirements of the code as
implemented.
Reported-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the concept of extracting an image from an archive (which could be
a single-file archive such as a gzip-compressed file), along with an
"imgextract" command to expose this functionality to scripts.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The EFI PCI API takes a page count as the input to AllocateBuffer()
but a byte count as the input to Map(). There is nothing in the UEFI
specification that requires us to map exactly the allocated length,
and no systems have yet been observed that will fail if the map length
does not exactly match the allocated length. However, it is plausible
that some implementations may fail if asked to map a length that does
not match the length of the corresponding allocation.
Avoid potential future problems by always mapping the full allocated
length.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 79c0173 ("[build] Create util/genfsimg for building
filesystem-based images") introduced the new genfsimg, which lacks the
-l option when building ISO files. This option is required to build
level 2 (long plain) ISO9660 filenames, which are required when using
the .lkrn extensions on older versions of ISOLINUX.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The use of jumbo frames for the Xen netfront virtual NIC requires the
use of scatter-gather ("feature-sg"), with the receive descriptor ring
becoming a list of page-sized buffers and the backend using as many
page buffers as required for each packet.
Since iPXE's abstraction of an I/O buffer does not include any sort of
scatter-gather list, this requires an extra allocation and copy on the
receive datapath for any packet that spans more than a single page.
This support is required in order to successfully boot an AWS EC2
virtual machine (with non-enhanced networking) via iSCSI if jumbo
frames are enabled, since the netback driver used in EC2 seems not to
allow "feature-sg" to be renegotiated once the Linux kernel driver
takes over.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The INT 13 extensions provide a mechanism for accessing disks using
linear (LBA) rather than C/H/S addressing. SAN protocols such as
iSCSI invariably support only linear addresses and so iPXE currently
provides LBA access to all SAN disks (with autodetection and emulation
of an appropriate geometry for C/H/S accesses).
Most BIOSes will not report support for INT 13 extensions for floppy
disk drives, and some operating systems may be confused by a floppy
drive that claims such support.
Minimise surprise by reporting the existence of support for INT 13
extensions only for non-floppy drive numbers. Continue to provide
support for all drive numbers, to avoid breaking operating systems
that may unconditionally use the INT 13 extensions without first
checking for support.
Reported-by: Valdo Toost <vtoost@hot.ee>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When CONSOLE_SYSLOG is used, a DBG() from within a network device
driver may cause its transmit() or poll() methods to be unexpectedly
re-entered. Since these methods are not intended to be re-entrant,
this can lead to undefined behaviour.
Add an explicit re-entrancy guard to both methods. Note that this
must operate at a per-netdevice level, since there are legitimate
circumstances under which the netdev_tx() or netdev_poll() functions
may be re-entered (e.g. when using VLAN devices).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is no method for obtaining the number of PCI buses when using
PCIAPI_DIRECT, and we therefore currently scan all possible bus
numbers. This can cause a several-second startup delay in some
virtualised environments, since PCI configuration space access will
necessarily require the involvement of the hypervisor.
Ameliorate this situation by defaulting to scanning only a single bus,
and expanding the number of PCI buses to accommodate any subordinate
buses that are detected during enumeration.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Adding this missing identifier allows the X557-AT2 chipset seen on (at
least) Super Micro A2SDI-H-TF motherboards to function with iPXE.
Signed-off-by: Tyler J. Stachecki <stachecki.tyler@gmail.com>
After a PE image is fully loaded and relocated, the loader code may
opt to zero discardable sections for security reasons. This includes
relocation and debug information, as both contain hints about specific
locations within the binary. Mark both generated sections as
discardable, which follows the PE specification.
Signed-off-by: Marvin Häuser <mhaeuser@posteo.de>
For optimal memory permission management, PE sections need to be
aligned by the platform's minimum page size. Currently, the PE
section alignment is fixed to 32 bytes, which is below the typical 4kB
page size. Align all sections to 4kB and adjust ELF to PE image
conversion accordingly.
Signed-off-by: Marvin Häuser <mhaeuser@posteo.de>
As per https://github.com/ipxe/ipxe/pull/313#issuecomment-816018398,
these sections are not required for EFI execution. Discard them to
avoid implementation-defined alignment malforming binaries.
Signed-off-by: Marvin Häuser <mhaeuser@posteo.de>
Handle a DHCPNAK by returning to the discovery state to allow iPXE to
attempt to obtain a replacement IPv4 address.
Reuse the existing logic for deferring discovery when the link is
blocked: this avoids hammering a misconfigured DHCP server with a
non-stop stream of requests and allows the DHCP process to eventually
time out and fail.
Originally-implemented-by: Blake Rouse <blake.rouse@canonical.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The iPXE build system is constructed for a standalone codebase with no
external dependencies, and does not have any equivalent of the
standard userspace ./configure script. We currently check for the
ability to include slirp/libslirp.h and conditionalise portions of
linux_api.c on its presence. The actual slirp driver code is built
unconditionally, as with all iPXE drivers.
This currently leads to a silent runtime failure if attempting to use
slirp.linux built on a system that was missing slirp/libslirp.h.
Convert this to a link-time failure by deliberately omitting the
relevant symbols from linux_api.c when slirp/libslirp.h is not
present. This allows other builds (e.g. tap.linux or tests.linux) to
succeed: the link-time failure will occur only if the slirp driver is
included within the build target.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Linux kernel 3.12 and earlier report a zero size via stat() for all
ACPI table files in sysfs. There is no way to determine the file size
other than by reading the file until EOF.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Consumers of acpi_find() will assume that returned structures include
a valid table header and that the length in the table header is
correct. These assumptions are necessary when dealing with raw ACPI
tables, since there exists no independent source of length
information.
Ensure that these assumptions are also valid for ACPI tables read from
sysfs.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The statx() system call has a clean header file and a consistent
layout, but was unfortunately added only in kernel 4.11.
Using stat() or fstat() directly is extremely messy since glibc does
not necessarily use the kernel native data structures. However, as
the only current use case is to obtain the length of an open file, we
can merely provide a wrapper that does precisely this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The DNS server list is currently printed as a debug message whenever
settings are applied. This can result in some very noisy debug logs
when a script makes extensive use of settings.
Move the DNS server list debug messages to DBGLVL_EXTRA.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow arbitrary settings to be specified on the Linux command line.
For example:
./bin-x86_64-linux/slirp.linux \
--net slirp,testserver=qa-test.ipxe.org
This can be useful when using the Linux userspace build to test
embedded scripts, since it allows arbitrary parameters to be passed
directly on the command line.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Versions of gcc prior to 9.1 do not support the single-argument form
of static_assert(). Fix by unconditionally defining a compatibility
macro for the single file that uses this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a driver using libslirp to provide a virtual network interface
without requiring root permissions on the host. This simplifies the
process of running iPXE as a Linux userspace application with network
access. For example:
make bin-x86_64-linux/slirp.linux
./bin-x86_64-linux/slirp.linux --net slirp
libslirp will provide a built-in emulated DHCP server and NAT router.
Settings such as the boot filename may be controlled via command-line
options. For example:
./bin-x86_64-linux/slirp.linux \
--net slirp,filename=http://192.168.0.1/boot.ipxe
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The "used" attribute can be applied only to functions or variables,
which prevents the use of __asmcall as a type attribute.
Fix by removing "used" from the definition of __asmcall for i386 and
x86_64 architectures, and adding explicit __used annotations where
necessary.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The ACPI API currently expects platforms to provide access to a single
contiguous ACPI table. Some platforms (e.g. Linux userspace) do not
provide a convenient way to obtain the entire ACPI table, but do
provide access to individual tables.
All iPXE consumers of the ACPI API require access only to individual
tables.
Redefine the internal API to make acpi_find() an API method, with all
existing implementations delegating to the current RSDT-based
implementation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The result from acpi_find_rsdt() is used only for the debug message.
Simplify the debug message and remove the otherwise redundant call to
acpi_find_rsdt().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When building as a Linux userspace application, iPXE currently
implements its own system calls to the host kernel rather than relying
on the host's C library. The output binary is statically linked and
has no external dependencies.
This matches the general philosophy of other platforms on which iPXE
runs, since there are no external libraries available on either BIOS
or UEFI bare metal. However, it would be useful for the Linux
userspace application to be able to link against host libraries such
as libslirp.
Modify the build process to perform a two-stage link: first picking
out the requested objects in the usual way from blib.a but with
relocations left present, then linking again with a helper object to
create a standard hosted application. The helper object provides the
standard main() entry point and wrappers for the Linux system calls
required by the iPXE Linux drivers and interface code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow for the possibility of linking to platform libraries for the
Linux userspace build by adding an iPXE-specific symbol prefix.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Recent versions of the GNU assembler (observed with GNU as 2.35 on
Fedora 33) will produce a warning message
Warning: no instruction mnemonic suffix given and no register
operands; using default for `bts'
The operand size affects only the potential range for the bit number.
Since we pass the bit number as an unsigned int, it is already
constrained to 32 bits for both i386 and x86_64.
Silence the assembler warning by specifying an explicit 32-bit operand
size (and thereby matching the choice that the assembler would
otherwise make automatically).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use the reference implementation of the EFI compression algorithm
(taken from the EDK2 codebase, with minor bugfixes to allow
compilation with -Werror) to compress EFI ROM images.
Inspired-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Assume that preservation of the %xmm registers is unnecessary during
installation of iPXE into memory, since this is an operation that by
its nature substantially disrupts large portions of the system anyway
(such as the E820 memory map). This assumption allows us to utilise
the existing CPUID code to check that FXSAVE/FXRSTOR are supported.
Test for support during the call to init_librm and store the flag for
use during subsequent calls to virt_call.
Reduce the scope of TIVOLI_VMM_WORKAROUND to affecting only the call
to check_fxsr(), to reduce #ifdef pollution in the remaining code.
Debugged-by: Johannes Heimansberg <git@jhe.dedyn.io>
Signed-off-by: Michael Brown <mcb30@ipxe.org>