Commit edf74df ("[pxe] Always reconstruct packet for
PXENV_GET_CACHED_INFO") fixed the problems caused by returning stale
DHCP packets (e.g. from an earlier boot attempt using a different
network device), but broke interoperability with NBPs such as WDS
which may overwrite our cached (fake) DHCP packets and expect the
modified packets to be returned by a subsequent call to
PXENV_GET_CACHED_INFO.
Fix by constructing the fake DHCP packets immediately before
transferring control to a PXE NBP. Calls to PXENV_GET_CACHED_INFO
will now never modify the cached packets.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
A relatively common user mistake is to attempt to boot an EFI
executable (such as grub.efi) using a BIOS version of iPXE.
Unfortunately there are no signature checks that we can use to
unambiguously identify a PXE NBP, since a PXE NBP is just raw machine
code. We therefore have to accept anything sufficiently small to fit
into base memory as a valid PXE NBP.
We can detect that a file might be an EFI executable by checking for
the initial "MZ" signature bytes. This does not necessarily preclude
the file from also being a PXE NBP (since it would be possible to
create a hybrid binary which acts as both an EFI executable and a PXE
NBP, similar to the way in which wimboot and the Linux kernel are
hybrid binaries which act as both an EFI executable and a bzImage).
If the initial "MZ" signature bytes are present, then attempt to warn
the user by setting the image type to "PXE-NBP (may be EFI?)". We
can't (sensibly) prevent the user from accidentally running an EFI
executable as a PXE NBP, but we can at least make it easier for the
user to identify their mistake.
Inspired-by: Robin Smidsrød <robin@smidsrod.no>
Inspired-by: Wissam Shoukair <wissams@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
UEFI platforms may provide a watchdog timer, which will reboot the
machine if an operating system takes more than five minutes to load.
This can cause long-lived iPXE downloads (or interactive shell
sessions) to unexpectedly reboot.
Fix by resetting the watchdog timer every ten seconds while the iPXE
main processing loop continues to run.
Reported-by: Bradley B Williams <bradleybwilliams@swbell.net>
Reported-by: John Clark <john.r.clark.3@gmail.com>
Reported-by: wdriever@gmail.com
Reported-by: Charlie Beima <cbeima@indiana.edu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Check for existence of the UART in uart_select(), not just in
uart_init(). This allows uart_select() to refuse to set a non-working
address in uart->base, which in turns means that the serial console
code will not attempt to use a non-existent UART.
Reported-by: Torgeir Wulfsberg <Torgeir.Wulfsberg@kongsberg.com>
Reported-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When the ability for iPXE to handle multiple serial ports was added,
the choice was made that the singular serial port referred to by
COMBOOT calls should mean the port used for the serial console. This
unintentionally caused IMAGE_COMBOOT to also enable CONSOLE_SERIAL.
Fix by providing a weak-symbol version of the serial console which
will be used if serial console support was not explicitly enabled.
Reported-by: Torgeir Wulfsberg <Torgeir.Wulfsberg@kongsberg.com>
Reported-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We do not set up any kind of virtual addressing before invoking an
ELFBoot image. Reject if the image's program headers indicate that
virtual addresses are not equal to physical addresses.
This avoids problems when loading some RHEL5 kernels, which seem to
include ELFBoot headers using virtual addressing. With this change,
these kernels are no longer detected as ELFBoot, and so may be
(correctly) detected as bzImage instead.
Reported-by: Torgeir.Wulfsberg@kongsberg.com
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Expose the high bit of the VGA text attribute byte via the ANSI SGR
parameters 5 ("blink on") and 25 ("blink off").
Note that some video cards (and virtual machines) may display a high
intensity background colour instead of blinking text.
Signed-off-by: Christian Nilsson <nikize@gmail.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Older, out-of-tree Xen kernel modules (such as those provided with
SuSE Linux Enterprise Server 11) do not clear the leftover "event
pending" bit when opening an event channel. Consequently, no event is
ever delivered to indicate that there is information in the XenStore
ring buffer, and the system hangs shortly after loading the
xen-platform-pci kernel module.
Work around this problem by always waiting for the XenStore event
channel to be signalled, and clearing the event before processing the
received data.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Avoid accidentally returning stale packets (e.g. for a previously
attempted network device) by always constructing a fresh DHCP packet.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The size of the .mrom payload (the second PCI ROM image) is defined in
its PCI header. The code type for the .mrom payload image is
deliberately set to an invalid value (0xff) to ensure that no BIOS
tries to parse anything in the image other than the PCI header.
Since the code type is not set to 0x00 ("Intel x86, PC-AT
compatible"), bytes 0x02-0x17 should not be interpreted by the BIOS as
being in the standard ISA expansion ROM format. In particular, the
byte at offset 0x02 does not represent the length of the ROM image (in
512-byte blocks).
However, some Dell BIOSes seem to erroneously use the byte at offset
0x02 to determine the length of the .mrom payload when walking the
list of PCI ROM images. Since this byte is currently set to zero,
this can lead to the BIOS getting stuck in an infinite loop during
POST. (This problem may not arise if the .mrom payload is the final
image in the ROM, since the BIOS will then have no reason to attempt
to locate the next image.)
One possible workaround would be to put the real payload size in this
byte, but doing so would constrain the .mrom payload size to 128kB
(see commit 8049a52 ("[mromprefix] Allow for .mrom images larger than
128kB") for more details).
Another possible workaround would be to put the real payload size as a
word in bytes 0x02-0x03 (as is done for EFI ROMs). This would not
constrain the .mrom payload size, but a payload size which happened to
be exactly 128kB would result in a zero value in the byte at offset
0x02 and so could still result in infinite loops on BIOSes with this
bug.
We choose to place a fixed value of 0x01 in the byte at offset 0x02.
This should at least prevent the BIOS from getting stuck in an
infinite loop. (The BIOS may walk into the middle of the .mrom
payload, where it will almost certainly not find a valid {0x55,0xaa}
signature or a valid PCIR header, and will therefore hopefully abort
processing.)
Reported-by: Wissam Shoukair <wissams@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some HP BIOSes (observed with an HP ProLiant m710p Server Cartridge)
have a bug in the implementation of INT 1a,b101: they blithely assume
that real-mode code is able to read from anywhere in the 32-bit memory
space.
This problem affects the call to INT 1a,b101 made from within
pcibios_num_bus() (which uses REAL_CODE() and hence executes in
genuine real mode) but does not affect the call made from within
romprefix.S (since with a PMM BIOS, that call executes in flat real
mode anyway).
Work around the problem by explicitly calling flatten_real_mode()
before invoking INT 1a,b101. This is a rarely-used code path, and so
the extra overhead of emulating instructions in some VM configurations
(see commit 6d4deee ("[librm] Use genuine real mode to accelerate
operation in virtual machines") for more details) is negligible.
Reported-by: Wissam Shoukair <wissams@mellanox.com>
Debugged-by: Wissam Shoukair <wissams@mellanox.com>
Debugged-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Several popular public cloud providers do not provide any sensible
mechanism for obtaining debug output from an OS which is failing to
boot. For example, Amazon EC2 provides the "Get System Log" facility,
which occasionally deigns to report a random subset of the characters
emitted via the VM's serial port, but usually returns only a blank
screen. (Amazingly, this is still superior to the debugging
facilities provided by Azure.)
Work around these shortcomings by adding a console type which sends
output to a magically detected raw disk partition, and including such
a partition within any iPXE .usb-format image.
To use this facility:
- build an iPXE .usb image with CONSOLE_INT13 enabled
- boot the cloud VM from this image
- after the boot fails, attach the VM's boot disk to a second VM
- from this second VM, use "less -f -R /dev/sdb3" (or similar) to
view the iPXE output.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Rename PCI_CLASS() (which constructs a struct pci_class_id) to
PCI_CLASS_ID(), and provide PCI_CLASS() as a macro which constructs
the 24-bit scalar value of a PCI class code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The "vram" setting returns the (Base64-encoded) contents of video RAM,
and can be used to capture a screenshot. For example: after running
memtest.0 and encountering an error, the output can be captured and
sent to a remote server for later diagnosis:
#!ipxe
chain -a http://server/memtest.0 && goto ok || goto bad
:bad
params
param errno ${errno}
param vram ${vram}
chain -a http://server/report.php##params
:ok
Inspired-by: Christian Nilsson <nikize@gmail.com>
Originally-implemented-by: Christian Nilsson <nikize@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The USB bus drivers (ehci.c and xhci.c) have PCI device ID tables and
hence PCI_ROM() lines, but should probably not be included in the
all-drivers build on this basis, since they do nothing useful unless a
USB network driver is also present.
Fix by constructing the all-drivers list based on the driver class
(i.e. the portion of the source path immediately after "drivers/").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The assembler on OpenBSD 5.7 seems not to correctly handle the
combinations of .struct and .previous used in unlzma.S, and ends up
complaining about an "attempt to allocate data in absolute section".
Work around this problem by explicitly resetting the section after the
data structure definitions.
Reported-by: Jiri B <jirib@devio.us>
Tested-by: Jiri B <jirib@devio.us>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PCI v3.0 supports a "device list" which allows the ROM to claim
support for multiple PCI device IDs (but only a single vendor ID).
Add support for building such ROMs by scanning the build target
element list and incorporating any device IDs into the ROM's device
list header. For example:
make bin/8086153a--8086153b.mrom
would build a ROM claiming support for both 8086:153a and 8086:153b.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Entropy gathering via timer ticks is slow under UEFI (of the order of
20-30 seconds on some machines). Use the EFI_RNG_PROTOCOL if
available, to speed up the process of entropy gathering.
Note that some implementations (including EDK2) will fail if we
request fewer than 32 random bytes at a time, and that the RNG
protocol provides no guarantees about the amount of entropy provided
by a call to GetRNG(). We take the (hopefully pessimistic) view that
a 32-byte block returned by GetRNG() will contain at least the 1.3
bits of entropy claimed by min_entropy_per_sample().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Our current behaviour when booting as a ROM is to autoboot only from
devices which are attached via the PCI bus:dev.fn address passed to
the ROM's initialisation vector.
Add a build configuration option AUTOBOOT_ROM_FILTER (enabled by
default) to control this behaviour. This allows for ROMs to be built
which will attempt to boot from any detected device, even if not
attached via the original PCI bus:dev.fn address. (This is
particularly useful when building combined EHCI/xHCI ROMs for USB
network boot, since the BIOS may request a boot via the EHCI
controller but the xHCI driver will reroute the root hub ports to the
xHCI controller.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We hook the UEFI ExitBootServices() event and use it to trigger a call
to shutdown_boot(). This does not automatically cause drivers to be
disconnected from their devices, since device enumeration is now
handled by the UEFI core rather than by iPXE. (Under the old and
dubiously compatible device model, iPXE used to perform its own device
enumeration and so the call to shutdown_boot() would indeed have
caused drivers to be disconnected.)
Fix by replicating parts of the dummy "EFI root device" from
efiprefix.c to efidrvprefix.c, so that the call to shutdown_boot()
will call efi_driver_disconnect_all().
Originally-fixed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
None of the x86_64 builds currently have any way of invoking these
functions. They are included only to avoid introducing unnecessary
architecture-specific dependencies into the self-test suite.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 8ab4b00 ("[libc] Rewrite setjmp() and longjmp()") introduced a
regression in which the saved values of %ebx, %esi, and %edi were all
accidentally restored into %esp. The result is that the second and
subsequent returns from setjmp() would effectively corrupt %ebx, %esi,
%edi, and the stack pointer %esp.
Use of setjmp() and longjmp() is generally discouraged: our only use
occurs as part of the implementation of PXENV_RESTART_TFTP, since the
PXE API effectively mandates its use here. The call to setjmp()
occurs at the start of pxe_start_nbp(), where there are almost
certainly no values held in %ebx, %esi, or %edi. The corruption of
these registers therefore had no visible effect on program execution.
The corruption of %esp would have been visible on return from
pxe_start_nbp(), but there are no known PXE NBPs which first call
PXENV_RESTART_TFTP and subsequently attempt to return to the PXE base
code. The effect on program execution was therefore similar to that
of moving the stack to a pseudo-random location in the 32-bit address
space; this will often allow execution to complete successfully since
there is a high chance that the pseudo-random location will be unused.
The regression therefore went undetected for around one month.
Fix by restoring the correct registers from the saved jmp_buf
structure.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Several of the assembly files in arch/i386/prefix were missed by the
automated relicensing tool due to missing licence declarations, code
dating back to the initial git revision, etc. Manual review shows
that these files may be relicensed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Relicense files with kind permission from
Stefan Hajnoczi <stefanha@redhat.com>
alongside the contributors who have already granted such relicensing
permission.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
At some point in the past few years, binutils became more aggressive
at removing unused symbols. To function as a symbol requirement, a
relocation record must now be in a section marked with @progbits and
must not be in a section which gets discarded during the link (either
via --gc-sections or via /DISCARD/).
Update REQUIRE_SYMBOL() to generate relocation records meeting these
criteria. To minimise the impact upon the final binary size, we use
existing symbols (specified via the REQUIRING_SYMBOL() macro) as the
relocation targets where possible. We use R_386_NONE or R_X86_64_NONE
relocation types to prevent any actual unwanted relocation taking
place. Where no suitable symbol exists for REQUIRING_SYMBOL() (such
as in config.c), the macro PROVIDE_REQUIRING_SYMBOL() can be used to
generate a one-byte-long symbol to act as the relocation target.
If there are versions of binutils for which this approach fails, then
the fallback will probably involve killing off REQUEST_SYMBOL(),
redefining REQUIRE_SYMBOL() to use the current definition of
REQUEST_SYMBOL(), and postprocessing the linked ELF file with
something along the lines of "nm -u | wc -l" to check that there are
no undefined symbols remaining.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The valgrind headers are not x86-specific; they detect the CPU
architecture and contain inline assembly for multiple architectures.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
These files cannot be automatically relicensed by util/relicense.pl
since they either contain unusual but trivial contributions (such as
the addition of __nonnull function attributes), or contain lines
dating back to the initial git revision (and so require manual
knowledge of the code's origin).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Relicence files with kind permission from the following contributors:
Alex Williamson <alex.williamson@redhat.com>
Eduardo Habkost <ehabkost@redhat.com>
Greg Jednaszewski <jednaszewski@gmail.com>
H. Peter Anvin <hpa@zytor.com>
Marin Hannache <git@mareo.fr>
Robin Smidsrød <robin@smidsrod.no>
Shao Miller <sha0.miller@gmail.com>
Thomas Horsten <thomas@horsten.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The code in lzma_literal() checks to see if we are at the start of the
compressed input data in order to determine whether or not a most
recent output byte exists. This check is incorrect, since
initialisation of the decompressor will always consume the first five
bytes of the compressed input data.
Fix by instead checking whether or not we are at the start of the
output data stream. This is, in any case, a more logical check.
This issue was masked during development and testing since virtual
machines tend to zero the initial contents of RAM; the spuriously-read
"most recent output byte" is therefore likely to already be a zero
when running in a virtual machine.
Reported-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The 0xe9 debug port exists only on virtual machines. Provide an
option to print debug output on the BIOS console, to allow for
debugging on real hardware.
Note that this option can be used only if the decompressor is called
in flat real mode; the easiest way to achieve this is to build with
DEBUG=libprefix.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the decompressor the option of generating debugging output via
the BIOS console by calling it in flat real mode (rather than 16-bit
protected mode) when libprefix.S is built with debugging enabled.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
LZMA performs an extra normalisation after decompression is complete,
which does not affect the output but may consume an extra byte from
the input (and so may affect which byte is identified as being the
start of the next block).
Reported-by: Robin Smidsrød <robin@smidsrod.no>
Tested-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
LZMA provides significantly better compression (by ~15%) than the
current NRV2B algorithm.
We use a raw LZMA stream (aka LZMA1) to avoid the need for code to
parse the LZMA2 block headers. We use parameters {lc=2,lp=0,pb=0} to
reduce the stack space required by the decompressor to acceptable
levels (around 8kB). Using lc=3 or pb=2 would give marginally better
compression, but at the cost of substantially increasing the required
stack space.
The build process now requires the liblzma headers to be present on
the build system, since we do not include a copy of an LZMA compressor
within the iPXE source tree. The decompressor is written from scratch
(based on XZ Embedded) and is entirely self-contained within the
iPXE source.
The branch-call-jump (BCJ) filter used to improve the compressibility
is specific to iPXE. We choose not to use liblzma's built-in BCJ
filter since the algorithm is complex and undocumented. Our BCJ
filter achieves approximately the same results (on typical iPXE
binaries) with a substantially simpler algorithm.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some decompression algorithms (e.g. LZMA) require large amounts of
temporary stack space, which may not be made available by all
prefixes. Use .bss16 as a temporary stack for the duration of the
calls to install_block (switching back to the external stack before we
start making calls into code which might access variables in .bss16),
and allow the decompressor to define a global symbol to force a
minimum value on the size of .bss16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Other hypervisors (e.g. KVM) may provide an unusable subset of the
Hyper-V features, and our attempts to use these non-existent features
cause the guest to reboot.
Fix by explicitly checking for the Hyper-V features that we use.
Reported-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The 8254 timer code (used to implement udelay()) has an unknown
provenance. Rewrite this code to avoid potential licensing
uncertainty.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As with memcpy(), we can reduce the code size (by an average of 0.2%)
by giving the compiler more visibility into what memset() is doing,
and by avoiding the "rep" prefix on short fixed-length sequences of
string operations.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some of the C library string functions have an unknown provenance.
Reimplement all such functions to avoid potential licensing
uncertainty.
Remove the inline-assembler versions of strlen(), memswap(), and
strncmp(); these save a minimal amount of space (around 40 bytes in
total) and are not performance-critical.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hardened versions of gcc default to building position-independent
code, which breaks our i386 build. Our build process therefore
detects such platforms and automatically adds "-fno-PIE -nopie" to the
gcc command line.
On x86_64, we choose to build position-independent code (in order to
reduce the final binary size and, in particular, the number of
relocations required for UEFI binaries). The workaround therefore
breaks the build process for x86_64 binaries on such platforms.
Fix by moving the workaround to the i386-specific portion of the
Makefile.
Reported-by: Jan Kundrát <jkt@kde.org>
Debugged-by: Jan Kundrát <jkt@kde.org>
Debugged-by: Marin Hannache <git@mareo.fr>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use PRODUCT_SHORT_NAME instead of a hardcoded "iPXE" for strings which
are typically shown in the user interface.
Note that this only allows for customisation of the user interface.
Where the "iPXE" string serves a technical purpose (such as in the
HTTP User-Agent), the string cannot be customised.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some devices return multiple packets in a single poll. Handle such
devices gracefully by enqueueing received PXE UDP packets (along with
a pseudo-header to hold the IPv4 addresses and port numbers) and
dequeueing them on subsequent calls to PXENV_UDP_READ.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Fetching the TFTP file size is currently implemented via a custom
"tftpsize://" protocol hack. Generalise this approach to instead
close the TFTP connection whenever the parent data-transfer interface
is closed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow drivers to specify a supported PCI class code. To save space in
the final binary, make this an attribute of the driver rather than an
attribute of a PCI device ID list entry.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The EDK2 codebase uses -malign-double for 32-bit builds, which causes
64-bit integers to be naturally aligned. This affects the layout of
some structures (including EFI_BLOCK_IO_MEDIA).
This mirrors wimboot commit 7b8f39d ("[build] Fix building of 32-bit
UEFI version").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .mrom payload has a code type of 0xff and so the initialisation
length field (single byte at offset 0x02) does not need to be
present. Use only the PCI header's image length field, which allows
the .mrom payload to be up to 32MB in size.
Inspired-by: Swift Geek <swiftgeek@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
mromprefix.S currently uses the initialisation length field (single
byte at offset 0x02) to determine the length of a ROM image within a
multi-image ROM BAR. For PCI ROM images with a code type other than
0, the initialisation length field may not be present.
Fix by using the PCI header's image length field instead.
Inspired-by: Swift Geek <swiftgeek@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The build process has for a long time assumed that every ROM is a PCI
ROM, and will always include the PCI header and PCI-related
functionality (such as checking the PCI BIOS version, including the
PCI bus:dev.fn address within the ROM product name string, etc.).
While real ISA cards are no longer in use, some virtualisation
environments (notably VirtualBox) have support only for ISA ROMs.
This can cause problems: in particular, VirtualBox will call our
initialisation entry point with random garbage in %ax, which we then
treat as the PCI bus:dev.fn address of the autoboot device: this
generally prevents the default boot sequence from using any network
devices.
Create .isarom and .pcirom prefixes which can be used to explicitly
specify the type of ROM to be created. (Note that the .mrom prefix
always implies a PCI ROM, since the .mrom mechanism relies on
reconfiguring PCI BARs.)
Make .rom a magic prefix which will automatically select the
appropriate PCI or ISA ROM prefix for ROMs defined via a PCI_ROM() or
ISA_ROM() macro. To maintain backwards compatibility, we default to
building a PCI ROM for anything which is not directly derived from a
PCI_ROM() or ISA_ROM() macro (e.g. bin/intel.rom).
Add a selection of targets to "make everything" to ensure that the
(relatively obscure) ISA ROM build process is included within the
per-commit QA checks.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Since some PnP BIOSes fail to set %es:di to point to the PnP signature
on entry, we identify a PnP BIOS by scanning through the top 64kB of
base memory looking for the PnP structure. We therefore don't
actually use the values of %es:di provided to the initialisation entry
point, and so there is no need to preserve them.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Using version 1 grant tables limits guests to using 16TB of grantable
RAM, and prevents the use of subpage grants. Some versions of the Xen
hypervisor refuse to allow the grant table version to be set after the
first grant references have been created, so the loaded operating
system may be stuck with whatever choice we make here. We therefore
currently use version 2 grant tables, since they give the most
flexibility to the loaded OS.
Current versions (7.2.0) of the Windows PV drivers have no support for
version 2 grant tables, and will merrily create version 1 entries in
what the hypervisor believes to be a version 2 table. This causes
some confusion.
Avoid this problem by attempting to use version 1 tables, since
otherwise we may render Windows unable to boot.
Play nicely with other potential bootloaders by accepting either
version 1 or version 2 grant tables (if we are unable to set our
requested version).
Note that the use of version 1 tables on a 64-bit system introduces a
possible failure path in which a frame number cannot fit into the
32-bit field within the v1 structure. This in turn introduces
additional failure paths into netfront_transmit() and
netfront_refill_rx().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
At some point during XenServer development history, the Windows PV
drivers changed to using a PCI device ID of 5853:0002 rather than
5853:0001. Current (7.2.0) drivers will bind to either 5853:0001 or
5853:0002, and the general approach taken by the world at large
(including Amazon EC2) seems to be to use only 5853:0001.
However, the current version of XenServer (6.2.0) will create the
platform device as 5853:0002 (via the platform:device_id VM parameter)
for any VMs created using the built-in templates for Windows Vista or
later.
Accept either PCI ID, since the underlying device is identical.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently treat network devices as available for use via the SNP
API only if RX queue processing has been frozen. (This is similar in
spirit to the way that RX queue processing is frozen for the network
device currently exposed via the PXE API.)
The default state of a freshly created network device is for the RX
queue to not be frozen, and thus to be unavailable for use via SNP.
This causes problems when devices are added through code paths other
than _efidrv_start() (which explicitly releases devices for use via
SNP).
We don't actually need to freeze RX queue processing, since calls via
the SNP API will always use netdev_poll() rather than net_poll(), and
so will never trigger the RX queue processing code path anyway.
We can therefore simplify the code to use a single global flag to
indicate whether network devices are claimed for use by iPXE or
available for use via SNP. Using a global flag allows the default
state for dynamically created network devices to behave sensibly.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add basic support for Xen PV-HVM domains (detected via the Xen
platform PCI device with IDs 5853:0001), including support for
accessing configuration via XenStore and enumerating devices via
XenBus.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When a 32-bit iPXE binary is running on a system which allocates PCI
memory BARs above 4GB, our PCI subsystem will return the base address
for any such BARs as zero (with a warning message if DEBUG=pci is
enabled). Currently, ioremap() will happily map an address pointing
to the start of physical memory, providing no sensible indication of
failure.
Fix by always returning NULL if we are asked to ioremap() a zero bus
address.
With a totally flat memory model (e.g. under EFI), this provides an
accurate failure indication since no PCI peripheral will be mapped to
the zero bus address.
With the librm memory model, there is the possibility of a spurious
NULL return from ioremap() if the bus address happens to be equal to
virt_offset. Under the current virtual memory map, the NULL virtual
address will always be the start of .textdata, and so this problem
cannot occur; a NULL return from ioremap() will always be an accurate
failure indication.
Debugged-by: Anton D. Kachalov <mouse@yandex-team.ru>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide a single instance of EFI_DRIVER_BINDING_PROTOCOL (attached to
our image handle); this matches the expectations scattered throughout
the EFI specification.
Open the underlying hardware device using EFI_OPEN_PROTOCOL_BY_DRIVER
and EFI_OPEN_PROTOCOL_EXCLUSIVE, to prevent other drivers from
attaching to the same device.
Do not automatically connect to devices when being loaded as a driver;
leave this task to the platform firmware (or to the user, if loading
directly from the EFI shell).
When running as an application, forcibly disconnect any existing
drivers from devices that we want to control, and reconnect them on
exit.
Provide a meaningful driver version number (based on the build
timestamp), to allow platform firmware to automatically load newer
versions of iPXE drivers if multiple drivers are present.
Include device paths within debug messages where possible, to aid in
debugging.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some external code (observed with FreeBSD's bootloader) will continue
to make INT 13 calls after reconfiguring the 8259 PIC to change the
vector offsets for IRQs. If an IRQ (e.g. the timer IRQ) subsequently
occurs while iPXE is in protected mode, this will cause a general
protection fault since the corresponding IDT entry is empty.
A general protection fault is INT 0x0d, which happens to overlap with
the original IRQ5. We therefore do have an ISR set up to handle a
general protection fault, but this ISR simply reflects the interrupt
down to the real-mode INT 0x0d and then attempts to return. Since our
ISR is expecting a hardware interrupt rather than a general protection
fault, it doesn't remove the error code from the stack before issuing
the iret instruction; it therefore attempts to return to a garbage
address. Since the segment part of this address is likely to be
invalid, a second general protection fault occurs. This cycle
continues until we run out of stack space and triple fault.
Fix by reflecting all INTs down to real mode. This actually reduces
the code size by four bytes (but increases the bss size by almost
2kB).
Reported-by: Brian Rak <dn@devicenull.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The bzImage boot protocol allows the real-mode code to be loaded at
any segment within base memory. (The fact that both iPXE and recent
versions of Syslinux will load the real-mode code at 1000:0000 is a
coincidence; it is not guaranteed by the specification.)
Fix by making the code relocatable.
Reported-by: Andrew Stuart <andrew@shopcusa.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Rework geniso and genliso to provide a single merged utility for
generating ISO images.
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .lkrn prefix currently provides a zImage kernel with unused setup
sectors and the whole iPXE binary placed within the "protected mode
kernel" portion of the zImage.
The work carried out years ago to create the .mrom format provides a
mechanism allowing the iPXE binary to be split into a small real-mode
header and a larger payload. This neatly matches the way that a
bzImage is loaded: the "setup sectors" can contain the header and the
"protected mode kernel" can contain the payload.
This removes the size restrictions on an iPXE .lkrn image (and hence
on derived image formats such as .iso).
Also remove obsolete copyright information, since none of the original
code or functionality now remains.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Interrupt processing adds noise to profiling results. Allow
interrupts (from within protected mode) to be profiled separately,
with time spent within the interrupt handler being excluded from any
other profiling currently in progress.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PXENV_UNDI_ISR calls may implicitly refill the underlying receive
ring, and so could continue to retrieve packets indefinitely. Place
an upper limit on the number of calls to PXENV_UNDI_ISR per call to
undinet_poll().
Signed-off-by: Michael Brown <mcb30@ipxe.org>