initrd_init() calls umalloc() to allocate space for the initrd image,
but does so before hide_etherboot() has been called. It is therefore
possible for the initrd to end up overwriting iPXE itself.
Fix by converting initrd_init() from an init_fn to a startup_fn.
Originally-fixed-by: Till Straumann <strauman@slac.stanford.edu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The command line may be situated in an area of base memory that will
be overwritten by iPXE's real-mode segments, causing the command line
to be corrupted before it can be used.
Fix by creating a copy of the command line on the prefix stack (below
0x7c00) before installing the real-mode segments.
Reported-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PXENV_FILE_EXIT_HOOK is designed to allow ipxelinux.0 to unload both
the iPXE and pxelinux components without affecting the underlying PXE
stack. Unfortunately, it causes unexpected behaviour in other
situations, such as when loading a non-embedded pxelinux.0 via
undionly.kpxe. For example:
PXE ROM -> undionly.kpxe -> pxelinux.0 -> chain.c32 to boot hd0
would cause control to return to iPXE instead of booting from the hard
disk. In some cases, this would result in a harmless but confusing
"No more network devices" message; in other cases stranger things
would happen, such as being returned to the iPXE shell prompt.
The fundamental problem is that when pxelinux detects
PXENV_FILE_EXIT_HOOK, it may attempt to specify an exit hook and then
exit back to iPXE, assuming that iPXE will in turn exit cleanly via
the specified exit hook. This is not a valid assumption in the
general case, since the action of exiting back to iPXE does not
directly cause iPXE to exit itself. (In the specific case of
ipxelinux.0, this will work since the embedded script exits as soon as
pxelinux.0 exits.)
Fix the unexpected behaviour in the non-ipxelinux.0 cases by including
support for PXENV_FILE_EXIT_HOOK only when using a new .kkkpxe format.
The ipxelinux.0 build process should therefore now use undionly.kkkpxe
instead of undionly.kkpxe.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Very nasty things can happen if a NULL network device is used. Check
that pxe_netdev is non-NULL at the applicable entry points, so that
this type of problem gets reported to the caller rather than being
allowed to crash the system.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On at least one PXE stack (Realtek r8169), PXENV_UNDI_INITIALIZE has
been observed to fail intermittently due to a media test failure (PXE
error 0x00000061). Retrying the call to PXENV_UNDI_INITIALIZE
succeeds, and the NIC is then usable.
It is worth noting that this particular Realtek PXE stack is already
known to be unreliable: for example, it repeatably fails its own
boot-time media test after every warm reboot.
Fix by attempting PXENV_UNDI_INITIALIZE multiple times, with a short
delay between each attempt to allow the link to settle.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow an initrd (such as an embedded script) to be passed to iPXE when
loaded as a .lkrn (or .iso) image. This allows an embedded script to
be varied without recompiling iPXE.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Specify a driver name of "undionly" and a device name based on the
UNDI-reported underlying hardware device. For example:
net0: 52:54:00:12:34:56 using undionly on UNDI-PCI00:03.0 (open)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some BIOSes are reported to corrupt %ebx when using INT 15,2401 (see
http://opensolaris.org/jive/thread.jspa?messageID=377026). Guard
against this by preserving all (non-segment) registers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The symbol_text16 is defined globally by the linker. Use rm_text16
instead of _text16 for the local variable within librm.S to avoid
confusion when reading linker maps.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
All users of imgdownload() require registration of the image, so make
registration an integral part of imgdownload() itself and simplify the
"action" parameter to be one of image_select(), image_exec() et al.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE specifies a value of 0 for cmdline_size, causing GRUB to not pass
in a command line. Fix by setting cmdline_size to the maximum value
of 2047.
Signed-off-by: Valentine Barshak <gvaxon@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the link layer to directly report whether or not a packet is
multicast or broadcast at the time of calling pull(), rather than
relying on heuristics to determine this at a later stage.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow objects to support both streaming and block device protocols, by
starting streaming data only when the data transfer window opens.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some bootloaders seem to add "BOOT_IMAGE=..." at the end of the
command line; some at the start. Cope with either variation.
Reported-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
IBM BIOSes ignore the PnP header offset stored at address 0x1a and
instead scan for the $PnP signature on a 16-byte boundary. (This
alignment is not mandated by the PnP specification.)
Force PnP header to a 16-byte boundary to work around these BIOSes.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Several BIOSes (including most IBM BIOSes and many virtual machine
BIOSes) do not provide detectable PnP support, but will use the BEV
entry point for a PnP option ROM. On these semi-PnP BIOSes, iPXE will
respond to the absence of detectable PnP support by hooking INT19,
which disrupts the boot order.
BIOSes that genuinely require hooking INT19 seem to be very rare
nowadays. It may therefore be preferable to assume that the absence
of detectable PnP support indicates a semi-PnP BIOS rather than a
non-PnP BIOS.
Change the default behaviour so that INT19 will never be hooked unless
the compile-time option NONPNP_HOOK_INT19 is enabled. Leave the
redundant PnP detection routine in-place to allow for debugging via
the ROM banner line.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Revert commit 38cd351 ("[romprefix] Attempt to gracefully handle
semi-PnP IBM BIOSes"), since the test for the "IBM " signature in %edi
is not sufficient to identify an IBM BIOS.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some IBM BIOSes provide partial support for PnP: they will use the BEV
entry point but will not advertise PnP support. This causes iPXE to
hook INT 19, which disrupts the boot process.
Attempt to improve this situation by detecting an IBM BIOS and
treating it as a PnP BIOS despite the absence of a PnP signature.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some versions of binutils have curious concepts of what constitutes
subtraction. For example:
0x00000000000000f0 _text16_late = .
0x0000000000000898 _mtext16 = .
0x0000000000000898 _etext16 = .
0x0000000000000898 _text16_late_filesz = ABSOLUTE ((_mtext16 - _text16_late))
0x00000000000007a8 _text16_late_memsz = ABSOLUTE ((_etext16 - _text16_late))
This has interesting side-effects such as producing sizes for .bss
segments that are negative, causing the majority of addressable memory
to be zeroed out.
Fix by using the form
ABSOLUTE ( x ) - ABSOLUTE ( y )
rather than
ABSOLUTE ( x - y )
Reported-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Tested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This allows older versions of ELTORITO.SYS (such as the version found
on the FreeDOS installation CD-ROM) to use iPXE's emulated CD-ROM
drive.
Reported-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Expose the multiple-SAN-drive capability of the iPXE core via the iPXE
command line by adding commands to hook and unhook additional drives.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some PXE stacks (notably old Etherboot/gPXE stacks) will claim to use
the timer interrupt, rather than reporting that interrupts are not
supported. Since using the timer interrupt is equivalent to polling
anyway, we may as well genuinely poll these stacks.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Temporary modification to prevent valgrind.h from breaking compilation
with gcc 4.6. When this problem is fixed upstream, a new and
unmodified copy of valgrind.h should be imported.
Signed-off-by: Thomas Miletich <thomas.miletich@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An iPXE .exe image can be loaded from DOS. Tested using bin/ipxe.exe
to load a Linux kernel and simple initramfs from within MS-DOS 6.22.
(EDD must be disabled using the "edd=off" kernel parameter, since the
loaded kernel image has already overwritten parts of DOS' INT 13
wrapper.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
In the unlikely (but observable) event that INT 15,88 returns less
memory above 1MB than is required for the temporary decompression
area, ignore it and use the 1MB point anyway.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Make the allocators used by malloc and linux_umalloc valgrindable.
Include valgrind headers in the codebase to avoid a build dependency
on valgrind.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some PXE stacks advertise that interrupts are not supported, despite
requiring the use of interrupts. Attempt to cope with such cards
without breaking others by always hooking the interrupt, and using the
"interrupts supported" flag only to decide whether or not to wait for
an interrupt before calling PXENV_UNDI_ISR_IN_PROCESS.
The possible combinations are therefore:
1. Card generates interrupts and claims to support interrupts
iPXE will call PXENV_UNDI_ISR_IN_PROCESS only after an interrupt
has been observed. (This is required to avoid lockups in some PXE
stacks, which spuriously sulk if called before an interrupt has
been generated.)
Such a card should work correctly.
2. Card does not generate interrupts and does not claim to support
interrupts
iPXE will call PXENV_UNDI_ISR_IN_PROCESS indiscriminately, matching
the observed behaviour of at least one other PXE NBP (winBoot/i).
Such a card should work correctly.
3. Card generates interrupts but claims not to support interrupts
iPXE will call PXENV_UNDI_ISR_IN_PROCESS indiscriminately. An
interrupt will still result in a call to PXENV_UNDI_ISR_IN_START.
Such a card may work correctly.
4. Card does not generate interrupts but claims to support interrupts
Such a card will not work at all.
Reported-by: Jerry Cheng <jaspers.cheng@msa.hinet.net>
Tested-by: Jerry Cheng <jaspers.cheng@msa.hinet.net>
Reported-by: Mauricio Silveira <mauricio@livreti.com.br>
Tested-by: Mauricio Silveira <mauricio@livreti.com.br>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE allocates its first PMM block using the image source length,
which is rounded up to the nearest 16-byte paragraph. It then copies
in data of a length calculated from the ROM size, which is
theoretically less than or equal to the image source length, but is
rounded up to the nearest 512-byte sector. This can result in copying
beyond the end of the allocated PMM block, which can corrupt the PMM
data structures (and other essentially arbitrary areas of memory).
Fix by rounding up the image source length to the nearest 512-byte
sector before using it as the PMM allocation length.
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Jarrod Johnson <jarrod.b.johnson@gmail.com>
Reported-by: Itay Gazit <itayg@mellanox.co.il>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
INT 16,01 will discard some extended keystrokes on some BIOSes, making
it impossible for iPXE to detect keypresses such as F12. Fix by using
INT 16,11 instead.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some prefixes (e.g. .lkrn) allow a command line to be passed in to
iPXE. At present, this command line is ignored.
If a command line is provided, treat it as an embedded script (without
an explicit "#!ipxe" magic marker). This allows for patterns of
invocation such as
title iPXE
kernel /boot/ipxe.lkrn dhcp && \
sanboot iscsi:10.0.4.1::::iqn.2010-04.org.ipxe.dolphin:storage
Here GRUB is instructed to load ipxe.lkrn with an embedded script
equivalent to
#!ipxe
dhcp
sanboot iscsi:10.0.4.1::::iqn.2010-04.org.ipxe.dolphin:storage
This can be used to effectively vary the embedded script without
having to rebuild ipxe.lkrn.
Originally-implemented-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The function keys F5-F12 all conform to the same ANSI pattern as the
other "special" keys that we currently recognise. Add these key
definitions, and shrink the representation of the ANSI sequences in
bios_console.c to compensate.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Refactor the {load,exec} image operations as {probe,exec}. This makes
the probe mechanism cleaner, eliminates some forward declarations,
avoids holding magic state in image->priv, eliminates the possibility
of screwing up between the "load" and "exec" stages, and makes the
documentation simpler since the concept of "loading" (as distinct from
"executing") no longer needs to be explained.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The online documentation (e.g. http://ipxe.org/cmd/ifopen), though not
yet complete, is far more comprehensive than could be provided within
the iPXE binary. Save around 200 bytes (compressed) by removing the
command descriptions from the interactive help, and instead referring
users directly to the web page describing the relevant command.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently use INT 13,00 as an opportunity to reopen the underlying
block device, which works well for callers such as DOS that will use
INT 13,00 in response to any disk errors. However, some callers (such
as Windows Server 2008) do not attempt to reset the disk, and so any
failures become effectively permanent.
Fix this by automatically reopening the underlying block device
whenever we might want to access it.
This makes direct installation of Windows to an iSCSI target much more
reliable.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The "size" bit (aka the D/B) bit should (as far as I can tell) be
irrelevant for accesses to a non-code, non-stack, expand-upwards
segment. However, VirtualBox fails on some accesses via this segment
if this bit is not set.
This change allows iPXE to boot under VirtualBox without having to
disable VT-x/AMD-V support.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Building the Linux-specific code (tap.o et al) requires external
headers that have proven to be extremely variable across systems,
causing frequent build failures.
Until this situation is rectified, remove the Linux-specific code from
the default (non-Linux build).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some binutils versions will drag in an object to satisfy the entry
symbol; some won't. Try to cope with this exciting variety of
behaviour by ensuring that all entry symbols are unique.
Remove the explicit inclusion of the prefix object on the linker
command line, since the entry symbol now provides all the information
needed to identify the prefix.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 623469d ("[build] Eliminate unused sections at link-time")
introduced a regression in several build formats, in which the prefix
would end up being garbage-collected out of existence. Fix by
ensuring that an entry symbol exists in each possible prefix, and is
required by the linker script.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use -ffunction-sections, -fdata-sections, and --gc-sections to
automatically prune out any unreferenced sections.
This saves around 744 bytes (uncompressed) from the rtl8139.rom build.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
EFI performs its own PCI bus enumeration. Respect this, and start
controlling devices only when instructed to do so by EFI.
As a side benefit, we should now correctly create multiple SNP
instances for multi-port devices.
This should also fix the problem of failing to enumerate devices
because the PCI bridges have not yet been enabled at the time the iPXE
driver is loaded.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Merge the "bus" and "devfn" fields into a single "busdevfn" field, to
match the format used by the majority of external code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some BIOSes can report multiple memory regions which may be adjacent
and the same type. Since only the first region is used in the
mboot.c32 layer it's possible to run out of memory when loading all of
the boot modules. One may get around this problem by having iPXE
merge these memory regions internally.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove the concept of shutdown exit flags, and replace it with a
counter used to keep track of exposed interfaces that require devices
to remain active.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
libflat no longer has anything to do with flat real mode; it handles
only the A20 gate. Update library name to match.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Flat real mode will have been set up as a side-effect of the
protected-mode call invoked during install_block() for .text16.early;
there is no need to do so explicitly.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Flat real mode works perfectly on real hardware, but seems to cause
problems for some hypervisors. Revert to using 16-bit protected mode
(and returning to real mode with 4GB limits, so as not to break PMM
BIOSes).
Allow the code specific to the .mrom format to continue to assume that
flat real mode works, since this format is specific to real hardware.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The PXE debugging messages have remained pretty much unaltered since
Etherboot 5.4, and are now difficult to read in comparison to most of
the rest of iPXE.
Bring the pxe_udp debug messages up to normal iPXE standards.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Earlier versions of the PXE specification do not have the SubVendor_ID
and SubDevice_ID fields, and some NBPs may not provide space for them.
Avoid overwriting the contents of these fields, just in case.
This is similar to the problem with the BufferLimit field in
PXENV_GET_CACHED_INFO.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Changes were made to files where the licence text within the files
themselves confirms that the files are GPL version 2 or later.
Signed-off-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use the real-mode address ffff:0010 to access the linear address
0x100000, and so test whether or not the A20 gate is enabled without
requiring a switch into flat real mode (or some other addressing
mode).
This speeds up CPU mode transitions, and also avoids breaking the NBP
from IBM's Tivoli Provisioning Manager for Operating System
Deployment. This NBP makes some calls to iPXE in VM86 mode rather
than true real mode and does not correctly emulate our transition into
flat real mode.
Interestingly, Tivoli's VMM *does* allow us to switch into protected
mode (though it patches our GDT so that we execute in ring 1 rather
than ring 0). However, paging is still disabled and we have a 4GB
segment limit. Being in ring 1 does not, therefore, restrict us in
any meaningful way; this has been verified by deliberately writing
garbage over Tivoli's own GDT (at address 0x02201010) during a
nominally VM86-mode PXE API call. It's unclear precisely what
protection this VMM is supposed to be offering.
Suggested-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some network cards do not generate interrupts when operated via the
UNDI API. Allow for this by waiting for the ISR to be triggered only
if the PXE stack advertises that it supports interrupts. When the PXE
stack does not advertise interrupt support, we skip the call to
PXENV_UNDI_ISR_IN_START and just poll the device using
PXENV_UNDI_ISR_IN_PROCESS. This matches the observed behaviour of at
least one other PXE NBP (emBoot's winBoot/i), so there is a reasonable
chance of this working.
Originally-implemented-by: Muralidhar Appalla <Muralidhar.Appalla@emulex.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The disk signature is used by some OSes (notably Windows) to identify
the boot disk, so it's useful debugging information to have.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Support the extensions mandated by EDD 4.0, including:
o the ability to specify a flat physical address in a disk address
packet,
o the ability to specify a sector count greater than 127 in a disk
address packet,
o support for all functions within the Fixed Disk Access and EDD
Support subsets,
o the ability to describe a device using EDD Device Path Information.
This implementation is based on draft revision 3 of the EDD 4.0
specification, with reference to the EDD 3.0 specification. It is
possible that this implementation may need to change in order to
conform to the final published EDD 4.0 specification.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The block device interface used in gPXE predates the invention of even
the old gPXE data-transfer interface, let alone the current iPXE
generic asynchronous interface mechanism. Bring this old code up to
date, with the following benefits:
o Block device commands can be cancelled by the requestor. The INT 13
layer uses this to provide a global timeout on all INT 13 calls,
with the result that an unexpected passive failure mode (such as
an iSCSI target ACKing the request but never sending a response)
will lead to a timeout that gets reported back to the INT 13 user,
rather than simply freezing the system.
o INT 13,00 (reset drive) is now able to reset the underlying block
device. INT 13 users, such as DOS, that use INT 13,00 as a method
for error recovery now have a chance of recovering.
o All block device commands are tagged, with a numerical tag that
will show up in debugging output and in packet captures; this will
allow easier interpretation of bug reports that include both
sources of information.
o The extremely ugly hacks used to generate the boot firmware tables
have been eradicated and replaced with a generic acpi_describe()
method (exploiting the ability of iPXE interfaces to pass through
methods to an underlying interface). The ACPI tables are now
built in a shared data block within .bss16, rather than each
requiring dedicated space in .data16.
o The architecture-independent concept of a SAN device has been
exposed to the iPXE core through the sanboot API, which provides
calls to hook, unhook, boot, and describe SAN devices. This
allows for much more flexible usage patterns (such as hooking an
empty SAN device and then running an OS installer via TFTP).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE has never supported SEEK_END; the usage of "whence" offers only
the options of SEEK_SET and SEEK_CUR and so is effectively a boolean
flag. Further flags will be required to support additional metadata
required by the Fibre Channel network model, so repurpose the "whence"
field as a generic "flags" field.
xfer_seek() has always been used with SEEK_SET, so remove the "whence"
field altogether from its argument list.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Declarations without the accompanying __table_entry cause misalignment
of the table entries when using gcc 4.5. Fix by adding the
appropriate __table_entry macro or (where possible) by removing
unnecessary forward declarations.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Support qemu-like arguments for network setup:
--net driver_name[,setting=value]*
and global settings:
--settings setting=value[,setting=value]*
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add makefiles, ld scripts and default config for linux platform for
both i386 and x86_64.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
pcbios specific get_memmap() is used by the b44 driver making
all-drivers builds fail on other platforms. Move it to the I/O API
group and provide a dummy implementation on EFI.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
COM32 binaries generally expect to run with interrupts
enabled. Syslinux does so, and COM32 programs will execute cli/sti
pairs when running a critical section, to provide mutual exclusion
against BIOS interrupt handlers. Previously, under iPXE, the IDT was
not valid, so any interrupt (e.g. a timer tick) would generally cause
the machine to triple fault.
This change introduces code to:
- Create a valid IDT at the same location that syslinux uses
- Create an "interrupt jump buffer", which contains small pieces of
code that simply record the vector number and jump to a common
handler
- Thunk down to real mode and execute the BIOS's interrupt handler
whenever an interrupt is received in a COM32 program
- Switch IDTs and enable/disable interrupts when context switching to
and from COM32 binaries
Testing done:
- Booted VMware ESX using a COM32 multiboot loader (mboot.c32)
- Built with GDBSERIAL enabled, and tested breakpoints on int22 and
com32_irq
- Put the following code in a COM32 program:
asm volatile ( "sti" );
while ( 1 );
Before this change, the machine would triple fault
immediately. After this change, it hangs as expected. Under Bochs,
it is possible to see the interrupt handler run, and the current
time in the BIOS data area gets incremented.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An assembly version of memswap() is in an x86 word-length-agnostic
header file, but it used 32-bit registers to store pointers, leading
to memory errors responding to ARP queries on 64-bit systems.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The existence and usage of the BEV entry point is covered by the PnP
spec, not the BBS spec; the BBS spec merely describes a policy for
selecting the boot device order. iPXE should therefore check only for
a PnP BIOS in order to decide whether or not to hook INT19.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit ea12dc0 ("[build] Avoid hard-coding the path to perl")
introduced a build failure for fully clean trees (e.g. after running
"make veryclean"), since the dependency upon $(PARSEROM) now includes
a dependency upon "perl" (which doesn't exist) rather than upon
"/usr/bin/perl" (which does exist).
There should of course be no dependency upon the perl binary at all;
the dependency should be upon "./util/parserom.pl" alone.
Fix by removing the $(PERL) from the definition of Perl-based utility
paths, and adding $(PERL) at the point of usage.
Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove data-xfer as an interface type, and replace data-xfer
interfaces with generic interfaces supporting the data-xfer methods.
Filter interfaces (as used by the TLS layer) are handled using the
generic pass-through interface capability. A side-effect of this is
that deliver_raw() no longer exists as a data-xfer method. (In
practice this doesn't lose any efficiency, since there are no
instances within the current codebase where xfer_deliver_raw() is used
to pass data to an interface supporting the deliver_raw() method.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove name-resolution as an interface type, and replace
name-resolution interfaces with generic interfaces supporting the
resolv_done() method.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
strerror() has not been able to use the PXE-only error table since
commit 9aa61ad ("Add per-file error identifiers") back in 2007.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Most of iPXE uses __attribute__((packed)) anyway, and PACKED conflicts
with an identically-named macro in the upstream EFI header files.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
See RFC 4578 for details.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The linker chooses to look for _start first and always picks
efidrvprefix.o to satisfy it (probably because it's earlier in the
archive) which causes a multiple definition error when the linker
later has to pick efiprefix.o for other symbols.
Fix by using EFI-specific TGT_LD_FLAGS with an explicit entry point.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This removes the need for inline safety wrappers, marginally reducing
the size penalty of weak functions, and works around an apparent
binutils bug that causes undefined weak symbols to not actually be
NULL when compiling with -fPIE (as EFI builds do).
A bug in versions of binutils prior to 2.16 (released in 2005) will
cause same-file weak definitions to not work with those
toolchains. Update the README to reflect our new dependency on
binutils >= 2.16.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
COMBOOT API calls set the carry flag on failure. This was not being
propagated because the COMBOOT interrupt handler used iret to return
with EFLAGS restored from the stack. This patch propagates CF before
returning from the interrupt.
Reported-by: Geoff Lywood <glywood@vmware.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Microsoft WDS can end up calling PXENV_RESTART_TFTP to execute a
second-stage NBP which then exits. Specifically, wdsnbp.com uses
PXENV_RESTART_TFTP to execute pxeboot.com, which will exit if the user
does not press F12. iPXE currently treats PXENV_RESTART_TFTP as a
normal PXE API call, and so attempts to return to wdsnbp.com, which
has just been vaporised by pxeboot.com.
Use rmsetjmp/rmlongjmp to preserve the stack state as of the initial
NBP execution, and to restore this state immediately prior to
executing the NBP loaded via PXENV_RESTART_TFTP. This matches the
behaviour in the PXE spec (which says that "if TFTP is restarted,
control is never returned to the caller"), and allows pxeboot.com to
exit relatively cleanly back to iPXE.
As with all usage of setjmp/longjmp, there may be subtle corner case
bugs due to not gracefully unwinding any state accumulated by the time
of the longjmp call, but this seems to be the only viable way to
provide the specified behaviour.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an infrastructure allowing the prefix to provide an open_payload()
method for obtaining out-of-band access to the whole iPXE image. Add
a mechanism within this infrastructure that allows raw access to the
expansion ROM BAR by temporarily borrowing an address from a suitable
memory BAR on the same PCI card.
For cards that have a memory BAR that is at least as large as their
expansion ROM BAR, this allows large iPXE ROMs to be supported even on
systems where PMM fails, or where option ROM space pressure makes it
impossible to use PMM shrinking. The BIOS sees only a stub ROM of
approximately 3kB in size; the remainder (which can be well over 64kB)
is loaded only at the time iPXE is invoked.
As a nice side-effect, an iPXE .mrom image will continue to work even
if its PMM-allocated areas are overwritten between initialisation and
invocation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The only remaining useful function of makerom.pl is to correct the ROM
and PnP checksums; the PCI IDs are set at link time, and padding is
performed using padimg.pl.
Option::ROM already provides a facility for correcting the checksums,
so we may as well just use this instead.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
It is common for system memory maps to be grotesquely unreliable
during POST. Many sanity checks have been added to the memory map
reading code, but these do not catch all problems.
Skip relocation entirely if called during POST. This should avoid the
problems typically encountered, at the cost of slightly disrupting the
memory map of an operating system booted via iPXE when iPXE was
entered during POST. Since this is a very rare special case (used,
for example, when reflashing an experimental ROM that would otherwise
prevent the system from completing POST), this is an acceptable cost.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some BIOSes (at least some AMI BIOSes) tend to refuse to allocate a
single area large enough to hold both the iPXE image source and the
temporary decompression area, despite promising a largest available
PMM memory block of several megabytes. This causes ROM image
shrinking to fail on these BIOSes, with undesirable consequences:
other option ROMs may be disabled due to shortage of option ROM space,
and the iPXE ROM may itself be corrupted by a further BIOS bug (again,
observed on an AMI BIOS) which causes large ROMs to end up overlapping
reserved areas of memory. This can potentially render a system
unbootable via any means.
Increase the chances of a successful PMM allocation by dropping the
alignment requirement (which is redundant now that we can enable A20
from within the prefix); this allows us to reduce the allocation size
from 2MB down to only the required size.
Increase the chances still further by using two separate allocations:
one to hold the image source (i.e. the copy of the ROM before being
shrunk) and the other to act as the decompression area. This allows
ROM image shrinking to take place even on systems that fail to
allocate enough memory for the temporary decompression area.
Improve the behaviour of iPXE in systems with multiple iPXE ROMs by
sharing PMM allocations where possible. Image source areas can be
shared with any iPXE ROMs with a matching build identifier, and the
temporary decompression area can be shared with any iPXE ROMs with the
same uncompressed size (rounded up to the nearest 128kB).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use INT 15,88 to find a suitable temporary decompression area, rather
than a fixed address. This hopefully gives us a better chance of not
treading on any PMM-allocated areas, in BIOSes where PMM support
exists but tends not to give us the large blocks that we ask for.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Always call INT 15,88 even if we don't use the result. This allows
DEBUG=memmap to show the complete result set returned by all of the
INT 15 memory-map calls.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Randomly generate a 32-bit build identifier that can be used to
identify identical iPXE ROMs when multiple such ROMs are present in a
system (e.g. when a multi-function NIC exposes the same iPXE ROM image
via each function's expansion ROM BAR).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The existing "iPXE starting execution" message indicates that the BEV
(or INT19) was invoked, but gives no indication on whether or not the
iPXE source was successfully retrieved (e.g. from PMM). Split the
"starting execution message" into "starting execution...ok"; the "ok"
indicates that the main iPXE body was successfully decompressed and
relocated.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Now that we can use odd megabytes, there is no particular need to use
an even megabyte as the fallback temporary load point.
Note that the old warnings about avoiding 2MB pre-date our ability to
cooperate with other PXE ROMs by using PMM.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE is now capable of operating in odd megabytes of memory, so remove
the obsolete code enforcing an even-megabyte constraint.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use the shared code in libflat to perform the A20 transitions
automatically on each transition from real to protected mode. This
allows us to remove all explicit calls to gateA20_set().
The old warnings about avoiding automatically enabling A20 are
essentially redundant; they date back to the time when we would always
start hammering the keyboard controller without first checking to see
if gate A20 was already enabled (which it almost always is).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE currently insists on residing in an even megabyte. This imposes
undesirably severe constraints upon our PMM allocation strategy, and
limits our options for mechanisms to access ROMs greater than 64kB in
size.
Add A20 handling code to libflat so that prefixes are able to access
memory even in odd megabytes.
The algorithms and tuning parameters in the new A20 handling code are
based upon a mixture of the existing iPXE A20 code and the A20 code
from the 2.6.32 Linux kernel.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The flatten_real_mode routine is not needed until after decompressing
.text16.early, and currently performs various contortions to
compensate for the fact that .prefix may not be writable. Move
flatten_real_mode to .text16.early to save on (compressed) binary size
and simplify the code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a section .text16.early which is always kept inline with the
prefix. This will allow for some code sharing between the .prefix and
.text16 sections.
Note that the simple solution of just prepending the .prefix section
to the .text16 section will not work, because a bug in Wyse Streaming
Manager server (WLDRM13.BIN) requires us to place a dummy PXENV+ entry
point at the start of .text16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use flat real mode rather than 16-bit protected mode for access to
high memory during installation. This simplifies the code by reducing
the number of CPU modes we need to think about, and also increases the
amount of code in common between the normal and (somewhat
hypothetical) KEEP_IT_REAL methods of operation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When returning to real mode, set 4GB segment limits instead of 64kB
limits. This change improves our chances of successfully returning to
a PMM-capable BIOS aftering entering iPXE during POST; the BIOS will
have set up flat real mode before calling our initialisation point,
and may be disconcerted if we then return in genuine real mode.
This change is unlikely to break anything, since any code that might
potentially access beyond 64kB must use addr32 prefixes to do so; if
this is the case then it is almost certainly code written to expect
flat real mode anyway.
Note that it is not possible to restore the real-mode segment limits
to their original values, since it is not possible to know which
protected-mode segment descriptor was originally used to initialise
the limit portion of the segment register.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .hrom prefix provides an experimental mechanism for reducing
option ROM space usage on systems where PMM allocation fails, by
pretending that PMM allocation succeeded and gave us an address fixed
at compilation time. This is unreliable, and potentially dangerous.
In particular, when multiple gPXE ROMs are present in a system, each
gPXE ROM will assume ownership of the same fixed address, resulting in
undefined behaviour.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .xrom prefix provides an experimental mechanism for loading ROM
images greater than 64kB in size by mapping the expansion ROM BAR in
at a hopefully-unused address. This is unreliable, and potentially
dangerous. In particular, there is no guarantee that any PCI bridges
between the CPU and the device will respond to accesses for the
"unused" memory region that is chosen, and it is possible that the
process of scanning for the "unused" memory region may end up issuing
reads to other PCI devices. If this ends up trampling on a register
with read side-effects belonging to an unrelated PCI device, this may
cause undefined behaviour.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Access to the gpxe.org and etherboot.org domains and associated
resources has been revoked by the registrant of the domain. Work
around this problem by renaming project from gPXE to iPXE, and
updating URLs to match.
Also update README, LOG and COPYRIGHTS to remove obsolete information.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
gPXE currently overwrites the filename stored in the cached DHCP
packets when a call to PXENV_TFTP_READ_FILE or PXENV_RESTART_TFTP is
made. This code has existed for many years as a workaround for RIS,
which seemed to require that this be done.
pxe_set_cached_filename() causes problems with the Bootix NBP, and a
recent test demonstrates that RIS will complete successfully even with
pxe_set_cached_filename() removed. There have been many changes to
the DHCP and PXE logic since this code was first added, and it is
quite plausible that it was masking a bug that no longer exists.
Reported-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com>
Debugged-by: Shao Miller <Shao.Miller@yrdsb.edu.on.ca>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Current gPXE code always returns "OURS" in response to
PXENV_UNDI_ISR:START. This is harmless for non-shared interrupt
lines, and avoids the complexity of trying to determine whether or not
we really did cause the interrupt. (This is a non-trivial
determination; some drivers don't have interrupt support and hook the
system timer interrupt instead, for example.)
A problem occurs when we have a shared interrupt line, the other
device asserts an interrupt, and the controlling ISR does not chain to
the other device's ISR when we return "OURS". Under these
circumstances, the other device's ISR never executes, and so the
interrupt remains asserted, causing an interrupt storm.
Work around this by returning "OURS" if and only if our net device's
interrupt is currently recorded as being enabled. Since we always
disable interrupts as a result of a call to PXENV_UNDI_ISR:START, this
guarantees that we will eventually (on the second call) return "NOT
OURS", allowing the other ISR to be called. Under normal operation,
including a non-shared interrupt situation, this change will make no
difference since PXENV_UNDI_ISR:START would be called only when
interrupts were enabled anyway.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
In the actual SYSLINUX suite's comboot implementation, the version
string is prefixed by CR LF, and the copyright string has a leading
space. Some tools (specifically HDT) assume these padding characters
exist, so we should probably return strings in a similar format.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Loading multiple UNDI instances would be useful in systems that have
several network cards with vendor PXE ROMs. However, we cannot rely on
UNDI ROMs working correctly with multiple instances loaded
simultaneously.
The gPXE UNDI driver supports the following multi-NIC configurations:
1. Chainloading undionly.kpxe on a specific NIC.
2. Loading the UNDI driver for the first probed device and ignoring all
other UNDI devices in the system.
This patch refuses to probe additional UNDI devices so there can never
be multiple instances of UNDI loaded.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The .elf, .elfd, .lmelf, and .lmelfd prefices were brought over from
legacy Etherboot and they do not build in gPXE. This patch removes the
ELF prefices.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The unfinished .exe prefix was brought over from legacy Etherboot.
There has been no demand for .exe images so this patch removes the
prefix.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The DOS .com prefix was brought over from legacy Etherboot but does not
build. There has been no demand for .com images so this patch removes
the prefix.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The .lkrn prefix allows gPXE to be loaded as a Linux bzImage. The
bImage prefix was carried over from legacy Etherboot and does not build.
This patch removes the .bImage prefix, use .lkrn instead.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
It might be the case that we wish to chain to an NBP without
being "in the way". We now implement a hook in our exit path
for gPXE *.*pxe build targets. The hook is a pointer to a
SEG16:OFF16 which we try to jump to during exit. By default,
this pointer results in the usual exit path.
We also implement the "pxenv_file_exit_hook" PXE API routine
to allow the user to specify an alternate SEG16:OFF16 to jump
to during exit.
Unfortunately, this additional PXE extension has a cost
in code size. Fortunately, a look at the size difference
for a gPXE .rom build target shows zero size difference
after compression.
The routine is documented in doc/pxe_extensions as follows:
FILE EXIT HOOK
Op-Code: PXENV_FILE_EXIT_HOOK (00e7h)
Input: Far pointer to a t_PXENV_FILE_EXIT_HOOK parameter
structure that has been initialized by the caller.
Output: PXENV_EXIT_SUCCESS or PXENV_EXIT_FAILURE must be
returned in AX. The Status field in the parameter
structure must be set to one of the values represented
by the PXENV_STATUS_xxx constants.
Description:Modify the exit path to jump to the specified code.
Only valid for pxeprefix-based builds.
typedef struct s_PXENV_FILE_EXIT_HOOK {
PXENV_STATUS_t Status;
SEGOFF16_t Hook;
} t_PXENV_FILE_EXIT_HOOK;
Set before calling API service:
Hook: The SEG16:OFF16 of the code to jump to.
Returned from API service:
Status: See PXENV_STATUS_xxx constants.
Requested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The standard option ROM format provides a header indicating the size
of the entire ROM, which the BIOS will reserve space for, load, and
call as necessary. However, this space is strictly limited to 128k for
all ROMs. gPXE ameliorates this somewhat by reserving space for itself
in high memory and relocating the majority of its code there, but on
systems prior to PCI3 enough space must still be present to load the
ROM in the first place. Even on PCI3 systems, the BIOS often limits the
size of ROM it will load to a bit over 64kB.
These space problems can be solved by providing an artificially small
size in the ROM header: just enough to let the prefix code (at the
beginning of the ROM image) be loaded by the BIOS. To the BIOS, the
gPXE ROM will appear to be only a few kilobytes; it can then load
the rest of itself by accessing the ROM directly using the PCI
interface reserved for that task.
There are a few problems with this approach. First, gPXE needs to find
an unmapped region in memory to map the ROM so it can read from it;
this is done using the crude but effective approach of scanning high
memory (over 0xF0000000) for a sufficiently large region of all-ones
(0xFF) reads. (In x86 architecture, all-ones is returned for accesses
to memory regions that no mapped device can satisfy.) This is not
provably valid in all situations, but has worked well in practice.
More importantly, this type of ROM access can only work if the PCI ROM
BAR exists at all. NICs on physical add-in PCI cards generally must
have the BAR in order for the BIOS to be able to load their ROM, but
ISA cards and LAN-on-Motherboard cards will both fail to load gPXE
using this scheme.
Due to these uncertainties, it is recommended that .xrom only be used
when a regular .rom image is infeasible due to crowded option ROM
space. However, when it works it could allow loading gPXE images
as large as a flash chip one could find - 128kB or even higher.
Signed-off-by: Marty Connor <mdc@etherboot.org>
For extremely tight space requirements and specific applications, it is
sometimes desirable to create gPXE images that cannot provide the PXE API
functionality to client programs. Add a configuration header option,
PXE_STACK, that can be removed to remove this stack. Also add PXE_MENU
to control the PXE boot menu, which most uses of gPXE do not need.
Signed-off-by: Marty Connor <mdc@etherboot.org>
If we don't unload the PXE stack before executing gPXE, automatically
take advantage of the cached DHCPACK that the underlying/parent PXE
stack can provide. If that cached DHCPACK contains option 175.178, or
the user sets the use-cached setting before invoking DHCP, the real
DHCP request will be skipped and the cached DHCPACK will be used for
network configuration. Otherwise, the cached settings block is thrown
away as soon as a fresh one is acquired.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Calling the parent PXE stack (the stack that loaded us, for
undionly.kkpxe) can be useful for more than UNDI calls; for instance,
it lets us get cached DHCP packets to avoid re-DHCP when working with
embedded images.
Signed-off-by: Marty Connor <mdc@etherboot.org>
pxenv_tftp_get_fsize is an API call that PXE clients can call to
obtain the size of a remote file. It is implemented by starting a TFTP
transfer with pxe_tftp_open, waiting for the response and then
stopping the transfer with pxe_tftp_close(). This leaves the session
hanging on the TFTP server and it will try to resend the packet
repeatedly (verified with tftpd-hpa) until it times out.
This patch adds a method "tftpsize" that will abort the transfer after
the first packet is received from the server. This will terminate the
session on the server and is the same behaviour as Intel's PXE ROM
exhibits.
Together with a qemu patch to handle the ERROR packet (submitted to
qemu's mailing list), this resolves a specific issue where booting
pxegrub with qemu's TFTP server would be slow or hang.
I've tested this against qemu's tftp server and against my normal boot
infrastructure (tftpd-hpa). Booting pxegrub and loading extra files
now produces a trace similar to Intel's PXE client and there are no
spurious retransmits from tftpd any more.
Signed-off-by: Thomas Horsten <thomas@horsten.com>
Signed-off-by: Milan Plzik <milan.plzik@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
When the "keep-san" option is used, the function is exited without
unregistering the stack allocated int13h drive. To prevent a dangling
pointer to the stack, these structs should be heap allocated.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
gPXE currently takes advantage of the feature of PCI3.0 that allows
option ROMs to relocate the bulk of their code to high memory and so
take up only a small amount of space in the option ROM area. Currently,
the relocation can only take place if the BIOS's implementation of PMM
can be made to return blocks aligned to an even megabyte, because of
the A20 gate. AMI BIOSes, in particular, will not return allocations
that gPXE can use.
Ameliorate the situation somewhat by adding a prefix, .hrom, that works
identically to .rom except in the case that PMM allocation fails. Where
.rom would give up and place itself entirely in option ROM space, .hrom
moves to a block (assumed free) at HIGHMEM_LOADPOINT = 4MB. This allows
for the use of larger gPXE ROMs than would otherwise be possible.
Because there is no way to check that the area at HIGHMEM_LOADPOINT is
really free, other devices using that memory during the boot process
will cause failure for gPXE, the other device, or both. In practice
such conflicts will likely not occur, but this prefix should still be
considered EXPERIMENTAL.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The disk partition prefix code in hdprefix.S reads the gPXE image in
tracks, not individual sectors. This means it will attempt to read
beyond the end of the image if the .hd image type is not padded to 32
KB.
This issue is affects virtualization software which may execute a .hd or
.usb image file directly - effectively running a machine with a tiny
disk containing just the gPXE image. Boot will fail when gPXE tries to
read beyond the end of disk.
The Multiboot memory map needs to be built after unhiding gPXE and
downloaded images from memory. Solaris faults during boot when trying
to access the ramdisk, which is hidden from the memory map while gPXE is
executing. This issue is fixed by using the memory map from after gPXE
unhides itself.
Reported-by: Moinak Ghosh <moinakg@belenix.org>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
The get_underlying_e820 function should return with CF unset on success.
Reported-by: Timothy Stack <tstack@vmware.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
REQUIRE_SYMBOL() formerly used a formulation of symbol requirement
that would allow a link to succeed despite lacking a required symbol,
because it did not introduce any relocations. Fix by renaming it to
REQUEST_SYMBOL() (since the soft-requirement behavior can be useful)
and add a REQUIRE_SYMBOL() that truly requires.
Add EXPORT_SYMBOL() and IMPORT_SYMBOL() for REQUEST_SYMBOL()-like
behavior that allows one to make use of the symbol, by combining a
weak external on the symbol itself with a REQUEST_SYMBOL() of a second
symbol.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Some BIOSes (observed with an AMI BIOS on a SunFire X2200) seem to
reset the BIOS drive counter at 40:75 after a failed boot attempt.
This causes problems when attempting a Windows direct-to-iSCSI
installation: bootmgr.exe calls INT 13,0800 and gets told that there
are no hard disks, so never bothers to read the MBR in order to obtain
the boot disk signature. The Windows iSCSI initiator will detect the
iBFT and connect to the target, and everything will appear to work
except for the error message "This computer's hardware may not support
booting to this disk. Ensure that the disk's controller is enabled in
the computer's BIOS menu."
Fix by checking the BIOS drive counter on every INT 13 call, and
updating it whenever necessary.
The case of an unsupported SAN protocol will currently not result in
any error message. Fix by printing the error message at the top level
using strerror(), rather than using hard-coded error messages in the
error paths.
IPoIB has a 20-byte link-layer address, of which only eight bytes
represent anything relating to a "hardware address".
The PXE and EFI SNP APIs expect the permanent address to be the same
size as the link-layer address, so fill in the "permanent address"
field with the initial link layer address (as generated by
register_netdev() based upon the real hardware address).
The hardware address is an intrinsic property of the hardware, while
the link-layer address can be changed at runtime. This separation is
exposed via APIs such as PXE and EFI, but is currently elided by gPXE.
Expose the hardware and link-layer addresses as separate properties
within a net device. Drivers should now fill in hw_addr, which will
be used to initialise ll_addr at the time of calling
register_netdev().
The option ROM header contains a one-byte field indicating the number
of 512-byte sectors in the ROM image. Currently it is linked to
contain the number of uncompressed sectors, with an instruction to the
compressor to correct it. This causes link failure when the
uncompressed size of the ROM image is over 128k.
Fix by replacing the SUBx compressor fixup with an ADDx fixup that
adds the total compressed output length, scaled as requested, to an
addend stored in the field where the final length value will be
placed. This is similar to the behavior of ELF relocations, and
ensures that an overflow error will not be generated unless the
compressed size is still too large for the field.
This also allows us to do away with the _filesz_pgh and _filesz_sect
calculations exported by the linker script.
Output tested bitwise identical to the old SUBx mechanism on hd, dsk,
lkrn, and rom prefixes, on both 32-bit and 64-bit processors.
Modified-by: Michael Brown <mcb30@etherboot.org>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
The SRP Boot Firmware Table serves a similar role to the iSCSI and AoE
Boot Firmware Tables; it provides information required by the loaded
OS in order to establish a connection back to the SRP boot device.
SRP is the SCSI RDMA Protocol. It allows for a method of SAN booting
whereby the target is responsible for reading and writing data using
Remote DMA directly to the initiator's memory. The software initiator
merely sends and receives SCSI commands; it never has to touch the
actual data.
Some BIOSes support the BIOS Boot Specification (BBS) but fail to set
%es:%di correctly when calling the option ROM initialisation entry
point. This causes gPXE to identify the BIOS as non-PnP (and so
non-BBS), leaving the user unable to control the boot order.
Fix by scanning for the $PnP signature ourselves, rather than relying
on the BIOS having passed in %es:%di correctly.
Tested-by: Helmut Adrigan <helmut.adrigan@chello.at>
pxe_api.h is just a description of API functions, it's actively
undesirable to have more implementations than necessary. Allowing it
under the MIT license lets the Syslinux libraries use it.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
We add a syslinux floppy disk type using parts of the genliso script.
This floppy image cat be dd'ed to a physical floppy or used in
instances where a virtual floppy with an mountable DOS filesystem is
useful.
We also modify the genliso script to only generate .liso images
rather than creating images depending on how it is called.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
IPoIB has a link-layer broadcast address that varies according to the
partition key. We currently go through several contortions to pretend
that the link-layer address is a fixed constant; by making the
broadcast address a property of the network device rather than the
link-layer protocol it will be possible to simplify IPoIB's broadcast
handling.
These commands can be used to activate or deactivate the PXE API (on a
specifiable network interface).
This is currently of limited use, since most image formats will call
shutdown() before booting the image, meaning that the underlying net
device gets shut down during remove_devices() anyway.
pxe_init_structures() fills in the fields of the !PXE and PXENV+
structures that aren't known until gPXE starts up. Once gPXE is
started, these values will never change.
Make pxe_init_structures() an initialisation function so that PXE
users don't have to worry about calling it.
It is possible that the UNDI ISR may be triggered before netdev_tx()
returns control to pxenv_undi_transmit(). This means that
pxenv_undi_isr() may see a zero undi_tx_count, and so not check for TX
completions. This is not a significant problem, since it will check
for TX completions on the next call to pxenv_undi_isr() anyway; it
just means that the NBP will see a spurious IRQ that was apparently
caused by nothing.
Fix by updating the undi_tx_count before calling netdev_tx(), so that
pxenv_undi_isr() can decrement it and report the TX completion.
Symantec Ghost requires working multicast support. gPXE configures
all (sufficiently supported) network adapters into "receive all
multicasts" mode, which means that PXENV_UNDI_SET_MCAST_ADDRESS is
actually a no-op, but the current implementation returns
PXENV_STATUS_UNSUPPORTED instead.
Fix by making PXENV_UNDI_SET_MCAST_ADDRESS return success. For good
measure, also implement PXENV_UNDI_GET_MCAST_ADDRESS, since the
relevant functionality is now exposed by the net device core.
Note that this will silently fail if the gPXE driver for the NIC being
used fails to configure the NIC in "receive all multicasts" mode.
The PXE debugging messages have remained pretty much unaltered since
Etherboot 5.4, and are now difficult to read in comparison to most of
the rest of gPXE.
Bring the pxe_undi debug messages up to normal gPXE standards.
The Symantec UNDI DOS driver fails when run on top of gPXE because we
return our interface type as "gPXE" rather than one of the predefined
NDIS interface type strings.
Fix by returning the standard "DIX+802.3" string; this isn't
necessarily always accurate, but it's highly unlikely that anything
trying to use the UNDI API would understand our IPoIB link-layer
pseudo-header anyway.
The Intel DOS UNDI driver fails when run on top of gPXE because we do
not fill in the ServiceFlags field in PXENV_UNDI_GET_IFACE_INFO.
Fix by filling in the ServiceFlags field with reasonable values
indicating our approximate feature capabilities.
The 3Com DOS UNDI driver fails when run on top of gPXE for two
reasons: firstly because PXENV_UNDI_SET_PACKET_FILTER is unsupported,
and secondly because gPXE enters the NBP without enabling interrupts
on the NIC, and the 3Com driver never calls PXENV_UNDI_OPEN.
Fix by always returning success from PXENV_UNDI_SET_PACKET_FILTER
(which is no worse than the current situation, since we already ignore
the receive packet filter in PXENV_UNDI_OPEN), and by forcibly
enabling interrupts on the NIC within PXENV_UNDI_TRANSMIT. The latter
is something of a hack, but avoids the need to implement a complete
base-code ISR that we would otherwise need if we were to enter the NBP
with interrupts enabled.
In order to construct outgoing link-layer frames or parse incoming
ones properly, some protocols (such as 802.11) need more state than is
available in the existing variables passed to the link-layer protocol
handlers. To remedy this, add struct net_device *netdev as the first
argument to each of these functions, so that more information can be
fetched from the link layer-private part of the network device.
Updated all three call sites (netdevice.c, efi_snp.c, pxe_undi.c) and
both implementations (ethernet.c, ipoib.c) of ll_protocol to use the
new argument.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Etherboot 5.4 erroneously treats PXENV_UNLOAD_STACK as the "final
shutdown" call, and unhooks INT15. When using gPXE's undionly.kpxe,
this results in gPXE overwriting the portion of Etherboot located in
high memory, because it is no longer hidden from the system memory map
at the time that gPXE loads.
Work around this by explicitly testing for Etherboot as the underlying
PXE stack (as is already done in undinet.c) and skipping the call to
PXENV_UNLOAD_STACK if necessary.
Solaris kernels are multiboot images with the "raw" flag set,
indicating that the loader should use the raw address fields within
the multiboot header rather than looking for an ELF header. However,
the Solaris kernel contains garbage data in the raw address fields,
and requires us to use the ELF header instead.
Work around this by always using the ELF header if present. This
renders the "raw" flag somewhat redundant.
The build mechanism currently allows for multiple objects per source
file. The only remaining user of this is unnrv2b.S. Replace this
usage with a separate unnrv2b16.S wrapper file, as is currently used
for e.g. pxeprefix.S and kpxeprefix.S.
Some utilities that expect a floppy disk image (e.g. iLO?) may test
for a file of the correct size. Reinstate the .pdsk image format in
order to provide this if needed.
QEMU will silently round down a disk or ROM image file to the nearest
512 bytes. Fix by always padding .rom, .dsk and .hd images to the
nearest 512-byte boundary.
Originally-fixed-by: Stefan Hajnoczi <stefanha@gmail.com>
Using "lret $2" to return from an interrupt causes interrupts to be
disabled in the calling program, since the INT instruction will have
disabled interrupts. Instead, patch CF on the stack and use iret to
return.
Interestingly, the original PC BIOS had this bug in at least one
place.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
This allows gPXE to load memtest86, which is packaged as an old kernel.
Split all code that directly touches the kernel headers out into
bzimage_parse_header() and bzimage_update_header(), to reduce code
size and offset the cost of supporting older kernels.
Total cost of this feature: 11 bytes (uncompressed).
The parsing of the !PXE and PXENV+ structures share a fair bit of
code; merge the common code to save a few bytes.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Search for the PXE entry points (via the !PXE or PXENV+ structures)
through all known combinations of search methods. Furthermore, if we
find a PXENV+ structure, attempt to use it to find the !PXE structure
if at all possible.
Avoid passing credentials in the iBFT that were available but not
required for login. This works around a problem in the Microsoft
iSCSI initiator, which will refuse to initiate sessions if the CHAP
password is fewer than 12 characters, even if the target ends up not
asking for CHAP authentication.
The PXE 1.x spec specifies that on NBP entry or on return from INT
1Ah AX=5650h, EDX shall point to the physical address of the PXENV+
structure. The PXE 2.x spec drops this requirement, simply stating
that EDX is clobbered. Given the principle "be conservative in what
you send, liberal in what you accept", however, we should implement
this anyway.
Certain combinations of PXE stack and BIOS result in a broken INT 18
call, which will leave the system displaying a "PRESS ANY KEY TO
REBOOT" message instead of proceeding to the next boot device. On
these systems, returning via the PXE stack is the only way to continue
to the next boot device. Returning via the PXE stack works only if we
haven't already blown away the PXE base code in pxeprefix.S.
In most circumstances, we do want to blow away the PXE base code.
Base memory is a limited resource, and it is desirable to reclaim as
much as possible. When we perform an iSCSI boot, we need to place the
iBFT above the 512kB mark, because otherwise it may not be detected by
the loaded OS; this may not be possible if the PXE base code is still
occupying that memory.
Introduce a new prefix type .kkpxe which will preserve both the PXE
base code and the UNDI driver (as compared to .kpxe, which preserves
the UNDI driver but uninstalls the PXE base code). This prefix type
can be used on systems that are known to experience the specific
problem of INT 18 being broken, or in builds (such as gpxelinux.0) for
which it is particularly important to know that returning to the BIOS
will work.
Written by H. Peter Anvin <hpa@zytor.com> and Stefan Hajnoczi
<stefanha@gmail.com>, minor structural alterations by Michael Brown
<mcb30@etherboot.org>.
COMBOOT images use INTs to issue API calls; these end up making calls
into gPXE from real mode, and so temporarily change the real-mode
stack pointer. When our COMBOOT code uses a longjmp() to implement
the various "exit COMBOOT image" API calls, this leaves the real-mode
stack pointer stuck with its temporary value, which causes problems if
we eventually try to exit out of gPXE back to the BIOS.
Fix by adding rmsetjmp() and rmlongjmp() calls (analogous to
sigsetjmp()/siglongjmp()); these save and restore the additional state
needed for real-mode calls to function correctly.
Multi-level menus via COMBOOT rely on the COMBOOT program being able
to exit and invoke a new COMBOOT program (the next menu). This works,
but rapidly (within about five iterations) runs out of space in gPXE's
internal stack, since each new image is executed in a new function
context.
Fix by allowing tail recursion between images; an image can now
specify a replacement image for itself, and image_exec() will perform
the necessary tail recursion.
The version of the GNU assembler shipped with Fedora 10
(2.18.50.0.9-8.fc10) complains about character literals in some of our
assembly code. Changing $'x' to $( 'x' ) seems to fix the problem.
Yes, the whitespace is required; using just $('x') does not work.
Reported by Kevin O'Connor <kevin@koconnor.net>.
There are code paths other than PMM allocation that can result in our
changing the ROM checksum. For example, we attempt to update our
product string to incorporate the PCI bus:dev.fn number. In a system
that does not support PMM, we could therefore end up with an incorrect
checksum.
Fix by attempting to update the checksum unconditionally.
As reported by Stefan, commit 13d09e6 ("[i386] Simplify linker script
and standardise linker-defined symbol names") breaks gdb, readelf and
associated utilities.
This is caused by the .stack section overwriting a block in the middle
of the .debug_info section (despite being included in the
.bss.textdata section in the output file, which apparently has the
correct attributes for a .bss section).
Fixed by adding explicit flags and type to the stack section
declaration.
If it happens that _textdata_memsz ends up being an exact multiple of
4kB, then this will cause the .textdata section (after relocation) to
start on a page boundary. This means that the hidden memory region
(which is rounded down to the nearest page boundary) will start
exactly at virtual address 0, i.e. UNULL. This means that
init_eheap() will erroneously assume that it has failed to allocate a
an external heap, since it typically ends up choosing the area that
lies immediately below .textdata, which in this case will be the
region with top==UNULL.
A subsequent error is that memtop_urealloc() passes through the error
return status -ENOMEM to the caller, which (rightly) assumes that the
result represents a valid userptr_t address.
Fixed by using alternative tests for heap non-existence, and by
returning UNULL in case of an error from init_eheap().
There are many functions that take ownership of the I/O buffer they
are passed as a parameter. The caller should not retain a pointer to
the I/O buffer. Use iob_disown() to automatically nullify the
caller's pointer, e.g.:
xfer_deliver_iob ( xfer, iob_disown ( iobuf ) );
This will ensure that iobuf is set to NULL for any code after the call
to xfer_deliver_iob().
iob_disown() is currently used only in places where it simplifies the
code, by avoiding an extra line explicitly setting the I/O buffer
pointer to NULL. It should ideally be used with each call to any
function that takes ownership of an I/O buffer. (The SSA
optimisations will ensure that use of iob_disown() gets optimised away
in cases where the caller makes no further use of the I/O buffer
pointer anyway.)
If gcc ever introduces an __attribute__((free)), indicating that use
of a function argument after a function call should generate a
warning, then we should use this to identify all applicable function
call sites, and add iob_disown() as necessary.
The DHCP client code now implements only the mechanism of the DHCP and
PXE Boot Server protocols. Boot Server Discovery can be initiated
manually using the "pxebs" command. The menuing code is separated out
into a user-level function on a par with boot_root_path(), and is
entered in preference to a normal filename boot if the DHCP vendor
class is "PXEClient" and the PXE boot menu option exists.
pxe_tftp.c assumes that the first seek on its data-transfer interface
represents the block size. Apart from being an ugly hack, this will
also screw up file size calculation for files smaller than one block.
The proper solution would be to extend the data-transfer interface to
support the reporting of stat()-like data. This is not going to
happen until the cost of adding interface methods is reduced (a fix I
have planned since June 2008).
In the meantime, abuse the xfer_window() method to return the block
size, since it is not being used for anything else and is vaguely
justifiable.
Astonishingly, having returned the incorrect TFTP blocksize via
PXENV_TFTP_OPEN for almost a year seems not to have affected any of
the test cases run during that time; this bug was found only when
someone tried running the heavily-patched version of pxegrub found in
OpenSolaris.
elf2efi converts a suitable ELF executable (containing relocation
information, and with appropriate virtual addresses) into an EFI
executable. It is less tightly coupled with the gPXE build process
and, in particular, does not require the use of a hand-crafted PE
image header in efiprefix.S.
elf2efi correctly handles .bss sections, which significantly reduces
the size of the gPXE EFI executable.
The check for unresolved symbols does not explicitly specify an output
architecture format, and so causes a warning when building an i386 EFI
binary on an x86_64 platform. This warning is harmless, and
specifying the output architecture in multiple places is cumbersome,
so just inhibit the warning.
At POST time some BIOSes return invalid e820 maps even though
they indicate that the data is valid. We add a check that the first
region returned by e820 is RAM type and declare the map to be invalid
if it is not.
This extends the sanity checks from 8b20e5d ("[pcbios] Sanity-check
the INT15,e820 and INT15,e801 memory maps").
Currently the only supported platform for x86_64 is EFI.
Building an EFI64 gPXE requires a version of gcc that supports
__attribute__((ms_abi)). This currently means a development build of
gcc; the feature should be present when gcc 4.4 is released.
In the meantime; you can grab a suitable gcc tree from
git://git.etherboot.org/scm/people/mcb30/gcc/.git
EFI provides a copy of the SMBIOS table accessible via the EFI system
table, which we should use instead of manually scanning through the
F000:0000 segment.
On non-BBS systems, we have to hook INT 19 in order to be able to boot
from the gPXE ROM at all. However, doing this unconditionally will
prevent the user from booting via any other devices.
Previously, the INT 19 entry point would prompt the user to press B in
order to boot from gPXE, which makes it impossible to perform an
unattended network boot. We now prompt the user to press N to skip
booting from gPXE, which allows for unattended operation.
This should be a better match for most real-world scenarios. Most
modern systems support BBS and so are unaffected by this change. Very
old (non-BBS) systems tend not to have PXE ROMs by default anyway; if
the user has added a gPXE ROM then they probably do want to boot from
the network. Newer non-BBS systems are essentially limited to IBM
servers, which will recapture the INT 19 vector anyway and implement
their own boot-ordering selection mechanism.
Remove the assortment of miscellaneous hacks to guess the "network
boot device", and replace them each with a call to last_opened_netdev().
It still isn't guaranteed correct, but it won't be any worse than
before, and it will at least be consistent.
This brings us in to line with Linux definitions, and also simplifies
adding x86_64 support since both platforms have 2-byte shorts, 4-byte
ints and 8-byte long longs.