gcc 4.8 and 4.9 fail to compile pxe_call.c with the error "bp cannot
be used in asm here". Other points in the codebase which use "ebp" in
the asm clobber list do not seem to be affected.
Unfortunately gcc provides no way to specify %ebp as an output
register, so we cannot use this as a workaround. The only viable
solution is to explicitly push/pop %ebp within the asm itself. This
is ugly for two reasons: firstly, it may be unnecessary; secondly, it
may cause gcc to generate invalid %esp-relative addresses if the asm
happens to use memory operands. This specific block of asm uses no
memory operands and so will not generate invalid code.
Reported-by: Daniel P. Berrange <berrange@redhat.com>
Reported-by: Christian Hesse <list@eworm.de>
Originally-fixed-by: Christian Hesse <list@eworm.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some versions of Linux apparently complain if initrds are not aligned
to a page boundary. Fix by changing INITRD_ALIGN from 4 bytes to 4096
bytes.
The amount of padding at the end of each initrd will now often be
sufficient to allow the cpio header to be prepended without crossing
an alignment boundary. The final location of the initrd may therefore
end up being slightly higher than the post-shuffle location.
bzimage_load_initrd() must therefore now copy the initrd body prior to
copying the cpio header, otherwise the start of the initrd body may be
overwritten by the cpio header. (Note that the guarantee that an
initrd will never need to overwrite an initrd at a higher location
still holds, since the overall length of each initrd cannot decrease
as a result of adding a cpio header.)
Reported-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 2422647 ("[prefix] Allow prefix to specify an arbitrary maximum
address for relocation") introduced a regression into the UNDI ROM
loader by preserving an extra register on the stack without modifying
the %sp-relative addresses used in the routine.
Fix by correcting the %sp-relative addresses to allow for the extra
preserved variable.
Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When the $(eval) function is available (in GNU make >= 3.80), we can
evaluate many of the dynamically-generated Makefile rules directly.
This avoids generating a few hundred Makefile fragments in the
filesystem, and so speeds up the build process.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Create an explicit concept of "settings scope" and eliminate the magic
values used for numerical setting tags.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Report the cause of the failure when we are unable to open the .mrom
payload. There are two possible failure cases:
- Unable to find a suitable memory BAR to borrow (e.g. if the NIC
doesn't have a memory BAR that is at least as large as the
expansion ROM BAR, or if the memory BAR has been assigned a 64-bit
address which won't fit into the 32-bit expansion ROM BAR). This
will be reported as "BABABABA".
- Unable to find correct ROM image within the BAR. This will be
reported as the address (within the borrowed BAR) at which we first
fail to find a valid 55AA signature.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Exploit the redefinition of iPXE error codes to include a "platform
error code" to allow for meaningful conversion of EFI_STATUS values to
iPXE errors and vice versa.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The low 8 bits of an iPXE error code are currently defined as the
closest equivalent PXE error code. Generalise this scheme to
platforms other than PC-BIOS by extending this definition to "closest
equivalent platform error code". This allows for the possibility of
returning meaningful errors via EFI APIs.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The implementation of INT 10,06 on some BIOSes (observed with both
Hyper-V and a Dell OptiPlex 7010) seems to treat %dx=0xffff as a
special value meaning "do absolutely nothing". Fix by using
%dx=0xfefe, which should still be sufficient to cover any realistic
screen size.
Reported-by: John Clark <skyman@iastate.edu>
Tested-by: John Clark <skyman@iastate.edu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Abstract out the ability to reboot the system to a separate reboot()
function (with platform-specific implementations), add an EFI
implementation, and make the existing "reboot" command available under
EFI.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 2629b7e ("[pcbios] Inhibit all calls to INT 15,e820 and INT
15,e801 during POST") introduced a regression into .lkrn images when
used with no corresponding initrd.
Specifically, the semantics of the "maximum address for relocation"
value passed to install_prealloc() in %ebp changed so that zero became
a special value meaning "inhibit use of INT 15,e820 and INT 15,e801".
The %ebp value meaing "no upper limit on relocation" was changed from
zero to 0xffffffff, and all prefixes providing fixed values for %ebp
were updated to match the new semantics.
The .lkrn prefix provides the initrd base address as the maximum
address for relocation. When no initrd is present, this address will
be zero, and so will unintentionally trigger the "inhibit INT 15,e820
and INT 15,e801" behaviour.
Fix by explicitly setting %ebp to 0xffffffff if no initrd is present
before calling install_prealloc().
Reported-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Tested-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
If a multifunction PCI device exposes an iPXE ROM via each function,
then each function will display a "Press Ctrl-B to configure iPXE"
prompt, and delay for two seconds. Since a single instance of iPXE
can drive all functions on the multifunction device, this simply adds
unnecessary delay to the boot process.
Fix by inhibiting the "Press Ctrl-B" prompt for all except the first
function on a PCI device.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Many BIOSes do not construct the full system memory map until after
calling the option ROM initialisation entry points. For several
years, we have added sanity checks and workarounds to accommodate
charming quirks such as BIOSes which report the entire 32-bit address
space (including all memory-mapped PCI BARs) as being usable RAM.
The IBM x3650 takes quirky behaviour to a new extreme. Calling either
INT 15,e820 or INT 15,e801 during POST doesn't just get you invalid
data. We could cope with invalid data. Instead, these nominally
read-only API calls manage to trash some internal BIOS state, with the
result that the system memory map is _never_ constructed. This tends
to confuse subsequent bootloaders and operating systems.
[ GRUB 0.97 fails in a particularly amusing way. Someone thought it
would be a good idea for memcpy() to check that the destination memory
region is a valid part of the system memory map; if not, then memcpy()
will sulk, fail, and return NULL. This breaks pretty much every use
of memcpy() including, for example, those inserted implicitly by gcc
to copy non-const initialisers. Debugging is _fun_ when a simple call
to printf() manages to create an infinite recursion, exhaust the
available stack space, and shut down the CPU. ]
Fix by completely inhibiting calls to INT 15,e820 and INT 15,e801
during POST.
We do now allow relocation during POST up to the maximum address
returned by INT 15,88 (which seems so far to always be safe). This
allows us to continue to have a reasonable size of external heap, even
if the PMM allocation is close to the 1MB mark.
The downside of allowing relocation during POST is that we may
overwrite PMM-allocated memory in use by other option ROMs. However,
the downside of inhibiting relocation, when combined with also
inhibiting calls to INT 15,e820 and INT 15,e801, would be that we
might have no external heap available: this would make booting an OS
impossible and could prevent some devices from even completing
initialisation.
On balance, the lesser evil is probably to allow relocation during
POST (up to the limit provided by INT 15,88). Entering iPXE during
POST is a rare operation; on the even rarer systems where doing so
happens to overwrite a PMM-allocated region, then there exists a
fairly simple workaround: if the user enters iPXE during POST and
wishes to exit iPXE, then the user must reboot. This is an acceptable
cost, given the rarity of the situation and the simplicity of the
workaround.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
romprefix.S currently calls uninstall() with an invalid value in %ax.
Consequently, base memory is not freed after a ROM boot attempt (or
after entering iPXE during POST).
The uninstall() function is physically present in .text16, and so can
use %cs to determine the .text16 segment address. The .data16 segment
address is not required, since uninstall() is called only by code
paths which set up .data16 to immediately follow .text16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The PXE TFTP API allows the caller to request a particular TFTP block
size. Since mid-2008, iPXE has appended a "?blksize=xxx" parameter to
the TFTP URI constructed internally; nothing has ever parsed this
parameter. Nobody seems to have cared that this parameter has been
ignored for almost five years.
Fix by using xfer_window(), which provides a fairly natural way to
convey the block size information from the PXE TFTP API to the TFTP
protocol layer.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some PXE stacks are known to claim that IRQs are supported, but then
never generate interrupts. No satisfactory solution has been found to
this problem; the workaround is to add the PCI vendor and device IDs
to a list of devices which will be treated as simply not supporting
interrupts.
This is something of a hack, since it will generate false positives
for identical devices with a working PXE stack (e.g. those that have
been reflashed with iPXE), but it's an improvement on the current
situation.
Reported-by: Richard Moore <rich@richud.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
At present, loading a bzImage via iPXE requires enough RAM to hold two
copies of each initrd file. Remove this constraint by rearranging the
initrds in place.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
No code from the original source remains within this file; relicense
under GPL2+ with a new copyright notice.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Current versions of gcc require -maccumulate-outgoing-args if any
sysv_abi functions call ms_abi functions. This requirement is likely
to be lifted in future gcc versions, so test explicitly to see if the
current version of gcc requires -maccumulate-outgoing-args.
This problem is currently masked since the implied
-fasynchronous-unwind-tables (which is the default in current gcc
versions) implies -maccumulate-outgoing-args.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 73eb3f1 ("[int13] Zero all possible registers when jumping to a
boot sector") introduced a regression preventing the SAN-booting of
boot sectors which rely upon %dl containing the correct drive number
(such as most CD-ROM boot sectors).
Fix by not zeroing %edx.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
At least one boot sector (the DUET boot sector used for bootstrapping
EFI from a non-EFI system) fails to initialise the high words of
registers before using them in calculations, leading to undefined
behaviour.
Work around such broken boot sectors by explicitly zeroing the
contents of all registers apart from %cs:%ip and %ss:%sp.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Almost all clients of the raw-packet interfaces (UNDI and SNP) can
handle only Ethernet link layers. Expose an Ethernet-compatible link
layer to local clients, while remaining compatible with IPoIB on the
wire. This requires manipulation of ARP (but not DHCP) packets within
the IPoIB driver.
This is ugly, but it's the only viable way to allow IPoIB devices to
be driven via the raw-packet interfaces.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
COMBOOT images are detected by looking for a ".com" or ".cbt" filename
extension. There are widely-used files with a ".com" extension, such
as "wdsnbp.com", which are PXE images rather than COMBOOT images.
Avoid false detection of PXE images as COMBOOT images by accepting
only a ".cbt" extension as indicating a COMBOOT image.
Interestingly, this bug has been present for a long time but was
frequently concealed because the filename was truncated to fit the
fixed-length "name" field in struct image. (PXE binaries ending in
".com" tend to be related to Windows deployment products and so often
use pathnames including backslashes, which iPXE doesn't recognise as a
path separator and so treats as part of a very long filename.)
Commit 1c127a6 ("[image] Simplify image management commands and
internal API") made the image name a variable-length field, and so
exposed this flaw in the COMBOOT image detection algorithm.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the file mode to be specified using a "mode=" command line
parameter. For example:
initrd http://web/boot/bootlocal.sh /opt/bootlocal.sh mode=755
Requested-by: Bryce Zimmerman <bryce.zimmerman@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some AMI BIOSes apparently break in exciting ways when asked for PMM
allocations for sizes that are not multiples of 4kB.
Fix by rounding up the image source area to the nearest 4kB. (The
temporary decompression area is already rounded up to the nearest
128kB, to facilitate sharing between multiple iPXE ROMs.)
Reported-by: Itay Gazit <itayg@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Cygwin's assembler treats '/' as a comment character.
Reported-by: Steve Goodrich <steve.goodrich@se-eng.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Similarly to FreeBSD, OpenBSD requires the object format to be
specified as elf_i386_obsd rather than elf_i386.
Reported-by: Jiri B <jirib@devio.us>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PCI3.0 allows us to report a "runtime size" which can be smaller than
the actual ROM size. On systems that support PMM our runtime size
will be small (~2.5kB), which helps to conserve the limited option ROM
space. However, there is no guarantee that the PMM allocation will
succeed, and so we need to report the worst-case runtime size in the
PCI header.
Move the "shrunk ROM size" field from the PCI header to a new "iPXE
ROM header", allowing it to be accessed by ROM-manipulation utilities
such as disrom.pl.
Reported-by: Anton D. Kachalov <mouse@yandex-team.ru>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PXENV_FILE_CMDLINE is an iPXE extension, and will not be supported by
most PXE stacks. Do not report any errors to the user, since in
almost all cases the error will mean simply "not loaded by iPXE".
Reported-by: Patrick Domack <patrickdk@patrickdk.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Attempt to restore the network device to the state it was in prior to
calling the NBP. This simplifies the task of taking follow-up action
in an iPXE script.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some BIOSes (observed on a Supermicro system with an AMI BIOS) seem to
use the area immediately below 0x7c00 to store data related to the
boot process. This data is currently liable to be overwritten by the
temporary stack used while decompressing and installing iPXE.
Try to avoid any such problems by placing the temporary stack
immediately after the loaded iPXE binary. Any memory used by the
stack could then potentially have been overwritten anyway by a larger
binary.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The setup_move_size field is not defined in protocol versions earlier
than 2.00 (and is obsolete in versions later than 2.01). In binaries
using versions earlier than 2.00, the relevant location is likely to
contain executable code.
Interestingly, this bug has been present since support for pre-2.00
protocol versions was added in 2009, and has been unexpectedly
modifying the memtest86+ code fragment:
mov $0x92, %dx
inb %dx, %al
Fortuitously, the modification exactly overwrote the value loaded into
%dx, and so the net effect was limited to causing Fast Gate A20
detection to always fail.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The undinet driver always has to make a copy of the received frame
into an I/O buffer. Align this copy sensibly so that subsequent
operations are as fast as possible.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The generic TCP/IP checksum implementation requires approximately 10
CPU clocks per byte (as measured using the TSC). Improve this to
approximately 0.5 CPU clocks per byte by using "lodsl ; adcl" in an
unrolled loop.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Calculating the TCP/IP checksum on received packets accounts for a
substantial fraction of the response latency.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The "rep" prefix can be used with an iteration count of zero, which
allows the variable-length memcpy() to be implemented without using
any conditional jumps.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PMM defines the return code 0xffffffff as meaning "unsupported
function". It's hard to imagine a PMM BIOS that doesn't support
pmmAllocate(), but apparently such things do exist.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
A .mrom image currently assumes that it is the first image within the
expansion ROM BAR, which may not be correct when multiple images are
present.
Fix by scanning through the BAR until we locate an image matching our
build ID.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The header of a .mrom image declares its length to be only a few
kilobytes; the remainder is accessed via a sideband mechanism. This
makes it difficult to append an additional ROM image, such as an EFI
ROM.
Add a second, dummy ROM header covering the payload portion of the
.mrom image, allowing consumers to locate any appended ROM images in
the usual way.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Avoid potential confusion in the documentation by using a
vendor-neutral name for the extended (AMD-defined) feature set.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The WinCE, a.out and FreeBSD loaders are designed to be #included by
core/loader.c, which no longer exists. These old loaders are not
usable anymore and cause compilation failures when enabled in
config/general.h.
Signed-off-by: Marin Hannache <mareo@mareo.fr>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Solaris assumes that there is enough space above the Multiboot modules
to use as a decompression and scratch area. This assumption is
invalid when using iPXE, which places the Multiboot modules near the
top of (32-bit) memory.
Fix by copying the modules to an area of memory immediately following
the loaded kernel.
Debugged-by: Michael Brown <mcb30@ipxe.org>
Debugged-by: Scott McWhirter <scottm@joyent.com>
Tested-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow iPXE settings to be specified in the .vmx file via the VMware
GuestInfo mechanism. For example:
guestinfo.ipxe.filename = "http://boot.ipxe.org/demo/boot.php"
guestinfo.ipxe.dns = "192.168.0.1"
guestinfo.ipxe.net0.ip = "192.168.0.15"
guestinfo.ipxe.net0.netmask = "255.255.255.0"
guestinfo.ipxe.net0.gateway = "192.168.0.1"
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Solaris kernels seem to rely on having the full kernel path present in
the multiboot command line; if only the kernel name is present then
the boot fails with the error message
krtld: failed to open 'unix'
Debugged-by: Michael Brown <mcb30@ipxe.org>
Debugged-by: Scott McWhirter <scottm@joyent.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Using __from_text16() and __from_data16() in inline asm constraints
sometimes defeats gcc's ability to simplify expressions down to
compile-time constants.
Reported-by: Jason Kohles <jkohles@palantir.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
At present, we always hide an extra sizeof(struct external_memory), to
account for the header on the lowest allocated block. This header
ceases to exist when there are no allocated blocks remaining.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An ANSI escape sequence context cannot be shared between multiple
users. Make the ANSI escape sequence context part of the line console
definition and provide individual contexts for each user.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The output from text-based user interfaces such as the "config"
command is not generally meaningful for logfile-based consoles such as
syslog and vmconsole.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the concept of a "console usage", such as "standard output" or
"debug messages". Allow usages to be associated with each console
independently. For example, to send debugging output via the serial
port, while preventing it from appearing on the local console:
#define CONSOLE_SERIAL CONSOLE_USAGE_ALL
#define CONSOLE_PCBIOS ( CONSOLE_USAGE_ALL & ~CONSOLE_USAGE_DEBUG )
If no usages are explicitly specified, then a default set of usages
will be applied. For example:
#define CONSOLE_SERIAL
will have the same affect as
#define CONSOLE_SERIAL CONSOLE_USAGE_ALL
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove the name, cmdline, and action parameters from imgdownload() and
imgdownload_string(). These functions now simply download and return
an image.
Add the function imgacquire(), which will interpret a "name or URI
string" parameter and return either an existing image or a newly
downloaded image.
Use imgacquire() to merge similar image-management commands that
currently differ only by whether they take the name of an existing
image or the URI of a new image to download. For example, "chain" and
"imgexec" can now be merged.
Extend imgstat and imgfree commands to take an optional list of
images.
Remove the arbitrary restriction on the length of image names.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is no INT 10 call for "display character with attribute,
advancing the cursor and scrolling the screen as necessary". We
therefore make two INT 10 calls: INT 10,09 to write the character with
its attribute at the current cursor position, and then INT 10,0e to
(re)write the character (leaving the attribute unchanged), advance the
cursor position and scroll as necessary.
This confuses the serial-over-LAN console redirection feature provided
by some BIOSes.
Fix by performing the INT10,09 only when necessary to change the
existing attribute.
Reported-by: Itay Gazit <itaygazit@gmail.com>
Tested-by: Itay Gazit <itaygazit@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RSA requires modular exponentiation using arbitrarily large integers.
Given the sizes of the modulus and exponent, all required calculations
can be done without any further dynamic storage allocation. The x86
architecture allows for efficient large integer support via inline
assembly using the instructions that take advantage of the carry flag
(e.g. "adcl", "rcrl").
This implemention is approximately 80% smaller than the (more generic)
AXTLS implementation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Minimise code size by forcing the use of memory addresses for
__bswap_16s() and __bswap_64s(). (__bswap_32s() cannot avoid loading the
value into a register.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Fix a strict-aliasing error on certain versions of gcc.
Reported-by: Marko Myllynen <myllynen@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use the "bswap" instruction to shrink the size of byte-swapping code,
and provide the in-place variants __bswap_{16,32,64}s.
"bswap" is available only on 486 and later processors. (We already
assume the presence of "cpuid" and "rdtsc", which are available only
on Pentium and later processors.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some PXE stacks (observed with a QLogic 8242) will always try to
prepend a link-layer header, even if the caller uses P_UNKNOWN to
indicate that the link-layer header has already been filled in. This
results in an invalid packet being transmitted.
Work around these faulty PXE stacks where possible by stripping the
existing link-layer header and allowing the PXE stack to (re)construct
the link-layer header itself.
Originally-fixed-by: Buck Huppmann <buckh@pobox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The RTC-based entropy source uses the nanosecond-scale CPU TSC to
measure the time between two 1kHz interrupts generated by the CMOS
RTC. In a physical machine these clocks are driven from independent
crystals, resulting in some observable clock drift. In a virtual
machine, the CMOS RTC is typically emulated using host-OS
constructions such as SIGALRM.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
ANS X9.82 specifies several Approved Sources of Entropy Input (SEI).
One such SEI uses an entropy source as the Source of Entropy Input,
condensing each entropy source output after each GetEntropy call.
This can be implemented relatively cheaply in iPXE and avoids the need
to allocate potentially very large buffers.
(Note that the terms "entropy source" and "Source of Entropy Input"
are not synonyms within the context of ANS X9.82.)
Use the iPXE API mechanism to allow entropy sources to be selected at
compilation time.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
initrd_init() calls umalloc() to allocate space for the initrd image,
but does so before hide_etherboot() has been called. It is therefore
possible for the initrd to end up overwriting iPXE itself.
Fix by converting initrd_init() from an init_fn to a startup_fn.
Originally-fixed-by: Till Straumann <strauman@slac.stanford.edu>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The command line may be situated in an area of base memory that will
be overwritten by iPXE's real-mode segments, causing the command line
to be corrupted before it can be used.
Fix by creating a copy of the command line on the prefix stack (below
0x7c00) before installing the real-mode segments.
Reported-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PXENV_FILE_EXIT_HOOK is designed to allow ipxelinux.0 to unload both
the iPXE and pxelinux components without affecting the underlying PXE
stack. Unfortunately, it causes unexpected behaviour in other
situations, such as when loading a non-embedded pxelinux.0 via
undionly.kpxe. For example:
PXE ROM -> undionly.kpxe -> pxelinux.0 -> chain.c32 to boot hd0
would cause control to return to iPXE instead of booting from the hard
disk. In some cases, this would result in a harmless but confusing
"No more network devices" message; in other cases stranger things
would happen, such as being returned to the iPXE shell prompt.
The fundamental problem is that when pxelinux detects
PXENV_FILE_EXIT_HOOK, it may attempt to specify an exit hook and then
exit back to iPXE, assuming that iPXE will in turn exit cleanly via
the specified exit hook. This is not a valid assumption in the
general case, since the action of exiting back to iPXE does not
directly cause iPXE to exit itself. (In the specific case of
ipxelinux.0, this will work since the embedded script exits as soon as
pxelinux.0 exits.)
Fix the unexpected behaviour in the non-ipxelinux.0 cases by including
support for PXENV_FILE_EXIT_HOOK only when using a new .kkkpxe format.
The ipxelinux.0 build process should therefore now use undionly.kkkpxe
instead of undionly.kkpxe.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Very nasty things can happen if a NULL network device is used. Check
that pxe_netdev is non-NULL at the applicable entry points, so that
this type of problem gets reported to the caller rather than being
allowed to crash the system.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On at least one PXE stack (Realtek r8169), PXENV_UNDI_INITIALIZE has
been observed to fail intermittently due to a media test failure (PXE
error 0x00000061). Retrying the call to PXENV_UNDI_INITIALIZE
succeeds, and the NIC is then usable.
It is worth noting that this particular Realtek PXE stack is already
known to be unreliable: for example, it repeatably fails its own
boot-time media test after every warm reboot.
Fix by attempting PXENV_UNDI_INITIALIZE multiple times, with a short
delay between each attempt to allow the link to settle.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow an initrd (such as an embedded script) to be passed to iPXE when
loaded as a .lkrn (or .iso) image. This allows an embedded script to
be varied without recompiling iPXE.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Specify a driver name of "undionly" and a device name based on the
UNDI-reported underlying hardware device. For example:
net0: 52:54:00:12:34:56 using undionly on UNDI-PCI00:03.0 (open)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some BIOSes are reported to corrupt %ebx when using INT 15,2401 (see
http://opensolaris.org/jive/thread.jspa?messageID=377026). Guard
against this by preserving all (non-segment) registers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The symbol_text16 is defined globally by the linker. Use rm_text16
instead of _text16 for the local variable within librm.S to avoid
confusion when reading linker maps.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
All users of imgdownload() require registration of the image, so make
registration an integral part of imgdownload() itself and simplify the
"action" parameter to be one of image_select(), image_exec() et al.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE specifies a value of 0 for cmdline_size, causing GRUB to not pass
in a command line. Fix by setting cmdline_size to the maximum value
of 2047.
Signed-off-by: Valentine Barshak <gvaxon@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the link layer to directly report whether or not a packet is
multicast or broadcast at the time of calling pull(), rather than
relying on heuristics to determine this at a later stage.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow objects to support both streaming and block device protocols, by
starting streaming data only when the data transfer window opens.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some bootloaders seem to add "BOOT_IMAGE=..." at the end of the
command line; some at the start. Cope with either variation.
Reported-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
IBM BIOSes ignore the PnP header offset stored at address 0x1a and
instead scan for the $PnP signature on a 16-byte boundary. (This
alignment is not mandated by the PnP specification.)
Force PnP header to a 16-byte boundary to work around these BIOSes.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Several BIOSes (including most IBM BIOSes and many virtual machine
BIOSes) do not provide detectable PnP support, but will use the BEV
entry point for a PnP option ROM. On these semi-PnP BIOSes, iPXE will
respond to the absence of detectable PnP support by hooking INT19,
which disrupts the boot order.
BIOSes that genuinely require hooking INT19 seem to be very rare
nowadays. It may therefore be preferable to assume that the absence
of detectable PnP support indicates a semi-PnP BIOS rather than a
non-PnP BIOS.
Change the default behaviour so that INT19 will never be hooked unless
the compile-time option NONPNP_HOOK_INT19 is enabled. Leave the
redundant PnP detection routine in-place to allow for debugging via
the ROM banner line.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Revert commit 38cd351 ("[romprefix] Attempt to gracefully handle
semi-PnP IBM BIOSes"), since the test for the "IBM " signature in %edi
is not sufficient to identify an IBM BIOS.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some IBM BIOSes provide partial support for PnP: they will use the BEV
entry point but will not advertise PnP support. This causes iPXE to
hook INT 19, which disrupts the boot process.
Attempt to improve this situation by detecting an IBM BIOS and
treating it as a PnP BIOS despite the absence of a PnP signature.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some versions of binutils have curious concepts of what constitutes
subtraction. For example:
0x00000000000000f0 _text16_late = .
0x0000000000000898 _mtext16 = .
0x0000000000000898 _etext16 = .
0x0000000000000898 _text16_late_filesz = ABSOLUTE ((_mtext16 - _text16_late))
0x00000000000007a8 _text16_late_memsz = ABSOLUTE ((_etext16 - _text16_late))
This has interesting side-effects such as producing sizes for .bss
segments that are negative, causing the majority of addressable memory
to be zeroed out.
Fix by using the form
ABSOLUTE ( x ) - ABSOLUTE ( y )
rather than
ABSOLUTE ( x - y )
Reported-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Tested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This allows older versions of ELTORITO.SYS (such as the version found
on the FreeDOS installation CD-ROM) to use iPXE's emulated CD-ROM
drive.
Reported-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Expose the multiple-SAN-drive capability of the iPXE core via the iPXE
command line by adding commands to hook and unhook additional drives.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some PXE stacks (notably old Etherboot/gPXE stacks) will claim to use
the timer interrupt, rather than reporting that interrupts are not
supported. Since using the timer interrupt is equivalent to polling
anyway, we may as well genuinely poll these stacks.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Temporary modification to prevent valgrind.h from breaking compilation
with gcc 4.6. When this problem is fixed upstream, a new and
unmodified copy of valgrind.h should be imported.
Signed-off-by: Thomas Miletich <thomas.miletich@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An iPXE .exe image can be loaded from DOS. Tested using bin/ipxe.exe
to load a Linux kernel and simple initramfs from within MS-DOS 6.22.
(EDD must be disabled using the "edd=off" kernel parameter, since the
loaded kernel image has already overwritten parts of DOS' INT 13
wrapper.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
In the unlikely (but observable) event that INT 15,88 returns less
memory above 1MB than is required for the temporary decompression
area, ignore it and use the 1MB point anyway.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Make the allocators used by malloc and linux_umalloc valgrindable.
Include valgrind headers in the codebase to avoid a build dependency
on valgrind.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some PXE stacks advertise that interrupts are not supported, despite
requiring the use of interrupts. Attempt to cope with such cards
without breaking others by always hooking the interrupt, and using the
"interrupts supported" flag only to decide whether or not to wait for
an interrupt before calling PXENV_UNDI_ISR_IN_PROCESS.
The possible combinations are therefore:
1. Card generates interrupts and claims to support interrupts
iPXE will call PXENV_UNDI_ISR_IN_PROCESS only after an interrupt
has been observed. (This is required to avoid lockups in some PXE
stacks, which spuriously sulk if called before an interrupt has
been generated.)
Such a card should work correctly.
2. Card does not generate interrupts and does not claim to support
interrupts
iPXE will call PXENV_UNDI_ISR_IN_PROCESS indiscriminately, matching
the observed behaviour of at least one other PXE NBP (winBoot/i).
Such a card should work correctly.
3. Card generates interrupts but claims not to support interrupts
iPXE will call PXENV_UNDI_ISR_IN_PROCESS indiscriminately. An
interrupt will still result in a call to PXENV_UNDI_ISR_IN_START.
Such a card may work correctly.
4. Card does not generate interrupts but claims to support interrupts
Such a card will not work at all.
Reported-by: Jerry Cheng <jaspers.cheng@msa.hinet.net>
Tested-by: Jerry Cheng <jaspers.cheng@msa.hinet.net>
Reported-by: Mauricio Silveira <mauricio@livreti.com.br>
Tested-by: Mauricio Silveira <mauricio@livreti.com.br>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE allocates its first PMM block using the image source length,
which is rounded up to the nearest 16-byte paragraph. It then copies
in data of a length calculated from the ROM size, which is
theoretically less than or equal to the image source length, but is
rounded up to the nearest 512-byte sector. This can result in copying
beyond the end of the allocated PMM block, which can corrupt the PMM
data structures (and other essentially arbitrary areas of memory).
Fix by rounding up the image source length to the nearest 512-byte
sector before using it as the PMM allocation length.
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Jarrod Johnson <jarrod.b.johnson@gmail.com>
Reported-by: Itay Gazit <itayg@mellanox.co.il>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
INT 16,01 will discard some extended keystrokes on some BIOSes, making
it impossible for iPXE to detect keypresses such as F12. Fix by using
INT 16,11 instead.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some prefixes (e.g. .lkrn) allow a command line to be passed in to
iPXE. At present, this command line is ignored.
If a command line is provided, treat it as an embedded script (without
an explicit "#!ipxe" magic marker). This allows for patterns of
invocation such as
title iPXE
kernel /boot/ipxe.lkrn dhcp && \
sanboot iscsi:10.0.4.1::::iqn.2010-04.org.ipxe.dolphin:storage
Here GRUB is instructed to load ipxe.lkrn with an embedded script
equivalent to
#!ipxe
dhcp
sanboot iscsi:10.0.4.1::::iqn.2010-04.org.ipxe.dolphin:storage
This can be used to effectively vary the embedded script without
having to rebuild ipxe.lkrn.
Originally-implemented-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The function keys F5-F12 all conform to the same ANSI pattern as the
other "special" keys that we currently recognise. Add these key
definitions, and shrink the representation of the ANSI sequences in
bios_console.c to compensate.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Refactor the {load,exec} image operations as {probe,exec}. This makes
the probe mechanism cleaner, eliminates some forward declarations,
avoids holding magic state in image->priv, eliminates the possibility
of screwing up between the "load" and "exec" stages, and makes the
documentation simpler since the concept of "loading" (as distinct from
"executing") no longer needs to be explained.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The online documentation (e.g. http://ipxe.org/cmd/ifopen), though not
yet complete, is far more comprehensive than could be provided within
the iPXE binary. Save around 200 bytes (compressed) by removing the
command descriptions from the interactive help, and instead referring
users directly to the web page describing the relevant command.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently use INT 13,00 as an opportunity to reopen the underlying
block device, which works well for callers such as DOS that will use
INT 13,00 in response to any disk errors. However, some callers (such
as Windows Server 2008) do not attempt to reset the disk, and so any
failures become effectively permanent.
Fix this by automatically reopening the underlying block device
whenever we might want to access it.
This makes direct installation of Windows to an iSCSI target much more
reliable.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The "size" bit (aka the D/B) bit should (as far as I can tell) be
irrelevant for accesses to a non-code, non-stack, expand-upwards
segment. However, VirtualBox fails on some accesses via this segment
if this bit is not set.
This change allows iPXE to boot under VirtualBox without having to
disable VT-x/AMD-V support.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Building the Linux-specific code (tap.o et al) requires external
headers that have proven to be extremely variable across systems,
causing frequent build failures.
Until this situation is rectified, remove the Linux-specific code from
the default (non-Linux build).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some binutils versions will drag in an object to satisfy the entry
symbol; some won't. Try to cope with this exciting variety of
behaviour by ensuring that all entry symbols are unique.
Remove the explicit inclusion of the prefix object on the linker
command line, since the entry symbol now provides all the information
needed to identify the prefix.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 623469d ("[build] Eliminate unused sections at link-time")
introduced a regression in several build formats, in which the prefix
would end up being garbage-collected out of existence. Fix by
ensuring that an entry symbol exists in each possible prefix, and is
required by the linker script.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use -ffunction-sections, -fdata-sections, and --gc-sections to
automatically prune out any unreferenced sections.
This saves around 744 bytes (uncompressed) from the rtl8139.rom build.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
EFI performs its own PCI bus enumeration. Respect this, and start
controlling devices only when instructed to do so by EFI.
As a side benefit, we should now correctly create multiple SNP
instances for multi-port devices.
This should also fix the problem of failing to enumerate devices
because the PCI bridges have not yet been enabled at the time the iPXE
driver is loaded.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Merge the "bus" and "devfn" fields into a single "busdevfn" field, to
match the format used by the majority of external code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some BIOSes can report multiple memory regions which may be adjacent
and the same type. Since only the first region is used in the
mboot.c32 layer it's possible to run out of memory when loading all of
the boot modules. One may get around this problem by having iPXE
merge these memory regions internally.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove the concept of shutdown exit flags, and replace it with a
counter used to keep track of exposed interfaces that require devices
to remain active.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
libflat no longer has anything to do with flat real mode; it handles
only the A20 gate. Update library name to match.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Flat real mode will have been set up as a side-effect of the
protected-mode call invoked during install_block() for .text16.early;
there is no need to do so explicitly.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Flat real mode works perfectly on real hardware, but seems to cause
problems for some hypervisors. Revert to using 16-bit protected mode
(and returning to real mode with 4GB limits, so as not to break PMM
BIOSes).
Allow the code specific to the .mrom format to continue to assume that
flat real mode works, since this format is specific to real hardware.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The PXE debugging messages have remained pretty much unaltered since
Etherboot 5.4, and are now difficult to read in comparison to most of
the rest of iPXE.
Bring the pxe_udp debug messages up to normal iPXE standards.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Earlier versions of the PXE specification do not have the SubVendor_ID
and SubDevice_ID fields, and some NBPs may not provide space for them.
Avoid overwriting the contents of these fields, just in case.
This is similar to the problem with the BufferLimit field in
PXENV_GET_CACHED_INFO.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Changes were made to files where the licence text within the files
themselves confirms that the files are GPL version 2 or later.
Signed-off-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use the real-mode address ffff:0010 to access the linear address
0x100000, and so test whether or not the A20 gate is enabled without
requiring a switch into flat real mode (or some other addressing
mode).
This speeds up CPU mode transitions, and also avoids breaking the NBP
from IBM's Tivoli Provisioning Manager for Operating System
Deployment. This NBP makes some calls to iPXE in VM86 mode rather
than true real mode and does not correctly emulate our transition into
flat real mode.
Interestingly, Tivoli's VMM *does* allow us to switch into protected
mode (though it patches our GDT so that we execute in ring 1 rather
than ring 0). However, paging is still disabled and we have a 4GB
segment limit. Being in ring 1 does not, therefore, restrict us in
any meaningful way; this has been verified by deliberately writing
garbage over Tivoli's own GDT (at address 0x02201010) during a
nominally VM86-mode PXE API call. It's unclear precisely what
protection this VMM is supposed to be offering.
Suggested-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some network cards do not generate interrupts when operated via the
UNDI API. Allow for this by waiting for the ISR to be triggered only
if the PXE stack advertises that it supports interrupts. When the PXE
stack does not advertise interrupt support, we skip the call to
PXENV_UNDI_ISR_IN_START and just poll the device using
PXENV_UNDI_ISR_IN_PROCESS. This matches the observed behaviour of at
least one other PXE NBP (emBoot's winBoot/i), so there is a reasonable
chance of this working.
Originally-implemented-by: Muralidhar Appalla <Muralidhar.Appalla@emulex.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The disk signature is used by some OSes (notably Windows) to identify
the boot disk, so it's useful debugging information to have.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Support the extensions mandated by EDD 4.0, including:
o the ability to specify a flat physical address in a disk address
packet,
o the ability to specify a sector count greater than 127 in a disk
address packet,
o support for all functions within the Fixed Disk Access and EDD
Support subsets,
o the ability to describe a device using EDD Device Path Information.
This implementation is based on draft revision 3 of the EDD 4.0
specification, with reference to the EDD 3.0 specification. It is
possible that this implementation may need to change in order to
conform to the final published EDD 4.0 specification.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The block device interface used in gPXE predates the invention of even
the old gPXE data-transfer interface, let alone the current iPXE
generic asynchronous interface mechanism. Bring this old code up to
date, with the following benefits:
o Block device commands can be cancelled by the requestor. The INT 13
layer uses this to provide a global timeout on all INT 13 calls,
with the result that an unexpected passive failure mode (such as
an iSCSI target ACKing the request but never sending a response)
will lead to a timeout that gets reported back to the INT 13 user,
rather than simply freezing the system.
o INT 13,00 (reset drive) is now able to reset the underlying block
device. INT 13 users, such as DOS, that use INT 13,00 as a method
for error recovery now have a chance of recovering.
o All block device commands are tagged, with a numerical tag that
will show up in debugging output and in packet captures; this will
allow easier interpretation of bug reports that include both
sources of information.
o The extremely ugly hacks used to generate the boot firmware tables
have been eradicated and replaced with a generic acpi_describe()
method (exploiting the ability of iPXE interfaces to pass through
methods to an underlying interface). The ACPI tables are now
built in a shared data block within .bss16, rather than each
requiring dedicated space in .data16.
o The architecture-independent concept of a SAN device has been
exposed to the iPXE core through the sanboot API, which provides
calls to hook, unhook, boot, and describe SAN devices. This
allows for much more flexible usage patterns (such as hooking an
empty SAN device and then running an OS installer via TFTP).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE has never supported SEEK_END; the usage of "whence" offers only
the options of SEEK_SET and SEEK_CUR and so is effectively a boolean
flag. Further flags will be required to support additional metadata
required by the Fibre Channel network model, so repurpose the "whence"
field as a generic "flags" field.
xfer_seek() has always been used with SEEK_SET, so remove the "whence"
field altogether from its argument list.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Declarations without the accompanying __table_entry cause misalignment
of the table entries when using gcc 4.5. Fix by adding the
appropriate __table_entry macro or (where possible) by removing
unnecessary forward declarations.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Support qemu-like arguments for network setup:
--net driver_name[,setting=value]*
and global settings:
--settings setting=value[,setting=value]*
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add makefiles, ld scripts and default config for linux platform for
both i386 and x86_64.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
pcbios specific get_memmap() is used by the b44 driver making
all-drivers builds fail on other platforms. Move it to the I/O API
group and provide a dummy implementation on EFI.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
COM32 binaries generally expect to run with interrupts
enabled. Syslinux does so, and COM32 programs will execute cli/sti
pairs when running a critical section, to provide mutual exclusion
against BIOS interrupt handlers. Previously, under iPXE, the IDT was
not valid, so any interrupt (e.g. a timer tick) would generally cause
the machine to triple fault.
This change introduces code to:
- Create a valid IDT at the same location that syslinux uses
- Create an "interrupt jump buffer", which contains small pieces of
code that simply record the vector number and jump to a common
handler
- Thunk down to real mode and execute the BIOS's interrupt handler
whenever an interrupt is received in a COM32 program
- Switch IDTs and enable/disable interrupts when context switching to
and from COM32 binaries
Testing done:
- Booted VMware ESX using a COM32 multiboot loader (mboot.c32)
- Built with GDBSERIAL enabled, and tested breakpoints on int22 and
com32_irq
- Put the following code in a COM32 program:
asm volatile ( "sti" );
while ( 1 );
Before this change, the machine would triple fault
immediately. After this change, it hangs as expected. Under Bochs,
it is possible to see the interrupt handler run, and the current
time in the BIOS data area gets incremented.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An assembly version of memswap() is in an x86 word-length-agnostic
header file, but it used 32-bit registers to store pointers, leading
to memory errors responding to ARP queries on 64-bit systems.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The existence and usage of the BEV entry point is covered by the PnP
spec, not the BBS spec; the BBS spec merely describes a policy for
selecting the boot device order. iPXE should therefore check only for
a PnP BIOS in order to decide whether or not to hook INT19.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit ea12dc0 ("[build] Avoid hard-coding the path to perl")
introduced a build failure for fully clean trees (e.g. after running
"make veryclean"), since the dependency upon $(PARSEROM) now includes
a dependency upon "perl" (which doesn't exist) rather than upon
"/usr/bin/perl" (which does exist).
There should of course be no dependency upon the perl binary at all;
the dependency should be upon "./util/parserom.pl" alone.
Fix by removing the $(PERL) from the definition of Perl-based utility
paths, and adding $(PERL) at the point of usage.
Reported-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove data-xfer as an interface type, and replace data-xfer
interfaces with generic interfaces supporting the data-xfer methods.
Filter interfaces (as used by the TLS layer) are handled using the
generic pass-through interface capability. A side-effect of this is
that deliver_raw() no longer exists as a data-xfer method. (In
practice this doesn't lose any efficiency, since there are no
instances within the current codebase where xfer_deliver_raw() is used
to pass data to an interface supporting the deliver_raw() method.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove name-resolution as an interface type, and replace
name-resolution interfaces with generic interfaces supporting the
resolv_done() method.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
strerror() has not been able to use the PXE-only error table since
commit 9aa61ad ("Add per-file error identifiers") back in 2007.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Most of iPXE uses __attribute__((packed)) anyway, and PACKED conflicts
with an identically-named macro in the upstream EFI header files.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
See RFC 4578 for details.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The linker chooses to look for _start first and always picks
efidrvprefix.o to satisfy it (probably because it's earlier in the
archive) which causes a multiple definition error when the linker
later has to pick efiprefix.o for other symbols.
Fix by using EFI-specific TGT_LD_FLAGS with an explicit entry point.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This removes the need for inline safety wrappers, marginally reducing
the size penalty of weak functions, and works around an apparent
binutils bug that causes undefined weak symbols to not actually be
NULL when compiling with -fPIE (as EFI builds do).
A bug in versions of binutils prior to 2.16 (released in 2005) will
cause same-file weak definitions to not work with those
toolchains. Update the README to reflect our new dependency on
binutils >= 2.16.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
COMBOOT API calls set the carry flag on failure. This was not being
propagated because the COMBOOT interrupt handler used iret to return
with EFLAGS restored from the stack. This patch propagates CF before
returning from the interrupt.
Reported-by: Geoff Lywood <glywood@vmware.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Microsoft WDS can end up calling PXENV_RESTART_TFTP to execute a
second-stage NBP which then exits. Specifically, wdsnbp.com uses
PXENV_RESTART_TFTP to execute pxeboot.com, which will exit if the user
does not press F12. iPXE currently treats PXENV_RESTART_TFTP as a
normal PXE API call, and so attempts to return to wdsnbp.com, which
has just been vaporised by pxeboot.com.
Use rmsetjmp/rmlongjmp to preserve the stack state as of the initial
NBP execution, and to restore this state immediately prior to
executing the NBP loaded via PXENV_RESTART_TFTP. This matches the
behaviour in the PXE spec (which says that "if TFTP is restarted,
control is never returned to the caller"), and allows pxeboot.com to
exit relatively cleanly back to iPXE.
As with all usage of setjmp/longjmp, there may be subtle corner case
bugs due to not gracefully unwinding any state accumulated by the time
of the longjmp call, but this seems to be the only viable way to
provide the specified behaviour.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an infrastructure allowing the prefix to provide an open_payload()
method for obtaining out-of-band access to the whole iPXE image. Add
a mechanism within this infrastructure that allows raw access to the
expansion ROM BAR by temporarily borrowing an address from a suitable
memory BAR on the same PCI card.
For cards that have a memory BAR that is at least as large as their
expansion ROM BAR, this allows large iPXE ROMs to be supported even on
systems where PMM fails, or where option ROM space pressure makes it
impossible to use PMM shrinking. The BIOS sees only a stub ROM of
approximately 3kB in size; the remainder (which can be well over 64kB)
is loaded only at the time iPXE is invoked.
As a nice side-effect, an iPXE .mrom image will continue to work even
if its PMM-allocated areas are overwritten between initialisation and
invocation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The only remaining useful function of makerom.pl is to correct the ROM
and PnP checksums; the PCI IDs are set at link time, and padding is
performed using padimg.pl.
Option::ROM already provides a facility for correcting the checksums,
so we may as well just use this instead.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
It is common for system memory maps to be grotesquely unreliable
during POST. Many sanity checks have been added to the memory map
reading code, but these do not catch all problems.
Skip relocation entirely if called during POST. This should avoid the
problems typically encountered, at the cost of slightly disrupting the
memory map of an operating system booted via iPXE when iPXE was
entered during POST. Since this is a very rare special case (used,
for example, when reflashing an experimental ROM that would otherwise
prevent the system from completing POST), this is an acceptable cost.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some BIOSes (at least some AMI BIOSes) tend to refuse to allocate a
single area large enough to hold both the iPXE image source and the
temporary decompression area, despite promising a largest available
PMM memory block of several megabytes. This causes ROM image
shrinking to fail on these BIOSes, with undesirable consequences:
other option ROMs may be disabled due to shortage of option ROM space,
and the iPXE ROM may itself be corrupted by a further BIOS bug (again,
observed on an AMI BIOS) which causes large ROMs to end up overlapping
reserved areas of memory. This can potentially render a system
unbootable via any means.
Increase the chances of a successful PMM allocation by dropping the
alignment requirement (which is redundant now that we can enable A20
from within the prefix); this allows us to reduce the allocation size
from 2MB down to only the required size.
Increase the chances still further by using two separate allocations:
one to hold the image source (i.e. the copy of the ROM before being
shrunk) and the other to act as the decompression area. This allows
ROM image shrinking to take place even on systems that fail to
allocate enough memory for the temporary decompression area.
Improve the behaviour of iPXE in systems with multiple iPXE ROMs by
sharing PMM allocations where possible. Image source areas can be
shared with any iPXE ROMs with a matching build identifier, and the
temporary decompression area can be shared with any iPXE ROMs with the
same uncompressed size (rounded up to the nearest 128kB).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use INT 15,88 to find a suitable temporary decompression area, rather
than a fixed address. This hopefully gives us a better chance of not
treading on any PMM-allocated areas, in BIOSes where PMM support
exists but tends not to give us the large blocks that we ask for.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Always call INT 15,88 even if we don't use the result. This allows
DEBUG=memmap to show the complete result set returned by all of the
INT 15 memory-map calls.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Randomly generate a 32-bit build identifier that can be used to
identify identical iPXE ROMs when multiple such ROMs are present in a
system (e.g. when a multi-function NIC exposes the same iPXE ROM image
via each function's expansion ROM BAR).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The existing "iPXE starting execution" message indicates that the BEV
(or INT19) was invoked, but gives no indication on whether or not the
iPXE source was successfully retrieved (e.g. from PMM). Split the
"starting execution message" into "starting execution...ok"; the "ok"
indicates that the main iPXE body was successfully decompressed and
relocated.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Now that we can use odd megabytes, there is no particular need to use
an even megabyte as the fallback temporary load point.
Note that the old warnings about avoiding 2MB pre-date our ability to
cooperate with other PXE ROMs by using PMM.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE is now capable of operating in odd megabytes of memory, so remove
the obsolete code enforcing an even-megabyte constraint.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use the shared code in libflat to perform the A20 transitions
automatically on each transition from real to protected mode. This
allows us to remove all explicit calls to gateA20_set().
The old warnings about avoiding automatically enabling A20 are
essentially redundant; they date back to the time when we would always
start hammering the keyboard controller without first checking to see
if gate A20 was already enabled (which it almost always is).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE currently insists on residing in an even megabyte. This imposes
undesirably severe constraints upon our PMM allocation strategy, and
limits our options for mechanisms to access ROMs greater than 64kB in
size.
Add A20 handling code to libflat so that prefixes are able to access
memory even in odd megabytes.
The algorithms and tuning parameters in the new A20 handling code are
based upon a mixture of the existing iPXE A20 code and the A20 code
from the 2.6.32 Linux kernel.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The flatten_real_mode routine is not needed until after decompressing
.text16.early, and currently performs various contortions to
compensate for the fact that .prefix may not be writable. Move
flatten_real_mode to .text16.early to save on (compressed) binary size
and simplify the code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a section .text16.early which is always kept inline with the
prefix. This will allow for some code sharing between the .prefix and
.text16 sections.
Note that the simple solution of just prepending the .prefix section
to the .text16 section will not work, because a bug in Wyse Streaming
Manager server (WLDRM13.BIN) requires us to place a dummy PXENV+ entry
point at the start of .text16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use flat real mode rather than 16-bit protected mode for access to
high memory during installation. This simplifies the code by reducing
the number of CPU modes we need to think about, and also increases the
amount of code in common between the normal and (somewhat
hypothetical) KEEP_IT_REAL methods of operation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When returning to real mode, set 4GB segment limits instead of 64kB
limits. This change improves our chances of successfully returning to
a PMM-capable BIOS aftering entering iPXE during POST; the BIOS will
have set up flat real mode before calling our initialisation point,
and may be disconcerted if we then return in genuine real mode.
This change is unlikely to break anything, since any code that might
potentially access beyond 64kB must use addr32 prefixes to do so; if
this is the case then it is almost certainly code written to expect
flat real mode anyway.
Note that it is not possible to restore the real-mode segment limits
to their original values, since it is not possible to know which
protected-mode segment descriptor was originally used to initialise
the limit portion of the segment register.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .hrom prefix provides an experimental mechanism for reducing
option ROM space usage on systems where PMM allocation fails, by
pretending that PMM allocation succeeded and gave us an address fixed
at compilation time. This is unreliable, and potentially dangerous.
In particular, when multiple gPXE ROMs are present in a system, each
gPXE ROM will assume ownership of the same fixed address, resulting in
undefined behaviour.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .xrom prefix provides an experimental mechanism for loading ROM
images greater than 64kB in size by mapping the expansion ROM BAR in
at a hopefully-unused address. This is unreliable, and potentially
dangerous. In particular, there is no guarantee that any PCI bridges
between the CPU and the device will respond to accesses for the
"unused" memory region that is chosen, and it is possible that the
process of scanning for the "unused" memory region may end up issuing
reads to other PCI devices. If this ends up trampling on a register
with read side-effects belonging to an unrelated PCI device, this may
cause undefined behaviour.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Access to the gpxe.org and etherboot.org domains and associated
resources has been revoked by the registrant of the domain. Work
around this problem by renaming project from gPXE to iPXE, and
updating URLs to match.
Also update README, LOG and COPYRIGHTS to remove obsolete information.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
gPXE currently overwrites the filename stored in the cached DHCP
packets when a call to PXENV_TFTP_READ_FILE or PXENV_RESTART_TFTP is
made. This code has existed for many years as a workaround for RIS,
which seemed to require that this be done.
pxe_set_cached_filename() causes problems with the Bootix NBP, and a
recent test demonstrates that RIS will complete successfully even with
pxe_set_cached_filename() removed. There have been many changes to
the DHCP and PXE logic since this code was first added, and it is
quite plausible that it was masking a bug that no longer exists.
Reported-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com>
Debugged-by: Shao Miller <Shao.Miller@yrdsb.edu.on.ca>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Current gPXE code always returns "OURS" in response to
PXENV_UNDI_ISR:START. This is harmless for non-shared interrupt
lines, and avoids the complexity of trying to determine whether or not
we really did cause the interrupt. (This is a non-trivial
determination; some drivers don't have interrupt support and hook the
system timer interrupt instead, for example.)
A problem occurs when we have a shared interrupt line, the other
device asserts an interrupt, and the controlling ISR does not chain to
the other device's ISR when we return "OURS". Under these
circumstances, the other device's ISR never executes, and so the
interrupt remains asserted, causing an interrupt storm.
Work around this by returning "OURS" if and only if our net device's
interrupt is currently recorded as being enabled. Since we always
disable interrupts as a result of a call to PXENV_UNDI_ISR:START, this
guarantees that we will eventually (on the second call) return "NOT
OURS", allowing the other ISR to be called. Under normal operation,
including a non-shared interrupt situation, this change will make no
difference since PXENV_UNDI_ISR:START would be called only when
interrupts were enabled anyway.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
In the actual SYSLINUX suite's comboot implementation, the version
string is prefixed by CR LF, and the copyright string has a leading
space. Some tools (specifically HDT) assume these padding characters
exist, so we should probably return strings in a similar format.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Loading multiple UNDI instances would be useful in systems that have
several network cards with vendor PXE ROMs. However, we cannot rely on
UNDI ROMs working correctly with multiple instances loaded
simultaneously.
The gPXE UNDI driver supports the following multi-NIC configurations:
1. Chainloading undionly.kpxe on a specific NIC.
2. Loading the UNDI driver for the first probed device and ignoring all
other UNDI devices in the system.
This patch refuses to probe additional UNDI devices so there can never
be multiple instances of UNDI loaded.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The .elf, .elfd, .lmelf, and .lmelfd prefices were brought over from
legacy Etherboot and they do not build in gPXE. This patch removes the
ELF prefices.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The unfinished .exe prefix was brought over from legacy Etherboot.
There has been no demand for .exe images so this patch removes the
prefix.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The DOS .com prefix was brought over from legacy Etherboot but does not
build. There has been no demand for .com images so this patch removes
the prefix.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The .lkrn prefix allows gPXE to be loaded as a Linux bzImage. The
bImage prefix was carried over from legacy Etherboot and does not build.
This patch removes the .bImage prefix, use .lkrn instead.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
It might be the case that we wish to chain to an NBP without
being "in the way". We now implement a hook in our exit path
for gPXE *.*pxe build targets. The hook is a pointer to a
SEG16:OFF16 which we try to jump to during exit. By default,
this pointer results in the usual exit path.
We also implement the "pxenv_file_exit_hook" PXE API routine
to allow the user to specify an alternate SEG16:OFF16 to jump
to during exit.
Unfortunately, this additional PXE extension has a cost
in code size. Fortunately, a look at the size difference
for a gPXE .rom build target shows zero size difference
after compression.
The routine is documented in doc/pxe_extensions as follows:
FILE EXIT HOOK
Op-Code: PXENV_FILE_EXIT_HOOK (00e7h)
Input: Far pointer to a t_PXENV_FILE_EXIT_HOOK parameter
structure that has been initialized by the caller.
Output: PXENV_EXIT_SUCCESS or PXENV_EXIT_FAILURE must be
returned in AX. The Status field in the parameter
structure must be set to one of the values represented
by the PXENV_STATUS_xxx constants.
Description:Modify the exit path to jump to the specified code.
Only valid for pxeprefix-based builds.
typedef struct s_PXENV_FILE_EXIT_HOOK {
PXENV_STATUS_t Status;
SEGOFF16_t Hook;
} t_PXENV_FILE_EXIT_HOOK;
Set before calling API service:
Hook: The SEG16:OFF16 of the code to jump to.
Returned from API service:
Status: See PXENV_STATUS_xxx constants.
Requested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The standard option ROM format provides a header indicating the size
of the entire ROM, which the BIOS will reserve space for, load, and
call as necessary. However, this space is strictly limited to 128k for
all ROMs. gPXE ameliorates this somewhat by reserving space for itself
in high memory and relocating the majority of its code there, but on
systems prior to PCI3 enough space must still be present to load the
ROM in the first place. Even on PCI3 systems, the BIOS often limits the
size of ROM it will load to a bit over 64kB.
These space problems can be solved by providing an artificially small
size in the ROM header: just enough to let the prefix code (at the
beginning of the ROM image) be loaded by the BIOS. To the BIOS, the
gPXE ROM will appear to be only a few kilobytes; it can then load
the rest of itself by accessing the ROM directly using the PCI
interface reserved for that task.
There are a few problems with this approach. First, gPXE needs to find
an unmapped region in memory to map the ROM so it can read from it;
this is done using the crude but effective approach of scanning high
memory (over 0xF0000000) for a sufficiently large region of all-ones
(0xFF) reads. (In x86 architecture, all-ones is returned for accesses
to memory regions that no mapped device can satisfy.) This is not
provably valid in all situations, but has worked well in practice.
More importantly, this type of ROM access can only work if the PCI ROM
BAR exists at all. NICs on physical add-in PCI cards generally must
have the BAR in order for the BIOS to be able to load their ROM, but
ISA cards and LAN-on-Motherboard cards will both fail to load gPXE
using this scheme.
Due to these uncertainties, it is recommended that .xrom only be used
when a regular .rom image is infeasible due to crowded option ROM
space. However, when it works it could allow loading gPXE images
as large as a flash chip one could find - 128kB or even higher.
Signed-off-by: Marty Connor <mdc@etherboot.org>
For extremely tight space requirements and specific applications, it is
sometimes desirable to create gPXE images that cannot provide the PXE API
functionality to client programs. Add a configuration header option,
PXE_STACK, that can be removed to remove this stack. Also add PXE_MENU
to control the PXE boot menu, which most uses of gPXE do not need.
Signed-off-by: Marty Connor <mdc@etherboot.org>
If we don't unload the PXE stack before executing gPXE, automatically
take advantage of the cached DHCPACK that the underlying/parent PXE
stack can provide. If that cached DHCPACK contains option 175.178, or
the user sets the use-cached setting before invoking DHCP, the real
DHCP request will be skipped and the cached DHCPACK will be used for
network configuration. Otherwise, the cached settings block is thrown
away as soon as a fresh one is acquired.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Calling the parent PXE stack (the stack that loaded us, for
undionly.kkpxe) can be useful for more than UNDI calls; for instance,
it lets us get cached DHCP packets to avoid re-DHCP when working with
embedded images.
Signed-off-by: Marty Connor <mdc@etherboot.org>
pxenv_tftp_get_fsize is an API call that PXE clients can call to
obtain the size of a remote file. It is implemented by starting a TFTP
transfer with pxe_tftp_open, waiting for the response and then
stopping the transfer with pxe_tftp_close(). This leaves the session
hanging on the TFTP server and it will try to resend the packet
repeatedly (verified with tftpd-hpa) until it times out.
This patch adds a method "tftpsize" that will abort the transfer after
the first packet is received from the server. This will terminate the
session on the server and is the same behaviour as Intel's PXE ROM
exhibits.
Together with a qemu patch to handle the ERROR packet (submitted to
qemu's mailing list), this resolves a specific issue where booting
pxegrub with qemu's TFTP server would be slow or hang.
I've tested this against qemu's tftp server and against my normal boot
infrastructure (tftpd-hpa). Booting pxegrub and loading extra files
now produces a trace similar to Intel's PXE client and there are no
spurious retransmits from tftpd any more.
Signed-off-by: Thomas Horsten <thomas@horsten.com>
Signed-off-by: Milan Plzik <milan.plzik@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
When the "keep-san" option is used, the function is exited without
unregistering the stack allocated int13h drive. To prevent a dangling
pointer to the stack, these structs should be heap allocated.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
gPXE currently takes advantage of the feature of PCI3.0 that allows
option ROMs to relocate the bulk of their code to high memory and so
take up only a small amount of space in the option ROM area. Currently,
the relocation can only take place if the BIOS's implementation of PMM
can be made to return blocks aligned to an even megabyte, because of
the A20 gate. AMI BIOSes, in particular, will not return allocations
that gPXE can use.
Ameliorate the situation somewhat by adding a prefix, .hrom, that works
identically to .rom except in the case that PMM allocation fails. Where
.rom would give up and place itself entirely in option ROM space, .hrom
moves to a block (assumed free) at HIGHMEM_LOADPOINT = 4MB. This allows
for the use of larger gPXE ROMs than would otherwise be possible.
Because there is no way to check that the area at HIGHMEM_LOADPOINT is
really free, other devices using that memory during the boot process
will cause failure for gPXE, the other device, or both. In practice
such conflicts will likely not occur, but this prefix should still be
considered EXPERIMENTAL.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The disk partition prefix code in hdprefix.S reads the gPXE image in
tracks, not individual sectors. This means it will attempt to read
beyond the end of the image if the .hd image type is not padded to 32
KB.
This issue is affects virtualization software which may execute a .hd or
.usb image file directly - effectively running a machine with a tiny
disk containing just the gPXE image. Boot will fail when gPXE tries to
read beyond the end of disk.
The Multiboot memory map needs to be built after unhiding gPXE and
downloaded images from memory. Solaris faults during boot when trying
to access the ramdisk, which is hidden from the memory map while gPXE is
executing. This issue is fixed by using the memory map from after gPXE
unhides itself.
Reported-by: Moinak Ghosh <moinakg@belenix.org>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
The get_underlying_e820 function should return with CF unset on success.
Reported-by: Timothy Stack <tstack@vmware.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
REQUIRE_SYMBOL() formerly used a formulation of symbol requirement
that would allow a link to succeed despite lacking a required symbol,
because it did not introduce any relocations. Fix by renaming it to
REQUEST_SYMBOL() (since the soft-requirement behavior can be useful)
and add a REQUIRE_SYMBOL() that truly requires.
Add EXPORT_SYMBOL() and IMPORT_SYMBOL() for REQUEST_SYMBOL()-like
behavior that allows one to make use of the symbol, by combining a
weak external on the symbol itself with a REQUEST_SYMBOL() of a second
symbol.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Some BIOSes (observed with an AMI BIOS on a SunFire X2200) seem to
reset the BIOS drive counter at 40:75 after a failed boot attempt.
This causes problems when attempting a Windows direct-to-iSCSI
installation: bootmgr.exe calls INT 13,0800 and gets told that there
are no hard disks, so never bothers to read the MBR in order to obtain
the boot disk signature. The Windows iSCSI initiator will detect the
iBFT and connect to the target, and everything will appear to work
except for the error message "This computer's hardware may not support
booting to this disk. Ensure that the disk's controller is enabled in
the computer's BIOS menu."
Fix by checking the BIOS drive counter on every INT 13 call, and
updating it whenever necessary.
The case of an unsupported SAN protocol will currently not result in
any error message. Fix by printing the error message at the top level
using strerror(), rather than using hard-coded error messages in the
error paths.
IPoIB has a 20-byte link-layer address, of which only eight bytes
represent anything relating to a "hardware address".
The PXE and EFI SNP APIs expect the permanent address to be the same
size as the link-layer address, so fill in the "permanent address"
field with the initial link layer address (as generated by
register_netdev() based upon the real hardware address).
The hardware address is an intrinsic property of the hardware, while
the link-layer address can be changed at runtime. This separation is
exposed via APIs such as PXE and EFI, but is currently elided by gPXE.
Expose the hardware and link-layer addresses as separate properties
within a net device. Drivers should now fill in hw_addr, which will
be used to initialise ll_addr at the time of calling
register_netdev().
The option ROM header contains a one-byte field indicating the number
of 512-byte sectors in the ROM image. Currently it is linked to
contain the number of uncompressed sectors, with an instruction to the
compressor to correct it. This causes link failure when the
uncompressed size of the ROM image is over 128k.
Fix by replacing the SUBx compressor fixup with an ADDx fixup that
adds the total compressed output length, scaled as requested, to an
addend stored in the field where the final length value will be
placed. This is similar to the behavior of ELF relocations, and
ensures that an overflow error will not be generated unless the
compressed size is still too large for the field.
This also allows us to do away with the _filesz_pgh and _filesz_sect
calculations exported by the linker script.
Output tested bitwise identical to the old SUBx mechanism on hd, dsk,
lkrn, and rom prefixes, on both 32-bit and 64-bit processors.
Modified-by: Michael Brown <mcb30@etherboot.org>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
The SRP Boot Firmware Table serves a similar role to the iSCSI and AoE
Boot Firmware Tables; it provides information required by the loaded
OS in order to establish a connection back to the SRP boot device.
SRP is the SCSI RDMA Protocol. It allows for a method of SAN booting
whereby the target is responsible for reading and writing data using
Remote DMA directly to the initiator's memory. The software initiator
merely sends and receives SCSI commands; it never has to touch the
actual data.
Some BIOSes support the BIOS Boot Specification (BBS) but fail to set
%es:%di correctly when calling the option ROM initialisation entry
point. This causes gPXE to identify the BIOS as non-PnP (and so
non-BBS), leaving the user unable to control the boot order.
Fix by scanning for the $PnP signature ourselves, rather than relying
on the BIOS having passed in %es:%di correctly.
Tested-by: Helmut Adrigan <helmut.adrigan@chello.at>
pxe_api.h is just a description of API functions, it's actively
undesirable to have more implementations than necessary. Allowing it
under the MIT license lets the Syslinux libraries use it.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
We add a syslinux floppy disk type using parts of the genliso script.
This floppy image cat be dd'ed to a physical floppy or used in
instances where a virtual floppy with an mountable DOS filesystem is
useful.
We also modify the genliso script to only generate .liso images
rather than creating images depending on how it is called.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
IPoIB has a link-layer broadcast address that varies according to the
partition key. We currently go through several contortions to pretend
that the link-layer address is a fixed constant; by making the
broadcast address a property of the network device rather than the
link-layer protocol it will be possible to simplify IPoIB's broadcast
handling.
These commands can be used to activate or deactivate the PXE API (on a
specifiable network interface).
This is currently of limited use, since most image formats will call
shutdown() before booting the image, meaning that the underlying net
device gets shut down during remove_devices() anyway.
pxe_init_structures() fills in the fields of the !PXE and PXENV+
structures that aren't known until gPXE starts up. Once gPXE is
started, these values will never change.
Make pxe_init_structures() an initialisation function so that PXE
users don't have to worry about calling it.
It is possible that the UNDI ISR may be triggered before netdev_tx()
returns control to pxenv_undi_transmit(). This means that
pxenv_undi_isr() may see a zero undi_tx_count, and so not check for TX
completions. This is not a significant problem, since it will check
for TX completions on the next call to pxenv_undi_isr() anyway; it
just means that the NBP will see a spurious IRQ that was apparently
caused by nothing.
Fix by updating the undi_tx_count before calling netdev_tx(), so that
pxenv_undi_isr() can decrement it and report the TX completion.
Symantec Ghost requires working multicast support. gPXE configures
all (sufficiently supported) network adapters into "receive all
multicasts" mode, which means that PXENV_UNDI_SET_MCAST_ADDRESS is
actually a no-op, but the current implementation returns
PXENV_STATUS_UNSUPPORTED instead.
Fix by making PXENV_UNDI_SET_MCAST_ADDRESS return success. For good
measure, also implement PXENV_UNDI_GET_MCAST_ADDRESS, since the
relevant functionality is now exposed by the net device core.
Note that this will silently fail if the gPXE driver for the NIC being
used fails to configure the NIC in "receive all multicasts" mode.
The PXE debugging messages have remained pretty much unaltered since
Etherboot 5.4, and are now difficult to read in comparison to most of
the rest of gPXE.
Bring the pxe_undi debug messages up to normal gPXE standards.
The Symantec UNDI DOS driver fails when run on top of gPXE because we
return our interface type as "gPXE" rather than one of the predefined
NDIS interface type strings.
Fix by returning the standard "DIX+802.3" string; this isn't
necessarily always accurate, but it's highly unlikely that anything
trying to use the UNDI API would understand our IPoIB link-layer
pseudo-header anyway.
The Intel DOS UNDI driver fails when run on top of gPXE because we do
not fill in the ServiceFlags field in PXENV_UNDI_GET_IFACE_INFO.
Fix by filling in the ServiceFlags field with reasonable values
indicating our approximate feature capabilities.
The 3Com DOS UNDI driver fails when run on top of gPXE for two
reasons: firstly because PXENV_UNDI_SET_PACKET_FILTER is unsupported,
and secondly because gPXE enters the NBP without enabling interrupts
on the NIC, and the 3Com driver never calls PXENV_UNDI_OPEN.
Fix by always returning success from PXENV_UNDI_SET_PACKET_FILTER
(which is no worse than the current situation, since we already ignore
the receive packet filter in PXENV_UNDI_OPEN), and by forcibly
enabling interrupts on the NIC within PXENV_UNDI_TRANSMIT. The latter
is something of a hack, but avoids the need to implement a complete
base-code ISR that we would otherwise need if we were to enter the NBP
with interrupts enabled.
In order to construct outgoing link-layer frames or parse incoming
ones properly, some protocols (such as 802.11) need more state than is
available in the existing variables passed to the link-layer protocol
handlers. To remedy this, add struct net_device *netdev as the first
argument to each of these functions, so that more information can be
fetched from the link layer-private part of the network device.
Updated all three call sites (netdevice.c, efi_snp.c, pxe_undi.c) and
both implementations (ethernet.c, ipoib.c) of ll_protocol to use the
new argument.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Etherboot 5.4 erroneously treats PXENV_UNLOAD_STACK as the "final
shutdown" call, and unhooks INT15. When using gPXE's undionly.kpxe,
this results in gPXE overwriting the portion of Etherboot located in
high memory, because it is no longer hidden from the system memory map
at the time that gPXE loads.
Work around this by explicitly testing for Etherboot as the underlying
PXE stack (as is already done in undinet.c) and skipping the call to
PXENV_UNLOAD_STACK if necessary.
Solaris kernels are multiboot images with the "raw" flag set,
indicating that the loader should use the raw address fields within
the multiboot header rather than looking for an ELF header. However,
the Solaris kernel contains garbage data in the raw address fields,
and requires us to use the ELF header instead.
Work around this by always using the ELF header if present. This
renders the "raw" flag somewhat redundant.
The build mechanism currently allows for multiple objects per source
file. The only remaining user of this is unnrv2b.S. Replace this
usage with a separate unnrv2b16.S wrapper file, as is currently used
for e.g. pxeprefix.S and kpxeprefix.S.
Some utilities that expect a floppy disk image (e.g. iLO?) may test
for a file of the correct size. Reinstate the .pdsk image format in
order to provide this if needed.
QEMU will silently round down a disk or ROM image file to the nearest
512 bytes. Fix by always padding .rom, .dsk and .hd images to the
nearest 512-byte boundary.
Originally-fixed-by: Stefan Hajnoczi <stefanha@gmail.com>
Using "lret $2" to return from an interrupt causes interrupts to be
disabled in the calling program, since the INT instruction will have
disabled interrupts. Instead, patch CF on the stack and use iret to
return.
Interestingly, the original PC BIOS had this bug in at least one
place.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
This allows gPXE to load memtest86, which is packaged as an old kernel.
Split all code that directly touches the kernel headers out into
bzimage_parse_header() and bzimage_update_header(), to reduce code
size and offset the cost of supporting older kernels.
Total cost of this feature: 11 bytes (uncompressed).
The parsing of the !PXE and PXENV+ structures share a fair bit of
code; merge the common code to save a few bytes.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Search for the PXE entry points (via the !PXE or PXENV+ structures)
through all known combinations of search methods. Furthermore, if we
find a PXENV+ structure, attempt to use it to find the !PXE structure
if at all possible.
Avoid passing credentials in the iBFT that were available but not
required for login. This works around a problem in the Microsoft
iSCSI initiator, which will refuse to initiate sessions if the CHAP
password is fewer than 12 characters, even if the target ends up not
asking for CHAP authentication.
The PXE 1.x spec specifies that on NBP entry or on return from INT
1Ah AX=5650h, EDX shall point to the physical address of the PXENV+
structure. The PXE 2.x spec drops this requirement, simply stating
that EDX is clobbered. Given the principle "be conservative in what
you send, liberal in what you accept", however, we should implement
this anyway.
Certain combinations of PXE stack and BIOS result in a broken INT 18
call, which will leave the system displaying a "PRESS ANY KEY TO
REBOOT" message instead of proceeding to the next boot device. On
these systems, returning via the PXE stack is the only way to continue
to the next boot device. Returning via the PXE stack works only if we
haven't already blown away the PXE base code in pxeprefix.S.
In most circumstances, we do want to blow away the PXE base code.
Base memory is a limited resource, and it is desirable to reclaim as
much as possible. When we perform an iSCSI boot, we need to place the
iBFT above the 512kB mark, because otherwise it may not be detected by
the loaded OS; this may not be possible if the PXE base code is still
occupying that memory.
Introduce a new prefix type .kkpxe which will preserve both the PXE
base code and the UNDI driver (as compared to .kpxe, which preserves
the UNDI driver but uninstalls the PXE base code). This prefix type
can be used on systems that are known to experience the specific
problem of INT 18 being broken, or in builds (such as gpxelinux.0) for
which it is particularly important to know that returning to the BIOS
will work.
Written by H. Peter Anvin <hpa@zytor.com> and Stefan Hajnoczi
<stefanha@gmail.com>, minor structural alterations by Michael Brown
<mcb30@etherboot.org>.
COMBOOT images use INTs to issue API calls; these end up making calls
into gPXE from real mode, and so temporarily change the real-mode
stack pointer. When our COMBOOT code uses a longjmp() to implement
the various "exit COMBOOT image" API calls, this leaves the real-mode
stack pointer stuck with its temporary value, which causes problems if
we eventually try to exit out of gPXE back to the BIOS.
Fix by adding rmsetjmp() and rmlongjmp() calls (analogous to
sigsetjmp()/siglongjmp()); these save and restore the additional state
needed for real-mode calls to function correctly.
Multi-level menus via COMBOOT rely on the COMBOOT program being able
to exit and invoke a new COMBOOT program (the next menu). This works,
but rapidly (within about five iterations) runs out of space in gPXE's
internal stack, since each new image is executed in a new function
context.
Fix by allowing tail recursion between images; an image can now
specify a replacement image for itself, and image_exec() will perform
the necessary tail recursion.
The version of the GNU assembler shipped with Fedora 10
(2.18.50.0.9-8.fc10) complains about character literals in some of our
assembly code. Changing $'x' to $( 'x' ) seems to fix the problem.
Yes, the whitespace is required; using just $('x') does not work.
Reported by Kevin O'Connor <kevin@koconnor.net>.
There are code paths other than PMM allocation that can result in our
changing the ROM checksum. For example, we attempt to update our
product string to incorporate the PCI bus:dev.fn number. In a system
that does not support PMM, we could therefore end up with an incorrect
checksum.
Fix by attempting to update the checksum unconditionally.
As reported by Stefan, commit 13d09e6 ("[i386] Simplify linker script
and standardise linker-defined symbol names") breaks gdb, readelf and
associated utilities.
This is caused by the .stack section overwriting a block in the middle
of the .debug_info section (despite being included in the
.bss.textdata section in the output file, which apparently has the
correct attributes for a .bss section).
Fixed by adding explicit flags and type to the stack section
declaration.
If it happens that _textdata_memsz ends up being an exact multiple of
4kB, then this will cause the .textdata section (after relocation) to
start on a page boundary. This means that the hidden memory region
(which is rounded down to the nearest page boundary) will start
exactly at virtual address 0, i.e. UNULL. This means that
init_eheap() will erroneously assume that it has failed to allocate a
an external heap, since it typically ends up choosing the area that
lies immediately below .textdata, which in this case will be the
region with top==UNULL.
A subsequent error is that memtop_urealloc() passes through the error
return status -ENOMEM to the caller, which (rightly) assumes that the
result represents a valid userptr_t address.
Fixed by using alternative tests for heap non-existence, and by
returning UNULL in case of an error from init_eheap().
There are many functions that take ownership of the I/O buffer they
are passed as a parameter. The caller should not retain a pointer to
the I/O buffer. Use iob_disown() to automatically nullify the
caller's pointer, e.g.:
xfer_deliver_iob ( xfer, iob_disown ( iobuf ) );
This will ensure that iobuf is set to NULL for any code after the call
to xfer_deliver_iob().
iob_disown() is currently used only in places where it simplifies the
code, by avoiding an extra line explicitly setting the I/O buffer
pointer to NULL. It should ideally be used with each call to any
function that takes ownership of an I/O buffer. (The SSA
optimisations will ensure that use of iob_disown() gets optimised away
in cases where the caller makes no further use of the I/O buffer
pointer anyway.)
If gcc ever introduces an __attribute__((free)), indicating that use
of a function argument after a function call should generate a
warning, then we should use this to identify all applicable function
call sites, and add iob_disown() as necessary.
The DHCP client code now implements only the mechanism of the DHCP and
PXE Boot Server protocols. Boot Server Discovery can be initiated
manually using the "pxebs" command. The menuing code is separated out
into a user-level function on a par with boot_root_path(), and is
entered in preference to a normal filename boot if the DHCP vendor
class is "PXEClient" and the PXE boot menu option exists.
pxe_tftp.c assumes that the first seek on its data-transfer interface
represents the block size. Apart from being an ugly hack, this will
also screw up file size calculation for files smaller than one block.
The proper solution would be to extend the data-transfer interface to
support the reporting of stat()-like data. This is not going to
happen until the cost of adding interface methods is reduced (a fix I
have planned since June 2008).
In the meantime, abuse the xfer_window() method to return the block
size, since it is not being used for anything else and is vaguely
justifiable.
Astonishingly, having returned the incorrect TFTP blocksize via
PXENV_TFTP_OPEN for almost a year seems not to have affected any of
the test cases run during that time; this bug was found only when
someone tried running the heavily-patched version of pxegrub found in
OpenSolaris.
elf2efi converts a suitable ELF executable (containing relocation
information, and with appropriate virtual addresses) into an EFI
executable. It is less tightly coupled with the gPXE build process
and, in particular, does not require the use of a hand-crafted PE
image header in efiprefix.S.
elf2efi correctly handles .bss sections, which significantly reduces
the size of the gPXE EFI executable.
The check for unresolved symbols does not explicitly specify an output
architecture format, and so causes a warning when building an i386 EFI
binary on an x86_64 platform. This warning is harmless, and
specifying the output architecture in multiple places is cumbersome,
so just inhibit the warning.
At POST time some BIOSes return invalid e820 maps even though
they indicate that the data is valid. We add a check that the first
region returned by e820 is RAM type and declare the map to be invalid
if it is not.
This extends the sanity checks from 8b20e5d ("[pcbios] Sanity-check
the INT15,e820 and INT15,e801 memory maps").
Currently the only supported platform for x86_64 is EFI.
Building an EFI64 gPXE requires a version of gcc that supports
__attribute__((ms_abi)). This currently means a development build of
gcc; the feature should be present when gcc 4.4 is released.
In the meantime; you can grab a suitable gcc tree from
git://git.etherboot.org/scm/people/mcb30/gcc/.git
EFI provides a copy of the SMBIOS table accessible via the EFI system
table, which we should use instead of manually scanning through the
F000:0000 segment.
On non-BBS systems, we have to hook INT 19 in order to be able to boot
from the gPXE ROM at all. However, doing this unconditionally will
prevent the user from booting via any other devices.
Previously, the INT 19 entry point would prompt the user to press B in
order to boot from gPXE, which makes it impossible to perform an
unattended network boot. We now prompt the user to press N to skip
booting from gPXE, which allows for unattended operation.
This should be a better match for most real-world scenarios. Most
modern systems support BBS and so are unaffected by this change. Very
old (non-BBS) systems tend not to have PXE ROMs by default anyway; if
the user has added a gPXE ROM then they probably do want to boot from
the network. Newer non-BBS systems are essentially limited to IBM
servers, which will recapture the INT 19 vector anyway and implement
their own boot-ordering selection mechanism.
Remove the assortment of miscellaneous hacks to guess the "network
boot device", and replace them each with a call to last_opened_netdev().
It still isn't guaranteed correct, but it won't be any worse than
before, and it will at least be consistent.
This brings us in to line with Linux definitions, and also simplifies
adding x86_64 support since both platforms have 2-byte shorts, 4-byte
ints and 8-byte long longs.
Code paths that automatically allocate memory from the FBMS at 40:13
should also free it, if possible.
Freeing this memory will not be possible if either
1. The FBMS has been modified since our allocation, or
2. We have not been able to unhook one or more BIOS interrupt vectors.
_filesz was incorrectly forced to be aligned up to MAX_ALIGN. In a
non-compressed build, this would cause a build failure unless _filesz
happened to already be aligned to MAX_ALIGN.
The only way that PMM allows us to request a block in a region with
A20=0 is to ask for a block with an alignment of 2MB. Due to the PMM
API design, the only way we can do this is to ask for a block with a
size of 2MB.
Unfortunately, some BIOSes will hit problems if we allocate a 2MB
block. In particular, it may not be possible to enter the BIOS setup
screen; the BIOS setup code attempts a PMM allocation, fails, and
hangs the machine.
We now try allocating only as much as we need via PMM. If the
allocated block has A20=1, we free the allocated block, double the
allocation size, and try again. Repeat until either we obtain a block
with A20=0 or allocation fails. (This is guaranteed to terminate by
the time we reach an allocation size of 2MB.)
With a 16-bit operand, lgdt/lidt will load only a 24-bit base address,
ignoring the high-order bits. This meant that we could fail to fully
restore the GDT across a call into gPXE, if the GDT happened to be
located above the 16MB mark.
Not all of our lgdt/lidt instructions require a data32 prefix (for
example, reloading the real-mode IDT can never require a 32-bit base
address), but by adding them everywhere we will hopefully not forget
the necessary ones in future.
Some hardware vendors have been known to remove all gPXE-related
branding from ROMs that they build. While this is not prohibited by
the GPL, it is a little impolite.
Add a facility for adding branding messages via two #defines
(PRODUCT_NAME and PRODUCT_SHORT_NAME) in config/general.h. This
should accommodate all known OEM-mandated branding requirements.
Vendors with branding requirements that cannot be satisfied by using
PRODUCT_NAME and/or PRODUCT_SHORT_NAME should contact us so that we
can extended this facility as necessary.
This function is a major kludge, but can be made slightly more
accurate by ignoring net devices that aren't open. Eventually it
needs to be removed entirely.
Settings can be constructed using a dotted-decimal notation, to allow
for access to unnamed settings. The default interpretation is as a
DHCP option number (with encapsulated options represented as
"<encapsulating option>.<encapsulated option>".
In several contexts (e.g. SMBIOS, Phantom CLP), it is useful to
interpret the dotted-decimal notation as referring to non-DHCP
options. In this case, it becomes necessary for these contexts to
ignore standard DHCP options, otherwise we end up trying to, for
example, retrieve the boot filename from SMBIOS.
Allow settings blocks to specify a "tag magic". When dotted-decimal
notation is used to construct a setting, the tag magic value of the
originating settings block will be ORed in to the tag number.
Store/fetch methods can then check for the magic number before
interpreting arbitrarily-numbered settings.
This extends the sanity checks on the runtime segment address provided
in %bx, first implemented in commit 5600955.
We now allow the ROM to be placed anywhere above a000:0000 (rather
than c000:0000, as before), since this is the region allowed by the
PCI 3 spec. If the BIOS asks us to place the runtime image such that
it would overlap with the init-time image (which is explicitly
prohibited by the PCI 3 spec), then we assume that the BIOS is faulty
and ignore the provided runtime segment address.
Testing on a SuperMicro BIOS providing overlapping segment addresses
shows that ignoring the provided runtime segment address is safe to do
in these circumstances.
Someone at Dell must have a full-time job designing ways to screw up
implementations of INT 15,e820. This latest gem is courtesy of a Dell
Xanadu system, which arbitrarily decides to obliterate the contents of
%esi.
Preserve %esi, %edi and %ebp across calls to INT 15,e820, in case
someone tries a variation on this trick in future.
FreeBSD requires the object format to be specified as elf_i386_fbsd,
rather than elf_i386.
Based on a patch from Eygene Ryabinkin <rea-fbsd@codelabs.ru>
Some PCI 3 BIOSes seem to provide a garbage value in %bx, which should
contain the runtime segment address. Perform a basic sanity check: we
reject the segment if it is below the start of option ROM space. If
the sanity check fails, we assume that the BIOS was not expecting us
to be a PCI 3 ROM, and we just leave our image in situ.
The section name seems to have significance for some versions of
binutils.
There is no way to instruct gcc that sections such as .bss16 contain
uninitialised data; it will emit them with contents explicitly set to
zero. We therefore have to rely on the linker script to force these
sections to become uninitialised-data sections. We do this by marking
them as NOLOAD; this seems to be the closest semantic equivalent in the
linker script language.
However, this gets ignored by some versions of ld (including 2.17 as
shipped with Debian Etch), which mark the resulting sections with
(CONTENTS,ALLOC,LOAD,DATA). Combined with the fact that this version of
ld seems to ignore the specified LMA for these sections, this means that
they end up overlapping other sections, and so parts of .prefix (for
example) get obliterated by .data16's bss section.
Rename the .bss sections from .section_bss to .bss.section; this seems to
cause these versions of ld to treat them as uninitialised data.
Not fully understood, but it seems that the LMA of bss sections matters
for some newer binutils builds. Force all bss sections to have an LMA
at the end of the file, so that they don't interfere with other
sections.
The symptom was that objcopy -O binary -j .zinfo would extract the
.zinfo section from bin/xxx.tmp as a blob of the correct length, but
with zero contents. This would then cause the [ZBIN] stage of the
build to fail.
Also explicitly state that .zinfo(.*) sections have @progbits, in case
some future assembler or linker variant decides to omit them.
Some versions of ld choke on the "AT ( _xxx_lma )" in efi.lds with an
error saying "nonconstant expression for load base". Since these were
only explicitly setting the LMA to the address that it would have had
anyway, they can be safely omitted.
We have EFI APIs for CPU I/O, PCI I/O, timers, console I/O, user
access and user memory allocation.
EFI executables are created using the vanilla GNU toolchain, with the
EXE header handcrafted in assembly and relocations generated by a
custom efilink utility.
The userptr_t is now the fundamental type that gets used for conversions.
For example, virt_to_phys() is implemented in terms of virt_to_user() and
user_to_phys().
Reduce the number of sections within the linker script to match the
number of practical sections within the output file.
Define _section, _msection, _esection, _section_filesz, _section_memsz,
and _section_lma for each section, replacing the mixture of symbols that
previously existed.
In particular, replace _text and _end with _textdata and _etextdata, to
make it explicit within code that uses these symbols that the .text and
.data sections are always treated as a single contiguous block.
Allow for the build CPU architecture and platform to be specified as part
of the make command goals. For example:
make bin/rtl8139.rom # Standard i386 PC-BIOS build
make bin-efi/rtl8139.efi # i386 EFI build
The generic syntax is "bin[-[arch-]platform]", with the default
architecture being "i386" (regardless of the host architecture) and the
default platform being "pcbios".
Non-path targets such as "srcs" can be specified using e.g.
make bin-efi srcs
Note that this changeset is merely Makefile restructuring to allow the
build architecture and platform to be determined by the make command
goals, and to export these to compiled code via the ARCH and PLATFORM
defines. It doesn't actually introduce any new build platforms.
Although the E820 API allows for a caller to provide only a 20-byte
buffer, there exists at least one combination (HP BIOS, 32-bit WinPE)
that relies on information found only in the "extended attributes"
field, which requires a 24-byte buffer.
Allow for up to a 64-byte E820 buffer, in the hope of coping with
future idiocies like this one.
The ACPI specification defines an additional 4-byte field at offset 20
for an E820 memory map entry. This field is presumably optional,
since generally E820 gets given only a 20-byte buffer to fill.
However, the bits of this optional field are defined as:
bit 0 : region is enabled
bit 1 : region is non-volatile memory rather than RAM
so it seems as though callers that pass in only a 20-byte buffer may
be missing out on some rather important information.
Our INT 15,e820 code was setting %es=%ss (as part of the "look ahead
in the memory map" logic), but failing to restore %es afterwards.
This is a serious bug, but wasn't affecting many platforms because
almost all callers seem to set %es=%ss anyway.
Some BIOSes require us to pass in not only the continuation value (in
%ebx) as returned by the previous call to INT 15,e820 but also the
unmodified buffer (at %es:%di) as returned by the previous call to INT
15,e820. Apparently, someone thought it would be a worthwhile
optimisation to fill in only the low dword of the "length" field and
the low byte of the "type field", assuming that the buffer would
remain unaltered from the previous call.
This problem was being triggered by the "peek ahead" logic in
get_mangled_e820(), which would read the next entry into a temporary
buffer in order to be able to guarantee terminating the map with
%ebx=0 rather than CF=1. (Terminating with CF=1 upsets some Windows
flavours, despite being documented legal behaviour.)
Work around this problem by always fetching directly into our e820
cache; that way we can guarantee that the underlying call always sees
the previous buffer contents (and the same buffer address).