This patch extends the embedded image feature to allow multiple
embedded images instead of just one.
gPXE now always boots the first embedded image on startup instead of
doing the hardcoded DHCP boot (aka autoboot).
Based heavily upon a patch by Stefan Hajnoczi <stefanha@gmail.com>.
There are code paths other than PMM allocation that can result in our
changing the ROM checksum. For example, we attempt to update our
product string to incorporate the PCI bus:dev.fn number. In a system
that does not support PMM, we could therefore end up with an incorrect
checksum.
Fix by attempting to update the checksum unconditionally.
As reported by Stefan, commit 13d09e6 ("[i386] Simplify linker script
and standardise linker-defined symbol names") breaks gdb, readelf and
associated utilities.
This is caused by the .stack section overwriting a block in the middle
of the .debug_info section (despite being included in the
.bss.textdata section in the output file, which apparently has the
correct attributes for a .bss section).
Fixed by adding explicit flags and type to the stack section
declaration.
The documentation in xfer.h and xfer.c does not say that the metadata
parameter is optional in calls such as xfer_deliver_iob_meta() and the
deliver_iob() method. However, some code in net/ is prepared to
accept a NULL pointer, and xfer_deliver_as_iob() passes a NULL pointer
directly to the deliver_iob() method.
Fix this mess of conflicting assumptions by making everything assume
that the metadata parameter is mandatory, and fixing
xfer_deliver_as_iob() to pass in a dummy metadata structure (as is
already done in xfer_deliver_iob()).
If it happens that _textdata_memsz ends up being an exact multiple of
4kB, then this will cause the .textdata section (after relocation) to
start on a page boundary. This means that the hidden memory region
(which is rounded down to the nearest page boundary) will start
exactly at virtual address 0, i.e. UNULL. This means that
init_eheap() will erroneously assume that it has failed to allocate a
an external heap, since it typically ends up choosing the area that
lies immediately below .textdata, which in this case will be the
region with top==UNULL.
A subsequent error is that memtop_urealloc() passes through the error
return status -ENOMEM to the caller, which (rightly) assumes that the
result represents a valid userptr_t address.
Fixed by using alternative tests for heap non-existence, and by
returning UNULL in case of an error from init_eheap().
fetchf_uristring() was failing to handle error values from
fetch_setting(), resulting in its attempting to allocate extremely
large temporary buffers on the stack (and so overrunning the stack and
locking up the machine).
Problem reported by Shao Miller <Shao.Miller@yrdsb.edu.on.ca>.
This previously unsupported NIC variant was was found to work using
the current driver:
PCI_ROM(0x13f0, 0x0200, "ip100a", "IC+ IP100A"),
Signed-off-by: Marty Connor <mdc@etherboot.org>
The PXE spec dictates the rather ugly feature that we have to present
a DHCP-specified prompt string to the user, then wait to see if they
press F8 before displaying the menu.
This seems to me to be a significant retrograde step from the current
situation of displaying the menu with the timeout counting down
against the default selected boot option, but apparently the lack of
the "Press F8" prompt causes some confusion.
Various combinations of options 43.6, 43.7 and 43.8 dictate which
servers we send Boot Server Discovery requests to, and which servers
we should accept responses from. Obey these options, and remove the
explicit specification of a single Boot Server from start_pxebs() and
dependent functions.
There are many functions that take ownership of the I/O buffer they
are passed as a parameter. The caller should not retain a pointer to
the I/O buffer. Use iob_disown() to automatically nullify the
caller's pointer, e.g.:
xfer_deliver_iob ( xfer, iob_disown ( iobuf ) );
This will ensure that iobuf is set to NULL for any code after the call
to xfer_deliver_iob().
iob_disown() is currently used only in places where it simplifies the
code, by avoiding an extra line explicitly setting the I/O buffer
pointer to NULL. It should ideally be used with each call to any
function that takes ownership of an I/O buffer. (The SSA
optimisations will ensure that use of iob_disown() gets optimised away
in cases where the caller makes no further use of the I/O buffer
pointer anyway.)
If gcc ever introduces an __attribute__((free)), indicating that use
of a function argument after a function call should generate a
warning, then we should use this to identify all applicable function
call sites, and add iob_disown() as necessary.
A TFTP DATA packet with a block number of zero (representing a
negative offset within the file) could potentially cause problems.
Fixed by explicitly rejecting such packets.
Identified by Stefan Hajnoczi <stefanha@gmail.com>.
The DHCP client code now implements only the mechanism of the DHCP and
PXE Boot Server protocols. Boot Server Discovery can be initiated
manually using the "pxebs" command. The menuing code is separated out
into a user-level function on a par with boot_root_path(), and is
entered in preference to a normal filename boot if the DHCP vendor
class is "PXEClient" and the PXE boot menu option exists.
Automatically unregister any settings with the same name (and position
within the settings tree) as a newly registered settings block.
This functionality is generalised out from dhcp.c.
Some targets send a spurious CHECK CONDITION message in response to
the first SCSI command. We issue (and ignore the status of) an
arbitary harmless SCSI command (a READ CAPACITY (10)) in order to draw
out this response.
The Solaris Comstar target seems to send more than one spurious CHECK
CONDITION response. Attempt up to SCSI_MAX_DUMMY_READ_CAP dummy READ
CAPACITY (10) commands before assuming that error responses are
meaningful.
Problem reported by Kristof Van Doorsselaere <kvandoor@aserver.com>
and Shiva Shankar <802.11e@gmail.com>.
Try to qualify relative names in the DNS resolver using the DHCP Domain
Name. For example:
DHCP Domain Name: etherboot.org
(Relative) Name: www
yields:
www.etherboot.org
Only names with no dots ('.') will be modified. A name with one or more
dots is unchanged.
pxe_tftp.c assumes that the first seek on its data-transfer interface
represents the block size. Apart from being an ugly hack, this will
also screw up file size calculation for files smaller than one block.
The proper solution would be to extend the data-transfer interface to
support the reporting of stat()-like data. This is not going to
happen until the cost of adding interface methods is reduced (a fix I
have planned since June 2008).
In the meantime, abuse the xfer_window() method to return the block
size, since it is not being used for anything else and is vaguely
justifiable.
Astonishingly, having returned the incorrect TFTP blocksize via
PXENV_TFTP_OPEN for almost a year seems not to have affected any of
the test cases run during that time; this bug was found only when
someone tried running the heavily-patched version of pxegrub found in
OpenSolaris.
PXE dictates a mechanism for boot menuing, involving prompting the
user with a variable message, waiting for a predefined keypress,
displaying a boot menu, and waiting for a selection.
This breaks the currently desirable abstraction that DHCP is a process
that can happen in the background without any user interaction.
F8 is represented by the ANSI escape sequence "^[[19~", which is not
representable as a KEY_xxx constant using the current encoding scheme.
Adapt the encoding scheme to allow F8 to be represented, since PXE
requires that we may need to prompt the user to press F8.
Remove the lazy assumption that ProxyDHCP == "DHCP with option 60 set
to PXEClient", and explicitly separate the notion of ProxyDHCP from
the notion of packets containing PXE options.
It is possible to configure a DHCP server to hand out PXE options
without a ProxyDHCP server present. This requires setting option 60
to "PXEClient", which will cause gPXE to attempt ProxyDHCP.
We assume in several places that dhcp->proxydhcpack is set to the
DHCPACK packet containing option 60 set to "PXEClient". When we
transition into ProxyDHCPREQUEST, set dhcp->proxydhcpack=dhcp->dhcpack
so that this assumption holds true.
We ought to rename several references to "proxydhcp" to something more
accurate, such as "pxedhcp". Treating a single DHCP response as
potentially both DHCPOFFER and ProxyDHCPOFFER does make the code
smaller, but the variable names get confusing.
Pick out the first boot menu item from the boot menu (option 43.9) and
pass it to the boot server as the boot menu item (option 43.71).
Also improve DHCP debug messages to include more details of the
packets being transmitted.
Apparently this can cause a major speedup on some iSCSI targets, which
will otherwise wait for a timer to expire before responding. It
doesn't seem to hurt other simple TCP test cases (e.g. HTTP
downloads).
Problem and solution identified by Shiva Shankar <802.11e@gmail.com>
Some PXE configurations require us to perform a third DHCP transaction
(in addition to the real DHCP transaction and the ProxyDHCP
transaction) in order to retrieve information from a "Boot Server".
This is an experimental implementation, since the actual behaviour is
not well specified in the PXE spec.
When sending to a multicast address, it may be necessary to specify
the source address explicitly, since the multicast destination address
does not provide enough information to deduce the source address via
the miniroute table.
Allow the source address specified via the data-xfer metadata to be
passed down through the TCP/IP stack to the IPv4 layer, which can use
it as a default source address.
Move all the DHCP state transition logic into a single function
dhcp_next_state(). This will make it easier to add support for PXE
Boot Servers, since it abstracts away the difference between "mark
DHCP as complete" and "transition to boot server discovery".
The Linux PXE server (http://www.kano.org.uk/projects/pxe) does not
set the server identifier in its ProxyDHCP responses. If the server
ID is missing, do not treat this as an error.
This resolves the "vague and unsettling memory" mentioned in commit
fdb8481d ("[dhcp] Verify server identifier on ProxyDHCPACKs").
Note that we already accept ProxyDHCPOFFERs without a server
identifier; they get treated as potential BOOTP packets.
At some point, it seems that someone decided to change the GUID for
the EFI_NETWORK_INTERFACE_IDENTIFIER_PROTOCOL. Current EFI builds
ignore the older GUID, older EFI builds ignore the newer GUID, so we
have to expose both.
Include a minimal component name protocol so that the driver name
shows up as something other than "<UNKNOWN>" in the driver list, and a
device path protocol so that the network interface shows up as a
separate device in the device list, rather than being attached
directly to the PCI device.
Incidentally, the EFI component name protocol reaches new depths for
signal-to-noise ratio in program code. A typical instance within the
EFI development kit will use an additional 300 lines of code to
provide slightly less functionality than GNU gettext achieves with
three additional characters.
The UEFI specification does not mention ROM checksums, and reassigns
the field typically used as a checksum byte. The UEFI shell
"loadpcirom" utility does not verify ROM checksums, but it seems that
some UEFI BIOSes do.
Some devices take a very long time to initialise. This can make it
difficult to visually distinguish between the error cases of failing
to start executing C code and failing to initialise a device.
Add a "gPXE initialising devices..." message. The trailing ellipsis
indicates to the user that this may take some time, and the presence
of the message indicates to the developer that relocation etc. all
succeeded.
elf2efi converts a suitable ELF executable (containing relocation
information, and with appropriate virtual addresses) into an EFI
executable. It is less tightly coupled with the gPXE build process
and, in particular, does not require the use of a hand-crafted PE
image header in efiprefix.S.
elf2efi correctly handles .bss sections, which significantly reduces
the size of the gPXE EFI executable.
Conventional usage of the various struct sockaddr_xxx types involves
liberal use of casting, which tends to trigger strict-aliasing
warnings from gcc. Avoid these now and in future by marking all the
relevant types with __attribute__((may_alias)).
The check for unresolved symbols does not explicitly specify an output
architecture format, and so causes a warning when building an i386 EFI
binary on an x86_64 platform. This warning is harmless, and
specifying the output architecture in multiple places is cumbersome,
so just inhibit the warning.
At POST time some BIOSes return invalid e820 maps even though
they indicate that the data is valid. We add a check that the first
region returned by e820 is RAM type and declare the map to be invalid
if it is not.
This extends the sanity checks from 8b20e5d ("[pcbios] Sanity-check
the INT15,e820 and INT15,e801 memory maps").
Driver was storing the result of pci_bar_start() and pci_bar_size() in
an int, rather than an unsigned long.
(Bug was introduced in the vendor's tree in commit eac85cd "Port
etherfabric driver to net_device api".)
adjust_pci_device() has historically enabled bus-mastering and I/O
cycles, but has never previously needed to enable memory cycles. Some
EFI systems seem not to enable memory cycles by default, so add that
to the list of PCI command register bits that we force on.
When compiling for the Linux kernel, PCI_BASE_ADDRESS_0 == 0, and
PCI_BASE_ADDRESS_1 == 1. This is not so when compiling for gPXE. We
must use the symbolic names rather than integers to get the correct
values.
Bug identified and patch supplied by:
George Chou <george.chou@advantech.com>
Currently the only supported platform for x86_64 is EFI.
Building an EFI64 gPXE requires a version of gcc that supports
__attribute__((ms_abi)). This currently means a development build of
gcc; the feature should be present when gcc 4.4 is released.
In the meantime; you can grab a suitable gcc tree from
git://git.etherboot.org/scm/people/mcb30/gcc/.git
The patch file supplied for commit 3a799e9 ("[hermon] Add PCI ID for
ConnectX QDR card") accidentally marked drivers/infiniband/hermon.c as
being executable.
EFI provides a copy of the SMBIOS table accessible via the EFI system
table, which we should use instead of manually scanning through the
F000:0000 segment.
EFI passes in copies of SMBIOS and other system configuration tables
via the EFI system table. Allow configuration tables to be requested
using a mechanism similar to the current method for requesting EFI
protocols.
On non-BBS systems, we have to hook INT 19 in order to be able to boot
from the gPXE ROM at all. However, doing this unconditionally will
prevent the user from booting via any other devices.
Previously, the INT 19 entry point would prompt the user to press B in
order to boot from gPXE, which makes it impossible to perform an
unattended network boot. We now prompt the user to press N to skip
booting from gPXE, which allows for unattended operation.
This should be a better match for most real-world scenarios. Most
modern systems support BBS and so are unaffected by this change. Very
old (non-BBS) systems tend not to have PXE ROMs by default anyway; if
the user has added a gPXE ROM then they probably do want to boot from
the network. Newer non-BBS systems are essentially limited to IBM
servers, which will recapture the INT 19 vector anyway and implement
their own boot-ordering selection mechanism.
This driver is based on Stefan Hajnoczi's summer work, which
is in turn based on version 1.01 of the linux b44 driver.
I just assembled the pieces and fixed/added a few pieces
here and there to make it work for my hardware.
The most major limitation is that this driver won't work
on systems with >1GB RAM due to the card not having enough
address bits for that and gPXE not working around this
limitation.
Still, other than that the driver works well enough for
at least 2 users :) and the above limitation can always
be fixed when somebody wants it bad enough :)
Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com>
Remove the assortment of miscellaneous hacks to guess the "network
boot device", and replace them each with a call to last_opened_netdev().
It still isn't guaranteed correct, but it won't be any worse than
before, and it will at least be consistent.
There are currently four places within the codebase that use a
heuristic to guess the "boot network device", with varying degrees of
success. Add a feature to the net device core to maintain a list of
open network devices, in order of opening, and provide a function
last_opened_netdev() to retrieve the most recently opened net device.
This should do a better job than the current assortment of
guess_boot_netdev() functions.
The AoE spec does not specify that the source MAC address of a
received packet actually matches the MAC address of the AoE target.
In principle an AoE server can respond to an AoE request on any
interface available to it, which may not be an address configured to
accept AoE requests.
This issue is resolved by implementing AoE device discovery. The
purpose of AoE discovery is to find out which addresses an AoE target
can use for requests. An AoE configuration command is sent when the
AoE attach is attempted. The AoE target must respond to that
configuration query from an interface that can accept requests.
Based on a patch from Ryan Thomas <ryan@coraid.com>
EFI_STATUS is defined as an INTN, which maps to UINT32 (i.e. unsigned
int) on i386 and UINT64 (i.e. unsigned long) on x86_64. This would
require a cast each time the error status is printed.
Add efi_strerror() to avoid this ickiness and simultaneously enable
prettier reporting of EFI status codes.
This brings us in to line with Linux definitions, and also simplifies
adding x86_64 support since both platforms have 2-byte shorts, 4-byte
ints and 8-byte long longs.
Code paths that automatically allocate memory from the FBMS at 40:13
should also free it, if possible.
Freeing this memory will not be possible if either
1. The FBMS has been modified since our allocation, or
2. We have not been able to unhook one or more BIOS interrupt vectors.
_filesz was incorrectly forced to be aligned up to MAX_ALIGN. In a
non-compressed build, this would cause a build failure unless _filesz
happened to already be aligned to MAX_ALIGN.
The return path in directed route SMPs lists the egress ports in order
from SM to node, rather than from node to SM.
To write to the correct offset within the return path, we need to
parse the hop pointer. This is held within the class-specific data
portion of the MAD header, which was previously unused by us and
defined to be a uint16_t. Define this field to be a union type; this
requires some rearrangement of ib_mad.h and corresponding changes to
ipoib.c.
The only way that PMM allows us to request a block in a region with
A20=0 is to ask for a block with an alignment of 2MB. Due to the PMM
API design, the only way we can do this is to ask for a block with a
size of 2MB.
Unfortunately, some BIOSes will hit problems if we allocate a 2MB
block. In particular, it may not be possible to enter the BIOS setup
screen; the BIOS setup code attempts a PMM allocation, fails, and
hangs the machine.
We now try allocating only as much as we need via PMM. If the
allocated block has A20=1, we free the allocated block, double the
allocation size, and try again. Repeat until either we obtain a block
with A20=0 or allocation fails. (This is guaranteed to terminate by
the time we reach an allocation size of 2MB.)
These cards very nearly support our current IB Verbs model. There is
one minor difference: multicast packets will always be delivered by
the hardware to QP0, so the driver has to redirect them to the
appropriate QP. This means that QP owners may see receive completions
for buffers that they never posted. Nothing in our current codebase
will break because of this.
This can be used with cards that require the driver to construct and
parse packet headers manually. Headers are optionally handled
out-of-line from the packet payload, since some such cards will split
received headers into a separate ring buffer.
Some Infiniband cards will not be as accommodating as the Arbel and
Hermon cards in providing enough space for us to push a fake extra
header at the start of the received packet. We must therefore make do
with squeezing enough information to identify source and destination
addresses into the two bytes of padding within a genuine IPoIB
link-layer header.
Not all Infiniband cards have embedded subnet management agents.
Split out the code that communicates with such an embedded SMA into a
separate ib_smc.c file, and have drivers call ib_smc_update()
explicitly when they suspect that the answers given by the embedded
SMA may have changed.
Receive completion handlers now get passed an address vector
containing the information extracted from the packet headers
(including the GRH, if present), and only the payload remains in the
I/O buffer.
This breaks the symmetry between transmit and receive completions, so
remove the ib_completer_t type and use an ib_completion_queue_operations
structure instead.
Rename the "destination QPN" and "destination LID" fields in struct
ib_address_vector to reflect its new dual usage.
Since the ib_completion structure now contains only an IB status code,
("syndrome") replace it with a generic gPXE integer status code.
Avoid leaking I/O buffers in ib_destroy_qp() by completing any
outstanding work queue entries with a generic error code. This
requires the completion handlers to be available to ib_destroy_qp(),
which is done by making them static configuration parameters of the CQ
(set by ib_create_cq()) rather than being provided on each call to
ib_poll_cq().
This mimics the functionality of netdev_{tx,rx}_flush(). The netdev
flush functions would previously have been catching any I/O buffers
leaked by the IPoIB data queue (though not by the IPoIB metadata
queue).
Add the simplified ne2k_isa driver. It is just a selective copy+paste
of the relevant parts from ns8390.c plus a little trivial hacking to
make it actually work.
It is true that the code is pretty ugly, but:
a) ns8390.c is worse
b) It is only 372 lines and no #ifdefs
c) It works both in qemu/bochs and in real hardware
and we all know it is easier to cleanup working code
Hope someone will find the time to rewrite this driver properly,
but until then at least for me this is an ok solution.
Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com>
netdev_rx_err() and netdev_tx_complete_err() get passed the error
code, but currently use it only in debug messages.
Retain error numbers and frequencey counts for up to
NETDEV_MAX_UNIQUE_ERRORS (4) different errors for each of TX and RX.
This allows the "ifstat" command to report the reasons for TX/RX
errors in most cases, even in non-debug builds.
Halting the PEGs breaks platforms where there is sideband access to
the NIC (e.g. HP machines using iLO). (We have to retain the
unhalting code because on some other platforms (e.g. IBM blades with
BOFM) the pre-PXE firmware must halt the PEGs to avoid issues with the
BIOS rereading via the expansion ROM BAR.)
The retry timer needs to be running as soon as we know that we are
trying to transmit a command. If transmission fails because of a
temporary error condition, then the timer will allow us to retry the
transmission later.
This fixes a regression introduced in commit 612f4e7:
[settings] Avoid returning uninitialised data on error in fetch_xxx_setting()
in which the memset() was moved from fetch_string_setting() to
fetch_setting(), in order that it would be useful for non-string
setting types. However, this neglects to take into account the fact
that fetch_string_setting() shrinks its buffer by one byte (to allow
for the NUL) before calling fetch_setting().
Restore the memset() in fetch_string_setting(), so that the
terminating NUL is guaranteed to actually be a NUL.
With a 16-bit operand, lgdt/lidt will load only a 24-bit base address,
ignoring the high-order bits. This meant that we could fail to fully
restore the GDT across a call into gPXE, if the GDT happened to be
located above the 16MB mark.
Not all of our lgdt/lidt instructions require a data32 prefix (for
example, reloading the real-mode IDT can never require a 32-bit base
address), but by adding them everywhere we will hopefully not forget
the necessary ones in future.
This is something of an ugly hack to accommodate an OEM requirement.
The NIC has only one expansion ROM BAR, rather than one per port. To
allow individual ports to be selectively enabled/disabled for PXE boot
(as required), we must therefore leave the expansion ROM always
enabled, and place the per-port enable/disable logic within the gPXE
driver.
Some hardware vendors have been known to remove all gPXE-related
branding from ROMs that they build. While this is not prohibited by
the GPL, it is a little impolite.
Add a facility for adding branding messages via two #defines
(PRODUCT_NAME and PRODUCT_SHORT_NAME) in config/general.h. This
should accommodate all known OEM-mandated branding requirements.
Vendors with branding requirements that cannot be satisfied by using
PRODUCT_NAME and/or PRODUCT_SHORT_NAME should contact us so that we
can extended this facility as necessary.
The Phantom firmware selectively disables PCI functions based on the
board type, with the end result that we see one PCI function for each
network port. This allows us to eliminate the code for reading from
flash and, more importantly, removes knowledge of the board type magic
number from the gPXE driver.
This function is a major kludge, but can be made slightly more
accurate by ignoring net devices that aren't open. Eventually it
needs to be removed entirely.
Settings can be constructed using a dotted-decimal notation, to allow
for access to unnamed settings. The default interpretation is as a
DHCP option number (with encapsulated options represented as
"<encapsulating option>.<encapsulated option>".
In several contexts (e.g. SMBIOS, Phantom CLP), it is useful to
interpret the dotted-decimal notation as referring to non-DHCP
options. In this case, it becomes necessary for these contexts to
ignore standard DHCP options, otherwise we end up trying to, for
example, retrieve the boot filename from SMBIOS.
Allow settings blocks to specify a "tag magic". When dotted-decimal
notation is used to construct a setting, the tag magic value of the
originating settings block will be ORed in to the tag number.
Store/fetch methods can then check for the magic number before
interpreting arbitrarily-numbered settings.
This extends the sanity checks on the runtime segment address provided
in %bx, first implemented in commit 5600955.
We now allow the ROM to be placed anywhere above a000:0000 (rather
than c000:0000, as before), since this is the region allowed by the
PCI 3 spec. If the BIOS asks us to place the runtime image such that
it would overlap with the init-time image (which is explicitly
prohibited by the PCI 3 spec), then we assume that the BIOS is faulty
and ignore the provided runtime segment address.
Testing on a SuperMicro BIOS providing overlapping segment addresses
shows that ignoring the provided runtime segment address is safe to do
in these circumstances.
This interface provides access to firmware settings (e.g. MAC address)
that will apply to all drivers loaded for the duration of the current
system boot.
A hardware bug means that reads through the expansion ROM BAR can
return corrupted data if the PEGs are running. This breaks platforms
that re-read the expansion ROM after invoking gPXE code, such as IBM
blade servers.
Halt PEGs during driver shutdown, and unhalt PEGs during driver
startup if we detect that this is not the first startup since
power-on.
A DOS-style full path name such as "C:\Program Files\tftpboot\nbp.0"
satisfies the syntax requirements for a URI with a scheme of "C" and
an opaque portion of "\Program Files\tftpboot\nbp.0".
Add a check in parse_uri() to ignore schemes that are apparently only
a single character long; this avoids interpreting DOS-style paths in
this way, and shouldn't affect any practical URI scheme.
Most other Phantom drivers define a register space in terms of a 64M
virtual address space. While this doesn't map in any meaningful way
to the actual addresses used on the latest cards, it makes maintenance
easier if we do the same.
Someone at Dell must have a full-time job designing ways to screw up
implementations of INT 15,e820. This latest gem is courtesy of a Dell
Xanadu system, which arbitrarily decides to obliterate the contents of
%esi.
Preserve %esi, %edi and %ebp across calls to INT 15,e820, in case
someone tries a variation on this trick in future.
Callers (e.g. usr/autoboot.c) may not check the return values from
fetch_xxx_setting(), assuming that in error cases the returned setting
value will be "empty" (for some sensible value of "empty").
In particular, if the DHCP server did not specify a next-server
address, this would result in gPXE using uninitialised data for the
TFTP server IP address.
FreeBSD requires the object format to be specified as elf_i386_fbsd,
rather than elf_i386.
Based on a patch from Eygene Ryabinkin <rea-fbsd@codelabs.ru>
Some PCI 3 BIOSes seem to provide a garbage value in %bx, which should
contain the runtime segment address. Perform a basic sanity check: we
reject the segment if it is below the start of option ROM space. If
the sanity check fails, we assume that the BIOS was not expecting us
to be a PCI 3 ROM, and we just leave our image in situ.
The section name seems to have significance for some versions of
binutils.
There is no way to instruct gcc that sections such as .bss16 contain
uninitialised data; it will emit them with contents explicitly set to
zero. We therefore have to rely on the linker script to force these
sections to become uninitialised-data sections. We do this by marking
them as NOLOAD; this seems to be the closest semantic equivalent in the
linker script language.
However, this gets ignored by some versions of ld (including 2.17 as
shipped with Debian Etch), which mark the resulting sections with
(CONTENTS,ALLOC,LOAD,DATA). Combined with the fact that this version of
ld seems to ignore the specified LMA for these sections, this means that
they end up overlapping other sections, and so parts of .prefix (for
example) get obliterated by .data16's bss section.
Rename the .bss sections from .section_bss to .bss.section; this seems to
cause these versions of ld to treat them as uninitialised data.
Not fully understood, but it seems that the LMA of bss sections matters
for some newer binutils builds. Force all bss sections to have an LMA
at the end of the file, so that they don't interfere with other
sections.
The symptom was that objcopy -O binary -j .zinfo would extract the
.zinfo section from bin/xxx.tmp as a blob of the correct length, but
with zero contents. This would then cause the [ZBIN] stage of the
build to fail.
Also explicitly state that .zinfo(.*) sections have @progbits, in case
some future assembler or linker variant decides to omit them.
The virtnet_transmit() logic for waiting the packet to be transmitted is
reversed: we can't wait the packet to be transmitted if we didn't kick()
the ring yet. The vring_more_used() while loop logic is reversed also,
that explains why the code works today.
The current code risks trying to free a buffer from the used ring
when none was available, that will happen most times because KVM
doesn't handle the packet immediately on kick(). Luckily it was working
because it was unlikely to have a buffer still queued for transmit when
virtnet_transmit() was called.
Also, adds a BUG_ON() to vring_get_buf(), to catch cases where we try
to free a buffer from the used ring when there was none available.
Patch for Etherboot. gPXE has the same problem on the code, but I hadn't
a chance to test gPXE using virtio-net yet.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
EFI requires us to be able to specify the source address for
individual transmitted packets, and to be able to extract the
destination address on received packets.
Take advantage of this to rationalise the push() and pull() methods so
that push() takes a (dest,source,proto) tuple and pull() returns a
(dest,source,proto) tuple.
Multicast hashing is an ugly overlap between network and link layers.
EFI requires us to provide access to this functionality, so move it
out of ipv4.c and expose it as a method of the link layer.
Some versions of ld choke on the "AT ( _xxx_lma )" in efi.lds with an
error saying "nonconstant expression for load base". Since these were
only explicitly setting the LMA to the address that it would have had
anyway, they can be safely omitted.
We have EFI APIs for CPU I/O, PCI I/O, timers, console I/O, user
access and user memory allocation.
EFI executables are created using the vanilla GNU toolchain, with the
EXE header handcrafted in assembly and relocations generated by a
custom efilink utility.
monojob_wait() was holding a reference to the completed job, meaning that
various objects would not be freed until the next job was plugged in to
the monojob interface.
The userptr_t is now the fundamental type that gets used for conversions.
For example, virt_to_phys() is implemented in terms of virt_to_user() and
user_to_phys().
-Wformat-nonliteral is not enabled by -Wall and needs to be explicitly
specified.
Modified the few files that use nonliteral format strings to work with
this new setting in place.
Inspired by a patch from Carl Karsten <carl@personnelware.com> and an
identical patch from Rorschach <r0rschach@lavabit.com>.
The intention is to include near-verbatim copies of the EFI headers
required by gPXE. This is achieved using the import.pl script in
src/include/gpxe/efi.
Note that import.pl will modify any #include lines in each imported
header to reflect its new location within the gPXE tree. It will also
tidy up the file by removing carriage return characters and trailing
whitespace.
Reduce the number of sections within the linker script to match the
number of practical sections within the output file.
Define _section, _msection, _esection, _section_filesz, _section_memsz,
and _section_lma for each section, replacing the mixture of symbols that
previously existed.
In particular, replace _text and _end with _textdata and _etextdata, to
make it explicit within code that uses these symbols that the .text and
.data sections are always treated as a single contiguous block.
Allow for the build CPU architecture and platform to be specified as part
of the make command goals. For example:
make bin/rtl8139.rom # Standard i386 PC-BIOS build
make bin-efi/rtl8139.efi # i386 EFI build
The generic syntax is "bin[-[arch-]platform]", with the default
architecture being "i386" (regardless of the host architecture) and the
default platform being "pcbios".
Non-path targets such as "srcs" can be specified using e.g.
make bin-efi srcs
Note that this changeset is merely Makefile restructuring to allow the
build architecture and platform to be determined by the make command
goals, and to export these to compiled code via the ARCH and PLATFORM
defines. It doesn't actually introduce any new build platforms.
Some devices (e.g. the Atmel AT24C11) have no concept of a device
address; they respond to every device address and use this value as
the word address. Some other devices use part of the device address
field to extend the word address field.
Generalise the i2c bit-bashing support to handle this by defining the
device address length and word address length as properties of an i2c
device. The word address is assumed to overflow into the device
address field if the address used exceeds the width of the word
address field.
Also add a bus reset mechanism. i2c chips don't usually have a reset
line, so rebooting the host will not clear any bizarre state that the
chip may be in. We reset the bus by clocking SCL until we see SDA
high, at which point we know we can generate a start condition and
have it seen by all devices. We then generate a stop condition to
leave the bus in a known state prior to use.
Finally, add some extra debugging messages to i2c_bit.c.
Some older versions of gcc don't complain about unknown compiler flags
unless you ask them to actually compile; asking them to merely
preprocess won't trigger the error.
Fix the -fno-stack-protector test by making it attempt to compile an
empty file, rather than preprocess an empty file.
The usefulness of DBGLVL_IO is limited by the fact that many cards
require large numbers of uninteresting I/O reads/writes at device
probe time, typically when driving a bit-bashing I2C/SPI bus to read
the MAC address.
This patch adds the DBG_DISABLE() and DBG_ENABLE() macros, which can
be used to temporarily disable and re-enable selected debug levels.
Note that debug levels must still be enabled in the build in order to
function at all: you can't use DBG_ENABLE(DBGLVL_IO) in an object
built with DEBUG=object:1 and expect it to do anything.
?= in a Makefile means that that variable can be overridden by the
environment. This is confusing to users, especially with a generic
name like "ARCH".
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Although the E820 API allows for a caller to provide only a 20-byte
buffer, there exists at least one combination (HP BIOS, 32-bit WinPE)
that relies on information found only in the "extended attributes"
field, which requires a 24-byte buffer.
Allow for up to a 64-byte E820 buffer, in the hope of coping with
future idiocies like this one.
The ACPI specification defines an additional 4-byte field at offset 20
for an E820 memory map entry. This field is presumably optional,
since generally E820 gets given only a 20-byte buffer to fill.
However, the bits of this optional field are defined as:
bit 0 : region is enabled
bit 1 : region is non-volatile memory rather than RAM
so it seems as though callers that pass in only a 20-byte buffer may
be missing out on some rather important information.
Our INT 15,e820 code was setting %es=%ss (as part of the "look ahead
in the memory map" logic), but failing to restore %es afterwards.
This is a serious bug, but wasn't affecting many platforms because
almost all callers seem to set %es=%ss anyway.
Use individual page mappings rather than a single whole-region
mapping, to avoid the waste of memory that occurs due to the
constraint that each mapped block must be aligned on its own size.
Some BIOSes require us to pass in not only the continuation value (in
%ebx) as returned by the previous call to INT 15,e820 but also the
unmodified buffer (at %es:%di) as returned by the previous call to INT
15,e820. Apparently, someone thought it would be a worthwhile
optimisation to fill in only the low dword of the "length" field and
the low byte of the "type field", assuming that the buffer would
remain unaltered from the previous call.
This problem was being triggered by the "peek ahead" logic in
get_mangled_e820(), which would read the next entry into a temporary
buffer in order to be able to guarantee terminating the map with
%ebx=0 rather than CF=1. (Terminating with CF=1 upsets some Windows
flavours, despite being documented legal behaviour.)
Work around this problem by always fetching directly into our e820
cache; that way we can guarantee that the underlying call always sees
the previous buffer contents (and the same buffer address).
We were accidentally allocating only half the required amount of
memory (given the alignment method) for the firmware buffer, leading
to conflicts between the firmware buffer and gPXE code/data segments.
We were accidentally allocating only half the required amount of
memory (given the alignment method) for the firmware buffer, leading
to conflicts between the firmware buffer and gPXE code/data segments.
We seem to be having issues with various E820 memory maps. These
problems are often difficult to reproduce, requiring access to the
specific system exhibiting the problem.
Add a facility for hooking in a fake E820 map generator, using an
arbitrary map defined in a C array, solely in order to be able to test
the map-mangling code against arbitrary E820 maps.
In particular, allow BANNER_TIMEOUT=0 to inhibit the prompt banners
altogether.
Ironically, this request comes from the same OEM that originally
required the prompts to be present during POST.
Some really moronic BIOSes bring up the PXE stack via the UNDI loader
entry point during POST, and then don't bother to unload it before
overwriting the code and data segments. If this happens, we really
don't want to leave INT 15 hooked, because that will cause any loaded
OS to die horribly as soon as it attempts to fetch the system memory
map.
We use a heuristic to detect whether or not we are being loaded at the
top of free base memory. If we determine that we are being loaded at
some other arbitrary location in base memory, then we assume that it's
not safe to hook INT 15.
This allows settings to be expanded in a way that is safe to include
within a URI string, such as
kernel http://10.0.0.1/boot.php?mf=${manufacturer:uristring}
where the ${manufacturer} setting may contain characters that are not
permitted (or have reserved purposes) within a URI.
Since whitespace characters will be URI-encoded (e.g. "%20" for a
space character), this also works around the problem that spaces
within an expanded setting would cause the shell to split command-line
arguments incorrectly.
On non-BBS systems we hook INT 19, since there is no other way we can
guarantee gaining control of the flow of execution. If we end up
doing this, prompt the user before attempting boot, since forcibly
capturing INT 19 is rather antisocial.
It is possible for the BIOS to use the UNDI API to bring up the NIC
prior to system boot. If this happens, UNM_NIC_REG_CMDPEG_STATE will
contain the value 0xf00f (UNM_NIC_REG_CMDPEG_STATE_INITIALIZE_ACK),
and we should skip initialising the command PEG.
The firmware will now determine the right port mode on all cards, so
the PXE driver doesn't have to set it. (Setting the port mode
apparently breaks some newer cards.)
At least one Dell system calls the UNDI loader entry point with the
BIOS console disabled. The serial console is active only after a call
to initialise(), so move the debug message in undi_loader() so that it
can be displayed via the serial console.
If the INT 15,e820 memory map reports a region [0,0), this confuses
the "truncate to even megabytes" logic, which ends up rounding the
region 'down' to [0,fff00000).
Fix by ensuring that the region's end address is at least 1, before we
subtract 1 to obtain the "last byte in region" address.
INT 15,e801 is capable of returning a memory range that extends to
4GB, so allow for this in the debug message that shows the data
returned by INT 15,e801.
The domain etherboot.org was actually registered on 2000-01-09, not
2000-09-01. (To put it another way, it was registered on 1/9/2000 (US
date format) rather than 1/9/2000 (sensible date format); this may
illuminate the cause of the error.)
"iqn.2000-09.org.etherboot:" is still valid as per RFC3720, but may be
surprising to users, so change it to something less unexpected.
Thanks to the anonymous contributor for pointing this one out.
Apparently some BIOSes will place option ROMs on 512-byte boundaries.
While this is against specification, it doesn't actually hurt
anything, so we may as well increase our scan granularity to 512
bytes.
Contributed by Luca <lucarx76@gmail.com>
Wyse Streaming Manager server (WLDRM13.BIN) assumes that the PXENV+
entry point is at UNDI_CS:0000; apparently, somebody at Wyse has
difficulty distinguishing between the words "may" and "must"...
Add a dummy entry point at UNDI_CS:0000, which just jumps to the
correct entry point.
The multiboot specification states that, for raw images, if
load_end_addr is zero then it should be interpreted as meaning "use
the entire file", and if bss_end_addr is zero it should be interpreted
as meaning "no bss".
Must check that argument to a fclose() is not NULL -- we can get to the
'err' label when file was not opened. fclose(NULL) is known to produce
core dump on some platforms and we don't want zbin to fail so loudly.
Signed-off-by: Eygene Ryabinkin <rea-fbsd@codelabs.ru>
Explicitly state that we are using 32-bit addressing in 16-bit code.
GNU as 2.15 (FreeBSD/amd64 7-STABLE) got confused that 32-bit registers
are used in the code that was declared as 16-bit. Add explicit modifier
'addr32' to make assembler happy.
Signed-off-by: Eygene Ryabinkin <rea-fbsd@codelabs.ru>
IBM's iSCSI Firmware Initiator checks the UNDIROMID pointer in the
!PXE structure that gets created by the UNDI loader. We didn't
previously fill this value in.
Commit f58cc3f introduced a temporary workaround for a bug in current
prototype silicon, but failed to apply it to all eight PCI functions
within the device.
Option::ROM was assuming that ROM images using a short jump
instruction for the init entry point would have a zero byte at offset
5; this is not necessarily true.
Include PMM allocation result in POST banner.
Include full product string in "starting execution" message.
Also mark ourselves as supporting DDIM in PnP header, for
completeness.
On a system that doesn't support BBS, we end up hooking INT19 to gain
control of the boot process. If the system is PCI3.0, we must take
care to use the runtime value for %cs, rather than the POST-time
value, otherwise we end up pointing INT19 to the temporary option ROM
POST scratch area.
Determine the network-layer packet type and fill it in for UNDI
clients. This is required by some NBPs such as emBoot's winBoot/i.
This change requires refactoring the link-layer portions of the
gPXE netdevice API, so that it becomes possible to strip the
link-layer header without passing the packet up the network stack.
Some dumb NBPs (e.g. emBoot's winBoot/i) never call PXENV_UNDI_ISR
with FuncFlag=PXENV_UNDI_ISR_START; they just sit in a tight polling
loop merrily violating the PXE spec with repeated calls to
PXENV_UNDI_ISR_IN_PROCESS. Force a extra calls to netdev_poll() to
cope with these out-of-spec clients.
Allow for an arbitrary number of splits of the system memory map via
INT 15,e820.
Features of the new map-mangling algorithm include:
Supports random access to e820 map entries.
Requires only sequential access support from the underlying e820
map, even if our caller uses random access.
Empty regions will always be stripped.
Always terminates with %ebx=0, even if the underlying map terminates
with CF=1.
Allows for an arbitrary number of hidden regions, with underlying
regions split into as many subregions as necessary.
Total size increase to achieve this is 193 bytes.
Define a list of N allowed memory regions, and split each underlying
e820 region into up to N subregions. Strip resulting empty regions
out of the map, avoiding using the "return with CF set to strip last
empty region" trick, because it seems that bootmgr.exe in Win2k8 gets
upset if the memory map is terminated with CF set.
This is an intermediate checkin that defines a single allowed memory
region covering the entire 64-bit address space, and uses the existing
map-mangling code on top of the new region-splitting code. This
sanitises the memory map to the point that Win2k8 is able to boot even
on a system that defines a final zero-length region at the 4GB mark.
I'm checking this in because it may be useful for future debugging
efforts to be able to run with the existing and known-working map
mangling code together with the map sanitisation capabilities of the
new map mangling code.
Add support for manipulating the jump instruction that forms the
option ROM initialisation entry point, so that mergerom.pl can treat
it just like other entry points.
Add support for merging the initialisation entry point (and IBM BOFM
table) to mergerom.pl; this is another slightly icky but unfortunately
necessary GPL vs. NDA workaround. When mergerom.pl replaces an entry
point in the original ROM, it now fills in the corresponding entry
point in the merged ROM with the original value; this allows (for
example) a merged initialisation entry point to do some processing and
then jump back to the original entry point.
fetch_string_setting() was subtracting one from the length of the
to-be-NUL-terminated buffer in order to obtain the length of the
unterminated buffer to be passed to fetch_setting(). This works
extremely well unless the length of the to-be-NUL-terminated buffer is
zero, at which point we end up giving fetch_setting() a buffer of
length -1UL, thereby inviting it to overwrite as much memory as it
wants...
The ProxyDHCPREQUEST is a unicast packet, so the first request will
almost always be lost due to not having the IP address in the ARP
cache. If the minimum retry time is set to one second (as per commit
ff2b6a5), then ProxyDHCP will time out and give up before managing to
successfully transmit a request.
The DHCP timers need to be reworked anyway, so this mild hack is
acceptable for now.
New min_timeout and max_timeout fields in struct retry_timer allow
users of this timer to set their own desired minimum and maximum
timeouts, without being constrained to a single global minimum and
maximum. Users of the timer can still elect to use the default global
values by leaving the min_timeout and max_timeout fields as 0.
WinPE seems to have a bug that causes it to always use the TFTP server
IP address and filename from the ProxyDHCPACK packet, even if the
ProxyDHCPACK packet doesn't exist. This causes it to end up
attempting to fetch a file such as
tftp://0.0.0.0/bootmgr.exe
If we don't have a ProxyDHCPACK to use, we pretend that it was a copy
of the DHCPACK packet. This works around the problem, and hopefully
won't surprise any NBPs.
Altiris erroneously cares about the ordering of DHCP options, and will
get confused if we don't construct them in the order it expects.
This is observed (so far) only when attempting to deploy 64-bit Win2k3.
This patch adds support for the virtio-net adapter provided by KVM.
Written by Laurent Vivier <Laurent.Vivier@bull.net> for Etherboot.
Wrapped as legacy driver for gPXE by Stefan Hajnoczi
<stefanha@gmail.com>.
When we boot from a DHCP-supplied filename, we previously relied on
the fact that the current working URI is set to tftp://[next-server]/
in order to resolve the filename into a full tftp:// URI. However,
this process will eliminate the distinction between filenames with and
without initial slashes:
cwuri="tftp://10.0.0.1/" filename="vmlinuz" => URI="tftp://10.0.0.1/vmlinuz"
cwuri="tftp://10.0.0.1/" filename="/vmlinuz" => URI="tftp://10.0.0.1/vmlinuz"
This distinction is important for some TFTP servers. We now
explicitly construct a string of the form
"tftp://[next-server]/filename"
so that a filename with an initial slash will result in a URI
containing a double-slash, e.g.
"tftp://10.0.0.1//vmlinuz"
The TFTP code always strips a single initial slash, and so ends up
presenting the correct path to the server.
URIs entered explicitly by users at the command line must include a
double slash if they want an initial slash presented to the TFTP
server:
"kernel tftp://10.0.0.1/vmlinuz" => filename="vmlinuz"
"kernel tftp://10.0.0.1//vmlinuz" => filename="/vmlinuz"
This utility is required as a workaround for legal restrictions on
including GPL and non-GPL code within the same expansion ROM image.
While this is not encouraged, we are prepared to accept that
concatenation of ROM images and updating of the ROM header data
structures can be classed as "mere aggregation" within the terms of
the GPL.
If in any doubt, assume that you cannot include GPL and non-GPL code
within the same expansion ROM image. Contact the Etherboot team for
clarification on your specific circumstances.
When an error reply (not 1xx, 2xx or 3xx) was received, ftp_reply()
invoked ftp_done() to close connections, but did not return, and the
rest of code in this function could try to send commands to the closed
control connection.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Based on a patch contributed by Sergey Vlasov <vsu@altlinux.ru> :
In my testing with "qemu -net user" the 226 response to RETR was
often received earlier than final packets of the data connection;
this caused the received file to become truncated without any error
indication. Fix this by adding an intermediate state FTP_TRANSFER
between FTP_RETR and FTP_QUIT, so that the transfer is considered to
be complete only when both the end of data connection is encountered
and the final reply to the RETR command is received.
H. Peter Anvin <hpa@zytor.com> sent word that Sergey Vlasov
<vsu@altlinux.ru> discovered gPXE lkrn images fail to load in SYSLINUX
3.70 because we have initrd_addr_max zeroed. This patch sets the same
value as the Linux kernel.
Also change the header jmp instruction to use a hardcoded opcode value
like Linux does. Just in case the assembler decides to use a three-byte
instruction instead of the desired two-byte jmp.
Allow settings to be expanded in arbitrary commands, such as
kernel http://10.0.0.1/boot.php?uuid=${uuid}
Also add the "echo" command, as being the easiest way to test this
features.
Print one dot per second while waiting in monojob.c (e.g. for DHCP,
for file downloads, etc.), to inform user that the system has not
locked up.
Patch contributed by Andrew Schran <aschran@google.com>, minor
modification by me.
This change allows the time for which shell banners are displayed to
be configured in the config.h file. The ability to access the shell
can also be effectively disabled by setting this timeout to zero.
In tg3_chip_reset(), the PCI_EXPRESS change is taken from the Linux
tg3 driver. I am not sure what exactly it does (it is not documented
in the Linux driver), but it is necessary for the NIC to work
correctly.
Use "-include" rather than "include" for the generated Makefile
fragments, in order to suppress the long list of warnings that
otherwise appears at the start of a clean build.
Contributed by Edward Waugh <ewaugh@netxen.com>
Add yet another ugly hack to iscsiboot.c, this time to allow the user to
inhibit the shutdown/removal of the iSCSI INT13 device (and the network
devices, since they are required for the iSCSI device to function).
On the plus side, the fact that shutdown() now takes flags to
differentiate between shutdown-for-exit and shutdown-for-boot means that
another ugly hack (to allow returning via the PXE stack on BIOSes that
have broken INT 18 calls) will be easier.
I feel dirty.
Conjecture: The hardware issues 64-bit DMA writes of status descriptors,
which some PCI bridges seem to split into two 32-bit writes in reverse
order (i.e. dword 1 first). This means that we sometimes observe a
partial status descriptor. Add an explicit check to ensure that the
descriptor is complete before processing it.
Also ensure that the RDS consumer counter is incremented only when we
know that we have actually consumed an RX descriptor.
Shifting all INT13 drive numbers causes problems on systems that use a
sparse drive number space (e.g. qemu BIOS, which uses 0xe0 for the CD-ROM
drive).
The strategy now is:
Each drive is assigned a "natural" drive number, being the next
available drive number in the system (based on the BIOS drive count).
Each drive is accessed using its specified drive number. If the
specified drive number is -1, the natural drive number will be used.
Accesses to the specified drive number will be delivered to the
emulated drive, masking out any preexisting drive using this number.
Accesses to the natural drive number, if different, will be remapped to
the masked-out drive.
The overall upshot is that, for examples:
System has no drives. Emulated INT13 drive gets natural number 0x80
and specified number 0x80. Accesses to drive 0x80 go to the emulated
drive, and there is no remapping.
System has one drive. Emulated INT13 drive gets natural number 0x81
and specified number 0x80. Accesses to drive 0x80 go to the emulated
drive. Accesses to drive 0x81 get remapped to the original drive 0x80.
Verifying server ID and DHCP transaction ID is insufficient to
differentiate between DHCPACK and ProxyDHCPACK when the DHCP server and
Proxy DHCP server are the same machine.
If there is more than one loaded image, refuse to automatically select
the image to execute. There are at least two possible cases, with
different "correct" answers:
1. User loads image A by mistake, then loads image B and types "boot".
User wants to execute image B.
2. User loads image A, then loads image B (which patches image A), then
types "boot". User wants to execute image A.
If a user actually wants to load multiple images, they must explicitly
specify which image is to be executed.
Clearing the LOADED flag actually prevents users from doing clever things
such as loading an image, then loading a patch image, then executing the
first image. (image_exec() checks for IMAGE_LOADED, so this sequence of
operations will fail if the LOADED flag gets cleared.)
This reverts commit 14c080020f.
Loading an image may overwrite part or all of any previously-loaded
images, so we should clear the LOADED flag for all images prior to
attempting to load a new image.
We can just treat all non-kernel images as initrds, which matches our
behaviour for multiboot kernels. This allows us to eliminate initrd as
an image type, and treat the "initrd" command as just another synonym for
"imgfetch".
dhcppkt_store() is supposed to clear the setting if passed NULL for the
setting data. In the case of fixed-location fields (e.g. client IP
address), this requires setting the content of the field to all-zeros.
__from_data16 and __from_text16 now take a pointer to a
.data16/.text16 variable, and return the real-mode offset within the
appropriate segment. This matches the use case for every occurrence
of these macros, and prevents potential future bugs such as that fixed
in commit d51d80f. (The bug arose essentially because "&pointer" is
still syntactically valid.)
__from_data16 takes the value pointed to, rather than the pointer
itself. This was silently causing gPXE to return a dud buffer pointer
when the caller did not supply a buffer for PXENV_GET_CACHED_INFO.
Perform the same test for a matching DHCP_SERVER_IDENTIFIER on
ProxyDHCPACKs as we do for DHCPACKs. Otherwise, a retransmitted
DHCPACK can end up being treated as the ProxyDHCPACK.
I have a vague and unsettling memory that this test was deliberately
omitted, but I can't remember why, and can't find anything in the VC
logs.
ns8390.c can produce four different drivers (one PCI, three ISA.) The
ISA driver requires setting a few macros; do that by setting defines
in stub files instead of using src/Config.
Currently, all the ISA drivers are broken (they were not enabled by
default), so #if 0 them out.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
When the 16-bit segment registers are accessed using 32-bit instructions
the high order bytes are undefined on older CPUs. We now explicitly
zero the high order bytes when snapshotting the CPU state. This ensures
that the GDB stub reports consistent values for the segment registers.
This commit implements GDB over UDP. Using UDP is more complex than
serial and has required some restructuring.
The GDB stub is now built using one or both of GDBSERIAL and GDBUDP
config.h options.
To enter the debugger, execute the gPXE shell command:
gdbstub <transport> [<options>...]
Where <transport> is "serial" or "udp". For "udp", the name of a
configured network device is required:
gdbstub udp net0
The GDB stub listens on UDP port 43770 by default.
Commit fd0aef9 introduced a typo that caused PMM detection to start at
paragraph 0xe00 rather than 0xe000. (Detection would still work, since it
would scan until it ran out of base memory, but it would end up scanning
an unnecessarily large portion of base memory.)
Spotted by Sebastian Herbszt <herbszt@gmx.de>.
Send a null command, specifically "pulse outputs" with no outputs
selected, to the KBC after changing A20. This was apparently done by DOS,
presumably as a synchronization hack, and the authors of the UHCI spec
thought it was inherent. Therefore, there are systems out there (e.g. HP
DL360 G5) which will stop responsing to "legacy USB" unless they see the
null command, 0xFF, written to port 0x64 at the end of the A20 toggling
sequence.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Some drivers that still use the legacy-driver wrapper (tg3 in particular)
apparently do not specify their alignment constraints properly. This
hack forces any __shared data to be maximally aligned.
Note that this provides only 16-byte alignment; it is not possible to
request alignment to any greater than 16 bytes using
__attribute__((aligned)), since the relocation code will preserve only 16
byte alignment (and operation under -DKEEP_IT_REAL cannot preserve more
that 16 byte alignment).
Idea proposed by Tim Hockin <thockin@google.com>
When the BIOS doesn't support BBS, hooking INT 19 is the only way to add
ourselves as a boot device. If we have to do this, we should at least
try to chain to the original INT 19 vector if our boot fails.
Idea suggested by Andrew Schran <aschran@google.com>
2.6.22+ kernels have an extra field in the bzimage_header structure to
indicate the maximum permitted command-line length. Use this if it is
available.
A bug in read_smbios_string() was causing the starting offset of the
SMBIOS structure to be added twice, resulting in completely the wrong
strings being returned.
Bug identified by Martin Herweg <m.herweg@gmx.de>
Add the definition of SLAM_MAX_BLOCKS_PER_NACK, which is roughly
equivalent to a TCP window size; it represents the maximum number of
packets that will be requested in a single NACK.
Note that, to keep the code size down, we still limit ourselves to
requesting only a single range per NACK; if the missing-block list is
discontiguous then we may request fewer than SLAM_MAX_BLOCKS_PER_NACK
blocks.
Avoid calling cpu_nap() until after we have determined that there is
no input ready to read. This avoids delaying for one timer interrupt
(~50ms) in the case of
if ( iskey() )
char = getkey()
which happens to be present in monojob.c, which is where we spend most
of our time looping (e.g. during any download).
This should eliminate the irritating tendency of gPXE to lose
keypresses.
Discovered on a Dell system where the serial port seems to send in a
constant stream of 0xff characters; this wouldn't be a problem in
itself except that each one triggers the 50ms delay (as mentioned
above), which really kills performance.
On any fast network, or with any driver that may drop packets
(e.g. Infiniband, which has very small RX rings), the traditional
usage of the SLAM protocol will result in enormous numbers of packet
drops and a consequent large number of retransmissions.
By adapting the client behaviour, we can force the server to act more
like a multicast TFTP server, with flow control provided by a single
master client.
This behaviour should interoperate with any traditional SLAM client
(e.g. Etherboot 5.4) on the network. The SLAM protocol isn't actually
documented anywhere, so it's hard to define either behaviour as
compliant or otherwise.
A missing test for dhcp->dhcpoffer in dhcp_timer_expired() was causing
the client to transition to DHCPREQUEST after timing out on waiting
for ProxyDHCP even if no DHCPOFFERs had been received.
In a SLAM NACK packet, if we run out of space to represent the
missing-block list, then indicate all remaining blocks as missing.
This avoids the need to wait for the one-second timeout before
receiving the blocks that otherwise wouldn't have been requested due
to running out of space.
Shorter NACK packets take less time to construct and spew out less
debug output, and there's a limit to how useful it is to send a
complete missing-block list anyway; if the loss rate is high then
we're going to have to retransmit an updated missing-block list
anyway.
Also add pretty debugging output to show the list of requested blocks.
We never set up specific multicast filters; native drivers will ask
the card to receive all multicast packets. The only way to achieve
this via the UNDI API is to enable promiscuous mode.
UDP sockets can be used for multicast, at which point it becomes
plausible that we could receive packets that aren't destined for us
but that still match on a port number.
Delete ELF as a generic image type. The method for invoking an
ELF-based image (as well as any tables that must be set up to allow it
to boot) will always depend on the specific architecture. core/elf.c
now only provides the elf_load() function, to avoid duplicating
functionality between ELF-based image types.
Add arch/i386/image/elfboot.c, to handle the generic case of 32-bit
x86 ELF images. We don't currently set up any multiboot tables, ELF
notes, etc. This seems to be sufficient for loading kernels generated
using both wraplinux and coreboot's mkelfImage.
Note that while Etherboot 5.4 allowed ELF images to return, we don't.
There is no callback mechanism for the loaded image to shut down gPXE,
which means that we have to shut down before invoking the image. This
means that we lose device state, protection against being trampled on,
etc. It is not safe to continue afterwards.
Revert "Use .SECONDARY instead of .PRECIOUS for bin/%.tmp targets."
This reverts commit de29e5a39c.
.SECONDARY doesn't seem to work properly with the target patterns of
implicit rules. In particular, a "make clean ; make bin/rtl8139.dsk"
will correctly leave the bin/rtl8139.dsk.tmp file present when .PRECIOUS
is used, but not when .SECONDARY is used.
This is slightly irritating since we don't want the
"do-not-delete-if-interrupted" semantics of .PRECIOUS, but it seems to be
the best compromise.
During development it is often handy to change the config.h options from
their defaults, for example to enable debugging features.
To prevent accidental commits of debugging config.h changes, mdc
suggested having a config-local.h that is excluded from source control.
This file acts as a temporary config.h and can override any of the
defaults.
This commit is an attempt to implement the config-local.h feature.
The config.h file now has the following as its last line:
/* @TRYSOURCE config-local.h */
The @TRYSOURCE directive causes config-local.h to be included at that
point in the file. If config-local.h does not exist, no error will be
printed and parsing will continue as normal. Therefore, mkconfig.pl is
"trying" to "source" config-local.h.
The GDBSYM config.h option was an attempt at QEMU GDB debugging. I have
removed the code since it is unused and may confuse people wanting to
use the GDB stub.
Maintain state for the advertised window length, and only ever increase
it (instead of calculating it afresh on each transmit). This avoids
triggering "treason uncloaked" messages on Linux peers.
Respond to zero-length TCP keepalives (i.e. empty data packets
transmitted outside the window). Even if the peer wouldn't otherwise
expect an ACK (because its packet consumed no sequence space), force an
ACK if it was outside the window.
We don't yet generate TCP keepalives. It could be done, but it's unclear
what benefit this would have. (Linux, for example, doesn't start sending
keepalives until the connection has been idle for two hours.)
When the embedded image is a script, the unregister_image() performed by
image/script.c corrupts memory, since image/embedded.c omitted the call
to register_image().
This is the first bug fixed using Stefan Hajnoczi's gdb stub for gPXE.
Return the most appropriate of EACCES, EPERM, ENODEV, ENOTSUP, EIO or
EINVAL depending on the exact error returned by the target, rather than
just always returning EPERM.
Also, ensure that error strings exist for these errors.
The ROM prefix now prompts the user to enter the gPXE shell during POST;
this allows for configuring gPXE without needing to attempt to boot from
it. (It also slows down system boot by three seconds per gPXE ROM, but
hey.)
This is apparently a certain OEM's requirement for option ROMs.
Various specification documents disagree about the byte ordering of
UUIDs. However, SMBIOS seems to use the standard in which everything is
in network-endian order.
This doesn't affect anything sent on the wire; only what gets printed on
the screen when the "uuid" variable is displayed.
From: Daniel Mealha Cabrita <dancab@utfpr.edu.br>
I've added tg3-5721 support for gPXE, the patch (against gpxe-0.9.3) is
attached to this message.
This chipset is present in HP ML150 G2 servers (possibly other HP machines
as well).
From: Viswanath Krishnamurthy <viswa.krish@gmail.com>
The current ipv4 incorrectly checks the IP address for multicast address.
This causes valid IPv4 unicast address to be trated as multicast address
For e.g if the PXE/tftp server IP address is 192.168.4.XXX where XXX is
224 or greater, it gets treated as multicast address and a ethernet
multicast address is sent out on the wire causing timeouts
Some EMC targets will fail if we advertise that we can authenticate with
CHAP, but the target is configured to allow unauthenticated access to that
target. We advertise AuthMethod=CHAP,None; the target should (I think)
select AuthMethod=None for unprotected targets. IETD does this, but an
EMC Celerra NS83 doesn't.
Fix by offering only AuthMethod=None if the user hasn't supplied a
username and password; this means that we won't be offering CHAP
authentication unless the user is expecting to use it (in which case the
target is presumably configured appropriately).
Many thanks to Alessandro Iurlano <alessandro.iurlano@gmail.com> for
reporting and helping to diagnose this problem.
ROMs will refuse to build unless pci_vendor_id and pci_device_id are
defined. We probably ought to fix up the Makefile (and the ROM prefix) so
that they're required only for PCI ROMs, but this will do for now.
Drivers are not allowed to call printf(). Converted eprintf() to DBG(),
and removed spurious startup banner.
Fixed hardcoded inclusion of little_bswap.h
Use EIO rather than 1 as an error number.
Add ability for network devices to flag link up/down state to the
networking core.
Autobooting code will now wait for link-up before attempting DHCP.
IPoIB reflects the Infiniband link state as the network device link state
(which is not strictly correct; we also need a succesful IPoIB IPv4
broadcast group join), but is probably more informative.
Infiniband devices no longer block waiting for link-up in
register_ibdev().
Hermon driver needs to create an event queue and poll for link-up events.
Infiniband core needs to reread MAD parameters when link state changes.
IPoIB needs to cope with Infiniband link parameters being only partially
available at probe and open time.
PXE is a catch-all image format with no signature checks. If an
unsupported image file is loaded, it will be treated as a PXE image. In
most cases, the image will be too large to be loaded as a PXE image (which
has to fit in base memory), so the error returned to the user will be that
the segment could not fit within the memory region.
Add an explicit check to pxe_image.c to reject images larger than base
memory with ENOEXEC.
Add ENOEXEC to the error string table.
gPXE is not compliant with the HTTP/1.1 specification (RFC 2616),
since it lacks support for "Transfer-Encoding: chunked". gPXE is,
however, compliant with the HTTP/1.0 specification (RFC 1945), which
does not require "Transfer-Encoding: chunked" to be supported.
The only HTTP/1.1 feature that gPXE uses is the "Host:" header, but
servers universally accept that one from HTTP/1.0 clients as an
optional extension (it is obligatory for HTTP/1.1). gPXE does not,
for example, appear to support connection caching. Advertising as a
HTTP/1.0 client will typically make the server close the connection
immediately upon sending the last data, which is actually beneficial
if we aren't going to keep the connection alive anyway.
The PXE spec is (as usual) unclear on precisely when ProxyDHCPREQUESTs
should be issued. We adapt the following, slightly paranoid approach:
If an offer contains an IP address, then it is a normal DHCPOFFER.
If an offer contains an option #60 "PXEClient", then it is a
ProxyDHCPOFFER. Note that the same packet can be both a normal
DHCPOFFER and a ProxyDHCPOFFER.
After receiving the normal DHCPACK, if we have received a
ProxyDHCPOFFER, we unicast a ProxyDHCPREQUEST back to the ProxyDHCP
server on port 4011. If we time out waiting for a ProxyDHCPACK, we
treat this as a non-fatal error.
Change the PXE return code for EWOULDBLOCK from PXENV_STATUS_FAILURE
to PXENV_STATUS_TFTP_OPEN. This code is only used by the FILE_READ
PXEXT call, and is necessary to distinguish "error" from "no data" in
that call.
(The only other nonblocking call is UDP_READ, where the caller doesn't
care about the distinction, however, gPXE doesn't use EWOULDBLOCK
internally to represent this condition in that code.)
Allow for settings to be described by something other than a DHCP option
tag if desirable. Currently used only for the MAC address setting.
Separate out fake DHCP packet creation code from dhcp.c to fakedhcp.c.
Remove notion of settings from dhcppkt.c.
Rationalise dhcp.c to use settings API only for final registration of the
DHCP options, rather than using {store,fetch}_setting throughout.
Add dedicated functions create_dhcpdiscover(), create_dhcpack() and
create_proxydhcpack() for use by external code such as the PXE preboot
code.
Register ProxyDHCP options under the global scope "proxydhcp".
Unregister previously-acquired DHCP and ProxyDHCP settings when DHCP
succeeds.
Add a configuration settings block for each net device. This will
provide the parent scope for settings applicable only to that network
device (e.g. non-volatile options stored on the NIC, options obtained via
DHCP, etc.).
Expose the MAC address as a setting.
Add the notion of the settings hierarchy, complete with
register/unregister routines.
Rename set->store and get->fetch to avoid naming conflicts with get/put
as used in reference counting.
Add the concept of an abstract configuration setting, comprising a (DHCP)
tag value and an associated byte sequence.
Add the concept of a settings namespace.
Add functions for extracting string, IPv4 address, and signed and
unsigned integer values from configuration settings (analogous to
dhcp_snprintf(), dhcp_ipv4_option(), etc.).
Update functions for parsing and formatting named/typed options to work
with new settings API.
Update NVO commands and config UI to use new settings API.
Timers are sometimes required before the call to initialise(), so we
cannot rely on initialise() to set up the timers before use.
Also fix a potential integer overflow issue in generic_currticks_udelay()
Add missing comments to timer code.
Lock system if no suitable timer source is found.
Fix initialisation order so that timers are initialised before code that
needs to use them.
Allow encapsulated options to be specified as e.g. "175.3". As a
side-effect, change the separator character for the type field from "." to
":"; for example, the IP address pseudo-option is now "175.3:ipv4".
Add parse and display routines for 16-bit and 32-bit integer configuration
settings.
Add parse and display routines for hex-string configuration settings.
Assume hex-string as a configuration setting type if no type is explicitly
specified.
show_setting() and related functions now return an "actual length" in the
style of snprintf(). This is to allow consumers to allocate buffers large
enough to hold the formatted setting.
When PMM is used, the gPXE image source will no longer be in base memory.
Decompression of .text16 and .data16 can therefore no longer be done in
real mode.
Use BBS installation check to see if we need to hook INT19 even on a PnP
BIOS.
Verify that $PnP signature is paragraph-aligned; bochs/qemu BIOS provides
a dummy $PnP signature with no valid entry point, and deliberately
unaligns the signature to indicate that it is not properly valid.
Print message if INT19 is hooked.
Attempt to use PMM even if BBS check failed.
WinPE's pxeboot.n12 takes the BufferLimit returned by gPXE (indicating
the size of gPXE's internal DHCP packet buffers) and erroneously passes
it in as BufferSize (indicating the size of pxeboot.n12's DHCP packet
buffer). If these don't match, then pxeboot.n12 ends up instructing gPXE
to overwrite parts of its data segment.
Change gPXE's internal DHCP packet buffers to be exactly
sizeof(BOOTPLAYER_t) bytes to work around this problem.
ROM initialisation vector now attempts to allocate a 2MB block using
PMM. If successful, it copies the ROM image to this block, then
shrinks the ROM image to allow for more option ROMs. If unsuccessful,
it leaves the ROM as-is.
ROM BEV now attempts to return to the BIOS, resorting to INT 18 only
if the BIOS stack has been corrupted.
The generate-by-PCI-device-ID rules (bin/pci_VVVV_DDDD.rom) are generally
used for building actual ROM images to be burned, and the burning
utilities generally run under some DOS variant. Change the filename from
pci_VVVV_DDDD.rom to VVVVDDDD.rom so that it is compatible with the DOS
8.3-character filename limit.
This allows pxelinux to execute arbitrary gPXE commands. This is
remarkably unsafe (not least because some of the commands will assume
full ownership of memory and do nasty things like edit the e820 map
underneath the calling pxelinux), but it does allow access to the
"sanboot" command.
Replace a printf with a DBG in timer_rtdsc.c
Replace a printf in timer.c with assert
Return proper error codes from timer drivers
Signed-off-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Timer subsystem initialization code in core/timer.c
Split the BIOS and RTDSC timer drivers from i386_timer.c
Split arch/i386/firmware/pcbios/bios.c into the RTSDC
timer driver and arch/i386/core/nap.c
Split the headers properly:
include/unistd.h - delay functions to be used by the
gPXE core and drivers.
include/gpxe/timer.h - the fimer subsystem interface
to be used by the timer drivers
and currticks() to be used by
the code gPXE subsystems.
include/latch.h - removed
include/timer.h - scheduled for removal. Some driver
are using currticks, which is
only for core subsystems.
Signed-off-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
RFC 4390 provides for the DHCP client identifier to contain the link-layer
hardware type and MAC address when the MAC address exceeds 16 bytes.
However, the hardware type field is only 8 bits; we were assuming 16 bits.
Arbel and Hermon cards both have multiple ports. Add the
infrastructure required to register each port as a separate IB
device. Don't yet register more than one port, since registration
will currently fail unless a valid link is detected.
Use ib_*_{set,get}_{drv,owner}data wrappers to access driver- and
owner-private data on Infiniband structures.
Pull out common code for handling management datagrams from arbel.c
and hermon.c into infiniband.c.
Add port number to struct ib_device.
Add open(), close() and mad() methods to struct ib_device_operations.
As written, if the if the UNDI ISR call clobbers the upper halves of
any of the GPRs (which by convention it is permitted to do, and by
paranoia should be expected to do) then nothing in the interrupt
handler will recover the state.
Additionally, save/restore %fs and %gs out of sheer paranoia - it's a
cheap enough operation, and may prevent problems due to poorly written
UNDI stacks.
Since we don't know what the UNDI code does, it is safest to
save/restore %eflags even though the lower half of %eflags is
automatically saved by the interrupt itself.
As written, if the if the UNDI ISR call clobbers the upper halves of
any of the GPRs (which by convention it is permitted to do, and by
paranoia should be expected to do) then nothing in the interrupt
handler will recover the state.
Additionally, save/restore %fs and %gs out of sheer paranoia - it's a
cheap enough operation, and may prevent problems due to poorly written
UNDI stacks.
Allow port numbers in iSCSI redirection.
Wait for SCSI status, not just the final data-in (which may be followed
by an explicit SCSI Response PDU if the S bit is not set).
stappers and xl0 pointed out that gnu make sets some variables, so ?=
is ineffective in some cases where we use it.. Cross-compilation
requires that some variables can be overridden in the
src/$(ARCH)/Config file, so include that file _after_ utility program
variables are set.
From: Geert Stappers <stappers@stappers.nl>
To: etherboot-developers@lists.sourceforge.net
Subject: [Etherboot-developers] 3c90x polling again [patch]
Date: Thu, 29 Nov 2007 09:22:36 +0100
User-Agent: Mutt/1.5.16 (2007-06-11)
Hello,
gPXE didn't work on 3COM 905C Tornado cards for me.
It did transmit the DHCP request, but it didn't see the DHCP offer.
Adding debug print statements allready solved the problem.
Attached is a patch that has a cleaner delay then print statements.
The core of it is
- for(i=0;i<40000;i++);
+ mdelay(1);
There was no research if the change is about a longer delay
or about code NOT being optimized away. It works for me :-)
Cheers
Geert Stappers
driver's probe() routine fills in in nic->irqno. This is so that
non-interrupt-capable legacy drivers which set nic->irqno=0 will end
up reporting IRQ#0 via PXENV_UNDI_GET_INFORMATION; this in turn means
that the calling PXE NBP will (should) hook the timer interrupt, and
everything will sort of work.
The e1000_irq() routine should (per mcb30) do enable on non-zero,
disable on zero. This is not consistent in all drivers, so I'll
wait to update it when doing a global sweep.
This needs to be done manually because if the irq() routine is
implemented then we want something like "nic->irqno = pci->irqno;",
else we do "nic->irqno = 0;" nic->ioaddr may also need to be set
carefully.
Also added local variables to end of many files, for emacs indentation
to match kernel style (tab does 8 space indent).
_textdata_link_addr, _load_addr and _max_align in the linker scripts.
A bug in some versions of ld causes segfaults if the DEFINED() macro
is used in a linker script *and* the -Map option to ld is present.
We don't currently need to override any of these values; if we need to
do so in future then the solution will probably be to always specify
the values on the ld command line, and have the linker script not
define them at all.
PXENV_GET_CACHED_INFO. If we dont do this, Altiris' NBP screws up; it
relies on being able to grab pointers to each of the three packets and
then read them at will later.
There may still be an issue with memory handling, since it seems to
die ungracefully when ARP packets come in after loading a kernel.
Something to debug.
We didn't specify values for MaxRecvDataSegmentLength and
MaxBurstLength (to save space, since we were happy with the
RFC-defined default values of 8kB and 256kB respectively). However,
the OpenSolaris target (incorrectly) assumes default values of zero
for these parameters.
The upshot was that the OpenSolaris target would get stuck in an
endless loop trying to send us the first 512-byte sector, zero bytes
at a time, and would eventually run out of memory and core-dump.
Fixed by explicitly specifying the default values for these two
parameters.
memory map. (We achieve this by setting CF on the last entry if it is
zero-length; this avoids the need to look ahead to see at each entry
if the *next* entry would be both the last entry and zero-length).
This fixes the "0kB base memory" error message upon starting Windows
2003 on a SunFire X2100.
memory map. (We achieve this by setting CF on the last entry if it is
zero-length; this avoids the need to look ahead to see at each entry
if the *next* entry would be both the last entry and zero-length).
This fixes the "0kB base memory" error message upon starting Windows
2003 on a SunFire X2100.
byte, rather than the number of permissible bytes (i.e. subtract one
from the value under the previous definition to get the value under
the new definition).
This avoids integer overflow on 64-bit kernels, where
bzhdr.initrd_addr_max may be 0xffffffffffffffff; under the old
behaviour we set mem_limit equal to initrd_addr_max+1, which meant it
ended up as zero. Kernel loads would fail with ENOBUFS.
tracking down a bug that turned out to be a free_iob() used where I
needed a netdev_tx_complete(). This left the freed I/O buffer on the
net device's TX list, with bad, bad consequences later.
Also fixed the bug in question.
initiator. This table needs to be replaced by something similar to
iBFT (i.e. scanned for and identified by signature, rather than being
at a fixed address), but it works for now.
Experimentation reveals that gcc ignores -mrtd for the implicit
arithmetic functions (e.g. __udivdi3), but not for the implicit
memcpy() and memset() functions. Mark the implicit arithmetic
functions with __attribute__((cdecl)) to compensate for this.
(Note: we cannot mark with with __cdecl, because we define __cdecl to
incorporate regparm(0) as well.)