FreeBSD requires the object format to be specified as elf_i386_fbsd,
rather than elf_i386.
Based on a patch from Eygene Ryabinkin <rea-fbsd@codelabs.ru>
Some PCI 3 BIOSes seem to provide a garbage value in %bx, which should
contain the runtime segment address. Perform a basic sanity check: we
reject the segment if it is below the start of option ROM space. If
the sanity check fails, we assume that the BIOS was not expecting us
to be a PCI 3 ROM, and we just leave our image in situ.
The section name seems to have significance for some versions of
binutils.
There is no way to instruct gcc that sections such as .bss16 contain
uninitialised data; it will emit them with contents explicitly set to
zero. We therefore have to rely on the linker script to force these
sections to become uninitialised-data sections. We do this by marking
them as NOLOAD; this seems to be the closest semantic equivalent in the
linker script language.
However, this gets ignored by some versions of ld (including 2.17 as
shipped with Debian Etch), which mark the resulting sections with
(CONTENTS,ALLOC,LOAD,DATA). Combined with the fact that this version of
ld seems to ignore the specified LMA for these sections, this means that
they end up overlapping other sections, and so parts of .prefix (for
example) get obliterated by .data16's bss section.
Rename the .bss sections from .section_bss to .bss.section; this seems to
cause these versions of ld to treat them as uninitialised data.
Not fully understood, but it seems that the LMA of bss sections matters
for some newer binutils builds. Force all bss sections to have an LMA
at the end of the file, so that they don't interfere with other
sections.
The symptom was that objcopy -O binary -j .zinfo would extract the
.zinfo section from bin/xxx.tmp as a blob of the correct length, but
with zero contents. This would then cause the [ZBIN] stage of the
build to fail.
Also explicitly state that .zinfo(.*) sections have @progbits, in case
some future assembler or linker variant decides to omit them.
The virtnet_transmit() logic for waiting the packet to be transmitted is
reversed: we can't wait the packet to be transmitted if we didn't kick()
the ring yet. The vring_more_used() while loop logic is reversed also,
that explains why the code works today.
The current code risks trying to free a buffer from the used ring
when none was available, that will happen most times because KVM
doesn't handle the packet immediately on kick(). Luckily it was working
because it was unlikely to have a buffer still queued for transmit when
virtnet_transmit() was called.
Also, adds a BUG_ON() to vring_get_buf(), to catch cases where we try
to free a buffer from the used ring when there was none available.
Patch for Etherboot. gPXE has the same problem on the code, but I hadn't
a chance to test gPXE using virtio-net yet.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
EFI requires us to be able to specify the source address for
individual transmitted packets, and to be able to extract the
destination address on received packets.
Take advantage of this to rationalise the push() and pull() methods so
that push() takes a (dest,source,proto) tuple and pull() returns a
(dest,source,proto) tuple.
Multicast hashing is an ugly overlap between network and link layers.
EFI requires us to provide access to this functionality, so move it
out of ipv4.c and expose it as a method of the link layer.
Some versions of ld choke on the "AT ( _xxx_lma )" in efi.lds with an
error saying "nonconstant expression for load base". Since these were
only explicitly setting the LMA to the address that it would have had
anyway, they can be safely omitted.
We have EFI APIs for CPU I/O, PCI I/O, timers, console I/O, user
access and user memory allocation.
EFI executables are created using the vanilla GNU toolchain, with the
EXE header handcrafted in assembly and relocations generated by a
custom efilink utility.
monojob_wait() was holding a reference to the completed job, meaning that
various objects would not be freed until the next job was plugged in to
the monojob interface.
The userptr_t is now the fundamental type that gets used for conversions.
For example, virt_to_phys() is implemented in terms of virt_to_user() and
user_to_phys().
-Wformat-nonliteral is not enabled by -Wall and needs to be explicitly
specified.
Modified the few files that use nonliteral format strings to work with
this new setting in place.
Inspired by a patch from Carl Karsten <carl@personnelware.com> and an
identical patch from Rorschach <r0rschach@lavabit.com>.
The intention is to include near-verbatim copies of the EFI headers
required by gPXE. This is achieved using the import.pl script in
src/include/gpxe/efi.
Note that import.pl will modify any #include lines in each imported
header to reflect its new location within the gPXE tree. It will also
tidy up the file by removing carriage return characters and trailing
whitespace.
Reduce the number of sections within the linker script to match the
number of practical sections within the output file.
Define _section, _msection, _esection, _section_filesz, _section_memsz,
and _section_lma for each section, replacing the mixture of symbols that
previously existed.
In particular, replace _text and _end with _textdata and _etextdata, to
make it explicit within code that uses these symbols that the .text and
.data sections are always treated as a single contiguous block.
Allow for the build CPU architecture and platform to be specified as part
of the make command goals. For example:
make bin/rtl8139.rom # Standard i386 PC-BIOS build
make bin-efi/rtl8139.efi # i386 EFI build
The generic syntax is "bin[-[arch-]platform]", with the default
architecture being "i386" (regardless of the host architecture) and the
default platform being "pcbios".
Non-path targets such as "srcs" can be specified using e.g.
make bin-efi srcs
Note that this changeset is merely Makefile restructuring to allow the
build architecture and platform to be determined by the make command
goals, and to export these to compiled code via the ARCH and PLATFORM
defines. It doesn't actually introduce any new build platforms.
Some devices (e.g. the Atmel AT24C11) have no concept of a device
address; they respond to every device address and use this value as
the word address. Some other devices use part of the device address
field to extend the word address field.
Generalise the i2c bit-bashing support to handle this by defining the
device address length and word address length as properties of an i2c
device. The word address is assumed to overflow into the device
address field if the address used exceeds the width of the word
address field.
Also add a bus reset mechanism. i2c chips don't usually have a reset
line, so rebooting the host will not clear any bizarre state that the
chip may be in. We reset the bus by clocking SCL until we see SDA
high, at which point we know we can generate a start condition and
have it seen by all devices. We then generate a stop condition to
leave the bus in a known state prior to use.
Finally, add some extra debugging messages to i2c_bit.c.
Some older versions of gcc don't complain about unknown compiler flags
unless you ask them to actually compile; asking them to merely
preprocess won't trigger the error.
Fix the -fno-stack-protector test by making it attempt to compile an
empty file, rather than preprocess an empty file.
The usefulness of DBGLVL_IO is limited by the fact that many cards
require large numbers of uninteresting I/O reads/writes at device
probe time, typically when driving a bit-bashing I2C/SPI bus to read
the MAC address.
This patch adds the DBG_DISABLE() and DBG_ENABLE() macros, which can
be used to temporarily disable and re-enable selected debug levels.
Note that debug levels must still be enabled in the build in order to
function at all: you can't use DBG_ENABLE(DBGLVL_IO) in an object
built with DEBUG=object:1 and expect it to do anything.
?= in a Makefile means that that variable can be overridden by the
environment. This is confusing to users, especially with a generic
name like "ARCH".
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Although the E820 API allows for a caller to provide only a 20-byte
buffer, there exists at least one combination (HP BIOS, 32-bit WinPE)
that relies on information found only in the "extended attributes"
field, which requires a 24-byte buffer.
Allow for up to a 64-byte E820 buffer, in the hope of coping with
future idiocies like this one.
The ACPI specification defines an additional 4-byte field at offset 20
for an E820 memory map entry. This field is presumably optional,
since generally E820 gets given only a 20-byte buffer to fill.
However, the bits of this optional field are defined as:
bit 0 : region is enabled
bit 1 : region is non-volatile memory rather than RAM
so it seems as though callers that pass in only a 20-byte buffer may
be missing out on some rather important information.
Our INT 15,e820 code was setting %es=%ss (as part of the "look ahead
in the memory map" logic), but failing to restore %es afterwards.
This is a serious bug, but wasn't affecting many platforms because
almost all callers seem to set %es=%ss anyway.
Use individual page mappings rather than a single whole-region
mapping, to avoid the waste of memory that occurs due to the
constraint that each mapped block must be aligned on its own size.
Some BIOSes require us to pass in not only the continuation value (in
%ebx) as returned by the previous call to INT 15,e820 but also the
unmodified buffer (at %es:%di) as returned by the previous call to INT
15,e820. Apparently, someone thought it would be a worthwhile
optimisation to fill in only the low dword of the "length" field and
the low byte of the "type field", assuming that the buffer would
remain unaltered from the previous call.
This problem was being triggered by the "peek ahead" logic in
get_mangled_e820(), which would read the next entry into a temporary
buffer in order to be able to guarantee terminating the map with
%ebx=0 rather than CF=1. (Terminating with CF=1 upsets some Windows
flavours, despite being documented legal behaviour.)
Work around this problem by always fetching directly into our e820
cache; that way we can guarantee that the underlying call always sees
the previous buffer contents (and the same buffer address).
We were accidentally allocating only half the required amount of
memory (given the alignment method) for the firmware buffer, leading
to conflicts between the firmware buffer and gPXE code/data segments.
We were accidentally allocating only half the required amount of
memory (given the alignment method) for the firmware buffer, leading
to conflicts between the firmware buffer and gPXE code/data segments.
We seem to be having issues with various E820 memory maps. These
problems are often difficult to reproduce, requiring access to the
specific system exhibiting the problem.
Add a facility for hooking in a fake E820 map generator, using an
arbitrary map defined in a C array, solely in order to be able to test
the map-mangling code against arbitrary E820 maps.
In particular, allow BANNER_TIMEOUT=0 to inhibit the prompt banners
altogether.
Ironically, this request comes from the same OEM that originally
required the prompts to be present during POST.
Some really moronic BIOSes bring up the PXE stack via the UNDI loader
entry point during POST, and then don't bother to unload it before
overwriting the code and data segments. If this happens, we really
don't want to leave INT 15 hooked, because that will cause any loaded
OS to die horribly as soon as it attempts to fetch the system memory
map.
We use a heuristic to detect whether or not we are being loaded at the
top of free base memory. If we determine that we are being loaded at
some other arbitrary location in base memory, then we assume that it's
not safe to hook INT 15.
This allows settings to be expanded in a way that is safe to include
within a URI string, such as
kernel http://10.0.0.1/boot.php?mf=${manufacturer:uristring}
where the ${manufacturer} setting may contain characters that are not
permitted (or have reserved purposes) within a URI.
Since whitespace characters will be URI-encoded (e.g. "%20" for a
space character), this also works around the problem that spaces
within an expanded setting would cause the shell to split command-line
arguments incorrectly.
On non-BBS systems we hook INT 19, since there is no other way we can
guarantee gaining control of the flow of execution. If we end up
doing this, prompt the user before attempting boot, since forcibly
capturing INT 19 is rather antisocial.
It is possible for the BIOS to use the UNDI API to bring up the NIC
prior to system boot. If this happens, UNM_NIC_REG_CMDPEG_STATE will
contain the value 0xf00f (UNM_NIC_REG_CMDPEG_STATE_INITIALIZE_ACK),
and we should skip initialising the command PEG.
The firmware will now determine the right port mode on all cards, so
the PXE driver doesn't have to set it. (Setting the port mode
apparently breaks some newer cards.)
At least one Dell system calls the UNDI loader entry point with the
BIOS console disabled. The serial console is active only after a call
to initialise(), so move the debug message in undi_loader() so that it
can be displayed via the serial console.
If the INT 15,e820 memory map reports a region [0,0), this confuses
the "truncate to even megabytes" logic, which ends up rounding the
region 'down' to [0,fff00000).
Fix by ensuring that the region's end address is at least 1, before we
subtract 1 to obtain the "last byte in region" address.
INT 15,e801 is capable of returning a memory range that extends to
4GB, so allow for this in the debug message that shows the data
returned by INT 15,e801.
The domain etherboot.org was actually registered on 2000-01-09, not
2000-09-01. (To put it another way, it was registered on 1/9/2000 (US
date format) rather than 1/9/2000 (sensible date format); this may
illuminate the cause of the error.)
"iqn.2000-09.org.etherboot:" is still valid as per RFC3720, but may be
surprising to users, so change it to something less unexpected.
Thanks to the anonymous contributor for pointing this one out.
Apparently some BIOSes will place option ROMs on 512-byte boundaries.
While this is against specification, it doesn't actually hurt
anything, so we may as well increase our scan granularity to 512
bytes.
Contributed by Luca <lucarx76@gmail.com>
Wyse Streaming Manager server (WLDRM13.BIN) assumes that the PXENV+
entry point is at UNDI_CS:0000; apparently, somebody at Wyse has
difficulty distinguishing between the words "may" and "must"...
Add a dummy entry point at UNDI_CS:0000, which just jumps to the
correct entry point.
The multiboot specification states that, for raw images, if
load_end_addr is zero then it should be interpreted as meaning "use
the entire file", and if bss_end_addr is zero it should be interpreted
as meaning "no bss".
Must check that argument to a fclose() is not NULL -- we can get to the
'err' label when file was not opened. fclose(NULL) is known to produce
core dump on some platforms and we don't want zbin to fail so loudly.
Signed-off-by: Eygene Ryabinkin <rea-fbsd@codelabs.ru>
Explicitly state that we are using 32-bit addressing in 16-bit code.
GNU as 2.15 (FreeBSD/amd64 7-STABLE) got confused that 32-bit registers
are used in the code that was declared as 16-bit. Add explicit modifier
'addr32' to make assembler happy.
Signed-off-by: Eygene Ryabinkin <rea-fbsd@codelabs.ru>
IBM's iSCSI Firmware Initiator checks the UNDIROMID pointer in the
!PXE structure that gets created by the UNDI loader. We didn't
previously fill this value in.
Commit f58cc3f introduced a temporary workaround for a bug in current
prototype silicon, but failed to apply it to all eight PCI functions
within the device.
Option::ROM was assuming that ROM images using a short jump
instruction for the init entry point would have a zero byte at offset
5; this is not necessarily true.
Include PMM allocation result in POST banner.
Include full product string in "starting execution" message.
Also mark ourselves as supporting DDIM in PnP header, for
completeness.
On a system that doesn't support BBS, we end up hooking INT19 to gain
control of the boot process. If the system is PCI3.0, we must take
care to use the runtime value for %cs, rather than the POST-time
value, otherwise we end up pointing INT19 to the temporary option ROM
POST scratch area.
Determine the network-layer packet type and fill it in for UNDI
clients. This is required by some NBPs such as emBoot's winBoot/i.
This change requires refactoring the link-layer portions of the
gPXE netdevice API, so that it becomes possible to strip the
link-layer header without passing the packet up the network stack.
Some dumb NBPs (e.g. emBoot's winBoot/i) never call PXENV_UNDI_ISR
with FuncFlag=PXENV_UNDI_ISR_START; they just sit in a tight polling
loop merrily violating the PXE spec with repeated calls to
PXENV_UNDI_ISR_IN_PROCESS. Force a extra calls to netdev_poll() to
cope with these out-of-spec clients.
Allow for an arbitrary number of splits of the system memory map via
INT 15,e820.
Features of the new map-mangling algorithm include:
Supports random access to e820 map entries.
Requires only sequential access support from the underlying e820
map, even if our caller uses random access.
Empty regions will always be stripped.
Always terminates with %ebx=0, even if the underlying map terminates
with CF=1.
Allows for an arbitrary number of hidden regions, with underlying
regions split into as many subregions as necessary.
Total size increase to achieve this is 193 bytes.
Define a list of N allowed memory regions, and split each underlying
e820 region into up to N subregions. Strip resulting empty regions
out of the map, avoiding using the "return with CF set to strip last
empty region" trick, because it seems that bootmgr.exe in Win2k8 gets
upset if the memory map is terminated with CF set.
This is an intermediate checkin that defines a single allowed memory
region covering the entire 64-bit address space, and uses the existing
map-mangling code on top of the new region-splitting code. This
sanitises the memory map to the point that Win2k8 is able to boot even
on a system that defines a final zero-length region at the 4GB mark.
I'm checking this in because it may be useful for future debugging
efforts to be able to run with the existing and known-working map
mangling code together with the map sanitisation capabilities of the
new map mangling code.
Add support for manipulating the jump instruction that forms the
option ROM initialisation entry point, so that mergerom.pl can treat
it just like other entry points.
Add support for merging the initialisation entry point (and IBM BOFM
table) to mergerom.pl; this is another slightly icky but unfortunately
necessary GPL vs. NDA workaround. When mergerom.pl replaces an entry
point in the original ROM, it now fills in the corresponding entry
point in the merged ROM with the original value; this allows (for
example) a merged initialisation entry point to do some processing and
then jump back to the original entry point.
fetch_string_setting() was subtracting one from the length of the
to-be-NUL-terminated buffer in order to obtain the length of the
unterminated buffer to be passed to fetch_setting(). This works
extremely well unless the length of the to-be-NUL-terminated buffer is
zero, at which point we end up giving fetch_setting() a buffer of
length -1UL, thereby inviting it to overwrite as much memory as it
wants...
The ProxyDHCPREQUEST is a unicast packet, so the first request will
almost always be lost due to not having the IP address in the ARP
cache. If the minimum retry time is set to one second (as per commit
ff2b6a5), then ProxyDHCP will time out and give up before managing to
successfully transmit a request.
The DHCP timers need to be reworked anyway, so this mild hack is
acceptable for now.
New min_timeout and max_timeout fields in struct retry_timer allow
users of this timer to set their own desired minimum and maximum
timeouts, without being constrained to a single global minimum and
maximum. Users of the timer can still elect to use the default global
values by leaving the min_timeout and max_timeout fields as 0.
WinPE seems to have a bug that causes it to always use the TFTP server
IP address and filename from the ProxyDHCPACK packet, even if the
ProxyDHCPACK packet doesn't exist. This causes it to end up
attempting to fetch a file such as
tftp://0.0.0.0/bootmgr.exe
If we don't have a ProxyDHCPACK to use, we pretend that it was a copy
of the DHCPACK packet. This works around the problem, and hopefully
won't surprise any NBPs.
Altiris erroneously cares about the ordering of DHCP options, and will
get confused if we don't construct them in the order it expects.
This is observed (so far) only when attempting to deploy 64-bit Win2k3.
This patch adds support for the virtio-net adapter provided by KVM.
Written by Laurent Vivier <Laurent.Vivier@bull.net> for Etherboot.
Wrapped as legacy driver for gPXE by Stefan Hajnoczi
<stefanha@gmail.com>.
When we boot from a DHCP-supplied filename, we previously relied on
the fact that the current working URI is set to tftp://[next-server]/
in order to resolve the filename into a full tftp:// URI. However,
this process will eliminate the distinction between filenames with and
without initial slashes:
cwuri="tftp://10.0.0.1/" filename="vmlinuz" => URI="tftp://10.0.0.1/vmlinuz"
cwuri="tftp://10.0.0.1/" filename="/vmlinuz" => URI="tftp://10.0.0.1/vmlinuz"
This distinction is important for some TFTP servers. We now
explicitly construct a string of the form
"tftp://[next-server]/filename"
so that a filename with an initial slash will result in a URI
containing a double-slash, e.g.
"tftp://10.0.0.1//vmlinuz"
The TFTP code always strips a single initial slash, and so ends up
presenting the correct path to the server.
URIs entered explicitly by users at the command line must include a
double slash if they want an initial slash presented to the TFTP
server:
"kernel tftp://10.0.0.1/vmlinuz" => filename="vmlinuz"
"kernel tftp://10.0.0.1//vmlinuz" => filename="/vmlinuz"
This utility is required as a workaround for legal restrictions on
including GPL and non-GPL code within the same expansion ROM image.
While this is not encouraged, we are prepared to accept that
concatenation of ROM images and updating of the ROM header data
structures can be classed as "mere aggregation" within the terms of
the GPL.
If in any doubt, assume that you cannot include GPL and non-GPL code
within the same expansion ROM image. Contact the Etherboot team for
clarification on your specific circumstances.
When an error reply (not 1xx, 2xx or 3xx) was received, ftp_reply()
invoked ftp_done() to close connections, but did not return, and the
rest of code in this function could try to send commands to the closed
control connection.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Based on a patch contributed by Sergey Vlasov <vsu@altlinux.ru> :
In my testing with "qemu -net user" the 226 response to RETR was
often received earlier than final packets of the data connection;
this caused the received file to become truncated without any error
indication. Fix this by adding an intermediate state FTP_TRANSFER
between FTP_RETR and FTP_QUIT, so that the transfer is considered to
be complete only when both the end of data connection is encountered
and the final reply to the RETR command is received.
H. Peter Anvin <hpa@zytor.com> sent word that Sergey Vlasov
<vsu@altlinux.ru> discovered gPXE lkrn images fail to load in SYSLINUX
3.70 because we have initrd_addr_max zeroed. This patch sets the same
value as the Linux kernel.
Also change the header jmp instruction to use a hardcoded opcode value
like Linux does. Just in case the assembler decides to use a three-byte
instruction instead of the desired two-byte jmp.
Allow settings to be expanded in arbitrary commands, such as
kernel http://10.0.0.1/boot.php?uuid=${uuid}
Also add the "echo" command, as being the easiest way to test this
features.
Print one dot per second while waiting in monojob.c (e.g. for DHCP,
for file downloads, etc.), to inform user that the system has not
locked up.
Patch contributed by Andrew Schran <aschran@google.com>, minor
modification by me.
This change allows the time for which shell banners are displayed to
be configured in the config.h file. The ability to access the shell
can also be effectively disabled by setting this timeout to zero.
In tg3_chip_reset(), the PCI_EXPRESS change is taken from the Linux
tg3 driver. I am not sure what exactly it does (it is not documented
in the Linux driver), but it is necessary for the NIC to work
correctly.
Use "-include" rather than "include" for the generated Makefile
fragments, in order to suppress the long list of warnings that
otherwise appears at the start of a clean build.
Contributed by Edward Waugh <ewaugh@netxen.com>