Weak symbols are a useful tool in eliminating unnecessary dependencies
between object files, but they are somewhat dangerous because one must
remember to test the weak symbol against NULL before using it. To
rectify that, add macros for declaring weak functions that will return
a default value inline if the file defining them is not available at
link time.
Signed-off-by: Marty Connor <mdc@etherboot.org>
There is no defined error code for aborting a request but 0 is commonly
used. This patch switches the abort request error code from
TFTP_ERR_UNKNOWN_TID (5) to 0.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
pxenv_tftp_get_fsize is an API call that PXE clients can call to
obtain the size of a remote file. It is implemented by starting a TFTP
transfer with pxe_tftp_open, waiting for the response and then
stopping the transfer with pxe_tftp_close(). This leaves the session
hanging on the TFTP server and it will try to resend the packet
repeatedly (verified with tftpd-hpa) until it times out.
This patch adds a method "tftpsize" that will abort the transfer after
the first packet is received from the server. This will terminate the
session on the server and is the same behaviour as Intel's PXE ROM
exhibits.
Together with a qemu patch to handle the ERROR packet (submitted to
qemu's mailing list), this resolves a specific issue where booting
pxegrub with qemu's TFTP server would be slow or hang.
I've tested this against qemu's tftp server and against my normal boot
infrastructure (tftpd-hpa). Booting pxegrub and loading extra files
now produces a trace similar to Intel's PXE client and there are no
spurious retransmits from tftpd any more.
Signed-off-by: Thomas Horsten <thomas@horsten.com>
Signed-off-by: Milan Plzik <milan.plzik@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
When the "keep-san" option is used, the function is exited without
unregistering the stack allocated int13h drive. To prevent a dangling
pointer to the stack, these structs should be heap allocated.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The retry timer is used to retransmit TFTP packets lost on the network,
and to start a new connection. There is an unnecessary delay while
waiting for name resolution because the timer period is fixed and cannot
be shortened when name resolution completes. This patch keeps the timer
period at zero while name resolution takes place so that no time is lost
once before sending the first packet.
Reported-by: Thomas Horsten <thomas@horsten.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
This patch adds TFTP support for files larger than 65535 blocks by
wrapping the 16-bit block number.
Reported-by: Mark Johnson <johnson.nh@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
IBM's Tivoli Provisioning Manager for OS Deployment, when acting as a
ProxyDHCP server, sends an initial offer with a vendor class of "PXEClient"
and vendor-encapsulated options that have nothing to do with PXE. To
differentiate between this case and the case of a ProxyDHCP server that
sends all PXE options in its initial offer, modify gPXE to check for
the presence of an encapsulated PXE boot menu option (43.9) instead of
simply checking for the existence of any encapsulated options at all.
This is the same check used by the Intel vendor PXE ROM.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The PXE standard provides examples of ProxyDHCP responses being encoded both
as type DHCPOFFER and DHCPACK, but currently we only accept DHCPACKs. Since
there are PXE servers in existence that respond to ProxyDHCPREQUESTs with
DHCPOFFERs, modify gPXE's ProxyDHCP pruning logic to treat both types of
responses equally.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The previous [skge] commit should have been recorded as authored by
Thomas Miletich <thomas.miletich@gmail.com>
I mistakenly committed it improperly after fixing a merge issue.
Signed-off-by: Marty Connor <mdc@etherboot.org>
This code is based on the linux skge driver. It supports Marvell Yukon
and SysKonnect Gigabit chipsets.
The code is based on code Michael Decker <mrd999@gmail.com> wrote for
Google Summer of Code 2008.
Support for dual-port cards is untested. The code, however, was left
in. In my opinion it's easier to fix the code if we need to, instead
of having to add support for it from scratch.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The bin/xxx.sizes targets examine the list of obj_ symbols in bin/xxx.tmp
to determine which objects to measure the size of. These symbols have been
normalized to C identifiers, so the result is an error message from `size'
when examining a target that includes objects that were originally named
with hyphens.
Fix by turning obj_foo_bar into $(wildcard bin/foo?bar.o) instead of
bin/foo_bar.o.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Change the behaviour for adding DHCP options into a DHCP packet so
that we now append options, rather than insert them in front of
whatever options might already be present.
Apparently, the DHCP relay logic on a Nortel 470-48T layer 2 switch
cares about the order of DHCP options. If we build a DHCP packet
pre-populated with some options, their order will now be preserved,
except for encapsulated options.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Apparently, the DHCP relay logic on a Nortel 470-48T layer 2 switch
cares about the order of DHCP options. Specifically, it requires
that the DHCP message type option be the first option present in the
DHCP packet. We achieve this by having this option appear first in
our dhcp_request_options_data array, which pre-populates DHCP
requests.
Signed-off-by: Marty Connor <mdc@etherboot.org>
gPXE currently takes advantage of the feature of PCI3.0 that allows
option ROMs to relocate the bulk of their code to high memory and so
take up only a small amount of space in the option ROM area. Currently,
the relocation can only take place if the BIOS's implementation of PMM
can be made to return blocks aligned to an even megabyte, because of
the A20 gate. AMI BIOSes, in particular, will not return allocations
that gPXE can use.
Ameliorate the situation somewhat by adding a prefix, .hrom, that works
identically to .rom except in the case that PMM allocation fails. Where
.rom would give up and place itself entirely in option ROM space, .hrom
moves to a block (assumed free) at HIGHMEM_LOADPOINT = 4MB. This allows
for the use of larger gPXE ROMs than would otherwise be possible.
Because there is no way to check that the area at HIGHMEM_LOADPOINT is
really free, other devices using that memory during the boot process
will cause failure for gPXE, the other device, or both. In practice
such conflicts will likely not occur, but this prefix should still be
considered EXPERIMENTAL.
Signed-off-by: Marty Connor <mdc@etherboot.org>
This driver supports all current Myricom 10 gigabit Ethernet NICs.
It was written from scratch for gPXE by Glenn Brown <glenn@myri.com>,
referenencing Myricom's Linux and EFI drivers, with permission.
Signed-off-by: Glenn Brown <glenn@myri.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Taken from Linux /usr/include/linux/pci.h .
Signed-off-by: Glenn Brown <glenn@myri.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Contrary to the IEEE specification, some access points apparently
set the Spectrum Mgmt bit in the capabilities field even when
broadcasting on a 2.4GHz band that does not require spectrum
management. Allow gPXE to attempt to connect to such networks;
if spectrum management is really required, our advertisement
of capabilities not including it will result in an association
failure.
Reported-by: Peter Meyer <residue@xmail.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Wireless gPXE images are already so large that user-friendliness
seems to trump ROM-size friendliness in this case.
Signed-off-by: Marty Connor <mdc@etherboot.org>
EAPOL is a container protocol that can wrap either EAP packets or
802.11 EAPOL-Key frames. For cleanliness' sake, add a stub that strips
the framing and sends packets off to the appropriate handler if it
is compiled in.
Signed-off-by: Marty Connor <mdc@etherboot.org>
WEP is a highly flawed cryptosystem, barely better than no encryption at all,
but many people still use it. It does have the advantage of being very simple
and small in code size.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Add commands `iwstat' (to list 802.11-specific status information for
802.11 devices) and `iwlist' (to scan for available networks and print
a list along with security information).
Signed-off-by: Marty Connor <mdc@etherboot.org>
This fixes an issue where passing a length as a compound expression
(e.g. using `hdrlen + datalen') would trigger compiler warnings and
potentially precedence-related errors.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Both of these routines are used by 802.11 WPA, but they are generic
and could be needed by other protocols as well.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The disk partition prefix code in hdprefix.S reads the gPXE image in
tracks, not individual sectors. This means it will attempt to read
beyond the end of the image if the .hd image type is not padded to 32
KB.
This issue is affects virtualization software which may execute a .hd or
.usb image file directly - effectively running a machine with a tiny
disk containing just the gPXE image. Boot will fail when gPXE tries to
read beyond the end of disk.
The Multiboot memory map needs to be built after unhiding gPXE and
downloaded images from memory. Solaris faults during boot when trying
to access the ramdisk, which is hidden from the memory map while gPXE is
executing. This issue is fixed by using the memory map from after gPXE
unhides itself.
Reported-by: Moinak Ghosh <moinakg@belenix.org>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
This version is Based on Michael Decker's GSoC 2008 code.
A number cleanups and fixes were applied.
Earlier-version-reviewed-by: Marty Connor <mdc@etherboot.org>
Earlier-version-tested-by: Marty Connor <mdc@etherboot.org>
Earlier-version-tested-by: Shao Miller <Shao.Miller@yrdsb.edu.on.ca>
Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com>
Reviewed-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The 82571 supports an alternate MAC address location in NVRAM.
When this is set, use this for the MAC rather than the default
physical MAC address.
Ported from linux-2.6.git 93ca161027eb6a1761fb674ad7b995aedccf5f6e
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Tested-by: Thomas Miletich <thomas.miletich@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The get_underlying_e820 function should return with CF unset on success.
Reported-by: Timothy Stack <tstack@vmware.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
It is often the case that some module of gPXE is only relevant if the
subsystem it depends on is already being included. For instance,
commands to manage wireless interfaces are quite useless if no
compiled-in driver has pulled in the wireless networking stack. There
may be a user-modifiable configuration options for these dependent
modules, but even if enabled, they should not be included when they
would be useless.
Solve this by allowing the creation of config_subsystem.c, for
configuration directives like those in the global config.c that should
only be considered when subsystem.c is included in the final gPXE
build.
For consistency, move core/config.c to the config/ directory, where
the other config_subsystem.c files will eventually reside.
Signed-off-by: Marty Connor <mdc@etherboot.org>
REQUIRE_SYMBOL() formerly used a formulation of symbol requirement
that would allow a link to succeed despite lacking a required symbol,
because it did not introduce any relocations. Fix by renaming it to
REQUEST_SYMBOL() (since the soft-requirement behavior can be useful)
and add a REQUIRE_SYMBOL() that truly requires.
Add EXPORT_SYMBOL() and IMPORT_SYMBOL() for REQUEST_SYMBOL()-like
behavior that allows one to make use of the symbol, by combining a
weak external on the symbol itself with a REQUEST_SYMBOL() of a second
symbol.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The PXE menu code also treated the type as big-endian, which went
unnoticed until the first fix because its ntohs() was matched by a
htons() in the PXE boot server discovery code.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Some BIOSes (observed with an AMI BIOS on a SunFire X2200) seem to
reset the BIOS drive counter at 40:75 after a failed boot attempt.
This causes problems when attempting a Windows direct-to-iSCSI
installation: bootmgr.exe calls INT 13,0800 and gets told that there
are no hard disks, so never bothers to read the MBR in order to obtain
the boot disk signature. The Windows iSCSI initiator will detect the
iBFT and connect to the target, and everything will appear to work
except for the error message "This computer's hardware may not support
booting to this disk. Ensure that the disk's controller is enabled in
the computer's BIOS menu."
Fix by checking the BIOS drive counter on every INT 13 call, and
updating it whenever necessary.
The case of an unsupported SAN protocol will currently not result in
any error message. Fix by printing the error message at the top level
using strerror(), rather than using hard-coded error messages in the
error paths.
The latest RTL-generated register lists include (mostly redundant)
xxx_MSB values alongside xxx_LSB and xxx_RMASK, and also include
default register values.
Some subnet managers expect the GetResponse from a SetPortInfo MAD to
contain the new link state. The transition is not immediate, so we
often end up returning the previous link state. This can cause the SM
to fail to activate the port.
Fix by waiting for up to 20us for the link state transition to take
effect.
The first byte of the IPoIB MAC address is used for flags indicating
support for "connected mode". Strip out the non-QPN bits of the first
dword when constructing the address vector for transmitted IPoIB
packets, so as not to end up passing an invalid QPN in the BTH.
IBA section 14.2.5.2 states that "the contents of the NodeDescription
attribute are the same for all ports on a node". Satisfy this by
using the HCA GUID rather than the port GUID to form the node
description string.
We do not discard routing table entries when closing an interface. It
is plausible that multiple interfaces may be on the same physical
network; if so, then we may end up in a situation whereby outbound
packets attempt to route via a closed interface.
Fix by ignoring non-open net devices in ipv4_route().
ipv4.c calculates the default subnet mask before calling
fetch_ipv4_setting() to retrieve the configured subnet mask (if any).
However, as of commit 612f4e7 "[settings] Avoid returning
uninitialised data on error in fetch_xxx_setting()",
fetch_ipv4_setting() will zero the IP address if the setting does not
exist, rather than leaving it unaltered.
Fix by fetching the setting first and calculating the default subnet
mask only if necessary.
ipv4.c uses a gateway address of INADDR_NONE to represent "no
gateway". It initialises the gateway address to INADDR_NONE before
calling fetch_ipv4_setting() to retrieve the configured gateway
address (if any).
However, as of commit 612f4e7 "[settings] Avoid returning
uninitialised data on error in fetch_xxx_setting()",
fetch_ipv4_setting() will zero the IP address if the setting does not
exist, rather than leaving it unaltered.
Fix by using a zero IP address to indicate "no gateway", so that a
non-existent gateway address setting will be treated as such.
The PXE type field is canonically little-endian, but the pxebs command
treats it as big-endian in converting the type number passed on the
command line to a field value to search against. Fix, to prevent the
necessity of incantations like "pxebs net0 1536" to select menu item #6.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Modified-by: Michael Brown <mcb30@etherboot.org>
Error message was:
[BUILD] bin/atl1e.oncc1: warnings being treated as errors
drivers/net/atl1e.c: In function 'atl1e_get_permanent_address':
drivers/net/atl1e.c:1326: error: dereferencing type-punned pointer will break strict-aliasing rules
make: *** [bin/atl1e.o] Error 1
Reported-by: Giandomenico De Tullio <ghisha@email.it>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Modified-by: Michael Brown <mcb30@etherboot.org>
Remove spaces in 3rd PCI_ROM field.
Debugged-by: Marty Connor <mdc@etherboot.org>
Reported-by: Giandomenico De Tullio <ghisha@email.it>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
The iBFT is Ethernet-centric in providing only six bytes for a MAC
address. This is most probably an indirect consequence of a similar
design flaw in the Windows NDIS stack. (The WinOF IPoIB stack
performs all sorts of contortions in order to pretend to the NDIS
layer that it is dealing with six-byte MAC addresses.)
There is no sensible way in which to extend the iBFT without breaking
compatibility with programs that expect to parse it. Add the notion
of an "Ethernet-compatible" MAC address to our link layer abstraction,
so that link layers can provide their own workarounds for this
limitation.
Recent gcc versions generate more warnings when compiling util/zbin.c
on a 64-bit system:
util/zbin.c: In function `read_file':
util/zbin.c:85: warning: format `%d' expects type `int', but
argument 3 has type `size_t'
util/zbin.c:91: warning: format `%d' expects type `int', but
argument 3 has type `size_t'
util/zbin.c: In function `read_zinfo_file':
util/zbin.c:119: warning: format `%d' expects type `int', but
argument 4 has type `size_t'
util/zbin.c: In function `alloc_output_file':
util/zbin.c:134: warning: format `%d' expects type `int', but
argument 3 has type `size_t'
util/zbin.c: In function `process_zinfo_add':
util/zbin.c:244: warning: format `%d' expects type `int', but
argument 3 has type `size_t'
util/zbin.c:266: warning: format `%d' expects type `int', but
argument 7 has type `size_t'
util/zbin.c:286: warning: format `%#x' expects type `unsigned int',
but argument 7 has type `size_t'
util/zbin.c: In function `write_output_file':
util/zbin.c:348: warning: format `%d' expects type `int', but
argument 3 has type `size_t'
This patch eliminates these warnings.
Signed-off-by: Marty Connor <mdc@etherboot.org>
gcc 3.3.3 gave the following error when compiling sis190.c
drivers/net/sis190.c: In function 'sis190_get_mac_addr_from_apc':
drivers/net/sis190.c:966: warning: 'isa_bridge' might be used
uninitialized in this function
make: *** [bin/sis190.o] Error 1
This patch allows error-free compilation.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Recent gcc versions generate warnings when compiling util/zbin.c
( tested with gcc-4.3.3 ):
util/zbin.c: In function ‘process_zinfo_pack’:
util/zbin.c:200: warning: format ‘%#zx’ expects type ‘size_t’, but argument 6
has type ‘long unsigned int’
util/zbin.c: In function ‘process_zinfo_add’:
util/zbin.c:257: warning: format ‘%#lx’ expects type ‘long unsigned int’, but
argument 4 has type ‘int’
util/zbin.c:266: warning: format ‘%#lx’ expects type ‘long unsigned int’, but
argument 4 has type ‘int’
util/zbin.c:266: warning: format ‘%d’ expects type ‘int’, but argument 8 has
type ‘long unsigned int’
util/zbin.c:286: warning: format ‘%#lx’ expects type ‘long unsigned int’, but
argument 6 has type ‘int’
util/zbin.c:286: warning: format ‘%#lx’ expects type ‘long unsigned int’, but
argument 7 has type ‘size_t’
This patch eliminates these warnings.
Tested with gcc-4.3.3 on Ubuntu 9.04 and gcc-4.1.2 on Debian Etch.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Some BIOSes set the PCI cacheline size to zero for the card; the ath5k
driver fixes it to a reasonable in PCI config space, but failed to
correct the internal value it had already read. This resulted in
divide-by-zero errors when cacheline-aligning various data structures.
Fix by setting the internal cachelsz to a sane value at the same time
as we write that value to PCI config space.
Signed-off-by: Marty Connor <mdc@etherboot.org>
This adds basic rfkill support for enabling the wireless card on certain
laptops, and changes miscellaneous other details that may help in obscure
cases.
Also change the error handling to not report CRC errors, which due to the
basic facts of wireless may happen even more frequently than valid packets.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Add the 82576 to the e1000 driver.
- Examining the Linux 2.6.30-rc4 igb driver, which supports this card and;
- Information available in the Intel® 82576 Gigabit Ethernet
Controller Datasheet v2.1, which is available from Intel's web site.
I only have a dual-ported card with Copper PHY, so any code paths relating
to Fibre haven't been tested. Also, I have only tested using auto-negotiation
of speed and duplex, and no flow control. Other code paths relating to
those settings also have not been exercised.
Signed-off-by: Simon Horman <horms@verge.net.au>
Sponsored-by: Thomas Miletich <thomas.miletich@gmail.com>
Modified-by: Thomas Miletich <thomas.miletich@gmail.com>
Modified-by: Marty Connor <mdc@etherboot.org>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Enable interrupts in sis900_irq(). Doing so allows some programs using
gPXE's UNDI interface to work properly, including Symantec Ghost.
Tested-by: Hubert Mercier <hubert.mercier@unilim.fr>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The mtools version check does not handle GNU mtools 4.0.10. This commit
makes the pattern more general so it matches older mtools as well as the
newer "mtools (GNU mtools) 4.0.10" string.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
A jump instruction starts at the third byte of an option ROM image, and
it is required that the bytes in the whole image add up to zero. To
achieve this, a checksum byte is usually placed after the jump. The jump
can be either a short jump (2 bytes, EB xx) or a near jump (3 bytes,
E9 xx xx). gPXE's romprefix.S uses a near jump, but modrom.pl assumed
a short jump, and clobbered the high byte of the offset. This caused
modrom-modified gPXE ROM images to crash the system during POST.
Fix by making modrom.pl place the checksum at byte 6, like makerom.pl does.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Debug builds for filenames with hyphens such as:
$ make bin/via-rhine.dsk DEBUG=via-rhine
fail with:
[BUILD] bin/via-rhine.dbg1.o
<command-line>: error: missing whitespace after the macro name
make: *** [bin/via-rhine.dbg1.o] Error 1
This is because "-" is not a legal character in C identifiers, and
gcc rejects "-Ddebug_via-rhine=1" as an argument.
Signed-off-by: Daniel Verkamp <daniel@drv.nu>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Both methods disabled packet tx and rx just to have it enabled again
by calling a3c90x_reset().
Fixed by disabling tx and rx after the call to a3c90x_reset().
Tested by booting Ubuntu intrepid(8.10) directly from gPXE and pxelinux.
Tested on 3c905, 3c905B, 3c905C.
Signed-off-by: Marty Connor <mdc@etherboot.org>
Some systems will retry their boot sequence in the event of a boot
failure. On these systems, the second and subsequent boot attempts
will fail to initialise the Hermon HCA.
Fix by resetting the HCA during probe(). This incurs a one-second
cost, but there seems to be no viable alternative.
Originally-fixed-by: Itay Gazit <itaygazit@gmail.com>
Some devices can only be reset via a mechanism that also resets the
card's PCI core, thus necessitating a backup and restore of all or
part of the PCI configuration space across a reset.
802.11 multicast hashing is the same as standard Ethernet hashing, so
just expose and use eth_mc_hash().
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
The recent change to process_add() to detect duplicate process
additions relies on the fact that all processes will be initialized
using process_init_stopped() before being passed to that function.
The autoassociation process was not initialized in this fashion, so
process_add() erroneously detected it as a duplicate.
Fix by using process_init_stopped() to initialize the autoassociation
process instead of setting the step member directly.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
For IPoIB, the chaddr field is too small (16 bytes) to contain the
20-byte IPoIB link-layer address. RFC4390 mandates that we should
pass an empty chaddr field and rely on the DHCP client identifier
instead. This has many problems, not least of which is that a client
identifier containing an IPoIB link-layer address is not very useful
from the point of view of creating DHCP reservations, since the QPN
component is assigned at runtime and may vary between boots.
Leave the DHCP client identifier as-is, to avoid breaking existing
setups as far as possible, but expose the real hardware address (the
port GUID) via the DHCP chaddr field, using the broadcast flag to
instruct the DHCP server not to use this chaddr value as a link-layer
address.
This makes it possible (at least with ISC dhcpd) to create DHCP
reservations using host declarations such as:
host duckling {
fixed-address 10.252.252.99;
hardware unknown-32 00:02:c9:02:00:25:a1:b5;
}
IPoIB has a 20-byte link-layer address, of which only eight bytes
represent anything relating to a "hardware address".
The PXE and EFI SNP APIs expect the permanent address to be the same
size as the link-layer address, so fill in the "permanent address"
field with the initial link layer address (as generated by
register_netdev() based upon the real hardware address).
The hardware address is an intrinsic property of the hardware, while
the link-layer address can be changed at runtime. This separation is
exposed via APIs such as PXE and EFI, but is currently elided by gPXE.
Expose the hardware and link-layer addresses as separate properties
within a net device. Drivers should now fill in hw_addr, which will
be used to initialise ll_addr at the time of calling
register_netdev().
The option ROM header contains a one-byte field indicating the number
of 512-byte sectors in the ROM image. Currently it is linked to
contain the number of uncompressed sectors, with an instruction to the
compressor to correct it. This causes link failure when the
uncompressed size of the ROM image is over 128k.
Fix by replacing the SUBx compressor fixup with an ADDx fixup that
adds the total compressed output length, scaled as requested, to an
addend stored in the field where the final length value will be
placed. This is similar to the behavior of ELF relocations, and
ensures that an overflow error will not be generated unless the
compressed size is still too large for the field.
This also allows us to do away with the _filesz_pgh and _filesz_sect
calculations exported by the linker script.
Output tested bitwise identical to the old SUBx mechanism on hd, dsk,
lkrn, and rom prefixes, on both 32-bit and 64-bit processors.
Modified-by: Michael Brown <mcb30@etherboot.org>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
The SRP Boot Firmware Table serves a similar role to the iSCSI and AoE
Boot Firmware Tables; it provides information required by the loaded
OS in order to establish a connection back to the SRP boot device.
There is diagnostic value in being able to disambiguate between the
various reasons why an IB CM has rejected a connection attempt. In
particular, reason 8 "invalid service ID" can be used to identify an
incorrect SRP service_id root-path component, and reason 28 "consumer
reject" corresponds to a genuine SRP login rejection IU, which can be
passed up to the SRP layer.
For rejection reasons other than "consumer reject", we should not pass
through the private data, since it is most likely generated by the CM
without any protocol-specific knowledge.
With iSCSI, connection attempts are expensive; it may take many
seconds to determine that a connection will fail. SRP connection
attempts are much less expensive, so we may as well avoid the
"optimisation" of declaring a state of permanent failure after a
certain number of attempts. This allows a gPXE SRP initiator to
resume operations after an arbitrary amount of SRP target downtime.
Generate errors within individual MAD transaction consumers such as
ib_pathrec.c and ib_mcast.c, rather than within ib_mi.c. This allows
for more meaningful error messages to eventually be displayed to the
user.
SRP is the SCSI RDMA Protocol. It allows for a method of SAN booting
whereby the target is responsible for reading and writing data using
Remote DMA directly to the initiator's memory. The software initiator
merely sends and receives SCSI commands; it never has to touch the
actual data.
The minimal-surprise behaviour, when no explicit SRP initiator device
is specified, will probably be to use the most recently opened
Infiniband device. This matches our behaviour with using the most
recently opened net device for PXE, iSCSI, AoE, NBI, etc.
SRP over Infiniband uses a protocol whereby data is sent via a
combination of the CM private data fields and the RC queue pair
itself. This seems sufficiently generic that it's worth having
available as a separate protocol.
The ACK timeout determines how long we take to notice a failed
Reliable Connection. Reducing it from the arbitrary value of 19 down
to 14 reduces the individual ACK timeout from around 2.1s to 67ms;
this in turn reduces the time to tear down and re-establish a broken
SRP session from around 30s to around 1s.
The Infiniband Communication Manager will refuse to establish a
connection if it believes the connection is already established.
There is no immediately obvious way to ask it to tear down the
existing connection and replace it; to issue a DREP we would need to
know the local and remote communication IDs used for the previous
connection setup.
We can work around this by randomising the high-order bits of the
queue pair number; these have no significance to the hardware, but are
sufficient to convince the IB CM that this is a different connection.
We will terminate our transaction as soon as we receive the first CM
REP, since that provides all the state that we need. However, the
peer may resend the REP if it didn't see our RTU, and if we don't
respond with another RTU we risk being disconnected. (This protocol
appears not to handle retries gracefully.)
Fix by adding a management agent that will listen for these duplicate
REPs and send back an RTU.
When a probe found no results, the list head of beacons would not be
freed, leaking 16 bytes of memory per probe.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Previously the maximum packet length was computed using an erroneous
understanding of the role of the MIC field in TKIP-encrypted packets.
The field is actually considered to be part of the MSDU (encrypted and
fragmented data), not the MPDU (container for each encrypted
fragment). As such its size does not contribute to cryptographic
overhead outside the data field's size limitations. The net result is
that the previous maximum packet length value was 4 bytes too long;
fix it to the correct value of 2352.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Some cards (such as ath5k) always need to tune to a particular channel
when they are reset; the reset may happen upon open(), which is before
the channels array would be set up (in prepare_probe()). Avoid tuning
the card to an inconsistent state by copying the hardware
supported-channels array to the 802.11 device's allowable-channels
array even before channels are "properly" set up.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
The prior net80211 model of physical-layer behavior for drivers was
overly simplistic and limited the drivers that could be written. To
be more flexible, split the driver-provided list of supported rates by
band, and add a means for specifying a list of supported channels.
Allow drivers to specify a hardware channel value that will be tied to
uses of the channel.
Expose net80211_duration() to drivers, and make the rate it uses in
its computations configurable, so that it can be used in calculating
durations that must be set in hardware for ACK and CTS packets. Add
net80211_cts_duration() for the common case of calculating the
duration for a CTS packet.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
If isolinux.bin is not installed in the expected location the error
message shown is slightly misleading.
Signed-off-by: Vibi Sreenivasan <vibi_sreenivasan@cms.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
A management interface is the component through which both local and
remote management agents are accessed.
This new implementation of a management interface allows for the user
to react to timed-out transactions, and also allows for cancellation
of in-progress transactions.
Some BIOSes support the BIOS Boot Specification (BBS) but fail to set
%es:%di correctly when calling the option ROM initialisation entry
point. This causes gPXE to identify the BIOS as non-PnP (and so
non-BBS), leaving the user unable to control the boot order.
Fix by scanning for the $PnP signature ourselves, rather than relying
on the BIOS having passed in %es:%di correctly.
Tested-by: Helmut Adrigan <helmut.adrigan@chello.at>
The IBA specification refers to management "interfaces" and "agents".
The interface is the component that connects to the queue pair and
sends and receives MADs; the agent is the component that constructs
the reply to the MAD.
Rename the IB_{QPN,QKEY,QPT} constants as a first step towards making
this separation in gPXE.
The function __intel_new_proc_init() (called implicitly when building
using icc) is marked with __attribute__((cdecl)). This breaks
building on x86_64, where cdecl is meaningless.
Fix by replacing with the existing __libgcc macro, which is already
defined to be "__attribute__((cdecl))" for i386 builds and empty for
x86_64 builds.
The MIT and ISC licenses are legally equivalent to the bsd2 license,
but with slightly different verbiage.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
pxe_api.h is just a description of API functions, it's actively
undesirable to have more implementations than necessary. Allowing it
under the MIT license lets the Syslinux libraries use it.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
In several places, we currently use size_t to represent a difference
between TCP sequence numbers. This can cause compiler warnings
relating to printf format specifiers, since the result of
(uint32_t+size_t) may be an unsigned long on some compilers.
Fix by using uint32_t for all variables that represent a difference
between TCP sequence numbers.
Tested-by: Joshua Oreman <oremanj@xenon.get-linux.org>
The geniso, genliso and gensdsk scripts contain hard-coded temporary
directory names, and so could potentially collide with each other when
run as part of a concurrent build (e.g. "make -j 4").
Fix by using mktemp to generate suitable temporary directory names.
We add a syslinux floppy disk type using parts of the genliso script.
This floppy image cat be dd'ed to a physical floppy or used in
instances where a virtual floppy with an mountable DOS filesystem is
useful.
We also modify the genliso script to only generate .liso images
rather than creating images depending on how it is called.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
This is required for all modern 802.11 devices, and allows drivers
to be written for them with minimally more effort than is required
for a wired NIC.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Modified-by: Michael Brown <mcb30@etherboot.org>
The Linux IB Communication Manager will always send MADs to QP1,
rather than back to the originating QP. On Hermon, QP1 is by default
handled by the embedded firmware. We can change this, but the cost is
that we have to handle both QP0 and QP1 (i.e. we have to provide SMA
as well as GMA service in software), and we have to use MLX queues
rather than standard UD queues (i.e. we have to construct the UD
datagrams by hand).
There doesn't seem to be any viable way around this situation, ugly
though it is.
Queue pairs are now assumed to be created in the INIT state, with a
call to ib_modify_qp() required to bring the queue pair to the RTS
state.
ib_modify_qp() no longer takes a modification list; callers should
modify the relevant queue pair parameters (e.g. qkey) directly and
then call ib_modify_qp() to synchronise the changes to the hardware.
The packet sequence number is now a property of the queue pair, rather
than of the device.
Each queue pair may have an associated address vector. For RC queue
pairs, this is the address vector that will be programmed in to the
hardware as the remote address. For UD queue pairs, it will be used
as the default address vector if none is supplied to ib_post_send().
Now that MAD handlers no longer return a status code, we can allow
them to return a pointer to a MAD structure if and only if they want
to send a response. This provides a more natural and flexible
approach than using a "response method" field within the handler's
descriptor.
MAD handlers have to set the status fields within the MAD itself
anyway, in order to provide a meaningful response MAD; the additional
gPXE return status code is just noise.
Note that we probably don't need to ever explicitly set the status to
IB_MGMT_STATUS_OK, since it should already have this value from the
request. (By not explicitly setting the status in this way, we can
safely have ib_sma_set_xxx() call ib_sma_get_xxx() in order to
generate the GetResponse MAD without worrying that ib_sma_get_xxx()
will clear any error status set by ib_sma_set_xxx().)
Most IB hardware seems not to allow allocation of the genuine QPNs 0
and 1, so allow for the externally-visible QPN (as constructed and
parsed by ib_packet, where used) to differ from the real
hardware-allocated QPN.
The queue key is stored as a property of the queue pair, and so can
optionally be added by the Infiniband core at the time of calling
ib_post_send(), rather than always having to be specified by the
caller.
This allows IPoIB to avoid explicitly keeping track of the data queue
key.
Now that path record lookups are handled entirely via
ib_resolve_path(), the only role of the IPoIB peer cache is as a
lookup table for MAC addresses. Update the code structure and
comments to reflect this.
The IPoIB broadcast MAC address varies according to the partition key.
Now that the broadcast MAC address is a property of the network device
rather than of the link layer, we can expose this real MAC address
directly.
The broadcast LID is now identified via a path record lookup; this is
marginally inefficient (since it was present in the MCMemberRecord
GetResponse), but avoids the need to special-case broadcasts when
constructing the address vector in ipoib_transmit().
Generalise the subnet management agent into a general management agent
capable of sending and responding to MADs, including support for
retransmissions as necessary.
Currently, all Infiniband users must create a process for polling
their completion queues (or rely on a regular hook such as
netdev_poll() in ipoib.c).
Move instead to a model whereby the Infiniband core maintains a single
process calling ib_poll_eq(), and polling the event queue triggers
polls of the applicable completion queues. (At present, the
Infiniband core simply polls all of the device's completion queues.)
Polling a completion queue will now implicitly refill all attached
receive work queues; this is analogous to the way that netdev_poll()
implicitly refills the RX ring.
Infiniband users no longer need to create a process just to poll their
completion queues and refill their receive rings.
IPoIB and the SMA have separate constants for the packet size to be
used to I/O buffer allocations. Merge these into the single
IB_MAX_PAYLOAD_SIZE constant.
(Various other points in the Infiniband stack have hard-coded
assumptions of a 2048-byte payload; we don't currently support
variable MTUs.)
IPoIB has a link-layer broadcast address that varies according to the
partition key. We currently go through several contortions to pretend
that the link-layer address is a fixed constant; by making the
broadcast address a property of the network device rather than the
link-layer protocol it will be possible to simplify IPoIB's broadcast
handling.
Move the icky call to step() from aoe.c to ata.c; this takes it at
least one step further away from where it really doesn't belong.
Unfortunately, AoE has the ugly aoe_discover() mechanism which means
that we still have a step() loop in aoe.c for now; this needs to be
replaced at some future point.
Objects typically call xfer_close() as part of their response to a
close() message. If the initiating object has already nullified the
xfer interface then this isn't a problem, but it can lead to
unexpected behaviour when the initiating object is aiming to reuse the
connection and so does not nullify the interface.
Fix by always temporarily nullifying the interface during xfer_close()
(as was already being done by xfer_vreopen() in order to work around
this specific problem).
Reported-by: infernix <infernix@infernix.net>
Tested-by: infernix <infernix@infernix.net>
These commands can be used to activate or deactivate the PXE API (on a
specifiable network interface).
This is currently of limited use, since most image formats will call
shutdown() before booting the image, meaning that the underlying net
device gets shut down during remove_devices() anyway.
ifcommon_exec() was long-ago marked as __attribute__((regparm(2))) in
order to minimise the size of functions that call into it. Since
then, gPXE has added -mregparm=3 as a general compilation option, and
this "optimisation" is now counter-productive.
Change (and simplify) the prototype to minimise code size given the
current compilation conditions.
pxe_init_structures() fills in the fields of the !PXE and PXENV+
structures that aren't known until gPXE starts up. Once gPXE is
started, these values will never change.
Make pxe_init_structures() an initialisation function so that PXE
users don't have to worry about calling it.
It is possible that the UNDI ISR may be triggered before netdev_tx()
returns control to pxenv_undi_transmit(). This means that
pxenv_undi_isr() may see a zero undi_tx_count, and so not check for TX
completions. This is not a significant problem, since it will check
for TX completions on the next call to pxenv_undi_isr() anyway; it
just means that the NBP will see a spurious IRQ that was apparently
caused by nothing.
Fix by updating the undi_tx_count before calling netdev_tx(), so that
pxenv_undi_isr() can decrement it and report the TX completion.
Symantec Ghost requires working multicast support. gPXE configures
all (sufficiently supported) network adapters into "receive all
multicasts" mode, which means that PXENV_UNDI_SET_MCAST_ADDRESS is
actually a no-op, but the current implementation returns
PXENV_STATUS_UNSUPPORTED instead.
Fix by making PXENV_UNDI_SET_MCAST_ADDRESS return success. For good
measure, also implement PXENV_UNDI_GET_MCAST_ADDRESS, since the
relevant functionality is now exposed by the net device core.
Note that this will silently fail if the gPXE driver for the NIC being
used fails to configure the NIC in "receive all multicasts" mode.
The PXE debugging messages have remained pretty much unaltered since
Etherboot 5.4, and are now difficult to read in comparison to most of
the rest of gPXE.
Bring the pxe_undi debug messages up to normal gPXE standards.
This keeps code size down, since the wireless interface management
commands have the same command-line interface and overall structure as
the wired commands.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
With the addition of link status codes, we can now display a detailed
error indication if iflinkwait() fails.
Putting the error output in iflinkwait avoids code duplication, and
gains symmetry with the other interface management routines; ifopen()
already prints an error directly if it cannot open its interface.
Modified-by: Michael Brown <mcb30@etherboot.org>
Signed-off-by: Michael Brown <mcb30@etherboot.org>