Expose the IPv6 address (or prefix) as ${ip6}, the prefix length as
${len6}, and the router address as ${gateway6}.
Originally-implemented-by: Hannes Reinecke <hare@suse.de>
Originally-implemented-by: Marin Hannache <git@mareo.fr>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The settings scope ipv6_scope refers specifically to IPv6 settings
that have a corresponding DHCPv6 option. Rename to dhcpv6_scope to
more accurately reflect this purpose.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
In edk2, there are several drivers that associate HII forms (and
corresponding config access protocol instances) with each individual
network device. (In this context, "network device" means the EFI
handle on which the SNP protocol is installed, and on which the device
path ending with the MAC() node is installed also.) Such edk2 drivers
are, for example: Ip4Dxe, HttpBootDxe, VlanConfigDxe.
In UEFI, any given handle can carry at most one instance of a specific
protocol (see e.g. the specification of the InstallProtocolInterface()
boot service). This implies that the class of drivers mentioned above
can't install their EFI_HII_CONFIG_ACCESS_PROTOCOL instances on the
SNP handle directly -- they would conflict with each other.
Accordingly, each of those edk2 drivers creates a "private" child
handle under the SNP handle, and installs its config access protocol
(and corresponding HII package list) on its child handle.
The device path for the child handle is traditionally derived by
appending a Hardware Vendor Device Path node after the MAC() node.
The VenHw() nodes in question consist of a GUID (by definition), and
no trailing data (by choice). The purpose of these VenHw() nodes is
only that all the child nodes can be uniquely identified by device
path.
At the moment iPXE does not follow this pattern. It doesn't run into
a conflict when it installs its EFI_HII_CONFIG_ACCESS_PROTOCOL
directly on the SNP handle, but that's only because iPXE is the sole
driver not following the pattern. This behavior seems risky (one
might call it a "latent bug"); better align iPXE with the edk2 custom.
Cc: Michael Brown <mcb30@ipxe.org>
Cc: Gary Lin <glin@suse.com>
Cc: Ladi Prosek <lprosek@redhat.com>
Ref: http://thread.gmane.org/gmane.comp.bios.edk2.devel/13494/focus=13532
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As with assertions, profiling is enabled for objects built with any
debug level (including an explicit debug level of zero).
Allow profiling to be globally enabled or disabled by adding PROFILE=1
or PROFILE=0 respectively to the build command line.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Assertions are enabled for objects built with any debug level
(including an explicit debug level of zero). It is sometimes useful
to be able to enable assertions across all objects; this currently
requires manually hacking include/assert.h.
Allow assertions to be globally enabled by adding ASSERT=1 to the
build command line. For example:
make bin/8086100e.mrom ASSERT=1
Similarly, allow assertions to be globally disabled by adding ASSERT=0
to the build command line. If no ASSERT=... is specified on the
build command line, then only objects mentioned in DEBUG=... will have
assertions enabled (as is currently the case).
Note than globally enabling assertions imposes a relatively heavy
runtime penalty, primarily due to the various sanity checks performed
by list_add(), list_for_each_entry(), etc.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Extend the DEBUG=... syntax to allow debug messages to be compiled in
but disabled by default. For example:
make bin/undionly.kpxe DEBUG=netdevice:3:1
would compile in the messages as for DEBUG=netdevice:3, but would set
the debug level mask so that only the DEBUG=netdevice:1 messages would
be displayed.
This allows for external code to selectively enable the additional
debug messages at runtime, without being overwhelmed by unwanted
initial noise. For example, a developer of a new protocol may want to
temporarily enable tracing of all packets received: this can be done
by building with DEBUG=netdevice:3:1 and using
// temporarily enable per-packet messages
DBG_ENABLE_OBJECT ( netdevice, DBGLVL_EXTRA );
...
// disable per-packet messages
DBG_DISABLE_OBJECT ( netdevice, DBGLVL_EXTRA );
Note that unlike the usual DBG_ENABLE() and DBG_DISABLE() macros,
DBG_ENABLE_OBJECT() and DBG_DISABLE_OBJECT() will not be removed via
dead code elimination if debugging is disabled in the specified
object. In particular, this means that using either of these macros
will always result in a symbol reference to the specified object.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The DBG_ENABLE() and DBG_DISABLE() macros currently affect the debug
level of all objects that were built with debugging enabled. This is
undesirable, since it is common to use different debug levels in each
object.
Make the debug level mask a per-object variable. DBG_ENABLE() and
DBG_DISABLE() now control only the debug level for the containing
object (which is consistent with the intended usage across the
existing codebase). DBG_ENABLE_OBJECT() and DBG_DISABLE_OBJECT() may
be used to control the debug level for a specified object. For
example:
// Enable DBG() messages from tcpip.c
DBG_ENABLE_OBJECT ( tcpip, DBGLVL_LOG );
Note that the existence of debug messages continues to be gated by the
DEBUG=... list specified on the build command line. If an object was
built without the relevant debug level, then DBG_ENABLE_OBJECT() will
have no effect on that object at runtime (other than to explicitly
drag in the object via a symbol reference).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The vendor class identifier strings in DHCP_ARCH_VENDOR_CLASS_ID are
out of sync with the (correct) client architecture values in
DHCP_ARCH_CLIENT_ARCHITECTURE.
Fix by removing all definitions of DHCP_ARCH_VENDOR_CLASS_ID, and
instead generating the vendor class identifier string automatically
based on DHCP_ARCH_CLIENT_ARCHITECTURE and DHCP_ARCH_CLIENT_NDI.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC3315 defines DHCPv6 option 16 (vendor class identifier) but does
not define any direct relationship with the roughly equivalent DHCPv4
option 60.
The PXE specification predates IPv6, and the UEFI specification is
expectedly vague on the subject. Examination of the reference EDK2
codebase suggests that the DHCPv6 vendor class identifier will be
formatted in accordance with RFC3315, using a single vendor-class-data
item in which the opaque-data field is the string as would appear in
DHCPv4 option 60.
RFC3315 requires the vendor class identifier to specify an IANA
enterprise number, as a way of disambiguating the vendor-class-data
namespace. The EDK2 code uses the value 343, described as:
// TODO: IANA TBD: temporarily using Intel's
Since this "TODO" has been present since at least 2010, it is probably
safe to assume that it has now become a de facto standard.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC5970 defines DHCPv6 options 61 (client system architecture type)
and 62 (client network interface identifier), with contents equivalent
to DHCPv4 options 93 and 94 respectively.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some of the regions may end up being unmapped, either because they are
optional or because the attempt to map them has failed. Region types
starting at 0 didn't make it easy to test for this condition.
This commit bumps all valid region types up by 1 with 0 having the
implicit 'unmapped' meaning.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide a mechanism to allow an arbitrary adjustment to be applied to
all subsequent calls to time().
Note that the underlying clock source (e.g. the RTC clock) will not be
changed; only the time as reported within iPXE will be affected.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
In some circumstances, intermediate devices may lose state in a way
that temporarily prevents the successful delivery of packets from a
TCP peer. For example, a firewall may drop a NAT forwarding table
entry.
Since iPXE spends most of its time downloading files (and hence purely
receiving data, sending only TCP ACKs), this can easily happen in a
situation in which there is no reason for iPXE's TCP stack to generate
any retransmissions. The temporary loss of connectivity can therefore
effectively become permanent.
Work around this problem by sending TCP keepalives after a period of
inactivity on an established connection.
TCP keepalives usually send a single garbage byte in sequence number
space that has already been ACKed by the peer. Since we do not need
to elicit a response from the peer, we instead send pure ACKs (with no
garbage data) in order to keep the transmit code path simple.
Originally-implemented-by: Ladi Prosek <lprosek@redhat.com>
Debugged-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Extend the 16-bit PCI bus:dev.fn address to a 32-bit seg🚌dev.fn
address, assuming a segment value of zero in contexts where multiple
segments are unsupported by the underlying data structures (e.g. in
the iBFT or BOFM tables).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Mac OS X uses non-standard EFI protocols to obtain the DHCP packets
from the UEFI firmware.
Originally-implemented-by: Michael Kuron <m.kuron@gmx.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There has been a longstanding disagreement between RFC4578 and the
IANA "Processor Architecture Types" registry. RFC4578 section 2.1
defines type 7 as "EFI BC" and type 9 as "EFI x86-64"; the IANA
registry quotes RFC4578 as its source but has these values erroneously
swapped. The EDK2 codebase uses the IANA values.
As of March 2016, RFC4578 has been modified by an errata to match the
values as recorded in the IANA registry.
Fix our definitions to match the consensus values.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently use the EFI_CPU_ARCH_PROTOCOL's GetTimerValue() method to
generate the currticks() timer, calibrated against a 1ms delay from
the boot services Stall() method.
This does not work on ARM platforms, where GetTimerValue() is an empty
stub which just returns EFI_UNSUPPORTED.
Fix by instead creating a periodic timer event, and using this event
to increment a current tick counter.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Require architecture-specific code to make a deliberate choice to use
the unoptimised generic_tcpip_continue_chksum() function, if there is
no optimised version available.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This commit adds support for driving virtio 1.0 PCI devices. In
addition to various helpers, a number of vpm_ functions are introduced
to be used instead of their legacy vp_ counterparts when accessing
virtio 1.0 (aka modern) devices.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Virtio 1.0 introduces new constants and data structures, common to all
devices as well as specific to virtio-net. This commit adds a subset
of these to be able to drive the virtio-net 1.0 network device.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PCI devices may support more capabilities of the same type (for
example PCI_CAP_ID_VNDR) and there was no way to discover all of them.
This commit adds a new API pci_find_next_capability which provides
this functionality. It would typically be used like so:
for (pos = pci_find_capability(pci, PCI_CAP_ID_VNDR);
pos > 0;
pos = pci_find_next_capability(pci, pos, PCI_CAP_ID_VNDR)) {
...
}
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The DHCP option 175.189 has been defined (by us) since 2006 as
containing the drive number to be used for a SAN boot, but has never
been automatically used as such by iPXE.
Use this option (if specified) to override the default SAN drive
number.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Interpret the maximum drive number (0xff for hard disks, 0x7f for
floppy disks) as meaning "use natural drive number".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide access to local files via the "file://" URI scheme. There are
three syntaxes:
- An opaque URI with a relative path (e.g. "file:script.ipxe").
This will be interpreted as a path relative to the iPXE binary.
- A hierarchical URI with a non-network absolute path
(e.g. "file:/boot/script.ipxe"). This will be interpreted as a
path relative to the root of the filesystem from which the iPXE
binary was loaded.
- A hierarchical URI with a network path in which the authority is a
volume label (e.g. "file://bootdisk/script.ipxe"). This will be
interpreted as a path relative to the root of the filesystem with
the specified volume label.
Note that the potentially desirable shell mappings (e.g. "fs0:" and
"blk0:") are concepts internal to the UEFI shell binary, and do not
seem to be exposed in any way to external executables. The old
EFI_SHELL_PROTOCOL (which did provide access to these mappings) is no
longer installed by current versions of the UEFI shell.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On some architectures (such as ARM) the "@" character is used as a
comment delimiter. A section type argument such as "@progbits"
therefore becomes "%progbits".
This is further complicated by the fact that the "%" character has
special meaning for inline assembly when input or output operands are
used, in which cases "@progbits" becomes "%%progbits".
Allow the section type character(s) to be defined via Makefile
variables.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is no practical way to generate an underlength ARP packet since
an ARP packet is always padded up to the minimum Ethernet frame length
(or dropped by the receiving Ethernet hardware if incorrectly padded),
but the absence of an explicit check causes warnings from some
analysis tools.
Fix by adding an explicit check on the I/O buffer length.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The assumption in asn1_type() that an ASN.1 cursor will always contain
a type byte is incorrect. A cursor that has been cleanly invalidated
via asn1_invalidate_cursor() will contain a type byte, but there are
other ways in which to arrive at a zero-length cursor.
Fix by explicitly checking the cursor length in asn1_type(). This
allows asn1_invalidate_cursor() to be reduced to simply zeroing the
length field.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some EoIB implementations utilise an EoIB-to-Ethernet gateway device
that does not perform a FullMember join to the multicast group for the
EoIB broadcast domain. This has various exciting side-effects, such
as requiring every EoIB node to send every broadcast packet twice.
As an added bonus, the gateway may also break the EoIB MAC address to
GID mapping protocol by sending Ethernet-sourced packets from the
wrong QPN.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some EoIB implementations require each individual EoIB node to create
the multicast group for the EoIB broadcast domain.
It is left as an exercise for the interested reader to determine how
such an implementation might ever allow the parameters of such a
multicast group to be changed without requiring a simultaneous upgrade
of every driver on every operating system on every machine currently
attached to the fabric.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
EoIB is a fairly simple protocol in which raw Ethernet frames
(excluding the CRC) are encapsulated within Infiniband Unreliable
Datagrams, with a four-byte fixed EoIB header (which conveys no actual
information). The Ethernet broadcast domain is provided by a
multicast group, similar to the IPoIB IPv4 multicast group.
The mapping from Ethernet MAC addresses to Infiniband address vectors
is achieved by snooping incoming traffic and building a peer cache
which can then be used to map a MAC address into a port GID. The
address vector is completed using a path record lookup, as for IPoIB.
Note that this requires every packet to include a GRH.
Add basic support for EoIB devices. This driver is substantially
derived from the IPoIB driver. There is currently no mechanism for
automatically creating EoIB devices.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit e62e52b ("[ipoib] Simplify test for received broadcast
packets") relies upon the multicast LID being present in the
destination address vector as passed to ipoib_complete_recv().
Unfortunately, this information is not present in many Infiniband
devices' completion queue entries.
Fix by testing instead for the presence of a multicast GID.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Record the speed of a USB device based on the port's speed at the time
that the device was enabled. This allows us to remember the device's
speed even after the device has been disconnected (and so the port's
current speed has changed).
In particular, this allows us to correctly identify the transaction
translator for a low-speed or full-speed device after the device has
been disconnected.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide BIT_QWORD_PTR() to allow for easy extraction of non-endian
fields (e.g. Infiniband GUIDs) without unnecessary byte swapping.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Tested using QEMU and usbredir to expose the LAN9512 chip present on a
Raspberry Pi.
There is a known issue with the LAN9512: an extra two bytes are
appended to every transmitted packet. These two bytes comprise:
{ 0x00, 0x08 } if packet length == 0 (mod 8)
{ CRC[0], 0x00 } if packet length == 7 (mod 8)
{ CRC[0], CRC[1] } otherwise
The extra bytes are appended whether the Ethernet CRC is generated
manually or added automatically by the hardware. The issue occurs
with the Linux kernel driver as well as the iPXE driver. It appears
to be an undocumented hardware errata.
TCP/IP traffic is not affected, since the IP header length field
causes the extraneous bytes to be discarded by the receiver. However,
protocols that rely on the length of the Ethernet frame (such as FCoE
or iPXE's "lotest" protocol) will be unusable on this hardware.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the UEFI platform firmware to provide drivers for unrecognised
devices, by exposing our own implementation of EFI_USB_IO_PROTOCOL.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Make the class ID a property of the USB driver (rather than a property
of the USB device ID), and allow USB drivers to specify a wildcard ID
for any of the three component IDs (class, subclass, or protocol).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generate a score for each possible USB device configuration based on
the available driver support, and select the configuration with the
highest score. This will allow us to prefer ECM over RNDIS (for
devices which support both) and will allow us to meaningfully select a
configuration even when we have drivers available for all functions
(e.g. when exposing unused functions via EFI_USB_IO_PROTOCOL).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The decision on whether or not a zero-length packet needs to be
transmitted is independent of the host controller and belongs in the
USB core.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
TCP/IP checksum fields are one's complement values and therefore have
two possible representations of zero: positive zero (0x0000) and
negative zero (0xffff).
In RFC768, UDP over IPv4 exploits this redundancy to repurpose the
positive representation of zero (0x0000) to mean "no checksum
calculated"; checksums are optional for UDP over IPv4.
In RFC2460, checksums are made mandatory for UDP over IPv4. The
wording of the RFC is such that the UDP header is mandated to use only
the negative representation of zero (0xffff), rather than simply
requiring the checksum to be correct but allowing for either
representation of zero to be used.
In RFC1071, an example algorithm is given for calculating the TCP/IP
checksum. This algorithm happens to produce only the positive
representation of zero (0x0000); this is an artifact of the way that
unsigned arithmetic is used to calculate a signed one's complement
sum (and its final negation).
A common misconception has developed (exemplified in RFC1624) that
this artifact is part of the specification. Many people have assumed
that the checksum field should never contain the negative
representation of zero (0xffff).
A sensible receiver will calculate the checksum over the whole packet
and verify that the result is zero (in whichever representation of
zero happens to be generated by the receiver's algorithm). Such a
receiver will not care which representation of zero happens to be used
in the checksum field.
However, there are receivers in existence which will verify the
received checksum the hard way: by calculating the checksum over the
remainder of the packet and comparing the result against the checksum
field. If the representation of zero used by the receiver's algorithm
does not match the representation of zero used by the transmitter (and
so placed in the checksum field), and if the receiver does not
explicitly allow for both representations to compare as equal, then
the receiver may reject packets with a valid checksum.
For UDP, the combined RFCs effectively mandate that we should generate
only the negative representation of zero in the checksum field.
For IP, TCP and ICMP, the RFCs do not mandate which representation of
zero should be used, but the misconceptions which have grown up around
RFC1071 and RFC1624 suggest that it would be least surprising to
generate only the positive representation of zero in the checksum
field.
Fix by ensuring that all of our checksum algorithms generate only the
positive representation of zero, and explicitly inverting this in the
case of transmitted UDP packets.
Reported-by: Wissam Shoukair <wissams@mellanox.com>
Tested-by: Wissam Shoukair <wissams@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow iPXE to coexist with other USB device drivers, by attaching to
the EFI_USB_IO_PROTOCOL instances provided by the UEFI platform
firmware.
The EFI_USB_IO_PROTOCOL is an unsurprisingly badly designed
abstraction of a USB device. The poor design choices intrinsic in the
UEFI specification prevent efficient operation as a network device,
with the result that devices operated using the EFI_USB_IO_PROTOCOL
operate approximately two orders of magnitude slower than devices
operated using our native EHCI or xHCI host controller drivers.
Since the performance is so abysmally slow, and since the underlying
problems are due to fundamental architectural mistakes in the UEFI
specification, support for the EFI_USB_IO_PROTOCOL host controller
driver is left as disabled by default. Users are advised to use the
native iPXE host controller drivers instead.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Many UEFI NBPs expect to find an EFI_PXE_BASE_CODE_PROTOCOL installed
in addition to the EFI_SIMPLE_NETWORK_PROTOCOL. Most NBPs use the
EFI_PXE_BASE_CODE_PROTOCOL only to retrieve the cached DHCP packets.
This implementation has been tested with grub.efi, shim.efi,
syslinux.efi, and wdsmgfw.efi. Some methods (such as Discover() and
Arp()) are not used by any known NBP and so have not (yet) been
implemented.
Usage notes for the tested bootstraps are:
- grub.efi uses EFI_PXE_BASE_CODE_PROTOCOL only to retrieve the
cached DHCP packet, and uses no other methods.
- shim.efi uses EFI_PXE_BASE_CODE_PROTOCOL to retrieve the cached
DHCP packet and to retrieve the next NBP via the Mtftp() method.
If shim.efi was downloaded via HTTP (or other non-TFTP protocol)
then shim.efi will blindly call Mtftp() with an HTTP URI as the
filename: this allows the next NBP (e.g. grubx64.efi) to also be
transparently retrieved by HTTP.
shim.efi can also use the EFI_SIMPLE_FILE_SYSTEM_PROTOCOL to
retrieve files previously loaded by "imgfetch" or similar commands
in iPXE. The current implementation of shim.efi will use the
EFI_SIMPLE_FILE_SYSTEM_PROTOCOL only if it does not find an
EFI_PXE_BASE_CODE_PROTOCOL; this patch therefore prevents this
usage of our EFI_SIMPLE_FILE_SYSTEM_PROTOCOL. This logic could be
trivially reversed in shim.efi if needed.
- syslinux.efi uses EFI_PXE_BASE_CODE_PROTOCOL only to retrieve the
cached DHCP packet. Versions 6.03 and earlier have a bug which
may cause syslinux.efi to attach to the wrong NIC if there are
multiple NICs in the system (or if the UEFI firmware supports
IPv6).
- wdsmgfw.efi (ab)uses EFI_PXE_BASE_CODE_PROTOCOL to retrieve the
cached DHCP packets, and to send and retrieve UDP packets via the
UdpWrite() and UdpRead() methods. (This was presumably done in
order to minimise the amount of benefit obtainable by switching to
UEFI, by replicating all of the design mistakes present in the
original PXE specification.)
The EFI_DOWNGRADE_UX configuration option remains available for now,
until this implementation has received more widespread testing.
Signed-off-by: Michael Brown <mcb30@ipxe.org>