SHA-224 is almost identical to SHA-256, with differing initial hash
values and a truncated output length.
This implementation has been verified using the NIST SHA-224 test
vectors.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Update the digest self-tests to use okx(), and centralise concepts and
data shared between tests for multiple algorithms to reduce duplicated
code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
None of the x86_64 builds currently have any way of invoking these
functions. They are included only to avoid introducing unnecessary
architecture-specific dependencies into the self-test suite.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 8ab4b00 ("[libc] Rewrite setjmp() and longjmp()") introduced a
regression in which the saved values of %ebx, %esi, and %edi were all
accidentally restored into %esp. The result is that the second and
subsequent returns from setjmp() would effectively corrupt %ebx, %esi,
%edi, and the stack pointer %esp.
Use of setjmp() and longjmp() is generally discouraged: our only use
occurs as part of the implementation of PXENV_RESTART_TFTP, since the
PXE API effectively mandates its use here. The call to setjmp()
occurs at the start of pxe_start_nbp(), where there are almost
certainly no values held in %ebx, %esi, or %edi. The corruption of
these registers therefore had no visible effect on program execution.
The corruption of %esp would have been visible on return from
pxe_start_nbp(), but there are no known PXE NBPs which first call
PXENV_RESTART_TFTP and subsequently attempt to return to the PXE base
code. The effect on program execution was therefore similar to that
of moving the stack to a pseudo-random location in the 32-bit address
space; this will often allow execution to complete successfully since
there is a high chance that the pseudo-random location will be unused.
The regression therefore went undetected for around one month.
Fix by restoring the correct registers from the saved jmp_buf
structure.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
xHCI provides a somewhat convoluted mechanism for specifying details
of a transaction translator. Hubs must be marked as such in the
device slot context. The only opportunity to do so is as part of a
Configure Endpoint command, which can be executed only when opening
the hub's interrupt endpoint.
We add a mechanism for host controllers to intercept the opening of
hub devices, providing xHCI with an opportunity to update the internal
device slot structure for the corresponding USB device to indicate
that the device is a hub. We then include the hub-specific details in
the input context whenever any Configure Endpoint command is issued.
When a device is opened, we record the device slot and port for its
transaction translator (if any), and supply these as part of the
Address Device command.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Support low-speed and full-speed devices attached to a USB2 hub. Such
devices use a transaction translator (TT) within the USB2 hub, which
asynchronously initiates transactions on the lower-speed bus and
returns the result via a split completion on the high-speed bus.
We make the simplifying assumption that there will never be more than
sixteen active interrupt endpoints behind a single transaction
translator; this assumption allows us to schedule all periodic start
splits in microframe 0 and all periodic split completions in
microframes 2 and 3. (We do not handle isochronous endpoints.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The current endpoint reset logic defers the reset until the caller
attempts to enqueue a new transfer to that endpoint. This is
insufficient when dealing with endpoints behind a transaction
translator, since the transaction translator is a resource shared
between multiple endpoints.
We cannot reset the endpoint as part of the completion handling, since
that would introduce recursive calls to usb_poll(). Instead, we
add the endpoint to a list of halted endpoints, and perform the reset
on the next call to usb_step().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The endpoint may already have enqueued TRBs at the time that
xhci_endpoint_reset() is called. Ring the doorbell to resume
processing these TRBs immediately, rather than waiting until the next
call to xhci_endpoint_message() or xhci_endpoint_stream().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Several of the USB timeouts were chosen on the principle of "pick an
arbitrary but ridiculously large value, just to be safe". It turns
out that some of the timeouts permitted by the USB specification are
even larger: for example, control transactions are allowed to take up
to five seconds to complete.
Fix up these USB timeout values to match those found in the USB2
specification.
Debugged-by: Robin Smidsrød <robin@smidsrod.no>
Tested-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
xHCI (and EHCI) nominally provide a mechanism for releasing ownership
of the host controller back to the BIOS, which can then potentially
restore legacy USB keyboard functionality.
This is a rarely used code path, since most operating systems claim
ownership and never attempt to later return to the BIOS. On some
systems (observed with a Lenovo X1 Carbon), this code path leads to
obscure and interesting bugs: if the xHCI and EHCI controllers are
both claimed and later released back to the BIOS, then a subsequent
call to INT 16,0305 to set the keyboard repeat rate to a non-default
value will lock the system.
Obscure though this sequence of operations may sound, it is exactly
what happens when using iPXE to boot a Linux kernel via a USB network
card. There is old and probably unwanted code in Linux's
arch/x86/boot/main.c which sets the keyboard repeat rate (with the
accompanying comment "Set keyboard repeat rate (why?)"). When booting
Linux via a USB network card on a Lenovo X1 Carbon, the system
therefore locks up immediately after jumping to the kernel's entry
point.
Work around this problem by preventing the release of ownership back
to the BIOS if it is known that we are shutting down to boot an OS.
This should allow legacy USB keyboard functionality to be restored if
the user chooses to exit iPXE, while avoiding the rarely used code
paths (and corresponding BIOS bugs) if the user chooses instead to
boot an OS.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When using iPXE as an option ROM for a PCI USB controller (e.g. via
qemu's "-device nec-usb-xhci,romfile=..." syntax), the ROM prefix will
set the PCI bus:dev.fn address of the USB controller as the PCI
autoboot device. This will cause iPXE to fail to boot from any
detected USB network devices, since they will not match the autoboot
bus type (or location).
Fix by allowing the autoboot bus type and location to match against
the network device or any of its parent devices. This allows the
match to succeed for USB network devices attached to the selected PCI
USB controller.
Reported-by: Dan Ellis <Dan.Ellis@displaylink.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
If the BIOS fails to gracefully release ownership of the xHCI
controller, we can forcibly claim it by disabling all SMIs via the
USB legacy support control/status register.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RX FIFO overflow is almost inevitable since the (usable) USB2 bus
bandwidth is approximately one quarter of the Ethernet bandwidth.
Avoid flooding the console with RX FIFO overflow messages in a
standard debug build.
With TCP SACK implemented, the RX FIFO overflow no longer causes a
catastrophic drop in throughput. Experimentation shows that HTTP
downloads now progress at a fairly smooth 250Mbps, which is around the
maximum speed attainable for a USB2 NIC.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The TCP Selective Acknowledgement option (specified in RFC2018)
provides a mechanism for the receiver to indicate packets that have
been received out of order (e.g. due to earlier dropped packets).
iPXE often operates in environments in which there is a high
probability of packet loss. For example, the legacy USB keyboard
emulation in some BIOSes involves polling the USB bus from within a
system management interrupt: this introduces an invisible delay of
around 500us which is long enough for around 40 full-length packets to
be dropped. Similarly, almost all 1Gbps USB2 devices will eventually
end up dropping packets because the USB2 bus does not provide enough
bandwidth to sustain a 1Gbps stream, and most devices will not provide
enough internal buffering to hold a full TCP window's worth of
received packets.
Add support for sending TCP Selective Acknowledgements. This provides
the sender with more detailed information about which packets have
been lost, and so allows for a more efficient retransmission strategy.
We include a SACK-permitted option in our SYN packet, since
experimentation shows that at least Linux peers will not include a
SACK-permitted option in the SYN-ACK packet if one was not present in
the initial SYN. (RFC2018 does not seem to mandate this behaviour,
but it is consistent with the approach taken in RFC1323.) We ignore
any received SACK options; this is safe to do since SACK is only ever
advisory and we never have to send non-trivial amounts of data.
Since our TCP receive queue is a candidate for cache discarding under
low memory conditions, we may end up discarding data that has been
reported as received via a SACK option. This is permitted by RFC2018.
We follow the stricture that SACK blocks must not report data which is
no longer held by the receiver: previously-reported blocks are
validated against the current receive queue before being included
within the current SACK block list.
Experiments in a qemu VM using forced packet drops (by setting
NETDEV_DISCARD_RATE to 32) show that implementing SACK improves
throughput by around 400%.
Experiments with a USB2 NIC (an SMSC7500) show that implementing SACK
improves throughput by around 700%, increasing the download rate from
35Mbps up to 250Mbps (which is approximately the usable bandwidth
limit for USB2).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Several of the assembly files in arch/i386/prefix were missed by the
automated relicensing tool due to missing licence declarations, code
dating back to the initial git revision, etc. Manual review shows
that these files may be relicensed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This driver is functional but any downloads via a TCP-based protocol
tend to perform poorly. The 1Gbps Ethernet line rate is substantially
higher than the 480Mbps (in practice around 280Mbps) provided by USB2,
and the device has only 32kB of internal buffer memory. Our 256kB TCP
receive window therefore rapidly overflows the RX FIFO, leading to
multiple dropped packets (usually within the same TCP window) and
hence a low overall throughput.
Reducing the TCP window size so that the RX FIFO does not overflow
greatly increases throughput, but is not a general-purpose solution.
Further investigation is required to determine how other OSes
(e.g. Linux) cope with this scenario. It is possible that
implementing TCP SACK would provide some benefit.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Most devices expose at least the link up/down status via a bit in a
MAC register, since the MAC generally already needs to know whether or
not the link is up. Some devices (e.g. the SMSC75xx USB NIC) expose
this information to software only via the MII registers.
Provide a generic mii_check_link() implementation to check the BMSR
and report the link status via netdev_link_{up,down}().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Microsoft IIS supports only MD5-sess for Digest authentication.
Requested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE already sends RX notifications to the backend when needed, but
does not set the "feature-rx-notify" flag. As of XenServer 6.5, this
flag is mandatory and omitting it will cause the backend to fail.
Fix by setting the "feature-rx-notify" flag, to inform the backend
that we will send notifications.
Reported-by: Shalom Bhooshi <shalom.bhooshi@citrix.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Restore the original values of XUSB2PR and USB3PSSEN, in case we are
booting an OS with no support for xHCI.
Suggested-by: Dan Ellis <Dan.Ellis@displaylink.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Intel PCH controllers default to routing USB2 ports to EHCI rather
than xHCI, and default to disabling SuperSpeed connections.
Manipulate the PCI configuration space registers as necessary to
reroute ports and enable SuperSpeed.
Originally-fixed-by: Dan Ellis <Dan.Ellis@displaylink.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Relicense files with kind permission from
Stefan Hajnoczi <stefanha@redhat.com>
alongside the contributors who have already granted such relicensing
permission.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Rewrite (and relicense) the header files which are included in all
builds of iPXE (including non-Linux builds).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
At some point in the past few years, binutils became more aggressive
at removing unused symbols. To function as a symbol requirement, a
relocation record must now be in a section marked with @progbits and
must not be in a section which gets discarded during the link (either
via --gc-sections or via /DISCARD/).
Update REQUIRE_SYMBOL() to generate relocation records meeting these
criteria. To minimise the impact upon the final binary size, we use
existing symbols (specified via the REQUIRING_SYMBOL() macro) as the
relocation targets where possible. We use R_386_NONE or R_X86_64_NONE
relocation types to prevent any actual unwanted relocation taking
place. Where no suitable symbol exists for REQUIRING_SYMBOL() (such
as in config.c), the macro PROVIDE_REQUIRING_SYMBOL() can be used to
generate a one-byte-long symbol to act as the relocation target.
If there are versions of binutils for which this approach fails, then
the fallback will probably involve killing off REQUEST_SYMBOL(),
redefining REQUIRE_SYMBOL() to use the current definition of
REQUEST_SYMBOL(), and postprocessing the linked ELF file with
something along the lines of "nm -u | wc -l" to check that there are
no undefined symbols remaining.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The valgrind headers are not x86-specific; they detect the CPU
architecture and contain inline assembly for multiple architectures.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Unregistering a child settings block can have almost arbitrary
effects, due to the call to apply_settings(). Avoid potentially
dereferencing a stale pointer by using list_first_entry() rather than
list_for_each_entry_safe() to iterate over the list of child settings.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The code in list.h was originally taken from the Linux kernel many
years ago, but has been rewritten to the point that no original code
remains, and may therefore be relicensed.
The functions and data structures remain largely API-compatible, to
facilitate the conversion of Linux network drivers to iPXE.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
These files cannot be automatically relicensed by util/relicense.pl
since they either contain unusual but trivial contributions (such as
the addition of __nonnull function attributes), or contain lines
dating back to the initial git revision (and so require manual
knowledge of the code's origin).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Relicence files with kind permission from the following contributors:
Alex Williamson <alex.williamson@redhat.com>
Eduardo Habkost <ehabkost@redhat.com>
Greg Jednaszewski <jednaszewski@gmail.com>
H. Peter Anvin <hpa@zytor.com>
Marin Hannache <git@mareo.fr>
Robin Smidsrød <robin@smidsrod.no>
Shao Miller <sha0.miller@gmail.com>
Thomas Horsten <thomas@horsten.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Relicense files authored by Dan Lynch while working as an employee of
Fen Systems Ltd., with permission from Fen Systems Ltd.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UBDL relicensing tool (util/relicense.pl) is designed to identify
files which may be relicensed under a dual GPL+UBDL licence. It uses
git-blame to identify the author of each line (using the -M and -C
options to track lines moved or copied between files), and relicenses
files for which all authors have given permission.
The relicensing tool will ignore certain types of lines identified by
git-blame:
- empty lines
- comments
- standalone opening or closing braces
- "#include ..."
- "return 0;"
- "return rc;"
- "PCI_ROM(...)"
- "FILE_LICENCE(...)"
These lines either contain no meaningful content (e.g. empty lines),
contain only non-copyrightable facts (e.g. PCI ROM IDs) or are
sufficiently common within the codebase that git-blame is likely to
misattribute their origin (e.g. "return 0").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the text for the Unmodified Binary Distribution Licence. This
Licence allows for the distribution of unmodified binaries built from
publicly available source code, without imposing the obligations of
the GNU General Public License upon anyone who chooses to distribute
only the unmodified binaries built from that source code. See the
licence text for the precise terms and conditions.
Add the licence GPL2_OR_LATER_OR_UBDL to the set of licences which can
be declared using FILE_LICENCE(), and add the corresponding support to
licence.pl.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the standard warranty disclaimer and Free Software Foundation
address paragraphs to the licence text where these are not currently
present.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The code in lzma_literal() checks to see if we are at the start of the
compressed input data in order to determine whether or not a most
recent output byte exists. This check is incorrect, since
initialisation of the decompressor will always consume the first five
bytes of the compressed input data.
Fix by instead checking whether or not we are at the start of the
output data stream. This is, in any case, a more logical check.
This issue was masked during development and testing since virtual
machines tend to zero the initial contents of RAM; the spuriously-read
"most recent output byte" is therefore likely to already be a zero
when running in a virtual machine.
Reported-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The 0xe9 debug port exists only on virtual machines. Provide an
option to print debug output on the BIOS console, to allow for
debugging on real hardware.
Note that this option can be used only if the decompressor is called
in flat real mode; the easiest way to achieve this is to build with
DEBUG=libprefix.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow the decompressor the option of generating debugging output via
the BIOS console by calling it in flat real mode (rather than 16-bit
protected mode) when libprefix.S is built with debugging enabled.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
LZMA performs an extra normalisation after decompression is complete,
which does not affect the output but may consume an extra byte from
the input (and so may affect which byte is identified as being the
start of the next block).
Reported-by: Robin Smidsrød <robin@smidsrod.no>
Tested-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE uses DHCP timeouts loosely based on values recommended by the
specification, but often abbreviated to reduce timeouts for reliable
and/or simple network topologies. Extract the DHCP timing parameters
to config/dhcp.h and document them. The resulting default iPXE
behavior is exactly the same, but downstreams are now afforded the
opportunity to implement spec-compliant behavior via config file
overrides.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
LZMA provides significantly better compression (by ~15%) than the
current NRV2B algorithm.
We use a raw LZMA stream (aka LZMA1) to avoid the need for code to
parse the LZMA2 block headers. We use parameters {lc=2,lp=0,pb=0} to
reduce the stack space required by the decompressor to acceptable
levels (around 8kB). Using lc=3 or pb=2 would give marginally better
compression, but at the cost of substantially increasing the required
stack space.
The build process now requires the liblzma headers to be present on
the build system, since we do not include a copy of an LZMA compressor
within the iPXE source tree. The decompressor is written from scratch
(based on XZ Embedded) and is entirely self-contained within the
iPXE source.
The branch-call-jump (BCJ) filter used to improve the compressibility
is specific to iPXE. We choose not to use liblzma's built-in BCJ
filter since the algorithm is complex and undocumented. Our BCJ
filter achieves approximately the same results (on typical iPXE
binaries) with a substantially simpler algorithm.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some decompression algorithms (e.g. LZMA) require large amounts of
temporary stack space, which may not be made available by all
prefixes. Use .bss16 as a temporary stack for the duration of the
calls to install_block (switching back to the external stack before we
start making calls into code which might access variables in .bss16),
and allow the decompressor to define a global symbol to force a
minimum value on the size of .bss16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Other hypervisors (e.g. KVM) may provide an unusable subset of the
Hyper-V features, and our attempts to use these non-existent features
cause the guest to reboot.
Fix by explicitly checking for the Hyper-V features that we use.
Reported-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The implementation of strtoul() has a partially unknown provenance.
Rewrite this code to avoid potential licensing uncertainty.
Since we now use -ffunction-sections, there is no need to place
strtoull() in a separate file from strtoul().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The implementation of inet_aton() has an unknown provenance. Rewrite
this code to avoid potential licensing uncertainty.
Also move the code from core/misc.c to its logical home in net/ipv4.c,
and add a few extra test cases.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When a command times out, abort it (via the Command Abort bit in the
Command Ring Control Register) so that subsequent commands may execute
as expected.
This improves robustness when a device fails to respond to the Set
Address command, since the subsequent Disable Slot command will now
succeed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
If the Disable Slot command fails then the hardware may continue to
write to the slot context. Leak the memory used by the slot context
to avoid future memory corruption.
This situation has been observed in practice when a Set Address
command fails, causing the command ring to become temporarily
unresponsive.
Note that there is no need to similarly leak memory on the failure
path in xhci_device_open(), since in the event of a failure the
hardware is never informed of the slot context address.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The 8254 timer code (used to implement udelay()) has an unknown
provenance. Rewrite this code to avoid potential licensing
uncertainty.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As with memcpy(), we can reduce the code size (by an average of 0.2%)
by giving the compiler more visibility into what memset() is doing,
and by avoiding the "rep" prefix on short fixed-length sequences of
string operations.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some of the C library string functions have an unknown provenance.
Reimplement all such functions to avoid potential licensing
uncertainty.
Remove the inline-assembler versions of strlen(), memswap(), and
strncmp(); these save a minimal amount of space (around 40 bytes in
total) and are not performance-critical.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Provide a generic framework for allocating, refilling, and optionally
recycling I/O buffers used by bulk IN and interrupt endpoints.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Hardened versions of gcc default to building position-independent
code, which breaks our i386 build. Our build process therefore
detects such platforms and automatically adds "-fno-PIE -nopie" to the
gcc command line.
On x86_64, we choose to build position-independent code (in order to
reduce the final binary size and, in particular, the number of
relocations required for UEFI binaries). The workaround therefore
breaks the build process for x86_64 binaries on such platforms.
Fix by moving the workaround to the i386-specific portion of the
Makefile.
Reported-by: Jan Kundrát <jkt@kde.org>
Debugged-by: Jan Kundrát <jkt@kde.org>
Debugged-by: Marin Hannache <git@mareo.fr>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
UEFI binaries may be relocated to any location within the 64-bit
address space. We compile as position-independent code with hidden
visibility, which should force all relocation records to be either
PC-relative (in which case no PE relocations are required) or full
64-bit relocations. There should be no R_X86_64_32 relocation
records, since that would imply an invalid assumption that code could
not be relocated above 4GB.
Remove support for R_X86_64_32 relocation records from util/elf2efi.c,
so that any such records result in a build failure rather than a
potential runtime failure.
Reported-by: Jan Kundrát <jkt@kde.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When building hvmloader for Xen tools the iPXE objects are also linked
into the binary. Unfortunately the linker will place them in the
order found in the archive. Since this order is random the resulting
hvmloader binary differs when it was built from identical sources but
on different build hosts. To help with creating a reproducible binary
the elements in blib.a must simply be sorted before passing them to
$(AR).
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit a60f2dd ("[usb] Try multiple USB device configurations")
changed the behaviour of register_usb() such that if no drivers are
found then the device will be closed and the memory used will be
freed.
If a port status change subsequently occurs while the device is still
physically attached, then usb_hotplug() will see this as a new device
having been attached, since there is no device recorded as being
currently attached to the port. This can lead to spurious hotplug
events (or even endless loops of hotplug events, if the process of
opening and closing the device happens to generate a port status
change).
Fix by using a separate flag to indicate that a device is physically
attached (even if we have no corresponding struct usb_device).
Reported-by: Dan Ellis <Dan.Ellis@displaylink.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use PRODUCT_SHORT_NAME instead of a hardcoded "iPXE" for strings which
are typically shown in the user interface.
Note that this only allows for customisation of the user interface.
Where the "iPXE" string serves a technical purpose (such as in the
HTTP User-Agent), the string cannot be customised.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some xHCI controllers (observed with a Renesas Electronics PCIe USB3
card) seem to require a delay after forcing the link state of USB3
ports to RxDetect. Omitting this delay causes strange behaviour
including system lockups.
Add an unconditional 20ms delay after writing the port link states.
This seems to be sufficient to avoid the problem.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some USB endpoints require that a short packet be used to terminate
transfers, since they have no other way to determine message
boundaries. If the message length happens to be an exact multiple of
the USB packet size, then this requires the use of an additional
zero-length packet.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
USB Communications Device Class devices may use a union functional
descriptor to group several interfaces into a function.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Iterate over a USB device's available configurations until we find one
for which we have working drivers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some protocols (such as ARP) may modify the received packet and re-use
the same I/O buffer for transmission of a reply. To allow this,
reserve sufficient headroom at the start of each received packet
buffer for our transmit datapath headers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some devices return multiple packets in a single poll. Handle such
devices gracefully by enqueueing received PXE UDP packets (along with
a pseudo-header to hold the IPv4 addresses and port numbers) and
dequeueing them on subsequent calls to PXENV_UDP_READ.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Fetching the TFTP file size is currently implemented via a custom
"tftpsize://" protocol hack. Generalise this approach to instead
close the TFTP connection whenever the parent data-transfer interface
is closed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some devices have a very small number of internal buffers, and rely on
being able to pack multiple packets into each buffer. Using 2048-byte
buffers on such devices produces throughput of around 100Mbps. Using
a small number of much larger buffers (e.g. 32kB) increases the
throughput to around 780Mbps. (The full 1Gbps is not reached because
the high RTT induced by the use of multi-packet buffers causes us to
saturate our 256kB TCP window.)
Since allocation of large buffers is very likely to fail, allocate the
buffer set only once when the device is opened and recycle buffers
immediately after use. Received data is now always copied to
per-packet buffers.
If allocation of large buffers fails, fall back to allocating a larger
number of smaller buffers. This will give reduced performance, but
the device will at least still be functional.
Share code between the interrupt and bulk IN endpoint handlers, since
the buffer handling is now very similar.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow drivers to specify a supported PCI class code. To save space in
the final binary, make this an attribute of the driver rather than an
attribute of a PCI device ID list entry.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We require the ability to disconnect from and reconnect to VMBus; if
we don't have this then there is no (viable) way for a loaded
operating system to continue to use any VMBus devices. (There is also
a small but non-zero risk that the host will continue to write to our
interrupt and monitor pages, since the VMBUS_UNLOAD message in earlier
versions is essentially a no-op.)
This requires us to ensure that the host supports protocol version 3.0
(VMBUS_VERSION_WIN8_1). However, we can't actually _use_ protocol
version 3.0, since doing so causes an iSCSI-booted Windows Server 2012
R2 VM to crash due to a NULL pointer dereference in vmbus.sys.
To work around this problem, we first ensure that we can connect using
protocol v3.0, then disconnect and reconnect using the oldest known
protocol.
This deliberately prevents the use of the iPXE native Hyper-V drivers
on older versions of Hyper-V, where we could use our drivers but in so
doing would break the loaded operating system.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Windows Server 2012 R2 generates an RNDIS_INDICATE_STATUS_MSG with a
status code of 0x4002006. This status code does not appear to be
documented anywhere within the sphere of human knowledge.
Explicitly ignore this status code in order to avoid unnecessarily
cluttering the display when RNDIS debugging is enabled.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The (undocumented) VMBus protocol seems to allow for transfer
page-based packets where the data payload is split into an arbitrary
set of ranges within the transfer page set.
The RNDIS protocol includes a length field within the header of each
message, and it is known from observation that multiple RNDIS messages
can be concatenated into a single VMBus message.
iPXE currently assumes that the transfer page range boundaries are
entirely arbitrary, and uses the RNDIS header length to determine the
RNDIS message boundaries.
Windows Server 2012 R2 generates an RNDIS_INDICATE_STATUS_MSG for an
undocumented and unknown status code (0x40020006) with a malformed
RNDIS header length: the length does not cover the StatusBuffer
portion of the message. This causes iPXE to report a malformed RNDIS
message and to discard any further RNDIS messages within the same
VMBus message.
The Linux Hyper-V driver assumes that the transfer page range
boundaries correspond to RNDIS message boundaries, and so does not
notice the malformed length field in the RNDIS header.
Match the behaviour of the Linux Hyper-V driver: assume that the
transfer page range boundaries correspond to the RNDIS message
boundaries and ignore the RNDIS header length. This avoids triggering
the "malformed packet" error and also avoids unnecessary data copying:
since we now have one I/O buffer per RNDIS message, there is no longer
any need to use iob_split().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Empirical observation suggests that 32 is a sensible size to minimise
the number of deferred packet transmissions without overflowing the
VMBus transmit ring buffer.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow for elision of transmitted TCP ACKs by handling all received
VMBus messages in each network device poll operation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On Windows Server 2012 R2, closing and reopening the device will
sometimes result in a non-functional RX datapath. The root cause is
unknown. Clearing the receive filter before closing the device seems
to fix the problem.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On Windows Server 2012 R2, the receive buffer teardown completion
message seems to occasionally be deferred until after the VMBus
channel has been closed. This happens even if there are no packets
currently in the receive buffer.
Work around this problem by separating the revocation and teardown of
the receive buffer, and deferring the teardown until after the VMBus
channel has been closed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Hyper-V RNDIS implementation on Windows Server 2012 R2 requires
that we send an explicit RNDIS initialisation message in order to get
a working RX datapath.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RNDIS devices may provide multiple packets encapsulated into a single
message. Provide an API to allow the RNDIS driver to split an I/O
buffer into smaller portions.
The current implementation will always copy the underlying data,
rather than splitting the buffer in situ.
Signed-off-by: Michael Brown <mcb30@ipxe.org>