Commit Graph

1244 Commits (60531ff6e25d363f36f843cae29a95031aac2147)

Author SHA1 Message Date
Michael Brown 2b6b02ee7e [tls] Use intf_insert() to add TLS to an interface
Restructure the use of add_tls() to insert a TLS filter onto an
existing interface.  This allows for the possibility of using
add_tls() to start TLS on an existing connection (as used in several
protocols which will negotiate the choice to use TLS before the
ClientHello is sent).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-12-07 13:51:46 +00:00
Michael Brown a2e5cf1a3f [netdevice] Fix misleading comment on netdev_rx()
Unlike netdev_rx_err(), there is no valid circumstance under which
netdev_rx() may be called with a null I/O buffer, since a call to
netdev_rx() represents the successful reception of a packet.  Fix the
code comment to reflect this.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-29 11:25:40 +00:00
Michael Brown 9ff61ab28d [netdevice] Do not attempt to unmap a null I/O buffer
netdev_tx_err() may be called with a null I/O buffer (e.g. to record a
transmit error with no associated buffer).  Avoid a potential null
pointer dereference in the DMA unmapping code path.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-29 11:25:40 +00:00
Michael Brown 8d337ecdae [dma] Move I/O buffer DMA operations to iobuf.h
Include a potential DMA mapping within the definition of an I/O
buffer, and move all I/O buffer DMA mapping functions from dma.h to
iobuf.h.  This avoids the need for drivers to maintain a separate list
of DMA mappings for each I/O buffer that they may handle.

Network device drivers typically do not keep track of transmit I/O
buffers, since the network device core already maintains a transmit
queue.  Drivers will typically call netdev_tx_complete_next() to
complete a transmission without first obtaining the relevant I/O
buffer pointer (and will rely on the network device core automatically
cancelling any pending transmissions when the device is closed).

To allow this driver design approach to be retained, update the
netdev_tx_complete() family of functions to automatically perform the
DMA unmapping operation if required.  For symmetry, also update the
netdev_rx() family of functions to behave the same way.

As a further convenience for drivers, allow the network device core to
automatically perform DMA mapping on the transmit datapath before
calling the driver's transmit() method.  This avoids the need to
introduce a mapping error handling code path into the typically
error-free transmit methods.

With these changes, the modifications required to update a typical
network device driver to use the new DMA API are fairly minimal:

- Allocate and free descriptor rings and similar coherent structures
  using dma_alloc()/dma_free() rather than malloc_phys()/free_phys()

- Allocate and free receive buffers using alloc_rx_iob()/free_rx_iob()
  rather than alloc_iob()/free_iob()

- Calculate DMA addresses using dma() or iob_dma() rather than
  virt_to_bus()

- Set a 64-bit DMA mask if needed using dma_set_mask_64bit() and
  thereafter eliminate checks on DMA address ranges

- Either record the DMA device in netdev->dma, or call iob_map_tx() as
  part of the transmit() method

- Ensure that debug messages use virt_to_phys() when displaying
  "hardware" addresses

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-11-28 20:26:28 +00:00
Michael Brown a2e44077cd [infiniband] Allow SRP device to be described using an EFI device path
The UEFI specification provides a partial definition of an Infiniband
device path structure.  Use this structure to construct what may be a
plausible path containing at least some of the information required to
identify an SRP target device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-10-23 15:34:35 +01:00
Michael Brown bf051a76ee [fcp] Allow Fibre Channel device to be described using an EFI device path
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-10-22 14:16:55 +01:00
Michael Brown e6f9054d13 [iscsi] Allow iSCSI device to be described using an EFI device path
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-10-20 15:05:37 +01:00
Michael Brown 04cb17de50 [aoe] Allow AoE device to be described using an EFI device path
There is no standard defined for AoE device paths in the UEFI
specification, and it seems unlikely that any standard will be adopted
in future.

Choose to construct an AoE device path using a concatenation of the
network device path and a SATA device path, treating the AoE major and
minor numbers as the HBA port number and port multiplier port number
respectively.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-10-19 14:45:49 +01:00
Michael Brown b50ad5f09a [http] Allow HTTP connection to be described using an EFI device path
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-10-19 13:07:40 +01:00
Michael Brown 6ccd5239b1 [ipv6] Reduce time spent waiting for router discovery
Now that IPv6 is enabled by default for UEFI builds, it is important
that iPXE does not delay unnecessarily in the (still relatively
common) case of a network that lacks IPv6 routers.

Apply the timeout values used for neighbour discovery to the router
discovery process.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-10-14 14:57:27 +01:00
Michael Brown 388d657080 [lacp] Ignore (and do not echo) trailing padding on received packets
The LACP responder reuses the received I/O buffer to construct the
response LACP (or marker) packet.  Any received padding will therefore
be unintentionally included within the response.

Truncate the received I/O buffer to the expected length (which is
already defined in a way to allow for future protocol expansion)
before reusing it to construct the response.

Reported-by: Tore Anderson <tore@fud.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-10-14 14:18:49 +01:00
Michael Brown 3d43789914 [lacp] Detect and ignore erroneously looped back LACP packets
Some external drivers (observed with the UEFI NII driver provided by
an HPE-branded Mellanox ConnectX-3 Pro) seem to cause LACP packets
transmitted by iPXE to be looped back as received packets.  Since
iPXE's trivial LACP responder will send one response per received
packet, this results in an immediate LACP packet storm.

Detect looped back LACP packets (based on the received LACP actor MAC
address), and refuse to respond to such packets.

Reported-by: Tore Anderson <tore@fud.no>
Tested-by: Tore Anderson <tore@fud.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-10-14 13:36:17 +01:00
Michael Brown ee2dc525b4 [wpa] Fix erroneous debug message in wpa_derive_ptk
Split debug message since eth_ntoa() uses a static result buffer.

Originally-fixed-by: Michael Bazzinotti <bazz@bazz1.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-07-21 15:34:39 +01:00
Michael Brown 366206517e [dns] Use all configured DNS servers
When no response is obtained from the first configured DNS server,
fall back to attempting the other configured servers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-07-15 19:10:30 +01:00
Michael Brown a95a2eafc5 [xfer] Remove address family from definition of a socket opener
All implemented socket openers provide definitions for both IPv4 and
IPv6 using exactly the same opener method.  Simplify the logic by
omitting the address family from the definition.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-07-15 18:46:58 +01:00
Michael Brown 2dac11eb1d [tls] Allow a minimum TLS protocol version to be specified
The supported ciphers and digest algorithms may already be specified
via config/crypto.h.  Extend this to allow a minimum TLS protocol
version to be specified.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-06-12 21:40:33 +01:00
Michael Brown e3ca211071 [iscsi] Eliminate variable-length stack allocation in URI parsing
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-02-16 23:47:06 +00:00
Michael Brown e2e29e7ae3 [iscsi] Eliminate variable-length stack allocations in CHAP handlers
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-02-16 23:19:55 +00:00
Michael Brown 0a74321915 [slam] Allow for the possibility of IPv6 multicast addresses
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-02-16 22:02:25 +00:00
Michael Brown c5306bcfa5 [slam] Eliminate variable-length stack allocation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-02-16 21:55:59 +00:00
Michael Brown 6248ac396a [infiniband] Eliminate variable-length stack allocation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-02-16 21:42:35 +00:00
Michael Brown c625681ca1 [tftp] Eliminate unnecessary variable-length stack allocation
Eliminate an unnecessary variable-length stack allocation and memory
copy by allowing TFTP option processors to modify the option string
in-place.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2020-02-16 20:08:20 +00:00
Michael Brown a2d3bedf1f [peerdist] Allow for the use of a hosted cache server
Allow a PeerDist hosted cache server to be specified via the
${peerhost} setting, e.g.:

  # Use 192.168.0.1 as hosted cache server
  set peerhost 192.168.0.1

Note that this simply treats the hosted cache server as a permanently
discovered peer for all segments.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-12-15 23:29:44 +00:00
Michael Brown 53af9905e0 [peerdist] Allow PeerDist to be globally enabled or disabled
Allow the use of PeerDist content encoding to be enabled or disabled
via the ${peerdist} setting, e.g.:

  # Disable PeerDist
  set peerdist 0

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-12-13 14:44:22 +00:00
Michael Brown f1e6efa40b [ethernet] Avoid false positive Coverity warning
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-08-17 17:30:09 +01:00
Michael Brown fd96acb7de [tls] Add missing call to tls_tx_resume() when restarting negotiation
The restart of negotiation triggered by a HelloRequest currently does
not call tls_tx_resume() and so may end up leaving the connection in
an idle state in which the pending ClientHello is never sent.

Fix by calling tls_tx_resume() as part of tls_restart(), since the
call to tls_tx_resume() logically belongs alongside the code that sets
bits in tls->tx_pending.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-08-16 22:51:14 +01:00
Michael Brown d8a1958ba5 [peerdist] Limit number of concurrent raw block downloads
Raw block downloads are expensive if the origin server uses HTTPS,
since each concurrent download will require local TLS resources
(including potentially large received encrypted data buffers).

Raw block downloads may also be prohibitively slow to initiate when
the origin server is using HTTPS and client certificates.  Origin
servers for PeerDist downloads are likely to be running IIS, which has
a bug that breaks session resumption and requires each connection to
go through the full client certificate verification.

Limit the total number of concurrent raw block downloads to ameliorate
these problems.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-08-16 22:19:50 +01:00
Michael Brown 02b26de963 [peerdist] Start block download timers from within opener methods
Move the responsibility for starting the block download timers from
peerblk_expired() to peerblk_raw_open() and peerblk_retrieval_open(),
in preparation for adding the ability to defer calls to
peerblk_raw_open() via a block download queue.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-08-16 22:19:50 +01:00
Michael Brown fe680c8228 [vlan] Provide vlan_netdev_rx() and vlan_netdev_rx_err()
The Hermon driver uses vlan_find() to identify the appropriate VLAN
device for packets that are received with the VLAN tag already
stripped out by the hardware.  Generalise this capability and expose
it for use by other network card drivers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-04-27 20:25:00 +01:00
Michael Brown f6b2bf9507 [tcp] Display "connecting" status until connection is established
Provide increased visibility into the progress of TCP connections by
displaying an explicit "connecting" status message while waiting for
the TCP handshake to complete.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-03-10 17:29:06 +00:00
Michael Brown 7b63c1275f [tls] Display validator messages only while validation is in progress
Allow the cipherstream to report progress status messages during
connection establishment.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-03-10 17:27:33 +00:00
Michael Brown b28ccfc725 [tls] Display cross-certificate and OCSP status messages
TLS connections will almost always create background connections to
perform cross-signed certificate downloads and OCSP checks.  There is
currently no direct visibility into which checks are taking place,
which makes troubleshooting difficult in the absence of either a
packet capture or a debug build.

Use the job progress message buffer to report the current cross-signed
certificate download or OCSP status check, where applicable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-03-07 15:23:19 +00:00
Michael Brown 447e5cd447 [crypto] Use x509_name() in validator debug messages
Display a human-readable certificate name in validator debug messages
wherever possible.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-03-07 13:47:30 +00:00
Michael Brown eaba1a22b8 [tls] Support stateless session resumption
Add support for RFC5077 session ticket extensions to allow for
stateless TLS session resumption.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-03-06 15:11:18 +00:00
Michael Brown 799781f168 [tls] Fix incorrectly duplicated error number
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-03-06 15:11:18 +00:00
Michael Brown 272fe32529 [tls] Support stateful session resumption
Record the session ID (if any) provided by the server and attempt to
reuse it for any concurrent connections to the same server.

If multiple connections are initiated concurrently (e.g. when using
PeerDist) then defer sending the ClientHello for all but the first
connection, to allow time for the first connection to potentially
obtain a session ID (and thereby speed up the negotiation for all
remaining connections).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-02-21 11:32:25 +00:00
Michael Brown 36a4c85f91 [init] Show startup and shutdown function names in debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2019-01-25 14:53:43 +00:00
Michael Brown b9d68b9de0 [ethernet] Use standard 1500 byte MTU unless explicitly overridden
Devices that support jumbo frames will currently default to the
largest possible MTU.  This assumption is valid for virtual adapters
such as virtio-net, where the MTU must have been configured by a
system administrator, but is unsafe in the general case of a physical
adapter.

Default to the standard Ethernet MTU, unless explicitly overridden
either by the driver or via the ${netX/mtu} setting.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-07-17 12:14:43 +01:00
Michael Brown 05b979146d [rndis] Clean up error handling path in register_rndis()
Avoid calling rndis_halt() and rndis->op->close() twice if the call to
register_netdev() fails.

Reported-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-07-09 10:35:57 +01:00
Roman Kagan 16d7495308 [rndis] Register netdev with MAC filled
register_netdev expects ->hw_addr and ->ll_addr to be already filled,
so move it towards the end of register_rndis, after the respective
fields have been successfully queried from the underlying device.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-07-07 20:19:14 +01:00
Michael Brown e7f67d5a4c [http] Work around stateful authentication schemes
As pointedly documented in RFC7230 section 2.3, HTTP is a stateless
protocol: each request message can be understood in isolation from any
other requests or responses.  Various authentication schemes such as
NTLM break this fundamental property of HTTP and rely on the same TCP
connection being reused.

Work around these broken authentication schemes by ensuring that the
most recently pooled connection is reused for the subsequent
authentication retry.

Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-06-08 13:53:02 +01:00
Michael Brown baaf50017d [tls] Ensure that window change is propagated to plainstream interface
The cipherstream xfer_window_changed() message is used to retrigger
the TLS transmit state machine.  If the transmit state machine is
idle, then the window change message will not be propagated to the
plainstream interface.  This can potentially cause the plainstream
interface peer (e.g. httpcore) to block waiting for a window change
message that will never arrive.

Fix by ensuring that the window change message is propagated to the
plainstream interface if the transmit state machine is idle.  (If the
transmit state machine is not idle then the plainstream window will be
zero anyway.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-03-24 21:51:07 +00:00
Michael Brown 4152aff103 [tls] Rename tls_session to tls_connection
In TLS terminology a session conceptually spans multiple individual
connections, and essentially represents the stored cryptographic state
(master secret and cipher suite) required to establish communication
without going through the certificate and key exchange handshakes.

Rename tls_session to tls_connection in order to make the name
tls_session available to represent the session state.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-03-24 21:37:17 +00:00
Michael Brown ac4fbd47ae [tls] Ensure received data list is initialised before calling tls_free()
A failure in tls_generate_random() will result in a call to ref_put()
before the received data list has been initialised, which will cause
free_tls() to attempt to traverse an uninitialised list.

Fix by ensuring that all fields referenced by free_tls() are
initialised before any of the potential failure paths.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-03-23 11:07:29 +00:00
Michael Brown 342ff967cc [lacp] Check the partner's own state when checking for blocked links
The blocked link test in eth_slow_lacp_rx() is performed before the
actor TLV is copied to the partner TLV, and so must test the actor
state field rather than the partner state field.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-03-19 15:47:39 +02:00
Michael Brown a0021a30dd [ocsp] Centralise test for whether or not an OCSP check is required
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-03-18 22:25:01 +02:00
Michael Brown b11ae1d91b [tftp] Prevent potential division by zero
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-03-18 17:43:11 +02:00
Michael Brown c160c9dfc0 [lacp] Fix debug message to match documentation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-03-18 17:20:04 +02:00
Michael Brown 33d79d5d2b [lacp] Mark link as blocked if partner is not yet up and running
Mark the link as blocked if the LACP partner is not reporting itself
as being in sync, collecting, and distributing.

This matches the behaviour for STP: we mark the link as blocked if we
detect that the switch is actively blocking traffic, in order to
extend the DHCP discovery period and so prevent boot failures on
switches that take an excessively long time to enable ports.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-03-18 17:16:35 +02:00
Hannes Reinecke c84f9d6727 [iscsi] Parse IPv6 address in root path
The iSCSI root path may contain a literal IPv6 address.  Update the
parser to handle this address format correctly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-03-01 13:30:41 +00:00
Michael Brown 6737a8795f [http] Allow for domain names within NTLM user names
Allow a NetBIOS domain name to be specified within a URL using a
syntax such as:

  http://domain%5Cusername:password@server/path

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-02-19 11:58:28 +00:00
Martin Habets 70189a8e47 [netdevice] Make netdev_irq_enabled() independent of netdev_irq_supported()
The UNDI layer uses the NETDEV_IRQ_ENABLED flag to choose whether to
return PXENV_UNDI_ISR_OUT_OURS or PXENV_UNDI_ISR_OUT_NOT_OURS for a
given interrupt.  For a network device that does not support
interrupts, the flag will never be set and so pxenv_undi_isr() will
always return PXENV_UNDI_ISR_OUT_NOT_OURS.  This causes some NBPs
(such as lpxelinux.0) to hang.

Redefine NETDEV_IRQ_ENABLED as a simple administrative flag which can
be set even on network devices that do not support interrupts.  This
allows pxenv_undi_isr() (which is the sole user of NETDEV_IRQ_ENABLED)
to function as expected by lpxelinux.0.

Signed-off-by: Martin Habets <mhabets@solarflare.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2018-01-14 21:53:29 +00:00
Michael Brown 659c484efc [http] Report unsuccessful response status lines at DBGVL_LOG
The precise HTTP response status code is currently visible only at
DBGLVL_EXTRA.  Allow for easier debugging by reporting the whole
status line at DBGLVL_LOG for any unsuccessful responses.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-12-28 13:04:59 +00:00
Michael Brown ea29122a70 [http] Include error messages for 4xx and 5xx response codes
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-12-28 12:34:07 +00:00
Michael Brown b5e0b50723 [http] Add support for NTLM authentication
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-11-12 18:52:04 +00:00
Michael Brown 96bd872c03 [http] Handle parsing of WWW-Authenticate header within authentication scheme
Allow individual authentication schemes to parse WWW-Authenticate
headers that do not comply with RFC2617.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-11-12 18:52:04 +00:00
Michael Brown c49acbb4d2 [http] Gracefully handle offers of multiple authentication schemes
Servers may provide multiple WWW-Authenticate headers, each offering a
different authentication scheme.  We currently fail the request as
soon as we encounter an unrecognised scheme, which prevents subsequent
offers from succeeding.

Fix by silently ignoring headers for schemes that we do not recognise.
If no schemes are recognised then the request will eventually fail
anyway due to the 401 response code.

If multiple schemes are supported, arbitrarily choose the scheme
appearing first within the response headers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-11-12 18:52:03 +00:00
Ladi Prosek 0631a46a94 [crypto] Fail fast if cross-certificate source is empty
In fully self-contained deployments it may be desirable to build iPXE
with an empty CROSSCERT source to avoid talking to external services.

Add an explicit check for this case and make validator_start_download
fail immediately if the base URI is empty.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-09-24 17:56:04 +01:00
Michael Brown af02a8d071 [dns] Ensure DNS names are NUL-terminated when used as diagnostic strings
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-09-07 12:19:35 +01:00
Michael Brown 9faf069126 [dns] Report current DNS query as job progress status message
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-09-06 11:46:13 +01:00
Michael Brown 8047baf7c6 [netdevice] Add "hwaddr" setting
Expose the underlying hardware address as a setting.  For IPoIB
devices, this provides scripts with access to the Infiniband GUID.

Requested-by: Allen, Benjamin S. <bsallen@alcf.anl.gov>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-09-06 10:52:30 +01:00
Michael Brown 7e673a6b67 [peerdist] Gather and report peer statistics during download
Record and report the number of peers (calculated as the maximum
number of peers discovered for a block's segment at the time that the
block download is complete), and the percentage of blocks retrieved
from peers rather than from the origin server.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-09-05 23:23:22 +01:00
Michael Brown 97f0f56a34 [netdevice] Cancel all pending transmissions on any transmit error
Some external code (such as the UEFI UNDI driver for the Realtek USB
NIC on a Microsoft Surface Book) will block during transmission
attempts and can take several seconds to report a transmit error.  If
there is a large queue of pending transmissions, then the accumulated
time from a series of such failures can easily exceed the EFI watchdog
timeout, resulting in what appears to be a system lockup followed by a
reboot.

Work around this problem by immediately cancelling any pending
transmissions as soon as any transmit error occurs.

The only expected transmit error under normal operation is ENOBUFS
arising when the hardware transmit queue is full.  By definition, this
can happen only for drivers that do not utilise deferred
transmissions, and so this new behaviour will not affect these
drivers.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-09-05 12:30:04 +01:00
Michael Brown 1e4a3f5bab [tls] Support RFC5746 secure renegotiation
Support renegotiation with servers supporting RFC5746.  This allows
for the use of per-directory client certificates.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-07-04 19:54:34 +01:00
Michael Brown 2f12690455 [tls] Keep cipherstream window open until TLS negotiation is complete
When performing a SAN boot, the plainstream window size will be zero
(since this is the mechanism used internally to indicate that no data
should be fetched via the initial request).  This zero value currently
propagates to the advertised TCP window size, which prevents the TLS
negotiation from completing.

Fix by ensuring that the cipherstream window is held open until TLS
negotiation is complete, and only then falling back to passing through
the plainstream window size.

Reported-by: John Wigley <johnwigley#ipxe@acorna.co.uk>
Tested-by: John Wigley <johnwigley#ipxe@acorna.co.uk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-05-22 13:17:23 +01:00
Michael Brown 785389c2ba [iscsi] Always send FirstBurstLength parameter
As of kernel 4.11, the LIO target will propose a value for
FirstBurstLength if the initiator did not do so.  This is entirely
redundant in our case, since FirstBurstLength is defined by RFC 3720
to be

  "Irrelevant when: ( InitialR2T=Yes and ImmediateData=No )"

and we already enforce both InitialR2T=Yes and ImmediateData=No in our
initial proposal.  However, LIO (arguably correctly) complains when we
do not respond to its redundant proposal of an already-irrelevant
value.

Fix by always proposing the default value for FirstBurstLength.

Debugged-by: Patrick Seeburger <info@8bit.de>
Tested-by: Patrick Seeburger <info@8bit.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-05-03 13:01:11 +01:00
Michael Brown c8cae7cc17 [http] Notify data transfer interface when underlying connection is ready
HTTP implements xfer_window_changed() on the underlying server
connection using http_step(), which does not propagate the window
change notification to the data transfer interface.  This breaks the
multipath-capable SAN boot code, which relies on the window change
notification to discover that the HTTP block device is ready for
commands to be issued.

Fix by sending xfer_window_changed() in http_step() once the
underlying connection has been determined to be ready.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-28 23:40:52 +03:00
Michael Brown 7cfdd769aa [block] Describe all SAN devices via ACPI tables
Describe all SAN devices via ACPI tables such as the iBFT.  For tables
that can describe only a single device (i.e. the aBFT and sBFT), one
table is installed per device.  For multi-device tables (i.e. the
iBFT), all devices are described in a single table.

An underlying SAN device connection may be closed at the time that we
need to construct an ACPI table.  We therefore introduce the concept
of an "ACPI descriptor" which enables the SAN boot code to maintain an
opaque pointer to the underlying object, and an "ACPI model" which can
build tables from a list of such descriptors.  This separates the
lifecycles of ACPI descriptions from the lifecycles of the block
device interfaces, and allows for construction of the ACPI tables even
if the block device interface has been closed.

For a multipath SAN device, iPXE will wait until sufficient
information is available to describe all devices but will not wait for
all paths to connect successfully.  For example: with a multipath
iSCSI boot iPXE will wait until at least one path has become available
and name resolution has completed on all other paths.  We do this
since the iBFT has to include IP addresses rather than DNS names.  We
will commence booting without waiting for the inactive paths to either
become available or close; this avoids unnecessary boot delays.

Note that the Linux kernel will refuse to accept an iBFT with more
than two NIC or target structures.  We therefore describe only the
NICs that are actually required in order to reach the described
targets.  Any iBFT with at most two targets is therefore guaranteed to
describe at most two NICs.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-28 19:12:48 +03:00
Michael Brown 75bb948008 [tcp] Use correct length for memset()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-22 15:11:05 +02:00
Michael Brown c26c1fd07c [infiniband] Return status code from ib_create_mi()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-22 11:18:23 +02:00
Michael Brown 39ef530088 [infiniband] Return status code from ib_create_cq() and ib_create_qp()
Any underlying errors arising during ib_create_cq() or ib_create_qp()
are lost since the functions simply return NULL on error.  This makes
debugging harder, since a debug-enabled build is required to discover
the root cause of the error.

Fix by returning a status code from these functions, thereby allowing
any underlying errors to be propagated.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-22 11:18:02 +02:00
Michael Brown f17cf0ecd0 [http] Add missing check for memory allocation failure
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-21 15:20:59 +02:00
Michael Brown 64de7dc7fd [slam] Avoid NULL pointer dereference in slam_pull_value()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-21 14:57:36 +02:00
Michael Brown 60561d0f3d [slam] Fix resource leak on error path
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-21 14:53:13 +02:00
Michael Brown 9b581158b5 [802.11] Remove redundant NULL pointer check after dereference
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-21 14:01:08 +02:00
Michael Brown e500e5dd07 [nfs] Fix double free bug on error path
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-21 13:46:26 +02:00
Michael Brown de2c6fa240 [dhcp] Allow vendor class to be changed in DHCP requests
Allow the DHCPv4 vendor class to be specified via the "vendor-class"
setting.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-20 13:58:59 +02:00
Vishvananda Ishaya Abrams 4524cc11bf [iscsi] Don't close when receiving NOP-In
Some iSCSI targets send NOP-In.  Rather than closing the connection
when we receive one, it is more user friendly to log a debug message
and keep the connection open.  Eventually, it would be nice if iPXE
supported replying to NOP-Ins, but we might as well keep the
connection open until the target disconnects us.

Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-09 14:23:22 +00:00
Michael Brown a29bdb3a92 [iscsi] Use intfs_shutdown() when shutting down multiple interfaces
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-03-09 12:16:15 +00:00
Michael Brown 4a4da573dd [http] Cleanly shut down potentially looped interfaces
Use intfs_shutdown() and intfs_restart() to cleanly shut down multiple
interfaces that may loop back to the same object.

This fixes a regression introduced by commit daa8ed9 ("[interface]
Provide intf_reinit() to reinitialise nullified interfaces") which
broke the use of HTTP Basic and Digest authentication.

Reported-by: murmansk <murmansk@hotmail.com>
Reported-by: Brett Waldo <brettwaldo@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-02-02 16:58:00 +00:00
Michael Brown 302f1eeb80 [time] Allow timer to be selected at runtime
Allow the active timer (providing udelay() and currticks()) to be
selected at runtime based on probing during the INIT_EARLY stage of
initialisation.

TICKS_PER_SEC is now a fixed compile-time constant for all builds, and
is independent of the underlying clock tick rate.  We choose the value
1024 to allow multiplications and divisions on seconds to be converted
to bit shifts.

TICKS_PER_MS is defined as 1, allowing multiplications and divisions
on milliseconds to be omitted entirely.  The 2% inaccuracy in this
definition is negligible when using the standard BIOS timer (running
at around 18.2Hz).

TIMER_RDTSC now checks for a constant TSC before claiming to be a
usable timer.  (This timer can be tested in KVM via the command-line
option "-cpu host,+invtsc".)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-01-26 08:17:37 +00:00
Michael Brown 70fc25ad6e [netdevice] Limit MTU by hardware maximum frame length
Separate out the concept of "hardware maximum supported frame length"
and "configured link MTU", and limit the latter according to the
former.

In networks where the DHCP-supplied link MTU is inconsistent with the
hardware or driver capabilities (e.g. a network using jumbo frames),
this will result in iPXE advertising a TCP MSS consistent with a size
that can actually be received.

Note that the term "MTU" is typically used to refer to the maximum
length excluding the link-layer headers; we adopt this usage.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-01-25 14:55:09 +00:00
Michael Brown 16aed6e5ce [netdevice] Allow MTU to be changed at runtime
Provide a settings applicator to modify netdev->max_pkt_len in
response to changes to the "mtu" setting (DHCP option 26).

Note that as with MAC address changes, drivers are permitted to
completely ignore any changes in the MTU value.  The net result will
be that iPXE effectively uses the smaller of either the hardware
default MTU or the software configured MTU.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-01-23 17:47:28 +00:00
Michael Brown de85336abb [cloud] Add ability to retrieve Google Compute Engine metadata
For some unspecified "security" reason, the Google Compute Engine
metadata server will refuse any requests that do not include the
non-standard HTTP header "Metadata-Flavor: Google".

Attempt to autodetect such requests (by comparing the hostname against
"metadata.google.internal"), and add the "Metadata-Flavor: Google"
header if applicable.

Enable this feature in the CONFIG=cloud build, and include a sample
embedded script allowing iPXE to boot from a script configured as
metadata via e.g.

  # Create shared boot image
  make bin/ipxe.usb CONFIG=cloud EMBED=config/cloud/gce.ipxe

  # Configure per-instance boot script
  gcloud compute instances add-metadata <instance> \
         --metadata-from-file ipxeboot=boot.ipxe

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-01-23 14:43:20 +00:00
David Decotigny 04c7befa73 [build] Return const char * from uuid_ntoa()
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-01-22 13:45:00 +00:00
Michael Brown 43b2d8eafb [ipv4] Accept unicast packets for the local network broadcast address
The ISC Kea DHCP server transmits its DHCPOFFER as a unicast packet
with a broadcast IPv4 destination address (255.255.255.255).  This
combination is currently rejected by iPXE.

Fix by explicitly accepting the local network broadcast address
(255.255.255.255) as a valid unicast destination address.

Reported-by: Roy Ledochowski <roy.ledochowski@hpe.com>
Tested-by: Roy Ledochowski <roy.ledochowski@hpe.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2017-01-22 09:12:52 +00:00
Michael Brown 81fceaec6e [iscsi] Avoid potential infinite loops during shutdown
The command and data interfaces may be connected to the same object.
Nullify the data interface before shutting down the control interface
to avoid potential infinite loops.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-11-16 23:03:37 +00:00
Michael Brown daa8ed9274 [interface] Provide intf_reinit() to reinitialise nullified interfaces
Provide an abstraction intf_reinit() to restore the descriptor of a
previously nullified interface.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-11-16 22:22:13 +00:00
Michael Brown ff28b22568 [crypto] Generalise X.509 "valid" field to a "flags" field
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-08-25 15:41:57 +01:00
Michael Brown a4c4f72297 [ipv6] Allow for multiple routers
Select the IPv6 source address and corresponding router (if any) using
a very simplified version of the algorithm from RFC6724:

- Ignore any source address that has a smaller scope than the
  destination address.  For example, do not use a link-local source
  address when sending to a global destination address.

- If we have a source address which is on the same link as the
  destination address, then use that source address.

- If we are left with multiple possible source addresses, then choose
  the address with the smallest scope.  For example, if we are sending
  to a site-local destination address and we have both a global source
  address and a site-local source address, then use the site-local
  source address.

- If we are still left with multiple possible source addresses, then
  choose the address with the longest matching prefix.

For the purposes of this algorithm, we treat RFC4193 Unique Local
Addresses as having organisation-local scope.  Since we use only
link-local scope for our multicast transmissions, this approximation
should remain valid in all practical situations.

Originally-implemented-by: Thomas Bächler <thomas@archlinux.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-25 15:20:22 +01:00
Michael Brown daa1a59310 [ipv6] Rename ipv6_scope to ipv6_settings_scope
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-21 15:47:45 +01:00
Michael Brown c34d1518eb [ipv6] Create routing table based on IPv6 settings
Use the IPv6 settings to construct the routing table, in a matter
analogous to the construction of the IPv4 routing table.

This allows for manual assignment of IPv6 addresses via e.g.

  set net0/ip6 2001:ba8:0:1d4::6950:5845
  set net0/len6 64
  set net0/gateway6 fe80::226:bff:fedd:d3c0

The prefix length ("len6") may be omitted, in which case a default
prefix length of 64 will be assumed.

Multiple IPv6 addresses may be assigned manually by implicitly
creating child settings blocks.  For example:

  set net0/ip6 2001:ba8:0:1d4::6950:5845
  set net0.ula/ip6 fda4:2496:e992::6950:5845

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-20 13:02:44 +01:00
Michael Brown 4ad3c73b30 [ipv6] Match user expectations for IPv6 settings priorities
A reasonable user expectation is that ${net0/ip6} should show the
"highest-priority" of the IPv6 addresses, even when multiple IPv6
addresses are active.  The expected order of priority is likely to be
manually-assigned addresses first, then stateful DHCPv6 addresses,
then SLAAC addresses, and lastly link-local addresses.

Using ${priority} to enforce an ordering is undesirable since that
would affect the priority assigned to each of the net<N> blocks as a
whole, so use the sibling ordering capability instead.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-19 17:07:53 +01:00
Michael Brown 1fdc7da435 [ipv6] Expose IPv6 link-local address settings
Originally-implemented-by: Hannes Reinecke <hare@suse.de>
Originally-implemented-by: Marin Hannache <git@mareo.fr>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-19 14:35:30 +01:00
Michael Brown 03d19cf14d [dhcpv6] Expose IPv6 address setting acquired through DHCPv6
Originally-implemented-by: Hannes Reinecke <hare@suse.de>
Originally-implemented-by: Marin Hannache <git@mareo.fr>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-19 01:20:34 +01:00
Michael Brown 3b783d7fd2 [ipv6] Expose IPv6 settings acquired through NDP
Expose the IPv6 address (or prefix) as ${ip6}, the prefix length as
${len6}, and the router address as ${gateway6}.

Originally-implemented-by: Hannes Reinecke <hare@suse.de>
Originally-implemented-by: Marin Hannache <git@mareo.fr>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-19 00:13:00 +01:00
Michael Brown ee54ab5be6 [ipv6] Allow settings to comprise arbitrary subsets of NDP options
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-19 00:13:00 +01:00
Michael Brown 129206f476 [ipv6] Rename ipv6_scope to dhcpv6_scope
The settings scope ipv6_scope refers specifically to IPv6 settings
that have a corresponding DHCPv6 option.  Rename to dhcpv6_scope to
more accurately reflect this purpose.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-16 12:42:08 +01:00
Michael Brown ecfc81d76f [settings] Create space for IPv6 in settings display order
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-15 17:39:49 +01:00
Michael Brown c53a209a42 [ipv6] Perform SLAAC only during autoconfiguration
We currently perform IPv6 stateless address autoconfiguration (SLAAC)
in response to any router advertisement with the relevant flags set.
This can result in the local IPv6 source address changing midway
through a TCP connection, since our connections bind only to a local
port number and do not store a local network address.

In addition, this behaviour for SLAAC is inconsistent with that for
DHCPv4 and stateful DHCPv6, both of which will be performed only as a
result of an explicit autoconfiguration action (e.g. via the default
autoboot sequence, or the "ifconf" command).

Fix by ignoring router advertisements arriving outside the context of
an ongoing autoconfiguration attempt.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2016-07-15 15:58:47 +01:00