Make is_uefi_supported() only check for /sys/firmware/efi as
get_efi_partition() will detect a missing ESP or an invalid
partition scheme. Stop using get_efi_partition() inside
is_uefi_supported() as the former is eventually called in every
UEFI related code.
UEFI supports both MBR and GPT as partition schemes and this is
a required change to handle the particular case of Windows not
being able to boot UEFI from a MBR partition scheme.
Provide more information in exception messages as those are the
source of the logging messages. Add information about paths, files
or configuration related to the operation associated to the
exception.
Replace exception types to be more explicit about the nature of
the error.
Improve the exception raising semantics by using the 'from' keyword,
this wraps an older exception into a new one so it is still considered
the same object.
Use only the exception messages as the main resource for error
messages.
The previous error code had string duplication in the form of:
logging.error('msg here')
raise Exception('msg here')
That approach also has the downside of having log duplication as
it had the local logging.err() and a global logging.exception()
inside send_internal_server_error capturing the exception message.
The actual code only requires raising an exception with a proper
error message.
Improve exception messages to give more error context.
Log every AssertionError as a backtrace.
Use the 'raise Exception from e' syntax to modify the a previously
raised exception 'e' into an exception with aditional context or
different type. This also prevents the message that warns about
newer exceptions being launch after an initial exception.
The main function must be able to handle the login of critical
error in the main ogClient class instance. Add a try except block
to the ogCLient run logic and move the relevant error logs into
the except block.
Delegate the error messages to the exception message. This is the
first step towards error message deduplication.
Create ogboot.me and ogboot.secondboot as empty files and
ogboot.firstboot with the value "iniciado" in the root of
the BIOS Windows system partition.
The files must contain data for GRUB to be able to write content,
therefore these are created containing 3072 null bytes.
The Windows boot process is handled by the "pxe" profile.
There the files ogboot.me, ogboot.firstboot and ogboot.secondboot
are used as a state machine to chose between booting Windows and
ogLive.
The first Windows boot happens if ogboot.me and ogboot.firstboot
are identical, then "iniciado" is written in ogboot.firstboot.
We skip this stage as we create ogboot.firstboot with 'iniciado'.
The second Windows boot occurs if ogboot.me and ogboot.secondboot
are boot identical, then "iniciado" is written in ogboot.secondboot.
After the Windows boot ogLive is booted.
Create a bios.py file to hold all the BIOS specific functions.
Implement the _boot_bios_linux in Python. The new boot process
tries to find the vmlinuz and initrd binaries at the desired
partition. Then it tries to load them with kexec with the proper
Grub boot params.
One step closer to the removal of the boot legacy script.
The mage creation process was being interrupted by an error
trying to read the Windows registry by the Hivex library.
Now the exceptions are handled and an error is reported.
The OS probe logic must be able to check a distro programmatically,
add get_linux_distro_id to return an id whitout versioning.
Ensure the availability of 'ubuntu' when we need to ensure certain
features are only used with a supported system.
This change is a preparative for reimplementing the BIOS boot
in order to deprecate the legacy script. All the codepaths to
boot systems located at a partition are now called from the
boot_os_at function enabling an easier structure for the incoming
code.
Checking the existence /sys/firmware/efi as it might appear
sometimes in BIOS installs if the BIOS configuration is not
proper. Checking for the EFI partition is the safest method
to veryfy the install type.
The function getlinuxversion receives a path to the os-release
file. The case of not being able to open it was not handled and
thus causing an unwanted exception.
The json functionality proposed upstream might be merged one day
in efibootmgr so deploying a fork would not be needed anymore.
This change aims to ease the migration once that day comes.
Replace IniciarSesion script in favor of native Python code when booting
a UEFI system into Linux. This completes the implementation of booting
into an OS on a UEFI compliant system.
Replace IniciarSesion script in favor of native Python code when booting
a UEFI system. This applies when running the "session" command.
WIP: Only UEFI boots Windows systems. Raise NotImplementedError
exception trying to boot a Linux system using UEFI.
Add utility module related to the process of booting a system from a
client's partition.
The main utility function to boot a clients system is boot_os_at(), from
which firmware (UEFI or BIOS) and os-family specific private functions are invoked.
This initial commit adds UEFI windows boot function.
Add UEFI related utilities inside a new utility module: uefi.py
_check_efibootmgr_json
======================
Check if the system efibootmgr executable supports JSON output. This is
a private function used only by other functions from uefi.py.
is_uefi_supported
=================
Check if the system supports UEFI firmware.
run_efibootmgr_json
===================
Runs efibootmgr with json output support. Return the JSON output as a
Python dict.
efibootmgr_create_bootentry
===========================
Create nvram boot entry. This bootentry is usually later set to boot
next just once via "BootNext" nvram variable.
efibootmgr_delete_bootentry
===========================
Delete a nvram boot entry. Used to avoid duplicates when booting the
same disk and partition from a given client.
efibootmgr_bootnext
===================
Set nvram "BootNext" variable to a given boot entry so after client
reboot, PXE is not executed and the given boot entry takes precedence.
Add dependency with efibootmgr version >= 18, and efibootmgr JSON output
which is currently out of tree from util-linux repo.
Add a basic OS family enumeration: OSFamily.
Add utility function that probes for an installed Linux or Windows
system, returns the corresponding enum value, OSFamily.UNKNOWN
otherwise.
Add utility function inside disk.py to find, if any, the first ESP
partition of a given disk.
The disk is provided as an integer (starting at 1 following OpenGnsys
scripts usual values), meaning the (n-1)th disk from the disk array
returned from get_disks(). In the future a better mechanism should be
put in place to fetch probed disks from a running client.
This change is part of the upcoming drop of "IniciarSesion" script in
favor of a Python native approach. Specifically regarding UEFI systems.
value extraction did not have error checking and was handled in
a one-liner. The actual implementation expands the parsing logic
and moves it into a function.
Revisit 5056b8f0d5 ("fs: validate ntfsresize dry-run output") that has
introduced a possible infinity loop.
Disentangle this loop while at it: iterate until best smallest size is
found by probing.
do not return the returncode, instead return an integer.
do not use
except CalledProcessError as e:
it causes a another exception while handling exception.
Remount the original image repository.
it should be possible to simplify this further by:
- stacking mounts, no need to umount initial repo and mount it again
when switching to the new repo, because remount back initial repo
might fail (!)
- use check=False and simply check for x.returncode
Remove mbuffer, this is never used.
mbuffer has been never been used since ogClient supports native image restore.
Originally this was used like this:
partclone [...] | mbuffer -q -M 40M | lzop [...]
supposely to speed up partclone in case the device where the read happens is
slowier than the device that is used for writes.
See mbuffer(1) manpage examples.
In any case, this needs benchmarking to really make sure this is helping.
Remove it until that ever happens.
cover more error cases where exceptions need to be raised.
check return code in the invoked subprocess.
restoreImageCustom has been intentionally left behind, it
is unclear what this custom script returns on success and
error.
validate 'Needed relocations: ' is in place before stepping on the split chunks
(2024-01-11 10:28:16) ogClient: [ERROR] - Exception when running "image create" subprocess
Traceback (most recent call last):
File "/opt/opengnsys/ogClient/src/live/ogOperations.py", line 454, in image_create
ogReduceFs(disk, partition)
File "/opt/opengnsys/ogClient/src/utils/fs.py", line 105, in ogReduceFs
_reduce_ntfsresize(partdev)
File "/opt/opengnsys/ogClient/src/utils/fs.py", line 235, in _reduce_ntfsresize
extra_size = int(out_resize_dryrun.split('Needed relocations : ')[1].split(' ')[0])*1.1+1024
IndexError: list index out of range
if not present, no need to adjust size
Users can create an image of a filesystem that contains no OS, therefore,
instead of rising an exception when no OS is detected, deliver a "unknown"
OS and an empty list of software.
Image backup is considered a legacy feature. Use the legacy mechanism of
naming image backups by adding ".ant" suffix.
Previously, by using the strftime suffix clients were reporting that the
disk were getting full rather quickly.
When a good method for image deletion is implemented then a proper
backup naming mechanism should be reconsidered.
When a client's hardware presents an empty pci storage child there is an
invalid call to _bytes_to_human a string is supplied as a default value
if the storage child does not present a 'size' attribute.
Fix this by checking if 'size' is present in the JSON output from lshw.
If size is present then map the bytes to a human readable string using
_bytes_to_human, if no size is present then use 'Empty slot' to indicate
that the memory bank is not being used.
When a client's hardware presents an empty memory bank and invalid call
to _bytes_to_human is performed because None is passed as a parameter.
size = _bytes_to_human(obj.get('size', None))
Fix this by checking if 'size' is present in the JSON output from lshw.
If size is present then map the bytes to a human readable string using
_bytes_to_human, if no size is present then use 'Empty slot' to indicate
that the memory bank is not being used.
Some users have mistakenly reported tiptorrent problems when the process
takes a long time. Specifically by rebooting or powering off the client
in the middle of the md5sum computation stage, just after the tiptorrent
transfer.
Same problem occurs when image creation command takes a long period of
time.
In order to help the user understand the different stages of commands
such as image creation or image restore using tiptorrent, the following
changes have been made to the current logging solution:
- Add log messages to warn users not to reboot or shut down the client
during a tiptorrent transfer, and also during the md5sum computation
stage.
- Add a log message telling the user that the image creation processes
have started.
- Use logging.exception inside "except:" blocks to print a traceback
with the log messsage.
(https://docs.python.org/3/library/logging.html#logging.exception)
The first stage of parsing the "lshw -json" command output is to load
the json string into a Python dictionary. lshw output is large and
varies from machine to machine, so it's not safe to assume that
different keys will be present in the dictionary.
Use dict.get() instead of dict[key] to avoid KeyError exceptions.
Backup image file if image creation request included
"backup": true
This only applies when the target image is already present in the
repository folder before running the partclone subprocess.
This parameter is ignored if the target image is not present in the
repository.
Enable parsing of "X-Sequence" HTTP headers from incoming requests.
Add "seq" field in restRequest class.
Enable adding "X-Sequence" to outgoing responses.
Add "seq" field inside restResponse class.
Store current client sequence number inside ogClient class.
Ideally, the restRequest object should be used to retrieve the
sequence number but not all processing functions inside ogRest.py
receive the request as parameter (eg: process_refresh).
In the other hand, all processing functions receive the ogClient object.
The subprocess module expects bytes-like object for "input" parameter by
default. Passing a string object result in the following error:
(2023-06-13 14:44:43) ogClient: [ERROR] - Exception when running "image create" subprocess
(2023-06-13 14:44:43) ogClient: [ERROR] - Unexpected error
Traceback (most recent call last):
File "/opt/opengnsys/ogClient/src/live/ogOperations.py", line 465, in image_create
ogExtendFs(disk, partition)
File "/opt/opengnsys/ogClient/src/utils/fs.py", line 124, in ogExtendFs
_extend_ntfsresize(partdev)
File "/opt/opengnsys/ogClient/src/utils/fs.py", line 250, in _extend_ntfsresize
proc = subprocess.run(cmd, input='y')
File "/usr/lib/python3.8/subprocess.py", line 495, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "/usr/lib/python3.8/subprocess.py", line 1013, in communicate
self._stdin_write(input)
File "/usr/lib/python3.8/subprocess.py", line 962, in _stdin_write
self.stdin.write(input)
TypeError: a bytes-like object is required, not 'str'
Fixes: dd999bfe34 ("utils: rewrite ogReduceFs")
There is a corner case in which a target NTFS filesystem is already
shrunken. When this happens ntfsresize text output parsing breaks.
Check when ntfsresize reports nothing to do, warn the user about this
and stop the dry-run ntfsresize loop.
_extend_ntfsresize contains an incorrect variable name inside
subprocess.run referring the resize command value.
Simplify this variable name inside each specific _extend_* function:
s/cmd_resize2fs/cmd
s/cmd_ntfsresize/cmd
Remove unnecessary InventarioSoftware invocation inside image_create
operation. Software inventory is executed after image creation
(see ogRest.py).
Remove legacy 'path' parameter. This parameter was used to specify the
path of a text file in which legacy bash scripts wrote the software
inventory of the client (something like "Csft-{ip}...").
Fixes: 04bb35bd86 ("live: rewrite software inventory")
Fixes: 2e3d47b7b8 ("Avoid writting /software output to a file")
Don't raise exception if any windows program is missing DisplayName
node in the windows registry.
This attribute/node should contain the program's name. This name is used
as the package's name in the software set (software inventory).
This patch should be considered a hotfix, python-hivex does not report
any helpful message about this error.
(2023-05-09 14:43:13) ogClient: [ERROR] - Unexpected error
Traceback (most recent call last):
[...]
RuntimeError: Success
Before this patch, image creation *might* fail because it cannot create
the software inventory associated with the image due to the previously
described error. The software inventory is part of the response payload
of the image creation command (see src/ogRest:image_create).
Fixes: 04bb35bd86 (live: rewrite software inventory)
Add optional 'operation' parameter to _poweroff_oglive function.
Reuse _poweroff_oglive code before the busybox subprocess when rebooting
an ogLive client.
Replace legacy bash script /opt/opengnsys/client/scripts/poweroff with a
Python native solution.
Use subprocess module for any required external program when shutting
down a client. ethtool is used to ensure WoL setting is correct before
shutting down.
ogLive does not properly use a init system so busybox is used when
shutting down the system. In other live environments poweroff operation
just calls /sbin/poweroff.
Add utility function to unmount any mountpoint present in the /mnt
folder.
This function is a simplified version of the legacy bash function
ogUnmountAll used in several operations.
Drop subprocess call to bash function ogExtendFs. Use a native python
solution with subprocess calls to the required underlying tools.
Use get_filesystem_type to get the present filesystem from a partition
and call the corresponding filesystem grow function.
Filesystem specific functions are declared "_extend_{filesystem}" and
should not be imported elsewhere.
Each filesystem specific function wraps a subprocess call to the
required underlying program:
- NTFS filesystems: "ntfsresize -f [partition]"
- ext4 filesystems: "resize2fs -f [partition]"
Set NTFS related subprocess stdin to 'y' because human input cannot be
unset with other ntfsresize parameters.
Drop subprocess call to bash function ogReduceFs. Use a native python
solution with subprocess calls to the required underlying tools.
Use get_filesystem_type to get the filesystem from a partition and call
the corresponding supported filesystem shrink function.
Filesystem specific functions are declared "_reduce_{filesystem}" and
should not be imported elsewhere.
In case of NTFS filesystems, the output of 'ntfsresize' is processed
directly. This is dirty, but we can expect no changes to the output
strings if we read the following comment in the nftsresize.c source
code:
https://github.com/tuxera/ntfs-3g/blob/edge/ntfsprogs/ntfsresize.c#L12
ntfsresize requires to do previous dry-run executions to confirm
that the resizing is possible.
If a dry-run fails but a 10% increase in size is still smaller than
original filesystem then retry the operation until dry-run reports
sucess or the size increase is bigger than original.
If resizing to a smaller ntfs filesystem is not possible then ogReduceFs
will do nothing.
Replace legacy shell script InventarioHardware for helper functions
from hw_inventory.py
Use get_hardware_inventory to obtain a HardwareInventory object with
the hardware information. Map the HardwareInventory object to a legacy
response string with the legacy_list_hardware_inventory function.
Remove "Chrd-*" file reading logic, it's no longer needed. Legacy shell
script InventarioHardware uses that file.
Expect a change in the structure of hardware inventory response payload
in the future. This patch does not address the HTTP response containing
the hardware inventory as a '\n' separated string of hardware elements.
hw_inventory.py defines classes and helpers functions enabling
fetching of hardware inventory from a running client.
Uses a subprocess call to the command 'lshw -json' to obtain hardware
information.
Relevant public functions:
> get_hardware_inventory()
Main function encapsulating subprocess and output processing
logic.
Returns a HardwareInventory object.
> legacy_list_hardware_inventory(inventory)
Legacy string representation of parameter HardwareInventory object
Rename software inventory file to sw_inventory to better distinguish
it from a future hardware inventory code.
In the future sw_inventory and hw_inventory might be merged together
once each file is tidied up.
Replace legacy bash script in favor of Python code. Improves error
traceability and further development.
The software inventory operation mounts the target partition and it
fetches the list of installed software (package set). Once the
operation is complete, it unmounts the target partition.
For Windows, introduce hivex library python bindings for accessing
Windows registry hive files (https://libguestfs.org/hivex.3.html).
This operation is still processed by legacy code in the server side
(ogAdmServer.c in ogServer). Legacy backend process expects the software
inventory like the following example:
"software": "Windows 10 Enterprise Evaluation 2004 \nIntel(R) Network Connections 24.0.0.11 24.0.0.11 ..."
The os name is inserted first in this list followed by a '\n' separated
string of the software packages.
The legacy server code can be found in function actualizaSoftware at
ogServer/src/ogAdmServer.c
It is expected for software inventory payload to change in the future to
a simpler solution using just a json array of strings.
Add missing samba credentials parameter in ogChangeRepo invocation.
Credentials are loaded from ogClient config file.
Any production deployment should use its own samba user and password.
ogChangeRepo fails when using default samba credentials in a production
environment.
Fixes: a1edbe904b ("legacy: rewrite ogChangeRepo")
Fixes: 3703fd6063 ("live: support native unicast cache image restore")
Removes undefined 'repo' variable from error logging message.
This caused the traceback to be polluted with an unhelpful message
about this variable being undefined.
Fixes: 3703fd606 ("live: support native unicast cache image restore")
Adds linux swap partition type, mapped to the 'LINUX-SWAP' string in web
interfaces like ogCP or webconsole.
Fixes: 29c53e54e9 ("live: add parttypes.py")
Capture all possible Python exceptions in the try/except block of every
opengnsys operation.
Create an error handling function to deduplicate code in the except
block. The error handling function resets the ogRest state to IDLE and
sends the corresponding 500 Internal Server Error.
This *does not cover* every possible error. There are functions inside
ogThread which contain code that may raise errors that are not covered
by any try/except block.
Remove unnecesary root logger constant: LOGGER
The root logger is used by default when executing:
logging.debug()
logging.info()
logging.warning()
...
There is no point in doing:
LOGGER = logging.getLogger() # Get root logger
LOGGER.debug() # Use root logger
Change the name of the helper functions used when getting opengnsys
image information (legacy ogGetImageInfo bash script). As of now the
process consist of decompressing the image file with lzop and feeding
that output to partclone.info.
Prefer a more explicit function name rather than "process_image_*"
Add comment about skipping the first two lines of partclone.info output.
Usually, partclone.info starts printing out these two lines that are not
related to the partclone image information:
Partclone v0.3.23 http://partclone.org
Showing info of image (-)
As long as partclone.info output doesn't change we'll be fine, but we
should not depend on human readable output. This might change in the
future (i.e. adding json output format to partclone.info).
Rewrites this legacy script behavior using native Python code, using
subprocess module when executing programs like partclone.info or lzop
ogGetImageInfo is a bash script that retrieves information regarding an
OpenGnsys partition image, specifically:
- clonator
- compressor
- filesystem
- datasize (size of the partition image)
This rewrite only supports partclone and lzop compressed images. This is
standard behavior, we have no reports of other programs or compression
algorithms in use.
Keep this legacy function with hungarian notation to emphasize this is
still a legacy component that may be replaced in the future.
Drop ogChangeRepo Bash script in favor of a native Python
approach. Use only necessary subprocess calls instead of bringing
all the logic of this function into a Bash script black box.
ogChangeRepo unmounts the current OpenGnsys image samba folder
(/opt/opengnsys/images) and mounts (connects to) a new directory using
the new provided ip address. Keeping access mode from previous mount.
If anything goes wrong when mounting the new directory, it will fallback
to mounting the previous directory.
If no previous OpenGnsys image samba directory is detected, this
functions tries to mount the new directory anyway. In this case,
it will raise CalledProcessError if something goes wrong.
Rewrites the setup operation using python-libfdisk module instead of an
external bash script. Consolidating the operation into Python's code,
limiting external subprocesses to well known programs and small
concrete tasks that are difficult to fully integrate into Python.
Use parttypes.py to fetch partition types from python-libfdisk module.
Use fs.py to create any specified supported filesystem.
OpenGnsys cache partitions are created labelling the partition as
"CACHE". Stops setting non-standard MBR hexcode (0xca) to the cache
partition in addition to the filesystem label.
Any partition specified as type EMPTY will be ignored.
init_cache() creates the default directory in which OpenGnsys stores
images when using any cache enabled transfer method.
As of this commit this folder must exist for tiptorrent.py to
work properly.
Subprocess Popen object inside tiptorrent.py use
'cwd' optional parameter like:
cwd='/opt/opengnsys/cache/opt/opengnsys/images/'
This folder convention might change in the future.
Adds utility module which wraps several mkfs.* calls as a subprocess.
The main utility function is mkfs(fs, disk, partition, label), which
subsequently calls the corresponding mkfs_*(partition_device) function.
mkfs() supports specifying a drive label where supported.
Other modules using fs.py should call mkfs() only.
Adds parttypes.py module with utility functions to get partition types
(parttypes) from python-libfdisk.
Supports standard partition types, either DOS or GPT.
DOS labels use a hex code to define partition types, python-libfdisk
exposes get_parttype_from_code to look up for DOS partition types from a
given hexcode.
GPT label uses a string (UUID) for each supported partition type,
python-libfdisk exposes get_parttype_from_string to look up for GPT
partition types from a given string.
Clients running in ogLive can show log messages via a lighttp server.
Particularly, a html page named "real time log" consists of <text-area>
tags with the contents of two particular text files
/tmp/session.log and /tmp/command.log
Adds a Python logging handler in order to write ogClient log messages
into /tmp/session.log. This way ogClient logs are show in the "real time
log" html page too.
Clears content of blue text areas in the real time log view before
executing a restore image operation.
Adds private function _ogbrowser_clear_logs, this function writes to a
couple of text files present in the ogLive environment.
The contents of this file are printed out to the blue text areas
in the "real time log" view.
Fix error paths in live operations which do not
reset the "browser" to the main page (one with the menu).
Add error logging messages when:
* _restartBrowser fails.
* ogChangeRepo fails.
Improve checksum fetch error handling. For example, when an invalid
repository IP is specified.
UNICAST-CACHE consist of:
1. Checking if the target image is already present at the opengnsys
cache partition. If so, check for integrity (local and remote
checksum). If the image is not present in the cache partition,
download the target image into it.
2. Restore the image from cache partition.
This commit add support for this operation natively from ogClient
Python's code.