Add explicit check for .full.sum after downloading it.
Rewrite errors log, one of them is misleading when checksum validation
fails, it refers to missing .full.sum, but it could be a different
reason.
Fix image size, permissions and creation time.
Improve error report related to these parameters now showing the
exact cause of the problem if any occurred during the definition
of image size, file permissions or image creation time values.
Use the constant OG_CACHE_IMAGE_PATH from cache.py to obtain the
location of the directory where images are stored.
This way the path can be changed from one single point.
Rename OGIMG as OG_IMAGE_PATH.
Rename OGCACHE_MOUNTPOINT as OG_CACHE_PATH.
Define OG_CACHE_IMAGE_PATH as OG_CACHE_PATH + OG_IMAGE_PATH.
This will serve to have a unique point to obtain cache related
paths.
This method reports the /dev path to cache partition, rename it.
Add explicit check if blkid is successful.
And add logging to report that device path to cache is not found.
Add exception checks to the os.mkdir operation and log the error
found. The previous implementation was too optimistic and only
handled mount related errors.
Report mkfs failure for every partition. This does not raise an
exception as that would skip partprobe operations and the mkfs
operations in the next potentially well formated partitions.
otherwise error path uses uninitialized variable
File "/opt/opengnsys/ogClient/src/utils/fs.py", line 112, in ogReduceFs
return ret
UnboundLocalError: local variable 'ret' referenced before assignment
Capture only the relevant exception types in each except block.
The capture of the Exception type means hiding information for
unhandled error cases, even for syntax errors in the codebase.
Using a more fine grained exception filtering improves error
traceability.
Log an error message in known error cases and log a backtrace
otherwise.
Define a new error type OgError to be used in all the 'raise'
blocks to define the error message to log. The exception
propagates until it reaches send_internal_server_error() where
the exception type is checked. If the type is OgError we log
the exception message. Logs the backtrace for other types.
The initial error implementation printed a backtrace everytime
an error ocurred. The next iteration changed it to only print
a backtrace in a very particular case but ended up omiting too
much information such as syntax errors or unknown error context.
The actual implementation only logs the cases we already cover in
the codebase and logs a bracktrace in the others, enabling a
better debugging experience.
Refine 97647c32aa utils: add enforce_gpt argument to get_efi_partition()
to provide more explicit error when trying to boot Windows UEFI from DOS
partition.
Replace unexistent mountpoint variable to report a failed
mount operation before an OS probe from a partition.
Improve the semantics of the error message replacing 'at' with
'into'.
Remove the period at the end of the log message.
Make init_cache() use the actual cache mountpoint returned by the
function mount_cache() for the creation of the cache directories
instead of a hardcoded path.
Implement a Python equivalent of ogCopyEfiBootLoader as the
function copy_efi_bootloader. This function copies the contents of
the folder of the EFI loader in the ESP into a ogBoot folder at
the root of the partition target of an image creation.
copy_efi_bootloader is a Windows only functionality.
The Windows bootloader only supports a UEFI boot from a GPT
partition. Set enforce_gpt to True in every codepath related to
Windows. When enforce_gpt is set to True get_efi_partition()
raises an exception when an MBR partition scheme is detected.
Make is_uefi_supported() only check for /sys/firmware/efi as
get_efi_partition() will detect a missing ESP or an invalid
partition scheme. Stop using get_efi_partition() inside
is_uefi_supported() as the former is eventually called in every
UEFI related code.
UEFI supports both MBR and GPT as partition schemes and this is
a required change to handle the particular case of Windows not
being able to boot UEFI from a MBR partition scheme.
Provide more information in exception messages as those are the
source of the logging messages. Add information about paths, files
or configuration related to the operation associated to the
exception.
Replace exception types to be more explicit about the nature of
the error.
Improve the exception raising semantics by using the 'from' keyword,
this wraps an older exception into a new one so it is still considered
the same object.
Use only the exception messages as the main resource for error
messages.
The previous error code had string duplication in the form of:
logging.error('msg here')
raise Exception('msg here')
That approach also has the downside of having log duplication as
it had the local logging.err() and a global logging.exception()
inside send_internal_server_error capturing the exception message.
The actual code only requires raising an exception with a proper
error message.
Improve exception messages to give more error context.
Log every AssertionError as a backtrace.
Use the 'raise Exception from e' syntax to modify the a previously
raised exception 'e' into an exception with aditional context or
different type. This also prevents the message that warns about
newer exceptions being launch after an initial exception.
Create ogboot.me and ogboot.secondboot as empty files and
ogboot.firstboot with the value "iniciado" in the root of
the BIOS Windows system partition.
The files must contain data for GRUB to be able to write content,
therefore these are created containing 3072 null bytes.
The Windows boot process is handled by the "pxe" profile.
There the files ogboot.me, ogboot.firstboot and ogboot.secondboot
are used as a state machine to chose between booting Windows and
ogLive.
The first Windows boot happens if ogboot.me and ogboot.firstboot
are identical, then "iniciado" is written in ogboot.firstboot.
We skip this stage as we create ogboot.firstboot with 'iniciado'.
The second Windows boot occurs if ogboot.me and ogboot.secondboot
are boot identical, then "iniciado" is written in ogboot.secondboot.
After the Windows boot ogLive is booted.
Create a bios.py file to hold all the BIOS specific functions.
Implement the _boot_bios_linux in Python. The new boot process
tries to find the vmlinuz and initrd binaries at the desired
partition. Then it tries to load them with kexec with the proper
Grub boot params.
One step closer to the removal of the boot legacy script.
The mage creation process was being interrupted by an error
trying to read the Windows registry by the Hivex library.
Now the exceptions are handled and an error is reported.
The OS probe logic must be able to check a distro programmatically,
add get_linux_distro_id to return an id whitout versioning.
Ensure the availability of 'ubuntu' when we need to ensure certain
features are only used with a supported system.
This change is a preparative for reimplementing the BIOS boot
in order to deprecate the legacy script. All the codepaths to
boot systems located at a partition are now called from the
boot_os_at function enabling an easier structure for the incoming
code.
Checking the existence /sys/firmware/efi as it might appear
sometimes in BIOS installs if the BIOS configuration is not
proper. Checking for the EFI partition is the safest method
to veryfy the install type.
The function getlinuxversion receives a path to the os-release
file. The case of not being able to open it was not handled and
thus causing an unwanted exception.
The json functionality proposed upstream might be merged one day
in efibootmgr so deploying a fork would not be needed anymore.
This change aims to ease the migration once that day comes.
Replace IniciarSesion script in favor of native Python code when booting
a UEFI system into Linux. This completes the implementation of booting
into an OS on a UEFI compliant system.
Add utility module related to the process of booting a system from a
client's partition.
The main utility function to boot a clients system is boot_os_at(), from
which firmware (UEFI or BIOS) and os-family specific private functions are invoked.
This initial commit adds UEFI windows boot function.
Add UEFI related utilities inside a new utility module: uefi.py
_check_efibootmgr_json
======================
Check if the system efibootmgr executable supports JSON output. This is
a private function used only by other functions from uefi.py.
is_uefi_supported
=================
Check if the system supports UEFI firmware.
run_efibootmgr_json
===================
Runs efibootmgr with json output support. Return the JSON output as a
Python dict.
efibootmgr_create_bootentry
===========================
Create nvram boot entry. This bootentry is usually later set to boot
next just once via "BootNext" nvram variable.
efibootmgr_delete_bootentry
===========================
Delete a nvram boot entry. Used to avoid duplicates when booting the
same disk and partition from a given client.
efibootmgr_bootnext
===================
Set nvram "BootNext" variable to a given boot entry so after client
reboot, PXE is not executed and the given boot entry takes precedence.
Add dependency with efibootmgr version >= 18, and efibootmgr JSON output
which is currently out of tree from util-linux repo.
Add a basic OS family enumeration: OSFamily.
Add utility function that probes for an installed Linux or Windows
system, returns the corresponding enum value, OSFamily.UNKNOWN
otherwise.
Add utility function inside disk.py to find, if any, the first ESP
partition of a given disk.
The disk is provided as an integer (starting at 1 following OpenGnsys
scripts usual values), meaning the (n-1)th disk from the disk array
returned from get_disks(). In the future a better mechanism should be
put in place to fetch probed disks from a running client.
This change is part of the upcoming drop of "IniciarSesion" script in
favor of a Python native approach. Specifically regarding UEFI systems.
value extraction did not have error checking and was handled in
a one-liner. The actual implementation expands the parsing logic
and moves it into a function.
Revisit 5056b8f0d5 ("fs: validate ntfsresize dry-run output") that has
introduced a possible infinity loop.
Disentangle this loop while at it: iterate until best smallest size is
found by probing.
do not return the returncode, instead return an integer.
do not use
except CalledProcessError as e:
it causes a another exception while handling exception.
Remount the original image repository.
it should be possible to simplify this further by:
- stacking mounts, no need to umount initial repo and mount it again
when switching to the new repo, because remount back initial repo
might fail (!)
- use check=False and simply check for x.returncode
cover more error cases where exceptions need to be raised.
check return code in the invoked subprocess.
restoreImageCustom has been intentionally left behind, it
is unclear what this custom script returns on success and
error.
validate 'Needed relocations: ' is in place before stepping on the split chunks
(2024-01-11 10:28:16) ogClient: [ERROR] - Exception when running "image create" subprocess
Traceback (most recent call last):
File "/opt/opengnsys/ogClient/src/live/ogOperations.py", line 454, in image_create
ogReduceFs(disk, partition)
File "/opt/opengnsys/ogClient/src/utils/fs.py", line 105, in ogReduceFs
_reduce_ntfsresize(partdev)
File "/opt/opengnsys/ogClient/src/utils/fs.py", line 235, in _reduce_ntfsresize
extra_size = int(out_resize_dryrun.split('Needed relocations : ')[1].split(' ')[0])*1.1+1024
IndexError: list index out of range
if not present, no need to adjust size
Users can create an image of a filesystem that contains no OS, therefore,
instead of rising an exception when no OS is detected, deliver a "unknown"
OS and an empty list of software.
When a client's hardware presents an empty pci storage child there is an
invalid call to _bytes_to_human a string is supplied as a default value
if the storage child does not present a 'size' attribute.
Fix this by checking if 'size' is present in the JSON output from lshw.
If size is present then map the bytes to a human readable string using
_bytes_to_human, if no size is present then use 'Empty slot' to indicate
that the memory bank is not being used.
When a client's hardware presents an empty memory bank and invalid call
to _bytes_to_human is performed because None is passed as a parameter.
size = _bytes_to_human(obj.get('size', None))
Fix this by checking if 'size' is present in the JSON output from lshw.
If size is present then map the bytes to a human readable string using
_bytes_to_human, if no size is present then use 'Empty slot' to indicate
that the memory bank is not being used.
Some users have mistakenly reported tiptorrent problems when the process
takes a long time. Specifically by rebooting or powering off the client
in the middle of the md5sum computation stage, just after the tiptorrent
transfer.
Same problem occurs when image creation command takes a long period of
time.
In order to help the user understand the different stages of commands
such as image creation or image restore using tiptorrent, the following
changes have been made to the current logging solution:
- Add log messages to warn users not to reboot or shut down the client
during a tiptorrent transfer, and also during the md5sum computation
stage.
- Add a log message telling the user that the image creation processes
have started.
- Use logging.exception inside "except:" blocks to print a traceback
with the log messsage.
(https://docs.python.org/3/library/logging.html#logging.exception)
The first stage of parsing the "lshw -json" command output is to load
the json string into a Python dictionary. lshw output is large and
varies from machine to machine, so it's not safe to assume that
different keys will be present in the dictionary.
Use dict.get() instead of dict[key] to avoid KeyError exceptions.
The subprocess module expects bytes-like object for "input" parameter by
default. Passing a string object result in the following error:
(2023-06-13 14:44:43) ogClient: [ERROR] - Exception when running "image create" subprocess
(2023-06-13 14:44:43) ogClient: [ERROR] - Unexpected error
Traceback (most recent call last):
File "/opt/opengnsys/ogClient/src/live/ogOperations.py", line 465, in image_create
ogExtendFs(disk, partition)
File "/opt/opengnsys/ogClient/src/utils/fs.py", line 124, in ogExtendFs
_extend_ntfsresize(partdev)
File "/opt/opengnsys/ogClient/src/utils/fs.py", line 250, in _extend_ntfsresize
proc = subprocess.run(cmd, input='y')
File "/usr/lib/python3.8/subprocess.py", line 495, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "/usr/lib/python3.8/subprocess.py", line 1013, in communicate
self._stdin_write(input)
File "/usr/lib/python3.8/subprocess.py", line 962, in _stdin_write
self.stdin.write(input)
TypeError: a bytes-like object is required, not 'str'
Fixes: dd999bfe34 ("utils: rewrite ogReduceFs")
There is a corner case in which a target NTFS filesystem is already
shrunken. When this happens ntfsresize text output parsing breaks.
Check when ntfsresize reports nothing to do, warn the user about this
and stop the dry-run ntfsresize loop.
_extend_ntfsresize contains an incorrect variable name inside
subprocess.run referring the resize command value.
Simplify this variable name inside each specific _extend_* function:
s/cmd_resize2fs/cmd
s/cmd_ntfsresize/cmd
Don't raise exception if any windows program is missing DisplayName
node in the windows registry.
This attribute/node should contain the program's name. This name is used
as the package's name in the software set (software inventory).
This patch should be considered a hotfix, python-hivex does not report
any helpful message about this error.
(2023-05-09 14:43:13) ogClient: [ERROR] - Unexpected error
Traceback (most recent call last):
[...]
RuntimeError: Success
Before this patch, image creation *might* fail because it cannot create
the software inventory associated with the image due to the previously
described error. The software inventory is part of the response payload
of the image creation command (see src/ogRest:image_create).
Fixes: 04bb35bd86 (live: rewrite software inventory)
Add utility function to unmount any mountpoint present in the /mnt
folder.
This function is a simplified version of the legacy bash function
ogUnmountAll used in several operations.
Drop subprocess call to bash function ogExtendFs. Use a native python
solution with subprocess calls to the required underlying tools.
Use get_filesystem_type to get the present filesystem from a partition
and call the corresponding filesystem grow function.
Filesystem specific functions are declared "_extend_{filesystem}" and
should not be imported elsewhere.
Each filesystem specific function wraps a subprocess call to the
required underlying program:
- NTFS filesystems: "ntfsresize -f [partition]"
- ext4 filesystems: "resize2fs -f [partition]"
Set NTFS related subprocess stdin to 'y' because human input cannot be
unset with other ntfsresize parameters.
Drop subprocess call to bash function ogReduceFs. Use a native python
solution with subprocess calls to the required underlying tools.
Use get_filesystem_type to get the filesystem from a partition and call
the corresponding supported filesystem shrink function.
Filesystem specific functions are declared "_reduce_{filesystem}" and
should not be imported elsewhere.
In case of NTFS filesystems, the output of 'ntfsresize' is processed
directly. This is dirty, but we can expect no changes to the output
strings if we read the following comment in the nftsresize.c source
code:
https://github.com/tuxera/ntfs-3g/blob/edge/ntfsprogs/ntfsresize.c#L12
ntfsresize requires to do previous dry-run executions to confirm
that the resizing is possible.
If a dry-run fails but a 10% increase in size is still smaller than
original filesystem then retry the operation until dry-run reports
sucess or the size increase is bigger than original.
If resizing to a smaller ntfs filesystem is not possible then ogReduceFs
will do nothing.
hw_inventory.py defines classes and helpers functions enabling
fetching of hardware inventory from a running client.
Uses a subprocess call to the command 'lshw -json' to obtain hardware
information.
Relevant public functions:
> get_hardware_inventory()
Main function encapsulating subprocess and output processing
logic.
Returns a HardwareInventory object.
> legacy_list_hardware_inventory(inventory)
Legacy string representation of parameter HardwareInventory object
Rename software inventory file to sw_inventory to better distinguish
it from a future hardware inventory code.
In the future sw_inventory and hw_inventory might be merged together
once each file is tidied up.
Replace legacy bash script in favor of Python code. Improves error
traceability and further development.
The software inventory operation mounts the target partition and it
fetches the list of installed software (package set). Once the
operation is complete, it unmounts the target partition.
For Windows, introduce hivex library python bindings for accessing
Windows registry hive files (https://libguestfs.org/hivex.3.html).
This operation is still processed by legacy code in the server side
(ogAdmServer.c in ogServer). Legacy backend process expects the software
inventory like the following example:
"software": "Windows 10 Enterprise Evaluation 2004 \nIntel(R) Network Connections 24.0.0.11 24.0.0.11 ..."
The os name is inserted first in this list followed by a '\n' separated
string of the software packages.
The legacy server code can be found in function actualizaSoftware at
ogServer/src/ogAdmServer.c
It is expected for software inventory payload to change in the future to
a simpler solution using just a json array of strings.
Change the name of the helper functions used when getting opengnsys
image information (legacy ogGetImageInfo bash script). As of now the
process consist of decompressing the image file with lzop and feeding
that output to partclone.info.
Prefer a more explicit function name rather than "process_image_*"
Add comment about skipping the first two lines of partclone.info output.
Usually, partclone.info starts printing out these two lines that are not
related to the partclone image information:
Partclone v0.3.23 http://partclone.org
Showing info of image (-)
As long as partclone.info output doesn't change we'll be fine, but we
should not depend on human readable output. This might change in the
future (i.e. adding json output format to partclone.info).
Rewrites this legacy script behavior using native Python code, using
subprocess module when executing programs like partclone.info or lzop
ogGetImageInfo is a bash script that retrieves information regarding an
OpenGnsys partition image, specifically:
- clonator
- compressor
- filesystem
- datasize (size of the partition image)
This rewrite only supports partclone and lzop compressed images. This is
standard behavior, we have no reports of other programs or compression
algorithms in use.
Keep this legacy function with hungarian notation to emphasize this is
still a legacy component that may be replaced in the future.
Drop ogChangeRepo Bash script in favor of a native Python
approach. Use only necessary subprocess calls instead of bringing
all the logic of this function into a Bash script black box.
ogChangeRepo unmounts the current OpenGnsys image samba folder
(/opt/opengnsys/images) and mounts (connects to) a new directory using
the new provided ip address. Keeping access mode from previous mount.
If anything goes wrong when mounting the new directory, it will fallback
to mounting the previous directory.
If no previous OpenGnsys image samba directory is detected, this
functions tries to mount the new directory anyway. In this case,
it will raise CalledProcessError if something goes wrong.
init_cache() creates the default directory in which OpenGnsys stores
images when using any cache enabled transfer method.
As of this commit this folder must exist for tiptorrent.py to
work properly.
Subprocess Popen object inside tiptorrent.py use
'cwd' optional parameter like:
cwd='/opt/opengnsys/cache/opt/opengnsys/images/'
This folder convention might change in the future.
Adds utility module which wraps several mkfs.* calls as a subprocess.
The main utility function is mkfs(fs, disk, partition, label), which
subsequently calls the corresponding mkfs_*(partition_device) function.
mkfs() supports specifying a drive label where supported.
Other modules using fs.py should call mkfs() only.
Fix error paths in live operations which do not
reset the "browser" to the main page (one with the menu).
Add error logging messages when:
* _restartBrowser fails.
* ogChangeRepo fails.
Improve checksum fetch error handling. For example, when an invalid
repository IP is specified.
Integrates image restore command into native ogClient code. Further
reduces the need for external Bash scripts.
After a succesful image restore, OS configuration is still using
external Bash script "osConfigure/osConfigureCustom".
ogCopyEfiBootloader is an invalid legacy bash function name.
Rename to the correct function name 'ogCopyEfiBootLoader' and
rename utility python wrapper too.
Fixes: 0bd037c1a409c65fbcb01355ee0dd6dca770330e
Do not return the subprocess result for ogReduceFs/ogExtendFs.
ogReduceFs works with or without the target filesystem mounted.
ogExtendFs requires the target filesystem to be mounted.
'ogMount' legacy script invocation should be replaced by a better
mount/umount wrapper.
Use legacy script that saves the Windows-specific content from the ESP
to the image target filesystem.
Current image restore solution from OpenGnsys scripts expect the EFI
partition to be stored in the target system partition. (Only for Windows
10)
For example, storing the ESP in the NTFS partition of a Windows image.
Expect use of bash script ogCopyEfiBootloader until further
integration is merged.
Integrates some parts of this operation into native code, eg: the md5
checksum computation.
Wraps non native processes and commands using the subprocess module.
For example, legacy.py stores bash commands pending integration.
Supports python >=3.6, expected until more modern ogLives are put into
production environments.
Adds new logging handler redirecting messages to the log file
located in the Samba shared directory (applies to live mode
clients, i.e: ogLive)
Parses log level configuration from ogclient.json. See:
{
"opengnsys": {
...
"log": "INFO",
...
}
...
}
Adds --debug option to set root logger level to DEBUG when starting
ogClient. Overrides log level from config file.
In addition:
- Replaces any occurence of print with a corresponding logging function.
- Unsets log level for handlers, use root logger level instead.
- Default level for root logger is INFO.
- Replaces level from response log messages to debug (ogRest)
Returns true if target is already a mountpoint. Does not call mount.
It's possible that another device might be mounted in the target
mountpoint. A future check between the source and target for
equal device major:minor must be added.
Generates a cache.txt file if a cache partition is detected.
OpenGnsys stores information about stored images in its 'cache'
partition via a text file.
The file is stored in a samba shared directory, mounted at
'/opt/opengnsys/log/' in a live client. The file name is '{ip}.cache.txt'.
Previously, the generation of this file was delegated to external bash
scripts.