[File] [PATCH] of Magdir/archive, virtual VirtualBox *.NVRAM described as tar archive

Jörg Jenderek (GMX) joerg.jen.der.ek at gmx.net
Wed Oct 25 23:55:50 UTC 2023


Hello,
some weeks ago i must migrate to Windows 10. During that process i
lost some Virtual Box machines. So i look for file formats related to
Virtual Box. One format use filename extension nvram.

When running file command version 5.45 with -e tar option on such
examples and related files, i get an output like:

Black_Cobra_003.cbt: Comic Book archive, tar archive
		     , 1st image 19.jpg
FreeDOS_1.ova:       Open Virtualization Format Archive
		     , with FreeDOS_1.ovf
Mint-21.1_2nd.nvram: POSIX tar archive (GNU), file TpmEmuTpms/permall
		     , mode 0100700, uid 0000000, gid 0000000
		     , size 00000010451, seconds 14431206570
		     , user someone, group somegroup
OS X 10.11.nvram:    data
Vista.nvram:         data
Win10_22H2de.nvram:  POSIX tar archive (GNU), file TpmEmuTpms/permall
		     , mode 0100700, uid 0000000, gid 0000000
		     , size 00000010451, seconds 14344626366
		     , user someone, group somegroup
Win11-no_tar.nvram:  data
tar-1.35.tar:        POSIX tar archive, directory tar-1.35/
		     , mode 0000755, uid 0001750, gid 0001750
		     , size 00000000000, seconds 14455433533
		     , user gray, group gray

For the VirtualBox samples with --extension option only tar/gtar or ???
is displayed. Furthermore with -i option for nvram samples only
application/x-gtar or generic application/octet-stream
is shown.

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/). Here
often the program freeze. The samples are described as "Tape Archive
Format" with mime type application/x-tar by by PUID x-fmt/265. The
OVA and NVRAM suffix are considered as bad. The CBT sample is described
as "Comic Book Archive" via PUID fmt/1462 based on file name suffix (See
EXTENSION_MISMATCH true in droid-nvram.csv.gz).

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html).  The samples described
as TAR by file command are here also described with low priority as "TAR
- Tape ARchive (GNU)" by ark-tar-gnu.trid.xml with generic mime type
application/x-gtar and wrong 2 suffix (.TAR/GTAR). With highest priority
these samples are described as "VirtualBox saved (U)EFI BIOS settings
(TAR)" by nvram-virtualbox-tar.trid.xml with mime type
application/x-virtualbox-nvram. The other nvram samples are described
similar without (TAR) phrase by nvram-virtualbox.trid.xml. The other tar
based samples are described with low priority as "TAR - Tape ARchive" by
ark-tar-posix.trid.xml and ark-tar-file.trid.xml. The OVA sample is
described correctly with highest priority as "Open Virtualization Format
package" by ova.trid.xml whereas for the CBT samples no sub
classification is shown (See appended trid-v-nvram.txt.gz).

Unfortunately i found no file format description for such VirtualBox
nvram samples. I and other people often complaining about Microsoft
behaviour, but open software is also not the holy grail in every field.
Such nvram samples are used and installed by VirtualBox, but the file
type is not officially registered or you find no sufficient file
specification. Some people say "may the source be with you", but when
unpacking VirtualBox source packages if get about 1 GB of source text
files. Unfortunately i have not enough expertise and time to find there
the needed explanations.

But in VirtualBox User Manual there exist in Chapter 8 about VBoxManage
a section about modifynvram command. This command list and modify the
NVRAM content of a virtual machine. So i use this as reference URL. That
is expressed inside Magdir/virtual by comment lines like:
# URL: 		https://www.virtualbox.org/
#		manual/ch08.html#vboxmanage-modifynvram
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/n/nvram-virtualbox.trid.xml

The interesting sub command is listvars. This lists all UEFI variables
in the virtual machine stored along with their owner UUID. This can be
done for examples command line like:
      VBoxManage modifynvram "Mint-21.1_2nd" listvars

So we get variable names and their GUID. In global strings of TrID
definition we find again the listed UEFI variables but encoded as
UTF16. Some are obviously identified as UEFI variables. These are
expressed by lines like:
	<String>M'E'M'O'R'Y'T'Y'P'E'I'N'F'O'R'M'A'T'I'O'N</String>
	<String>K'E'Y'0'0'0'0</String>
	<String>B'O'O'T'O'R'D'E'R</String>
	<String>B'O'O'T'0'0'0'0</String>
	<String>M'T'C</String>
	<String>C'O'N'O'U'T</String>
	<String>C'O'N'I'N</String>
	<String>T'I'M'E'O'U'T</String>
	<String>P'L'A'T'F'O'R'M'L'A'N'G</String>
	<String>I'N'I'T'I'A'L'A'T'T'E'M'P'T'O'R'D'E'R</String>

I can query the content of a given UEFI variable by sub command
queryvar. This for example looks like:
	modifynvram "Win10_test" queryvar --name=Boot0000
	modifynvram "Win10_test" queryvar --name=Boot0001
	modifynvram "Win10_test" queryvar --name=Boot0002

The shown content are boot devices. That encoded as UTF-16 are expressed
in TrID definition inside global string section by lines like:
	<String>U'E'F'I' 'V'B'O'X' 'C'D'-'R'O'M' 'V'B</String>
	<String>U'I'A'P'P</String>
	<String>E'F'I' 'I'N'T'E'R'N'A'L' 'S'H'E'L'L</String>

Then there are some short lines looking like ASCII inside global string
section of TrID definition. These look like:
         <String>EI2YD</String>
         <String>_FVH</String>

Some of them are found in the first 64 bytes of NVRAM samples which seem
to be constant. Nothing looks a magic pattern except for 4 byte sequence
_FVH and 2 byte sequence AA55 at the end of this header. So such samples
are now described inside Magdir/virtual by lines like:
0	long		0
 >0x64	beshort		0xAA55
 >>0x28	string		_FVH
 >>>0	use		virtualbox-nvram
0	name	virtualbox-nvram
 >0x64	beshort		x			VirtualBox NVRAM file
!:mime	application/x-virtualbox-nvram
!:ext	nvram
Because i do not know if the start magic are always true (VirtualBox
version 7.0.12 r159484) i put displaying part inside sub routine
virtualbox-nvram. so in worst case things can easily be changed. Instead
of generic mime type application/octet-stream i show an user defined one.

Some NVRAM samples are just tar files. That can be verified by unpacking
listing (see appended 7z-l-slt.txt.gz in output directory) like done by
command like:
     7z l -ttar -slt  *.nvram

Then we see the same information reported by file command. First member
is file TpmEmuTpms\permall which is writeable, readable and executable
by user someone (uid 0) "-rwx------" with group somegroup (gid 0). Now
comes the interesting part. Second member is a file with name name
efi\nvram and is readable and writeable to all "-rw-rw-rw-". Apparently
that file is of the same kind as the other variant. So i extract these
samples with same name as VirtualBox machine and additional no_tar
phrase before suffix.

This variant is described by TrID nvram-virtualbox-tar.trid.xml
definition. The first member is TpmEmuTpms\permall. Unfortunately i do
not know under which condition the tar based or the other variant is
generated. The tar based samples are generated with by VirtualBox
version 7.0.8 and also used by version 7.0.12.

According to TrID some NVRAM samples are just tar files with
TpmEmuTpms/permall archive member. That can be verified by unpacking
listing (see appended 7z-l-nvram.txt.gz) like done by command like:
	7z l -ttar *.nvram

Assuming that TpmEmuTpms\permall file always comes first in TAR archive
i can change magic lines inside Magdir/archive. There after some test
lines the displaying part is done by subroutine tar-cbt for Comic
Book archive packed as tar (*.cbt), tar-ova for Open Virtualization
Format Archive (*.ova) or tar-file for other cases. So i must
only insert test lines for NVRAM samples by check that first archive
member name[100] is a file name with name TpmEmuTpms/permall. So this
part now becomes like:
  >>>>>>>>0	string	TpmEmuTpms/permall
  >>>>>>>>>0	use	tar-nvram
  >>>>>>>>0	regex	\^[0-9]{2,4}[.](png|jpg|jpeg|tif|tiff|gif|bmp)
  >>>>>>>>>0	use	tar-cbt
  >>>>>>>>0	regex	\^.{1,96}[.](ovf)
  >>>>>>>>>0	use	tar-ova
  >>>>>>>>0	default		x
  >>>>>>>>>0	use	tar-file

Unfortunately the TAR format is used for storing by others with own file
name suffix and even own mime type. And often exact information about
such items is missing. Too overcome this problem also for future
expansion i first restructured the current magic lines. After
characterising the different tar variants then inside sub routine
tar-file information (like file type, name, mode, time, owner, group) of
first tar archive member is shown starting with interpretation of type
flag. Now i put this part inside new sub routine tar-entry. This looks like:
0	name		tar-entry
#>156	ubyte		x		\b, %c-type
 >156	ubyte		x
 >>156	ubyte		0		\b, file
 >>156	ubyte		0x30		\b, file
 >>156	ubyte		0x31		\b, hard link
...
 >>>257	string		>\0		\b, comment: %-.40s

Now i replace corresponding lines by calling this sub routine inside
routine tar-file.  Then i replace the corresponding lines in the other
routines for tar based files. For CBT samples this routine look like:
0	name		tar-cbt
 >0	string		x		Comic Book archive, tar archive
!:mime	application/vnd.comicbook
!:ext	cbt
 >0	string		>\0		\b, 1st image %-.60s
So there last line is now replaced and routine now looks like:
0	name		tar-cbt
 >0	string		x		Comic Book archive, tar archive
!:mime	application/vnd.comicbook
!:ext	cbt
 >0	use	tar-entry

Then i do the same procedure in routine tar-ova. For VirtualBox samples
the corresponding sub routine so now looks like:
0	name		tar-nvram
 >0	string		x		VirtualBox NVRAM file
!:mime	application/x-virtualbox-nvram
!:ext	nvram
 >0	use	tar-entry
 >512	search/0x1800/s	efi/nvram\0
 >>&0	use	tar-entry
#>>&512	indirect	x
After showing information of first archive entry (That starts with file
TpmEmuTpms/permall) i look also for next member name which seems to be
always efi/nvram. Now i also call tar-entry for this second entry. If i
like i could also inspect the content of this entry via indirect calling
Magdir/virtual.

After applying the above mentioned modifications by 2 patches
file-5.45-archive-nvram.diff and file-5.45-virtual-nvram.diff then with
option -e tar i get a more correct output like:

Black_Cobra_003.cbt: Comic Book archive, tar archive
		     , file 19.jpg
		     , mode 000644
		     , size 00003315356, seconds 11540725637
FreeDOS_1.ova:       Open Virtualization Format Archive
		     , file FreeDOS_1.ovf
		     , mode 0100640, uid 0000007, gid 0000000
		     , size 00000023702, seconds 14423046655
		     , user vboxovf10, group vbox_v7.0.6r155176
Mint-21.1_2nd.nvram: VirtualBox NVRAM file
		     , file TpmEmuTpms/permall
		     , mode 0100700, uid 0000000, gid 0000000
		     , size 00000010451, seconds 14431206570
		     , user someone, group somegroup
		     , file efi/nvram
		     , mode 0100700, uid 0000000, gid 0000000
		     , size 00002040000, seconds 14431411147
		     , user someone, group somegroup
OS X 10.11.nvram:    VirtualBox NVRAM file
Vista.nvram:         VirtualBox NVRAM file
Win10_22H2de.nvram:  VirtualBox NVRAM file
		     , file TpmEmuTpms/permall
		     , mode 0100700, uid 0000000, gid 0000000
		     , size 00000010451, seconds 14344626366
		     , user someone, group somegroup
		     , file efi/nvram
		     , mode 0100666, uid 0000000, gid 0000000
		     , size 00002040000, seconds 14344626337
		     , user someone, group somegroup
Win11-no_tar.nvram:  VirtualBox NVRAM file
tar-1.35.tar:        POSIX tar archive
		     , directory tar-1.35/
		     , mode 0000755, uid 0001750, gid 0001750
		     , size 00000000000, seconds 14455433533
		     , user gray, group gray

With --extension option now the correct file names suffix are shown like:
Black_Cobra_003.cbt: cbt
FreeDOS_1.ova:       ova
Mint-21.1_2nd.nvram: nvram
OS X 10.11.nvram:    nvram
Vista.nvram:         nvram
Win10_22H2de.nvram:  nvram
Win11-no_tar.nvram:  nvram
tar-1.35.tar:        tar/ustar

I hope my diff files can be applied in future version of
file utility. I hope that other users check that my assumptions are
always true and give hints about information concerning NVRAM file format.

With best wishes
Jörg Jenderek
--
Jörg Jenderek
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-nvram.txt.gz
Type: application/x-gzip
Size: 1182 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231026/eb9dc085/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 7z-l-nvram.txt.gz
Type: application/x-gzip
Size: 560 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231026/eb9dc085/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 7z-l-slt-nvram.txt.gz
Type: application/x-gzip
Size: 579 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231026/eb9dc085/attachment-0002.bin>
-------------- next part --------------
--- file-5.45/magic/Magdir/archive.old	2023-07-27 20:04:45.000000000 +0200
+++ file-5.45/magic/Magdir/archive	2023-10-26 01:11:40.128266800 +0200
@@ -27,2 +27,7 @@
 >>>>>>>148	ubyte&0xEF	=0x20	
+# check for specific 1st member name that indicates other mime type and file name suffix
+>>>>>>>>0	string		TpmEmuTpms/permall
+# maybe also look for 2nd tar member efi/nvram containing UEFI variables part
+#>>>>>>>>>512	search/0x1800	efi/nvram\0		EFI_PART_FOUND
+>>>>>>>>>0	use	tar-nvram
 # FOR DEBUGGING: 
@@ -36,3 +41,3 @@
 >>>>>>>>>0	use	tar-ova
-# if 1st member name without digits and without used image suffix and without *.ovf then it is a TAR archive
+# if 1st member name without digits and without used image suffix, without *.ovf and TpmEmuTpms/ then it is a pure TAR archive
 >>>>>>>>0	default		x
@@ -88,3 +93,7 @@
 !:ext	tar/ustar
-# type flag of 1st tar archive member
+# show information for 1st tar archive member
+>0	use	tar-entry
+#	display information of tar archive member (file type, name, permissions, user, group)
+0	name		tar-entry
+# type flag of tar archive member
 #>156	ubyte		x		\b, %c-type
@@ -159,2 +168,21 @@
 >>>257	string		>\0		\b, comment: %-.40s
+# Summary:	VirtualBox NvramFile with UEFI variables packed inside TAR archive
+# URL:		hhttps://www.virtualbox.org/manual/ch08.html#vboxmanage-modifynvram
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/n/nvram-virtualbox-tar.trid.xml
+# Note:		called "VirtualBox saved (U)EFI BIOS settings (TAR) by TrID and
+#		verified by 7-Zip `7z l -ttar Mint-21.1.nvram` and
+#		VirtualBox `VBoxManage modifynvram "Mint-21.1" listvars`
+0	name		tar-nvram
+# 
+>0	string		x		VirtualBox NVRAM file
+#!:mime	application/x-gtar
+!:mime	application/x-virtualbox-nvram
+!:ext	nvram
+# first name[100] like: TpmEmuTpms/permall
+>0	use	tar-entry
+# 2nd tar member efi/nvram contains UEFI variables part described by ./virtual
+>512	search/0x1800/s	efi/nvram\0
+>>&0	use	tar-entry
+# 2nd tar member efi/nvram content could be described by ./virtual
+#>>&512	indirect	x
 # Summary:	Comic Book Archive *.CBT with TAR format
@@ -171,3 +199,4 @@
 # or maybe like ComicInfo.xml
->0	string		>\0		\b, 1st image %-.60s
+#>0	string		>\0		\b, 1st image %-.60s
+>0	use	tar-entry
 # Summary:	Open Virtualization Format *.OVF with disk images and more packed as TAR archive *.OVA
@@ -186,3 +215,4 @@
 # assuming name[100] like: DOS-0.9.ovf FreeDOS_1.ovf Win98SE_DE.ovf
->0	string		>\0		\b, with %-.60s
+#>0	string		>\0		\b, with %-.60s
+>0	use	tar-entry
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-archive-nvram.diff.sig
Type: application/octet-stream
Size: 1326 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231026/eb9dc085/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: droid-nvram.csv.gz
Type: application/x-gzip
Size: 586 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231026/eb9dc085/attachment-0003.bin>
-------------- next part --------------
--- file-5.45/magic/Magdir/virtual.old	2022-08-29 17:22:27.000000000 +0200
+++ file-5.45/magic/Magdir/virtual	2023-10-26 01:38:21.141221000 +0200
@@ -296,12 +296,29 @@
 0x40	ulelong		0xbeda107f	VirtualBox Disk Image
 >0x44	uleshort	>0		\b, major %u
 >0x46	uleshort	>0		\b, minor %u
 >0	string		>\0		(%s)
 >368	lequad		x		 \b, %lld bytes
 
+# From:		Joerg Jenderek
+# URL: 		https://www.virtualbox.org/manual/ch08.html#vboxmanage-modifynvram
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/n/nvram-virtualbox.trid.xml
+# Note:		called "VirtualBox saved (U)EFI BIOS settings" by TrID and
+#		verfied partly by VirtualBox version 7.0.12 `VBoxManage modifynvram <uuid|vmname> listvars`
+# first 64 bytes seems to be constant
+0	long		0
+>0x28	string		_FVH
+>>0x64	beshort		0xAA55
+>>>0	use		virtualbox-nvram
+#	display information of virtualbox *.nvram
+0	name	virtualbox-nvram
+>0x64	beshort		x			VirtualBox NVRAM file
+#!:mime	application/octet-stream
+!:mime	application/x-virtualbox-nvram
+!:ext	nvram
+
 0	string/b	Bochs\ Virtual\ HD\ Image	Bochs disk image,
 >32	string	x				type %s,
 >48	string	x				subtype %s
 
 0	lelong	0x02468ace			Bochs Sparse disk image
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-virtual-nvram.diff.sig
Type: application/octet-stream
Size: 846 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231026/eb9dc085/attachment-0001.obj>


More information about the File mailing list