[File] [PATCH] of Magdir/virtual for Microsoft Disk Image eXtended (*.vhdx)
Jörg Jenderek
joerg.jen.der.ek at gmx.net
Sat Dec 15 17:00:12 UTC 2018
Hello,
some days ago i run the file command 5.35 on my disc images. Disc images
with name extension vhdx are described only as "data".
VHDX (Virtual Hard Disk v2) is the successor format to VHD. So i add new
lines in Magdir/virtual after VHD entry. On microsoft servers
information about that file format can be found. So i add reference URL
to document [MS-VHDX].pdf with date 9/12/2018 and with title "Virtual
Hard Disk v2 (VHDX) File Format".
According to Microsoft documentation such images start with the
VHDX_FILE_IDENTIFIER signature 0x656C696678646876. This becomes first
magic line:
0 string vhdxfile
To distinguish disc image from ASCII text starting with phrase
"vhdxfile" i look for more characteristics. According to docs header
part is stored at offset 64KB and at 128KB. These start with VHDX_HEADER
signature "head". This now used as
>0x10000 string head Microsoft Disk Image eXtended
Afterwards the name of the creator of the VHDX file is stored as
UTF-16. So names like "QEMU v3.0.0", "Microsoft Windows 6.3.9600.18512"
are also shown by line
>>8 lestring16 x \b, by %.256s
Second field in header part is a CRC-32C hash over the entire 4 KB
structure. So i add line like
#>>0x10004 ulelong x \b, CRC 0x%x
Because this information is not so useful for "normal" users i add this
as a comment line. I also handle other field in the same way.
Luckily now newer file versions support quad pointers. So it is possible
to peek inside the Log Entry section. To check for the existence of log
entry look for signature value (0x65676F6C~loge) by lines:
>>(0x10048.q) ulelong !0x65676F6C \b, NO Log Signature
>>(0x10048.q) ulelong =0x65676F6C \b; LOG
I tried to get the value for the virtual size of image, which is also
shown by command like `qemu-img info`. This is stored as VirtualDiskSize
by Virtual Disk Size GUID at different offsets. Unfortunately i am not
smart enough to program this part. From region section get info to jump
to metadata section. I can do this step. Then in metadata Table look for
the entry with wanted GUID. Yes i can do this. Now get value by looking
at offset relative to beginning of metadata. There i fail. May be
another person is clever to do this.
So i keep first steps starting with displayed phrase "region". But this
information is normally not of interest for users.
After applying the above mentioned modifications by patch
file-5.35-virtual-vhdx.diff then all such disk images are described by
Magdir/virtual like:
Esp.vhdx:
Microsoft Disk Image eXtended,
by Microsoft Windows 6.3.9600.18512, sequence 0x14;
LOG; region, 2 entries,
id BAT, at 0x300000, Required 1,
id Metadata, at 0x200000, Required 1
qemu16MB-dynamic.vhdx:
Microsoft Disk Image eXtended,
by QEMU v3.0.0, sequence 0x1993d0b2,
NO Log Signature; region, 2 entries,
id BAT, at 0x200000, Required 0,
id Metadata, at 0x300000, Required 0
qemu16MB-fixed.vhdx:
Microsoft Disk Image eXtended,
by QEMU v3.0.0, sequence 0x9897bc4c,
NO Log Signature; region, 2 entries,
id BAT, at 0x200000, Required 0,
id Metadata, at 0x300000, Required 0
qemu24MB-logsize2M.vhdx:
Microsoft Disk Image eXtended,
by QEMU v3.0.0, sequence 0x675f22b0,
LogLength 2 MB, NO Log Signature; region, 2 entries,
id BAT, at 0x300000, Required 0,
id Metadata, at 0x400000, Required 0
\VM\FISCH_C.VHDX:
Microsoft Disk Image eXtended,
by d2v, sequence 0xa;
LOG; region, 2 entries,
id Metadata, at 0x200000, Required 1,
id BAT, at 0x300000, Required 1
\temp\vhdx24mb.vhdx:
Microsoft Disk Image eXtended,
by Microsoft Windows 6.3.9600.18512, sequence 0x8;
LOG; region, 2 entries,
id BAT, at 0x300000, Required 1,
id Metadata, at 0x200000, Required 1
I hope my diff file can be applied in future version of
file utility.
With best wishes
Jörg Jenderek
--
Jörg Jenderek
-------------- next part --------------
--- file-5.35/magic/Magdir/virtual.old 2017-03-17 21:34:26 +0000
+++ file-5.35/magic/Magdir/virtual 2018-12-15 16:04:46 +0000
@@ -11,2 +11,124 @@
+# From: Joerg Jenderek
+# URL: https://msdn.microsoft.com/en-us/library/mt740058.aspx
+# Reference: https://winprotocoldoc.blob.core.windows.net/productionwindowsarchives/
+# MS-VHDX/[MS-VHDX].pdf
+# Note: extends the VHD format with new capabilities, such as a 16TB maximum size
+# TODO: find and display values like virtual size, disk size, cluster_size, etc
+# display id in GUID format
+#
+# VHDX_FILE_IDENTIFIER signature 0x656C696678646876
+0 string vhdxfile
+# VHDX_HEADER signature. 1 header is stored at offset 64KB and the other at 128KB
+>0x10000 string head Microsoft Disk Image eXtended
+#>0x20000 string head \b, 2nd header
+#!:mime application/x-virtualbox-vhdx
+!:ext vhdx
+# Creator[256] like "QEMU v3.0.0", "Microsoft Windows 6.3.9600.18512"
+>>8 lestring16 x \b, by %.256s
+# The Checksum field is a CRC-32C hash over the entire 4 KB structure
+#>>0x10004 ulelong x \b, CRC 0x%x
+# SequenceNumber
+>>0x10008 ulequad x \b, sequence 0x%llx
+# FileWriteGuid
+#>>0x10010 ubequad x \b, file id 0x%llx
+#>>>0x10018 ubequad x \b-%llx
+# DataWriteGuid
+#>>0x10020 ubequad x \b, data id 0x%llx
+#>>>0x10028 ubequad x \b-%llx
+# LogGuid. If this field is zero, then the log is empty or has no valid entries
+>>0x10030 ubequad >0 \b, log id 0x%llx
+>>>0x10038 ubequad x \b-%llx
+# LogVersion. If not 0 there is a log to replay
+>>0x10040 uleshort >0 \b, LogVersion 0x%x
+# Version. This field must be set to 1
+>>0x10042 uleshort !1 \b, Version 0x%x
+# LogLength must be multiples of 1 MB
+>>0x10044 ulelong/1048576 >1 \b, LogLength %u MB
+# LogOffset (normally 0x100000 when log direct after header); multiples of 1 MB
+>>0x10048 ulequad !0x100000 \b, LogOffset 0x%llx
+# Log Entry Signature must be 0x65676F6C~loge
+>>(0x10048.q) ulelong !0x65676F6C \b, NO Log Signature
+>>(0x10048.q) ulelong =0x65676F6C \b; LOG
+# Log Entry Checksum
+#>>>(0x10048.q+4) ulelong x \b, Log CRC 0x%x
+# Log Entry Length must be a multiple of 4 KB
+>>>(0x10048.q+8) ulelong/1024 >4 \b, EntryLength %u KB
+# Log Entry Tail must be a multiple of 4 KB
+#>>>(0x10048.q+12) ulelong x \b, Tail 0x%x
+# Log Entry SequenceNumber
+#>>>(0x10048.q+16) ulequad x \b, # 0x%llx
+# Log Entry DescriptorCount may be zero. only 4 bytes in other docs instead 8
+#>>>(0x10048.q+24) ulelong x \b, DescriptorCount 0x%llx
+# Log Entry Reserved must be set to 0
+>>>(0x10048.q+28) ulelong !0 \b, Reserved 0x%x
+# Log Entry LogGuid
+#>>>(0x10048.q+32) ubequad x \b, Log id 0x%llx
+#>>>(0x10048.q+40) ubequad x \b-%llx
+# Log Entry FlushedFileOffset should VHDX size when entry is written.
+#>>>(0x10048.q+48) ulequad x \b, FlushedFileOffset %llu
+# Log Entry LastFileOffset
+#>>>(0x10048.q+56) ulequad x \b, LastFileOffset %llu
+# filling
+#>>>(0x10048.q+64) ulequad >0 \b, filling %llx
+# Reserved[4016]
+#>>0x10050 ulequad >0 \b, Reserved 0x%llx
+# VHDX_REGION_TABLE_HEADER Signature 0x69676572~regi at offset 192 KB and 256 KB
+>0x30000 ulelong !0x69676572 \b, 1st region INVALID
+>0x30000 ulelong =0x69676572 \b; region
+# region Checksum. CRC-32C hash over the entire 64-KB table
+#>>0x30004 ulelong x \b, CRC 0x%x
+# The EntryCount specifies number of valid entries; Found 2; This must be =< 2047.
+>>0x30008 ulelong x \b, %u entries
+# reserved must be zero
+#>>0x3000C ulelong !0 \b, RESERVED 0x%x
+# Region Table Entry starts with identifier for the object. often BAT id
+>>0x30010 use vhdx-id
+# FileOffset
+>>0x30020 ulequad x \b, at 0x%llx
+# Length. Specifies the length of the object within the file
+#>>0x30028 ulelong x \b, Length 0x%x
+# 1 means region entry is required. if region not recognized, then REFUSE to load VHDX
+>>0x3002C ulelong x \b, Required %u
+# 2nd region entry often metadata id
+>>0x30030 use vhdx-id
+# 2nd entry FileOffset
+>>0x30040 ulequad x \b, at 0x%llx
+# 1 means region entry is required. if region not recognized, then REFUSE to load VHDX
+>>0x3004C ulelong x \b, Required %u
+# 2nd region
+>>0x40000 ulelong !0x69676572 \b, 2nd region INVALID
+# check in vhdx images for known id and show names instead hexadecimal
+0 name vhdx-id
+# http://www.windowstricks.in/online-windows-guid-converter
+# 2DC27766-F623-4200-9D64-115E9BFD4A08 BAT GUID
+# 6677C22D23F600429D64115E9BFD4A08 BAT ID
+>0 ubequad =0x6677C22D23F60042
+>>8 ubequad =0x9D64115E9BFD4A08 \b, id BAT
+# no BAT id
+>>8 default x
+>>>0 use vhdx-id-hex
+# 8B7CA206-4790-4B9A-B8FE-575F050F886E Metadata region GUID
+# 06A27C8B90479A4BB8FE575F050F886E Metadata region ID
+>0 ubequad =0x06A27C8B90479A4B
+>>8 ubequad =0xB8FE575F050F886E \b, id Metadata
+# no Metadata id
+>>8 default x
+>>>0 use vhdx-id-hex
+# 2FA54224-CD1B-4876-B211-5DBED83BF4B8 Virtual Disk Size GUID
+# 2442A52F1BCD7648B2115DBED83BF4B8 Virtual Disk Size ID
+# value "virtual size" can be verified by command `qemu-img info `
+>0 ubequad =0x2442A52F1BCD7648
+>>8 ubequad =0xB2115DBED83BF4B8 \b, id vsize
+# no Virtual Disk Size ID
+>>8 default x
+>>>0 use vhdx-id-hex
+# other ids
+>0 default x
+>>0 use vhdx-id-hex
+# in vhdx images show id as hexadecimal
+0 name vhdx-id-hex
+>0 ubequad x \b, ID 0x%16.16llx
+>8 ubequad x \b-%16.16llx
+#
# libvirt
More information about the File
mailing list