[File] [PATCH] of Magdir/virtual for Microsoft Disk Image (update+mime type+extension vhd)

Jörg Jenderek joerg.jen.der.ek at gmx.net
Fri Nov 23 02:08:06 UTC 2018


some day ago i run the file command 5.35 on my disc images. The good
news are that all that examples are described correctly first by phrase
"Microsoft Disk Image, Virtual Server or Virtual PC".

But no file name extension "vhd" is shown by command line option
--extension. Also no mime type is shown by -i option.

So i started to change Magdir/virtual. The vhd file format was
introduced in general by Microsoft virtualization software "Virtual PC".
If i understand the information found on link
http://www.forensicswiki.org/wiki/Virtual_Hard_Disk_%28VHD%29 right, the
used link ending with "bb676673.aspx" seems to point in the past to
Virtual Hard Disk Image Format Specification on Microsoft web site.
But Microsoft does not support "Virtual PC" any more and now the link
redirect to successor virtualization software "Hyper-V". So i remove the
old link. So i look for new URL on Microsoft servers about this file
format. In the end i find a Word document named "Virtual Hard Disk
Format Spec_10_18_06.doc" there. So i add the concerning URL as
reference. To get a long living URL i finally also add an URL about "VHD
(Virtual Hard Disk)" on http://fileformats.archiveteam.org/ .

After the identifying magic line starting with
 0	string	conectix	
now show file name extension according to above mentioned web site by line:
 !:ext   vhd
After installing virtualization software "VirtualBox" on Windows such
disc images get a user defined mime type. This is now also used by file
command by additional line
 !:mime	application/x-virtualbox-vhd

Furthermore i look for information that can be extracted and shown by
additional magic lines. Why? Some times i run out of real disc space and
must move some disk images and forget to adapt vhd dependencies in other
virtual machines. Or i delete some test vhd images and later must
revived images from trash directory.

According to found word document i added additional magic lines. Some
are only interesting for forensic purpose or i do not understand the
full purpose. For such cases i add lines as comment. For example for the
stored data offset i add a line like
 #>16	ubequad		!0x200		\b, Data Offset 0x%llx

So i display information that may be useful for normal users. So display
the 4 byte Creator Application by line:
 >28	string		x		\b, Creator %-4.4s
If the hard disk is created by Microsoft Virtual PC, "vpc " is written
in this field. If the hard disk image is created by Microsoft Virtual
Server, then "vs  " is written in this field. For Virtualbox i found
"vbox" and Sysinternals disk2vhd writes "d2v\0". Other applications
should use their own unique identifiers.

The next field stores the the Creator Version. This field holds the
major/minor version of the application that created the hard disk image.
Virtual Server 2004 sets this value to 0x00010000 and Virtual PC 2004
sets this to 0x00050000. This information is now shown in human readable
form by lines:
 >32	ubeshort	x		%x
 >34	ubeshort	x		\b.%x

Creator Host OS is stored in next field, with following meaning:
0x5769326B~Windows (Wi2k), 0x4D616320~Macintosh (Mac)
This is now shown in human readable form by lines:
 >36	ubelong		x		(
 >>36	ubelong		0x5769326B	\bW2k
 >>36	ubelong		0x4D616320	\bMac
 >>36	default		x		\b0x
 >>>36	ubelong		x		\b%8.8x

Afterward show creation time of the hard disk image. This time is stored
as big endian 4 bytes as seconds since 1 Jan 2000 UTC. This date is
946684800 second behind Unix epoch. So this is now shown by magic line
 >24	bedate+946684800	x	\b) %s

Afterwards display Current Size. This field stores the current size of
the hard disk image. This value is same as the original size when the
hard disk image is created. This value can change depending on whether
the image is expanded. This information is now shown by line:
 >48	ubequad		x		\b, %llu bytes

Afterwards the Disk Geometry (cylinder, heads, and sectors per track) is
stored. This is now shown by lines:
 >56	ubeshort	x		\b, CHS %u
 >58	ubyte		x		\b/%u
 >59	ubyte		x		\b/%u

With this additional information output columns get big. So maybe
shorter names like "Microsoft Virtual Hard Disk image" or "Microsoft
Virtual HD image" could/should be used when looking how other call such
disk images. See for example at
But at the moment i keep old phrase.

After applying the above mentioned modifications by patch
file-5.35-virtual-vhd.diff then all such disk images are described by
Magdir/virtual with additional information like:

ramdrive147MB-d2v.VHD: Microsoft Disk Image, Virtual Server or
	Virtual PC
	, Creator d2v  1.0 (W2k) Thu Nov 22 02:52:46 2018
	, 157286400 bytes, CHS 65535/16/255
VirtualXPVHD.vhd:      Microsoft Disk Image, Virtual Server or
	Virtual PC
	, Creator vpc  1.0 (W2k) Thu Sep 10 13:56:37 2009
	, 136365211648 bytes, CHS 65278/16/255
Free-DOS-1.1.vhd:      Microsoft Disk Image, Virtual Server or
	Virtual PC
	, Creator vbox 5.1 (W2k) Fri Mar 17 02:07:55 2017
	, 336592896 bytes, CHS 652/16/63
dyn16Mb_mac.vhd:       Microsoft Disk Image, Virtual Server or
	Virtual PC
	, Creator vbox 5.2 (Mac) Wed Nov 21 16:49:34 2018
	, 16777216 bytes, CHS 481/4/17

The second item is misidentifying many disc images as "(Lepton 2.x)"
or "(Lepton 3.x)". This misidentification does not occur in version
5.32. These massages are triggered by Magdir/measure. So in that file
the magic lines for DIY-Thermocam data seems to be not accurate enough.
So somebody should check that file.

Furthermore there exist an successor extended format of VHD named VHDX.
This newer file format is not recogniced by file command 5.35 and need
new own magic lines. I am working on this TODO item.

I hope my diff file and suggestions can be applied in future version of
file utility.

With best wishes
Jörg Jenderek
Jörg Jenderek

-------------- next part --------------
--- file-5.35/magic/Magdir/virtual.old	2017-03-17 21:34:26 +0000
+++ file-5.35/magic/Magdir/virtual	2018-11-23 01:49:42 +0000
@@ -7,5 +7,66 @@
 # Virtual PC
-# http://technet.microsoft.com/en-us/virtualserver/bb676673.aspx
-# .vhd
+# VirtualBox
+# URL: http://fileformats.archiveteam.org/wiki/VHD_(Virtual_Hard_Disk)
+# Reference: https://download.microsoft.com/download/f/f/e/ffef50a5-07dd-4cf8-aaa3-442c0673a029/
+# Virtual%20Hard%20Disk%20Format%20Spec_10_18_06.doc
 0	string	conectix	Microsoft Disk Image, Virtual Server or Virtual PC
+# alternative shorter names
+#0	string	conectix	Microsoft Virtual Hard Disk image
+#0	string	conectix	Microsoft Virtual HD image
+!:mime	application/x-virtualbox-vhd
+!:ext   vhd
+# Features is a bit field used to indicate specific feature support
+#>8	ubelong		!0x00000002	\b, Features 0x%x
+# Reserved. This bit must always be set to 1.
+#>8	ubelong		&0x00000002	\b, Reserved 0x%x
+# File Format Version for the current specification 0x00010000
+#>12	ubelong		!0x00010000	\b, Version 0x%8.8x
+# Data Offset only found 0x200
+#>16	ubequad		!0x200		\b, Data Offset 0x%llx
+#>16	ubequad		x		\b, at 0x%llx
+# Dynamic Disk Header cookie like cxsparse
+#>(16.Q)	string		x		"%-.8s"
+# This field contains a Unicode string (UTF-16) of the parent hard disk filename
+#>(16.Q+64)	ubequad	x		\b, parent name 0x%llx
+# Creator Application
+# vpc~Microsoft Virtual PC, vs~Microsoft Virtual Server, vbox~VirtualBox, d2v~disk2vhd
+>28	string		x		\b, Creator %-4.4s
+# Creator Version: 0x00010000~Virtual Server 2004, 0x00050000~Virtual PC 2004
+# holds the major/minor version of the application that created the image
+>32	ubeshort	x		%x
+>34	ubeshort	x		\b.%x
+#>32	ubelong		x		\b, Version 0x%8.8x
+# Creator Host OS: 0x5769326B~Windows (Wi2k), 0x4D616320~Macintosh (Mac)
+>36	ubelong		x		(
+>>36	ubelong		0x5769326B	\bW2k
+>>36	ubelong		0x4D616320	\bMac
+>>36	default		x		\b0x
+>>>36	ubelong		x		\b%8.8x
+# creation Time in seconds since 1 Jan 2000 UTC~946684800 sec. since Unix Epoch
+>24	bedate+946684800	x	\b) %s
+# Original Size
+#>40	ubequad		x		\b, o.-Size 0x%llx
+# Current Size is same as original size, but change when disk is expanded
+#>48	ubequad		x		\b, Size 0x%llx
+>48	ubequad		x		\b, %llu bytes
+# Disk Geometry: cylinder, heads, and sectors/track for hard disk
+#>56	ubeshort	x		\b, Cylinder 0x%x
+>56	ubeshort	x		\b, CHS %u
+# Heads
+#>58	ubyte		x		\b, Heads 0x%x
+>58	ubyte		x		\b/%u
+# Sectors per track
+#>59	ubyte		x		\b, Sectors 0x%x
+>59	ubyte		x		\b/%u
+# Disk Type: 3~Dynamic hard disk
+>60	ubelong		!0x3		\b, type 0x%x
+# Checksum
+#>64	ubelong		x		\b, cksum 0x%x
+# universally unique identifier (UUID) to associate a parent with its differencing image
+#>68	ubequad		x		\b, id 0x%16.16llx
+#>76	ubequad		x		\b-%16.16llx
+# Saved State: 1~Saved State
+>84	ubyte		!0		\b, State 0x%x
+# Reserved 427 bytes with nils
+#>85	ubequad	!0			\b, Reserved 0x%16.16llx

More information about the File mailing list