[File] [PATCH] of Magdir/fsav for Clam AntiVirus (update+extensions)

Jörg Jenderek joerg.jen.der.ek at gmx.net
Sun Dec 2 19:50:14 UTC 2018


Hello,

some days ago i run file command version 5.35 on files concerning
malware detection software ClamAV i get some strange output like:
c:\ProgramData\.clamwin\db\daily.cld:
	Clam AntiVirus database
	02 Dec 2018 09-13 -0500, version 25173, gzipped
c:\Users\Public\Documents\.clamwin\db\daily.cld:
	Clam AntiVirus database
	15 Apr 2016 22-35 -0400, version 21495
T:\win98se-de.cd\SOFTWARE\SECURITY\ANTIVIR\CLAMAV\daily.cvd:
	Clam AntiVirus database
	08 Nov 2006 16-10 +0000, version 2177, gzipped
clamav-0c45ab870ff4e51a4df6e0946292fd3e.00002224.clamtmp:
	Clam AntiVirus database
	22 Mar 2017 16-28 -0400, version 23228, gzipped
B:\DIVERSE\clamav-0.100.2\examples\fileprop_analysis\analysis.cud:
	Clam AntiVirus database
	04 Mar 2015 13-58 -0500, version 246, gzipped
B:\cvd\new\out\db.cud:
	Clam AntiVirus database
	:1:3:82:70cf2d6be559c6c, gzipped
B:\cvd\new\out\db-3.cud:
	Clam AntiVirus database
	:17:2:82:6fac060672a8ec, gzipped
B:\cvd\new\out\db-1.cud:
	Clam AntiVirus database
	:2:1:82:36b0618d8da4a7e, gzipped
B:\cvd\new\out\db-3.info:
	Clam AntiVirus database
	:17:2:82:X:X:firstname
B:\cvd\new\out\db-1.info:
	Clam AntiVirus database
	:2:1:82:X:X:Joerg_Jende
B:\cvd_cld\daily.info:
	Clam AntiVirus database
	30 May 2016 01-18 -0400, version 21637

The output is correct for "official" virus database (*.cvd or *.cld).
But wrong is output for examples with other name extensions like
"db-3.info" or "db-3.cud". Furthermore no filename extension is shown by
--extension option.

The description for file command for such examples are handled by
Magdir/fsav . So i change there some lines. The old link
html/node45.html points to chapter "CVD format" of Clam AntiVirus user
Manual. This has become in version 0.100.2 to something like
html/node60.html. Unfortunately the html version of user manual does not
exist as web site any more.
So i add an URL pointing to user manual as PDF document like:
https://github.com/vrtadmin/clamav-faq/raw/master/manual/clamdoc.pdf

According to manual the header is a 512-bytes long string with colon
separated fields. After magic string "ClamAV-VDB" comes build time.
For the official databases this time files looks like "08 Aug 2018 20-43
-0400". This information was shown by magic lines like:
 0	string		ClamAV-VDB:
 >11	string		>\0		Clam AntiVirus database %-.23s

Unfortunately the documentation is incomplete. Apparently some fields
are optional. If field is totally missing header contains string part
like "::". Or sometimes unused filed is marked with X character. So
fixed length string expression is not true in general. So i replace such
magic lines by expressions with regular expressions. So now build times
is shown by line
 >>11	regex		\^[^:]{0,23}	\b, %s
Next field after colon character is also shown by regular expression
starting relative to field before. So third field for version now is
shown by line like
 >>>&1	regex		\^[^:]{1,6}	\b, version %s
This information can often be verified by running a command like:
`sigtool --info=FILE`

I found in inspected examples buildtime field inside official/signed
variants a longer date description, whereas for unofficial/unsigned
variants this field was mostly empty. According to man page sigtool(1)
filename extension "cud" is used for second variant. For the other
variant extension "cld" and "cvd" used. And during update process the
temporarily used extension "clamtmp" is found. This is now expressed by
lines like
 >>10	string		=::		(unsigned)
 !:ext	cud
 >>10	default		x		(with buildtime)
 !:ext	cld/cvd/clamtmp/cud
Unfortunately i am not smart enough to understand source code of clamav.
So i do not know if my observations are always true.

When i remove the text message "(with buildtime)" i get error messages like:
Magdir/fsav, 78: Warning: Current entry does not yet have a
description for adding a EXTENSION type
file: could not find any valid magic files! (No error)
This is annoying, because identifying text "Clam AntiVirus" and
subclassification just gives different file name extension.
Maybe somebody can check this in the source of the file command.

Furthermore i found files with name extension "info" starting with magic
string "ClamAV-VDB". These examples seem to contain the first bytes of
clamav databases followed sometimes by additional ASCII text and exist
only when updating/building of database. This behavior is now described
by lines like
 >511	ubyte		=0x20		database
 !:mime	application/x-clamav-database
 >511	default		x		file
 !:mime	application/x-clamav
 !:ext	info

This works because before offset 512 with the real clamav database the
rest of header is padded with space characters, whereas for "info" files
there is nothing or ASCII text.

The documentation about real database type is not accurate. I found
seldom pure tar files and often gzipped tar files. Now i use feature of
file command itself to inspect real database type by last magic lines.
First look for padding spaced again to skip "info" files. Afterwards
first look for tar characteristics. If true do work by sub routine found
in Magdir/archive. If not tar look for gziped magic. If this is true
then do work by magic lines inside Magdir/compress. This now looks like:
 >510	ubyte		=0x20
 >>1012	quad		=0		\b, with
 >>>512	use		tar-file
 >>1012	quad		!0
 >>>512	string		\037\213	\b, with
 >>>>512 indirect	x

For pure tar database i found only cld extension. If database is
compressed this seems to depend on values of variables
"CompressLocalDatabase" and "ScriptedUpdates" inside freshclam.conf.
Maybe an clamav expert can check this.

After applying the above mentioned modifications by patch
file-5.35-fsav-clamav.diff then all such inspected examples are
described by fsav+compress+archive like:

c:\ProgramData\.clamwin\db\daily.cld:
	Clam AntiVirus database (with buildtime)
	, 02 Dec 2018 09-13 -0500, version 25173, 2167842 signatures
	, level 63, builder neo, with
	gzip compressed data, max compression,
	from NTFS filesystem (NT), original size 157552640
c:\Users\Public\Documents\.clamwin\db\daily.cld:
	Clam AntiVirus database (with buildtime)
	, 15 Apr 2016 22-35 -0400, version 21495, 85311 signatures
	, level 63, builder neo, with
	tar archive (V7), file COPYING, size 43110
T:\win98se-de.cd\SOFTWARE\SECURITY\ANTIVIR\CLAMAV\daily.cvd:
	Clam AntiVirus database (with buildtime)
	, 08 Nov 2006 16-10 +0000, version 2177, 2183 signatures
	, level 9, builder sven, with
	gzip compressed data,
	from Unix, original size 532480
clamav-0c45ab870ff4e51a4df6e0946292fd3e.00002224.clamtmp:
	Clam AntiVirus database (with buildtime)
	, 22 Mar 2017 16-28 -0400, version 23228, 1894770 signatures
	, level 63, builder neo, with
	gzip compressed data, max compression,
	from Unix, original size 2273556519
B:\DIVERSE\clamav-0.100.2\examples\fileprop_analysis\analysis.cud:
	Clam AntiVirus database (with buildtime)
	, 04 Mar 2015 13-58 -0500, version 246, 4 signatures
	, level 80, builder clamav, with
	gzip compressed data, max compression,
	from Unix, original size 54784
B:\cvd\new\out\db.cud:
	Clam AntiVirus database (unsigned)
	, , version 1, 3 signatures
	, level 82, builder Firstname signername_abcdefghij, with
	gzip compressed data, max compression,
	from NTFS filesystem (NT), original size 19968
B:\cvd\new\out\db-3.cud:
	Clam AntiVirus database (unsigned)
	, , version 17, 2 signatures
	, level 82, builder firstname signername, with
	gzip compressed data, max compression,
	from NTFS filesystem (NT), original size 19968
B:\cvd\new\out\db-1.cud:
	Clam AntiVirus database (unsigned)
	, , version 2, 1 signatures
	, level 82, builder Joerg_Jenderek_1234567890abcdef, with
	gzip compressed data, max compression,
	from NTFS filesystem (NT), original size 19968
B:\cvd\new\out\db-3.info:
	Clam AntiVirus file
	, , version 17, 2 signatures
	, level 82, builder firstname signername
B:\cvd\new\out\db-1.info:
	Clam AntiVirus file
	, , version 2, 1 signatures
	, level 82, builder Joerg_Jenderek_1234567890abcdef
B:\cvd_cld\daily.info:
	Clam AntiVirus file
	, 30 May 2016 01-18 -0400, version 21637, 197706 signatures
	, level 63, builder neo

I hope my diff file can be applied in future version of
file utility.

With best wishes
Jörg Jenderek
-- 
Jörg Jenderek



-------------- next part --------------
--- file-5.35/magic/Magdir/fsav.old	2018-07-16 13:30:41 +0000
+++ file-5.35/magic/Magdir/fsav	2018-12-02 19:07:32 +0000
@@ -41,23 +41,62 @@
 
 # Joerg Jenderek: joerg dot jenderek at web dot de
-# http://www.clamav.net/doc/latest/html/node45.html
-# .cvd files start with a 512 bytes colon separated header
+# clamav-0.100.2\docs\html\node60.html 
+# https://github.com/vrtadmin/clamav-faq/raw/master/manual/clamdoc.pdf
+# ClamAV virus database files start with a 512 bytes colon separated header
 # ClamAV-VDB:buildDate:version:signaturesNumbers:functionalityLevelRequired:MD5:Signature:builder:buildTime
-# + gzipped tarball files
-0	string		ClamAV-VDB:
->11	string		>\0		Clam AntiVirus database %-.23s
->>34	string		:
->>>35		string		!:	\b, version
->>>>35		string		x 	\b %-.1s
->>>>>36		string		!:
->>>>>>36	string		x 	\b%-.1s
->>>>>>>37	string		!:
->>>>>>>>37	string		x 	\b%-.1s
->>>>>>>>>38	string		!:
->>>>>>>>>>38	string		x 	\b%-.1s
->>>>>>>>>>>39	string		!:
->>>>>>>>>>>>39	string		x 	\b%-.1s
->512	string		\037\213	\b, gzipped
->769	string		ustar\0		\b, tarred
+# + gzipped (optional) tarball files
+# output can often be verified by `sigtool --info=FILE`
+0	string		ClamAV-VDB:	Clam AntiVirus
+# padding spaces implies database
+>511	ubyte		=0x20		database
+!:mime	application/x-clamav-database
+# empty build time
+>>10	string		=::		(unsigned)
+# sigtool(1) man page
+!:ext	cud
+# display some text to avoid error like:
+# Magdir/fsav, 78: Warning: Current entry does not yet have a description for adding a EXTENSION type
+# file: could not find any valid magic files! (No error)
+>>10	default		x		(with buildtime)
+#>>10	default		x
+# clamtmp is used for temporily database like update process
+# for pure tar database only cld extension found
+!:ext	cld/cvd/clamtmp/cud
+>511	default		x		file
+!:mime	application/x-clamav
+!:ext	info
+>11	string		>\0
+# buildDate empty or like "22 Mar 2017 12-57 -0400"; verified by `sigtool -i FILE`
+>>11	regex		\^[^:]{0,23}	\b, %s
+# version like 25170
+>>>&1	regex		\^[^:]{1,6}	\b, version %s
+# signaturesNumbers like 4566249
+>>>>&1	regex		\^[^:]{1,10}	\b, %s signatures
+# functionalityLevelRequired like 60
+>>>>>&1	regex		\^[^:]{1,4}	\b, level %s
+# X for nothing or MD5
+#>>>>>>&1	regex	\^[^:]{1,32}	\b, MD5 "%s"
+>>>>>>&1	regex	\^[^:]{1,32}
+# X for nothing or digital signature starting like AIzk/LYbX
+#>>>>>>>&1	regex	\^[^:]{1,256}	\b, signature "%s"
+>>>>>>>&1	regex	\^[^:]{1,256}
+# builder like neo
+>>>>>>>>&1	regex	\^[^:]{1,32}	\b, builder %s
+# buildTime like 1506611558
+#>>>>>>>>>&1	regex	\^[^:]{1,10}	\b, %s
+>>>>>>>>>&1	regex	\^[^:]{1,10}	
+# padding with spaces
+#>>>>>>>>>>&1	ubequad	x		\b, padding 0x%16.16llx
+>510	ubyte		=0x20
+# inspect real database content
+#>>512	ubeshort	x		\b, database MAGIC 0x%x
+# ./archive handle pure tar archives
+>>1012	quad		=0		\b, with
+>>>512	use		tar-file
+# not pure tar
+>>1012	quad		!0
+# one space at the end of text and then handles gziped archives by ./compress
+>>>512	string		\037\213	\b, with 
+>>>>512	indirect	x
 
 # Type: Grisoft AVG AntiVirus


More information about the File mailing list