[File] [PATCH] Magdir/diff bsdiff(1) patch file; missing mime type+extension

Christos Zoulas christos at zoulas.com
Sun Jan 21 19:59:54 UTC 2024


Committed, thanks!

christos

> On Jan 20, 2024, at 4:11 PM, Jörg Jenderek (GMX) <joerg.jen.der.ek at gmx.net> wrote:
> 
> Hello,
> 
> some days ago i must handle some patch files. Unfortunately there exist
> about a dozen of different variants. Some are not recognized. In this
> session i will handle "bsdiff" samples which are "binary" and not text.
> The samples are created by bsdiff utility.
> 
> When running file command version 5.45 on such samples i get an
> output like:
> 
> fmt-439-signature-id-672.bsdiff: bsdiff(1) patch file
> lmhosts.bsdiff:                  bsdiff(1) patch file
> lsmod-xbox.bsdiff:               bsdiff(1) patch file
> test.bsdiff:                     bsdiff(1) patch file
> 
> Furthermore only generic mime type application/octet-stream is
> shown with -i option. With option --extension only 3 byte
> sequence ??? is shown.
> 
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This does also recognize
> samples. These are here with highest priority described as "bsdiff patch
> (v4)" with mime type application/x-bsdiff and BSDIFF suffix by
> bsdiff.trid.xml (See appended trid-v-bsdiff.txt.gz).
> 
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/).
> The recognized samples here described as "BSDIFF" with version 4.0 by
> PUID fmt/439. No mime type is listed here (see appended
> droid-bsdiff.csv.gz).
> 
> On Linux according to shared MIME-info database such samples are called
> "Binary differences between files". Here application/x-bsdiff is used as
> mime type. This makes sense because the samples are binary file and not
> text files like in many other difference output.  The samples are just
> recognized by looking for 8 byte sequence "BSDIFF40" at the beginning.
> but also "BSDIFN40" is here considered as valid start magic, but i do
> not or can create such samples. That information can be seen in source
> freedesktop.org.xml.in found for example on gitlab.freedesktop.org.
> 
> With the help of these tools i found pages about BSDIFF file format.
> That is expressed inside Magdir/diff by comment lines like:
> # URL: 	http://www.daemonology.net/bsdiff/
> # Ref.:	https://github.com/cperciva/bsdiff/blob/master/bsdiff-ra/FORMAT
> #	http://mark0.net/download/triddefs_xml.7z/defs/b/bsdiff.trid.xml
> 
> The detected samples are done by line inside Magdir/diff which looks like:
> 0	string/b	BSDIFF40	bsdiff(1) patch file
> 
> First i look what others tools check. These also check for BZh string at
> offset 32 and at offset 36 for string 1AY&SY (that is hexadecimal
> sequence 314159265359 for circle number pi in BCD notation). So these
> patterns are characteristics for bzip2 compressed data, what is
> described by Magdir/compress. After 32 byte header comes compressed
> data, which are done at the moment by bzip2. So show information about
> that part by adding line like:
> >>0x20	indirect	x		\b, at 0x20
> 
> According to documentation in 32 byte sized header the length of
> different patch parts are stored. These can be shown by lines like:
> >>8	lequad		x		\b, new length %lld
> >>16	lelong		x		\b, new segment length %d
> >>20	lelong		!0		\b, compressed header length %d
> >>24	lequad		x		\b, data length %lld
> 
> The sample fmt-439-signature-id-672.bsdiff is not a real patch. It is
> used by DROID tool as pattern template to recognize bsdiff patches.
> For this sample all length fields are zero. So i skip this sample with
> invalid new file segment length. So the magic start like:
> 0	string/b	BSDIFF40
> >16	long		!0		bsdiff(1) patch file
> !:mime	application/x-bsdiff
> !:ext	bsdiff
> 
> After applying the above mentioned modifications by patch
> file-5.45-diff-bsdiff.diff and using Magdir/compress then
> my samples are still recognized, but more details are shown
> and invalid DROID sample is skipped. This now then looks like:
> 
> fmt-439-signature-id-672.bsdiff: data
> lmhosts.bsdiff:                  bsdiff(1) patch file
> 				 , at 0x20 bzip2 compressed data
> 				 , block size = 900k
> lsmod-xbox.bsdiff:               bsdiff(1) patch file
> 				 , at 0x20 bzip2 compressed data
> 				 , block size = 900k
> test.bsdiff:                     bsdiff(1) patch file
> 				 , at 0x20 bzip2 compressed data
> 				 , block size = 900k
> 
> I hope my diff file can be applied in future version of file
> utility.
> 
> There are still other patch formats, which are sometimes are not
> recognized or not described completely. I will try to handle these in a
> future session.
> 
> With best wishes,
> Jörg Jenderek
> --
> Jörg Jenderek
> <trid-v-bsdiff.txt.gz><droid-bsdiff.csv.gz><file-5_45-diff-bsdiff_diff.DEFANGED-238><file-5_45-diff-bsdiff_diff_sig.DEFANGED-239>-- 
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>



More information about the File mailing list