[File] [PATCH] Magdir/audio Adaptive Multi-Rate Codec; missing variant WideBand *.awb

Christos Zoulas christos at zoulas.com
Mon Oct 23 19:45:41 UTC 2023


Committed, thanks!

christos

> On Oct 20, 2023, at 8:13 PM, Jörg Jenderek (GMX) <joerg.jen.der.ek at gmx.net> wrote:
> 
> Hello,
> 
> some days ago i want to handle some shebang scripts. Surprisingly some
> files are audio sample and are not not such scripts. In this session i
> will handle Adaptive Multi-Rate Codec samples (*.amr *.awb).
> 
> When running file command version 5.45 with -k option on such audio
> samples i get an output like:
> 
> AUD001.amr:   Adaptive Multi-Rate Codec (GSM telephony)
> 	      a AMR script executable (binary data)
> amr-wb.awb:   Adaptive Multi-Rate Codec (GSM telephony)
> 	      a AMR-WB script executable (binary data)
> example.3ga:  Adaptive Multi-Rate Codec (GSM telephony)
> 	      a AMR script executable (binary data)
> example.amr:  Adaptive Multi-Rate Codec (GSM telephony)
> 	      a AMR script executable (binary data)
> 
> With option -i for all samples audio/amr is shown. And with --extension
> option for all samples amr is displayed.
> 
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). The samples with awb
> suffix are called "Adaptive Multi-Rate Wideband ACELP codec" without
> mime type by audio-awb.trid.xml. The other samples (*.amr) are called
> "AMR (Adaptive Multi Rate) encoded audio" with mime type audio/amr by
> audio-amr.trid.xml. Here only suffix amr is listed (See appended
> trid-v-amr.txt.gz).
> 
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/).
> Here the samples with awb suffix are described as "Adaptive Multi-Rate
> Wideband Audio" with mime type audio/amr-wb by PUID fmt/954. The other
> samples are described as ""Adaptive Multi-Rate Audio" with mime type
> audio/amr by PUID fmt/356. For the sample with 3ga suffix this is
> considered as invalid name (See in droid-amr.csv.gz EXTENSION_MISMATCH
> true).
> 
> According to shared-mime-info database (See freedesktop.org.xml.in for
> example on freedesktop.org) the sample with awb suffix is called
> "AMR-WB", "AMR-WB audio" or "Adaptive Multi-Rate Wideband" with mime
> type audio/AMR-WB. As alternative mime type audio/amr-wb-encrypted is
> listed. The other samples are called "AMR", "AMR audio" or "Adaptive
> Multi-Rate" with mime type audio/AMR. Only suffix amr is here listed.
> 
> TrID list the used file name extension and often with -v option the
> related URL pointing to used file format information.
> 
> With the help of these tools i found page on file formats archive team
> web site. There also links to samples for download are listed. So these
> informations are now expressed inside Magdir/audio by additional comment
> lines like:
> # http://fileformats.archiveteam.org/wiki/Adaptive_Multi-Rate_Audio
> # Reference:	https://datatracker.ietf.org/doc/html/rfc4867
> #		http://mark0.net/download/triddefs_xml.7z
> #		defs/a/audio-amr.trid.xml
> #		defs/a/audio-awb.trid.xml
> 
> The description happen inside Magdir/audio by lines like:
> 0	string	#!AMR	Adaptive Multi-Rate Codec (GSM telephony)
> !:mime	audio/amr
> !:ext  amr
> 
> According to other tools and documentation the audio samples are now
> described by starting line with magic strength length 80 like:
> 0	string	#!AMR	Adaptive Multi-Rate Codec
> Then i do sub classification for wide band variant by additional branch
> that looks like:
> >5	string	-WB		(Wideband)
> !:mime	audio/AMR-WB
> !:apple	????amrw
> !:ext	awb
> According to officially registered at iana.org the audio type is
> expressed by up cased phrase AMR-WB. On some sites the low case amr-wb
> word is listed. That is not officially and the links and search on IANA
> then are wrong and not working. On IANA also the 4 byte amrw macintosh
> apple type code is listed.
> 
> For the other variant i do not check bytes after starting 5 first magic
> bytes but i assume this are valid and unique enough. So this is done by
> branch with lines like:
> >5	default	x		(GSM telephony)
> !:mime	audio/AMR
> !:apple	????amr
> !:ext  amr
> #!:ext  amr/3ga
> 
> Here again the official IANA audio sub type is expressed by upcase
> phrase AMR and not low case. On IANA also the 4 byte "amr " macintosh
> apple type code is listed. On file formats archive team web site one
> sample example.3ga with 3ga suffix is listed. But on other sites such
> items are not listed. So i am unsure if this is always true or if this
> happens by accident. So i show only 1 suffix amr here.
> 
> After applying the above mentioned modifications by patch
> file-5.45-audio-amr.diff then my audio samples are still recognized and
> described with correct sub classification. This now looks like:
> 
> AUD001.amr:   Adaptive Multi-Rate Codec (GSM telephony)
> amr-wb.awb:   Adaptive Multi-Rate Codec (Wideband)
> example.3ga:  Adaptive Multi-Rate Codec (GSM telephony)
> example.amr:  Adaptive Multi-Rate Codec (GSM telephony)
> 
> I hope my diff file can be applied in future version of file utility.
> 
> There is still something to-do. Inside Magdir/varied.script the
> misidentification as "a AMR script executable (binary data)" with
> strength (20=60/3) should be excluded. I will try to handle this in the
> future if have fully understand what is happening there and i get more
> bad/good script samples.
> 
> With best wishes,
> Jörg Jenderek
> --
> Jörg Jenderek
> <trid-v-amr.txt.gz><droid-amr.csv.gz><file-5_45-audio-amr_diff.DEFANGED-17979><file-5_45-audio-amr_diff_sig.DEFANGED-17980>-- 
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>



More information about the File mailing list