[File] Excel BIFF 2-8 BOF Magic Decoding

Brian Inglis Brian.Inglis at Shaw.ca
Mon May 8 21:24:53 UTC 2023


Hi folks,

Attached is ExcelBIFF2-8BOF.magic that works well with existing file and Excel 
magic to decode ancient Excel 2.0-8.x Binary Interchange File Format versions 2, 
3, 4, 5, 8 [6 and 7 may have been MS internal only] Beginning Of File records 
that also happen to be generated by common PDF etc. to XLS converters, as a side 
effect of a personal project to extract the data, as existing xls2csv converters 
only work for MS Office Composite Document types.

All the required information should be included, except the sources of my test 
files, which I downloaded from:

	https://telparia.com/fileFormatSamples/document/xls/
and
	https://webarchive.nationalarchives.gov.uk/ukgwa/

as it allows selection by year and file types and PRONOM fmt/55... terms to get 
ancient formats.

Please license the attached under your standard terms, whether public domain, 
BSD, MIT, X, or more formal, and rename, modify, fold, spindle, mutilate to 
match your standards.

Some example outputs are in the attached log: personal files have been elided 
(tested with file -L -m /usr/share/misc/magic:ExcelBIFF2-8BOF.magic *.xl*).

The BIFF 5/8 decoding may not be seen in standalone files in the wild, except 
possibly for converter output, but may be helpful in combination with CDF object 
decoding to expand on bare V2, which do not even mention Excel, as in the 
attached log, towards the bottom.

I also noticed a couple of other anomalies in the attached log:

* code page is printed as signed, it should be unsigned uleshort, so "Code page: 
-535" should show as 65001 Unicode:

	$ printf "%d\n" $((-535&0xffff))
	65001

* an xlsx file shows up as just a zip, at the bottom of the log; to download:

https://www.blackviper.com/service-configurations/black-vipers-windows-10-service-configurations/

click on EXCEL download button above table near bottom: it opens normally in my 
Excel emulators Gnumeric and LibreOffice calc.

-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                  -- Antoine de Saint-Exupéry
-------------- next part --------------
#!/usr/bin/file -m
# ExcelBIFF2-8BOF.magic - Excel Binary Interchange File Format versions 2-8 Beginning of File records
# See https://www.gaia-gis.it/gaia-sins/freexl-1.0.6-doxy-doc/html/Format.html
#	Excel	Commercial	BIFF	Release
#	Version	Name		Version	Year	Notes
#	2.x	Excel 2.0	BIFF2	1987	Before CFBF. File is the BIFF stream,
#						containing a single worksheet.
#	3.0	Excel 3.0	BIFF3	1990	""
#	4.0	Excel 4.0	BIFF4	1992	""
#	5.0	Excel 5.0	BIFF5	1993	Starting with BIFF5, a single Workbook
#						can internally store many individual Worksheets.
#						The BIFF stream is stored in the CFBF file container.
#	7.0	Excel 95	BIFF5	1995	
#	8.0	Excel 98	BIFF8	1998	
#	9.0	Excel 2000	BIFF8	1999	
#	10.0	Excel XP	BIFF8	2001	
#	11.0	Excel 2003	BIFF8	2003	
# See https://www.openoffice.org/sc/excelfileformat.pdf#page=135
#	5.8 BOF – Beginning of File
# See also https://en.wikipedia.org/wiki/Microsoft_Excel;
#	Old file extensions
#	Format		Extension	Description
#	Spreadsheet	.xls	Main spreadsheet format which holds data in
#				worksheets, charts, and macros
#	Add-in (VBA)	.xla	Adds custom functionality; written in VBA
#	Toolbar		.xlb	The file extension where Microsoft Excel custom
#				toolbar settings are stored.
#	Chart		.xlc	A chart created with data from a Microsoft Excel
#				spreadsheet that only saves the chart.
#				To save the chart and spreadsheet save as .XLS.
#				XLC is not supported in Excel 2007 or in any
#				newer versions of Excel.
#	Dialog		.xld	Used in older versions of Excel.
#	Archive		.xlk	A backup of an Excel Spreadsheet
#	Add-in (DLL)	.xll	Adds custom functionality; written in C++/C,
#				Fortran, etc. and compiled in to a special
#				dynamic-link library
#	Macro		.xlm	A macro is created by the user or pre-installed
#				with Excel.
#	Template	.xlt	A pre-formatted spreadsheet created by the user
#				or by Microsoft Excel.
#	Module		.xlv	A module is written in VBA (Visual Basic for
#				Applications) for Microsoft Excel
#	Workspace	.xlw	Arrangement of the windows of multiple Workbooks
#	Library		.DLL	Code written in VBA may access functions in a DLL,
#				typically this is used to access the Windows API
#!:ext	xls/xla/xlb/xlc/xld/xlk/xll/xlm/xlt/xlv/xlw

#!:mime	application/vnd.ms-excel

#	5.8.1 BOF Records Written by Excel
#	Record BOF, BIFF2 (record identifier is 0009 H):
#	Offset	Size	Contents
#	0	2	BIFF version (not used)
#	2	2	Type of the following data:	0010H = Sheet
#							0020H = Chart
#							0040H = Macro sheet
#	e.g. 0x0009 BOF len 4 version 2 content 0x0010 Sheet
0	uleshort	=0x0009	Excel 2 BIFF 2
>2	uleshort	=4
#			version
>>4	uleshort	=0
>>4	uleshort	=2
>>>6	uleshort	=0x0010	Sheet
>>>6	uleshort	=0x0020	Chart
>>>6	uleshort	=0x0040	Macros

#	Record BOF, BIFF3 (record identifier is 0209 H) and BIFF4 (record identifier is 0409H):
#	Offset	Size	Contents
#	0	2	BIFF version (not used)
#	2	2	Type of the following data:	0010H = Sheet
#							0020H = Chart
#							0040H = Macro sheet
#							0100H = Workspace (BIFF3W/BIFF4W only)
#	4	2        Not used
0	uleshort	=0x0209	Excel 3 BIFF 3
>2	uleshort	=6
#			version
>>4	uleshort	=0
>>4	uleshort	=3
>>>6	uleshort	=0x0010	Sheet
>>>6	uleshort	=0x0020	Chart
>>>6	uleshort	=0x0040	Macros
#			(BIFF3W only)
>>>6	uleshort	=0x0100	Workspace

0	uleshort	=0x0409	Excel 4 BIFF 4
>2	uleshort	=6
#			version
>>4	uleshort	=0
>>4	uleshort	=4
>>>6	uleshort	=0x0010	Sheet
>>>6	uleshort	=0x0020	Chart
>>>6	uleshort	=0x0040	Macros
#			(BIFF4W only)
>>>6	uleshort	=0x0100	Workspace

#	Record BOF, BIFF5 (record identifier is 0809 H):
#	Offset	Size        Contents
#	0	2	BIFF version (always 0500H for BIFF5).
#			Should only be used, if this record is the leading workbook globals BOF (see above).
#	2	2	Type of the following data:	0005H = Workbook globals
#							0006H = Visual Basic module
#							0010H = Sheet or dialogue (see SHEETPR, ➜5.97)
#							0020H = Chart
#							0040H = Macro sheet
#							0100H = Workspace (BIFF5W only)
#	4	2	Build identifier, must not be 0
#	6	2	Build year
0	uleshort	=0x0809	Excel 5 BIFF 5
>2	uleshort	=8
#			version
>>4	uleshort	=0x0500
>>4	uleshort	=5
>>4	uleshort	=0
>>>6	uleshort	=0x0005	Workbook Globals
>>>6	uleshort	=0x0006	VB Module
>>>6	uleshort	=0x0010	Sheet
>>>6	uleshort	=0x0020	Chart
>>>6	uleshort	=0x0040	Macros
#			(BIFF5W only)
>>>6	uleshort	=0x0100	Workspace
>>>>8	uleshort	>0	Build %d
>>>>>10	uleshort	>1900	Year %d

#	Record BOF, BIFF8 (record identifier is 0809 H):
#	Offset	Size	Contents
#	 0	2	BIFF version (always 0600 H for BIFF8)
#	 2	2	Type of the following data:	0005H = Workbook globals
#							0006H = Visual Basic module
#							0010H = Sheet or dialogue (see SHEETPR, ➜5.97)
#							0020H = Chart
#							0040H = Macro sheet
#							0100H = Workspace (BIFF8W only)
#	 4	2	Build identifier, must not be 0
#	 6	2	Build year, must not be 0
#	 8	4	File history flags
#	12	4	Lowest Excel version that can read all records in this file
0	uleshort	=0x0809	Excel 8 BIFF 8
>2	uleshort	=16
#			version
>>4	uleshort	=0x0600
>>4	uleshort	=8
>>4	uleshort	=0
>>>6	uleshort	=0x0005	Workbook Globals
>>>6	uleshort	=0x0006	VB Module
>>>6	uleshort	=0x0010	Sheet
>>>6	uleshort	=0x0020	Chart
>>>6	uleshort	=0x0040	Macros
#			(BIFF8W only)
>>>6	uleshort	=0x0100	Workspace
>>>>8	uleshort	>0	Build %d
>>>>>10	uleshort	>1900	Year %d
>>>>>>12 ulelong	!0	File history %d
>>>>>>16 ulelong	>0	Excel version needed %d

#	5.8.2 BOF Records Written by Other External Tools
#	Various external tools write non-standard BOF records with the record
#	identifier 0809H (determining a BIFF5-BIFF8 BOF record), but with a
#	different BIFF version field. In this case, the record identifier is
#	ignored, and only the version field is used to set the BIFF version of
#	the workbook.
#	Record BOF (record identifier is 0809 H):
#	Offset	Size	Contents
#	0	2	BIFF version:			0000H = BIFF5
#							0200H = BIFF2
#							0300H = BIFF3
#							0400H = BIFF4
#							0500H = BIFF5
#							0600H = BIFF8
#	2	2	Type of the following data:	0005H = Workbook globals
#							0006H = Visual Basic module
#							0010H = Sheet or dialogue (see SHEETPR, ➜5.97)
#							0020H = Chart
#							0040H = Macro sheet
#							0100H = Workspace
#	[4]	var.	(optional) Additional fields of a BOF record, should be ignored
0	uleshort	=0x0809
#			>= 4
>2	uleshort	>3
>>4	uleshort	=0	Excel 5 BIFF 5
>>4	uleshort	=0x0200	Excel 2 BIFF 2
>>4	uleshort	=2	Excel 2 BIFF 2
>>4	uleshort	=0x0300	Excel 3 BIFF 3
>>4	uleshort	=3	Excel 3 BIFF 3
>>4	uleshort	=0x0400	Excel 4 BIFF 4
>>4	uleshort	=4	Excel 4 BIFF 4
>>4	uleshort	=0x0500	Excel 5 BIFF 5
>>4	uleshort	=5	Excel 5 BIFF 5
>>4	uleshort	=0x0600	Excel 8 BIFF 8
>>4	uleshort	=6	Excel 8 BIFF 8
>>4	uleshort	=0x0800	Excel 8 BIFF 8
>>4	uleshort	=8	Excel 8 BIFF 8
>>>6	uleshort	=0x0005	Workbook Globals
>>>6	uleshort	=0x0006	VB Module
>>>6	uleshort	=0x0010	Sheet/Dialogue
>>>6	uleshort	=0x0020	Chart
>>>6	uleshort	=0x0040	Macros
#			(BIFF8W only)
>>>6	uleshort	=0x0100	Workspace

-------------- next part --------------
chart.xls:      Excel 3 BIFF 3
dial.xlm:       Excel 2 BIFF 2 Macros
east.xls:       Excel 3 BIFF 3
hgram2.xlm:     Excel 3 BIFF 3
navigate.xlm:   Excel 2 BIFF 2
order.xls:      Excel 3 BIFF 3
preview.xlm:    Excel 3 BIFF 3
que.xls:        Excel 3 BIFF 3
timeln2.xls:    Excel 3 BIFF 3
west.xls:       Excel 3 BIFF 3
winoye.xlw:     Excel 4 BIFF 4

DJNS.xlc:       CDFV2 Microsoft Excel

BBS.xls:                                Composite Document File V2 Document, Little Endian, Os: Windows, Version 4.0, Code page: 936, Author: BaoWei, Last Saved By: BaoWei, Name of Creating Application: Microsoft Excel, Create Time/Date: Sun Dec 28 13:07:00 1997, Security: 0

example.xls:                            Composite Document File V2 Document, Little Endian, Os: Windows, Version 6.1, Code page: 1252, Author: Vb1, Last Saved By: Vb1, Revision Number: 6, Name of Creating Application: Microsoft Excel, Create Time/Date: Wed Sep 24 14:27:18 2014, Last Saved Time/Date: Tue Nov 18 13:58:52 2014, Security: 0

futhorc.xls:                            Composite Document File V2 Document, Little Endian, Os: Windows, Version 10.0, Code page: 1252, Author: default, Last Saved By: Simon Ager, Name of Creating Application: Microsoft Excel, Last Printed: Thu Nov 13 15:01:36 2003, Create Time/Date: Wed Mar 19 20:04:47 2003, Last Saved Time/Date: Sat May 11 12:30:37 2019, Security: 0

NAD83(CSRS)ASCMSubsetData.xls:          Composite Document File V2 Document, Little Endian, Os: Windows, Version 5.1, Code page: 1252, Title: NAD83 (CSRS) ASCM Subset Data, Author: Geoff Banham, Keywords: NAD83 (CSRS) ASCM Subset Data, Last Saved By: Edward Titanich, Name of Creating Application: Microsoft Excel, Last Printed: Mon Feb 22 14:18:23 2010, Create Time/Date: Thu Sep 21 21:00:08 2000, Last Saved Time/Date: Thu Mar  4 14:19:49 2010, Security: 0

ComponentPrices.xls:                    Composite Document File V2 Document, Little Endian, Os: Windows, Version 5.2, Title: Product Compone, Author: Crystal D, Comments: Powered by

Canadian Population Quarterly.xls:      Composite Document File V2 Document, Little Endian, Os: Windows, Version 1.0, Code page: -535, Revision Number: 2, Total Editing Time: 02:11, Last Saved Time/Date: Sun Apr  5 08:16:02 2020

Quarterly-Disp-Fee-Rpt-ESC-BOB-Q3-2021-All-except-Ontario-wdc-EN.xls:                     Composite Document File V2 Document, Little Endian, Os: Windows, Version 1.0, Code page: -535, Revision Number: 3, Total Editing Time: 10:33, Last Saved Time/Date: Mon Feb  7 04:49:35 2022

Black Vipers Windows 10 Service Configurations  Black Viper  www.blackviper.com.xlsx:     Zip archive data, at least v1.0 to extract, compression method=store



More information about the File mailing list