[File] Excel BIFF 2-8 BOF Magic Decoding
Brian Inglis
Brian.Inglis at Shaw.ca
Mon May 8 21:24:53 UTC 2023
Hi folks,
Attached is ExcelBIFF2-8BOF.magic that works well with existing file and Excel
magic to decode ancient Excel 2.0-8.x Binary Interchange File Format versions 2,
3, 4, 5, 8 [6 and 7 may have been MS internal only] Beginning Of File records
that also happen to be generated by common PDF etc. to XLS converters, as a side
effect of a personal project to extract the data, as existing xls2csv converters
only work for MS Office Composite Document types.
All the required information should be included, except the sources of my test
files, which I downloaded from:
https://telparia.com/fileFormatSamples/document/xls/
and
https://webarchive.nationalarchives.gov.uk/ukgwa/
as it allows selection by year and file types and PRONOM fmt/55... terms to get
ancient formats.
Please license the attached under your standard terms, whether public domain,
BSD, MIT, X, or more formal, and rename, modify, fold, spindle, mutilate to
match your standards.
Some example outputs are in the attached log: personal files have been elided
(tested with file -L -m /usr/share/misc/magic:ExcelBIFF2-8BOF.magic *.xl*).
The BIFF 5/8 decoding may not be seen in standalone files in the wild, except
possibly for converter output, but may be helpful in combination with CDF object
decoding to expand on bare V2, which do not even mention Excel, as in the
attached log, towards the bottom.
I also noticed a couple of other anomalies in the attached log:
* code page is printed as signed, it should be unsigned uleshort, so "Code page:
-535" should show as 65001 Unicode:
$ printf "%d\n" $((-535&0xffff))
65001
* an xlsx file shows up as just a zip, at the bottom of the log; to download:
https://www.blackviper.com/service-configurations/black-vipers-windows-10-service-configurations/
click on EXCEL download button above table near bottom: it opens normally in my
Excel emulators Gnumeric and LibreOffice calc.
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
-------------- next part --------------
#!/usr/bin/file -m
# ExcelBIFF2-8BOF.magic - Excel Binary Interchange File Format versions 2-8 Beginning of File records
# See https://www.gaia-gis.it/gaia-sins/freexl-1.0.6-doxy-doc/html/Format.html
# Excel Commercial BIFF Release
# Version Name Version Year Notes
# 2.x Excel 2.0 BIFF2 1987 Before CFBF. File is the BIFF stream,
# containing a single worksheet.
# 3.0 Excel 3.0 BIFF3 1990 ""
# 4.0 Excel 4.0 BIFF4 1992 ""
# 5.0 Excel 5.0 BIFF5 1993 Starting with BIFF5, a single Workbook
# can internally store many individual Worksheets.
# The BIFF stream is stored in the CFBF file container.
# 7.0 Excel 95 BIFF5 1995
# 8.0 Excel 98 BIFF8 1998
# 9.0 Excel 2000 BIFF8 1999
# 10.0 Excel XP BIFF8 2001
# 11.0 Excel 2003 BIFF8 2003
# See https://www.openoffice.org/sc/excelfileformat.pdf#page=135
# 5.8 BOF – Beginning of File
# See also https://en.wikipedia.org/wiki/Microsoft_Excel;
# Old file extensions
# Format Extension Description
# Spreadsheet .xls Main spreadsheet format which holds data in
# worksheets, charts, and macros
# Add-in (VBA) .xla Adds custom functionality; written in VBA
# Toolbar .xlb The file extension where Microsoft Excel custom
# toolbar settings are stored.
# Chart .xlc A chart created with data from a Microsoft Excel
# spreadsheet that only saves the chart.
# To save the chart and spreadsheet save as .XLS.
# XLC is not supported in Excel 2007 or in any
# newer versions of Excel.
# Dialog .xld Used in older versions of Excel.
# Archive .xlk A backup of an Excel Spreadsheet
# Add-in (DLL) .xll Adds custom functionality; written in C++/C,
# Fortran, etc. and compiled in to a special
# dynamic-link library
# Macro .xlm A macro is created by the user or pre-installed
# with Excel.
# Template .xlt A pre-formatted spreadsheet created by the user
# or by Microsoft Excel.
# Module .xlv A module is written in VBA (Visual Basic for
# Applications) for Microsoft Excel
# Workspace .xlw Arrangement of the windows of multiple Workbooks
# Library .DLL Code written in VBA may access functions in a DLL,
# typically this is used to access the Windows API
#!:ext xls/xla/xlb/xlc/xld/xlk/xll/xlm/xlt/xlv/xlw
#!:mime application/vnd.ms-excel
# 5.8.1 BOF Records Written by Excel
# Record BOF, BIFF2 (record identifier is 0009 H):
# Offset Size Contents
# 0 2 BIFF version (not used)
# 2 2 Type of the following data: 0010H = Sheet
# 0020H = Chart
# 0040H = Macro sheet
# e.g. 0x0009 BOF len 4 version 2 content 0x0010 Sheet
0 uleshort =0x0009 Excel 2 BIFF 2
>2 uleshort =4
# version
>>4 uleshort =0
>>4 uleshort =2
>>>6 uleshort =0x0010 Sheet
>>>6 uleshort =0x0020 Chart
>>>6 uleshort =0x0040 Macros
# Record BOF, BIFF3 (record identifier is 0209 H) and BIFF4 (record identifier is 0409H):
# Offset Size Contents
# 0 2 BIFF version (not used)
# 2 2 Type of the following data: 0010H = Sheet
# 0020H = Chart
# 0040H = Macro sheet
# 0100H = Workspace (BIFF3W/BIFF4W only)
# 4 2 Not used
0 uleshort =0x0209 Excel 3 BIFF 3
>2 uleshort =6
# version
>>4 uleshort =0
>>4 uleshort =3
>>>6 uleshort =0x0010 Sheet
>>>6 uleshort =0x0020 Chart
>>>6 uleshort =0x0040 Macros
# (BIFF3W only)
>>>6 uleshort =0x0100 Workspace
0 uleshort =0x0409 Excel 4 BIFF 4
>2 uleshort =6
# version
>>4 uleshort =0
>>4 uleshort =4
>>>6 uleshort =0x0010 Sheet
>>>6 uleshort =0x0020 Chart
>>>6 uleshort =0x0040 Macros
# (BIFF4W only)
>>>6 uleshort =0x0100 Workspace
# Record BOF, BIFF5 (record identifier is 0809 H):
# Offset Size Contents
# 0 2 BIFF version (always 0500H for BIFF5).
# Should only be used, if this record is the leading workbook globals BOF (see above).
# 2 2 Type of the following data: 0005H = Workbook globals
# 0006H = Visual Basic module
# 0010H = Sheet or dialogue (see SHEETPR, ➜5.97)
# 0020H = Chart
# 0040H = Macro sheet
# 0100H = Workspace (BIFF5W only)
# 4 2 Build identifier, must not be 0
# 6 2 Build year
0 uleshort =0x0809 Excel 5 BIFF 5
>2 uleshort =8
# version
>>4 uleshort =0x0500
>>4 uleshort =5
>>4 uleshort =0
>>>6 uleshort =0x0005 Workbook Globals
>>>6 uleshort =0x0006 VB Module
>>>6 uleshort =0x0010 Sheet
>>>6 uleshort =0x0020 Chart
>>>6 uleshort =0x0040 Macros
# (BIFF5W only)
>>>6 uleshort =0x0100 Workspace
>>>>8 uleshort >0 Build %d
>>>>>10 uleshort >1900 Year %d
# Record BOF, BIFF8 (record identifier is 0809 H):
# Offset Size Contents
# 0 2 BIFF version (always 0600 H for BIFF8)
# 2 2 Type of the following data: 0005H = Workbook globals
# 0006H = Visual Basic module
# 0010H = Sheet or dialogue (see SHEETPR, ➜5.97)
# 0020H = Chart
# 0040H = Macro sheet
# 0100H = Workspace (BIFF8W only)
# 4 2 Build identifier, must not be 0
# 6 2 Build year, must not be 0
# 8 4 File history flags
# 12 4 Lowest Excel version that can read all records in this file
0 uleshort =0x0809 Excel 8 BIFF 8
>2 uleshort =16
# version
>>4 uleshort =0x0600
>>4 uleshort =8
>>4 uleshort =0
>>>6 uleshort =0x0005 Workbook Globals
>>>6 uleshort =0x0006 VB Module
>>>6 uleshort =0x0010 Sheet
>>>6 uleshort =0x0020 Chart
>>>6 uleshort =0x0040 Macros
# (BIFF8W only)
>>>6 uleshort =0x0100 Workspace
>>>>8 uleshort >0 Build %d
>>>>>10 uleshort >1900 Year %d
>>>>>>12 ulelong !0 File history %d
>>>>>>16 ulelong >0 Excel version needed %d
# 5.8.2 BOF Records Written by Other External Tools
# Various external tools write non-standard BOF records with the record
# identifier 0809H (determining a BIFF5-BIFF8 BOF record), but with a
# different BIFF version field. In this case, the record identifier is
# ignored, and only the version field is used to set the BIFF version of
# the workbook.
# Record BOF (record identifier is 0809 H):
# Offset Size Contents
# 0 2 BIFF version: 0000H = BIFF5
# 0200H = BIFF2
# 0300H = BIFF3
# 0400H = BIFF4
# 0500H = BIFF5
# 0600H = BIFF8
# 2 2 Type of the following data: 0005H = Workbook globals
# 0006H = Visual Basic module
# 0010H = Sheet or dialogue (see SHEETPR, ➜5.97)
# 0020H = Chart
# 0040H = Macro sheet
# 0100H = Workspace
# [4] var. (optional) Additional fields of a BOF record, should be ignored
0 uleshort =0x0809
# >= 4
>2 uleshort >3
>>4 uleshort =0 Excel 5 BIFF 5
>>4 uleshort =0x0200 Excel 2 BIFF 2
>>4 uleshort =2 Excel 2 BIFF 2
>>4 uleshort =0x0300 Excel 3 BIFF 3
>>4 uleshort =3 Excel 3 BIFF 3
>>4 uleshort =0x0400 Excel 4 BIFF 4
>>4 uleshort =4 Excel 4 BIFF 4
>>4 uleshort =0x0500 Excel 5 BIFF 5
>>4 uleshort =5 Excel 5 BIFF 5
>>4 uleshort =0x0600 Excel 8 BIFF 8
>>4 uleshort =6 Excel 8 BIFF 8
>>4 uleshort =0x0800 Excel 8 BIFF 8
>>4 uleshort =8 Excel 8 BIFF 8
>>>6 uleshort =0x0005 Workbook Globals
>>>6 uleshort =0x0006 VB Module
>>>6 uleshort =0x0010 Sheet/Dialogue
>>>6 uleshort =0x0020 Chart
>>>6 uleshort =0x0040 Macros
# (BIFF8W only)
>>>6 uleshort =0x0100 Workspace
-------------- next part --------------
chart.xls: Excel 3 BIFF 3
dial.xlm: Excel 2 BIFF 2 Macros
east.xls: Excel 3 BIFF 3
hgram2.xlm: Excel 3 BIFF 3
navigate.xlm: Excel 2 BIFF 2
order.xls: Excel 3 BIFF 3
preview.xlm: Excel 3 BIFF 3
que.xls: Excel 3 BIFF 3
timeln2.xls: Excel 3 BIFF 3
west.xls: Excel 3 BIFF 3
winoye.xlw: Excel 4 BIFF 4
DJNS.xlc: CDFV2 Microsoft Excel
BBS.xls: Composite Document File V2 Document, Little Endian, Os: Windows, Version 4.0, Code page: 936, Author: BaoWei, Last Saved By: BaoWei, Name of Creating Application: Microsoft Excel, Create Time/Date: Sun Dec 28 13:07:00 1997, Security: 0
example.xls: Composite Document File V2 Document, Little Endian, Os: Windows, Version 6.1, Code page: 1252, Author: Vb1, Last Saved By: Vb1, Revision Number: 6, Name of Creating Application: Microsoft Excel, Create Time/Date: Wed Sep 24 14:27:18 2014, Last Saved Time/Date: Tue Nov 18 13:58:52 2014, Security: 0
futhorc.xls: Composite Document File V2 Document, Little Endian, Os: Windows, Version 10.0, Code page: 1252, Author: default, Last Saved By: Simon Ager, Name of Creating Application: Microsoft Excel, Last Printed: Thu Nov 13 15:01:36 2003, Create Time/Date: Wed Mar 19 20:04:47 2003, Last Saved Time/Date: Sat May 11 12:30:37 2019, Security: 0
NAD83(CSRS)ASCMSubsetData.xls: Composite Document File V2 Document, Little Endian, Os: Windows, Version 5.1, Code page: 1252, Title: NAD83 (CSRS) ASCM Subset Data, Author: Geoff Banham, Keywords: NAD83 (CSRS) ASCM Subset Data, Last Saved By: Edward Titanich, Name of Creating Application: Microsoft Excel, Last Printed: Mon Feb 22 14:18:23 2010, Create Time/Date: Thu Sep 21 21:00:08 2000, Last Saved Time/Date: Thu Mar 4 14:19:49 2010, Security: 0
ComponentPrices.xls: Composite Document File V2 Document, Little Endian, Os: Windows, Version 5.2, Title: Product Compone, Author: Crystal D, Comments: Powered by
Canadian Population Quarterly.xls: Composite Document File V2 Document, Little Endian, Os: Windows, Version 1.0, Code page: -535, Revision Number: 2, Total Editing Time: 02:11, Last Saved Time/Date: Sun Apr 5 08:16:02 2020
Quarterly-Disp-Fee-Rpt-ESC-BOB-Q3-2021-All-except-Ontario-wdc-EN.xls: Composite Document File V2 Document, Little Endian, Os: Windows, Version 1.0, Code page: -535, Revision Number: 3, Total Editing Time: 10:33, Last Saved Time/Date: Mon Feb 7 04:49:35 2022
Black Vipers Windows 10 Service Configurations Black Viper www.blackviper.com.xlsx: Zip archive data, at least v1.0 to extract, compression method=store
More information about the File
mailing list