New Upstream Snapshot - hachoir
Ready changes
Summary
Merged new upstream version: 3.2.0+git20221203.1.01ee9e5+dfsg (was: 3.1.0+dfsg).
Resulting package
Built on 2023-01-20T11:49 (took 4m19s)
The resulting binary packages can be installed (if you have the apt repository enabled) by running one of:
apt install -t fresh-snapshots hachoir
Diff
diff --git a/.gitignore b/.gitignore
deleted file mode 100644
index bec29d09..00000000
--- a/.gitignore
+++ /dev/null
@@ -1,13 +0,0 @@
-*.py[cod]
-*.swp
-MANIFEST
-build/
-dist/
-hachoir-metadata/hachoir_metadata/qt/dialog_ui.py
-
-# generated by tox
-.tox/
-hachoir.egg-info/
-
-# Mac files
-.DS_Store
diff --git a/.hgignore b/.hgignore
deleted file mode 100644
index ca7b6c5e..00000000
--- a/.hgignore
+++ /dev/null
@@ -1,19 +0,0 @@
-syntax: glob
-
-# Generated files: .py => .pyc
-*.pyc
-*.pyo
-__pycache__
-hachoir-metadata/hachoir_metadata/qt/dialog_ui.py
-
-# Temporary files (vim backups)
-*.swp
-
-# build/ subdirectories
-build
-
-# build by the tox command
-.tox/
-
-# build by distutils
-hachoir.egg-info/
diff --git a/.travis.yml b/.travis.yml
deleted file mode 100644
index b04fbdd5..00000000
--- a/.travis.yml
+++ /dev/null
@@ -1,7 +0,0 @@
-language: python
-env:
- - TOXENV=py36
- - TOXENV=doc
- - TOXENV=pep8
-install: pip install -U tox
-script: tox
diff --git a/PKG-INFO b/PKG-INFO
new file mode 100644
index 00000000..e06fd943
--- /dev/null
+++ b/PKG-INFO
@@ -0,0 +1,69 @@
+Metadata-Version: 2.1
+Name: hachoir
+Version: 3.2.0
+Summary: Package of Hachoir parsers used to open binary files
+Home-page: http://hachoir.readthedocs.io/
+Author: Hachoir team (see AUTHORS file)
+License: GNU GPL v2
+Project-URL: Source, https://github.com/vstinner/hachoir
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Environment :: Console :: Curses
+Classifier: Environment :: Plugins
+Classifier: Intended Audience :: Developers
+Classifier: Intended Audience :: Education
+Classifier: License :: OSI Approved :: GNU General Public License (GPL)
+Classifier: Natural Language :: English
+Classifier: Operating System :: OS Independent
+Classifier: Programming Language :: Python :: 3
+Classifier: Topic :: Multimedia
+Classifier: Topic :: Scientific/Engineering :: Information Analysis
+Classifier: Topic :: Software Development :: Disassemblers
+Classifier: Topic :: Software Development :: Interpreters
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Classifier: Topic :: System :: Filesystems
+Classifier: Topic :: Text Processing
+Classifier: Topic :: Utilities
+Provides-Extra: urwid
+Provides-Extra: wx
+License-File: COPYING
+
+*******
+Hachoir
+*******
+
+.. image:: https://img.shields.io/pypi/v/hachoir.svg
+ :alt: Latest release on the Python Cheeseshop (PyPI)
+ :target: https://pypi.python.org/pypi/hachoir
+
+.. image:: https://github.com/vstinner/hachoir/actions/workflows/build.yml/badge.svg
+ :alt: Build status of hachoir on GitHub Actions
+ :target: https://github.com/vstinner/hachoir/actions
+
+.. image:: http://unmaintained.tech/badge.svg
+ :target: http://unmaintained.tech/
+ :alt: No Maintenance Intended
+
+Hachoir is a Python library to view and edit a binary stream field by field.
+In other words, Hachoir allows you to "browse" any binary stream just like you
+browse directories and files.
+
+A file is splitted in a tree of fields, where the smallest field is just one
+bit. Examples of fields types: integers, strings, bits, padding types, floats,
+etc. Hachoir is the French word for a meat grinder (meat mincer), which is used
+by butchers to divide meat into long tubes; Hachoir is used by computer
+butchers to divide binary files into fields.
+
+* `Hachoir website <http://hachoir.readthedocs.io/>`_ (source code, bugs)
+* `Hachoir on GitHub (Source code, bug tracker) <https://github.com/vstinner/hachoir>`_
+* License: GNU GPL v2
+
+Command line tools using Hachoir parsers:
+
+* hachoir-grep: find a text pattern in a binary file
+* hachoir-metadata: get metadata from binary files
+* hachoir-strip: modify a file to remove metadata
+* hachoir-urwid: display the content of a binary file in text mode
+
+Installation instructions: http://hachoir.readthedocs.io/en/latest/install.html
+
+Hachoir is written for Python 3.6 or newer.
diff --git a/README.rst b/README.rst
index fb2e8844..da6d8f33 100644
--- a/README.rst
+++ b/README.rst
@@ -6,9 +6,9 @@ Hachoir
:alt: Latest release on the Python Cheeseshop (PyPI)
:target: https://pypi.python.org/pypi/hachoir
-.. image:: https://travis-ci.org/vstinner/hachoir.svg?branch=master
- :alt: Build status of hachoir on Travis CI
- :target: https://travis-ci.org/vstinner/hachoir
+.. image:: https://github.com/vstinner/hachoir/actions/workflows/build.yml/badge.svg
+ :alt: Build status of hachoir on GitHub Actions
+ :target: https://github.com/vstinner/hachoir/actions
.. image:: http://unmaintained.tech/badge.svg
:target: http://unmaintained.tech/
diff --git a/TODO.rst b/TODO.rst
deleted file mode 100644
index 4b6018c9..00000000
--- a/TODO.rst
+++ /dev/null
@@ -1,41 +0,0 @@
-*********
-TODO list
-*********
-
-TODO
-====
-
-* Fix hachoir-subfile: hachoir.regex only supports Unicode?
-* Write more tests:
-
- - use coverage to check which parsers are never tested
- - write tests for hachoir-subfile
-
-* convert all methods names to PEP8!!!
-* test hachoir-gtk
-
-
-subfile
-=======
-
-Disabled Parsers
-^^^^^^^^^^^^^^^^
-
- * MPEG audio is disabled
-
-Parsers without magic string
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
- * PCX: PhotoRec regex:
- "\x0a[\0\2\3\4\5]][[\0\1][\1\4\x08\x18]"
- (magic, version, compression, bits/pixel)
- * TGA
- * MPEG video, proposition:
- regex "\x00\x00\x01[\xB0\xB3\xB5\xBA\xBB" (from PhotoRec) at offset 0
- (0xBA is the most common value)
-
-Compute content size
-^^^^^^^^^^^^^^^^^^^^
-
- * gzip: need to decompress flow (deflate using zlib)
- * bzip2: need to decompress flow
diff --git a/debian/changelog b/debian/changelog
index f3d2b500..e798aa36 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,8 +1,9 @@
-hachoir (3.1.0+dfsg-6) UNRELEASED; urgency=medium
+hachoir (3.2.0+git20221203.1.01ee9e5+dfsg-1) UNRELEASED; urgency=medium
* Update standards version to 4.6.1, no changes needed.
+ * New upstream snapshot.
- -- Debian Janitor <janitor@jelmer.uk> Mon, 07 Nov 2022 17:56:56 -0000
+ -- Debian Janitor <janitor@jelmer.uk> Fri, 20 Jan 2023 11:47:20 -0000
hachoir (3.1.0+dfsg-5) unstable; urgency=medium
diff --git a/doc/.gitignore b/doc/.gitignore
deleted file mode 100644
index 968ec3c8..00000000
--- a/doc/.gitignore
+++ /dev/null
@@ -1 +0,0 @@
-parser_list.rst
diff --git a/doc/authors.rst b/doc/authors.rst
index 4fc3a2a5..c4b57811 100644
--- a/doc/authors.rst
+++ b/doc/authors.rst
@@ -23,7 +23,7 @@ Contributors
* Aurélien Jacobs <aurel AT gnuage DOT org> - AVI parser big contributor
* Christophe Fergeau <teuf AT gnome.org> - Improve iTunesDB parser
* Christophe Gisquet <christophe.gisquet AT free.fr> - Write RAR parser
-* Cyril Zorin <cyril.zorin AT gmail.com> - Author of 3DO parser
+* Kirill Zorin <cyril.zorin AT gmail.com> - Author of hachoir-wx, 3DO and game parsers
* Elie Roudninski aka adema <liliroud AT hotmail.com> - Started Gtk GUI
* Feth Arezki <feth AT arezki.net> - Fix hachoir-metadata-qt to save the current directory
* Frédéric Weisbecker <chantecode AT gmail.com> - Author of ReiserFS parser
diff --git a/doc/changelog.rst b/doc/changelog.rst
index d5468daa..0773e901 100644
--- a/doc/changelog.rst
+++ b/doc/changelog.rst
@@ -2,6 +2,38 @@
Changelog
+++++++++
+hachoir 3.2.0 (2022-11-27)
+==========================
+
+* Fix hachoir-grep command line parsing.
+* PYC parser supports Python 3.12.
+
+hachoir 3.1.3 (2022-04-04)
+==========================
+
+* The development branch ``master`` was renamed to ``main``.
+ See https://sfconservancy.org/news/2020/jun/23/gitbranchname/ for the
+ rationale.
+* Replace Travis CI with GitHub Actions.
+* ttf: Support OpenType magic number (OTTO).
+* hachoir-wx: Load darkdetect and test once, fallback if not found.
+* Add hachoir-wx docs.
+* jpeg: Set the size of a JpegImageData with no terminator to the
+ remaining length in the stream to avoid parsing subfields of the JpegImageData
+ if possible.
+* fit: Add parser of Garmin fit files.
+* lzx: Fix LZX decompression.
+
+hachoir 3.1.2 (2020-02-15)
+==========================
+
+* Fix a SyntaxWarning in the PDF parser.
+
+hachoir 3.1.1 (2020-01-06)
+==========================
+
+* Fix hachoir-wx
+
hachoir 3.1.0 (2019-10-28)
==========================
diff --git a/doc/conf.py b/doc/conf.py
index 808c86de..d0150450 100644
--- a/doc/conf.py
+++ b/doc/conf.py
@@ -55,7 +55,7 @@ copyright = u'2014, Victor Stinner'
#
# The short X.Y version.
# The full version, including alpha/beta/rc tags.
-version = release = '3.0a6'
+version = release = '3.2.0'
# The language for content autogenerated by Sphinx. Refer to documentation
diff --git a/doc/images/urwid.png b/doc/images/urwid.png
deleted file mode 100644
index fcc61448..00000000
Binary files a/doc/images/urwid.png and /dev/null differ
diff --git a/doc/index.rst b/doc/index.rst
index 693c6fd7..2fd59257 100644
--- a/doc/index.rst
+++ b/doc/index.rst
@@ -16,6 +16,7 @@ Command line tools using Hachoir parsers:
* :ref:`hachoir-metadata <metadata>`: get metadata from binary files
* :ref:`hachoir-urwid <urwid>`: display the content of a binary file in text mode
+* :ref:`hachoir-wx <wx>`: display the content of a binary file in GUI mode
* :ref:`hachoir-grep <grep>`: find a text pattern in a binary file
* :ref:`hachoir-strip <strip>`: modify a file to remove metadata
@@ -32,6 +33,7 @@ User Guide
install
metadata
urwid
+ wx
subfile
grep
strip
diff --git a/doc/install.rst b/doc/install.rst
index 2a81b368..308b0fc9 100644
--- a/doc/install.rst
+++ b/doc/install.rst
@@ -7,7 +7,7 @@ To install Hachoir, type::
python3 -m pip install -U hachoir
To use hachoir-urwid, you will also need to install `urwid library
-<http://excess.org/urwid/>`_::
+<http://urwid.org/>`_::
python3 -m pip install -U urwid
diff --git a/doc/metadata.rst b/doc/metadata.rst
index 20e349e4..ffdb9126 100644
--- a/doc/metadata.rst
+++ b/doc/metadata.rst
@@ -140,25 +140,6 @@ Video
Command line options
====================
-Modes --mime and --type
-=======================
-
-Option --mime ask to just display file MIME type (works like UNIX
-"file --mime" program)::
-
- $ hachoir-metadata --mime logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico
- logo-Kubuntu.png: image/png
- sheep_on_drugs.mp3: audio/mpeg
- wormux_32x32_16c.ico: image/x-ico
-
-Option --file display short description of file type (works like
-UNIX "file" program)::
-
- $ hachoir-metadata --type logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico
- logo-Kubuntu.png: PNG picture: 331x90x8 (alpha layer)
- sheep_on_drugs.mp3: MPEG v1 layer III, 128.0 Kbit/sec, 44.1 KHz, Joint stereo
- wormux_32x32_16c.ico: Microsoft Windows icon: 16x16x32
-
Modes --mime and --type
-----------------------
@@ -171,7 +152,7 @@ Option ``--mime`` ask to just display file MIME type::
(it works like UNIX "file --mime" program)
-Option ``--file`` display short description of file type::
+Option ``--type`` display short description of file type::
$ hachoir-metadata --type logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico
logo-Kubuntu.png: PNG picture: 331x90x8 (alpha layer)
diff --git a/doc/wx.rst b/doc/wx.rst
new file mode 100644
index 00000000..429ade8e
--- /dev/null
+++ b/doc/wx.rst
@@ -0,0 +1,25 @@
+.. _wx:
+
+++++++++++++++++++
+hachoir-wx program
+++++++++++++++++++
+
+hachoir-wx is a graphical binary file explorer and hex viewer, which uses the
+Hachoir library to parse the files and the WxPython library to create the user
+interface.
+
+Before use, make sure to install the required dependencies with ``pip install
+hachoir[wx]``. On Mac OS and Windows, this will install WxPython. On Linux, you
+may need to install a version of WxPython using your distribution's package manager
+or from the `WxPython Download page <https://www.wxpython.org/pages/downloads/>`_.
+
+.. image:: images/wx.png
+ :alt: hachoir-wx screenshot (MP3 audio file)
+
+Command line options
+====================
+
+* ``--preload=10``: Load 10 fields when loading a new field set
+* ``--path="/header/bpp"``: Open the specified path and focus on the field
+* ``--parser=PARSERID``: Force a parser (and skip parser validation)
+* ``--help``: Show all command line options
diff --git a/hachoir-grep b/hachoir-grep
deleted file mode 100755
index aed5c15f..00000000
--- a/hachoir-grep
+++ /dev/null
@@ -1,5 +0,0 @@
-#!/usr/bin/env python3
-from hachoir.grep import main
-
-if __name__ == "__main__":
- main()
diff --git a/hachoir-metadata b/hachoir-metadata
deleted file mode 100755
index 2db3c280..00000000
--- a/hachoir-metadata
+++ /dev/null
@@ -1,3 +0,0 @@
-#!/usr/bin/env python3
-from hachoir.metadata.main import main
-main()
diff --git a/hachoir-metadata-gtk b/hachoir-metadata-gtk
deleted file mode 100755
index b52a1efe..00000000
--- a/hachoir-metadata-gtk
+++ /dev/null
@@ -1,3 +0,0 @@
-#!/usr/bin/env python3
-from hachoir.metadata.gtk import MetadataGtk
-MetadataGtk().main()
diff --git a/hachoir-metadata-qt b/hachoir-metadata-qt
deleted file mode 100755
index eb090871..00000000
--- a/hachoir-metadata-qt
+++ /dev/null
@@ -1,3 +0,0 @@
-#!/usr/bin/env python3
-from hachoir.metadata.qt.main import main
-main()
diff --git a/hachoir-strip b/hachoir-strip
deleted file mode 100755
index 0a0f5b1e..00000000
--- a/hachoir-strip
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/usr/bin/env python3
-from hachoir.strip import main
-if __name__ == "__main__":
- main()
diff --git a/hachoir-subfile b/hachoir-subfile
deleted file mode 100755
index ff501336..00000000
--- a/hachoir-subfile
+++ /dev/null
@@ -1,3 +0,0 @@
-#!/usr/bin/env python3
-from hachoir.subfile.main import main
-main()
diff --git a/hachoir-urwid b/hachoir-urwid
deleted file mode 100755
index 39bf50c9..00000000
--- a/hachoir-urwid
+++ /dev/null
@@ -1,3 +0,0 @@
-#!/usr/bin/env python3
-from hachoir.urwid import main
-main()
diff --git a/hachoir-wx b/hachoir-wx
deleted file mode 100755
index 08012aa7..00000000
--- a/hachoir-wx
+++ /dev/null
@@ -1,5 +0,0 @@
-#!/usr/bin/env python3
-from hachoir.wx.main import main
-
-if __name__ == "__main__":
- main()
diff --git a/hachoir.egg-info/PKG-INFO b/hachoir.egg-info/PKG-INFO
new file mode 100644
index 00000000..e06fd943
--- /dev/null
+++ b/hachoir.egg-info/PKG-INFO
@@ -0,0 +1,69 @@
+Metadata-Version: 2.1
+Name: hachoir
+Version: 3.2.0
+Summary: Package of Hachoir parsers used to open binary files
+Home-page: http://hachoir.readthedocs.io/
+Author: Hachoir team (see AUTHORS file)
+License: GNU GPL v2
+Project-URL: Source, https://github.com/vstinner/hachoir
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Environment :: Console :: Curses
+Classifier: Environment :: Plugins
+Classifier: Intended Audience :: Developers
+Classifier: Intended Audience :: Education
+Classifier: License :: OSI Approved :: GNU General Public License (GPL)
+Classifier: Natural Language :: English
+Classifier: Operating System :: OS Independent
+Classifier: Programming Language :: Python :: 3
+Classifier: Topic :: Multimedia
+Classifier: Topic :: Scientific/Engineering :: Information Analysis
+Classifier: Topic :: Software Development :: Disassemblers
+Classifier: Topic :: Software Development :: Interpreters
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Classifier: Topic :: System :: Filesystems
+Classifier: Topic :: Text Processing
+Classifier: Topic :: Utilities
+Provides-Extra: urwid
+Provides-Extra: wx
+License-File: COPYING
+
+*******
+Hachoir
+*******
+
+.. image:: https://img.shields.io/pypi/v/hachoir.svg
+ :alt: Latest release on the Python Cheeseshop (PyPI)
+ :target: https://pypi.python.org/pypi/hachoir
+
+.. image:: https://github.com/vstinner/hachoir/actions/workflows/build.yml/badge.svg
+ :alt: Build status of hachoir on GitHub Actions
+ :target: https://github.com/vstinner/hachoir/actions
+
+.. image:: http://unmaintained.tech/badge.svg
+ :target: http://unmaintained.tech/
+ :alt: No Maintenance Intended
+
+Hachoir is a Python library to view and edit a binary stream field by field.
+In other words, Hachoir allows you to "browse" any binary stream just like you
+browse directories and files.
+
+A file is splitted in a tree of fields, where the smallest field is just one
+bit. Examples of fields types: integers, strings, bits, padding types, floats,
+etc. Hachoir is the French word for a meat grinder (meat mincer), which is used
+by butchers to divide meat into long tubes; Hachoir is used by computer
+butchers to divide binary files into fields.
+
+* `Hachoir website <http://hachoir.readthedocs.io/>`_ (source code, bugs)
+* `Hachoir on GitHub (Source code, bug tracker) <https://github.com/vstinner/hachoir>`_
+* License: GNU GPL v2
+
+Command line tools using Hachoir parsers:
+
+* hachoir-grep: find a text pattern in a binary file
+* hachoir-metadata: get metadata from binary files
+* hachoir-strip: modify a file to remove metadata
+* hachoir-urwid: display the content of a binary file in text mode
+
+Installation instructions: http://hachoir.readthedocs.io/en/latest/install.html
+
+Hachoir is written for Python 3.6 or newer.
diff --git a/hachoir.egg-info/SOURCES.txt b/hachoir.egg-info/SOURCES.txt
new file mode 100644
index 00000000..f46b25b2
--- /dev/null
+++ b/hachoir.egg-info/SOURCES.txt
@@ -0,0 +1,403 @@
+COPYING
+MANIFEST.in
+README.rst
+benchmark.sh
+hachoir-metadata-csv
+runtests.py
+setup.py
+tox.ini
+doc/Makefile
+doc/authors.rst
+doc/changelog.rst
+doc/conf.py
+doc/contact.rst
+doc/developer.rst
+doc/editor.rst
+doc/gen_parser_list.py
+doc/grep.rst
+doc/hacking.rst
+doc/index.rst
+doc/install.rst
+doc/internals.rst
+doc/make.bat
+doc/metadata.rst
+doc/parser.rst
+doc/regex.rst
+doc/strip.rst
+doc/subfile.rst
+doc/urwid.rst
+doc/wx.rst
+doc/examples/editor_add_extra.py
+doc/examples/editor_gzip.py
+doc/examples/editor_zip.py
+doc/examples/metadata.py
+hachoir/__init__.py
+hachoir/grep.py
+hachoir/strip.py
+hachoir/test.py
+hachoir/urwid.py
+hachoir.egg-info/PKG-INFO
+hachoir.egg-info/SOURCES.txt
+hachoir.egg-info/dependency_links.txt
+hachoir.egg-info/entry_points.txt
+hachoir.egg-info/requires.txt
+hachoir.egg-info/top_level.txt
+hachoir.egg-info/zip-safe
+hachoir/core/__init__.py
+hachoir/core/benchmark.py
+hachoir/core/bits.py
+hachoir/core/cmd_line.py
+hachoir/core/config.py
+hachoir/core/dict.py
+hachoir/core/endian.py
+hachoir/core/error.py
+hachoir/core/event_handler.py
+hachoir/core/i18n.py
+hachoir/core/iso639.py
+hachoir/core/language.py
+hachoir/core/log.py
+hachoir/core/memory.py
+hachoir/core/profiler.py
+hachoir/core/text_handler.py
+hachoir/core/timeout.py
+hachoir/core/tools.py
+hachoir/editor/__init__.py
+hachoir/editor/field.py
+hachoir/editor/fieldset.py
+hachoir/editor/typed_field.py
+hachoir/field/__init__.py
+hachoir/field/basic_field_set.py
+hachoir/field/bit_field.py
+hachoir/field/byte_field.py
+hachoir/field/character.py
+hachoir/field/enum.py
+hachoir/field/fake_array.py
+hachoir/field/field.py
+hachoir/field/field_set.py
+hachoir/field/float.py
+hachoir/field/fragment.py
+hachoir/field/generic_field_set.py
+hachoir/field/helper.py
+hachoir/field/integer.py
+hachoir/field/link.py
+hachoir/field/padding.py
+hachoir/field/parser.py
+hachoir/field/seekable_field_set.py
+hachoir/field/static_field_set.py
+hachoir/field/string_field.py
+hachoir/field/sub_file.py
+hachoir/field/timestamp.py
+hachoir/field/vector.py
+hachoir/metadata/__init__.py
+hachoir/metadata/__main__.py
+hachoir/metadata/archive.py
+hachoir/metadata/audio.py
+hachoir/metadata/config.py
+hachoir/metadata/cr2.py
+hachoir/metadata/csv.py
+hachoir/metadata/file_system.py
+hachoir/metadata/filter.py
+hachoir/metadata/formatter.py
+hachoir/metadata/gtk.py
+hachoir/metadata/image.py
+hachoir/metadata/jpeg.py
+hachoir/metadata/main.py
+hachoir/metadata/metadata.py
+hachoir/metadata/metadata_item.py
+hachoir/metadata/misc.py
+hachoir/metadata/program.py
+hachoir/metadata/register.py
+hachoir/metadata/riff.py
+hachoir/metadata/safe.py
+hachoir/metadata/setter.py
+hachoir/metadata/timezone.py
+hachoir/metadata/video.py
+hachoir/metadata/qt/__init__.py
+hachoir/metadata/qt/main.py
+hachoir/parser/__init__.py
+hachoir/parser/guess.py
+hachoir/parser/parser.py
+hachoir/parser/parser_list.py
+hachoir/parser/template.py
+hachoir/parser/archive/__init__.py
+hachoir/parser/archive/ace.py
+hachoir/parser/archive/ar.py
+hachoir/parser/archive/arj.py
+hachoir/parser/archive/bomstore.py
+hachoir/parser/archive/bzip2_parser.py
+hachoir/parser/archive/cab.py
+hachoir/parser/archive/gzip_parser.py
+hachoir/parser/archive/lzx.py
+hachoir/parser/archive/mar.py
+hachoir/parser/archive/mozilla_ar.py
+hachoir/parser/archive/prs_pak.py
+hachoir/parser/archive/rar.py
+hachoir/parser/archive/rpm.py
+hachoir/parser/archive/sevenzip.py
+hachoir/parser/archive/tar.py
+hachoir/parser/archive/zip.py
+hachoir/parser/archive/zlib.py
+hachoir/parser/audio/__init__.py
+hachoir/parser/audio/aiff.py
+hachoir/parser/audio/au.py
+hachoir/parser/audio/flac.py
+hachoir/parser/audio/id3.py
+hachoir/parser/audio/itunesdb.py
+hachoir/parser/audio/midi.py
+hachoir/parser/audio/mod.py
+hachoir/parser/audio/modplug.py
+hachoir/parser/audio/mpeg_audio.py
+hachoir/parser/audio/real_audio.py
+hachoir/parser/audio/s3m.py
+hachoir/parser/audio/xm.py
+hachoir/parser/common/__init__.py
+hachoir/parser/common/deflate.py
+hachoir/parser/common/msdos.py
+hachoir/parser/common/tracker.py
+hachoir/parser/common/win32.py
+hachoir/parser/common/win32_lang_id.py
+hachoir/parser/container/__init__.py
+hachoir/parser/container/action_script.py
+hachoir/parser/container/asn1.py
+hachoir/parser/container/mkv.py
+hachoir/parser/container/mp4.py
+hachoir/parser/container/ogg.py
+hachoir/parser/container/realmedia.py
+hachoir/parser/container/riff.py
+hachoir/parser/container/swf.py
+hachoir/parser/file_system/__init__.py
+hachoir/parser/file_system/ext2.py
+hachoir/parser/file_system/fat.py
+hachoir/parser/file_system/iso9660.py
+hachoir/parser/file_system/linux_swap.py
+hachoir/parser/file_system/mbr.py
+hachoir/parser/file_system/ntfs.py
+hachoir/parser/file_system/reiser_fs.py
+hachoir/parser/game/__init__.py
+hachoir/parser/game/blp.py
+hachoir/parser/game/laf.py
+hachoir/parser/game/spider_man_video.py
+hachoir/parser/game/zsnes.py
+hachoir/parser/image/__init__.py
+hachoir/parser/image/bmp.py
+hachoir/parser/image/common.py
+hachoir/parser/image/cr2.py
+hachoir/parser/image/exif.py
+hachoir/parser/image/gif.py
+hachoir/parser/image/ico.py
+hachoir/parser/image/iptc.py
+hachoir/parser/image/jpeg.py
+hachoir/parser/image/pcx.py
+hachoir/parser/image/photoshop_metadata.py
+hachoir/parser/image/png.py
+hachoir/parser/image/psd.py
+hachoir/parser/image/tga.py
+hachoir/parser/image/tiff.py
+hachoir/parser/image/wmf.py
+hachoir/parser/image/xcf.py
+hachoir/parser/misc/__init__.py
+hachoir/parser/misc/bplist.py
+hachoir/parser/misc/chm.py
+hachoir/parser/misc/common.py
+hachoir/parser/misc/dsstore.py
+hachoir/parser/misc/file_3do.py
+hachoir/parser/misc/file_3ds.py
+hachoir/parser/misc/fit.py
+hachoir/parser/misc/gnome_keyring.py
+hachoir/parser/misc/hlp.py
+hachoir/parser/misc/lnk.py
+hachoir/parser/misc/mapsforge_map.py
+hachoir/parser/misc/msoffice.py
+hachoir/parser/misc/msoffice_summary.py
+hachoir/parser/misc/mstask.py
+hachoir/parser/misc/ole2.py
+hachoir/parser/misc/ole2_util.py
+hachoir/parser/misc/pcf.py
+hachoir/parser/misc/pdf.py
+hachoir/parser/misc/pifv.py
+hachoir/parser/misc/torrent.py
+hachoir/parser/misc/ttf.py
+hachoir/parser/misc/word_2.py
+hachoir/parser/misc/word_doc.py
+hachoir/parser/network/__init__.py
+hachoir/parser/network/common.py
+hachoir/parser/network/ouid.py
+hachoir/parser/network/tcpdump.py
+hachoir/parser/program/__init__.py
+hachoir/parser/program/elf.py
+hachoir/parser/program/exe.py
+hachoir/parser/program/exe_ne.py
+hachoir/parser/program/exe_pe.py
+hachoir/parser/program/exe_res.py
+hachoir/parser/program/java.py
+hachoir/parser/program/java_serialized.py
+hachoir/parser/program/macho.py
+hachoir/parser/program/nds.py
+hachoir/parser/program/prc.py
+hachoir/parser/program/python.py
+hachoir/parser/video/__init__.py
+hachoir/parser/video/amf.py
+hachoir/parser/video/asf.py
+hachoir/parser/video/flv.py
+hachoir/parser/video/fourcc.py
+hachoir/parser/video/mpeg_ts.py
+hachoir/parser/video/mpeg_video.py
+hachoir/regex/__init__.py
+hachoir/regex/parser.py
+hachoir/regex/pattern.py
+hachoir/regex/regex.py
+hachoir/stream/__init__.py
+hachoir/stream/input.py
+hachoir/stream/input_helper.py
+hachoir/stream/output.py
+hachoir/stream/stream.py
+hachoir/subfile/__init__.py
+hachoir/subfile/__main__.py
+hachoir/subfile/data_rate.py
+hachoir/subfile/main.py
+hachoir/subfile/output.py
+hachoir/subfile/pattern.py
+hachoir/subfile/search.py
+hachoir/wx/__init__.py
+hachoir/wx/__main__.py
+hachoir/wx/app.py
+hachoir/wx/dialogs.py
+hachoir/wx/dispatcher.py
+hachoir/wx/main.py
+hachoir/wx/unicode.py
+hachoir/wx/field_view/__init__.py
+hachoir/wx/field_view/core_type_menu.py
+hachoir/wx/field_view/core_type_menu_fwd.py
+hachoir/wx/field_view/core_type_menu_imp.py
+hachoir/wx/field_view/field_menu.py
+hachoir/wx/field_view/field_menu_fwd.py
+hachoir/wx/field_view/field_menu_imp.py
+hachoir/wx/field_view/field_menu_setup.py
+hachoir/wx/field_view/field_split_menu.py
+hachoir/wx/field_view/field_split_menu_fwd.py
+hachoir/wx/field_view/field_split_menu_imp.py
+hachoir/wx/field_view/field_view.py
+hachoir/wx/field_view/field_view_fwd.py
+hachoir/wx/field_view/field_view_imp.py
+hachoir/wx/field_view/field_view_setup.py
+hachoir/wx/field_view/format.py
+hachoir/wx/field_view/mutator.py
+hachoir/wx/field_view/stubs.py
+hachoir/wx/frame_view/__init__.py
+hachoir/wx/frame_view/frame_view.py
+hachoir/wx/frame_view/frame_view_fwd.py
+hachoir/wx/frame_view/frame_view_imp.py
+hachoir/wx/frame_view/frame_view_setup.py
+hachoir/wx/hex_view/__init__.py
+hachoir/wx/hex_view/file_cache.py
+hachoir/wx/hex_view/hex_view.py
+hachoir/wx/hex_view/hex_view_setup.py
+hachoir/wx/resource/__init__.py
+hachoir/wx/resource/hachoir_wx.xrc
+hachoir/wx/resource/resource.py
+hachoir/wx/tree_view/__init__.py
+hachoir/wx/tree_view/tree_view.py
+hachoir/wx/tree_view/tree_view_setup.py
+tests/regex_regression.rst
+tests/test_doc.py
+tests/test_editor.py
+tests/test_grep.py
+tests/test_metadata.py
+tests/test_parser.py
+tests/test_strip.py
+tests/files/08lechat_hq_fr.mp3
+tests/files/10min.mkv
+tests/files/25min.aifc
+tests/files/32bpp.tga
+tests/files/7zip.chm
+tests/files/Panasonic_AG_HMC_151.MTS
+tests/files/ReferenceMap.class
+tests/files/andorra.map
+tests/files/angle-bear-48x48.ani
+tests/files/anti-arpeggio_tune.ptm
+tests/files/archive.7z
+tests/files/arp_dns_ping_dns.tcpdump
+tests/files/article01.bmp
+tests/files/audio_8khz_8bit_ulaw_4s39.au
+tests/files/breakdance.flv
+tests/files/cacert_class3.der
+tests/files/canon.raw.cr2
+tests/files/cd_0008_5C48_1m53s.cda
+tests/files/cercle.exe
+tests/files/claque-beignet.swf
+tests/files/com.apple.pkg.BaseSystemResources.bom
+tests/files/cross.xcf
+tests/files/debian-31r4-i386-binary-1.iso.torrent
+tests/files/default_mount_opts.ext2
+tests/files/deja_vu_serif-2.7.ttf
+tests/files/dell8.fat16
+tests/files/dontyou.xm
+tests/files/eula.exe
+tests/files/example2.arj
+tests/files/example4_chapters.arj
+tests/files/firstrun.rm
+tests/files/flashmob.mkv
+tests/files/free-software-song.midi.bz2
+tests/files/ftp-0.17-537.i586.rpm
+tests/files/georgia.cab
+tests/files/get-versions.64bit.little.elf
+tests/files/globe.wmf
+tests/files/gps.jpg
+tests/files/grasslogo_vector.emf
+tests/files/green_fire.jpg
+tests/files/hachoir-core.ace
+tests/files/hachoir-core.rar
+tests/files/hachoir.org.sxw
+tests/files/hero.tga
+tests/files/hotel_california.flac
+tests/files/india_map.gif
+tests/files/indiana.mid
+tests/files/interlude_david_aubrun.ogg
+tests/files/jpeg.exif.photoshop.jpg
+tests/files/kde_click.wav
+tests/files/kde_haypo_corner.bmp
+tests/files/kino14s.laf
+tests/files/ladouce_1h15.wav
+tests/files/lara_croft.pcx
+tests/files/linux_swap_9pages
+tests/files/logo-kubuntu.png
+tests/files/macos_10.12.macho
+tests/files/macos_10.5.macho
+tests/files/marc_kravetz.mp3
+tests/files/matrix_ping_pong.wmv
+tests/files/mbr_linux_and_ext
+tests/files/mev.32bit.big.elf
+tests/files/mev.64bit.big.elf
+tests/files/my60k.ext2
+tests/files/nitrodir.nds
+tests/files/ocr10.laf
+tests/files/paktest.pak
+tests/files/pentax_320x240.mov
+tests/files/pikachu.wmf
+tests/files/ping_20020927-3ubuntu2
+tests/files/png_331x90x8_truncated.png
+tests/files/pyc_example_1.5.2_pyc.bin
+tests/files/pyc_example_2.2.3_pyc.bin
+tests/files/pyc_example_2.5c1_pyc.bin
+tests/files/python.cpython-312.pyc.bin
+tests/files/python.cpython-37.pyc.bin
+tests/files/quicktime.mp4
+tests/files/radpoor.doc
+tests/files/reiserfs_v3_332k.bin
+tests/files/sample.tif
+tests/files/sample.ts
+tests/files/satellite_one.s3m
+tests/files/sheep_on_drugs.mp3
+tests/files/small_text.tar
+tests/files/smallville.s03e02.avi
+tests/files/steganography.mp3
+tests/files/swat.blp
+tests/files/test.txt.gz
+tests/files/test_file.fit
+tests/files/twunk_16.exe
+tests/files/types.ext2
+tests/files/usa_railroad.jpg
+tests/files/vim.lnk
+tests/files/weka.model
+tests/files/wormux_32x32_16c.ico
+tests/files/yellowdude.3ds
\ No newline at end of file
diff --git a/hachoir.egg-info/dependency_links.txt b/hachoir.egg-info/dependency_links.txt
new file mode 100644
index 00000000..8b137891
--- /dev/null
+++ b/hachoir.egg-info/dependency_links.txt
@@ -0,0 +1 @@
+
diff --git a/hachoir.egg-info/entry_points.txt b/hachoir.egg-info/entry_points.txt
new file mode 100644
index 00000000..a32f78fa
--- /dev/null
+++ b/hachoir.egg-info/entry_points.txt
@@ -0,0 +1,8 @@
+[console_scripts]
+hachoir-grep = hachoir.grep:main
+hachoir-metadata = hachoir.metadata.main:main
+hachoir-strip = hachoir.strip:main
+hachoir-urwid = hachoir.urwid:main
+
+[gui_scripts]
+hachoir-wx = hachoir.wx.main:main
diff --git a/hachoir.egg-info/requires.txt b/hachoir.egg-info/requires.txt
new file mode 100644
index 00000000..d5586164
--- /dev/null
+++ b/hachoir.egg-info/requires.txt
@@ -0,0 +1,7 @@
+
+[urwid]
+urwid==1.3.1
+
+[wx]
+darkdetect
+wxPython==4.*
diff --git a/hachoir.egg-info/top_level.txt b/hachoir.egg-info/top_level.txt
new file mode 100644
index 00000000..beff499a
--- /dev/null
+++ b/hachoir.egg-info/top_level.txt
@@ -0,0 +1 @@
+hachoir
diff --git a/hachoir.egg-info/zip-safe b/hachoir.egg-info/zip-safe
new file mode 100644
index 00000000..8b137891
--- /dev/null
+++ b/hachoir.egg-info/zip-safe
@@ -0,0 +1 @@
+
diff --git a/hachoir/__init__.py b/hachoir/__init__.py
index d34f5e55..4e2da0af 100644
--- a/hachoir/__init__.py
+++ b/hachoir/__init__.py
@@ -1,2 +1,2 @@
-VERSION = (3, 1, 0)
+VERSION = (3, 2, 0)
__version__ = ".".join(map(str, VERSION))
diff --git a/hachoir/core/dict.py b/hachoir/core/dict.py
index 053bd4b2..c55e3f4c 100644
--- a/hachoir/core/dict.py
+++ b/hachoir/core/dict.py
@@ -168,7 +168,7 @@ class Dict(object):
_index = index
if index < 0:
index += len(self._value_list)
- if not(0 <= index <= len(self._value_list)):
+ if not (0 <= index <= len(self._value_list)):
raise IndexError("Insert error: index '%s' is invalid" % _index)
for item_key, item_index in self._index.items():
if item_index >= index:
diff --git a/hachoir/core/tools.py b/hachoir/core/tools.py
index 7655c0e8..43575f22 100644
--- a/hachoir/core/tools.py
+++ b/hachoir/core/tools.py
@@ -493,7 +493,7 @@ def timestampUNIX(value):
"""
if not isinstance(value, (float, int)):
raise TypeError("timestampUNIX(): an integer or float is required")
- if not(0 <= value <= 2147483647):
+ if not (0 <= value <= 2147483647):
raise ValueError("timestampUNIX(): value have to be in 0..2147483647")
return UNIX_TIMESTAMP_T0 + timedelta(seconds=value)
@@ -514,7 +514,7 @@ def timestampMac32(value):
"""
if not isinstance(value, (float, int)):
raise TypeError("an integer or float is required")
- if not(0 <= value <= 4294967295):
+ if not (0 <= value <= 4294967295):
return "invalid Mac timestamp (%s)" % value
return MAC_TIMESTAMP_T0 + timedelta(seconds=value)
diff --git a/hachoir/editor/field.py b/hachoir/editor/field.py
index 0724c230..f3d43c0d 100644
--- a/hachoir/editor/field.py
+++ b/hachoir/editor/field.py
@@ -63,7 +63,7 @@ class FakeField(object):
addr = self._parent._getFieldInputAddress(self._name)
input = self._parent.input
stream = input.stream
- if size % 8:
+ if size % 8 or addr % 8:
output.copyBitsFrom(stream, addr, size, input.endian)
else:
output.copyBytesFrom(stream, addr, size // 8)
diff --git a/hachoir/editor/typed_field.py b/hachoir/editor/typed_field.py
index 38d63792..fe646962 100644
--- a/hachoir/editor/typed_field.py
+++ b/hachoir/editor/typed_field.py
@@ -101,7 +101,7 @@ class EditableBits(EditableFixedField):
self._is_altered = True
def _setValue(self, value):
- if not(0 <= value < (1 << self._size)):
+ if not (0 <= value < (1 << self._size)):
raise ValueError("Invalid value, must be in range %s..%s"
% (0, (1 << self._size) - 1))
self._value = value
@@ -248,7 +248,7 @@ class EditableInteger(EditableFixedField):
else:
valid = self.VALID_VALUE_UNSIGNED
minval, maxval = valid[self._size]
- if not(minval <= value <= maxval):
+ if not (minval <= value <= maxval):
raise ValueError("Invalid value, must be in range %s..%s"
% (minval, maxval))
self._value = value
@@ -274,7 +274,7 @@ class EditableTimestampMac32(EditableFixedField):
EditableFixedField.__init__(self, parent, name, value, 32)
def _setValue(self, value):
- if not(self.minval <= value <= self.maxval):
+ if not (self.minval <= value <= self.maxval):
raise ValueError("Invalid value, must be in range %s..%s"
% (self.minval, self.maxval))
self._value = value
diff --git a/hachoir/field/__init__.py b/hachoir/field/__init__.py
index aea37526..cbc2d937 100644
--- a/hachoir/field/__init__.py
+++ b/hachoir/field/__init__.py
@@ -34,7 +34,8 @@ from hachoir.field.vector import GenericVector, UserVector # noqa
# Complex types
from hachoir.field.float import Float32, Float64, Float80 # noqa
-from hachoir.field.timestamp import (GenericTimestamp, # noqa
+from hachoir.field.timestamp import ( # noqa
+ GenericTimestamp,
TimestampUnix32, TimestampUnix64, TimestampMac32, TimestampUUID60,
TimestampWin64, TimedeltaMillisWin64,
DateTimeMSDOS32, TimeDateMSDOS32, TimedeltaWin64)
diff --git a/hachoir/field/byte_field.py b/hachoir/field/byte_field.py
index c372ad83..e0bdb083 100644
--- a/hachoir/field/byte_field.py
+++ b/hachoir/field/byte_field.py
@@ -20,7 +20,7 @@ class RawBytes(Field):
def __init__(self, parent, name, length, description="Raw data"):
assert issubclass(parent.__class__, Field)
- if not(0 < length <= MAX_LENGTH):
+ if not (0 < length <= MAX_LENGTH):
raise FieldError("Invalid RawBytes length (%s)!" % length)
Field.__init__(self, parent, name, length * 8, description)
self._display = None
diff --git a/hachoir/field/generic_field_set.py b/hachoir/field/generic_field_set.py
index e67e4b56..74d8898f 100644
--- a/hachoir/field/generic_field_set.py
+++ b/hachoir/field/generic_field_set.py
@@ -117,7 +117,7 @@ class GenericFieldSet(BasicFieldSet):
_getSize, doc="Size in bits, may create all fields to get size")
def _getCurrentSize(self):
- assert not(self.done)
+ assert not (self.done)
return self._current_size
current_size = property(_getCurrentSize)
diff --git a/hachoir/field/padding.py b/hachoir/field/padding.py
index 80b082dc..4c7265c8 100644
--- a/hachoir/field/padding.py
+++ b/hachoir/field/padding.py
@@ -23,7 +23,7 @@ class PaddingBits(Bits):
self._display_pattern = self.checkPattern()
def checkPattern(self):
- if not(config.check_padding_pattern):
+ if not (config.check_padding_pattern):
return False
if self.pattern != 0:
return False
@@ -72,7 +72,7 @@ class PaddingBytes(Bytes):
self._display_pattern = self.checkPattern()
def checkPattern(self):
- if not(config.check_padding_pattern):
+ if not (config.check_padding_pattern):
return False
if self.pattern is None:
return False
diff --git a/hachoir/field/string_field.py b/hachoir/field/string_field.py
index 41e47d28..742634d2 100644
--- a/hachoir/field/string_field.py
+++ b/hachoir/field/string_field.py
@@ -244,7 +244,7 @@ class GenericString(Bytes):
and err.end == len(text) \
and self._charset == "UTF-16-LE":
try:
- text = str(text + "\0", self._charset, "strict")
+ text = str(text + b"\0", self._charset, "strict")
self.warning(
"Fix truncated %s string: add missing nul byte" % self._charset)
return text
diff --git a/hachoir/field/timestamp.py b/hachoir/field/timestamp.py
index 0f7d3a56..e45b8320 100644
--- a/hachoir/field/timestamp.py
+++ b/hachoir/field/timestamp.py
@@ -61,7 +61,7 @@ class TimeDateMSDOS32(FieldSet):
def createValue(self):
return datetime(
- 1980 + self["year"].value, self["month"].value, self["day"].value,
+ 1980 + self["year"].value, self["month"].value or 1, self["day"].value or 1,
self["hour"].value, self["minute"].value, 2 * self["second"].value)
def createDisplay(self):
diff --git a/hachoir/field/vector.py b/hachoir/field/vector.py
index 8b5474e6..fabb70e1 100644
--- a/hachoir/field/vector.py
+++ b/hachoir/field/vector.py
@@ -7,7 +7,7 @@ class GenericVector(FieldSet):
# Sanity checks
assert issubclass(item_class, Field)
assert isinstance(item_class.static_size, int)
- if not(0 < nb_items):
+ if not (0 < nb_items):
raise ParserError('Unable to create empty vector "%s" in %s'
% (name, parent.path))
size = nb_items * item_class.static_size
diff --git a/hachoir/grep.py b/hachoir/grep.py
index 4a84d93f..b1c46cc2 100644
--- a/hachoir/grep.py
+++ b/hachoir/grep.py
@@ -63,7 +63,7 @@ def parseOptions():
if len(arguments) < 2:
parser.print_help()
sys.exit(1)
- pattern = str(arguments[0], "ascii")
+ pattern = arguments[0]
filenames = arguments[1:]
return values, pattern, filenames
@@ -169,11 +169,11 @@ class ConsoleGrep(Grep):
def runGrep(values, pattern, filenames):
grep = ConsoleGrep()
grep.display_filename = (1 < len(filenames))
- grep.display_address = not(values.no_addr)
+ grep.display_address = not values.no_addr
grep.display_path = values.path
- grep.display_value = not(values.no_value)
+ grep.display_value = not values.no_value
grep.display_percent = values.percent
- grep.display = not(values.bench)
+ grep.display = not values.bench
for filename in filenames:
grep.searchFile(filename, pattern, case_sensitive=values.case)
diff --git a/hachoir/metadata/main.py b/hachoir/metadata/main.py
index b652f9ec..7f2e9873 100644
--- a/hachoir/metadata/main.py
+++ b/hachoir/metadata/main.py
@@ -85,7 +85,7 @@ def processFile(values, filename,
with parser:
# Extract metadata
- extract_metadata = not(values.mime or values.type)
+ extract_metadata = not (values.mime or values.type)
if extract_metadata:
try:
metadata = extractMetadata(parser, values.quality)
@@ -124,7 +124,7 @@ def processFile(values, filename,
def processFiles(values, filenames, display=True):
- human = not(values.raw)
+ human = not values.raw
ok = True
priority = int(values.level) * 100 + 99
display_filename = (1 < len(filenames))
diff --git a/hachoir/metadata/qt/dialog.ui b/hachoir/metadata/qt/dialog.ui
deleted file mode 100644
index 498a8dae..00000000
--- a/hachoir/metadata/qt/dialog.ui
+++ /dev/null
@@ -1,64 +0,0 @@
-<ui version="4.0" >
- <class>Form</class>
- <widget class="QWidget" name="Form" >
- <property name="geometry" >
- <rect>
- <x>0</x>
- <y>0</y>
- <width>441</width>
- <height>412</height>
- </rect>
- </property>
- <property name="windowTitle" >
- <string>hachoir-metadata</string>
- </property>
- <layout class="QVBoxLayout" name="verticalLayout" >
- <item>
- <layout class="QHBoxLayout" name="horizontalLayout_2" >
- <item>
- <widget class="QPushButton" name="open_button" >
- <property name="text" >
- <string>Open</string>
- </property>
- </widget>
- </item>
- <item>
- <widget class="QComboBox" name="files_combo" >
- <property name="sizePolicy" >
- <sizepolicy vsizetype="Fixed" hsizetype="Expanding" >
- <horstretch>0</horstretch>
- <verstretch>0</verstretch>
- </sizepolicy>
- </property>
- </widget>
- </item>
- </layout>
- </item>
- <item>
- <widget class="QTableWidget" name="metadata_table" >
- <property name="alternatingRowColors" >
- <bool>true</bool>
- </property>
- <property name="showGrid" >
- <bool>false</bool>
- </property>
- <property name="rowCount" >
- <number>0</number>
- </property>
- <property name="columnCount" >
- <number>0</number>
- </property>
- </widget>
- </item>
- <item>
- <widget class="QPushButton" name="quit_button" >
- <property name="text" >
- <string>Quit</string>
- </property>
- </widget>
- </item>
- </layout>
- </widget>
- <resources/>
- <connections/>
-</ui>
diff --git a/hachoir/parser/archive/__init__.py b/hachoir/parser/archive/__init__.py
index d35ea0e8..3ec59338 100644
--- a/hachoir/parser/archive/__init__.py
+++ b/hachoir/parser/archive/__init__.py
@@ -1,5 +1,6 @@
from hachoir.parser.archive.ace import AceFile # noqa
from hachoir.parser.archive.ar import ArchiveFile # noqa
+from hachoir.parser.archive.arj import ArjParser # noqa
from hachoir.parser.archive.bomstore import BomFile # noqa
from hachoir.parser.archive.bzip2_parser import Bzip2Parser # noqa
from hachoir.parser.archive.cab import CabFile # noqa
diff --git a/hachoir/parser/archive/arj.py b/hachoir/parser/archive/arj.py
new file mode 100644
index 00000000..4b8a9bd2
--- /dev/null
+++ b/hachoir/parser/archive/arj.py
@@ -0,0 +1,155 @@
+"""
+ARJ archive file parser
+
+https://github.com/FarGroup/FarManager/blob/master/plugins/multiarc/arc.doc/arj.txt
+"""
+
+from hachoir.core.endian import LITTLE_ENDIAN
+from hachoir.field import (FieldSet, ParserError,
+ CString, Enum, RawBytes,
+ UInt8, UInt16, UInt32,
+ Bytes)
+from hachoir.parser import Parser
+
+HOST_OS = {
+ 0: "MSDOS",
+ 1: "PRIMOS",
+ 2: "UNIX",
+ 3: "AMIGA",
+ 4: "MACDOS",
+ 5: "OS/2",
+ 6: "APPLE GS",
+ 7: "ATARI ST",
+ 8: "NEXT",
+ 9: "VAX VMS",
+ 10: "WIN95",
+ 11: "WIN32",
+}
+
+FILE_TYPE = {
+ 0: "BINARY",
+ 1: "TEXT",
+ 2: "COMMENT",
+ 3: "DIRECTORY",
+ 4: "VOLUME",
+ 5: "CHAPTER",
+}
+
+MAGIC = b"\x60\xEA"
+
+
+class BaseBlock(FieldSet):
+ @property
+ def isEmpty(self):
+ return self["basic_header_size"].value == 0
+
+ def _header_start_fields(self):
+ yield Bytes(self, "magic", len(MAGIC))
+ if self["magic"].value != MAGIC:
+ raise ParserError("Wrong header magic")
+ yield UInt16(self, "basic_header_size", "zero if end of archive")
+ if not self.isEmpty:
+ yield UInt8(self, "first_hdr_size")
+ yield UInt8(self, "archiver_version")
+ yield UInt8(self, "min_archiver_version")
+ yield Enum(UInt8(self, "host_os"), HOST_OS)
+ yield UInt8(self, "arj_flags")
+
+ def _header_end_fields(self):
+ yield UInt8(self, "last_chapter")
+ fhs = self["first_hdr_size"]
+ name_position = fhs.address // 8 + fhs.value
+ current_position = self["last_chapter"].address // 8 + 1
+ if name_position > current_position:
+ yield RawBytes(self, "reserved2", name_position - current_position)
+
+ yield CString(self, "filename", "File name", charset="ASCII")
+ yield CString(self, "comment", "Comment", charset="ASCII")
+ yield UInt32(self, "crc", "Header CRC")
+
+ i = 0
+ while not self.eof:
+ yield UInt16(self, f"extended_header_size_{i}")
+ cur_size = self[f"extended_header_size_{i}"].value
+ if cur_size == 0:
+ break
+ yield RawBytes(self, "extended_header_data", cur_size)
+ yield UInt32(self, f"extended_header_crc_{i}")
+ i += 1
+
+ def validate(self):
+ if self.stream.readBytes(0, 2) != MAGIC:
+ return "Invalid magic"
+ return True
+
+
+class Header(BaseBlock):
+ def createFields(self):
+ yield from self._header_start_fields()
+ if not self.isEmpty:
+ yield UInt8(self, "security_version")
+ yield Enum(UInt8(self, "file_type"), FILE_TYPE)
+ yield UInt8(self, "reserved")
+ yield UInt32(self, "date_time_created")
+ yield UInt32(self, "date_time_modified")
+ yield UInt32(self, "archive_size")
+ yield UInt32(self, "security_envelope_file_position")
+ yield UInt16(self, "filespec_position")
+ yield UInt16(self, "security_envelope_data_len")
+ yield UInt8(self, "encryption_version")
+ yield from self._header_end_fields()
+
+ def createDescription(self):
+ if self.isEmpty:
+ return "Empty main header"
+ return "Main header of '%s'" % self["filename"].value
+
+
+class Block(BaseBlock):
+ def createFields(self):
+ yield from self._header_start_fields()
+ if not self.isEmpty:
+ yield UInt8(self, "method")
+ yield Enum(UInt8(self, "file_type"), FILE_TYPE)
+ yield UInt8(self, "reserved")
+ yield UInt32(self, "date_time_modified")
+ yield UInt32(self, "compressed_size")
+ yield UInt32(self, "original_size")
+ yield UInt32(self, "original_file_crc")
+ yield UInt16(self, "filespec_position")
+ yield UInt16(self, "file_access_mode")
+ yield UInt8(self, "first_chapter")
+ yield from self._header_end_fields()
+ compressed_size = self["compressed_size"].value
+ if compressed_size > 0:
+ yield RawBytes(self, "compressed_data", compressed_size)
+
+ def createDescription(self):
+ if self.isEmpty:
+ return "Empty file header"
+ return "File header of '%s'" % self["filename"].value
+
+
+class ArjParser(Parser):
+ endian = LITTLE_ENDIAN
+ PARSER_TAGS = {
+ "id": "arj",
+ "category": "archive",
+ "file_ext": ("arj",),
+ "min_size": 4 * 8,
+ "description": "ARJ archive"
+ }
+
+ def validate(self):
+ if self.stream.readBytes(0, 2) != MAGIC:
+ return "Invalid magic"
+ return True
+
+ def createFields(self):
+ yield Header(self, "header")
+ if not self["header"].isEmpty:
+ while not self.eof:
+ block = Block(self, "file_header[]")
+ yield block
+ if block.isEmpty:
+ break
diff --git a/hachoir/parser/archive/bzip2_parser.py b/hachoir/parser/archive/bzip2_parser.py
index 9c2b9211..2e91d690 100644
--- a/hachoir/parser/archive/bzip2_parser.py
+++ b/hachoir/parser/archive/bzip2_parser.py
@@ -57,8 +57,8 @@ class ZeroTerminatedNumber(Field):
return self._value
-def move_to_front(l, c):
- l[:] = l[c:c + 1] + l[0:c] + l[c + 1:]
+def move_to_front(seq, index):
+ seq[:] = seq[index:index + 1] + seq[0:index] + seq[index + 1:]
class Bzip2Bitmap(FieldSet):
@@ -218,7 +218,7 @@ class Bzip2Parser(Parser):
def validate(self):
if self.stream.readBytes(0, 3) != b'BZh':
return "Wrong file signature"
- if not("1" <= self["blocksize"].value <= "9"):
+ if not ("1" <= self["blocksize"].value <= "9"):
return "Wrong blocksize"
return True
diff --git a/hachoir/parser/archive/lzx.py b/hachoir/parser/archive/lzx.py
index 8db16f00..9d6baf6e 100644
--- a/hachoir/parser/archive/lzx.py
+++ b/hachoir/parser/archive/lzx.py
@@ -13,6 +13,7 @@ from hachoir.field import (FieldSet,
from hachoir.core.endian import MIDDLE_ENDIAN, LITTLE_ENDIAN
from hachoir.core.tools import paddingSize
from hachoir.parser.archive.zlib import build_tree, HuffmanCode, extend_data
+import struct
class LZXPreTreeEncodedTree(FieldSet):
@@ -146,6 +147,8 @@ class LZXBlock(FieldSet):
self.window_size = self.WINDOW_SIZE[self.compression_level]
self.block_type = self["block_type"].value
curlen = len(self.parent.uncompressed_data)
+ intel_started = False # Do we perform Intel jump fixups on this block?
+
if self.block_type in (1, 2): # Verbatim or aligned offset block
if self.block_type == 2:
for i in range(8):
@@ -156,6 +159,8 @@ class LZXBlock(FieldSet):
yield LZXPreTreeEncodedTree(self, "main_tree_rest", self.window_size * 8)
main_tree = build_tree(
self["main_tree_start"].lengths + self["main_tree_rest"].lengths)
+ if self["main_tree_start"].lengths[0xE8]:
+ intel_started = True
yield LZXPreTreeEncodedTree(self, "length_tree", 249)
length_tree = build_tree(self["length_tree"].lengths)
current_decoded_size = 0
@@ -169,7 +174,7 @@ class LZXBlock(FieldSet):
field._description = "Literal value %r" % chr(
field.realvalue)
current_decoded_size += 1
- self.parent.uncompressed_data += chr(field.realvalue)
+ self.parent.uncompressed_data.append(field.realvalue)
yield field
continue
position_header, length_header = divmod(
@@ -243,8 +248,7 @@ class LZXBlock(FieldSet):
self.parent.r2 = self.parent.r1
self.parent.r1 = self.parent.r0
self.parent.r0 = position
- self.parent.uncompressed_data = extend_data(
- self.parent.uncompressed_data, length, position)
+ extend_data(self.parent.uncompressed_data, length, position)
current_decoded_size += length
elif self.block_type == 3: # Uncompressed block
padding = paddingSize(self.address + self.current_size, 16)
@@ -253,6 +257,7 @@ class LZXBlock(FieldSet):
else:
yield PaddingBits(self, "padding[]", 16)
self.endian = LITTLE_ENDIAN
+ intel_started = True # apparently intel fixup may be needed on uncompressed blocks?
yield UInt32(self, "r[]", "New value of R0")
yield UInt32(self, "r[]", "New value of R1")
yield UInt32(self, "r[]", "New value of R2")
@@ -266,12 +271,40 @@ class LZXBlock(FieldSet):
else:
raise ParserError("Unknown block type %d!" % self.block_type)
+ # Fixup Intel jumps if necessary
+ if (
+ intel_started
+ and self.parent["filesize_indicator"].value
+ and self.parent["filesize"].value > 0
+ ):
+ # Note that we're decoding a block-at-a-time instead of a frame-at-a-time,
+ # so we need to handle the frame boundaries carefully.
+ filesize = self.parent["filesize"].value
+ start_pos = max(0, curlen - 10) # We may need to correct something from the last block
+ end_pos = len(self.parent.uncompressed_data) - 10
+ while 1:
+ jmp_pos = self.parent.uncompressed_data.find(b"\xE8", start_pos, end_pos)
+ if jmp_pos == -1:
+ break
+ if (jmp_pos % 32768) >= (32768 - 10):
+ # jumps at the end of frames are not fixed up
+ start_pos = jmp_pos + 1
+ continue
+ abs_off, = struct.unpack("<i", self.parent.uncompressed_data[jmp_pos + 1:jmp_pos + 5])
+ if -jmp_pos <= abs_off < filesize:
+ if abs_off < 0:
+ rel_off = abs_off + filesize
+ else:
+ rel_off = abs_off - jmp_pos
+ self.parent.uncompressed_data[jmp_pos + 1:jmp_pos + 5] = struct.pack("<i", rel_off)
+ start_pos = jmp_pos + 5
+
class LZXStream(Parser):
endian = MIDDLE_ENDIAN
def createFields(self):
- self.uncompressed_data = ""
+ self.uncompressed_data = bytearray()
self.r0 = 1
self.r1 = 1
self.r2 = 1
@@ -291,6 +324,6 @@ class LZXStream(Parser):
def lzx_decompress(stream, window_bits):
data = LZXStream(stream)
data.compr_level = window_bits
- for unused in data:
+ for _ in data:
pass
return data.uncompressed_data
diff --git a/hachoir/parser/archive/mar.py b/hachoir/parser/archive/mar.py
index be71607b..a6efb381 100644
--- a/hachoir/parser/archive/mar.py
+++ b/hachoir/parser/archive/mar.py
@@ -44,7 +44,7 @@ class MarFile(Parser):
return "Invalid magic"
if self["version"].value != 3:
return "Invalid version"
- if not(1 <= self["nb_file"].value <= MAX_NB_FILE):
+ if not (1 <= self["nb_file"].value <= MAX_NB_FILE):
return "Invalid number of file"
return True
diff --git a/hachoir/parser/archive/zlib.py b/hachoir/parser/archive/zlib.py
index ef55ca0b..4c6e0d28 100644
--- a/hachoir/parser/archive/zlib.py
+++ b/hachoir/parser/archive/zlib.py
@@ -14,13 +14,13 @@ from hachoir.core.text_handler import textHandler, hexadecimal
from hachoir.core.tools import paddingSize, alignValue
-def extend_data(data, length, offset):
- """Extend data using a length and an offset."""
+def extend_data(data: bytearray, length, offset):
+ """Extend data using a length and an offset, LZ-style."""
if length >= offset:
new_data = data[-offset:] * (alignValue(length, offset) // offset)
- return data + new_data[:length]
+ data += new_data[:length]
else:
- return data + data[-offset:-offset + length]
+ data += data[-offset:-offset + length]
def build_tree(lengths):
@@ -136,9 +136,9 @@ class DeflateBlock(FieldSet):
CODE_LENGTH_ORDER = [16, 17, 18, 0, 8, 7, 9,
6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15]
- def __init__(self, parent, name, uncomp_data="", *args, **kwargs):
+ def __init__(self, parent, name, uncomp_data=b"", *args, **kwargs):
FieldSet.__init__(self, parent, name, *args, **kwargs)
- self.uncomp_data = uncomp_data
+ self.uncomp_data = bytearray(uncomp_data)
def createFields(self):
yield Bit(self, "final", "Is this the final block?") # BFINAL
@@ -227,7 +227,7 @@ class DeflateBlock(FieldSet):
field._description = "Literal Code %r (Huffman Code %i)" % (
chr(value), field.value)
yield field
- self.uncomp_data += chr(value)
+ self.uncomp_data.append(value)
if value == 256:
field._description = "Block Terminator Code (256) (Huffman Code %i)" % field.value
yield field
@@ -267,15 +267,14 @@ class DeflateBlock(FieldSet):
extrafield._description = "Distance Extra Bits (%i), total length %i" % (
extrafield.value, distance)
yield extrafield
- self.uncomp_data = extend_data(
- self.uncomp_data, length, distance)
+ extend_data(self.uncomp_data, length, distance)
class DeflateData(GenericFieldSet):
endian = LITTLE_ENDIAN
def createFields(self):
- uncomp_data = ""
+ uncomp_data = bytearray()
blk = DeflateBlock(self, "compressed_block[]", uncomp_data)
yield blk
uncomp_data = blk.uncomp_data
@@ -326,11 +325,11 @@ class ZlibData(Parser):
yield textHandler(UInt32(self, "data_checksum", "ADLER32 checksum of compressed data"), hexadecimal)
-def zlib_inflate(stream, wbits=None, prevdata=""):
+def zlib_inflate(stream, wbits=None):
if wbits is None or wbits >= 0:
return ZlibData(stream)["data"].uncompressed_data
else:
data = DeflateData(None, "root", stream, "", stream.askSize(None))
- for unused in data:
+ for _ in data:
pass
return data.uncompressed_data
diff --git a/hachoir/parser/audio/id3.py b/hachoir/parser/audio/id3.py
index 99195935..e6f11312 100644
--- a/hachoir/parser/audio/id3.py
+++ b/hachoir/parser/audio/id3.py
@@ -451,7 +451,7 @@ class ID3_Chunk(FieldSet):
if size:
cls = None
- if not(is_compressed):
+ if not is_compressed:
tag = self["tag"].value
if tag in ID3_Chunk.handler:
cls = ID3_Chunk.handler[tag]
diff --git a/hachoir/parser/audio/itunesdb.py b/hachoir/parser/audio/itunesdb.py
index 92871e8e..095679dc 100644
--- a/hachoir/parser/audio/itunesdb.py
+++ b/hachoir/parser/audio/itunesdb.py
@@ -128,7 +128,7 @@ class DataObject(FieldSet):
yield padding
for i in range(self["entry_count"].value):
yield UInt32(self, "index[" + str(i) + "]", "Index of the " + str(i) + "nth mhit")
- elif(self["type"].value < 15) or (self["type"].value > 17) or (self["type"].value >= 200):
+ elif (self["type"].value < 15) or (self["type"].value > 17) or (self["type"].value >= 200):
yield UInt32(self, "unknown[]")
yield UInt32(self, "unknown[]")
yield UInt32(self, "position", "Position")
diff --git a/hachoir/parser/audio/midi.py b/hachoir/parser/audio/midi.py
index 03f93ec6..b9ed1338 100644
--- a/hachoir/parser/audio/midi.py
+++ b/hachoir/parser/audio/midi.py
@@ -29,7 +29,7 @@ class Integer(Bits):
while True:
bits = stream.readBits(addr, 8, parent.endian)
value = (value << 7) + (bits & 127)
- if not(bits & 128):
+ if not (bits & 128):
break
addr += 8
self._size += 8
diff --git a/hachoir/parser/file_system/ext2.py b/hachoir/parser/file_system/ext2.py
index 3a1f973a..7f0e5443 100644
--- a/hachoir/parser/file_system/ext2.py
+++ b/hachoir/parser/file_system/ext2.py
@@ -747,7 +747,7 @@ class EXT2_FS(HachoirParser, RootSeekableFieldSet):
def validate(self):
if self.stream.readBytes((1024 + 56) * 8, 2) != b"\x53\xEF":
return "Invalid magic number"
- if not(0 <= self["superblock/log_block_size"].value <= 2):
+ if not (0 <= self["superblock/log_block_size"].value <= 2):
return "Invalid (log) block size"
if self["superblock/inode_size"].value not in (0, 128):
return "Unsupported inode size"
diff --git a/hachoir/parser/guess.py b/hachoir/parser/guess.py
index 0ec323f6..97536088 100644
--- a/hachoir/parser/guess.py
+++ b/hachoir/parser/guess.py
@@ -127,10 +127,14 @@ def createParser(filename, real_filename=None, tags=None):
Create a parser from a file or returns None on error.
Options:
- - filename (unicode): Input file name ;
- - real_filename (str|unicode): Real file name.
+ - file (str|io.IOBase): Input file name or
+ a byte io.IOBase stream ;
+ - real_filename (str): Real file name.
"""
if not tags:
tags = []
stream = FileInputStream(filename, real_filename, tags=tags)
- return guessParser(stream)
+ guess = guessParser(stream)
+ if guess is None:
+ stream.close()
+ return guess
diff --git a/hachoir/parser/image/gif.py b/hachoir/parser/image/gif.py
index e97a7fbc..a33b1283 100644
--- a/hachoir/parser/image/gif.py
+++ b/hachoir/parser/image/gif.py
@@ -27,7 +27,7 @@ MAX_HEIGHT = MAX_WIDTH
MAX_FILE_SIZE = 100 * 1024 * 1024
-def rle_repr(l):
+def rle_repr(chain):
"""Run-length encode a list into an "eval"-able form
Example:
@@ -46,7 +46,7 @@ def rle_repr(l):
result[-1] = '[%s, %s]' % (result[-1][1:-1], previous)
else:
result.append('[%s]' % previous)
- iterable = iter(l)
+ iterable = iter(chain)
runlen = 1
result = []
try:
diff --git a/hachoir/parser/image/jpeg.py b/hachoir/parser/image/jpeg.py
index 64d50c10..419d6f48 100644
--- a/hachoir/parser/image/jpeg.py
+++ b/hachoir/parser/image/jpeg.py
@@ -205,7 +205,7 @@ class SOSComponent(FieldSet):
def createFields(self):
comp_id = UInt8(self, "component_id")
yield comp_id
- if not(1 <= comp_id.value <= self["../nr_components"].value):
+ if not (1 <= comp_id.value <= self["../nr_components"].value):
raise ParserError("JPEG error: Invalid component-id")
yield Bits(self, "dc_coding_table", 4, "DC entropy coding table destination selector")
yield Bits(self, "ac_coding_table", 4, "AC entropy coding table destination selector")
@@ -387,7 +387,10 @@ class JpegImageData(FieldSet):
end = self.stream.searchBytes(b"\xff", start, MAX_FILESIZE * 8)
if end is None:
# this is a bad sign, since it means there is no terminator
- # we ignore this; it likely means a truncated image
+ # this likely means a truncated image:
+ # set the size to the remaining length of the stream
+ # to avoid being forced to parse subfields to calculate size
+ self._size = self.stream._size - self.absolute_address
break
if self.stream.readBytes(end, 2) == b'\xff\x00':
# padding: false alarm
diff --git a/hachoir/parser/image/wmf.py b/hachoir/parser/image/wmf.py
index 7541c1c0..2a60951a 100644
--- a/hachoir/parser/image/wmf.py
+++ b/hachoir/parser/image/wmf.py
@@ -597,7 +597,7 @@ class WMF_File(Parser):
yield UInt32(self, "max_record_size", "The size of largest record in 16-bit words")
yield UInt16(self, "nb_params", "Not Used (always 0)")
- while not(self.eof):
+ while not self.eof:
yield Function(self, "func[]")
def isEMF(self):
diff --git a/hachoir/parser/misc/__init__.py b/hachoir/parser/misc/__init__.py
index ccb72fb2..208ffe06 100644
--- a/hachoir/parser/misc/__init__.py
+++ b/hachoir/parser/misc/__init__.py
@@ -16,3 +16,4 @@ from hachoir.parser.misc.word_doc import WordDocumentParser # noqa
from hachoir.parser.misc.word_2 import Word2DocumentParser # noqa
from hachoir.parser.misc.mstask import MSTaskFile # noqa
from hachoir.parser.misc.mapsforge_map import MapsforgeMapFile # noqa
+from hachoir.parser.misc.fit import FITFile # noqa
diff --git a/hachoir/parser/misc/fit.py b/hachoir/parser/misc/fit.py
new file mode 100644
index 00000000..8e51f877
--- /dev/null
+++ b/hachoir/parser/misc/fit.py
@@ -0,0 +1,173 @@
+"""
+Garmin fit file Format parser.
+
+Author: Sebastien Ponce <sebastien.ponce@cern.ch>
+"""
+
+from hachoir.parser import Parser
+from hachoir.field import FieldSet, Int8, UInt8, Int16, UInt16, Int32, UInt32, Int64, UInt64, RawBytes, Bit, Bits, Bytes, String, Float32, Float64
+from hachoir.core.endian import BIG_ENDIAN, LITTLE_ENDIAN
+
+field_types = {
+ 0: UInt8, # enum
+ 1: Int8, # signed int of 8 bits
+ 2: UInt8, # unsigned int of 8 bits
+ 131: Int16, # signed int of 16 bits
+ 132: UInt16, # unsigned int of 16 bits
+ 133: Int32, # signed int of 32 bits
+ 134: UInt32, # unsigned int of 32 bits
+ 7: String, # string
+ 136: Float32, # float
+ 137: Float64, # double
+ 10: UInt8, # unsigned int of 8 bits with 0 as invalid value
+ 139: UInt16, # unsigned int of 16 bits with 0 as invalid value
+ 140: UInt32, # unsigned int of 32 bits with 0 as invalid value
+ 13: Bytes, # bytes
+ 142: Int64, # signed int of 64 bits
+ 143: UInt64, # unsigned int of 64 bits
+ 144: UInt64 # unsigned int of 64 bits with 0 as invalid value
+}
+
+
+class Header(FieldSet):
+ endian = LITTLE_ENDIAN
+
+ def createFields(self):
+ yield UInt8(self, "size", "Header size")
+ yield UInt8(self, "protocol", "Protocol version")
+ yield UInt16(self, "profile", "Profile version")
+ yield UInt32(self, "datasize", "Data size")
+ yield RawBytes(self, "datatype", 4)
+ yield UInt16(self, "crc", "CRC of first 11 bytes or 0x0")
+
+ def createDescription(self):
+ return "Header of fit file. Data size is %d" % (self["datasize"].value)
+
+
+class NormalRecordHeader(FieldSet):
+
+ def createFields(self):
+ yield Bit(self, "normal", "Normal header (0)")
+ yield Bit(self, "type", "Message type (0 data, 1 definition")
+ yield Bit(self, "typespecific", "0")
+ yield Bit(self, "reserved", "0")
+ yield Bits(self, "msgType", 4, description="Message type")
+
+ def createDescription(self):
+ return "Record header, this is a %s message" % ("definition" if self["type"].value else "data")
+
+
+class FieldDefinition(FieldSet):
+
+ def createFields(self):
+ yield UInt8(self, "number", "Field definition number")
+ yield UInt8(self, "size", "Size in bytes")
+ yield UInt8(self, "type", "Base type")
+
+ def createDescription(self):
+ return "Field Definition. Number %d, Size %d" % (self["number"].value, self["size"].value)
+
+
+class DefinitionMessage(FieldSet):
+
+ def createFields(self):
+ yield NormalRecordHeader(self, "RecordHeader")
+ yield UInt8(self, "reserved", "Reserved (0)")
+ yield UInt8(self, "architecture", "Architecture (0 little, 1 big endian")
+ self.endian = BIG_ENDIAN if self["architecture"].value else LITTLE_ENDIAN
+ yield UInt16(self, "msgNumber", "Message Number")
+ yield UInt8(self, "nbFields", "Number of fields")
+ for n in range(self["nbFields"].value):
+ yield FieldDefinition(self, "fieldDefinition[]")
+
+ def createDescription(self):
+ return "Definition Message. Contains %d fields" % (self["nbFields"].value)
+
+
+class DataMessage(FieldSet):
+
+ def createFields(self):
+ hdr = NormalRecordHeader(self, "RecordHeader")
+ yield hdr
+ msgType = self["RecordHeader"]["msgType"].value
+ msgDef = self.parent.msgDefs[msgType]
+ for n in range(msgDef["nbFields"].value):
+ desc = msgDef["fieldDefinition[%d]" % n]
+ typ = field_types[desc["type"].value]
+ self.endian = BIG_ENDIAN if msgDef["architecture"].value else LITTLE_ENDIAN
+ if typ == String or typ == Bytes:
+ yield typ(self, "field%d" % n, desc["size"].value)
+ else:
+ if typ.static_size // 8 == desc["size"].value:
+ yield typ(self, "field%d" % n, desc["size"].value)
+ else:
+ for p in range(desc["size"].value * 8 // typ.static_size):
+ yield typ(self, "field%d[]" % n)
+
+ def createDescription(self):
+ return "Data Message"
+
+
+class TimeStamp(FieldSet):
+
+ def createFields(self):
+ yield Bit(self, "timestamp", "TimeStamp (1)")
+ yield Bits(self, "msgType", 3, description="Message type")
+ yield Bits(self, "time", 4, description="TimeOffset")
+
+ def createDescription(self):
+ return "TimeStamp"
+
+
+class CRC(FieldSet):
+
+ def createFields(self):
+ yield UInt16(self, "crc", "CRC")
+
+ def createDescription(self):
+ return "CRC"
+
+
+class FITFile(Parser):
+ endian = BIG_ENDIAN
+ PARSER_TAGS = {
+ "id": "fit",
+ "category": "misc",
+ "file_ext": ("fit",),
+ "mime": ("application/fit",),
+ "min_size": 14 * 8,
+ "description": "Garmin binary fit format"
+ }
+
+ def __init__(self, *args, **kwargs):
+ Parser.__init__(self, *args, **kwargs)
+ self.msgDefs = {}
+
+ def validate(self):
+ s = self.stream.readBytes(0, 12)
+ if s[8:12] != b'.FIT':
+ return "Invalid header %d %d %d %d" % tuple([int(b) for b in s[8:12]])
+ return True
+
+ def createFields(self):
+ yield Header(self, "header")
+ while self.current_size < self["header"]["datasize"].value * 8:
+ b = self.stream.readBits(self.absolute_address + self.current_size, 2, self.endian)
+ if b == 1:
+ defMsg = DefinitionMessage(self, "definition[]")
+ msgType = defMsg["RecordHeader"]["msgType"].value
+ sizes = ''
+ ts = 0
+ for n in range(defMsg["nbFields"].value):
+ fname = "fieldDefinition[%d]" % n
+ size = defMsg[fname]["size"].value
+ ts += size
+ sizes += "%d/" % size
+ sizes += "%d" % ts
+ self.msgDefs[msgType] = defMsg
+ yield defMsg
+ elif b == 0:
+ yield DataMessage(self, "data[]")
+ else:
+ yield TimeStamp(self, "timestamp[]")
+ yield CRC(self, "crc")
diff --git a/hachoir/parser/misc/mapsforge_map.py b/hachoir/parser/misc/mapsforge_map.py
index 906e95c1..a393942b 100644
--- a/hachoir/parser/misc/mapsforge_map.py
+++ b/hachoir/parser/misc/mapsforge_map.py
@@ -41,7 +41,7 @@ class UIntVbe(Field):
size += 1
assert size < 100, "UIntVBE is too large"
- if not(haveMoreData):
+ if not haveMoreData:
break
self._size = size * 8
@@ -71,7 +71,7 @@ class IntVbe(Field):
size += 1
assert size < 100, "IntVBE is too large"
- if not(haveMoreData):
+ if not haveMoreData:
break
if isNegative:
@@ -142,7 +142,7 @@ class TileHeader(FieldSet):
def createFields(self):
numLevels = int(self.zoomIntervalCfg[
"max_zoom_level"].value - self.zoomIntervalCfg["min_zoom_level"].value) + 1
- assert(numLevels < 50)
+ assert (numLevels < 50)
for i in range(numLevels):
yield TileZoomTable(self, "zoom_table_entry[]")
yield UIntVbe(self, "first_way_offset")
diff --git a/hachoir/parser/misc/ole2.py b/hachoir/parser/misc/ole2.py
index 74b2168e..bfc1f7d8 100644
--- a/hachoir/parser/misc/ole2.py
+++ b/hachoir/parser/misc/ole2.py
@@ -211,7 +211,7 @@ class OLE2_File(HachoirParser, RootSeekableFieldSet):
return "Unknown major version (%s)" % self["header/ver_maj"].value
if self["header/endian"].value not in (b"\xFF\xFE", b"\xFE\xFF"):
return "Unknown endian (%s)" % self["header/endian"].raw_display
- if not(MIN_BIG_BLOCK_LOG2 <= self["header/bb_shift"].value <= MAX_BIG_BLOCK_LOG2):
+ if not (MIN_BIG_BLOCK_LOG2 <= self["header/bb_shift"].value <= MAX_BIG_BLOCK_LOG2):
return "Invalid (log 2 of) big block size (%s)" % self["header/bb_shift"].value
if self["header/bb_shift"].value < self["header/sb_shift"].value:
return "Small block size (log2=%s) is bigger than big block size (log2=%s)!" \
diff --git a/hachoir/parser/misc/pdf.py b/hachoir/parser/misc/pdf.py
index dc934bfe..2fccc6a1 100644
--- a/hachoir/parser/misc/pdf.py
+++ b/hachoir/parser/misc/pdf.py
@@ -392,7 +392,7 @@ class CrossReferenceTable(FieldSet):
FieldSet.__init__(self, parent, name, description=desc)
pos = self.stream.searchBytesLength(Trailer.MAGIC, False)
if pos is None:
- raise ParserError("Can't find '%s' starting at %u"
+ raise ParserError("Can't find '%s' starting at %u" %
(Trailer.MAGIC, self.absolute_address // 8))
self._size = 8 * pos - self.absolute_address
diff --git a/hachoir/parser/misc/ttf.py b/hachoir/parser/misc/ttf.py
index ac374658..ca5e7c49 100644
--- a/hachoir/parser/misc/ttf.py
+++ b/hachoir/parser/misc/ttf.py
@@ -2,6 +2,8 @@
TrueType Font parser.
Documents:
+ - "The OpenType Specification"
+ https://docs.microsoft.com/en-us/typography/opentype/spec/
- "An Introduction to TrueType Fonts: A look inside the TTF format"
written by "NRSI: Computers & Writing Systems"
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=IWS-Chapter08
@@ -11,11 +13,26 @@ Creation date: 2007-02-08
"""
from hachoir.parser import Parser
-from hachoir.field import (FieldSet, ParserError,
- UInt16, UInt32, Bit, Bits,
- PaddingBits, NullBytes,
- String, RawBytes, Bytes, Enum,
- TimestampMac32)
+from hachoir.field import (
+ FieldSet,
+ ParserError,
+ UInt8,
+ UInt16,
+ UInt24,
+ UInt32,
+ Int16,
+ Bit,
+ Bits,
+ PaddingBits,
+ NullBytes,
+ String,
+ RawBytes,
+ Bytes,
+ Enum,
+ TimestampMac32,
+ GenericVector,
+ PascalString8,
+)
from hachoir.core.endian import BIG_ENDIAN
from hachoir.core.text_handler import textHandler, hexadecimal, filesizeHandler
@@ -69,11 +86,65 @@ CHARSET_MAP = {
3: {1: "UTF-16-BE"},
}
+PERMISSIONS = {
+ 0: "Installable embedding",
+ 2: "Restricted License embedding",
+ 4: "Preview & Print embedding",
+ 8: "Editable embedding",
+}
-class TableHeader(FieldSet):
+FWORD = Int16
+UFWORD = UInt16
+
+
+class Tag(String):
+ def __init__(self, parent, name, description=None):
+ String.__init__(self, parent, name, 4, description)
+
+
+class Version16Dot16(FieldSet):
+ static_size = 32
def createFields(self):
- yield String(self, "tag", 4)
+ yield UInt16(self, "major")
+ yield UInt16(self, "minor")
+
+ def createValue(self):
+ return float("%u.%x" % (self["major"].value, self["minor"].value))
+
+
+class Fixed(FieldSet):
+ def createFields(self):
+ yield UInt16(self, "int_part")
+ yield UInt16(self, "float_part")
+
+ def createValue(self):
+ return self["int_part"].value + float(self["float_part"].value) / 65536
+
+
+class Tuple(FieldSet):
+ def __init__(self, parent, name, axisCount):
+ super().__init__(parent, name, description="Tuple Record")
+ self.axisCount = axisCount
+
+ def createFields(self):
+ for _ in range(self.axisCount):
+ yield (Fixed(self, "coordinate[]"))
+
+
+class F2DOT14(FieldSet):
+ static_size = 16
+
+ def createFields(self):
+ yield Int16(self, "int_part")
+
+ def createValue(self):
+ return self["int_part"].value / 16384
+
+
+class TableHeader(FieldSet):
+ def createFields(self):
+ yield Tag(self, "tag")
yield textHandler(UInt32(self, "checksum"), hexadecimal)
yield UInt32(self, "offset")
yield filesizeHandler(UInt32(self, "size"))
@@ -83,7 +154,6 @@ class TableHeader(FieldSet):
class NameHeader(FieldSet):
-
def createFields(self):
yield Enum(UInt16(self, "platformID"), PLATFORM_NAME)
yield UInt16(self, "encodingID")
@@ -135,7 +205,7 @@ def parseFontHeader(self):
yield Bits(self, "adobe", 2, "(used by Adobe)")
yield UInt16(self, "unit_per_em", "Units per em")
- if not(16 <= self["unit_per_em"].value <= 16384):
+ if not (16 <= self["unit_per_em"].value <= 16384):
raise ParserError("TTF: Invalid unit/em value")
yield UInt32(self, "created_high")
yield TimestampMac32(self, "created")
@@ -162,17 +232,273 @@ def parseFontHeader(self):
yield UInt16(self, "glyph_format", "(=0)")
+class AxisValueMap(FieldSet):
+ static_size = 32
+
+ def createFields(self):
+ yield F2DOT14(self, "fromCoordinate")
+ yield F2DOT14(self, "toCoordinate")
+
+
+class SegmentMaps(FieldSet):
+ def createFields(self):
+ yield UInt16(
+ self, "positionMapCount", "The number of correspondence pairs for this axis"
+ )
+ for _ in range(self["positionMapCount"].value):
+ yield (AxisValueMap(self, "axisValueMaps[]"))
+
+
+def parseAvar(self):
+ yield UInt16(self, "majorVersion", "Major version")
+ yield UInt16(self, "minorVersion", "Minor version")
+ yield PaddingBits(self, "reserved[]", 16)
+ yield UInt16(self, "axisCount", "The number of variation axes for this font")
+ for _ in range(self["axisCount"].value):
+ yield (SegmentMaps(self, "segmentMaps[]"))
+
+
+class VariationAxisRecord(FieldSet):
+ def createFields(self):
+ yield Tag(self, "axisTag", "Tag identifying the design variation for the axis")
+ yield Fixed(self, "minValue", "The minimum coordinate value for the axis")
+ yield Fixed(self, "defaultValue", "The default coordinate value for the axis")
+ yield Fixed(self, "maxValue", "The maximum coordinate value for the axis")
+ yield PaddingBits(self, "reservedFlags", 15)
+ yield Bit(
+ self, "hidden", "The axis should not be exposed directly in user interfaces"
+ )
+ yield UInt16(
+ self,
+ "axisNameID",
+ "The name ID for entries in the 'name' table that provide a display name for this axis",
+ )
+
+
+class InstanceRecord(FieldSet):
+ def __init__(self, parent, name, axisCount, hasPSNameID=False):
+ super().__init__(parent, name, description="Instance record")
+ self.axisCount = axisCount
+ self.hasPSNameID = hasPSNameID
+
+ def createFields(self):
+ yield UInt16(
+ self, "subfamilyNameID", "Name ID for subfamily names for this instance"
+ )
+ yield PaddingBits(self, "reservedFlags", 16)
+ yield Tuple(self, "coordinates", axisCount=self.axisCount)
+ if self.hasPSNameID:
+ yield UInt16(
+ self,
+ "postScriptNameID",
+ "Name ID for PostScript names for this instance",
+ )
+
+
+def parseFvar(self):
+ yield UInt16(self, "majorVersion", "Major version")
+ yield UInt16(self, "minorVersion", "Minor version")
+ yield UInt16(
+ self, "axisArrayOffset", "Offset to the start of the VariationAxisRecord array."
+ )
+ yield PaddingBits(self, "reserved[]", 16)
+ yield UInt16(self, "axisCount", "The number of variation axes for this font")
+ yield UInt16(self, "axisSize", "The size in bytes of each VariationAxisRecord")
+ yield UInt16(self, "instanceCount", "The number of named instances for this font")
+ yield UInt16(self, "instanceSize", "The size in bytes of each InstanceRecord")
+ if self["axisArrayOffset"].value > 16:
+ yield PaddingBits(self, "padding", 8 * (self["axisArrayOffset"].value - 16))
+ for _ in range(self["axisCount"].value):
+ yield (VariationAxisRecord(self, "axes[]"))
+ for _ in range(self["instanceCount"].value):
+ yield (
+ InstanceRecord(
+ self,
+ "instances[]",
+ axisCount=self["axisCount"].value,
+ hasPSNameID=(
+ self["instanceSize"].value == (2 * self["axisCount"].value + 6)
+ ),
+ )
+ )
+
+
+class EncodingRecord(FieldSet):
+ static_size = 64
+
+ def createFields(self):
+ yield Enum(UInt16(self, "platformID"), PLATFORM_NAME)
+ yield UInt16(self, "encodingID")
+ self.offset = UInt32(self, "subtableOffset")
+ yield self.offset
+
+
+class CmapTable0(FieldSet):
+ def createFields(self):
+ yield UInt16(self, "format", "Table format")
+ yield UInt16(self, "length", "Length in bytes")
+ yield UInt16(self, "language", "Language ID")
+ yield GenericVector(self, "mapping", 256, UInt8)
+
+
+class CmapTable4(FieldSet):
+ def createFields(self):
+ yield UInt16(self, "format", "Table format")
+ yield UInt16(self, "length", "Length in bytes")
+ yield UInt16(self, "language", "Language ID")
+ yield UInt16(self, "segCountX2", "Twice the number of segments")
+ segments = self["segCountX2"].value // 2
+ yield UInt16(self, "searchRange")
+ yield UInt16(self, "entrySelector")
+ yield UInt16(self, "rangeShift")
+ yield GenericVector(self, "endCode", segments, UInt16)
+ yield PaddingBits(self, "reserved[]", 16)
+ yield GenericVector(self, "startCode", segments, UInt16)
+ yield GenericVector(self, "idDelta", segments, Int16)
+ yield GenericVector(self, "idRangeOffsets", segments, UInt16)
+ remainder = (self["length"].value - (self.current_size / 8)) / 2
+ if remainder:
+ yield GenericVector(self, "glyphIdArray", remainder, UInt16)
+
+
+class CmapTable6(FieldSet):
+ def createFields(self):
+ yield UInt16(self, "format", "Table format")
+ yield UInt16(self, "length", "Length in bytes")
+ yield UInt16(self, "language", "Language ID")
+ yield UInt16(self, "firstCode", "First character code of subrange")
+ yield UInt16(self, "entryCount", "Number of character codes in subrange")
+ yield GenericVector(self, "glyphIdArray", self["entryCount"].value, UInt16)
+
+
+class SequentialMapGroup(FieldSet):
+ def createFields(self):
+ yield UInt32(self, "startCharCode", "First character code in this group")
+ yield UInt32(self, "endCharCode", "First character code in this group")
+ yield UInt32(
+ self,
+ "startGlyphID",
+ "Glyph index corresponding to the starting character code",
+ )
+
+
+class CmapTable12(FieldSet):
+ def createFields(self):
+ yield UInt16(self, "format", "Table format")
+ yield PaddingBits(self, "reserved[]", 16)
+ yield UInt32(self, "length", "Length in bytes")
+ yield UInt32(self, "language", "Language ID")
+ yield UInt32(self, "numGroups", "Number of groupings which follow")
+ for i in range(self["numGroups"].value):
+ yield SequentialMapGroup(self, "mapgroup[]")
+
+
+class VariationSelector(FieldSet):
+ def createFields(self):
+ yield UInt24(self, "varSelector", "Variation selector")
+ yield UInt32(self, "defaultUVSOffset", "Offset to default UVS table")
+ yield UInt32(self, "nonDefaultUVSOffset", "Offset to non-default UVS table")
+
+
+class CmapTable14(FieldSet):
+ def createFields(self):
+ yield UInt16(self, "format", "Table format")
+ yield UInt32(self, "length", "Length in bytes")
+ yield UInt32(
+ self, "numVarSelectorRecords", "Number of variation selector records"
+ )
+ for i in range(self["numVarSelectorRecords"].value):
+ yield VariationSelector(self, "variationSelector[]")
+
+
+def parseCmap(self):
+ yield UInt16(self, "version")
+ numTables = UInt16(self, "numTables", "Number of encoding tables")
+ yield numTables
+ encodingRecords = []
+ for index in range(numTables.value):
+ entry = EncodingRecord(self, "encodingRecords[]")
+ yield entry
+ encodingRecords.append(entry)
+ encodingRecords.sort(key=lambda field: field["subtableOffset"].value)
+ last = None
+ for er in encodingRecords:
+ offset = er["subtableOffset"].value
+ if last and last == offset:
+ continue
+ last = offset
+
+ # Add padding if any
+ padding = self.seekByte(offset, relative=True, null=False)
+ if padding:
+ yield padding
+ format = UInt16(self, "format").value
+ if format == 0:
+ yield CmapTable0(self, "cmap table format 0")
+ elif format == 4:
+ yield CmapTable4(self, "cmap table format 4")
+ elif format == 6:
+ yield CmapTable6(self, "cmap table format 6")
+ elif format == 12:
+ yield CmapTable12(self, "cmap table format 12")
+ elif format == 14:
+ yield CmapTable14(self, "cmap table format 14")
+
+
+class SignatureRecord(FieldSet):
+ def createFields(self):
+ yield UInt16(self, "format", "Table format")
+ yield UInt16(self, "length", "Length of signature")
+ yield UInt16(self, "signatureBlockOffset", "Offset to signature block")
+
+
+class SignatureBlock(FieldSet):
+ def createFields(self):
+ yield PaddingBits(self, "reserved[]", 32)
+ yield UInt32(
+ self,
+ "length",
+ "Length (in bytes) of the PKCS#7 packet in the signature field",
+ )
+ yield String(self, "signature", self["length"].value, "Signature block")
+
+
+def parseDSIG(self):
+ yield UInt32(self, "version")
+ yield UInt16(self, "numSignatures", "Number of signatures in the table")
+ yield Bit(self, "flag", "Cannot be resigned")
+ yield PaddingBits(self, "reserved[]", 7)
+ entries = []
+ for i in range(self["numSignatures"].value):
+ record = SignatureRecord(self, "signatureRecords[]")
+ entries.append(record)
+ yield record
+ entries.sort(key=lambda field: field["signatureBlockOffset"].value)
+ last = None
+ for entry in entries:
+ offset = entry["signatureBlockOffset"].value
+ if last and last == offset:
+ continue
+ last = offset
+ # Add padding if any
+ padding = self.seekByte(offset, relative=True, null=False)
+ if padding:
+ yield padding
+
+ padding = (self.size - self.current_size) // 8
+ if padding:
+ yield NullBytes(self, "padding_end", padding)
+
+
def parseNames(self):
# Read header
yield UInt16(self, "format")
if self["format"].value != 0:
- raise ParserError("TTF (names): Invalid format (%u)" %
- self["format"].value)
+ raise ParserError("TTF (names): Invalid format (%u)" % self["format"].value)
yield UInt16(self, "count")
yield UInt16(self, "offset")
if MAX_NAME_COUNT < self["count"].value:
- raise ParserError("Invalid number of names (%s)"
- % self["count"].value)
+ raise ParserError("Invalid number of names (%s)" % self["count"].value)
# Read name index
entries = []
@@ -208,17 +534,210 @@ def parseNames(self):
# Read value
size = entry["length"].value
if size:
- yield String(self, "value[]", size, entry.description, charset=entry.getCharset())
+ yield String(
+ self, "value[]", size, entry.description, charset=entry.getCharset()
+ )
padding = (self.size - self.current_size) // 8
if padding:
yield NullBytes(self, "padding_end", padding)
+def parseMaxp(self):
+ # Read header
+ yield Version16Dot16(self, "format", "format version")
+ yield UInt16(self, "numGlyphs", "Number of glyphs")
+ if self["format"].value >= 1:
+ yield UInt16(self, "maxPoints", "Maximum points in a non-composite glyph")
+ yield UInt16(self, "maxContours", "Maximum contours in a non-composite glyph")
+ yield UInt16(self, "maxCompositePoints", "Maximum points in a composite glyph")
+ yield UInt16(
+ self, "maxCompositeContours", "Maximum contours in a composite glyph"
+ )
+ yield UInt16(self, "maxZones", "Do instructions use the twilight zone?")
+ yield UInt16(self, "maxTwilightPoints", "Maximum points used in Z0")
+ yield UInt16(self, "maxStorage", "Number of Storage Area locations")
+ yield UInt16(self, "maxFunctionDefs", "Number of function definitions")
+ yield UInt16(self, "maxInstructionDefs", "Number of instruction definitions")
+ yield UInt16(self, "maxStackElements", "Maximum stack depth")
+ yield UInt16(
+ self, "maxSizeOfInstructions", "Maximum byte count for glyph instructions"
+ )
+ yield UInt16(
+ self,
+ "maxComponentElements",
+ "Maximum number of components at glyph top level",
+ )
+ yield UInt16(self, "maxComponentDepth", "Maximum level of recursion")
+
+
+def parseHhea(self):
+ yield UInt16(self, "majorVersion", "Major version")
+ yield UInt16(self, "minorVersion", "Minor version")
+ yield FWORD(self, "ascender", "Typographic ascent")
+ yield FWORD(self, "descender", "Typographic descent")
+ yield FWORD(self, "lineGap", "Typographic linegap")
+ yield UFWORD(self, "advanceWidthMax", "Maximum advance width")
+ yield FWORD(self, "minLeftSideBearing", "Minimum left sidebearing value")
+ yield FWORD(self, "minRightSideBearing", "Minimum right sidebearing value")
+ yield FWORD(self, "xMaxExtent", "Maximum X extent")
+ yield Int16(self, "caretSlopeRise", "Caret slope rise")
+ yield Int16(self, "caretSlopeRun", "Caret slope run")
+ yield Int16(self, "caretOffset", "Caret offset")
+ yield GenericVector(self, "reserved", 4, Int16)
+ yield Int16(self, "metricDataFormat", "Metric data format")
+ yield UInt16(self, "numberOfHMetrics", "Number of horizontal metrics")
+
+
+class fsType(FieldSet):
+ def createFields(self):
+ yield Enum(Bits(self, "usage_permissions", 4), PERMISSIONS)
+ yield PaddingBits(self, "reserved[]", 4)
+ yield Bit(self, "no_subsetting", "Font may not be subsetted prior to embedding")
+ yield Bit(
+ self,
+ "bitmap_embedding",
+ "Only bitmaps contained in the font may be embedded",
+ )
+ yield PaddingBits(self, "reserved[]", 6)
+
+
+def parseOS2(self):
+ yield UInt16(self, "version", "Table version")
+ yield Int16(self, "xAvgCharWidth")
+ yield UInt16(self, "usWeightClass")
+ yield UInt16(self, "usWidthClass")
+ yield fsType(self, "fsType")
+ yield Int16(self, "ySubscriptXSize")
+ yield Int16(self, "ySubscriptYSize")
+ yield Int16(self, "ySubscriptXOffset")
+ yield Int16(self, "ySubscriptYOffset")
+ yield Int16(self, "ySuperscriptXSize")
+ yield Int16(self, "ySuperscriptYSize")
+ yield Int16(self, "ySuperscriptXOffset")
+ yield Int16(self, "ySuperscriptYOffset")
+ yield Int16(self, "yStrikeoutSize")
+ yield Int16(self, "yStrikeoutPosition")
+ yield Int16(self, "sFamilyClass")
+ yield GenericVector(self, "panose", 10, UInt8)
+ yield UInt32(self, "ulUnicodeRange1")
+ yield UInt32(self, "ulUnicodeRange2")
+ yield UInt32(self, "ulUnicodeRange3")
+ yield UInt32(self, "ulUnicodeRange4")
+ yield Tag(self, "achVendID", "Vendor ID")
+ yield UInt16(self, "fsSelection")
+ yield UInt16(self, "usFirstCharIndex")
+ yield UInt16(self, "usLastCharIndex")
+ yield Int16(self, "sTypoAscender")
+ yield Int16(self, "sTypoDescender")
+ yield Int16(self, "sTypoLineGap")
+ yield UInt16(self, "usWinAscent")
+ yield UInt16(self, "usWinDescent")
+ if self["version"].value >= 1:
+ yield UInt32(self, "ulCodePageRange1")
+ yield UInt32(self, "ulCodePageRange2")
+ if self["version"].value >= 2:
+ yield Int16(self, "sxHeight")
+ yield Int16(self, "sCapHeight")
+ yield UInt16(self, "usDefaultChar")
+ yield UInt16(self, "usBreakChar")
+ yield UInt16(self, "usMaxContext")
+ if self["version"].value >= 5:
+ yield UInt16(self, "usLowerOpticalPointSize")
+ yield UInt16(self, "usUpperOpticalPointSize")
+
+
+def parsePost(self):
+ yield Version16Dot16(self, "version", "Table version")
+ yield Fixed(
+ self,
+ "italicAngle",
+ "Italic angle in counter-clockwise degrees from the vertical.",
+ )
+ yield FWORD(self, "underlinePosition", "Top of underline to baseline")
+ yield FWORD(self, "underlineThickness", "Suggested underline thickness")
+ yield UInt32(self, "isFixedPitch", "Is the font fixed pitch?")
+ yield UInt32(self, "minMemType42", "Minimum memory usage (OpenType)")
+ yield UInt32(self, "maxMemType42", "Maximum memory usage (OpenType)")
+ yield UInt32(self, "minMemType1", "Minimum memory usage (Type 1)")
+ yield UInt32(self, "maxMemType1", "Maximum memory usage (Type 1)")
+ if self["version"].value == 2.0:
+ yield UInt16(self, "numGlyphs")
+ indices = GenericVector(
+ self,
+ "Array of indices into the string data",
+ self["numGlyphs"].value,
+ UInt16,
+ "glyphNameIndex",
+ )
+ yield indices
+ for gid, index in enumerate(indices):
+ if index.value >= 258:
+ yield PascalString8(self, "glyphname[%i]" % gid)
+ elif self["version"].value == 2.0:
+ yield UInt16(self, "numGlyphs")
+ indices = GenericVector(
+ self,
+ "Difference between graphic index and standard order of glyph",
+ self["numGlyphs"].value,
+ UInt16,
+ "offset",
+ )
+ yield indices
+
+
+# This is work-in-progress until I work out good ways to do random-access on offsets
+parseScriptList = (
+ parseFeatureList
+) = parseLookupList = parseFeatureVariationsTable = lambda x: None
+
+
+def parseGSUB(self):
+ yield UInt16(self, "majorVersion", "Major version")
+ yield UInt16(self, "minorVersion", "Minor version")
+ SUBTABLES = [
+ ("script list", parseScriptList),
+ ("feature list", parseFeatureList),
+ ("lookup list", parseLookupList),
+ ]
+ offsets = []
+ for description, parser in SUBTABLES:
+ name = description.title().replace(" ", "")
+ offset = UInt16(
+ self, name[0].lower() + name[1:], "Offset to %s table" % description
+ )
+ yield offset
+ offsets.append((offset.value, parser))
+ if self["min_ver"].value == 1:
+ offset = UInt32(
+ self, "featureVariationsOffset", "Offset to feature variations table"
+ )
+ offsets.append((offset.value, parseFeatureVariationsTable))
+
+ offsets.sort(key=lambda field: field[0])
+ padding = self.seekByte(offsets[0][0], null=True)
+ if padding:
+ yield padding
+ lastOffset, first_parser = offsets[0]
+ for offset, parser in offsets[1:]:
+ # yield parser(self)
+ yield RawBytes(self, "content", offset - lastOffset)
+ lastOffset = offset
+
+
class Table(FieldSet):
TAG_INFO = {
+ "DSIG": ("DSIG", "Digital Signature", parseDSIG),
+ "GSUB": ("GSUB", "Glyph Substitutions", parseGSUB),
+ "avar": ("avar", "Axis variation table", parseAvar),
+ "cmap": ("cmap", "Character to Glyph Index Mapping", parseCmap),
+ "fvar": ("fvar", "Font variations table", parseFvar),
"head": ("header", "Font header", parseFontHeader),
+ "hhea": ("hhea", "Horizontal Header", parseHhea),
+ "maxp": ("maxp", "Maximum Profile", parseMaxp),
"name": ("names", "Names", parseNames),
+ "OS/2": ("OS_2", "OS/2 and Windows Metrics", parseOS2),
+ "post": ("post", "PostScript", parsePost),
}
def __init__(self, parent, name, table, **kw):
@@ -251,10 +770,15 @@ class TrueTypeFontFile(Parser):
}
def validate(self):
- if self["maj_ver"].value != 1:
- return "Invalid major version (%u)" % self["maj_ver"].value
- if self["min_ver"].value != 0:
- return "Invalid minor version (%u)" % self["min_ver"].value
+ if self["maj_ver"].value == 1 and self["min_ver"].value == 0:
+ pass
+ elif self["maj_ver"].value == 0x4F54 and self["min_ver"].value == 0x544F:
+ pass
+ else:
+ return "Invalid version (%u.%u)" % (
+ self["maj_ver"].value,
+ self["min_ver"].value,
+ )
if not (MIN_NB_TABLE <= self["nb_table"].value <= MAX_NB_TABLE):
return "Invalid number of table (%u)" % self["nb_table"].value
return True
diff --git a/hachoir/parser/parser.py b/hachoir/parser/parser.py
index 1ec1b5e8..a00bf76f 100644
--- a/hachoir/parser/parser.py
+++ b/hachoir/parser/parser.py
@@ -13,7 +13,7 @@ class HachoirParser(object):
"""
A parser is the root of all other fields. It create first level of fields
and have special attributes and methods:
- - tags: dictionnary with keys:
+ - tags: dictionary with keys:
- "file_ext": classical file extensions (string or tuple of strings) ;
- "mime": MIME type(s) (string or tuple of strings) ;
- "description": String describing the parser.
diff --git a/hachoir/parser/program/python.py b/hachoir/parser/program/python.py
index f2d3127c..bd2d905f 100644
--- a/hachoir/parser/program/python.py
+++ b/hachoir/parser/program/python.py
@@ -10,10 +10,12 @@ Creation: 25 march 2005
"""
from hachoir.parser import Parser
-from hachoir.field import (FieldSet, UInt8,
- UInt16, Int32, UInt32, Int64, ParserError, Float64,
- Character, RawBytes, PascalString8, TimestampUnix32,
- Bit, String)
+from hachoir.field import (
+ FieldSet, UInt8,
+ UInt16, Int32, UInt32, Int64, UInt64,
+ ParserError, Float64,
+ Character, RawBytes, PascalString8, TimestampUnix32,
+ Bit, String, NullBits)
from hachoir.core.endian import LITTLE_ENDIAN
from hachoir.core.bits import long2raw
from hachoir.core.text_handler import textHandler, hexadecimal
@@ -152,13 +154,17 @@ def parseShortASCII(parent):
def parseCode(parent):
- if 0x3000000 <= parent.root.getVersion():
+ version = parent.root.getVersion()
+ if 0x3000000 <= version:
yield UInt32(parent, "arg_count", "Argument count")
+ if 0x3080000 <= version:
+ yield UInt32(parent, "posonlyargcount", "Positional only argument count")
yield UInt32(parent, "kwonlyargcount", "Keyword only argument count")
- yield UInt32(parent, "nb_locals", "Number of local variables")
+ if version < 0x30B0000:
+ yield UInt32(parent, "nb_locals", "Number of local variables")
yield UInt32(parent, "stack_size", "Stack size")
yield UInt32(parent, "flags")
- elif 0x2030000 <= parent.root.getVersion():
+ elif 0x2030000 <= version:
yield UInt32(parent, "arg_count", "Argument count")
yield UInt32(parent, "nb_locals", "Number of local variables")
yield UInt32(parent, "stack_size", "Stack size")
@@ -168,20 +174,34 @@ def parseCode(parent):
yield UInt16(parent, "nb_locals", "Number of local variables")
yield UInt16(parent, "stack_size", "Stack size")
yield UInt16(parent, "flags")
+
yield Object(parent, "compiled_code")
yield Object(parent, "consts")
yield Object(parent, "names")
- yield Object(parent, "varnames")
- if 0x2000000 <= parent.root.getVersion():
- yield Object(parent, "freevars")
- yield Object(parent, "cellvars")
+ if 0x30B0000 <= version:
+ yield Object(parent, "co_localsplusnames")
+ yield Object(parent, "co_localspluskinds")
+ else:
+ yield Object(parent, "varnames")
+ if 0x2000000 <= version:
+ yield Object(parent, "freevars")
+ yield Object(parent, "cellvars")
+
yield Object(parent, "filename")
yield Object(parent, "name")
- if 0x2030000 <= parent.root.getVersion():
+ if 0x30B0000 <= version:
+ yield Object(parent, "qualname")
+
+ if 0x2030000 <= version:
yield UInt32(parent, "firstlineno", "First line number")
else:
yield UInt16(parent, "firstlineno", "First line number")
- yield Object(parent, "lnotab")
+ if 0x30A0000 <= version:
+ yield Object(parent, "linetable")
+ if 0x30B0000 <= version:
+ yield Object(parent, "exceptiontable")
+ else:
+ yield Object(parent, "lnotab")
class Object(FieldSet):
@@ -301,6 +321,16 @@ class BytecodeChar(Character):
static_size = 7
+PY_RELEASE_LEVEL_ALPHA = 0xA
+PY_RELEASE_LEVEL_FINAL = 0xF
+
+
+def VERSION(major, minor, release_level=PY_RELEASE_LEVEL_FINAL, serial=0):
+ micro = 0
+ return ((major << 24) + (minor << 16) + (micro << 8)
+ + (release_level << 4) + (serial << 0))
+
+
class PythonCompiledFile(Parser):
PARSER_TAGS = {
"id": "python",
@@ -394,7 +424,90 @@ class PythonCompiledFile(Parser):
3377: ("Python 3.6b1 ", 0x3060000),
3378: ("Python 3.6b2 ", 0x3060000),
3379: ("Python 3.6rc1", 0x3060000),
- 3390: ("Python 3.7a0 ", 0x3070000),
+ 3390: ("Python 3.7a1", 0x30700A1),
+ 3391: ("Python 3.7a2", 0x30700A2),
+ 3392: ("Python 3.7a4", 0x30700A4),
+ 3393: ("Python 3.7b1", 0x30700B1),
+ 3394: ("Python 3.7b5", 0x30700B5),
+ 3400: ("Python 3.8a1", VERSION(3, 8)),
+ 3401: ("Python 3.8a1", VERSION(3, 8)),
+ 3410: ("Python 3.8a1", VERSION(3, 8)),
+ 3411: ("Python 3.8b2", VERSION(3, 8)),
+ 3412: ("Python 3.8b2", VERSION(3, 8)),
+ 3413: ("Python 3.8b4", VERSION(3, 8)),
+ 3420: ("Python 3.9a0", VERSION(3, 9)),
+ 3421: ("Python 3.9a0", VERSION(3, 9)),
+ 3422: ("Python 3.9a0", VERSION(3, 9)),
+ 3423: ("Python 3.9a2", VERSION(3, 9)),
+ 3424: ("Python 3.9a2", VERSION(3, 9)),
+ 3425: ("Python 3.9a2", VERSION(3, 9)),
+ 3430: ("Python 3.10a1", VERSION(3, 10)),
+ 3431: ("Python 3.10a1", VERSION(3, 10)),
+ 3432: ("Python 3.10a2", VERSION(3, 10)),
+ 3433: ("Python 3.10a2", VERSION(3, 10)),
+ 3434: ("Python 3.10a6", VERSION(3, 10)),
+ 3435: ("Python 3.10a7", VERSION(3, 10)),
+ 3436: ("Python 3.10b1", VERSION(3, 10)),
+ 3437: ("Python 3.10b1", VERSION(3, 10)),
+ 3438: ("Python 3.10b1", VERSION(3, 10)),
+ 3439: ("Python 3.10b1", VERSION(3, 10)),
+ 3450: ("Python 3.11a1", VERSION(3, 11)),
+ 3451: ("Python 3.11a1", VERSION(3, 11)),
+ 3452: ("Python 3.11a1", VERSION(3, 11)),
+ 3453: ("Python 3.11a1", VERSION(3, 11)),
+ 3454: ("Python 3.11a1", VERSION(3, 11)),
+ 3455: ("Python 3.11a1", VERSION(3, 11)),
+ 3456: ("Python 3.11a1", VERSION(3, 11)),
+ 3457: ("Python 3.11a1", VERSION(3, 11)),
+ 3458: ("Python 3.11a1", VERSION(3, 11)),
+ 3459: ("Python 3.11a1", VERSION(3, 11)),
+ 3460: ("Python 3.11a1", VERSION(3, 11)),
+ 3461: ("Python 3.11a1", VERSION(3, 11)),
+ 3462: ("Python 3.11a2", VERSION(3, 11)),
+ 3463: ("Python 3.11a3", VERSION(3, 11)),
+ 3464: ("Python 3.11a3", VERSION(3, 11)),
+ 3465: ("Python 3.11a3", VERSION(3, 11)),
+ 3466: ("Python 3.11a4", VERSION(3, 11)),
+ 3467: ("Python 3.11a4", VERSION(3, 11)),
+ 3468: ("Python 3.11a4", VERSION(3, 11)),
+ 3469: ("Python 3.11a4", VERSION(3, 11)),
+ 3470: ("Python 3.11a4", VERSION(3, 11)),
+ 3471: ("Python 3.11a4", VERSION(3, 11)),
+ 3472: ("Python 3.11a4", VERSION(3, 11)),
+ 3473: ("Python 3.11a4", VERSION(3, 11)),
+ 3474: ("Python 3.11a4", VERSION(3, 11)),
+ 3475: ("Python 3.11a5", VERSION(3, 11)),
+ 3476: ("Python 3.11a5", VERSION(3, 11)),
+ 3477: ("Python 3.11a5", VERSION(3, 11)),
+ 3478: ("Python 3.11a5", VERSION(3, 11)),
+ 3479: ("Python 3.11a5", VERSION(3, 11)),
+ 3480: ("Python 3.11a5", VERSION(3, 11)),
+ 3481: ("Python 3.11a5", VERSION(3, 11)),
+ 3482: ("Python 3.11a5", VERSION(3, 11)),
+ 3483: ("Python 3.11a5", VERSION(3, 11)),
+ 3484: ("Python 3.11a5", VERSION(3, 11)),
+ 3485: ("Python 3.11a5", VERSION(3, 11)),
+ 3486: ("Python 3.11a6", VERSION(3, 11)),
+ 3487: ("Python 3.11a6", VERSION(3, 11)),
+ 3488: ("Python 3.11a6", VERSION(3, 11)),
+ 3489: ("Python 3.11a6", VERSION(3, 11)),
+ 3490: ("Python 3.11a6", VERSION(3, 11)),
+ 3491: ("Python 3.11a6", VERSION(3, 11)),
+ 3492: ("Python 3.11a7", VERSION(3, 11)),
+ 3493: ("Python 3.11a7", VERSION(3, 11)),
+ 3494: ("Python 3.11a7", VERSION(3, 11)),
+ 3500: ("Python 3.12a1", VERSION(3, 12)),
+ 3501: ("Python 3.12a1", VERSION(3, 12)),
+ 3502: ("Python 3.12a1", VERSION(3, 12)),
+ 3503: ("Python 3.12a1", VERSION(3, 12)),
+ 3504: ("Python 3.12a1", VERSION(3, 12)),
+ 3505: ("Python 3.12a1", VERSION(3, 12)),
+ 3506: ("Python 3.12a1", VERSION(3, 12)),
+ 3507: ("Python 3.12a1", VERSION(3, 12)),
+ 3508: ("Python 3.12a1", VERSION(3, 12)),
+ 3509: ("Python 3.12a1", VERSION(3, 12)),
+ 3510: ("Python 3.12a1", VERSION(3, 12)),
+ 3511: ("Python 3.12a1", VERSION(3, 12)),
}
# Dictionnary which associate the pyc signature (4-byte long string)
@@ -411,13 +524,7 @@ class PythonCompiledFile(Parser):
if self["magic_string"].value != "\r\n":
return r"Wrong magic string (\r\n)"
- version = self.getVersion()
- if version >= 0x3030000 and self['magic_number'].value >= 3200:
- offset = 12
- else:
- offset = 8
- value = self.stream.readBits(offset * 8, 7, self.endian)
- if value != ord(b'c'):
+ if self["content/bytecode"].value != "c":
return "First object bytecode is not code"
return True
@@ -430,8 +537,22 @@ class PythonCompiledFile(Parser):
def createFields(self):
yield UInt16(self, "magic_number", "Magic number")
yield String(self, "magic_string", 2, r"Magic string \r\n", charset="ASCII")
- yield TimestampUnix32(self, "timestamp", "Timestamp")
version = self.getVersion()
- if version >= 0x3030000 and self['magic_number'].value >= 3200:
- yield UInt32(self, "filesize", "Size of the Python source file (.py) modulo 2**32")
+
+ # PEP 552: Deterministic pycs #31650 (Python 3.7a4); magic=3392
+ if version >= 0x30700A4:
+ yield Bit(self, "use_hash", "Is hash based?")
+ yield Bit(self, "checked")
+ yield NullBits(self, "reserved", 30)
+ use_hash = self['use_hash'].value
+ else:
+ use_hash = False
+
+ if use_hash:
+ yield UInt64(self, "hash")
+ else:
+ yield TimestampUnix32(self, "timestamp", "Timestamp modulo 2**32")
+ if version >= 0x3030000 and self['magic_number'].value >= 3200:
+ yield UInt32(self, "filesize", "Size of the Python source file (.py) modulo 2**32")
+
yield Object(self, "content")
diff --git a/hachoir/parser/video/asf.py b/hachoir/parser/video/asf.py
index 8da2d1ac..fc41624b 100644
--- a/hachoir/parser/video/asf.py
+++ b/hachoir/parser/video/asf.py
@@ -355,7 +355,7 @@ class AsfFile(Parser):
if self.stream.readBytes(0, len(magic)) != magic:
return "Invalid magic"
header = self[0]
- if not(30 <= header["size"].value <= MAX_HEADER_SIZE):
+ if not (30 <= header["size"].value <= MAX_HEADER_SIZE):
return "Invalid header size (%u)" % header["size"].value
return True
diff --git a/hachoir/parser/video/mpeg_ts.py b/hachoir/parser/video/mpeg_ts.py
index 8e4e8701..e626e70c 100644
--- a/hachoir/parser/video/mpeg_ts.py
+++ b/hachoir/parser/video/mpeg_ts.py
@@ -134,7 +134,7 @@ class MPEG_TS(Parser):
# FIXME: detect using file content, not file name
# maybe detect sync at offset+4 bytes?
source = self.stream.source
- if not(source and source.startswith("file:")):
+ if not (source and source.startswith("file:")):
return True
filename = source[5:].lower()
return filename.endswith((".m2ts", ".mts"))
diff --git a/hachoir/parser/video/mpeg_video.py b/hachoir/parser/video/mpeg_video.py
index 4ddc37f0..d77d758c 100644
--- a/hachoir/parser/video/mpeg_video.py
+++ b/hachoir/parser/video/mpeg_video.py
@@ -244,7 +244,7 @@ class PacketElement(FieldSet):
yield Bits(self, "sync[]", 4) # =2, or 3 if has_dts=True
yield Timestamp(self, "pts")
if self["has_dts"].value:
- if not(self["has_pts"].value):
+ if not self["has_pts"].value:
raise ParserError("Invalid PTS/DTS values")
yield Bits(self, "sync[]", 4) # =1
yield Timestamp(self, "dts")
diff --git a/hachoir/regex/parser.py b/hachoir/regex/parser.py
index f381459a..234c935f 100644
--- a/hachoir/regex/parser.py
+++ b/hachoir/regex/parser.py
@@ -164,7 +164,7 @@ def _parse(text, start=0, until=None):
if char == 'b':
new_regex = RegexWord()
else:
- if not(char in REGEX_COMMAND_CHARACTERS or char in " '"):
+ if not (char in REGEX_COMMAND_CHARACTERS or char in " '"):
raise SyntaxError(
"Operator '\\%s' is not supported" % char)
new_regex = RegexString(char)
diff --git a/hachoir/stream/input_helper.py b/hachoir/stream/input_helper.py
index 132dd670..ed9263e9 100644
--- a/hachoir/stream/input_helper.py
+++ b/hachoir/stream/input_helper.py
@@ -4,18 +4,23 @@ from hachoir.stream import InputIOStream, InputSubStream, InputStreamError
def FileInputStream(filename, real_filename=None, **args):
"""
- Create an input stream of a file. filename must be unicode.
+ Create an input stream of a file. filename must be unicode or a file
+ object.
real_filename is an optional argument used to specify the real filename,
its type can be 'str' or 'unicode'. Use real_filename when you are
not able to convert filename to real unicode string (ie. you have to
use unicode(name, 'replace') or unicode(name, 'ignore')).
"""
- assert isinstance(filename, str)
if not real_filename:
- real_filename = filename
+ real_filename = (filename if isinstance(filename, str)
+ else getattr(filename, 'name', ''))
try:
- inputio = open(real_filename, 'rb')
+ if isinstance(filename, str):
+ inputio = open(real_filename, 'rb')
+ else:
+ inputio = filename
+ filename = getattr(filename, 'name', '')
except IOError as err:
errmsg = str(err)
raise InputStreamError(
diff --git a/hachoir/stream/output.py b/hachoir/stream/output.py
index 6f62671c..4a9e1514 100644
--- a/hachoir/stream/output.py
+++ b/hachoir/stream/output.py
@@ -2,6 +2,7 @@ from io import StringIO
from hachoir.core.endian import BIG_ENDIAN, LITTLE_ENDIAN
from hachoir.core.bits import long2raw
from hachoir.stream import StreamError
+from hachoir.core import config
from errno import EBADF
MAX_READ_NBYTES = 2 ** 16
@@ -111,12 +112,12 @@ class OutputStream(object):
self.writeBytes(raw)
def copyBitsFrom(self, input, address, nb_bits, endian):
- if (nb_bits % 8) == 0:
+ if (nb_bits % 8) == 0 and (address % 8) == 0 and (self._bit_pos % 8) == 0:
self.copyBytesFrom(input, address, nb_bits // 8)
else:
# Arbitrary limit (because we should use a buffer, like copyBytesFrom(),
# but with endianess problem
- assert nb_bits <= 128
+ assert nb_bits <= config.max_bit_length
data = input.readBits(address, nb_bits, endian)
self.writeBits(nb_bits, data, endian)
diff --git a/hachoir/strip.py b/hachoir/strip.py
index 9b33cdeb..5db2868f 100644
--- a/hachoir/strip.py
+++ b/hachoir/strip.py
@@ -278,7 +278,7 @@ def main():
if parser:
editor = createEditor(parser)
ok &= stripEditor(editor, filename + ".new",
- level, not(values.quiet))
+ level, not values.quiet)
else:
ok = False
if ok:
diff --git a/hachoir/subfile/main.py b/hachoir/subfile/main.py
index a4c477e5..fe895819 100644
--- a/hachoir/subfile/main.py
+++ b/hachoir/subfile/main.py
@@ -85,7 +85,7 @@ def main():
stream = FileInputStream(filename)
with stream:
subfile = SearchSubfile(stream, values.offset, values.size)
- subfile.verbose = not(values.quiet)
+ subfile.verbose = not values.quiet
subfile.debug = values.debug
if output:
subfile.setOutput(output)
diff --git a/hachoir/subfile/search.py b/hachoir/subfile/search.py
index 9cb9ad98..f7ae929d 100644
--- a/hachoir/subfile/search.py
+++ b/hachoir/subfile/search.py
@@ -95,7 +95,7 @@ class SearchSubfile:
print("[!] Memory error!", file=stderr)
self.mainFooter()
self.stream.close()
- return not(main_error)
+ return (not main_error)
def mainHeader(self):
# Fix slice size if needed
@@ -149,7 +149,7 @@ class SearchSubfile:
if parser.content_size is not None:
text += " size=%s (%s)" % (parser.content_size //
8, humanFilesize(parser.content_size // 8))
- if not(parser.content_size) or parser.content_size // 8 < FILE_MAX_SIZE:
+ if not parser.content_size or parser.content_size // 8 < FILE_MAX_SIZE:
text += ": " + parser.description
else:
text += ": " + parser.__class__.__name__
diff --git a/hachoir/urwid.py b/hachoir/urwid.py
index 7839031c..7be693ed 100644
--- a/hachoir/urwid.py
+++ b/hachoir/urwid.py
@@ -295,7 +295,7 @@ class Walker(ListWalker):
text += "= %s" % display
if node.field.description and self.flags & self.display_description:
description = node.field.description
- if not(self.flags & self.human_size):
+ if not (self.flags & self.human_size):
description = makePrintable(description, "ASCII")
text += ": %s" % description
if self.flags & self.display_size and node.field.size or self.flags & self.display_type:
diff --git a/hachoir/wx/app.py b/hachoir/wx/app.py
index 1bd62d92..06b4b079 100644
--- a/hachoir/wx/app.py
+++ b/hachoir/wx/app.py
@@ -8,12 +8,12 @@ from hachoir.wx.dispatcher import dispatcher_t
from hachoir.wx import frame_view, field_view, hex_view, tree_view
from hachoir.wx.dialogs import file_open_dialog
from hachoir.wx.unicode import force_unicode
-from hachoir.version import VERSION
+from hachoir import __version__
class app_t(App):
def __init__(self, filename=None):
- print("[+] Run hachoir-wx version %s" % VERSION)
+ print("[+] Run hachoir-wx version %s" % __version__)
self.filename = filename
App.__init__(self, False)
diff --git a/hachoir/wx/field_view/stubs.py b/hachoir/wx/field_view/stubs.py
index fae03182..1f9e9e82 100644
--- a/hachoir/wx/field_view/stubs.py
+++ b/hachoir/wx/field_view/stubs.py
@@ -32,7 +32,7 @@ def field_type_name(field):
def convert_size(from_field, to_type):
- if not(('Byte' in field_type_name(from_field)) ^ ('Byte' in to_type.__name__)):
+ if not (('Byte' in field_type_name(from_field)) ^ ('Byte' in to_type.__name__)):
return from_field.size
elif 'Byte' in field_type_name(from_field):
return from_field.size * 8
diff --git a/hachoir/wx/hex_view/hex_view.py b/hachoir/wx/hex_view/hex_view.py
index 69d06c7f..7cfde491 100644
--- a/hachoir/wx/hex_view/hex_view.py
+++ b/hachoir/wx/hex_view/hex_view.py
@@ -1,6 +1,12 @@
import wx
from .file_cache import FileCache
+try:
+ import darkdetect
+ darkmode = darkdetect.isDark()
+except ImportError:
+ darkmode = False
+
textchars = set('0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ ')
text_view_transtable = bytes([c if chr(c) in textchars else ord('.') for c in range(256)])
@@ -168,7 +174,10 @@ class hex_view_t(wx.ScrolledWindow):
# Draw "textbox" rects under the hex and text views
dc.SetPen(wx.NullPen)
- dc.SetBrush(wx.WHITE_BRUSH)
+ if darkmode:
+ dc.SetBrush(wx.BLACK_BRUSH)
+ else:
+ dc.SetBrush(wx.WHITE_BRUSH)
dc.DrawRectangle(lo.boxstart('hex'), 0, lo.boxwidth('hex'), h)
dc.DrawRectangle(lo.boxstart('text'), 0, lo.boxwidth('text'), h)
diff --git a/hachoir/wx/main.py b/hachoir/wx/main.py
index 8b39519b..b902023a 100755
--- a/hachoir/wx/main.py
+++ b/hachoir/wx/main.py
@@ -1,7 +1,7 @@
#!/usr/bin/env python3
from hachoir.wx.app import app_t
-from hachoir.version import PACKAGE, VERSION, WEBSITE
+from hachoir import __version__
from hachoir.core.cmd_line import getHachoirOptions, configureHachoir
from optparse import OptionParser
import sys
@@ -24,8 +24,7 @@ def parseOptions():
def main():
- print("%s version %s" % (PACKAGE, VERSION))
- print(WEBSITE)
+ print("hachoir version %s" % __version__)
print()
values, filename = parseOptions()
configureHachoir(values)
diff --git a/setup.cfg b/setup.cfg
new file mode 100644
index 00000000..8bfd5a12
--- /dev/null
+++ b/setup.cfg
@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0
+
diff --git a/setup.py b/setup.py
index 3f181148..1f5ec9f2 100755
--- a/setup.py
+++ b/setup.py
@@ -2,26 +2,26 @@
#
# Prepare a release:
#
-# - check version: hachoir/version.py and doc/conf.py
+# - check version: hachoir/__init__.py and doc/conf.py
# - set the release date: edit doc/changelog.rst
# - run: git commit -a
# - Remove untracked files/dirs: git clean -fdx
-# - run tests: tox
+# - run tests: tox --parallel auto
# - run: git push
-# - check Travis CI status:
-# https://travis-ci.org/vstinner/hachoir
-# - run: git tag x.y.z
-# - Remove untracked files/dirs: git clean -fdx
-# - run: python3 setup.py sdist bdist_wheel
+# - check GitHub Actions status:
+# https://github.com/vstinner/hachoir/actions
#
# Release a new version:
#
+# - git tag x.y.z
+# - git clean -fdx # Remove untracked files/dirs
+# - python3 setup.py sdist bdist_wheel
# - git push --tags
# - twine upload dist/*
#
# After the release:
#
-# - set version to N+1: hachoir/version.py and doc/conf.py
+# - set version to N+1: hachoir/__init__.py and doc/conf.py
ENTRY_POINTS = {
'console_scripts': [
@@ -72,6 +72,9 @@ def main():
"name": "hachoir",
"version": hachoir.__version__,
"url": 'http://hachoir.readthedocs.io/',
+ "project_urls": {
+ "Source": "https://github.com/vstinner/hachoir",
+ },
"author": "Hachoir team (see AUTHORS file)",
"description": "Package of Hachoir parsers used to open binary files",
"long_description": long_description,
@@ -83,6 +86,10 @@ def main():
"extras_require": {
"urwid": [
"urwid==1.3.1"
+ ],
+ "wx": [
+ "darkdetect",
+ "wxPython==4.*"
]
},
"zip_safe": True,
diff --git a/tests/test_editor.py b/tests/test_editor.py
new file mode 100644
index 00000000..1a00eb59
--- /dev/null
+++ b/tests/test_editor.py
@@ -0,0 +1,44 @@
+import unittest
+from io import BytesIO
+from hachoir.core.endian import BIG_ENDIAN
+from hachoir.editor import createEditor
+from hachoir.field import Parser, Bits
+from hachoir.stream import StringInputStream, OutputStream
+from hachoir.test import setup_tests
+
+
+class TestEditor(unittest.TestCase):
+ def test_bit_alignment(self):
+ data = bytes([255, 255, 255, 254])
+ stream = StringInputStream(data)
+ parser = TestParser(stream)
+ editor = createEditor(parser)
+
+ # Cause a change in a non-byte-aligned field
+ editor['flags[2]'].value -= 1
+
+ # Generate output and verify operation
+ output_io = BytesIO()
+ output_stream = OutputStream(output_io)
+
+ editor.writeInto(output_stream)
+ output_bits = "{0:b}".format(int.from_bytes(output_io.getvalue(), 'big'))
+
+ # X is the modified bit
+ # .....,,,,,,,,,,,,,,,,..X,,,,,,,,
+ self.assertEqual(output_bits, "11111111111111111111111011111110")
+
+
+class TestParser(Parser):
+ endian = BIG_ENDIAN
+
+ def createFields(self):
+ yield Bits(self, 'flags[]', 5)
+ yield Bits(self, 'flags[]', 16)
+ yield Bits(self, 'flags[]', 3)
+ yield Bits(self, 'flags[]', 8)
+
+
+if __name__ == "__main__":
+ setup_tests()
+ unittest.main()
diff --git a/tests/test_metadata.py b/tests/test_metadata.py
index 8677ba13..345aa5aa 100755
--- a/tests/test_metadata.py
+++ b/tests/test_metadata.py
@@ -126,7 +126,7 @@ class TestMetadata(unittest.TestCase):
# Check type
if type(read) != type(value) \
- and not(isinstance(value, int) and isinstance(value, int)):
+ and not (isinstance(value, int) and isinstance(value, int)):
if self.verbose:
sys.stdout.write("wrong type (%s instead of %s)!\n"
% (type(read).__name__, type(value).__name__))
diff --git a/tests/test_parser.py b/tests/test_parser.py
index fc10f342..11278b6e 100755
--- a/tests/test_parser.py
+++ b/tests/test_parser.py
@@ -236,6 +236,12 @@ class TestParsers(unittest.TestCase):
parser = self.parse("python.cpython-37.pyc.bin")
self.checkValue(parser, "/content/consts/item[0]/name/text", "f")
+ def check_pyc_312(self, parser):
+ parser = self.parse("python.cpython-312.pyc.bin")
+ self.checkValue(parser, "/content/consts/item[0]/value", 1)
+ self.checkValue(parser, "/content/names/item[0]/text", "x")
+ self.checkValue(parser, "/content/name/text", "<module>")
+
def test_java(self):
parser = self.parse("ReferenceMap.class")
self.checkValue(parser, "/minor_version", 3)
@@ -836,6 +842,37 @@ class TestParsers(unittest.TestCase):
self.checkValue(parser, "/bpp", 32)
self.checkDisplay(parser, "/codec", "True-color RLE")
+ def test_ttf(self):
+ parser = self.parse("deja_vu_serif-2.7.ttf")
+ self.checkValue(parser, "/hhea/ascender", 1901)
+ self.checkValue(parser, "/maxp/maxCompositePoints", 101)
+ self.checkValue(parser, "/cmap/encodingRecords[1]/platformID", 1)
+ self.checkValue(parser, "/OS_2/achVendID", "Deja")
+
+ def test_fit(self):
+ parser = self.parse("test_file.fit")
+ self.checkValue(parser, "/header/datasize", 8148)
+ self.checkValue(parser, "/definition[18]/msgNumber", 325)
+ self.checkValue(parser, "/definition[18]/fieldDefinition[6]/number", 5)
+ self.checkValue(parser, "/definition[18]/RecordHeader/msgType", 2)
+ self.checkValue(parser, "/data[50]/field0", 1000231166)
+ self.checkValue(parser, "/data[50]/field1", 111)
+ self.checkValue(parser, "/data[50]/field2", 96)
+
+ def test_arj(self):
+ parser = self.parse("example2.arj")
+ self.checkValue(parser, "/header/crc", 0xd2fe10aa)
+ self.checkValue(parser, "/header/filename", "example2.arj")
+ self.checkValue(parser, "/file_header[1]/filename", "usr/bin/awk")
+ self.checkValue(parser, "/file_header[1]/original_size", 125416)
+
+ def test_arj2(self):
+ parser = self.parse("example4_chapters.arj")
+ self.checkValue(parser, "/header/crc", 0x967cc3b)
+ self.checkValue(parser, "/header/filename", "example4.arj")
+ self.checkValue(parser, "/file_header[15]/filename", "usr/bin/groups")
+ self.checkValue(parser, "/file_header[15]/original_size", 35000)
+
class TestParserRandomStream(unittest.TestCase):
diff --git a/tools/entropy.py b/tools/entropy.py
deleted file mode 100755
index 4471819d..00000000
--- a/tools/entropy.py
+++ /dev/null
@@ -1,78 +0,0 @@
-#!/usr/bin/env python3
-from math import log
-
-
-class Entropy:
-
- def __init__(self):
- self.frequence = dict((index, 0) for index in range(0, 256))
- self.count = 0
-
- def readBytes(self, bytes):
- for byte in bytes:
- self.frequence[byte] = self.frequence[byte] + 1
- self.count += len(bytes)
- return self
-
- def compute(self):
- h = 0
- for value in self.frequence.values():
- if not value:
- continue
- p_i = float(value) / self.count
- h -= p_i * log(p_i, 2)
- return h
-
-from time import time
-from sys import stderr
-
-
-class EntropyFile(Entropy):
-
- def __init__(self):
- Entropy.__init__(self)
- self.progress_time = 1.0
- self.buffer_size = 4096
-
- def displayProgress(self, percent):
- print("Progress: %.1f%%" % percent, file=stderr)
-
- def readStream(self, stream, streamsize=None):
- # Read stream size
- if streamsize is None:
- stream.seek(0, 2)
- streamsize = stream.tell()
- if streamsize <= 0:
- raise ValueError("Empty stream")
-
- # Read stream content
- stream.seek(0, 0)
- next_msg = time() + self.progress_time
- while True:
- if next_msg <= time():
- self.displayProgress(stream.tell() * 100.0 / streamsize)
- next_msg = time() + self.progress_time
- raw = stream.read(self.buffer_size)
- if not raw:
- break
- self.readBytes(raw)
- return self
-
- def readFile(self, filename):
- stream = open(filename, 'rb')
- self.readStream(stream)
- return self
-
-
-def main():
- from sys import argv, exit
- if len(argv) != 2:
- print("usage: %s filename" % argv[0], file=stderr)
- exit(1)
- entropy = EntropyFile()
- entropy.readFile(argv[1])
- print("Entropy: %.4f bit/byte" % entropy.compute())
- exit(0)
-
-if __name__ == "__main__":
- main()
diff --git a/tools/find_deflate.py b/tools/find_deflate.py
deleted file mode 100755
index c1909bc6..00000000
--- a/tools/find_deflate.py
+++ /dev/null
@@ -1,57 +0,0 @@
-#!/usr/bin/env python3
-from zlib import decompress, error as zlib_error
-from sys import argv, stderr, exit
-from time import time
-
-MIN_SIZE = 2
-
-
-def canDeflate(compressed_data):
- try:
- data = decompress(compressed_data)
- return True
- except zlib_error:
- return False
-
-
-def findDeflateBlocks(data):
- next_msg = time() + 1.0
- max_index = len(data) - MIN_SIZE - 1
- for index in range(max_index + 1):
- if next_msg < time():
- next_msg = time() + 1.0
- print("Progress: %.1f%% (offset %s/%s)" % (
- index * 100.0 / max_index, index, max_index))
- if canDeflate(data[index:]):
- yield index
-
-
-def guessDeflateSize(data, offset):
- size = len(data) - offset
- while size:
- if canDeflate(data[offset:offset + size]):
- yield size
- size -= 1
-
-
-def main():
- if len(argv) != 2:
- print("usage: %s filename" % argv[0], file=stderr)
- exit(1)
- data = open(argv[1], 'rb').read()
- offsets = []
- for offset in findDeflateBlocks(data):
- print("Offset %s" % offset)
- offsets.append(offset)
- if offsets:
- for offset in offsets:
- for size in guessDeflateSize(data, offset):
- if size == (len(data) - offset):
- size = "%s (until the end)" % size
- print("Offset %s -- size %s" % (offset, size))
- else:
- print("No deflate block found", file=stderr)
- exit(0)
-
-if __name__ == "__main__":
- main()
diff --git a/tools/flake8.sh b/tools/flake8.sh
deleted file mode 100755
index 651bd4f0..00000000
--- a/tools/flake8.sh
+++ /dev/null
@@ -1,6 +0,0 @@
-#!/bin/sh
-set -e -x
-cd $(dirname "$0")/..
-# use /bin/sh to support "*.py"
-# FIXME: add hachoir-wx (currrently broken)
-flake8 hachoir/ tests/ runtests.py setup.py doc/examples/*.py
diff --git a/tools/flv_extractor.py b/tools/flv_extractor.py
deleted file mode 100755
index fe25ecda..00000000
--- a/tools/flv_extractor.py
+++ /dev/null
@@ -1,37 +0,0 @@
-#!/usr/bin/env python3
-"""
-Extract audio from a FLV movie
-
-Author: Victor Stinner
-Creation date: 2006-11-06
-"""
-from hachoir.parser import createParser
-from hachoir.stream import FileOutputStream
-from hachoir.parser.video.flv import AUDIO_CODEC_MP3
-from sys import stderr, exit, argv
-
-
-def main():
- if len(argv) != 2:
- print("usage: %s video.flv" % argv[0], file=stderr)
- exit(1)
-
- # Open input video
- inputname = argv[1]
- parser = createParser(inputname)
- if parser["audio[0]/codec"].value != AUDIO_CODEC_MP3:
- print("Unknown audio codec: %s" %
- parser["audio[0]/codec"].display, file=stderr)
-
- # Extract audio
- print("Extractor audio from: %s" % inputname)
- outputname = inputname + ".mp3"
- output = FileOutputStream(outputname)
- for chunk in parser.array("audio"):
- data = chunk["music_data"]
- output.copyBitsFrom(
- data.parent.stream, data.absolute_address, data.size, data.parent.endian)
- print("Write audio into: %s" % outputname)
-
-
-main()
diff --git a/tools/fuzzer/file_fuzzer.py b/tools/fuzzer/file_fuzzer.py
deleted file mode 100644
index 9757a2f7..00000000
--- a/tools/fuzzer/file_fuzzer.py
+++ /dev/null
@@ -1,195 +0,0 @@
-from os import path
-from os.path import basename
-from random import randint
-from tools import getFilesize, generateUniqueID
-from hachoir.stream import InputIOStream, InputStreamError
-from hachoir.metadata import extractMetadata
-from hachoir.parser import guessParser
-from io import StringIO
-from array import array
-from mangle import mangle
-from time import time
-
-# Truncate: minimum/maximum file size (in bytes)
-MIN_SIZE = 1
-MAX_SIZE = 1024 * 1024
-
-# Limit each test to 10 secondes
-MAX_DURATION = 10.0
-
-# Number of mangle operation depending of current size
-# '0.30' means: 30% of current size in byte
-MANGLE_PERCENT = 0.01
-MANGLE_PERCENT_INCR = 0
-MIN_MANGLE_PERCENT = 0.10
-
-# Limit fuzzing to 40 tests
-MAX_NB_EXTRACT = 40
-
-# 1 times on 20 tries
-TRUNCATE_RATE = 20
-
-
-class UndoMangle:
-
- def __init__(self, fuzz):
- self.data = fuzz.data.tostring()
- self.orig = fuzz.is_original
-
- def __call__(self, fuzz):
- fuzz.data = array('B', self.data)
- fuzz.is_original = self.orig
-
-
-class UndoTruncate:
-
- def __init__(self, fuzz):
- self.data = fuzz.data
- self.orig = fuzz.is_original
-
- def __call__(self, fuzz):
- fuzz.data = self.data
- fuzz.is_original = self.orig
-
-
-class FileFuzzer:
-
- def __init__(self, fuzzer, filename):
- self.fuzzer = fuzzer
- self.verbose = fuzzer.verbose
- self.mangle_percent = MANGLE_PERCENT
- self.file = open(filename, "rb")
- self.nb_undo = 0
- self.filename = filename
- self.size = getFilesize(self.file)
- self.mangle_count = 0
- self.mangle_call = 0
- size = randint(MIN_SIZE, MAX_SIZE)
- data_str = self.file.read(size)
- self.data = array('B', data_str)
- self.undo = None
- self.nb_extract = 0
- if len(self.data) < self.size:
- self.is_original = False
- self.nb_truncate = 1
- self.warning("Truncate to %s bytes" % len(self.data))
- else:
- self.is_original = True
- self.nb_truncate = 0
- self.info("Size: %s bytes" % len(self.data))
-
- def acceptTruncate(self):
- return MIN_SIZE < len(self.data)
-
- def sumUp(self):
- self.warning("[SUMUP] Extraction: %s" % self.nb_extract)
- if self.mangle_call:
- self.warning("[SUMUP] Mangle# %s (%s op.)" % (
- self.mangle_call, self.mangle_count))
- if self.nb_truncate:
- percent = len(self.data) * 100.0 / self.size
- self.warning("[SUMUP] Truncate# %s -- size: %.1f%% of %s" % (
- self.nb_truncate, percent, self.size))
-
- def warning(self, message):
- print(" %s (%s): %s" %
- (basename(self.filename), self.nb_extract, message))
-
- def info(self, message):
- if self.verbose:
- self.warning(message)
-
- def mangle(self):
- # Store last state
- self.undo = UndoMangle(self)
-
- # Mangle data
- count = mangle(self.data, self.mangle_percent)
-
- # Update state
- self.mangle_call += 1
- self.mangle_count += count
- self.is_original = False
- self.warning("Mangle #%s (%s op.)" % (self.mangle_call, count))
-
- def truncate(self):
- assert MIN_SIZE < len(self.data)
- # Store last state (for undo)
- self.undo = UndoTruncate(self)
-
- # Truncate
- self.nb_truncate += 1
- new_size = randint(MIN_SIZE, len(self.data) - 1)
- self.warning("Truncate #%s (%s bytes)" % (self.nb_truncate, new_size))
- self.data = self.data[:new_size]
- self.is_original = False
-
- def tryUndo(self):
- # No operation to undo?
- if not self.undo:
- self.info("Unable to undo")
- return False
-
- # Undo
- self.nb_undo += 1
- self.info("Undo #%s" % self.nb_undo)
- self.undo(self)
- self.undo = None
-
- # Update mangle percent
- percent = max(self.mangle_percent -
- MANGLE_PERCENT_INCR, MIN_MANGLE_PERCENT)
- if self.mangle_percent != percent:
- self.mangle_percent = percent
- self.info("Set mangle percent to: %u%%" %
- int(self.mangle_percent * 100))
- return True
-
- def extract(self):
- self.nb_extract += 1
- self.prefix = ""
-
- data = self.data.tostring()
- stream = InputIOStream(StringIO(data), filename=self.filename)
-
- # Create parser
- start = time()
- try:
- parser = guessParser(stream)
- except InputStreamError as err:
- parser = None
- if not parser:
- self.info("Unable to create parser: stop")
- return None
-
- # Extract metadata
- try:
- metadata = extractMetadata(parser, 0.5)
- failure = bool(self.fuzzer.log_error)
- except Exception as err:
- self.info("SERIOUS ERROR: %s" % err)
- self.prefix = "metadata"
- failure = True
- duration = time() - start
-
- # Timeout?
- if MAX_DURATION < duration:
- self.info("Process is too long: %.1f seconds" % duration)
- failure = True
- self.prefix = "timeout"
- if not failure and (metadata is None or not metadata):
- self.info("Unable to extract metadata")
- return None
-# for line in metadata.exportPlaintext():
-# print(">>> %s" % line)
- return failure
-
- def keepFile(self, prefix):
- data = self.data.tostring()
- uniq_id = generateUniqueID(data)
- filename = "%s-%s" % (uniq_id, basename(self.filename))
- if prefix:
- filename = "%s-%s" % (prefix, filename)
- filename = path.join(self.fuzzer.error_dir, filename)
- open(filename, "wb").write(data)
- print("=> Store file %s" % filename)
diff --git a/tools/fuzzer/mangle.py b/tools/fuzzer/mangle.py
deleted file mode 100644
index b6d35143..00000000
--- a/tools/fuzzer/mangle.py
+++ /dev/null
@@ -1,91 +0,0 @@
-from random import randint, choice as random_choice
-from array import array
-
-MAX_MIX = 20
-MIN_MIX = -MAX_MIX
-MIN_COUNT = 15
-MAX_COUNT = 2500
-MAX_INC = 32
-MIN_INC = -MAX_INC
-
-SPECIAL_VALUES_NOENDIAN = (
- "\x00",
- "\x00\x00",
- "\x7f",
- "\x7f\xff",
- "\x7f\xff\xff\xff",
- "\x80",
- "\x80\x00",
- "\x80\x00\x00\x00",
- "\xfe",
- "\xfe\xff",
- "\xfe\xff\xff\xff",
- "\xff",
- "\xff\xff",
- "\xff\xff\xff\xff",
-)
-
-SPECIAL_VALUES = []
-for item in SPECIAL_VALUES_NOENDIAN:
- SPECIAL_VALUES.append(item)
- itemb = item[::-1]
- if item != itemb:
- SPECIAL_VALUES.append(itemb)
-
-
-def mangle_replace(data, offset):
- data[offset] = randint(0, 255)
-
-
-def mangle_increment(data, offset):
- value = data[offset] + randint(MIN_INC, MAX_INC)
- data[offset] = max(min(value, 255), 0)
-
-
-def mangle_bit(data, offset):
- bit = randint(0, 7)
- if randint(0, 1) == 1:
- value = data[offset] | (1 << bit)
- else:
- value = data[offset] & (~(1 << bit) & 0xFF)
- data[offset] = value
-
-
-def mangle_special_value(data, offset):
- tlen = len(data) - offset
- text = random_choice(SPECIAL_VALUES)[:tlen]
- data[offset:offset + len(text)] = array("B", text)
-
-
-def mangle_mix(data, ofs1):
- ofs2 = ofs1 + randint(MIN_MIX, MAX_MIX)
- ofs2 = max(min(ofs2, len(data) - 1), 0)
- data[ofs1], data[ofs2] = data[ofs2], data[ofs1]
-
-
-MANGLE_OPERATIONS = (
- mangle_replace,
- mangle_increment,
- mangle_bit,
- mangle_special_value,
- mangle_mix,
-)
-
-
-def mangle(data, percent, min_count=MIN_COUNT, max_count=MAX_COUNT):
- """
- Mangle data: add few random bytes in input byte array.
-
- This function is based on an idea of Ilja van Sprundel (file mangle.c).
- """
- hsize = len(data) - 1
- max_percent = max(min(percent, 1.0), 0.0001)
- count = int(float(len(data)) * max_percent)
- count = max(count, min_count)
- count = min(count, max_count)
- count = randint(1, count)
- for index in range(count):
- operation = random_choice(MANGLE_OPERATIONS)
- offset = randint(0, hsize)
- operation(data, offset)
- return count
diff --git a/tools/fuzzer/stress.py b/tools/fuzzer/stress.py
deleted file mode 100755
index 457ddf26..00000000
--- a/tools/fuzzer/stress.py
+++ /dev/null
@@ -1,201 +0,0 @@
-#!/usr/bin/env python3
-from os import path, getcwd, nice, mkdir
-from sys import exit, argv, stderr
-from glob import glob
-from random import choice as random_choice, randint
-from hachoir.core.memory import limitedMemory
-from errno import EEXIST
-from time import sleep
-from hachoir.core.log import log as hachoir_logger, Log
-from file_fuzzer import FileFuzzer, MAX_NB_EXTRACT, TRUNCATE_RATE
-import re
-
-# Constants
-SLEEP_SEC = 0
-MEMORY_LIMIT = 5 * 1024 * 1024
-
-
-class Fuzzer:
-
- def __init__(self, filedb_dirs, error_dir):
- self.filedb_dirs = filedb_dirs
- self.filedb = []
- self.tmp_file = "/tmp/stress-hachoir"
- self.nb_error = 0
- self.error_dir = error_dir
- self.verbose = False
-
- def filterError(self, text):
- if "Error during metadata extraction" in text:
- return False
- if text.startswith("Error when creating MIME type"):
- return True
- if text.startswith("Unable to create value: "):
- why = text[24:]
- if why.startswith("Can't get field \""):
- return True
- if why.startswith("invalid literal for int(): "):
- return True
- if why.startswith("timestampUNIX(): value have to be in "):
- return True
- if re.match("^Can't read [0-9]+ bits at ", why):
- return True
- if why.startswith("'decimal' codec can't encode character"):
- return True
- if why.startswith("date newer than year "):
- return True
- if why in (
- "day is out of range for month",
- "year is out of range",
- "[Float80] floating point overflow"):
- return True
- if re.match("^(second|minute|hour|month) must be in ", why):
- return True
- if re.match("days=[0-9]+; must have magnitude ", text):
- # Error during metadata extraction: days=1143586582; must have
- # magnitude <= 999999999
- return True
- if "floating point overflow" in text:
- return True
- if "field is too large" in text:
- return True
- if "Seek below field set start" in text:
- return True
- if text.startswith("Unable to create directory directory["):
- # [/section_rsrc] Unable to create directory directory[2][0][]: Can't get field "header" from /section_rsrc/directory[2][0][]
- return True
- if "FAT chain: " in text:
- return True
- if text.startswith("EXE resource: depth too high"):
- return True
- if "OLE2: Unable to parse property of type" in text:
- return True
- if "OLE2: Too much sections" in text:
- return True
- if "OLE2: Invalid endian value" in text:
- return True
- if "Seek above field set end" in text:
- return True
- return False
-
- def newLog(self, level, prefix, text, context):
- if level < Log.LOG_ERROR or self.filterError(text):
- # if self.verbose:
- # print " ignore %s %s" % (prefix, text)
- return
- self.log_error += 1
- print("METADATA ERROR: %s %s" % (prefix, text))
-
- def fuzzFile(self, fuzz):
-
- failure = False
- while fuzz.nb_extract < MAX_NB_EXTRACT:
- if SLEEP_SEC:
- sleep(SLEEP_SEC)
- self.log_error = 0
- fatal_error = False
- try:
- failure = limitedMemory(MEMORY_LIMIT, fuzz.extract)
- prefix = fuzz.prefix
- except KeyboardInterrupt:
- try:
- failure = (
- input("Keep current file (y/n)?").strip() == "y")
- except (KeyboardInterrupt, EOFError):
- print()
- failure = False
- prefix = "interrupt"
- fatal_error = True
- except MemoryError:
- print("MEMORY ERROR!")
- failure = True
- prefix = "memory"
- except Exception as err:
- print("EXCEPTION (%s): %s" % (err.__class__.__name__, err))
- failure = True
- prefix = "exception"
- if fatal_error:
- break
- if failure is None:
- if fuzz.tryUndo():
- failure = False
- elif fuzz.is_original:
- print(
- " Warning: Unsupported file format: remove %s from test suite" % fuzz.filename)
- self.filedb.remove(fuzz.filename)
- return True
- if failure is None:
- break
- if failure:
- break
- if fuzz.acceptTruncate():
- if randint(0, TRUNCATE_RATE - 1) == 0:
- fuzz.truncate()
- else:
- fuzz.mangle()
- else:
- fuzz.mangle()
-
- # Process error
- if failure:
- fuzz.keepFile(prefix)
- self.nb_error += 1
- fuzz.sumUp()
- return (not fatal_error)
-
- def init(self):
- # Setup log
- self.nb_error = 0
- hachoir_logger.use_print = False
- hachoir_logger.on_new_message = self.newLog
-
- # Load file DB
- self.filedb = []
- for directory in self.filedb_dirs:
- new_files = glob(path.join(directory, "*.*"))
- self.filedb.extend(new_files)
- if not self.filedb:
- print("Empty directories: %s" % self.filedb_dirs)
- exit(1)
-
- # Create error directory
- try:
- mkdir(self.error_dir)
- except OSError as err:
- if err[0] == EEXIST:
- pass
-
- def run(self):
- self.init()
- try:
- while True:
- test_file = random_choice(self.filedb)
- print("[+] %s error -- test file: %s" %
- (self.nb_error, test_file))
- fuzz = FileFuzzer(self, test_file)
- ok = self.fuzzFile(fuzz)
- if not ok:
- break
- except KeyboardInterrupt:
- print("Stop")
-
-
-def main():
- # Read command line argument
- if len(argv) < 2:
- print("usage: %s directory [directory2 ...]" % argv[0], file=stderr)
- exit(1)
- test_dirs = [path.normpath(path.expanduser(item)) for item in argv[1:]]
-
- # Directory is current directory?
- err_dir = path.join(getcwd(), "error")
-
- # Nice
- nice(19)
-
- fuzzer = Fuzzer(test_dirs, err_dir)
- fuzzer.run()
-
-
-if __name__ == "__main__":
- main()
diff --git a/tools/fuzzer/tools.py b/tools/fuzzer/tools.py
deleted file mode 100644
index 4b809f8a..00000000
--- a/tools/fuzzer/tools.py
+++ /dev/null
@@ -1,38 +0,0 @@
-from sys import platform
-
-if platform == 'win32':
- from win32process import (GetCurrentProcess, SetPriorityClass,
- BELOW_NORMAL_PRIORITY_CLASS)
-
- def beNice():
- process = GetCurrentProcess()
- # FIXME: Not supported on Windows 95/98/Me/NT: ignore error?
- # which error?
- SetPriorityClass(process, BELOW_NORMAL_PRIORITY_CLASS)
-
- OS_ERRORS = (OSError, WindowsError)
-else:
- from os import nice
-
- def beNice():
- nice(19)
-
- OS_ERRORS = OSError
-
-try:
- import sha
-
- def generateUniqueID(data):
- return sha.new(data).hexdigest()
-except ImportError:
- def generateUniqueID(data):
- generateUniqueID.sequence += 1
- return generateUniqueID.sequence
- generateUniqueID.sequence = 0
-
-
-def getFilesize(file):
- file.seek(0, 2)
- size = file.tell()
- file.seek(0, 0)
- return size
diff --git a/tools/hachoir-fuse.py b/tools/hachoir-fuse.py
deleted file mode 100755
index 39bf65d2..00000000
--- a/tools/hachoir-fuse.py
+++ /dev/null
@@ -1,243 +0,0 @@
-#!/usr/bin/env python3
-"""
-Proof of concept of Hachoir user interface using FUSE
-"""
-
-import errno
-# import os
-import stat
-from hachoir.log import log
-from hachoir.field import MissingField
-from hachoir.tools import makePrintable
-from hachoir.stream import FileOutputStream
-from hachoir.editor import createEditor
-from hachoir.parser import createParser
-
-# some spaghetti to make it usable without fuse-py being installed
-for i in True, False:
- try:
- import fuse
- from fuse import Fuse
- except ImportError:
- if i:
- try:
- import _find_fuse_parts # noqa
- except ImportError:
- pass
- else:
- raise
-
-if not hasattr(fuse, '__version__'):
- raise RuntimeError(
- "your fuse-py doesn't know of fuse.__version__, probably it's too old.")
-
-# This setting is optional, but it ensures that this class will keep
-# working after a future API revision
-fuse.fuse_python_api = (0, 2)
-
-
-class MyStat(fuse.Stat):
-
- def __init__(self):
- self.st_mode = 0
- self.st_ino = 0
- self.st_dev = 0
- self.st_nlink = 0
- self.st_uid = 0
- self.st_gid = 0
- self.st_size = 0
- self.st_atime = 0
- self.st_mtime = 0
- self.st_ctime = 0
-
-
-class HelloFS(Fuse):
-
- def __init__(self, input_filename, **kw):
- Fuse.__init__(self, **kw)
- log.setFilename("/home/haypo/fuse_log")
- self.hachoir = createParser(input_filename)
- if True:
- self.hachoir = createEditor(self.hachoir)
- self.readonly = False
- else:
- self.readonly = True
- self.fs_charset = "ASCII"
-
- def getField(self, path):
- try:
- field = self.hachoir
- try:
- for name in path.split("/")[1:]:
- if not name:
- break
- name = name.split("-", 1)
- if len(name) != 2:
- return None
- field = field[name[1]]
- return field
- except MissingField:
- return None
- except Exception as xx:
- log.info("Exception: %s" % str(xx))
- raise
-
- def fieldValue(self, field):
- return makePrintable(field.display, "ISO-8859-1") + "\n"
-
- def getattr(self, path):
- st = MyStat()
- if path == '/':
- st.st_mode = stat.S_IFDIR | 0o755
- st.st_nlink = 2
- return st
- if path == "/.command":
- st.st_mode = stat.S_IFDIR | 0o755
- return st
- if path.startswith("/.command/"):
- name = path.split("/", 3)[2]
- if name in ("writeInto",):
- st.st_mode = stat.S_IFREG | 0o444
- return st
- return -errno.ENOENT
-
- # Get field
- field = self.getField(path)
- if not field:
- return -errno.ENOENT
-
- # Set size and mode
- if field.is_field_set:
- st.st_mode = stat.S_IFDIR | 0o755
- else:
- st.st_mode = stat.S_IFREG | 0o444
- st.st_nlink = 1
- if field.hasValue():
- st.st_size = len(self.fieldValue(field))
- else:
- st.st_size = 0
- return st
-
- def unlink(self, path, *args):
- log.info("unlink(%s)" % path)
- field = self.getField(path)
- log.info("del %s" % field.name)
- if not field:
- return -errno.ENOENT
- if self.readonly:
- return -errno.EACCES
- log.info("del %s" % field.name)
- try:
- del field.parent[field.name]
- except Exception as err:
- log.info("del ERROR %s" % err)
- return 0
-
- def readCommandDir(self):
- yield fuse.Direntry('writeInto')
-
- def readdir(self, path, offset):
- log.info("readdir(%s)" % path)
- yield fuse.Direntry('.')
- yield fuse.Direntry('..')
- if path == "/.command":
- for entry in self.readCommandDir():
- yield entry
- return
-
- # Get field
- fieldset = self.getField(path)
-# if not fieldset:
-# return -errno.ENOENT
-
- if path == "/":
- entry = fuse.Direntry(".command")
- entry.type = stat.S_IFREG
- yield entry
-
- # Format file name
- count = len(fieldset)
- if count % 10:
- count += 10 - (count % 10)
- format = "%%0%ud-%%s" % (count // 10)
-
- # Create entries
- for index, field in enumerate(fieldset):
- name = format % (1 + index, field.name)
- entry = fuse.Direntry(name)
- if field.is_field_set:
- entry.type = stat.S_IFDIR
- else:
- entry.type = stat.S_IFREG
- yield entry
- log.info("readdir(%s) done" % path)
-
- def open(self, path, flags):
- log.info("open(%s)" % path)
-# if ...:
-# return -errno.ENOENT
-# accmode = os.O_RDONLY | os.O_WRONLY | os.O_RDWR
-# if (flags & accmode) != os.O_RDONLY:
-# return -errno.EACCES
-
- def write(self, path, data, offset):
- if path == "/.command/writeInto":
- if self.readonly:
- return -errno.EACCES
- try:
- data = data.strip(" \t\r\n\0")
- filename = str(data, self.fs_charset)
- except UnicodeDecodeError:
- log.info("writeInto(): unicode error!")
- return 0
- log.info("writeInto(%s)" % filename)
- stream = FileOutputStream(filename)
- self.hachoir.writeInto(stream)
- return len(data)
-
- def read(self, path, size, offset):
- try:
- log.info("read(%s, %s, %s)" % (path, size, offset))
- field = self.getField(path)
- if not field:
- log.info("=> ENOENT")
- return -errno.ENOENT
- if not field.hasValue():
- return ''
- except Exception as xx:
- log.info("ERR: %s" % xx)
- raise
- data = self.fieldValue(field)
- slen = len(data)
- if offset >= slen:
- return ''
- if offset + size > slen:
- size = slen - offset
- data = data[offset:offset + size]
- log.info("=> %s" % repr(data))
- return data
-
- def stop(self):
- log.info("stop()")
-
- def truncate(self, *args):
- log.info("truncate(): TODO!")
-
-
-def main():
- usage = """
-Userspace hello example
-
-""" + Fuse.fusage
- server = HelloFS('/home/haypo/testcase/KDE_Click.wav',
- version="%prog " + fuse.__version__,
- usage=usage,
- dash_s_do='setsingle')
-
- server.parse(errex=1)
- server.main()
- server.stop()
-
-
-if __name__ == '__main__':
- main()
diff --git a/tools/steganography.py b/tools/steganography.py
deleted file mode 100755
index 44a4fc8c..00000000
--- a/tools/steganography.py
+++ /dev/null
@@ -1,172 +0,0 @@
-#!/usr/bin/env python3
-
-from hachoir.editor import (createEditor as hachoirCreateEditor,
- NewFieldSet, EditableInteger, EditableString, EditableBytes)
-from hachoir.stream import FileOutputStream
-from hachoir.parser import createParser
-from hachoir.parser.image import PngFile
-from hachoir.parser.audio import MpegAudioFile
-from sys import argv, stdin, stdout, stderr, exit
-import zlib
-
-
-class InjecterError(Exception):
- pass
-
-
-class Injecter:
-
- def __init__(self, editor):
- self.editor = editor
-
- def getMaxSize(self):
- "None: no limit"
- raise NotImplementedError()
-
- def read(self):
- raise NotImplementedError()
-
- def write(self, data):
- raise NotImplementedError()
-
- def saveInto(self, filename):
- output = FileOutputStream(filename)
- self.editor.writeInto(output)
-
-
-def computeCRC32(data):
- "Compute CRC-32 of data string. Result is a positive integer."
- crc = zlib.crc32(data)
- if 0 <= crc:
- return crc
- else:
- return 1 << 32
-
-
-class PngInjecter(Injecter):
- MAGIC = "HACHOIR"
-
- def getMaxSize(self):
- return None
-
- def read(self):
- for field in self.editor:
- if field.name.startswith("text[") \
- and field["keyword"].value == self.MAGIC:
- return field["text"].value
- return None
-
- def write(self, data):
- tag = "tEXt"
- data = "%s\0%s" % (self.MAGIC, data)
- size = len(data)
- crc = computeCRC32(tag + data)
- chunk = NewFieldSet(self.editor, "inject[]")
- chunk.insert(EditableInteger(chunk, "size", False, 32, size))
- chunk.insert(EditableBytes(chunk, "tag", tag))
- chunk.insert(EditableBytes(chunk, "content", data))
- chunk.insert(EditableInteger(chunk, "crc32", False, 32, crc))
- self.editor.insertBefore("end", chunk)
-
-
-class MpegAudioInjecter(Injecter):
- MAX_PACKET_SIZE = 2048 # bytes between each frame
-
- def __init__(self, editor, packet_size=None):
- Injecter.__init__(self, editor)
- self.frames = editor["frames"]
- if packet_size:
- # Limit packet size to 1..MAX_PACKET_SIZE bytes
- self.packet_size = max(min(self.MAX_PACKET_SIZE, packet_size), 1)
- else:
- self.packet_size = self.MAX_PACKET_SIZE
-
- def getMaxSize(self):
- return len(self.frames) * self.packet_size * 8
-
- def read(self):
- data = []
- for field in self.frames:
- if field.name.startswith("padding["):
- data.append(field.value)
- if data:
- return b"".join(data)
- else:
- return None
-
- def write(self, data):
- count = 30
- self.packet_size = 3
- data = b"\0" * (self.packet_size * count - 1)
- print("Packet size: %s" % self.packet_size)
- print("Check input message")
- if b"\xff" in data:
- raise InjecterError(
- "Sorry, MPEG audio injecter disallows 0xFF byte")
-
-# print "Check message size"
-# maxbytes = self.getMaxSize()
-# if maxbytes < len(data)*8:
-# raise InjecterError("Message is too big (max: %s, want: %s)" % \
-# (maxbytes, len(data)))
-
- print("Inject message")
- field_index = 0
- index = 0
- output = self.frames
- while index < len(data):
- padding = data[index:index + self.packet_size]
- print(index, index + self.packet_size, type(padding))
- name = "frame[%u]" % field_index
- print("Insert %s before %s" % (len(padding), name))
- output.insertAfter(name, EditableString(
- output, "padding[]", "fixed", padding))
- index += self.packet_size
- field_index += 2
-
-
-def createEditor(filename):
- parser = createParser(filename)
- return hachoirCreateEditor(parser)
-
-
-injecter_cls = {
- PngFile: PngInjecter,
- MpegAudioFile: MpegAudioInjecter,
-}
-
-
-def main():
- if len(argv) != 2:
- print("usage: %s music.mp3" % argv[0], file=stderr)
- exit(1)
-
- filename = str(argv[1])
- editor = createEditor(filename)
-# injecter = injecter_cls[editor.input.__class__]
- injecter = MpegAudioInjecter(editor, packet_size=16)
-
- if False:
- data = injecter.read()
- if data:
- stdout.write(data)
- exit(0)
- else:
- print("No data", file=stderr)
- exit(1)
- else:
- out_filename = filename + ".msg"
- print("Write your message and valid with CTRL+D:")
- stdout.flush()
- data = stdin.read()
- data = data.encode('utf-8')
-
- print("Hide message")
- injecter.write(data)
-
- print("Write ouput into: %s" % out_filename)
- injecter.saveInto(out_filename)
-
-
-if __name__ == "__main__":
- main()
diff --git a/tools/swf_extractor.py b/tools/swf_extractor.py
deleted file mode 100755
index 116a9866..00000000
--- a/tools/swf_extractor.py
+++ /dev/null
@@ -1,100 +0,0 @@
-#!/usr/bin/env python3
-from hachoir.parser import createParser, guessParser
-from sys import stderr, exit, argv
-
-
-class JpegExtractor:
-
- def __init__(self):
- self.jpg_index = 1
- self.snd_index = 1
- self.verbose = False
-
- def storeJPEG(self, content):
- name = "image-%03u.jpg" % self.jpg_index
- print("Write new image: %s" % name)
- with open(name, "wb") as fp:
- fp.write(content)
- self.jpg_index += 1
-
- def createNewSound(self):
- name = "sound-%03u.mp3" % self.snd_index
- print("Write new sound: %s" % name)
- self.snd_index += 1
- return open(name, "wb")
-
- def extractFormat2(self, field):
- if "jpeg_header" in field:
- header = field["jpeg_header"]
- if 32 < header.size:
- if self.verbose:
- print("Use JPEG table: %s" % header.path)
- header = field.root.stream.readBytes(
- header.absolute_address, (header.size - 16) // 8)
- else:
- header = ""
- else:
- header = None
- content = field["image"].value
- if header:
- content = header + content[2:]
- if self.verbose:
- print("Extract JPEG from %s" % field.path)
- self.storeJPEG(content)
-
- def extractSound2(self, parser):
- header = None
- output = None
- for field in parser:
- if field.name.startswith("def_sound["):
- header = field
- output = self.createNewSound()
- data = header["music_data"].value
- assert data[:1] == b'\xFF'
- output.write(data)
- elif field.name.startswith("sound_blk") \
- and "music_data" in field:
- data = field["music_data"].value
- if data:
- assert data[0] == '\xFF'
- output.write(data)
-
- def main(self):
- if len(argv) != 2:
- print("usage: %s document.swf" % argv[0], file=stderr)
- exit(1)
-
- filename = argv[1]
- parser = createParser(filename)
-
- if parser["signature"].value == "CWS":
- deflate_swf = parser["compressed_data"].getSubIStream()
- parser = guessParser(deflate_swf)
-
- if "jpg_table/data" in parser:
- # JPEG pictures with common header
- jpeg_header = parser["jpg_table/data"].value[:-2]
- for field in parser.array("def_bits"):
- jpeg_content = field["image"].value[2:]
- if self.verbose:
- print("Extract JPEG from %s" % field.path)
- self.storeJPEG(jpeg_header + jpeg_content)
-
- # JPEG in format 2/3
- for field in parser.array("def_bits_jpeg2"):
- self.extractFormat2(field)
- for field in parser.array("def_bits_jpeg3"):
- self.extractFormat2(field)
-
- # Extract sound
- # self.extractSound(parser)
- self.extractSound2(parser)
-
- # Does it extract anything?
- if self.jpg_index == 1:
- print("No JPEG picture found.")
- if self.snd_index == 1:
- print("No sound found.")
-
-
-JpegExtractor().main()
diff --git a/tox.ini b/tox.ini
index 190a2c13..32abdbc9 100644
--- a/tox.ini
+++ b/tox.ini
@@ -13,10 +13,14 @@ commands =
sh tools/flake8.sh
[flake8]
+# E121 continuation line under-indented for hanging indent
+# hachoir/parser/network/ouid.py
+# E131 continuation line unaligned for hanging indent
+# parser/container/mp4.py
# E501 line too long (88 > 79 characters)
# W503 line break before binary operator
# W504 line break after binary operator
-ignore = E501,W503,W504
+ignore = E121,E131,E501,W503,W504
[testenv:doc]
deps=