New Upstream Release - python-saneyaml

Ready changes

Summary

Merged new upstream version: 0.6.0 (was: 0.5.2).

Resulting package

Built on 2023-06-06T13:12 (took 6m53s)

The resulting binary packages can be installed (if you have the apt repository enabled) by running one of:

apt install -t fresh-releases python3-saneyaml

Lintian Result

Diff

diff --git a/.gitattributes b/.gitattributes
new file mode 100644
index 0000000..96c89ce
--- /dev/null
+++ b/.gitattributes
@@ -0,0 +1,3 @@
+# Ignore all Git auto CR/LF line endings conversions
+* -text
+pyproject.toml export-subst
diff --git a/.github/workflows/docs-ci.yml b/.github/workflows/docs-ci.yml
new file mode 100644
index 0000000..18a44aa
--- /dev/null
+++ b/.github/workflows/docs-ci.yml
@@ -0,0 +1,37 @@
+name: CI Documentation
+
+on: [push, pull_request]
+
+jobs:
+  build:
+    runs-on: ubuntu-20.04
+
+    strategy:
+      max-parallel: 4
+      matrix:
+        python-version: [3.9]
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v2
+
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v2
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Give permission to run scripts
+        run: chmod +x ./docs/scripts/doc8_style_check.sh
+
+      - name: Install Dependencies
+        run:  pip install -e .[docs]
+
+      - name: Check Sphinx Documentation build minimally
+        working-directory: ./docs
+        run: sphinx-build -E -W source build
+
+      - name: Check for documentation style errors
+        working-directory: ./docs
+        run: ./scripts/doc8_style_check.sh
+
+
diff --git a/.github/workflows/pypi-release.yml b/.github/workflows/pypi-release.yml
new file mode 100644
index 0000000..22315ff
--- /dev/null
+++ b/.github/workflows/pypi-release.yml
@@ -0,0 +1,83 @@
+name: Create library release archives, create a GH release and publish PyPI wheel and sdist on tag in main branch
+
+
+# This is executed automatically on a tag in the main branch
+
+# Summary of the steps:
+# - build wheels and sdist
+# - upload wheels and sdist to PyPI
+# - create gh-release and upload wheels and dists there
+# TODO: smoke test wheels and sdist
+# TODO: add changelog to release text body
+
+# WARNING: this is designed only for packages building as pure Python wheels
+
+on:
+  workflow_dispatch:
+  push:
+    tags:
+      - "v*.*.*"
+
+jobs:
+  build-pypi-distribs:
+    name: Build and publish library to PyPI
+    runs-on: ubuntu-20.04
+
+    steps:
+      - uses: actions/checkout@master
+      - name: Set up Python
+        uses: actions/setup-python@v1
+        with:
+          python-version: 3.9
+
+      - name: Install pypa/build
+        run: python -m pip install build --user
+
+      - name: Build a binary wheel and a source tarball
+        run: python -m build --sdist --wheel --outdir dist/
+
+      - name: Upload built archives
+        uses: actions/upload-artifact@v3
+        with:
+          name: pypi_archives
+          path: dist/*
+
+
+  create-gh-release:
+    name: Create GH release
+    needs:
+      - build-pypi-distribs
+    runs-on: ubuntu-20.04
+
+    steps:
+      - name: Download built archives
+        uses: actions/download-artifact@v3
+        with:
+          name: pypi_archives
+          path: dist
+
+      - name: Create GH release
+        uses: softprops/action-gh-release@v1
+        with:
+          draft: true
+          files: dist/*
+
+
+  create-pypi-release:
+    name: Create PyPI release
+    needs:
+      - create-gh-release
+    runs-on: ubuntu-20.04
+
+    steps:
+      - name: Download built archives
+        uses: actions/download-artifact@v3
+        with:
+          name: pypi_archives
+          path: dist
+
+      - name: Publish to PyPI
+        if: startsWith(github.ref, 'refs/tags')
+        uses: pypa/gh-action-pypi-publish@master
+        with:
+          password: ${{ secrets.PYPI_API_TOKEN }}
diff --git a/.gitignore b/.gitignore
index fc50d61..2d48196 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,10 +1,8 @@
-# ScanCode special files
-/SCANCODE_DEV_MODE
-
 # Python compiled files
 *.py[cod]
 
 # virtualenv and other misc bits
+/src/*.egg-info
 *.egg-info
 /dist
 /build
@@ -15,6 +13,7 @@
 /Lib
 /pip-selfcheck.json
 /tmp
+/venv
 .Python
 /include
 /Include
@@ -44,9 +43,16 @@ htmlcov
 .idea
 org.eclipse.core.resources.prefs
 .vscode
+.vs
 
 # Sphinx
 docs/_build
+docs/bin
+docs/build
+docs/include
+docs/Lib
+doc/pyvenv.cfg
+pyvenv.cfg
 
 # Various junk and temp files
 .DS_Store
@@ -59,3 +65,10 @@ docs/_build
 
 # pyenv
 /.python-version
+/man/
+/.pytest_cache/
+lib64
+tcl
+
+# Ignore Jupyter Notebook related temp files
+.ipynb_checkpoints/
diff --git a/.readthedocs.yml b/.readthedocs.yml
new file mode 100644
index 0000000..1b71cd9
--- /dev/null
+++ b/.readthedocs.yml
@@ -0,0 +1,18 @@
+# .readthedocs.yml
+# Read the Docs configuration file
+# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
+
+# Required
+version: 2
+
+# Where the Sphinx conf.py file is located
+sphinx:
+   configuration: docs/source/conf.py
+
+# Setting the python version and doc build requirements
+python:
+  install:
+    - method: pip
+      path: .
+      extra_requirements:
+        - docs
diff --git a/.travis.yml b/.travis.yml
deleted file mode 100644
index c4e028b..0000000
--- a/.travis.yml
+++ /dev/null
@@ -1,22 +0,0 @@
-language: python
-
-python:
-  - "2.7"
-  - "3.6"
-
-install:
-    - pip install -r requirements_dev.txt
-
-script:
-    - bin/py.test -vvs tests
-
-notifications:
-    irc:
-        channels:
-          - "chat.freenode.net#aboutcode"
-    on_success: change
-    on_failure: always
-    use_notice: true
-    skip_join: true
-    template:
-      - "%{repository_slug}#%{build_number} (%{branch} - %{commit} : %{author}): %{message} : %{build_url}"
diff --git a/AUTHORS.rst b/AUTHORS.rst
new file mode 100644
index 0000000..51a19cc
--- /dev/null
+++ b/AUTHORS.rst
@@ -0,0 +1,3 @@
+The following organizations or individuals have contributed to this repo:
+
+- 
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
index f84fbd1..cf800af 100644
--- a/CHANGELOG.rst
+++ b/CHANGELOG.rst
@@ -1,7 +1,38 @@
 Changelog
 =========
 
-0.1 (2018-11-17)
-----------------
+v0.6 (2023-01-15)
+-----------------
 
-Initial release.
+- Merge latest https://github.com/nexB/skeleton
+- Support PyYAML 5.x and 6.x and document in README
+  Thank you to @mwgamble
+
+
+v0.5 (2021-03-31)
+-----------------
+
+- Adopt https://github.com/nexB/skeleton
+- Support Python 3 only, drop Python 2 support
+- Support both PyYAML 5.x only
+- Drop travis for Azure pipelines
+
+
+v0.4 (2019-04-10)
+-----------------
+
+- Dump nulls correctly
+- Fix CI on Travis
+- Attempt to support PyYAML 5.x partially
+
+
+v0.3 (2019-04-07)
+-----------------
+
+- Use OrderedDict to keep ordering of mappings
+
+
+v0.1 (2018-11-17)
+-----------------
+
+- Initial release based on code originally used in ScanCode toolkit.
diff --git a/CODE_OF_CONDUCT.rst b/CODE_OF_CONDUCT.rst
new file mode 100644
index 0000000..590ba19
--- /dev/null
+++ b/CODE_OF_CONDUCT.rst
@@ -0,0 +1,86 @@
+Contributor Covenant Code of Conduct
+====================================
+
+Our Pledge
+----------
+
+In the interest of fostering an open and welcoming environment, we as
+contributors and maintainers pledge to making participation in our
+project and our community a harassment-free experience for everyone,
+regardless of age, body size, disability, ethnicity, gender identity and
+expression, level of experience, education, socio-economic status,
+nationality, personal appearance, race, religion, or sexual identity and
+orientation.
+
+Our Standards
+-------------
+
+Examples of behavior that contributes to creating a positive environment
+include:
+
+-  Using welcoming and inclusive language
+-  Being respectful of differing viewpoints and experiences
+-  Gracefully accepting constructive criticism
+-  Focusing on what is best for the community
+-  Showing empathy towards other community members
+
+Examples of unacceptable behavior by participants include:
+
+-  The use of sexualized language or imagery and unwelcome sexual
+   attention or advances
+-  Trolling, insulting/derogatory comments, and personal or political
+   attacks
+-  Public or private harassment
+-  Publishing others’ private information, such as a physical or
+   electronic address, without explicit permission
+-  Other conduct which could reasonably be considered inappropriate in a
+   professional setting
+
+Our Responsibilities
+--------------------
+
+Project maintainers are responsible for clarifying the standards of
+acceptable behavior and are expected to take appropriate and fair
+corrective action in response to any instances of unacceptable behavior.
+
+Project maintainers have the right and responsibility to remove, edit,
+or reject comments, commits, code, wiki edits, issues, and other
+contributions that are not aligned to this Code of Conduct, or to ban
+temporarily or permanently any contributor for other behaviors that they
+deem inappropriate, threatening, offensive, or harmful.
+
+Scope
+-----
+
+This Code of Conduct applies both within project spaces and in public
+spaces when an individual is representing the project or its community.
+Examples of representing a project or community include using an
+official project e-mail address, posting via an official social media
+account, or acting as an appointed representative at an online or
+offline event. Representation of a project may be further defined and
+clarified by project maintainers.
+
+Enforcement
+-----------
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may
+be reported by contacting the project team at pombredanne@gmail.com
+or on the Gitter chat channel at https://gitter.im/aboutcode-org/discuss .
+All complaints will be reviewed and investigated and will result in a
+response that is deemed necessary and appropriate to the circumstances.
+The project team is obligated to maintain confidentiality with regard to
+the reporter of an incident. Further details of specific enforcement
+policies may be posted separately.
+
+Project maintainers who do not follow or enforce the Code of Conduct in
+good faith may face temporary or permanent repercussions as determined
+by other members of the project’s leadership.
+
+Attribution
+-----------
+
+This Code of Conduct is adapted from the `Contributor Covenant`_ ,
+version 1.4, available at
+https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
+
+.. _Contributor Covenant: https://www.contributor-covenant.org
diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst
index fdb659f..3d41e89 100644
--- a/CONTRIBUTING.rst
+++ b/CONTRIBUTING.rst
@@ -48,11 +48,9 @@ For other questions, discussions, and chats, we have:
   Gitter also has an IRC bridge at https://irc.gitter.im/
   This is the main place where we chat and meet.
 
-- an official #aboutcode IRC channel on freenode (server chat.freenode.net)
+- an official #aboutcode IRC channel on Libera Chat (server web.libera.chat)
   for scancode and other related tools. You can use your
-  favorite IRC client or use the web chat at https://webchat.freenode.net/ .
-  This is a busy place with a lot of CI and commit notifications that makes
-  actual chat sometimes difficult!
+  favorite IRC client or use the web chat at https://web.libera.chat/?#aboutcode .
 
 - a mailing list at `sourceforge <https://lists.sourceforge.net/lists/listinfo/aboutcode-discuss>`_
 
diff --git a/MANIFEST.in b/MANIFEST.in
index 7f92448..ef3721e 100644
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -1,19 +1,15 @@
 graft src
-graft tests
 
-prune src/saneyaml.egg-info
-
-include CHANGELOG.rst
-include README.rst
-include CONTRIBUTING.rst
-include apache-2.0.LICENSE
+include *.LICENSE
 include NOTICE
-include MANIFEST.in
-include setup.py
-include setup.cfg
-include .gitignore
-include .travis.yml
-include appveyor.yml
-include saneyaml.ABOUT
+include *.ABOUT
+include *.toml
+include *.yml
+include *.rst
+include setup.*
+include configure*
+include requirements*
+include .git*
 
 global-exclude *.py[co] __pycache__ *.*~
+
diff --git a/Makefile b/Makefile
new file mode 100644
index 0000000..cc36c35
--- /dev/null
+++ b/Makefile
@@ -0,0 +1,54 @@
+# SPDX-License-Identifier: Apache-2.0
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/skeleton for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+
+# Python version can be specified with `$ PYTHON_EXE=python3.x make conf`
+PYTHON_EXE?=python3
+VENV=venv
+ACTIVATE?=. ${VENV}/bin/activate;
+
+dev:
+	@echo "-> Configure the development envt."
+	./configure --dev
+
+isort:
+	@echo "-> Apply isort changes to ensure proper imports ordering"
+	${VENV}/bin/isort --sl -l 100 src tests setup.py
+
+black:
+	@echo "-> Apply black code formatter"
+	${VENV}/bin/black -l 100 src tests setup.py
+
+doc8:
+	@echo "-> Run doc8 validation"
+	@${ACTIVATE} doc8 --max-line-length 100 --ignore-path docs/_build/ --quiet docs/
+
+valid: isort black
+
+check:
+	@echo "-> Run pycodestyle (PEP8) validation"
+	@${ACTIVATE} pycodestyle --max-line-length=100 --exclude=.eggs,venv,lib,thirdparty,docs,migrations,settings.py,.cache .
+	@echo "-> Run isort imports ordering validation"
+	@${ACTIVATE} isort --sl --check-only -l 100 setup.py src tests . 
+	@echo "-> Run black validation"
+	@${ACTIVATE} black --check --check -l 100 src tests setup.py
+
+clean:
+	@echo "-> Clean the Python env"
+	./configure --clean
+
+test:
+	@echo "-> Run the test suite"
+	${VENV}/bin/pytest -vvs
+
+docs:
+	rm -rf docs/_build/
+	@${ACTIVATE} sphinx-build docs/ docs/_build/
+
+.PHONY: conf dev check valid black isort clean test docs
diff --git a/NOTICE b/NOTICE
index 77a4e0f..65936b2 100644
--- a/NOTICE
+++ b/NOTICE
@@ -1,12 +1,19 @@
-Copyright (c) 2018 nexB Inc. and others. All rights reserved.
-http://nexb.com and https://github.com/nexB/saneyaml/
-
-Licensed under the Apache License, Version 2.0 (the "License");
-you may not use this file except in compliance with the License.
-You may obtain a copy of the License at
-    http://www.apache.org/licenses/LICENSE-2.0
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
+#
+# Copyright (c) nexB Inc. and others.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Visit https://aboutcode.org and https://github.com/nexB/ for support and download.
+# ScanCode is a trademark of nexB Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
diff --git a/PKG-INFO b/PKG-INFO
new file mode 100644
index 0000000..b9719a1
--- /dev/null
+++ b/PKG-INFO
@@ -0,0 +1,70 @@
+Metadata-Version: 2.1
+Name: saneyaml
+Version: 0.6.0
+Summary: Read and write readable YAML safely preserving order and avoiding bad surprises with unwanted infered type conversions. This library is a PyYaml wrapper with sane behaviour to read and write readable YAML safely, typically when used for configuration.
+Home-page: https://github.com/nexB/saneyaml
+Author: nexB. Inc. and others
+Author-email: info@aboutcode.org
+License: Apache-2.0
+Keywords: utilities,yaml,pyyaml,block,flow
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Intended Audience :: Developers
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3 :: Only
+Classifier: Topic :: Software Development
+Classifier: Topic :: Utilities
+Requires-Python: >=3.7
+Description-Content-Type: text/x-rst
+Provides-Extra: testing
+Provides-Extra: docs
+License-File: apache-2.0.LICENSE
+License-File: NOTICE
+License-File: AUTHORS.rst
+License-File: CHANGELOG.rst
+License-File: CODE_OF_CONDUCT.rst
+
+========
+saneyaml
+========
+
+This micro library is a PyYaml wrapper with sane behaviour to read and
+write readable YAML safely, typically when used with configuration files.
+
+With saneyaml you can dump readable and clean YAML and load safely any YAML
+preserving ordering and avoiding surprises of type conversions by loading
+everything except booleans as strings.
+
+Optionally you can check for duplicated map keys when loading YAML.
+
+Works with Python 3. Requires PyYAML 5.x or higher.
+
+license: apache-2.0
+homepage_url: https://github.com/nexB/saneyaml
+
+Usage::
+
+    pip install saneyaml
+    
+    >>> from  saneyaml import load
+    >>> from  saneyaml import dump
+    >>> a=load('''version: 3.0.0.dev6
+    ... 
+    ... description: |
+    ...     AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file
+    ...     provides a way to document a software component.
+    ... ''')
+    >>> a
+    dict([
+        (u'version', u'3.0.0.dev6'), 
+        (u'description', u'AboutCode Toolkit is a tool to process ABOUT files. '
+        'An ABOUT file\nprovides a way to document a software component.\n')])
+    
+    >>> pprint(a.items())
+    [(u'version', u'3.0.0.dev6'),
+     (u'description',
+      u'AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file\nprovides a way to document a software component.\n')]
+    >>> print(dump(a))
+    version: 3.0.0.dev6
+    description: |
+      AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file
+      provides a way to document a software component.
diff --git a/README.rst b/README.rst
index 24d1097..8da7780 100644
--- a/README.rst
+++ b/README.rst
@@ -8,38 +8,38 @@ write readable YAML safely, typically when used with configuration files.
 With saneyaml you can dump readable and clean YAML and load safely any YAML
 preserving ordering and avoiding surprises of type conversions by loading
 everything except booleans as strings.
+
 Optionally you can check for duplicated map keys when loading YAML.
 
-Works with Python 2 and 3. Requires PyYAML.
+Works with Python 3. Requires PyYAML 5.x or higher.
 
-License: apache-2.0
-Homepage_url: https://github.com/nexB/saneyaml
+license: apache-2.0
+homepage_url: https://github.com/nexB/saneyaml
 
 Usage::
 
-pip install saneyaml
-
->>> from  saneyaml import load as l
->>> from  saneyaml import dump as d
->>> a=l('''version: 3.0.0.dev6
-... 
-... description: |
-...     AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file
-...     provides a way to document a software component.
-... ''')
->>> a
-OrderedDict([
-    (u'version', u'3.0.0.dev6'), 
-    (u'description', u'AboutCode Toolkit is a tool to process ABOUT files. '
-    'An ABOUT file\nprovides a way to document a software component.\n')])
-
->>> pprint(a.items())
-[(u'version', u'3.0.0.dev6'),
- (u'description',
-  u'AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file\nprovides a way to document a software component.\n')]
->>> print(d(a))
-version: 3.0.0.dev6
-description: |
-  AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file
-  provides a way to document a software component.
-
+    pip install saneyaml
+    
+    >>> from  saneyaml import load
+    >>> from  saneyaml import dump
+    >>> a=load('''version: 3.0.0.dev6
+    ... 
+    ... description: |
+    ...     AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file
+    ...     provides a way to document a software component.
+    ... ''')
+    >>> a
+    dict([
+        (u'version', u'3.0.0.dev6'), 
+        (u'description', u'AboutCode Toolkit is a tool to process ABOUT files. '
+        'An ABOUT file\nprovides a way to document a software component.\n')])
+    
+    >>> pprint(a.items())
+    [(u'version', u'3.0.0.dev6'),
+     (u'description',
+      u'AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file\nprovides a way to document a software component.\n')]
+    >>> print(dump(a))
+    version: 3.0.0.dev6
+    description: |
+      AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file
+      provides a way to document a software component.
diff --git a/apache-2.0.LICENSE b/apache-2.0.LICENSE
index d9a10c0..261eeb9 100644
--- a/apache-2.0.LICENSE
+++ b/apache-2.0.LICENSE
@@ -174,3 +174,28 @@
       of your accepting any such warranty or additional liability.
 
    END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
diff --git a/appveyor.yml b/appveyor.yml
index ceb7ffe..b4b4a35 100644
--- a/appveyor.yml
+++ b/appveyor.yml
@@ -1,16 +1,12 @@
 version: '{build}'
 
-install:
-    - configure etc/conf/dev
-
 build: off
 
 test_script:
-    - set
+    - pip install pyyaml
+    - pip install -r requirements_dev.txt
+# test also on latest version
+    - py.test -vvs tests
+    - pip uninstall -y pyyaml
+    - pip install pyyaml
     - py.test -vvs tests
-
-on_success:
-  - "python etc/scripts/irc-notify.py aboutcode [{project_name}:{branch}] {short_commit}: \"{message}\" ({author}) {color_green}Succeeded,Details: {build_url},Commit: {commit_url}"
-
-on_failure:
-  - "python etc/scripts/irc-notify.py aboutcode [{project_name}:{branch}] {short_commit}: \"{message}\" ({author}) {color_red}Failed,Details: {build_url},Commit: {commit_url}"
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
new file mode 100644
index 0000000..fc5a41e
--- /dev/null
+++ b/azure-pipelines.yml
@@ -0,0 +1,72 @@
+
+################################################################################
+# We use Azure to run the full tests suites on multiple Python 3.x
+# on multiple Windows, macOS and Linux versions all on 64 bits
+# These jobs are using VMs with Azure-provided Python builds
+################################################################################
+
+jobs:
+
+    - template: etc/ci/azure-posix.yml
+      parameters:
+          job_name: ubuntu18_cpython
+          image_name: ubuntu-18.04
+          python_versions: ['3.7', '3.8', '3.9', '3.10', '3.11']
+          test_suites:
+              all: venv/bin/pytest -n 2 -vvs
+
+    - template: etc/ci/azure-posix.yml
+      parameters:
+          job_name: ubuntu20_cpython
+          image_name: ubuntu-20.04
+          python_versions: ['3.7', '3.8', '3.9', '3.10', '3.11']
+          test_suites:
+              all: venv/bin/pytest -n 2 -vvs
+
+    - template: etc/ci/azure-posix.yml
+      parameters:
+          job_name: ubuntu22_cpython
+          image_name: ubuntu-22.04
+          python_versions: ['3.7', '3.8', '3.9', '3.10', '3.11']
+          test_suites:
+              all: venv/bin/pytest -n 2 -vvs
+
+    - template: etc/ci/azure-posix.yml
+      parameters:
+          job_name: macos1015_cpython
+          image_name: macos-10.15
+          python_versions: ['3.7', '3.8', '3.9', '3.10', '3.11']
+          test_suites:
+              all: venv/bin/pytest -n 2 -vvs
+
+    - template: etc/ci/azure-posix.yml
+      parameters:
+          job_name: macos11_cpython
+          image_name: macos-11
+          python_versions: ['3.7', '3.8', '3.9', '3.10', '3.11']
+          test_suites:
+              all: venv/bin/pytest -n 2 -vvs
+
+    - template: etc/ci/azure-posix.yml
+      parameters:
+          job_name: macos12_cpython
+          image_name: macos-12
+          python_versions: ['3.7', '3.8', '3.9', '3.10', '3.11']
+          test_suites:
+              all: venv/bin/pytest -n 2 -vvs
+
+    - template: etc/ci/azure-win.yml
+      parameters:
+          job_name: win2019_cpython
+          image_name: windows-2019
+          python_versions: ['3.7', '3.8', '3.9', '3.10', '3.11']
+          test_suites:
+              all: venv\Scripts\pytest -n 2 -vvs
+
+    - template: etc/ci/azure-win.yml
+      parameters:
+          job_name: win2022_cpython
+          image_name: windows-2022
+          python_versions: ['3.7', '3.8', '3.9', '3.10', '3.11']
+          test_suites:
+              all: venv\Scripts\pytest -n 2 -vvs
diff --git a/configure b/configure
new file mode 100755
index 0000000..926a894
--- /dev/null
+++ b/configure
@@ -0,0 +1,202 @@
+#!/usr/bin/env bash
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/ for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+
+set -e
+#set -x
+
+################################
+# A configuration script to set things up:
+# create a virtualenv and install or update thirdparty packages.
+# Source this script for initial configuration
+# Use configure --help for details
+#
+# NOTE: please keep in sync with Windows script configure.bat
+#
+# This script will search for a virtualenv.pyz app in etc/thirdparty/virtualenv.pyz
+# Otherwise it will download the latest from the VIRTUALENV_PYZ_URL default
+################################
+CLI_ARGS=$1
+
+################################
+# Defaults. Change these variables to customize this script
+################################
+
+# Requirement arguments passed to pip and used by default or with --dev.
+REQUIREMENTS="--editable . --constraint requirements.txt"
+DEV_REQUIREMENTS="--editable .[testing] --constraint requirements.txt --constraint requirements-dev.txt"
+DOCS_REQUIREMENTS="--editable .[docs] --constraint requirements.txt"
+
+# where we create a virtualenv
+VIRTUALENV_DIR=venv
+
+# Cleanable files and directories to delete with the --clean option
+CLEANABLE="build dist venv .cache .eggs"
+
+# extra  arguments passed to pip
+PIP_EXTRA_ARGS=" "
+
+# the URL to download virtualenv.pyz if needed
+VIRTUALENV_PYZ_URL=https://bootstrap.pypa.io/virtualenv.pyz
+################################
+
+
+################################
+# Current directory where this script lives
+CFG_ROOT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
+CFG_BIN_DIR=$CFG_ROOT_DIR/$VIRTUALENV_DIR/bin
+
+
+################################
+# Install with or without and index. With "--no-index" this is using only local wheels
+# This is an offline mode with no index and no network operations
+# NO_INDEX="--no-index "
+NO_INDEX=""
+
+
+################################
+# Thirdparty package locations and index handling
+# Find packages from the local thirdparty directory if present
+THIRDPARDIR=$CFG_ROOT_DIR/thirdparty
+if [[ "$(echo $THIRDPARDIR/*.whl)x" != "$THIRDPARDIR/*.whlx" ]]; then
+    PIP_EXTRA_ARGS="$NO_INDEX --find-links $THIRDPARDIR"
+fi
+
+
+################################
+# Set the quiet flag to empty if not defined
+if [[ "$CFG_QUIET" == "" ]]; then
+    CFG_QUIET=" "
+fi
+
+
+################################
+# Find a proper Python to run
+# Use environment variables or a file if available.
+# Otherwise the latest Python by default.
+find_python() {
+    if [[ "$PYTHON_EXECUTABLE" == "" ]]; then
+        # check for a file named PYTHON_EXECUTABLE
+        if [ -f "$CFG_ROOT_DIR/PYTHON_EXECUTABLE" ]; then
+            PYTHON_EXECUTABLE=$(cat "$CFG_ROOT_DIR/PYTHON_EXECUTABLE")
+        else
+            PYTHON_EXECUTABLE=python3
+        fi
+    fi
+}
+
+
+################################
+create_virtualenv() {
+    # create a virtualenv for Python
+    # Note: we do not use the bundled Python 3 "venv" because its behavior and
+    # presence is not consistent across Linux distro and sometimes pip is not
+    # included either by default. The virtualenv.pyz app cures all these issues.
+
+    VENV_DIR="$1"
+    if [ ! -f "$CFG_BIN_DIR/python" ]; then
+
+        mkdir -p "$CFG_ROOT_DIR/$VENV_DIR"
+
+        if [ -f "$CFG_ROOT_DIR/etc/thirdparty/virtualenv.pyz" ]; then
+            VIRTUALENV_PYZ="$CFG_ROOT_DIR/etc/thirdparty/virtualenv.pyz"
+        else
+            VIRTUALENV_PYZ="$CFG_ROOT_DIR/$VENV_DIR/virtualenv.pyz"
+            wget -O "$VIRTUALENV_PYZ" "$VIRTUALENV_PYZ_URL" 2>/dev/null || curl -o  "$VIRTUALENV_PYZ" "$VIRTUALENV_PYZ_URL"
+        fi
+
+        $PYTHON_EXECUTABLE "$VIRTUALENV_PYZ" \
+            --wheel embed --pip embed --setuptools embed \
+            --seeder pip \
+            --never-download \
+            --no-periodic-update \
+            --no-vcs-ignore \
+            $CFG_QUIET \
+            "$CFG_ROOT_DIR/$VENV_DIR"
+    fi
+}
+
+
+################################
+install_packages() {
+    # install requirements in virtualenv
+    # note: --no-build-isolation means that pip/wheel/setuptools will not
+    # be reinstalled a second time and reused from the virtualenv and this
+    # speeds up the installation.
+    # We always have the PEP517 build dependencies installed already.
+
+    "$CFG_BIN_DIR/pip" install \
+        --upgrade \
+        --no-build-isolation \
+        $CFG_QUIET \
+        $PIP_EXTRA_ARGS \
+        $1
+}
+
+
+################################
+cli_help() {
+    echo An initial configuration script
+    echo "  usage: ./configure [options]"
+    echo
+    echo The default is to configure for regular use. Use --dev for development.
+    echo
+    echo The options are:
+    echo " --clean: clean built and installed files and exit."
+    echo " --dev:   configure the environment for development."
+    echo " --help:  display this help message and exit."
+    echo
+    echo By default, the python interpreter version found in the path is used.
+    echo Alternatively, the PYTHON_EXECUTABLE environment variable can be set to
+    echo configure another Python executable interpreter to use. If this is not
+    echo set, a file named PYTHON_EXECUTABLE containing a single line with the
+    echo path of the Python executable to use will be checked last.
+    set +e
+    exit
+}
+
+
+################################
+clean() {
+    # Remove cleanable file and directories and files from the root dir.
+    echo "* Cleaning ..."
+    for cln in $CLEANABLE;
+        do rm -rf "${CFG_ROOT_DIR:?}/${cln:?}";
+    done
+    set +e
+    exit
+}
+
+
+################################
+# Main command line entry point
+CFG_REQUIREMENTS=$REQUIREMENTS
+
+# We are using getopts to parse option arguments that start with "-"
+while getopts :-: optchar; do
+    case "${optchar}" in
+        -)
+            case "${OPTARG}" in
+                help  ) cli_help;;
+                clean ) find_python && clean;;
+                dev   ) CFG_REQUIREMENTS="$DEV_REQUIREMENTS";;
+                docs   ) CFG_REQUIREMENTS="$DOCS_REQUIREMENTS";;
+            esac;;
+    esac
+done
+
+
+PIP_EXTRA_ARGS="$PIP_EXTRA_ARGS"
+
+find_python
+create_virtualenv "$VIRTUALENV_DIR"
+install_packages "$CFG_REQUIREMENTS"
+. "$CFG_BIN_DIR/activate"
+
+
+set +e
diff --git a/configure.bat b/configure.bat
new file mode 100644
index 0000000..5e95b31
--- /dev/null
+++ b/configure.bat
@@ -0,0 +1,207 @@
+@echo OFF
+@setlocal
+
+@rem Copyright (c) nexB Inc. and others. All rights reserved.
+@rem SPDX-License-Identifier: Apache-2.0
+@rem See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+@rem See https://github.com/nexB/ for support or download.
+@rem See https://aboutcode.org for more information about nexB OSS projects.
+
+
+@rem ################################
+@rem # A configuration script to set things up:
+@rem # create a virtualenv and install or update thirdparty packages.
+@rem # Source this script for initial configuration
+@rem # Use configure --help for details
+
+@rem # NOTE: please keep in sync with POSIX script configure
+
+@rem # This script will search for a virtualenv.pyz app in etc\thirdparty\virtualenv.pyz
+@rem # Otherwise it will download the latest from the VIRTUALENV_PYZ_URL default
+@rem ################################
+
+
+@rem ################################
+@rem # Defaults. Change these variables to customize this script
+@rem ################################
+
+@rem # Requirement arguments passed to pip and used by default or with --dev.
+set "REQUIREMENTS=--editable . --constraint requirements.txt"
+set "DEV_REQUIREMENTS=--editable .[testing] --constraint requirements.txt --constraint requirements-dev.txt"
+set "DOCS_REQUIREMENTS=--editable .[docs] --constraint requirements.txt"
+
+@rem # where we create a virtualenv
+set "VIRTUALENV_DIR=venv"
+
+@rem # Cleanable files and directories to delete with the --clean option
+set "CLEANABLE=build dist venv .cache .eggs"
+
+@rem # extra  arguments passed to pip
+set "PIP_EXTRA_ARGS= "
+
+@rem # the URL to download virtualenv.pyz if needed
+set VIRTUALENV_PYZ_URL=https://bootstrap.pypa.io/virtualenv.pyz
+@rem ################################
+
+
+@rem ################################
+@rem # Current directory where this script lives
+set CFG_ROOT_DIR=%~dp0
+set "CFG_BIN_DIR=%CFG_ROOT_DIR%\%VIRTUALENV_DIR%\Scripts"
+
+
+@rem ################################
+@rem # Thirdparty package locations and index handling
+@rem # Find packages from the local thirdparty directory
+if exist "%CFG_ROOT_DIR%\thirdparty" (
+    set PIP_EXTRA_ARGS=--find-links "%CFG_ROOT_DIR%\thirdparty"
+)
+
+
+@rem ################################
+@rem # Set the quiet flag to empty if not defined
+if not defined CFG_QUIET (
+    set "CFG_QUIET= "
+)
+
+
+@rem ################################
+@rem # Main command line entry point
+set "CFG_REQUIREMENTS=%REQUIREMENTS%"
+
+:again
+if not "%1" == "" (
+    if "%1" EQU "--help"   (goto cli_help)
+    if "%1" EQU "--clean"  (goto clean)
+    if "%1" EQU "--dev"    (
+        set "CFG_REQUIREMENTS=%DEV_REQUIREMENTS%"
+    )
+    if "%1" EQU "--docs"    (
+        set "CFG_REQUIREMENTS=%DOCS_REQUIREMENTS%"
+    )
+    shift
+    goto again
+)
+
+set "PIP_EXTRA_ARGS=%PIP_EXTRA_ARGS%"
+
+
+@rem ################################
+@rem # Find a proper Python to run
+@rem # Use environment variables or a file if available.
+@rem # Otherwise the latest Python by default.
+if not defined PYTHON_EXECUTABLE (
+    @rem # check for a file named PYTHON_EXECUTABLE
+    if exist "%CFG_ROOT_DIR%\PYTHON_EXECUTABLE" (
+        set /p PYTHON_EXECUTABLE=<"%CFG_ROOT_DIR%\PYTHON_EXECUTABLE"
+    ) else (
+        set "PYTHON_EXECUTABLE=py"
+    )
+)
+
+
+@rem ################################
+:create_virtualenv
+@rem # create a virtualenv for Python
+@rem # Note: we do not use the bundled Python 3 "venv" because its behavior and
+@rem # presence is not consistent across Linux distro and sometimes pip is not
+@rem # included either by default. The virtualenv.pyz app cures all these issues.
+
+if not exist "%CFG_BIN_DIR%\python.exe" (
+    if not exist "%CFG_BIN_DIR%" (
+        mkdir "%CFG_BIN_DIR%"
+    )
+
+    if exist "%CFG_ROOT_DIR%\etc\thirdparty\virtualenv.pyz" (
+        %PYTHON_EXECUTABLE% "%CFG_ROOT_DIR%\etc\thirdparty\virtualenv.pyz" ^
+            --wheel embed --pip embed --setuptools embed ^
+            --seeder pip ^
+            --never-download ^
+            --no-periodic-update ^
+            --no-vcs-ignore ^
+            %CFG_QUIET% ^
+            "%CFG_ROOT_DIR%\%VIRTUALENV_DIR%"
+    ) else (
+        if not exist "%CFG_ROOT_DIR%\%VIRTUALENV_DIR%\virtualenv.pyz" (
+            curl -o "%CFG_ROOT_DIR%\%VIRTUALENV_DIR%\virtualenv.pyz" %VIRTUALENV_PYZ_URL%
+
+            if %ERRORLEVEL% neq 0 (
+                exit /b %ERRORLEVEL%
+            )
+        )
+        %PYTHON_EXECUTABLE% "%CFG_ROOT_DIR%\%VIRTUALENV_DIR%\virtualenv.pyz" ^
+            --wheel embed --pip embed --setuptools embed ^
+            --seeder pip ^
+            --never-download ^
+            --no-periodic-update ^
+            --no-vcs-ignore ^
+            %CFG_QUIET% ^
+            "%CFG_ROOT_DIR%\%VIRTUALENV_DIR%"
+    )
+)
+
+if %ERRORLEVEL% neq 0 (
+    exit /b %ERRORLEVEL%
+)
+
+
+@rem ################################
+:install_packages
+@rem # install requirements in virtualenv
+@rem # note: --no-build-isolation means that pip/wheel/setuptools will not
+@rem # be reinstalled a second time and reused from the virtualenv and this
+@rem # speeds up the installation.
+@rem # We always have the PEP517 build dependencies installed already.
+
+"%CFG_BIN_DIR%\pip" install ^
+    --upgrade ^
+    --no-build-isolation ^
+    %CFG_QUIET% ^
+    %PIP_EXTRA_ARGS% ^
+    %CFG_REQUIREMENTS%
+
+
+@rem ################################
+:create_bin_junction
+@rem # Create junction to bin to have the same directory between linux and windows
+if exist "%CFG_ROOT_DIR%\%VIRTUALENV_DIR%\bin" (
+    rmdir /s /q "%CFG_ROOT_DIR%\%VIRTUALENV_DIR%\bin"
+)
+mklink /J "%CFG_ROOT_DIR%\%VIRTUALENV_DIR%\bin" "%CFG_ROOT_DIR%\%VIRTUALENV_DIR%\Scripts"
+
+if %ERRORLEVEL% neq 0 (
+    exit /b %ERRORLEVEL%
+)
+
+exit /b 0
+
+
+@rem ################################
+:cli_help
+    echo An initial configuration script
+    echo "  usage: configure [options]"
+    echo " "
+    echo The default is to configure for regular use. Use --dev for development.
+    echo " "
+    echo The options are:
+    echo " --clean: clean built and installed files and exit."
+    echo " --dev:   configure the environment for development."
+    echo " --help:  display this help message and exit."
+    echo " "
+    echo By default, the python interpreter version found in the path is used.
+    echo Alternatively, the PYTHON_EXECUTABLE environment variable can be set to
+    echo configure another Python executable interpreter to use. If this is not
+    echo set, a file named PYTHON_EXECUTABLE containing a single line with the
+    echo path of the Python executable to use will be checked last.
+    exit /b 0
+
+
+@rem ################################
+:clean
+@rem # Remove cleanable file and directories and files from the root dir.
+echo "* Cleaning ..."
+for %%F in (%CLEANABLE%) do (
+    rmdir /s /q "%CFG_ROOT_DIR%\%%F" >nul 2>&1
+    del /f /q "%CFG_ROOT_DIR%\%%F" >nul 2>&1
+)
+exit /b 0
diff --git a/debian/changelog b/debian/changelog
index 49d231e..ea8778c 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,10 @@
+python-saneyaml (0.6.0-1) UNRELEASED; urgency=low
+
+  * New upstream release.
+  * New upstream release.
+
+ -- Debian Janitor <janitor@jelmer.uk>  Tue, 06 Jun 2023 13:06:40 -0000
+
 python-saneyaml (0.3-2) unstable; urgency=medium
 
   * Source-only rebuild.
diff --git a/docs/Makefile b/docs/Makefile
new file mode 100644
index 0000000..d0c3cbf
--- /dev/null
+++ b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = source
+BUILDDIR      = build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/docs/make.bat b/docs/make.bat
new file mode 100644
index 0000000..6247f7e
--- /dev/null
+++ b/docs/make.bat
@@ -0,0 +1,35 @@
+@ECHO OFF
+
+pushd %~dp0
+
+REM Command file for Sphinx documentation
+
+if "%SPHINXBUILD%" == "" (
+	set SPHINXBUILD=sphinx-build
+)
+set SOURCEDIR=source
+set BUILDDIR=build
+
+if "%1" == "" goto help
+
+%SPHINXBUILD% >NUL 2>NUL
+if errorlevel 9009 (
+	echo.
+	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
+	echo.installed, then set the SPHINXBUILD environment variable to point
+	echo.to the full path of the 'sphinx-build' executable. Alternatively you
+	echo.may add the Sphinx directory to PATH.
+	echo.
+	echo.If you don't have Sphinx installed, grab it from
+	echo.http://sphinx-doc.org/
+	exit /b 1
+)
+
+%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+goto end
+
+:help
+%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+
+:end
+popd
diff --git a/docs/scripts/doc8_style_check.sh b/docs/scripts/doc8_style_check.sh
new file mode 100644
index 0000000..9416323
--- /dev/null
+++ b/docs/scripts/doc8_style_check.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+# halt script on error
+set -e
+# Check for Style Code Violations
+doc8 --max-line-length 100 source --ignore D000 --quiet
\ No newline at end of file
diff --git a/docs/scripts/sphinx_build_link_check.sh b/docs/scripts/sphinx_build_link_check.sh
new file mode 100644
index 0000000..c542686
--- /dev/null
+++ b/docs/scripts/sphinx_build_link_check.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+# halt script on error
+set -e
+# Build locally, and then check links
+sphinx-build -E -W -b linkcheck source build
\ No newline at end of file
diff --git a/docs/source/_static/theme_overrides.css b/docs/source/_static/theme_overrides.css
new file mode 100644
index 0000000..9662d63
--- /dev/null
+++ b/docs/source/_static/theme_overrides.css
@@ -0,0 +1,353 @@
+body {
+    color: #000000;
+}
+
+p {
+    margin-bottom: 10px;
+}
+
+.wy-plain-list-disc, .rst-content .section ul, .rst-content .toctree-wrapper ul, article ul {
+    margin-bottom: 10px;
+}
+
+.custom_header_01 {
+    color: #cc0000;
+    font-size: 22px;
+    font-weight: bold;
+    line-height: 50px;
+}
+
+h1, h2, h3, h4, h5, h6 {
+    margin-bottom: 20px;
+    margin-top: 20px;
+}
+
+h5 {
+    font-size: 18px;
+    color: #000000;
+    font-style: italic;
+    margin-bottom: 10px;
+}
+
+h6 {
+    font-size: 15px;
+    color: #000000;
+    font-style: italic;
+    margin-bottom: 10px;
+}
+
+/* custom admonitions */
+/* success */
+.custom-admonition-success  .admonition-title {
+    color: #000000;
+    background: #ccffcc;
+    border-radius: 5px 5px 0px 0px;
+}
+div.custom-admonition-success.admonition {
+    color: #000000;
+    background: #ffffff;
+    border: solid 1px #cccccc;
+    border-radius: 5px;
+    box-shadow: 1px 1px 5px 3px #d8d8d8;
+    margin: 20px 0px 30px 0px;
+}
+
+/* important */
+.custom-admonition-important  .admonition-title {
+    color: #000000;
+    background: #ccffcc;
+    border-radius: 5px 5px 0px 0px;
+    border-bottom: solid 1px #000000;
+}
+div.custom-admonition-important.admonition {
+    color: #000000;
+    background: #ffffff;
+    border: solid 1px #cccccc;
+    border-radius: 5px;
+    box-shadow: 1px 1px 5px 3px #d8d8d8;
+    margin: 20px 0px 30px 0px;
+}
+
+/* caution */
+.custom-admonition-caution  .admonition-title {
+    color: #000000;
+    background: #ffff99;
+    border-radius: 5px 5px 0px 0px;
+    border-bottom: solid 1px #e8e8e8;
+}
+div.custom-admonition-caution.admonition {
+    color: #000000;
+    background: #ffffff;
+    border: solid 1px #cccccc;
+    border-radius: 5px;
+    box-shadow: 1px 1px 5px 3px #d8d8d8;
+    margin: 20px 0px 30px 0px;
+}
+
+/* note */
+.custom-admonition-note  .admonition-title {
+    color: #ffffff;
+    background: #006bb3;
+    border-radius: 5px 5px 0px 0px;
+}
+div.custom-admonition-note.admonition {
+    color: #000000;
+    background: #ffffff;
+    border: solid 1px #cccccc;
+    border-radius: 5px;
+    box-shadow: 1px 1px 5px 3px #d8d8d8;
+    margin: 20px 0px 30px 0px;
+}
+
+/* todo */
+.custom-admonition-todo  .admonition-title {
+    color: #000000;
+    background: #cce6ff;
+    border-radius: 5px 5px 0px 0px;
+    border-bottom: solid 1px #99ccff;
+}
+div.custom-admonition-todo.admonition {
+    color: #000000;
+    background: #ffffff;
+    border: solid 1px #99ccff;
+    border-radius: 5px;
+    box-shadow: 1px 1px 5px 3px #d8d8d8;
+    margin: 20px 0px 30px 0px;
+}
+
+/* examples */
+.custom-admonition-examples  .admonition-title {
+    color: #000000;
+    background: #ffe6cc;
+    border-radius: 5px 5px 0px 0px;
+    border-bottom: solid 1px #d8d8d8;
+}
+div.custom-admonition-examples.admonition {
+    color: #000000;
+    background: #ffffff;
+    border: solid 1px #cccccc;
+    border-radius: 5px;
+    box-shadow: 1px 1px 5px 3px #d8d8d8;
+    margin: 20px 0px 30px 0px;
+}
+
+.wy-nav-content {
+    max-width: 100%;
+    padding-right: 100px;
+    padding-left: 100px;
+    background-color: #f2f2f2;
+}
+
+div.rst-content {
+    background-color: #ffffff;
+    border: solid 1px #e5e5e5;
+    padding: 20px 40px 20px 40px;
+}
+
+.rst-content .guilabel {
+    border: 1px solid #ffff99;
+    background: #ffff99;
+    font-size: 100%;
+    font-weight: normal;
+    border-radius: 4px;
+    padding: 2px 0px;
+    margin: auto 2px;
+    vertical-align: middle;
+}
+
+.rst-content kbd {
+    font-family: SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",Courier,monospace;
+    border: solid 1px #d8d8d8;
+    background-color: #f5f5f5;
+    padding: 0px 3px;
+    border-radius: 3px;
+}
+
+.wy-nav-content-wrap a {
+    color: #0066cc;
+    text-decoration: none;
+}
+.wy-nav-content-wrap a:hover {
+    color: #0099cc;
+    text-decoration: underline;
+}
+
+.wy-nav-top a {
+    color: #ffffff;
+}
+
+/* Based on numerous similar approaches e.g., https://github.com/readthedocs/sphinx_rtd_theme/issues/117 and https://rackerlabs.github.io/docs-rackspace/tools/rtd-tables.html -- but remove form-factor limits to enable table wrap on full-size and smallest-size form factors */
+.wy-table-responsive table td {
+    white-space: normal !important;
+}
+
+.rst-content table.docutils td,
+.rst-content table.docutils th {
+    padding: 5px 10px 5px 10px;
+}
+.rst-content table.docutils td p,
+.rst-content table.docutils th p {
+    font-size: 14px;
+    margin-bottom: 0px;
+}
+.rst-content table.docutils td p cite,
+.rst-content table.docutils th p cite {
+    font-size: 14px;
+    background-color: transparent;
+}
+
+.colwidths-given th {
+    border: solid 1px #d8d8d8 !important;
+}
+.colwidths-given td {
+    border: solid 1px #d8d8d8 !important;
+}
+
+/*handles single-tick inline code*/
+.wy-body-for-nav cite {
+    color: #000000;
+    background-color: transparent;
+    font-style: normal;
+    font-family: "Courier New";
+    font-size: 13px;
+    padding: 3px 3px 3px 3px;
+}
+
+.rst-content pre.literal-block, .rst-content div[class^="highlight"] pre, .rst-content .linenodiv pre {
+    font-family: SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",Courier,monospace;
+    font-size: 13px;
+    overflow: visible;
+    white-space: pre-wrap;
+    color: #000000;
+}
+
+.rst-content pre.literal-block, .rst-content div[class^='highlight'] {
+    background-color: #f8f8f8;
+    border: solid 1px #e8e8e8;
+}
+
+/* This enables inline code to wrap. */
+code, .rst-content tt, .rst-content code {
+    white-space: pre-wrap;
+    padding: 2px 3px 1px;
+    border-radius: 3px;
+    font-size: 13px;
+    background-color: #ffffff;
+}
+
+/* use this added class for code blocks attached to bulleted list items */
+.highlight-top-margin {
+    margin-top: 20px !important;
+}
+
+/* change color of inline code block */
+span.pre {
+    color: #e01e5a;
+}
+
+.wy-body-for-nav blockquote {
+    margin: 1em 0;
+    padding-left: 1em;
+    border-left: 4px solid #ddd;
+    color: #000000;
+}
+
+/* Fix the unwanted top and bottom padding inside a nested bulleted/numbered list */
+.rst-content .section ol p, .rst-content .section ul p {
+    margin-bottom: 0px;
+}
+
+/* add spacing between bullets for legibility */
+.rst-content .section ol li, .rst-content .section ul li {
+    margin-bottom: 5px;
+}
+
+.rst-content .section ol li:first-child, .rst-content .section ul li:first-child {
+    margin-top: 5px;
+}
+
+/* but exclude the toctree bullets */
+.rst-content .toctree-wrapper ul li, .rst-content .toctree-wrapper ul li:first-child {
+    margin-top: 0px;
+    margin-bottom: 0px;
+}
+
+/* remove extra space at bottom of multine list-table cell */
+.rst-content .line-block {
+    margin-left: 0px;
+    margin-bottom: 0px;
+    line-height: 24px;
+}
+
+/* fix extra vertical spacing in page toctree */
+.rst-content .toctree-wrapper ul li ul, article ul li ul {
+    margin-top: 0;
+    margin-bottom: 0;
+}
+
+/* this is used by the genindex added via layout.html (see source/_templates/) to sidebar toc */
+.reference.internal.toc-index {
+    color: #d9d9d9;
+}
+
+.reference.internal.toc-index.current {
+    background-color: #ffffff;
+    color: #000000;
+    font-weight: bold;
+}
+
+.toc-index-div {
+    border-top: solid 1px #000000;
+    margin-top: 10px;
+    padding-top: 5px;
+}
+
+.indextable ul li {
+    font-size: 14px;
+    margin-bottom: 5px;
+}
+
+/* The next 2 fix the poor vertical spacing in genindex.html (the alphabetized index) */
+.indextable.genindextable {
+    margin-bottom: 20px;
+}
+
+div.genindex-jumpbox {
+    margin-bottom: 10px;
+}
+
+/* rst image classes */
+
+.clear-both {
+    clear: both;
+  }
+
+.float-left {
+    float: left;
+    margin-right: 20px;
+}
+
+img {
+    border: solid 1px #e8e8e8;
+}
+
+/* These are custom and need to be defined in conf.py to access in all pages, e.g., '.. role:: red' */
+.img-title {
+    color: #000000;
+    /* neither padding nor margin works for vertical spacing bc it's a span -- line-height does, sort of */
+    line-height: 3.0;
+    font-style: italic;
+    font-weight: 600;
+}
+
+.img-title-para {
+    color: #000000;
+    margin-top: 20px;
+    margin-bottom: 0px;
+    font-style: italic;
+    font-weight: 500;
+}
+
+.red {
+    color: red;
+}
diff --git a/docs/source/conf.py b/docs/source/conf.py
new file mode 100644
index 0000000..d5435e7
--- /dev/null
+++ b/docs/source/conf.py
@@ -0,0 +1,97 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# This file only contains a selection of the most common options. For a full
+# list see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+# -- Path setup --------------------------------------------------------------
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+#
+# import os
+# import sys
+# sys.path.insert(0, os.path.abspath('.'))
+
+
+# -- Project information -----------------------------------------------------
+
+project = "nexb-skeleton"
+copyright = "nexB Inc. and others."
+author = "AboutCode.org authors and contributors"
+
+
+# -- General configuration ---------------------------------------------------
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = [
+    "sphinx.ext.intersphinx",
+]
+
+# This points to aboutcode.readthedocs.io
+# In case of "undefined label" ERRORS check docs on intersphinx to troubleshoot
+# Link was created at commit - https://github.com/nexB/aboutcode/commit/faea9fcf3248f8f198844fe34d43833224ac4a83
+
+intersphinx_mapping = {
+    "aboutcode": ("https://aboutcode.readthedocs.io/en/latest/", None),
+    "scancode-workbench": ("https://scancode-workbench.readthedocs.io/en/develop/", None),
+}
+
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ["_templates"]
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This pattern also affects html_static_path and html_extra_path.
+exclude_patterns = []
+
+
+# -- Options for HTML output -------------------------------------------------
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+#
+html_theme = "sphinx_rtd_theme"
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ["_static"]
+
+master_doc = "index"
+
+html_context = {
+    "display_github": True,
+    "github_user": "nexB",
+    "github_repo": "nexb-skeleton",
+    "github_version": "develop",  # branch
+    "conf_py_path": "/docs/source/",  # path in the checkout to the docs root
+}
+
+html_css_files = ["_static/theme_overrides.css"]
+
+
+# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
+html_show_sphinx = True
+
+# Define CSS and HTML abbreviations used in .rst files.  These are examples.
+# .. role:: is used to refer to styles defined in _static/theme_overrides.css and is used like this: :red:`text`
+rst_prolog = """
+.. |psf| replace:: Python Software Foundation
+
+.. # define a hard line break for HTML
+.. |br| raw:: html
+
+   <br />
+
+.. role:: red
+
+.. role:: img-title
+
+.. role:: img-title-para
+
+"""
diff --git a/docs/source/contribute/contrib_doc.rst b/docs/source/contribute/contrib_doc.rst
new file mode 100644
index 0000000..13882e1
--- /dev/null
+++ b/docs/source/contribute/contrib_doc.rst
@@ -0,0 +1,314 @@
+.. _contrib_doc_dev:
+
+Contributing to the Documentation
+=================================
+
+.. _contrib_doc_setup_local:
+
+Setup Local Build
+-----------------
+
+To get started, create or identify a working directory on your local machine.
+
+Open that directory and execute the following command in a terminal session::
+
+    git clone https://github.com/nexB/skeleton.git
+
+That will create an ``/skeleton`` directory in your working directory.
+Now you can install the dependencies in a virtualenv::
+
+    cd skeleton
+    ./configure --docs
+
+.. note::
+
+    In case of windows, run ``configure --docs`` instead of this.
+
+Now, this will install the following prerequisites:
+
+- Sphinx
+- sphinx_rtd_theme (the format theme used by ReadTheDocs)
+- docs8 (style linter)
+
+These requirements are already present in setup.cfg and `./configure --docs` installs them.
+
+Now you can build the HTML documents locally::
+
+    source venv/bin/activate
+    cd docs
+    make html
+
+Assuming that your Sphinx installation was successful, Sphinx should build a local instance of the
+documentation .html files::
+
+    open build/html/index.html
+
+.. note::
+
+    In case this command did not work, for example on Ubuntu 18.04 you may get a message like “Couldn’t
+    get a file descriptor referring to the console”, try:
+
+    ::
+
+        see build/html/index.html
+
+You now have a local build of the AboutCode documents.
+
+.. _contrib_doc_share_improvements:
+
+Share Document Improvements
+---------------------------
+
+Ensure that you have the latest files::
+
+    git pull
+    git status
+
+Before commiting changes run Continious Integration Scripts locally to run tests. Refer
+:ref:`doc_ci` for instructions on the same.
+
+Follow standard git procedures to upload your new and modified files. The following commands are
+examples::
+
+    git status
+    git add source/index.rst
+    git add source/how-to-scan.rst
+    git status
+    git commit -m "New how-to document that explains how to scan"
+    git status
+    git push
+    git status
+
+The Scancode-Toolkit webhook with ReadTheDocs should rebuild the documentation after your
+Pull Request is Merged.
+
+Refer the `Pro Git Book <https://git-scm.com/book/en/v2/>`_ available online for Git tutorials
+covering more complex topics on Branching, Merging, Rebasing etc.
+
+.. _doc_ci:
+
+Continuous Integration
+----------------------
+
+The documentations are checked on every new commit through Travis-CI, so that common errors are
+avoided and documentation standards are enforced. Travis-CI presently checks for these 3 aspects
+of the documentation :
+
+1. Successful Builds (By using ``sphinx-build``)
+2. No Broken Links   (By Using ``link-check``)
+3. Linting Errors    (By Using ``Doc8``)
+
+So run these scripts at your local system before creating a Pull Request::
+
+    cd docs
+    ./scripts/sphinx_build_link_check.sh
+    ./scripts/doc8_style_check.sh
+
+If you don't have permission to run the scripts, run::
+
+    chmod u+x ./scripts/doc8_style_check.sh
+
+.. _doc_style_docs8:
+
+Style Checks Using ``Doc8``
+---------------------------
+
+How To Run Style Tests
+^^^^^^^^^^^^^^^^^^^^^^
+
+In the project root, run the following commands::
+
+    $ cd docs
+    $ ./scripts/doc8_style_check.sh
+
+A sample output is::
+
+    Scanning...
+    Validating...
+    docs/source/misc/licence_policy_plugin.rst:37: D002 Trailing whitespace
+    docs/source/misc/faq.rst:45: D003 Tabulation used for indentation
+    docs/source/misc/faq.rst:9: D001 Line too long
+    docs/source/misc/support.rst:6: D005 No newline at end of file
+    ========
+    Total files scanned = 34
+    Total files ignored = 0
+    Total accumulated errors = 326
+    Detailed error counts:
+        - CheckCarriageReturn = 0
+        - CheckIndentationNoTab = 75
+        - CheckMaxLineLength = 190
+        - CheckNewlineEndOfFile = 13
+        - CheckTrailingWhitespace = 47
+        - CheckValidity = 1
+
+Now fix the errors and run again till there isn't any style error in the documentation.
+
+What is Checked?
+^^^^^^^^^^^^^^^^
+
+PyCQA is an Organization for code quality tools (and plugins) for the Python programming language.
+Doc8 is a sub-project of the same Organization. Refer this `README <https://github.com/PyCQA/doc8/blob/master/README.rst>`_ for more details.
+
+What is checked:
+
+    - invalid rst format - D000
+    - lines should not be longer than 100 characters - D001
+
+        - RST exception: line with no whitespace except in the beginning
+        - RST exception: lines with http or https URLs
+        - RST exception: literal blocks
+        - RST exception: rst target directives
+
+    - no trailing whitespace - D002
+    - no tabulation for indentation - D003
+    - no carriage returns (use UNIX newlines) - D004
+    - no newline at end of file - D005
+
+.. _doc_interspinx:
+
+Interspinx
+----------
+
+ScanCode toolkit documentation uses `Intersphinx <http://www.sphinx-doc.org/en/master/usage/extensions/intersphinx.html>`_
+to link to other Sphinx Documentations, to maintain links to other Aboutcode Projects.
+
+To link sections in the same documentation, standart reST labels are used. Refer
+`Cross-Referencing <http://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#ref-role>`_ for more information.
+
+For example::
+
+    .. _my-reference-label:
+
+    Section to cross-reference
+    --------------------------
+
+    This is the text of the section.
+
+    It refers to the section itself, see :ref:`my-reference-label`.
+
+Now, using Intersphinx, you can create these labels in one Sphinx Documentation and then referance
+these labels from another Sphinx Documentation, hosted in different locations.
+
+You just have to add the following in the ``conf.py`` file for your Sphinx Documentation, where you
+want to add the links::
+
+    extensions = [
+    'sphinx.ext.intersphinx'
+    ]
+
+    intersphinx_mapping = {'aboutcode': ('https://aboutcode.readthedocs.io/en/latest/', None)}
+
+To show all Intersphinx links and their targets of an Intersphinx mapping file, run::
+
+    python -msphinx.ext.intersphinx https://aboutcode.readthedocs.io/en/latest/objects.inv
+
+.. WARNING::
+
+    ``python -msphinx.ext.intersphinx https://aboutcode.readthedocs.io/objects.inv`` will give
+    error.
+
+This enables you to create links to the ``aboutcode`` Documentation in your own Documentation,
+where you modified the configuration file. Links can be added like this::
+
+    For more details refer :ref:`aboutcode:doc_style_guide`.
+
+You can also not use the ``aboutcode`` label assigned to all links from aboutcode.readthedocs.io,
+if you don't have a label having the same name in your Sphinx Documentation. Example::
+
+    For more details refer :ref:`doc_style_guide`.
+
+If you have a label in your documentation which is also present in the documentation linked by
+Intersphinx, and you link to that label, it will create a link to the local label.
+
+For more information, refer this tutorial named
+`Using Intersphinx <https://my-favorite-documentation-test.readthedocs.io/en/latest/using_intersphinx.html>`_.
+
+.. _doc_style_conv:
+
+Style Conventions for the Documentaion
+--------------------------------------
+
+1. Headings
+
+    (`Refer <http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#sections>`_)
+    Normally, there are no heading levels assigned to certain characters as the structure is
+    determined from the succession of headings. However, this convention is used in Python’s Style
+    Guide for documenting which you may follow:
+
+    # with overline, for parts
+
+    * with overline, for chapters
+
+    =, for sections
+
+    -, for subsections
+
+    ^, for sub-subsections
+
+    ", for paragraphs
+
+2. Heading Underlines
+
+    Do not use underlines that are longer/shorter than the title headline itself. As in:
+
+    ::
+
+        Correct :
+
+        Extra Style Checks
+        ------------------
+
+        Incorrect :
+
+        Extra Style Checks
+        ------------------------
+
+.. note::
+
+    Underlines shorter than the Title text generates Errors on sphinx-build.
+
+
+3. Internal Links
+
+    Using ``:ref:`` is advised over standard reStructuredText links to sections (like
+    ```Section title`_``) because it works across files, when section headings are changed, will
+    raise warnings if incorrect, and works for all builders that support cross-references.
+    However, external links are created by using the standard ```Section title`_`` method.
+
+4. Eliminate Redundancy
+
+    If a section/file has to be repeated somewhere else, do not write the exact same section/file
+    twice. Use ``.. include: ../README.rst`` instead. Here, ``../`` refers to the documentation
+    root, so file location can be used accordingly. This enables us to link documents from other
+    upstream folders.
+
+5. Using ``:ref:`` only when necessary
+
+    Use ``:ref:`` to create internal links only when needed, i.e. it is referenced somewhere.
+    Do not create references for all the sections and then only reference some of them, because
+    this created unnecessary references. This also generates ERROR in ``restructuredtext-lint``.
+
+6. Spelling
+
+    You should check for spelling errors before you push changes. `Aspell <http://aspell.net/>`_
+    is a GNU project Command Line tool you can use for this purpose. Download and install Aspell,
+    then execute ``aspell check <file-name>`` for all the files changed. Be careful about not
+    changing commands or other stuff as Aspell gives prompts for a lot of them. Also delete the
+    temporary ``.bak`` files generated. Refer the `manual <http://aspell.net/man-html/>`_ for more
+    information on how to use.
+
+7. Notes and Warning Snippets
+
+    Every ``Note`` and ``Warning`` sections are to be kept in ``rst_snippets/note_snippets/`` and
+    ``rst_snippets/warning_snippets/`` and then included to eliminate redundancy, as these are
+    frequently used in multiple files.
+
+Converting from Markdown
+------------------------
+
+If you want to convert a ``.md`` file to a ``.rst`` file, this `tool <https://github.com/chrissimpkins/md2rst>`_
+does it pretty well. You'd still have to clean up and check for errors as this contains a lot of
+bugs. But this is definitely better than converting everything by yourself.
+
+This will be helpful in converting GitHub wiki's (Markdown Files) to reStructuredtext files for
+Sphinx/ReadTheDocs hosting.
diff --git a/docs/source/index.rst b/docs/source/index.rst
new file mode 100644
index 0000000..eb63717
--- /dev/null
+++ b/docs/source/index.rst
@@ -0,0 +1,16 @@
+Welcome to nexb-skeleton's documentation!
+=========================================
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents:
+
+   skeleton-usage
+   contribute/contrib_doc
+
+Indices and tables
+==================
+
+* :ref:`genindex`
+* :ref:`modindex`
+* :ref:`search`
diff --git a/docs/source/skeleton-usage.rst b/docs/source/skeleton-usage.rst
new file mode 100644
index 0000000..cde23dc
--- /dev/null
+++ b/docs/source/skeleton-usage.rst
@@ -0,0 +1,160 @@
+Usage
+=====
+A brand new project
+-------------------
+.. code-block:: bash
+
+    git init my-new-repo
+    cd my-new-repo
+    git pull git@github.com:nexB/skeleton
+
+    # Create the new repo on GitHub, then update your remote
+    git remote set-url origin git@github.com:nexB/your-new-repo.git
+
+From here, you can make the appropriate changes to the files for your specific project.
+
+Update an existing project
+---------------------------
+.. code-block:: bash
+
+    cd my-existing-project
+    git remote add skeleton git@github.com:nexB/skeleton
+    git fetch skeleton
+    git merge skeleton/main --allow-unrelated-histories
+
+This is also the workflow to use when updating the skeleton files in any given repository.
+
+Customizing
+-----------
+
+You typically want to perform these customizations:
+
+- remove or update the src/README.rst and tests/README.rst files
+- set project info and dependencies in setup.cfg
+- check the configure and configure.bat defaults
+
+Initializing a project
+----------------------
+
+All projects using the skeleton will be expected to pull all of it dependencies
+from thirdparty.aboutcode.org/pypi or the local thirdparty directory, using
+requirements.txt and/or requirements-dev.txt to determine what version of a
+package to collect. By default, PyPI will not be used to find and collect
+packages from.
+
+In the case where we are starting a new project where we do not have
+requirements.txt and requirements-dev.txt and whose dependencies are not yet on
+thirdparty.aboutcode.org/pypi, we run the following command after adding and
+customizing the skeleton files to your project:
+
+.. code-block:: bash
+
+    ./configure
+
+This will initialize the virtual environment for the project, pull in the
+dependencies from PyPI and add them to the virtual environment.
+
+
+Generating requirements.txt and requirements-dev.txt
+----------------------------------------------------
+
+After the project has been initialized, we can generate the requirements.txt and
+requirements-dev.txt files.
+
+Ensure the virtual environment is enabled.
+
+.. code-block:: bash
+
+    source venv/bin/activate
+
+To generate requirements.txt:
+
+.. code-block:: bash
+
+    python etc/scripts/gen_requirements.py -s venv/lib/python<version>/site-packages/
+
+Replace \<version\> with the version number of the Python being used, for example:
+``venv/lib/python3.6/site-packages/``
+
+To generate requirements-dev.txt after requirements.txt has been generated:
+
+.. code-block:: bash
+
+    ./configure --dev
+    python etc/scripts/gen_requirements_dev.py -s venv/lib/python<version>/site-packages/
+
+Note: on Windows, the ``site-packages`` directory is located at ``venv\Lib\site-packages\``
+
+.. code-block:: bash
+
+    python .\\etc\\scripts\\gen_requirements.py -s .\\venv\\Lib\\site-packages\\
+    .\configure --dev
+    python .\\etc\\scripts\\gen_requirements_dev.py -s .\\venv\\Lib\\site-packages\\
+
+
+Collecting and generating ABOUT files for dependencies
+------------------------------------------------------
+
+Ensure that the dependencies used by ``etc/scripts/fetch_thirdparty.py`` are installed:
+
+.. code-block:: bash
+
+    pip install -r etc/scripts/requirements.txt
+
+Once we have requirements.txt and requirements-dev.txt, we can fetch the project
+dependencies as wheels and generate ABOUT files for them:
+
+.. code-block:: bash
+
+    python etc/scripts/fetch_thirdparty.py -r requirements.txt -r requirements-dev.txt
+
+There may be issues with the generated ABOUT files, which will have to be
+corrected. You can check to see if your corrections are valid by running:
+
+.. code-block:: bash
+
+    python etc/scripts/check_thirdparty.py -d thirdparty
+
+Once the wheels are collected and the ABOUT files are generated and correct,
+upload them to thirdparty.aboutcode.org/pypi by placing the wheels and ABOUT
+files from the thirdparty directory to the pypi directory at
+https://github.com/nexB/thirdparty-packages
+
+
+Usage after project initialization
+----------------------------------
+
+Once the ``requirements.txt`` and ``requirements-dev.txt`` have been generated
+and the project dependencies and their ABOUT files have been uploaded to
+thirdparty.aboutcode.org/pypi, you can configure the project as needed, typically
+when you update dependencies or use a new checkout.
+
+If the virtual env for the project becomes polluted, or you would like to remove
+it, use the ``--clean`` option:
+
+.. code-block:: bash
+
+    ./configure --clean
+
+Then you can run ``./configure`` again to set up the project virtual environment.
+
+To set up the project for development use:
+
+.. code-block:: bash
+
+    ./configure --dev
+
+To update the project dependencies (adding, removing, updating packages, etc.),
+update the dependencies in ``setup.cfg``, then run:
+
+.. code-block:: bash
+
+    ./configure --clean # Remove existing virtual environment
+    source venv/bin/activate # Ensure virtual environment is activated
+    python etc/scripts/gen_requirements.py -s venv/lib/python<version>/site-packages/ # Regenerate requirements.txt
+    python etc/scripts/gen_requirements_dev.py -s venv/lib/python<version>/site-packages/ # Regenerate requirements-dev.txt
+    pip install -r etc/scripts/requirements.txt # Install dependencies needed by etc/scripts/bootstrap.py
+    python etc/scripts/fetch_thirdparty.py -r requirements.txt -r requirements-dev.txt # Collect dependency wheels and their ABOUT files
+
+Ensure that the generated ABOUT files are valid, then take the dependency wheels
+and ABOUT files and upload them to thirdparty.aboutcode.org/pypi.
diff --git a/etc/ci/azure-container-deb.yml b/etc/ci/azure-container-deb.yml
new file mode 100644
index 0000000..85b611d
--- /dev/null
+++ b/etc/ci/azure-container-deb.yml
@@ -0,0 +1,50 @@
+parameters:
+    job_name: ''
+    container: ''
+    python_path: ''
+    python_version: ''
+    package_manager: apt-get
+    install_python: ''
+    install_packages: |
+        set -e -x
+        sudo apt-get -y update
+        sudo apt-get -y install \
+            build-essential \
+            xz-utils zlib1g bzip2 libbz2-1.0 tar \
+            sqlite3 libxml2-dev libxslt1-dev \
+            software-properties-common openssl
+    test_suite: ''
+    test_suite_label: ''
+
+
+jobs:
+    - job: ${{ parameters.job_name }}
+
+      pool:
+          vmImage: 'ubuntu-16.04'
+
+      container:
+          image: ${{ parameters.container }}
+          options: '--name ${{ parameters.job_name }} -e LANG=C.UTF-8 -e LC_ALL=C.UTF-8 -v /usr/bin/docker:/tmp/docker:ro'
+
+      steps:
+          - checkout: self
+            fetchDepth: 10
+
+          - script: /tmp/docker exec -t -e LANG=C.UTF-8 -e LC_ALL=C.UTF-8 -u 0 ${{ parameters.job_name }} $(Build.SourcesDirectory)/etc/ci/install_sudo.sh ${{ parameters.package_manager }}
+            displayName: Install sudo
+
+          - script: ${{ parameters.install_packages }}
+            displayName: Install required packages
+
+          - script: ${{ parameters.install_python }}
+            displayName: 'Install Python ${{ parameters.python_version }}'
+
+          - script: ${{ parameters.python_path }} --version
+            displayName: 'Show Python version'
+
+          - script: PYTHON_EXE=${{ parameters.python_path }} ./configure --dev
+            displayName: 'Run Configure'
+
+          - script: ${{ parameters.test_suite }}
+            displayName: 'Run ${{ parameters.test_suite_label }} tests with py${{ parameters.python_version }} on ${{ parameters.job_name }}'
diff --git a/etc/ci/azure-container-rpm.yml b/etc/ci/azure-container-rpm.yml
new file mode 100644
index 0000000..1e6657d
--- /dev/null
+++ b/etc/ci/azure-container-rpm.yml
@@ -0,0 +1,51 @@
+parameters:
+    job_name: ''
+    image_name: 'ubuntu-16.04'
+    container: ''
+    python_path: ''
+    python_version: ''
+    package_manager: yum
+    install_python: ''
+    install_packages: |
+        set -e -x
+        sudo yum groupinstall -y "Development Tools"
+        sudo yum install -y \
+            openssl openssl-devel \
+            sqlite-devel zlib-devel xz-devel bzip2-devel \
+            bzip2 tar unzip zip \
+            libxml2-devel libxslt-devel
+    test_suite: ''
+    test_suite_label: ''
+
+
+jobs:
+    - job: ${{ parameters.job_name }}
+
+      pool:
+          vmImage: ${{ parameters.image_name }}
+
+      container:
+          image: ${{ parameters.container }}
+          options: '--name ${{ parameters.job_name }} -e LANG=C.UTF-8 -e LC_ALL=C.UTF-8 -v /usr/bin/docker:/tmp/docker:ro'
+
+      steps:
+          - checkout: self
+            fetchDepth: 10
+
+          - script: /tmp/docker exec -t -e LANG=C.UTF-8 -e LC_ALL=C.UTF-8 -u 0 ${{ parameters.job_name }} $(Build.SourcesDirectory)/etc/ci/install_sudo.sh ${{ parameters.package_manager }}
+            displayName: Install sudo
+
+          - script: ${{ parameters.install_packages }}
+            displayName: Install required packages
+
+          - script: ${{ parameters.install_python }}
+            displayName: 'Install Python ${{ parameters.python_version }}'
+
+          - script: ${{ parameters.python_path }} --version
+            displayName: 'Show Python version'
+
+          - script: PYTHON_EXE=${{ parameters.python_path }} ./configure --dev
+            displayName: 'Run Configure'
+
+          - script: ${{ parameters.test_suite }}
+            displayName: 'Run ${{ parameters.test_suite_label }} tests with py${{ parameters.python_version }} on ${{ parameters.job_name }}'
diff --git a/etc/ci/azure-posix.yml b/etc/ci/azure-posix.yml
new file mode 100644
index 0000000..9fdc7f1
--- /dev/null
+++ b/etc/ci/azure-posix.yml
@@ -0,0 +1,39 @@
+parameters:
+    job_name: ''
+    image_name: ''
+    python_versions: []
+    test_suites: {}
+    python_architecture: x64
+
+jobs:
+    - job: ${{ parameters.job_name }}
+
+      pool:
+          vmImage: ${{ parameters.image_name }}
+
+      strategy:
+          matrix:
+              ${{ each tsuite in parameters.test_suites }}:
+                 ${{ tsuite.key }}:
+                     test_suite_label: ${{ tsuite.key }}
+                     test_suite: ${{ tsuite.value }}
+
+      steps:
+          - checkout: self
+            fetchDepth: 10
+
+          - ${{ each pyver in parameters.python_versions }}:
+              - task: UsePythonVersion@0
+                inputs:
+                    versionSpec: '${{ pyver }}'
+                    architecture: '${{ parameters.python_architecture }}'
+                displayName: '${{ pyver }} - Install Python'
+
+              - script: |
+                    python${{ pyver }} --version
+                    echo "python${{ pyver }}" > PYTHON_EXECUTABLE
+                    ./configure --clean && ./configure --dev
+                displayName: '${{ pyver }} - Configure'
+
+              - script: $(test_suite)
+                displayName: '${{ pyver }} - $(test_suite_label) on ${{ parameters.job_name }}'
diff --git a/etc/ci/azure-win.yml b/etc/ci/azure-win.yml
new file mode 100644
index 0000000..26b4111
--- /dev/null
+++ b/etc/ci/azure-win.yml
@@ -0,0 +1,39 @@
+parameters:
+    job_name: ''
+    image_name: ''
+    python_versions: []
+    test_suites: {}
+    python_architecture: x64
+
+jobs:
+    - job: ${{ parameters.job_name }}
+
+      pool:
+          vmImage: ${{ parameters.image_name }}
+
+      strategy:
+          matrix:
+              ${{ each tsuite in parameters.test_suites }}:
+                 ${{ tsuite.key }}:
+                     test_suite_label: ${{ tsuite.key }}
+                     test_suite: ${{ tsuite.value }}
+
+      steps:
+          - checkout: self
+            fetchDepth: 10
+
+          - ${{ each pyver in parameters.python_versions }}:
+              - task: UsePythonVersion@0
+                inputs:
+                    versionSpec: '${{ pyver }}'
+                    architecture: '${{ parameters.python_architecture }}'
+                displayName: '${{ pyver }} - Install Python'
+
+              - script: |
+                   python --version
+                   echo | set /p=python> PYTHON_EXECUTABLE
+                   configure --clean && configure --dev
+                displayName: '${{ pyver }} - Configure'
+
+              - script: $(test_suite)
+                displayName: '${{ pyver }} - $(test_suite_label) on ${{ parameters.job_name }}'
diff --git a/etc/ci/install_sudo.sh b/etc/ci/install_sudo.sh
new file mode 100644
index 0000000..77f4210
--- /dev/null
+++ b/etc/ci/install_sudo.sh
@@ -0,0 +1,15 @@
+#!/bin/bash
+set -e
+
+
+if [[ "$1" == "apt-get" ]]; then
+    apt-get update -y
+    apt-get -o DPkg::Options::="--force-confold" install -y sudo
+
+elif [[ "$1" == "yum" ]]; then
+    yum install -y sudo
+
+elif [[ "$1" == "dnf" ]]; then
+    dnf install -y sudo
+
+fi
diff --git a/etc/ci/macports-ci b/etc/ci/macports-ci
new file mode 100644
index 0000000..ac474e4
--- /dev/null
+++ b/etc/ci/macports-ci
@@ -0,0 +1,304 @@
+#! /bin/bash
+
+# Copyright (c) 2019 Giovanni Bussi
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+export COLUMNS=80
+
+if [ "$GITHUB_ACTIONS" = true ] ; then
+    echo "COLUMNS=$COLUMNS" >> "$GITHUB_ENV"
+fi
+
+# file to be source at the end of subshell:
+export MACPORTS_CI_SOURCEME="$(mktemp)"
+
+(
+# start subshell
+# this allows to use the script in two ways:
+# 1. as ./macports-ci
+# 2. as source ./macports-ci
+# as of now, choice 2 only changes the env var COLUMNS.
+
+MACPORTS_VERSION=2.6.4
+MACPORTS_PREFIX=/opt/local
+MACPORTS_SYNC=tarball
+
+action=$1
+shift
+
+case "$action" in
+(install)
+
+echo "macports-ci: install"
+
+KEEP_BREW=yes
+
+for opt
+do
+  case "$opt" in
+  (--source) SOURCE=yes ;;
+  (--binary) SOURCE=no ;;
+  (--keep-brew) KEEP_BREW=yes ;;
+  (--remove-brew) KEEP_BREW=no ;;
+  (--version=*) MACPORTS_VERSION="${opt#--version=}" ;;
+  (--prefix=*)  MACPORTS_PREFIX="${opt#--prefix=}" ;;
+  (--sync=*)    MACPORTS_SYNC="${opt#--sync=}" ;;
+  (*) echo "macports-ci: unknown option $opt"
+      exit 1 ;;
+  esac
+done
+
+if test "$KEEP_BREW" = no ; then
+  echo "macports-ci: removing homebrew"
+  pushd "$(mktemp -d)"
+  curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/uninstall > uninstall
+  chmod +x uninstall
+  ./uninstall --force
+  popd
+else
+  echo "macports-ci: keeping HomeBrew"
+fi
+
+echo "macports-ci: prefix=$MACPORTS_PREFIX"
+
+if test "$MACPORTS_PREFIX" != /opt/local ; then
+  echo "macports-ci: Installing on non standard prefix $MACPORTS_PREFIX can be only made from sources"
+  SOURCE=yes
+fi
+
+if test "$SOURCE" = yes ; then
+  echo "macports-ci: Installing from source"
+else
+  echo "macports-ci: Installing from binary"
+fi
+
+echo "macports-ci: Sync mode=$MACPORTS_SYNC"
+
+pushd "$(mktemp -d)"
+
+OSX_VERSION="$(sw_vers -productVersion | grep -o '^[0-9][0-9]*\.[0-9][0-9]*')"
+
+if test "$OSX_VERSION" == 10.10 ; then
+  OSX_NAME=Yosemite
+elif test "$OSX_VERSION" == 10.11 ; then
+  OSX_NAME=ElCapitan
+elif test "$OSX_VERSION" == 10.12 ; then
+  OSX_NAME=Sierra
+elif test "$OSX_VERSION" == 10.13 ; then
+  OSX_NAME=HighSierra
+elif test "$OSX_VERSION" == 10.14 ; then
+  OSX_NAME=Mojave
+elif test "$OSX_VERSION" == 10.15 ; then
+  OSX_NAME=Catalina
+else
+  echo "macports-ci: Unknown OSX version $OSX_VERSION"
+  exit 1
+fi
+
+echo "macports-ci: OSX version $OSX_VERSION $OSX_NAME"
+
+MACPORTS_PKG=MacPorts-${MACPORTS_VERSION}-${OSX_VERSION}-${OSX_NAME}.pkg
+
+# this is a workaround needed because binary installer MacPorts-2.6.3-10.12-Sierra.pkg is broken
+if [ "$SOURCE" != yes ] && [ "$MACPORTS_PKG" = "MacPorts-2.6.3-10.12-Sierra.pkg" ] ; then
+  echo "macports-ci: WARNING $MACPORTS_PKG installer is broken"
+  echo "macports-ci: reverting to 2.6.2 installer followed by selfupdate"
+  MACPORTS_VERSION=2.6.2
+  MACPORTS_PKG=MacPorts-${MACPORTS_VERSION}-${OSX_VERSION}-${OSX_NAME}.pkg
+fi
+
+URL="https://distfiles.macports.org/MacPorts"
+URL="https://github.com/macports/macports-base/releases/download/v$MACPORTS_VERSION/"
+
+echo "macports-ci: Base URL is $URL"
+
+if test "$SOURCE" = yes ; then
+# download source:
+  curl -LO $URL/MacPorts-${MACPORTS_VERSION}.tar.bz2
+  tar xjf MacPorts-${MACPORTS_VERSION}.tar.bz2
+  cd MacPorts-${MACPORTS_VERSION}
+# install
+  ./configure --prefix="$MACPORTS_PREFIX" --with-applications-dir="$MACPORTS_PREFIX/Applications" >/dev/null &&
+    sudo make install >/dev/null
+else
+
+# download installer:
+  curl -LO $URL/$MACPORTS_PKG
+# install:
+  sudo installer -verbose -pkg $MACPORTS_PKG -target /
+fi
+
+# update:
+export PATH="$MACPORTS_PREFIX/bin:$PATH"
+
+echo "PATH=\"$MACPORTS_PREFIX/bin:\$PATH\""  > "$MACPORTS_CI_SOURCEME"
+
+if [ "$GITHUB_ACTIONS" = true ] ; then
+    echo "$MACPORTS_PREFIX/bin" >> "$GITHUB_PATH"
+fi
+
+
+SOURCES="${MACPORTS_PREFIX}"/etc/macports/sources.conf
+
+case "$MACPORTS_SYNC" in
+(rsync)
+  echo "macports-ci: Using rsync"
+  ;;
+(github)
+  echo "macports-ci: Using github"
+   pushd "$MACPORTS_PREFIX"/var/macports/sources
+   sudo mkdir -p github.com/macports/macports-ports/
+   sudo chown -R $USER:admin github.com
+   git clone https://github.com/macports/macports-ports.git github.com/macports/macports-ports/
+   awk '{if($NF=="[default]") print "file:///opt/local/var/macports/sources/github.com/macports/macports-ports/"; else print}' "$SOURCES" > $HOME/$$.tmp
+   sudo mv -f $HOME/$$.tmp "$SOURCES"
+   popd
+  ;;
+(tarball)
+  echo "macports-ci: Using tarball"
+  awk '{if($NF=="[default]") print "https://distfiles.macports.org/ports.tar.gz [default]"; else print}' "$SOURCES" > $$.tmp
+  sudo mv -f $$.tmp "$SOURCES"
+  ;;
+(*)
+  echo "macports-ci: Unknown sync mode $MACPORTS_SYNC"
+  ;;
+esac
+
+i=1
+# run through a while to retry upon failure
+while true
+do
+  echo "macports-ci: Trying to selfupdate (iteration $i)"
+# here I test for the presence of a known portfile
+# this check confirms that ports were installed
+# notice that port -N selfupdate && break is not sufficient as a test
+# (sometime it returns a success even though ports have not been installed)
+# for some misterious reasons, running without "-d" does not work in some case
+  sudo port -d -N selfupdate 2>&1 | grep -v DEBUG | awk '{if($1!="x")print}'
+  port info xdrfile > /dev/null && break || true
+  sleep 5
+  i=$((i+1))
+  if ((i>20)) ; then
+    echo "macports-ci: Failed after $i iterations"
+    exit 1
+  fi
+done
+
+echo "macports-ci: Selfupdate successful after $i iterations"
+
+dir="$PWD"
+popd
+sudo rm -fr $dir
+
+;;
+
+(localports)
+
+echo "macports-ci: localports"
+
+for opt
+do
+  case "$opt" in
+  (*) ports="$opt" ;;
+  esac
+done
+
+if ! test -d "$ports" ; then
+  echo "macports-ci: Please provide a port directory"
+  exit 1
+fi
+
+w=$(which port)
+
+MACPORTS_PREFIX="${w%/bin/port}"
+
+cd "$ports"
+
+ports="$(pwd)"
+
+echo "macports-ci: Portdir fullpath: $ports"
+SOURCES="${MACPORTS_PREFIX}"/etc/macports/sources.conf
+
+awk -v repo="file://$ports" '{if($NF=="[default]") print repo; print}' "$SOURCES" > $$.tmp
+sudo mv -f $$.tmp "$SOURCES"
+
+portindex
+
+;;
+
+(ccache)
+w=$(which port)
+MACPORTS_PREFIX="${w%/bin/port}"
+
+echo "macports-ci: ccache"
+
+ccache_do=install
+
+for opt
+do
+  case "$opt" in
+  (--save) ccache_do=save ;;
+  (--install) ccache_do=install ;;
+  (*) echo "macports-ci: ccache: unknown option $opt"
+      exit 1 ;;
+  esac
+done
+
+
+case "$ccache_do" in
+(install)
+# first install ccache
+sudo port -N install ccache
+# then tell macports to use it
+CONF="${MACPORTS_PREFIX}"/etc/macports/macports.conf
+awk '{if(match($0,"configureccache")) print "configureccache yes" ; else print }' "$CONF" > $$.tmp
+sudo mv -f $$.tmp "$CONF"
+
+# notice that cache size is set to 512Mb, same as it is set by Travis-CI on linux
+# might be changed in the future
+test -f "$HOME"/.macports-ci-ccache/ccache.conf &&
+  sudo rm -fr "$MACPORTS_PREFIX"/var/macports/build/.ccache &&
+  sudo mkdir -p "$MACPORTS_PREFIX"/var/macports/build/.ccache &&
+  sudo cp -a "$HOME"/.macports-ci-ccache/* "$MACPORTS_PREFIX"/var/macports/build/.ccache/ &&
+  sudo echo "max_size = 512M" > "$MACPORTS_PREFIX"/var/macports/build/.ccache/ccache.conf &&
+  sudo chown -R macports:admin "$MACPORTS_PREFIX"/var/macports/build/.ccache
+
+;;
+(save)
+
+sudo rm -fr "$HOME"/.macports-ci-ccache
+sudo mkdir -p "$HOME"/.macports-ci-ccache
+sudo cp -a "$MACPORTS_PREFIX"/var/macports/build/.ccache/* "$HOME"/.macports-ci-ccache/
+
+esac
+
+CCACHE_DIR="$MACPORTS_PREFIX"/var/macports/build/.ccache/ ccache -s
+
+;;
+
+(*)
+echo "macports-ci: unknown action $action"
+
+esac
+
+)
+
+# allows setting env var if necessary:
+source "$MACPORTS_CI_SOURCEME"
diff --git a/etc/ci/macports-ci.ABOUT b/etc/ci/macports-ci.ABOUT
new file mode 100644
index 0000000..60a11f8
--- /dev/null
+++ b/etc/ci/macports-ci.ABOUT
@@ -0,0 +1,16 @@
+about_resource: macports-ci
+name: macports-ci
+version: c9676e67351a3a519e37437e196cd0ee9c2180b8
+download_url: https://raw.githubusercontent.com/GiovanniBussi/macports-ci/c9676e67351a3a519e37437e196cd0ee9c2180b8/macports-ci
+description: Simplify MacPorts setup on Travis-CI
+homepage_url: https://github.com/GiovanniBussi/macports-ci
+license_expression: mit
+copyright: Copyright (c) Giovanni Bussi
+attribute: yes
+checksum_md5: 5d31d479132502f80acdaed78bed9e23
+checksum_sha1: 74b15643bd1a528d91b4a7c2169c6fc656f549c2
+package_url: pkg:github/giovannibussi/macports-ci@c9676e67351a3a519e37437e196cd0ee9c2180b8#macports-ci
+licenses:
+  - key: mit
+    name: MIT License
+    file: mit.LICENSE
diff --git a/etc/ci/mit.LICENSE b/etc/ci/mit.LICENSE
new file mode 100644
index 0000000..e662c78
--- /dev/null
+++ b/etc/ci/mit.LICENSE
@@ -0,0 +1,5 @@
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
\ No newline at end of file
diff --git a/etc/scripts/README.rst b/etc/scripts/README.rst
new file mode 100755
index 0000000..5e54a2c
--- /dev/null
+++ b/etc/scripts/README.rst
@@ -0,0 +1,112 @@
+This directory contains the tools to manage a directory of thirdparty Python
+package source, wheels and metadata pin, build, update, document and publish to
+a PyPI-like repo (GitHub release).
+
+NOTE: These are tested to run ONLY on Linux.
+
+
+Thirdparty packages management scripts
+======================================
+
+Pre-requisites
+--------------
+
+* There are two run "modes":
+
+  * To generate or update pip requirement files, you need to start with a clean
+    virtualenv as instructed below (This is to avoid injecting requirements
+    specific to the tools used here in the main requirements).
+
+  * For other usages, the tools here can run either in their own isolated
+    virtualenv or in the the main configured development virtualenv.
+    These requireements need to be installed::
+
+        pip install --requirement etc/scripts/requirements.txt
+
+TODO: we need to pin the versions of these tools
+
+
+
+Generate or update pip requirement files
+----------------------------------------
+
+Scripts
+~~~~~~~
+
+**gen_requirements.py**: create/update requirements files from currently
+  installed requirements.
+
+**gen_requirements_dev.py** does the same but can subtract the main requirements
+  to get extra requirements used in only development.
+
+
+Usage
+~~~~~
+
+The sequence of commands to run are:
+
+
+* Start with these to generate the main pip requirements file::
+
+    ./configure --clean
+    ./configure
+    python etc/scripts/gen_requirements.py --site-packages-dir <path to site-packages dir>
+
+* You can optionally install or update extra main requirements after the
+  ./configure step such that these are included in the generated main requirements.
+
+* Optionally, generate a development pip requirements file by running these::
+
+    ./configure --clean
+    ./configure --dev
+    python etc/scripts/gen_requirements_dev.py --site-packages-dir <path to site-packages dir>
+
+* You can optionally install or update extra dev requirements after the
+  ./configure step such that these are included in the generated dev
+  requirements.
+
+Notes: we generate development requirements after the main as this step requires
+the main requirements.txt to be up-to-date first. See **gen_requirements.py and
+gen_requirements_dev.py** --help for details.
+
+Note: this does NOT hash requirements for now.
+
+Note: Be aware that if you are using "conditional" requirements (e.g. only for
+OS or Python versions) in setup.py/setp.cfg/requirements.txt as these are NOT
+yet supported.
+
+
+Populate a thirdparty directory with wheels, sources, .ABOUT and license files
+------------------------------------------------------------------------------
+
+Scripts
+~~~~~~~
+
+* **fetch_thirdparty.py** will fetch package wheels, source sdist tarballs
+  and their ABOUT, LICENSE and NOTICE files to populate a local directory from
+  a list of PyPI simple URLs (typically PyPI.org proper and our self-hosted PyPI)
+  using pip requirements file(s), specifiers or pre-existing packages files.
+  Fetch wheels for specific python version and operating system combinations.
+
+* **check_thirdparty.py** will check a thirdparty directory for errors.
+
+
+Upgrade virtualenv app
+----------------------
+
+The bundled virtualenv.pyz has to be upgraded by hand and is stored under
+etc/thirdparty
+
+* Fetch https://github.com/pypa/get-virtualenv/raw/<latest tag>/public/virtualenv.pyz
+  for instance https://github.com/pypa/get-virtualenv/raw/20.2.2/public/virtualenv.pyz
+  and save to thirdparty and update the ABOUT and LICENSE files as needed.
+
+* This virtualenv app contains also bundled pip, wheel and setuptools that are
+  essential for the installation to work.
+
+
+Other files
+===========
+
+The other files and scripts are test, support and utility modules used by the
+main scripts documented here.
diff --git a/etc/scripts/check_thirdparty.py b/etc/scripts/check_thirdparty.py
new file mode 100644
index 0000000..b052f25
--- /dev/null
+++ b/etc/scripts/check_thirdparty.py
@@ -0,0 +1,55 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/skeleton for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+import click
+
+import utils_thirdparty
+
+
+@click.command()
+@click.option(
+    "-d",
+    "--dest",
+    type=click.Path(exists=True, readable=True, path_type=str, file_okay=False),
+    required=True,
+    help="Path to the thirdparty directory to check.",
+)
+@click.option(
+    "-w",
+    "--wheels",
+    is_flag=True,
+    help="Check missing wheels.",
+)
+@click.option(
+    "-s",
+    "--sdists",
+    is_flag=True,
+    help="Check missing source sdists tarballs.",
+)
+@click.help_option("-h", "--help")
+def check_thirdparty_dir(
+    dest,
+    wheels,
+    sdists,
+):
+    """
+    Check a thirdparty directory for problems and print these on screen.
+    """
+    # check for problems
+    print(f"==> CHECK FOR PROBLEMS")
+    utils_thirdparty.find_problems(
+        dest_dir=dest,
+        report_missing_sources=sdists,
+        report_missing_wheels=wheels,
+    )
+
+
+if __name__ == "__main__":
+    check_thirdparty_dir()
diff --git a/etc/scripts/fetch_thirdparty.py b/etc/scripts/fetch_thirdparty.py
new file mode 100644
index 0000000..eedf05c
--- /dev/null
+++ b/etc/scripts/fetch_thirdparty.py
@@ -0,0 +1,315 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/skeleton for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+
+import itertools
+import os
+import sys
+from collections import defaultdict
+
+import click
+
+import utils_thirdparty
+import utils_requirements
+
+TRACE = False
+TRACE_DEEP = False
+
+
+@click.command()
+@click.option(
+    "-r",
+    "--requirements",
+    "requirements_files",
+    type=click.Path(exists=True, readable=True, path_type=str, dir_okay=False),
+    metavar="REQUIREMENT-FILE",
+    multiple=True,
+    required=False,
+    help="Path to pip requirements file(s) listing thirdparty packages.",
+)
+@click.option(
+    "--spec",
+    "--specifier",
+    "specifiers",
+    type=str,
+    metavar="SPECIFIER",
+    multiple=True,
+    required=False,
+    help="Thirdparty package name==version specification(s) as in django==1.2.3. "
+    "With --latest-version a plain package name is also acceptable.",
+)
+@click.option(
+    "-l",
+    "--latest-version",
+    is_flag=True,
+    help="Get the latest version of all packages, ignoring any specified versions.",
+)
+@click.option(
+    "-d",
+    "--dest",
+    "dest_dir",
+    type=click.Path(exists=True, readable=True, path_type=str, file_okay=False),
+    metavar="DIR",
+    default=utils_thirdparty.THIRDPARTY_DIR,
+    show_default=True,
+    help="Path to the detsination directory where to save downloaded wheels, "
+    "sources, ABOUT and LICENSE files..",
+)
+@click.option(
+    "-w",
+    "--wheels",
+    is_flag=True,
+    help="Download wheels.",
+)
+@click.option(
+    "-s",
+    "--sdists",
+    is_flag=True,
+    help="Download source sdists tarballs.",
+)
+@click.option(
+    "-p",
+    "--python-version",
+    "python_versions",
+    type=click.Choice(utils_thirdparty.PYTHON_VERSIONS),
+    metavar="PYVER",
+    default=utils_thirdparty.PYTHON_VERSIONS,
+    show_default=True,
+    multiple=True,
+    help="Python version(s) to use for wheels.",
+)
+@click.option(
+    "-o",
+    "--operating-system",
+    "operating_systems",
+    type=click.Choice(utils_thirdparty.PLATFORMS_BY_OS),
+    metavar="OS",
+    default=tuple(utils_thirdparty.PLATFORMS_BY_OS),
+    multiple=True,
+    show_default=True,
+    help="OS(ses) to use for wheels: one of linux, mac or windows.",
+)
+@click.option(
+    "--index-url",
+    "index_urls",
+    type=str,
+    metavar="INDEX",
+    default=utils_thirdparty.PYPI_INDEX_URLS,
+    show_default=True,
+    multiple=True,
+    help="PyPI index URL(s) to use for wheels and sources, in order of preferences.",
+)
+@click.option(
+    "--use-cached-index",
+    is_flag=True,
+    help="Use on disk cached PyPI indexes list of packages and versions and do not refetch if present.",
+)
+@click.option(
+    "--sdist-only",
+    "sdist_only",
+    type=str,
+    metavar="SDIST",
+    default=tuple(),
+    show_default=False,
+    multiple=True,
+    help="Package name(s) that come only in sdist format (no wheels). "
+         "The command will not fail and exit if no wheel exists for these names",
+)
+@click.option(
+    "--wheel-only",
+    "wheel_only",
+    type=str,
+    metavar="WHEEL",
+    default=tuple(),
+    show_default=False,
+    multiple=True,
+    help="Package name(s) that come only in wheel format (no sdist). "
+         "The command will not fail and exit if no sdist exists for these names",
+)
+@click.option(
+    "--no-dist",
+    "no_dist",
+    type=str,
+    metavar="DIST",
+    default=tuple(),
+    show_default=False,
+    multiple=True,
+    help="Package name(s) that do not come either in wheel or sdist format. "
+         "The command will not fail and exit if no distribution exists for these names",
+)
+@click.help_option("-h", "--help")
+def fetch_thirdparty(
+    requirements_files,
+    specifiers,
+    latest_version,
+    dest_dir,
+    python_versions,
+    operating_systems,
+    wheels,
+    sdists,
+    index_urls,
+    use_cached_index,
+    sdist_only,
+    wheel_only,
+    no_dist,
+):
+    """
+    Download to --dest THIRDPARTY_DIR the PyPI wheels, source distributions,
+    and their ABOUT metadata, license and notices files.
+
+    Download the PyPI packages listed in the combination of:
+    - the pip requirements --requirements REQUIREMENT-FILE(s),
+    - the pip name==version --specifier SPECIFIER(s)
+    - any pre-existing wheels or sdsists found in --dest-dir THIRDPARTY_DIR.
+
+    Download wheels with the --wheels option for the ``--python-version``
+    PYVER(s) and ``--operating_system`` OS(s) combinations defaulting to all
+    supported combinations.
+
+    Download sdists tarballs with the --sdists option.
+
+    Generate or Download .ABOUT, .LICENSE and .NOTICE files for all the wheels
+    and sources fetched.
+
+    Download from the provided PyPI simple --index-url INDEX(s) URLs.
+    """
+    if not (wheels or sdists):
+        print("Error: one or both of --wheels  and --sdists is required.")
+        sys.exit(1)
+
+    print(f"COLLECTING REQUIRED NAMES & VERSIONS FROM {dest_dir}")
+
+    existing_packages_by_nv = {
+        (package.name, package.version): package
+        for package in utils_thirdparty.get_local_packages(directory=dest_dir)
+    }
+
+    required_name_versions = set(existing_packages_by_nv.keys())
+
+    for req_file in requirements_files:
+        nvs = utils_requirements.load_requirements(
+            requirements_file=req_file,
+            with_unpinned=latest_version,
+        )
+        required_name_versions.update(nvs)
+
+    for specifier in specifiers:
+        nv = utils_requirements.get_required_name_version(
+            requirement=specifier,
+            with_unpinned=latest_version,
+        )
+        required_name_versions.add(nv)
+
+    if latest_version:
+        names = set(name for name, _version in sorted(required_name_versions))
+        required_name_versions = {(n, None) for n in names}
+
+    if not required_name_versions:
+        print("Error: no requirements requested.")
+        sys.exit(1)
+
+    if TRACE_DEEP:
+        print("required_name_versions:")
+        for n, v in required_name_versions:
+            print(f"    {n} @ {v}")
+
+    # create the environments matrix we need for wheels
+    environments = None
+    if wheels:
+        evts = itertools.product(python_versions, operating_systems)
+        environments = [utils_thirdparty.Environment.from_pyver_and_os(pyv, os) for pyv, os in evts]
+
+    # Collect PyPI repos
+    repos = []
+    for index_url in index_urls:
+        index_url = index_url.strip("/")
+        existing = utils_thirdparty.DEFAULT_PYPI_REPOS_BY_URL.get(index_url)
+        if existing:
+            existing.use_cached_index = use_cached_index
+            repos.append(existing)
+        else:
+            repo = utils_thirdparty.PypiSimpleRepository(
+                index_url=index_url,
+                use_cached_index=use_cached_index,
+            )
+            repos.append(repo)
+
+    wheels_or_sdist_not_found = defaultdict(list)
+
+    for name, version in sorted(required_name_versions):
+        nv = name, version
+        print(f"Processing: {name} @ {version}")
+        if wheels:
+            for environment in environments:
+
+                if TRACE:
+                    print(f"  ==> Fetching wheel for envt: {environment}")
+
+                fetched = utils_thirdparty.download_wheel(
+                    name=name,
+                    version=version,
+                    environment=environment,
+                    dest_dir=dest_dir,
+                    repos=repos,
+                )
+                if not fetched:
+                    wheels_or_sdist_not_found[f"{name}=={version}"].append(environment)
+                    if TRACE:
+                        print(f"      NOT FOUND")
+
+        if (sdists or
+            (f"{name}=={version}" in wheels_or_sdist_not_found and name in sdist_only)
+         ):
+            if TRACE:
+                print(f"  ==> Fetching sdist: {name}=={version}")
+
+            fetched = utils_thirdparty.download_sdist(
+                name=name,
+                version=version,
+                dest_dir=dest_dir,
+                repos=repos,
+            )
+            if not fetched:
+                wheels_or_sdist_not_found[f"{name}=={version}"].append("sdist")
+                if TRACE:
+                    print(f"      NOT FOUND")
+
+    mia = []
+    for nv, dists in wheels_or_sdist_not_found.items():
+        name, _, version = nv.partition("==")
+        if name in no_dist:
+            continue
+        sdist_missing = sdists and "sdist" in dists and not name in wheel_only
+        if sdist_missing:
+            mia.append(f"SDist missing: {nv} {dists}")
+        wheels_missing = wheels and any(d for d in dists if d != "sdist") and not name in sdist_only
+        if wheels_missing:
+            mia.append(f"Wheels missing: {nv} {dists}")
+
+    if mia:
+        for m in mia:
+            print(m)
+        raise Exception(mia)
+
+    print(f"==> FETCHING OR CREATING ABOUT AND LICENSE FILES")
+    utils_thirdparty.fetch_abouts_and_licenses(dest_dir=dest_dir, use_cached_index=use_cached_index)
+    utils_thirdparty.clean_about_files(dest_dir=dest_dir)
+
+    # check for problems
+    print(f"==> CHECK FOR PROBLEMS")
+    utils_thirdparty.find_problems(
+        dest_dir=dest_dir,
+        report_missing_sources=sdists,
+        report_missing_wheels=wheels,
+    )
+
+
+if __name__ == "__main__":
+    fetch_thirdparty()
diff --git a/etc/scripts/gen_pypi_simple.py b/etc/scripts/gen_pypi_simple.py
new file mode 100644
index 0000000..214d90d
--- /dev/null
+++ b/etc/scripts/gen_pypi_simple.py
@@ -0,0 +1,315 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+# SPDX-License-Identifier: BSD-2-Clause-Views AND MIT
+# Copyright (c) 2010 David Wolever <david@wolever.net>. All rights reserved.
+# originally from https://github.com/wolever/pip2pi
+
+import hashlib
+import os
+import re
+import shutil
+from collections import defaultdict
+from html import escape
+from pathlib import Path
+from typing import NamedTuple
+
+"""
+Generate a PyPI simple index froma  directory.
+"""
+
+
+class InvalidDistributionFilename(Exception):
+    pass
+
+
+def get_package_name_from_filename(filename):
+    """
+    Return the normalized package name extracted from a package ``filename``.
+    Normalization is done according to distribution name rules.
+    Raise an ``InvalidDistributionFilename`` if the ``filename`` is invalid::
+
+    >>> get_package_name_from_filename("foo-1.2.3_rc1.tar.gz")
+    'foo'
+    >>> get_package_name_from_filename("foo_bar-1.2-py27-none-any.whl")
+    'foo-bar'
+    >>> get_package_name_from_filename("Cython-0.17.2-cp26-none-linux_x86_64.whl")
+    'cython'
+    >>> get_package_name_from_filename("python_ldap-2.4.19-cp27-none-macosx_10_10_x86_64.whl")
+    'python-ldap'
+    >>> try:
+    ...     get_package_name_from_filename("foo.whl")
+    ... except InvalidDistributionFilename:
+    ...     pass
+    >>> try:
+    ...     get_package_name_from_filename("foo.png")
+    ... except InvalidDistributionFilename:
+    ...     pass
+    """
+    if not filename or not filename.endswith(dist_exts):
+        raise InvalidDistributionFilename(filename)
+
+    filename = os.path.basename(filename)
+
+    if filename.endswith(sdist_exts):
+        name_ver = None
+        extension = None
+
+        for ext in sdist_exts:
+            if filename.endswith(ext):
+                name_ver, extension, _ = filename.rpartition(ext)
+                break
+
+        if not extension or not name_ver:
+            raise InvalidDistributionFilename(filename)
+
+        name, _, version = name_ver.rpartition("-")
+
+        if not (name and version):
+            raise InvalidDistributionFilename(filename)
+
+    elif filename.endswith(wheel_ext):
+
+        wheel_info = get_wheel_from_filename(filename)
+
+        if not wheel_info:
+            raise InvalidDistributionFilename(filename)
+
+        name = wheel_info.group("name")
+        version = wheel_info.group("version")
+
+        if not (name and version):
+            raise InvalidDistributionFilename(filename)
+
+    elif filename.endswith(app_ext):
+        name_ver, extension, _ = filename.rpartition(".pyz")
+
+        if "-" in filename:
+            name, _, version = name_ver.rpartition("-")
+        else:
+            name = name_ver
+
+        if not name:
+            raise InvalidDistributionFilename(filename)
+
+    name = normalize_name(name)
+    return name
+
+
+def normalize_name(name):
+    """
+    Return a normalized package name per PEP503, and copied from
+    https://www.python.org/dev/peps/pep-0503/#id4
+    """
+    return name and re.sub(r"[-_.]+", "-", name).lower() or name
+
+
+def build_per_package_index(pkg_name, packages, base_url):
+    """
+    Return an HTML document as string representing the index for a package
+    """
+    document = []
+    header = f"""<!DOCTYPE html>
+<html>
+  <head>
+    <meta name="pypi:repository-version" content="1.0">
+    <title>Links for {pkg_name}</title>
+  </head>
+  <body>"""
+    document.append(header)
+
+    for package in sorted(packages, key=lambda p: p.archive_file):
+        document.append(package.simple_index_entry(base_url))
+
+    footer = """  </body>
+</html>
+"""
+    document.append(footer)
+    return "\n".join(document)
+
+
+def build_links_package_index(packages_by_package_name, base_url):
+    """
+    Return an HTML document as string which is a links index of all packages
+    """
+    document = []
+    header = f"""<!DOCTYPE html>
+<html>
+  <head>
+    <title>Links for all packages</title>
+  </head>
+  <body>"""
+    document.append(header)
+
+    for _name, packages in sorted(packages_by_package_name.items(), key=lambda i: i[0]):
+        for package in sorted(packages, key=lambda p: p.archive_file):
+            document.append(package.simple_index_entry(base_url))
+
+    footer = """  </body>
+</html>
+"""
+    document.append(footer)
+    return "\n".join(document)
+
+
+class Package(NamedTuple):
+    name: str
+    index_dir: Path
+    archive_file: Path
+    checksum: str
+
+    @classmethod
+    def from_file(cls, name, index_dir, archive_file):
+        with open(archive_file, "rb") as f:
+            checksum = hashlib.sha256(f.read()).hexdigest()
+        return cls(
+            name=name,
+            index_dir=index_dir,
+            archive_file=archive_file,
+            checksum=checksum,
+        )
+
+    def simple_index_entry(self, base_url):
+        return (
+            f'    <a href="{base_url}/{self.archive_file.name}#sha256={self.checksum}">'
+            f"{self.archive_file.name}</a><br/>"
+        )
+
+
+def build_pypi_index(directory, base_url="https://thirdparty.aboutcode.org/pypi"):
+    """
+    Using a ``directory`` directory of wheels and sdists, create the a PyPI
+    simple directory index at ``directory``/simple/ populated with the proper
+    PyPI simple index directory structure crafted using symlinks.
+
+    WARNING: The ``directory``/simple/ directory is removed if it exists.
+    NOTE: in addition to the a PyPI simple index.html there is also a links.html
+    index file generated which is suitable to use with pip's --find-links
+    """
+
+    directory = Path(directory)
+
+    index_dir = directory / "simple"
+    if index_dir.exists():
+        shutil.rmtree(str(index_dir), ignore_errors=True)
+
+    index_dir.mkdir(parents=True)
+    packages_by_package_name = defaultdict(list)
+
+    # generate the main simple index.html
+    simple_html_index = [
+        "<!DOCTYPE html>",
+        "<html><head><title>PyPI Simple Index</title>",
+        '<meta charset="UTF-8">' '<meta name="api-version" value="2" /></head><body>',
+    ]
+
+    for pkg_file in directory.iterdir():
+
+        pkg_filename = pkg_file.name
+
+        if (
+            not pkg_file.is_file()
+            or not pkg_filename.endswith(dist_exts)
+            or pkg_filename.startswith(".")
+        ):
+            continue
+
+        pkg_name = get_package_name_from_filename(
+            filename=pkg_filename,
+        )
+        pkg_index_dir = index_dir / pkg_name
+        pkg_index_dir.mkdir(parents=True, exist_ok=True)
+        pkg_indexed_file = pkg_index_dir / pkg_filename
+
+        link_target = Path("../..") / pkg_filename
+        pkg_indexed_file.symlink_to(link_target)
+
+        if pkg_name not in packages_by_package_name:
+            esc_name = escape(pkg_name)
+            simple_html_index.append(f'<a href="{esc_name}/">{esc_name}</a><br/>')
+
+        packages_by_package_name[pkg_name].append(
+            Package.from_file(
+                name=pkg_name,
+                index_dir=pkg_index_dir,
+                archive_file=pkg_file,
+            )
+        )
+
+    # finalize main index
+    simple_html_index.append("</body></html>")
+    index_html = index_dir / "index.html"
+    index_html.write_text("\n".join(simple_html_index))
+
+    # also generate the simple index.html of each package, listing all its versions.
+    for pkg_name, packages in packages_by_package_name.items():
+        per_package_index = build_per_package_index(
+            pkg_name=pkg_name,
+            packages=packages,
+            base_url=base_url,
+        )
+        pkg_index_dir = packages[0].index_dir
+        ppi_html = pkg_index_dir / "index.html"
+        ppi_html.write_text(per_package_index)
+
+    # also generate the a links.html page with all packages.
+    package_links = build_links_package_index(
+        packages_by_package_name=packages_by_package_name,
+        base_url=base_url,
+    )
+    links_html = index_dir / "links.html"
+    links_html.write_text(package_links)
+
+
+"""
+name: pip-wheel
+version: 20.3.1
+download_url: https://github.com/pypa/pip/blob/20.3.1/src/pip/_internal/models/wheel.py
+copyright: Copyright (c) 2008-2020 The pip developers (see AUTHORS.txt file)
+license_expression: mit
+notes: the wheel name regex is copied from pip-20.3.1 pip/_internal/models/wheel.py
+
+Copyright (c) 2008-2020 The pip developers (see AUTHORS.txt file)
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+"""
+get_wheel_from_filename = re.compile(
+    r"""^(?P<namever>(?P<name>.+?)-(?P<version>.*?))
+    ((-(?P<build>\d[^-]*?))?-(?P<pyvers>.+?)-(?P<abis>.+?)-(?P<plats>.+?)
+    \.whl)$""",
+    re.VERBOSE,
+).match
+
+sdist_exts = (
+    ".tar.gz",
+    ".tar.bz2",
+    ".zip",
+    ".tar.xz",
+)
+
+wheel_ext = ".whl"
+app_ext = ".pyz"
+dist_exts = sdist_exts + (wheel_ext, app_ext)
+
+if __name__ == "__main__":
+    import sys
+
+    pkg_dir = sys.argv[1]
+    build_pypi_index(pkg_dir)
diff --git a/etc/scripts/gen_pypi_simple.py.ABOUT b/etc/scripts/gen_pypi_simple.py.ABOUT
new file mode 100644
index 0000000..4de5ded
--- /dev/null
+++ b/etc/scripts/gen_pypi_simple.py.ABOUT
@@ -0,0 +1,8 @@
+about_resource: gen_pypi_simple.py
+name: gen_pypi_simple.py
+license_expression: bsd-2-clause-views and mit
+copyright: Copyright (c) nexB Inc.
+   Copyright (c) 2010 David Wolever <david@wolever.net>
+   Copyright (c) The pip developers
+notes: Originally from https://github.com/wolever/pip2pi and modified extensivley
+ Also partially derived from pip code
diff --git a/etc/scripts/gen_pypi_simple.py.NOTICE b/etc/scripts/gen_pypi_simple.py.NOTICE
new file mode 100644
index 0000000..6e0fbbc
--- /dev/null
+++ b/etc/scripts/gen_pypi_simple.py.NOTICE
@@ -0,0 +1,56 @@
+SPDX-License-Identifier: BSD-2-Clause-Views AND mit
+
+Copyright (c) nexB Inc.
+Copyright (c) 2010 David Wolever <david@wolever.net>
+Copyright (c) The pip developers
+
+
+Original code: copyright 2010 David Wolever <david@wolever.net>. All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+   1. Redistributions of source code must retain the above copyright notice,
+   this list of conditions and the following disclaimer.
+
+   2. Redistributions in binary form must reproduce the above copyright notice,
+   this list of conditions and the following disclaimer in the documentation
+   and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO
+EVENT SHALL <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+The views and conclusions contained in the software and documentation are those
+of the authors and should not be interpreted as representing official policies,
+either expressed or implied, of David Wolever.
+
+
+Original code: Copyright (c) 2008-2020 The pip developers
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
diff --git a/etc/scripts/gen_requirements.py b/etc/scripts/gen_requirements.py
new file mode 100644
index 0000000..07e26f7
--- /dev/null
+++ b/etc/scripts/gen_requirements.py
@@ -0,0 +1,57 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/skeleton for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+import argparse
+import pathlib
+
+import utils_requirements
+
+"""
+Utilities to manage requirements files.
+NOTE: this should use ONLY the standard library and not import anything else
+because this is used for boostrapping with no requirements installed.
+"""
+
+
+def gen_requirements():
+    description = """
+    Create or replace the `--requirements-file` file FILE requirements file with all
+    locally installed Python packages.all Python packages found installed in `--site-packages-dir`
+    """
+    parser = argparse.ArgumentParser(description=description)
+
+    parser.add_argument(
+        "-s",
+        "--site-packages-dir",
+        dest="site_packages_dir",
+        type=pathlib.Path,
+        required=True,
+        metavar="DIR",
+        help="Path to the 'site-packages' directory where wheels are installed such as lib/python3.6/site-packages",
+    )
+    parser.add_argument(
+        "-r",
+        "--requirements-file",
+        type=pathlib.Path,
+        metavar="FILE",
+        default="requirements.txt",
+        help="Path to the requirements file to update or create.",
+    )
+
+    args = parser.parse_args()
+
+    utils_requirements.lock_requirements(
+        site_packages_dir=args.site_packages_dir,
+        requirements_file=args.requirements_file,
+    )
+
+
+if __name__ == "__main__":
+    gen_requirements()
diff --git a/etc/scripts/gen_requirements_dev.py b/etc/scripts/gen_requirements_dev.py
new file mode 100644
index 0000000..12cc06d
--- /dev/null
+++ b/etc/scripts/gen_requirements_dev.py
@@ -0,0 +1,68 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/skeleton for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+import argparse
+import pathlib
+
+import utils_requirements
+
+"""
+Utilities to manage requirements files.
+NOTE: this should use ONLY the standard library and not import anything else
+because this is used for boostrapping with no requirements installed.
+"""
+
+
+def gen_dev_requirements():
+    description = """
+    Create or overwrite the `--dev-requirements-file` pip requirements FILE with
+    all Python packages found installed in `--site-packages-dir`. Exclude
+    package names also listed in the --main-requirements-file pip requirements
+    FILE (that are assume to the production requirements and therefore to always
+    be present in addition to the development requirements).
+    """
+    parser = argparse.ArgumentParser(description=description)
+
+    parser.add_argument(
+        "-s",
+        "--site-packages-dir",
+        type=pathlib.Path,
+        required=True,
+        metavar="DIR",
+        help='Path to the "site-packages" directory where wheels are installed such as lib/python3.6/site-packages',
+    )
+    parser.add_argument(
+        "-d",
+        "--dev-requirements-file",
+        type=pathlib.Path,
+        metavar="FILE",
+        default="requirements-dev.txt",
+        help="Path to the dev requirements file to update or create.",
+    )
+    parser.add_argument(
+        "-r",
+        "--main-requirements-file",
+        type=pathlib.Path,
+        default="requirements.txt",
+        metavar="FILE",
+        help="Path to the main requirements file. Its requirements will be excluded "
+        "from the generated dev requirements.",
+    )
+    args = parser.parse_args()
+
+    utils_requirements.lock_dev_requirements(
+        dev_requirements_file=args.dev_requirements_file,
+        main_requirements_file=args.main_requirements_file,
+        site_packages_dir=args.site_packages_dir,
+    )
+
+
+if __name__ == "__main__":
+    gen_dev_requirements()
diff --git a/etc/scripts/requirements.txt b/etc/scripts/requirements.txt
new file mode 100644
index 0000000..7c514da
--- /dev/null
+++ b/etc/scripts/requirements.txt
@@ -0,0 +1,12 @@
+aboutcode_toolkit
+attrs
+commoncode
+click
+requests
+saneyaml
+pip
+setuptools
+twine
+wheel
+build
+packvers
diff --git a/etc/scripts/test_utils_pip_compatibility_tags.py b/etc/scripts/test_utils_pip_compatibility_tags.py
new file mode 100644
index 0000000..98187c5
--- /dev/null
+++ b/etc/scripts/test_utils_pip_compatibility_tags.py
@@ -0,0 +1,130 @@
+"""Generate and work with PEP 425 Compatibility Tags.
+
+copied from pip-20.3.1 pip/tests/unit/test_utils_compatibility_tags.py
+download_url: https://raw.githubusercontent.com/pypa/pip/20.3.1/tests/unit/test_utils_compatibility_tags.py
+
+Copyright (c) 2008-2020 The pip developers (see AUTHORS.txt file)
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+"""
+
+from unittest.mock import patch
+import sysconfig
+
+import pytest
+
+import utils_pip_compatibility_tags
+
+
+@pytest.mark.parametrize(
+    "version_info, expected",
+    [
+        ((2,), "2"),
+        ((2, 8), "28"),
+        ((3,), "3"),
+        ((3, 6), "36"),
+        # Test a tuple of length 3.
+        ((3, 6, 5), "36"),
+        # Test a 2-digit minor version.
+        ((3, 10), "310"),
+    ],
+)
+def test_version_info_to_nodot(version_info, expected):
+    actual = utils_pip_compatibility_tags.version_info_to_nodot(version_info)
+    assert actual == expected
+
+
+class Testcompatibility_tags(object):
+    def mock_get_config_var(self, **kwd):
+        """
+        Patch sysconfig.get_config_var for arbitrary keys.
+        """
+        get_config_var = sysconfig.get_config_var
+
+        def _mock_get_config_var(var):
+            if var in kwd:
+                return kwd[var]
+            return get_config_var(var)
+
+        return _mock_get_config_var
+
+    def test_no_hyphen_tag(self):
+        """
+        Test that no tag contains a hyphen.
+        """
+        import pip._internal.utils.compatibility_tags
+
+        mock_gcf = self.mock_get_config_var(SOABI="cpython-35m-darwin")
+
+        with patch("sysconfig.get_config_var", mock_gcf):
+            supported = pip._internal.utils.compatibility_tags.get_supported()
+
+        for tag in supported:
+            assert "-" not in tag.interpreter
+            assert "-" not in tag.abi
+            assert "-" not in tag.platform
+
+
+class TestManylinux2010Tags(object):
+    @pytest.mark.parametrize(
+        "manylinux2010,manylinux1",
+        [
+            ("manylinux2010_x86_64", "manylinux1_x86_64"),
+            ("manylinux2010_i686", "manylinux1_i686"),
+        ],
+    )
+    def test_manylinux2010_implies_manylinux1(self, manylinux2010, manylinux1):
+        """
+        Specifying manylinux2010 implies manylinux1.
+        """
+        groups = {}
+        supported = utils_pip_compatibility_tags.get_supported(platforms=[manylinux2010])
+        for tag in supported:
+            groups.setdefault((tag.interpreter, tag.abi), []).append(tag.platform)
+
+        for arches in groups.values():
+            if arches == ["any"]:
+                continue
+            assert arches[:2] == [manylinux2010, manylinux1]
+
+
+class TestManylinux2014Tags(object):
+    @pytest.mark.parametrize(
+        "manylinuxA,manylinuxB",
+        [
+            ("manylinux2014_x86_64", ["manylinux2010_x86_64", "manylinux1_x86_64"]),
+            ("manylinux2014_i686", ["manylinux2010_i686", "manylinux1_i686"]),
+        ],
+    )
+    def test_manylinuxA_implies_manylinuxB(self, manylinuxA, manylinuxB):
+        """
+        Specifying manylinux2014 implies manylinux2010/manylinux1.
+        """
+        groups = {}
+        supported = utils_pip_compatibility_tags.get_supported(platforms=[manylinuxA])
+        for tag in supported:
+            groups.setdefault((tag.interpreter, tag.abi), []).append(tag.platform)
+
+        expected_arches = [manylinuxA]
+        expected_arches.extend(manylinuxB)
+        for arches in groups.values():
+            if arches == ["any"]:
+                continue
+            assert arches[:3] == expected_arches
diff --git a/etc/scripts/test_utils_pip_compatibility_tags.py.ABOUT b/etc/scripts/test_utils_pip_compatibility_tags.py.ABOUT
new file mode 100644
index 0000000..07eee35
--- /dev/null
+++ b/etc/scripts/test_utils_pip_compatibility_tags.py.ABOUT
@@ -0,0 +1,14 @@
+about_resource: test_utils_pip_compatibility_tags.py
+
+type: github
+namespace: pypa
+name: pip
+version: 20.3.1
+subpath: tests/unit/test_utils_compatibility_tags.py
+
+package_url: pkg:github/pypa/pip@20.3.1#tests/unit/test_utils_compatibility_tags.py
+
+download_url: https://raw.githubusercontent.com/pypa/pip/20.3.1/tests/unit/test_utils_compatibility_tags.py
+copyright: Copyright (c) 2008-2020 The pip developers (see AUTHORS.txt file)
+license_expression: mit
+notes: subset copied from pip for tag handling
diff --git a/etc/scripts/test_utils_pypi_supported_tags.py b/etc/scripts/test_utils_pypi_supported_tags.py
new file mode 100644
index 0000000..d291572
--- /dev/null
+++ b/etc/scripts/test_utils_pypi_supported_tags.py
@@ -0,0 +1,92 @@
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import pytest
+
+from utils_pypi_supported_tags import validate_platforms_for_pypi
+
+"""
+Wheel platform checking tests
+
+Copied and modified on 2020-12-24 from
+https://github.com/pypa/warehouse/blob/37a83dd342d9e3b3ab4f6bde47ca30e6883e2c4d/tests/unit/forklift/test_legacy.py
+"""
+
+
+def validate_wheel_filename_for_pypi(filename):
+    """
+    Validate if the filename is a PyPI/warehouse-uploadable wheel file name
+    with supported platform tags. Return a list of unsupported platform tags or
+    an empty list if all tags are supported.
+    """
+    from utils_thirdparty import Wheel
+
+    wheel = Wheel.from_filename(filename)
+    return validate_platforms_for_pypi(wheel.platforms)
+
+
+@pytest.mark.parametrize(
+    "plat",
+    [
+        "any",
+        "win32",
+        "win_amd64",
+        "win_ia64",
+        "manylinux1_i686",
+        "manylinux1_x86_64",
+        "manylinux2010_i686",
+        "manylinux2010_x86_64",
+        "manylinux2014_i686",
+        "manylinux2014_x86_64",
+        "manylinux2014_aarch64",
+        "manylinux2014_armv7l",
+        "manylinux2014_ppc64",
+        "manylinux2014_ppc64le",
+        "manylinux2014_s390x",
+        "manylinux_2_5_i686",
+        "manylinux_2_12_x86_64",
+        "manylinux_2_17_aarch64",
+        "manylinux_2_17_armv7l",
+        "manylinux_2_17_ppc64",
+        "manylinux_2_17_ppc64le",
+        "manylinux_3_0_s390x",
+        "macosx_10_6_intel",
+        "macosx_10_13_x86_64",
+        "macosx_11_0_x86_64",
+        "macosx_10_15_arm64",
+        "macosx_11_10_universal2",
+        # A real tag used by e.g. some numpy wheels
+        (
+            "macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64."
+            "macosx_10_10_intel.macosx_10_10_x86_64"
+        ),
+    ],
+)
+def test_is_valid_pypi_wheel_return_true_for_supported_wheel(plat):
+    filename = f"foo-1.2.3-cp34-none-{plat}.whl"
+    assert not validate_wheel_filename_for_pypi(filename)
+
+
+@pytest.mark.parametrize(
+    "plat",
+    [
+        "linux_x86_64",
+        "linux_x86_64.win32",
+        "macosx_9_2_x86_64",
+        "macosx_12_2_arm64",
+        "macosx_10_15_amd64",
+    ],
+)
+def test_is_valid_pypi_wheel_raise_exception_for_aunsupported_wheel(plat):
+    filename = f"foo-1.2.3-cp34-none-{plat}.whl"
+    invalid = validate_wheel_filename_for_pypi(filename)
+    assert invalid
diff --git a/etc/scripts/test_utils_pypi_supported_tags.py.ABOUT b/etc/scripts/test_utils_pypi_supported_tags.py.ABOUT
new file mode 100644
index 0000000..176efac
--- /dev/null
+++ b/etc/scripts/test_utils_pypi_supported_tags.py.ABOUT
@@ -0,0 +1,17 @@
+about_resource: test_utils_pypi_supported_tags.py
+
+type: github
+namespace: pypa
+name: warehouse
+version: 37a83dd342d9e3b3ab4f6bde47ca30e6883e2c4d
+subpath: tests/unit/forklift/test_legacy.py
+
+package_url: pkg:github/pypa/warehouse@37a83dd342d9e3b3ab4f6bde47ca30e6883e2c4d#tests/unit/forklift/test_legacy.py
+
+download_url: https://github.com/pypa/warehouse/blob/37a83dd342d9e3b3ab4f6bde47ca30e6883e2c4d/tests/unit/forklift/test_legacy.py
+copyright: Copyright (c) The warehouse developers
+homepage_url: https://warehouse.readthedocs.io
+license_expression: apache-2.0
+notes: Test for wheel platform checking copied and heavily modified on
+ 2020-12-24 from warehouse. This contains the basic functions to check if a
+ wheel file name is would be supported for uploading to PyPI.
diff --git a/etc/scripts/utils_dejacode.py b/etc/scripts/utils_dejacode.py
new file mode 100644
index 0000000..c42e6c9
--- /dev/null
+++ b/etc/scripts/utils_dejacode.py
@@ -0,0 +1,211 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/skeleton for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+import io
+import os
+import zipfile
+
+import requests
+import saneyaml
+
+from packvers import version as packaging_version
+
+"""
+Utility to create and retrieve package and ABOUT file data from DejaCode.
+"""
+
+DEJACODE_API_KEY = os.environ.get("DEJACODE_API_KEY", "")
+DEJACODE_API_URL = os.environ.get("DEJACODE_API_URL", "")
+
+DEJACODE_API_URL_PACKAGES = f"{DEJACODE_API_URL}packages/"
+DEJACODE_API_HEADERS = {
+    "Authorization": "Token {}".format(DEJACODE_API_KEY),
+    "Accept": "application/json; indent=4",
+}
+
+
+def can_do_api_calls():
+    if not DEJACODE_API_KEY and DEJACODE_API_URL:
+        print("DejaCode DEJACODE_API_KEY and DEJACODE_API_URL not configured. Doing nothing")
+        return False
+    else:
+        return True
+
+
+def fetch_dejacode_packages(params):
+    """
+    Return a list of package data mappings calling the package API with using
+    `params` or an empty list.
+    """
+    if not can_do_api_calls():
+        return []
+
+    response = requests.get(
+        DEJACODE_API_URL_PACKAGES,
+        params=params,
+        headers=DEJACODE_API_HEADERS,
+    )
+
+    return response.json()["results"]
+
+
+def get_package_data(distribution):
+    """
+    Return a mapping of package data or None for a Distribution `distribution`.
+    """
+    results = fetch_dejacode_packages(distribution.identifiers())
+
+    len_results = len(results)
+
+    if len_results == 1:
+        return results[0]
+
+    elif len_results > 1:
+        print(f"More than 1 entry exists, review at: {DEJACODE_API_URL_PACKAGES}")
+    else:
+        print("Could not find package:", distribution.download_url)
+
+
+def update_with_dejacode_data(distribution):
+    """
+    Update the Distribution `distribution` with DejaCode package data. Return
+    True if data was updated.
+    """
+    package_data = get_package_data(distribution)
+    if package_data:
+        return distribution.update(package_data, keep_extra=False)
+
+    print(f"No package found for: {distribution}")
+
+
+def update_with_dejacode_about_data(distribution):
+    """
+    Update the Distribution `distribution` wiht ABOUT code data fetched from
+    DejaCode. Return True if data was updated.
+    """
+    package_data = get_package_data(distribution)
+    if package_data:
+        package_api_url = package_data["api_url"]
+        about_url = f"{package_api_url}about"
+        response = requests.get(about_url, headers=DEJACODE_API_HEADERS)
+        # note that this is YAML-formatted
+        about_text = response.json()["about_data"]
+        about_data = saneyaml.load(about_text)
+
+        return distribution.update(about_data, keep_extra=True)
+
+    print(f"No package found for: {distribution}")
+
+
+def fetch_and_save_about_files(distribution, dest_dir="thirdparty"):
+    """
+    Fetch and save in `dest_dir` the .ABOUT, .LICENSE and .NOTICE files fetched
+    from DejaCode for a Distribution `distribution`. Return True if files were
+    fetched.
+    """
+    package_data = get_package_data(distribution)
+    if package_data:
+        package_api_url = package_data["api_url"]
+        about_url = f"{package_api_url}about_files"
+        response = requests.get(about_url, headers=DEJACODE_API_HEADERS)
+        about_zip = response.content
+        with io.BytesIO(about_zip) as zf:
+            with zipfile.ZipFile(zf) as zi:
+                zi.extractall(path=dest_dir)
+        return True
+
+    print(f"No package found for: {distribution}")
+
+
+def find_latest_dejacode_package(distribution):
+    """
+    Return a mapping of package data for the closest version to
+    a Distribution `distribution` or None.
+    Return the newest of the packages if prefer_newest is True.
+    Filter out version-specific attributes.
+    """
+    ids = distribution.purl_identifiers(skinny=True)
+    packages = fetch_dejacode_packages(params=ids)
+    if not packages:
+        return
+
+    for package_data in packages:
+        matched = (
+            package_data["download_url"] == distribution.download_url
+            and package_data["version"] == distribution.version
+            and package_data["filename"] == distribution.filename
+        )
+
+        if matched:
+            return package_data
+
+    # there was no exact match, find the latest version
+    # TODO: consider the closest version rather than the latest
+    # or the version that has the best data
+    with_versions = [(packaging_version.parse(p["version"]), p) for p in packages]
+    with_versions = sorted(with_versions)
+    latest_version, latest_package_version = sorted(with_versions)[-1]
+    print(
+        f"Found DejaCode latest version: {latest_version} " f"for dist: {distribution.package_url}",
+    )
+
+    return latest_package_version
+
+
+def create_dejacode_package(distribution):
+    """
+    Create a new DejaCode Package a Distribution `distribution`.
+    Return the new or existing package data.
+    """
+    if not can_do_api_calls():
+        return
+
+    existing_package_data = get_package_data(distribution)
+    if existing_package_data:
+        return existing_package_data
+
+    print(f"Creating new DejaCode package for: {distribution}")
+
+    new_package_payload = {
+        # Trigger data collection, scan, and purl
+        "collect_data": 1,
+    }
+
+    fields_to_carry_over = [
+        "download_url" "type",
+        "namespace",
+        "name",
+        "version",
+        "qualifiers",
+        "subpath",
+        "license_expression",
+        "copyright",
+        "description",
+        "homepage_url",
+        "primary_language",
+        "notice_text",
+    ]
+
+    for field in fields_to_carry_over:
+        value = getattr(distribution, field, None)
+        if value:
+            new_package_payload[field] = value
+
+    response = requests.post(
+        DEJACODE_API_URL_PACKAGES,
+        data=new_package_payload,
+        headers=DEJACODE_API_HEADERS,
+    )
+    new_package_data = response.json()
+    if response.status_code != 201:
+        raise Exception(f"Error, cannot create package for: {distribution}")
+
+    print(f'New Package created at: {new_package_data["absolute_url"]}')
+    return new_package_data
diff --git a/etc/scripts/utils_pip_compatibility_tags.py b/etc/scripts/utils_pip_compatibility_tags.py
new file mode 100644
index 0000000..af42a0c
--- /dev/null
+++ b/etc/scripts/utils_pip_compatibility_tags.py
@@ -0,0 +1,192 @@
+"""Generate and work with PEP 425 Compatibility Tags.
+
+copied from pip-20.3.1 pip/_internal/utils/compatibility_tags.py
+download_url: https://github.com/pypa/pip/blob/20.3.1/src/pip/_internal/utils/compatibility_tags.py
+
+Copyright (c) 2008-2020 The pip developers (see AUTHORS.txt file)
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+"""
+
+import re
+
+from packvers.tags import (
+    compatible_tags,
+    cpython_tags,
+    generic_tags,
+    interpreter_name,
+    interpreter_version,
+    mac_platforms,
+)
+
+_osx_arch_pat = re.compile(r"(.+)_(\d+)_(\d+)_(.+)")
+
+
+def version_info_to_nodot(version_info):
+    # type: (Tuple[int, ...]) -> str
+    # Only use up to the first two numbers.
+    return "".join(map(str, version_info[:2]))
+
+
+def _mac_platforms(arch):
+    # type: (str) -> List[str]
+    match = _osx_arch_pat.match(arch)
+    if match:
+        name, major, minor, actual_arch = match.groups()
+        mac_version = (int(major), int(minor))
+        arches = [
+            # Since we have always only checked that the platform starts
+            # with "macosx", for backwards-compatibility we extract the
+            # actual prefix provided by the user in case they provided
+            # something like "macosxcustom_". It may be good to remove
+            # this as undocumented or deprecate it in the future.
+            "{}_{}".format(name, arch[len("macosx_") :])
+            for arch in mac_platforms(mac_version, actual_arch)
+        ]
+    else:
+        # arch pattern didn't match (?!)
+        arches = [arch]
+    return arches
+
+
+def _custom_manylinux_platforms(arch):
+    # type: (str) -> List[str]
+    arches = [arch]
+    arch_prefix, arch_sep, arch_suffix = arch.partition("_")
+    if arch_prefix == "manylinux2014":
+        # manylinux1/manylinux2010 wheels run on most manylinux2014 systems
+        # with the exception of wheels depending on ncurses. PEP 599 states
+        # manylinux1/manylinux2010 wheels should be considered
+        # manylinux2014 wheels:
+        # https://www.python.org/dev/peps/pep-0599/#backwards-compatibility-with-manylinux2010-wheels
+        if arch_suffix in {"i686", "x86_64"}:
+            arches.append("manylinux2010" + arch_sep + arch_suffix)
+            arches.append("manylinux1" + arch_sep + arch_suffix)
+    elif arch_prefix == "manylinux2010":
+        # manylinux1 wheels run on most manylinux2010 systems with the
+        # exception of wheels depending on ncurses. PEP 571 states
+        # manylinux1 wheels should be considered manylinux2010 wheels:
+        # https://www.python.org/dev/peps/pep-0571/#backwards-compatibility-with-manylinux1-wheels
+        arches.append("manylinux1" + arch_sep + arch_suffix)
+    return arches
+
+
+def _get_custom_platforms(arch):
+    # type: (str) -> List[str]
+    arch_prefix, _arch_sep, _arch_suffix = arch.partition("_")
+    if arch.startswith("macosx"):
+        arches = _mac_platforms(arch)
+    elif arch_prefix in ["manylinux2014", "manylinux2010"]:
+        arches = _custom_manylinux_platforms(arch)
+    else:
+        arches = [arch]
+    return arches
+
+
+def _expand_allowed_platforms(platforms):
+    # type: (Optional[List[str]]) -> Optional[List[str]]
+    if not platforms:
+        return None
+
+    seen = set()
+    result = []
+
+    for p in platforms:
+        if p in seen:
+            continue
+        additions = [c for c in _get_custom_platforms(p) if c not in seen]
+        seen.update(additions)
+        result.extend(additions)
+
+    return result
+
+
+def _get_python_version(version):
+    # type: (str) -> PythonVersion
+    if len(version) > 1:
+        return int(version[0]), int(version[1:])
+    else:
+        return (int(version[0]),)
+
+
+def _get_custom_interpreter(implementation=None, version=None):
+    # type: (Optional[str], Optional[str]) -> str
+    if implementation is None:
+        implementation = interpreter_name()
+    if version is None:
+        version = interpreter_version()
+    return "{}{}".format(implementation, version)
+
+
+def get_supported(
+    version=None,  # type: Optional[str]
+    platforms=None,  # type: Optional[List[str]]
+    impl=None,  # type: Optional[str]
+    abis=None,  # type: Optional[List[str]]
+):
+    # type: (...) -> List[Tag]
+    """Return a list of supported tags for each version specified in
+    `versions`.
+
+    :param version: a string version, of the form "33" or "32",
+        or None. The version will be assumed to support our ABI.
+    :param platforms: specify a list of platforms you want valid
+        tags for, or None. If None, use the local system platform.
+    :param impl: specify the exact implementation you want valid
+        tags for, or None. If None, use the local interpreter impl.
+    :param abis: specify a list of abis you want valid
+        tags for, or None. If None, use the local interpreter abi.
+    """
+    supported = []  # type: List[Tag]
+
+    python_version = None  # type: Optional[PythonVersion]
+    if version is not None:
+        python_version = _get_python_version(version)
+
+    interpreter = _get_custom_interpreter(impl, version)
+
+    platforms = _expand_allowed_platforms(platforms)
+
+    is_cpython = (impl or interpreter_name()) == "cp"
+    if is_cpython:
+        supported.extend(
+            cpython_tags(
+                python_version=python_version,
+                abis=abis,
+                platforms=platforms,
+            )
+        )
+    else:
+        supported.extend(
+            generic_tags(
+                interpreter=interpreter,
+                abis=abis,
+                platforms=platforms,
+            )
+        )
+    supported.extend(
+        compatible_tags(
+            python_version=python_version,
+            interpreter=interpreter,
+            platforms=platforms,
+        )
+    )
+
+    return supported
diff --git a/etc/scripts/utils_pip_compatibility_tags.py.ABOUT b/etc/scripts/utils_pip_compatibility_tags.py.ABOUT
new file mode 100644
index 0000000..7bbb026
--- /dev/null
+++ b/etc/scripts/utils_pip_compatibility_tags.py.ABOUT
@@ -0,0 +1,14 @@
+about_resource: utils_pip_compatibility_tags.py
+
+type: github
+namespace: pypa
+name: pip
+version: 20.3.1
+subpath: src/pip/_internal/utils/compatibility_tags.py
+
+package_url: pkg:github/pypa/pip@20.3.1#src/pip/_internal/utils/compatibility_tags.py
+
+download_url: https://github.com/pypa/pip/blob/20.3.1/src/pip/_internal/utils/compatibility_tags.py
+copyright: Copyright (c) 2008-2020 The pip developers (see AUTHORS.txt file)
+license_expression: mit
+notes: subset copied from pip for tag handling
\ No newline at end of file
diff --git a/etc/scripts/utils_pypi_supported_tags.py b/etc/scripts/utils_pypi_supported_tags.py
new file mode 100644
index 0000000..de9f21b
--- /dev/null
+++ b/etc/scripts/utils_pypi_supported_tags.py
@@ -0,0 +1,105 @@
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import re
+
+"""
+Wheel platform checking
+
+Copied and modified on 2020-12-24 from
+https://github.com/pypa/warehouse/blob/37a83dd342d9e3b3ab4f6bde47ca30e6883e2c4d/warehouse/forklift/legacy.py
+
+This contains the basic functions to check if a wheel file name is would be
+supported for uploading to PyPI.
+"""
+
+# These platforms can be handled by a simple static list:
+_allowed_platforms = {
+    "any",
+    "win32",
+    "win_amd64",
+    "win_ia64",
+    "manylinux1_x86_64",
+    "manylinux1_i686",
+    "manylinux2010_x86_64",
+    "manylinux2010_i686",
+    "manylinux2014_x86_64",
+    "manylinux2014_i686",
+    "manylinux2014_aarch64",
+    "manylinux2014_armv7l",
+    "manylinux2014_ppc64",
+    "manylinux2014_ppc64le",
+    "manylinux2014_s390x",
+    "linux_armv6l",
+    "linux_armv7l",
+}
+# macosx is a little more complicated:
+_macosx_platform_re = re.compile(r"macosx_(?P<major>\d+)_(\d+)_(?P<arch>.*)")
+_macosx_arches = {
+    "ppc",
+    "ppc64",
+    "i386",
+    "x86_64",
+    "arm64",
+    "intel",
+    "fat",
+    "fat32",
+    "fat64",
+    "universal",
+    "universal2",
+}
+_macosx_major_versions = {
+    "10",
+    "11",
+}
+
+# manylinux pep600 is a little more complicated:
+_manylinux_platform_re = re.compile(r"manylinux_(\d+)_(\d+)_(?P<arch>.*)")
+_manylinux_arches = {
+    "x86_64",
+    "i686",
+    "aarch64",
+    "armv7l",
+    "ppc64",
+    "ppc64le",
+    "s390x",
+}
+
+
+def is_supported_platform_tag(platform_tag):
+    """
+    Return True if the ``platform_tag`` is supported on PyPI.
+    """
+    if platform_tag in _allowed_platforms:
+        return True
+    m = _macosx_platform_re.match(platform_tag)
+    if m and m.group("major") in _macosx_major_versions and m.group("arch") in _macosx_arches:
+        return True
+    m = _manylinux_platform_re.match(platform_tag)
+    if m and m.group("arch") in _manylinux_arches:
+        return True
+    return False
+
+
+def validate_platforms_for_pypi(platforms):
+    """
+    Validate if the wheel platforms are supported platform tags on Pypi. Return
+    a list of unsupported platform tags or an empty list if all tags are
+    supported.
+    """
+
+    # Check that if it's a binary wheel, it's on a supported platform
+    invalid_tags = []
+    for plat in platforms:
+        if not is_supported_platform_tag(plat):
+            invalid_tags.append(plat)
+    return invalid_tags
diff --git a/etc/scripts/utils_pypi_supported_tags.py.ABOUT b/etc/scripts/utils_pypi_supported_tags.py.ABOUT
new file mode 100644
index 0000000..228a538
--- /dev/null
+++ b/etc/scripts/utils_pypi_supported_tags.py.ABOUT
@@ -0,0 +1,17 @@
+about_resource: utils_pypi_supported_tags.py
+
+type: github
+namespace: pypa
+name: warehouse
+version: 37a83dd342d9e3b3ab4f6bde47ca30e6883e2c4d
+subpath: warehouse/forklift/legacy.py
+
+package_url: pkg:github/pypa/warehouse@37a83dd342d9e3b3ab4f6bde47ca30e6883e2c4d#warehouse/forklift/legacy.py
+
+download_url: https://github.com/pypa/warehouse/blob/37a83dd342d9e3b3ab4f6bde47ca30e6883e2c4d/warehouse/forklift/legacy.py
+copyright: Copyright (c) The warehouse developers
+homepage_url: https://warehouse.readthedocs.io
+license_expression: apache-2.0
+notes: Wheel platform checking copied and heavily modified on 2020-12-24 from
+ warehouse. This contains the basic functions to check if a wheel file name is
+ would be supported for uploading to PyPI.
diff --git a/etc/scripts/utils_requirements.py b/etc/scripts/utils_requirements.py
new file mode 100644
index 0000000..0fc25a3
--- /dev/null
+++ b/etc/scripts/utils_requirements.py
@@ -0,0 +1,157 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/skeleton for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+
+import os
+import re
+import subprocess
+
+"""
+Utilities to manage requirements files and call pip.
+NOTE: this should use ONLY the standard library and not import anything else
+because this is used for boostrapping with no requirements installed.
+"""
+
+
+def load_requirements(requirements_file="requirements.txt", with_unpinned=False):
+    """
+    Yield package (name, version) tuples for each requirement in a `requirement`
+    file. Only accept requirements pinned to an exact version.
+    """
+    with open(requirements_file) as reqs:
+        req_lines = reqs.read().splitlines(False)
+    return get_required_name_versions(req_lines, with_unpinned=with_unpinned)
+
+
+def get_required_name_versions(requirement_lines, with_unpinned=False):
+    """
+    Yield required (name, version) tuples given a`requirement_lines` iterable of
+    requirement text lines. Only accept requirements pinned to an exact version.
+    """
+
+    for req_line in requirement_lines:
+        req_line = req_line.strip()
+        if not req_line or req_line.startswith("#"):
+            continue
+        if req_line.startswith("-") or (not with_unpinned and not "==" in req_line):
+            print(f"Requirement line is not supported: ignored: {req_line}")
+            continue
+        yield get_required_name_version(requirement=req_line, with_unpinned=with_unpinned)
+
+
+def get_required_name_version(requirement, with_unpinned=False):
+    """
+    Return a (name, version) tuple given a`requirement` specifier string.
+    Requirement version must be pinned. If ``with_unpinned`` is True, unpinned
+    requirements are accepted and only the name portion is returned.
+
+    For example:
+    >>> assert get_required_name_version("foo==1.2.3") == ("foo", "1.2.3")
+    >>> assert get_required_name_version("fooA==1.2.3.DEV1") == ("fooa", "1.2.3.dev1")
+    >>> assert get_required_name_version("foo==1.2.3", with_unpinned=False) == ("foo", "1.2.3")
+    >>> assert get_required_name_version("foo", with_unpinned=True) == ("foo", "")
+    >>> assert get_required_name_version("foo>=1.2", with_unpinned=True) == ("foo", ""), get_required_name_version("foo>=1.2")
+    >>> try:
+    ...   assert not get_required_name_version("foo", with_unpinned=False)
+    ... except Exception as e:
+    ...   assert "Requirement version must be pinned" in str(e)
+    """
+    requirement = requirement and "".join(requirement.lower().split())
+    assert requirement, f"specifier is required is empty:{requirement!r}"
+    name, operator, version = split_req(requirement)
+    assert name, f"Name is required: {requirement}"
+    is_pinned = operator == "=="
+    if with_unpinned:
+        version = ""
+    else:
+        assert is_pinned and version, f"Requirement version must be pinned: {requirement}"
+    return name, version
+
+
+def lock_requirements(requirements_file="requirements.txt", site_packages_dir=None):
+    """
+    Freeze and lock current installed requirements and save this to the
+    `requirements_file` requirements file.
+    """
+    with open(requirements_file, "w") as fo:
+        fo.write(get_installed_reqs(site_packages_dir=site_packages_dir))
+
+
+def lock_dev_requirements(
+    dev_requirements_file="requirements-dev.txt",
+    main_requirements_file="requirements.txt",
+    site_packages_dir=None,
+):
+    """
+    Freeze and lock current installed development-only requirements and save
+    this to the `dev_requirements_file` requirements file. Development-only is
+    achieved by subtracting requirements from the `main_requirements_file`
+    requirements file from the current requirements using package names (and
+    ignoring versions).
+    """
+    main_names = {n for n, _v in load_requirements(main_requirements_file)}
+    all_reqs = get_installed_reqs(site_packages_dir=site_packages_dir)
+    all_req_lines = all_reqs.splitlines(False)
+    all_req_nvs = get_required_name_versions(all_req_lines)
+    dev_only_req_nvs = {n: v for n, v in all_req_nvs if n not in main_names}
+
+    new_reqs = "\n".join(f"{n}=={v}" for n, v in sorted(dev_only_req_nvs.items()))
+    with open(dev_requirements_file, "w") as fo:
+        fo.write(new_reqs)
+
+
+def get_installed_reqs(site_packages_dir):
+    """
+    Return the installed pip requirements as text found in `site_packages_dir`
+    as a text.
+    """
+    if not os.path.exists(site_packages_dir):
+        raise Exception(f"site_packages directory: {site_packages_dir!r} does not exists")
+    # Also include these packages in the output with --all: wheel, distribute,
+    # setuptools, pip
+    args = ["pip", "freeze", "--exclude-editable", "--all", "--path", site_packages_dir]
+    return subprocess.check_output(args, encoding="utf-8")
+
+
+comparators = (
+    "===",
+    "~=",
+    "!=",
+    "==",
+    "<=",
+    ">=",
+    ">",
+    "<",
+)
+
+_comparators_re = r"|".join(comparators)
+version_splitter = re.compile(rf"({_comparators_re})")
+
+
+def split_req(req):
+    """
+    Return a three-tuple of (name, comparator, version) given a ``req``
+    requirement specifier string. Each segment may be empty. Spaces are removed.
+
+    For example:
+    >>> assert split_req("foo==1.2.3") == ("foo", "==", "1.2.3"), split_req("foo==1.2.3")
+    >>> assert split_req("foo") == ("foo", "", ""), split_req("foo")
+    >>> assert split_req("==1.2.3") == ("", "==", "1.2.3"), split_req("==1.2.3")
+    >>> assert split_req("foo >= 1.2.3 ") == ("foo", ">=", "1.2.3"), split_req("foo >= 1.2.3 ")
+    >>> assert split_req("foo>=1.2") == ("foo", ">=", "1.2"), split_req("foo>=1.2")
+    """
+    assert req
+    # do not allow multiple constraints and tags
+    assert not any(c in req for c in ",;")
+    req = "".join(req.split())
+    if not any(c in req for c in comparators):
+        return req, "", ""
+    segments = version_splitter.split(req, maxsplit=1)
+    return tuple(segments)
diff --git a/etc/scripts/utils_thirdparty.py b/etc/scripts/utils_thirdparty.py
new file mode 100644
index 0000000..addf8e5
--- /dev/null
+++ b/etc/scripts/utils_thirdparty.py
@@ -0,0 +1,2288 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/skeleton for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+import email
+import itertools
+import os
+import re
+import shutil
+import subprocess
+import tempfile
+import time
+import urllib
+from collections import defaultdict
+from urllib.parse import quote_plus
+
+import attr
+import license_expression
+import packageurl
+import requests
+import saneyaml
+from commoncode import fileutils
+from commoncode.hash import multi_checksums
+from commoncode.text import python_safe_name
+from packvers import tags as packaging_tags
+from packvers import version as packaging_version
+
+import utils_pip_compatibility_tags
+
+"""
+Utilities to manage Python thirparty libraries source, binaries and metadata in
+local directories and remote repositories.
+
+- download wheels for packages for all each supported operating systems
+  (Linux, macOS, Windows) and Python versions (3.x) combinations
+
+- download sources for packages (aka. sdist)
+
+- create, update and download ABOUT, NOTICE and LICENSE metadata for these
+  wheels and source distributions
+
+- update pip requirement files based on actually installed packages for
+  production and development
+
+
+Approach
+--------
+
+The processing is organized around these key objects:
+
+- A PyPiPackage represents a PyPI package with its name and version and the
+  metadata used to populate an .ABOUT file and document origin and license.
+  It contains the downloadable Distribution objects for that version:
+
+  - one Sdist source Distribution
+  - a list of Wheel binary Distribution
+
+- A Distribution (either a Wheel or Sdist) is identified by and created from its
+  filename as well as its name and version.
+  A Distribution is fetched from a Repository.
+  Distribution metadata can be loaded from and dumped to ABOUT files.
+
+- A Wheel binary Distribution can have Python/Platform/OS tags it supports and
+  was built for and these tags can be matched to an Environment.
+
+- An Environment is a combination of a Python version and operating system
+  (e.g., platfiorm and ABI tags.) and is represented by the "tags" it supports.
+
+- A plain LinksRepository which is just a collection of URLs scrape from a web
+  page such as HTTP diretory listing. It is used either with pip "--find-links"
+  option or to fetch ABOUT and LICENSE files.
+
+- A PypiSimpleRepository is a PyPI "simple" index where a HTML page is listing
+  package name links. Each such link points to an HTML page listing URLs to all
+  wheels and sdsist of all versions of this package.
+
+PypiSimpleRepository and Packages are related through packages name, version and
+filenames.
+
+The Wheel models code is partially derived from the mit-licensed pip and the
+Distribution/Wheel/Sdist design has been heavily inspired by the packaging-
+dists library https://github.com/uranusjr/packaging-dists by Tzu-ping Chung
+"""
+
+"""
+Wheel downloader
+
+- parse requirement file
+- create a TODO queue of requirements to process
+- done: create an empty map of processed binary requirements as {package name: (list of versions/tags}
+
+
+- while we have package reqs in TODO queue, process one requirement:
+    - for each PyPI simple index:
+        - fetch through cache the PyPI simple index for this package
+        - for each environment:
+            - find a wheel matching pinned requirement in this index
+            - if file exist locally, continue
+            - fetch the wheel for env
+                - IF pure, break, no more needed for env
+            - collect requirement deps from wheel metadata and add to queue
+    - if fetched, break, otherwise display error message
+
+
+"""
+
+TRACE = False
+TRACE_DEEP = False
+TRACE_ULTRA_DEEP = False
+
+# Supported environments
+PYTHON_VERSIONS = "37", "38", "39", "310"
+
+PYTHON_DOT_VERSIONS_BY_VER = {
+    "37": "3.7",
+    "38": "3.8",
+    "39": "3.9",
+    "310": "3.10",
+}
+
+
+def get_python_dot_version(version):
+    """
+    Return a dot version from a plain, non-dot version.
+    """
+    return PYTHON_DOT_VERSIONS_BY_VER[version]
+
+
+ABIS_BY_PYTHON_VERSION = {
+    "37": ["cp37", "cp37m", "abi3"],
+    "38": ["cp38", "cp38m", "abi3"],
+    "39": ["cp39", "cp39m", "abi3"],
+    "310": ["cp310", "cp310m", "abi3"],
+}
+
+PLATFORMS_BY_OS = {
+    "linux": [
+        "linux_x86_64",
+        "manylinux1_x86_64",
+        "manylinux2010_x86_64",
+        "manylinux2014_x86_64",
+    ],
+    "macos": [
+        "macosx_10_6_intel",
+        "macosx_10_6_x86_64",
+        "macosx_10_9_intel",
+        "macosx_10_9_x86_64",
+        "macosx_10_10_intel",
+        "macosx_10_10_x86_64",
+        "macosx_10_11_intel",
+        "macosx_10_11_x86_64",
+        "macosx_10_12_intel",
+        "macosx_10_12_x86_64",
+        "macosx_10_13_intel",
+        "macosx_10_13_x86_64",
+        "macosx_10_14_intel",
+        "macosx_10_14_x86_64",
+        "macosx_10_15_intel",
+        "macosx_10_15_x86_64",
+        "macosx_11_0_x86_64",
+        "macosx_11_intel",
+        "macosx_11_0_x86_64",
+        "macosx_11_intel",
+        "macosx_10_9_universal2",
+        "macosx_10_10_universal2",
+        "macosx_10_11_universal2",
+        "macosx_10_12_universal2",
+        "macosx_10_13_universal2",
+        "macosx_10_14_universal2",
+        "macosx_10_15_universal2",
+        "macosx_11_0_universal2",
+        # 'macosx_11_0_arm64',
+    ],
+    "windows": [
+        "win_amd64",
+    ],
+}
+
+THIRDPARTY_DIR = "thirdparty"
+CACHE_THIRDPARTY_DIR = ".cache/thirdparty"
+
+################################################################################
+
+ABOUT_BASE_URL = "https://thirdparty.aboutcode.org/pypi"
+ABOUT_PYPI_SIMPLE_URL = f"{ABOUT_BASE_URL}/simple"
+ABOUT_LINKS_URL = f"{ABOUT_PYPI_SIMPLE_URL}/links.html"
+PYPI_SIMPLE_URL = "https://pypi.org/simple"
+PYPI_INDEX_URLS = (PYPI_SIMPLE_URL, ABOUT_PYPI_SIMPLE_URL)
+
+################################################################################
+
+EXTENSIONS_APP = (".pyz",)
+EXTENSIONS_SDIST = (
+    ".tar.gz",
+    ".zip",
+    ".tar.xz",
+)
+EXTENSIONS_INSTALLABLE = EXTENSIONS_SDIST + (".whl",)
+EXTENSIONS_ABOUT = (
+    ".ABOUT",
+    ".LICENSE",
+    ".NOTICE",
+)
+EXTENSIONS = EXTENSIONS_INSTALLABLE + EXTENSIONS_ABOUT + EXTENSIONS_APP
+
+LICENSEDB_API_URL = "https://scancode-licensedb.aboutcode.org"
+
+LICENSING = license_expression.Licensing()
+
+collect_urls = re.compile('href="([^"]+)"').findall
+
+################################################################################
+# Fetch wheels and sources locally
+################################################################################
+
+
+class DistributionNotFound(Exception):
+    pass
+
+
+def download_wheel(name, version, environment, dest_dir=THIRDPARTY_DIR, repos=tuple()):
+    """
+    Download the wheels binary distribution(s) of package ``name`` and
+    ``version`` matching the ``environment`` Environment constraints into the
+    ``dest_dir`` directory. Return a list of fetched_wheel_filenames, possibly
+    empty.
+
+    Use the first PyPI simple repository from a list of ``repos`` that contains this wheel.
+    """
+    if TRACE_DEEP:
+        print(f"  download_wheel: {name}=={version} for envt: {environment}")
+
+    if not repos:
+        repos = DEFAULT_PYPI_REPOS
+
+    fetched_wheel_filenames = []
+
+    for repo in repos:
+        package = repo.get_package_version(name=name, version=version)
+        if not package:
+            if TRACE_DEEP:
+                print(f"    download_wheel: No package in {repo.index_url} for {name}=={version}")
+            continue
+        supported_wheels = list(package.get_supported_wheels(environment=environment))
+        if not supported_wheels:
+            if TRACE_DEEP:
+                print(
+                    f"    download_wheel: No supported wheel for {name}=={version}: {environment} "
+                )
+            continue
+
+        for wheel in supported_wheels:
+            if TRACE_DEEP:
+                print(
+                    f"    download_wheel: Getting wheel from index (or cache): {wheel.download_url}"
+                )
+            fetched_wheel_filename = wheel.download(dest_dir=dest_dir)
+            fetched_wheel_filenames.append(fetched_wheel_filename)
+
+        if fetched_wheel_filenames:
+            # do not futher fetch from other repos if we find in first, typically PyPI
+            break
+
+    return fetched_wheel_filenames
+
+
+def download_sdist(name, version, dest_dir=THIRDPARTY_DIR, repos=tuple()):
+    """
+    Download the sdist source distribution of package ``name`` and ``version``
+    into the ``dest_dir`` directory. Return a fetched filename or None.
+
+    Use the first PyPI simple repository from a list of ``repos`` that contains
+    this sdist.
+    """
+    if TRACE:
+        print(f"  download_sdist: {name}=={version}")
+
+    if not repos:
+        repos = DEFAULT_PYPI_REPOS
+
+    fetched_sdist_filename = None
+
+    for repo in repos:
+        package = repo.get_package_version(name=name, version=version)
+
+        if not package:
+            if TRACE_DEEP:
+                print(f"    download_sdist: No package in {repo.index_url} for {name}=={version}")
+            continue
+        sdist = package.sdist
+        if not sdist:
+            if TRACE_DEEP:
+                print(f"    download_sdist: No sdist for {name}=={version}")
+            continue
+
+        if TRACE_DEEP:
+            print(f"    download_sdist: Getting sdist from index (or cache): {sdist.download_url}")
+        fetched_sdist_filename = package.sdist.download(dest_dir=dest_dir)
+
+        if fetched_sdist_filename:
+            # do not futher fetch from other repos if we find in first, typically PyPI
+            break
+
+    return fetched_sdist_filename
+
+
+################################################################################
+#
+# Core models
+#
+################################################################################
+
+
+@attr.attributes
+class NameVer:
+    name = attr.ib(
+        type=str,
+        metadata=dict(help="Python package name, lowercase and normalized."),
+    )
+
+    version = attr.ib(
+        type=str,
+        metadata=dict(help="Python package version string."),
+    )
+
+    @property
+    def normalized_name(self):
+        return NameVer.normalize_name(self.name)
+
+    @staticmethod
+    def normalize_name(name):
+        """
+        Return a normalized package name per PEP503, and copied from
+        https://www.python.org/dev/peps/pep-0503/#id4
+        """
+        return name and re.sub(r"[-_.]+", "-", name).lower() or name
+
+    def sortable_name_version(self):
+        """
+        Return a tuple of values to sort by name, then version.
+        This method is a suitable to use as key for sorting NameVer instances.
+        """
+        return self.normalized_name, packaging_version.parse(self.version)
+
+    @classmethod
+    def sorted(cls, namevers):
+        return sorted(namevers or [], key=cls.sortable_name_version)
+
+
+@attr.attributes
+class Distribution(NameVer):
+
+    # field names that can be updated from another Distribution or mapping
+    updatable_fields = [
+        "license_expression",
+        "copyright",
+        "description",
+        "homepage_url",
+        "primary_language",
+        "notice_text",
+        "extra_data",
+    ]
+
+    filename = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="File name."),
+    )
+
+    path_or_url = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="Path or URL"),
+    )
+
+    sha256 = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="SHA256 checksum."),
+    )
+
+    sha1 = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="SHA1 checksum."),
+    )
+
+    md5 = attr.ib(
+        repr=False,
+        type=int,
+        default=0,
+        metadata=dict(help="MD5 checksum."),
+    )
+
+    type = attr.ib(
+        repr=False,
+        type=str,
+        default="pypi",
+        metadata=dict(help="Package type"),
+    )
+
+    namespace = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="Package URL namespace"),
+    )
+
+    qualifiers = attr.ib(
+        repr=False,
+        type=dict,
+        default=attr.Factory(dict),
+        metadata=dict(help="Package URL qualifiers"),
+    )
+
+    subpath = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="Package URL subpath"),
+    )
+
+    size = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="Size in bytes."),
+    )
+
+    primary_language = attr.ib(
+        repr=False,
+        type=str,
+        default="Python",
+        metadata=dict(help="Primary Programming language."),
+    )
+
+    description = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="Description."),
+    )
+
+    homepage_url = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="Homepage URL"),
+    )
+
+    notes = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="Notes."),
+    )
+
+    copyright = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="Copyright."),
+    )
+
+    license_expression = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="License expression"),
+    )
+
+    licenses = attr.ib(
+        repr=False,
+        type=list,
+        default=attr.Factory(list),
+        metadata=dict(help="List of license mappings."),
+    )
+
+    notice_text = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="Notice text"),
+    )
+
+    extra_data = attr.ib(
+        repr=False,
+        type=dict,
+        default=attr.Factory(dict),
+        metadata=dict(help="Extra data"),
+    )
+
+    @property
+    def package_url(self):
+        """
+        Return a Package URL string of self.
+        """
+        return str(
+            packageurl.PackageURL(
+                type=self.type,
+                namespace=self.namespace,
+                name=self.name,
+                version=self.version,
+                subpath=self.subpath,
+                qualifiers=self.qualifiers,
+            )
+        )
+
+    @property
+    def download_url(self):
+        return self.get_best_download_url()
+
+    def get_best_download_url(self, repos=tuple()):
+        """
+        Return the best download URL for this distribution where best means this
+        is the first URL found for this distribution found in the list of
+        ``repos``.
+
+        If none is found, return a synthetic PyPI remote URL.
+        """
+
+        if not repos:
+            repos = DEFAULT_PYPI_REPOS
+
+        for repo in repos:
+            package = repo.get_package_version(name=self.name, version=self.version)
+            if not package:
+                if TRACE:
+                    print(
+                        f"     get_best_download_url: {self.name}=={self.version} "
+                        f"not found in {repo.index_url}"
+                    )
+                continue
+            pypi_url = package.get_url_for_filename(self.filename)
+            if pypi_url:
+                return pypi_url
+            else:
+                if TRACE:
+                    print(
+                        f"     get_best_download_url: {self.filename} not found in {repo.index_url}"
+                    )
+
+    def download(self, dest_dir=THIRDPARTY_DIR):
+        """
+        Download this distribution into `dest_dir` directory.
+        Return the fetched filename.
+        """
+        assert self.filename
+        if TRACE_DEEP:
+            print(
+                f"Fetching distribution of {self.name}=={self.version}:",
+                self.filename,
+            )
+
+        # FIXME:
+        fetch_and_save(
+            path_or_url=self.path_or_url,
+            dest_dir=dest_dir,
+            filename=self.filename,
+            as_text=False,
+        )
+        return self.filename
+
+    @property
+    def about_filename(self):
+        return f"{self.filename}.ABOUT"
+
+    @property
+    def about_download_url(self):
+        return f"{ABOUT_BASE_URL}/{self.about_filename}"
+
+    @property
+    def notice_filename(self):
+        return f"{self.filename}.NOTICE"
+
+    @property
+    def notice_download_url(self):
+        return f"{ABOUT_BASE_URL}/{self.notice_filename}"
+
+    @classmethod
+    def from_path_or_url(cls, path_or_url):
+        """
+        Return a distribution built from the data found in the filename of a
+        ``path_or_url`` string. Raise an exception if this is not a valid
+        filename.
+        """
+        filename = os.path.basename(path_or_url.strip("/"))
+        dist = cls.from_filename(filename)
+        dist.path_or_url = path_or_url
+        return dist
+
+    @classmethod
+    def get_dist_class(cls, filename):
+        if filename.endswith(".whl"):
+            return Wheel
+        elif filename.endswith(
+            (
+                ".zip",
+                ".tar.gz",
+            )
+        ):
+            return Sdist
+        raise InvalidDistributionFilename(filename)
+
+    @classmethod
+    def from_filename(cls, filename):
+        """
+        Return a distribution built from the data found in a `filename` string.
+        Raise an exception if this is not a valid filename
+        """
+        filename = os.path.basename(filename.strip("/"))
+        clazz = cls.get_dist_class(filename)
+        return clazz.from_filename(filename)
+
+    def has_key_metadata(self):
+        """
+        Return True if this distribution has key metadata required for basic attribution.
+        """
+        if self.license_expression == "public-domain":
+            # copyright not needed
+            return True
+        return self.license_expression and self.copyright and self.path_or_url
+
+    def to_about(self):
+        """
+        Return a mapping of ABOUT data from this distribution fields.
+        """
+        about_data = dict(
+            about_resource=self.filename,
+            checksum_md5=self.md5,
+            checksum_sha1=self.sha1,
+            copyright=self.copyright,
+            description=self.description,
+            download_url=self.download_url,
+            homepage_url=self.homepage_url,
+            license_expression=self.license_expression,
+            name=self.name,
+            namespace=self.namespace,
+            notes=self.notes,
+            notice_file=self.notice_filename if self.notice_text else "",
+            package_url=self.package_url,
+            primary_language=self.primary_language,
+            qualifiers=self.qualifiers,
+            size=self.size,
+            subpath=self.subpath,
+            type=self.type,
+            version=self.version,
+        )
+
+        about_data.update(self.extra_data)
+        about_data = {k: v for k, v in sorted(about_data.items()) if v}
+        return about_data
+
+    def to_dict(self):
+        """
+        Return a mapping data from this distribution.
+        """
+        return {k: v for k, v in attr.asdict(self).items() if v}
+
+    def save_about_and_notice_files(self, dest_dir=THIRDPARTY_DIR):
+        """
+        Save a .ABOUT file to `dest_dir`. Include a .NOTICE file if there is a
+        notice_text.
+        """
+
+        def save_if_modified(location, content):
+            if os.path.exists(location):
+                with open(location) as fi:
+                    existing_content = fi.read()
+                if existing_content == content:
+                    return False
+
+            if TRACE:
+                print(f"Saving ABOUT (and NOTICE) files for: {self}")
+            with open(location, "w") as fo:
+                fo.write(content)
+            return True
+
+        as_about = self.to_about()
+
+        save_if_modified(
+            location=os.path.join(dest_dir, self.about_filename),
+            content=saneyaml.dump(as_about),
+        )
+
+        notice_text = self.notice_text and self.notice_text.strip()
+        if notice_text:
+            save_if_modified(
+                location=os.path.join(dest_dir, self.notice_filename),
+                content=notice_text,
+            )
+
+    def load_about_data(self, about_filename_or_data=None, dest_dir=THIRDPARTY_DIR):
+        """
+        Update self with ABOUT data loaded from an `about_filename_or_data`
+        which is either a .ABOUT file in `dest_dir` or an ABOUT data mapping.
+        `about_filename_or_data` defaults to this distribution default ABOUT
+        filename if not provided. Load the notice_text if present from dest_dir.
+        """
+        if not about_filename_or_data:
+            about_filename_or_data = self.about_filename
+
+        if isinstance(about_filename_or_data, str):
+            # that's an about_filename
+            about_path = os.path.join(dest_dir, about_filename_or_data)
+            if os.path.exists(about_path):
+                with open(about_path) as fi:
+                    about_data = saneyaml.load(fi.read())
+                    if not about_data:
+                        return False
+            else:
+                return False
+        else:
+            about_data = about_filename_or_data
+
+        md5 = about_data.pop("checksum_md5", None)
+        if md5:
+            about_data["md5"] = md5
+        sha1 = about_data.pop("checksum_sha1", None)
+        if sha1:
+            about_data["sha1"] = sha1
+        sha256 = about_data.pop("checksum_sha256", None)
+        if sha256:
+            about_data["sha256"] = sha256
+
+        about_data.pop("about_resource", None)
+        notice_text = about_data.pop("notice_text", None)
+        notice_file = about_data.pop("notice_file", None)
+        if notice_text:
+            about_data["notice_text"] = notice_text
+        elif notice_file:
+            notice_loc = os.path.join(dest_dir, notice_file)
+            if os.path.exists(notice_loc):
+                with open(notice_loc) as fi:
+                    about_data["notice_text"] = fi.read()
+        return self.update(about_data, keep_extra=True)
+
+    def load_remote_about_data(self):
+        """
+        Fetch and update self with "remote" data Distribution ABOUT file and
+        NOTICE file if any. Return True if the data was updated.
+        """
+        try:
+            about_text = CACHE.get(
+                path_or_url=self.about_download_url,
+                as_text=True,
+            )
+        except RemoteNotFetchedException:
+            return False
+
+        if not about_text:
+            return False
+
+        about_data = saneyaml.load(about_text)
+        notice_file = about_data.pop("notice_file", None)
+        if notice_file:
+            try:
+                notice_text = CACHE.get(
+                    path_or_url=self.notice_download_url,
+                    as_text=True,
+                )
+                if notice_text:
+                    about_data["notice_text"] = notice_text
+            except RemoteNotFetchedException:
+                print(f"Failed to fetch NOTICE file: {self.notice_download_url}")
+        return self.load_about_data(about_data)
+
+    def get_checksums(self, dest_dir=THIRDPARTY_DIR):
+        """
+        Return a mapping of computed checksums for this dist filename is
+        `dest_dir`.
+        """
+        dist_loc = os.path.join(dest_dir, self.filename)
+        if os.path.exists(dist_loc):
+            return multi_checksums(dist_loc, checksum_names=("md5", "sha1", "sha256"))
+        else:
+            return {}
+
+    def set_checksums(self, dest_dir=THIRDPARTY_DIR):
+        """
+        Update self with checksums computed for this dist filename is `dest_dir`.
+        """
+        self.update(self.get_checksums(dest_dir), overwrite=True)
+
+    def validate_checksums(self, dest_dir=THIRDPARTY_DIR):
+        """
+        Return True if all checksums that have a value in this dist match
+        checksums computed for this dist filename is `dest_dir`.
+        """
+        real_checksums = self.get_checksums(dest_dir)
+        for csk in ("md5", "sha1", "sha256"):
+            csv = getattr(self, csk)
+            rcv = real_checksums.get(csk)
+            if csv and rcv and csv != rcv:
+                return False
+        return True
+
+    def get_license_keys(self):
+        try:
+            keys = LICENSING.license_keys(
+                self.license_expression,
+                unique=True,
+                simple=True,
+            )
+        except license_expression.ExpressionParseError:
+            return ["unknown"]
+        return keys
+
+    def fetch_license_files(self, dest_dir=THIRDPARTY_DIR, use_cached_index=False):
+        """
+        Fetch license files if missing in `dest_dir`.
+        Return True if license files were fetched.
+        """
+        urls = LinksRepository.from_url(use_cached_index=use_cached_index).links
+        errors = []
+        extra_lic_names = [l.get("file") for l in self.extra_data.get("licenses", {})]
+        extra_lic_names += [self.extra_data.get("license_file")]
+        extra_lic_names = [ln for ln in extra_lic_names if ln]
+        lic_names = [f"{key}.LICENSE" for key in self.get_license_keys()]
+        for filename in lic_names + extra_lic_names:
+            floc = os.path.join(dest_dir, filename)
+            if os.path.exists(floc):
+                continue
+
+            try:
+                # try remotely first
+                lic_url = get_license_link_for_filename(filename=filename, urls=urls)
+
+                fetch_and_save(
+                    path_or_url=lic_url,
+                    dest_dir=dest_dir,
+                    filename=filename,
+                    as_text=True,
+                )
+                if TRACE:
+                    print(f"Fetched license from remote: {lic_url}")
+
+            except:
+                try:
+                    # try licensedb second
+                    lic_url = f"{LICENSEDB_API_URL}/{filename}"
+                    fetch_and_save(
+                        path_or_url=lic_url,
+                        dest_dir=dest_dir,
+                        filename=filename,
+                        as_text=True,
+                    )
+                    if TRACE:
+                        print(f"Fetched license from licensedb: {lic_url}")
+
+                except:
+                    msg = f'No text for license {filename} in expression "{self.license_expression}" from {self}'
+                    print(msg)
+                    errors.append(msg)
+
+        return errors
+
+    def extract_pkginfo(self, dest_dir=THIRDPARTY_DIR):
+        """
+        Return the text of the first PKG-INFO or METADATA file found in the
+        archive of this Distribution in `dest_dir`. Return None if not found.
+        """
+
+        fn = self.filename
+        if fn.endswith(".whl"):
+            fmt = "zip"
+        elif fn.endswith(".tar.gz"):
+            fmt = "gztar"
+        else:
+            fmt = None
+
+        dist = os.path.join(dest_dir, fn)
+        with tempfile.TemporaryDirectory(prefix=f"pypi-tmp-extract-{fn}") as td:
+            shutil.unpack_archive(filename=dist, extract_dir=td, format=fmt)
+            # NOTE: we only care about the first one found in the dist
+            # which may not be 100% right
+            for pi in fileutils.resource_iter(location=td, with_dirs=False):
+                if pi.endswith(
+                    (
+                        "PKG-INFO",
+                        "METADATA",
+                    )
+                ):
+                    with open(pi) as fi:
+                        return fi.read()
+
+    def load_pkginfo_data(self, dest_dir=THIRDPARTY_DIR):
+        """
+        Update self with data loaded from the PKG-INFO file found in the
+        archive of this Distribution in `dest_dir`.
+        """
+        pkginfo_text = self.extract_pkginfo(dest_dir=dest_dir)
+        if not pkginfo_text:
+            print(f"!!!!PKG-INFO/METADATA not found in {self.filename}")
+            return
+        raw_data = email.message_from_string(pkginfo_text)
+
+        classifiers = raw_data.get_all("Classifier") or []
+
+        declared_license = [raw_data["License"]] + [
+            c for c in classifiers if c.startswith("License")
+        ]
+        license_expression = get_license_expression(declared_license)
+        other_classifiers = [c for c in classifiers if not c.startswith("License")]
+
+        holder = raw_data["Author"]
+        holder_contact = raw_data["Author-email"]
+        copyright_statement = f"Copyright (c) {holder} <{holder_contact}>"
+
+        pkginfo_data = dict(
+            name=raw_data["Name"],
+            declared_license=declared_license,
+            version=raw_data["Version"],
+            description=raw_data["Summary"],
+            homepage_url=raw_data["Home-page"],
+            copyright=copyright_statement,
+            license_expression=license_expression,
+            holder=holder,
+            holder_contact=holder_contact,
+            keywords=raw_data["Keywords"],
+            classifiers=other_classifiers,
+        )
+
+        return self.update(pkginfo_data, keep_extra=True)
+
+    def update_from_other_dist(self, dist):
+        """
+        Update self using data from another dist
+        """
+        return self.update(dist.get_updatable_data())
+
+    def get_updatable_data(self, data=None):
+        data = data or self.to_dict()
+        return {k: v for k, v in data.items() if v and k in self.updatable_fields}
+
+    def update(self, data, overwrite=False, keep_extra=True):
+        """
+        Update self with a mapping of `data`. Keep unknown data as extra_data if
+        `keep_extra` is True. If `overwrite` is True, overwrite self with `data`
+        Return True if any data was updated, False otherwise. Raise an exception
+        if there are key data conflicts.
+        """
+        package_url = data.get("package_url")
+        if package_url:
+            purl_from_data = packageurl.PackageURL.from_string(package_url)
+            purl_from_self = packageurl.PackageURL.from_string(self.package_url)
+            if purl_from_data != purl_from_self:
+                print(
+                    f"Invalid dist update attempt, no same same purl with dist: "
+                    f"{self} using data {data}."
+                )
+                return
+
+        data.pop("about_resource", None)
+        dl = data.pop("download_url", None)
+        if dl:
+            data["path_or_url"] = dl
+
+        updated = False
+        extra = {}
+        for k, v in data.items():
+            if isinstance(v, str):
+                v = v.strip()
+            if not v:
+                continue
+
+            if hasattr(self, k):
+                value = getattr(self, k, None)
+                if not value or (overwrite and value != v):
+                    try:
+                        setattr(self, k, v)
+                    except Exception as e:
+                        raise Exception(f"{self}, {k}, {v}") from e
+                    updated = True
+
+            elif keep_extra:
+                # note that we always overwrite extra
+                extra[k] = v
+                updated = True
+
+        self.extra_data.update(extra)
+
+        return updated
+
+
+def get_license_link_for_filename(filename, urls):
+    """
+    Return a link for `filename` found in the `links` list of URLs or paths. Raise an
+    exception if no link is found or if there are more than one link for that
+    file name.
+    """
+    path_or_url = [l for l in urls if l.endswith(f"/{filename}")]
+    if not path_or_url:
+        raise Exception(f"Missing link to file: {filename}")
+    if not len(path_or_url) == 1:
+        raise Exception(f"Multiple links to file: {filename}: \n" + "\n".join(path_or_url))
+    return path_or_url[0]
+
+
+class InvalidDistributionFilename(Exception):
+    pass
+
+
+def get_sdist_name_ver_ext(filename):
+    """
+    Return a (name, version, extension) if filename is a valid sdist name. Some legacy
+    binary builds have weird names. Return False otherwise.
+
+    In particular they do not use PEP440 compliant versions and/or mix tags, os
+    and arch names in tarball names and versions:
+
+    >>> assert get_sdist_name_ver_ext("intbitset-1.3.tar.gz")
+    >>> assert not get_sdist_name_ver_ext("intbitset-1.3.linux-x86_64.tar.gz")
+    >>> assert get_sdist_name_ver_ext("intbitset-1.4a.tar.gz")
+    >>> assert get_sdist_name_ver_ext("intbitset-1.4a.zip")
+    >>> assert not get_sdist_name_ver_ext("intbitset-2.0.linux-x86_64.tar.gz")
+    >>> assert get_sdist_name_ver_ext("intbitset-2.0.tar.gz")
+    >>> assert not get_sdist_name_ver_ext("intbitset-2.1-1.src.rpm")
+    >>> assert not get_sdist_name_ver_ext("intbitset-2.1-1.x86_64.rpm")
+    >>> assert not get_sdist_name_ver_ext("intbitset-2.1.linux-x86_64.tar.gz")
+    >>> assert not get_sdist_name_ver_ext("cffi-1.2.0-1.tar.gz")
+    >>> assert not get_sdist_name_ver_ext("html5lib-1.0-reupload.tar.gz")
+    >>> assert not get_sdist_name_ver_ext("selenium-2.0-dev-9429.tar.gz")
+    >>> assert not get_sdist_name_ver_ext("testfixtures-1.8.0dev-r4464.tar.gz")
+    """
+    name_ver = None
+    extension = None
+
+    for ext in EXTENSIONS_SDIST:
+        if filename.endswith(ext):
+            name_ver, extension, _ = filename.rpartition(ext)
+            break
+
+    if not extension or not name_ver:
+        return False
+
+    name, _, version = name_ver.rpartition("-")
+
+    if not name or not version:
+        return False
+
+    # weird version
+    if any(
+        w in version
+        for w in (
+            "x86_64",
+            "i386",
+        )
+    ):
+        return False
+
+    # all char versions
+    if version.isalpha():
+        return False
+
+    # non-pep 440 version
+    if "-" in version:
+        return False
+
+    # single version
+    if version.isdigit() and len(version) == 1:
+        return False
+
+    # r1 version
+    if len(version) == 2 and version[0] == "r" and version[1].isdigit():
+        return False
+
+    # dotless version (but calver is OK)
+    if "." not in version and len(version) < 3:
+        return False
+
+    # version with dashes selenium-2.0-dev-9429.tar.gz
+    if name.endswith(("dev",)) and "." not in version:
+        return False
+    # version pre or post, old legacy
+    if version.startswith(("beta", "rc", "pre", "post", "final")):
+        return False
+
+    return name, version, extension
+
+
+@attr.attributes
+class Sdist(Distribution):
+
+    extension = attr.ib(
+        repr=False,
+        type=str,
+        default="",
+        metadata=dict(help="File extension, including leading dot."),
+    )
+
+    @classmethod
+    def from_filename(cls, filename):
+        """
+        Return a Sdist object built from a filename.
+        Raise an exception if this is not a valid sdist filename
+        """
+        name_ver_ext = get_sdist_name_ver_ext(filename)
+        if not name_ver_ext:
+            raise InvalidDistributionFilename(filename)
+
+        name, version, extension = name_ver_ext
+
+        return cls(
+            type="pypi",
+            name=name,
+            version=version,
+            extension=extension,
+            filename=filename,
+        )
+
+    def to_filename(self):
+        """
+        Return an sdist filename reconstructed from its fields (that may not be
+        the same as the original filename.)
+        """
+        return f"{self.name}-{self.version}.{self.extension}"
+
+
+@attr.attributes
+class Wheel(Distribution):
+
+    """
+    Represents a wheel file.
+
+    Copied and heavily modified from pip-20.3.1 copied from pip-20.3.1
+    pip/_internal/models/wheel.py
+
+    name: pip compatibility tags
+    version: 20.3.1
+    download_url: https://github.com/pypa/pip/blob/20.3.1/src/pip/_internal/models/wheel.py
+    copyright: Copyright (c) 2008-2020 The pip developers (see AUTHORS.txt file)
+    license_expression: mit
+    notes: copied from pip-20.3.1 pip/_internal/models/wheel.py
+
+    Copyright (c) 2008-2020 The pip developers (see AUTHORS.txt file)
+
+    Permission is hereby granted, free of charge, to any person obtaining
+    a copy of this software and associated documentation files (the
+    "Software"), to deal in the Software without restriction, including
+    without limitation the rights to use, copy, modify, merge, publish,
+    distribute, sublicense, and/or sell copies of the Software, and to
+    permit persons to whom the Software is furnished to do so, subject to
+    the following conditions:
+
+    The above copyright notice and this permission notice shall be
+    included in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+    EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+    MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+    NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+    LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+    OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+    WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+    """
+
+    get_wheel_from_filename = re.compile(
+        r"""^(?P<namever>(?P<name>.+?)-(?P<ver>.*?))
+        ((-(?P<build>\d[^-]*?))?-(?P<pyvers>.+?)-(?P<abis>.+?)-(?P<plats>.+?)
+        \.whl)$""",
+        re.VERBOSE,
+    ).match
+
+    build = attr.ib(
+        type=str,
+        default="",
+        metadata=dict(help="Python wheel build."),
+    )
+
+    python_versions = attr.ib(
+        type=list,
+        default=attr.Factory(list),
+        metadata=dict(help="List of wheel Python version tags."),
+    )
+
+    abis = attr.ib(
+        type=list,
+        default=attr.Factory(list),
+        metadata=dict(help="List of wheel ABI tags."),
+    )
+
+    platforms = attr.ib(
+        type=list,
+        default=attr.Factory(list),
+        metadata=dict(help="List of wheel platform tags."),
+    )
+
+    tags = attr.ib(
+        repr=False,
+        type=set,
+        default=attr.Factory(set),
+        metadata=dict(help="Set of all tags for this wheel."),
+    )
+
+    @classmethod
+    def from_filename(cls, filename):
+        """
+        Return a wheel object built from a filename.
+        Raise an exception if this is not a valid wheel filename
+        """
+        wheel_info = cls.get_wheel_from_filename(filename)
+        if not wheel_info:
+            raise InvalidDistributionFilename(filename)
+
+        name = wheel_info.group("name").replace("_", "-")
+        # we'll assume "_" means "-" due to wheel naming scheme
+        # (https://github.com/pypa/pip/issues/1150)
+        version = wheel_info.group("ver").replace("_", "-")
+        build = wheel_info.group("build")
+        python_versions = wheel_info.group("pyvers").split(".")
+        abis = wheel_info.group("abis").split(".")
+        platforms = wheel_info.group("plats").split(".")
+
+        # All the tag combinations from this file
+        tags = {
+            packaging_tags.Tag(x, y, z) for x in python_versions for y in abis for z in platforms
+        }
+
+        return cls(
+            filename=filename,
+            type="pypi",
+            name=name,
+            version=version,
+            build=build,
+            python_versions=python_versions,
+            abis=abis,
+            platforms=platforms,
+            tags=tags,
+        )
+
+    def is_supported_by_tags(self, tags):
+        """
+        Return True is this wheel is compatible with one of a list of PEP 425 tags.
+        """
+        if TRACE_DEEP:
+            print()
+            print("is_supported_by_tags: tags:", tags)
+            print("self.tags:", self.tags)
+        return not self.tags.isdisjoint(tags)
+
+    def to_filename(self):
+        """
+        Return a wheel filename reconstructed from its fields (that may not be
+        the same as the original filename.)
+        """
+        build = f"-{self.build}" if self.build else ""
+        pyvers = ".".join(self.python_versions)
+        abis = ".".join(self.abis)
+        plats = ".".join(self.platforms)
+        return f"{self.name}-{self.version}{build}-{pyvers}-{abis}-{plats}.whl"
+
+    def is_pure(self):
+        """
+        Return True if wheel `filename` is for a "pure" wheel e.g. a wheel that
+        runs on all Pythons 3 and all OSes.
+
+        For example::
+
+        >>> Wheel.from_filename('aboutcode_toolkit-5.1.0-py2.py3-none-any.whl').is_pure()
+        True
+        >>> Wheel.from_filename('beautifulsoup4-4.7.1-py3-none-any.whl').is_pure()
+        True
+        >>> Wheel.from_filename('beautifulsoup4-4.7.1-py2-none-any.whl').is_pure()
+        False
+        >>> Wheel.from_filename('bitarray-0.8.1-cp36-cp36m-win_amd64.whl').is_pure()
+        False
+        >>> Wheel.from_filename('extractcode_7z-16.5-py2.py3-none-macosx_10_13_intel.whl').is_pure()
+        False
+        >>> Wheel.from_filename('future-0.16.0-cp36-none-any.whl').is_pure()
+        False
+        >>> Wheel.from_filename('foo-4.7.1-py3-none-macosx_10_13_intel.whl').is_pure()
+        False
+        >>> Wheel.from_filename('future-0.16.0-py3-cp36m-any.whl').is_pure()
+        False
+        """
+        return "py3" in self.python_versions and "none" in self.abis and "any" in self.platforms
+
+
+def is_pure_wheel(filename):
+    try:
+        return Wheel.from_filename(filename).is_pure()
+    except:
+        return False
+
+
+@attr.attributes
+class PypiPackage(NameVer):
+    """
+    A Python package contains one or more wheels and one source distribution
+    from a repository.
+    """
+
+    sdist = attr.ib(
+        repr=False,
+        type=Sdist,
+        default=None,
+        metadata=dict(help="Sdist source distribution for this package."),
+    )
+
+    wheels = attr.ib(
+        repr=False,
+        type=list,
+        default=attr.Factory(list),
+        metadata=dict(help="List of Wheel for this package"),
+    )
+
+    def get_supported_wheels(self, environment, verbose=TRACE_ULTRA_DEEP):
+        """
+        Yield all the Wheel of this package supported and compatible with the
+        Environment `environment`.
+        """
+        envt_tags = environment.tags()
+        if verbose:
+            print("get_supported_wheels: envt_tags:", envt_tags)
+        for wheel in self.wheels:
+            if wheel.is_supported_by_tags(envt_tags):
+                yield wheel
+
+    @classmethod
+    def package_from_dists(cls, dists):
+        """
+        Return a new PypiPackage built from an iterable of Wheels and Sdist
+        objects all for the same package name and version.
+
+        For example:
+        >>> w1 = Wheel(name='bitarray', version='0.8.1', build='',
+        ...    python_versions=['cp38'], abis=['cp38m'],
+        ...    platforms=['linux_x86_64'])
+        >>> w2 = Wheel(name='bitarray', version='0.8.1', build='',
+        ...    python_versions=['cp38'], abis=['cp38m'],
+        ...    platforms=['macosx_10_9_x86_64', 'macosx_10_10_x86_64'])
+        >>> sd = Sdist(name='bitarray', version='0.8.1')
+        >>> package = PypiPackage.package_from_dists(dists=[w1, w2, sd])
+        >>> assert package.name == 'bitarray'
+        >>> assert package.version == '0.8.1'
+        >>> assert package.sdist == sd
+        >>> assert package.wheels == [w1, w2]
+        """
+        dists = list(dists)
+        if TRACE_DEEP:
+            print(f"package_from_dists: {dists}")
+        if not dists:
+            return
+
+        reference_dist = dists[0]
+        normalized_name = reference_dist.normalized_name
+        version = reference_dist.version
+
+        package = PypiPackage(name=normalized_name, version=version)
+
+        for dist in dists:
+            if dist.normalized_name != normalized_name:
+                if TRACE:
+                    print(
+                        f"  Skipping inconsistent dist name: expected {normalized_name} got {dist}"
+                    )
+                continue
+            elif dist.version != version:
+                dv = packaging_version.parse(dist.version)
+                v = packaging_version.parse(version)
+                if dv != v:
+                    if TRACE:
+                        print(
+                            f"  Skipping inconsistent dist version: expected {version} got {dist}"
+                        )
+                    continue
+
+            if isinstance(dist, Sdist):
+                package.sdist = dist
+
+            elif isinstance(dist, Wheel):
+                package.wheels.append(dist)
+
+            else:
+                raise Exception(f"Unknown distribution type: {dist}")
+
+        if TRACE_DEEP:
+            print(f"package_from_dists: {package}")
+
+        return package
+
+    @classmethod
+    def packages_from_dir(cls, directory):
+        """
+        Yield PypiPackages built from files found in at directory path.
+        """
+        base = os.path.abspath(directory)
+
+        paths = [os.path.join(base, f) for f in os.listdir(base) if f.endswith(EXTENSIONS)]
+
+        if TRACE_ULTRA_DEEP:
+            print("packages_from_dir: paths:", paths)
+        return PypiPackage.packages_from_many_paths_or_urls(paths)
+
+    @classmethod
+    def packages_from_many_paths_or_urls(cls, paths_or_urls):
+        """
+        Yield PypiPackages built from a list of paths or URLs.
+        These are sorted by name and then by version from oldest to newest.
+        """
+        dists = PypiPackage.dists_from_paths_or_urls(paths_or_urls)
+        if TRACE_ULTRA_DEEP:
+            print("packages_from_many_paths_or_urls: dists:", dists)
+
+        dists = NameVer.sorted(dists)
+
+        for _projver, dists_of_package in itertools.groupby(
+            dists,
+            key=NameVer.sortable_name_version,
+        ):
+            package = PypiPackage.package_from_dists(dists_of_package)
+            if TRACE_ULTRA_DEEP:
+                print("packages_from_many_paths_or_urls", package)
+            yield package
+
+    @classmethod
+    def dists_from_paths_or_urls(cls, paths_or_urls):
+        """
+        Return a list of Distribution given a list of
+        ``paths_or_urls`` to wheels or source distributions.
+
+        Each Distribution receives two extra attributes:
+            - the path_or_url it was created from
+            - its filename
+
+        For example:
+        >>> paths_or_urls ='''
+        ...     /home/foo/bitarray-0.8.1-cp36-cp36m-linux_x86_64.whl
+        ...     bitarray-0.8.1-cp36-cp36m-macosx_10_9_x86_64.macosx_10_10_x86_64.whl
+        ...     bitarray-0.8.1-cp36-cp36m-win_amd64.whl
+        ...     https://example.com/bar/bitarray-0.8.1.tar.gz
+        ...     bitarray-0.8.1.tar.gz.ABOUT
+        ...     bit.LICENSE'''.split()
+        >>> results = list(PypiPackage.dists_from_paths_or_urls(paths_or_urls))
+        >>> for r in results:
+        ...    print(r.__class__.__name__, r.name, r.version)
+        ...    if isinstance(r, Wheel):
+        ...       print(" ", ", ".join(r.python_versions), ", ".join(r.platforms))
+        Wheel bitarray 0.8.1
+          cp36 linux_x86_64
+        Wheel bitarray 0.8.1
+          cp36 macosx_10_9_x86_64, macosx_10_10_x86_64
+        Wheel bitarray 0.8.1
+          cp36 win_amd64
+        Sdist bitarray 0.8.1
+        """
+        dists = []
+        if TRACE_ULTRA_DEEP:
+            print("     ###paths_or_urls:", paths_or_urls)
+        installable = [f for f in paths_or_urls if f.endswith(EXTENSIONS_INSTALLABLE)]
+        for path_or_url in installable:
+            try:
+                dist = Distribution.from_path_or_url(path_or_url)
+                dists.append(dist)
+                if TRACE_DEEP:
+                    print(
+                        "     ===> dists_from_paths_or_urls:",
+                        dist,
+                        "\n     ",
+                        "with URL:",
+                        dist.download_url,
+                        "\n     ",
+                        "from URL:",
+                        path_or_url,
+                    )
+            except InvalidDistributionFilename:
+                if TRACE_DEEP:
+                    print(f"     Skipping invalid distribution from: {path_or_url}")
+                continue
+        return dists
+
+    def get_distributions(self):
+        """
+        Yield all distributions available for this PypiPackage
+        """
+        if self.sdist:
+            yield self.sdist
+        for wheel in self.wheels:
+            yield wheel
+
+    def get_url_for_filename(self, filename):
+        """
+        Return the URL for this filename or None.
+        """
+        for dist in self.get_distributions():
+            if dist.filename == filename:
+                return dist.path_or_url
+
+
+@attr.attributes
+class Environment:
+    """
+    An Environment describes a target installation environment with its
+    supported Python version, ABI, platform, implementation and related
+    attributes.
+
+    We can use these to pass as `pip download` options and force fetching only
+    the subset of packages that match these Environment constraints as opposed
+    to the current running Python interpreter constraints.
+    """
+
+    python_version = attr.ib(
+        type=str,
+        default="",
+        metadata=dict(help="Python version supported by this environment."),
+    )
+
+    operating_system = attr.ib(
+        type=str,
+        default="",
+        metadata=dict(help="operating system supported by this environment."),
+    )
+
+    implementation = attr.ib(
+        type=str,
+        default="cp",
+        metadata=dict(help="Python implementation supported by this environment."),
+        repr=False,
+    )
+
+    abis = attr.ib(
+        type=list,
+        default=attr.Factory(list),
+        metadata=dict(help="List of ABI tags supported by this environment."),
+        repr=False,
+    )
+
+    platforms = attr.ib(
+        type=list,
+        default=attr.Factory(list),
+        metadata=dict(help="List of platform tags supported by this environment."),
+        repr=False,
+    )
+
+    @classmethod
+    def from_pyver_and_os(cls, python_version, operating_system):
+        if "." in python_version:
+            python_version = "".join(python_version.split("."))
+
+        return cls(
+            python_version=python_version,
+            implementation="cp",
+            abis=ABIS_BY_PYTHON_VERSION[python_version],
+            platforms=PLATFORMS_BY_OS[operating_system],
+            operating_system=operating_system,
+        )
+
+    def get_pip_cli_options(self):
+        """
+        Return a list of pip download command line options for this environment.
+        """
+        options = [
+            "--python-version",
+            self.python_version,
+            "--implementation",
+            self.implementation,
+        ]
+        for abi in self.abis:
+            options.extend(["--abi", abi])
+
+        for platform in self.platforms:
+            options.extend(["--platform", platform])
+
+        return options
+
+    def tags(self):
+        """
+        Return a set of all the PEP425 tags supported by this environment.
+        """
+        return set(
+            utils_pip_compatibility_tags.get_supported(
+                version=self.python_version or None,
+                impl=self.implementation or None,
+                platforms=self.platforms or None,
+                abis=self.abis or None,
+            )
+        )
+
+
+################################################################################
+#
+# PyPI repo and link index for package wheels and sources
+#
+################################################################################
+
+
+@attr.attributes
+class PypiSimpleRepository:
+    """
+    A PyPI repository of Python packages: wheels, sdist, etc. like the public
+    PyPI simple index. It is populated lazily based on requested packages names.
+    """
+
+    index_url = attr.ib(
+        type=str,
+        default=PYPI_SIMPLE_URL,
+        metadata=dict(help="Base PyPI simple URL for this index."),
+    )
+
+    # we keep a nested mapping of PypiPackage that has this shape:
+    # {name: {version: PypiPackage, version: PypiPackage, etc}
+    # the inner versions mapping is sorted by version from oldest to newest
+
+    packages = attr.ib(
+        type=dict,
+        default=attr.Factory(lambda: defaultdict(dict)),
+        metadata=dict(
+            help="Mapping of {name: {version: PypiPackage, version: PypiPackage, etc} available in this repo"
+        ),
+    )
+
+    fetched_package_normalized_names = attr.ib(
+        type=set,
+        default=attr.Factory(set),
+        metadata=dict(help="A set of already fetched package normalized names."),
+    )
+
+    use_cached_index = attr.ib(
+        type=bool,
+        default=False,
+        metadata=dict(
+            help="If True, use any existing on-disk cached PyPI index files. Otherwise, fetch and cache."
+        ),
+    )
+
+    def _get_package_versions_map(self, name):
+        """
+        Return a mapping of all available PypiPackage version for this package name.
+        The mapping may be empty. It is ordered by version from oldest to newest
+        """
+        assert name
+        normalized_name = NameVer.normalize_name(name)
+        versions = self.packages[normalized_name]
+        if not versions and normalized_name not in self.fetched_package_normalized_names:
+            self.fetched_package_normalized_names.add(normalized_name)
+            try:
+                links = self.fetch_links(normalized_name=normalized_name)
+                # note that thsi is sorted so the mapping is also sorted
+                versions = {
+                    package.version: package
+                    for package in PypiPackage.packages_from_many_paths_or_urls(paths_or_urls=links)
+                }
+                self.packages[normalized_name] = versions
+            except RemoteNotFetchedException as e:
+                if TRACE:
+                    print(f"failed to fetch package name: {name} from: {self.index_url}:\n{e}")
+
+        if not versions and TRACE:
+            print(f"WARNING: package {name} not found in repo: {self.index_url}")
+
+        return versions
+
+    def get_package_versions(self, name):
+        """
+        Return a mapping of all available PypiPackage version as{version:
+        package} for this package name. The mapping may be empty but not None.
+        It is sorted by version from oldest to newest.
+        """
+        return dict(self._get_package_versions_map(name))
+
+    def get_package_version(self, name, version=None):
+        """
+        Return the PypiPackage with name and version or None.
+        Return the latest PypiPackage version if version is None.
+        """
+        if not version:
+            versions = list(self._get_package_versions_map(name).values())
+            # return the latest version
+            return versions and versions[-1]
+        else:
+            return self._get_package_versions_map(name).get(version)
+
+    def fetch_links(self, normalized_name):
+        """
+        Return a list of download link URLs found in a PyPI simple index for package
+        name using the `index_url` of this repository.
+        """
+        package_url = f"{self.index_url}/{normalized_name}"
+        text = CACHE.get(
+            path_or_url=package_url,
+            as_text=True,
+            force=not self.use_cached_index,
+        )
+        links = collect_urls(text)
+        # TODO: keep sha256
+        links = [l.partition("#sha256=") for l in links]
+        links = [url for url, _, _sha256 in links]
+        return links
+
+
+PYPI_PUBLIC_REPO = PypiSimpleRepository(index_url=PYPI_SIMPLE_URL)
+PYPI_SELFHOSTED_REPO = PypiSimpleRepository(index_url=ABOUT_PYPI_SIMPLE_URL)
+DEFAULT_PYPI_REPOS = PYPI_PUBLIC_REPO, PYPI_SELFHOSTED_REPO
+DEFAULT_PYPI_REPOS_BY_URL = {r.index_url: r for r in DEFAULT_PYPI_REPOS}
+
+
+@attr.attributes
+class LinksRepository:
+    """
+    Represents a simple links repository such an HTTP directory listing or an
+    HTML page with links.
+    """
+
+    url = attr.ib(
+        type=str,
+        default="",
+        metadata=dict(help="Links directory URL"),
+    )
+
+    links = attr.ib(
+        type=list,
+        default=attr.Factory(list),
+        metadata=dict(help="List of links available in this repo"),
+    )
+
+    use_cached_index = attr.ib(
+        type=bool,
+        default=False,
+        metadata=dict(
+            help="If True, use any existing on-disk cached index files. Otherwise, fetch and cache."
+        ),
+    )
+
+    def __attrs_post_init__(self):
+        if not self.links:
+            self.links = self.find_links()
+
+    def find_links(self, _CACHE=[]):
+        """
+        Return a list of link URLs found in the HTML page at `self.url`
+        """
+        if _CACHE:
+            return _CACHE
+
+        links_url = self.url
+        if TRACE_DEEP:
+            print(f"Finding links from: {links_url}")
+        plinks_url = urllib.parse.urlparse(links_url)
+        base_url = urllib.parse.SplitResult(
+            plinks_url.scheme, plinks_url.netloc, "", "", ""
+        ).geturl()
+
+        if TRACE_DEEP:
+            print(f"Base URL {base_url}")
+
+        text = CACHE.get(
+            path_or_url=links_url,
+            as_text=True,
+            force=not self.use_cached_index,
+        )
+
+        links = []
+        for link in collect_urls(text):
+            if not link.endswith(EXTENSIONS):
+                continue
+
+            plink = urllib.parse.urlsplit(link)
+
+            if plink.scheme:
+                # full URL kept as-is
+                url = link
+
+            if plink.path.startswith("/"):
+                # absolute link
+                url = f"{base_url}{link}"
+
+            else:
+                # relative link
+                url = f"{links_url}/{link}"
+
+            if TRACE_DEEP:
+                print(f"Adding URL: {url}")
+
+            links.append(url)
+
+        if TRACE:
+            print(f"Found {len(links)} links at {links_url}")
+        _CACHE.extend(links)
+        return links
+
+    @classmethod
+    def from_url(cls, url=ABOUT_BASE_URL, _LINKS_REPO={}, use_cached_index=False):
+        if url not in _LINKS_REPO:
+            _LINKS_REPO[url] = cls(url=url, use_cached_index=use_cached_index)
+        return _LINKS_REPO[url]
+
+
+################################################################################
+# Globals for remote repos to be lazily created and cached on first use for the
+# life of the session together with some convenience functions.
+################################################################################
+
+
+def get_local_packages(directory=THIRDPARTY_DIR):
+    """
+    Return the list of all PypiPackage objects built from a local directory. Return
+    an empty list if the package cannot be found.
+    """
+    return list(PypiPackage.packages_from_dir(directory=directory))
+
+
+################################################################################
+#
+# Basic file and URL-based operations using a persistent file-based Cache
+#
+################################################################################
+
+
+@attr.attributes
+class Cache:
+    """
+    A simple file-based cache based only on a filename presence.
+    This is used to avoid impolite fetching from remote locations.
+    """
+
+    directory = attr.ib(type=str, default=CACHE_THIRDPARTY_DIR)
+
+    def __attrs_post_init__(self):
+        os.makedirs(self.directory, exist_ok=True)
+
+    def get(self, path_or_url, as_text=True, force=False):
+        """
+        Return the content fetched from a ``path_or_url`` through the cache.
+        Raise an Exception on errors. Treats the content as text if as_text is
+        True otherwise as treat as binary. `path_or_url` can be a path or a URL
+        to a file.
+        """
+        cache_key = quote_plus(path_or_url.strip("/"))
+        cached = os.path.join(self.directory, cache_key)
+
+        if force or not os.path.exists(cached):
+            if TRACE_DEEP:
+                print(f"        FILE CACHE MISS: {path_or_url}")
+            content = get_file_content(path_or_url=path_or_url, as_text=as_text)
+            wmode = "w" if as_text else "wb"
+            with open(cached, wmode) as fo:
+                fo.write(content)
+            return content
+        else:
+            if TRACE_DEEP:
+                print(f"        FILE CACHE HIT: {path_or_url}")
+            return get_local_file_content(path=cached, as_text=as_text)
+
+
+CACHE = Cache()
+
+
+def get_file_content(path_or_url, as_text=True):
+    """
+    Fetch and return the content at `path_or_url` from either a local path or a
+    remote URL. Return the content as bytes is `as_text` is False.
+    """
+    if path_or_url.startswith("https://"):
+        if TRACE_DEEP:
+            print(f"Fetching: {path_or_url}")
+        _headers, content = get_remote_file_content(url=path_or_url, as_text=as_text)
+        return content
+
+    elif path_or_url.startswith("file://") or (
+        path_or_url.startswith("/") and os.path.exists(path_or_url)
+    ):
+        return get_local_file_content(path=path_or_url, as_text=as_text)
+
+    else:
+        raise Exception(f"Unsupported URL scheme: {path_or_url}")
+
+
+def get_local_file_content(path, as_text=True):
+    """
+    Return the content at `url` as text. Return the content as bytes is
+    `as_text` is False.
+    """
+    if path.startswith("file://"):
+        path = path[7:]
+
+    mode = "r" if as_text else "rb"
+    with open(path, mode) as fo:
+        return fo.read()
+
+
+class RemoteNotFetchedException(Exception):
+    pass
+
+
+def get_remote_file_content(
+    url,
+    as_text=True,
+    headers_only=False,
+    headers=None,
+    _delay=0,
+):
+    """
+    Fetch and return a tuple of (headers, content) at `url`. Return content as a
+    text string if `as_text` is True. Otherwise return the content as bytes.
+
+    If `header_only` is True, return only (headers, None). Headers is a mapping
+    of HTTP headers.
+    Retries multiple times to fetch if there is a HTTP 429 throttling response
+    and this with an increasing delay.
+    """
+    time.sleep(_delay)
+    headers = headers or {}
+    # using a GET with stream=True ensure we get the the final header from
+    # several redirects and that we can ignore content there. A HEAD request may
+    # not get us this last header
+    print(f"    DOWNLOADING: {url}")
+    with requests.get(url, allow_redirects=True, stream=True, headers=headers) as response:
+        status = response.status_code
+        if status != requests.codes.ok:  # NOQA
+            if status == 429 and _delay < 20:
+                # too many requests: start some exponential delay
+                increased_delay = (_delay * 2) or 1
+
+                return get_remote_file_content(
+                    url,
+                    as_text=as_text,
+                    headers_only=headers_only,
+                    _delay=increased_delay,
+                )
+
+            else:
+                raise RemoteNotFetchedException(f"Failed HTTP request from {url} with {status}")
+
+        if headers_only:
+            return response.headers, None
+
+        return response.headers, response.text if as_text else response.content
+
+
+def fetch_and_save(
+    path_or_url,
+    dest_dir,
+    filename,
+    as_text=True,
+):
+    """
+    Fetch content at ``path_or_url`` URL or path and save this to
+    ``dest_dir/filername``. Return the fetched content. Raise an Exception on
+    errors. Treats the content as text if as_text is True otherwise as treat as
+    binary.
+    """
+    content = CACHE.get(
+        path_or_url=path_or_url,
+        as_text=as_text,
+    )
+    output = os.path.join(dest_dir, filename)
+    wmode = "w" if as_text else "wb"
+    with open(output, wmode) as fo:
+        fo.write(content)
+    return content
+
+
+################################################################################
+#
+# Functions to update or fetch ABOUT and license files
+#
+################################################################################
+
+
+def clean_about_files(
+    dest_dir=THIRDPARTY_DIR,
+):
+    """
+    Given a thirdparty dir, clean ABOUT files
+    """
+    local_packages = get_local_packages(directory=dest_dir)
+    for local_package in local_packages:
+        for local_dist in local_package.get_distributions():
+            local_dist.load_about_data(dest_dir=dest_dir)
+            local_dist.set_checksums(dest_dir=dest_dir)
+
+            if "classifiers" in local_dist.extra_data:
+                local_dist.extra_data.pop("classifiers", None)
+                local_dist.save_about_and_notice_files(dest_dir)
+
+
+def fetch_abouts_and_licenses(dest_dir=THIRDPARTY_DIR, use_cached_index=False):
+    """
+    Given a thirdparty dir, add missing ABOUT. LICENSE and NOTICE files using
+    best efforts:
+
+    - use existing ABOUT files
+    - try to load existing remote ABOUT files
+    - derive from existing distribution with same name and latest version that
+      would have such ABOUT file
+    - extract ABOUT file data from distributions PKGINFO or METADATA files
+
+    Use available existing on-disk cached index if use_cached_index is True.
+    """
+
+    def get_other_dists(_package, _dist):
+        """
+        Return a list of all the dists from `_package` that are not the `_dist`
+        object
+        """
+        return [d for d in _package.get_distributions() if d != _dist]
+
+    local_packages = get_local_packages(directory=dest_dir)
+    packages_by_name = defaultdict(list)
+    for local_package in local_packages:
+        distributions = list(local_package.get_distributions())
+        distribution = distributions[0]
+        packages_by_name[distribution.name].append(local_package)
+
+    for local_package in local_packages:
+        for local_dist in local_package.get_distributions():
+            local_dist.load_about_data(dest_dir=dest_dir)
+            local_dist.set_checksums(dest_dir=dest_dir)
+
+            # if has key data we may look to improve later, but we can move on
+            if local_dist.has_key_metadata():
+                local_dist.save_about_and_notice_files(dest_dir=dest_dir)
+                local_dist.fetch_license_files(dest_dir=dest_dir, use_cached_index=use_cached_index)
+                continue
+
+            # lets try to get from another dist of the same local package
+            for otherd in get_other_dists(local_package, local_dist):
+                updated = local_dist.update_from_other_dist(otherd)
+                if updated and local_dist.has_key_metadata():
+                    break
+
+            # if has key data we may look to improve later, but we can move on
+            if local_dist.has_key_metadata():
+                local_dist.save_about_and_notice_files(dest_dir=dest_dir)
+                local_dist.fetch_license_files(dest_dir=dest_dir, use_cached_index=use_cached_index)
+                continue
+
+            # try to get another version of the same package that is not our version
+            other_local_packages = [
+                p
+                for p in packages_by_name[local_package.name]
+                if p.version != local_package.version
+            ]
+            other_local_version = other_local_packages and other_local_packages[-1]
+            if other_local_version:
+                latest_local_dists = list(other_local_version.get_distributions())
+                for latest_local_dist in latest_local_dists:
+                    latest_local_dist.load_about_data(dest_dir=dest_dir)
+                    if not latest_local_dist.has_key_metadata():
+                        # there is not much value to get other data if we are missing the key ones
+                        continue
+                    else:
+                        local_dist.update_from_other_dist(latest_local_dist)
+                        # if has key data we may look to improve later, but we can move on
+                        if local_dist.has_key_metadata():
+                            break
+
+                # if has key data we may look to improve later, but we can move on
+                if local_dist.has_key_metadata():
+                    local_dist.save_about_and_notice_files(dest_dir=dest_dir)
+                    local_dist.fetch_license_files(
+                        dest_dir=dest_dir, use_cached_index=use_cached_index
+                    )
+                    continue
+
+            # lets try to fetch remotely
+            local_dist.load_remote_about_data()
+
+            # if has key data we may look to improve later, but we can move on
+            if local_dist.has_key_metadata():
+                local_dist.save_about_and_notice_files(dest_dir=dest_dir)
+                local_dist.fetch_license_files(dest_dir=dest_dir, use_cached_index=use_cached_index)
+                continue
+
+            # try to get a latest version of the same package that is not our version
+            # and that is in our self hosted repo
+            lpv = local_package.version
+            lpn = local_package.name
+
+            other_remote_packages = [
+                p for v, p in PYPI_SELFHOSTED_REPO.get_package_versions(lpn).items() if v != lpv
+            ]
+
+            latest_version = other_remote_packages and other_remote_packages[-1]
+            if latest_version:
+                latest_dists = list(latest_version.get_distributions())
+                for remote_dist in latest_dists:
+                    remote_dist.load_remote_about_data()
+                    if not remote_dist.has_key_metadata():
+                        # there is not much value to get other data if we are missing the key ones
+                        continue
+                    else:
+                        local_dist.update_from_other_dist(remote_dist)
+                        # if has key data we may look to improve later, but we can move on
+                        if local_dist.has_key_metadata():
+                            break
+
+                # if has key data we may look to improve later, but we can move on
+                if local_dist.has_key_metadata():
+                    local_dist.save_about_and_notice_files(dest_dir=dest_dir)
+                    local_dist.fetch_license_files(
+                        dest_dir=dest_dir, use_cached_index=use_cached_index
+                    )
+                    continue
+
+            # try to get data from pkginfo (no license though)
+            local_dist.load_pkginfo_data(dest_dir=dest_dir)
+
+            # FIXME: save as this is the last resort for now in all cases
+            # if local_dist.has_key_metadata() or not local_dist.has_key_metadata():
+            local_dist.save_about_and_notice_files(dest_dir)
+
+            lic_errs = local_dist.fetch_license_files(dest_dir, use_cached_index=use_cached_index)
+
+            if not local_dist.has_key_metadata():
+                print(f"Unable to add essential ABOUT data for: {local_dist}")
+            if lic_errs:
+                lic_errs = "\n".join(lic_errs)
+                print(f"Failed to fetch some licenses:: {lic_errs}")
+
+
+################################################################################
+#
+# Functions to build new Python wheels including native on multiple OSes
+#
+################################################################################
+
+
+def call(args, verbose=TRACE):
+    """
+    Call args in a subprocess and display output on the fly if ``trace`` is True.
+    Return a tuple of (returncode, stdout, stderr)
+    """
+    if TRACE_DEEP:
+        print("Calling:", " ".join(args))
+    with subprocess.Popen(
+        args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, encoding="utf-8"
+    ) as process:
+
+        stdouts = []
+        while True:
+            line = process.stdout.readline()
+            if not line and process.poll() is not None:
+                break
+            stdouts.append(line)
+            if verbose:
+                print(line.rstrip(), flush=True)
+
+        stdout, stderr = process.communicate()
+        if not stdout.strip():
+            stdout = "\n".join(stdouts)
+        return process.returncode, stdout, stderr
+
+
+def download_wheels_with_pip(
+    requirements_specifiers=tuple(),
+    requirements_files=tuple(),
+    environment=None,
+    dest_dir=THIRDPARTY_DIR,
+    index_url=PYPI_SIMPLE_URL,
+    links_url=ABOUT_LINKS_URL,
+):
+    """
+    Fetch binary wheel(s) using pip for the ``envt`` Environment given a list of
+    pip ``requirements_files`` and a list of ``requirements_specifiers`` string
+    (such as package names or as name==version).
+    Return a tuple of (list of downloaded files, error string).
+    Do NOT fail on errors, but return an error message on failure.
+    """
+
+    cli_args = [
+        "pip",
+        "download",
+        "--only-binary",
+        ":all:",
+        "--dest",
+        dest_dir,
+        "--index-url",
+        index_url,
+        "--find-links",
+        links_url,
+        "--no-color",
+        "--progress-bar",
+        "off",
+        "--no-deps",
+        "--no-build-isolation",
+        "--verbose",
+        #         "--verbose",
+    ]
+
+    if environment:
+        eopts = environment.get_pip_cli_options()
+        cli_args.extend(eopts)
+    else:
+        print("WARNING: no download environment provided.")
+
+    cli_args.extend(requirements_specifiers)
+    for req_file in requirements_files:
+        cli_args.extend(["--requirement", req_file])
+
+    if TRACE:
+        print(f"Downloading wheels using command:", " ".join(cli_args))
+
+    existing = set(os.listdir(dest_dir))
+    error = False
+    try:
+        returncode, _stdout, stderr = call(cli_args, verbose=True)
+        if returncode != 0:
+            error = stderr
+    except Exception as e:
+        error = str(e)
+
+    if error:
+        print()
+        print("###########################################################################")
+        print("##################### Failed to fetch all wheels ##########################")
+        print("###########################################################################")
+        print(error)
+        print()
+        print("###########################################################################")
+
+    downloaded = existing ^ set(os.listdir(dest_dir))
+    return sorted(downloaded), error
+
+
+################################################################################
+#
+# Functions to check for problems
+#
+################################################################################
+
+
+def check_about(dest_dir=THIRDPARTY_DIR):
+    try:
+        subprocess.check_output(f"venv/bin/about check {dest_dir}".split())
+    except subprocess.CalledProcessError as cpe:
+        print()
+        print("Invalid ABOUT files:")
+        print(cpe.output.decode("utf-8", errors="replace"))
+
+
+def find_problems(
+    dest_dir=THIRDPARTY_DIR,
+    report_missing_sources=False,
+    report_missing_wheels=False,
+):
+    """
+    Print the problems found in `dest_dir`.
+    """
+
+    local_packages = get_local_packages(directory=dest_dir)
+
+    for package in local_packages:
+        if report_missing_sources and not package.sdist:
+            print(f"{package.name}=={package.version}: Missing source distribution.")
+        if report_missing_wheels and not package.wheels:
+            print(f"{package.name}=={package.version}: Missing wheels.")
+
+        for dist in package.get_distributions():
+            dist.load_about_data(dest_dir=dest_dir)
+            abpth = os.path.abspath(os.path.join(dest_dir, dist.about_filename))
+            if not dist.has_key_metadata():
+                print(f"   Missing key ABOUT data in file://{abpth}")
+            if "classifiers" in dist.extra_data:
+                print(f"   Dangling classifiers data in file://{abpth}")
+            if not dist.validate_checksums(dest_dir):
+                print(f"   Invalid checksums in file://{abpth}")
+            if not dist.sha1 and dist.md5:
+                print(f"   Missing checksums in file://{abpth}")
+
+    check_about(dest_dir=dest_dir)
+
+
+def get_license_expression(declared_licenses):
+    """
+    Return a normalized license expression or None.
+    """
+    if not declared_licenses:
+        return
+    try:
+        from packagedcode.licensing import get_only_expression_from_extracted_license
+
+        return get_only_expression_from_extracted_license(declared_licenses)
+    except ImportError:
+        # Scancode is not installed, clean and join all the licenses
+        lics = [python_safe_name(l).lower() for l in declared_licenses]
+        return " AND ".join(lics).lower()
diff --git a/etc/scripts/utils_thirdparty.py.ABOUT b/etc/scripts/utils_thirdparty.py.ABOUT
new file mode 100644
index 0000000..8480349
--- /dev/null
+++ b/etc/scripts/utils_thirdparty.py.ABOUT
@@ -0,0 +1,15 @@
+about_resource: utils_thirdparty.py
+package_url: pkg:github.com/pypa/pip/@20.3.1#src/pip/_internal/models/wheel.py
+type: github
+namespace: pypa
+name: pip
+version: 20.3.1
+subpath: src/pip/_internal/models/wheel.py
+
+download_url: https://github.com/pypa/pip/blob/20.3.1/src/pip/_internal/models/wheel.py
+copyright: Copyright (c) 2008-2020 The pip developers (see AUTHORS.txt file)
+license_expression: mit
+notes: copied from pip-20.3.1 pip/_internal/models/wheel.py
+ The models code has been heavily inspired from the ISC-licensed packaging-dists
+ https://github.com/uranusjr/packaging-dists by Tzu-ping Chung
+ 
\ No newline at end of file
diff --git a/pyproject.toml b/pyproject.toml
new file mode 100644
index 0000000..b221ffc
--- /dev/null
+++ b/pyproject.toml
@@ -0,0 +1,52 @@
+[build-system]
+requires = ["setuptools >= 50", "wheel", "setuptools_scm[toml] >= 6"]
+build-backend = "setuptools.build_meta"
+
+[tool.setuptools_scm]
+# this is used populated when creating a git archive
+# and when there is .git dir and/or there is no git installed
+fallback_version = "9999.999.999"
+
+[tool.pytest.ini_options]
+norecursedirs = [
+   ".git",
+   "bin",
+   "dist",
+   "build",
+   "_build",
+   "dist",
+   "etc",
+   "local",
+   "ci",
+   "docs",
+   "man",
+   "share",
+   "samples",
+   ".cache",
+   ".settings",
+   "Include",
+   "include",
+   "Lib",
+   "lib",
+   "lib64",
+   "Lib64",
+   "Scripts",
+   "thirdparty",
+   "tmp",
+   "venv",
+   "tests/data",
+   ".eggs",
+   "src/*/data",
+   "tests/*/data"
+]
+
+python_files = "*.py"
+
+python_classes = "Test"
+python_functions = "test"
+
+addopts = [
+    "-rfExXw",
+    "--strict-markers",
+    "--doctest-modules"
+]
diff --git a/requirements-dev.txt b/requirements-dev.txt
new file mode 100644
index 0000000..e69de29
diff --git a/requirements.txt b/requirements.txt
new file mode 100644
index 0000000..e69de29
diff --git a/requirements_dev.txt b/requirements_dev.txt
deleted file mode 100644
index 42fbdac..0000000
--- a/requirements_dev.txt
+++ /dev/null
@@ -1,3 +0,0 @@
-pytest
--e .
-
diff --git a/saneyaml.ABOUT b/saneyaml.ABOUT
index 5b12815..2c4f0bd 100644
--- a/saneyaml.ABOUT
+++ b/saneyaml.ABOUT
@@ -2,18 +2,15 @@ about_resource: .
 name: saneyaml
 
 description: |
+ Dump readable YAML and load safely any YAML preserving
+ ordering and avoiding surprises of type conversions.
+ his library is a PyYaml wrapper with sane behaviour to read and
+ write readable YAML safely, typically when used for configuration.
 
-homepage_url: http://www.nexb.com/community.html
+homepage_url: https://github.com/nexB/saneyaml
 
 license_expression: apache-2.0
-licenses:
-    - key: apache-2.0
-      name: Apache 2.0
-      file: apache-2.0.LICENSE
-
-copyright: Copyright (c) 2018 nexB Inc. and others
-
+copyright: Copyright (c) nexB Inc. and others
 notice_file: NOTICE
-
 vcs_tool: git
 vcs_repository: https://github.com/nexB/saneyaml.git
diff --git a/setup.cfg b/setup.cfg
index c358a6f..f6cc8f0 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -1,48 +1,61 @@
-[bdist_wheel]
-universal = 1
-
 [metadata]
-license_file = NOTICE
+name = saneyaml
+license = Apache-2.0
+description = Read and write readable YAML safely preserving order and avoiding bad surprises with unwanted infered type conversions. This library is a PyYaml wrapper with sane behaviour to read and write readable YAML safely, typically when used for configuration.
+long_description = file:README.rst
+long_description_content_type = text/x-rst
+url = https://github.com/nexB/saneyaml
+author = nexB. Inc. and others
+author_email = info@aboutcode.org
+classifiers = 
+	Development Status :: 5 - Production/Stable
+	Intended Audience :: Developers
+	Programming Language :: Python :: 3
+	Programming Language :: Python :: 3 :: Only
+	Topic :: Software Development
+	Topic :: Utilities
+keywords = 
+	utilities
+	yaml
+	pyyaml
+	block
+	flow
+license_files = 
+	apache-2.0.LICENSE
+	NOTICE
+	AUTHORS.rst
+	CHANGELOG.rst
+	CODE_OF_CONDUCT.rst
 
-[aliases]
-release = clean --all sdist bdist_wheel 
+[options]
+package_dir = 
+	=src
+packages = find:
+include_package_data = true
+zip_safe = false
+setup_requires = setuptools_scm[toml] >= 4
+python_requires = >=3.7
+py_modules = 
+	saneyaml
+install_requires = PyYAML
 
-[tool:pytest]
-norecursedirs =
-    .git
-    bin
-    dist
-    build
-    _build
-    dist
-    local
-    ci
-    docs
-    man
-    share
-    samples
-    .cache
-    .settings
-    etc
-    Include
-    include
-    Lib
-    lib
-    Scripts
-    thirdparty/*
-    tmp/*
-    tests/testdata/*
+[options.packages.find]
+where = src
 
-python_files = *.py
+[options.extras_require]
+testing = 
+	pytest >= 6, != 7.0.0
+	pytest-xdist >= 2
+	aboutcode-toolkit >= 7.0.2
+	twine
+	black
+	isort
+docs = 
+	Sphinx >= 3.3.1
+	sphinx-rtd-theme >= 0.5.0
+	doc8 >= 0.8.1
 
-python_classes=Test
-python_functions=test
+[egg_info]
+tag_build = 
+tag_date = 0
 
-addopts =
-    -rfEsxXw
-    --strict
-     -s
-     -vv
-    --ignore docs/conf.py
-    --ignore setup.py
-    --doctest-modules
diff --git a/setup.py b/setup.py
index 40b6668..bac24a4 100644
--- a/setup.py
+++ b/setup.py
@@ -1,55 +1,6 @@
 #!/usr/bin/env python
-# -*- encoding: utf-8 -*-
 
-from __future__ import absolute_import
-from __future__ import print_function
+import setuptools
 
-from glob import glob
-import io
-from os.path import basename
-from os.path import dirname
-from os.path import join
-from os.path import splitext
-
-from setuptools import find_packages
-from setuptools import setup
-
-
-setup(
-    name='saneyaml',
-    version='0.3',
-    license='Apache-2.0',
-    description='Dump readable YAML and load safely any YAML preserving '
-        'ordering and avoiding surprises of type conversions. '
-        'This library is a PyYaml wrapper with sane behaviour to read and '
-        'write readable YAML safely, typically when used for configuration.',
-    long_description='',
-    author='AboutCode authors and others.',
-    author_email='info@nexb.com',
-    url='http://aboutcode.org',
-    packages=find_packages('src'),
-    package_dir={'': 'src'},
-    py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
-    include_package_data=True,
-    zip_safe=False,
-    platforms='any',
-    classifiers=[
-        'Development Status :: 5 - Production/Stable',
-        'Programming Language :: Python',
-        'Programming Language :: Python :: 2',
-        'Programming Language :: Python :: 2.7',
-        'Programming Language :: Python :: 3',
-        'Programming Language :: Python :: 3.6',
-        'Intended Audience :: Developers',
-        'License :: OSI Approved :: Apache Software License',
-        'Operating System :: OS Independent',
-        'Topic :: Software Development',
-        'Topic :: Utilities',
-    ],
-    keywords=[
-        'yaml', 'block', 'flow', 'readable',
-    ],
-    install_requires=[
-        'PyYAML >= 3.11, <= 3.13',
-    ],
-)
+if __name__ == "__main__":
+    setuptools.setup()
diff --git a/src/saneyaml.egg-info/PKG-INFO b/src/saneyaml.egg-info/PKG-INFO
new file mode 100644
index 0000000..b9719a1
--- /dev/null
+++ b/src/saneyaml.egg-info/PKG-INFO
@@ -0,0 +1,70 @@
+Metadata-Version: 2.1
+Name: saneyaml
+Version: 0.6.0
+Summary: Read and write readable YAML safely preserving order and avoiding bad surprises with unwanted infered type conversions. This library is a PyYaml wrapper with sane behaviour to read and write readable YAML safely, typically when used for configuration.
+Home-page: https://github.com/nexB/saneyaml
+Author: nexB. Inc. and others
+Author-email: info@aboutcode.org
+License: Apache-2.0
+Keywords: utilities,yaml,pyyaml,block,flow
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Intended Audience :: Developers
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3 :: Only
+Classifier: Topic :: Software Development
+Classifier: Topic :: Utilities
+Requires-Python: >=3.7
+Description-Content-Type: text/x-rst
+Provides-Extra: testing
+Provides-Extra: docs
+License-File: apache-2.0.LICENSE
+License-File: NOTICE
+License-File: AUTHORS.rst
+License-File: CHANGELOG.rst
+License-File: CODE_OF_CONDUCT.rst
+
+========
+saneyaml
+========
+
+This micro library is a PyYaml wrapper with sane behaviour to read and
+write readable YAML safely, typically when used with configuration files.
+
+With saneyaml you can dump readable and clean YAML and load safely any YAML
+preserving ordering and avoiding surprises of type conversions by loading
+everything except booleans as strings.
+
+Optionally you can check for duplicated map keys when loading YAML.
+
+Works with Python 3. Requires PyYAML 5.x or higher.
+
+license: apache-2.0
+homepage_url: https://github.com/nexB/saneyaml
+
+Usage::
+
+    pip install saneyaml
+    
+    >>> from  saneyaml import load
+    >>> from  saneyaml import dump
+    >>> a=load('''version: 3.0.0.dev6
+    ... 
+    ... description: |
+    ...     AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file
+    ...     provides a way to document a software component.
+    ... ''')
+    >>> a
+    dict([
+        (u'version', u'3.0.0.dev6'), 
+        (u'description', u'AboutCode Toolkit is a tool to process ABOUT files. '
+        'An ABOUT file\nprovides a way to document a software component.\n')])
+    
+    >>> pprint(a.items())
+    [(u'version', u'3.0.0.dev6'),
+     (u'description',
+      u'AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file\nprovides a way to document a software component.\n')]
+    >>> print(dump(a))
+    version: 3.0.0.dev6
+    description: |
+      AboutCode Toolkit is a tool to process ABOUT files. An ABOUT file
+      provides a way to document a software component.
diff --git a/src/saneyaml.egg-info/SOURCES.txt b/src/saneyaml.egg-info/SOURCES.txt
new file mode 100644
index 0000000..4d0fd52
--- /dev/null
+++ b/src/saneyaml.egg-info/SOURCES.txt
@@ -0,0 +1,115 @@
+.gitattributes
+.gitignore
+.readthedocs.yml
+AUTHORS.rst
+CHANGELOG.rst
+CODE_OF_CONDUCT.rst
+CONTRIBUTING.rst
+MANIFEST.in
+Makefile
+NOTICE
+README.rst
+apache-2.0.LICENSE
+appveyor.yml
+azure-pipelines.yml
+configure
+configure.bat
+pyproject.toml
+requirements-dev.txt
+requirements.txt
+saneyaml.ABOUT
+setup.cfg
+setup.py
+.github/workflows/docs-ci.yml
+.github/workflows/pypi-release.yml
+docs/Makefile
+docs/make.bat
+docs/scripts/doc8_style_check.sh
+docs/scripts/sphinx_build_link_check.sh
+docs/source/conf.py
+docs/source/index.rst
+docs/source/skeleton-usage.rst
+docs/source/_static/theme_overrides.css
+docs/source/contribute/contrib_doc.rst
+etc/ci/azure-container-deb.yml
+etc/ci/azure-container-rpm.yml
+etc/ci/azure-posix.yml
+etc/ci/azure-win.yml
+etc/ci/install_sudo.sh
+etc/ci/macports-ci
+etc/ci/macports-ci.ABOUT
+etc/ci/mit.LICENSE
+etc/scripts/README.rst
+etc/scripts/check_thirdparty.py
+etc/scripts/fetch_thirdparty.py
+etc/scripts/gen_pypi_simple.py
+etc/scripts/gen_pypi_simple.py.ABOUT
+etc/scripts/gen_pypi_simple.py.NOTICE
+etc/scripts/gen_requirements.py
+etc/scripts/gen_requirements_dev.py
+etc/scripts/requirements.txt
+etc/scripts/test_utils_pip_compatibility_tags.py
+etc/scripts/test_utils_pip_compatibility_tags.py.ABOUT
+etc/scripts/test_utils_pypi_supported_tags.py
+etc/scripts/test_utils_pypi_supported_tags.py.ABOUT
+etc/scripts/utils_dejacode.py
+etc/scripts/utils_pip_compatibility_tags.py
+etc/scripts/utils_pip_compatibility_tags.py.ABOUT
+etc/scripts/utils_pypi_supported_tags.py
+etc/scripts/utils_pypi_supported_tags.py.ABOUT
+etc/scripts/utils_requirements.py
+etc/scripts/utils_thirdparty.py
+etc/scripts/utils_thirdparty.py.ABOUT
+src/saneyaml.py
+src/saneyaml.egg-info/PKG-INFO
+src/saneyaml.egg-info/SOURCES.txt
+src/saneyaml.egg-info/dependency_links.txt
+src/saneyaml.egg-info/not-zip-safe
+src/saneyaml.egg-info/requires.txt
+src/saneyaml.egg-info/top_level.txt
+tests/test_saneyaml.py
+tests/test_skeleton_codestyle.py
+tests/data/ruby_tags/metadata1
+tests/data/ruby_tags/metadata1.notag
+tests/data/yamls/about.yml
+tests/data/yamls/about.yml.expected.load.json
+tests/data/yamls/about.yml.expected.yaml.dump
+tests/data/yamls/afpl-8.0.yml
+tests/data/yamls/afpl-8.0.yml.expected.load.json
+tests/data/yamls/afpl-8.0.yml.expected.yaml.dump
+tests/data/yamls/artistic-2.0_or_later_or_gpl-2.0.yml
+tests/data/yamls/artistic-2.0_or_later_or_gpl-2.0.yml.expected.load.json
+tests/data/yamls/artistic-2.0_or_later_or_gpl-2.0.yml.expected.yaml.dump
+tests/data/yamls/autoconf-exception-3.0.yml
+tests/data/yamls/autoconf-exception-3.0.yml.expected.load.json
+tests/data/yamls/autoconf-exception-3.0.yml.expected.yaml.dump
+tests/data/yamls/contextlib2.yml
+tests/data/yamls/contextlib2.yml.expected.load.json
+tests/data/yamls/contextlib2.yml.expected.yaml.dump
+tests/data/yamls/copyright_test.yml
+tests/data/yamls/copyright_test.yml.expected.load.json
+tests/data/yamls/copyright_test.yml.expected.yaml.dump
+tests/data/yamls/corner-cases.yml
+tests/data/yamls/corner-cases.yml.expected.load.json
+tests/data/yamls/corner-cases.yml.expected.yaml.dump
+tests/data/yamls/gpl-2.0.yml
+tests/data/yamls/gpl-2.0.yml.expected.load.json
+tests/data/yamls/gpl-2.0.yml.expected.yaml.dump
+tests/data/yamls/idna-2.6-py2.py3-none-any.whl.yml
+tests/data/yamls/idna-2.6-py2.py3-none-any.whl.yml.expected.load.json
+tests/data/yamls/idna-2.6-py2.py3-none-any.whl.yml.expected.yaml.dump
+tests/data/yamls/isodate.yml
+tests/data/yamls/isodate.yml.expected.load.json
+tests/data/yamls/isodate.yml.expected.yaml.dump
+tests/data/yamls/license_texs.yml
+tests/data/yamls/license_texs.yml.expected.load.json
+tests/data/yamls/license_texs.yml.expected.yaml.dump
+tests/data/yamls/lxml-4.2.1-cp27-cp27m-win_amd64.whl.yml
+tests/data/yamls/lxml-4.2.1-cp27-cp27m-win_amd64.whl.yml.expected.load.json
+tests/data/yamls/lxml-4.2.1-cp27-cp27m-win_amd64.whl.yml.expected.yaml.dump
+tests/data/yamls/lxml-4.2.1.tar.gz.yml
+tests/data/yamls/lxml-4.2.1.tar.gz.yml.expected.load.json
+tests/data/yamls/lxml-4.2.1.tar.gz.yml.expected.yaml.dump
+tests/data/yamls/not-a-license_125.yml
+tests/data/yamls/not-a-license_125.yml.expected.load.json
+tests/data/yamls/not-a-license_125.yml.expected.yaml.dump
\ No newline at end of file
diff --git a/src/saneyaml.egg-info/dependency_links.txt b/src/saneyaml.egg-info/dependency_links.txt
new file mode 100644
index 0000000..8b13789
--- /dev/null
+++ b/src/saneyaml.egg-info/dependency_links.txt
@@ -0,0 +1 @@
+
diff --git a/src/saneyaml.egg-info/not-zip-safe b/src/saneyaml.egg-info/not-zip-safe
new file mode 100644
index 0000000..8b13789
--- /dev/null
+++ b/src/saneyaml.egg-info/not-zip-safe
@@ -0,0 +1 @@
+
diff --git a/src/saneyaml.egg-info/requires.txt b/src/saneyaml.egg-info/requires.txt
new file mode 100644
index 0000000..e5584da
--- /dev/null
+++ b/src/saneyaml.egg-info/requires.txt
@@ -0,0 +1,14 @@
+PyYAML
+
+[docs]
+Sphinx>=3.3.1
+sphinx-rtd-theme>=0.5.0
+doc8>=0.8.1
+
+[testing]
+pytest!=7.0.0,>=6
+pytest-xdist>=2
+aboutcode-toolkit>=7.0.2
+twine
+black
+isort
diff --git a/src/saneyaml.egg-info/top_level.txt b/src/saneyaml.egg-info/top_level.txt
new file mode 100644
index 0000000..e1b4d5e
--- /dev/null
+++ b/src/saneyaml.egg-info/top_level.txt
@@ -0,0 +1 @@
+saneyaml
diff --git a/src/saneyaml.py b/src/saneyaml.py
index 9286215..eb0c19f 100644
--- a/src/saneyaml.py
+++ b/src/saneyaml.py
@@ -1,26 +1,16 @@
 #!/usr/bin/env python
 # -*- coding: utf8 -*-
 #
-# Copyright (c) 2018 nexB Inc. and others. All rights reserved.
-# http://nexb.com and https://github.com/nexB/saneyaml/
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/saneyaml/ for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
 #
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#     http://www.apache.org/licenses/LICENSE-2.0
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from __future__ import absolute_import
-from __future__ import print_function
-from __future__ import unicode_literals
-
-from collections import OrderedDict
+
 from functools import partial
-import sys
+import re
 
 import yaml
 from yaml.error import YAMLError
@@ -34,25 +24,6 @@ try:  # pragma: nocover
 except ImportError:  # pragma: nocover
     from yaml import SafeLoader
 
-try:  # pragma: nocover
-    # Python 2
-    unicode
-except NameError:  # pragma: nocover
-    # Python 3
-    unicode = str  # NOQA
-
-# Python 2 to 3.5
-python2 = sys.version_info[0] < 3
-python3old = sys.version_info[0] == 3 and sys.version_info[1] < 6
-OLD_PY = python2 or python3old
-
-if OLD_PY:  # pragma: nocover
-    from collections import OrderedDict as odict
-else:
-    # CPython 3.6 and up dict is ordered by default. And this is the Python spec
-    # in 3.7 and up.
-    odict = dict
-
 """
 A wrapper around PyYAML to provide sane defaults ensuring that dump/load does
 not damage content, keeps ordering and ordered mappings, use always block-style
@@ -71,11 +42,11 @@ versions and Python vs C/libyaml, the tests may behave differently in some cases
 and fail.
 """
 
-
 ###############################################################################
 # Loading
 ###############################################################################
 
+
 def load(s, allow_duplicate_keys=True):
     """
     Return an object safely loaded from a YAML string `s`. `s` must be unicode
@@ -109,25 +80,27 @@ class BaseSaneLoader(SafeLoader):
     def ordered_loader(self, node, check_dupe=False):
         """
         Ensure that YAML maps order is preserved and loaded in an ordered mapping.
+        Legacy from the pre Python 3.6 times when dicts where not ordered.
         """
         assert isinstance(node, yaml.MappingNode)
-        omap = odict()
+        omap = dict()
         yield omap
         for key, value in node.value:
             key = self.construct_object(key)
             value = self.construct_object(value)
             if check_dupe and key in omap:
-                raise UnsupportedYamlFeatureError(
-                    'Duplicate key in YAML source: {}'.format(key))
+                raise UnsupportedYamlFeatureError('Duplicate key in YAML source: {}'.format(key))
             omap[key] = value
 
+
 # Load most types as strings : nulls, ints, (such as in version 01) floats (such
 # as version 2.20) and timestamps conversion (in versions too), booleans are all
 # loaded as plain strings.
 # This avoid unwanted type conversions for unquoted strings and the resulting
 # content damaging. This overrides the implicit resolvers. Callers must handle
 # type conversion explicitly from unicode to other types in the loaded objects.
-
+# NOTE: we are still using the built-in loader for booleans. It will recognize
+# yes/no as a boolean.
 BaseSaneLoader.add_constructor('tag:yaml.org,2002:str', BaseSaneLoader.string_loader)
 BaseSaneLoader.add_constructor('tag:yaml.org,2002:null', BaseSaneLoader.string_loader)
 BaseSaneLoader.add_constructor('tag:yaml.org,2002:boolean', BaseSaneLoader.string_loader)
@@ -143,6 +116,7 @@ BaseSaneLoader.add_constructor(None, BaseSaneLoader.ordered_loader)
 class SaneLoader(BaseSaneLoader):
     pass
 
+
 # Always load mapping as ordered mappings
 SaneLoader.add_constructor('tag:yaml.org,2002:map', BaseSaneLoader.ordered_loader)
 SaneLoader.add_constructor('tag:yaml.org,2002:omap', BaseSaneLoader.ordered_loader)
@@ -161,11 +135,13 @@ dupe_checkding_ordered_loader = partial(BaseSaneLoader.ordered_loader, check_dup
 DupeKeySaneLoader.add_constructor('tag:yaml.org,2002:map', dupe_checkding_ordered_loader)
 DupeKeySaneLoader.add_constructor('tag:yaml.org,2002:omap', dupe_checkding_ordered_loader)
 
-
 ###############################################################################
 # Dumping
 ###############################################################################
 
+WIDTH = 90
+
+
 def dump(obj, indent=2, encoding=None):
     """
     Return a safe and sane YAML string representation from `obj`.
@@ -187,7 +163,7 @@ def dump(obj, indent=2, encoding=None):
         # anything above 2 will yield weird vertical indents on lists and maps
         indent=indent,
         # make this 80ish
-        width=90,
+        width=WIDTH,
         # posix LF
         line_break='\n',
         # no --- and ...
@@ -197,30 +173,57 @@ def dump(obj, indent=2, encoding=None):
 
 
 class IndentingEmitter(Emitter):
+
     def increase_indent(self, flow=False, indentless=False):
         """
         Ensure that lists items are always indented.
         """
         return super(IndentingEmitter, self).increase_indent(
-            flow=False, indentless=False)
+            flow=False,
+            indentless=False,
+        )
 
 
 class SaneDumper(IndentingEmitter, Serializer, SafeRepresenter, Resolver):
 
     def __init__(self, stream,
-            default_style=None, default_flow_style=None,
-            canonical=None, indent=None, width=None,
-            allow_unicode=None, line_break=None,
-            encoding=None, explicit_start=None, explicit_end=None,
-            version=None, tags=None):
-        IndentingEmitter.__init__(self, stream, canonical=canonical,
-                indent=indent, width=width,
-                allow_unicode=allow_unicode, line_break=line_break)
-        Serializer.__init__(self, encoding=encoding,
-                explicit_start=explicit_start, explicit_end=explicit_end,
-                version=version, tags=tags)
-        SafeRepresenter.__init__(self, default_style=default_style,
-                default_flow_style=default_flow_style)
+        default_style=None,
+        default_flow_style=None,
+        canonical=None,
+        indent=None,
+        width=None,
+        allow_unicode=None,
+        line_break=None,
+        encoding=None,
+        explicit_start=None,
+        explicit_end=None,
+        version=None,
+        tags=None,
+        sort_keys=False,
+        **kwargs,
+    ):
+        IndentingEmitter.__init__(
+            self,
+            stream,
+            canonical=canonical,
+            indent=indent,
+            width=width,
+            allow_unicode=allow_unicode,
+            line_break=line_break,
+        )
+        Serializer.__init__(
+            self,
+            encoding=encoding,
+            explicit_start=explicit_start,
+            explicit_end=explicit_end,
+            version=version,
+            tags=tags,
+        )
+        SafeRepresenter.__init__(
+            self,
+            default_style=default_style,
+            default_flow_style=default_flow_style,
+        )
         Resolver.__init__(self)
 
     def determine_block_hints(self, text):
@@ -245,7 +248,7 @@ class SaneDumper(IndentingEmitter, Serializer, SafeRepresenter, Resolver):
         """
         Always dump nulls as empty string.
         """
-        return self.represent_scalar('tag:yaml.org,2002:null', '')
+        return self.represent_scalar('tag:yaml.org,2002:str', '', style=None)
 
     def string_dumper(self, value):
         """
@@ -254,20 +257,50 @@ class SaneDumper(IndentingEmitter, Serializer, SafeRepresenter, Resolver):
         """
         tag = 'tag:yaml.org,2002:str'
         style = None
+
+        if value is None:
+            return ''
+
+        if isinstance(value, bool):
+            value = 'yes' if value else 'no'
+            style = ''
+
         if isinstance(value, float):
             style = "'"
 
+        if isinstance(value, int):
+            value = str(value)
+            style = ''
+
         if isinstance(value, bytes):
             value = value.decode('utf-8')
-        elif not isinstance(value, unicode):
+        elif isinstance(value, int):
+            value = str(value)
+        elif not isinstance(value, str):
             value = repr(value)
 
         # do not quote integer strings
-        if value.isdigit() and unicode(int(value)) == value:
-            style = None
-            tag = 'tag:yaml.org,2002:int'
+        if value.isdigit():
+            if value.lstrip('0') == value:
+                style = ''
+            else:
+                # things such as 012 needs to be quoted
+                style = "'"
+
+        # quote things that could be mistakenly loaded as date
+        if is_iso_date(value):
+            style = "'"
+
+        # quote things that could be mistakenly loaded as float such as version numbers
+        if value != '.' and len(value.split('.')) == 2 and all(c in '0123456789.' for c in value):
+            style = "'"
+
+        elif value == 'null':
+            style = "'"
 
-        if '\n' in value:
+        # if '\n' in value or len(value) > WIDTH:
+            # literal_style for multilines or long
+        elif '\n' in value:
             # literal_style for multilines
             style = '|'
 
@@ -282,12 +315,27 @@ class SaneDumper(IndentingEmitter, Serializer, SafeRepresenter, Resolver):
         return self.represent_scalar('tag:yaml.org,2002:bool', value, style=None)
 
 
+def is_float(s):
+    """
+    Return True if this is a float with trailing zeroes such as `1.20`
+    """
+    try:
+        float(s)
+        return s.startswith('0') or s.endswith('0')
+    except:
+        return False
+
+
+# Return True if s is an iso date such as `2019-12-12`
+is_iso_date = re.compile(r'19|20[0-9]{2}-[0-1][0-9]-[0-3]?[1-9]').match
+
 SaneDumper.add_representer(int, SaneDumper.string_dumper)
-SaneDumper.add_representer(odict, SaneDumper.ordered_dumper)
-SaneDumper.add_representer(OrderedDict, SaneDumper.ordered_dumper)
+SaneDumper.add_representer(dict, SaneDumper.ordered_dumper)
 SaneDumper.add_representer(type(None), SaneDumper.null_dumper)
-SaneDumper.add_representer(bool, SaneDumper.boolean_dumper)
+SaneDumper.add_representer(bool, SaneDumper.string_dumper)
 SaneDumper.add_representer(bytes, SaneDumper.string_dumper)
 SaneDumper.add_representer(str, SaneDumper.string_dumper)
-SaneDumper.add_representer(unicode, SaneDumper.string_dumper)
 SaneDumper.add_representer(float, SaneDumper.string_dumper)
+
+SaneDumper.yaml_implicit_resolvers = {}
+SaneDumper.yaml_path_resolvers = {}
diff --git a/tests/data/yamls/corner-cases.yml.expected.yaml.dump b/tests/data/yamls/corner-cases.yml.expected.yaml.dump
index 2b1278e..ac4194d 100644
--- a/tests/data/yamls/corner-cases.yml.expected.yaml.dump
+++ b/tests/data/yamls/corner-cases.yml.expected.yaml.dump
@@ -1,8 +1,8 @@
 about_resource: 'null'
 name: '123.34'
 about_resource_path: '012'
-? ''
+?
 : - this
   - 'null'
   - '2012-03-12'
-that: ''
+that:
diff --git a/tests/test_saneyaml.py b/tests/test_saneyaml.py
index 86f52d8..bd296a7 100644
--- a/tests/test_saneyaml.py
+++ b/tests/test_saneyaml.py
@@ -1,24 +1,14 @@
 #!/usr/bin/env python
 # -*- coding: utf8 -*-
 #
-# Copyright (c) 2018 nexB Inc. and others. All rights reserved.
-# http://nexb.com and https://github.com/nexB/saneyaml/
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/saneyaml/ for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
 #
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#     http://www.apache.org/licenses/LICENSE-2.0
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from __future__ import absolute_import
-from __future__ import print_function
-from __future__ import unicode_literals
-
-from collections import OrderedDict
+
 import io
 import json
 import os
@@ -27,12 +17,6 @@ from unittest.case import TestCase
 
 import saneyaml
 
-try:
-    unicode
-except NameError:
-    unicode = str  # NOQA
-
-
 test_data_dir = os.path.join(os.path.dirname(__file__), 'data')
 
 
@@ -104,7 +88,7 @@ a: 45
     def test_dump_does_handles_numbers_and_booleans_correctly(self):
         test = [
             None,
-            OrderedDict([
+            dict([
                 (1, None),
                 (123.34, 'tha')
             ])
@@ -146,7 +130,7 @@ x: !!int 5
 that: *environ
 '''
         result = saneyaml.load(test)
-        expected = OrderedDict([
+        expected = dict([
             ('x', '5'),
             ('', ['this', 'null', '2012-03-12']),
             ('that', '')
@@ -157,13 +141,11 @@ that: *environ
 safe_chars = re.compile(r'[\W_]', re.MULTILINE)
 
 
-def python_safe(s, python2=False):
+def python_safe(s):
     """Return a name safe to use as a python function name"""
     s = s.strip().lower()
     s = [x for x in safe_chars.split(s) if x]
     s = '_'.join(s)
-    if saneyaml.python2:
-        s = s.encode('utf-8')
     return s
 
 
@@ -186,7 +168,7 @@ def get_yaml_test_method(test_file, expected_load_file, expected_dump_file, rege
                 out.write(test_dump)
 
         with io.open(expected_load_file, encoding='utf-8') as inp:
-            expected_load = json.load(inp, object_pairs_hook=OrderedDict)
+            expected_load = json.load(inp)
 
         with io.open(expected_dump_file, encoding='utf-8') as inp:
             expected_dump = inp.read()
@@ -194,7 +176,6 @@ def get_yaml_test_method(test_file, expected_load_file, expected_dump_file, rege
         assert expected_load == test_load
         assert expected_dump == test_dump
 
-
     tfn = test_file.replace(test_data_dir, '').strip('/\\')
     test_name = 'test_{}'.format(tfn)
     test_name = python_safe(test_name)
diff --git a/tests/test_skeleton_codestyle.py b/tests/test_skeleton_codestyle.py
new file mode 100644
index 0000000..2eb6e55
--- /dev/null
+++ b/tests/test_skeleton_codestyle.py
@@ -0,0 +1,36 @@
+#
+# Copyright (c) nexB Inc. and others. All rights reserved.
+# ScanCode is a trademark of nexB Inc.
+# SPDX-License-Identifier: Apache-2.0
+# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
+# See https://github.com/nexB/skeleton for support or download.
+# See https://aboutcode.org for more information about nexB OSS projects.
+#
+
+import subprocess
+import unittest
+import configparser
+
+
+class BaseTests(unittest.TestCase):
+    def test_skeleton_codestyle(self):
+        """
+        This test shouldn't run in proliferated repositories.
+        """
+        setup_cfg = configparser.ConfigParser()
+        setup_cfg.read("setup.cfg")
+        if setup_cfg["metadata"]["name"] != "skeleton":
+            return
+
+        args = "venv/bin/black --check -l 100 setup.py etc tests"
+        try:
+            subprocess.check_output(args.split())
+        except subprocess.CalledProcessError as e:
+            print("===========================================================")
+            print(e.output)
+            print("===========================================================")
+            raise Exception(
+                "Black style check failed; please format the code using:\n"
+                "  python -m black -l 100 setup.py etc tests",
+                e.output,
+            ) from e

Debdiff

[The following lists of changes regard files as different if they have different names, permissions or owners.]

Files in second set of .debs but not in first

-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.6.0.egg-info/PKG-INFO
-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.6.0.egg-info/dependency_links.txt
-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.6.0.egg-info/not-zip-safe
-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.6.0.egg-info/requires.txt
-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.6.0.egg-info/top_level.txt

Files in first set of .debs but not in second

-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.3.egg-info/PKG-INFO
-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.3.egg-info/dependency_links.txt
-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.3.egg-info/not-zip-safe
-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.3.egg-info/requires.txt
-rw-r--r--  root/root   /usr/lib/python3/dist-packages/saneyaml-0.3.egg-info/top_level.txt

No differences were encountered in the control files

More details

Full run details