New upstream version 17.3.1
Free Ekanayaka
6 years ago
0 | [run] | |
1 | branch = True | |
2 | source = | |
3 | hyperlink | |
4 | ../hyperlink | |
5 | ||
6 | [paths] | |
7 | source = | |
8 | ../hyperlink | |
9 | */lib/python*/site-packages/hyperlink | |
10 | */Lib/site-packages/hyperlink | |
11 | */pypy/site-packages/hyperlink |
0 | # Hyperlink Changelog | |
1 | ||
2 | ## dev (not yet released) | |
3 | ||
4 | * *None so far* | |
5 | ||
6 | ## 17.3.0 | |
7 | ||
8 | *(July 18, 2017)* | |
9 | ||
10 | Fixed a couple major decoding issues and simplified the URL API. | |
11 | ||
12 | * limit types accepted by `URL.from_text()` to just text (str on py3, | |
13 | unicode on py2), see #20 | |
14 | * fix percent decoding issues surrounding multiple calls to | |
15 | `URL.to_iri()` (see #16) | |
16 | * remove the `socket`-inspired `family` argument from `URL`'s APIs. It | |
17 | was never consistently implemented and leaked slightly more problems | |
18 | than it solved. | |
19 | * Improve authority parsing (see #26) | |
20 | * include LICENSE, README, docs, and other resources in the package | |
21 | ||
22 | ## 17.2.1 | |
23 | ||
24 | *(June 18, 2017)* | |
25 | ||
26 | A small bugfix release after yesterday's big changes. This patch | |
27 | version simply reverts an exception message for parameters expecting | |
28 | strings on Python 3, returning to compliance with Twisted's test | |
29 | suite. | |
30 | ||
31 | ## 17.2.0 | |
32 | ||
33 | *(June 17, 2017)* | |
34 | ||
35 | Fixed a great round of issues based on the amazing community review | |
36 | (@wsanchez and @jvanasco) after our first listserv announcement and | |
37 | [PyConWeb talk](https://www.youtube.com/watch?v=EIkmADO-r10). | |
38 | ||
39 | * Add checking for invalid unescaped delimiters in parameters to the | |
40 | `URL` constructor. No more slashes and question marks allowed in | |
41 | path segments themselves. | |
42 | * More robust support for IDNA decoding on "narrow"/UCS-2 Python | |
43 | builds (e.g., Mac's built-in Python). | |
44 | * Correctly encode colons in the first segment of relative paths for | |
45 | URLs with no scheme set. | |
46 | * Make URLs with empty paths compare as equal (`http://example.com` | |
47 | vs. `http://example.com/`) per RFC 3986. If you need the stricter | |
48 | check, you can check the attributes directly or compare the strings. | |
49 | * Automatically escape the arguments to `.child()` and `.sibling()` | |
50 | * Fix some IPv6 and port parsing corner cases. | |
51 | ||
52 | ## 17.1.1 | |
53 | ||
54 | * Python 2.6 support | |
55 | * Added LICENSE | |
56 | * Automated CI and code coverage | |
57 | * When a host and a query string are present, empty paths are now | |
58 | rendered as a single slash. This is slightly more in line with RFC | |
59 | 3986 section 6.2.3, but might need to go further and use an empty | |
60 | slash whenever the authority is present. This also better replicates | |
61 | Twisted URL's old behavior. | |
62 | ||
63 | ## 17.1.0 | |
64 | ||
65 | * Correct encoding for username/password part of URL (userinfo) | |
66 | * Dot segments are resolved on empty URL.click | |
67 | * Many, many more schemes and default ports | |
68 | * Faster percent-encoding with segment-specific functions | |
69 | * Better detection and inference of scheme netloc usage (the presence | |
70 | of `//` in URLs) | |
71 | * IPv6 support with IP literal validation | |
72 | * Faster, regex-based parsing | |
73 | * URLParseError type for errors while parsing URLs | |
74 | * URL is now hashable, so feel free to use URLs as keys in dicts | |
75 | * Improved error on invalid scheme, directing users to URL.from_text | |
76 | in the event that they used the wrong constructor | |
77 | * PEP8-compatible API, with full, transparent backwards compatibility | |
78 | for Twisted APIs, guaranteed. | |
79 | * Extensive docstring expansion. | |
80 | ||
81 | ## Pre-17.0.0 | |
82 | ||
83 | * Lots of good features! Used to be called twisted.python.url |
0 | Copyright (c) 2017 | |
1 | Glyph Lefkowitz | |
2 | Itamar Turner-Trauring | |
3 | Jean Paul Calderone | |
4 | Adi Roiban | |
5 | Amber Hawkie Brown | |
6 | Mahmoud Hashemi | |
7 | ||
8 | and others that have contributed code to the public domain. | |
9 | ||
10 | Permission is hereby granted, free of charge, to any person obtaining | |
11 | a copy of this software and associated documentation files (the | |
12 | "Software"), to deal in the Software without restriction, including | |
13 | without limitation the rights to use, copy, modify, merge, publish, | |
14 | distribute, sublicense, and/or sell copies of the Software, and to | |
15 | permit persons to whom the Software is furnished to do so, subject to | |
16 | the following conditions: | |
17 | ||
18 | The above copyright notice and this permission notice shall be | |
19 | included in all copies or substantial portions of the Software. | |
20 | ||
21 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, | |
22 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF | |
23 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND | |
24 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE | |
25 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION | |
26 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION | |
27 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
0 | include README.md LICENSE CHANGELOG.md tox.ini requirements-test.txt .coveragerc Makefile pytest.ini .tox-coveragerc | |
1 | exclude TODO.md appveyor.yml | |
2 | ||
3 | graft docs | |
4 | prune docs/_build |
0 | Metadata-Version: 1.1 | |
1 | Name: hyperlink | |
2 | Version: 17.3.1 | |
3 | Summary: A featureful, correct URL for Python. | |
4 | Home-page: https://github.com/python-hyper/hyperlink | |
5 | Author: Mahmoud Hashemi and Glyph Lefkowitz | |
6 | Author-email: mahmoud@hatnote.com | |
7 | License: MIT | |
8 | Description: The humble, but powerful, URL runs everything around us. Chances | |
9 | are you've used several just to read this text. | |
10 | ||
11 | Hyperlink is a featureful, pure-Python implementation of the URL, with | |
12 | an emphasis on correctness. BSD licensed. | |
13 | ||
14 | See the docs at http://hyperlink.readthedocs.io. | |
15 | ||
16 | Platform: any | |
17 | Classifier: Topic :: Utilities | |
18 | Classifier: Intended Audience :: Developers | |
19 | Classifier: Topic :: Software Development :: Libraries | |
20 | Classifier: Development Status :: 5 - Production/Stable | |
21 | Classifier: Programming Language :: Python :: 2.6 | |
22 | Classifier: Programming Language :: Python :: 2.7 | |
23 | Classifier: Programming Language :: Python :: 3.4 | |
24 | Classifier: Programming Language :: Python :: 3.5 | |
25 | Classifier: Programming Language :: Python :: 3.6 | |
26 | Classifier: Programming Language :: Python :: Implementation :: PyPy |
0 | # Hyperlink | |
1 | ||
2 | *Cool URLs that don't change.* | |
3 | ||
4 | <a href="https://hyperlink.readthedocs.io/en/latest/"><img src="https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat"></a> | |
5 | <a href="https://pypi.python.org/pypi/hyperlink"><img src="https://img.shields.io/pypi/v/boltons.svg"></a> | |
6 | <a href="http://calver.org"><img src="https://img.shields.io/badge/calver-YY.MINOR.MICRO-22bfda.svg"></a> | |
7 | ||
8 | Hyperlink provides a pure-Python implementation of immutable | |
9 | URLs. Based on [RFC 3986][rfc3986] and [3987][rfc3987], the Hyperlink URL | |
10 | makes working with both URIs and IRIs easy. | |
11 | ||
12 | Hyperlink is tested against Python 2.7, 3.4, 3.5, 3.6, and PyPy. | |
13 | ||
14 | Full documentation is available on [Read the Docs][docs]. | |
15 | ||
16 | [rfc3986]: https://tools.ietf.org/html/rfc3986 | |
17 | [rfc3987]: https://tools.ietf.org/html/rfc3987 | |
18 | [docs]: http://hyperlink.readthedocs.io/en/latest/ | |
19 | ||
20 | ## Installation | |
21 | ||
22 | Hyperlink is a pure-Python package and requires nothing but | |
23 | Python. The easiest way to install is with pip: | |
24 | ||
25 | ``` | |
26 | pip install hyperlink | |
27 | ``` | |
28 | ||
29 | Then, hyperlink away! | |
30 | ||
31 | ```python | |
32 | from hyperlink import URL | |
33 | ||
34 | url = URL.from_text('http://github.com/mahmoud/hyperlink?utm_source=README') | |
35 | utm_source = url.get('utm_source') | |
36 | better_url = url.replace(scheme='https') | |
37 | user_url = better_url.click('..') | |
38 | ``` | |
39 | ||
40 | See the full API docs on [Read the Docs][docs]. | |
41 | ||
42 | ## More information | |
43 | ||
44 | Hyperlink would not have been possible without the help of | |
45 | [Glyph Lefkowitz](https://glyph.twistedmatrix.com/) and many other | |
46 | community members, especially considering that it started as an | |
47 | extract from the Twisted networking library. Thanks to them, | |
48 | Hyperlink's URL has been production-grade for well over a decade. | |
49 | ||
50 | Still, should you encounter any issues, do file an issue, or submit a | |
51 | pull request. |
0 | # Makefile for Sphinx documentation | |
1 | # | |
2 | ||
3 | # You can set these variables from the command line. | |
4 | SPHINXOPTS = | |
5 | SPHINXBUILD = sphinx-build | |
6 | PAPER = | |
7 | BUILDDIR = _build | |
8 | ||
9 | # User-friendly check for sphinx-build | |
10 | ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1) | |
11 | $(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/) | |
12 | endif | |
13 | ||
14 | # Internal variables. | |
15 | PAPEROPT_a4 = -D latex_paper_size=a4 | |
16 | PAPEROPT_letter = -D latex_paper_size=letter | |
17 | ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . | |
18 | # the i18n builder cannot share the environment and doctrees with the others | |
19 | I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . | |
20 | ||
21 | .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest coverage gettext | |
22 | ||
23 | help: | |
24 | @echo "Please use \`make <target>' where <target> is one of" | |
25 | @echo " html to make standalone HTML files" | |
26 | @echo " dirhtml to make HTML files named index.html in directories" | |
27 | @echo " singlehtml to make a single large HTML file" | |
28 | @echo " pickle to make pickle files" | |
29 | @echo " json to make JSON files" | |
30 | @echo " htmlhelp to make HTML files and a HTML help project" | |
31 | @echo " qthelp to make HTML files and a qthelp project" | |
32 | @echo " applehelp to make an Apple Help Book" | |
33 | @echo " devhelp to make HTML files and a Devhelp project" | |
34 | @echo " epub to make an epub" | |
35 | @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" | |
36 | @echo " latexpdf to make LaTeX files and run them through pdflatex" | |
37 | @echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx" | |
38 | @echo " text to make text files" | |
39 | @echo " man to make manual pages" | |
40 | @echo " texinfo to make Texinfo files" | |
41 | @echo " info to make Texinfo files and run them through makeinfo" | |
42 | @echo " gettext to make PO message catalogs" | |
43 | @echo " changes to make an overview of all changed/added/deprecated items" | |
44 | @echo " xml to make Docutils-native XML files" | |
45 | @echo " pseudoxml to make pseudoxml-XML files for display purposes" | |
46 | @echo " linkcheck to check all external links for integrity" | |
47 | @echo " doctest to run all doctests embedded in the documentation (if enabled)" | |
48 | @echo " coverage to run coverage check of the documentation (if enabled)" | |
49 | ||
50 | clean: | |
51 | rm -rf $(BUILDDIR)/* | |
52 | ||
53 | html: | |
54 | $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html | |
55 | @echo | |
56 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." | |
57 | ||
58 | dirhtml: | |
59 | $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml | |
60 | @echo | |
61 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." | |
62 | ||
63 | singlehtml: | |
64 | $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml | |
65 | @echo | |
66 | @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." | |
67 | ||
68 | pickle: | |
69 | $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle | |
70 | @echo | |
71 | @echo "Build finished; now you can process the pickle files." | |
72 | ||
73 | json: | |
74 | $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json | |
75 | @echo | |
76 | @echo "Build finished; now you can process the JSON files." | |
77 | ||
78 | htmlhelp: | |
79 | $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp | |
80 | @echo | |
81 | @echo "Build finished; now you can run HTML Help Workshop with the" \ | |
82 | ".hhp project file in $(BUILDDIR)/htmlhelp." | |
83 | ||
84 | qthelp: | |
85 | $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp | |
86 | @echo | |
87 | @echo "Build finished; now you can run "qcollectiongenerator" with the" \ | |
88 | ".qhcp project file in $(BUILDDIR)/qthelp, like this:" | |
89 | @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/hyperlink.qhcp" | |
90 | @echo "To view the help file:" | |
91 | @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/hyperlink.qhc" | |
92 | ||
93 | applehelp: | |
94 | $(SPHINXBUILD) -b applehelp $(ALLSPHINXOPTS) $(BUILDDIR)/applehelp | |
95 | @echo | |
96 | @echo "Build finished. The help book is in $(BUILDDIR)/applehelp." | |
97 | @echo "N.B. You won't be able to view it unless you put it in" \ | |
98 | "~/Library/Documentation/Help or install it in your application" \ | |
99 | "bundle." | |
100 | ||
101 | devhelp: | |
102 | $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp | |
103 | @echo | |
104 | @echo "Build finished." | |
105 | @echo "To view the help file:" | |
106 | @echo "# mkdir -p $$HOME/.local/share/devhelp/hyperlink" | |
107 | @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/hyperlink" | |
108 | @echo "# devhelp" | |
109 | ||
110 | epub: | |
111 | $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub | |
112 | @echo | |
113 | @echo "Build finished. The epub file is in $(BUILDDIR)/epub." | |
114 | ||
115 | latex: | |
116 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex | |
117 | @echo | |
118 | @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." | |
119 | @echo "Run \`make' in that directory to run these through (pdf)latex" \ | |
120 | "(use \`make latexpdf' here to do that automatically)." | |
121 | ||
122 | latexpdf: | |
123 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex | |
124 | @echo "Running LaTeX files through pdflatex..." | |
125 | $(MAKE) -C $(BUILDDIR)/latex all-pdf | |
126 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." | |
127 | ||
128 | latexpdfja: | |
129 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex | |
130 | @echo "Running LaTeX files through platex and dvipdfmx..." | |
131 | $(MAKE) -C $(BUILDDIR)/latex all-pdf-ja | |
132 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." | |
133 | ||
134 | text: | |
135 | $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text | |
136 | @echo | |
137 | @echo "Build finished. The text files are in $(BUILDDIR)/text." | |
138 | ||
139 | man: | |
140 | $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man | |
141 | @echo | |
142 | @echo "Build finished. The manual pages are in $(BUILDDIR)/man." | |
143 | ||
144 | texinfo: | |
145 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo | |
146 | @echo | |
147 | @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo." | |
148 | @echo "Run \`make' in that directory to run these through makeinfo" \ | |
149 | "(use \`make info' here to do that automatically)." | |
150 | ||
151 | info: | |
152 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo | |
153 | @echo "Running Texinfo files through makeinfo..." | |
154 | make -C $(BUILDDIR)/texinfo info | |
155 | @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo." | |
156 | ||
157 | gettext: | |
158 | $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale | |
159 | @echo | |
160 | @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale." | |
161 | ||
162 | changes: | |
163 | $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes | |
164 | @echo | |
165 | @echo "The overview file is in $(BUILDDIR)/changes." | |
166 | ||
167 | linkcheck: | |
168 | $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck | |
169 | @echo | |
170 | @echo "Link check complete; look for any errors in the above output " \ | |
171 | "or in $(BUILDDIR)/linkcheck/output.txt." | |
172 | ||
173 | doctest: | |
174 | $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest | |
175 | @echo "Testing of doctests in the sources finished, look at the " \ | |
176 | "results in $(BUILDDIR)/doctest/output.txt." | |
177 | ||
178 | coverage: | |
179 | $(SPHINXBUILD) -b coverage $(ALLSPHINXOPTS) $(BUILDDIR)/coverage | |
180 | @echo "Testing of coverage in the sources finished, look at the " \ | |
181 | "results in $(BUILDDIR)/coverage/python.txt." | |
182 | ||
183 | xml: | |
184 | $(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml | |
185 | @echo | |
186 | @echo "Build finished. The XML files are in $(BUILDDIR)/xml." | |
187 | ||
188 | pseudoxml: | |
189 | $(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml | |
190 | @echo | |
191 | @echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml." |
0 | {% extends "!page.html" %} | |
1 | {% block menu %} | |
2 | {{ super() }} | |
3 | <iframe src="https://ghbtns.com/github-btn.html?user=python-hyper&repo=hyperlink&type=star&count=true&size=medium" frameborder="0" scrolling="0" width="160px" height="30px" style="margin-left: 23px; margin-top: 10px;"></iframe> | |
4 | {% endblock %} |
0 | .. _hyperlink_api: | |
1 | ||
2 | Hyperlink API | |
3 | ============= | |
4 | ||
5 | .. automodule:: hyperlink._url | |
6 | ||
7 | Creation | |
8 | -------- | |
9 | ||
10 | Before you can work with URLs, you must create URLs. There are two | |
11 | ways to create URLs, from parts and from text. | |
12 | ||
13 | .. autoclass:: hyperlink.URL | |
14 | .. automethod:: hyperlink.URL.from_text | |
15 | ||
16 | Transformation | |
17 | -------------- | |
18 | ||
19 | Once a URL is created, some of the most common tasks are to transform | |
20 | it into other URLs and text. | |
21 | ||
22 | .. automethod:: hyperlink.URL.to_text | |
23 | .. automethod:: hyperlink.URL.to_uri | |
24 | .. automethod:: hyperlink.URL.to_iri | |
25 | .. automethod:: hyperlink.URL.replace | |
26 | ||
27 | Navigation | |
28 | ---------- | |
29 | ||
30 | Go places with URLs. Simulate browser behavior and perform semantic | |
31 | path operations. | |
32 | ||
33 | .. automethod:: hyperlink.URL.click | |
34 | .. automethod:: hyperlink.URL.sibling | |
35 | .. automethod:: hyperlink.URL.child | |
36 | ||
37 | Query Parameters | |
38 | ---------------- | |
39 | ||
40 | CRUD operations on the query string multimap. | |
41 | ||
42 | .. automethod:: hyperlink.URL.get | |
43 | .. automethod:: hyperlink.URL.add | |
44 | .. automethod:: hyperlink.URL.set | |
45 | .. automethod:: hyperlink.URL.remove | |
46 | ||
47 | Attributes | |
48 | ---------- | |
49 | ||
50 | URLs have many parts, and URL objects have many attributes to represent them. | |
51 | ||
52 | .. autoattribute:: hyperlink.URL.absolute | |
53 | .. autoattribute:: hyperlink.URL.scheme | |
54 | .. autoattribute:: hyperlink.URL.host | |
55 | .. autoattribute:: hyperlink.URL.port | |
56 | .. autoattribute:: hyperlink.URL.path | |
57 | .. autoattribute:: hyperlink.URL.query | |
58 | .. autoattribute:: hyperlink.URL.fragment | |
59 | .. autoattribute:: hyperlink.URL.userinfo | |
60 | .. autoattribute:: hyperlink.URL.user | |
61 | .. autoattribute:: hyperlink.URL.rooted | |
62 | .. autoattribute:: hyperlink.URL.family | |
63 | ||
64 | Low-level functions | |
65 | ------------------- | |
66 | ||
67 | A couple of notable helpers used by the :class:`~hyperlink.URL` type. | |
68 | ||
69 | .. autoclass:: hyperlink.URLParseError | |
70 | .. autofunction:: hyperlink.register_scheme | |
71 | .. autofunction:: hyperlink.parse_host | |
72 | ||
73 | .. TODO: run doctests in docs? |
0 | # -*- coding: utf-8 -*- | |
1 | # | |
2 | # hyperlink documentation build configuration file, created by | |
3 | # sphinx-quickstart on Sat Mar 21 00:34:18 2015. | |
4 | # | |
5 | # This file is execfile()d with the current directory set to its | |
6 | # containing dir. | |
7 | # | |
8 | # Note that not all possible configuration values are present in this | |
9 | # autogenerated file. | |
10 | # | |
11 | # All configuration values have a default; values that are commented out | |
12 | # serve to show the default. | |
13 | ||
14 | import os | |
15 | import sys | |
16 | import sphinx | |
17 | from pprint import pprint | |
18 | ||
19 | # If extensions (or modules to document with autodoc) are in another directory, | |
20 | # add these directories to sys.path here. If the directory is relative to the | |
21 | # documentation root, use os.path.abspath to make it absolute, like shown here. | |
22 | CUR_PATH = os.path.dirname(os.path.abspath(__file__)) | |
23 | PROJECT_PATH = os.path.abspath(CUR_PATH + '/../') | |
24 | PACKAGE_PATH = os.path.abspath(CUR_PATH + '/../hyperlink') | |
25 | sys.path.insert(0, PROJECT_PATH) | |
26 | sys.path.insert(0, PACKAGE_PATH) | |
27 | ||
28 | pprint(os.environ) | |
29 | ||
30 | # -- General configuration ------------------------------------------------ | |
31 | ||
32 | autosummary_generate = True | |
33 | ||
34 | # Add any Sphinx extension module names here, as strings. They can be | |
35 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom | |
36 | # ones. | |
37 | extensions = [ | |
38 | 'sphinx.ext.autodoc', | |
39 | 'sphinx.ext.autosummary', | |
40 | 'sphinx.ext.doctest', | |
41 | 'sphinx.ext.intersphinx', | |
42 | 'sphinx.ext.coverage', | |
43 | 'sphinx.ext.viewcode', | |
44 | ] | |
45 | ||
46 | # Read the Docs is version 1.2 as of writing | |
47 | if sphinx.version_info[:2] < (1, 3): | |
48 | extensions.append('sphinxcontrib.napoleon') | |
49 | else: | |
50 | extensions.append('sphinx.ext.napoleon') | |
51 | ||
52 | # Add any paths that contain templates here, relative to this directory. | |
53 | templates_path = ['_templates'] | |
54 | ||
55 | # source_suffix = ['.rst', '.md'] | |
56 | source_suffix = '.rst' | |
57 | ||
58 | # The master toctree document. | |
59 | master_doc = 'index' | |
60 | ||
61 | # General information about the project. | |
62 | project = u'hyperlink' | |
63 | copyright = u'2017, Mahmoud Hashemi' | |
64 | author = u'Mahmoud Hashemi' | |
65 | ||
66 | version = '17.3' | |
67 | release = '17.3.0' | |
68 | ||
69 | if os.name != 'nt': | |
70 | today_fmt = '%B %d, %Y' | |
71 | ||
72 | exclude_patterns = ['_build'] | |
73 | ||
74 | # The name of the Pygments (syntax highlighting) style to use. | |
75 | pygments_style = 'sphinx' | |
76 | ||
77 | # Example configuration for intersphinx: refer to the Python standard library. | |
78 | intersphinx_mapping = {'python': ('https://docs.python.org/2.7', None)} | |
79 | ||
80 | ||
81 | # -- Options for HTML output ---------------------------------------------- | |
82 | ||
83 | # The theme to use for HTML and HTML Help pages. See the documentation for | |
84 | # a list of builtin themes. | |
85 | on_rtd = os.environ.get('READTHEDOCS', None) == 'True' | |
86 | ||
87 | if on_rtd: | |
88 | html_theme = 'default' | |
89 | else: # only import and set the theme if we're building docs locally | |
90 | import sphinx_rtd_theme | |
91 | html_theme = 'sphinx_rtd_theme' | |
92 | html_theme_path = ['_themes', sphinx_rtd_theme.get_html_theme_path()] | |
93 | ||
94 | html_theme_options = {'navigation_depth': 3, | |
95 | 'collapse_navigation': False} | |
96 | ||
97 | # Add any paths that contain custom themes here, relative to this directory. | |
98 | # html_theme_path = [] | |
99 | ||
100 | # TEMP: see https://github.com/rtfd/readthedocs.org/issues/1692 | |
101 | # Add RTD Theme Path. | |
102 | #if 'html_theme_path' in globals(): | |
103 | # html_theme_path.append('/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx') | |
104 | #else: | |
105 | # html_theme_path = ['_themes', '/home/docs/checkouts/readthedocs.org/readthedocs/templates/sphinx'] | |
106 | ||
107 | # The name for this set of Sphinx documents. If None, it defaults to | |
108 | # "<project> v<release> documentation". | |
109 | #html_title = None | |
110 | ||
111 | # A shorter title for the navigation bar. Default is the same as html_title. | |
112 | #html_short_title = None | |
113 | ||
114 | # The name of an image file (relative to this directory) to place at the top | |
115 | # of the sidebar. | |
116 | #html_logo = None | |
117 | ||
118 | # The name of an image file (within the static path) to use as favicon of the | |
119 | # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 | |
120 | # pixels large. | |
121 | #html_favicon = None | |
122 | ||
123 | # Add any paths that contain custom static files (such as style sheets) here, | |
124 | # relative to this directory. They are copied after the builtin static files, | |
125 | # so a file named "default.css" will overwrite the builtin "default.css". | |
126 | html_static_path = ['_static'] | |
127 | ||
128 | # Add any extra paths that contain custom files (such as robots.txt or | |
129 | # .htaccess) here, relative to this directory. These files are copied | |
130 | # directly to the root of the documentation. | |
131 | #html_extra_path = [] | |
132 | ||
133 | # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, | |
134 | # using the given strftime format. | |
135 | #html_last_updated_fmt = '%b %d, %Y' | |
136 | ||
137 | # If true, SmartyPants will be used to convert quotes and dashes to | |
138 | # typographically correct entities. | |
139 | #html_use_smartypants = True | |
140 | ||
141 | # Custom sidebar templates, maps document names to template names. | |
142 | #html_sidebars = {} | |
143 | ||
144 | # Additional templates that should be rendered to pages, maps page names to | |
145 | # template names. | |
146 | #html_additional_pages = {} | |
147 | ||
148 | # If false, no module index is generated. | |
149 | #html_domain_indices = True | |
150 | ||
151 | # If false, no index is generated. | |
152 | #html_use_index = True | |
153 | ||
154 | # If true, the index is split into individual pages for each letter. | |
155 | #html_split_index = False | |
156 | ||
157 | # If true, links to the reST sources are added to the pages. | |
158 | #html_show_sourcelink = True | |
159 | ||
160 | # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. | |
161 | #html_show_sphinx = True | |
162 | ||
163 | # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. | |
164 | #html_show_copyright = True | |
165 | ||
166 | # If true, an OpenSearch description file will be output, and all pages will | |
167 | # contain a <link> tag referring to it. The value of this option must be the | |
168 | # base URL from which the finished HTML is served. | |
169 | #html_use_opensearch = '' | |
170 | ||
171 | # This is the file name suffix for HTML files (e.g. ".xhtml"). | |
172 | #html_file_suffix = None | |
173 | ||
174 | # Language to be used for generating the HTML full-text search index. | |
175 | # Sphinx supports the following languages: | |
176 | # 'da', 'de', 'en', 'es', 'fi', 'fr', 'hu', 'it', 'ja' | |
177 | # 'nl', 'no', 'pt', 'ro', 'ru', 'sv', 'tr' | |
178 | #html_search_language = 'en' | |
179 | ||
180 | # A dictionary with options for the search language support, empty by default. | |
181 | # Now only 'ja' uses this config value | |
182 | #html_search_options = {'type': 'default'} | |
183 | ||
184 | # The name of a javascript file (relative to the configuration directory) that | |
185 | # implements a search results scorer. If empty, the default will be used. | |
186 | #html_search_scorer = 'scorer.js' | |
187 | ||
188 | # Output file base name for HTML help builder. | |
189 | htmlhelp_basename = 'hyperlinkdoc' | |
190 | ||
191 | # -- Options for LaTeX output --------------------------------------------- | |
192 | ||
193 | latex_elements = { | |
194 | # The paper size ('letterpaper' or 'a4paper'). | |
195 | #'papersize': 'letterpaper', | |
196 | ||
197 | # The font size ('10pt', '11pt' or '12pt'). | |
198 | #'pointsize': '10pt', | |
199 | ||
200 | # Additional stuff for the LaTeX preamble. | |
201 | #'preamble': '', | |
202 | ||
203 | # Latex figure (float) alignment | |
204 | #'figure_align': 'htbp', | |
205 | } | |
206 | ||
207 | # Grouping the document tree into LaTeX files. List of tuples | |
208 | # (source start file, target name, title, | |
209 | # author, documentclass [howto, manual, or own class]). | |
210 | latex_documents = [ | |
211 | (master_doc, 'hyperlink.tex', u'hyperlink Documentation', | |
212 | u'Mahmoud Hashemi', 'manual'), | |
213 | ] | |
214 | ||
215 | # The name of an image file (relative to this directory) to place at the top of | |
216 | # the title page. | |
217 | #latex_logo = None | |
218 | ||
219 | # For "manual" documents, if this is true, then toplevel headings are parts, | |
220 | # not chapters. | |
221 | #latex_use_parts = False | |
222 | ||
223 | # If true, show page references after internal links. | |
224 | #latex_show_pagerefs = False | |
225 | ||
226 | # If true, show URL addresses after external links. | |
227 | #latex_show_urls = False | |
228 | ||
229 | # Documents to append as an appendix to all manuals. | |
230 | #latex_appendices = [] | |
231 | ||
232 | # If false, no module index is generated. | |
233 | #latex_domain_indices = True | |
234 | ||
235 | ||
236 | # -- Options for manual page output --------------------------------------- | |
237 | ||
238 | # One entry per manual page. List of tuples | |
239 | # (source start file, name, description, authors, manual section). | |
240 | man_pages = [ | |
241 | (master_doc, 'hyperlink', u'hyperlink Documentation', | |
242 | [author], 1) | |
243 | ] | |
244 | ||
245 | # If true, show URL addresses after external links. | |
246 | #man_show_urls = False | |
247 | ||
248 | ||
249 | # -- Options for Texinfo output ------------------------------------------- | |
250 | ||
251 | # Grouping the document tree into Texinfo files. List of tuples | |
252 | # (source start file, target name, title, author, | |
253 | # dir menu entry, description, category) | |
254 | texinfo_documents = [ | |
255 | (master_doc, 'hyperlink', u'hyperlink Documentation', | |
256 | author, 'hyperlink', 'One line description of project.', | |
257 | 'Miscellaneous'), | |
258 | ] | |
259 | ||
260 | # Documents to append as an appendix to all manuals. | |
261 | #texinfo_appendices = [] | |
262 | ||
263 | # If false, no module index is generated. | |
264 | #texinfo_domain_indices = True | |
265 | ||
266 | # How to display URL addresses: 'footnote', 'no', or 'inline'. | |
267 | #texinfo_show_urls = 'footnote' | |
268 | ||
269 | # If true, do not generate a @detailmenu in the "Top" node's menu. | |
270 | #texinfo_no_detailmenu = False |
0 | Hyperlink Design | |
1 | ================ | |
2 | ||
3 | The URL is a nuanced format with a long history. Suitably, a lot of | |
4 | work has gone into translating the standards, `RFC 3986`_ and `RFC | |
5 | 3987`_, into a Pythonic interface. Hyperlink's design strikes a unique | |
6 | balance of correctness and usability. | |
7 | ||
8 | .. _uris_and_iris: | |
9 | ||
10 | A Tale of Two Representations | |
11 | ----------------------------- | |
12 | ||
13 | The URL is a powerful construct, designed to be used by both humans | |
14 | and computers. | |
15 | ||
16 | This dual purpose has resulted in two canonical representations: the | |
17 | URI and the IRI. | |
18 | ||
19 | Even though the W3C themselves have `recognized the confusion`_ this can | |
20 | cause, Hyperlink's URL makes the distinction quite natural. Simply: | |
21 | ||
22 | * **URI**: Fully-encoded, ASCII-only, suitable for network transfer | |
23 | * **IRI**: Fully-decoded, Unicode-friendly, suitable for display (e.g., in a browser bar) | |
24 | ||
25 | We can use Hyperlink to very easily demonstrate the difference:: | |
26 | ||
27 | >>> url = URL.from_text('http://example.com/café') | |
28 | >>> url.to_uri().to_text() | |
29 | u'http://example.com/caf%C3%A9' | |
30 | ||
31 | We construct a URL from text containing Unicode (``é``), then | |
32 | transform it using :meth:`~URL.to_uri()`. This results in ASCII-only | |
33 | percent-encoding familiar to all web developers, and a common | |
34 | characteristic of URIs. | |
35 | ||
36 | Still, Hyperlink's distinction between URIs and IRIs is pragmatic, and | |
37 | only limited to output. Input can contain *any mix* of percent | |
38 | encoding and Unicode, without issue: | |
39 | ||
40 | >>> url = URL.from_text('http://example.com/caf%C3%A9/au láit') | |
41 | >>> print(url.to_iri().to_text()) | |
42 | http://example.com/café/au láit | |
43 | >>> print(url.to_uri().to_text()) | |
44 | http://example.com/caf%C3%A9/au%20l%C3%A1it | |
45 | ||
46 | Note that even when a URI and IRI point to the same resource, they | |
47 | will often be different URLs: | |
48 | ||
49 | >>> url.to_uri() == url.to_iri() | |
50 | False | |
51 | ||
52 | And with that caveat out of the way, you're qualified to correct other | |
53 | people (and their code) on the nuances of URI vs IRI. | |
54 | ||
55 | .. _recognized the confusion: https://www.w3.org/TR/uri-clarification/ | |
56 | ||
57 | Immutability | |
58 | ------------ | |
59 | ||
60 | Hyperlink's URL is notable for being an `immutable`_ representation. Once | |
61 | constructed, instances are not changed. Methods like | |
62 | :meth:`~URL.click()`, :meth:`~URL.set()`, and :meth:`~URL.replace()`, | |
63 | all return new URL objects. This enables URLs to be used in sets, as | |
64 | well as dictionary keys. | |
65 | ||
66 | .. _immutable: https://docs.python.org/2/glossary.html#term-immutable | |
67 | .. _multidict: https://en.wikipedia.org/wiki/Multimap | |
68 | .. _query string: https://en.wikipedia.org/wiki/Query_string | |
69 | .. _GET parameters: http://php.net/manual/en/reserved.variables.get.php | |
70 | .. _twisted.python.url.URL: https://twistedmatrix.com/documents/current/api/twisted.python.url.URL.html | |
71 | .. _boltons.urlutils: http://boltons.readthedocs.io/en/latest/urlutils.html | |
72 | .. _uri clarification: https://www.w3.org/TR/uri-clarification/ | |
73 | .. _BNF grammar: https://tools.ietf.org/html/rfc3986#appendix-A | |
74 | ||
75 | ||
76 | .. _RFC 3986: https://tools.ietf.org/html/rfc3986 | |
77 | .. _RFC 3987: https://tools.ietf.org/html/rfc3987 | |
78 | .. _section 5.4: https://tools.ietf.org/html/rfc3986#section-5.4 | |
79 | .. _section 3.4: https://tools.ietf.org/html/rfc3986#section-3.4 | |
80 | .. _section 5.2.4: https://tools.ietf.org/html/rfc3986#section-5.2.4 | |
81 | .. _section 2.2: https://tools.ietf.org/html/rfc3986#section-2.2 | |
82 | .. _section 2.3: https://tools.ietf.org/html/rfc3986#section-2.3 | |
83 | .. _section 3.2.1: https://tools.ietf.org/html/rfc3986#section-3.2.1 | |
84 | ||
85 | ||
86 | Query parameters | |
87 | ---------------- | |
88 | ||
89 | One of the URL format's most useful features is the mapping formed | |
90 | by the query parameters, sometimes called "query arguments" or "GET | |
91 | parameters". Regardless of what you call them, they are encoded in | |
92 | the query string portion of the URL, and they are very powerful. | |
93 | ||
94 | Query parameters are actually a type of "multidict", where a given key | |
95 | can have multiple values. This is why the :meth:`~URL.get()` method | |
96 | returns a list of strings. Keys can also have no value, which is | |
97 | conventionally interpreted as a truthy flag. | |
98 | ||
99 | >>> url = URL.from_text('http://example.com/?a=b&c') | |
100 | >>> url.get(u'a') | |
101 | ['b'] | |
102 | >>> url.get(u'c') | |
103 | [None] | |
104 | >>> url.get('missing') # returns None | |
105 | [] | |
106 | ||
107 | ||
108 | Values can be modified and added using :meth:`~URL.set()` and | |
109 | :meth:`~URL.add()`. | |
110 | ||
111 | >>> url = url.add(u'x', u'x') | |
112 | >>> url = url.add(u'x', u'y') | |
113 | >>> url.to_text() | |
114 | u'http://example.com/?a=b&c&x=x&x=y' | |
115 | >>> url = url.set(u'x', u'z') | |
116 | >>> url.to_text() | |
117 | u'http://example.com/?a=b&c&x=z' | |
118 | ||
119 | ||
120 | Values can be unset with :meth:`~URL.remove()`. | |
121 | ||
122 | >>> url = url.remove(u'a') | |
123 | >>> url = url.remove(u'c') | |
124 | >>> url.to_text() | |
125 | u'http://example.com/?x=z' | |
126 | ||
127 | Note how all modifying methods return copies of the URL and do not | |
128 | mutate the URL in place, much like methods on strings. | |
129 | ||
130 | Origins and backwards-compatibility | |
131 | ----------------------------------- | |
132 | ||
133 | Hyperlink's URL is descended directly from `twisted.python.url.URL`_, | |
134 | in all but the literal code-inheritance sense. While a lot of | |
135 | functionality has been incorporated from `boltons.urlutils`_, extra | |
136 | care has been taken to maintain backwards-compatibility for legacy | |
137 | APIs, making Hyperlink's URL a drop-in replacement for Twisted's URL type. | |
138 | ||
139 | If you are porting a Twisted project to use Hyperlink's URL, and | |
140 | encounter any sort of incompatibility, please do not hesitate to `file | |
141 | an issue`_. | |
142 | ||
143 | .. _file an issue: https://github.com/python-hyper/hyperlink/issues |
0 | FAQ | |
1 | === | |
2 | ||
3 | There were bound to be questions. | |
4 | ||
5 | .. contents:: | |
6 | :local: | |
7 | ||
8 | Why not just use text? | |
9 | ---------------------- | |
10 | ||
11 | URLs were designed as a text format, so, apart from the principle of | |
12 | structuring structured data, why use URL objects? | |
13 | ||
14 | There are two major advantages of using :class:`~hyperlink.URL` over | |
15 | representing URLs as strings. The first is that it's really easy to | |
16 | evaluate a relative hyperlink, for example, when crawling documents, | |
17 | to figure out what is linked:: | |
18 | ||
19 | >>> URL.from_text(u'https://example.com/base/uri/').click(u"/absolute") | |
20 | URL.from_text(u'https://example.com/absolute') | |
21 | >>> URL.from_text(u'https://example.com/base/uri/').click(u"rel/path") | |
22 | URL.from_text(u'https://example.com/base/uri/rel/path') | |
23 | ||
24 | The other is that URLs have two normalizations. One representation is | |
25 | suitable for humans to read, because it can represent data from many | |
26 | character sets - this is the Internationalized, or IRI, normalization. | |
27 | The other is the older, US-ASCII-only representation, which is | |
28 | necessary for most contexts where you would need to put a URI. You | |
29 | can convert *between* these representations according to certain | |
30 | rules. :class:`~hyperlink.URL` exposes these conversions as methods:: | |
31 | ||
32 | >>> URL.from_text(u"https://→example.com/foo⇧bar/").to_uri() | |
33 | URL.from_text(u'https://xn--example-dk9c.com/foo%E2%87%A7bar/') | |
34 | >>> URL.from_text(u'https://xn--example-dk9c.com/foo%E2%87%A7bar/').to_iri() | |
35 | URL.from_text(u'https://\\u2192example.com/foo\\u21e7bar/') | |
36 | ||
37 | For more info, see A Tale of Two Representations, above. | |
38 | ||
39 | How does Hyperlink compare to other libraries? | |
40 | ---------------------------------------------- | |
41 | ||
42 | Hyperlink certainly isn't the first library to provide a Python model | |
43 | for URLs. It just happens to be among the best. | |
44 | ||
45 | urlparse: Built-in to the standard library (merged into urllib for | |
46 | Python 3). No URL type, requires user to juggle a bunch of | |
47 | strings. Overly simple approach makes it easy to make mistakes. | |
48 | ||
49 | boltons.urlutils: Shares some underlying implementation. Two key | |
50 | differences. First, the boltons URL is mutable, intended to work like | |
51 | a string factory for URL text. Second, the boltons URL has advanced | |
52 | query parameter mapping type. Complete implementation in a single | |
53 | file. | |
54 | ||
55 | furl: Not a single URL type, but types for many parts of the | |
56 | URL. Similar approach to boltons for query parameters. Poor netloc | |
57 | handling (support for non-network schemes like mailto). Unlicensed. | |
58 | ||
59 | purl: Another immutable implementation. Method-heavy API. | |
60 | ||
61 | rfc3986: Very heavily focused on various types of validation. Large | |
62 | for a URL library, if that matters to you. Exclusively supports URIs, | |
63 | `lacking IRI support`_ at the time of writing. | |
64 | ||
65 | In reality, any of the third-party libraries above do a better job | |
66 | than the standard library, and much of the hastily thrown together | |
67 | code in a corner of a util.py deep in a project. URLs are easy to mess | |
68 | up, make sure you use a tested implementation. | |
69 | ||
70 | .. _lacking IRI support: https://github.com/sigmavirus24/rfc3986/issues/23 | |
71 | ||
72 | Are URLs really a big deal in 201X? | |
73 | ----------------------------------- | |
74 | ||
75 | Hyperlink's first release, in 2017, comes somewhere between 23 and 30 | |
76 | years after URLs were already in use. Is the URL really still that big | |
77 | of a deal? | |
78 | ||
79 | Look, buddy, I don't know how you got this document, but I'm pretty | |
80 | sure you (and your computer) used one if not many URLs to get | |
81 | here. URLs are only getting more relevant. Buy stock in URLs. | |
82 | ||
83 | And if you're worried that URLs are just another technology with an | |
84 | obsoletion date planned in advance, I'll direct your attention to the | |
85 | ``IPvFuture`` rule in the `BNF grammar`_. If it has plans to outlast | |
86 | IPv6, the URL will probably outlast you and me, too. | |
87 | ||
88 | .. _BNF grammar: https://tools.ietf.org/html/rfc3986#appendix-A |
Binary diff not shown
Binary diff not shown
0 | .. hyperlink documentation master file, created on Mon Apr 10 00:34:18 2017. | |
1 | hyperlink | |
2 | ========= | |
3 | ||
4 | *Cool URLs that don't change.* | |
5 | ||
6 | |release| |calver| | |
7 | ||
8 | **Hyperlink** provides a pure-Python implementation of immutable | |
9 | URLs. Based on `RFC 3986`_ and `RFC 3987`_, the Hyperlink URL balances | |
10 | simplicity and correctness for both :ref:`URIs and IRIs <uris_and_iris>`. | |
11 | ||
12 | Hyperlink is tested against Python 2.7, 3.4, 3.5, and PyPy. | |
13 | ||
14 | For an introduction to the hyperlink library, its background, and URLs | |
15 | in general, see `this talk from PyConWeb 2017`_ (and `the accompanying | |
16 | slides`_). | |
17 | ||
18 | .. _RFC 3986: https://tools.ietf.org/html/rfc3986 | |
19 | .. _RFC 3987: https://tools.ietf.org/html/rfc3987 | |
20 | .. _this talk from PyConWeb 2017: https://www.youtube.com/watch?v=EIkmADO-r10 | |
21 | .. _the accompanying slides: https://speakerdeck.com/mhashemi/urls-in-plain-view | |
22 | .. |release| image:: https://img.shields.io/pypi/v/hyperlink.svg | |
23 | :target: https://pypi.python.org/pypi/hyperlink | |
24 | ||
25 | .. |calver| image:: https://img.shields.io/badge/calver-YY.MINOR.MICRO-22bfda.svg | |
26 | :target: http://calver.org | |
27 | ||
28 | ||
29 | Installation and Integration | |
30 | ---------------------------- | |
31 | ||
32 | Hyperlink is a pure-Python package and only depends on the standard | |
33 | library. The easiest way to install is with pip:: | |
34 | ||
35 | pip install hyperlink | |
36 | ||
37 | Then, URLs are just an import away:: | |
38 | ||
39 | from hyperlink import URL | |
40 | ||
41 | url = URL.from_text('http://github.com/mahmoud/hyperlink?utm_souce=readthedocs') | |
42 | ||
43 | better_url = url.replace(scheme='https') | |
44 | user_url = better_url.click('..') | |
45 | ||
46 | print(user_url.to_text()) | |
47 | # prints: https://github.com/mahmoud | |
48 | ||
49 | print(user_url.get('utm_source')) | |
50 | # prints: readthedocs | |
51 | ||
52 | See :ref:`the API docs <hyperlink_api>` for more usage examples. | |
53 | ||
54 | Gaps | |
55 | ---- | |
56 | ||
57 | Found something missing in hyperlink? `Pull Requests`_ and `Issues`_ weclome! | |
58 | ||
59 | .. _Pull Requests: https://github.com/mahmoud/python-hyper/pulls | |
60 | .. _Issues: https://github.com/mahmoud/python-hyper/issues | |
61 | ||
62 | Section listing | |
63 | --------------- | |
64 | ||
65 | .. toctree:: | |
66 | :maxdepth: 2 | |
67 | ||
68 | design | |
69 | api | |
70 | faq |
0 | @ECHO OFF | |
1 | ||
2 | REM Command file for Sphinx documentation | |
3 | ||
4 | if "%SPHINXBUILD%" == "" ( | |
5 | set SPHINXBUILD=sphinx-build | |
6 | ) | |
7 | set BUILDDIR=_build | |
8 | set ALLSPHINXOPTS=-d %BUILDDIR%/doctrees %SPHINXOPTS% . | |
9 | set I18NSPHINXOPTS=%SPHINXOPTS% . | |
10 | if NOT "%PAPER%" == "" ( | |
11 | set ALLSPHINXOPTS=-D latex_paper_size=%PAPER% %ALLSPHINXOPTS% | |
12 | set I18NSPHINXOPTS=-D latex_paper_size=%PAPER% %I18NSPHINXOPTS% | |
13 | ) | |
14 | ||
15 | if "%1" == "" goto help | |
16 | ||
17 | if "%1" == "help" ( | |
18 | :help | |
19 | echo.Please use `make ^<target^>` where ^<target^> is one of | |
20 | echo. html to make standalone HTML files | |
21 | echo. dirhtml to make HTML files named index.html in directories | |
22 | echo. singlehtml to make a single large HTML file | |
23 | echo. pickle to make pickle files | |
24 | echo. json to make JSON files | |
25 | echo. htmlhelp to make HTML files and a HTML help project | |
26 | echo. qthelp to make HTML files and a qthelp project | |
27 | echo. devhelp to make HTML files and a Devhelp project | |
28 | echo. epub to make an epub | |
29 | echo. latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter | |
30 | echo. text to make text files | |
31 | echo. man to make manual pages | |
32 | echo. texinfo to make Texinfo files | |
33 | echo. gettext to make PO message catalogs | |
34 | echo. changes to make an overview over all changed/added/deprecated items | |
35 | echo. xml to make Docutils-native XML files | |
36 | echo. pseudoxml to make pseudoxml-XML files for display purposes | |
37 | echo. linkcheck to check all external links for integrity | |
38 | echo. doctest to run all doctests embedded in the documentation if enabled | |
39 | echo. coverage to run coverage check of the documentation if enabled | |
40 | goto end | |
41 | ) | |
42 | ||
43 | if "%1" == "clean" ( | |
44 | for /d %%i in (%BUILDDIR%\*) do rmdir /q /s %%i | |
45 | del /q /s %BUILDDIR%\* | |
46 | goto end | |
47 | ) | |
48 | ||
49 | ||
50 | REM Check if sphinx-build is available and fallback to Python version if any | |
51 | %SPHINXBUILD% 2> nul | |
52 | if errorlevel 9009 goto sphinx_python | |
53 | goto sphinx_ok | |
54 | ||
55 | :sphinx_python | |
56 | ||
57 | set SPHINXBUILD=python -m sphinx.__init__ | |
58 | %SPHINXBUILD% 2> nul | |
59 | if errorlevel 9009 ( | |
60 | echo. | |
61 | echo.The 'sphinx-build' command was not found. Make sure you have Sphinx | |
62 | echo.installed, then set the SPHINXBUILD environment variable to point | |
63 | echo.to the full path of the 'sphinx-build' executable. Alternatively you | |
64 | echo.may add the Sphinx directory to PATH. | |
65 | echo. | |
66 | echo.If you don't have Sphinx installed, grab it from | |
67 | echo.http://sphinx-doc.org/ | |
68 | exit /b 1 | |
69 | ) | |
70 | ||
71 | :sphinx_ok | |
72 | ||
73 | ||
74 | if "%1" == "html" ( | |
75 | %SPHINXBUILD% -b html %ALLSPHINXOPTS% %BUILDDIR%/html | |
76 | if errorlevel 1 exit /b 1 | |
77 | echo. | |
78 | echo.Build finished. The HTML pages are in %BUILDDIR%/html. | |
79 | goto end | |
80 | ) | |
81 | ||
82 | if "%1" == "dirhtml" ( | |
83 | %SPHINXBUILD% -b dirhtml %ALLSPHINXOPTS% %BUILDDIR%/dirhtml | |
84 | if errorlevel 1 exit /b 1 | |
85 | echo. | |
86 | echo.Build finished. The HTML pages are in %BUILDDIR%/dirhtml. | |
87 | goto end | |
88 | ) | |
89 | ||
90 | if "%1" == "singlehtml" ( | |
91 | %SPHINXBUILD% -b singlehtml %ALLSPHINXOPTS% %BUILDDIR%/singlehtml | |
92 | if errorlevel 1 exit /b 1 | |
93 | echo. | |
94 | echo.Build finished. The HTML pages are in %BUILDDIR%/singlehtml. | |
95 | goto end | |
96 | ) | |
97 | ||
98 | if "%1" == "pickle" ( | |
99 | %SPHINXBUILD% -b pickle %ALLSPHINXOPTS% %BUILDDIR%/pickle | |
100 | if errorlevel 1 exit /b 1 | |
101 | echo. | |
102 | echo.Build finished; now you can process the pickle files. | |
103 | goto end | |
104 | ) | |
105 | ||
106 | if "%1" == "json" ( | |
107 | %SPHINXBUILD% -b json %ALLSPHINXOPTS% %BUILDDIR%/json | |
108 | if errorlevel 1 exit /b 1 | |
109 | echo. | |
110 | echo.Build finished; now you can process the JSON files. | |
111 | goto end | |
112 | ) | |
113 | ||
114 | if "%1" == "htmlhelp" ( | |
115 | %SPHINXBUILD% -b htmlhelp %ALLSPHINXOPTS% %BUILDDIR%/htmlhelp | |
116 | if errorlevel 1 exit /b 1 | |
117 | echo. | |
118 | echo.Build finished; now you can run HTML Help Workshop with the ^ | |
119 | .hhp project file in %BUILDDIR%/htmlhelp. | |
120 | goto end | |
121 | ) | |
122 | ||
123 | if "%1" == "qthelp" ( | |
124 | %SPHINXBUILD% -b qthelp %ALLSPHINXOPTS% %BUILDDIR%/qthelp | |
125 | if errorlevel 1 exit /b 1 | |
126 | echo. | |
127 | echo.Build finished; now you can run "qcollectiongenerator" with the ^ | |
128 | .qhcp project file in %BUILDDIR%/qthelp, like this: | |
129 | echo.^> qcollectiongenerator %BUILDDIR%\qthelp\hyperlink.qhcp | |
130 | echo.To view the help file: | |
131 | echo.^> assistant -collectionFile %BUILDDIR%\qthelp\hyperlink.ghc | |
132 | goto end | |
133 | ) | |
134 | ||
135 | if "%1" == "devhelp" ( | |
136 | %SPHINXBUILD% -b devhelp %ALLSPHINXOPTS% %BUILDDIR%/devhelp | |
137 | if errorlevel 1 exit /b 1 | |
138 | echo. | |
139 | echo.Build finished. | |
140 | goto end | |
141 | ) | |
142 | ||
143 | if "%1" == "epub" ( | |
144 | %SPHINXBUILD% -b epub %ALLSPHINXOPTS% %BUILDDIR%/epub | |
145 | if errorlevel 1 exit /b 1 | |
146 | echo. | |
147 | echo.Build finished. The epub file is in %BUILDDIR%/epub. | |
148 | goto end | |
149 | ) | |
150 | ||
151 | if "%1" == "latex" ( | |
152 | %SPHINXBUILD% -b latex %ALLSPHINXOPTS% %BUILDDIR%/latex | |
153 | if errorlevel 1 exit /b 1 | |
154 | echo. | |
155 | echo.Build finished; the LaTeX files are in %BUILDDIR%/latex. | |
156 | goto end | |
157 | ) | |
158 | ||
159 | if "%1" == "latexpdf" ( | |
160 | %SPHINXBUILD% -b latex %ALLSPHINXOPTS% %BUILDDIR%/latex | |
161 | cd %BUILDDIR%/latex | |
162 | make all-pdf | |
163 | cd %~dp0 | |
164 | echo. | |
165 | echo.Build finished; the PDF files are in %BUILDDIR%/latex. | |
166 | goto end | |
167 | ) | |
168 | ||
169 | if "%1" == "latexpdfja" ( | |
170 | %SPHINXBUILD% -b latex %ALLSPHINXOPTS% %BUILDDIR%/latex | |
171 | cd %BUILDDIR%/latex | |
172 | make all-pdf-ja | |
173 | cd %~dp0 | |
174 | echo. | |
175 | echo.Build finished; the PDF files are in %BUILDDIR%/latex. | |
176 | goto end | |
177 | ) | |
178 | ||
179 | if "%1" == "text" ( | |
180 | %SPHINXBUILD% -b text %ALLSPHINXOPTS% %BUILDDIR%/text | |
181 | if errorlevel 1 exit /b 1 | |
182 | echo. | |
183 | echo.Build finished. The text files are in %BUILDDIR%/text. | |
184 | goto end | |
185 | ) | |
186 | ||
187 | if "%1" == "man" ( | |
188 | %SPHINXBUILD% -b man %ALLSPHINXOPTS% %BUILDDIR%/man | |
189 | if errorlevel 1 exit /b 1 | |
190 | echo. | |
191 | echo.Build finished. The manual pages are in %BUILDDIR%/man. | |
192 | goto end | |
193 | ) | |
194 | ||
195 | if "%1" == "texinfo" ( | |
196 | %SPHINXBUILD% -b texinfo %ALLSPHINXOPTS% %BUILDDIR%/texinfo | |
197 | if errorlevel 1 exit /b 1 | |
198 | echo. | |
199 | echo.Build finished. The Texinfo files are in %BUILDDIR%/texinfo. | |
200 | goto end | |
201 | ) | |
202 | ||
203 | if "%1" == "gettext" ( | |
204 | %SPHINXBUILD% -b gettext %I18NSPHINXOPTS% %BUILDDIR%/locale | |
205 | if errorlevel 1 exit /b 1 | |
206 | echo. | |
207 | echo.Build finished. The message catalogs are in %BUILDDIR%/locale. | |
208 | goto end | |
209 | ) | |
210 | ||
211 | if "%1" == "changes" ( | |
212 | %SPHINXBUILD% -b changes %ALLSPHINXOPTS% %BUILDDIR%/changes | |
213 | if errorlevel 1 exit /b 1 | |
214 | echo. | |
215 | echo.The overview file is in %BUILDDIR%/changes. | |
216 | goto end | |
217 | ) | |
218 | ||
219 | if "%1" == "linkcheck" ( | |
220 | %SPHINXBUILD% -b linkcheck %ALLSPHINXOPTS% %BUILDDIR%/linkcheck | |
221 | if errorlevel 1 exit /b 1 | |
222 | echo. | |
223 | echo.Link check complete; look for any errors in the above output ^ | |
224 | or in %BUILDDIR%/linkcheck/output.txt. | |
225 | goto end | |
226 | ) | |
227 | ||
228 | if "%1" == "doctest" ( | |
229 | %SPHINXBUILD% -b doctest %ALLSPHINXOPTS% %BUILDDIR%/doctest | |
230 | if errorlevel 1 exit /b 1 | |
231 | echo. | |
232 | echo.Testing of doctests in the sources finished, look at the ^ | |
233 | results in %BUILDDIR%/doctest/output.txt. | |
234 | goto end | |
235 | ) | |
236 | ||
237 | if "%1" == "coverage" ( | |
238 | %SPHINXBUILD% -b coverage %ALLSPHINXOPTS% %BUILDDIR%/coverage | |
239 | if errorlevel 1 exit /b 1 | |
240 | echo. | |
241 | echo.Testing of coverage in the sources finished, look at the ^ | |
242 | results in %BUILDDIR%/coverage/python.txt. | |
243 | goto end | |
244 | ) | |
245 | ||
246 | if "%1" == "xml" ( | |
247 | %SPHINXBUILD% -b xml %ALLSPHINXOPTS% %BUILDDIR%/xml | |
248 | if errorlevel 1 exit /b 1 | |
249 | echo. | |
250 | echo.Build finished. The XML files are in %BUILDDIR%/xml. | |
251 | goto end | |
252 | ) | |
253 | ||
254 | if "%1" == "pseudoxml" ( | |
255 | %SPHINXBUILD% -b pseudoxml %ALLSPHINXOPTS% %BUILDDIR%/pseudoxml | |
256 | if errorlevel 1 exit /b 1 | |
257 | echo. | |
258 | echo.Build finished. The pseudo-XML files are in %BUILDDIR%/pseudoxml. | |
259 | goto end | |
260 | ) | |
261 | ||
262 | :end |
0 | ||
1 | from ._url import URL, URLParseError, register_scheme, parse_host | |
2 | ||
3 | __all__ = [ | |
4 | "URL", | |
5 | "URLParseError", | |
6 | "register_scheme", | |
7 | "parse_host" | |
8 | ] |
0 | # -*- coding: utf-8 -*- | |
1 | u"""Hyperlink provides Pythonic URL parsing, construction, and rendering. | |
2 | ||
3 | Usage is straightforward:: | |
4 | ||
5 | >>> from hyperlink import URL | |
6 | >>> url = URL.from_text(u'http://github.com/mahmoud/hyperlink?utm_source=docs') | |
7 | >>> url.host | |
8 | u'github.com' | |
9 | >>> secure_url = url.replace(scheme=u'https') | |
10 | >>> secure_url.get('utm_source')[0] | |
11 | u'docs' | |
12 | ||
13 | As seen here, the API revolves around the lightweight and immutable | |
14 | :class:`URL` type, documented below. | |
15 | """ | |
16 | ||
17 | import re | |
18 | import string | |
19 | import socket | |
20 | from unicodedata import normalize | |
21 | try: | |
22 | from socket import inet_pton | |
23 | except ImportError: | |
24 | # based on https://gist.github.com/nnemkin/4966028 | |
25 | # this code only applies on Windows Python 2.7 | |
26 | import ctypes | |
27 | ||
28 | class _sockaddr(ctypes.Structure): | |
29 | _fields_ = [("sa_family", ctypes.c_short), | |
30 | ("__pad1", ctypes.c_ushort), | |
31 | ("ipv4_addr", ctypes.c_byte * 4), | |
32 | ("ipv6_addr", ctypes.c_byte * 16), | |
33 | ("__pad2", ctypes.c_ulong)] | |
34 | ||
35 | WSAStringToAddressA = ctypes.windll.ws2_32.WSAStringToAddressA | |
36 | WSAAddressToStringA = ctypes.windll.ws2_32.WSAAddressToStringA | |
37 | ||
38 | def inet_pton(address_family, ip_string): | |
39 | addr = _sockaddr() | |
40 | ip_string = ip_string.encode('ascii') | |
41 | addr.sa_family = address_family | |
42 | addr_size = ctypes.c_int(ctypes.sizeof(addr)) | |
43 | ||
44 | if WSAStringToAddressA(ip_string, address_family, None, ctypes.byref(addr), ctypes.byref(addr_size)) != 0: | |
45 | raise socket.error(ctypes.FormatError()) | |
46 | ||
47 | if address_family == socket.AF_INET: | |
48 | return ctypes.string_at(addr.ipv4_addr, 4) | |
49 | if address_family == socket.AF_INET6: | |
50 | return ctypes.string_at(addr.ipv6_addr, 16) | |
51 | raise socket.error('unknown address family') | |
52 | ||
53 | ||
54 | unicode = type(u'') | |
55 | try: | |
56 | unichr | |
57 | except NameError: | |
58 | unichr = chr # py3 | |
59 | NoneType = type(None) | |
60 | ||
61 | ||
62 | # from boltons.typeutils | |
63 | def make_sentinel(name='_MISSING', var_name=None): | |
64 | """Creates and returns a new **instance** of a new class, suitable for | |
65 | usage as a "sentinel", a kind of singleton often used to indicate | |
66 | a value is missing when ``None`` is a valid input. | |
67 | ||
68 | Args: | |
69 | name (str): Name of the Sentinel | |
70 | var_name (str): Set this name to the name of the variable in | |
71 | its respective module enable pickleability. | |
72 | ||
73 | >>> make_sentinel(var_name='_MISSING') | |
74 | _MISSING | |
75 | ||
76 | The most common use cases here in boltons are as default values | |
77 | for optional function arguments, partly because of its | |
78 | less-confusing appearance in automatically generated | |
79 | documentation. Sentinels also function well as placeholders in queues | |
80 | and linked lists. | |
81 | ||
82 | .. note:: | |
83 | ||
84 | By design, additional calls to ``make_sentinel`` with the same | |
85 | values will not produce equivalent objects. | |
86 | ||
87 | >>> make_sentinel('TEST') == make_sentinel('TEST') | |
88 | False | |
89 | >>> type(make_sentinel('TEST')) == type(make_sentinel('TEST')) | |
90 | False | |
91 | ||
92 | """ | |
93 | class Sentinel(object): | |
94 | def __init__(self): | |
95 | self.name = name | |
96 | self.var_name = var_name | |
97 | ||
98 | def __repr__(self): | |
99 | if self.var_name: | |
100 | return self.var_name | |
101 | return '%s(%r)' % (self.__class__.__name__, self.name) | |
102 | if var_name: | |
103 | def __reduce__(self): | |
104 | return self.var_name | |
105 | ||
106 | def __nonzero__(self): | |
107 | return False | |
108 | ||
109 | __bool__ = __nonzero__ | |
110 | ||
111 | return Sentinel() | |
112 | ||
113 | ||
114 | _unspecified = _UNSET = make_sentinel('_UNSET') | |
115 | ||
116 | ||
117 | # RFC 3986 Section 2.3, Unreserved URI Characters | |
118 | # https://tools.ietf.org/html/rfc3986#section-2.3 | |
119 | _UNRESERVED_CHARS = frozenset('~-._0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ' | |
120 | 'abcdefghijklmnopqrstuvwxyz') | |
121 | ||
122 | ||
123 | # URL parsing regex (based on RFC 3986 Appendix B, with modifications) | |
124 | _URL_RE = re.compile(r'^((?P<scheme>[^:/?#]+):)?' | |
125 | r'((?P<_netloc_sep>//)' | |
126 | r'(?P<authority>[^/?#]*))?' | |
127 | r'(?P<path>[^?#]*)' | |
128 | r'(\?(?P<query>[^#]*))?' | |
129 | r'(#(?P<fragment>.*))?$') | |
130 | _SCHEME_RE = re.compile(r'^[a-zA-Z0-9+-.]*$') | |
131 | _AUTHORITY_RE = re.compile(r'^(?:(?P<userinfo>[^@/?#]*)@)?' | |
132 | r'(?P<host>' | |
133 | r'(?:\[(?P<ipv6_host>[^[\]/?#]*)\])' | |
134 | r'|(?P<plain_host>[^:/?#[\]]*)' | |
135 | r'|(?P<bad_host>.*?))?' | |
136 | r'(?::(?P<port>.*))?$') | |
137 | ||
138 | ||
139 | _HEX_CHAR_MAP = dict([((a + b).encode('ascii'), | |
140 | unichr(int(a + b, 16)).encode('charmap')) | |
141 | for a in string.hexdigits for b in string.hexdigits]) | |
142 | _ASCII_RE = re.compile('([\x00-\x7f]+)') | |
143 | ||
144 | # RFC 3986 section 2.2, Reserved Characters | |
145 | # https://tools.ietf.org/html/rfc3986#section-2.2 | |
146 | _GEN_DELIMS = frozenset(u':/?#[]@') | |
147 | _SUB_DELIMS = frozenset(u"!$&'()*+,;=") | |
148 | _ALL_DELIMS = _GEN_DELIMS | _SUB_DELIMS | |
149 | ||
150 | _USERINFO_SAFE = _UNRESERVED_CHARS | _SUB_DELIMS | |
151 | _USERINFO_DELIMS = _ALL_DELIMS - _USERINFO_SAFE | |
152 | _PATH_SAFE = _UNRESERVED_CHARS | _SUB_DELIMS | set(u':@%') | |
153 | _PATH_DELIMS = _ALL_DELIMS - _PATH_SAFE | |
154 | _SCHEMELESS_PATH_SAFE = _PATH_SAFE - set(':') | |
155 | _SCHEMELESS_PATH_DELIMS = _ALL_DELIMS - _SCHEMELESS_PATH_SAFE | |
156 | _FRAGMENT_SAFE = _UNRESERVED_CHARS | _PATH_SAFE | set(u'/?') | |
157 | _FRAGMENT_DELIMS = _ALL_DELIMS - _FRAGMENT_SAFE | |
158 | _QUERY_SAFE = _UNRESERVED_CHARS | _FRAGMENT_SAFE - set(u'&=+') | |
159 | _QUERY_DELIMS = _ALL_DELIMS - _QUERY_SAFE | |
160 | ||
161 | ||
162 | def _make_decode_map(delims, allow_percent=False): | |
163 | ret = dict(_HEX_CHAR_MAP) | |
164 | if not allow_percent: | |
165 | delims = set(delims) | set([u'%']) | |
166 | for delim in delims: | |
167 | _hexord = '{0:02X}'.format(ord(delim)).encode('ascii') | |
168 | _hexord_lower = _hexord.lower() | |
169 | ret.pop(_hexord) | |
170 | if _hexord != _hexord_lower: | |
171 | ret.pop(_hexord_lower) | |
172 | return ret | |
173 | ||
174 | ||
175 | def _make_quote_map(safe_chars): | |
176 | ret = {} | |
177 | # v is included in the dict for py3 mostly, because bytestrings | |
178 | # are iterables of ints, of course! | |
179 | for i, v in zip(range(256), range(256)): | |
180 | c = chr(v) | |
181 | if c in safe_chars: | |
182 | ret[c] = ret[v] = c | |
183 | else: | |
184 | ret[c] = ret[v] = '%{0:02X}'.format(i) | |
185 | return ret | |
186 | ||
187 | ||
188 | _USERINFO_PART_QUOTE_MAP = _make_quote_map(_USERINFO_SAFE) | |
189 | _USERINFO_DECODE_MAP = _make_decode_map(_USERINFO_DELIMS) | |
190 | _PATH_PART_QUOTE_MAP = _make_quote_map(_PATH_SAFE) | |
191 | _SCHEMELESS_PATH_PART_QUOTE_MAP = _make_quote_map(_SCHEMELESS_PATH_SAFE) | |
192 | _PATH_DECODE_MAP = _make_decode_map(_PATH_DELIMS) | |
193 | _QUERY_PART_QUOTE_MAP = _make_quote_map(_QUERY_SAFE) | |
194 | _QUERY_DECODE_MAP = _make_decode_map(_QUERY_DELIMS) | |
195 | _FRAGMENT_QUOTE_MAP = _make_quote_map(_FRAGMENT_SAFE) | |
196 | _FRAGMENT_DECODE_MAP = _make_decode_map(_FRAGMENT_DELIMS) | |
197 | _UNRESERVED_DECODE_MAP = dict([(k, v) for k, v in _HEX_CHAR_MAP.items() | |
198 | if v.decode('ascii', 'replace') | |
199 | in _UNRESERVED_CHARS]) | |
200 | ||
201 | _ROOT_PATHS = frozenset(((), (u'',))) | |
202 | ||
203 | ||
204 | def _encode_path_part(text, maximal=True): | |
205 | "Percent-encode a single segment of a URL path." | |
206 | if maximal: | |
207 | bytestr = normalize('NFC', text).encode('utf8') | |
208 | return u''.join([_PATH_PART_QUOTE_MAP[b] for b in bytestr]) | |
209 | return u''.join([_PATH_PART_QUOTE_MAP[t] if t in _PATH_DELIMS else t | |
210 | for t in text]) | |
211 | ||
212 | ||
213 | def _encode_schemeless_path_part(text, maximal=True): | |
214 | """Percent-encode the first segment of a URL path for a URL without a | |
215 | scheme specified. | |
216 | """ | |
217 | if maximal: | |
218 | bytestr = normalize('NFC', text).encode('utf8') | |
219 | return u''.join([_SCHEMELESS_PATH_PART_QUOTE_MAP[b] for b in bytestr]) | |
220 | return u''.join([_SCHEMELESS_PATH_PART_QUOTE_MAP[t] | |
221 | if t in _SCHEMELESS_PATH_DELIMS else t for t in text]) | |
222 | ||
223 | ||
224 | def _encode_path_parts(text_parts, rooted=False, has_scheme=True, | |
225 | has_authority=True, joined=True, maximal=True): | |
226 | """ | |
227 | Percent-encode a tuple of path parts into a complete path. | |
228 | ||
229 | Setting *maximal* to False percent-encodes only the reserved | |
230 | characters that are syntactically necessary for serialization, | |
231 | preserving any IRI-style textual data. | |
232 | ||
233 | Leaving *maximal* set to its default True percent-encodes | |
234 | everything required to convert a portion of an IRI to a portion of | |
235 | a URI. | |
236 | ||
237 | RFC 3986 3.3: | |
238 | ||
239 | If a URI contains an authority component, then the path component | |
240 | must either be empty or begin with a slash ("/") character. If a URI | |
241 | does not contain an authority component, then the path cannot begin | |
242 | with two slash characters ("//"). In addition, a URI reference | |
243 | (Section 4.1) may be a relative-path reference, in which case the | |
244 | first path segment cannot contain a colon (":") character. | |
245 | """ | |
246 | if not text_parts: | |
247 | return u'' if joined else text_parts | |
248 | if rooted: | |
249 | text_parts = (u'',) + text_parts | |
250 | # elif has_authority and text_parts: | |
251 | # raise Exception('see rfc above') # TODO: too late to fail like this? | |
252 | encoded_parts = [] | |
253 | if has_scheme: | |
254 | encoded_parts = [_encode_path_part(part, maximal=maximal) | |
255 | if part else part for part in text_parts] | |
256 | else: | |
257 | encoded_parts = [_encode_schemeless_path_part(text_parts[0])] | |
258 | encoded_parts.extend([_encode_path_part(part, maximal=maximal) | |
259 | if part else part for part in text_parts[1:]]) | |
260 | if joined: | |
261 | return u'/'.join(encoded_parts) | |
262 | return tuple(encoded_parts) | |
263 | ||
264 | ||
265 | def _encode_query_part(text, maximal=True): | |
266 | """ | |
267 | Percent-encode a single query string key or value. | |
268 | """ | |
269 | if maximal: | |
270 | bytestr = normalize('NFC', text).encode('utf8') | |
271 | return u''.join([_QUERY_PART_QUOTE_MAP[b] for b in bytestr]) | |
272 | return u''.join([_QUERY_PART_QUOTE_MAP[t] if t in _QUERY_DELIMS else t | |
273 | for t in text]) | |
274 | ||
275 | ||
276 | def _encode_fragment_part(text, maximal=True): | |
277 | """Quote the fragment part of the URL. Fragments don't have | |
278 | subdelimiters, so the whole URL fragment can be passed. | |
279 | """ | |
280 | if maximal: | |
281 | bytestr = normalize('NFC', text).encode('utf8') | |
282 | return u''.join([_FRAGMENT_QUOTE_MAP[b] for b in bytestr]) | |
283 | return u''.join([_FRAGMENT_QUOTE_MAP[t] if t in _FRAGMENT_DELIMS else t | |
284 | for t in text]) | |
285 | ||
286 | ||
287 | def _encode_userinfo_part(text, maximal=True): | |
288 | """Quote special characters in either the username or password | |
289 | section of the URL. | |
290 | """ | |
291 | if maximal: | |
292 | bytestr = normalize('NFC', text).encode('utf8') | |
293 | return u''.join([_USERINFO_PART_QUOTE_MAP[b] for b in bytestr]) | |
294 | return u''.join([_USERINFO_PART_QUOTE_MAP[t] if t in _USERINFO_DELIMS | |
295 | else t for t in text]) | |
296 | ||
297 | ||
298 | ||
299 | # This port list painstakingly curated by hand searching through | |
300 | # https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml | |
301 | # and | |
302 | # https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml | |
303 | SCHEME_PORT_MAP = {'acap': 674, 'afp': 548, 'dict': 2628, 'dns': 53, | |
304 | 'file': None, 'ftp': 21, 'git': 9418, 'gopher': 70, | |
305 | 'http': 80, 'https': 443, 'imap': 143, 'ipp': 631, | |
306 | 'ipps': 631, 'irc': 194, 'ircs': 6697, 'ldap': 389, | |
307 | 'ldaps': 636, 'mms': 1755, 'msrp': 2855, 'msrps': None, | |
308 | 'mtqp': 1038, 'nfs': 111, 'nntp': 119, 'nntps': 563, | |
309 | 'pop': 110, 'prospero': 1525, 'redis': 6379, 'rsync': 873, | |
310 | 'rtsp': 554, 'rtsps': 322, 'rtspu': 5005, 'sftp': 22, | |
311 | 'smb': 445, 'snmp': 161, 'ssh': 22, 'steam': None, | |
312 | 'svn': 3690, 'telnet': 23, 'ventrilo': 3784, 'vnc': 5900, | |
313 | 'wais': 210, 'ws': 80, 'wss': 443, 'xmpp': None} | |
314 | ||
315 | # This list of schemes that don't use authorities is also from the link above. | |
316 | NO_NETLOC_SCHEMES = set(['urn', 'about', 'bitcoin', 'blob', 'data', 'geo', | |
317 | 'magnet', 'mailto', 'news', 'pkcs11', | |
318 | 'sip', 'sips', 'tel']) | |
319 | # As of Mar 11, 2017, there were 44 netloc schemes, and 13 non-netloc | |
320 | ||
321 | ||
322 | def register_scheme(text, uses_netloc=True, default_port=None): | |
323 | """Registers new scheme information, resulting in correct port and | |
324 | slash behavior from the URL object. There are dozens of standard | |
325 | schemes preregistered, so this function is mostly meant for | |
326 | proprietary internal customizations or stopgaps on missing | |
327 | standards information. If a scheme seems to be missing, please | |
328 | `file an issue`_! | |
329 | ||
330 | Args: | |
331 | text (unicode): Text representing the scheme. | |
332 | (the 'http' in 'http://hatnote.com') | |
333 | uses_netloc (bool): Does the scheme support specifying a | |
334 | network host? For instance, "http" does, "mailto" does | |
335 | not. Defaults to True. | |
336 | default_port (int): The default port, if any, for netloc-using | |
337 | schemes. | |
338 | ||
339 | .. _file an issue: https://github.com/mahmoud/hyperlink/issues | |
340 | ||
341 | """ | |
342 | text = text.lower() | |
343 | if default_port is not None: | |
344 | try: | |
345 | default_port = int(default_port) | |
346 | except (ValueError, TypeError): | |
347 | raise ValueError('default_port expected integer or None, not %r' | |
348 | % (default_port,)) | |
349 | ||
350 | if uses_netloc is True: | |
351 | SCHEME_PORT_MAP[text] = default_port | |
352 | elif uses_netloc is False: | |
353 | if default_port is not None: | |
354 | raise ValueError('unexpected default port while specifying' | |
355 | ' non-netloc scheme: %r' % default_port) | |
356 | NO_NETLOC_SCHEMES.add(text) | |
357 | else: | |
358 | raise ValueError('uses_netloc expected bool, not: %r' % uses_netloc) | |
359 | ||
360 | return | |
361 | ||
362 | ||
363 | def scheme_uses_netloc(scheme, default=None): | |
364 | """Whether or not a URL uses :code:`:` or :code:`://` to separate the | |
365 | scheme from the rest of the URL depends on the scheme's own | |
366 | standard definition. There is no way to infer this behavior | |
367 | from other parts of the URL. A scheme either supports network | |
368 | locations or it does not. | |
369 | ||
370 | The URL type's approach to this is to check for explicitly | |
371 | registered schemes, with common schemes like HTTP | |
372 | preregistered. This is the same approach taken by | |
373 | :mod:`urlparse`. | |
374 | ||
375 | URL adds two additional heuristics if the scheme as a whole is | |
376 | not registered. First, it attempts to check the subpart of the | |
377 | scheme after the last ``+`` character. This adds intuitive | |
378 | behavior for schemes like ``git+ssh``. Second, if a URL with | |
379 | an unrecognized scheme is loaded, it will maintain the | |
380 | separator it sees. | |
381 | """ | |
382 | if not scheme: | |
383 | return False | |
384 | scheme = scheme.lower() | |
385 | if scheme in SCHEME_PORT_MAP: | |
386 | return True | |
387 | if scheme in NO_NETLOC_SCHEMES: | |
388 | return False | |
389 | if scheme.split('+')[-1] in SCHEME_PORT_MAP: | |
390 | return True | |
391 | return default | |
392 | ||
393 | ||
394 | class URLParseError(ValueError): | |
395 | """Exception inheriting from :exc:`ValueError`, raised when failing to | |
396 | parse a URL. Mostly raised on invalid ports and IPv6 addresses. | |
397 | """ | |
398 | pass | |
399 | ||
400 | ||
401 | def _optional(argument, default): | |
402 | if argument is _UNSET: | |
403 | return default | |
404 | else: | |
405 | return argument | |
406 | ||
407 | ||
408 | def _typecheck(name, value, *types): | |
409 | """ | |
410 | Check that the given *value* is one of the given *types*, or raise an | |
411 | exception describing the problem using *name*. | |
412 | """ | |
413 | if not types: | |
414 | raise ValueError('expected one or more types, maybe use _textcheck?') | |
415 | if not isinstance(value, types): | |
416 | raise TypeError("expected %s for %s, got %r" | |
417 | % (" or ".join([t.__name__ for t in types]), | |
418 | name, value)) | |
419 | return value | |
420 | ||
421 | ||
422 | def _textcheck(name, value, delims=frozenset(), nullable=False): | |
423 | if not isinstance(value, unicode): | |
424 | if nullable and value is None: | |
425 | return value # used by query string values | |
426 | else: | |
427 | str_name = "unicode" if bytes is str else "str" | |
428 | exp = str_name + ' or NoneType' if nullable else str_name | |
429 | raise TypeError('expected %s for %s, got %r' % (exp, name, value)) | |
430 | if delims and set(value) & set(delims): # TODO: test caching into regexes | |
431 | raise ValueError('one or more reserved delimiters %s present in %s: %r' | |
432 | % (''.join(delims), name, value)) | |
433 | return value | |
434 | ||
435 | ||
436 | def _decode_unreserved(text, normalize_case=False): | |
437 | return _percent_decode(text, normalize_case=normalize_case, | |
438 | _decode_map=_UNRESERVED_DECODE_MAP) | |
439 | ||
440 | ||
441 | def _decode_userinfo_part(text, normalize_case=False): | |
442 | return _percent_decode(text, normalize_case=normalize_case, | |
443 | _decode_map=_USERINFO_DECODE_MAP) | |
444 | ||
445 | ||
446 | def _decode_path_part(text, normalize_case=False): | |
447 | """ | |
448 | >>> _decode_path_part(u'%61%77%2f%7a') | |
449 | u'aw%2fz' | |
450 | >>> _decode_path_part(u'%61%77%2f%7a', normalize_case=True) | |
451 | u'aw%2Fz' | |
452 | """ | |
453 | return _percent_decode(text, normalize_case=normalize_case, | |
454 | _decode_map=_PATH_DECODE_MAP) | |
455 | ||
456 | ||
457 | def _decode_query_part(text, normalize_case=False): | |
458 | return _percent_decode(text, normalize_case=normalize_case, | |
459 | _decode_map=_QUERY_DECODE_MAP) | |
460 | ||
461 | ||
462 | def _decode_fragment_part(text, normalize_case=False): | |
463 | return _percent_decode(text, normalize_case=normalize_case, | |
464 | _decode_map=_FRAGMENT_DECODE_MAP) | |
465 | ||
466 | ||
467 | def _percent_decode(text, normalize_case=False, _decode_map=_HEX_CHAR_MAP): | |
468 | """Convert percent-encoded text characters to their normal, | |
469 | human-readable equivalents. | |
470 | ||
471 | All characters in the input text must be valid ASCII. All special | |
472 | characters underlying the values in the percent-encoding must be | |
473 | valid UTF-8. If a non-UTF8-valid string is passed, the original | |
474 | text is returned with no changes applied. | |
475 | ||
476 | Only called by field-tailored variants, e.g., | |
477 | :func:`_decode_path_part`, as every percent-encodable part of the | |
478 | URL has characters which should not be percent decoded. | |
479 | ||
480 | >>> _percent_decode(u'abc%20def') | |
481 | u'abc def' | |
482 | ||
483 | Args: | |
484 | text (unicode): The ASCII text with percent-encoding present. | |
485 | normalize_case (bool): Whether undecoded percent segments, such | |
486 | as encoded delimiters, should be uppercased, per RFC 3986 | |
487 | Section 2.1. See :func:`_decode_path_part` for an example. | |
488 | ||
489 | Returns: | |
490 | unicode: The percent-decoded version of *text*, with UTF-8 | |
491 | decoding applied. | |
492 | ||
493 | """ | |
494 | try: | |
495 | quoted_bytes = text.encode("ascii") | |
496 | except UnicodeEncodeError: | |
497 | return text | |
498 | ||
499 | bits = quoted_bytes.split(b'%') | |
500 | if len(bits) == 1: | |
501 | return text | |
502 | ||
503 | res = [bits[0]] | |
504 | append = res.append | |
505 | ||
506 | if not normalize_case: | |
507 | for item in bits[1:]: | |
508 | try: | |
509 | append(_decode_map[item[:2]]) | |
510 | append(item[2:]) | |
511 | except KeyError: | |
512 | append(b'%') | |
513 | append(item) | |
514 | else: | |
515 | for item in bits[1:]: | |
516 | try: | |
517 | append(_decode_map[item[:2]]) | |
518 | append(item[2:]) | |
519 | except KeyError: | |
520 | append(b'%') | |
521 | if item[:2] in _HEX_CHAR_MAP: | |
522 | append(item[:2].upper()) | |
523 | append(item[2:]) | |
524 | else: | |
525 | append(item) | |
526 | ||
527 | unquoted_bytes = b''.join(res) | |
528 | ||
529 | try: | |
530 | return unquoted_bytes.decode("utf-8") | |
531 | except UnicodeDecodeError: | |
532 | return text | |
533 | ||
534 | ||
535 | def _resolve_dot_segments(path): | |
536 | """Normalize the URL path by resolving segments of '.' and '..'. For | |
537 | more details, see `RFC 3986 section 5.2.4, Remove Dot Segments`_. | |
538 | ||
539 | Args: | |
540 | path (list): path segments in string form | |
541 | ||
542 | Returns: | |
543 | list: a new list of path segments with the '.' and '..' elements | |
544 | removed and resolved. | |
545 | ||
546 | .. _RFC 3986 section 5.2.4, Remove Dot Segments: https://tools.ietf.org/html/rfc3986#section-5.2.4 | |
547 | """ | |
548 | segs = [] | |
549 | ||
550 | for seg in path: | |
551 | if seg == u'.': | |
552 | pass | |
553 | elif seg == u'..': | |
554 | if segs: | |
555 | segs.pop() | |
556 | else: | |
557 | segs.append(seg) | |
558 | ||
559 | if list(path[-1:]) in ([u'.'], [u'..']): | |
560 | segs.append(u'') | |
561 | ||
562 | return segs | |
563 | ||
564 | ||
565 | def parse_host(host): | |
566 | """Parse the host into a tuple of ``(family, host)``, where family | |
567 | is the appropriate :mod:`socket` module constant when the host is | |
568 | an IP address. Family is ``None`` when the host is not an IP. | |
569 | ||
570 | Will raise :class:`URLParseError` on invalid IPv6 constants. | |
571 | ||
572 | Returns: | |
573 | tuple: family (socket constant or None), host (string) | |
574 | ||
575 | >>> parse_host('googlewebsite.com') == (None, 'googlewebsite.com') | |
576 | True | |
577 | >>> parse_host('::1') == (socket.AF_INET6, '::1') | |
578 | True | |
579 | >>> parse_host('192.168.1.1') == (socket.AF_INET, '192.168.1.1') | |
580 | True | |
581 | """ | |
582 | if not host: | |
583 | return None, u'' | |
584 | if u':' in host: | |
585 | try: | |
586 | inet_pton(socket.AF_INET6, host) | |
587 | except socket.error as se: | |
588 | raise URLParseError('invalid IPv6 host: %r (%r)' % (host, se)) | |
589 | except UnicodeEncodeError: | |
590 | pass # TODO: this can't be a real host right? | |
591 | else: | |
592 | family = socket.AF_INET6 | |
593 | return family, host | |
594 | try: | |
595 | inet_pton(socket.AF_INET, host) | |
596 | except (socket.error, UnicodeEncodeError): | |
597 | family = None # not an IP | |
598 | else: | |
599 | family = socket.AF_INET | |
600 | return family, host | |
601 | ||
602 | ||
603 | class URL(object): | |
604 | """From blogs to billboards, URLs are so common, that it's easy to | |
605 | overlook their complexity and power. With hyperlink's | |
606 | :class:`URL` type, working with URLs doesn't have to be hard. | |
607 | ||
608 | URLs are made of many parts. Most of these parts are officially | |
609 | named in `RFC 3986`_ and this diagram may prove handy in identifying | |
610 | them:: | |
611 | ||
612 | foo://user:pass@example.com:8042/over/there?name=ferret#nose | |
613 | \_/ \_______/ \_________/ \__/\_________/ \_________/ \__/ | |
614 | | | | | | | | | |
615 | scheme userinfo host port path query fragment | |
616 | ||
617 | While :meth:`~URL.from_text` is used for parsing whole URLs, the | |
618 | :class:`URL` constructor builds a URL from the individual | |
619 | components, like so:: | |
620 | ||
621 | >>> from hyperlink import URL | |
622 | >>> url = URL(scheme=u'https', host=u'example.com', path=[u'hello', u'world']) | |
623 | >>> print(url.to_text()) | |
624 | https://example.com/hello/world | |
625 | ||
626 | The constructor runs basic type checks. All strings are expected | |
627 | to be decoded (:class:`unicode` in Python 2). All arguments are | |
628 | optional, defaulting to appropriately empty values. A full list of | |
629 | constructor arguments is below. | |
630 | ||
631 | Args: | |
632 | scheme (unicode): The text name of the scheme. | |
633 | host (unicode): The host portion of the network location | |
634 | port (int): The port part of the network location. If | |
635 | ``None`` or no port is passed, the port will default to | |
636 | the default port of the scheme, if it is known. See the | |
637 | ``SCHEME_PORT_MAP`` and :func:`register_default_port` | |
638 | for more info. | |
639 | path (tuple): A tuple of strings representing the | |
640 | slash-separated parts of the path. | |
641 | query (tuple): The query parameters, as a tuple of | |
642 | key-value pairs. | |
643 | fragment (unicode): The fragment part of the URL. | |
644 | rooted (bool): Whether or not the path begins with a slash. | |
645 | userinfo (unicode): The username or colon-separated | |
646 | username:password pair. | |
647 | uses_netloc (bool): Indicates whether two slashes appear | |
648 | between the scheme and the host (``http://eg.com`` vs | |
649 | ``mailto:e@g.com``). Set automatically based on scheme. | |
650 | ||
651 | All of these parts are also exposed as read-only attributes of | |
652 | URL instances, along with several useful methods. | |
653 | ||
654 | .. _RFC 3986: https://tools.ietf.org/html/rfc3986 | |
655 | .. _RFC 3987: https://tools.ietf.org/html/rfc3987 | |
656 | """ | |
657 | ||
658 | def __init__(self, scheme=None, host=None, path=(), query=(), fragment=u'', | |
659 | port=None, rooted=None, userinfo=u'', uses_netloc=None): | |
660 | if host is not None and scheme is None: | |
661 | scheme = u'http' # TODO: why | |
662 | if port is None: | |
663 | port = SCHEME_PORT_MAP.get(scheme) | |
664 | if host and query and not path: | |
665 | # per RFC 3986 6.2.3, "a URI that uses the generic syntax | |
666 | # for authority with an empty path should be normalized to | |
667 | # a path of '/'." | |
668 | path = (u'',) | |
669 | ||
670 | # Now that we're done detecting whether they were passed, we can set | |
671 | # them to their defaults: | |
672 | if scheme is None: | |
673 | scheme = u'' | |
674 | if host is None: | |
675 | host = u'' | |
676 | if rooted is None: | |
677 | rooted = bool(host) | |
678 | ||
679 | # Set attributes. | |
680 | self._scheme = _textcheck("scheme", scheme) | |
681 | if self._scheme: | |
682 | if not _SCHEME_RE.match(self._scheme): | |
683 | raise ValueError('invalid scheme: %r. Only alphanumeric, "+",' | |
684 | ' "-", and "." allowed. Did you meant to call' | |
685 | ' %s.from_text()?' | |
686 | % (self._scheme, self.__class__.__name__)) | |
687 | ||
688 | _, self._host = parse_host(_textcheck('host', host, '/?#@')) | |
689 | if isinstance(path, unicode): | |
690 | raise TypeError("expected iterable of text for path, not: %r" | |
691 | % (path,)) | |
692 | self._path = tuple((_textcheck("path segment", segment, '/?#') | |
693 | for segment in path)) | |
694 | self._query = tuple( | |
695 | (_textcheck("query parameter name", k, '&=#'), | |
696 | _textcheck("query parameter value", v, '&#', nullable=True)) | |
697 | for (k, v) in query | |
698 | ) | |
699 | self._fragment = _textcheck("fragment", fragment) | |
700 | self._port = _typecheck("port", port, int, NoneType) | |
701 | self._rooted = _typecheck("rooted", rooted, bool) | |
702 | self._userinfo = _textcheck("userinfo", userinfo, '/?#@') | |
703 | ||
704 | uses_netloc = scheme_uses_netloc(self._scheme, uses_netloc) | |
705 | self._uses_netloc = _typecheck("uses_netloc", | |
706 | uses_netloc, bool, NoneType) | |
707 | ||
708 | return | |
709 | ||
710 | @property | |
711 | def scheme(self): | |
712 | """The scheme is a string, and the first part of an absolute URL, the | |
713 | part before the first colon, and the part which defines the | |
714 | semantics of the rest of the URL. Examples include "http", | |
715 | "https", "ssh", "file", "mailto", and many others. See | |
716 | :func:`~hyperlink.register_scheme()` for more info. | |
717 | """ | |
718 | return self._scheme | |
719 | ||
720 | @property | |
721 | def host(self): | |
722 | """The host is a string, and the second standard part of an absolute | |
723 | URL. When present, a valid host must be a domain name, or an | |
724 | IP (v4 or v6). It occurs before the first slash, or the second | |
725 | colon, if a :attr:`~hyperlink.URL.port` is provided. | |
726 | """ | |
727 | return self._host | |
728 | ||
729 | @property | |
730 | def port(self): | |
731 | """The port is an integer that is commonly used in connecting to the | |
732 | :attr:`host`, and almost never appears without it. | |
733 | ||
734 | When not present in the original URL, this attribute defaults | |
735 | to the scheme's default port. If the scheme's default port is | |
736 | not known, and the port is not provided, this attribute will | |
737 | be set to None. | |
738 | ||
739 | >>> URL.from_text(u'http://example.com/pa/th').port | |
740 | 80 | |
741 | >>> URL.from_text(u'foo://example.com/pa/th').port | |
742 | >>> URL.from_text(u'foo://example.com:8042/pa/th').port | |
743 | 8042 | |
744 | ||
745 | .. note:: | |
746 | ||
747 | Per the standard, when the port is the same as the schemes | |
748 | default port, it will be omitted in the text URL. | |
749 | ||
750 | """ | |
751 | return self._port | |
752 | ||
753 | @property | |
754 | def path(self): | |
755 | """A tuple of strings, created by splitting the slash-separated | |
756 | hierarchical path. Started by the first slash after the host, | |
757 | terminated by a "?", which indicates the start of the | |
758 | :attr:`~hyperlink.URL.query` string. | |
759 | """ | |
760 | return self._path | |
761 | ||
762 | @property | |
763 | def query(self): | |
764 | """Tuple of pairs, created by splitting the ampersand-separated | |
765 | mapping of keys and optional values representing | |
766 | non-hierarchical data used to identify the resource. Keys are | |
767 | always strings. Values are strings when present, or None when | |
768 | missing. | |
769 | ||
770 | For more operations on the mapping, see | |
771 | :meth:`~hyperlink.URL.get()`, :meth:`~hyperlink.URL.add()`, | |
772 | :meth:`~hyperlink.URL.set()`, and | |
773 | :meth:`~hyperlink.URL.delete()`. | |
774 | """ | |
775 | return self._query | |
776 | ||
777 | @property | |
778 | def fragment(self): | |
779 | """A string, the last part of the URL, indicated by the first "#" | |
780 | after the :attr:`~hyperlink.URL.path` or | |
781 | :attr:`~hyperlink.URL.query`. Enables indirect identification | |
782 | of a secondary resource, like an anchor within an HTML page. | |
783 | ||
784 | """ | |
785 | return self._fragment | |
786 | ||
787 | @property | |
788 | def rooted(self): | |
789 | """Whether or not the path starts with a forward slash (``/``). | |
790 | ||
791 | This is taken from the terminology in the BNF grammar, | |
792 | specifically the "path-rootless", rule, since "absolute path" | |
793 | and "absolute URI" are somewhat ambiguous. :attr:`path` does | |
794 | not contain the implicit prefixed ``"/"`` since that is | |
795 | somewhat awkward to work with. | |
796 | ||
797 | """ | |
798 | return self._rooted | |
799 | ||
800 | @property | |
801 | def userinfo(self): | |
802 | """The colon-separated string forming the username-password | |
803 | combination. | |
804 | """ | |
805 | return self._userinfo | |
806 | ||
807 | @property | |
808 | def uses_netloc(self): | |
809 | """ | |
810 | """ | |
811 | return self._uses_netloc | |
812 | ||
813 | @property | |
814 | def user(self): | |
815 | """ | |
816 | The user portion of :attr:`~hyperlink.URL.userinfo`. | |
817 | """ | |
818 | return self.userinfo.split(u':')[0] | |
819 | ||
820 | def authority(self, with_password=False, **kw): | |
821 | """Compute and return the appropriate host/port/userinfo combination. | |
822 | ||
823 | >>> url = URL.from_text(u'http://user:pass@localhost:8080/a/b?x=y') | |
824 | >>> url.authority() | |
825 | u'user:@localhost:8080' | |
826 | >>> url.authority(with_password=True) | |
827 | u'user:pass@localhost:8080' | |
828 | ||
829 | Args: | |
830 | with_password (bool): Whether the return value of this | |
831 | method include the password in the URL, if it is | |
832 | set. Defaults to False. | |
833 | ||
834 | Returns: | |
835 | str: The authority (network location and user information) portion | |
836 | of the URL. | |
837 | """ | |
838 | # first, a bit of twisted compat | |
839 | with_password = kw.pop('includeSecrets', with_password) | |
840 | if kw: | |
841 | raise TypeError('got unexpected keyword arguments: %r' % kw.keys()) | |
842 | host = self.host | |
843 | if ':' in host: | |
844 | hostport = ['[' + host + ']'] | |
845 | else: | |
846 | hostport = [self.host] | |
847 | if self.port != SCHEME_PORT_MAP.get(self.scheme): | |
848 | hostport.append(unicode(self.port)) | |
849 | authority = [] | |
850 | if self.userinfo: | |
851 | userinfo = self.userinfo | |
852 | if not with_password and u":" in userinfo: | |
853 | userinfo = userinfo[:userinfo.index(u":") + 1] | |
854 | authority.append(userinfo) | |
855 | authority.append(u":".join(hostport)) | |
856 | return u"@".join(authority) | |
857 | ||
858 | def __eq__(self, other): | |
859 | if not isinstance(other, self.__class__): | |
860 | return NotImplemented | |
861 | for attr in ['scheme', 'userinfo', 'host', 'query', | |
862 | 'fragment', 'port', 'uses_netloc']: | |
863 | if getattr(self, attr) != getattr(other, attr): | |
864 | return False | |
865 | if self.path == other.path or (self.path in _ROOT_PATHS | |
866 | and other.path in _ROOT_PATHS): | |
867 | return True | |
868 | return False | |
869 | ||
870 | def __ne__(self, other): | |
871 | if not isinstance(other, self.__class__): | |
872 | return NotImplemented | |
873 | return not self.__eq__(other) | |
874 | ||
875 | def __hash__(self): | |
876 | return hash((self.__class__, self.scheme, self.userinfo, self.host, | |
877 | self.path, self.query, self.fragment, self.port, | |
878 | self.rooted, self.uses_netloc)) | |
879 | ||
880 | @property | |
881 | def absolute(self): | |
882 | """Whether or not the URL is "absolute". Absolute URLs are complete | |
883 | enough to resolve to a network resource without being relative | |
884 | to a base URI. | |
885 | ||
886 | >>> URL.from_text(u'http://wikipedia.org/').absolute | |
887 | True | |
888 | >>> URL.from_text(u'?a=b&c=d').absolute | |
889 | False | |
890 | ||
891 | Absolute URLs must have both a scheme and a host set. | |
892 | """ | |
893 | return bool(self.scheme and self.host) | |
894 | ||
895 | def replace(self, scheme=_UNSET, host=_UNSET, path=_UNSET, query=_UNSET, | |
896 | fragment=_UNSET, port=_UNSET, rooted=_UNSET, userinfo=_UNSET, | |
897 | uses_netloc=_UNSET): | |
898 | """:class:`URL` objects are immutable, which means that attributes | |
899 | are designed to be set only once, at construction. Instead of | |
900 | modifying an existing URL, one simply creates a copy with the | |
901 | desired changes. | |
902 | ||
903 | If any of the following arguments is omitted, it defaults to | |
904 | the value on the current URL. | |
905 | ||
906 | Args: | |
907 | scheme (unicode): The text name of the scheme. | |
908 | host (unicode): The host portion of the network location | |
909 | port (int): The port part of the network location. | |
910 | path (tuple): A tuple of strings representing the | |
911 | slash-separated parts of the path. | |
912 | query (tuple): The query parameters, as a tuple of | |
913 | key-value pairs. | |
914 | fragment (unicode): The fragment part of the URL. | |
915 | rooted (bool): Whether or not the path begins with a slash. | |
916 | userinfo (unicode): The username or colon-separated | |
917 | username:password pair. | |
918 | uses_netloc (bool): Indicates whether two slashes appear | |
919 | between the scheme and the host (``http://eg.com`` vs | |
920 | ``mailto:e@g.com``) | |
921 | ||
922 | Returns: | |
923 | URL: a copy of the current :class:`URL`, with new values for | |
924 | parameters passed. | |
925 | ||
926 | """ | |
927 | return self.__class__( | |
928 | scheme=_optional(scheme, self.scheme), | |
929 | host=_optional(host, self.host), | |
930 | path=_optional(path, self.path), | |
931 | query=_optional(query, self.query), | |
932 | fragment=_optional(fragment, self.fragment), | |
933 | port=_optional(port, self.port), | |
934 | rooted=_optional(rooted, self.rooted), | |
935 | userinfo=_optional(userinfo, self.userinfo), | |
936 | uses_netloc=_optional(uses_netloc, self.uses_netloc) | |
937 | ) | |
938 | ||
939 | @classmethod | |
940 | def from_text(cls, text): | |
941 | """Whereas the :class:`URL` constructor is useful for constructing | |
942 | URLs from parts, :meth:`~URL.from_text` supports parsing whole | |
943 | URLs from their string form:: | |
944 | ||
945 | >>> URL.from_text(u'http://example.com') | |
946 | URL.from_text(u'http://example.com') | |
947 | >>> URL.from_text(u'?a=b&x=y') | |
948 | URL.from_text(u'?a=b&x=y') | |
949 | ||
950 | As you can see above, it's also used as the :func:`repr` of | |
951 | :class:`URL` objects. The natural counterpart to | |
952 | :func:`~URL.to_text()`. This method only accepts *text*, so be | |
953 | sure to decode those bytestrings. | |
954 | ||
955 | Args: | |
956 | text (unicode): A valid URL string. | |
957 | ||
958 | Returns: | |
959 | URL: The structured object version of the parsed string. | |
960 | ||
961 | .. note:: | |
962 | ||
963 | Somewhat unexpectedly, URLs are a far more permissive | |
964 | format than most would assume. Many strings which don't | |
965 | look like URLs are still valid URLs. As a result, this | |
966 | method only raises :class:`URLParseError` on invalid port | |
967 | and IPv6 values in the host portion of the URL. | |
968 | ||
969 | """ | |
970 | um = _URL_RE.match(_textcheck('text', text)) | |
971 | try: | |
972 | gs = um.groupdict() | |
973 | except AttributeError: | |
974 | raise URLParseError('could not parse url: %r' % text) | |
975 | ||
976 | au_text = gs['authority'] or u'' | |
977 | au_m = _AUTHORITY_RE.match(au_text) | |
978 | try: | |
979 | au_gs = au_m.groupdict() | |
980 | except AttributeError: | |
981 | raise URLParseError('invalid authority %r in url: %r' | |
982 | % (au_text, text)) | |
983 | if au_gs['bad_host']: | |
984 | raise URLParseError('invalid host %r in url: %r') | |
985 | ||
986 | userinfo = au_gs['userinfo'] or u'' | |
987 | ||
988 | host = au_gs['ipv6_host'] or au_gs['plain_host'] | |
989 | port = au_gs['port'] | |
990 | if port is not None: | |
991 | try: | |
992 | port = int(port) | |
993 | except ValueError: | |
994 | if not port: # TODO: excessive? | |
995 | raise URLParseError('port must not be empty: %r' % au_text) | |
996 | raise URLParseError('expected integer for port, not %r' % port) | |
997 | ||
998 | scheme = gs['scheme'] or u'' | |
999 | fragment = gs['fragment'] or u'' | |
1000 | uses_netloc = bool(gs['_netloc_sep']) | |
1001 | ||
1002 | if gs['path']: | |
1003 | path = gs['path'].split(u"/") | |
1004 | if not path[0]: | |
1005 | path.pop(0) | |
1006 | rooted = True | |
1007 | else: | |
1008 | rooted = False | |
1009 | else: | |
1010 | path = () | |
1011 | rooted = bool(au_text) | |
1012 | if gs['query']: | |
1013 | query = ((qe.split(u"=", 1) if u'=' in qe else (qe, None)) | |
1014 | for qe in gs['query'].split(u"&")) | |
1015 | else: | |
1016 | query = () | |
1017 | return cls(scheme, host, path, query, fragment, port, | |
1018 | rooted, userinfo, uses_netloc) | |
1019 | ||
1020 | def normalize(self, scheme=True, host=True, path=True, query=True, | |
1021 | fragment=True): | |
1022 | """Return a new URL object with several standard normalizations | |
1023 | applied: | |
1024 | ||
1025 | * Decode unreserved characters (`RFC 3986 2.3`_) | |
1026 | * Uppercase remaining percent-encoded octets (`RFC 3986 2.1`_) | |
1027 | * Convert scheme and host casing to lowercase (`RFC 3986 3.2.2`_) | |
1028 | * Resolve any "." and ".." references in the path (`RFC 3986 6.2.2.3`_) | |
1029 | * Ensure an ending slash on URLs with an empty path (`RFC 3986 6.2.3`_) | |
1030 | ||
1031 | All are applied by default, but normalizations can be disabled | |
1032 | per-part by passing `False` for that part's corresponding | |
1033 | name. | |
1034 | ||
1035 | Args: | |
1036 | scheme (bool): Convert the scheme to lowercase | |
1037 | host (bool): Convert the host to lowercase | |
1038 | path (bool): Normalize the path (see above for details) | |
1039 | query (bool): Normalize the query string | |
1040 | fragment (bool): Normalize the fragment | |
1041 | ||
1042 | >>> url = URL.from_text(u'Http://example.COM/a/../b/./c%2f?%61') | |
1043 | >>> print(url.normalize().to_text()) | |
1044 | http://example.com/b/c%2F?a | |
1045 | ||
1046 | .. _RFC 3986 3.2.2: https://tools.ietf.org/html/rfc3986#section-3.2.2 | |
1047 | .. _RFC 3986 2.3: https://tools.ietf.org/html/rfc3986#section-2.3 | |
1048 | .. _RFC 3986 2.1: https://tools.ietf.org/html/rfc3986#section-2.1 | |
1049 | .. _RFC 3986 6.2.2.3: https://tools.ietf.org/html/rfc3986#section-6.2.2.3 | |
1050 | .. _RFC 3986 6.2.3: https://tools.ietf.org/html/rfc3986#section-6.2.3 | |
1051 | ||
1052 | """ | |
1053 | # TODO: userinfo? | |
1054 | kw = {} | |
1055 | if scheme: | |
1056 | kw['scheme'] = self.scheme.lower() | |
1057 | if host: | |
1058 | kw['host'] = self.host.lower() | |
1059 | if path: | |
1060 | if self.path: | |
1061 | kw['path'] = [_decode_unreserved(p, normalize_case=True) | |
1062 | for p in _resolve_dot_segments(self.path)] | |
1063 | else: | |
1064 | kw['path'] = (u'',) | |
1065 | if query: | |
1066 | kw['query'] = [(_decode_unreserved(k, normalize_case=True), | |
1067 | _decode_unreserved(v, normalize_case=True) | |
1068 | if v else v) for k, v in self.query] | |
1069 | if fragment: | |
1070 | kw['fragment'] = _decode_unreserved(self.fragment, | |
1071 | normalize_case=True) | |
1072 | return self.replace(**kw) | |
1073 | ||
1074 | def child(self, *segments): | |
1075 | """Make a new :class:`URL` where the given path segments are a child | |
1076 | of this URL, preserving other parts of the URL, including the | |
1077 | query string and fragment. | |
1078 | ||
1079 | For example:: | |
1080 | ||
1081 | >>> url = URL.from_text(u'http://localhost/a/b?x=y') | |
1082 | >>> child_url = url.child(u"c", u"d") | |
1083 | >>> child_url.to_text() | |
1084 | u'http://localhost/a/b/c/d?x=y' | |
1085 | ||
1086 | Args: | |
1087 | segments (unicode): Additional parts to be joined and added to | |
1088 | the path, like :func:`os.path.join`. Special characters | |
1089 | in segments will be percent encoded. | |
1090 | ||
1091 | Returns: | |
1092 | URL: A copy of the current URL with the extra path segments. | |
1093 | ||
1094 | """ | |
1095 | segments = [_textcheck('path segment', s) for s in segments] | |
1096 | new_segs = _encode_path_parts(segments, joined=False, maximal=False) | |
1097 | new_path = self.path[:-1 if (self.path and self.path[-1] == u'') | |
1098 | else None] + new_segs | |
1099 | return self.replace(path=new_path) | |
1100 | ||
1101 | def sibling(self, segment): | |
1102 | """Make a new :class:`URL` with a single path segment that is a | |
1103 | sibling of this URL path. | |
1104 | ||
1105 | Args: | |
1106 | segment (unicode): A single path segment. | |
1107 | ||
1108 | Returns: | |
1109 | URL: A copy of the current URL with the last path segment | |
1110 | replaced by *segment*. Special characters such as | |
1111 | ``/?#`` will be percent encoded. | |
1112 | ||
1113 | """ | |
1114 | _textcheck('path segment', segment) | |
1115 | new_path = self.path[:-1] + (_encode_path_part(segment),) | |
1116 | return self.replace(path=new_path) | |
1117 | ||
1118 | def click(self, href=u''): | |
1119 | """Resolve the given URL relative to this URL. | |
1120 | ||
1121 | The resulting URI should match what a web browser would | |
1122 | generate if you visited the current URL and clicked on *href*. | |
1123 | ||
1124 | >>> url = URL.from_text(u'http://blog.hatnote.com/') | |
1125 | >>> url.click(u'/post/155074058790').to_text() | |
1126 | u'http://blog.hatnote.com/post/155074058790' | |
1127 | >>> url = URL.from_text(u'http://localhost/a/b/c/') | |
1128 | >>> url.click(u'../d/./e').to_text() | |
1129 | u'http://localhost/a/b/d/e' | |
1130 | ||
1131 | Args: | |
1132 | href (unicode): A string representing a clicked URL. | |
1133 | ||
1134 | Return: | |
1135 | URL: A copy of the current URL with navigation logic applied. | |
1136 | ||
1137 | For more information, see `RFC 3986 section 5`_. | |
1138 | ||
1139 | .. _RFC 3986 section 5: https://tools.ietf.org/html/rfc3986#section-5 | |
1140 | """ | |
1141 | if href: | |
1142 | if isinstance(href, URL): | |
1143 | clicked = href | |
1144 | else: | |
1145 | # TODO: This error message is not completely accurate, | |
1146 | # as URL objects are now also valid, but Twisted's | |
1147 | # test suite (wrongly) relies on this exact message. | |
1148 | _textcheck('relative URL', href) | |
1149 | clicked = URL.from_text(href) | |
1150 | if clicked.absolute: | |
1151 | return clicked | |
1152 | else: | |
1153 | clicked = self | |
1154 | ||
1155 | query = clicked.query | |
1156 | if clicked.scheme and not clicked.rooted: | |
1157 | # Schemes with relative paths are not well-defined. RFC 3986 calls | |
1158 | # them a "loophole in prior specifications" that should be avoided, | |
1159 | # or supported only for backwards compatibility. | |
1160 | raise NotImplementedError('absolute URI with rootless path: %r' | |
1161 | % (href,)) | |
1162 | else: | |
1163 | if clicked.rooted: | |
1164 | path = clicked.path | |
1165 | elif clicked.path: | |
1166 | path = self.path[:-1] + clicked.path | |
1167 | else: | |
1168 | path = self.path | |
1169 | if not query: | |
1170 | query = self.query | |
1171 | return self.replace(scheme=clicked.scheme or self.scheme, | |
1172 | host=clicked.host or self.host, | |
1173 | port=clicked.port or self.port, | |
1174 | path=_resolve_dot_segments(path), | |
1175 | query=query, | |
1176 | fragment=clicked.fragment) | |
1177 | ||
1178 | def to_uri(self): | |
1179 | u"""Make a new :class:`URL` instance with all non-ASCII characters | |
1180 | appropriately percent-encoded. This is useful to do in preparation | |
1181 | for sending a :class:`URL` over a network protocol. | |
1182 | ||
1183 | For example:: | |
1184 | ||
1185 | >>> URL.from_text(u'https://→example.com/foo⇧bar/').to_uri() | |
1186 | URL.from_text(u'https://xn--example-dk9c.com/foo%E2%87%A7bar/') | |
1187 | ||
1188 | Returns: | |
1189 | URL: A new instance with its path segments, query parameters, and | |
1190 | hostname encoded, so that they are all in the standard | |
1191 | US-ASCII range. | |
1192 | """ | |
1193 | new_userinfo = u':'.join([_encode_userinfo_part(p) for p in | |
1194 | self.userinfo.split(':', 1)]) | |
1195 | new_path = _encode_path_parts(self.path, has_scheme=bool(self.scheme), | |
1196 | rooted=False, joined=False, maximal=True) | |
1197 | return self.replace( | |
1198 | userinfo=new_userinfo, | |
1199 | host=self.host.encode("idna").decode("ascii"), | |
1200 | path=new_path, | |
1201 | query=tuple([tuple(_encode_query_part(x, maximal=True) | |
1202 | if x is not None else None | |
1203 | for x in (k, v)) | |
1204 | for k, v in self.query]), | |
1205 | fragment=_encode_fragment_part(self.fragment, maximal=True) | |
1206 | ) | |
1207 | ||
1208 | def to_iri(self): | |
1209 | u"""Make a new :class:`URL` instance with all but a few reserved | |
1210 | characters decoded into human-readable format. | |
1211 | ||
1212 | Percent-encoded Unicode and IDNA-encoded hostnames are | |
1213 | decoded, like so:: | |
1214 | ||
1215 | >>> url = URL.from_text(u'https://xn--example-dk9c.com/foo%E2%87%A7bar/') | |
1216 | >>> print(url.to_iri().to_text()) | |
1217 | https://→example.com/foo⇧bar/ | |
1218 | ||
1219 | .. note:: | |
1220 | ||
1221 | As a general Python issue, "narrow" (UCS-2) builds of | |
1222 | Python may not be able to fully decode certain URLs, and | |
1223 | the in those cases, this method will return a best-effort, | |
1224 | partially-decoded, URL which is still valid. This issue | |
1225 | does not affect any Python builds 3.4+. | |
1226 | ||
1227 | Returns: | |
1228 | URL: A new instance with its path segments, query parameters, and | |
1229 | hostname decoded for display purposes. | |
1230 | """ | |
1231 | new_userinfo = u':'.join([_decode_userinfo_part(p) for p in | |
1232 | self.userinfo.split(':', 1)]) | |
1233 | try: | |
1234 | asciiHost = self.host.encode("ascii") | |
1235 | except UnicodeEncodeError: | |
1236 | textHost = self.host | |
1237 | else: | |
1238 | try: | |
1239 | textHost = asciiHost.decode("idna") | |
1240 | except ValueError: | |
1241 | # only reached on "narrow" (UCS-2) Python builds <3.4, see #7 | |
1242 | textHost = self.host | |
1243 | return self.replace(userinfo=new_userinfo, | |
1244 | host=textHost, | |
1245 | path=[_decode_path_part(segment) | |
1246 | for segment in self.path], | |
1247 | query=[tuple(_decode_query_part(x) | |
1248 | if x is not None else None | |
1249 | for x in (k, v)) | |
1250 | for k, v in self.query], | |
1251 | fragment=_decode_fragment_part(self.fragment)) | |
1252 | ||
1253 | def to_text(self, with_password=False): | |
1254 | """Render this URL to its textual representation. | |
1255 | ||
1256 | By default, the URL text will *not* include a password, if one | |
1257 | is set. RFC 3986 considers using URLs to represent such | |
1258 | sensitive information as deprecated. Quoting from RFC 3986, | |
1259 | `section 3.2.1`: | |
1260 | ||
1261 | "Applications should not render as clear text any data after the | |
1262 | first colon (":") character found within a userinfo subcomponent | |
1263 | unless the data after the colon is the empty string (indicating no | |
1264 | password)." | |
1265 | ||
1266 | Args: | |
1267 | with_password (bool): Whether or not to include the | |
1268 | password in the URL text. Defaults to False. | |
1269 | ||
1270 | Returns: | |
1271 | str: The serialized textual representation of this URL, | |
1272 | such as ``u"http://example.com/some/path?some=query"``. | |
1273 | ||
1274 | The natural counterpart to :class:`URL.from_text()`. | |
1275 | ||
1276 | .. _section 3.2.1: https://tools.ietf.org/html/rfc3986#section-3.2.1 | |
1277 | """ | |
1278 | scheme = self.scheme | |
1279 | authority = self.authority(with_password) | |
1280 | path = _encode_path_parts(self.path, | |
1281 | rooted=self.rooted, | |
1282 | has_scheme=bool(scheme), | |
1283 | has_authority=bool(authority), | |
1284 | maximal=False) | |
1285 | query_string = u'&'.join( | |
1286 | u'='.join((_encode_query_part(x, maximal=False) | |
1287 | for x in ([k] if v is None else [k, v]))) | |
1288 | for (k, v) in self.query) | |
1289 | ||
1290 | fragment = self.fragment | |
1291 | ||
1292 | parts = [] | |
1293 | _add = parts.append | |
1294 | if scheme: | |
1295 | _add(scheme) | |
1296 | _add(':') | |
1297 | if authority: | |
1298 | _add('//') | |
1299 | _add(authority) | |
1300 | elif (scheme and path[:2] != '//' and self.uses_netloc): | |
1301 | _add('//') | |
1302 | if path: | |
1303 | if scheme and authority and path[:1] != '/': | |
1304 | _add('/') # relpaths with abs authorities auto get '/' | |
1305 | _add(path) | |
1306 | if query_string: | |
1307 | _add('?') | |
1308 | _add(query_string) | |
1309 | if fragment: | |
1310 | _add('#') | |
1311 | _add(fragment) | |
1312 | return u''.join(parts) | |
1313 | ||
1314 | def __repr__(self): | |
1315 | """Convert this URL to an representation that shows all of its | |
1316 | constituent parts, as well as being a valid argument to | |
1317 | :func:`eval`. | |
1318 | """ | |
1319 | return '%s.from_text(%r)' % (self.__class__.__name__, self.to_text()) | |
1320 | ||
1321 | # # Begin Twisted Compat Code | |
1322 | asURI = to_uri | |
1323 | asIRI = to_iri | |
1324 | ||
1325 | @classmethod | |
1326 | def fromText(cls, s): | |
1327 | return cls.from_text(s) | |
1328 | ||
1329 | def asText(self, includeSecrets=False): | |
1330 | return self.to_text(with_password=includeSecrets) | |
1331 | ||
1332 | def __dir__(self): | |
1333 | try: | |
1334 | ret = object.__dir__(self) | |
1335 | except AttributeError: | |
1336 | # object.__dir__ == AttributeError # pdw for py2 | |
1337 | ret = dir(self.__class__) + list(self.__dict__.keys()) | |
1338 | ret = sorted(set(ret) - set(['fromText', 'asURI', 'asIRI', 'asText'])) | |
1339 | return ret | |
1340 | ||
1341 | # # End Twisted Compat Code | |
1342 | ||
1343 | def add(self, name, value=None): | |
1344 | """Make a new :class:`URL` instance with a given query argument, | |
1345 | *name*, added to it with the value *value*, like so:: | |
1346 | ||
1347 | >>> URL.from_text(u'https://example.com/?x=y').add(u'x') | |
1348 | URL.from_text(u'https://example.com/?x=y&x') | |
1349 | >>> URL.from_text(u'https://example.com/?x=y').add(u'x', u'z') | |
1350 | URL.from_text(u'https://example.com/?x=y&x=z') | |
1351 | ||
1352 | Args: | |
1353 | name (unicode): The name of the query parameter to add. The | |
1354 | part before the ``=``. | |
1355 | value (unicode): The value of the query parameter to add. The | |
1356 | part after the ``=``. Defaults to ``None``, meaning no | |
1357 | value. | |
1358 | ||
1359 | Returns: | |
1360 | URL: A new :class:`URL` instance with the parameter added. | |
1361 | """ | |
1362 | return self.replace(query=self.query + ((name, value),)) | |
1363 | ||
1364 | def set(self, name, value=None): | |
1365 | """Make a new :class:`URL` instance with the query parameter *name* | |
1366 | set to *value*. All existing occurences, if any are replaced | |
1367 | by the single name-value pair. | |
1368 | ||
1369 | >>> URL.from_text(u'https://example.com/?x=y').set(u'x') | |
1370 | URL.from_text(u'https://example.com/?x') | |
1371 | >>> URL.from_text(u'https://example.com/?x=y').set(u'x', u'z') | |
1372 | URL.from_text(u'https://example.com/?x=z') | |
1373 | ||
1374 | Args: | |
1375 | name (unicode): The name of the query parameter to set. The | |
1376 | part before the ``=``. | |
1377 | value (unicode): The value of the query parameter to set. The | |
1378 | part after the ``=``. Defaults to ``None``, meaning no | |
1379 | value. | |
1380 | ||
1381 | Returns: | |
1382 | URL: A new :class:`URL` instance with the parameter set. | |
1383 | """ | |
1384 | # Preserve the original position of the query key in the list | |
1385 | q = [(k, v) for (k, v) in self.query if k != name] | |
1386 | idx = next((i for (i, (k, v)) in enumerate(self.query) | |
1387 | if k == name), -1) | |
1388 | q[idx:idx] = [(name, value)] | |
1389 | return self.replace(query=q) | |
1390 | ||
1391 | def get(self, name): | |
1392 | """Get a list of values for the given query parameter, *name*:: | |
1393 | ||
1394 | >>> url = URL.from_text(u'?x=1&x=2') | |
1395 | >>> url.get('x') | |
1396 | [u'1', u'2'] | |
1397 | >>> url.get('y') | |
1398 | [] | |
1399 | ||
1400 | If the given *name* is not set, an empty list is returned. A | |
1401 | list is always returned, and this method raises no exceptions. | |
1402 | ||
1403 | Args: | |
1404 | name (unicode): The name of the query parameter to get. | |
1405 | ||
1406 | Returns: | |
1407 | list: A list of all the values associated with the key, in | |
1408 | string form. | |
1409 | ||
1410 | """ | |
1411 | return [value for (key, value) in self.query if name == key] | |
1412 | ||
1413 | def remove(self, name): | |
1414 | """Make a new :class:`URL` instance with all occurrences of the query | |
1415 | parameter *name* removed. No exception is raised if the | |
1416 | parameter is not already set. | |
1417 | ||
1418 | Args: | |
1419 | name (unicode): The name of the query parameter to remove. | |
1420 | ||
1421 | Returns: | |
1422 | URL: A new :class:`URL` instance with the parameter removed. | |
1423 | ||
1424 | """ | |
1425 | return self.replace(query=((k, v) for (k, v) in self.query | |
1426 | if k != name)) |
0 | ||
1 | ||
2 | from unittest import TestCase | |
3 | ||
4 | ||
5 | class HyperlinkTestCase(TestCase): | |
6 | """This type mostly exists to provide a backwards-compatible | |
7 | assertRaises method for Python 2.6 testing. | |
8 | """ | |
9 | def assertRaises(self, excClass, callableObj=None, *args, **kwargs): | |
10 | """Fail unless an exception of class excClass is raised | |
11 | by callableObj when invoked with arguments args and keyword | |
12 | arguments kwargs. If a different type of exception is | |
13 | raised, it will not be caught, and the test case will be | |
14 | deemed to have suffered an error, exactly as for an | |
15 | unexpected exception. | |
16 | ||
17 | If called with callableObj omitted or None, will return a | |
18 | context object used like this:: | |
19 | ||
20 | with self.assertRaises(SomeException): | |
21 | do_something() | |
22 | ||
23 | The context manager keeps a reference to the exception as | |
24 | the 'exception' attribute. This allows you to inspect the | |
25 | exception after the assertion:: | |
26 | ||
27 | with self.assertRaises(SomeException) as cm: | |
28 | do_something() | |
29 | the_exception = cm.exception | |
30 | self.assertEqual(the_exception.error_code, 3) | |
31 | """ | |
32 | context = _AssertRaisesContext(excClass, self) | |
33 | if callableObj is None: | |
34 | return context | |
35 | with context: | |
36 | callableObj(*args, **kwargs) | |
37 | ||
38 | ||
39 | class _AssertRaisesContext(object): | |
40 | "A context manager used to implement HyperlinkTestCase.assertRaises." | |
41 | ||
42 | def __init__(self, expected, test_case): | |
43 | self.expected = expected | |
44 | self.failureException = test_case.failureException | |
45 | ||
46 | def __enter__(self): | |
47 | return self | |
48 | ||
49 | def __exit__(self, exc_type, exc_value, tb): | |
50 | if exc_type is None: | |
51 | exc_name = self.expected.__name__ | |
52 | raise self.failureException("%s not raised" % (exc_name,)) | |
53 | if not issubclass(exc_type, self.expected): | |
54 | # let unexpected exceptions pass through | |
55 | return False | |
56 | self.exception = exc_value # store for later retrieval | |
57 | return True |
0 | """ | |
1 | Tests for hyperlink.test.common | |
2 | """ | |
3 | from unittest import TestCase | |
4 | from .common import HyperlinkTestCase | |
5 | ||
6 | ||
7 | class _ExpectedException(Exception): | |
8 | """An exception used to test HyperlinkTestCase.assertRaises. | |
9 | ||
10 | """ | |
11 | ||
12 | ||
13 | class _UnexpectedException(Exception): | |
14 | """An exception used to test HyperlinkTestCase.assertRaises. | |
15 | ||
16 | """ | |
17 | ||
18 | ||
19 | class TestHyperlink(TestCase): | |
20 | """Tests for HyperlinkTestCase""" | |
21 | ||
22 | def setUp(self): | |
23 | self.hyperlink_test = HyperlinkTestCase("run") | |
24 | ||
25 | def test_assertRaisesWithCallable(self): | |
26 | """HyperlinkTestCase.assertRaises does not raise an AssertionError | |
27 | when given a callable that, when called with the provided | |
28 | arguments, raises the expected exception. | |
29 | ||
30 | """ | |
31 | called_with = [] | |
32 | ||
33 | def raisesExpected(*args, **kwargs): | |
34 | called_with.append((args, kwargs)) | |
35 | raise _ExpectedException | |
36 | ||
37 | self.hyperlink_test.assertRaises(_ExpectedException, | |
38 | raisesExpected, 1, keyword=True) | |
39 | self.assertEqual(called_with, [((1,), {"keyword": True})]) | |
40 | ||
41 | def test_assertRaisesWithCallableUnexpectedException(self): | |
42 | """When given a callable that raises an unexpected exception, | |
43 | HyperlinkTestCase.assertRaises raises that exception. | |
44 | ||
45 | """ | |
46 | ||
47 | def doesNotRaiseExpected(*args, **kwargs): | |
48 | raise _UnexpectedException | |
49 | ||
50 | try: | |
51 | self.hyperlink_test.assertRaises(_ExpectedException, | |
52 | doesNotRaiseExpected) | |
53 | except _UnexpectedException: | |
54 | pass | |
55 | ||
56 | def test_assertRaisesWithCallableDoesNotRaise(self): | |
57 | """HyperlinkTestCase.assertRaises raises an AssertionError when given | |
58 | a callable that, when called, does not raise any exception. | |
59 | ||
60 | """ | |
61 | ||
62 | def doesNotRaise(*args, **kwargs): | |
63 | return True | |
64 | ||
65 | try: | |
66 | self.hyperlink_test.assertRaises(_ExpectedException, | |
67 | doesNotRaise) | |
68 | except AssertionError: | |
69 | pass | |
70 | ||
71 | def test_assertRaisesContextManager(self): | |
72 | """HyperlinkTestCase.assertRaises does not raise an AssertionError | |
73 | when used as a context manager with a suite that raises the | |
74 | expected exception. The context manager stores the exception | |
75 | instance under its `exception` instance variable. | |
76 | ||
77 | """ | |
78 | with self.hyperlink_test.assertRaises(_ExpectedException) as cm: | |
79 | raise _ExpectedException | |
80 | ||
81 | self.assertTrue(isinstance(cm.exception, _ExpectedException)) | |
82 | ||
83 | def test_assertRaisesContextManagerUnexpectedException(self): | |
84 | """When used as a context manager with a block that raises an | |
85 | unexpected exception, HyperlinkTestCase.assertRaises raises | |
86 | that unexpected exception. | |
87 | ||
88 | """ | |
89 | try: | |
90 | with self.hyperlink_test.assertRaises(_ExpectedException): | |
91 | raise _UnexpectedException | |
92 | except _UnexpectedException: | |
93 | pass | |
94 | ||
95 | def test_assertRaisesContextManagerDoesNotRaise(self): | |
96 | """HyperlinkTestcase.assertRaises raises an AssertionError when used | |
97 | as a context manager with a block that does not raise any | |
98 | exception. | |
99 | ||
100 | """ | |
101 | try: | |
102 | with self.hyperlink_test.assertRaises(_ExpectedException): | |
103 | pass | |
104 | except AssertionError: | |
105 | pass |
0 | # -*- coding: utf-8 -*- | |
1 | from __future__ import unicode_literals | |
2 | ||
3 | ||
4 | from .. import _url | |
5 | from .common import HyperlinkTestCase | |
6 | from .._url import register_scheme, URL | |
7 | ||
8 | ||
9 | class TestSchemeRegistration(HyperlinkTestCase): | |
10 | ||
11 | def setUp(self): | |
12 | self._orig_scheme_port_map = dict(_url.SCHEME_PORT_MAP) | |
13 | self._orig_no_netloc_schemes = set(_url.NO_NETLOC_SCHEMES) | |
14 | ||
15 | def tearDown(self): | |
16 | _url.SCHEME_PORT_MAP = self._orig_scheme_port_map | |
17 | _url.NO_NETLOC_SCHEMES = self._orig_no_netloc_schemes | |
18 | ||
19 | def test_register_scheme_basic(self): | |
20 | register_scheme('deltron', uses_netloc=True, default_port=3030) | |
21 | ||
22 | u1 = URL.from_text('deltron://example.com') | |
23 | assert u1.scheme == 'deltron' | |
24 | assert u1.port == 3030 | |
25 | assert u1.uses_netloc is True | |
26 | ||
27 | # test netloc works even when the original gives no indication | |
28 | u2 = URL.from_text('deltron:') | |
29 | u2 = u2.replace(host='example.com') | |
30 | assert u2.to_text() == 'deltron://example.com' | |
31 | ||
32 | # test default port means no emission | |
33 | u3 = URL.from_text('deltron://example.com:3030') | |
34 | assert u3.to_text() == 'deltron://example.com' | |
35 | ||
36 | register_scheme('nonetron', default_port=3031) | |
37 | u4 = URL(scheme='nonetron') | |
38 | u4 = u4.replace(host='example.com') | |
39 | assert u4.to_text() == 'nonetron://example.com' | |
40 | ||
41 | def test_register_no_netloc_scheme(self): | |
42 | register_scheme('noloctron', uses_netloc=False) | |
43 | u4 = URL(scheme='noloctron') | |
44 | u4 = u4.replace(path=("example", "path")) | |
45 | assert u4.to_text() == 'noloctron:example/path' | |
46 | ||
47 | def test_register_no_netloc_with_port(self): | |
48 | with self.assertRaises(ValueError): | |
49 | register_scheme('badnetlocless', uses_netloc=False, default_port=7) | |
50 | ||
51 | def test_invalid_uses_netloc(self): | |
52 | with self.assertRaises(ValueError): | |
53 | register_scheme('badnetloc', uses_netloc=None) | |
54 | with self.assertRaises(ValueError): | |
55 | register_scheme('badnetloc', uses_netloc=object()) | |
56 | ||
57 | def test_register_invalid_uses_netloc(self): | |
58 | with self.assertRaises(ValueError): | |
59 | register_scheme('lol', uses_netloc=lambda: 'nope') | |
60 | ||
61 | def test_register_invalid_port(self): | |
62 | with self.assertRaises(ValueError): | |
63 | register_scheme('nope', default_port=lambda: 'lol') |
0 | # -*- coding: utf-8 -*- | |
1 | ||
2 | # Copyright (c) Twisted Matrix Laboratories. | |
3 | # See LICENSE for details. | |
4 | ||
5 | from __future__ import unicode_literals | |
6 | ||
7 | import socket | |
8 | ||
9 | from .common import HyperlinkTestCase | |
10 | from .. import URL, URLParseError | |
11 | # automatically import the py27 windows implementation when appropriate | |
12 | from .. import _url | |
13 | from .._url import inet_pton, SCHEME_PORT_MAP, parse_host | |
14 | ||
15 | unicode = type(u'') | |
16 | ||
17 | ||
18 | BASIC_URL = "http://www.foo.com/a/nice/path/?zot=23&zut" | |
19 | ||
20 | # Examples from RFC 3986 section 5.4, Reference Resolution Examples | |
21 | relativeLinkBaseForRFC3986 = 'http://a/b/c/d;p?q' | |
22 | relativeLinkTestsForRFC3986 = [ | |
23 | # "Normal" | |
24 | # ('g:h', 'g:h'), # can't click on a scheme-having url without an abs path | |
25 | ('g', 'http://a/b/c/g'), | |
26 | ('./g', 'http://a/b/c/g'), | |
27 | ('g/', 'http://a/b/c/g/'), | |
28 | ('/g', 'http://a/g'), | |
29 | ('//g', 'http://g'), | |
30 | ('?y', 'http://a/b/c/d;p?y'), | |
31 | ('g?y', 'http://a/b/c/g?y'), | |
32 | ('#s', 'http://a/b/c/d;p?q#s'), | |
33 | ('g#s', 'http://a/b/c/g#s'), | |
34 | ('g?y#s', 'http://a/b/c/g?y#s'), | |
35 | (';x', 'http://a/b/c/;x'), | |
36 | ('g;x', 'http://a/b/c/g;x'), | |
37 | ('g;x?y#s', 'http://a/b/c/g;x?y#s'), | |
38 | ('', 'http://a/b/c/d;p?q'), | |
39 | ('.', 'http://a/b/c/'), | |
40 | ('./', 'http://a/b/c/'), | |
41 | ('..', 'http://a/b/'), | |
42 | ('../', 'http://a/b/'), | |
43 | ('../g', 'http://a/b/g'), | |
44 | ('../..', 'http://a/'), | |
45 | ('../../', 'http://a/'), | |
46 | ('../../g', 'http://a/g'), | |
47 | ||
48 | # Abnormal examples | |
49 | # ".." cannot be used to change the authority component of a URI. | |
50 | ('../../../g', 'http://a/g'), | |
51 | ('../../../../g', 'http://a/g'), | |
52 | ||
53 | # Only include "." and ".." when they are only part of a larger segment, | |
54 | # not by themselves. | |
55 | ('/./g', 'http://a/g'), | |
56 | ('/../g', 'http://a/g'), | |
57 | ('g.', 'http://a/b/c/g.'), | |
58 | ('.g', 'http://a/b/c/.g'), | |
59 | ('g..', 'http://a/b/c/g..'), | |
60 | ('..g', 'http://a/b/c/..g'), | |
61 | # Unnecessary or nonsensical forms of "." and "..". | |
62 | ('./../g', 'http://a/b/g'), | |
63 | ('./g/.', 'http://a/b/c/g/'), | |
64 | ('g/./h', 'http://a/b/c/g/h'), | |
65 | ('g/../h', 'http://a/b/c/h'), | |
66 | ('g;x=1/./y', 'http://a/b/c/g;x=1/y'), | |
67 | ('g;x=1/../y', 'http://a/b/c/y'), | |
68 | # Separating the reference's query and fragment components from the path. | |
69 | ('g?y/./x', 'http://a/b/c/g?y/./x'), | |
70 | ('g?y/../x', 'http://a/b/c/g?y/../x'), | |
71 | ('g#s/./x', 'http://a/b/c/g#s/./x'), | |
72 | ('g#s/../x', 'http://a/b/c/g#s/../x') | |
73 | ] | |
74 | ||
75 | ||
76 | ROUNDTRIP_TESTS = ( | |
77 | "http://localhost", | |
78 | "http://localhost/", | |
79 | "http://127.0.0.1/", | |
80 | "http://[::127.0.0.1]/", | |
81 | "http://[::1]/", | |
82 | "http://localhost/foo", | |
83 | "http://localhost/foo/", | |
84 | "http://localhost/foo!!bar/", | |
85 | "http://localhost/foo%20bar/", | |
86 | "http://localhost/foo%2Fbar/", | |
87 | "http://localhost/foo?n", | |
88 | "http://localhost/foo?n=v", | |
89 | "http://localhost/foo?n=/a/b", | |
90 | "http://example.com/foo!@$bar?b!@z=123", | |
91 | "http://localhost/asd?a=asd%20sdf/345", | |
92 | "http://(%2525)/(%2525)?(%2525)&(%2525)=(%2525)#(%2525)", | |
93 | "http://(%C3%A9)/(%C3%A9)?(%C3%A9)&(%C3%A9)=(%C3%A9)#(%C3%A9)", | |
94 | "?sslrootcert=/Users/glyph/Downloads/rds-ca-2015-root.pem&sslmode=verify", | |
95 | ||
96 | # from boltons.urlutils' tests | |
97 | ||
98 | 'http://googlewebsite.com/e-shops.aspx', | |
99 | 'http://example.com:8080/search?q=123&business=Nothing%20Special', | |
100 | 'http://hatnote.com:9000/?arg=1&arg=2&arg=3', | |
101 | 'https://xn--bcher-kva.ch', | |
102 | 'http://xn--ggbla1c4e.xn--ngbc5azd/', | |
103 | 'http://tools.ietf.org/html/rfc3986#section-3.4', | |
104 | # 'http://wiki:pedia@hatnote.com', | |
105 | 'ftp://ftp.rfc-editor.org/in-notes/tar/RFCs0001-0500.tar.gz', | |
106 | 'http://[1080:0:0:0:8:800:200C:417A]/index.html', | |
107 | 'ssh://192.0.2.16:2222/', | |
108 | 'https://[::101.45.75.219]:80/?hi=bye', | |
109 | 'ldap://[::192.9.5.5]/dc=example,dc=com??sub?(sn=Jensen)', | |
110 | 'mailto:me@example.com?to=me@example.com&body=hi%20http://wikipedia.org', | |
111 | 'news:alt.rec.motorcycle', | |
112 | 'tel:+1-800-867-5309', | |
113 | 'urn:oasis:member:A00024:x', | |
114 | ('magnet:?xt=urn:btih:1a42b9e04e122b97a5254e3df77ab3c4b7da725f&dn=Puppy%' | |
115 | '20Linux%20precise-5.7.1.iso&tr=udp://tracker.openbittorrent.com:80&' | |
116 | 'tr=udp://tracker.publicbt.com:80&tr=udp://tracker.istole.it:6969&' | |
117 | 'tr=udp://tracker.ccc.de:80&tr=udp://open.demonii.com:1337'), | |
118 | ||
119 | # percent-encoded delimiters in percent-encodable fields | |
120 | ||
121 | 'https://%3A@example.com/', # colon in username | |
122 | 'https://%40@example.com/', # at sign in username | |
123 | 'https://%2f@example.com/', # slash in username | |
124 | 'https://a:%3a@example.com/', # colon in password | |
125 | 'https://a:%40@example.com/', # at sign in password | |
126 | 'https://a:%2f@example.com/', # slash in password | |
127 | 'https://a:%3f@example.com/', # question mark in password | |
128 | 'https://example.com/%2F/', # slash in path | |
129 | 'https://example.com/%3F/', # question mark in path | |
130 | 'https://example.com/%23/', # hash in path | |
131 | 'https://example.com/?%23=b', # hash in query param name | |
132 | 'https://example.com/?%3D=b', # equals in query param name | |
133 | 'https://example.com/?%26=b', # ampersand in query param name | |
134 | 'https://example.com/?a=%23', # hash in query param value | |
135 | 'https://example.com/?a=%26', # ampersand in query param value | |
136 | 'https://example.com/?a=%3D', # equals in query param value | |
137 | # double-encoded percent sign in all percent-encodable positions: | |
138 | "http://(%2525):(%2525)@example.com/(%2525)/?(%2525)=(%2525)#(%2525)", | |
139 | # colon in first part of schemeless relative url | |
140 | 'first_seg_rel_path__colon%3Anotok/second_seg__colon%3Aok', | |
141 | ) | |
142 | ||
143 | ||
144 | class TestURL(HyperlinkTestCase): | |
145 | """ | |
146 | Tests for L{URL}. | |
147 | """ | |
148 | ||
149 | def assertUnicoded(self, u): | |
150 | """ | |
151 | The given L{URL}'s components should be L{unicode}. | |
152 | ||
153 | @param u: The L{URL} to test. | |
154 | """ | |
155 | self.assertTrue(isinstance(u.scheme, unicode) or u.scheme is None, | |
156 | repr(u)) | |
157 | self.assertTrue(isinstance(u.host, unicode) or u.host is None, | |
158 | repr(u)) | |
159 | for seg in u.path: | |
160 | self.assertEqual(type(seg), unicode, repr(u)) | |
161 | for (k, v) in u.query: | |
162 | self.assertEqual(type(seg), unicode, repr(u)) | |
163 | self.assertTrue(v is None or isinstance(v, unicode), repr(u)) | |
164 | self.assertEqual(type(u.fragment), unicode, repr(u)) | |
165 | ||
166 | def assertURL(self, u, scheme, host, path, query, | |
167 | fragment, port, userinfo=''): | |
168 | """ | |
169 | The given L{URL} should have the given components. | |
170 | ||
171 | @param u: The actual L{URL} to examine. | |
172 | ||
173 | @param scheme: The expected scheme. | |
174 | ||
175 | @param host: The expected host. | |
176 | ||
177 | @param path: The expected path. | |
178 | ||
179 | @param query: The expected query. | |
180 | ||
181 | @param fragment: The expected fragment. | |
182 | ||
183 | @param port: The expected port. | |
184 | ||
185 | @param userinfo: The expected userinfo. | |
186 | """ | |
187 | actual = (u.scheme, u.host, u.path, u.query, | |
188 | u.fragment, u.port, u.userinfo) | |
189 | expected = (scheme, host, tuple(path), tuple(query), | |
190 | fragment, port, u.userinfo) | |
191 | self.assertEqual(actual, expected) | |
192 | ||
193 | def test_initDefaults(self): | |
194 | """ | |
195 | L{URL} should have appropriate default values. | |
196 | """ | |
197 | def check(u): | |
198 | self.assertUnicoded(u) | |
199 | self.assertURL(u, 'http', '', [], [], '', 80, '') | |
200 | ||
201 | check(URL('http', '')) | |
202 | check(URL('http', '', [], [])) | |
203 | check(URL('http', '', [], [], '')) | |
204 | ||
205 | def test_init(self): | |
206 | """ | |
207 | L{URL} should accept L{unicode} parameters. | |
208 | """ | |
209 | u = URL('s', 'h', ['p'], [('k', 'v'), ('k', None)], 'f') | |
210 | self.assertUnicoded(u) | |
211 | self.assertURL(u, 's', 'h', ['p'], [('k', 'v'), ('k', None)], | |
212 | 'f', None) | |
213 | ||
214 | self.assertURL(URL('http', '\xe0', ['\xe9'], | |
215 | [('\u03bb', '\u03c0')], '\u22a5'), | |
216 | 'http', '\xe0', ['\xe9'], | |
217 | [('\u03bb', '\u03c0')], '\u22a5', 80) | |
218 | ||
219 | def test_initPercent(self): | |
220 | """ | |
221 | L{URL} should accept (and not interpret) percent characters. | |
222 | """ | |
223 | u = URL('s', '%68', ['%70'], [('%6B', '%76'), ('%6B', None)], | |
224 | '%66') | |
225 | self.assertUnicoded(u) | |
226 | self.assertURL(u, | |
227 | 's', '%68', ['%70'], | |
228 | [('%6B', '%76'), ('%6B', None)], | |
229 | '%66', None) | |
230 | ||
231 | def test_repr(self): | |
232 | """ | |
233 | L{URL.__repr__} will display the canonical form of the URL, wrapped in | |
234 | a L{URL.from_text} invocation, so that it is C{eval}-able but still easy | |
235 | to read. | |
236 | """ | |
237 | self.assertEqual( | |
238 | repr(URL(scheme='http', host='foo', path=['bar'], | |
239 | query=[('baz', None), ('k', 'v')], | |
240 | fragment='frob')), | |
241 | "URL.from_text(%s)" % (repr(u"http://foo/bar?baz&k=v#frob"),) | |
242 | ) | |
243 | ||
244 | def test_from_text(self): | |
245 | """ | |
246 | Round-tripping L{URL.from_text} with C{str} results in an equivalent | |
247 | URL. | |
248 | """ | |
249 | urlpath = URL.from_text(BASIC_URL) | |
250 | self.assertEqual(BASIC_URL, urlpath.to_text()) | |
251 | ||
252 | def test_roundtrip(self): | |
253 | """ | |
254 | L{URL.to_text} should invert L{URL.from_text}. | |
255 | """ | |
256 | for test in ROUNDTRIP_TESTS: | |
257 | result = URL.from_text(test).to_text(with_password=True) | |
258 | self.assertEqual(test, result) | |
259 | ||
260 | def test_roundtrip_double_iri(self): | |
261 | for test in ROUNDTRIP_TESTS: | |
262 | url = URL.from_text(test) | |
263 | iri = url.to_iri() | |
264 | double_iri = iri.to_iri() | |
265 | assert iri == double_iri | |
266 | ||
267 | iri_text = iri.to_text(with_password=True) | |
268 | double_iri_text = double_iri.to_text(with_password=True) | |
269 | assert iri_text == double_iri_text | |
270 | return | |
271 | ||
272 | def test_equality(self): | |
273 | """ | |
274 | Two URLs decoded using L{URL.from_text} will be equal (C{==}) if they | |
275 | decoded same URL string, and unequal (C{!=}) if they decoded different | |
276 | strings. | |
277 | """ | |
278 | urlpath = URL.from_text(BASIC_URL) | |
279 | self.assertEqual(urlpath, URL.from_text(BASIC_URL)) | |
280 | self.assertNotEqual( | |
281 | urlpath, | |
282 | URL.from_text('ftp://www.anotherinvaliddomain.com/' | |
283 | 'foo/bar/baz/?zot=21&zut') | |
284 | ) | |
285 | ||
286 | def test_fragmentEquality(self): | |
287 | """ | |
288 | An URL created with the empty string for a fragment compares equal | |
289 | to an URL created with an unspecified fragment. | |
290 | """ | |
291 | self.assertEqual(URL(fragment=''), URL()) | |
292 | self.assertEqual(URL.from_text(u"http://localhost/#"), | |
293 | URL.from_text(u"http://localhost/")) | |
294 | ||
295 | def test_child(self): | |
296 | """ | |
297 | L{URL.child} appends a new path segment, but does not affect the query | |
298 | or fragment. | |
299 | """ | |
300 | urlpath = URL.from_text(BASIC_URL) | |
301 | self.assertEqual("http://www.foo.com/a/nice/path/gong?zot=23&zut", | |
302 | urlpath.child('gong').to_text()) | |
303 | self.assertEqual("http://www.foo.com/a/nice/path/gong%2F?zot=23&zut", | |
304 | urlpath.child('gong/').to_text()) | |
305 | self.assertEqual( | |
306 | "http://www.foo.com/a/nice/path/gong%2Fdouble?zot=23&zut", | |
307 | urlpath.child('gong/double').to_text() | |
308 | ) | |
309 | self.assertEqual( | |
310 | "http://www.foo.com/a/nice/path/gong%2Fdouble%2F?zot=23&zut", | |
311 | urlpath.child('gong/double/').to_text() | |
312 | ) | |
313 | ||
314 | def test_multiChild(self): | |
315 | """ | |
316 | L{URL.child} receives multiple segments as C{*args} and appends each in | |
317 | turn. | |
318 | """ | |
319 | url = URL.from_text('http://example.com/a/b') | |
320 | self.assertEqual(url.child('c', 'd', 'e').to_text(), | |
321 | 'http://example.com/a/b/c/d/e') | |
322 | ||
323 | def test_childInitRoot(self): | |
324 | """ | |
325 | L{URL.child} of a L{URL} without a path produces a L{URL} with a single | |
326 | path segment. | |
327 | """ | |
328 | childURL = URL(host=u"www.foo.com").child(u"c") | |
329 | self.assertTrue(childURL.rooted) | |
330 | self.assertEqual("http://www.foo.com/c", childURL.to_text()) | |
331 | ||
332 | def test_sibling(self): | |
333 | """ | |
334 | L{URL.sibling} of a L{URL} replaces the last path segment, but does not | |
335 | affect the query or fragment. | |
336 | """ | |
337 | urlpath = URL.from_text(BASIC_URL) | |
338 | self.assertEqual( | |
339 | "http://www.foo.com/a/nice/path/sister?zot=23&zut", | |
340 | urlpath.sibling('sister').to_text() | |
341 | ) | |
342 | # Use an url without trailing '/' to check child removal. | |
343 | url_text = "http://www.foo.com/a/nice/path?zot=23&zut" | |
344 | urlpath = URL.from_text(url_text) | |
345 | self.assertEqual( | |
346 | "http://www.foo.com/a/nice/sister?zot=23&zut", | |
347 | urlpath.sibling('sister').to_text() | |
348 | ) | |
349 | ||
350 | def test_click(self): | |
351 | """ | |
352 | L{URL.click} interprets the given string as a relative URI-reference | |
353 | and returns a new L{URL} interpreting C{self} as the base absolute URI. | |
354 | """ | |
355 | urlpath = URL.from_text(BASIC_URL) | |
356 | # A null uri should be valid (return here). | |
357 | self.assertEqual("http://www.foo.com/a/nice/path/?zot=23&zut", | |
358 | urlpath.click("").to_text()) | |
359 | # A simple relative path remove the query. | |
360 | self.assertEqual("http://www.foo.com/a/nice/path/click", | |
361 | urlpath.click("click").to_text()) | |
362 | # An absolute path replace path and query. | |
363 | self.assertEqual("http://www.foo.com/click", | |
364 | urlpath.click("/click").to_text()) | |
365 | # Replace just the query. | |
366 | self.assertEqual("http://www.foo.com/a/nice/path/?burp", | |
367 | urlpath.click("?burp").to_text()) | |
368 | # One full url to another should not generate '//' between authority. | |
369 | # and path | |
370 | self.assertTrue("//foobar" not in | |
371 | urlpath.click('http://www.foo.com/foobar').to_text()) | |
372 | ||
373 | # From a url with no query clicking a url with a query, the query | |
374 | # should be handled properly. | |
375 | u = URL.from_text('http://www.foo.com/me/noquery') | |
376 | self.assertEqual('http://www.foo.com/me/17?spam=158', | |
377 | u.click('/me/17?spam=158').to_text()) | |
378 | ||
379 | # Check that everything from the path onward is removed when the click | |
380 | # link has no path. | |
381 | u = URL.from_text('http://localhost/foo?abc=def') | |
382 | self.assertEqual(u.click('http://www.python.org').to_text(), | |
383 | 'http://www.python.org') | |
384 | ||
385 | # https://twistedmatrix.com/trac/ticket/8184 | |
386 | u = URL.from_text('http://hatnote.com/a/b/../c/./d/e/..') | |
387 | res = 'http://hatnote.com/a/c/d/' | |
388 | self.assertEqual(u.click('').to_text(), res) | |
389 | ||
390 | # test click default arg is same as empty string above | |
391 | self.assertEqual(u.click().to_text(), res) | |
392 | ||
393 | # test click on a URL instance | |
394 | u = URL.fromText('http://localhost/foo/?abc=def') | |
395 | u2 = URL.from_text('bar') | |
396 | u3 = u.click(u2) | |
397 | self.assertEqual(u3.to_text(), 'http://localhost/foo/bar') | |
398 | ||
399 | def test_clickRFC3986(self): | |
400 | """ | |
401 | L{URL.click} should correctly resolve the examples in RFC 3986. | |
402 | """ | |
403 | base = URL.from_text(relativeLinkBaseForRFC3986) | |
404 | for (ref, expected) in relativeLinkTestsForRFC3986: | |
405 | self.assertEqual(base.click(ref).to_text(), expected) | |
406 | ||
407 | def test_clickSchemeRelPath(self): | |
408 | """ | |
409 | L{URL.click} should not accept schemes with relative paths. | |
410 | """ | |
411 | base = URL.from_text(relativeLinkBaseForRFC3986) | |
412 | self.assertRaises(NotImplementedError, base.click, 'g:h') | |
413 | self.assertRaises(NotImplementedError, base.click, 'http:h') | |
414 | ||
415 | def test_cloneUnchanged(self): | |
416 | """ | |
417 | Verify that L{URL.replace} doesn't change any of the arguments it | |
418 | is passed. | |
419 | """ | |
420 | urlpath = URL.from_text('https://x:1/y?z=1#A') | |
421 | self.assertEqual(urlpath.replace(urlpath.scheme, | |
422 | urlpath.host, | |
423 | urlpath.path, | |
424 | urlpath.query, | |
425 | urlpath.fragment, | |
426 | urlpath.port), | |
427 | urlpath) | |
428 | self.assertEqual(urlpath.replace(), urlpath) | |
429 | ||
430 | def test_clickCollapse(self): | |
431 | """ | |
432 | L{URL.click} collapses C{.} and C{..} according to RFC 3986 section | |
433 | 5.2.4. | |
434 | """ | |
435 | tests = [ | |
436 | ['http://localhost/', '.', 'http://localhost/'], | |
437 | ['http://localhost/', '..', 'http://localhost/'], | |
438 | ['http://localhost/a/b/c', '.', 'http://localhost/a/b/'], | |
439 | ['http://localhost/a/b/c', '..', 'http://localhost/a/'], | |
440 | ['http://localhost/a/b/c', './d/e', 'http://localhost/a/b/d/e'], | |
441 | ['http://localhost/a/b/c', '../d/e', 'http://localhost/a/d/e'], | |
442 | ['http://localhost/a/b/c', '/./d/e', 'http://localhost/d/e'], | |
443 | ['http://localhost/a/b/c', '/../d/e', 'http://localhost/d/e'], | |
444 | ['http://localhost/a/b/c/', '../../d/e/', | |
445 | 'http://localhost/a/d/e/'], | |
446 | ['http://localhost/a/./c', '../d/e', 'http://localhost/d/e'], | |
447 | ['http://localhost/a/./c/', '../d/e', 'http://localhost/a/d/e'], | |
448 | ['http://localhost/a/b/c/d', './e/../f/../g', | |
449 | 'http://localhost/a/b/c/g'], | |
450 | ['http://localhost/a/b/c', 'd//e', 'http://localhost/a/b/d//e'], | |
451 | ] | |
452 | for start, click, expected in tests: | |
453 | actual = URL.from_text(start).click(click).to_text() | |
454 | self.assertEqual( | |
455 | actual, | |
456 | expected, | |
457 | "{start}.click({click}) => {actual} not {expected}".format( | |
458 | start=start, | |
459 | click=repr(click), | |
460 | actual=actual, | |
461 | expected=expected, | |
462 | ) | |
463 | ) | |
464 | ||
465 | def test_queryAdd(self): | |
466 | """ | |
467 | L{URL.add} adds query parameters. | |
468 | """ | |
469 | self.assertEqual( | |
470 | "http://www.foo.com/a/nice/path/?foo=bar", | |
471 | URL.from_text("http://www.foo.com/a/nice/path/") | |
472 | .add(u"foo", u"bar").to_text()) | |
473 | self.assertEqual( | |
474 | "http://www.foo.com/?foo=bar", | |
475 | URL(host=u"www.foo.com").add(u"foo", u"bar") | |
476 | .to_text()) | |
477 | urlpath = URL.from_text(BASIC_URL) | |
478 | self.assertEqual( | |
479 | "http://www.foo.com/a/nice/path/?zot=23&zut&burp", | |
480 | urlpath.add(u"burp").to_text()) | |
481 | self.assertEqual( | |
482 | "http://www.foo.com/a/nice/path/?zot=23&zut&burp=xxx", | |
483 | urlpath.add(u"burp", u"xxx").to_text()) | |
484 | self.assertEqual( | |
485 | "http://www.foo.com/a/nice/path/?zot=23&zut&burp=xxx&zing", | |
486 | urlpath.add(u"burp", u"xxx").add(u"zing").to_text()) | |
487 | # Note the inversion! | |
488 | self.assertEqual( | |
489 | "http://www.foo.com/a/nice/path/?zot=23&zut&zing&burp=xxx", | |
490 | urlpath.add(u"zing").add(u"burp", u"xxx").to_text()) | |
491 | # Note the two values for the same name. | |
492 | self.assertEqual( | |
493 | "http://www.foo.com/a/nice/path/?zot=23&zut&burp=xxx&zot=32", | |
494 | urlpath.add(u"burp", u"xxx").add(u"zot", '32') | |
495 | .to_text()) | |
496 | ||
497 | def test_querySet(self): | |
498 | """ | |
499 | L{URL.set} replaces query parameters by name. | |
500 | """ | |
501 | urlpath = URL.from_text(BASIC_URL) | |
502 | self.assertEqual( | |
503 | "http://www.foo.com/a/nice/path/?zot=32&zut", | |
504 | urlpath.set(u"zot", '32').to_text()) | |
505 | # Replace name without value with name/value and vice-versa. | |
506 | self.assertEqual( | |
507 | "http://www.foo.com/a/nice/path/?zot&zut=itworked", | |
508 | urlpath.set(u"zot").set(u"zut", u"itworked").to_text() | |
509 | ) | |
510 | # Q: what happens when the query has two values and we replace? | |
511 | # A: we replace both values with a single one | |
512 | self.assertEqual( | |
513 | "http://www.foo.com/a/nice/path/?zot=32&zut", | |
514 | urlpath.add(u"zot", u"xxx").set(u"zot", '32').to_text() | |
515 | ) | |
516 | ||
517 | def test_queryRemove(self): | |
518 | """ | |
519 | L{URL.remove} removes all instances of a query parameter. | |
520 | """ | |
521 | url = URL.from_text(u"https://example.com/a/b/?foo=1&bar=2&foo=3") | |
522 | self.assertEqual( | |
523 | url.remove(u"foo"), | |
524 | URL.from_text(u"https://example.com/a/b/?bar=2") | |
525 | ) | |
526 | ||
527 | def test_parseEqualSignInParamValue(self): | |
528 | """ | |
529 | Every C{=}-sign after the first in a query parameter is simply included | |
530 | in the value of the parameter. | |
531 | """ | |
532 | u = URL.from_text('http://localhost/?=x=x=x') | |
533 | self.assertEqual(u.get(''), ['x=x=x']) | |
534 | self.assertEqual(u.to_text(), 'http://localhost/?=x%3Dx%3Dx') | |
535 | u = URL.from_text('http://localhost/?foo=x=x=x&bar=y') | |
536 | self.assertEqual(u.query, (('foo', 'x=x=x'), ('bar', 'y'))) | |
537 | self.assertEqual(u.to_text(), 'http://localhost/?foo=x%3Dx%3Dx&bar=y') | |
538 | ||
539 | def test_empty(self): | |
540 | """ | |
541 | An empty L{URL} should serialize as the empty string. | |
542 | """ | |
543 | self.assertEqual(URL().to_text(), '') | |
544 | ||
545 | def test_justQueryText(self): | |
546 | """ | |
547 | An L{URL} with query text should serialize as just query text. | |
548 | """ | |
549 | u = URL(query=[(u"hello", u"world")]) | |
550 | self.assertEqual(u.to_text(), '?hello=world') | |
551 | ||
552 | def test_identicalEqual(self): | |
553 | """ | |
554 | L{URL} compares equal to itself. | |
555 | """ | |
556 | u = URL.from_text('http://localhost/') | |
557 | self.assertEqual(u, u) | |
558 | ||
559 | def test_similarEqual(self): | |
560 | """ | |
561 | URLs with equivalent components should compare equal. | |
562 | """ | |
563 | u1 = URL.from_text('http://u@localhost:8080/p/a/t/h?q=p#f') | |
564 | u2 = URL.from_text('http://u@localhost:8080/p/a/t/h?q=p#f') | |
565 | self.assertEqual(u1, u2) | |
566 | ||
567 | def test_differentNotEqual(self): | |
568 | """ | |
569 | L{URL}s that refer to different resources are both unequal (C{!=}) and | |
570 | also not equal (not C{==}). | |
571 | """ | |
572 | u1 = URL.from_text('http://localhost/a') | |
573 | u2 = URL.from_text('http://localhost/b') | |
574 | self.assertFalse(u1 == u2, "%r != %r" % (u1, u2)) | |
575 | self.assertNotEqual(u1, u2) | |
576 | ||
577 | def test_otherTypesNotEqual(self): | |
578 | """ | |
579 | L{URL} is not equal (C{==}) to other types. | |
580 | """ | |
581 | u = URL.from_text('http://localhost/') | |
582 | self.assertFalse(u == 42, "URL must not equal a number.") | |
583 | self.assertFalse(u == object(), "URL must not equal an object.") | |
584 | self.assertNotEqual(u, 42) | |
585 | self.assertNotEqual(u, object()) | |
586 | ||
587 | def test_identicalNotUnequal(self): | |
588 | """ | |
589 | Identical L{URL}s are not unequal (C{!=}) to each other. | |
590 | """ | |
591 | u = URL.from_text('http://u@localhost:8080/p/a/t/h?q=p#f') | |
592 | self.assertFalse(u != u, "%r == itself" % u) | |
593 | ||
594 | def test_similarNotUnequal(self): | |
595 | """ | |
596 | Structurally similar L{URL}s are not unequal (C{!=}) to each other. | |
597 | """ | |
598 | u1 = URL.from_text('http://u@localhost:8080/p/a/t/h?q=p#f') | |
599 | u2 = URL.from_text('http://u@localhost:8080/p/a/t/h?q=p#f') | |
600 | self.assertFalse(u1 != u2, "%r == %r" % (u1, u2)) | |
601 | ||
602 | def test_differentUnequal(self): | |
603 | """ | |
604 | Structurally different L{URL}s are unequal (C{!=}) to each other. | |
605 | """ | |
606 | u1 = URL.from_text('http://localhost/a') | |
607 | u2 = URL.from_text('http://localhost/b') | |
608 | self.assertTrue(u1 != u2, "%r == %r" % (u1, u2)) | |
609 | ||
610 | def test_otherTypesUnequal(self): | |
611 | """ | |
612 | L{URL} is unequal (C{!=}) to other types. | |
613 | """ | |
614 | u = URL.from_text('http://localhost/') | |
615 | self.assertTrue(u != 42, "URL must differ from a number.") | |
616 | self.assertTrue(u != object(), "URL must be differ from an object.") | |
617 | ||
618 | def test_asURI(self): | |
619 | """ | |
620 | L{URL.asURI} produces an URI which converts any URI unicode encoding | |
621 | into pure US-ASCII and returns a new L{URL}. | |
622 | """ | |
623 | unicodey = ('http://\N{LATIN SMALL LETTER E WITH ACUTE}.com/' | |
624 | '\N{LATIN SMALL LETTER E}\N{COMBINING ACUTE ACCENT}' | |
625 | '?\N{LATIN SMALL LETTER A}\N{COMBINING ACUTE ACCENT}=' | |
626 | '\N{LATIN SMALL LETTER I}\N{COMBINING ACUTE ACCENT}' | |
627 | '#\N{LATIN SMALL LETTER U}\N{COMBINING ACUTE ACCENT}') | |
628 | iri = URL.from_text(unicodey) | |
629 | uri = iri.asURI() | |
630 | self.assertEqual(iri.host, '\N{LATIN SMALL LETTER E WITH ACUTE}.com') | |
631 | self.assertEqual(iri.path[0], | |
632 | '\N{LATIN SMALL LETTER E}\N{COMBINING ACUTE ACCENT}') | |
633 | self.assertEqual(iri.to_text(), unicodey) | |
634 | expectedURI = 'http://xn--9ca.com/%C3%A9?%C3%A1=%C3%AD#%C3%BA' | |
635 | actualURI = uri.to_text() | |
636 | self.assertEqual(actualURI, expectedURI, | |
637 | '%r != %r' % (actualURI, expectedURI)) | |
638 | ||
639 | def test_asIRI(self): | |
640 | """ | |
641 | L{URL.asIRI} decodes any percent-encoded text in the URI, making it | |
642 | more suitable for reading by humans, and returns a new L{URL}. | |
643 | """ | |
644 | asciiish = 'http://xn--9ca.com/%C3%A9?%C3%A1=%C3%AD#%C3%BA' | |
645 | uri = URL.from_text(asciiish) | |
646 | iri = uri.asIRI() | |
647 | self.assertEqual(uri.host, 'xn--9ca.com') | |
648 | self.assertEqual(uri.path[0], '%C3%A9') | |
649 | self.assertEqual(uri.to_text(), asciiish) | |
650 | expectedIRI = ('http://\N{LATIN SMALL LETTER E WITH ACUTE}.com/' | |
651 | '\N{LATIN SMALL LETTER E WITH ACUTE}' | |
652 | '?\N{LATIN SMALL LETTER A WITH ACUTE}=' | |
653 | '\N{LATIN SMALL LETTER I WITH ACUTE}' | |
654 | '#\N{LATIN SMALL LETTER U WITH ACUTE}') | |
655 | actualIRI = iri.to_text() | |
656 | self.assertEqual(actualIRI, expectedIRI, | |
657 | '%r != %r' % (actualIRI, expectedIRI)) | |
658 | ||
659 | def test_badUTF8AsIRI(self): | |
660 | """ | |
661 | Bad UTF-8 in a path segment, query parameter, or fragment results in | |
662 | that portion of the URI remaining percent-encoded in the IRI. | |
663 | """ | |
664 | urlWithBinary = 'http://xn--9ca.com/%00%FF/%C3%A9' | |
665 | uri = URL.from_text(urlWithBinary) | |
666 | iri = uri.asIRI() | |
667 | expectedIRI = ('http://\N{LATIN SMALL LETTER E WITH ACUTE}.com/' | |
668 | '%00%FF/' | |
669 | '\N{LATIN SMALL LETTER E WITH ACUTE}') | |
670 | actualIRI = iri.to_text() | |
671 | self.assertEqual(actualIRI, expectedIRI, | |
672 | '%r != %r' % (actualIRI, expectedIRI)) | |
673 | ||
674 | def test_alreadyIRIAsIRI(self): | |
675 | """ | |
676 | A L{URL} composed of non-ASCII text will result in non-ASCII text. | |
677 | """ | |
678 | unicodey = ('http://\N{LATIN SMALL LETTER E WITH ACUTE}.com/' | |
679 | '\N{LATIN SMALL LETTER E}\N{COMBINING ACUTE ACCENT}' | |
680 | '?\N{LATIN SMALL LETTER A}\N{COMBINING ACUTE ACCENT}=' | |
681 | '\N{LATIN SMALL LETTER I}\N{COMBINING ACUTE ACCENT}' | |
682 | '#\N{LATIN SMALL LETTER U}\N{COMBINING ACUTE ACCENT}') | |
683 | iri = URL.from_text(unicodey) | |
684 | alsoIRI = iri.asIRI() | |
685 | self.assertEqual(alsoIRI.to_text(), unicodey) | |
686 | ||
687 | def test_alreadyURIAsURI(self): | |
688 | """ | |
689 | A L{URL} composed of encoded text will remain encoded. | |
690 | """ | |
691 | expectedURI = 'http://xn--9ca.com/%C3%A9?%C3%A1=%C3%AD#%C3%BA' | |
692 | uri = URL.from_text(expectedURI) | |
693 | actualURI = uri.asURI().to_text() | |
694 | self.assertEqual(actualURI, expectedURI) | |
695 | ||
696 | def test_userinfo(self): | |
697 | """ | |
698 | L{URL.from_text} will parse the C{userinfo} portion of the URI | |
699 | separately from the host and port. | |
700 | """ | |
701 | url = URL.from_text( | |
702 | 'http://someuser:somepassword@example.com/some-segment@ignore' | |
703 | ) | |
704 | self.assertEqual(url.authority(True), | |
705 | 'someuser:somepassword@example.com') | |
706 | self.assertEqual(url.authority(False), 'someuser:@example.com') | |
707 | self.assertEqual(url.userinfo, 'someuser:somepassword') | |
708 | self.assertEqual(url.user, 'someuser') | |
709 | self.assertEqual(url.to_text(), | |
710 | 'http://someuser:@example.com/some-segment@ignore') | |
711 | self.assertEqual( | |
712 | url.replace(userinfo=u"someuser").to_text(), | |
713 | 'http://someuser@example.com/some-segment@ignore' | |
714 | ) | |
715 | ||
716 | def test_portText(self): | |
717 | """ | |
718 | L{URL.from_text} parses custom port numbers as integers. | |
719 | """ | |
720 | portURL = URL.from_text(u"http://www.example.com:8080/") | |
721 | self.assertEqual(portURL.port, 8080) | |
722 | self.assertEqual(portURL.to_text(), u"http://www.example.com:8080/") | |
723 | ||
724 | def test_mailto(self): | |
725 | """ | |
726 | Although L{URL} instances are mainly for dealing with HTTP, other | |
727 | schemes (such as C{mailto:}) should work as well. For example, | |
728 | L{URL.from_text}/L{URL.to_text} round-trips cleanly for a C{mailto:} URL | |
729 | representing an email address. | |
730 | """ | |
731 | self.assertEqual(URL.from_text(u"mailto:user@example.com").to_text(), | |
732 | u"mailto:user@example.com") | |
733 | ||
734 | def test_queryIterable(self): | |
735 | """ | |
736 | When a L{URL} is created with a C{query} argument, the C{query} | |
737 | argument is converted into an N-tuple of 2-tuples. | |
738 | """ | |
739 | url = URL(query=[['alpha', 'beta']]) | |
740 | self.assertEqual(url.query, (('alpha', 'beta'),)) | |
741 | ||
742 | def test_pathIterable(self): | |
743 | """ | |
744 | When a L{URL} is created with a C{path} argument, the C{path} is | |
745 | converted into a tuple. | |
746 | """ | |
747 | url = URL(path=['hello', 'world']) | |
748 | self.assertEqual(url.path, ('hello', 'world')) | |
749 | ||
750 | def test_invalidArguments(self): | |
751 | """ | |
752 | Passing an argument of the wrong type to any of the constructor | |
753 | arguments of L{URL} will raise a descriptive L{TypeError}. | |
754 | ||
755 | L{URL} typechecks very aggressively to ensure that its constitutent | |
756 | parts are all properly immutable and to prevent confusing errors when | |
757 | bad data crops up in a method call long after the code that called the | |
758 | constructor is off the stack. | |
759 | """ | |
760 | class Unexpected(object): | |
761 | def __str__(self): | |
762 | return "wrong" | |
763 | ||
764 | def __repr__(self): | |
765 | return "<unexpected>" | |
766 | ||
767 | defaultExpectation = "unicode" if bytes is str else "str" | |
768 | ||
769 | def assertRaised(raised, expectation, name): | |
770 | self.assertEqual(str(raised.exception), | |
771 | "expected {0} for {1}, got {2}".format( | |
772 | expectation, | |
773 | name, "<unexpected>")) | |
774 | ||
775 | def check(param, expectation=defaultExpectation): | |
776 | with self.assertRaises(TypeError) as raised: | |
777 | URL(**{param: Unexpected()}) | |
778 | ||
779 | assertRaised(raised, expectation, param) | |
780 | ||
781 | check("scheme") | |
782 | check("host") | |
783 | check("fragment") | |
784 | check("rooted", "bool") | |
785 | check("userinfo") | |
786 | check("port", "int or NoneType") | |
787 | ||
788 | with self.assertRaises(TypeError) as raised: | |
789 | URL(path=[Unexpected()]) | |
790 | ||
791 | assertRaised(raised, defaultExpectation, "path segment") | |
792 | ||
793 | with self.assertRaises(TypeError) as raised: | |
794 | URL(query=[(u"name", Unexpected())]) | |
795 | ||
796 | assertRaised(raised, defaultExpectation + " or NoneType", | |
797 | "query parameter value") | |
798 | ||
799 | with self.assertRaises(TypeError) as raised: | |
800 | URL(query=[(Unexpected(), u"value")]) | |
801 | ||
802 | assertRaised(raised, defaultExpectation, "query parameter name") | |
803 | # No custom error message for this one, just want to make sure | |
804 | # non-2-tuples don't get through. | |
805 | ||
806 | with self.assertRaises(TypeError): | |
807 | URL(query=[Unexpected()]) | |
808 | ||
809 | with self.assertRaises(ValueError): | |
810 | URL(query=[('k', 'v', 'vv')]) | |
811 | ||
812 | with self.assertRaises(ValueError): | |
813 | URL(query=[('k',)]) | |
814 | ||
815 | url = URL.from_text("https://valid.example.com/") | |
816 | with self.assertRaises(TypeError) as raised: | |
817 | url.child(Unexpected()) | |
818 | assertRaised(raised, defaultExpectation, "path segment") | |
819 | with self.assertRaises(TypeError) as raised: | |
820 | url.sibling(Unexpected()) | |
821 | assertRaised(raised, defaultExpectation, "path segment") | |
822 | with self.assertRaises(TypeError) as raised: | |
823 | url.click(Unexpected()) | |
824 | assertRaised(raised, defaultExpectation, "relative URL") | |
825 | ||
826 | def test_technicallyTextIsIterableBut(self): | |
827 | """ | |
828 | Technically, L{str} (or L{unicode}, as appropriate) is iterable, but | |
829 | C{URL(path="foo")} resulting in C{URL.from_text("f/o/o")} is never what | |
830 | you want. | |
831 | """ | |
832 | with self.assertRaises(TypeError) as raised: | |
833 | URL(path='foo') | |
834 | self.assertEqual( | |
835 | str(raised.exception), | |
836 | "expected iterable of text for path, not: {0}" | |
837 | .format(repr('foo')) | |
838 | ) | |
839 | ||
840 | def test_netloc(self): | |
841 | url = URL(scheme='https') | |
842 | self.assertEqual(url.uses_netloc, True) | |
843 | ||
844 | url = URL(scheme='git+https') | |
845 | self.assertEqual(url.uses_netloc, True) | |
846 | ||
847 | url = URL(scheme='mailto') | |
848 | self.assertEqual(url.uses_netloc, False) | |
849 | ||
850 | url = URL(scheme='ztp') | |
851 | self.assertEqual(url.uses_netloc, None) | |
852 | ||
853 | url = URL.from_text('ztp://test.com') | |
854 | self.assertEqual(url.uses_netloc, True) | |
855 | ||
856 | url = URL.from_text('ztp:test:com') | |
857 | self.assertEqual(url.uses_netloc, False) | |
858 | ||
859 | def test_ipv6_with_port(self): | |
860 | t = 'https://[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:80/' | |
861 | url = URL.from_text(t) | |
862 | assert url.host == '2001:0db8:85a3:0000:0000:8a2e:0370:7334' | |
863 | assert url.port == 80 | |
864 | assert SCHEME_PORT_MAP[url.scheme] != url.port | |
865 | ||
866 | def test_basic(self): | |
867 | text = 'https://user:pass@example.com/path/to/here?k=v#nice' | |
868 | url = URL.from_text(text) | |
869 | assert url.scheme == 'https' | |
870 | assert url.userinfo == 'user:pass' | |
871 | assert url.host == 'example.com' | |
872 | assert url.path == ('path', 'to', 'here') | |
873 | assert url.fragment == 'nice' | |
874 | ||
875 | text = 'https://user:pass@127.0.0.1/path/to/here?k=v#nice' | |
876 | url = URL.from_text(text) | |
877 | assert url.scheme == 'https' | |
878 | assert url.userinfo == 'user:pass' | |
879 | assert url.host == '127.0.0.1' | |
880 | assert url.path == ('path', 'to', 'here') | |
881 | ||
882 | text = 'https://user:pass@[::1]/path/to/here?k=v#nice' | |
883 | url = URL.from_text(text) | |
884 | assert url.scheme == 'https' | |
885 | assert url.userinfo == 'user:pass' | |
886 | assert url.host == '::1' | |
887 | assert url.path == ('path', 'to', 'here') | |
888 | ||
889 | def test_invalid_url(self): | |
890 | self.assertRaises(URLParseError, URL.from_text, '#\n\n') | |
891 | ||
892 | def test_invalid_authority_url(self): | |
893 | self.assertRaises(URLParseError, URL.from_text, 'http://abc:\n\n/#') | |
894 | ||
895 | def test_invalid_ipv6(self): | |
896 | invalid_ipv6_ips = ['2001::0234:C1ab::A0:aabc:003F', | |
897 | '2001::1::3F', | |
898 | ':', | |
899 | '::::', | |
900 | '::256.0.0.1'] | |
901 | for ip in invalid_ipv6_ips: | |
902 | url_text = 'http://[' + ip + ']' | |
903 | self.assertRaises(socket.error, inet_pton, | |
904 | socket.AF_INET6, ip) | |
905 | self.assertRaises(URLParseError, URL.from_text, url_text) | |
906 | ||
907 | def test_invalid_port(self): | |
908 | self.assertRaises(URLParseError, URL.from_text, 'ftp://portmouth:smash') | |
909 | self.assertRaises(ValueError, URL.from_text, | |
910 | 'http://reader.googlewebsite.com:neverforget') | |
911 | ||
912 | def test_idna(self): | |
913 | u1 = URL.from_text('http://bücher.ch') | |
914 | self.assertEquals(u1.host, 'bücher.ch') | |
915 | self.assertEquals(u1.to_text(), 'http://bücher.ch') | |
916 | self.assertEquals(u1.to_uri().to_text(), 'http://xn--bcher-kva.ch') | |
917 | ||
918 | u2 = URL.from_text('https://xn--bcher-kva.ch') | |
919 | self.assertEquals(u2.host, 'xn--bcher-kva.ch') | |
920 | self.assertEquals(u2.to_text(), 'https://xn--bcher-kva.ch') | |
921 | self.assertEquals(u2.to_iri().to_text(), u'https://bücher.ch') | |
922 | ||
923 | def test_netloc_slashes(self): | |
924 | # basic sanity checks | |
925 | url = URL.from_text('mailto:mahmoud@hatnote.com') | |
926 | self.assertEquals(url.scheme, 'mailto') | |
927 | self.assertEquals(url.to_text(), 'mailto:mahmoud@hatnote.com') | |
928 | ||
929 | url = URL.from_text('http://hatnote.com') | |
930 | self.assertEquals(url.scheme, 'http') | |
931 | self.assertEquals(url.to_text(), 'http://hatnote.com') | |
932 | ||
933 | # test that unrecognized schemes stay consistent with '//' | |
934 | url = URL.from_text('newscheme:a:b:c') | |
935 | self.assertEquals(url.scheme, 'newscheme') | |
936 | self.assertEquals(url.to_text(), 'newscheme:a:b:c') | |
937 | ||
938 | url = URL.from_text('newerscheme://a/b/c') | |
939 | self.assertEquals(url.scheme, 'newerscheme') | |
940 | self.assertEquals(url.to_text(), 'newerscheme://a/b/c') | |
941 | ||
942 | # test that reasonable guesses are made | |
943 | url = URL.from_text('git+ftp://gitstub.biz/glyph/lefkowitz') | |
944 | self.assertEquals(url.scheme, 'git+ftp') | |
945 | self.assertEquals(url.to_text(), | |
946 | 'git+ftp://gitstub.biz/glyph/lefkowitz') | |
947 | ||
948 | url = URL.from_text('what+mailto:freerealestate@enotuniq.org') | |
949 | self.assertEquals(url.scheme, 'what+mailto') | |
950 | self.assertEquals(url.to_text(), | |
951 | 'what+mailto:freerealestate@enotuniq.org') | |
952 | ||
953 | url = URL(scheme='ztp', path=('x', 'y', 'z'), rooted=True) | |
954 | self.assertEquals(url.to_text(), 'ztp:/x/y/z') | |
955 | ||
956 | # also works when the input doesn't include '//' | |
957 | url = URL(scheme='git+ftp', path=('x', 'y', 'z' ,''), | |
958 | rooted=True, uses_netloc=True) | |
959 | # broken bc urlunsplit | |
960 | self.assertEquals(url.to_text(), 'git+ftp:///x/y/z/') | |
961 | ||
962 | # really why would this ever come up but ok | |
963 | url = URL.from_text('file:///path/to/heck') | |
964 | url2 = url.replace(scheme='mailto') | |
965 | self.assertEquals(url2.to_text(), 'mailto:/path/to/heck') | |
966 | ||
967 | url_text = 'unregisteredscheme:///a/b/c' | |
968 | url = URL.from_text(url_text) | |
969 | no_netloc_url = url.replace(uses_netloc=False) | |
970 | self.assertEquals(no_netloc_url.to_text(), 'unregisteredscheme:/a/b/c') | |
971 | netloc_url = url.replace(uses_netloc=True) | |
972 | self.assertEquals(netloc_url.to_text(), url_text) | |
973 | ||
974 | return | |
975 | ||
976 | def test_wrong_constructor(self): | |
977 | with self.assertRaises(ValueError): | |
978 | # whole URL not allowed | |
979 | URL(BASIC_URL) | |
980 | with self.assertRaises(ValueError): | |
981 | # explicitly bad scheme not allowed | |
982 | URL('HTTP_____more_like_imHoTTeP') | |
983 | ||
984 | def test_encoded_userinfo(self): | |
985 | url = URL.from_text('http://user:pass@example.com') | |
986 | assert url.userinfo == 'user:pass' | |
987 | url = url.replace(userinfo='us%20her:pass') | |
988 | iri = url.to_iri() | |
989 | assert iri.to_text(with_password=True) == 'http://us her:pass@example.com' | |
990 | assert iri.to_text(with_password=False) == 'http://us her:@example.com' | |
991 | assert iri.to_uri().to_text(with_password=True) == 'http://us%20her:pass@example.com' | |
992 | ||
993 | def test_hash(self): | |
994 | url_map = {} | |
995 | url1 = URL.from_text('http://blog.hatnote.com/ask?utm_source=geocity') | |
996 | assert hash(url1) == hash(url1) # sanity | |
997 | ||
998 | url_map[url1] = 1 | |
999 | ||
1000 | url2 = URL.from_text('http://blog.hatnote.com/ask') | |
1001 | url2 = url2.set('utm_source', 'geocity') | |
1002 | ||
1003 | url_map[url2] = 2 | |
1004 | ||
1005 | assert len(url_map) == 1 | |
1006 | assert list(url_map.values()) == [2] | |
1007 | ||
1008 | assert hash(URL()) == hash(URL()) # slightly more sanity | |
1009 | ||
1010 | def test_dir(self): | |
1011 | url = URL() | |
1012 | res = dir(url) | |
1013 | ||
1014 | assert len(res) > 15 | |
1015 | # twisted compat | |
1016 | assert 'fromText' not in res | |
1017 | assert 'asText' not in res | |
1018 | assert 'asURI' not in res | |
1019 | assert 'asIRI' not in res | |
1020 | ||
1021 | def test_twisted_compat(self): | |
1022 | url = URL.fromText(u'http://example.com/a%20té%C3%A9st') | |
1023 | assert url.asText() == 'http://example.com/a%20té%C3%A9st' | |
1024 | assert url.asURI().asText() == 'http://example.com/a%20t%C3%A9%C3%A9st' | |
1025 | # TODO: assert url.asIRI().asText() == u'http://example.com/a%20téést' | |
1026 | ||
1027 | def test_set_ordering(self): | |
1028 | # TODO | |
1029 | url = URL.from_text('http://example.com/?a=b&c') | |
1030 | url = url.set(u'x', u'x') | |
1031 | url = url.add(u'x', u'y') | |
1032 | assert url.to_text() == u'http://example.com/?a=b&x=x&c&x=y' | |
1033 | # Would expect: | |
1034 | # assert url.to_text() == u'http://example.com/?a=b&c&x=x&x=y' | |
1035 | ||
1036 | def test_schemeless_path(self): | |
1037 | "See issue #4" | |
1038 | u1 = URL.from_text("urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob") | |
1039 | u2 = URL.from_text(u1.to_text()) | |
1040 | assert u1 == u2 # sanity testing roundtripping | |
1041 | ||
1042 | u3 = URL.from_text(u1.to_iri().to_text()) | |
1043 | assert u1 == u3 | |
1044 | assert u2 == u3 | |
1045 | ||
1046 | # test that colons are ok past the first segment | |
1047 | u4 = URL.from_text("first-segment/urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob") | |
1048 | u5 = u4.to_iri() | |
1049 | assert u5.to_text() == u'first-segment/urn:ietf:wg:oauth:2.0:oob' | |
1050 | ||
1051 | u6 = URL.from_text(u5.to_text()).to_uri() | |
1052 | assert u5 == u6 # colons stay decoded bc they're not in the first seg | |
1053 | ||
1054 | def test_emoji_domain(self): | |
1055 | "See issue #7, affecting only narrow builds (2.6-3.3)" | |
1056 | url = URL.from_text('https://xn--vi8hiv.ws') | |
1057 | iri = url.to_iri() | |
1058 | iri.to_text() | |
1059 | # as long as we don't get ValueErrors, we're good | |
1060 | ||
1061 | def test_delim_in_param(self): | |
1062 | "Per issue #6 and #8" | |
1063 | self.assertRaises(ValueError, URL, scheme=u'http', host=u'a/c') | |
1064 | self.assertRaises(ValueError, URL, path=(u"?",)) | |
1065 | self.assertRaises(ValueError, URL, path=(u"#",)) | |
1066 | self.assertRaises(ValueError, URL, query=((u"&", "test"))) | |
1067 | ||
1068 | def test_empty_paths_eq(self): | |
1069 | u1 = URL.from_text('http://example.com/') | |
1070 | u2 = URL.from_text('http://example.com') | |
1071 | ||
1072 | assert u1 == u2 | |
1073 | ||
1074 | u1 = URL.from_text('http://example.com') | |
1075 | u2 = URL.from_text('http://example.com') | |
1076 | ||
1077 | assert u1 == u2 | |
1078 | ||
1079 | u1 = URL.from_text('http://example.com') | |
1080 | u2 = URL.from_text('http://example.com/') | |
1081 | ||
1082 | assert u1 == u2 | |
1083 | ||
1084 | u1 = URL.from_text('http://example.com/') | |
1085 | u2 = URL.from_text('http://example.com/') | |
1086 | ||
1087 | assert u1 == u2 | |
1088 | ||
1089 | def test_from_text_type(self): | |
1090 | assert URL.from_text(u'#ok').fragment == u'ok' # sanity | |
1091 | self.assertRaises(TypeError, URL.from_text, b'bytes://x.y.z') | |
1092 | self.assertRaises(TypeError, URL.from_text, object()) | |
1093 | ||
1094 | def test_from_text_bad_authority(self): | |
1095 | # bad ipv6 brackets | |
1096 | self.assertRaises(URLParseError, URL.from_text, 'http://[::1/') | |
1097 | self.assertRaises(URLParseError, URL.from_text, 'http://::1]/') | |
1098 | self.assertRaises(URLParseError, URL.from_text, 'http://[[::1]/') | |
1099 | self.assertRaises(URLParseError, URL.from_text, 'http://[::1]]/') | |
1100 | ||
1101 | # empty port | |
1102 | self.assertRaises(URLParseError, URL.from_text, 'http://127.0.0.1:') | |
1103 | # non-integer port | |
1104 | self.assertRaises(URLParseError, URL.from_text, 'http://127.0.0.1:hi') | |
1105 | # extra port colon (makes for an invalid host) | |
1106 | self.assertRaises(URLParseError, URL.from_text, 'http://127.0.0.1::80') | |
1107 | ||
1108 | def test_normalize(self): | |
1109 | url = URL.from_text('HTTP://Example.com/A%61/./../A%61?B%62=C%63#D%64') | |
1110 | assert url.get('Bb') == [] | |
1111 | assert url.get('B%62') == ['C%63'] | |
1112 | assert len(url.path) == 4 | |
1113 | ||
1114 | # test that most expected normalizations happen | |
1115 | norm_url = url.normalize() | |
1116 | ||
1117 | assert norm_url.scheme == 'http' | |
1118 | assert norm_url.host == 'example.com' | |
1119 | assert norm_url.path == ('Aa',) | |
1120 | assert norm_url.get('Bb') == ['Cc'] | |
1121 | assert norm_url.fragment == 'Dd' | |
1122 | assert norm_url.to_text() == 'http://example.com/Aa?Bb=Cc#Dd' | |
1123 | ||
1124 | # test that flags work | |
1125 | noop_norm_url = url.normalize(scheme=False, host=False, | |
1126 | path=False, query=False, fragment=False) | |
1127 | assert noop_norm_url == url | |
1128 | ||
1129 | # test that empty paths get at least one slash | |
1130 | slashless_url = URL.from_text('http://example.io') | |
1131 | slashful_url = slashless_url.normalize() | |
1132 | assert slashful_url.to_text() == 'http://example.io/' | |
1133 | ||
1134 | # test case normalization for percent encoding | |
1135 | delimited_url = URL.from_text('/a%2fb/cd%3f?k%3d=v%23#test') | |
1136 | norm_delimited_url = delimited_url.normalize() | |
1137 | assert norm_delimited_url.to_text() == '/a%2Fb/cd%3F?k%3D=v%23#test' | |
1138 | ||
1139 | # test invalid percent encoding during normalize | |
1140 | assert URL(path=('', '%te%sts')).normalize().to_text() == '/%te%sts' |
0 | Metadata-Version: 1.1 | |
1 | Name: hyperlink | |
2 | Version: 17.3.1 | |
3 | Summary: A featureful, correct URL for Python. | |
4 | Home-page: https://github.com/python-hyper/hyperlink | |
5 | Author: Mahmoud Hashemi and Glyph Lefkowitz | |
6 | Author-email: mahmoud@hatnote.com | |
7 | License: MIT | |
8 | Description: The humble, but powerful, URL runs everything around us. Chances | |
9 | are you've used several just to read this text. | |
10 | ||
11 | Hyperlink is a featureful, pure-Python implementation of the URL, with | |
12 | an emphasis on correctness. BSD licensed. | |
13 | ||
14 | See the docs at http://hyperlink.readthedocs.io. | |
15 | ||
16 | Platform: any | |
17 | Classifier: Topic :: Utilities | |
18 | Classifier: Intended Audience :: Developers | |
19 | Classifier: Topic :: Software Development :: Libraries | |
20 | Classifier: Development Status :: 5 - Production/Stable | |
21 | Classifier: Programming Language :: Python :: 2.6 | |
22 | Classifier: Programming Language :: Python :: 2.7 | |
23 | Classifier: Programming Language :: Python :: 3.4 | |
24 | Classifier: Programming Language :: Python :: 3.5 | |
25 | Classifier: Programming Language :: Python :: 3.6 | |
26 | Classifier: Programming Language :: Python :: Implementation :: PyPy |
0 | .tox-coveragerc | |
1 | CHANGELOG.md | |
2 | LICENSE | |
3 | MANIFEST.in | |
4 | README.md | |
5 | pytest.ini | |
6 | requirements-test.txt | |
7 | setup.cfg | |
8 | setup.py | |
9 | tox.ini | |
10 | docs/Makefile | |
11 | docs/api.rst | |
12 | docs/conf.py | |
13 | docs/design.rst | |
14 | docs/faq.rst | |
15 | docs/hyperlink_logo_proto.png | |
16 | docs/hyperlink_logo_v1.png | |
17 | docs/index.rst | |
18 | docs/make.bat | |
19 | docs/_templates/page.html | |
20 | hyperlink/__init__.py | |
21 | hyperlink/_url.py | |
22 | hyperlink.egg-info/PKG-INFO | |
23 | hyperlink.egg-info/SOURCES.txt | |
24 | hyperlink.egg-info/dependency_links.txt | |
25 | hyperlink.egg-info/not-zip-safe | |
26 | hyperlink.egg-info/top_level.txt | |
27 | hyperlink/test/__init__.py | |
28 | hyperlink/test/common.py | |
29 | hyperlink/test/test_common.py | |
30 | hyperlink/test/test_scheme_registration.py | |
31 | hyperlink/test/test_url.py⏎ |
0 | hyperlink |
0 | """The humble, but powerful, URL runs everything around us. Chances | |
1 | are you've used several just to read this text. | |
2 | ||
3 | Hyperlink is a featureful, pure-Python implementation of the URL, with | |
4 | an emphasis on correctness. BSD licensed. | |
5 | ||
6 | See the docs at http://hyperlink.readthedocs.io. | |
7 | """ | |
8 | ||
9 | from setuptools import setup | |
10 | ||
11 | ||
12 | __author__ = 'Mahmoud Hashemi and Glyph Lefkowitz' | |
13 | __version__ = '17.3.1' | |
14 | __contact__ = 'mahmoud@hatnote.com' | |
15 | __url__ = 'https://github.com/python-hyper/hyperlink' | |
16 | __license__ = 'MIT' | |
17 | ||
18 | ||
19 | setup(name='hyperlink', | |
20 | version=__version__, | |
21 | description="A featureful, correct URL for Python.", | |
22 | long_description=__doc__, | |
23 | author=__author__, | |
24 | author_email=__contact__, | |
25 | url=__url__, | |
26 | packages=['hyperlink', 'hyperlink.test'], | |
27 | include_package_data=True, | |
28 | zip_safe=False, | |
29 | license=__license__, | |
30 | platforms='any', | |
31 | classifiers=[ | |
32 | 'Topic :: Utilities', | |
33 | 'Intended Audience :: Developers', | |
34 | 'Topic :: Software Development :: Libraries', | |
35 | 'Development Status :: 5 - Production/Stable', | |
36 | 'Programming Language :: Python :: 2.6', | |
37 | 'Programming Language :: Python :: 2.7', | |
38 | 'Programming Language :: Python :: 3.4', | |
39 | 'Programming Language :: Python :: 3.5', | |
40 | 'Programming Language :: Python :: 3.6', | |
41 | 'Programming Language :: Python :: Implementation :: PyPy', ] | |
42 | ) | |
43 | ||
44 | """ | |
45 | A brief checklist for release: | |
46 | ||
47 | * tox | |
48 | * git commit (if applicable) | |
49 | * Bump setup.py version off of -dev | |
50 | * git commit -a -m "bump version for x.y.z release" | |
51 | * python setup.py sdist bdist_wheel upload | |
52 | * bump docs/conf.py version | |
53 | * git commit | |
54 | * git tag -a vx.y.z -m "brief summary" | |
55 | * write CHANGELOG | |
56 | * git commit | |
57 | * bump setup.py version onto n+1 dev | |
58 | * git commit | |
59 | * git push | |
60 | ||
61 | """ |
0 | [tox] | |
1 | envlist = py26,py27,py34,py35,py36,pypy,coverage-report,packaging | |
2 | ||
3 | [testenv] | |
4 | changedir = .tox | |
5 | deps = -rrequirements-test.txt | |
6 | commands = coverage run --parallel --rcfile {toxinidir}/.tox-coveragerc -m pytest --doctest-modules {envsitepackagesdir}/hyperlink {posargs} | |
7 | ||
8 | # Uses default basepython otherwise reporting doesn't work on Travis where | |
9 | # Python 3.6 is only available in 3.6 jobs. | |
10 | [testenv:coverage-report] | |
11 | changedir = .tox | |
12 | deps = coverage | |
13 | commands = coverage combine --rcfile {toxinidir}/.tox-coveragerc | |
14 | coverage report --rcfile {toxinidir}/.tox-coveragerc | |
15 | coverage html --rcfile {toxinidir}/.tox-coveragerc -d {toxinidir}/htmlcov | |
16 | ||
17 | ||
18 | [testenv:packaging] | |
19 | changedir = {toxinidir} | |
20 | deps = | |
21 | check-manifest==0.35 | |
22 | readme_renderer==17.2 | |
23 | commands = | |
24 | check-manifest | |
25 | python setup.py check --metadata --restructuredtext --strict |