Codebase list python-geopandas / f0946d5
Update upstream source from tag 'upstream/0.11.0' Update to upstream version '0.11.0' with Debian dir 811a86892cc89d887b82a63bdf1a3dc151669a3b Bas Couwenberg 1 year, 10 months ago
103 changed file(s) with 5431 addition(s) and 2548 deletion(s). Raw diff Collapse all Expand all
99 geopandas/io/tests/*
1010 geopandas/tools/tests/*
1111 geopandas/_version.py
12 geopandas/datasets/naturalearth_creation.py
1010
1111 - [ ] I have confirmed this bug exists on the latest version of geopandas.
1212
13 - [ ] (optional) I have confirmed this bug exists on the master branch of geopandas.
13 - [ ] (optional) I have confirmed this bug exists on the main branch of geopandas.
1414
1515 ---
1616
1818 with:
1919 python-version: "3.x"
2020
21 - name: Build a binary wheel and a source tarball
21 - name: Build source and wheel distributions
2222 run: |
23 python -m pip install --upgrade pip
24 pip install setuptools wheel
25 python setup.py sdist bdist_wheel
23 python -m pip install --upgrade build twine
24 python -m build
25 twine check --strict dist/*
2626
2727 - name: Publish distribution to PyPI
2828 uses: pypa/gh-action-pypi-publish@master
11
22 on:
33 push:
4 branches: [master]
4 branches: [main, 0.**]
55 pull_request:
6 branches: [master]
6 branches: [main, 0.**]
77 schedule:
88 - cron: "0 0 * * *"
99
10 concurrency:
10 concurrency:
1111 group: ${{ github.workflow }}-${{ github.ref }}
1212 cancel-in-progress: true
1313
1616 runs-on: ubuntu-latest
1717
1818 steps:
19 - uses: actions/checkout@v2
20 - uses: actions/setup-python@v2
21 - uses: pre-commit/action@v2.0.0
19 - uses: actions/checkout@v3
20 - uses: actions/setup-python@v3
21 - uses: pre-commit/action@v2.0.3
2222
2323 Test:
2424 needs: Linting
3434 postgis: [false]
3535 dev: [false]
3636 env:
37 - ci/envs/37-minimal.yaml
38 - ci/envs/38-no-optional-deps.yaml
39 - ci/envs/37-pd10.yaml
40 - ci/envs/37-latest-defaults.yaml
41 - ci/envs/37-latest-conda-forge.yaml
37 - ci/envs/38-minimal.yaml
38 - ci/envs/39-no-optional-deps.yaml
39 - ci/envs/38-pd11-defaults.yaml
40 - ci/envs/38-latest-defaults.yaml
4241 - ci/envs/38-latest-conda-forge.yaml
42 - ci/envs/39-pd12-conda-forge.yaml
4343 - ci/envs/39-latest-conda-forge.yaml
44 - ci/envs/310-latest-conda-forge.yaml
4445 include:
45 - env: ci/envs/37-latest-conda-forge.yaml
46 os: macos-latest
47 postgis: false
48 dev: false
4946 - env: ci/envs/38-latest-conda-forge.yaml
5047 os: macos-latest
5148 postgis: false
5249 dev: false
53 - env: ci/envs/37-latest-conda-forge.yaml
54 os: windows-latest
50 - env: ci/envs/39-latest-conda-forge.yaml
51 os: macos-latest
5552 postgis: false
5653 dev: false
5754 - env: ci/envs/38-latest-conda-forge.yaml
5855 os: windows-latest
5956 postgis: false
6057 dev: false
61 - env: ci/envs/38-dev.yaml
58 - env: ci/envs/39-latest-conda-forge.yaml
59 os: windows-latest
60 postgis: false
61 dev: false
62 - env: ci/envs/310-dev.yaml
6263 os: ubuntu-latest
6364 dev: true
6465
6566 steps:
66 - uses: actions/checkout@v2
67 - uses: actions/checkout@v3
6768
6869 - name: Setup Conda
6970 uses: conda-incubator/setup-miniconda@v2
7071 with:
7172 environment-file: ${{ matrix.env }}
73 miniforge-version: latest
74 miniforge-variant: Mambaforge
75 use-mamba: true
7276
7377 - name: Check and Log Environment
7478 run: |
101105 pytest -v -r s -n auto --color=yes --cov=geopandas --cov-append --cov-report term-missing --cov-report xml geopandas/
102106
103107 - name: Test with PostGIS
104 if: contains(matrix.env, '38-latest-conda-forge.yaml') && contains(matrix.os, 'ubuntu')
108 if: contains(matrix.env, '39-pd12-conda-forge.yaml') && contains(matrix.os, 'ubuntu')
105109 env:
106110 PGUSER: postgres
107111 PGPASSWORD: postgres
113117 pytest -v -r s --color=yes --cov=geopandas --cov-append --cov-report term-missing --cov-report xml geopandas/io/tests/test_sql.py | tee /dev/stderr | if grep SKIPPED >/dev/null;then echo "TESTS SKIPPED, FAILING" && exit 1;fi
114118
115119 - name: Test docstrings
116 if: contains(matrix.env, '38-latest-conda-forge.yaml') && contains(matrix.os, 'ubuntu')
120 if: contains(matrix.env, '39-pd12-conda-forge.yaml') && contains(matrix.os, 'ubuntu')
117121 env:
118122 USE_PYGEOS: 1
119123 run: |
120124 pytest -v --color=yes --doctest-only geopandas --ignore=geopandas/datasets
121125
122 - uses: codecov/codecov-action@v1
126 - uses: codecov/codecov-action@v2
6565
6666 geopandas.egg-info
6767 geopandas/version.py
68 geopandas/datasets/ne_110m_admin_0_countries.zip
6869
6970 .asv
7071 doc/source/getting_started/my_file.geojson
00 files: 'geopandas\/'
11 repos:
2 - repo: https://github.com/python/black
3 rev: 20.8b1
4 hooks:
5 - id: black
6 language_version: python3
7 - repo: https://gitlab.com/pycqa/flake8
8 rev: 3.8.3
9 hooks:
10 - id: flake8
11 language: python_venv
2 - repo: https://github.com/psf/black
3 rev: 22.3.0
4 hooks:
5 - id: black
6 language_version: python3
7 - repo: https://gitlab.com/pycqa/flake8
8 rev: 3.8.3
9 hooks:
10 - id: flake8
11 language: python_venv
00 Changelog
11 =========
22
3 Version 0.11 (June 20, 2022)
4 ----------------------------
5
6 Highlights of this release:
7
8 - The ``geopandas.read_file()`` and `GeoDataFrame.to_file()` methods to read
9 and write GIS file formats can now optionally use the
10 [pyogrio](https://github.com/geopandas/pyogrio/) package under the hood
11 through the ``engine="pyogrio"`` keyword. The pyogrio package implements
12 vectorized IO for GDAL/OGR vector data sources, and is faster compared to
13 the ``fiona``-based engine (#2225).
14 - GeoParquet support updated to implement
15 [v0.4.0](https://github.com/opengeospatial/geoparquet/releases/tag/v0.4.0) of the
16 OpenGeospatial/GeoParquet specification (#2441). Backwards compatibility with v0.1.0 of
17 the metadata spec (implemented in the previous releases of GeoPandas) is guaranteed,
18 and reading and writing Parquet and Feather files will no longer produce a ``UserWarning``
19 (#2327).
20
21 New features and improvements:
22
23 - Improved handling of GeoDataFrame when the active geometry column is
24 lost from the GeoDataFrame. Previously, square bracket indexing ``gdf[[...]]`` returned
25 a GeoDataFrame when the active geometry column was retained and a DataFrame was
26 returned otherwise. Other pandas indexing methods (``loc``, ``iloc``, etc) did not follow
27 the same rules. The new behaviour for all indexing/reshaping operations is now as
28 follows (#2329, #2060):
29 - If operations produce a ``DataFrame`` containing the active geometry column, a
30 GeoDataFrame is returned
31 - If operations produce a ``DataFrame`` containing ``GeometryDtype`` columns, but not the
32 active geometry column, a ``GeoDataFrame`` is returned, where the active geometry
33 column is set to ``None`` (set the new geometry column with ``set_geometry()``)
34 - If operations produce a ``DataFrame`` containing no ``GeometryDtype`` columns, a
35 ``DataFrame`` is returned (this can be upcast again by calling ``set_geometry()`` or the
36 ``GeoDataFrame`` constructor)
37 - If operations produce a ``Series`` of ``GeometryDtype``, a ``GeoSeries`` is returned,
38 otherwise ``Series`` is returned.
39 - Error messages for having an invalid geometry column
40 have been improved, indicating the name of the last valid active geometry column set
41 and whether other geometry columns can be promoted to the active geometry column
42 (#2329).
43
44 - Datetime fields are now read and written correctly for GIS formats which support them
45 (e.g. GPKG, GeoJSON) with fiona 1.8.14 or higher. Previously, datetimes were read as
46 strings (#2202).
47 - ``folium.Map`` keyword arguments can now be specified as the ``map_kwds`` argument to
48 ``GeoDataFrame.explore()`` method (#2315).
49 - Add a new parameter ``style_function`` to ``GeoDataFrame.explore()`` to enable plot styling
50 based on GeoJSON properties (#2377).
51 - It is now possible to write an empty ``GeoDataFrame`` to a file for supported formats
52 (#2240). Attempting to do so will now emit a ``UserWarning`` instead of a ``ValueError``.
53 - Fast rectangle clipping has been exposed as ``GeoSeries/GeoDataFrame.clip_by_rect()``
54 (#1928).
55 - The ``mask`` parameter of ``GeoSeries/GeoDataFrame.clip()`` now accepts a rectangular mask
56 as a list-like to perform fast rectangle clipping using the new
57 ``GeoSeries/GeoDataFrame.clip_by_rect()`` (#2414).
58 - Bundled demo dataset ``naturalearth_lowres`` has been updated to version 5.0.1 of the
59 source, with field ``ISO_A3`` manually corrected for some cases (#2418).
60
61 Deprecations and compatibility notes:
62
63 - The active development branch of geopandas on GitHub has been renamed from master to
64 main (#2277).
65 - Deprecated methods ``GeometryArray.equals_exact()`` and ``GeometryArray.almost_equals()``
66 have been removed. They should
67 be replaced with ``GeometryArray.geom_equals_exact()`` and
68 ``GeometryArray.geom_almost_equals()`` respectively (#2267).
69 - Deprecated CRS functions ``explicit_crs_from_epsg()``, ``epsg_from_crs()`` and
70 ``get_epsg_file_contents()`` were removed (#2340).
71 - Warning about the behaviour change to ``GeoSeries.isna()`` with empty
72 geometries present has been removed (#2349).
73 - Specifying a CRS in the ``GeoDataFrame/GeoSeries`` constructor which contradicted the
74 underlying ``GeometryArray`` now raises a ``ValueError`` (#2100).
75 - Specifying a CRS in the ``GeoDataFrame`` constructor when no geometry column is provided
76 and calling ``GeoDataFrame. set_crs`` on a ``GeoDataFrame`` without an active geometry
77 column now raise a ``ValueError`` (#2100)
78 - Passing non-geometry data to the``GeoSeries`` constructor is now fully deprecated and
79 will raise a ``TypeError`` (#2314). Previously, a ``pandas.Series`` was returned for
80 non-geometry data.
81 - Deprecated ``GeoSeries/GeoDataFrame`` set operations ``__xor__()``,
82 ``__or__()``, ``__and__()`` and ``__sub__()``, ``geopandas.io.file.read_file``/``to_file`` and
83 ``geopandas.io.sql.read_postgis`` now emit ``FutureWarning`` instead of
84 ``DeprecationWarning`` and will be completely removed in a future release.
85 - Accessing the ``crs`` of a ``GeoDataFrame`` without active geometry column is deprecated and will be removed in GeoPandas 0.12 (#2373).
86
87 Bug fixes:
88
89 - ``GeoSeries.to_frame`` now creates a ``GeoDataFrame`` with the geometry column name set
90 correctly (#2296)
91 - Fix pickle files created with pygeos installed can not being readable when pygeos is
92 not installed (#2237).
93 - Fixed ``UnboundLocalError`` in ``GeoDataFrame.plot()`` using ``legend=True`` and
94 ``missing_kwds`` (#2281).
95 - Fix ``explode()`` incorrectly relating index to columns, including where the input index
96 is not unique (#2292)
97 - Fix ``GeoSeries.[xyz]`` raising an ``IndexError`` when the underlying GeoSeries contains
98 empty points (#2335). Rows corresponding to empty points now contain ``np.nan``.
99 - Fix ``GeoDataFrame.iloc`` raising a ``TypeError`` when indexing a ``GeoDataFrame`` with only
100 a single column of ``GeometryDtype`` (#1970).
101 - Fix ``GeoDataFrame.iterfeatures()`` not returning features with the same field order as
102 ``GeoDataFrame.columns`` (#2396).
103 - Fix ``GeoDataFrame.from_features()`` to support reading GeoJSON with null properties
104 (#2243).
105 - Fix ``GeoDataFrame.to_parquet()`` not intercepting ``engine`` keyword argument, breaking
106 consistency with pandas (#2227)
107 - Fix ``GeoDataFrame.explore()`` producing an error when ``column`` is of boolean dtype
108 (#2403).
109 - Fix an issue where ``GeoDataFrame.to_postgis()`` output the wrong SRID for ESRI
110 authority CRS (#2414).
111 - Fix ``GeoDataFrame.from_dict/from_features`` classmethods using ``GeoDataFrame`` rather
112 than ``cls`` as the constructor.
113 - Fix ``GeoDataFrame.plot()`` producing incorrect colors with mixed geometry types when
114 ``colors`` keyword is provided. (#2420)
115
116 Notes on (optional) dependencies:
117
118 - GeoPandas 0.11 drops support for Python 3.7 and pandas 0.25 (the minimum supported
119 pandas version is now 1.0.5). Further, the minimum required versions for the listed
120 dependencies have now changed to shapely 1.7, fiona 1.8.13.post1, pyproj 2.6.1.post1,
121 matplotlib 3.2, mapclassify 2.4.0 (#2358, #2391)
122
123
3124 Version 0.10.2 (October 16, 2021)
4125 ---------------------------------
5126
6127 Small bug-fix release:
7128
8 - Fix regression in `overlay()` in case no geometries are intersecting (but
129 - Fix regression in ``overlay()`` in case no geometries are intersecting (but
9130 have overlapping total bounds) (#2172).
10 - Fix regression in `overlay()` with `keep_geom_type=True` in case the
131 - Fix regression in ``overlay()`` with ``keep_geom_type=True`` in case the
11132 overlay of two geometries in a GeometryCollection with other geometry types
12133 (#2177).
13 - Fix `overlay()` to honor the `keep_geom_type` keyword for the
14 `op="differnce"` case (#2164).
15 - Fix regression in `plot()` with a mapclassify `scheme` in case the
134 - Fix ``overlay()`` to honor the ``keep_geom_type`` keyword for the
135 ``op="differnce"`` case (#2164).
136 - Fix regression in ``plot()`` with a mapclassify ``scheme`` in case the
16137 formatted legend labels have duplicates (#2166).
17 - Fix a bug in the `explore()` method ignoring the `vmin` and `vmax` keywords
138 - Fix a bug in the ``explore()`` method ignoring the ``vmin`` and ``vmax`` keywords
18139 in case they are set to 0 (#2175).
19 - Fix `unary_union` to correctly handle a GeoSeries with missing values (#2181).
20 - Avoid internal deprecation warning in `clip()` (#2179).
140 - Fix ``unary_union`` to correctly handle a GeoSeries with missing values (#2181).
141 - Avoid internal deprecation warning in ``clip()`` (#2179).
21142
22143
23144 Version 0.10.1 (October 8, 2021)
25146
26147 Small bug-fix release:
27148
28 - Fix regression in `overlay()` with non-overlapping geometries and a
29 non-default `how` (i.e. not "intersection") (#2157).
149 - Fix regression in ``overlay()`` with non-overlapping geometries and a
150 non-default ``how`` (i.e. not "intersection") (#2157).
30151
31152
32153 Version 0.10.0 (October 3, 2021)
34155
35156 Highlights of this release:
36157
37 - A new `sjoin_nearest()` method to join based on proximity, with the
38 ability to set a maximum search radius (#1865). In addition, the `sindex`
158 - A new ``sjoin_nearest()`` method to join based on proximity, with the
159 ability to set a maximum search radius (#1865). In addition, the ``sindex``
39160 attribute gained a new method for a "nearest" spatial index query (#1865,
40161 #2053).
41 - A new `explore()` method on GeoDataFrame and GeoSeries with native support
162 - A new ``explore()`` method on GeoDataFrame and GeoSeries with native support
42163 for interactive visualization based on folium / leaflet.js (#1953)
43 - The `geopandas.sjoin()`/`overlay()`/`clip()` functions are now also
164 - The ``geopandas.sjoin()``/``overlay()``/``clip()`` functions are now also
44165 available as methods on the GeoDataFrame (#2141, #1984, #2150).
45166
46167 New features and improvements:
47168
48 - Add support for pandas' `value_counts()` method for geometry dtype (#2047).
49 - The `explode()` method has a new `ignore_index` keyword (consistent with
169 - Add support for pandas' ``value_counts()`` method for geometry dtype (#2047).
170 - The ``explode()`` method has a new ``ignore_index`` keyword (consistent with
50171 pandas' explode method) to reset the index in the result, and a new
51 `index_parts` keywords to control whether a cumulative count indexing the
172 ``index_parts`` keywords to control whether a cumulative count indexing the
52173 parts of the exploded multi-geometries should be added (#1871).
53 - `points_from_xy()` is now available as a GeoSeries method `from_xy` (#1936).
54 - The `to_file()` method will now attempt to detect the driver (if not
174 - ``points_from_xy()`` is now available as a GeoSeries method ``from_xy`` (#1936).
175 - The ``to_file()`` method will now attempt to detect the driver (if not
55176 specified) based on the extension of the provided filename, instead of
56177 defaulting to ESRI Shapefile (#1609).
57 - Support for the `storage_options` keyword in `read_parquet()` for
178 - Support for the ``storage_options`` keyword in ``read_parquet()`` for
58179 specifying filesystem-specific options (e.g. for S3) based on fsspec (#2107).
59 - The read/write functions now support `~` (user home directory) expansion (#1876).
60 - Support the `convert_dtypes()` method from pandas to preserve the
180 - The read/write functions now support ``~`` (user home directory) expansion (#1876).
181 - Support the ``convert_dtypes()`` method from pandas to preserve the
61182 GeoDataFrame class (#2115).
62 - Support WKB values in the hex format in `GeoSeries.from_wkb()` (#2106).
63 - Update the `estimate_utm_crs()` method to handle crossing the antimeridian
183 - Support WKB values in the hex format in ``GeoSeries.from_wkb()`` (#2106).
184 - Update the ``estimate_utm_crs()`` method to handle crossing the antimeridian
64185 with pyproj 3.1+ (#2049).
65186 - Improved heuristic to decide how many decimals to show in the repr based on
66187 whether the CRS is projected or geographic (#1895).
67 - Switched the default for `geocode()` from GeoCode.Farm to the Photon
188 - Switched the default for ``geocode()`` from GeoCode.Farm to the Photon
68189 geocoding API (https://photon.komoot.io) (#2007).
69190
70191 Deprecations and compatibility notes:
71192
72 - The `op=` keyword of `sjoin()` to indicate which spatial predicate to use
73 for joining is being deprecated and renamed in favor of a new `predicate=`
193 - The ``op=`` keyword of ``sjoin()`` to indicate which spatial predicate to use
194 for joining is being deprecated and renamed in favor of a new ``predicate=``
74195 keyword (#1626).
75 - The `cascaded_union` attribute is deprecated, use `unary_union` instead (#2074).
196 - The ``cascaded_union`` attribute is deprecated, use ``unary_union`` instead (#2074).
76197 - Constructing a GeoDataFrame with a duplicated "geometry" column is now
77 disallowed. This can also raise an error in the `pd.concat(.., axis=1)`
198 disallowed. This can also raise an error in the ``pd.concat(.., axis=1)``
78199 function if this results in duplicated active geometry columns (#2046).
79 - The `explode()` method currently returns a GeoSeries/GeoDataFrame with a
200 - The ``explode()`` method currently returns a GeoSeries/GeoDataFrame with a
80201 MultiIndex, with an additional level with indices of the parts of the
81202 exploded multi-geometries. For consistency with pandas, this will change in
82 the future and the new `index_parts` keyword is added to control this.
83
84 Bug fixes:
85
86 - Fix in the `clip()` function to correctly clip MultiPoints instead of
203 the future and the new ``index_parts`` keyword is added to control this.
204
205 Bug fixes:
206
207 - Fix in the ``clip()`` function to correctly clip MultiPoints instead of
87208 leaving them intact when partly outside of the clip bounds (#2148).
88 - Fix `GeoSeries.isna()` to correctly return a boolean Series in case of an
209 - Fix ``GeoSeries.isna()`` to correctly return a boolean Series in case of an
89210 empty GeoSeries (#2073).
90211 - Fix the GeoDataFrame constructor to preserve the geometry name when the
91 argument is already a GeoDataFrame object (i.e. `GeoDataFrame(gdf)`) (#2138).
212 argument is already a GeoDataFrame object (i.e. ``GeoDataFrame(gdf)``) (#2138).
92213 - Fix loss of the values' CRS when setting those values as a column
93 (`GeoDataFrame.__setitem__`) (#1963)
94 - Fix in `GeoDataFrame.apply()` to preserve the active geometry column name
214 (``GeoDataFrame.__setitem__``) (#1963)
215 - Fix in ``GeoDataFrame.apply()`` to preserve the active geometry column name
95216 (#1955).
96 - Fix in `sjoin()` to not ignore the suffixes in case of a right-join
97 (`how="right`) (#2065).
98 - Fix `GeoDataFrame.explode()` with a MultiIndex (#1945).
99 - Fix the handling of missing values in `to/from_wkb` and `to_from_wkt` (#1891).
100 - Fix `to_file()` and `to_json()` when DataFrame has duplicate columns to
217 - Fix in ``sjoin()`` to not ignore the suffixes in case of a right-join
218 (``how="right``) (#2065).
219 - Fix ``GeoDataFrame.explode()`` with a MultiIndex (#1945).
220 - Fix the handling of missing values in ``to/from_wkb`` and ``to_from_wkt`` (#1891).
221 - Fix ``to_file()`` and ``to_json()`` when DataFrame has duplicate columns to
101222 raise an error (#1900).
102223 - Fix bug in the colors shown with user-defined classification scheme (#2019).
103 - Fix handling of the `path_effects` keyword in `plot()` (#2127).
104 - Fix `GeoDataFrame.explode()` to preserve `attrs` (#1935)
224 - Fix handling of the ``path_effects`` keyword in ``plot()`` (#2127).
225 - Fix ``GeoDataFrame.explode()`` to preserve ``attrs`` (#1935)
105226
106227 Notes on (optional) dependencies:
107228
124245
125246 New features and improvements:
126247
127 - The `geopandas.read_file` function now accepts more general
128 file-like objects (e.g. `fsspec` open file objects). It will now also
248 - The ``geopandas.read_file`` function now accepts more general
249 file-like objects (e.g. ``fsspec`` open file objects). It will now also
129250 automatically recognize zipped files (#1535).
130 - The `GeoDataFrame.plot()` method now provides access to the pandas plotting
131 functionality for the non-geometry columns, either using the `kind` keyword
132 or the accessor method (e.g. `gdf.plot(kind="bar")` or `gdf.plot.bar()`)
251 - The ``GeoDataFrame.plot()`` method now provides access to the pandas plotting
252 functionality for the non-geometry columns, either using the ``kind`` keyword
253 or the accessor method (e.g. ``gdf.plot(kind="bar")`` or ``gdf.plot.bar()``)
133254 (#1465).
134 - New `from_wkt()`, `from_wkb()`, `to_wkt()`, `to_wkb()` methods for
255 - New ``from_wkt()``, ``from_wkb()``, ``to_wkt()``, ``to_wkb()`` methods for
135256 GeoSeries to construct a GeoSeries from geometries in WKT or WKB
136257 representation, or to convert a GeoSeries to a pandas Seriew with WKT or WKB
137258 values (#1710).
138 - New `GeoSeries.z` attribute to access the z-coordinates of Point geometries
139 (similar to the existing `.x` and `.y` attributes) (#1773).
140 - The `to_crs()` method now handles missing values (#1618).
141 - Support for pandas' new `.attrs` functionality (#1658).
142 - The `dissolve()` method now allows dissolving by no column (`by=None`) to
259 - New ``GeoSeries.z`` attribute to access the z-coordinates of Point geometries
260 (similar to the existing ``.x`` and ``.y`` attributes) (#1773).
261 - The ``to_crs()`` method now handles missing values (#1618).
262 - Support for pandas' new ``.attrs`` functionality (#1658).
263 - The ``dissolve()`` method now allows dissolving by no column (``by=None``) to
143264 create a union of all geometries (single-row GeoDataFrame) (#1568).
144 - New `estimate_utm_crs()` method on GeoSeries/GeoDataFrame to determine the
265 - New ``estimate_utm_crs()`` method on GeoSeries/GeoDataFrame to determine the
145266 UTM CRS based on the bounds (#1646).
146 - `GeoDataFrame.from_dict()` now accepts `geometry` and `crs` keywords
267 - ``GeoDataFrame.from_dict()`` now accepts ``geometry`` and ``crs`` keywords
147268 (#1619).
148 - `GeoDataFrame.to_postgis()` and `geopandas.read_postgis()` now supports
269 - ``GeoDataFrame.to_postgis()`` and ``geopandas.read_postgis()`` now supports
149270 both sqlalchemy engine and connection objects (#1638).
150 - The `GeoDataFrame.explode()` method now allows exploding based on a
271 - The ``GeoDataFrame.explode()`` method now allows exploding based on a
151272 non-geometry column, using the pandas implementation (#1720).
152 - Performance improvement in `GeoDataFrame/GeoSeries.explode()` when using
273 - Performance improvement in ``GeoDataFrame/GeoSeries.explode()`` when using
153274 the PyGEOS backend (#1693).
154 - The binary operation and predicate methods (eg `intersection()`,
155 `intersects()`) have a new `align` keyword which allows optionally not
156 aligning on the index before performing the operation with `align=False`
275 - The binary operation and predicate methods (eg ``intersection()``,
276 ``intersects()``) have a new ``align`` keyword which allows optionally not
277 aligning on the index before performing the operation with ``align=False``
157278 (#1668).
158 - The `GeoDataFrame.dissolve()` method now supports all relevant keywords of
159 `groupby()`, i.e. the `level`, `sort`, `observed` and `dropna` keywords
279 - The ``GeoDataFrame.dissolve()`` method now supports all relevant keywords of
280 ``groupby()``, i.e. the ``level``, ``sort``, ``observed`` and ``dropna`` keywords
160281 (#1845).
161 - The `geopandas.overlay()` function now accepts `make_valid=False` to skip
162 the step to ensure the input geometries are valid using `buffer(0)` (#1802).
163 - The `GeoDataFrame.to_json()` method gained a `drop_id` keyword to
282 - The ``geopandas.overlay()`` function now accepts ``make_valid=False`` to skip
283 the step to ensure the input geometries are valid using ``buffer(0)`` (#1802).
284 - The ``GeoDataFrame.to_json()`` method gained a ``drop_id`` keyword to
164285 optionally not write the GeoDataFrame's index as the "id" field in the
165286 resulting JSON (#1637).
166 - A new `aspect` keyword in the plotting methods to optionally allow retaining
287 - A new ``aspect`` keyword in the plotting methods to optionally allow retaining
167288 the original aspect (#1512)
168 - A new `interval` keyword in the `legend_kwds` group of the `plot()` method
289 - A new ``interval`` keyword in the ``legend_kwds`` group of the ``plot()`` method
169290 to control the appearance of the legend labels when using a classification
170291 scheme (#1605).
171 - The spatial index of a GeoSeries (accessed with the `sindex` attribute) is
292 - The spatial index of a GeoSeries (accessed with the ``sindex`` attribute) is
172293 now stored on the underlying array. This ensures that the spatial index is
173294 preserved in more operations where possible, and that multiple geometry
174295 columns of a GeoDataFrame can each have a spatial index (#1444).
175 - Addition of a `has_sindex` attribute on the GeoSeries/GeoDataFrame to check
296 - Addition of a ``has_sindex`` attribute on the GeoSeries/GeoDataFrame to check
176297 if a spatial index has already been initialized (#1627).
177 - The `geopandas.testing.assert_geoseries_equal()` and `assert_geodataframe_equal()`
178 testing utilities now have a `normalize` keyword (False by default) to
298 - The ``geopandas.testing.assert_geoseries_equal()`` and ``assert_geodataframe_equal()``
299 testing utilities now have a ``normalize`` keyword (False by default) to
179300 normalize geometries before comparing for equality (#1826). Those functions
180301 now also give a more informative error message when failing (#1808).
181302
182303 Deprecations and compatibility notes:
183304
184 - The `is_ring` attribute currently returns True for Polygons. In the future,
305 - The ``is_ring`` attribute currently returns True for Polygons. In the future,
185306 this will be False (#1631). In addition, start to check it for LineStrings
186307 and LinearRings (instead of always returning False).
187 - The deprecated `objects` keyword in the `intersection()` method of the
188 `GeoDataFrame/GeoSeries.sindex` spatial index object has been removed
308 - The deprecated ``objects`` keyword in the ``intersection()`` method of the
309 ``GeoDataFrame/GeoSeries.sindex`` spatial index object has been removed
189310 (#1444).
190311
191312 Bug fixes:
192313
193 - Fix regression in the `plot()` method raising an error with empty
314 - Fix regression in the ``plot()`` method raising an error with empty
194315 geometries (#1702, #1828).
195 - Fix `geopandas.overlay()` to preserve geometries of the correct type which
316 - Fix ``geopandas.overlay()`` to preserve geometries of the correct type which
196317 are nested within a GeometryCollection as a result of the overlay
197318 operation (#1582). In addition, a warning will now be raised if geometries
198319 of different type are dropped from the result (#1554).
199320 - Fix the repr of an empty GeoSeries to not show spurious warnings (#1673).
200 - Fix the `.crs` for empty GeoDataFrames (#1560).
201 - Fix `geopandas.clip` to preserve the correct geometry column name (#1566).
202 - Fix bug in `plot()` method when using `legend_kwds` with multiple subplots
321 - Fix the ``.crs`` for empty GeoDataFrames (#1560).
322 - Fix ``geopandas.clip`` to preserve the correct geometry column name (#1566).
323 - Fix bug in ``plot()`` method when using ``legend_kwds`` with multiple subplots
203324 (#1583)
204 - Fix spurious warning with `missing_kwds` keyword of the `plot()` method
325 - Fix spurious warning with ``missing_kwds`` keyword of the ``plot()`` method
205326 when there are no areas with missing data (#1600).
206 - Fix the `plot()` method to correctly align values passed to the `column`
327 - Fix the ``plot()`` method to correctly align values passed to the ``column``
207328 keyword as a pandas Series (#1670).
208329 - Fix bug in plotting MultiPoints when passing values to determine the color
209330 (#1694)
210 - The `rename_geometry()` method now raises a more informative error message
331 - The ``rename_geometry()`` method now raises a more informative error message
211332 when a duplicate column name is used (#1602).
212 - Fix `explode()` method to preserve the CRS (#1655)
213 - Fix the `GeoSeries.apply()` method to again accept the `convert_dtype`
333 - Fix ``explode()`` method to preserve the CRS (#1655)
334 - Fix the ``GeoSeries.apply()`` method to again accept the ``convert_dtype``
214335 keyword to be consistent with pandas (#1636).
215 - Fix `GeoDataFrame.apply()` to preserve the CRS when possible (#1848).
216 - Fix bug in containment test as `geom in geoseries` (#1753).
217 - The `shift()` method of a GeoSeries/GeoDataFrame now preserves the CRS
336 - Fix ``GeoDataFrame.apply()`` to preserve the CRS when possible (#1848).
337 - Fix bug in containment test as ``geom in geoseries`` (#1753).
338 - The ``shift()`` method of a GeoSeries/GeoDataFrame now preserves the CRS
218339 (#1744).
219340 - The PostGIS IO functionality now quotes table names to ensure it works with
220341 case-sensitive names (#1825).
221 - Fix the `GeoSeries` constructor without passing data but only an index (#1798).
342 - Fix the ``GeoSeries`` constructor without passing data but only an index (#1798).
222343
223344 Notes on (optional) dependencies:
224345
225346 - GeoPandas 0.9.0 dropped support for Python 3.5. Further, the minimum
226347 required versions are pandas 0.24, numpy 1.15 and shapely 1.6 and fiona 1.8.
227 - The `descartes` package is no longer required for plotting polygons. This
348 - The ``descartes`` package is no longer required for plotting polygons. This
228349 functionality is now included by default in GeoPandas itself, when
229350 matplotlib is available (#1677).
230 - Fiona is now only imported when used in `read_file`/`to_file`. This means
351 - Fiona is now only imported when used in ``read_file``/``to_file``. This means
231352 you can now force geopandas to install without fiona installed (although it
232353 is still a default requirement) (#1775).
233354 - Compatibility with the upcoming Shapely 1.8 (#1659, #1662, #1819).
244365
245366 Small bug-fix release:
246367
247 - Fix a regression in the `plot()` method when visualizing with a
368 - Fix a regression in the ``plot()`` method when visualizing with a
248369 JenksCaspallSampled or FisherJenksSampled scheme (#1486).
249 - Fix spurious warning in `GeoDataFrame.to_postgis` (#1497).
250 - Fix the un-pickling with `pd.read_pickle` of files written with older
370 - Fix spurious warning in ``GeoDataFrame.to_postgis`` (#1497).
371 - Fix the un-pickling with ``pd.read_pickle`` of files written with older
251372 GeoPandas versions (#1511).
252373
253374
257378 **Experimental**: optional use of PyGEOS to speed up spatial operations (#1155).
258379 PyGEOS is a faster alternative for Shapely (being contributed back to a future
259380 version of Shapely), and is used in element-wise spatial operations and for
260 spatial index in e.g. `sjoin` (#1343, #1401, #1421, #1427, #1428). See the
381 spatial index in e.g. ``sjoin`` (#1343, #1401, #1421, #1427, #1428). See the
261382 [installation docs](https://geopandas.readthedocs.io/en/latest/install.html#using-the-optional-pygeos-dependency)
262383 for more info and how to enable it.
263384
265386
266387 - IO enhancements:
267388
268 - New `GeoDataFrame.to_postgis()` method to write to PostGIS database (#1248).
389 - New ``GeoDataFrame.to_postgis()`` method to write to PostGIS database (#1248).
269390 - New Apache Parquet and Feather file format support (#1180, #1435)
270 - Allow appending to files with `GeoDataFrame.to_file` (#1229).
271 - Add support for the `ignore_geometry` keyword in `read_file` to only read
391 - Allow appending to files with ``GeoDataFrame.to_file`` (#1229).
392 - Add support for the ``ignore_geometry`` keyword in ``read_file`` to only read
272393 the attribute data. If set to True, a pandas DataFrame without geometry is
273394 returned (#1383).
274 - `geopandas.read_file` now supports reading from file-like objects (#1329).
275 - `GeoDataFrame.to_file` now supports specifying the CRS to write to the file
395 - ``geopandas.read_file`` now supports reading from file-like objects (#1329).
396 - ``GeoDataFrame.to_file`` now supports specifying the CRS to write to the file
276397 (#802). By default it still uses the CRS of the GeoDataFrame.
277 - New `chunksize` keyword in `geopandas.read_postgis` to read a query in
398 - New ``chunksize`` keyword in ``geopandas.read_postgis`` to read a query in
278399 chunks (#1123).
279400
280401 - Improvements related to geometry columns and CRS:
282403 - Any column of the GeoDataFrame that has a "geometry" dtype is now returned
283404 as a GeoSeries. This means that when having multiple geometry columns, not
284405 only the "active" geometry column is returned as a GeoSeries, but also
285 accessing another geometry column (`gdf["other_geom_column"]`) gives a
406 accessing another geometry column (``gdf["other_geom_column"]``) gives a
286407 GeoSeries (#1336).
287408 - Multiple geometry columns in a GeoDataFrame can now each have a different
288 CRS. The global `gdf.crs` attribute continues to returns the CRS of the
409 CRS. The global ``gdf.crs`` attribute continues to returns the CRS of the
289410 "active" geometry column. The CRS of other geometry columns can be accessed
290 from the column itself (eg `gdf["other_geom_column"].crs`) (#1339).
291 - New `set_crs()` method on GeoDataFrame/GeoSeries to set the CRS of naive
411 from the column itself (eg ``gdf["other_geom_column"].crs``) (#1339).
412 - New ``set_crs()`` method on GeoDataFrame/GeoSeries to set the CRS of naive
292413 geometries (#747).
293414
294415 - Improvements related to plotting:
295416
296417 - The y-axis is now scaled depending on the center of the plot when using a
297418 geographic CRS, instead of using an equal aspect ratio (#1290).
298 - When passing a column of categorical dtype to the `column=` keyword of the
299 GeoDataFrame `plot()`, we now honor all categories and its order (#1483).
300 In addition, a new `categories` keyword allows to specify all categories
419 - When passing a column of categorical dtype to the ``column=`` keyword of the
420 GeoDataFrame ``plot()``, we now honor all categories and its order (#1483).
421 In addition, a new ``categories`` keyword allows to specify all categories
301422 and their order otherwise (#1173).
302 - For choropleths using a classification scheme (using `scheme=`), the
303 `legend_kwds` accept two new keywords to control the formatting of the
304 legend: `fmt` with a format string for the bin edges (#1253), and `labels`
423 - For choropleths using a classification scheme (using ``scheme=``), the
424 ``legend_kwds`` accept two new keywords to control the formatting of the
425 legend: ``fmt`` with a format string for the bin edges (#1253), and ``labels``
305426 to pass fully custom class labels (#1302).
306427
307 - New `covers()` and `covered_by()` methods on GeoSeries/GeoDataframe for the
428 - New ``covers()`` and ``covered_by()`` methods on GeoSeries/GeoDataframe for the
308429 equivalent spatial predicates (#1460, #1462).
309430 - GeoPandas now warns when using distance-based methods with data in a
310431 geographic projection (#1378).
313434
314435 - When constructing a GeoSeries or GeoDataFrame from data that already has a
315436 CRS, a deprecation warning is raised when both CRS don't match, and in the
316 future an error will be raised in such a case. You can use the new `set_crs`
437 future an error will be raised in such a case. You can use the new ``set_crs``
317438 method to override an existing CRS. See
318439 [the docs](https://geopandas.readthedocs.io/en/latest/projections.html#projection-for-multiple-geometry-columns).
319 - The helper functions in the `geopandas.plotting` module are deprecated for
440 - The helper functions in the ``geopandas.plotting`` module are deprecated for
320441 public usage (#656).
321 - The `geopandas.io` functions are deprecated, use the top-level `read_file` and
322 `to_file` instead (#1407).
323 - The set operators (`&`, `|`, `^`, `-`) are deprecated, use the
324 `intersection()`, `union()`, `symmetric_difference()`, `difference()` methods
442 - The ``geopandas.io`` functions are deprecated, use the top-level ``read_file`` and
443 ``to_file`` instead (#1407).
444 - The set operators (``&``, ``|``, ``^``, ``-``) are deprecated, use the
445 ``intersection()``, ``union()``, ``symmetric_difference()``, ``difference()`` methods
325446 instead (#1255).
326 - The `sindex` for empty dataframe will in the future return an empty spatial
327 index instead of `None` (#1438).
328 - The `objects` keyword in the `intersection` method of the spatial index
329 returned by the `sindex` attribute is deprecated and will be removed in the
447 - The ``sindex`` for empty dataframe will in the future return an empty spatial
448 index instead of ``None`` (#1438).
449 - The ``objects`` keyword in the ``intersection`` method of the spatial index
450 returned by the ``sindex`` attribute is deprecated and will be removed in the
330451 future (#1440).
331452
332453 Bug fixes:
333454
334 - Fix the `total_bounds()` method to ignore missing and empty geometries (#1312).
335 - Fix `geopandas.clip` when masking with non-overlapping area resulting in an
455 - Fix the ``total_bounds()`` method to ignore missing and empty geometries (#1312).
456 - Fix ``geopandas.clip`` when masking with non-overlapping area resulting in an
336457 empty GeoDataFrame (#1309, #1365).
337 - Fix error in `geopandas.sjoin` when joining on an empty geometry column (#1318).
338 - CRS related fixes: `pandas.concat` preserves CRS when concatenating GeoSeries
339 objects (#1340), preserve the CRS in `geopandas.clip` (#1362) and in
340 `GeoDataFrame.astype` (#1366).
341 - Fix bug in `GeoDataFrame.explode()` when 'level_1' is one of the column names
458 - Fix error in ``geopandas.sjoin`` when joining on an empty geometry column (#1318).
459 - CRS related fixes: ``pandas.concat`` preserves CRS when concatenating GeoSeries
460 objects (#1340), preserve the CRS in ``geopandas.clip`` (#1362) and in
461 ``GeoDataFrame.astype`` (#1366).
462 - Fix bug in ``GeoDataFrame.explode()`` when 'level_1' is one of the column names
342463 (#1445).
343464 - Better error message when rtree is not installed (#1425).
344 - Fix bug in `GeoSeries.equals()` (#1451).
465 - Fix bug in ``GeoSeries.equals()`` (#1451).
345466 - Fix plotting of multi-part geometries with additional style keywords (#1385).
346467
347 And we now have a [Code of Conduct](https://github.com/geopandas/geopandas/blob/master/CODE_OF_CONDUCT.md)!
468 And we now have a [Code of Conduct](https://github.com/geopandas/geopandas/blob/main/CODE_OF_CONDUCT.md)!
348469
349470 GeoPandas 0.8.0 is the last release to support Python 3.5. The next release
350471 will require Python 3.6, pandas 0.24, numpy 1.15 and shapely 1.6 or higher.
356477 Support for Python 2.7 has been dropped. GeoPandas now works with Python >= 3.5.
357478
358479 The important API change of this release is that GeoPandas now requires
359 PROJ > 6 and pyproj > 2.2, and that the `.crs` attribute of a GeoSeries and
480 PROJ > 6 and pyproj > 2.2, and that the ``.crs`` attribute of a GeoSeries and
360481 GeoDataFrame no longer stores the CRS information as a proj4 string or dict,
361482 but as a ``pyproj.CRS`` object (#1101).
362483
367488
368489 Other API changes;
369490
370 - The `GeoDataFrame.to_file` method will now also write the GeoDataFrame index
491 - The ``GeoDataFrame.to_file`` method will now also write the GeoDataFrame index
371492 to the file, if the index is named and/or non-integer. You can use the
372 `index=True/False` keyword to overwrite this default inference (#1059).
493 ``index=True/False`` keyword to overwrite this default inference (#1059).
373494
374495 New features and improvements:
375496
376 - A new `geopandas.clip` function to clip a GeoDataFrame to the spatial extent
497 - A new ``geopandas.clip`` function to clip a GeoDataFrame to the spatial extent
377498 of another shape (#1128).
378 - The `geopandas.overlay` function now works for all geometry types, including
499 - The ``geopandas.overlay`` function now works for all geometry types, including
379500 points and linestrings in addition to polygons (#1110).
380 - The `plot()` method gained support for missing values (in the column that
501 - The ``plot()`` method gained support for missing values (in the column that
381502 determines the colors). By default it doesn't plot the corresponding
382 geometries, but using the new `missing_kwds` argument you can specify how to
503 geometries, but using the new ``missing_kwds`` argument you can specify how to
383504 style those geometries (#1156).
384 - The `plot()` method now also supports plotting GeometryCollection and
505 - The ``plot()`` method now also supports plotting GeometryCollection and
385506 LinearRing objects (#1225).
386507 - Added support for filtering with a geometry or reading a subset of the rows in
387 `geopandas.read_file` (#1160).
508 ``geopandas.read_file`` (#1160).
388509 - Added support for the new nullable integer data type of pandas in
389 `GeoDataFrame.to_file` (#1220).
390
391 Bug fixes:
392
393 - `GeoSeries.reset_index()` now correctly results in a GeoDataFrame instead of DataFrame (#1252).
394 - Fixed the `geopandas.sjoin` function to handle MultiIndex correctly (#1159).
395 - Fixed the `geopandas.sjoin` function to preserve the index name of the left GeoDataFrame (#1150).
510 ``GeoDataFrame.to_file`` (#1220).
511
512 Bug fixes:
513
514 - ``GeoSeries.reset_index()`` now correctly results in a GeoDataFrame instead of DataFrame (#1252).
515 - Fixed the ``geopandas.sjoin`` function to handle MultiIndex correctly (#1159).
516 - Fixed the ``geopandas.sjoin`` function to preserve the index name of the left GeoDataFrame (#1150).
396517
397518
398519 Version 0.6.3 (February 6, 2020)
401522 Small bug-fix release:
402523
403524 - Compatibility with Shapely 1.7 and pandas 1.0 (#1244).
404 - Fix `GeoDataFrame.fillna` to accept non-geometry values again when there are
525 - Fix ``GeoDataFrame.fillna`` to accept non-geometry values again when there are
405526 no missing values in the geometry column. This should make it easier to fill
406527 the numerical columns of the GeoDataFrame (#1279).
407528
424545
425546 Small bug-fix release fixing a few regressions:
426547
427 - Fix `astype` when converting to string with Multi geometries (#1145) or when converting a dataframe without geometries (#1144).
428 - Fix `GeoSeries.fillna` to accept `np.nan` again (#1149).
548 - Fix ``astype`` when converting to string with Multi geometries (#1145) or when converting a dataframe without geometries (#1144).
549 - Fix ``GeoSeries.fillna`` to accept ``np.nan`` again (#1149).
429550
430551
431552 Version 0.6.0 (September 27, 2019)
437558
438559 - A refactor of the internals based on the pandas ExtensionArray interface (#1000). The main user visible changes are:
439560
440 - The `.dtype` of a GeoSeries is now a `'geometry'` dtype (and no longer a numpy `object` dtype).
441 - The `.values` of a GeoSeries now returns a custom `GeometryArray`, and no longer a numpy array. To get back a numpy array of Shapely scalars, you can convert explicitly using `np.asarray(..)`.
442
443 - The `GeoSeries` constructor now raises a warning when passed non-geometry data. Currently the constructor falls back to return a pandas `Series`, but in the future this will raise an error (#1085).
561 - The ``.dtype`` of a GeoSeries is now a ``'geometry'`` dtype (and no longer a numpy ``object`` dtype).
562 - The ``.values`` of a GeoSeries now returns a custom ``GeometryArray``, and no longer a numpy array. To get back a numpy array of Shapely scalars, you can convert explicitly using ``np.asarray(..)``.
563
564 - The ``GeoSeries`` constructor now raises a warning when passed non-geometry data. Currently the constructor falls back to return a pandas ``Series``, but in the future this will raise an error (#1085).
444565 - The missing value handling has been changed to now separate the concepts of missing geometries and empty geometries (#601, 1062). In practice this means that (see [the docs](https://geopandas.readthedocs.io/en/v0.6.0/missing_empty.html) for more details):
445566
446 - `GeoSeries.isna` now considers only missing values, and if you want to check for empty geometries, you can use `GeoSeries.is_empty` (`GeoDataFrame.isna` already only looked at missing values).
447 - `GeoSeries.dropna` now actually drops missing values (before it didn't drop either missing or empty geometries)
448 - `GeoSeries.fillna` only fills missing values (behaviour unchanged).
449 - `GeoSeries.align` uses missing values instead of empty geometries by default to fill non-matching index entries.
567 - ``GeoSeries.isna`` now considers only missing values, and if you want to check for empty geometries, you can use ``GeoSeries.is_empty`` (``GeoDataFrame.isna`` already only looked at missing values).
568 - ``GeoSeries.dropna`` now actually drops missing values (before it didn't drop either missing or empty geometries)
569 - ``GeoSeries.fillna`` only fills missing values (behaviour unchanged).
570 - ``GeoSeries.align`` uses missing values instead of empty geometries by default to fill non-matching index entries.
450571
451572 New features and improvements:
452573
453 - Addition of a `GeoSeries.affine_transform` method, equivalent of Shapely's function (#1008).
454 - Addition of a `GeoDataFrame.rename_geometry` method to easily rename the active geometry column (#1053).
455 - Addition of `geopandas.show_versions()` function, which can be used to give an overview of the installed libraries in bug reports (#899).
456 - The `legend_kwds` keyword of the `plot()` method can now also be used to specify keywords for the color bar (#1102).
457 - Performance improvement in the `sjoin()` operation by re-using existing spatial index of the input dataframes, if available (#789).
574 - Addition of a ``GeoSeries.affine_transform`` method, equivalent of Shapely's function (#1008).
575 - Addition of a ``GeoDataFrame.rename_geometry`` method to easily rename the active geometry column (#1053).
576 - Addition of ``geopandas.show_versions()`` function, which can be used to give an overview of the installed libraries in bug reports (#899).
577 - The ``legend_kwds`` keyword of the ``plot()`` method can now also be used to specify keywords for the color bar (#1102).
578 - Performance improvement in the ``sjoin()`` operation by re-using existing spatial index of the input dataframes, if available (#789).
458579 - Updated documentation to work with latest version of geoplot and contextily (#1044, #1088).
459580 - A new ``geopandas.options`` configuration, with currently a single option to control the display precision of the coordinates (``options.display_precision``). The default is now to show less coordinates (3 for projected and 5 for geographic coordinates), but the default can be overridden with the option.
460581
461582 Bug fixes:
462583
463 - Also try to use `pysal` instead of `mapclassify` if available (#1082).
464 - The `GeoDataFrame.astype()` method now correctly returns a `GeoDataFrame` if the geometry column is preserved (#1009).
465 - The `to_crs` method now uses `always_xy=True` to ensure correct lon/lat order handling for pyproj>=2.2.0 (#1122).
466 - Fixed passing list-like colors in the `plot()` method in case of "multi" geometries (#1119).
467 - Fixed the coloring of shapes and colorbar when passing a custom `norm` in the `plot()` method (#1091, #1089).
468 - Fixed `GeoDataFrame.to_file` to preserve VFS file paths (e.g. when a "s3://" path is specified) (#1124).
584 - Also try to use ``pysal`` instead of ``mapclassify`` if available (#1082).
585 - The ``GeoDataFrame.astype()`` method now correctly returns a ``GeoDataFrame`` if the geometry column is preserved (#1009).
586 - The ``to_crs`` method now uses ``always_xy=True`` to ensure correct lon/lat order handling for pyproj>=2.2.0 (#1122).
587 - Fixed passing list-like colors in the ``plot()`` method in case of "multi" geometries (#1119).
588 - Fixed the coloring of shapes and colorbar when passing a custom ``norm`` in the ``plot()`` method (#1091, #1089).
589 - Fixed ``GeoDataFrame.to_file`` to preserve VFS file paths (e.g. when a "s3://" path is specified) (#1124).
469590 - Fixed failing case in ``geopandas.sjoin`` with empty geometries (#1138).
470591
471592
482603
483604 Improvements:
484605
485 * Significant performance improvement (around 10x) for `GeoDataFrame.iterfeatures`,
486 which also improves `GeoDataFrame.to_file` (#864).
487 * File IO enhancements based on Fiona 1.8:
488
489 * Support for writing bool dtype (#855) and datetime dtype, if the file format supports it (#728).
490 * Support for writing dataframes with multiple geometry types, if the file format allows it (e.g. GeoJSON for all types, or ESRI Shapefile for Polygon+MultiPolygon) (#827, #867, #870).
491
492 * Compatibility with pyproj >= 2 (#962).
493 * A new `geopandas.points_from_xy()` helper function to convert x and y coordinates to Point objects (#896).
494 * The `buffer` and `interpolate` methods now accept an array-like to specify a variable distance for each geometry (#781).
495 * Addition of a `relate` method, corresponding to the shapely method that returns the DE-9IM matrix (#853).
496 * Plotting improvements:
497
498 * Performance improvement in plotting by only flattening the geometries if there are actually 'Multi' geometries (#785).
499 * Choropleths: access to all `mapclassify` classification schemes and addition of the `classification_kwds` keyword in the `plot` method to specify options for the scheme (#876).
500 * Ability to specify a matplotlib axes object on which to plot the color bar with the `cax` keyword, in order to have more control over the color bar placement (#894).
501
502 * Changed the default provider in ``geopandas.tools.geocode`` from Google (now requires an API key) to Geocode.Farm (#907, #975).
606 - Significant performance improvement (around 10x) for ``GeoDataFrame.iterfeatures``,
607 which also improves ``GeoDataFrame.to_file`` (#864).
608 - File IO enhancements based on Fiona 1.8:
609
610 - Support for writing bool dtype (#855) and datetime dtype, if the file format supports it (#728).
611 - Support for writing dataframes with multiple geometry types, if the file format allows it (e.g. GeoJSON for all types, or ESRI Shapefile for Polygon+MultiPolygon) (#827, #867, #870).
612
613 - Compatibility with pyproj >= 2 (#962).
614 - A new ``geopandas.points_from_xy()`` helper function to convert x and y coordinates to Point objects (#896).
615 - The ``buffer`` and ``interpolate`` methods now accept an array-like to specify a variable distance for each geometry (#781).
616 - Addition of a ``relate`` method, corresponding to the shapely method that returns the DE-9IM matrix (#853).
617 - Plotting improvements:
618
619 - Performance improvement in plotting by only flattening the geometries if there are actually 'Multi' geometries (#785).
620 - Choropleths: access to all ``mapclassify`` classification schemes and addition of the ``classification_kwds`` keyword in the ``plot`` method to specify options for the scheme (#876).
621 - Ability to specify a matplotlib axes object on which to plot the color bar with the ``cax`` keyword, in order to have more control over the color bar placement (#894).
622
623 - Changed the default provider in ``geopandas.tools.geocode`` from Google (now requires an API key) to Geocode.Farm (#907, #975).
503624
504625 Bug fixes:
505626
506627 - Remove the edge in the legend marker (#807).
507 - Fix the `align` method to preserve the CRS (#829).
508 - Fix `geopandas.testing.assert_geodataframe_equal` to correctly compare left and right dataframes (#810).
628 - Fix the ``align`` method to preserve the CRS (#829).
629 - Fix ``geopandas.testing.assert_geodataframe_equal`` to correctly compare left and right dataframes (#810).
509630 - Fix in choropleth mapping when the values contain missing values (#877).
510 - Better error message in `sjoin` if the input is not a GeoDataFrame (#842).
511 - Fix in `read_postgis` to handle nullable (missing) geometries (#856).
512 - Correctly passing through the `parse_dates` keyword in `read_postgis` to the underlying pandas method (#860).
631 - Better error message in ``sjoin`` if the input is not a GeoDataFrame (#842).
632 - Fix in ``read_postgis`` to handle nullable (missing) geometries (#856).
633 - Correctly passing through the ``parse_dates`` keyword in ``read_postgis`` to the underlying pandas method (#860).
513634 - Fixed the shape of Antarctica in the included demo dataset 'naturalearth_lowres'
514635 (by updating to the latest version) (#804).
515636
516
517637 Version 0.4.1 (March 5, 2019)
518638 -----------------------------
519639
520640 Small bug-fix release for compatibility with the latest Fiona and PySAL
521641 releases:
522642
523 * Compatibility with Fiona 1.8: fix deprecation warning (#854).
524 * Compatibility with PySAL 2.0: switched to `mapclassify` instead of `PySAL` as
525 dependency for choropleth mapping with the `scheme` keyword (#872).
526 * Fix for new `overlay` implementation in case the intersection is empty (#800).
527
643 - Compatibility with Fiona 1.8: fix deprecation warning (#854).
644 - Compatibility with PySAL 2.0: switched to ``mapclassify`` instead of ``PySAL`` as
645 dependency for choropleth mapping with the ``scheme`` keyword (#872).
646 - Fix for new ``overlay`` implementation in case the intersection is empty (#800).
528647
529648 Version 0.4.0 (July 15, 2018)
530649 -----------------------------
531650
532651 Improvements:
533652
534 * Improved `overlay` function (better performance, several incorrect behaviours fixed) (#429)
535 * Pass keywords to control legend behavior (`legend_kwds`) to `plot` (#434)
536 * Add basic support for reading remote datasets in `read_file` (#531)
537 * Pass kwargs for `buffer` operation on GeoSeries (#535)
538 * Expose all geopy services as options in geocoding (#550)
539 * Faster write speeds to GeoPackage (#605)
540 * Permit `read_file` filtering with a bounding box from a GeoDataFrame (#613)
541 * Set CRS on GeoDataFrame returned by `read_postgis` (#627)
542 * Permit setting markersize for Point GeoSeries plots with column values (#633)
543 * Started an example gallery (#463, #690, #717)
544 * Support for plotting MultiPoints (#683)
545 * Testing functionality (e.g. `assert_geodataframe_equal`) is now publicly exposed (#707)
546 * Add `explode` method to GeoDataFrame (similar to the GeoSeries method) (#671)
547 * Set equal aspect on active axis on multi-axis figures (#718)
548 * Pass array of values to column argument in `plot` (#770)
549
550 Bug fixes:
551
552 * Ensure that colorbars are plotted on the correct axis (#523)
553 * Handle plotting empty GeoDataFrame (#571)
554 * Save z-dimension when writing files (#652)
555 * Handle reading empty shapefiles (#653)
556 * Correct dtype for empty result of spatial operations (#685)
557 * Fix empty `sjoin` handling for pandas>=0.23 (#762)
558
653 - Improved ``overlay`` function (better performance, several incorrect behaviours fixed) (#429)
654 - Pass keywords to control legend behavior (``legend_kwds``) to ``plot`` (#434)
655 - Add basic support for reading remote datasets in ``read_file`` (#531)
656 - Pass kwargs for ``buffer`` operation on GeoSeries (#535)
657 - Expose all geopy services as options in geocoding (#550)
658 - Faster write speeds to GeoPackage (#605)
659 - Permit ``read_file`` filtering with a bounding box from a GeoDataFrame (#613)
660 - Set CRS on GeoDataFrame returned by ``read_postgis`` (#627)
661 - Permit setting markersize for Point GeoSeries plots with column values (#633)
662 - Started an example gallery (#463, #690, #717)
663 - Support for plotting MultiPoints (#683)
664 - Testing functionality (e.g. ``assert_geodataframe_equal``) is now publicly exposed (#707)
665 - Add ``explode`` method to GeoDataFrame (similar to the GeoSeries method) (#671)
666 - Set equal aspect on active axis on multi-axis figures (#718)
667 - Pass array of values to column argument in ``plot`` (#770)
668
669 Bug fixes:
670
671 - Ensure that colorbars are plotted on the correct axis (#523)
672 - Handle plotting empty GeoDataFrame (#571)
673 - Save z-dimension when writing files (#652)
674 - Handle reading empty shapefiles (#653)
675 - Correct dtype for empty result of spatial operations (#685)
676 - Fix empty ``sjoin`` handling for pandas>=0.23 (#762)
559677
560678 Version 0.3.0 (August 29, 2017)
561679 -------------------------------
562680
563681 Improvements:
564682
565 * Improve plotting performance using ``matplotlib.collections`` (#267)
566 * Improve default plotting appearance. The defaults now follow the new matplotlib defaults (#318, #502, #510)
567 * Provide access to x/y coordinates as attributes for Point GeoSeries (#383)
568 * Make the NYBB dataset available through ``geopandas.datasets`` (#384)
569 * Enable ``sjoin`` on non-integer-index GeoDataFrames (#422)
570 * Add ``cx`` indexer to GeoDataFrame (#482)
571 * ``GeoDataFrame.from_features`` now also accepts a Feature Collection (#225, #507)
572 * Use index label instead of integer id in output of ``iterfeatures`` and
683 - Improve plotting performance using ``matplotlib.collections`` (#267)
684 - Improve default plotting appearance. The defaults now follow the new matplotlib defaults (#318, #502, #510)
685 - Provide access to x/y coordinates as attributes for Point GeoSeries (#383)
686 - Make the NYBB dataset available through ``geopandas.datasets`` (#384)
687 - Enable ``sjoin`` on non-integer-index GeoDataFrames (#422)
688 - Add ``cx`` indexer to GeoDataFrame (#482)
689 - ``GeoDataFrame.from_features`` now also accepts a Feature Collection (#225, #507)
690 - Use index label instead of integer id in output of ``iterfeatures`` and
573691 ``to_json`` (#421)
574 * Return empty data frame rather than raising an error when performing a spatial join with non overlapping geodataframes (#335)
575
576 Bug fixes:
577
578 * Compatibility with shapely 1.6.0 (#512)
579 * Fix ``fiona.filter`` results when bbox is not None (#372)
580 * Fix ``dissolve`` to retain CRS (#389)
581 * Fix ``cx`` behavior when using index of 0 (#478)
582 * Fix display of lower bin in legend label of choropleth plots using a PySAL scheme (#450)
583
692 - Return empty data frame rather than raising an error when performing a spatial join with non overlapping geodataframes (#335)
693
694 Bug fixes:
695
696 - Compatibility with shapely 1.6.0 (#512)
697 - Fix ``fiona.filter`` results when bbox is not None (#372)
698 - Fix ``dissolve`` to retain CRS (#389)
699 - Fix ``cx`` behavior when using index of 0 (#478)
700 - Fix display of lower bin in legend label of choropleth plots using a PySAL scheme (#450)
584701
585702 Version 0.2.0
586703 -------------
587704
588705 Improvements:
589706
590 * Complete overhaul of the documentation
591 * Addition of ``overlay`` to perform spatial overlays with polygons (#142)
592 * Addition of ``sjoin`` to perform spatial joins (#115, #145, #188)
593 * Addition of ``__geo_interface__`` that returns a python data structure
707 - Complete overhaul of the documentation
708 - Addition of ``overlay`` to perform spatial overlays with polygons (#142)
709 - Addition of ``sjoin`` to perform spatial joins (#115, #145, #188)
710 - Addition of ``__geo_interface__`` that returns a python data structure
594711 to represent the ``GeoSeries`` as a GeoJSON-like ``FeatureCollection`` (#116)
595712 and ``iterfeatures`` method (#178)
596 * Addition of the ``explode`` (#146) and ``dissolve`` (#310, #311) methods.
597 * Addition of the ``sindex`` attribute, a Spatial Index using the optional
713 - Addition of the ``explode`` (#146) and ``dissolve`` (#310, #311) methods.
714 - Addition of the ``sindex`` attribute, a Spatial Index using the optional
598715 dependency ``rtree`` (``libspatialindex``) that can be used to speed up
599716 certain operations such as overlays (#140, #141).
600 * Addition of the ``GeoSeries.cx`` coordinate indexer to slice a GeoSeries based
717 - Addition of the ``GeoSeries.cx`` coordinate indexer to slice a GeoSeries based
601718 on a bounding box of the coordinates (#55).
602 * Improvements to plotting: ability to specify edge colors (#173), support for
719 - Improvements to plotting: ability to specify edge colors (#173), support for
603720 the ``vmin``, ``vmax``, ``figsize``, ``linewidth`` keywords (#207), legends
604721 for chloropleth plots (#210), color points by specifying a colormap (#186) or
605722 a single color (#238).
606 * Larger flexibility of ``to_crs``, accepting both dicts and proj strings (#289)
607 * Addition of embedded example data, accessible through
723 - Larger flexibility of ``to_crs``, accepting both dicts and proj strings (#289)
724 - Addition of embedded example data, accessible through
608725 ``geopandas.datasets.get_path``.
609726
610727 API changes:
611728
612 * In the ``plot`` method, the ``axes`` keyword is renamed to ``ax`` for
729 - In the ``plot`` method, the ``axes`` keyword is renamed to ``ax`` for
613730 consistency with pandas, and the ``colormap`` keyword is renamed to ``cmap``
614731 for consistency with matplotlib (#208, #228, #240).
615732
616733 Bug fixes:
617734
618 * Properly handle rows with missing geometries (#139, #193).
619 * Fix ``GeoSeries.to_json`` (#263).
620 * Correctly serialize metadata when pickling (#199, #206).
621 * Fix ``merge`` and ``concat`` to return correct GeoDataFrame (#247, #320, #322).
735 - Properly handle rows with missing geometries (#139, #193).
736 - Fix ``GeoSeries.to_json`` (#263).
737 - Correctly serialize metadata when pickling (#199, #206).
738 - Fix ``merge`` and ``concat`` to return correct GeoDataFrame (#247, #320, #322).
1111 In general, GeoPandas follows the conventions of the pandas project
1212 where applicable. Please read the [contributing
1313 guidelines](https://geopandas.readthedocs.io/en/latest/community/contributing.html).
14
1514
1615 In particular, when submitting a pull request:
1716
3837 Style
3938 -----
4039
41 - GeoPandas supports Python 3.7+ only. The last version of GeoPandas
40 - GeoPandas supports Python 3.8+ only. The last version of GeoPandas
4241 supporting Python 2 is 0.6.
4342
4443 - GeoPandas follows [the PEP 8
0 GeoPandas [![Actions Status](https://github.com/geopandas/geopandas/workflows/Tests/badge.svg)](https://github.com/geopandas/geopandas/actions?query=workflow%3ATests) [![Coverage Status](https://codecov.io/gh/geopandas/geopandas/branch/master/graph/badge.svg)](https://codecov.io/gh/geopandas/geopandas) [![Join the chat at https://gitter.im/geopandas/geopandas](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/geopandas/geopandas?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/geopandas/geopandas/master) [![DOI](https://zenodo.org/badge/11002815.svg)](https://zenodo.org/badge/latestdoi/11002815)
1 =========
0 [![pypi](https://img.shields.io/pypi/v/geopandas.svg)](https://pypi.python.org/pypi/geopandas/)
1 [![Actions Status](https://github.com/geopandas/geopandas/workflows/Tests/badge.svg)](https://github.com/geopandas/geopandas/actions?query=workflow%3ATests)
2 [![Coverage Status](https://codecov.io/gh/geopandas/geopandas/branch/main/graph/badge.svg)](https://codecov.io/gh/geopandas/geopandas)
3 [![Join the chat at https://gitter.im/geopandas/geopandas](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/geopandas/geopandas?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
4 [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/geopandas/geopandas/main)
5 [![DOI](https://zenodo.org/badge/11002815.svg)](https://zenodo.org/badge/latestdoi/11002815)
6
7 GeoPandas
8 ---------
29
310 Python tools for geographic data
411
3441 - ``shapely``
3542 - ``fiona``
3643 - ``pyproj``
44 - ``packaging``
3745
3846 Further, ``matplotlib`` is an optional dependency, required
3947 for plotting, and [``rtree``](https://github.com/Toblerity/rtree) is an optional
2525
2626 self.df1, self.df2 = df1, df2
2727
28 def time_sjoin(self, op):
29 sjoin(self.df1, self.df2, op=op)
28 def time_sjoin(self, predicate):
29 sjoin(self.df1, self.df2, predicate=predicate)
0 name: test
1 channels:
2 - conda-forge
3 dependencies:
4 - python=3.10
5 - cython
6 # required
7 - shapely
8 - fiona
9 - pyproj
10 - geos
11 - packaging
12 # testing
13 - pytest
14 - pytest-cov
15 - pytest-xdist
16 - fsspec
17 # optional
18 - rtree
19 #- geopy
20 - SQLalchemy
21 - libspatialite
22 - pyarrow
23 - pip
24 - pip:
25 - geopy
26 - mapclassify>=2.4.0
27 # dev versions of packages
28 - --pre --extra-index https://pypi.anaconda.org/scipy-wheels-nightly/simple
29 - numpy
30 - git+https://github.com/pandas-dev/pandas.git@main
31 - git+https://github.com/matplotlib/matplotlib.git@main
32 # - git+https://github.com/Toblerity/Shapely.git@main
33 - git+https://github.com/pygeos/pygeos.git@master
34 - git+https://github.com/python-visualization/folium.git@main
35 - git+https://github.com/geopandas/xyzservices.git@main
0 name: test
1 channels:
2 - conda-forge
3 dependencies:
4 - python=3.10
5 # required
6 - pandas
7 - shapely
8 - fiona
9 - pyproj
10 - pygeos
11 - packaging
12 # testing
13 - pytest
14 - pytest-cov
15 - pytest-xdist
16 - fsspec
17 # optional
18 - rtree
19 - matplotlib-base
20 - mapclassify
21 - folium
22 - xyzservices
23 - scipy
24 - geopy
25 # installed in tests.yaml, because not available on windows
26 # - postgis
27 - SQLalchemy
28 - psycopg2
29 - libspatialite
30 - geoalchemy2
31 - pyarrow
32 # - pyogrio
33 # doctest testing
34 - pytest-doctestplus
35 - pip
36 - pip:
37 - git+https://github.com/geopandas/pyogrio.git@main
+0
-28
ci/envs/37-latest-conda-forge.yaml less more
0 name: test
1 channels:
2 - conda-forge
3 dependencies:
4 - python=3.7
5 # required
6 - pandas
7 - shapely
8 - fiona
9 - pyproj
10 - pygeos
11 # testing
12 - pytest
13 - pytest-cov
14 - pytest-xdist
15 - fsspec
16 # optional
17 - rtree
18 - matplotlib
19 - mapclassify
20 - folium
21 - xyzservices
22 - scipy
23 - geopy
24 - SQLalchemy
25 - libspatialite
26 - pyarrow
27
+0
-28
ci/envs/37-latest-defaults.yaml less more
0 name: test
1 channels:
2 - defaults
3 dependencies:
4 - python=3.7
5 # required
6 - pandas
7 - shapely
8 - fiona
9 - pyproj
10 - geos
11 # testing
12 - pytest
13 - pytest-cov
14 - pytest-xdist
15 - fsspec
16 # optional
17 - rtree
18 - matplotlib
19 #- geopy
20 - SQLalchemy
21 - libspatialite
22 - pip:
23 - geopy
24 - mapclassify
25 - pyarrow
26 - folium
27 - xyzservices
+0
-28
ci/envs/37-minimal.yaml less more
0 name: test
1 channels:
2 - defaults
3 - conda-forge
4 dependencies:
5 - python=3.7
6 # required
7 - numpy=1.18
8 - pandas==0.25
9 - shapely=1.6
10 - fiona=1.8.13
11 #- pyproj
12 # testing
13 - pytest
14 - pytest-cov
15 - pytest-xdist
16 - fsspec
17 # optional
18 - rtree
19 - matplotlib
20 - matplotlib=3.1
21 # - mapclassify=2.4.0 - doesn't build due to conflicts
22 - geopy
23 - SQLalchemy
24 - libspatialite
25 - pyarrow
26 - pip:
27 - pyproj==2.2.2
+0
-29
ci/envs/37-pd10.yaml less more
0 name: test
1 channels:
2 - defaults
3 dependencies:
4 - python=3.7
5 # required
6 - pandas=1.0
7 - shapely
8 - fiona
9 - numpy=<1.19
10 #- pyproj
11 - geos
12 # testing
13 - pytest
14 - pytest-cov
15 - pytest-xdist
16 - fsspec
17 # optional
18 - rtree
19 - matplotlib
20 #- geopy
21 - SQLalchemy
22 - libspatialite
23 - pip
24 - pip:
25 - pyproj==3.0.1
26 - geopy
27 - mapclassify==2.4.0
28 - pyarrow
+0
-33
ci/envs/38-dev.yaml less more
0 name: test
1 channels:
2 - conda-forge
3 dependencies:
4 - python=3.8
5 - cython
6 # required
7 - fiona
8 - pyproj
9 - geos
10 # testing
11 - pytest
12 - pytest-cov
13 - pytest-xdist
14 - fsspec
15 # optional
16 - rtree
17 #- geopy
18 - SQLalchemy
19 - libspatialite
20 - pyarrow
21 - pip:
22 - geopy
23 - mapclassify>=2.4.0
24 # dev versions of packages
25 - git+https://github.com/numpy/numpy.git@main
26 - git+https://github.com/pydata/pandas.git@master
27 - git+https://github.com/matplotlib/matplotlib.git@master
28 - git+https://github.com/Toblerity/Shapely.git@master
29 - git+https://github.com/pygeos/pygeos.git@master
30 - git+https://github.com/python-visualization/folium.git@master
31 - git+https://github.com/geopandas/xyzservices.git@main
32
33 dependencies:
44 - python=3.8
55 # required
6 - pandas=1.3.2 # temporary pin because 1.3.3 has regression for overlay (GH2101)
6 - pandas
77 - shapely
8 - fiona
8 # - fiona # build with only pyogrio
9 - libgdal
910 - pyproj
1011 - pygeos
12 - packaging
1113 # testing
1214 - pytest
1315 - pytest-cov
1416 - pytest-xdist
15 - fsspec
17 # - fsspec # to have one non-minimal build without fsspec
1618 # optional
1719 - rtree
1820 - matplotlib
2123 - xyzservices
2224 - scipy
2325 - geopy
24 # installed in tests.yaml, because not available on windows
25 # - postgis
2626 - SQLalchemy
27 - psycopg2
2827 - libspatialite
29 - geoalchemy2
3028 - pyarrow
31 # doctest testing
32 - pytest-doctestplus
29 - pip
30 - pip:
31 - pyogrio
0 name: test
1 channels:
2 - defaults
3 dependencies:
4 - python=3.8
5 # required
6 - pandas
7 - shapely
8 - fiona
9 - pyproj
10 - geos
11 - packaging
12 # testing
13 - pytest
14 - pytest-cov
15 - pytest-xdist
16 - fsspec
17 # optional
18 - rtree
19 - matplotlib
20 #- geopy
21 - SQLalchemy
22 - libspatialite
23 - pip
24 - pip:
25 - geopy
26 - mapclassify
27 - pyarrow
28 - folium
29 - xyzservices
0 name: test
1 channels:
2 - defaults
3 - conda-forge
4 dependencies:
5 - python=3.8
6 # required
7 - numpy=1.18
8 - pandas==1.0.5
9 - shapely=1.7
10 - fiona=1.8.13.post1
11 - pyproj=2.6.1.post1
12 - packaging
13 #- pyproj
14 # testing
15 - pytest
16 - pytest-cov
17 - pytest-xdist
18 - fsspec
19 # optional
20 - rtree
21 - matplotlib
22 - matplotlib=3.2
23 - mapclassify=2.4.0
24 - geopy
25 - SQLalchemy
26 - libspatialite
27 - pyarrow
+0
-14
ci/envs/38-no-optional-deps.yaml less more
0 name: test
1 channels:
2 - conda-forge
3 dependencies:
4 - python=3.8
5 # required
6 - pandas
7 - shapely
8 - fiona
9 - pyproj
10 # testing
11 - pytest
12 - pytest-cov
13 - pytest-xdist
0 name: test
1 channels:
2 - defaults
3 dependencies:
4 - python=3.8
5 # required
6 - pandas=1.1
7 - shapely
8 - fiona
9 - numpy=<1.19
10 - pyproj=3.1.0
11 #- pyproj
12 - geos
13 - packaging
14 # testing
15 - pytest
16 - pytest-cov
17 - pytest-xdist
18 - fsspec
19 # optional
20 - rtree
21 - matplotlib
22 #- geopy
23 - SQLalchemy
24 - libspatialite
25 - pip
26 - pip:
27 - geopy
28 - mapclassify==2.4.0
29 - pyarrow
33 dependencies:
44 - python=3.9
55 # required
6 - pandas
6 - pandas=1.3
77 - shapely
88 - fiona
99 - pyproj
1010 - pygeos
11 - packaging
1112 # testing
1213 - pytest
1314 - pytest-cov
0 name: test
1 channels:
2 - conda-forge
3 dependencies:
4 - python=3.9
5 # required
6 - pandas
7 - shapely
8 - fiona
9 - pyproj
10 - packaging
11 # testing
12 - pytest
13 - pytest-cov
14 - pytest-xdist
0 name: test
1 channels:
2 - conda-forge
3 dependencies:
4 - python=3.9
5 # required
6 - pandas=1.2
7 - shapely
8 - fiona
9 - pyproj
10 - pygeos
11 - packaging
12 # testing
13 - pytest
14 - pytest-cov
15 - pytest-xdist
16 - fsspec
17 # optional
18 - rtree
19 - matplotlib
20 - mapclassify
21 - folium
22 - xyzservices
23 - scipy
24 - geopy
25 # installed in tests.yaml, because not available on windows
26 # - postgis
27 - SQLalchemy
28 - psycopg2
29 - libspatialite
30 - geoalchemy2
31 - pyarrow
32 # doctest testing
33 - pytest-doctestplus
3838 - libpysal=4.5.1
3939 - pygeos=0.10.2
4040 - xyzservices=2021.9.1
41 - packaging=21.0
4142 - pip
4243 - pip:
4344 - sphinx-toggleprompt
4343
4444 ## Download
4545
46 You can download all version in SVG and PNG from [GitHub repository](https://github.com/geopandas/geopandas/tree/master/doc/source/_static/logo).
46 You can download all version in SVG and PNG from [GitHub repository](https://github.com/geopandas/geopandas/tree/main/doc/source/_static/logo).
4747
4848
4949 ## Colors
2020
2121 - All existing tests should pass. Please make sure that the test
2222 suite passes, both locally and on
23 `GitHub Actions <hhttps://github.com/geopandas/geopandas/actions>`_. Status on
23 `GitHub Actions <https://github.com/geopandas/geopandas/actions>`_. Status on
2424 GHA will be visible on a pull request. GHA are automatically enabled
2525 on your own fork as well. To trigger a check, make a PR to your own fork.
2626
4343 imports when possible, and explicit relative imports for local
4444 imports when necessary in tests.
4545
46 - GeoPandas supports Python 3.7+ only. The last version of GeoPandas
46 - GeoPandas supports Python 3.8+ only. The last version of GeoPandas
4747 supporting Python 2 is 0.6.
4848
4949
113113 Creating a branch
114114 ~~~~~~~~~~~~~~~~~~
115115
116 You want your master branch to reflect only production-ready code, so create a
116 You want your main branch to reflect only production-ready code, so create a
117117 feature branch for making your changes. For example::
118118
119119 git branch shiny-new-feature
128128 what the branch brings to *GeoPandas*. You can have many shiny-new-features
129129 and switch in between them using the git checkout command.
130130
131 To update this branch, you need to retrieve the changes from the master branch::
131 To update this branch, you need to retrieve the changes from the main branch::
132132
133133 git fetch upstream
134 git rebase upstream/master
135
136 This will replay your commits on top of the latest GeoPandas git master. If this
134 git rebase upstream/main
135
136 This will replay your commits on top of the latest GeoPandas git main. If this
137137 leads to merge conflicts, you must resolve these before submitting your pull
138138 request. If you have uncommitted changes, you will need to ``stash`` them prior
139139 to updating. This will effectively store your changes and they can be reapplied
154154 - Make sure that you have :ref:`cloned the repository <contributing.forking>`
155155 - ``cd`` to the *geopandas** source directory
156156
157 Tell conda to create a new environment, named ``geopandas_dev``, or any other name you would like
157 Using the provided environment
158 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
159
160 *GeoPandas* provides an environment which includes the required dependencies for development.
161 The environment file is located in the top level of the repo and is named ``environment-dev.yml``.
162 You can create this environment by navigating to the the *GeoPandas* source directory
163 and running::
164
165 conda env create -f environment-dev.yml
166
167 This will create a new conda environment named ``geopandas_dev``.
168
169 Creating the environment manually
170 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
171
172 Alternatively, it is possible to create a development environment manually. To do this,
173 tell conda to create a new environment named ``geopandas_dev``, or any other name you would like
158174 for this environment, by running::
159175
160176 conda create -n geopandas_dev python
162178 This will create the new environment, and not touch any of your existing environments,
163179 nor any existing python installation.
164180
181 Working with the environment
182 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
183
165184 To work in this environment, you need to ``activate`` it. The instructions below
166185 should work for both Windows, Mac and Linux::
167186
182201
183202 At this point you can easily do a *development* install, as detailed in the next sections.
184203
204
185205 3) Installing Dependencies
186206 --------------------------
187207
188208 To run *GeoPandas* in an development environment, you must first install
189 *GeoPandas*'s dependencies. We suggest doing so using the following commands
190 (executed after your development environment has been activated)::
209 *GeoPandas*'s dependencies. If you used the provided environment in section 2, skip this
210 step and continue to section 4. If you created the environment manually, we suggest installing
211 dependencies using the following commands (executed after your development environment has been activated)::
191212
192213 conda install -c conda-forge pandas fiona shapely pyproj rtree pytest
193214
252273 <http://www.sphinx-doc.org/en/stable/rest.html#rst-primer>`_ and MyST syntax for ``md``
253274 files `explained here <https://myst-parser.readthedocs.io/en/latest/index.html>`_.
254275 The docstrings follow the `Numpy Docstring standard
255 <https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt>`_. Some pages
276 <https://github.com/numpy/numpy/blob/main/doc/HOWTO_DOCUMENT.rst.txt>`_. Some pages
256277 and examples are Jupyter notebooks converted to docs using `nbsphinx
257278 <https://nbsphinx.readthedocs.io/>`_. Jupyter notebooks should be stored without the output.
258279
307328 submitting code to run the check yourself::
308329
309330 black geopandas
310 git diff upstream/master -u -- "*.py" | flake8 --diff
331 git diff upstream/main -u -- "*.py" | flake8 --diff
311332
312333 to auto-format your code. Additionally, many editors have plugins that will
313334 apply ``black`` as you edit files.
314335
315336 Optionally (but recommended), you can setup `pre-commit hooks <https://pre-commit.com/>`_
316 to automatically run ``black`` and ``flake8`` when you make a git commit. This
317 can be done by installing ``pre-commit``::
337 to automatically run ``black`` and ``flake8`` when you make a git commit. If you did not
338 use the provided development environment in ``environment-dev.yml``, you must first install ``pre-commit``::
318339
319340 $ python -m pip install pre-commit
320341
8585
8686 # General information about the project.
8787 project = u"GeoPandas"
88 copyright = u"2013–2021, GeoPandas developers"
88 copyright = u"2013–2022, GeoPandas developers"
8989
9090 # The version info for the project you're documenting, acts as replacement for
9191 # |version| and |release|, also used in various other places throughout the
9292 # built documents.
9393 import geopandas
9494
95 version = release = geopandas.__version__
95 release = release = geopandas.__version__
96 version = release
97 if "+" in version:
98 version, remainder = release.split("+")
99 if not remainder.startswith("0"):
100 version = version + ".dev+" + remainder.split(".")[0]
96101
97102 # The language for content autogenerated by Sphinx. Refer to documentation
98103 # for a list of supported languages.
325330 .. note::
326331
327332 | This page was generated from `{{ docname }}`__.
328 | Interactive online version: :raw-html:`<a href="https://mybinder.org/v2/gh/geopandas/geopandas/master?urlpath=lab/tree/doc/source/{{ docname }}"><img alt="Binder badge" src="https://mybinder.org/badge_logo.svg" style="vertical-align:text-bottom"></a>`
329
330 __ https://github.com/geopandas/geopandas/blob/master/doc/source/{{ docname }}
333 | Interactive online version: :raw-html:`<a href="https://mybinder.org/v2/gh/geopandas/geopandas/main?urlpath=lab/tree/doc/source/{{ docname }}"><img alt="Binder badge" src="https://mybinder.org/badge_logo.svg" style="vertical-align:text-bottom"></a>`
334
335 __ https://github.com/geopandas/geopandas/blob/main/doc/source/{{ docname }}
331336 """
332337
333338 # --Options for sphinx extensions -----------------------------------------------
6868 .. autosummary::
6969 :toctree: api/
7070
71 GeoSeries.clip_by_rect
7172 GeoSeries.difference
7273 GeoSeries.intersection
7374 GeoSeries.symmetric_difference
7474 * 'sum'
7575 * 'mean'
7676 * 'median'
77 * function
78 * string function name
79 * list of functions and/or function names, e.g. [np.sum, 'mean']
80 * dict of axis labels -> functions, function names or list of such.
81
82 For example, to get the number of contries on each continent,
83 as well as the populations of the largest and smallest country of each,
84 we can aggregate the ``'name'`` column using ``'count'``,
85 and the ``'pop_est'`` column using ``'min'`` and ``'max'``:
86
87 .. ipython:: python
88
89 world = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
90 continents = world.dissolve(
91 by="continent",
92 aggfunc={
93 "name": "count",
94 "pop_est": ["min", "max"],
95 },
96 )
97
98 continents.head()
5959
6060 .. method:: GeoSeries.scale(self, xfact=1.0, yfact=1.0, zfact=1.0, origin='center')
6161
62 Scale the geometries of the :class:`~geopandas.GeoSeries` along each (x, y, z) dimensio.
62 Scale the geometries of the :class:`~geopandas.GeoSeries` along each (x, y, z) dimension.
6363
6464 .. method:: GeoSeries.skew(self, angle, origin='center', use_radians=False)
6565
11 "cells": [
22 {
33 "cell_type": "markdown",
4 "id": "c554e753",
5 "metadata": {},
46 "source": [
57 "# Interactive mapping\n",
68 "\n",
911 "Creating maps for interactive exploration mirrors the API of [static plots](../reference/api/geopandas.GeoDataFrame.plot.html) in an [explore()](../reference/api/geopandas.GeoDataFrame.explore.html) method of a GeoSeries or GeoDataFrame.\n",
1012 "\n",
1113 "Loading some example data:"
12 ],
13 "metadata": {}
14 ]
1415 },
1516 {
1617 "cell_type": "code",
1718 "execution_count": null,
19 "id": "caf2fbd5",
20 "metadata": {},
21 "outputs": [],
1822 "source": [
1923 "import geopandas\n",
2024 "\n",
2125 "nybb = geopandas.read_file(geopandas.datasets.get_path('nybb'))\n",
2226 "world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))\n",
2327 "cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))"
24 ],
25 "outputs": [],
26 "metadata": {}
28 ]
2729 },
2830 {
2931 "cell_type": "markdown",
32 "id": "56bf1bcf",
33 "metadata": {},
3034 "source": [
3135 "The simplest option is to use `GeoDataFrame.explore()`:"
32 ],
33 "metadata": {}
36 ]
3437 },
3538 {
3639 "cell_type": "code",
3740 "execution_count": null,
41 "id": "6b484ecc",
42 "metadata": {},
43 "outputs": [],
3844 "source": [
3945 "nybb.explore()"
40 ],
41 "outputs": [],
42 "metadata": {}
46 ]
4347 },
4448 {
4549 "cell_type": "markdown",
50 "id": "7a797389",
51 "metadata": {},
4652 "source": [
4753 "Interactive plotting offers largely the same customisation as static one plus some features on top of that. Check the code below which plots a customised choropleth map. You can use `\"BoroName\"` column with NY boroughs names as an input of the choropleth, show (only) its name in the tooltip on hover but show all values on click. You can also pass custom background tiles (either a name supported by folium, a name recognized by `xyzservices.providers.query_name()`, XYZ URL or `xyzservices.TileProvider` object), specify colormap (all supported by `matplotlib`) and specify black outline."
48 ],
49 "metadata": {}
54 ]
55 },
56 {
57 "cell_type": "markdown",
58 "id": "798bf532",
59 "metadata": {},
60 "source": [
61 "<div class=\"alert alert-info\">\n",
62 "Note\n",
63 "\n",
64 "Note that the GeoDataFrame needs to have a CRS set if you want to use background tiles.\n",
65 "</div>"
66 ]
5067 },
5168 {
5269 "cell_type": "code",
5370 "execution_count": null,
71 "id": "94b4ff24",
72 "metadata": {},
73 "outputs": [],
5474 "source": [
5575 "nybb.explore( \n",
5676 " column=\"BoroName\", # make choropleth based on \"BoroName\" column\n",
6080 " cmap=\"Set1\", # use \"Set1\" matplotlib colormap\n",
6181 " style_kwds=dict(color=\"black\") # use black outline\n",
6282 " )"
63 ],
64 "outputs": [],
65 "metadata": {}
83 ]
6684 },
6785 {
6886 "cell_type": "markdown",
87 "id": "5a10291e",
88 "metadata": {},
6989 "source": [
7090 "The `explore()` method returns a `folium.Map` object, which can also be passed directly (as you do with `ax` in `plot()`). You can then use folium functionality directly on the resulting map. In the example below, you can plot two GeoDataFrames on the same map and add layer control using folium. You can also add additional tiles allowing you to change the background directly in the map."
71 ],
72 "metadata": {}
91 ]
7392 },
7493 {
7594 "cell_type": "code",
7695 "execution_count": null,
96 "id": "cba9970b",
97 "metadata": {},
98 "outputs": [],
7799 "source": [
78100 "import folium\n",
79101 "\n",
99121 "folium.LayerControl().add_to(m) # use folium to add layer control\n",
100122 "\n",
101123 "m # show map"
102 ],
103 "outputs": [],
104 "metadata": {}
124 ]
105125 }
106126 ],
107127 "metadata": {
125145 },
126146 "nbformat": 4,
127147 "nbformat_minor": 5
128 }
148 }
2828 the ``layer`` keyword::
2929
3030 countries_gdf = geopandas.read_file("package.gpkg", layer='countries')
31
31
32 Currently fiona only exposes the default drivers. To display those, type::
33
34 import fiona; fiona.supported_drivers
35
36 There is an `array <https://github.com/Toblerity/Fiona/blob/master/fiona/drvsupport.py>`_
37 of unexposed but supported (depending on the GDAL-build) drivers. One can activate
38 these on runtime by updating the `supported_drivers` dictionary like::
39
40 fiona.supported_drivers["NAS"] = "raw"
41
3242 Where supported in :mod:`fiona`, *geopandas* can also load resources directly from
3343 a web URL, for example for GeoJSON files from `geojson.xyz <http://geojson.xyz/>`_::
3444
134144 ^^^^^^^^^^^^^^^^^^^^
135145
136146 Load in a subset of fields from the file:
147
148 .. note:: Requires Fiona 1.9+
149
150 .. code-block:: python
151
152 gdf = geopandas.read_file(
153 geopandas.datasets.get_path("naturalearth_lowres"),
154 include_fields=["pop_est", "continent", "name"],
155 )
137156
138157 .. note:: Requires Fiona 1.8+
139158
108108 columns of the same GeoDataFrame. The projection is now stored together with geometries per column (directly
109109 on the GeometryArray level).
110110
111 Note that if GeometryArray has assigned projection, it is preferred over the
112 projection passed to GeoSeries or GeoDataFrame during the creation:
111 Note that if GeometryArray has an assigned projection, it cannot be overridden by an another inconsistent
112 projection during the creation of a GeoSeries or GeoDataFrame:
113113
114114 .. code-block:: python
115115
120120 - Lat[north]: Geodetic latitude (degree)
121121 - Lon[east]: Geodetic longitude (degree)
122122 ...
123 >>> GeoSeries(array, crs=3395).crs # crs=3395 is ignored as array already has CRS
124 FutureWarning: CRS mismatch between CRS of the passed geometries and 'crs'. Use 'GeoDataFrame.set_crs(crs, allow_override=True)' to overwrite CRS or 'GeoDataFrame.to_crs(crs)' to reproject geometries. CRS mismatch will raise an error in the future versions of GeoPandas.
123 >>> GeoSeries(array, crs=4326) # crs=4326 is okay, as it matches the existing CRS
124 >>> GeoSeries(array, crs=3395) # crs=3395 is forbidden as array already has CRS
125 ValueError: CRS mismatch between CRS of the passed geometries and 'crs'. Use 'GeoSeries.set_crs(crs, allow_override=True)' to overwrite CRS or 'GeoSeries.to_crs(crs)' to reproject geometries.
125126 GeoSeries(array, crs=3395).crs
126127
127 <Geographic 2D CRS: EPSG:4326>
128 Name: WGS 84
129 Axis Info [ellipsoidal]:
130 - Lat[north]: Geodetic latitude (degree)
131 - Lon[east]: Geodetic longitude (degree)
132 ...
133
134 If you want to overwrite projection, you can then assign it to the GeoSeries
128 If you want to overwrite the projection, you can then assign it to the GeoSeries
135129 manually or re-project geometries to the target projection using either
136130 ``GeoSeries.set_crs(epsg=3395, allow_override=True)`` or
137131 ``GeoSeries.to_crs(epsg=3395)``.
211211 More Examples
212212 -------------
213213
214 A larger set of examples of the use of :meth:`~geopandas.GeoDataFrame.overlay` can be found `here <https://nbviewer.jupyter.org/github/geopandas/geopandas/blob/master/doc/source/gallery/overlays.ipynb>`_
214 A larger set of examples of the use of :meth:`~geopandas.GeoDataFrame.overlay` can be found `here <https://nbviewer.jupyter.org/github/geopandas/geopandas/blob/main/doc/source/gallery/overlays.ipynb>`_
215215
216216
217217
11 "cells": [
22 {
33 "cell_type": "markdown",
4 "metadata": {},
45 "source": [
56 "# Spatial Joins\n",
67 "\n",
1112 "A common use case might be a spatial join between a point layer and a polygon layer where you want to retain the point geometries and grab the attributes of the intersecting polygons.\n",
1213 "\n",
1314 "![illustration](https://web.natur.cuni.cz/~langhamr/lectures/vtfg1/mapinfo_1/about_gis/Image23.gif)"
14 ],
15 "metadata": {}
16 },
17 {
18 "cell_type": "markdown",
15 ]
16 },
17 {
18 "cell_type": "markdown",
19 "metadata": {},
1920 "source": [
2021 "\n",
2122 "## Types of spatial joins\n",
8384 " 0101000000F0D88AA0E1A4EEBF7052F7E5B115E9BF | 2 | 20\n",
8485 "(4 rows) \n",
8586 "```"
86 ],
87 "metadata": {}
88 },
89 {
90 "cell_type": "markdown",
87 ]
88 },
89 {
90 "cell_type": "markdown",
91 "metadata": {},
9192 "source": [
9293 "## Spatial Joins between two GeoDataFrames\n",
9394 "\n",
9495 "Let's take a look at how we'd implement these using `GeoPandas`. First, load up the NYC test data into `GeoDataFrames`:"
95 ],
96 "metadata": {}
97 },
98 {
99 "cell_type": "code",
100 "execution_count": null,
96 ]
97 },
98 {
99 "cell_type": "code",
100 "execution_count": null,
101 "metadata": {},
102 "outputs": [],
101103 "source": [
102104 "%matplotlib inline\n",
103105 "from shapely.geometry import Point\n",
117119 "\n",
118120 "# Make sure they're using the same projection reference\n",
119121 "pointdf.crs = polydf.crs"
120 ],
121 "outputs": [],
122 "metadata": {}
123 },
124 {
125 "cell_type": "code",
126 "execution_count": null,
122 ]
123 },
124 {
125 "cell_type": "code",
126 "execution_count": null,
127 "metadata": {},
128 "outputs": [],
127129 "source": [
128130 "pointdf"
129 ],
130 "outputs": [],
131 "metadata": {}
132 },
133 {
134 "cell_type": "code",
135 "execution_count": null,
131 ]
132 },
133 {
134 "cell_type": "code",
135 "execution_count": null,
136 "metadata": {},
137 "outputs": [],
136138 "source": [
137139 "polydf"
138 ],
139 "outputs": [],
140 "metadata": {}
141 },
142 {
143 "cell_type": "code",
144 "execution_count": null,
140 ]
141 },
142 {
143 "cell_type": "code",
144 "execution_count": null,
145 "metadata": {},
146 "outputs": [],
145147 "source": [
146148 "pointdf.plot()"
147 ],
148 "outputs": [],
149 "metadata": {}
150 },
151 {
152 "cell_type": "code",
153 "execution_count": null,
149 ]
150 },
151 {
152 "cell_type": "code",
153 "execution_count": null,
154 "metadata": {},
155 "outputs": [],
154156 "source": [
155157 "polydf.plot()"
156 ],
157 "outputs": [],
158 "metadata": {}
159 },
160 {
161 "cell_type": "markdown",
158 ]
159 },
160 {
161 "cell_type": "markdown",
162 "metadata": {},
162163 "source": [
163164 "## Joins"
164 ],
165 "metadata": {}
166 },
167 {
168 "cell_type": "code",
169 "execution_count": null,
165 ]
166 },
167 {
168 "cell_type": "code",
169 "execution_count": null,
170 "metadata": {},
171 "outputs": [],
170172 "source": [
171173 "join_left_df = pointdf.sjoin(polydf, how=\"left\")\n",
172174 "join_left_df\n",
173175 "# Note the NaNs where the point did not intersect a boro"
174 ],
175 "outputs": [],
176 "metadata": {}
177 },
178 {
179 "cell_type": "code",
180 "execution_count": null,
176 ]
177 },
178 {
179 "cell_type": "code",
180 "execution_count": null,
181 "metadata": {},
182 "outputs": [],
181183 "source": [
182184 "join_right_df = pointdf.sjoin(polydf, how=\"right\")\n",
183185 "join_right_df\n",
184186 "# Note Staten Island is repeated"
185 ],
186 "outputs": [],
187 "metadata": {}
188 },
189 {
190 "cell_type": "code",
191 "execution_count": null,
187 ]
188 },
189 {
190 "cell_type": "code",
191 "execution_count": null,
192 "metadata": {},
193 "outputs": [],
192194 "source": [
193195 "join_inner_df = pointdf.sjoin(polydf, how=\"inner\")\n",
194196 "join_inner_df\n",
195197 "# Note the lack of NaNs; dropped anything that didn't intersect"
196 ],
197 "outputs": [],
198 "metadata": {}
199 },
200 {
201 "cell_type": "markdown",
198 ]
199 },
200 {
201 "cell_type": "markdown",
202 "metadata": {},
202203 "source": [
203204 "We're not limited to using the `intersection` binary predicate. Any of the `Shapely` geometry methods that return a Boolean can be used by specifying the `op` kwarg."
204 ],
205 "metadata": {}
206 },
207 {
208 "cell_type": "code",
209 "execution_count": null,
210 "source": [
211 "pointdf.sjoin(polydf, how=\"left\", op=\"within\")"
212 ],
213 "outputs": [],
214 "metadata": {}
205 ]
206 },
207 {
208 "cell_type": "code",
209 "execution_count": null,
210 "metadata": {},
211 "outputs": [],
212 "source": [
213 "pointdf.sjoin(polydf, how=\"left\", predicate=\"within\")"
214 ]
215 },
216 {
217 "cell_type": "markdown",
218 "metadata": {},
219 "source": [
220 "We can also conduct a nearest neighbour join with `sjoin_nearest`."
221 ]
222 },
223 {
224 "cell_type": "code",
225 "execution_count": null,
226 "metadata": {},
227 "outputs": [],
228 "source": [
229 "pointdf.sjoin_nearest(polydf, how=\"left\", distance_col=\"Distances\")\n",
230 "# Note the optional Distances column with computed distances between each point\n",
231 "# and the nearest polydf geometry."
232 ]
215233 }
216234 ],
217235 "metadata": {
235253 },
236254 "nbformat": 4,
237255 "nbformat_minor": 4
238 }
256 }
9292 installed correctly.
9393
9494 - `fiona`_ provides binary wheels with the dependencies included for Mac and Linux,
95 but not for Windows.
95 but not for Windows. Alternatively, you can install `pyogrio`_ which does
96 have wheels for Windows.
9697 - `pyproj`_, `rtree`_, and `shapely`_ provide binary wheels with dependencies included
9798 for Mac, Linux, and Windows.
98 - Windows wheels for `shapely`, `fiona`, `pyproj` and `rtree`
99 can be found at `Christopher Gohlke's website
100 <https://www.lfd.uci.edu/~gohlke/pythonlibs/>`_.
10199
102100 Depending on your platform, you might need to compile and install their
103101 C dependencies manually. We refer to the individual packages for more
137135 Required dependencies:
138136
139137 - `numpy`_
140 - `pandas`_ (version 0.25 or later)
141 - `shapely`_ (interface to `GEOS`_)
142 - `fiona`_ (interface to `GDAL`_)
143 - `pyproj`_ (interface to `PROJ`_; version 2.2.0 or later)
138 - `pandas`_ (version 1.0 or later)
139 - `shapely`_ (interface to `GEOS`_; version 1.7 or later)
140 - `fiona`_ (interface to `GDAL`_; version 1.8 or later)
141 - `pyproj`_ (interface to `PROJ`_; version 2.6.1 or later)
142 - `packaging`_
144143
145144 Further, optional dependencies are:
146145
146 - `pyogrio`_ (optional; experimental alternative for fiona)
147147 - `rtree`_ (optional; spatial index to improve performance and required for
148148 overlay operations; interface to `libspatialindex`_)
149149 - `psycopg2`_ (optional; for PostGIS connection)
153153
154154 For plotting, these additional packages may be used:
155155
156 - `matplotlib`_ (>= 3.1.0)
156 - `matplotlib`_ (>= 3.2.0)
157157 - `mapclassify`_ (>= 2.4.0)
158158
159159
210210
211211 .. _fiona: https://fiona.readthedocs.io
212212
213 .. _pyogrio: https://pyogrio.readthedocs.io
214
213215 .. _matplotlib: http://matplotlib.org
214216
215217 .. _geopy: https://github.com/geopy/geopy
241243 .. _PROJ: https://proj.org/
242244
243245 .. _PyGEOS: https://github.com/pygeos/pygeos/
246
247 .. _packaging: https://packaging.pypa.io/en/latest/
11 channels:
22 - conda-forge
33 dependencies:
4 - python
45 # required
56 - fiona>=1.8
6 - pandas>=0.25
7 - pyproj>=2.2.0
8 - shapely>=1.6
9
10 # geodatabase access
11 - psycopg2>=2.5.1
12 - SQLAlchemy>=0.8.3
13
14 # geocoding
15 - geopy
16
17 # plotting
18 - matplotlib>=2.2
19 - mapclassify
7 - pandas>=1.0.0
8 - pygeos
9 - pyproj>=2.6.1.post1
10 - shapely>=1.7
11 - packaging
2012
2113 # testing
2214 - pytest>=3.1.0
2315 - pytest-cov
16 - pytest-xdist
17 - fsspec
2418 - codecov
25
26 # spatial access methods
27 - rtree>=0.8
28
2919 # styling
3020 - black
3121 - pre-commit
22
23 # optional
24 - folium
25 - xyzservices
26 - scipy
27 - libspatialite
28 - geoalchemy2
29 - pyarrow
30 # doctest testing
31 - pytest-doctestplus
32 # geocoding
33 - geopy
34 # geodatabase access
35 - psycopg2>=2.8.0
36 - SQLAlchemy>=1.3
37 # plotting
38 - matplotlib>=3.2
39 - mapclassify
40 # spatial access methods
41 - rtree>=0.9
1717 - rasterio
1818 - geoplot
1919 - folium
20 - packaging
00 # Examples Gallery
11
2 Examples are available in the [documentation](https://geopandas.readthedocs.io/en/latest/gallery/index.html). Source Jupyter notebooks are in [`doc/source/gallery`](https://github.com/geopandas/geopandas/tree/master/doc/source/gallery).
2 Examples are available in the [documentation](https://geopandas.readthedocs.io/en/latest/gallery/index.html). Source Jupyter notebooks are in [`doc/source/gallery`](https://github.com/geopandas/geopandas/tree/main/doc/source/gallery).
2222 import pandas as pd # noqa
2323 import numpy as np # noqa
2424
25 from ._version import get_versions
25 from . import _version
2626
27 __version__ = get_versions()["version"]
28 del get_versions
27 __version__ = _version.get_versions()["version"]
00 import contextlib
1 from distutils.version import LooseVersion
1 from packaging.version import Version
22 import importlib
33 import os
44 import warnings
1414 # pandas compat
1515 # -----------------------------------------------------------------------------
1616
17 PANDAS_GE_10 = str(pd.__version__) >= LooseVersion("1.0.0")
18 PANDAS_GE_11 = str(pd.__version__) >= LooseVersion("1.1.0")
19 PANDAS_GE_115 = str(pd.__version__) >= LooseVersion("1.1.5")
20 PANDAS_GE_12 = str(pd.__version__) >= LooseVersion("1.2.0")
17 PANDAS_GE_11 = Version(pd.__version__) >= Version("1.1.0")
18 PANDAS_GE_115 = Version(pd.__version__) >= Version("1.1.5")
19 PANDAS_GE_12 = Version(pd.__version__) >= Version("1.2.0")
20 PANDAS_GE_13 = Version(pd.__version__) >= Version("1.3.0")
21 PANDAS_GE_14 = Version(pd.__version__) >= Version("1.4.0rc0")
2122
2223
2324 # -----------------------------------------------------------------------------
2526 # -----------------------------------------------------------------------------
2627
2728
28 SHAPELY_GE_17 = str(shapely.__version__) >= LooseVersion("1.7.0")
29 SHAPELY_GE_18 = str(shapely.__version__) >= LooseVersion("1.8")
30 SHAPELY_GE_20 = str(shapely.__version__) >= LooseVersion("2.0")
29 SHAPELY_GE_18 = Version(shapely.__version__) >= Version("1.8")
30 SHAPELY_GE_182 = Version(shapely.__version__) >= Version("1.8.2")
31 SHAPELY_GE_20 = Version(shapely.__version__) >= Version("2.0")
3132
3233 GEOS_GE_390 = shapely.geos.geos_version >= (3, 9, 0)
3334
4647 import pygeos # noqa
4748
4849 # only automatically use pygeos if version is high enough
49 if str(pygeos.__version__) >= LooseVersion("0.8"):
50 if Version(pygeos.__version__) >= Version("0.8"):
5051 HAS_PYGEOS = True
51 PYGEOS_GE_09 = str(pygeos.__version__) >= LooseVersion("0.9")
52 PYGEOS_GE_010 = str(pygeos.__version__) >= LooseVersion("0.10")
52 PYGEOS_GE_09 = Version(pygeos.__version__) >= Version("0.9")
53 PYGEOS_GE_010 = Version(pygeos.__version__) >= Version("0.10")
5354 else:
5455 warnings.warn(
5556 "The installed version of PyGEOS is too old ({0} installed, 0.8 required),"
9192 import pygeos # noqa
9293
9394 # validate the pygeos version
94 if not str(pygeos.__version__) >= LooseVersion("0.8"):
95 if not Version(pygeos.__version__) >= Version("0.8"):
9596 raise ImportError(
9697 "PyGEOS >= 0.8 is required, version {0} is installed".format(
9798 pygeos.__version__
148149 )
149150 yield
150151
151
152 elif (str(np.__version__) >= LooseVersion("1.21")) and not SHAPELY_GE_20:
152 elif (Version(np.__version__) >= Version("1.21")) and not SHAPELY_GE_20:
153153
154154 @contextlib.contextmanager
155155 def ignore_shapely2_warnings():
161161 )
162162 yield
163163
164
165164 else:
166165
167166 @contextlib.contextmanager
219218 except ImportError:
220219 HAS_RTREE = False
221220
221
222222 # -----------------------------------------------------------------------------
223223 # pyproj compat
224224 # -----------------------------------------------------------------------------
225225
226 PYPROJ_LT_3 = LooseVersion(pyproj.__version__) < LooseVersion("3")
227 PYPROJ_GE_31 = LooseVersion(pyproj.__version__) >= LooseVersion("3.1")
226 PYPROJ_LT_3 = Version(pyproj.__version__) < Version("3")
227 PYPROJ_GE_31 = Version(pyproj.__version__) >= Version("3.1")
228 PYPROJ_GE_32 = Version(pyproj.__version__) >= Version("3.2")
4949 cls = self.__class__.__name__
5050 description = ""
5151 for key, option in self._options.items():
52 descr = u"{key}: {cur!r} [default: {default!r}]\n".format(
52 descr = "{key}: {cur!r} [default: {default!r}]\n".format(
5353 key=key, cur=self._config[key], default=option.default_value
5454 )
5555 description += descr
5757 if option.doc:
5858 doc_text = "\n".join(textwrap.wrap(option.doc, width=70))
5959 else:
60 doc_text = u"No description available."
60 doc_text = "No description available."
6161 doc_text = textwrap.indent(doc_text, prefix=" ")
6262 description += doc_text + "\n"
6363 space = "\n "
1010
1111 import shapely.geometry
1212 import shapely.geos
13 import shapely.ops
1314 import shapely.wkb
1415 import shapely.wkt
1516
5556 return True
5657 elif isinstance(value, float) and np.isnan(value):
5758 return True
58 elif compat.PANDAS_GE_10 and value is pd.NA:
59 elif value is pd.NA:
5960 return True
6061 else:
6162 return False
727728 #
728729
729730
731 def clip_by_rect(data, xmin, ymin, xmax, ymax):
732 if compat.USE_PYGEOS:
733 return pygeos.clip_by_rect(data, xmin, ymin, xmax, ymax)
734 else:
735 clipped_geometries = np.empty(len(data), dtype=object)
736 clipped_geometries[:] = [
737 shapely.ops.clip_by_rect(s, xmin, ymin, xmax, ymax)
738 if s is not None
739 else None
740 for s in data
741 ]
742 return clipped_geometries
743
744
730745 def difference(data, other):
731746 if compat.USE_PYGEOS:
732747 return _binary_method("difference", data, other)
44 # that just contains the computed version number.
55
66 # This file is released into the public domain. Generated by
7 # versioneer-0.16 (https://github.com/warner/python-versioneer)
7 # versioneer-0.21 (https://github.com/python-versioneer/python-versioneer)
88
99 """Git implementation of _version.py."""
1010
1313 import re
1414 import subprocess
1515 import sys
16 from typing import Callable, Dict
1617
1718
1819 def get_keywords():
2122 # setup.py/versioneer.py will grep for the variable names, so they must
2223 # each be defined on a line of their own. _version.py will just call
2324 # get_keywords().
24 git_refnames = " (HEAD -> master, tag: v0.10.2)"
25 git_full = "04d377f321972801888381356cb6259766eb63b6"
26 keywords = {"refnames": git_refnames, "full": git_full}
25 git_refnames = " (HEAD -> main, tag: v0.11.0)"
26 git_full = "1977b5036b9ca3a034e65ea1f5ba48b7225550a7"
27 git_date = "2022-06-21 08:00:39 +0200"
28 keywords = {"refnames": git_refnames, "full": git_full, "date": git_date}
2729 return keywords
2830
2931
4951 """Exception raised if a method is not valid for the current scenario."""
5052
5153
52 LONG_VERSION_PY = {}
53 HANDLERS = {}
54 LONG_VERSION_PY: Dict[str, str] = {}
55 HANDLERS: Dict[str, Dict[str, Callable]] = {}
5456
5557
5658 def register_vcs_handler(vcs, method): # decorator
57 """Decorator to mark a method as the handler for a particular VCS."""
59 """Create decorator to mark a method as the handler of a VCS."""
5860
5961 def decorate(f):
6062 """Store f in HANDLERS[vcs][method]."""
6668 return decorate
6769
6870
69 def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False):
71 def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False, env=None):
7072 """Call the given command(s)."""
7173 assert isinstance(commands, list)
72 p = None
73 for c in commands:
74 process = None
75 for command in commands:
7476 try:
75 dispcmd = str([c] + args)
77 dispcmd = str([command] + args)
7678 # remember shell=False, so use git.cmd on windows, not just git
77 p = subprocess.Popen(
78 [c] + args,
79 process = subprocess.Popen(
80 [command] + args,
7981 cwd=cwd,
82 env=env,
8083 stdout=subprocess.PIPE,
8184 stderr=(subprocess.PIPE if hide_stderr else None),
8285 )
8386 break
84 except EnvironmentError:
87 except OSError:
8588 e = sys.exc_info()[1]
8689 if e.errno == errno.ENOENT:
8790 continue
8891 if verbose:
8992 print("unable to run %s" % dispcmd)
9093 print(e)
91 return None
94 return None, None
9295 else:
9396 if verbose:
9497 print("unable to find command, tried %s" % (commands,))
95 return None
96 stdout = p.communicate()[0].strip()
97 if sys.version_info[0] >= 3:
98 stdout = stdout.decode()
99 if p.returncode != 0:
98 return None, None
99 stdout = process.communicate()[0].strip().decode()
100 if process.returncode != 0:
100101 if verbose:
101102 print("unable to run %s (error)" % dispcmd)
102 return None
103 return stdout
103 print("stdout was %s" % stdout)
104 return None, process.returncode
105 return stdout, process.returncode
104106
105107
106108 def versions_from_parentdir(parentdir_prefix, root, verbose):
107109 """Try to determine the version from the parent directory name.
108110
109 Source tarballs conventionally unpack into a directory that includes
110 both the project name and a version string.
111 """
112 dirname = os.path.basename(root)
113 if not dirname.startswith(parentdir_prefix):
114 if verbose:
115 print(
116 "guessing rootdir is '%s', but '%s' doesn't start with "
117 "prefix '%s'" % (root, dirname, parentdir_prefix)
118 )
119 raise NotThisMethod("rootdir doesn't start with parentdir_prefix")
120 return {
121 "version": dirname[len(parentdir_prefix) :],
122 "full-revisionid": None,
123 "dirty": False,
124 "error": None,
125 }
111 Source tarballs conventionally unpack into a directory that includes both
112 the project name and a version string. We will also support searching up
113 two directory levels for an appropriately named parent directory
114 """
115 rootdirs = []
116
117 for _ in range(3):
118 dirname = os.path.basename(root)
119 if dirname.startswith(parentdir_prefix):
120 return {
121 "version": dirname[len(parentdir_prefix) :],
122 "full-revisionid": None,
123 "dirty": False,
124 "error": None,
125 "date": None,
126 }
127 rootdirs.append(root)
128 root = os.path.dirname(root) # up a level
129
130 if verbose:
131 print(
132 "Tried directories %s but none started with prefix %s"
133 % (str(rootdirs), parentdir_prefix)
134 )
135 raise NotThisMethod("rootdir doesn't start with parentdir_prefix")
126136
127137
128138 @register_vcs_handler("git", "get_keywords")
134144 # _version.py.
135145 keywords = {}
136146 try:
137 f = open(versionfile_abs, "r")
138 for line in f.readlines():
139 if line.strip().startswith("git_refnames ="):
140 mo = re.search(r'=\s*"(.*)"', line)
141 if mo:
142 keywords["refnames"] = mo.group(1)
143 if line.strip().startswith("git_full ="):
144 mo = re.search(r'=\s*"(.*)"', line)
145 if mo:
146 keywords["full"] = mo.group(1)
147 f.close()
148 except EnvironmentError:
147 with open(versionfile_abs, "r") as fobj:
148 for line in fobj:
149 if line.strip().startswith("git_refnames ="):
150 mo = re.search(r'=\s*"(.*)"', line)
151 if mo:
152 keywords["refnames"] = mo.group(1)
153 if line.strip().startswith("git_full ="):
154 mo = re.search(r'=\s*"(.*)"', line)
155 if mo:
156 keywords["full"] = mo.group(1)
157 if line.strip().startswith("git_date ="):
158 mo = re.search(r'=\s*"(.*)"', line)
159 if mo:
160 keywords["date"] = mo.group(1)
161 except OSError:
149162 pass
150163 return keywords
151164
153166 @register_vcs_handler("git", "keywords")
154167 def git_versions_from_keywords(keywords, tag_prefix, verbose):
155168 """Get version information from git keywords."""
156 if not keywords:
157 raise NotThisMethod("no keywords at all, weird")
169 if "refnames" not in keywords:
170 raise NotThisMethod("Short version file found")
171 date = keywords.get("date")
172 if date is not None:
173 # Use only the last line. Previous lines may contain GPG signature
174 # information.
175 date = date.splitlines()[-1]
176
177 # git-2.2.0 added "%cI", which expands to an ISO-8601 -compliant
178 # datestamp. However we prefer "%ci" (which expands to an "ISO-8601
179 # -like" string, which we must then edit to make compliant), because
180 # it's been around since git-1.5.3, and it's too difficult to
181 # discover which version we're using, or to work around using an
182 # older one.
183 date = date.strip().replace(" ", "T", 1).replace(" ", "", 1)
158184 refnames = keywords["refnames"].strip()
159185 if refnames.startswith("$Format"):
160186 if verbose:
161187 print("keywords are unexpanded, not using")
162188 raise NotThisMethod("unexpanded keywords, not a git-archive tarball")
163 refs = set([r.strip() for r in refnames.strip("()").split(",")])
189 refs = {r.strip() for r in refnames.strip("()").split(",")}
164190 # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
165191 # just "foo-1.0". If we see a "tag: " prefix, prefer those.
166192 TAG = "tag: "
167 tags = set([r[len(TAG) :] for r in refs if r.startswith(TAG)])
193 tags = {r[len(TAG) :] for r in refs if r.startswith(TAG)}
168194 if not tags:
169195 # Either we're using git < 1.8.3, or there really are no tags. We use
170196 # a heuristic: assume all version tags have a digit. The old git %d
173199 # between branches and tags. By ignoring refnames without digits, we
174200 # filter out many common branch names like "release" and
175201 # "stabilization", as well as "HEAD" and "master".
176 tags = set([r for r in refs if re.search(r"\d", r)])
202 tags = {r for r in refs if re.search(r"\d", r)}
177203 if verbose:
178204 print("discarding '%s', no digits" % ",".join(refs - tags))
179205 if verbose:
182208 # sorting will prefer e.g. "2.0" over "2.0rc1"
183209 if ref.startswith(tag_prefix):
184210 r = ref[len(tag_prefix) :]
211 # Filter out refs that exactly match prefix or that don't start
212 # with a number once the prefix is stripped (mostly a concern
213 # when prefix is '')
214 if not re.match(r"\d", r):
215 continue
185216 if verbose:
186217 print("picking %s" % r)
187218 return {
189220 "full-revisionid": keywords["full"].strip(),
190221 "dirty": False,
191222 "error": None,
223 "date": date,
192224 }
193225 # no suitable tags, so version is "0+unknown", but full hex is still there
194226 if verbose:
198230 "full-revisionid": keywords["full"].strip(),
199231 "dirty": False,
200232 "error": "no suitable tags",
233 "date": None,
201234 }
202235
203236
204237 @register_vcs_handler("git", "pieces_from_vcs")
205 def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
238 def git_pieces_from_vcs(tag_prefix, root, verbose, runner=run_command):
206239 """Get version from 'git describe' in the root of the source tree.
207240
208241 This only gets called if the git-archive 'subst' keywords were *not*
209242 expanded, and _version.py hasn't already been rewritten with a short
210243 version string, meaning we're inside a checked out source tree.
211244 """
212 if not os.path.exists(os.path.join(root, ".git")):
213 if verbose:
214 print("no .git in %s" % root)
215 raise NotThisMethod("no .git directory")
216
217245 GITS = ["git"]
246 TAG_PREFIX_REGEX = "*"
218247 if sys.platform == "win32":
219248 GITS = ["git.cmd", "git.exe"]
249 TAG_PREFIX_REGEX = r"\*"
250
251 _, rc = runner(GITS, ["rev-parse", "--git-dir"], cwd=root, hide_stderr=True)
252 if rc != 0:
253 if verbose:
254 print("Directory %s not under git control" % root)
255 raise NotThisMethod("'git rev-parse --git-dir' returned error")
256
220257 # if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty]
221258 # if there isn't one, this yields HEX[-dirty] (no NUM)
222 describe_out = run_command(
259 describe_out, rc = runner(
223260 GITS,
224261 [
225262 "describe",
228265 "--always",
229266 "--long",
230267 "--match",
231 "%s*" % tag_prefix,
268 "%s%s" % (tag_prefix, TAG_PREFIX_REGEX),
232269 ],
233270 cwd=root,
234271 )
236273 if describe_out is None:
237274 raise NotThisMethod("'git describe' failed")
238275 describe_out = describe_out.strip()
239 full_out = run_command(GITS, ["rev-parse", "HEAD"], cwd=root)
276 full_out, rc = runner(GITS, ["rev-parse", "HEAD"], cwd=root)
240277 if full_out is None:
241278 raise NotThisMethod("'git rev-parse' failed")
242279 full_out = full_out.strip()
245282 pieces["long"] = full_out
246283 pieces["short"] = full_out[:7] # maybe improved later
247284 pieces["error"] = None
285
286 branch_name, rc = runner(GITS, ["rev-parse", "--abbrev-ref", "HEAD"], cwd=root)
287 # --abbrev-ref was added in git-1.6.3
288 if rc != 0 or branch_name is None:
289 raise NotThisMethod("'git rev-parse --abbrev-ref' returned error")
290 branch_name = branch_name.strip()
291
292 if branch_name == "HEAD":
293 # If we aren't exactly on a branch, pick a branch which represents
294 # the current commit. If all else fails, we are on a branchless
295 # commit.
296 branches, rc = runner(GITS, ["branch", "--contains"], cwd=root)
297 # --contains was added in git-1.5.4
298 if rc != 0 or branches is None:
299 raise NotThisMethod("'git branch --contains' returned error")
300 branches = branches.split("\n")
301
302 # Remove the first line if we're running detached
303 if "(" in branches[0]:
304 branches.pop(0)
305
306 # Strip off the leading "* " from the list of branches.
307 branches = [branch[2:] for branch in branches]
308 if "master" in branches:
309 branch_name = "master"
310 elif not branches:
311 branch_name = None
312 else:
313 # Pick the first branch that is returned. Good or bad.
314 branch_name = branches[0]
315
316 pieces["branch"] = branch_name
248317
249318 # parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty]
250319 # TAG might have hyphens.
262331 # TAG-NUM-gHEX
263332 mo = re.search(r"^(.+)-(\d+)-g([0-9a-f]+)$", git_describe)
264333 if not mo:
265 # unparseable. Maybe git-describe is misbehaving?
334 # unparsable. Maybe git-describe is misbehaving?
266335 pieces["error"] = "unable to parse git-describe output: '%s'" % describe_out
267336 return pieces
268337
288357 else:
289358 # HEX: no tags
290359 pieces["closest-tag"] = None
291 count_out = run_command(GITS, ["rev-list", "HEAD", "--count"], cwd=root)
360 count_out, rc = runner(GITS, ["rev-list", "HEAD", "--count"], cwd=root)
292361 pieces["distance"] = int(count_out) # total number of commits
362
363 # commit date: see ISO-8601 comment in git_versions_from_keywords()
364 date = runner(GITS, ["show", "-s", "--format=%ci", "HEAD"], cwd=root)[0].strip()
365 # Use only the last line. Previous lines may contain GPG signature
366 # information.
367 date = date.splitlines()[-1]
368 pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1)
293369
294370 return pieces
295371
325401 return rendered
326402
327403
404 def render_pep440_branch(pieces):
405 """TAG[[.dev0]+DISTANCE.gHEX[.dirty]] .
406
407 The ".dev0" means not master branch. Note that .dev0 sorts backwards
408 (a feature branch will appear "older" than the master branch).
409
410 Exceptions:
411 1: no tags. 0[.dev0]+untagged.DISTANCE.gHEX[.dirty]
412 """
413 if pieces["closest-tag"]:
414 rendered = pieces["closest-tag"]
415 if pieces["distance"] or pieces["dirty"]:
416 if pieces["branch"] != "master":
417 rendered += ".dev0"
418 rendered += plus_or_dot(pieces)
419 rendered += "%d.g%s" % (pieces["distance"], pieces["short"])
420 if pieces["dirty"]:
421 rendered += ".dirty"
422 else:
423 # exception #1
424 rendered = "0"
425 if pieces["branch"] != "master":
426 rendered += ".dev0"
427 rendered += "+untagged.%d.g%s" % (pieces["distance"], pieces["short"])
428 if pieces["dirty"]:
429 rendered += ".dirty"
430 return rendered
431
432
433 def pep440_split_post(ver):
434 """Split pep440 version string at the post-release segment.
435
436 Returns the release segments before the post-release and the
437 post-release version number (or -1 if no post-release segment is present).
438 """
439 vc = str.split(ver, ".post")
440 return vc[0], int(vc[1] or 0) if len(vc) == 2 else None
441
442
328443 def render_pep440_pre(pieces):
329 """TAG[.post.devDISTANCE] -- No -dirty.
330
331 Exceptions:
332 1: no tags. 0.post.devDISTANCE
333 """
334 if pieces["closest-tag"]:
335 rendered = pieces["closest-tag"]
444 """TAG[.postN.devDISTANCE] -- No -dirty.
445
446 Exceptions:
447 1: no tags. 0.post0.devDISTANCE
448 """
449 if pieces["closest-tag"]:
336450 if pieces["distance"]:
337 rendered += ".post.dev%d" % pieces["distance"]
338 else:
339 # exception #1
340 rendered = "0.post.dev%d" % pieces["distance"]
451 # update the post release segment
452 tag_version, post_version = pep440_split_post(pieces["closest-tag"])
453 rendered = tag_version
454 if post_version is not None:
455 rendered += ".post%d.dev%d" % (post_version + 1, pieces["distance"])
456 else:
457 rendered += ".post0.dev%d" % (pieces["distance"])
458 else:
459 # no commits, use the tag as the version
460 rendered = pieces["closest-tag"]
461 else:
462 # exception #1
463 rendered = "0.post0.dev%d" % pieces["distance"]
341464 return rendered
342465
343466
365488 if pieces["dirty"]:
366489 rendered += ".dev0"
367490 rendered += "+g%s" % pieces["short"]
491 return rendered
492
493
494 def render_pep440_post_branch(pieces):
495 """TAG[.postDISTANCE[.dev0]+gHEX[.dirty]] .
496
497 The ".dev0" means not master branch.
498
499 Exceptions:
500 1: no tags. 0.postDISTANCE[.dev0]+gHEX[.dirty]
501 """
502 if pieces["closest-tag"]:
503 rendered = pieces["closest-tag"]
504 if pieces["distance"] or pieces["dirty"]:
505 rendered += ".post%d" % pieces["distance"]
506 if pieces["branch"] != "master":
507 rendered += ".dev0"
508 rendered += plus_or_dot(pieces)
509 rendered += "g%s" % pieces["short"]
510 if pieces["dirty"]:
511 rendered += ".dirty"
512 else:
513 # exception #1
514 rendered = "0.post%d" % pieces["distance"]
515 if pieces["branch"] != "master":
516 rendered += ".dev0"
517 rendered += "+g%s" % pieces["short"]
518 if pieces["dirty"]:
519 rendered += ".dirty"
368520 return rendered
369521
370522
438590 "full-revisionid": pieces.get("long"),
439591 "dirty": None,
440592 "error": pieces["error"],
593 "date": None,
441594 }
442595
443596 if not style or style == "default":
445598
446599 if style == "pep440":
447600 rendered = render_pep440(pieces)
601 elif style == "pep440-branch":
602 rendered = render_pep440_branch(pieces)
448603 elif style == "pep440-pre":
449604 rendered = render_pep440_pre(pieces)
450605 elif style == "pep440-post":
451606 rendered = render_pep440_post(pieces)
607 elif style == "pep440-post-branch":
608 rendered = render_pep440_post_branch(pieces)
452609 elif style == "pep440-old":
453610 rendered = render_pep440_old(pieces)
454611 elif style == "git-describe":
463620 "full-revisionid": pieces["long"],
464621 "dirty": pieces["dirty"],
465622 "error": None,
623 "date": pieces.get("date"),
466624 }
467625
468626
486644 # versionfile_source is the relative path from the top of the source
487645 # tree (where the .git directory might live) to this file. Invert
488646 # this to find the root from __file__.
489 for i in cfg.versionfile_source.split("/"):
647 for _ in cfg.versionfile_source.split("/"):
490648 root = os.path.dirname(root)
491649 except NameError:
492650 return {
494652 "full-revisionid": None,
495653 "dirty": None,
496654 "error": "unable to find root of source tree",
655 "date": None,
497656 }
498657
499658 try:
513672 "full-revisionid": None,
514673 "dirty": None,
515674 "error": "unable to compute version",
675 "date": None,
516676 }
0 from collections.abc import Iterable
10 import numbers
21 import operator
32 import warnings
358357 if isinstance(idx, numbers.Integral):
359358 return _geom_to_shapely(self.data[idx])
360359 # array-like, slice
361 if compat.PANDAS_GE_10:
362 # for pandas >= 1.0, validate and convert IntegerArray/BooleanArray
363 # to numpy array, pass-through non-array-like indexers
364 idx = pd.api.indexers.check_array_indexer(self, idx)
365 if isinstance(idx, (Iterable, slice)):
366 return GeometryArray(self.data[idx], crs=self.crs)
367 else:
368 raise TypeError("Index type not supported", idx)
360 # validate and convert IntegerArray/BooleanArray
361 # to numpy array, pass-through non-array-like indexers
362 idx = pd.api.indexers.check_array_indexer(self, idx)
363 return GeometryArray(self.data[idx], crs=self.crs)
369364
370365 def __setitem__(self, key, value):
371 if compat.PANDAS_GE_10:
372 # for pandas >= 1.0, validate and convert IntegerArray/BooleanArray
373 # keys to numpy array, pass-through non-array-like indexers
374 key = pd.api.indexers.check_array_indexer(self, key)
366 # validate and convert IntegerArray/BooleanArray
367 # keys to numpy array, pass-through non-array-like indexers
368 key = pd.api.indexers.check_array_indexer(self, key)
375369 if isinstance(value, pd.Series):
376370 value = value.values
371 if isinstance(value, pd.DataFrame):
372 value = value.values.flatten()
377373 if isinstance(value, (list, np.ndarray)):
378374 value = from_shapely(value)
379375 if isinstance(value, GeometryArray):
419415 return self.__dict__
420416
421417 def __setstate__(self, state):
422 if compat.USE_PYGEOS:
423 geoms = pygeos.from_wkb(state[0])
418 if not isinstance(state, dict):
419 # pickle file saved with pygeos
420 geoms = vectorized.from_wkb(state[0])
424421 self._crs = state[1]
425422 self._sindex = None # pygeos.STRtree could not be pickled yet
426423 self.data = geoms
427424 self.base = None
428425 else:
426 if compat.USE_PYGEOS:
427 state["data"] = vectorized.from_shapely(state["data"])
429428 if "_crs" not in state:
430429 state["_crs"] = None
431430 self.__dict__.update(state)
560559 return self.geom_equals_exact(other, 0.5 * 10 ** (-decimal))
561560 # return _binary_predicate("almost_equals", self, other, decimal=decimal)
562561
563 def equals_exact(self, other, tolerance):
564 warnings.warn(
565 "GeometryArray.equals_exact() is now GeometryArray.geom_equals_exact(). "
566 "GeometryArray.equals_exact() will be deprecated in the future.",
567 FutureWarning,
568 stacklevel=2,
569 )
570 return self._binary_method("equals_exact", self, other, tolerance=tolerance)
571
572 def almost_equals(self, other, decimal):
573 warnings.warn(
574 "GeometryArray.almost_equals() is now GeometryArray.geom_almost_equals(). "
575 "GeometryArray.almost_equals() will be deprecated in the future.",
576 FutureWarning,
577 stacklevel=2,
578 )
579 return self.geom_equals_exact(other, 0.5 * 10 ** (-decimal))
580
581562 #
582563 # Binary operations that return new geometries
583564 #
565
566 def clip_by_rect(self, xmin, ymin, xmax, ymax):
567 return GeometryArray(
568 vectorized.clip_by_rect(self.data, xmin, ymin, xmax, ymax), crs=self.crs
569 )
584570
585571 def difference(self, other):
586572 return GeometryArray(
739725
740726 >>> a = a.to_crs(3857)
741727 >>> to_wkt(a)
742 array(['POINT (111319 111325)', 'POINT (222639 222684)',
743 'POINT (333958 334111)'], dtype=object)
728 array(['POINT (111319.490793 111325.142866)',
729 'POINT (222638.981587 222684.208506)',
730 'POINT (333958.47238 334111.171402)'], dtype=object)
744731 >>> a.crs # doctest: +SKIP
745732 <Projected CRS: EPSG:3857>
746733 Name: WGS 84 / Pseudo-Mercator
878865 def x(self):
879866 """Return the x location of point geometries in a GeoSeries"""
880867 if (self.geom_type[~self.isna()] == "Point").all():
881 return vectorized.get_x(self.data)
868 empty = self.is_empty
869 if empty.any():
870 nonempty = ~empty
871 coords = np.full_like(nonempty, dtype=float, fill_value=np.nan)
872 coords[nonempty] = vectorized.get_x(self.data[nonempty])
873 return coords
874 else:
875 return vectorized.get_x(self.data)
882876 else:
883877 message = "x attribute access only provided for Point geometries"
884878 raise ValueError(message)
887881 def y(self):
888882 """Return the y location of point geometries in a GeoSeries"""
889883 if (self.geom_type[~self.isna()] == "Point").all():
890 return vectorized.get_y(self.data)
884 empty = self.is_empty
885 if empty.any():
886 nonempty = ~empty
887 coords = np.full_like(nonempty, dtype=float, fill_value=np.nan)
888 coords[nonempty] = vectorized.get_y(self.data[nonempty])
889 return coords
890 else:
891 return vectorized.get_y(self.data)
891892 else:
892893 message = "y attribute access only provided for Point geometries"
893894 raise ValueError(message)
896897 def z(self):
897898 """Return the z location of point geometries in a GeoSeries"""
898899 if (self.geom_type[~self.isna()] == "Point").all():
899 return vectorized.get_z(self.data)
900 empty = self.is_empty
901 if empty.any():
902 nonempty = ~empty
903 coords = np.full_like(nonempty, dtype=float, fill_value=np.nan)
904 coords[nonempty] = vectorized.get_z(self.data[nonempty])
905 return coords
906 else:
907 return vectorized.get_z(self.data)
900908 else:
901909 message = "z attribute access only provided for Point geometries"
902910 raise ValueError(message)
10441052 dtype
10451053 ):
10461054 string_values = to_wkt(self)
1047 if compat.PANDAS_GE_10:
1048 pd_dtype = pd.api.types.pandas_dtype(dtype)
1049 if isinstance(pd_dtype, pd.StringDtype):
1050 # ensure to return a pandas string array instead of numpy array
1051 return pd.array(string_values, dtype=pd_dtype)
1055 pd_dtype = pd.api.types.pandas_dtype(dtype)
1056 if isinstance(pd_dtype, pd.StringDtype):
1057 # ensure to return a pandas string array instead of numpy array
1058 return pd.array(string_values, dtype=pd_dtype)
10521059 return string_values.astype(dtype, copy=False)
10531060 else:
10541061 return np.array(self, dtype=dtype, copy=copy)
11811188 Returns
11821189 -------
11831190 values : ndarray
1184 An array suitable for factoraization. This should maintain order
1191 An array suitable for factorization. This should maintain order
11851192 and be a supported dtype (Float64, Int64, UInt64, String, Object).
11861193 By default, the extension array is cast to object dtype.
11871194 na_value : object
973973 other : GeoSeries or geometric object
974974 The GeoSeries (elementwise) or geometric object to compare to.
975975 decimal : int
976 Decimal place presion used when testing for approximate equality.
976 Decimal place precision used when testing for approximate equality.
977977 align : bool (default True)
978978 If True, automatically aligns GeoSeries based on their indices.
979979 If False, the order of elements is preserved.
10411041 other : GeoSeries or geometric object
10421042 The GeoSeries (elementwise) or geometric object to compare to.
10431043 tolerance : float
1044 Decimal place presion used when testing for approximate equality.
1044 Decimal place precision used when testing for approximate equality.
10451045 align : bool (default True)
10461046 If True, automatically aligns GeoSeries based on their indices.
10471047 If False, the order of elements is preserved.
25342534 GeoSeries.union
25352535 """
25362536 return _binary_geo("intersection", self, other, align)
2537
2538 def clip_by_rect(self, xmin, ymin, xmax, ymax):
2539 """Returns a ``GeoSeries`` of the portions of geometry within the given
2540 rectangle.
2541
2542 Note that the results are not exactly equal to
2543 :meth:`~GeoSeries.intersection()`. E.g. in edge cases,
2544 :meth:`~GeoSeries.clip_by_rect()` will not return a point just touching the
2545 rectangle. Check the examples section below for some of these exceptions.
2546
2547 The geometry is clipped in a fast but possibly dirty way. The output is not
2548 guaranteed to be valid. No exceptions will be raised for topological errors.
2549
2550 Note: empty geometries or geometries that do not overlap with the specified
2551 bounds will result in ``GEOMETRYCOLLECTION EMPTY``.
2552
2553 Parameters
2554 ----------
2555 xmin: float
2556 Minimum x value of the rectangle
2557 ymin: float
2558 Minimum y value of the rectangle
2559 xmax: float
2560 Maximum x value of the rectangle
2561 ymax: float
2562 Maximum y value of the rectangle
2563
2564 Returns
2565 -------
2566 GeoSeries
2567
2568 Examples
2569 --------
2570 >>> from shapely.geometry import Polygon, LineString, Point
2571 >>> s = geopandas.GeoSeries(
2572 ... [
2573 ... Polygon([(0, 0), (2, 2), (0, 2)]),
2574 ... Polygon([(0, 0), (2, 2), (0, 2)]),
2575 ... LineString([(0, 0), (2, 2)]),
2576 ... LineString([(2, 0), (0, 2)]),
2577 ... Point(0, 1),
2578 ... ],
2579 ... crs=3857,
2580 ... )
2581 >>> bounds = (0, 0, 1, 1)
2582 >>> s
2583 0 POLYGON ((0.000 0.000, 2.000 2.000, 0.000 2.00...
2584 1 POLYGON ((0.000 0.000, 2.000 2.000, 0.000 2.00...
2585 2 LINESTRING (0.000 0.000, 2.000 2.000)
2586 3 LINESTRING (2.000 0.000, 0.000 2.000)
2587 4 POINT (0.000 1.000)
2588 dtype: geometry
2589 >>> s.clip_by_rect(*bounds)
2590 0 POLYGON ((0.000 0.000, 0.000 1.000, 1.000 1.00...
2591 1 POLYGON ((0.000 0.000, 0.000 1.000, 1.000 1.00...
2592 2 LINESTRING (0.000 0.000, 1.000 1.000)
2593 3 GEOMETRYCOLLECTION EMPTY
2594 4 GEOMETRYCOLLECTION EMPTY
2595 dtype: geometry
2596
2597 See also
2598 --------
2599 GeoSeries.intersection
2600 """
2601 from .geoseries import GeoSeries
2602
2603 geometry_array = GeometryArray(self.geometry.values)
2604 clipped_geometry = geometry_array.clip_by_rect(xmin, ymin, xmax, ymax)
2605 return GeoSeries(clipped_geometry.data, index=self.index, crs=self.crs)
25372606
25382607 #
25392608 # Other operations
2424
2525 This dataset is being provided by the Department of City Planning (DCP) on DCP’s website for informational purposes only. DCP does not warranty the completeness, accuracy, content, or fitness for any particular purpose or use of the dataset, nor are any such warranties to be implied or inferred with respect to the dataset as furnished on the website. DCP and the City are not liable for any deficiencies in the completeness, accuracy, content, or fitness for any particular purpose or use the dataset, or applications utilizing the dataset, provided by any third party.
2626
27 ### `naturalearth_lowres`
2728
29 #### Notes
30
31 - `gdp_md_est` is `GDP_MD` in source data set
32 - `iso_a3` have been overridden with `ADM0_A3` if source value was **-99** and row corresponds to **Sovereign country**, or **Country**
00 """
11 Script that generates the included dataset 'naturalearth_lowres.shp'.
22
3 Raw data: https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-0-countries/
4 Current version used: version 4.1.0
3 Raw data: https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip
4 Current version used: version 5.0.1
55 """ # noqa (E501 link is longer than max line length)
66
77 import geopandas as gpd
88
99 # assumes zipfile from naturalearthdata was downloaded to current directory
1010 world_raw = gpd.read_file("zip://./ne_110m_admin_0_countries.zip")
11
12 # not ideal - fix some country codes
13 mask = world_raw["ISO_A3"].eq("-99") & world_raw["TYPE"].isin(
14 ["Sovereign country", "Country"]
15 )
16 world_raw.loc[mask, "ISO_A3"] = world_raw.loc[mask, "ADM0_A3"]
17
1118 # subsets columns of interest for geopandas examples
1219 world_df = world_raw[
13 ["POP_EST", "CONTINENT", "NAME", "ISO_A3", "GDP_MD_EST", "geometry"]
14 ]
20 ["POP_EST", "CONTINENT", "NAME", "ISO_A3", "GDP_MD", "geometry"]
21 ].rename(
22 columns={"GDP_MD": "GDP_MD_EST"}
23 ) # column has changed name...
1524 world_df.columns = world_df.columns.str.lower()
25
1626 world_df.to_file(
1727 driver="ESRI Shapefile", filename="./naturalearth_lowres/naturalearth_lowres.shp"
1828 )
0 GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]
0 GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]
5656 tooltip_kwds={},
5757 popup_kwds={},
5858 legend_kwds={},
59 map_kwds={},
5960 **kwargs,
6061 ):
6162 """Interactive map based on GeoPandas and folium/leaflet.js
181182 Fill color. Defaults to the value of the color option
182183 fillOpacity : float (default 0.5)
183184 Fill opacity.
185 style_function : callable
186 Function mapping a GeoJson Feature to a style ``dict``.
187
188 * Style properties :func:`folium.vector_layers.path_options`
189 * GeoJson features :class:`GeoDataFrame.__geo_interface__`
190
191 e.g.::
192
193 lambda x: {"color":"red" if x["properties"]["gdp_md_est"]<10**6
194 else "blue"}
184195
185196 Plus all supported by :func:`folium.vector_layers.path_options`. See the
186197 documentation of :class:`folium.features.GeoJson` for details.
224235 Applies if ``colorbar=False``.
225236 max_labels : int, default 10
226237 Maximum number of colorbar tick labels (requires branca>=0.5.0)
238 map_kwds : dict (default {})
239 Additional keywords to be passed to folium :class:`~folium.folium.Map`,
240 e.g. ``dragging``, or ``scrollWheelZoom``.
241
227242
228243 **kwargs : dict
229244 Additional options to be passed on to the folium object.
301316 fit = False
302317
303318 # get a subset of kwargs to be passed to folium.Map
304 map_kwds = {i: kwargs[i] for i in kwargs.keys() if i in _MAP_KWARGS}
319 for i in _MAP_KWARGS:
320 if i in map_kwds:
321 raise ValueError(
322 f"'{i}' cannot be specified in 'map_kwds'. "
323 f"Use the '{i}={map_kwds[i]}' argument instead."
324 )
325 map_kwds = {
326 **map_kwds,
327 **{i: kwargs[i] for i in kwargs.keys() if i in _MAP_KWARGS},
328 }
305329
306330 if HAS_XYZSERVICES:
307331 # match provider name string to xyzservices.TileProvider
352376 "Cannot specify 'categories' when column has categorical dtype"
353377 )
354378 categorical = True
355 elif gdf[column].dtype is np.dtype("O") or categories:
379 elif (
380 gdf[column].dtype is np.dtype("O")
381 or gdf[column].dtype is np.dtype(bool)
382 or categories
383 ):
356384 categorical = True
357385
358386 nan_idx = pd.isna(gdf[column])
429457 style_kwds["fillOpacity"] = 0.5
430458 if "weight" not in style_kwds:
431459 style_kwds["weight"] = 2
460 if "style_function" in style_kwds:
461 style_kwds_function = style_kwds["style_function"]
462 if not callable(style_kwds_function):
463 raise ValueError("'style_function' has to be a callable")
464 style_kwds.pop("style_function")
465 else:
466
467 def _no_style(x):
468 return {}
469
470 style_kwds_function = _no_style
432471
433472 # specify color
434473 if color is not None:
439478 ): # use existing column
440479
441480 def _style_color(x):
442 return {
481 base_style = {
443482 "fillColor": x["properties"][color],
444483 **style_kwds,
484 }
485 return {
486 **base_style,
487 **style_kwds_function(x),
445488 }
446489
447490 style_function = _style_color
461504 if not stroke_color:
462505
463506 def _style_column(x):
464 return {
507 base_style = {
465508 "fillColor": x["properties"]["__folium_color"],
466509 "color": x["properties"]["__folium_color"],
467510 **style_kwds,
468511 }
512 return {
513 **base_style,
514 **style_kwds_function(x),
515 }
469516
470517 style_function = _style_column
471518 else:
472519
473520 def _style_stroke(x):
474 return {
521 base_style = {
475522 "fillColor": x["properties"]["__folium_color"],
476523 "color": stroke_color,
477524 **style_kwds,
478525 }
526 return {
527 **base_style,
528 **style_kwds_function(x),
529 }
479530
480531 style_function = _style_stroke
481532 else: # use folium default
482533
483534 def _style_default(x):
484 return {**style_kwds}
535 return {**style_kwds, **style_kwds_function(x)}
485536
486537 style_function = _style_default
487538
525576 ]
526577 gdf = gdf.drop(columns=non_active_geoms)
527578
528 # preprare tooltip and popup
579 # prepare tooltip and popup
529580 if isinstance(gdf, geopandas.GeoDataFrame):
530581 # add named index to the tooltip
531582 if gdf.index.name is not None:
795846 marker_kwds={},
796847 style_kwds={},
797848 highlight_kwds={},
849 map_kwds={},
798850 **kwargs,
799851 ):
800852 """Interactive map based on GeoPandas and folium/leaflet.js
865917 Fill color. Defaults to the value of the color option
866918 fillOpacity : float (default 0.5)
867919 Fill opacity.
920 style_function : callable
921 Function mapping a GeoJson Feature to a style ``dict``.
922
923 * Style properties :func:`folium.vector_layers.path_options`
924 * GeoJson features :class:`GeoSeries.__geo_interface__`
925
926 e.g.::
927
928 lambda x: {"color":"red" if x["properties"]["gdp_md_est"]<10**6
929 else "blue"}
930
868931
869932 Plus all supported by :func:`folium.vector_layers.path_options`. See the
870933 documentation of :class:`folium.features.GeoJson` for details.
872935 highlight_kwds : dict (default {})
873936 Style to be passed to folium highlight_function. Uses the same keywords
874937 as ``style_kwds``. When empty, defaults to ``{"fillOpacity": 0.75}``.
938 map_kwds : dict (default {})
939 Additional keywords to be passed to folium :class:`~folium.folium.Map`,
940 e.g. ``dragging``, or ``scrollWheelZoom``.
875941
876942 **kwargs : dict
877943 Additional options to be passed on to the folium.
896962 marker_kwds=marker_kwds,
897963 style_kwds=style_kwds,
898964 highlight_kwds=highlight_kwds,
965 map_kwds=map_kwds,
899966 **kwargs,
900967 )
2222 DEFAULT_GEO_COLUMN_NAME = "geometry"
2323
2424
25 def _geodataframe_constructor_with_fallback(*args, **kwargs):
26 """
27 A flexible constructor for GeoDataFrame._constructor, which falls back
28 to returning a DataFrame (if a certain operation does not preserve the
29 geometry column)
30 """
31 df = GeoDataFrame(*args, **kwargs)
32 geometry_cols_mask = df.dtypes == "geometry"
33 if len(geometry_cols_mask) == 0 or geometry_cols_mask.sum() == 0:
34 df = pd.DataFrame(df)
35
36 return df
37
38
2539 def _ensure_geometry(data, crs=None):
2640 """
2741 Ensure the data is of geometry dtype or converted to it.
3448 if is_geometry_type(data):
3549 if isinstance(data, Series):
3650 data = GeoSeries(data)
37 if data.crs is None:
51 if data.crs is None and crs is not None:
52 # Avoids caching issues/crs sharing issues
53 data = data.copy()
3854 data.crs = crs
3955 return data
4056 else:
4662 return out
4763
4864
49 def _crs_mismatch_warning():
50 # TODO: raise error in 0.9 or 0.10.
51 warnings.warn(
52 "CRS mismatch between CRS of the passed geometries "
53 "and 'crs'. Use 'GeoDataFrame.set_crs(crs, "
54 "allow_override=True)' to overwrite CRS or "
55 "'GeoDataFrame.to_crs(crs)' to reproject geometries. "
56 "CRS mismatch will raise an error in the future versions "
57 "of GeoPandas.",
58 FutureWarning,
59 stacklevel=3,
60 )
65 crs_mismatch_error = (
66 "CRS mismatch between CRS of the passed geometries "
67 "and 'crs'. Use 'GeoDataFrame.set_crs(crs, "
68 "allow_override=True)' to overwrite CRS or "
69 "'GeoDataFrame.to_crs(crs)' to reproject geometries. "
70 )
6171
6272
6373 class GeoDataFrame(GeoPandasBase, DataFrame):
112122 GeoSeries : Series object designed to store shapely geometry objects
113123 """
114124
125 # TODO: remove "_crs" in 0.12
115126 _metadata = ["_crs", "_geometry_column_name"]
116127
117128 _geometry_column_name = DEFAULT_GEO_COLUMN_NAME
120131 with compat.ignore_shapely2_warnings():
121132 super().__init__(data, *args, **kwargs)
122133
123 # need to set this before calling self['geometry'], because
124 # getitem accesses crs
125 self._crs = CRS.from_user_input(crs) if crs else None
134 # TODO: to be removed in 0.12
135 self._crs = None
126136
127137 # set_geometry ensures the geometry data have the proper dtype,
128138 # but is not called if `geometry=None` ('geometry' column present
135145 if geometry is None and isinstance(data, GeoDataFrame):
136146 self._geometry_column_name = data._geometry_column_name
137147 if crs is not None and data.crs != crs:
138 _crs_mismatch_warning()
139 # TODO: raise error in 0.9 or 0.10.
140 return
148 raise ValueError(crs_mismatch_error)
141149
142150 if geometry is None and "geometry" in self.columns:
143151 # Check for multiple columns with name "geometry". If there are,
150158 )
151159
152160 # only if we have actual geometry values -> call set_geometry
153 index = self.index
154161 try:
155162 if (
156163 hasattr(self["geometry"].values, "crs")
158165 and crs
159166 and not self["geometry"].values.crs == crs
160167 ):
161 _crs_mismatch_warning()
162 # TODO: raise error in 0.9 or 0.10.
168 raise ValueError(crs_mismatch_error)
163169 self["geometry"] = _ensure_geometry(self["geometry"].values, crs)
164170 except TypeError:
165171 pass
166172 else:
167 if self.index is not index:
168 # With pandas < 1.0 and an empty frame (no rows), the index
169 # gets reset to a default RangeIndex -> set back the original
170 # index if needed
171 self.index = index
172173 geometry = "geometry"
173174
174175 if geometry is not None:
178179 and crs
179180 and not geometry.crs == crs
180181 ):
181 _crs_mismatch_warning()
182 # TODO: raise error in 0.9 or 0.10.
183 self.set_geometry(geometry, inplace=True)
182 raise ValueError(crs_mismatch_error)
183
184 self.set_geometry(geometry, inplace=True, crs=crs)
184185
185186 if geometry is None and crs:
186 warnings.warn(
187 "Assigning CRS to a GeoDataFrame without a geometry column is now "
188 "deprecated and will not be supported in the future.",
189 FutureWarning,
190 stacklevel=2,
187 raise ValueError(
188 "Assigning CRS to a GeoDataFrame without a geometry column is not "
189 "supported. Supply geometry using the 'geometry=' keyword argument, "
190 "or by providing a DataFrame with column name 'geometry'",
191191 )
192192
193193 def __setattr__(self, attr, val):
199199
200200 def _get_geometry(self):
201201 if self._geometry_column_name not in self:
202 raise AttributeError(
203 "No geometry data set yet (expected in"
204 " column '%s'.)" % self._geometry_column_name
205 )
202 if self._geometry_column_name is None:
203 msg = (
204 "You are calling a geospatial method on the GeoDataFrame, "
205 "but the active geometry column to use has not been set. "
206 )
207 else:
208 msg = (
209 "You are calling a geospatial method on the GeoDataFrame, "
210 f"but the active geometry column ('{self._geometry_column_name}') "
211 "is not present. "
212 )
213 geo_cols = list(self.columns[self.dtypes == "geometry"])
214 if len(geo_cols) > 0:
215 msg += (
216 f"\nThere are columns with geometry data type ({geo_cols}), and "
217 "you can either set one as the active geometry with "
218 'df.set_geometry("name") or access the column as a '
219 'GeoSeries (df["name"]) and call the method directly on it.'
220 )
221 else:
222 msg += (
223 "\nThere are no existing columns with geometry data type. You can "
224 "add a geometry column as the active geometry column with "
225 "df.set_geometry. "
226 )
227
228 raise AttributeError(msg)
206229 return self[self._geometry_column_name]
207230
208231 def _set_geometry(self, col):
275298 frame = self
276299 else:
277300 frame = self.copy()
301 # if there is no previous self.geometry, self.copy() will downcast
302 if type(frame) == DataFrame:
303 frame = GeoDataFrame(frame)
278304
279305 to_remove = None
280306 geo_column_name = self._geometry_column_name
281307 if isinstance(col, (Series, list, np.ndarray, GeometryArray)):
282308 level = col
283 elif hasattr(col, "ndim") and col.ndim != 1:
309 elif hasattr(col, "ndim") and col.ndim > 1:
284310 raise ValueError("Must pass array with one dimension only.")
285311 else:
286312 try:
305331 del frame[to_remove]
306332
307333 if not crs:
308 level_crs = getattr(level, "crs", None)
309 crs = level_crs if level_crs is not None else self._crs
334 crs = getattr(level, "crs", None)
310335
311336 if isinstance(level, (GeoSeries, GeometryArray)) and level.crs != crs:
312337 # Avoids caching issues/crs sharing issues
315340
316341 # Check that we are using a listlike of geometries
317342 level = _ensure_geometry(level, crs=crs)
318 index = frame.index
319343 frame[geo_column_name] = level
320 if frame.index is not index and len(frame.index) == len(index):
321 # With pandas < 1.0 and an empty frame (no rows), the index gets reset
322 # to a default RangeIndex -> set back the original index if needed
323 frame.index = index
324344 frame._geometry_column_name = geo_column_name
325 frame.crs = crs
345
346 # TODO: to be removed in 0.12
347 frame._crs = level.crs
326348 if not inplace:
327349 return frame
328350
404426 GeoDataFrame.to_crs : re-project to another CRS
405427
406428 """
407 return self._crs
429 # TODO: remove try/except in 0.12
430 try:
431 return self.geometry.crs
432 except AttributeError:
433 # the active geometry column might not be set
434 warnings.warn(
435 "Accessing CRS of a GeoDataFrame without a geometry column is "
436 "deprecated and will be removed in GeoPandas 0.12. "
437 "Use GeoDataFrame.set_geometry to set the active geometry column.",
438 FutureWarning,
439 stacklevel=2,
440 )
441 return self._crs
408442
409443 @crs.setter
410444 def crs(self, value):
411445 """Sets the value of the crs"""
412446 if self._geometry_column_name not in self:
413 warnings.warn(
414 "Assigning CRS to a GeoDataFrame without a geometry column is now "
415 "deprecated and will not be supported in the future.",
416 FutureWarning,
417 stacklevel=4,
447 raise ValueError(
448 "Assigning CRS to a GeoDataFrame without a geometry column is not "
449 "supported. Use GeoDataFrame.set_geometry to set the active "
450 "geometry column.",
418451 )
419 self._crs = None if not value else CRS.from_user_input(value)
420452 else:
421453 if hasattr(self.geometry.values, "crs"):
422454 self.geometry.values.crs = value
423 self._crs = self.geometry.values.crs
424455 else:
425456 # column called 'geometry' without geometry
426457 self._crs = None if not value else CRS.from_user_input(value)
427458
459 # TODO: raise this error in 0.12. This already raises a FutureWarning
460 # TODO: defined in the crs property above
461 # raise ValueError(
462 # "Assigning CRS to a GeoDataFrame without an active geometry "
463 # "column is not supported. Use GeoDataFrame.set_geometry to set "
464 # "the active geometry column.",
465 # )
466
428467 def __setstate__(self, state):
429468 # overriding DataFrame method for compat with older pickles (CRS handling)
469 crs = None
430470 if isinstance(state, dict):
431 if "_metadata" in state and "crs" in state["_metadata"]:
432 metadata = state["_metadata"]
433 metadata[metadata.index("crs")] = "_crs"
434471 if "crs" in state and "_crs" not in state:
435 crs = state.pop("crs")
436 state["_crs"] = CRS.from_user_input(crs) if crs is not None else crs
472 crs = state.pop("crs", None)
473 else:
474 crs = state.pop("_crs", None)
475 crs = CRS.from_user_input(crs) if crs is not None else crs
437476
438477 super().__setstate__(state)
439478
441480 # at GeoDataFrame level with '_crs' (and not 'crs'), so without propagating
442481 # to the GeoSeries/GeometryArray
443482 try:
444 if self.crs is not None:
483 if crs is not None:
445484 if self.geometry.values.crs is None:
446 self.crs = self.crs
485 self.crs = crs
447486 except Exception:
448487 pass
449488
470509 GeoDataFrame
471510
472511 """
473 dataframe = super().from_dict(data, **kwargs)
474 return GeoDataFrame(dataframe, geometry=geometry, crs=crs)
512 dataframe = DataFrame.from_dict(data, **kwargs)
513 return cls(dataframe, geometry=geometry, crs=crs)
475514
476515 @classmethod
477516 def from_file(cls, filename, **kwargs):
604643 "geometry": shape(feature["geometry"]) if feature["geometry"] else None
605644 }
606645 # load properties
607 row.update(feature["properties"])
646 properties = feature["properties"]
647 if properties is None:
648 properties = {}
649 row.update(properties)
608650 rows.append(row)
609 return GeoDataFrame(rows, columns=columns, crs=crs)
651 return cls(rows, columns=columns, crs=crs)
610652
611653 @classmethod
612654 def from_postgis(
836878 if not self.columns.is_unique:
837879 raise ValueError("GeoDataFrame cannot contain duplicated column names.")
838880
839 properties_cols = self.columns.difference([self._geometry_column_name])
881 properties_cols = self.columns.drop(self._geometry_column_name)
840882
841883 if len(properties_cols) > 0:
842884 # convert to object to get python scalars.
952994
953995 return df
954996
955 def to_parquet(self, path, index=None, compression="snappy", **kwargs):
997 def to_parquet(
998 self, path, index=None, compression="snappy", version=None, **kwargs
999 ):
9561000 """Write a GeoDataFrame to the Parquet format.
9571001
9581002 Any geometry columns present are serialized to WKB format in the file.
9591003
9601004 Requires 'pyarrow'.
9611005
962 WARNING: this is an initial implementation of Parquet file support and
963 associated metadata. This is tracking version 0.1.0 of the metadata
964 specification at:
965 https://github.com/geopandas/geo-arrow-spec
1006 WARNING: this is an early implementation of Parquet file support and
1007 associated metadata, the specification for which continues to evolve.
1008 This is tracking version 0.4.0 of the GeoParquet specification at:
1009 https://github.com/opengeospatial/geoparquet
9661010
9671011 This metadata specification does not yet make stability promises. As such,
9681012 we do not yet recommend using this in a production setting unless you are
9811025 output except `RangeIndex` which is stored as metadata only.
9821026 compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
9831027 Name of the compression to use. Use ``None`` for no compression.
1028 version : {'0.1.0', '0.4.0', None}
1029 GeoParquet specification version; if not provided will default to
1030 latest supported version.
9841031 kwargs
9851032 Additional keyword arguments passed to :func:`pyarrow.parquet.write_table`.
9861033
9951042 GeoDataFrame.to_file : write GeoDataFrame to file
9961043 """
9971044
1045 # Accept engine keyword for compatibility with pandas.DataFrame.to_parquet
1046 # The only engine currently supported by GeoPandas is pyarrow, so no
1047 # other engine should be specified.
1048 engine = kwargs.pop("engine", "auto")
1049 if engine not in ("auto", "pyarrow"):
1050 raise ValueError(
1051 f"GeoPandas only supports using pyarrow as the engine for "
1052 f"to_parquet: {engine!r} passed instead."
1053 )
1054
9981055 from geopandas.io.arrow import _to_parquet
9991056
1000 _to_parquet(self, path, compression=compression, index=index, **kwargs)
1001
1002 def to_feather(self, path, index=None, compression=None, **kwargs):
1057 _to_parquet(
1058 self, path, compression=compression, index=index, version=version, **kwargs
1059 )
1060
1061 def to_feather(self, path, index=None, compression=None, version=None, **kwargs):
10031062 """Write a GeoDataFrame to the Feather format.
10041063
10051064 Any geometry columns present are serialized to WKB format in the file.
10061065
10071066 Requires 'pyarrow' >= 0.17.
10081067
1009 WARNING: this is an initial implementation of Feather file support and
1010 associated metadata. This is tracking version 0.1.0 of the metadata
1011 specification at:
1012 https://github.com/geopandas/geo-arrow-spec
1068 WARNING: this is an early implementation of Parquet file support and
1069 associated metadata, the specification for which continues to evolve.
1070 This is tracking version 0.4.0 of the GeoParquet specification at:
1071 https://github.com/opengeospatial/geoparquet
10131072
10141073 This metadata specification does not yet make stability promises. As such,
10151074 we do not yet recommend using this in a production setting unless you are
10291088 compression : {'zstd', 'lz4', 'uncompressed'}, optional
10301089 Name of the compression to use. Use ``"uncompressed"`` for no
10311090 compression. By default uses LZ4 if available, otherwise uncompressed.
1091 version : {'0.1.0', '0.4.0', None}
1092 GeoParquet specification version; if not provided will default to
1093 latest supported version.
10321094 kwargs
10331095 Additional keyword arguments passed to to
10341096 :func:`pyarrow.feather.write_feather`.
10461108
10471109 from geopandas.io.arrow import _to_feather
10481110
1049 _to_feather(self, path, index=index, compression=compression, **kwargs)
1111 _to_feather(
1112 self, path, index=index, compression=compression, version=version, **kwargs
1113 )
10501114
10511115 def to_file(self, filename, driver=None, schema=None, index=None, **kwargs):
10521116 """Write the ``GeoDataFrame`` to a file.
10611125 Parameters
10621126 ----------
10631127 filename : string
1064 File path or file handle to write to.
1128 File path or file handle to write to. The path may specify a
1129 GDAL VSI scheme.
10651130 driver : string, default None
10661131 The OGR format driver used to write the vector file.
10671132 If not specified, it attempts to infer it from the file extension.
10681133 If no extension is specified, it saves ESRI Shapefile to a folder.
1069 schema : dict, default: None
1134 schema : dict, default None
10701135 If specified, the schema dictionary is passed to Fiona to
1071 better control how the file is written.
1136 better control how the file is written. If None, GeoPandas
1137 will determine the schema based on each column's dtype.
1138 Not supported for the "pyogrio" engine.
10721139 index : bool, default None
10731140 If True, write index into one or more columns (for MultiIndex).
10741141 Default None writes the index into one or more columns only if
10771144
10781145 .. versionadded:: 0.7
10791146 Previously the index was not written.
1147 mode : string, default 'w'
1148 The write mode, 'w' to overwrite the existing file and 'a' to append.
1149 Not all drivers support appending. The drivers that support appending
1150 are listed in fiona.supported_drivers or
1151 https://github.com/Toblerity/Fiona/blob/master/fiona/drvsupport.py
1152 crs : pyproj.CRS, default None
1153 If specified, the CRS is passed to Fiona to
1154 better control how the file is written. If None, GeoPandas
1155 will determine the crs based on crs df attribute.
1156 The value can be anything accepted
1157 by :meth:`pyproj.CRS.from_user_input() <pyproj.crs.CRS.from_user_input>`,
1158 such as an authority string (eg "EPSG:4326") or a WKT string.
1159 engine : str, "fiona" or "pyogrio"
1160 The underlying library that is used to write the file. Currently, the
1161 supported options are "fiona" and "pyogrio". Defaults to "fiona" if
1162 installed, otherwise tries "pyogrio".
1163 **kwargs :
1164 Keyword args to be passed to the engine, and can be used to write
1165 to multi-layer data, store data within archives (zip files), etc.
1166 In case of the "fiona" engine, the keyword arguments are passed to
1167 fiona.open`. For more information on possible keywords, type:
1168 ``import fiona; help(fiona.open)``. In case of the "pyogrio" engine,
1169 the keyword arguments are passed to `pyogrio.write_dataframe`.
10801170
10811171 Notes
10821172 -----
1083 The extra keyword arguments ``**kwargs`` are passed to fiona.open and
1084 can be used to write to multi-layer data, store data within archives
1085 (zip files), etc.
1086
10871173 The format drivers will attempt to detect the encoding of your data, but
10881174 may fail. In this case, the proper encoding can be specified explicitly
10891175 by using the encoding keyword parameter, e.g. ``encoding='utf-8'``.
12731359 df = self.copy()
12741360 geom = df.geometry.to_crs(crs=crs, epsg=epsg)
12751361 df.geometry = geom
1276 df.crs = geom.crs
12771362 if not inplace:
12781363 return df
12791364
13201405 def __getitem__(self, key):
13211406 """
13221407 If the result is a column containing only 'geometry', return a
1323 GeoSeries. If it's a DataFrame with a 'geometry' column, return a
1324 GeoDataFrame.
1408 GeoSeries. If it's a DataFrame with any columns of GeometryDtype,
1409 return a GeoDataFrame.
13251410 """
13261411 result = super().__getitem__(key)
13271412 geo_col = self._geometry_column_name
13281413 if isinstance(result, Series) and isinstance(result.dtype, GeometryDtype):
13291414 result.__class__ = GeoSeries
1330 elif isinstance(result, DataFrame) and geo_col in result:
1331 result.__class__ = GeoDataFrame
1332 result._geometry_column_name = geo_col
1333 elif isinstance(result, DataFrame) and geo_col not in result:
1334 result.__class__ = DataFrame
1415 elif isinstance(result, DataFrame):
1416 if (result.dtypes == "geometry").sum() > 0:
1417 result.__class__ = GeoDataFrame
1418 if geo_col in result:
1419 result._geometry_column_name = geo_col
1420 else:
1421 result._geometry_column_name = None
1422 else:
1423 result.__class__ = DataFrame
13351424 return result
13361425
13371426 def __setitem__(self, key, value):
13431432 if pd.api.types.is_scalar(value) or isinstance(value, BaseGeometry):
13441433 value = [value] * self.shape[0]
13451434 try:
1346 value = _ensure_geometry(value, crs=self.crs)
1347 self._crs = value.crs
1435 # TODO: remove this use of _crs in 0.12
1436 warn = False
1437 if not (hasattr(self, "geometry") and hasattr(self.geometry, "crs")):
1438 crs = self._crs
1439 warn = True
1440 else:
1441 crs = getattr(self, "crs", None)
1442 value = _ensure_geometry(value, crs=crs)
1443 if warn and crs is not None:
1444 warnings.warn(
1445 "Setting geometries to a GeoDataFrame without a geometry "
1446 "column will currently preserve the CRS, if present. "
1447 "This is deprecated, and in the future the CRS will be lost "
1448 "in this case. You can use set_crs(..) on the result to "
1449 "set the CRS manually.",
1450 FutureWarning,
1451 stacklevel=2,
1452 )
13481453 except TypeError:
13491454 warnings.warn("Geometry column does not contain geometry.")
13501455 super().__setitem__(key, value)
13891494 result = super().apply(
13901495 func, axis=axis, raw=raw, result_type=result_type, args=args, **kwargs
13911496 )
1497 # Reconstruct gdf if it was lost by apply
13921498 if (
1393 isinstance(result, GeoDataFrame)
1499 isinstance(result, DataFrame)
13941500 and self._geometry_column_name in result.columns
1395 and isinstance(result[self._geometry_column_name].dtype, GeometryDtype)
13961501 ):
1397 # apply calls _constructor which resets geom col name to geometry
1398 result._geometry_column_name = self._geometry_column_name
1399 if self.crs is not None and result.crs is None:
1400 result.set_crs(self.crs, inplace=True)
1502 # axis=1 apply will split GeometryDType to object, try and cast back
1503 try:
1504 result = result.set_geometry(self._geometry_column_name)
1505 except TypeError:
1506 pass
1507 else:
1508 if self.crs is not None and result.crs is None:
1509 result.set_crs(self.crs, inplace=True)
1510 elif isinstance(result, Series):
1511 # Reconstruct series GeometryDtype if lost by apply
1512 try:
1513 # Note CRS cannot be preserved in this case as func could refer
1514 # to any column
1515 result = _ensure_geometry(result)
1516 except TypeError:
1517 pass
1518
14011519 return result
14021520
14031521 @property
14041522 def _constructor(self):
1405 return GeoDataFrame
1523 return _geodataframe_constructor_with_fallback
1524
1525 @property
1526 def _constructor_sliced(self):
1527 def _geodataframe_constructor_sliced(*args, **kwargs):
1528 """
1529 A specialized (Geo)Series constructor which can fall back to a
1530 Series if a certain operation does not produce geometries:
1531
1532 - We only return a GeoSeries if the data is actually of geometry
1533 dtype (and so we don't try to convert geometry objects such as
1534 the normal GeoSeries(..) constructor does with `_ensure_geometry`).
1535 - When we get here from obtaining a row or column from a
1536 GeoDataFrame, the goal is to only return a GeoSeries for a
1537 geometry column, and not return a GeoSeries for a row that happened
1538 to come from a DataFrame with only geometry dtype columns (and
1539 thus could have a geometry dtype). Therefore, we don't return a
1540 GeoSeries if we are sure we are in a row selection case (by
1541 checking the identity of the index)
1542 """
1543 srs = pd.Series(*args, **kwargs)
1544 is_row_proxy = srs.index is self.columns
1545 if is_geometry_type(srs) and not is_row_proxy:
1546 srs = GeoSeries(srs)
1547 return srs
1548
1549 return _geodataframe_constructor_sliced
14061550
14071551 def __finalize__(self, other, method=None, **kwargs):
14081552 """propagate metadata from other to self"""
14241568 f"Please ensure this column from the first DataFrame is not "
14251569 f"repeated."
14261570 )
1571 elif method == "unstack":
1572 # unstack adds multiindex columns and reshapes data.
1573 # it never makes sense to retain geometry column
1574 self._geometry_column_name = None
1575 self._crs = None
14271576 return self
14281577
14291578 def dissolve(
14521601 aggfunc : function or string, default "first"
14531602 Aggregation function for manipulation of data associated
14541603 with each group. Passed to pandas `groupby.agg` method.
1604 Accepted combinations are:
1605
1606 - function
1607 - string function name
1608 - list of functions and/or function names, e.g. [np.sum, 'mean']
1609 - dict of axis labels -> functions, function names or list of such.
14551610 as_index : boolean, default True
14561611 If true, groupby columns become index of result.
14571612 level : int or str or sequence of int or sequence of str, default None
15091664
15101665 See also
15111666 --------
1512 GeoDataFrame.explode : explode muti-part geometries into single geometries
1667 GeoDataFrame.explode : explode multi-part geometries into single geometries
15131668
15141669 """
15151670
15281683 # Process non-spatial component
15291684 data = self.drop(labels=self.geometry.name, axis=1)
15301685 aggregated_data = data.groupby(**groupby_kwargs).agg(aggfunc)
1686 aggregated_data.columns = aggregated_data.columns.to_flat_index()
15311687
15321688 # Process spatial component
15331689 def merge_geometries(block):
15521708 # overrides the pandas native explode method to break up features geometrically
15531709 def explode(self, column=None, ignore_index=False, index_parts=None, **kwargs):
15541710 """
1555 Explode muti-part geometries into multiple single geometries.
1711 Explode multi-part geometries into multiple single geometries.
15561712
15571713 Each row containing a multi-part geometry will be split into
15581714 multiple rows with single geometries, thereby increasing the vertical
16501806 )
16511807 index_parts = True
16521808
1653 df_copy = self.copy()
1654
1655 level_str = f"level_{df_copy.index.nlevels}"
1656
1657 if level_str in df_copy.columns: # GH1393
1658 df_copy = df_copy.rename(columns={level_str: f"__{level_str}"})
1659
1660 if index_parts:
1661 exploded_geom = df_copy.geometry.explode(index_parts=True)
1662 exploded_index = exploded_geom.index
1663 exploded_geom = exploded_geom.reset_index(level=-1, drop=True)
1664 else:
1665 exploded_geom = df_copy.geometry.explode(index_parts=True).reset_index(
1666 level=-1, drop=True
1667 )
1668 exploded_index = exploded_geom.index
1669
1670 df = (
1671 df_copy.drop(df_copy._geometry_column_name, axis=1)
1672 .join(exploded_geom)
1673 .__finalize__(self)
1674 )
1809 exploded_geom = self.geometry.reset_index(drop=True).explode(index_parts=True)
1810
1811 df = GeoDataFrame(
1812 self.drop(self._geometry_column_name, axis=1).take(
1813 exploded_geom.index.droplevel(-1)
1814 ),
1815 geometry=exploded_geom.values,
1816 ).__finalize__(self)
16751817
16761818 if ignore_index:
16771819 df.reset_index(inplace=True, drop=True)
16781820 elif index_parts:
16791821 # reset to MultiIndex, otherwise df index is only first level of
16801822 # exploded GeoSeries index.
1681 df.set_index(exploded_index, inplace=True)
1682 df.index.names = list(self.index.names) + [None]
1683 else:
1684 df.set_index(exploded_index, inplace=True)
1685 df.index.names = self.index.names
1686
1687 if f"__{level_str}" in df.columns:
1688 df = df.rename(columns={f"__{level_str}": level_str})
1689
1690 geo_df = df.set_geometry(self._geometry_column_name)
1691 return geo_df
1823 df = df.set_index(
1824 exploded_geom.index.droplevel(
1825 list(range(exploded_geom.index.nlevels - 1))
1826 ),
1827 append=True,
1828 )
1829
1830 return df
16921831
16931832 # overrides the pandas astype method to ensure the correct return type
16941833 def astype(self, dtype, copy=True, errors="raise", **kwargs):
17321871 """
17331872 # Overridden to fix GH1870, that return type is not preserved always
17341873 # (and where it was, geometry col was not)
1735
1736 if not compat.PANDAS_GE_10:
1737 raise NotImplementedError(
1738 "GeoDataFrame.convert_dtypes requires pandas >= 1.0"
1739 )
17401874
17411875 return GeoDataFrame(
17421876 super().convert_dtypes(*args, **kwargs),
17751909 - append: Insert new values to the existing table.
17761910 schema : string, optional
17771911 Specify the schema. If None, use default schema: 'public'.
1778 index : bool, default True
1912 index : bool, default False
17791913 Write DataFrame index as a column.
17801914 Uses *index_label* as the column name in the table.
17811915 index_label : string or sequence, default None
18171951 warnings.warn(
18181952 "'^' operator will be deprecated. Use the 'symmetric_difference' "
18191953 "method instead.",
1820 DeprecationWarning,
1954 FutureWarning,
18211955 stacklevel=2,
18221956 )
18231957 return self.geometry.symmetric_difference(other)
18261960 """Implement | operator as for builtin set type"""
18271961 warnings.warn(
18281962 "'|' operator will be deprecated. Use the 'union' method instead.",
1829 DeprecationWarning,
1963 FutureWarning,
18301964 stacklevel=2,
18311965 )
18321966 return self.geometry.union(other)
18351969 """Implement & operator as for builtin set type"""
18361970 warnings.warn(
18371971 "'&' operator will be deprecated. Use the 'intersection' method instead.",
1838 DeprecationWarning,
1972 FutureWarning,
18391973 stacklevel=2,
18401974 )
18411975 return self.geometry.intersection(other)
18441978 """Implement - operator as for builtin set type"""
18451979 warnings.warn(
18461980 "'-' operator will be deprecated. Use the 'difference' method instead.",
1847 DeprecationWarning,
1981 FutureWarning,
18481982 stacklevel=2,
18491983 )
18501984 return self.geometry.difference(other)
20402174
20412175 Notes
20422176 -----
2043 Since this join relies on distances, results will be innaccurate
2177 Since this join relies on distances, results will be inaccurate
20442178 if your geometries are in a geographic CRS.
20452179
20462180 Every operation in GeoPandas is planar, i.e. the potential third
20602194 """Clip points, lines, or polygon geometries to the mask extent.
20612195
20622196 Both layers must be in the same Coordinate Reference System (CRS).
2063 The GeoDataFrame will be clipped to the full extent of the `mask` object.
2197 The GeoDataFrame will be clipped to the full extent of the ``mask`` object.
20642198
20652199 If there are multiple polygons in mask, data from the GeoDataFrame will be
20662200 clipped to the total boundary of all polygons in mask.
20672201
20682202 Parameters
20692203 ----------
2070 mask : GeoDataFrame, GeoSeries, (Multi)Polygon
2071 Polygon vector layer used to clip `gdf`.
2204 mask : GeoDataFrame, GeoSeries, (Multi)Polygon, list-like
2205 Polygon vector layer used to clip the GeoDataFrame.
20722206 The mask's geometry is dissolved into one geometric feature
2073 and intersected with `gdf`.
2207 and intersected with GeoDataFrame.
2208 If the mask is list-like with four elements ``(minx, miny, maxx, maxy)``,
2209 ``clip`` will use a faster rectangle clipping
2210 (:meth:`~GeoSeries.clip_by_rect`), possibly leading to slightly different
2211 results.
20742212 keep_geom_type : boolean, default False
20752213 If True, return only geometries of original type in case of intersection
20762214 resulting in multiple geometry types or GeometryCollections.
20792217 Returns
20802218 -------
20812219 GeoDataFrame
2082 Vector data (points, lines, polygons) from `gdf` clipped to
2220 Vector data (points, lines, polygons) from the GeoDataFrame clipped to
20832221 polygon boundary from mask.
20842222
20852223 See also
22082346
22092347
22102348 DataFrame.set_geometry = _dataframe_set_geometry
2349
2350 if not compat.PANDAS_GE_11: # i.e. on pandas 1.0.x
2351 _geodataframe_constructor_with_fallback._from_axes = GeoDataFrame._from_axes
22
33 import numpy as np
44 import pandas as pd
5 from pandas import Series, MultiIndex
5 from pandas import Series, MultiIndex, DataFrame
66 from pandas.core.internals import SingleBlockManager
77
88 from pyproj import CRS
2727 from .base import is_geometry_type
2828
2929
30 _SERIES_WARNING_MSG = """\
31 You are passing non-geometry data to the GeoSeries constructor. Currently,
32 it falls back to returning a pandas Series. But in the future, we will start
33 to raise a TypeError instead."""
34
35
3630 def _geoseries_constructor_with_fallback(data=None, index=None, crs=None, **kwargs):
3731 """
3832 A flexible constructor for GeoSeries._constructor, which needs to be able
4034 geometries)
4135 """
4236 try:
43 with warnings.catch_warnings():
44 warnings.filterwarnings(
45 "ignore",
46 message=_SERIES_WARNING_MSG,
47 category=FutureWarning,
48 module="geopandas[.*]",
49 )
50 return GeoSeries(data=data, index=index, crs=crs, **kwargs)
37 return GeoSeries(data=data, index=index, crs=crs, **kwargs)
5138 except TypeError:
5239 return Series(data=data, index=index, **kwargs)
40
41
42 def _geoseries_expanddim(data=None, *args, **kwargs):
43 from geopandas import GeoDataFrame
44
45 # pd.Series._constructor_expanddim == pd.DataFrame
46 df = pd.DataFrame(data, *args, **kwargs)
47 geo_col_name = None
48 if isinstance(data, GeoSeries):
49 # pandas default column name is 0, keep convention
50 geo_col_name = data.name if data.name is not None else 0
51
52 if df.shape[1] == 1:
53 geo_col_name = df.columns[0]
54
55 if (df.dtypes == "geometry").sum() > 0:
56 if geo_col_name is None or not is_geometry_type(df[geo_col_name]):
57 df = GeoDataFrame(df)
58 df._geometry_column_name = None
59 else:
60 df = df.set_geometry(geo_col_name)
61
62 return df
63
64
65 # pd.concat (pandas/core/reshape/concat.py) requires this for the
66 # concatenation of series since pandas 1.1
67 # (https://github.com/pandas-dev/pandas/commit/f9e4c8c84bcef987973f2624cc2932394c171c8c)
68 _geoseries_expanddim._get_axis_number = DataFrame._get_axis_number
5369
5470
5571 class GeoSeries(GeoPandasBase, Series):
132148
133149 _metadata = ["name"]
134150
135 def __new__(cls, data=None, index=None, crs=None, **kwargs):
136 # we need to use __new__ because we want to return Series instance
137 # instead of GeoSeries instance in case of non-geometry data
138
151 def __init__(self, data=None, index=None, crs=None, **kwargs):
139152 if hasattr(data, "crs") and crs:
140153 if not data.crs:
141154 # make a copy to avoid setting CRS to passed GeometryArray
142155 data = data.copy()
143156 else:
144157 if not data.crs == crs:
145 warnings.warn(
158 raise ValueError(
146159 "CRS mismatch between CRS of the passed geometries "
147 "and 'crs'. Use 'GeoDataFrame.set_crs(crs, "
160 "and 'crs'. Use 'GeoSeries.set_crs(crs, "
148161 "allow_override=True)' to overwrite CRS or "
149162 "'GeoSeries.to_crs(crs)' to reproject geometries. "
150 "CRS mismatch will raise an error in the future versions "
151 "of GeoPandas.",
152 FutureWarning,
153 stacklevel=2,
154163 )
155 # TODO: raise error in 0.9 or 0.10.
156164
157165 if isinstance(data, SingleBlockManager):
158166 if isinstance(data.blocks[0].dtype, GeometryDtype):
167175 values = data.blocks[0].values
168176 block = ExtensionBlock(values, slice(0, len(values), 1), ndim=1)
169177 data = SingleBlockManager([block], data.axes[0], fastpath=True)
170 self = super(GeoSeries, cls).__new__(cls)
171 super(GeoSeries, self).__init__(data, index=index, **kwargs)
172 self.crs = getattr(self.values, "crs", crs)
173 return self
174 warnings.warn(_SERIES_WARNING_MSG, FutureWarning, stacklevel=2)
175 return Series(data, index=index, **kwargs)
178 else:
179 raise TypeError(
180 "Non geometry data passed to GeoSeries constructor, "
181 f"received data of dtype '{data.blocks[0].dtype}'"
182 )
176183
177184 if isinstance(data, BaseGeometry):
178185 # fix problem for scalar geometries passed, ensure the list of
202209 # pd.Series with empty data gives float64 for older pandas versions
203210 s = s.astype(object)
204211 else:
205 warnings.warn(_SERIES_WARNING_MSG, FutureWarning, stacklevel=2)
206 return s
212 raise TypeError(
213 "Non geometry data passed to GeoSeries constructor, "
214 f"received data of dtype '{s.dtype}'"
215 )
207216 # try to convert to GeometryArray, if fails return plain Series
208217 try:
209218 data = from_shapely(s.values, crs)
210219 except TypeError:
211 warnings.warn(_SERIES_WARNING_MSG, FutureWarning, stacklevel=2)
212 return s
220 raise TypeError(
221 "Non geometry data passed to GeoSeries constructor, "
222 f"received data of dtype '{s.dtype}'"
223 )
213224 index = s.index
214225 name = s.name
215226
216 self = super(GeoSeries, cls).__new__(cls)
217 super(GeoSeries, self).__init__(data, index=index, name=name, **kwargs)
218
227 super().__init__(data, index=index, name=name, **kwargs)
219228 if not self.crs:
220229 self.crs = crs
221 return self
222
223 def __init__(self, *args, **kwargs):
224 # need to overwrite Series init to prevent calling it for GeoSeries
225 # (doesn't know crs, all work is already done above)
226 pass
227230
228231 def append(self, *args, **kwargs):
229232 return self._wrapped_pandas_method("append", *args, **kwargs)
541544 Parameters
542545 ----------
543546 filename : string
544 File path or file handle to write to.
547 File path or file handle to write to. The path may specify a
548 GDAL VSI scheme.
545549 driver : string, default None
546550 The OGR format driver used to write the vector file.
547551 If not specified, it attempts to infer it from the file extension.
554558
555559 .. versionadded:: 0.7
556560 Previously the index was not written.
557
558 Notes
559 -----
560 The extra keyword arguments ``**kwargs`` are passed to fiona.open and
561 can be used to write to multi-layer data, store data within archives
562 (zip files), etc.
561 mode : string, default 'w'
562 The write mode, 'w' to overwrite the existing file and 'a' to append.
563 Not all drivers support appending. The drivers that support appending
564 are listed in fiona.supported_drivers or
565 https://github.com/Toblerity/Fiona/blob/master/fiona/drvsupport.py
566 crs : pyproj.CRS, default None
567 If specified, the CRS is passed to Fiona to
568 better control how the file is written. If None, GeoPandas
569 will determine the crs based on crs df attribute.
570 The value can be anything accepted
571 by :meth:`pyproj.CRS.from_user_input() <pyproj.crs.CRS.from_user_input>`,
572 such as an authority string (eg "EPSG:4326") or a WKT string.
573 engine : str, "fiona" or "pyogrio"
574 The underlying library that is used to write the file. Currently, the
575 supported options are "fiona" and "pyogrio". Defaults to "fiona" if
576 installed, otherwise tries "pyogrio".
577 **kwargs :
578 Keyword args to be passed to the engine, and can be used to write
579 to multi-layer data, store data within archives (zip files), etc.
580 In case of the "fiona" engine, the keyword arguments are passed to
581 fiona.open`. For more information on possible keywords, type:
582 ``import fiona; help(fiona.open)``. In case of the "pyogrio" engine,
583 the keyword arguments are passed to `pyogrio.write_dataframe`.
563584
564585 See Also
565586 --------
591612
592613 @property
593614 def _constructor_expanddim(self):
594 from geopandas import GeoDataFrame
595
596 return GeoDataFrame
615 return _geoseries_expanddim
597616
598617 def _wrapped_pandas_method(self, mtd, *args, **kwargs):
599618 """Wrap a generic pandas method to ensure it returns a GeoSeries"""
625644 if self.crs is not None:
626645 result.set_crs(self.crs, inplace=True)
627646 return result
628
629 def __finalize__(self, other, method=None, **kwargs):
630 """propagate metadata from other to self"""
631 # NOTE: backported from pandas master (upcoming v0.13)
632 for name in self._metadata:
633 object.__setattr__(self, name, getattr(other, name, None))
634 return self
635647
636648 def isna(self):
637649 """
671683 GeoSeries.notna : inverse of isna
672684 GeoSeries.is_empty : detect empty geometries
673685 """
674 if self.is_empty.any():
675 warnings.warn(
676 "GeoSeries.isna() previously returned True for both missing (None) "
677 "and empty geometries. Now, it only returns True for missing values. "
678 "Since the calling GeoSeries contains empty geometries, the result "
679 "has changed compared to previous versions of GeoPandas.\n"
680 "Given a GeoSeries 's', you can use 's.is_empty | s.isna()' to get "
681 "back the old behaviour.\n\n"
682 "To further ignore this warning, you can do: \n"
683 "import warnings; warnings.filterwarnings('ignore', 'GeoSeries.isna', "
684 "UserWarning)",
685 UserWarning,
686 stacklevel=2,
687 )
688
689686 return super().isna()
690687
691688 def isnull(self):
919916
920917 index = []
921918 geometries = []
922 for idx, s in self.geometry.iteritems():
919 for idx, s in self.geometry.items():
923920 if s.type.startswith("Multi") or s.type == "GeometryCollection":
924921 geoms = s.geoms
925922 idxs = [(idx, i) for i in range(len(geoms))]
12651262 warnings.warn(
12661263 "'^' operator will be deprecated. Use the 'symmetric_difference' "
12671264 "method instead.",
1268 DeprecationWarning,
1265 FutureWarning,
12691266 stacklevel=2,
12701267 )
12711268 return self.symmetric_difference(other)
12741271 """Implement | operator as for builtin set type"""
12751272 warnings.warn(
12761273 "'|' operator will be deprecated. Use the 'union' method instead.",
1277 DeprecationWarning,
1274 FutureWarning,
12781275 stacklevel=2,
12791276 )
12801277 return self.union(other)
12831280 """Implement & operator as for builtin set type"""
12841281 warnings.warn(
12851282 "'&' operator will be deprecated. Use the 'intersection' method instead.",
1286 DeprecationWarning,
1283 FutureWarning,
12871284 stacklevel=2,
12881285 )
12891286 return self.intersection(other)
12921289 """Implement - operator as for builtin set type"""
12931290 warnings.warn(
12941291 "'-' operator will be deprecated. Use the 'difference' method instead.",
1295 DeprecationWarning,
1292 FutureWarning,
12961293 stacklevel=2,
12971294 )
12981295 return self.difference(other)
13081305
13091306 Parameters
13101307 ----------
1311 mask : GeoDataFrame, GeoSeries, (Multi)Polygon
1308 mask : GeoDataFrame, GeoSeries, (Multi)Polygon, list-like
13121309 Polygon vector layer used to clip `gdf`.
13131310 The mask's geometry is dissolved into one geometric feature
1314 and intersected with `gdf`.
1311 and intersected with GeoSeries.
1312 If the mask is list-like with four elements ``(minx, miny, maxx, maxy)``,
1313 ``clip`` will use a faster rectangle clipping
1314 (:meth:`~GeoSeries.clip_by_rect`), possibly leading to slightly different
1315 results.
13151316 keep_geom_type : boolean, default False
13161317 If True, return only geometries of original type in case of intersection
13171318 resulting in multiple geometry types or GeometryCollections.
0 from distutils.version import LooseVersion
0 from packaging.version import Version
11 import json
22 import warnings
33
4 from pandas import DataFrame
4 from pandas import DataFrame, Series
55
66 from geopandas._compat import import_optional_dependency
77 from geopandas.array import from_wkb
99 import geopandas
1010 from .file import _expand_user
1111
12 METADATA_VERSION = "0.1.0"
13 # reference: https://github.com/geopandas/geo-arrow-spec
12 METADATA_VERSION = "0.4.0"
13 SUPPORTED_VERSIONS = ["0.1.0", "0.4.0"]
14 # reference: https://github.com/opengeospatial/geoparquet
1415
1516 # Metadata structure:
1617 # {
1718 # "geo": {
1819 # "columns": {
1920 # "<name>": {
20 # "crs": "<WKT or None: REQUIRED>",
2121 # "encoding": "WKB"
22 # "geometry_type": <str or list of str: REQUIRED>
23 # "crs": "<PROJJSON or None: OPTIONAL>",
24 # "orientation": "<'counterclockwise' or None: OPTIONAL>"
25 # "edges": "planar"
26 # "bbox": <list of [xmin, ymin, xmax, ymax]: OPTIONAL>
27 # "epoch": <float: OPTIONAL>
2228 # }
2329 # },
30 # "primary_column": "<str: REQUIRED>",
31 # "version": "<METADATA_VERSION>",
32 #
33 # # Additional GeoPandas specific metadata (not in metadata spec)
2434 # "creator": {
2535 # "library": "geopandas",
2636 # "version": "<geopandas.__version__>"
2737 # }
28 # "primary_column": "<str: REQUIRED>",
29 # "schema_version": "<METADATA_VERSION>"
3038 # }
3139 # }
3240
3947 )
4048
4149
42 def _create_metadata(df):
50 def _remove_id_from_member_of_ensembles(json_dict):
51 """
52 Older PROJ versions will not recognize IDs of datum ensemble members that
53 were added in more recent PROJ database versions.
54
55 Cf https://github.com/opengeospatial/geoparquet/discussions/110
56 and https://github.com/OSGeo/PROJ/pull/3221
57
58 Mimicking the patch to GDAL from https://github.com/OSGeo/gdal/pull/5872
59 """
60 for key, value in json_dict.items():
61 if isinstance(value, dict):
62 _remove_id_from_member_of_ensembles(value)
63 elif key == "members" and isinstance(value, list):
64 for member in value:
65 member.pop("id", None)
66
67
68 def _create_metadata(df, version=None):
4369 """Create and encode geo metadata dict.
4470
4571 Parameters
4672 ----------
4773 df : GeoDataFrame
74 version : {'0.1.0', '0.4.0', None}
75 GeoParquet specification version; if not provided will default to
76 latest supported version.
4877
4978 Returns
5079 -------
5180 dict
5281 """
82
83 version = version or METADATA_VERSION
84
85 if version not in SUPPORTED_VERSIONS:
86 raise ValueError(f"version must be one of: {', '.join(SUPPORTED_VERSIONS)}")
5387
5488 # Construct metadata for each geometry
5589 column_metadata = {}
5690 for col in df.columns[df.dtypes == "geometry"]:
5791 series = df[col]
92 geometry_types = sorted(Series(series.geom_type.unique()).dropna())
93
94 crs = None
95 if series.crs:
96 if version == "0.1.0":
97 crs = series.crs.to_wkt()
98 else: # version >= 0.4.0
99 crs = series.crs.to_json_dict()
100 _remove_id_from_member_of_ensembles(crs)
101
58102 column_metadata[col] = {
59 "crs": series.crs.to_wkt() if series.crs else None,
60103 "encoding": "WKB",
104 "crs": crs,
105 "geometry_type": geometry_types[0]
106 if len(geometry_types) == 1
107 else geometry_types,
61108 "bbox": series.total_bounds.tolist(),
62109 }
63110
64111 return {
65112 "primary_column": df._geometry_column_name,
66113 "columns": column_metadata,
67 "schema_version": METADATA_VERSION,
114 "version": METADATA_VERSION,
68115 "creator": {"library": "geopandas", "version": geopandas.__version__},
69116 }
70117
141188
142189 if not metadata:
143190 raise ValueError("Missing or malformed geo metadata in Parquet/Feather file")
191
192 # version was schema_version in 0.1.0
193 version = metadata.get("version", metadata.get("schema_version"))
194 if not version:
195 raise ValueError(
196 "'geo' metadata in Parquet/Feather file is missing required key: "
197 "'version'"
198 )
144199
145200 required_keys = ("primary_column", "columns")
146201 for key in required_keys:
154209 raise ValueError("'columns' in 'geo' metadata must be a dict")
155210
156211 # Validate that geometry columns have required metadata and values
157 required_col_keys = ("crs", "encoding")
212 # leaving out "geometry_type" for compatibility with 0.1
213 required_col_keys = ("encoding",)
158214 for col, column_metadata in metadata["columns"].items():
159215 for key in required_col_keys:
160216 if key not in column_metadata:
167223 raise ValueError("Only WKB geometry encoding is supported")
168224
169225
170 def _geopandas_to_arrow(df, index=None):
226 def _geopandas_to_arrow(df, index=None, version=None):
171227 """
172228 Helper function with main, shared logic for to_parquet/to_feather.
173229 """
174230 from pyarrow import Table
175231
176 warnings.warn(
177 "this is an initial implementation of Parquet/Feather file support and "
178 "associated metadata. This is tracking version 0.1.0 of the metadata "
179 "specification at "
180 "https://github.com/geopandas/geo-arrow-spec\n\n"
181 "This metadata specification does not yet make stability promises. "
182 "We do not yet recommend using this in a production setting unless you "
183 "are able to rewrite your Parquet/Feather files.\n\n"
184 "To further ignore this warning, you can do: \n"
185 "import warnings; warnings.filterwarnings('ignore', "
186 "message='.*initial implementation of Parquet.*')",
187 UserWarning,
188 stacklevel=4,
189 )
190
191232 _validate_dataframe(df)
192233
193234 # create geo metadata before altering incoming data frame
194 geo_metadata = _create_metadata(df)
235 geo_metadata = _create_metadata(df, version=version)
195236
196237 df = df.to_wkb()
197238
204245 return table.replace_schema_metadata(metadata)
205246
206247
207 def _to_parquet(df, path, index=None, compression="snappy", **kwargs):
248 def _to_parquet(df, path, index=None, compression="snappy", version=None, **kwargs):
208249 """
209250 Write a GeoDataFrame to the Parquet format.
210251
212253
213254 Requires 'pyarrow'.
214255
215 WARNING: this is an initial implementation of Parquet file support and
216 associated metadata. This is tracking version 0.1.0 of the metadata
217 specification at:
218 https://github.com/geopandas/geo-arrow-spec
219
220 This metadata specification does not yet make stability promises. As such,
221 we do not yet recommend using this in a production setting unless you are
222 able to rewrite your Parquet files.
223
256 This is tracking version 0.4.0 of the GeoParquet specification at:
257 https://github.com/opengeospatial/geoparquet
224258
225259 .. versionadded:: 0.8
226260
235269 output except `RangeIndex` which is stored as metadata only.
236270 compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
237271 Name of the compression to use. Use ``None`` for no compression.
272 version : {'0.1.0', '0.4.0', None}
273 GeoParquet specification version; if not provided will default to
274 latest supported version.
238275 kwargs
239276 Additional keyword arguments passed to pyarrow.parquet.write_table().
240277 """
243280 )
244281
245282 path = _expand_user(path)
246 table = _geopandas_to_arrow(df, index=index)
283 table = _geopandas_to_arrow(df, index=index, version=version)
247284 parquet.write_table(table, path, compression=compression, **kwargs)
248285
249286
250 def _to_feather(df, path, index=None, compression=None, **kwargs):
287 def _to_feather(df, path, index=None, compression=None, version=None, **kwargs):
251288 """
252289 Write a GeoDataFrame to the Feather format.
253290
255292
256293 Requires 'pyarrow' >= 0.17.
257294
258 WARNING: this is an initial implementation of Feather file support and
259 associated metadata. This is tracking version 0.1.0 of the metadata
260 specification at:
261 https://github.com/geopandas/geo-arrow-spec
262
263 This metadata specification does not yet make stability promises. As such,
264 we do not yet recommend using this in a production setting unless you are
265 able to rewrite your Feather files.
295 This is tracking version 0.4.0 of the GeoParquet specification at:
296 https://github.com/opengeospatial/geoparquet
266297
267298 .. versionadded:: 0.8
268299
278309 compression : {'zstd', 'lz4', 'uncompressed'}, optional
279310 Name of the compression to use. Use ``"uncompressed"`` for no
280311 compression. By default uses LZ4 if available, otherwise uncompressed.
312 version : {'0.1.0', '0.4.0', None}
313 GeoParquet specification version; if not provided will default to
314 latest supported version.
281315 kwargs
282316 Additional keyword arguments passed to pyarrow.feather.write_feather().
283317 """
287321 # TODO move this into `import_optional_dependency`
288322 import pyarrow
289323
290 if pyarrow.__version__ < LooseVersion("0.17.0"):
324 if Version(pyarrow.__version__) < Version("0.17.0"):
291325 raise ImportError("pyarrow >= 0.17 required for Feather support")
292326
293327 path = _expand_user(path)
294 table = _geopandas_to_arrow(df, index=index)
328 table = _geopandas_to_arrow(df, index=index, version=version)
295329 feather.write_feather(table, path, compression=compression, **kwargs)
296330
297331
298 def _arrow_to_geopandas(table):
332 def _arrow_to_geopandas(table, metadata=None):
299333 """
300334 Helper function with main, shared logic for read_parquet/read_feather.
301335 """
302336 df = table.to_pandas()
303337
304 metadata = table.schema.metadata
338 metadata = metadata or table.schema.metadata
305339 if metadata is None or b"geo" not in metadata:
306340 raise ValueError(
307341 """Missing geo metadata in Parquet/Feather file.
343377
344378 # Convert the WKB columns that are present back to geometry.
345379 for col in geometry_columns:
346 df[col] = from_wkb(df[col].values, crs=metadata["columns"][col]["crs"])
380 col_metadata = metadata["columns"][col]
381 if "crs" in col_metadata:
382 crs = col_metadata["crs"]
383 if isinstance(crs, dict):
384 _remove_id_from_member_of_ensembles(crs)
385 else:
386 # per the GeoParquet spec, missing CRS is to be interpreted as
387 # OGC:CRS84
388 crs = "OGC:CRS84"
389
390 df[col] = from_wkb(df[col].values, crs=crs)
347391
348392 return GeoDataFrame(df, geometry=geometry)
349393
360404 isinstance(path, str)
361405 and storage_options is None
362406 and filesystem is None
363 and LooseVersion(pyarrow.__version__) >= "5.0.0"
407 and Version(pyarrow.__version__) >= Version("5.0.0")
364408 ):
365409 # Use the native pyarrow filesystem if possible.
366410 try:
386430 return filesystem, path
387431
388432
433 def _ensure_arrow_fs(filesystem):
434 """
435 Simplified version of pyarrow.fs._ensure_filesystem. This is only needed
436 below because `pyarrow.parquet.read_metadata` does not yet accept a
437 filesystem keyword (https://issues.apache.org/jira/browse/ARROW-16719)
438 """
439 from pyarrow import fs
440
441 if isinstance(filesystem, fs.FileSystem):
442 return filesystem
443
444 # handle fsspec-compatible filesystems
445 try:
446 import fsspec
447 except ImportError:
448 pass
449 else:
450 if isinstance(filesystem, fsspec.AbstractFileSystem):
451 return fs.PyFileSystem(fs.FSSpecHandler(filesystem))
452
453 return filesystem
454
455
389456 def _read_parquet(path, columns=None, storage_options=None, **kwargs):
390457 """
391458 Load a Parquet object from the file path, returning a GeoDataFrame.
399466 * if the primary geometry column saved to this file is not included in
400467 columns, the first available geometry column will be set as the geometry
401468 column of the returned GeoDataFrame.
469
470 Supports versions 0.1.0, 0.4.0 of the GeoParquet specification at:
471 https://github.com/opengeospatial/geoparquet
472
473 If 'crs' key is not present in the GeoParquet metadata associated with the
474 Parquet object, it will default to "OGC:CRS84" according to the specification.
402475
403476 Requires 'pyarrow'.
404477
457530 kwargs["use_pandas_metadata"] = True
458531 table = parquet.read_table(path, columns=columns, filesystem=filesystem, **kwargs)
459532
460 return _arrow_to_geopandas(table)
533 # read metadata separately to get the raw Parquet FileMetaData metadata
534 # (pyarrow doesn't properly exposes those in schema.metadata for files
535 # created by GDAL - https://issues.apache.org/jira/browse/ARROW-16688)
536 metadata = None
537 if table.schema.metadata is None or b"geo" not in table.schema.metadata:
538 try:
539 # read_metadata does not accept a filesystem keyword, so need to
540 # handle this manually (https://issues.apache.org/jira/browse/ARROW-16719)
541 if filesystem is not None:
542 pa_filesystem = _ensure_arrow_fs(filesystem)
543 with pa_filesystem.open_input_file(path) as source:
544 metadata = parquet.read_metadata(source).metadata
545 else:
546 metadata = parquet.read_metadata(path).metadata
547 except Exception:
548 pass
549
550 return _arrow_to_geopandas(table, metadata)
461551
462552
463553 def _read_feather(path, columns=None, **kwargs):
473563 * if the primary geometry column saved to this file is not included in
474564 columns, the first available geometry column will be set as the geometry
475565 column of the returned GeoDataFrame.
566
567 Supports versions 0.1.0, 0.4.0 of the GeoParquet specification at:
568 https://github.com/opengeospatial/geoparquet
569
570 If 'crs' key is not present in the Feather metadata associated with the
571 Parquet object, it will default to "OGC:CRS84" according to the specification.
476572
477573 Requires 'pyarrow' >= 0.17.
478574
512608 # TODO move this into `import_optional_dependency`
513609 import pyarrow
514610
515 if pyarrow.__version__ < LooseVersion("0.17.0"):
611 if Version(pyarrow.__version__) < Version("0.17.0"):
516612 raise ImportError("pyarrow >= 0.17 required for Feather support")
517613
518614 path = _expand_user(path)
00 import os
1 from distutils.version import LooseVersion
1 from packaging.version import Version
22 from pathlib import Path
33 import warnings
44
55 import numpy as np
66 import pandas as pd
7 from pandas.api.types import is_integer_dtype
78
89 import pyproj
910 from shapely.geometry import mapping
1011 from shapely.geometry.base import BaseGeometry
1112
12 try:
13 import fiona
14
15 fiona_import_error = None
16
17 # only try to import fiona.Env if the main fiona import succeeded (otherwise you
18 # can get confusing "AttributeError: module 'fiona' has no attribute '_loading'"
19 # / partially initialized module errors)
20 try:
21 from fiona import Env as fiona_env
22 except ImportError:
23 try:
24 from fiona import drivers as fiona_env
25 except ImportError:
26 fiona_env = None
27
28 except ImportError as err:
29 fiona = None
30 fiona_import_error = str(err)
31
32
3313 from geopandas import GeoDataFrame, GeoSeries
34
3514
3615 # Adapted from pandas.io.common
3716 from urllib.request import urlopen as _urlopen
4120
4221 _VALID_URLS = set(uses_relative + uses_netloc + uses_params)
4322 _VALID_URLS.discard("")
23
24
25 fiona = None
26 fiona_env = None
27 fiona_import_error = None
28
29
30 def _import_fiona():
31 global fiona
32 global fiona_env
33 global fiona_import_error
34
35 if fiona is None:
36 try:
37 import fiona
38
39 # only try to import fiona.Env if the main fiona import succeeded
40 # (otherwise you can get confusing "AttributeError: module 'fiona'
41 # has no attribute '_loading'" / partially initialized module errors)
42 try:
43 from fiona import Env as fiona_env
44 except ImportError:
45 try:
46 from fiona import drivers as fiona_env
47 except ImportError:
48 fiona_env = None
49
50 except ImportError as err:
51 fiona = False
52 fiona_import_error = str(err)
53
54
55 pyogrio = None
56 pyogrio_import_error = None
57
58
59 def _import_pyogrio():
60 global pyogrio
61 global pyogrio_import_error
62
63 if pyogrio is None:
64 try:
65 import pyogrio
66 except ImportError as err:
67 pyogrio = False
68 pyogrio_import_error = str(err)
69
70
71 def _check_fiona(func):
72 if fiona is None:
73 raise ImportError(
74 f"the {func} requires the 'fiona' package, but it is not installed or does "
75 f"not import correctly.\nImporting fiona resulted in: {fiona_import_error}"
76 )
77
78
79 def _check_pyogrio(func):
80 if pyogrio is None:
81 raise ImportError(
82 f"the {func} requires the 'pyogrio' package, but it is not installed "
83 "or does not import correctly."
84 "\nImporting pyogrio resulted in: {pyogrio_import_error}"
85 )
86
87
88 def _check_engine(engine, func):
89 # default to "fiona" if installed, otherwise try pyogrio
90 if engine is None:
91 _import_fiona()
92 if fiona:
93 engine = "fiona"
94 else:
95 _import_pyogrio()
96 if pyogrio:
97 engine = "pyogrio"
98
99 if engine == "fiona":
100 _import_fiona()
101 _check_fiona(func)
102 elif engine == "pyogrio":
103 _import_pyogrio()
104 _check_pyogrio(func)
105 elif engine is None:
106 raise ImportError(
107 f"The {func} requires the 'pyogrio' or 'fiona' package, "
108 "but neither is installed or imports correctly."
109 f"\nImporting fiona resulted in: {fiona_import_error}"
110 f"\nImporting pyogrio resulted in: {pyogrio_import_error}"
111 )
112
113 return engine
114
44115
45116 _EXTENSION_TO_DRIVER = {
46117 ".bna": "BNA",
74145 return path
75146
76147
77 def _check_fiona(func):
78 if fiona is None:
79 raise ImportError(
80 f"the {func} requires the 'fiona' package, but it is not installed or does "
81 f"not import correctly.\nImporting fiona resulted in: {fiona_import_error}"
82 )
83
84
85148 def _is_url(url):
86149 """Check to see if *url* has a valid protocol."""
87150 try:
100163 )
101164
102165
103 def _read_file(filename, bbox=None, mask=None, rows=None, **kwargs):
166 def _read_file(filename, bbox=None, mask=None, rows=None, engine=None, **kwargs):
104167 """
105168 Returns a GeoDataFrame from a file or URL.
106169
113176 be opened, or any object with a read() method (such as an open file
114177 or StringIO)
115178 bbox : tuple | GeoDataFrame or GeoSeries | shapely Geometry, default None
116 Filter features by given bounding box, GeoSeries, GeoDataFrame or a
117 shapely geometry. CRS mis-matches are resolved if given a GeoSeries
118 or GeoDataFrame. Tuple is (minx, miny, maxx, maxy) to match the
119 bounds property of shapely geometry objects. Cannot be used with mask.
179 Filter features by given bounding box, GeoSeries, GeoDataFrame or a shapely
180 geometry. With engine="fiona", CRS mis-matches are resolved if given a GeoSeries
181 or GeoDataFrame. With engine="pyogrio", bbox must be in the same CRS as the
182 dataset. Tuple is (minx, miny, maxx, maxy) to match the bounds property of
183 shapely geometry objects. Cannot be used with mask.
120184 mask : dict | GeoDataFrame or GeoSeries | shapely Geometry, default None
121185 Filter for features that intersect with the given dict-like geojson
122186 geometry, GeoSeries, GeoDataFrame or shapely geometry.
125189 rows : int or slice, default None
126190 Load in specific rows by passing an integer (first `n` rows) or a
127191 slice() object.
192 engine : str, "fiona" or "pyogrio"
193 The underlying library that is used to read the file. Currently, the
194 supported options are "fiona" and "pyogrio". Defaults to "fiona" if
195 installed, otherwise tries "pyogrio".
128196 **kwargs :
129 Keyword args to be passed to the `open` or `BytesCollection` method
130 in the fiona library when opening the file. For more information on
131 possible keywords, type:
132 ``import fiona; help(fiona.open)``
197 Keyword args to be passed to the engine. In case of the "fiona" engine,
198 the keyword arguments are passed to the `open` or `BytesCollection`
199 method in the fiona library when opening the file. For more information
200 on possible keywords, type: ``import fiona; help(fiona.open)``. In
201 case of the "pyogrio" engine, the keyword arguments are passed to
202 `pyogrio.read_dataframe`.
133203
134204 Examples
135205 --------
162232 may fail. In this case, the proper encoding can be specified explicitly
163233 by using the encoding keyword parameter, e.g. ``encoding='utf-8'``.
164234 """
165 _check_fiona("'read_file' function")
235 engine = _check_engine(engine, "'read_file' function")
236
166237 filename = _expand_user(filename)
167238
239 from_bytes = False
168240 if _is_url(filename):
169241 req = _urlopen(filename)
170242 path_or_bytes = req.read()
171 reader = fiona.BytesCollection
243 from_bytes = True
172244 elif pd.api.types.is_file_like(filename):
173245 data = filename.read()
174246 path_or_bytes = data.encode("utf-8") if isinstance(data, str) else data
175 reader = fiona.BytesCollection
247 from_bytes = True
176248 else:
249 path_or_bytes = filename
250
251 if engine == "fiona":
252 return _read_file_fiona(
253 path_or_bytes, from_bytes, bbox=bbox, mask=mask, rows=rows, **kwargs
254 )
255 elif engine == "pyogrio":
256 return _read_file_pyogrio(
257 path_or_bytes, bbox=bbox, mask=mask, rows=rows, **kwargs
258 )
259 else:
260 raise ValueError(f"unknown engine '{engine}'")
261
262
263 def _read_file_fiona(
264 path_or_bytes, from_bytes, bbox=None, mask=None, rows=None, **kwargs
265 ):
266 if not from_bytes:
177267 # Opening a file via URL or file-like-object above automatically detects a
178268 # zipped file. In order to match that behavior, attempt to add a zip scheme
179269 # if missing.
180 if _is_zip(str(filename)):
181 parsed = fiona.parse_path(str(filename))
270 if _is_zip(str(path_or_bytes)):
271 parsed = fiona.parse_path(str(path_or_bytes))
182272 if isinstance(parsed, fiona.path.ParsedPath):
183273 # If fiona is able to parse the path, we can safely look at the scheme
184274 # and update it to have a zip scheme if necessary.
185275 schemes = (parsed.scheme or "").split("+")
186276 if "zip" not in schemes:
187277 parsed.scheme = "+".join(["zip"] + schemes)
188 filename = parsed.name
278 path_or_bytes = parsed.name
189279 elif isinstance(parsed, fiona.path.UnparsedPath) and not str(
190 filename
280 path_or_bytes
191281 ).startswith("/vsi"):
192282 # If fiona is unable to parse the path, it might have a Windows drive
193283 # scheme. Try adding zip:// to the front. If the path starts with "/vsi"
194284 # it is a legacy GDAL path type, so let it pass unmodified.
195 filename = "zip://" + parsed.name
196 path_or_bytes = filename
285 path_or_bytes = "zip://" + parsed.name
286
287 if from_bytes:
288 reader = fiona.BytesCollection
289 else:
197290 reader = fiona.open
198291
199292 with fiona_env():
235328 f_filt = features
236329 # get list of columns
237330 columns = list(features.schema["properties"])
331 datetime_fields = [
332 k for (k, v) in features.schema["properties"].items() if v == "datetime"
333 ]
238334 if kwargs.get("ignore_geometry", False):
239 return pd.DataFrame(
335 df = pd.DataFrame(
240336 [record["properties"] for record in f_filt], columns=columns
241337 )
242
243 return GeoDataFrame.from_features(
244 f_filt, crs=crs, columns=columns + ["geometry"]
245 )
338 else:
339 df = GeoDataFrame.from_features(
340 f_filt, crs=crs, columns=columns + ["geometry"]
341 )
342 for k in datetime_fields:
343 # fiona only supports up to ms precision, any microseconds are
344 # floating point rounding error
345 df[k] = pd.to_datetime(df[k]).dt.round(freq="ms")
346 return df
347
348
349 def _read_file_pyogrio(path_or_bytes, bbox=None, mask=None, rows=None, **kwargs):
350 import pyogrio
351
352 if rows is not None:
353 if isinstance(rows, int):
354 kwargs["max_features"] = rows
355 elif isinstance(rows, slice):
356 if rows.start is not None:
357 kwargs["skip_features"] = rows.start
358 if rows.stop is not None:
359 kwargs["max_features"] = rows.stop - (rows.start or 0)
360 if rows.step is not None:
361 raise ValueError("slice with step is not supported")
362 else:
363 raise TypeError("'rows' must be an integer or a slice.")
364 if bbox is not None:
365 if isinstance(bbox, (GeoDataFrame, GeoSeries)):
366 bbox = tuple(bbox.total_bounds)
367 elif isinstance(bbox, BaseGeometry):
368 bbox = bbox.bounds
369 if len(bbox) != 4:
370 raise ValueError("'bbox' should be a length-4 tuple.")
371 if mask is not None:
372 raise ValueError(
373 "The 'mask' keyword is not supported with the 'pyogrio' engine. "
374 "You can use 'bbox' instead."
375 )
376 if kwargs.pop("ignore_geometry", False):
377 kwargs["read_geometry"] = False
378
379 # TODO: if bbox is not None, check its CRS vs the CRS of the file
380 return pyogrio.read_dataframe(path_or_bytes, bbox=bbox, **kwargs)
246381
247382
248383 def read_file(*args, **kwargs):
249 import warnings
250
251384 warnings.warn(
252385 "geopandas.io.file.read_file() is intended for internal "
253386 "use only, and will be deprecated. Use geopandas.read_file() instead.",
254 DeprecationWarning,
387 FutureWarning,
255388 stacklevel=2,
256389 )
257390
259392
260393
261394 def to_file(*args, **kwargs):
262 import warnings
263
264395 warnings.warn(
265396 "geopandas.io.file.to_file() is intended for internal "
266397 "use only, and will be deprecated. Use GeoDataFrame.to_file() "
267398 "or GeoSeries.to_file() instead.",
268 DeprecationWarning,
399 FutureWarning,
269400 stacklevel=2,
270401 )
271402
298429 index=None,
299430 mode="w",
300431 crs=None,
432 engine=None,
301433 **kwargs,
302434 ):
303435 """
311443 ----------
312444 df : GeoDataFrame to be written
313445 filename : string
314 File path or file handle to write to.
446 File path or file handle to write to. The path may specify a
447 GDAL VSI scheme.
315448 driver : string, default None
316449 The OGR format driver used to write the vector file.
317450 If not specified, it attempts to infer it from the file extension.
319452 schema : dict, default None
320453 If specified, the schema dictionary is passed to Fiona to
321454 better control how the file is written. If None, GeoPandas
322 will determine the schema based on each column's dtype
455 will determine the schema based on each column's dtype.
456 Not supported for the "pyogrio" engine.
323457 index : bool, default None
324458 If True, write index into one or more columns (for MultiIndex).
325459 Default None writes the index into one or more columns only if
340474 The value can be anything accepted
341475 by :meth:`pyproj.CRS.from_user_input() <pyproj.crs.CRS.from_user_input>`,
342476 such as an authority string (eg "EPSG:4326") or a WKT string.
343
344 The *kwargs* are passed to fiona.open and can be used to write
345 to multi-layer data, store data within archives (zip files), etc.
346 The path may specify a fiona VSI scheme.
477 engine : str, "fiona" or "pyogrio"
478 The underlying library that is used to write the file. Currently, the
479 supported options are "fiona" and "pyogrio". Defaults to "fiona" if
480 installed, otherwise tries "pyogrio".
481 **kwargs :
482 Keyword args to be passed to the engine, and can be used to write
483 to multi-layer data, store data within archives (zip files), etc.
484 In case of the "fiona" engine, the keyword arguments are passed to
485 fiona.open`. For more information on possible keywords, type:
486 ``import fiona; help(fiona.open)``. In case of the "pyogrio" engine,
487 the keyword arguments are passed to `pyogrio.write_dataframe`.
347488
348489 Notes
349490 -----
351492 may fail. In this case, the proper encoding can be specified explicitly
352493 by using the encoding keyword parameter, e.g. ``encoding='utf-8'``.
353494 """
354 _check_fiona("'to_file' method")
495 engine = _check_engine(engine, "'to_file' method")
496
355497 filename = _expand_user(filename)
356498
357499 if index is None:
358500 # Determine if index attribute(s) should be saved to file
359 index = list(df.index.names) != [None] or type(df.index) not in (
360 pd.RangeIndex,
361 pd.Int64Index,
362 )
501 # (only if they are named or are non-integer)
502 index = list(df.index.names) != [None] or not is_integer_dtype(df.index.dtype)
363503 if index:
364504 df = df.reset_index(drop=False)
365 if schema is None:
366 schema = infer_schema(df)
367 if crs:
368 crs = pyproj.CRS.from_user_input(crs)
369 else:
370 crs = df.crs
371505
372506 if driver is None:
373507 driver = _detect_driver(filename)
378512 "ESRI Shapefile.",
379513 stacklevel=3,
380514 )
515
516 if engine == "fiona":
517 _to_file_fiona(df, filename, driver, schema, crs, mode, **kwargs)
518 elif engine == "pyogrio":
519 _to_file_pyogrio(df, filename, driver, schema, crs, mode, **kwargs)
520 else:
521 raise ValueError(f"unknown engine '{engine}'")
522
523
524 def _to_file_fiona(df, filename, driver, schema, crs, mode, **kwargs):
525
526 if schema is None:
527 schema = infer_schema(df)
528
529 if crs:
530 crs = pyproj.CRS.from_user_input(crs)
531 else:
532 crs = df.crs
381533
382534 with fiona_env():
383535 crs_wkt = None
385537 gdal_version = fiona.env.get_gdal_release_name()
386538 except AttributeError:
387539 gdal_version = "2.0.0" # just assume it is not the latest
388 if LooseVersion(gdal_version) >= LooseVersion("3.0.0") and crs:
540 if Version(gdal_version) >= Version("3.0.0") and crs:
389541 crs_wkt = crs.to_wkt()
390542 elif crs:
391543 crs_wkt = crs.to_wkt("WKT1_GDAL")
393545 filename, mode=mode, driver=driver, crs_wkt=crs_wkt, schema=schema, **kwargs
394546 ) as colxn:
395547 colxn.writerecords(df.iterfeatures())
548
549
550 def _to_file_pyogrio(df, filename, driver, schema, crs, mode, **kwargs):
551 import pyogrio
552
553 if schema is not None:
554 raise ValueError(
555 "The 'schema' argument is not supported with the 'pyogrio' engine."
556 )
557
558 if mode != "w":
559 raise ValueError(
560 "Only mode='w' is supported for now with the 'pyogrio' engine."
561 )
562
563 if crs is not None:
564 raise ValueError("Passing 'crs' it not supported with the 'pyogrio' engine.")
565
566 # for the fiona engine, this check is done in gdf.iterfeatures()
567 if not df.columns.is_unique:
568 raise ValueError("GeoDataFrame cannot contain duplicated column names.")
569
570 pyogrio.write_dataframe(df, filename, driver=driver, **kwargs)
396571
397572
398573 def infer_schema(df):
424599 )
425600
426601 if df.empty:
427 raise ValueError("Cannot write empty DataFrame to file.")
602 warnings.warn(
603 "You are attempting to write an empty DataFrame to file. "
604 "For some drivers, this operation may fail.",
605 UserWarning,
606 stacklevel=3,
607 )
428608
429609 # Since https://github.com/Toblerity/Fiona/issues/446 resolution,
430610 # Fiona allows a list of geometry types
181181 warnings.warn(
182182 "geopandas.io.sql.read_postgis() is intended for internal "
183183 "use only, and will be deprecated. Use geopandas.read_postgis() instead.",
184 DeprecationWarning,
184 FutureWarning,
185185 stacklevel=2,
186186 )
187187
229229
230230 # Check for 3D-coordinates
231231 if any(gdf.geometry.has_z):
232 target_geom_type = target_geom_type + "Z"
232 target_geom_type += "Z"
233233
234234 return target_geom_type, has_curve
235235
241241
242242 # Use geoalchemy2 default for srid
243243 # Note: undefined srid in PostGIS is 0
244 srid = -1
244 srid = None
245245 warning_msg = (
246246 "Could not parse CRS from the GeoDataFrame. "
247 + "Inserting data without defined CRS.",
247 "Inserting data without defined CRS."
248248 )
249249 if gdf.crs is not None:
250250 try:
251 srid = gdf.crs.to_epsg(min_confidence=25)
252 if srid is None:
253 srid = -1
254 warnings.warn(warning_msg, UserWarning, stacklevel=2)
251 for confidence in (100, 70, 25):
252 srid = gdf.crs.to_epsg(min_confidence=confidence)
253 if srid is not None:
254 break
255 auth_srid = gdf.crs.to_authority(
256 auth_name="ESRI", min_confidence=confidence
257 )
258 if auth_srid is not None:
259 srid = int(auth_srid[1])
260 break
255261 except Exception:
256262 warnings.warn(warning_msg, UserWarning, stacklevel=2)
263
264 if srid is None:
265 srid = -1
266 warnings.warn(warning_msg, UserWarning, stacklevel=2)
267
257268 return srid
258269
259270
271282
272283
273284 def _convert_to_ewkb(gdf, geom_name, srid):
274 """Convert geometries to ewkb. """
285 """Convert geometries to ewkb."""
275286 if compat.USE_PYGEOS:
276287 from pygeos import set_srid, to_wkb
277288
368379 try:
369380 from geoalchemy2 import Geometry
370381 except ImportError:
371 raise ImportError("'to_postgis()' requires geoalchemy2 package. ")
372
373 if not compat.SHAPELY_GE_17:
374 raise ImportError(
375 "'to_postgis()' requires newer version of Shapely "
376 "(>= '1.7.0').\nYou can update the library using "
377 "'pip install shapely --upgrade' or using "
378 "'conda update shapely' if using conda package manager."
379 )
382 raise ImportError("'to_postgis()' requires geoalchemy2 package.")
380383
381384 gdf = gdf.copy()
382385 geom_name = gdf.geometry.name
geopandas/io/tests/data/pickle/0.5.1_pd-0.25.3_py-3.7.3_x86_64_linux.pickle less more
Binary diff not shown
3030
3131
3232 def create_pickle_data():
33 """ create the pickle data """
33 """create the pickle data"""
3434
3535 # custom geometry column name
3636 gdf_the_geom = geopandas.GeoDataFrame(
00 from __future__ import absolute_import
11
2 from distutils.version import LooseVersion
2 from itertools import product
3 import json
4 from packaging.version import Version
35 import os
6 import pathlib
47
58 import pytest
69 from pandas import DataFrame, read_parquet as pd_read_parquet
710 from pandas.testing import assert_frame_equal
811 import numpy as np
9 from shapely.geometry import box
12 import pyproj
13 from shapely.geometry import box, Point, MultiPolygon
14
1015
1116 import geopandas
1217 from geopandas import GeoDataFrame, read_file, read_parquet, read_feather
1318 from geopandas.array import to_wkb
1419 from geopandas.datasets import get_path
1520 from geopandas.io.arrow import (
21 SUPPORTED_VERSIONS,
1622 _create_metadata,
1723 _decode_metadata,
1824 _encode_metadata,
25 _geopandas_to_arrow,
1926 _get_filesystem_path,
27 _remove_id_from_member_of_ensembles,
2028 _validate_dataframe,
2129 _validate_metadata,
2230 METADATA_VERSION,
2331 )
2432 from geopandas.testing import assert_geodataframe_equal, assert_geoseries_equal
33 from geopandas.tests.util import mock
34
35
36 DATA_PATH = pathlib.Path(os.path.dirname(__file__)) / "data"
2537
2638
2739 # Skip all tests in this module if pyarrow is not available
2840 pyarrow = pytest.importorskip("pyarrow")
29
30 # TEMPORARY: hide warning from to_parquet
31 pytestmark = pytest.mark.filterwarnings("ignore:.*initial implementation of Parquet.*")
3241
3342
3443 @pytest.fixture(
3746 pytest.param(
3847 "feather",
3948 marks=pytest.mark.skipif(
40 pyarrow.__version__ < LooseVersion("0.17.0"),
49 Version(pyarrow.__version__) < Version("0.17.0"),
4150 reason="needs pyarrow >= 0.17",
4251 ),
4352 ),
5665 metadata = _create_metadata(df)
5766
5867 assert isinstance(metadata, dict)
59 assert metadata["schema_version"] == METADATA_VERSION
68 assert metadata["version"] == METADATA_VERSION
69 assert metadata["primary_column"] == "geometry"
70 assert "geometry" in metadata["columns"]
71 crs_expected = df.crs.to_json_dict()
72 _remove_id_from_member_of_ensembles(crs_expected)
73 assert metadata["columns"]["geometry"]["crs"] == crs_expected
74 assert metadata["columns"]["geometry"]["encoding"] == "WKB"
75 assert metadata["columns"]["geometry"]["geometry_type"] == [
76 "MultiPolygon",
77 "Polygon",
78 ]
79
80 assert np.array_equal(
81 metadata["columns"]["geometry"]["bbox"], df.geometry.total_bounds
82 )
83
6084 assert metadata["creator"]["library"] == "geopandas"
6185 assert metadata["creator"]["version"] == geopandas.__version__
62 assert metadata["primary_column"] == "geometry"
63 assert "geometry" in metadata["columns"]
64 assert metadata["columns"]["geometry"]["crs"] == df.geometry.crs.to_wkt()
65 assert metadata["columns"]["geometry"]["encoding"] == "WKB"
66
67 assert np.array_equal(
68 metadata["columns"]["geometry"]["bbox"], df.geometry.total_bounds
69 )
86
87
88 def test_crs_metadata_datum_ensemble():
89 # compatibility for older PROJ versions using PROJJSON with datum ensembles
90 # https://github.com/geopandas/geopandas/pull/2453
91 crs = pyproj.CRS("EPSG:4326")
92 crs_json = crs.to_json_dict()
93 check_ensemble = False
94 if "datum_ensemble" in crs_json:
95 # older version of PROJ don't yet have datum ensembles
96 check_ensemble = True
97 assert "id" in crs_json["datum_ensemble"]["members"][0]
98 _remove_id_from_member_of_ensembles(crs_json)
99 if check_ensemble:
100 assert "id" not in crs_json["datum_ensemble"]["members"][0]
101 # ensure roundtrip still results in an equivalent CRS
102 assert pyproj.CRS(crs_json) == crs
103
104
105 def test_write_metadata_invalid_spec_version():
106 gdf = geopandas.GeoDataFrame(geometry=[box(0, 0, 10, 10)], crs="EPSG:4326")
107 with pytest.raises(ValueError, match="version must be one of"):
108 _create_metadata(gdf, version="invalid")
70109
71110
72111 def test_encode_metadata():
81120
82121 expected = {"a": "b"}
83122 assert _decode_metadata(metadata_str) == expected
123
124 assert _decode_metadata(None) is None
84125
85126
86127 def test_validate_dataframe():
111152 {
112153 "primary_column": "geometry",
113154 "columns": {"geometry": {"crs": None, "encoding": "WKB"}},
155 "schema_version": "0.1.0",
114156 }
115157 )
116158
117159 _validate_metadata(
118160 {
119161 "primary_column": "geometry",
120 "columns": {"geometry": {"crs": "WKT goes here", "encoding": "WKB"}},
162 "columns": {"geometry": {"crs": None, "encoding": "WKB"}},
163 "version": "<version>",
164 }
165 )
166
167 _validate_metadata(
168 {
169 "primary_column": "geometry",
170 "columns": {
171 "geometry": {
172 "crs": {
173 # truncated PROJJSON for testing, as PROJJSON contents
174 # not validated here
175 "id": {"authority": "EPSG", "code": 4326},
176 },
177 "encoding": "WKB",
178 }
179 },
180 "version": "0.4.0",
121181 }
122182 )
123183
125185 @pytest.mark.parametrize(
126186 "metadata,error",
127187 [
188 (None, "Missing or malformed geo metadata in Parquet/Feather file"),
128189 ({}, "Missing or malformed geo metadata in Parquet/Feather file"),
129 (
130 {"primary_column": "foo"},
131 "'geo' metadata in Parquet/Feather file is missing required key:",
132 ),
190 # missing "version" key:
133191 (
134192 {"primary_column": "foo", "columns": None},
135193 "'geo' metadata in Parquet/Feather file is missing required key",
136194 ),
195 # missing "columns" key:
137196 (
138 {"primary_column": "foo", "columns": []},
197 {"primary_column": "foo", "version": "<version>"},
198 "'geo' metadata in Parquet/Feather file is missing required key:",
199 ),
200 # missing "primary_column"
201 (
202 {"columns": [], "version": "<version>"},
203 "'geo' metadata in Parquet/Feather file is missing required key:",
204 ),
205 (
206 {"primary_column": "foo", "columns": [], "version": "<version>"},
139207 "'columns' in 'geo' metadata must be a dict",
140208 ),
209 # missing "encoding" for column
141210 (
142 {"primary_column": "foo", "columns": {"foo": {}}},
211 {"primary_column": "foo", "columns": {"foo": {}}, "version": "<version>"},
143212 (
144 "'geo' metadata in Parquet/Feather file is missing required key 'crs' "
145 "for column 'foo'"
213 "'geo' metadata in Parquet/Feather file is missing required key "
214 "'encoding' for column 'foo'"
146215 ),
147216 ),
148 (
149 {"primary_column": "foo", "columns": {"foo": {"crs": None}}},
150 "'geo' metadata in Parquet/Feather file is missing required key",
151 ),
152 (
153 {"primary_column": "foo", "columns": {"foo": {"encoding": None}}},
154 "'geo' metadata in Parquet/Feather file is missing required key",
155 ),
217 # invalid column encoding
156218 (
157219 {
158220 "primary_column": "foo",
159221 "columns": {"foo": {"crs": None, "encoding": None}},
222 "version": "<version>",
160223 },
161224 "Only WKB geometry encoding is supported",
162225 ),
164227 {
165228 "primary_column": "foo",
166229 "columns": {"foo": {"crs": None, "encoding": "BKW"}},
230 "version": "<version>",
167231 },
168232 "Only WKB geometry encoding is supported",
169233 ),
172236 def test_validate_metadata_invalid(metadata, error):
173237 with pytest.raises(ValueError, match=error):
174238 _validate_metadata(metadata)
239
240
241 def test_to_parquet_fails_on_invalid_engine(tmpdir):
242 df = GeoDataFrame(data=[[1, 2, 3]], columns=["a", "b", "a"], geometry=[Point(1, 1)])
243
244 with pytest.raises(
245 ValueError,
246 match=(
247 "GeoPandas only supports using pyarrow as the engine for "
248 "to_parquet: 'fastparquet' passed instead."
249 ),
250 ):
251 df.to_parquet(tmpdir / "test.parquet", engine="fastparquet")
252
253
254 @mock.patch("geopandas.io.arrow._to_parquet")
255 def test_to_parquet_does_not_pass_engine_along(mock_to_parquet):
256 df = GeoDataFrame(data=[[1, 2, 3]], columns=["a", "b", "a"], geometry=[Point(1, 1)])
257 df.to_parquet("", engine="pyarrow")
258 # assert that engine keyword is not passed through to _to_parquet (and thus
259 # parquet.write_table)
260 mock_to_parquet.assert_called_with(
261 df, "", compression="snappy", index=None, version=None
262 )
175263
176264
177265 # TEMPORARY: used to determine if pyarrow fails for roundtripping pandas data
216304
217305 filename = os.path.join(str(tmpdir), "test.pq")
218306
219 # TEMP: Initial implementation should raise a UserWarning
220 with pytest.warns(UserWarning, match="initial implementation"):
221 writer(df, filename)
307 writer(df, filename)
222308
223309 assert os.path.exists(filename)
224310
270356
271357
272358 @pytest.mark.skipif(
273 pyarrow.__version__ < LooseVersion("0.17.0"),
359 Version(pyarrow.__version__) < Version("0.17.0"),
274360 reason="Feather only supported for pyarrow >= 0.17",
275361 )
276362 @pytest.mark.parametrize("compression", ["uncompressed", "lz4", "zstd"])
488574
489575
490576 @pytest.mark.skipif(
491 pyarrow.__version__ >= LooseVersion("0.17.0"),
577 Version(pyarrow.__version__) >= Version("0.17.0"),
492578 reason="Feather only supported for pyarrow >= 0.17",
493579 )
494580 def test_feather_arrow_version(tmpdir):
527613 result = read_parquet("memory://data.parquet", filesystem=memfs)
528614 assert_geodataframe_equal(result, df)
529615
616 # reset fsspec registry
617 fsspec.register_implementation(
618 "memory", fsspec.implementations.memory.MemoryFileSystem, clobber=True
619 )
620
530621
531622 def test_non_fsspec_url_with_storage_options_raises():
532623 with pytest.raises(ValueError, match="storage_options"):
535626
536627
537628 @pytest.mark.skipif(
538 pyarrow.__version__ < LooseVersion("5.0.0"),
629 Version(pyarrow.__version__) < Version("5.0.0"),
539630 reason="pyarrow.fs requires pyarrow>=5.0.0",
540631 )
541632 def test_prefers_pyarrow_fs():
559650 f_df = geopandas.read_feather(test_file)
560651 assert_geodataframe_equal(gdf, f_df, check_crs=True)
561652 os.remove(os.path.expanduser(test_file))
653
654
655 @pytest.mark.parametrize("format", ["feather", "parquet"])
656 def test_write_read_default_crs(tmpdir, format):
657 if format == "feather":
658 from pyarrow.feather import write_feather as write
659 else:
660 from pyarrow.parquet import write_table as write
661
662 filename = os.path.join(str(tmpdir), f"test.{format}")
663 gdf = geopandas.GeoDataFrame(geometry=[box(0, 0, 10, 10)])
664 table = _geopandas_to_arrow(gdf)
665
666 # update the geo metadata to strip 'crs' entry
667 metadata = table.schema.metadata
668 geo_metadata = _decode_metadata(metadata[b"geo"])
669 del geo_metadata["columns"]["geometry"]["crs"]
670 metadata.update({b"geo": _encode_metadata(geo_metadata)})
671 table = table.replace_schema_metadata(metadata)
672
673 write(table, filename)
674
675 read = getattr(geopandas, f"read_{format}")
676 df = read(filename)
677 assert df.crs.equals(pyproj.CRS("OGC:CRS84"))
678
679
680 @pytest.mark.parametrize(
681 "format,version", product(["feather", "parquet"], [None] + SUPPORTED_VERSIONS)
682 )
683 def test_write_spec_version(tmpdir, format, version):
684 if format == "feather":
685 from pyarrow.feather import read_table
686
687 else:
688 from pyarrow.parquet import read_table
689
690 filename = os.path.join(str(tmpdir), f"test.{format}")
691 gdf = geopandas.GeoDataFrame(geometry=[box(0, 0, 10, 10)], crs="EPSG:4326")
692 write = getattr(gdf, f"to_{format}")
693 write(filename, version=version)
694
695 # ensure that we can roundtrip data regardless of version
696 read = getattr(geopandas, f"read_{format}")
697 df = read(filename)
698 assert_geodataframe_equal(df, gdf)
699
700 table = read_table(filename)
701 metadata = json.loads(table.schema.metadata[b"geo"])
702 assert metadata["version"] == version or METADATA_VERSION
703
704 # verify that CRS is correctly handled between versions
705 if version == "0.1.0":
706 assert metadata["columns"]["geometry"]["crs"] == gdf.crs.to_wkt()
707
708 else:
709 crs_expected = gdf.crs.to_json_dict()
710 _remove_id_from_member_of_ensembles(crs_expected)
711 assert metadata["columns"]["geometry"]["crs"] == crs_expected
712
713
714 @pytest.mark.parametrize("version", ["0.1.0", "0.4.0"])
715 def test_read_versioned_file(version):
716 """
717 Verify that files for different metadata spec versions can be read
718 created for each supported version:
719
720 # small dummy test dataset (not naturalearth_lowres, as this can change over time)
721 from shapely.geometry import box, MultiPolygon
722 df = geopandas.GeoDataFrame(
723 {"col_str": ["a", "b"], "col_int": [1, 2], "col_float": [0.1, 0.2]},
724 geometry=[MultiPolygon([box(0, 0, 1, 1), box(2, 2, 3, 3)]), box(4, 4, 5,5)],
725 crs="EPSG:4326",
726 )
727 df.to_feather(DATA_PATH / 'arrow' / f'test_data_v{METADATA_VERSION}.feather') # noqa: E501
728 df.to_parquet(DATA_PATH / 'arrow' / f'test_data_v{METADATA_VERSION}.parquet') # noqa: E501
729 """
730 check_crs = Version(pyproj.__version__) >= Version("3.0.0")
731
732 expected = geopandas.GeoDataFrame(
733 {"col_str": ["a", "b"], "col_int": [1, 2], "col_float": [0.1, 0.2]},
734 geometry=[MultiPolygon([box(0, 0, 1, 1), box(2, 2, 3, 3)]), box(4, 4, 5, 5)],
735 crs="EPSG:4326",
736 )
737
738 df = geopandas.read_feather(DATA_PATH / "arrow" / f"test_data_v{version}.feather")
739 assert_geodataframe_equal(df, expected, check_crs=check_crs)
740
741 df = geopandas.read_parquet(DATA_PATH / "arrow" / f"test_data_v{version}.parquet")
742 assert_geodataframe_equal(df, expected, check_crs=check_crs)
743
744
745 def test_read_gdal_files():
746 """
747 Verify that files written by GDAL can be read by geopandas.
748 Since it is currently not yet straightforward to install GDAL with
749 Parquet/Arrow enabled in our conda setup, we are testing with some
750 generated files included in the repo (using GDAL 3.5.0):
751
752 # small dummy test dataset (not naturalearth_lowres, as this can change over time)
753 from shapely.geometry import box, MultiPolygon
754 df = geopandas.GeoDataFrame(
755 {"col_str": ["a", "b"], "col_int": [1, 2], "col_float": [0.1, 0.2]},
756 geometry=[MultiPolygon([box(0, 0, 1, 1), box(2, 2, 3, 3)]), box(4, 4, 5,5)],
757 crs="EPSG:4326",
758 )
759 df.to_file("test_data.gpkg", GEOMETRY_NAME="geometry")
760 and then the gpkg file is converted to Parquet/Arrow with:
761 $ ogr2ogr -f Parquet -lco FID= test_data_gdal350.parquet test_data.gpkg
762 $ ogr2ogr -f Arrow -lco FID= -lco GEOMETRY_ENCODING=WKB test_data_gdal350.arrow test_data.gpkg # noqa: E501
763 """
764 check_crs = Version(pyproj.__version__) >= Version("3.0.0")
765
766 expected = geopandas.GeoDataFrame(
767 {"col_str": ["a", "b"], "col_int": [1, 2], "col_float": [0.1, 0.2]},
768 geometry=[MultiPolygon([box(0, 0, 1, 1), box(2, 2, 3, 3)]), box(4, 4, 5, 5)],
769 crs="EPSG:4326",
770 )
771
772 df = geopandas.read_parquet(DATA_PATH / "arrow" / "test_data_gdal350.parquet")
773 assert_geodataframe_equal(df, expected, check_crs=check_crs)
774
775 df = geopandas.read_feather(DATA_PATH / "arrow" / "test_data_gdal350.arrow")
776 assert_geodataframe_equal(df, expected, check_crs=check_crs)
777
778
779 def test_parquet_read_partitioned_dataset(tmpdir):
780 # we don't yet explicitly support this (in writing), but for Parquet it
781 # works for reading (by relying on pyarrow.read_table)
782 df = read_file(get_path("naturalearth_lowres"))
783
784 # manually create partitioned dataset
785 basedir = tmpdir / "partitioned_dataset"
786 basedir.mkdir()
787 df[:100].to_parquet(basedir / "data1.parquet")
788 df[100:].to_parquet(basedir / "data2.parquet")
789
790 result = read_parquet(basedir)
791 assert_geodataframe_equal(result, df)
792
793
794 def test_parquet_read_partitioned_dataset_fsspec(tmpdir):
795 fsspec = pytest.importorskip("fsspec")
796
797 df = read_file(get_path("naturalearth_lowres"))
798
799 # manually create partitioned dataset
800 memfs = fsspec.filesystem("memory")
801 memfs.mkdir("partitioned_dataset")
802 with memfs.open("partitioned_dataset/data1.parquet", "wb") as f:
803 df[:100].to_parquet(f)
804 with memfs.open("partitioned_dataset/data2.parquet", "wb") as f:
805 df[100:].to_parquet(f)
806
807 result = read_parquet("memory://partitioned_dataset")
808 assert_geodataframe_equal(result, df)
00 from collections import OrderedDict
11 import datetime
2 from packaging.version import Version
23 import io
34 import os
45 import pathlib
78 import numpy as np
89 import pandas as pd
910
10 import fiona
11 import pytz
12 from pandas.testing import assert_series_equal
1113 from shapely.geometry import Point, Polygon, box
1214
1315 import geopandas
1416 from geopandas import GeoDataFrame, read_file
15 from geopandas.io.file import fiona_env, _detect_driver, _EXTENSION_TO_DRIVER
17 from geopandas.io.file import _detect_driver, _EXTENSION_TO_DRIVER
1618
1719 from geopandas.testing import assert_geodataframe_equal, assert_geoseries_equal
1820 from geopandas.tests.util import PACKAGE_DIR, validate_boro_df
2022 import pytest
2123
2224
25 try:
26 import pyogrio
27 except ImportError:
28 pyogrio = False
29
30
31 try:
32 import fiona
33
34 FIONA_GE_1814 = Version(fiona.__version__) >= Version(
35 "1.8.14"
36 ) # datetime roundtrip
37 except ImportError:
38 fiona = False
39 FIONA_GE_1814 = False
40
41
42 PYOGRIO_MARK = pytest.mark.skipif(not pyogrio, reason="pyogrio not installed")
43 FIONA_MARK = pytest.mark.skipif(not fiona, reason="fiona not installed")
44
45
2346 _CRS = "epsg:4326"
2447
2548
49 @pytest.fixture(
50 params=[
51 pytest.param("fiona", marks=FIONA_MARK),
52 pytest.param("pyogrio", marks=PYOGRIO_MARK),
53 ]
54 )
55 def engine(request):
56 return request.param
57
58
59 def skip_pyogrio_not_supported(engine):
60 if engine == "pyogrio":
61 pytest.skip("not supported for the pyogrio engine")
62
63
2664 @pytest.fixture
27 def df_nybb():
65 def df_nybb(engine):
2866 nybb_path = geopandas.datasets.get_path("nybb")
29 df = read_file(nybb_path)
67 df = read_file(nybb_path, engine=engine)
3068 return df
3169
3270
71109 ]
72110
73111
74 def assert_correct_driver(file_path, ext):
112 def assert_correct_driver(file_path, ext, engine):
75113 # check the expected driver
76114 expected_driver = "ESRI Shapefile" if ext == "" else _EXTENSION_TO_DRIVER[ext]
77 with fiona.open(str(file_path)) as fds:
78 assert fds.driver == expected_driver
115
116 if engine == "fiona":
117 with fiona.open(str(file_path)) as fds:
118 assert fds.driver == expected_driver
119 else:
120 # TODO pyogrio doesn't yet provide a way to check the driver of a file
121 return
79122
80123
81124 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
82 def test_to_file(tmpdir, df_nybb, df_null, driver, ext):
125 def test_to_file(tmpdir, df_nybb, df_null, driver, ext, engine):
83126 """Test to_file and from_file"""
84127 tempfilename = os.path.join(str(tmpdir), "boros." + ext)
85 df_nybb.to_file(tempfilename, driver=driver)
128 df_nybb.to_file(tempfilename, driver=driver, engine=engine)
86129 # Read layer back in
87 df = GeoDataFrame.from_file(tempfilename)
130 df = GeoDataFrame.from_file(tempfilename, engine=engine)
88131 assert "geometry" in df
89132 assert len(df) == 5
90133 assert np.alltrue(df["BoroName"].values == df_nybb["BoroName"])
91134
92135 # Write layer with null geometry out to file
93136 tempfilename = os.path.join(str(tmpdir), "null_geom" + ext)
94 df_null.to_file(tempfilename, driver=driver)
137 df_null.to_file(tempfilename, driver=driver, engine=engine)
95138 # Read layer back in
96 df = GeoDataFrame.from_file(tempfilename)
139 df = GeoDataFrame.from_file(tempfilename, engine=engine)
97140 assert "geometry" in df
98141 assert len(df) == 2
99142 assert np.alltrue(df["Name"].values == df_null["Name"])
100143 # check the expected driver
101 assert_correct_driver(tempfilename, ext)
144 assert_correct_driver(tempfilename, ext, engine)
102145
103146
104147 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
105 def test_to_file_pathlib(tmpdir, df_nybb, df_null, driver, ext):
148 def test_to_file_pathlib(tmpdir, df_nybb, driver, ext, engine):
106149 """Test to_file and from_file"""
107150 temppath = pathlib.Path(os.path.join(str(tmpdir), "boros." + ext))
108 df_nybb.to_file(temppath, driver=driver)
151 df_nybb.to_file(temppath, driver=driver, engine=engine)
109152 # Read layer back in
110 df = GeoDataFrame.from_file(temppath)
153 df = GeoDataFrame.from_file(temppath, engine=engine)
111154 assert "geometry" in df
112155 assert len(df) == 5
113156 assert np.alltrue(df["BoroName"].values == df_nybb["BoroName"])
114157 # check the expected driver
115 assert_correct_driver(temppath, ext)
158 assert_correct_driver(temppath, ext, engine)
116159
117160
118161 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
119 def test_to_file_bool(tmpdir, driver, ext):
162 def test_to_file_bool(tmpdir, driver, ext, engine):
120163 """Test error raise when writing with a boolean column (GH #437)."""
121164 tempfilename = os.path.join(str(tmpdir), "temp.{0}".format(ext))
122165 df = GeoDataFrame(
123166 {
124 "a": [1, 2, 3],
125 "b": [True, False, True],
167 "col": [True, False, True],
126168 "geometry": [Point(0, 0), Point(1, 1), Point(2, 2)],
127169 },
128170 crs=4326,
129171 )
130172
131 df.to_file(tempfilename, driver=driver)
132 result = read_file(tempfilename)
173 df.to_file(tempfilename, driver=driver, engine=engine)
174 result = read_file(tempfilename, engine=engine)
133175 if ext in (".shp", ""):
134176 # Shapefile does not support boolean, so is read back as int
135 df["b"] = df["b"].astype("int64")
177 if engine == "fiona":
178 df["col"] = df["col"].astype("int64")
179 else:
180 df["col"] = df["col"].astype("int32")
136181 assert_geodataframe_equal(result, df)
137182 # check the expected driver
138 assert_correct_driver(tempfilename, ext)
139
140
141 def test_to_file_datetime(tmpdir):
183 assert_correct_driver(tempfilename, ext, engine)
184
185
186 TEST_DATE = datetime.datetime(2021, 11, 21, 1, 7, 43, 17500)
187 eastern = pytz.timezone("US/Eastern")
188
189 datetime_type_tests = (TEST_DATE, eastern.localize(TEST_DATE))
190
191
192 @pytest.mark.parametrize(
193 "time", datetime_type_tests, ids=("naive_datetime", "datetime_with_timezone")
194 )
195 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
196 def test_to_file_datetime(tmpdir, driver, ext, time, engine):
142197 """Test writing a data file with the datetime column type"""
143 tempfilename = os.path.join(str(tmpdir), "test_datetime.gpkg")
198 if engine == "pyogrio" and time.tzinfo is not None:
199 # TODO
200 pytest.skip("pyogrio doesn't yet support timezones")
201 if ext in (".shp", ""):
202 pytest.skip(f"Driver corresponding to ext {ext} doesn't support dt fields")
203 if time.tzinfo is not None and FIONA_GE_1814 is False:
204 # https://github.com/Toblerity/Fiona/pull/915
205 pytest.skip("Fiona >= 1.8.14 needed for timezone support")
206
207 tempfilename = os.path.join(str(tmpdir), f"test_datetime{ext}")
144208 point = Point(0, 0)
145 now = datetime.datetime.now()
146 df = GeoDataFrame({"a": [1, 2], "b": [now, now]}, geometry=[point, point], crs=4326)
147 df.to_file(tempfilename, driver="GPKG")
148 df_read = read_file(tempfilename)
149 assert_geoseries_equal(df.geometry, df_read.geometry)
209
210 df = GeoDataFrame(
211 {"a": [1.0, 2.0], "b": [time, time]}, geometry=[point, point], crs=4326
212 )
213 if FIONA_GE_1814:
214 fiona_precision_limit = "ms"
215 else:
216 fiona_precision_limit = "s"
217 df["b"] = df["b"].dt.round(freq=fiona_precision_limit)
218
219 df.to_file(tempfilename, driver=driver, engine=engine)
220 df_read = read_file(tempfilename, engine=engine)
221
222 assert_geodataframe_equal(df.drop(columns=["b"]), df_read.drop(columns=["b"]))
223 if df["b"].dt.tz is not None:
224 # US/Eastern becomes pytz.FixedOffset(-300) when read from file
225 # so compare fairly in terms of UTC
226 assert_series_equal(
227 df["b"].dt.tz_convert(pytz.utc), df_read["b"].dt.tz_convert(pytz.utc)
228 )
229 else:
230 assert_series_equal(df["b"], df_read["b"])
150231
151232
152233 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
153 def test_to_file_with_point_z(tmpdir, ext, driver):
234 def test_to_file_with_point_z(tmpdir, ext, driver, engine):
154235 """Test that 3D geometries are retained in writes (GH #612)."""
155236
156237 tempfilename = os.path.join(str(tmpdir), "test_3Dpoint" + ext)
157238 point3d = Point(0, 0, 500)
158239 point2d = Point(1, 1)
159240 df = GeoDataFrame({"a": [1, 2]}, geometry=[point3d, point2d], crs=_CRS)
160 df.to_file(tempfilename, driver=driver)
161 df_read = GeoDataFrame.from_file(tempfilename)
241 df.to_file(tempfilename, driver=driver, engine=engine)
242 df_read = GeoDataFrame.from_file(tempfilename, engine=engine)
162243 assert_geoseries_equal(df.geometry, df_read.geometry)
163244 # check the expected driver
164 assert_correct_driver(tempfilename, ext)
245 assert_correct_driver(tempfilename, ext, engine)
165246
166247
167248 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
168 def test_to_file_with_poly_z(tmpdir, ext, driver):
249 def test_to_file_with_poly_z(tmpdir, ext, driver, engine):
169250 """Test that 3D geometries are retained in writes (GH #612)."""
170251
171252 tempfilename = os.path.join(str(tmpdir), "test_3Dpoly" + ext)
172253 poly3d = Polygon([[0, 0, 5], [0, 1, 5], [1, 1, 5], [1, 0, 5]])
173254 poly2d = Polygon([[0, 0], [0, 1], [1, 1], [1, 0]])
174255 df = GeoDataFrame({"a": [1, 2]}, geometry=[poly3d, poly2d], crs=_CRS)
175 df.to_file(tempfilename, driver=driver)
176 df_read = GeoDataFrame.from_file(tempfilename)
256 df.to_file(tempfilename, driver=driver, engine=engine)
257 df_read = GeoDataFrame.from_file(tempfilename, engine=engine)
177258 assert_geoseries_equal(df.geometry, df_read.geometry)
178259 # check the expected driver
179 assert_correct_driver(tempfilename, ext)
180
181
182 def test_to_file_types(tmpdir, df_points):
260 assert_correct_driver(tempfilename, ext, engine)
261
262
263 def test_to_file_types(tmpdir, df_points, engine):
183264 """Test various integer type columns (GH#93)"""
184265 tempfilename = os.path.join(str(tmpdir), "int.shp")
185266 int_types = [
199280 for i, dtype in enumerate(int_types)
200281 )
201282 df = GeoDataFrame(data, geometry=geometry)
202 df.to_file(tempfilename)
203
204
205 def test_to_file_int64(tmpdir, df_points):
283 df.to_file(tempfilename, engine=engine)
284
285
286 def test_to_file_int64(tmpdir, df_points, engine):
287 skip_pyogrio_not_supported(engine) # TODO
206288 tempfilename = os.path.join(str(tmpdir), "int64.shp")
207289 geometry = df_points.geometry
208290 df = GeoDataFrame(geometry=geometry)
209291 df["data"] = pd.array([1, np.nan] * 5, dtype=pd.Int64Dtype())
210 df.to_file(tempfilename)
211 df_read = GeoDataFrame.from_file(tempfilename)
292 df.to_file(tempfilename, engine=engine)
293 df_read = GeoDataFrame.from_file(tempfilename, engine=engine)
212294 assert_geodataframe_equal(df_read, df, check_dtype=False, check_like=True)
213295
214296
215 def test_to_file_empty(tmpdir):
216 input_empty_df = GeoDataFrame()
297 def test_to_file_empty(tmpdir, engine):
298 input_empty_df = GeoDataFrame(columns=["geometry"])
217299 tempfilename = os.path.join(str(tmpdir), "test.shp")
218 with pytest.raises(ValueError, match="Cannot write empty DataFrame to file."):
219 input_empty_df.to_file(tempfilename)
300 with pytest.warns(UserWarning):
301 input_empty_df.to_file(tempfilename, engine=engine)
220302
221303
222304 def test_to_file_privacy(tmpdir, df_nybb):
223305 tempfilename = os.path.join(str(tmpdir), "test.shp")
224 with pytest.warns(DeprecationWarning):
306 with pytest.warns(FutureWarning):
225307 geopandas.io.file.to_file(df_nybb, tempfilename)
226308
227309
228 def test_to_file_schema(tmpdir, df_nybb):
310 def test_to_file_schema(tmpdir, df_nybb, engine):
229311 """
230312 Ensure that the file is written according to the schema
231313 if it is specified
242324 )
243325 schema = {"geometry": "Polygon", "properties": properties}
244326
245 # Take the first 2 features to speed things up a bit
246 df_nybb.iloc[:2].to_file(tempfilename, schema=schema)
247
248 with fiona.open(tempfilename) as f:
249 result_schema = f.schema
250
251 assert result_schema == schema
252
253
254 def test_to_file_column_len(tmpdir, df_points):
327 if engine == "pyogrio":
328 with pytest.raises(ValueError):
329 df_nybb.iloc[:2].to_file(tempfilename, schema=schema, engine=engine)
330 else:
331 # Take the first 2 features to speed things up a bit
332 df_nybb.iloc[:2].to_file(tempfilename, schema=schema, engine=engine)
333
334 import fiona
335
336 with fiona.open(tempfilename) as f:
337 result_schema = f.schema
338
339 assert result_schema == schema
340
341
342 def test_to_file_crs(tmpdir, engine):
343 """
344 Ensure that the file is written according to the crs
345 if it is specified
346 """
347 df = read_file(geopandas.datasets.get_path("nybb"), engine=engine)
348 tempfilename = os.path.join(str(tmpdir), "crs.shp")
349
350 # save correct CRS
351 df.to_file(tempfilename, engine=engine)
352 result = GeoDataFrame.from_file(tempfilename, engine=engine)
353 assert result.crs == df.crs
354
355 if engine == "pyogrio":
356 with pytest.raises(ValueError, match="Passing 'crs' it not supported"):
357 df.to_file(tempfilename, crs=3857, engine=engine)
358 return
359
360 # overwrite CRS
361 df.to_file(tempfilename, crs=3857, engine=engine)
362 result = GeoDataFrame.from_file(tempfilename, engine=engine)
363 assert result.crs == "epsg:3857"
364
365 # specify CRS for gdf without one
366 df2 = df.copy()
367 df2.crs = None
368 df2.to_file(tempfilename, crs=2263, engine=engine)
369 df = GeoDataFrame.from_file(tempfilename, engine=engine)
370 assert df.crs == "epsg:2263"
371
372
373 def test_to_file_column_len(tmpdir, df_points, engine):
255374 """
256375 Ensure that a warning about truncation is given when a geodataframe with
257376 column names longer than 10 characters is saved to shapefile
264383 with pytest.warns(
265384 UserWarning, match="Column names longer than 10 characters will be truncated"
266385 ):
267 df.to_file(tempfilename, driver="ESRI Shapefile")
386 df.to_file(tempfilename, driver="ESRI Shapefile", engine=engine)
387
388
389 def test_to_file_with_duplicate_columns(tmpdir, engine):
390 df = GeoDataFrame(data=[[1, 2, 3]], columns=["a", "b", "a"], geometry=[Point(1, 1)])
391 tempfilename = os.path.join(str(tmpdir), "duplicate.shp")
392 with pytest.raises(
393 ValueError, match="GeoDataFrame cannot contain duplicated column names."
394 ):
395 df.to_file(tempfilename, engine=engine)
268396
269397
270398 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
271 def test_append_file(tmpdir, df_nybb, df_null, driver, ext):
399 def test_append_file(tmpdir, df_nybb, df_null, driver, ext, engine):
272400 """Test to_file with append mode and from_file"""
401 skip_pyogrio_not_supported(engine)
273402 from fiona import supported_drivers
274403
275404 tempfilename = os.path.join(str(tmpdir), "boros" + ext)
277406 if "a" not in supported_drivers[driver]:
278407 return None
279408
280 df_nybb.to_file(tempfilename, driver=driver)
281 df_nybb.to_file(tempfilename, mode="a", driver=driver)
409 df_nybb.to_file(tempfilename, driver=driver, engine=engine)
410 df_nybb.to_file(tempfilename, mode="a", driver=driver, engine=engine)
282411 # Read layer back in
283 df = GeoDataFrame.from_file(tempfilename)
412 df = GeoDataFrame.from_file(tempfilename, engine=engine)
284413 assert "geometry" in df
285414 assert len(df) == (5 * 2)
286415 expected = pd.concat([df_nybb] * 2, ignore_index=True)
288417
289418 # Write layer with null geometry out to file
290419 tempfilename = os.path.join(str(tmpdir), "null_geom" + ext)
291 df_null.to_file(tempfilename, driver=driver)
292 df_null.to_file(tempfilename, mode="a", driver=driver)
420 df_null.to_file(tempfilename, driver=driver, engine=engine)
421 df_null.to_file(tempfilename, mode="a", driver=driver, engine=engine)
293422 # Read layer back in
294 df = GeoDataFrame.from_file(tempfilename)
423 df = GeoDataFrame.from_file(tempfilename, engine=engine)
295424 assert "geometry" in df
296425 assert len(df) == (2 * 2)
297426 expected = pd.concat([df_null] * 2, ignore_index=True)
299428
300429
301430 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
302 def test_empty_crs(tmpdir, driver, ext):
431 def test_empty_crs(tmpdir, driver, ext, engine):
303432 """Test handling of undefined CRS with GPKG driver (GH #1975)."""
304433 if ext == ".gpkg":
305434 pytest.xfail("GPKG is read with Undefined geographic SRS.")
307436 tempfilename = os.path.join(str(tmpdir), "boros" + ext)
308437 df = GeoDataFrame(
309438 {
310 "a": [1, 2, 3],
439 "a": [1.0, 2.0, 3.0],
311440 "geometry": [Point(0, 0), Point(1, 1), Point(2, 2)],
312441 },
313442 )
314443
315 df.to_file(tempfilename, driver=driver)
316 result = read_file(tempfilename)
444 df.to_file(tempfilename, driver=driver, engine=engine)
445 result = read_file(tempfilename, engine=engine)
317446
318447 if ext == ".geojson":
319448 # geojson by default assumes epsg:4326
327456 # -----------------------------------------------------------------------------
328457
329458
330 with fiona.open(geopandas.datasets.get_path("nybb")) as f:
331 CRS = f.crs["init"] if "init" in f.crs else f.crs_wkt
332 NYBB_COLUMNS = list(f.meta["schema"]["properties"].keys())
333
334
335 def test_read_file(df_nybb):
336 df = df_nybb.rename(columns=lambda x: x.lower())
459 NYBB_CRS = "epsg:2263"
460
461
462 def test_read_file(engine):
463 df = read_file(geopandas.datasets.get_path("nybb"), engine=engine)
337464 validate_boro_df(df)
338 assert df.crs == CRS
339 # get lower case columns, and exclude geometry column from comparison
340 lower_columns = [c.lower() for c in NYBB_COLUMNS]
341 assert (df.columns[:-1] == lower_columns).all()
465 assert df.crs == NYBB_CRS
466 expected_columns = ["BoroCode", "BoroName", "Shape_Leng", "Shape_Area"]
467 assert (df.columns[:-1] == expected_columns).all()
342468
343469
344470 @pytest.mark.web
345 def test_read_file_remote_geojson_url():
471 def test_read_file_remote_geojson_url(engine):
346472 url = (
347473 "https://raw.githubusercontent.com/geopandas/geopandas/"
348 "master/geopandas/tests/data/null_geom.geojson"
349 )
350 gdf = read_file(url)
474 "main/geopandas/tests/data/null_geom.geojson"
475 )
476 gdf = read_file(url, engine=engine)
351477 assert isinstance(gdf, geopandas.GeoDataFrame)
352478
353479
354480 @pytest.mark.web
355 def test_read_file_remote_zipfile_url():
481 def test_read_file_remote_zipfile_url(engine):
356482 url = (
357483 "https://raw.githubusercontent.com/geopandas/geopandas/"
358 "master/geopandas/datasets/nybb_16a.zip"
359 )
360 gdf = read_file(url)
484 "main/geopandas/datasets/nybb_16a.zip"
485 )
486 gdf = read_file(url, engine=engine)
361487 assert isinstance(gdf, geopandas.GeoDataFrame)
362488
363489
364 def test_read_file_textio(file_path):
490 def test_read_file_textio(file_path, engine):
365491 file_text_stream = open(file_path)
366492 file_stringio = io.StringIO(open(file_path).read())
367 gdf_text_stream = read_file(file_text_stream)
368 gdf_stringio = read_file(file_stringio)
493 gdf_text_stream = read_file(file_text_stream, engine=engine)
494 gdf_stringio = read_file(file_stringio, engine=engine)
369495 assert isinstance(gdf_text_stream, geopandas.GeoDataFrame)
370496 assert isinstance(gdf_stringio, geopandas.GeoDataFrame)
371497
372498
373 def test_read_file_bytesio(file_path):
499 def test_read_file_bytesio(file_path, engine):
374500 file_binary_stream = open(file_path, "rb")
375501 file_bytesio = io.BytesIO(open(file_path, "rb").read())
376 gdf_binary_stream = read_file(file_binary_stream)
377 gdf_bytesio = read_file(file_bytesio)
502 gdf_binary_stream = read_file(file_binary_stream, engine=engine)
503 gdf_bytesio = read_file(file_bytesio, engine=engine)
378504 assert isinstance(gdf_binary_stream, geopandas.GeoDataFrame)
379505 assert isinstance(gdf_bytesio, geopandas.GeoDataFrame)
380506
381507
382 def test_read_file_raw_stream(file_path):
508 def test_read_file_raw_stream(file_path, engine):
383509 file_raw_stream = open(file_path, "rb", buffering=0)
384 gdf_raw_stream = read_file(file_raw_stream)
510 gdf_raw_stream = read_file(file_raw_stream, engine=engine)
385511 assert isinstance(gdf_raw_stream, geopandas.GeoDataFrame)
386512
387513
388 def test_read_file_pathlib(file_path):
514 def test_read_file_pathlib(file_path, engine):
389515 path_object = pathlib.Path(file_path)
390 gdf_path_object = read_file(path_object)
516 gdf_path_object = read_file(path_object, engine=engine)
391517 assert isinstance(gdf_path_object, geopandas.GeoDataFrame)
392518
393519
394 def test_read_file_tempfile():
520 def test_read_file_tempfile(engine):
395521 temp = tempfile.TemporaryFile()
396522 temp.write(
397523 b"""
408534 """
409535 )
410536 temp.seek(0)
411 gdf_tempfile = geopandas.read_file(temp)
537 gdf_tempfile = geopandas.read_file(temp, engine=engine)
412538 assert isinstance(gdf_tempfile, geopandas.GeoDataFrame)
413539 temp.close()
414540
415541
416 def test_read_binary_file_fsspec():
542 def test_read_binary_file_fsspec(engine):
417543 fsspec = pytest.importorskip("fsspec")
418544 # Remove the zip scheme so fsspec doesn't open as a zipped file,
419545 # instead we want to read as bytes and let fiona decode it.
420546 path = geopandas.datasets.get_path("nybb")[6:]
421547 with fsspec.open(path, "rb") as f:
422 gdf = read_file(f)
548 gdf = read_file(f, engine=engine)
423549 assert isinstance(gdf, geopandas.GeoDataFrame)
424550
425551
426 def test_read_text_file_fsspec(file_path):
552 def test_read_text_file_fsspec(file_path, engine):
427553 fsspec = pytest.importorskip("fsspec")
428554 with fsspec.open(file_path, "r") as f:
429 gdf = read_file(f)
555 gdf = read_file(f, engine=engine)
430556 assert isinstance(gdf, geopandas.GeoDataFrame)
431557
432558
433 def test_infer_zipped_file():
559 def test_infer_zipped_file(engine):
434560 # Remove the zip scheme so that the test for a zipped file can
435561 # check it and add it back.
436562 path = geopandas.datasets.get_path("nybb")[6:]
437 gdf = read_file(path)
563 gdf = read_file(path, engine=engine)
438564 assert isinstance(gdf, geopandas.GeoDataFrame)
439565
440566 # Check that it can successfully add a zip scheme to a path that already has a
441567 # scheme
442 gdf = read_file("file+file://" + path)
568 gdf = read_file("file+file://" + path, engine=engine)
443569 assert isinstance(gdf, geopandas.GeoDataFrame)
444570
445571 # Check that it can add a zip scheme for a path that includes a subpath
446572 # within the archive.
447 gdf = read_file(path + "!nybb.shp")
573 gdf = read_file(path + "!nybb.shp", engine=engine)
448574 assert isinstance(gdf, geopandas.GeoDataFrame)
449575
450576
451 def test_allow_legacy_gdal_path():
577 def test_allow_legacy_gdal_path(engine):
452578 # Construct a GDAL-style zip path.
453579 path = "/vsizip/" + geopandas.datasets.get_path("nybb")[6:]
454 gdf = read_file(path)
580 gdf = read_file(path, engine=engine)
455581 assert isinstance(gdf, geopandas.GeoDataFrame)
456582
457583
458 def test_read_file_filtered__bbox(df_nybb):
584 def test_read_file_filtered__bbox(df_nybb, engine):
459585 nybb_filename = geopandas.datasets.get_path("nybb")
460586 bbox = (
461587 1031051.7879884212,
463589 1047224.3104931959,
464590 244317.30894023244,
465591 )
466 filtered_df = read_file(nybb_filename, bbox=bbox)
592 filtered_df = read_file(nybb_filename, bbox=bbox, engine=engine)
467593 expected = df_nybb[df_nybb["BoroName"].isin(["Bronx", "Queens"])]
468594 assert_geodataframe_equal(filtered_df, expected.reset_index(drop=True))
469595
470596
471 def test_read_file_filtered__bbox__polygon(df_nybb):
597 def test_read_file_filtered__bbox__polygon(df_nybb, engine):
472598 nybb_filename = geopandas.datasets.get_path("nybb")
473599 bbox = box(
474600 1031051.7879884212, 224272.49231459625, 1047224.3104931959, 244317.30894023244
475601 )
476 filtered_df = read_file(nybb_filename, bbox=bbox)
602 filtered_df = read_file(nybb_filename, bbox=bbox, engine=engine)
477603 expected = df_nybb[df_nybb["BoroName"].isin(["Bronx", "Queens"])]
478604 assert_geodataframe_equal(filtered_df, expected.reset_index(drop=True))
479605
480606
481 def test_read_file_filtered__rows(df_nybb):
607 def test_read_file_filtered__rows(df_nybb, engine):
482608 nybb_filename = geopandas.datasets.get_path("nybb")
483 filtered_df = read_file(nybb_filename, rows=1)
609 filtered_df = read_file(nybb_filename, rows=1, engine=engine)
484610 assert_geodataframe_equal(filtered_df, df_nybb.iloc[[0], :])
485611
486612
487 def test_read_file_filtered__rows_slice(df_nybb):
613 def test_read_file_filtered__rows_slice(df_nybb, engine):
488614 nybb_filename = geopandas.datasets.get_path("nybb")
489 filtered_df = read_file(nybb_filename, rows=slice(1, 3))
615 filtered_df = read_file(nybb_filename, rows=slice(1, 3), engine=engine)
490616 assert_geodataframe_equal(filtered_df, df_nybb.iloc[1:3, :].reset_index(drop=True))
491617
492618
493619 @pytest.mark.filterwarnings(
494620 "ignore:Layer does not support OLC_FASTFEATURECOUNT:RuntimeWarning"
495621 ) # for the slice with -1
496 def test_read_file_filtered__rows_bbox(df_nybb):
622 def test_read_file_filtered__rows_bbox(df_nybb, engine):
497623 nybb_filename = geopandas.datasets.get_path("nybb")
498624 bbox = (
499625 1031051.7879884212,
501627 1047224.3104931959,
502628 244317.30894023244,
503629 )
504 # combination bbox and rows (rows slice applied after bbox filtering!)
505 filtered_df = read_file(nybb_filename, bbox=bbox, rows=slice(4, None))
506 assert filtered_df.empty
507 filtered_df = read_file(nybb_filename, bbox=bbox, rows=slice(-1, None))
508 assert_geodataframe_equal(filtered_df, df_nybb.iloc[4:, :].reset_index(drop=True))
509
510
511 def test_read_file_filtered_rows_invalid():
630 if engine == "pyogrio":
631 with pytest.raises(ValueError, match="'skip_features' must be between 0 and 1"):
632 # combination bbox and rows (rows slice applied after bbox filtering!)
633 filtered_df = read_file(
634 nybb_filename, bbox=bbox, rows=slice(4, None), engine=engine
635 )
636 else: # fiona
637 # combination bbox and rows (rows slice applied after bbox filtering!)
638 filtered_df = read_file(
639 nybb_filename, bbox=bbox, rows=slice(4, None), engine=engine
640 )
641 assert filtered_df.empty
642
643 if engine == "pyogrio":
644 # TODO: support negative rows in pyogrio
645 with pytest.raises(ValueError, match="'skip_features' must be between 0 and 1"):
646 filtered_df = read_file(
647 nybb_filename, bbox=bbox, rows=slice(-1, None), engine=engine
648 )
649 else:
650 filtered_df = read_file(
651 nybb_filename, bbox=bbox, rows=slice(-1, None), engine=engine
652 )
653 filtered_df["BoroCode"] = filtered_df["BoroCode"].astype("int64")
654 assert_geodataframe_equal(
655 filtered_df, df_nybb.iloc[4:, :].reset_index(drop=True)
656 )
657
658
659 def test_read_file_filtered_rows_invalid(engine):
512660 with pytest.raises(TypeError):
513 read_file(geopandas.datasets.get_path("nybb"), rows="not_a_slice")
514
515
516 def test_read_file__ignore_geometry():
661 read_file(
662 geopandas.datasets.get_path("nybb"), rows="not_a_slice", engine=engine
663 )
664
665
666 def test_read_file__ignore_geometry(engine):
517667 pdf = geopandas.read_file(
518 geopandas.datasets.get_path("naturalearth_lowres"), ignore_geometry=True
668 geopandas.datasets.get_path("naturalearth_lowres"),
669 ignore_geometry=True,
670 engine=engine,
519671 )
520672 assert "geometry" not in pdf.columns
521673 assert isinstance(pdf, pd.DataFrame) and not isinstance(pdf, geopandas.GeoDataFrame)
522674
523675
524 def test_read_file__ignore_all_fields():
676 def test_read_file__ignore_all_fields(engine):
677 skip_pyogrio_not_supported(engine) # pyogrio has "columns" keyword instead
525678 gdf = geopandas.read_file(
526679 geopandas.datasets.get_path("naturalearth_lowres"),
527680 ignore_fields=["pop_est", "continent", "name", "iso_a3", "gdp_md_est"],
681 engine="fiona",
528682 )
529683 assert gdf.columns.tolist() == ["geometry"]
530684
531685
532 def test_read_file_filtered_with_gdf_boundary(df_nybb):
686 @PYOGRIO_MARK
687 def test_read_file__columns():
688 # TODO: this is only support for pyogrio, but we could mimic it for fiona as well
689 gdf = geopandas.read_file(
690 geopandas.datasets.get_path("naturalearth_lowres"),
691 columns=["name", "pop_est"],
692 engine="pyogrio",
693 )
694 assert gdf.columns.tolist() == ["name", "pop_est", "geometry"]
695
696
697 def test_read_file_filtered_with_gdf_boundary(df_nybb, engine):
533698 full_df_shape = df_nybb.shape
534699 nybb_filename = geopandas.datasets.get_path("nybb")
535700 bbox = geopandas.GeoDataFrame(
541706 244317.30894023244,
542707 )
543708 ],
544 crs=CRS,
545 )
546 filtered_df = read_file(nybb_filename, bbox=bbox)
709 crs=NYBB_CRS,
710 )
711 filtered_df = read_file(nybb_filename, bbox=bbox, engine=engine)
547712 filtered_df_shape = filtered_df.shape
548713 assert full_df_shape != filtered_df_shape
549714 assert filtered_df_shape == (2, 5)
550715
551716
552 def test_read_file_filtered_with_gdf_boundary__mask(df_nybb):
717 def test_read_file_filtered_with_gdf_boundary__mask(df_nybb, engine):
718 skip_pyogrio_not_supported(engine)
553719 gdf_mask = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
554720 gdf = geopandas.read_file(
555721 geopandas.datasets.get_path("naturalearth_cities"),
556722 mask=gdf_mask[gdf_mask.continent == "Africa"],
723 engine=engine,
557724 )
558725 filtered_df_shape = gdf.shape
559726 assert filtered_df_shape == (50, 2)
560727
561728
562 def test_read_file_filtered_with_gdf_boundary__mask__polygon(df_nybb):
729 def test_read_file_filtered_with_gdf_boundary__mask__polygon(df_nybb, engine):
730 skip_pyogrio_not_supported(engine)
563731 full_df_shape = df_nybb.shape
564732 nybb_filename = geopandas.datasets.get_path("nybb")
565733 mask = box(
566734 1031051.7879884212, 224272.49231459625, 1047224.3104931959, 244317.30894023244
567735 )
568 filtered_df = read_file(nybb_filename, mask=mask)
736 filtered_df = read_file(nybb_filename, mask=mask, engine=engine)
569737 filtered_df_shape = filtered_df.shape
570738 assert full_df_shape != filtered_df_shape
571739 assert filtered_df_shape == (2, 5)
572740
573741
574 def test_read_file_filtered_with_gdf_boundary_mismatched_crs(df_nybb):
742 def test_read_file_filtered_with_gdf_boundary_mismatched_crs(df_nybb, engine):
743 skip_pyogrio_not_supported(engine)
575744 full_df_shape = df_nybb.shape
576745 nybb_filename = geopandas.datasets.get_path("nybb")
577746 bbox = geopandas.GeoDataFrame(
583752 244317.30894023244,
584753 )
585754 ],
586 crs=CRS,
755 crs=NYBB_CRS,
587756 )
588757 bbox.to_crs(epsg=4326, inplace=True)
589 filtered_df = read_file(nybb_filename, bbox=bbox)
758 filtered_df = read_file(nybb_filename, bbox=bbox, engine=engine)
590759 filtered_df_shape = filtered_df.shape
591760 assert full_df_shape != filtered_df_shape
592761 assert filtered_df_shape == (2, 5)
593762
594763
595 def test_read_file_filtered_with_gdf_boundary_mismatched_crs__mask(df_nybb):
764 def test_read_file_filtered_with_gdf_boundary_mismatched_crs__mask(df_nybb, engine):
765 skip_pyogrio_not_supported(engine)
596766 full_df_shape = df_nybb.shape
597767 nybb_filename = geopandas.datasets.get_path("nybb")
598768 mask = geopandas.GeoDataFrame(
604774 244317.30894023244,
605775 )
606776 ],
607 crs=CRS,
777 crs=NYBB_CRS,
608778 )
609779 mask.to_crs(epsg=4326, inplace=True)
610 filtered_df = read_file(nybb_filename, mask=mask.geometry)
780 filtered_df = read_file(nybb_filename, mask=mask.geometry, engine=engine)
611781 filtered_df_shape = filtered_df.shape
612782 assert full_df_shape != filtered_df_shape
613783 assert filtered_df_shape == (2, 5)
614784
615785
616 def test_read_file_empty_shapefile(tmpdir):
786 @pytest.mark.filterwarnings(
787 "ignore:Layer 'b'test_empty'' does not have any features:UserWarning"
788 )
789 def test_read_file_empty_shapefile(tmpdir, engine):
790 if engine == "pyogrio" and not fiona:
791 pytest.skip("test requires fiona to work")
792 from geopandas.io.file import fiona_env
617793
618794 # create empty shapefile
619795 meta = {
632808 with fiona.open(fname, "w", **meta) as _: # noqa
633809 pass
634810
635 empty = read_file(fname)
811 empty = read_file(fname, engine=engine)
636812 assert isinstance(empty, geopandas.GeoDataFrame)
637813 assert all(empty.columns == ["A", "Z", "geometry"])
638814
639815
640816 def test_read_file_privacy(tmpdir, df_nybb):
641 with pytest.warns(DeprecationWarning):
817 with pytest.warns(FutureWarning):
642818 geopandas.io.file.read_file(geopandas.datasets.get_path("nybb"))
643819
644820
661837 @pytest.mark.parametrize(
662838 "driver,ext", [("ESRI Shapefile", "shp"), ("GeoJSON", "geojson")]
663839 )
664 def test_write_index_to_file(tmpdir, df_points, driver, ext):
840 def test_write_index_to_file(tmpdir, df_points, driver, ext, engine):
665841 fngen = FileNumber(tmpdir, "check", ext)
666842
667843 def do_checks(df, index_is_used):
690866
691867 # check GeoDataFrame with default index=None to autodetect
692868 tempfilename = next(fngen)
693 df.to_file(tempfilename, driver=driver, index=None)
694 df_check = read_file(tempfilename)
869 df.to_file(tempfilename, driver=driver, index=None, engine=engine)
870 df_check = read_file(tempfilename, engine=engine)
695871 if len(other_cols) == 0:
696872 expected_cols = driver_col[:]
697873 else:
703879
704880 # similar check on GeoSeries with index=None
705881 tempfilename = next(fngen)
706 df.geometry.to_file(tempfilename, driver=driver, index=None)
707 df_check = read_file(tempfilename)
882 df.geometry.to_file(tempfilename, driver=driver, index=None, engine=engine)
883 df_check = read_file(tempfilename, engine=engine)
708884 if index_is_used:
709885 expected_cols = index_cols + ["geometry"]
710886 else:
713889
714890 # check GeoDataFrame with index=True
715891 tempfilename = next(fngen)
716 df.to_file(tempfilename, driver=driver, index=True)
717 df_check = read_file(tempfilename)
892 df.to_file(tempfilename, driver=driver, index=True, engine=engine)
893 df_check = read_file(tempfilename, engine=engine)
718894 assert list(df_check.columns) == index_cols + other_cols + ["geometry"]
719895
720896 # similar check on GeoSeries with index=True
721897 tempfilename = next(fngen)
722 df.geometry.to_file(tempfilename, driver=driver, index=True)
723 df_check = read_file(tempfilename)
898 df.geometry.to_file(tempfilename, driver=driver, index=True, engine=engine)
899 df_check = read_file(tempfilename, engine=engine)
724900 assert list(df_check.columns) == index_cols + ["geometry"]
725901
726902 # check GeoDataFrame with index=False
727903 tempfilename = next(fngen)
728 df.to_file(tempfilename, driver=driver, index=False)
729 df_check = read_file(tempfilename)
904 df.to_file(tempfilename, driver=driver, index=False, engine=engine)
905 df_check = read_file(tempfilename, engine=engine)
730906 if len(other_cols) == 0:
731907 expected_cols = driver_col + ["geometry"]
732908 else:
735911
736912 # similar check on GeoSeries with index=False
737913 tempfilename = next(fngen)
738 df.geometry.to_file(tempfilename, driver=driver, index=False)
739 df_check = read_file(tempfilename)
914 df.geometry.to_file(tempfilename, driver=driver, index=False, engine=engine)
915 df_check = read_file(tempfilename, engine=engine)
740916 assert list(df_check.columns) == driver_col + ["geometry"]
741917
742918 return
8411017 @pytest.mark.parametrize(
8421018 "test_file", [(pathlib.Path("~/test_file.geojson")), "~/test_file.geojson"]
8431019 )
844 def test_write_read_file(test_file):
1020 def test_write_read_file(test_file, engine):
8451021 gdf = geopandas.GeoDataFrame(geometry=[box(0, 0, 10, 10)], crs=_CRS)
8461022 gdf.to_file(test_file, driver="GeoJSON")
847 df_json = geopandas.read_file(test_file)
1023 df_json = geopandas.read_file(test_file, engine=engine)
8481024 assert_geodataframe_equal(gdf, df_json, check_crs=True)
8491025 os.remove(os.path.expanduser(test_file))
1313
1414 from geopandas.testing import assert_geodataframe_equal
1515 import pytest
16
17 from .test_file import FIONA_MARK, PYOGRIO_MARK
18
1619
1720 # Credit: Polygons below come from Montreal city Open Data portal
1821 # http://donnees.ville.montreal.qc.ca/dataset/unites-evaluation-fonciere
245248 return request.param
246249
247250
248 def test_to_file_roundtrip(tmpdir, geodataframe, ogr_driver):
251 @pytest.fixture(
252 params=[
253 pytest.param("fiona", marks=FIONA_MARK),
254 pytest.param("pyogrio", marks=PYOGRIO_MARK),
255 ]
256 )
257 def engine(request):
258 return request.param
259
260
261 def test_to_file_roundtrip(tmpdir, geodataframe, ogr_driver, engine):
249262 output_file = os.path.join(str(tmpdir), "output_file")
250263
251264 expected_error = _expected_error_on(geodataframe, ogr_driver)
252265 if expected_error:
253 with pytest.raises(RuntimeError, match="Failed to write record"):
254 geodataframe.to_file(output_file, driver=ogr_driver)
266 with pytest.raises(
267 RuntimeError, match="Failed to write record|Could not add feature to layer"
268 ):
269 geodataframe.to_file(output_file, driver=ogr_driver, engine=engine)
255270 else:
256 geodataframe.to_file(output_file, driver=ogr_driver)
257
258 reloaded = geopandas.read_file(output_file)
271 geodataframe.to_file(output_file, driver=ogr_driver, engine=engine)
272
273 reloaded = geopandas.read_file(output_file, engine=engine)
274
275 if ogr_driver == "GeoJSON" and engine == "pyogrio":
276 # For GeoJSON files, the int64 column comes back as int32
277 reloaded["a"] = reloaded["a"].astype("int64")
259278
260279 assert_geodataframe_equal(geodataframe, reloaded, check_column_type="equiv")
11 See generate_legacy_storage_files.py for the creation of the legacy files.
22
33 """
4 from distutils.version import LooseVersion
4 from contextlib import contextmanager
5 from packaging.version import Version
56 import glob
67 import os
78 import pathlib
3536 return request.param
3637
3738
38 @pytest.fixture
39 def with_use_pygeos_false():
39 @contextmanager
40 def with_use_pygeos(option):
4041 orig = geopandas.options.use_pygeos
41 geopandas.options.use_pygeos = not orig
42 yield
43 geopandas.options.use_pygeos = orig
42 geopandas.options.use_pygeos = option
43 try:
44 yield
45 finally:
46 geopandas.options.use_pygeos = orig
4447
4548
4649 @pytest.mark.skipif(
47 compat.USE_PYGEOS or (str(pyproj.__version__) < LooseVersion("2.4")),
50 compat.USE_PYGEOS or (Version(pyproj.__version__) < Version("2.4")),
4851 reason=(
4952 "pygeos-based unpickling currently only works for pygeos-written files; "
5053 "old pyproj versions can't read pickles from newer pyproj versions"
6972 assert isinstance(result.has_sindex, bool)
7073
7174
72 @pytest.mark.skipif(not compat.HAS_PYGEOS, reason="requires pygeos to test #1745")
73 def test_pygeos_switch(tmpdir, with_use_pygeos_false):
74 gdf_crs = geopandas.GeoDataFrame(
75 def _create_gdf():
76 return geopandas.GeoDataFrame(
7577 {"a": [0.1, 0.2, 0.3], "geometry": [Point(1, 1), Point(2, 2), Point(3, 3)]},
7678 crs="EPSG:4326",
7779 )
78 path = str(tmpdir / "gdf_crs.pickle")
79 gdf_crs.to_pickle(path)
80 result = pd.read_pickle(path)
81 assert_geodataframe_equal(result, gdf_crs)
80
81
82 @pytest.mark.skipif(not compat.HAS_PYGEOS, reason="requires pygeos to test #1745")
83 def test_pygeos_switch(tmpdir):
84 # writing and reading with pygeos disabled
85 with with_use_pygeos(False):
86 gdf = _create_gdf()
87 path = str(tmpdir / "gdf_crs1.pickle")
88 gdf.to_pickle(path)
89 result = pd.read_pickle(path)
90 assert_geodataframe_equal(result, gdf)
91
92 # writing without pygeos, reading with pygeos
93 with with_use_pygeos(False):
94 gdf = _create_gdf()
95 path = str(tmpdir / "gdf_crs1.pickle")
96 gdf.to_pickle(path)
97
98 with with_use_pygeos(True):
99 result = pd.read_pickle(path)
100 gdf = _create_gdf()
101 assert_geodataframe_equal(result, gdf)
102
103 # writing with pygeos, reading without pygeos
104 with with_use_pygeos(True):
105 gdf = _create_gdf()
106 path = str(tmpdir / "gdf_crs1.pickle")
107 gdf.to_pickle(path)
108
109 with with_use_pygeos(False):
110 result = pd.read_pickle(path)
111 gdf = _create_gdf()
112 assert_geodataframe_equal(result, gdf)
2525 @pytest.fixture()
2626 def connection_postgis():
2727 """
28 Initiaties a connection to a postGIS database that must already exist.
28 Initiates a connection to a postGIS database that must already exist.
2929 See create_postgis for more information.
3030 """
3131 psycopg2 = pytest.importorskip("psycopg2")
5050 @pytest.fixture()
5151 def engine_postgis():
5252 """
53 Initiaties a connection engine to a postGIS database that must already exist.
53 Initiates a connection engine to a postGIS database that must already exist.
5454 """
5555 sqlalchemy = pytest.importorskip("sqlalchemy")
5656 from sqlalchemy.engine.url import URL
325325 create_postgis(con, df_nybb)
326326
327327 sql = "SELECT * FROM nybb;"
328 with pytest.warns(DeprecationWarning):
328 with pytest.warns(FutureWarning):
329329 geopandas.io.sql.read_postgis(sql, con)
330330
331331 def test_write_postgis_default(self, engine_postgis, df_nybb):
457457 ).fetchone()[0]
458458 assert target_srid == 0, "SRID should be 0, found %s" % target_srid
459459
460 def test_write_postgis_with_esri_authority(self, engine_postgis, df_nybb):
461 """
462 Tests that GeoDataFrame can be written to PostGIS with ESRI Authority
463 CRS information (GH #2414).
464 """
465 engine = engine_postgis
466
467 table = "nybb"
468
469 # Write to db
470 df_nybb_esri = df_nybb.to_crs("ESRI:102003")
471 write_postgis(df_nybb_esri, con=engine, name=table, if_exists="replace")
472 # Validate that srid is 102003
473 target_srid = engine.execute(
474 "SELECT Find_SRID('{schema}', '{table}', '{geom_col}');".format(
475 schema="public", table=table, geom_col="geometry"
476 )
477 ).fetchone()[0]
478 assert target_srid == 102003, "SRID should be 102003, found %s" % target_srid
479
460480 def test_write_postgis_geometry_collection(
461481 self, engine_postgis, df_geom_collection
462482 ):
55
66 import geopandas
77
8 from distutils.version import LooseVersion
8 from packaging.version import Version
99
1010 from ._decorator import doc
1111
1212
13 def deprecated(new):
13 def deprecated(new, warning_type=FutureWarning):
1414 """Helper to provide deprecation warning."""
1515
1616 def old(*args, **kwargs):
1717 warnings.warn(
1818 "{} is intended for internal ".format(new.__name__[1:])
1919 + "use only, and will be deprecated.",
20 DeprecationWarning,
20 warning_type,
2121 stacklevel=2,
2222 )
2323 new(*args, **kwargs)
6969 from matplotlib.colors import is_color_like
7070 from typing import Iterable
7171
72 mpl = matplotlib.__version__
73 if mpl >= LooseVersion("3.4") or (mpl > LooseVersion("3.3.2") and "+" in mpl):
72 mpl = Version(matplotlib.__version__)
73 if mpl >= Version("3.4"):
7474 # alpha is supported as array argument with matplotlib 3.4+
7575 scalar_kwargs = ["marker", "path_effects"]
7676 else:
324324 ----------
325325 s : Series
326326 The GeoSeries to be plotted. Currently Polygon,
327 MultiPolygon, LineString, MultiLineString and Point
327 MultiPolygon, LineString, MultiLineString, Point and MultiPoint
328328 geometries can be plotted.
329329 cmap : str (default None)
330330 The name of a colormap recognized by matplotlib. Any
334334
335335 tab10, tab20, Accent, Dark2, Paired, Pastel1, Set1, Set2
336336
337 color : str (default None)
337 color : str, np.array, pd.Series, List (default None)
338338 If specified, all objects will be colored uniformly.
339339 ax : matplotlib.pyplot.Artist (default None)
340340 axes on which to draw the plot
413413 )
414414 return ax
415415
416 # have colors been given for all geometries?
417 color_given = pd.api.types.is_list_like(color) and len(color) == len(s)
418
416419 # if cmap is specified, create range of colors based on cmap
417420 values = None
418421 if cmap is not None:
425428 # decompose GeometryCollections
426429 geoms, multiindex = _flatten_multi_geoms(s.geometry, prefix="Geom")
427430 values = np.take(values, multiindex, axis=0) if cmap else None
431 # ensure indexes are consistent
432 if color_given and isinstance(color, pd.Series):
433 color = color.reindex(s.index)
434 expl_color = np.take(color, multiindex, axis=0) if color_given else color
428435 expl_series = geopandas.GeoSeries(geoms)
429436
430437 geom_types = expl_series.type
442449 # color overrides both face and edgecolor. As we want people to be
443450 # able to use edgecolor as well, pass color to facecolor
444451 facecolor = style_kwds.pop("facecolor", None)
452 color_ = expl_color[poly_idx] if color_given else color
445453 if color is not None:
446 facecolor = color
454 facecolor = color_
447455
448456 values_ = values[poly_idx] if cmap else None
449457 _plot_polygon_collection(
454462 lines = expl_series[line_idx]
455463 if not lines.empty:
456464 values_ = values[line_idx] if cmap else None
465 color_ = expl_color[line_idx] if color_given else color
466
457467 _plot_linestring_collection(
458 ax, lines, values_, color=color, cmap=cmap, **style_kwds
468 ax, lines, values_, color=color_, cmap=cmap, **style_kwds
459469 )
460470
461471 # plot all Points in the same collection
462472 points = expl_series[point_idx]
463473 if not points.empty:
464474 values_ = values[point_idx] if cmap else None
475 color_ = expl_color[point_idx] if color_given else color
476
465477 _plot_point_collection(
466 ax, points, values_, color=color, cmap=cmap, **style_kwds
478 ax, points, values_, color=color_, cmap=cmap, **style_kwds
467479 )
468480
469481 plt.draw()
523535 - 'hexbin' : hexbin plot.
524536 cmap : str (default None)
525537 The name of a colormap recognized by matplotlib.
526 color : str (default None)
538 color : str, np.array, pd.Series (default None)
527539 If specified, all objects will be colored uniformly.
528540 ax : matplotlib.pyplot.Artist (default None)
529541 axes on which to draw the plot
562574 Size of the resulting matplotlib.figure.Figure. If the argument
563575 axes is given explicitly, figsize is ignored.
564576 legend_kwds : dict (default None)
565 Keyword arguments to pass to matplotlib.pyplot.legend() or
566 matplotlib.pyplot.colorbar().
577 Keyword arguments to pass to :func:`matplotlib.pyplot.legend` or
578 :func:`matplotlib.pyplot.colorbar`.
567579 Additional accepted keywords when `scheme` is specified:
568580
569581 fmt : string
736748 except ImportError:
737749 raise ImportError(mc_err)
738750
739 if mapclassify.__version__ < LooseVersion("2.4.0"):
751 if Version(mapclassify.__version__) < Version("2.4.0"):
740752 raise ImportError(mc_err)
741753
742754 if classification_kwds is None:
859871 **style_kwds,
860872 )
861873
862 if missing_kwds is not None and not expl_series[nan_idx].empty:
874 missing_data = not expl_series[nan_idx].empty
875 if missing_kwds is not None and missing_data:
863876 if color:
864877 if "color" not in missing_kwds:
865878 missing_kwds["color"] = color
901914 markeredgewidth=0,
902915 )
903916 )
904 if missing_kwds is not None:
917 if missing_kwds is not None and missing_data:
905918 if "color" in merged_kwds:
906919 merged_kwds["facecolor"] = merged_kwds["color"]
907920 patches.append(
452452
453453 # handle empty / invalid geometries
454454 if geometry is None:
455 # return an empty integer array, similar to pygeys.STRtree.query.
455 # return an empty integer array, similar to pygeos.STRtree.query.
456456 return np.array([], dtype=np.intp)
457457
458458 if not isinstance(geometry, BaseGeometry):
632632
633633 _PYGEOS_PREDICATES = {p.name for p in pygeos.strtree.BinaryPredicate} | set([None])
634634
635 class PyGEOSSTRTreeIndex(pygeos.STRtree):
635 class PyGEOSSTRTreeIndex(BaseSpatialIndex):
636636 """A simple wrapper around pygeos's STRTree.
637637
638638
650650 non_empty = geometry.copy()
651651 non_empty[pygeos.is_empty(non_empty)] = None
652652 # set empty geometries to None to maintain indexing
653 super().__init__(non_empty)
653 self._tree = pygeos.STRtree(non_empty)
654654 # store geometries, including empty geometries for user access
655655 self.geometries = geometry.copy()
656656
686686 if isinstance(geometry, BaseGeometry):
687687 geometry = array._shapely_to_geom(geometry)
688688
689 matches = super().query(geometry=geometry, predicate=predicate)
689 matches = self._tree.query(geometry=geometry, predicate=predicate)
690690
691691 if sort:
692692 return np.sort(matches)
739739
740740 geometry = self._as_geometry_array(geometry)
741741
742 res = super().query_bulk(geometry, predicate)
742 res = self._tree.query_bulk(geometry, predicate)
743743
744744 if sort:
745745 # sort by first array (geometry) and then second (tree)
759759 geometry = self._as_geometry_array(geometry)
760760
761761 if not return_all and max_distance is None and not return_distance:
762 return super().nearest(geometry)
763
764 result = super().nearest_all(
762 return self._tree.nearest(geometry)
763
764 result = self._tree.nearest_all(
765765 geometry, max_distance=max_distance, return_distance=return_distance
766766 )
767767 if return_distance:
803803
804804 # need to convert tuple of bounds to a geometry object
805805 if len(coordinates) == 4:
806 indexes = super().query(pygeos.box(*coordinates))
806 indexes = self._tree.query(pygeos.box(*coordinates))
807807 elif len(coordinates) == 2:
808 indexes = super().query(pygeos.points(*coordinates))
808 indexes = self._tree.query(pygeos.points(*coordinates))
809809 else:
810810 raise TypeError(
811811 "Invalid coordinates, must be iterable in format "
818818 @property
819819 @doc(BaseSpatialIndex.size)
820820 def size(self):
821 return len(self)
821 return len(self._tree)
822822
823823 @property
824824 @doc(BaseSpatialIndex.is_empty)
825825 def is_empty(self):
826 return len(self) == 0
826 return len(self._tree) == 0
827
828 def __len__(self):
829 return len(self._tree)
315315 )
316316
317317 # geometry comparison
318 for col, dtype in left.dtypes.iteritems():
318 for col, dtype in left.dtypes.items():
319319 if isinstance(dtype, GeometryDtype):
320320 assert_geoseries_equal(
321321 left[col],
00 import subprocess
11 import sys
2
3 from geopandas._compat import PANDAS_GE_10
42
53
64 def test_no_additional_imports():
1917 "psycopg2",
2018 "geopy",
2119 "geoalchemy2",
20 "matplotlib",
2221 }
23 if PANDAS_GE_10:
24 # pandas > 0.25 stopped importing matplotlib by default
25 blacklist.add("matplotlib")
2622
2723 code = """
2824 import sys
99 import shapely.geometry
1010 from shapely.geometry.base import CAP_STYLE, JOIN_STYLE
1111 import shapely.wkb
12 import shapely.wkt
1213 from shapely._buildcfg import geos_version
1314
1415 import geopandas
3132 shapely.geometry.Polygon([(random.random(), random.random()) for i in range(3)])
3233 for _ in range(10)
3334 ]
34 triangles = triangle_no_missing + [shapely.geometry.Polygon(), None]
35 triangles = triangle_no_missing + [shapely.wkt.loads("POLYGON EMPTY"), None]
3536 T = from_shapely(triangles)
3637
3738 points_no_missing = [
142143 missing_values = [None]
143144 if not compat.USE_PYGEOS:
144145 missing_values.extend([b"", np.nan])
145
146 if compat.PANDAS_GE_10:
147 missing_values.append(pd.NA)
146 missing_values.append(pd.NA)
148147
149148 res = from_wkb(missing_values)
150149 np.testing.assert_array_equal(res, np.full(len(missing_values), None))
204203 L_wkt = [f(p.wkt) for p in points_no_missing]
205204 res = from_wkt(L_wkt)
206205 assert isinstance(res, GeometryArray)
207 assert all(v.almost_equals(t) for v, t in zip(res, points_no_missing))
206 tol = 0.5 * 10 ** (-6)
207 assert all(v.equals_exact(t, tolerance=tol) for v, t in zip(res, points_no_missing))
208 assert all(v.equals_exact(t, tolerance=tol) for v, t in zip(res, points_no_missing))
208209
209210 # array
210211 res = from_wkt(np.array(L_wkt, dtype=object))
211212 assert isinstance(res, GeometryArray)
212 assert all(v.almost_equals(t) for v, t in zip(res, points_no_missing))
213 assert all(v.equals_exact(t, tolerance=tol) for v, t in zip(res, points_no_missing))
213214
214215 # missing values
215216 # TODO(pygeos) does not support empty strings, np.nan, or pd.NA
216217 missing_values = [None]
217218 if not compat.USE_PYGEOS:
218219 missing_values.extend([f(""), np.nan])
219
220 if compat.PANDAS_GE_10:
221 missing_values.append(pd.NA)
220 missing_values.append(pd.NA)
222221
223222 res = from_wkb(missing_values)
224223 np.testing.assert_array_equal(res, np.full(len(missing_values), None))
340339
341340
342341 @pytest.mark.parametrize(
343 "attr,args", [("equals_exact", (0.1,)), ("almost_equals", (3,))]
344 )
345 def test_equals_deprecation(attr, args):
346 point = points[0]
347 tri = triangles[0]
348
349 for other in [point, tri, shapely.geometry.Polygon()]:
350 with pytest.warns(FutureWarning):
351 result = getattr(T, attr)(other, *args)
352 assert result.tolist() == getattr(T, "geom_" + attr)(other, *args).tolist()
353
354
355 @pytest.mark.parametrize(
356342 "attr",
357343 [
358344 "boundary",
366352 def test_unary_geo(attr):
367353 na_value = None
368354
369 if attr == "boundary":
370 # pygeos returns None for empty geometries
371 if not compat.USE_PYGEOS:
372 # boundary raises for empty geometry
373 with pytest.raises(Exception):
374 T.boundary
375
376 values = triangle_no_missing + [None]
377 A = from_shapely(values)
378 else:
379 values = triangles
380 A = T
381
382 result = getattr(A, attr)
383 if attr == "exterior" and compat.USE_PYGEOS:
384 # TODO(pygeos)
385 # empty Polygon() has an exterior with shapely > 1.7, which gives
386 # empty LinearRing instead of None,
387 # but conversion to pygeos still results in empty GeometryCollection
388 expected = [
389 getattr(t, attr) if t is not None and not t.is_empty else na_value
390 for t in values
391 ]
392 else:
393 expected = [getattr(t, attr) if t is not None else na_value for t in values]
355 result = getattr(T, attr)
356 expected = [getattr(t, attr) if t is not None else na_value for t in triangles]
394357
395358 assert equal_geometries(result, expected)
396359
465428 "has_z",
466429 # for is_ring we raise a warning about the value for Polygon changing
467430 pytest.param(
468 "is_ring", marks=pytest.mark.filterwarnings("ignore:is_ring:FutureWarning")
431 "is_ring",
432 marks=[
433 pytest.mark.filterwarnings("ignore:is_ring:FutureWarning"),
434 ],
469435 ),
470436 ],
471437 )
483449
484450 result = getattr(V, attr)
485451
486 if attr == "is_simple" and (geos_version < (3, 8) or compat.USE_PYGEOS):
452 if attr == "is_simple" and geos_version < (3, 8):
487453 # poly.is_simple raises an error for empty polygon for GEOS < 3.8
488454 # with shapely, pygeos always returns False for all GEOS versions
489 # But even for Shapely with GEOS >= 3.8, empty GeometryCollection
490 # returns True instead of False
491455 expected = [
492456 getattr(t, attr) if t is not None and not t.is_empty else na_value
493457 for t in vals
499463 else na_value
500464 for t in vals
501465 ]
466 # empty Linearring.is_ring gives False with Shapely < 2.0
467 if compat.USE_PYGEOS and not compat.SHAPELY_GE_20:
468 expected[-2] = True
469 elif (
470 attr == "is_closed"
471 and compat.USE_PYGEOS
472 and compat.SHAPELY_GE_182
473 and not compat.SHAPELY_GE_20
474 ):
475 # In shapely 1.8.2, is_closed was changed to return always True for
476 # Polygon/MultiPolygon, while PyGEOS returns always False
477 expected = [False] * len(vals)
502478 else:
503479 expected = [getattr(t, attr) if t is not None else na_value for t in vals]
480
504481 assert result.tolist() == expected
505482
506483
512489 shapely.geometry.LineString([(0, 0), (1, 1), (1, -1)]),
513490 shapely.geometry.LineString([(0, 0), (1, 1), (1, -1), (0, 0)]),
514491 shapely.geometry.Polygon([(0, 0), (1, 1), (1, -1)]),
515 shapely.geometry.Polygon(),
492 shapely.wkt.loads("POLYGON EMPTY"),
516493 None,
517494 ]
518 expected = [True, False, True, True, False, False]
495 expected = [True, False, True, True, True, False]
496 if not compat.USE_PYGEOS and not compat.SHAPELY_GE_20:
497 # empty polygon is_ring gives False with Shapely < 2.0
498 expected[-2] = False
519499
520500 result = from_shapely(g).is_ring
521501
535515 def test_geom_types():
536516 cat = T.geom_type
537517 # empty polygon has GeometryCollection type
538 assert list(cat) == ["Polygon"] * (len(T) - 2) + ["GeometryCollection", None]
518 assert list(cat) == ["Polygon"] * (len(T) - 1) + [None]
539519
540520
541521 def test_geom_types_null_mixed():
783763 assert P5.equals(points[1])
784764
785765
766 @pytest.mark.parametrize(
767 "item",
768 [
769 geopandas.GeoDataFrame(
770 geometry=[shapely.geometry.Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])]
771 ),
772 geopandas.GeoSeries(
773 [shapely.geometry.Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])]
774 ),
775 np.array([shapely.geometry.Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])]),
776 [shapely.geometry.Polygon([(0, 0), (2, 0), (2, 2), (0, 2)])],
777 shapely.geometry.Polygon([(0, 0), (2, 0), (2, 2), (0, 2)]),
778 ],
779 )
780 def test_setitem(item):
781 points = [shapely.geometry.Point(i, i) for i in range(10)]
782 P = from_shapely(points)
783
784 P[[0]] = item
785
786 assert isinstance(P[0], shapely.geometry.Polygon)
787
788
786789 def test_equality_ops():
787790 with pytest.raises(ValueError):
788791 P[:5] == P[:7]
903906 assert t1[0] is None
904907
905908
906 @pytest.mark.skipif(not compat.PANDAS_GE_10, reason="pd.NA introduced in pandas 1.0")
907909 def test_isna_pdNA():
908910 t1 = T.copy()
909911 t1[0] = pd.NA
939941 self.landmarks.estimate_utm_crs()
940942 else:
941943 assert self.landmarks.estimate_utm_crs() == CRS("EPSG:32618")
942 assert self.landmarks.estimate_utm_crs("NAD83") == CRS("EPSG:26918")
944 if compat.PYPROJ_GE_32: # result is unstable in older pyproj
945 assert self.landmarks.estimate_utm_crs("NAD83") == CRS("EPSG:26918")
943946
944947 @pytest.mark.skipif(compat.PYPROJ_LT_3, reason="requires pyproj 3 or higher")
945948 def test_estimate_utm_crs__projected(self):
0 from distutils.version import LooseVersion
1 import os
0 from packaging.version import Version
21
32 import random
43
1716
1817 # pyproj 2.3.1 fixed a segfault for the case working in an environment with
1918 # 'init' dicts (https://github.com/pyproj4/pyproj/issues/415)
20 PYPROJ_LT_231 = LooseVersion(pyproj.__version__) < LooseVersion("2.3.1")
19 PYPROJ_LT_231 = Version(pyproj.__version__) < Version("2.3.1")
20 PYPROJ_GE_3 = Version(pyproj.__version__) >= Version("3.0.0")
2121
2222
2323 def _create_df(x, y=None, crs=None):
124124 # with PROJ >= 7, the transformation using EPSG code vs proj4 string is
125125 # slightly different due to use of grid files or not -> turn off network
126126 # to not use grid files at all for this test
127 os.environ["PROJ_NETWORK"] = "OFF"
127 if PYPROJ_GE_3:
128 pyproj.network.set_network_enabled(False)
129
128130 df = df_epsg26918()
129131 lonlat = df.to_crs(**epsg4326)
130132 utm = lonlat.to_crs(**epsg26918)
199201 assert s.crs == self.osgb
200202 assert s.values.crs == self.osgb
201203
202 with pytest.warns(FutureWarning):
204 with pytest.raises(
205 ValueError,
206 match="CRS mismatch between CRS of the passed geometries and 'crs'",
207 ):
203208 s = GeoSeries(arr, crs=4326)
204209 assert s.crs == self.osgb
205210
206 @pytest.mark.filterwarnings("ignore:Assigning CRS")
207211 def test_dataframe(self):
208212 arr = from_shapely(self.geoms, crs=27700)
209213 df = GeoDataFrame(geometry=arr)
218222 assert df.geometry.crs == self.osgb
219223 assert df.geometry.values.crs == self.osgb
220224
221 # different passed CRS than array CRS is ignored
222 with pytest.warns(FutureWarning, match="CRS mismatch"):
225 # different passed CRS than array CRS is now an error
226 match_str = "CRS mismatch between CRS of the passed geometries and 'crs'"
227 with pytest.raises(ValueError, match=match_str):
223228 df = GeoDataFrame(geometry=s, crs=4326)
224 assert df.crs == self.osgb
225 assert df.geometry.crs == self.osgb
226 assert df.geometry.values.crs == self.osgb
227 with pytest.warns(FutureWarning, match="CRS mismatch"):
229 with pytest.raises(ValueError, match=match_str):
228230 GeoDataFrame(geometry=s, crs=4326)
229 with pytest.warns(FutureWarning, match="CRS mismatch"):
231 with pytest.raises(ValueError, match=match_str):
230232 GeoDataFrame({"data": [1, 2], "geometry": s}, crs=4326)
231 with pytest.warns(FutureWarning, match="CRS mismatch"):
233 with pytest.raises(ValueError, match=match_str):
232234 GeoDataFrame(df, crs=4326).crs
233235
234236 # manually change CRS
240242 assert df.geometry.crs == self.wgs
241243 assert df.geometry.values.crs == self.wgs
242244
243 df = GeoDataFrame(self.geoms, columns=["geom"], crs=27700)
244 assert df.crs == self.osgb
245 df = df.set_geometry("geom")
245 with pytest.raises(ValueError, match="Assigning CRS to a GeoDataFrame without"):
246 GeoDataFrame(self.geoms, columns=["geom"], crs=27700)
247 with pytest.raises(ValueError, match="Assigning CRS to a GeoDataFrame without"):
248 GeoDataFrame(crs=27700)
249
250 df = GeoDataFrame(self.geoms, columns=["geom"])
251 df = df.set_geometry("geom", crs=27700)
246252 assert df.crs == self.osgb
247253 assert df.geometry.crs == self.osgb
248254 assert df.geometry.values.crs == self.osgb
254260 assert df.geometry.crs == self.osgb
255261 assert df.geometry.values.crs == self.osgb
256262
257 df = GeoDataFrame(crs=27700)
258 df = df.set_geometry(self.geoms)
259 assert df.crs == self.osgb
260 assert df.geometry.crs == self.osgb
261 assert df.geometry.values.crs == self.osgb
262
263263 # new geometry with set CRS has priority over GDF CRS
264 df = GeoDataFrame(crs=27700)
264 df = GeoDataFrame(geometry=self.geoms, crs=27700)
265265 df = df.set_geometry(self.geoms, crs=4326)
266266 assert df.crs == self.wgs
267267 assert df.geometry.crs == self.wgs
296296
297297 # geometry column without geometry
298298 df = GeoDataFrame({"geometry": [0, 1]})
299 df.crs = 27700
300 assert df.crs == self.osgb
299 with pytest.warns(
300 FutureWarning, match="Accessing CRS of a GeoDataFrame without a geometry"
301 ):
302 df.crs = 27700
303 with pytest.warns(
304 FutureWarning, match="Accessing CRS of a GeoDataFrame without a geometry"
305 ):
306 assert df.crs == self.osgb
307
308 def test_dataframe_getitem_without_geometry_column(self):
309 df = GeoDataFrame({"col": range(10)}, geometry=self.arr)
310 df["geom2"] = df.geometry.centroid
311 subset = df[["col", "geom2"]]
312 with pytest.warns(
313 FutureWarning, match="Accessing CRS of a GeoDataFrame without a geometry"
314 ):
315 assert subset.crs == self.osgb
301316
302317 def test_dataframe_setitem(self):
303318 # new geometry CRS has priority over GDF CRS
335350 assert df["geometry"].crs == self.wgs
336351 assert df["other_geom"].crs == self.osgb
337352
353 def test_dataframe_setitem_without_geometry_column(self):
354 arr = from_shapely(self.geoms)
355 df = GeoDataFrame({"col1": [1, 2], "geometry": arr}, crs=4326)
356
357 # create a dataframe without geometry column, but currently has cached _crs
358 with pytest.warns(UserWarning):
359 df["geometry"] = 1
360
361 # assigning a list of geometry object will currently use _crs
362 with pytest.warns(
363 FutureWarning,
364 match="Setting geometries to a GeoDataFrame without a geometry",
365 ):
366 df["geometry"] = self.geoms
367 assert df.crs == self.wgs
368
338369 @pytest.mark.parametrize(
339370 "scalar", [None, Point(0, 0), LineString([(0, 0), (1, 1)])]
340371 )
341372 def test_scalar(self, scalar):
342 with pytest.warns(FutureWarning):
373 df = GeoDataFrame()
374 df["geometry"] = scalar
375 df.crs = 4326
376 assert df.crs == self.wgs
377 assert df.geometry.crs == self.wgs
378 assert df.geometry.values.crs == self.wgs
379
380 @pytest.mark.filterwarnings("ignore:Accessing CRS")
381 def test_crs_with_no_geom_fails(self):
382 with pytest.raises(ValueError, match="Assigning CRS to a GeoDataFrame without"):
343383 df = GeoDataFrame()
344384 df.crs = 4326
345 df["geometry"] = scalar
346 assert df.crs == self.wgs
347 assert df.geometry.crs == self.wgs
348 assert df.geometry.values.crs == self.wgs
349385
350386 def test_read_file(self):
351387 nybb_filename = datasets.get_path("nybb")
577613 assert merged.geom.values.crs == self.osgb
578614 assert merged.crs == self.osgb
579615
580 # CRS should be assigned to geometry
581 def test_deprecation(self):
582 with pytest.warns(FutureWarning):
583 df = GeoDataFrame([], crs=27700)
584
585 # https://github.com/geopandas/geopandas/issues/1548
586 # ensure we still have converted the crs value to a CRS object
587 assert isinstance(df.crs, pyproj.CRS)
588
589 with pytest.warns(FutureWarning):
590 df = GeoDataFrame([])
591 df.crs = 27700
592
593 assert isinstance(df.crs, pyproj.CRS)
594
595616 # make sure that geometry column from list has CRS (__setitem__)
596617 def test_setitem_geometry(self):
597618 arr = from_shapely(self.geoms, crs=27700)
66
77 from pandas.testing import assert_frame_equal
88 import pytest
9
10 from geopandas.testing import assert_geodataframe_equal
911
1012
1113 @pytest.fixture
1719 nybb_polydf = nybb_polydf.set_geometry("myshapes")
1820 nybb_polydf["manhattan_bronx"] = 5
1921 nybb_polydf.loc[3:4, "manhattan_bronx"] = 6
22 nybb_polydf["BoroCode"] = nybb_polydf["BoroCode"].astype("int64")
2023 return nybb_polydf
2124
2225
7073 assert test.crs is None
7174
7275
73 def first_dissolve(nybb_polydf, first):
76 def test_first_dissolve(nybb_polydf, first):
7477 test = nybb_polydf.dissolve("manhattan_bronx")
7578 assert_frame_equal(first, test, check_column_type=False)
7679
296299 UserWarning, match="dropna kwarg is not supported for pandas < 1.1.0"
297300 ):
298301 nybb_polydf.dissolve(dropna=False)
302
303
304 def test_dissolve_multi_agg(nybb_polydf, merged_shapes):
305
306 merged_shapes[("BoroCode", "min")] = [3, 1]
307 merged_shapes[("BoroCode", "max")] = [5, 2]
308 merged_shapes[("BoroName", "count")] = [3, 2]
309
310 with pytest.warns(None) as record:
311 test = nybb_polydf.dissolve(
312 by="manhattan_bronx",
313 aggfunc={
314 "BoroCode": ["min", "max"],
315 "BoroName": "count",
316 },
317 )
318 assert_geodataframe_equal(test, merged_shapes)
319 assert len(record) == 0
11 import numpy as np
22 import pandas as pd
33 import pytest
4 from distutils.version import LooseVersion
4 from packaging.version import Version
55
66 folium = pytest.importorskip("folium")
77 branca = pytest.importorskip("branca")
1212 import matplotlib.colors as colors # noqa
1313 from branca.colormap import StepColormap # noqa
1414
15 BRANCA_05 = str(branca.__version__) > LooseVersion("0.4.2")
15 BRANCA_05 = Version(branca.__version__) > Version("0.4.2")
1616
1717
1818 class TestExplore:
6161 assert "openstreetmap" in m.to_dict()["children"].keys()
6262
6363 def test_map_settings_custom(self):
64 """Check custom map settins"""
64 """Check custom map settings"""
6565 m = self.nybb.explore(
6666 zoom_control=False,
6767 width=200,
251251 df["categorical"] = pd.Categorical(df["BoroName"])
252252 with pytest.raises(ValueError, match="Cannot specify 'categories'"):
253253 df.explore("categorical", categories=["Brooklyn", "Staten Island"])
254
255 def test_bool(self):
256 df = self.nybb.copy()
257 df["bool"] = [True, False, True, False, True]
258 m = df.explore("bool")
259 out_str = self._fetch_map_string(m)
260 assert '"__folium_color":"#9edae5","bool":true' in out_str
261 assert '"__folium_color":"#1f77b4","bool":false' in out_str
254262
255263 def test_column_values(self):
256264 """
308316 m = self.world.explore(column="pop_est", style_kwds=dict(color="black"))
309317 assert '"color":"black"' in self._fetch_map_string(m)
310318
319 # custom style_function - geopandas/issues/2350
320 m = self.world.explore(
321 style_kwds={
322 "style_function": lambda x: {
323 "fillColor": "red"
324 if x["properties"]["gdp_md_est"] < 10**6
325 else "green",
326 "color": "black"
327 if x["properties"]["gdp_md_est"] < 10**6
328 else "white",
329 }
330 }
331 )
332 # two lines with formatting instructions from style_function.
333 # make sure each passes test
334 assert all(
335 [
336 ('"fillColor":"green"' in t and '"color":"white"' in t)
337 or ('"fillColor":"red"' in t and '"color":"black"' in t)
338 for t in [
339 "".join(line.split())
340 for line in m._parent.render().split("\n")
341 if "return" in line and "color" in line
342 ]
343 ]
344 )
345
346 # style function has to be callable
347 with pytest.raises(ValueError, match="'style_function' has to be a callable"):
348 self.world.explore(style_kwds={"style_function": "not callable"})
349
311350 def test_tooltip(self):
312351 """Test tooltip"""
313352 # default with no tooltip or popup
412451 assert "BoroName" in out_str
413452
414453 def test_default_markers(self):
415 # check overriden default for points
454 # check overridden default for points
416455 m = self.cities.explore()
417456 strings = ['"radius":2', '"fill":true', "CircleMarker(latlng,opts)"]
418457 out_str = self._fetch_map_string(m)
541580 assert out_str.count("#5ec962ff") == 100
542581 assert out_str.count("#fde725ff") == 100
543582
544 # scale legend accorrdingly
583 # scale legend accordingly
545584 m = self.world.explore(
546585 "pop_est",
547586 legend=True,
549588 )
550589 out_str = self._fetch_map_string(m)
551590 assert out_str.count("#440154ff") == 16
552 assert out_str.count("#3b528bff") == 51
553 assert out_str.count("#21918cff") == 133
554 assert out_str.count("#5ec962ff") == 282
555 assert out_str.count("#fde725ff") == 18
591 assert out_str.count("#3b528bff") == 50
592 assert out_str.count("#21918cff") == 138
593 assert out_str.count("#5ec962ff") == 290
594 assert out_str.count("#fde725ff") == 6
556595
557596 # discrete cmap
558597 m = self.world.explore("pop_est", legend=True, cmap="Pastel2")
569608
570609 @pytest.mark.skipif(not BRANCA_05, reason="requires branca >= 0.5.0")
571610 def test_colorbar_max_labels(self):
611 import re
612
572613 # linear
573614 m = self.world.explore("pop_est", legend_kwds=dict(max_labels=3))
574615 out_str = self._fetch_map_string(m)
575
576 tick_values = [140.0, 465176713.5921569, 930353287.1843138]
577 for tick in tick_values:
578 assert str(tick) in out_str
616 tick_str = re.search(r"tickValues\(\[[\',\,\.,0-9]*\]\)", out_str).group(0)
617 assert (
618 tick_str.replace(",''", "")
619 == "tickValues([140.0,471386328.07843137,942772516.1568627])"
620 )
579621
580622 # scheme
581623 m = self.world.explore(
582624 "pop_est", scheme="headtailbreaks", legend_kwds=dict(max_labels=3)
583625 )
584626 out_str = self._fetch_map_string(m)
585
586 assert "tickValues([140,'',182567501.0,'',1330619341.0,''])" in out_str
627 assert "tickValues([140.0,'',184117213.1818182,'',1382066377.0,''])" in out_str
587628
588629 # short cmap
589630 m = self.world.explore("pop_est", legend_kwds=dict(max_labels=3), cmap="tab10")
590631 out_str = self._fetch_map_string(m)
591632
592 tick_values = [140.0, 551721192.4, 1103442244.8]
593 for tick in tick_values:
594 assert str(tick) in out_str
633 tick_str = re.search(r"tickValues\(\[[\',\,\.,0-9]*\]\)", out_str).group(0)
634 assert (
635 tick_str
636 == "tickValues([140.0,'','','',559086084.0,'','','',1118172028.0,'','',''])"
637 )
595638
596639 def test_xyzservices_providers(self):
597640 xyzservices = pytest.importorskip("xyzservices")
607650 'attribution":"\\u0026copy;\\u003cahref=\\"https://www.openstreetmap.org'
608651 in out_str
609652 )
610 assert '"maxNativeZoom":19,"maxZoom":19,"minZoom":0' in out_str
653 assert '"maxNativeZoom":20,"maxZoom":20,"minZoom":0' in out_str
611654
612655 def test_xyzservices_query_name(self):
613656 pytest.importorskip("xyzservices")
623666 'attribution":"\\u0026copy;\\u003cahref=\\"https://www.openstreetmap.org'
624667 in out_str
625668 )
626 assert '"maxNativeZoom":19,"maxZoom":19,"minZoom":0' in out_str
669 assert '"maxNativeZoom":20,"maxZoom":20,"minZoom":0' in out_str
627670
628671 def test_linearrings(self):
629672 rings = self.nybb.explode(index_parts=True).exterior
643686 out_str = self._fetch_map_string(m)
644687
645688 strings = [
646 "[140.00,33986655.00]",
647 "(33986655.00,105350020.00]",
648 "(105350020.00,207353391.00]",
649 "(207353391.00,326625791.00]",
650 "(326625791.00,1379302771.00]",
689 "[140.00,21803000.00]",
690 "(21803000.00,66834405.00]",
691 "(66834405.00,163046161.00]",
692 "(163046161.00,328239523.00]",
693 "(328239523.00,1397715000.00]",
651694 "missing",
652695 ]
653696 for s in strings:
664707 out_str = self._fetch_map_string(m)
665708
666709 strings = [
667 ">140.00,33986655.00",
668 ">33986655.00,105350020.00",
669 ">105350020.00,207353391.00",
670 ">207353391.00,326625791.00",
671 ">326625791.00,1379302771.00",
710 ">140.00,21803000.00",
711 ">21803000.00,66834405.00",
712 ">66834405.00,163046161.00",
713 ">163046161.00,328239523.00",
714 ">328239523.00,1397715000.00",
672715 "missing",
673716 ]
674717 for s in strings:
699742 out_str = self._fetch_map_string(m)
700743
701744 strings = [
702 ">140,33986655",
703 ">33986655,105350020",
704 ">105350020,207353391",
705 ">207353391,326625791",
706 ">326625791,1379302771",
745 ">140,21803000",
746 ">21803000,66834405",
747 ">66834405,163046161",
748 ">163046161,328239523",
749 ">328239523,1397715000",
707750 "missing",
708751 ]
709752 for s in strings:
749792 for s in strings:
750793 assert s in out_str
751794
752 assert out_str.count("008000ff") == 306
753 assert out_str.count("ffff00ff") == 187
754 assert out_str.count("ff0000ff") == 190
795 assert out_str.count("008000ff") == 304
796 assert out_str.count("ffff00ff") == 188
797 assert out_str.count("ff0000ff") == 191
755798
756799 # Using custom function colormap
757800 def my_color_function(field):
796839 gdf["centroid"] = gdf.centroid
797840
798841 gdf.explore()
842
843 def test_map_kwds(self):
844 def check():
845 out_str = self._fetch_map_string(m)
846 assert "zoomControl:false" in out_str
847 assert "dragging:false" in out_str
848 assert "scrollWheelZoom:false" in out_str
849
850 # check that folium and leaflet Map() parameters can be passed
851 m = self.world.explore(
852 zoom_control=False, map_kwds=dict(dragging=False, scrollWheelZoom=False)
853 )
854 check()
855 with pytest.raises(
856 ValueError, match="'zoom_control' cannot be specified in 'map_kwds'"
857 ):
858 self.world.explore(
859 map_kwds=dict(dragging=False, scrollWheelZoom=False, zoom_control=False)
860 )
230230 return request.param
231231
232232
233 @pytest.fixture
234 def invalid_scalar(data):
235 """
236 A scalar that *cannot* be held by this ExtensionArray.
237
238 The default should work for most subclasses, but is not guaranteed.
239
240 If the array can hold any item (i.e. object dtype), then use pytest.skip.
241 """
242 return object.__new__(object)
243
244
233245 # Fixtures defined in pandas/conftest.py that are also needed: defining them
234246 # here instead of importing for compatibility
235247
292304 class TestInterface(extension_tests.BaseInterfaceTests):
293305 def test_array_interface(self, data):
294306 # we are overriding this base test because the creation of `expected`
295 # potentionally doesn't work for shapely geometries
307 # potentially doesn't work for shapely geometries
296308 # TODO can be removed with Shapely 2.0
297309 result = np.array(data)
298310 assert result[0] == data[0]
11 import os
22 import shutil
33 import tempfile
4 from distutils.version import LooseVersion
4 from packaging.version import Version
55
66 import numpy as np
77 import pandas as pd
88
9 import pyproj
109 from pyproj import CRS
1110 from pyproj.exceptions import CRSError
1211 from shapely.geometry import Point, Polygon
2322 import pytest
2423
2524
26 PYPROJ_LT_3 = LooseVersion(pyproj.__version__) < LooseVersion("3")
2725 TEST_NEAREST = compat.PYGEOS_GE_010 and compat.USE_PYGEOS
28 pandas_133 = pd.__version__ == LooseVersion("1.3.3")
26 pandas_133 = Version(pd.__version__) == Version("1.3.3")
2927
3028
3129 @pytest.fixture
335333 assert isinstance(result, GeoDataFrame)
336334 assert isinstance(result.index, pd.DatetimeIndex)
337335
336 def test_set_geometry_np_int(self):
337 self.df.loc[:, 0] = self.df.geometry
338 df = self.df.set_geometry(np.int64(0))
339 assert df.geometry.name == 0
340
341 def test_get_geometry_invalid(self):
342 df = GeoDataFrame()
343 df["geom"] = self.df.geometry
344 msg_geo_col_none = "active geometry column to use has not been set. "
345 msg_geo_col_missing = "is not present. "
346
347 with pytest.raises(AttributeError, match=msg_geo_col_missing):
348 df.geometry
349 df2 = self.df.copy()
350 df2["geom2"] = df2.geometry
351 df2 = df2[["BoroCode", "BoroName", "geom2"]]
352 with pytest.raises(AttributeError, match=msg_geo_col_none):
353 df2.geometry
354
355 msg_other_geo_cols_present = "There are columns with geometry data type"
356 msg_no_other_geo_cols = "There are no existing columns with geometry data type"
357 with pytest.raises(AttributeError, match=msg_other_geo_cols_present):
358 df2.geometry
359
360 with pytest.raises(AttributeError, match=msg_no_other_geo_cols):
361 GeoDataFrame().geometry
362
338363 def test_align(self):
339364 df = self.df2
340365
505530 assert type(df2) is GeoDataFrame
506531 assert self.df.crs == df2.crs
507532
508 def test_to_file_crs(self):
509 """
510 Ensure that the file is written according to the crs
511 if it is specified
512
513 """
514 tempfilename = os.path.join(self.tempdir, "crs.shp")
515 # save correct CRS
516 self.df.to_file(tempfilename)
517 df = GeoDataFrame.from_file(tempfilename)
518 assert df.crs == self.df.crs
519 # overwrite CRS
520 self.df.to_file(tempfilename, crs=3857)
521 df = GeoDataFrame.from_file(tempfilename)
522 assert df.crs == "epsg:3857"
523
524 # specify CRS for gdf without one
525 df2 = self.df.copy()
526 df2.crs = None
527 df2.to_file(tempfilename, crs=2263)
528 df = GeoDataFrame.from_file(tempfilename)
529 assert df.crs == "epsg:2263"
530
531 def test_to_file_with_duplicate_columns(self):
532 df = GeoDataFrame(
533 data=[[1, 2, 3]], columns=["a", "b", "a"], geometry=[Point(1, 1)]
534 )
535 with pytest.raises(
536 ValueError, match="GeoDataFrame cannot contain duplicated column names."
537 ):
538 tempfilename = os.path.join(self.tempdir, "crs.shp")
539 df.to_file(tempfilename)
540
541533 def test_bool_index(self):
542534 # Find boros with 'B' in their name
543535 df = self.df[self.df["BoroName"].str.contains("B")]
593585 p3 = Point(3, 3)
594586 f3 = {
595587 "type": "Feature",
596 "properties": {"a": 2},
588 "properties": None,
597589 "geometry": p3.__geo_interface__,
598590 }
599591
601593
602594 result = df[["a", "b"]]
603595 expected = pd.DataFrame.from_dict(
604 [{"a": 0, "b": np.nan}, {"a": np.nan, "b": 1}, {"a": 2, "b": np.nan}]
596 [{"a": 0, "b": np.nan}, {"a": np.nan, "b": 1}, {"a": np.nan, "b": np.nan}]
605597 )
606598 assert_frame_equal(expected, result)
599
600 def test_from_features_empty_properties(self):
601 geojson_properties_object = """{
602 "type": "FeatureCollection",
603 "features": [
604 {
605 "type": "Feature",
606 "properties": {},
607 "geometry": {
608 "type": "Polygon",
609 "coordinates": [
610 [
611 [
612 11.3456529378891,
613 46.49461446367692
614 ],
615 [
616 11.345674395561216,
617 46.494097442978195
618 ],
619 [
620 11.346918940544128,
621 46.49385370294394
622 ],
623 [
624 11.347616314888,
625 46.4938352377453
626 ],
627 [
628 11.347514390945435,
629 46.49466985846028
630 ],
631 [
632 11.3456529378891,
633 46.49461446367692
634 ]
635 ]
636 ]
637 }
638 }
639 ]
640 }"""
641
642 geojson_properties_null = """{
643 "type": "FeatureCollection",
644 "features": [
645 {
646 "type": "Feature",
647 "properties": null,
648 "geometry": {
649 "type": "Polygon",
650 "coordinates": [
651 [
652 [
653 11.3456529378891,
654 46.49461446367692
655 ],
656 [
657 11.345674395561216,
658 46.494097442978195
659 ],
660 [
661 11.346918940544128,
662 46.49385370294394
663 ],
664 [
665 11.347616314888,
666 46.4938352377453
667 ],
668 [
669 11.347514390945435,
670 46.49466985846028
671 ],
672 [
673 11.3456529378891,
674 46.49461446367692
675 ]
676 ]
677 ]
678 }
679 }
680 ]
681 }"""
682
683 # geoJSON with empty properties
684 gjson_po = json.loads(geojson_properties_object)
685 gdf1 = GeoDataFrame.from_features(gjson_po)
686
687 # geoJSON with null properties
688 gjson_null = json.loads(geojson_properties_null)
689 gdf2 = GeoDataFrame.from_features(gjson_null)
690
691 assert_frame_equal(gdf1, gdf2)
607692
608693 def test_from_features_geom_interface_feature(self):
609694 class Placemark(object):
633718 geometry = [Point(xy) for xy in zip(df["lon"], df["lat"])]
634719 gdf = GeoDataFrame(df, geometry=geometry)
635720 # from_features returns sorted columns
636 expected = gdf[["geometry", "lat", "lon", "name"]]
721 expected = gdf[["geometry", "name", "lat", "lon"]]
637722
638723 # test FeatureCollection
639724 res = GeoDataFrame.from_features(gdf.__geo_interface__)
760845 assert self.df.crs == unpickled.crs
761846
762847 def test_estimate_utm_crs(self):
763 if PYPROJ_LT_3:
848 if compat.PYPROJ_LT_3:
764849 with pytest.raises(RuntimeError, match=r"pyproj 3\+ required"):
765850 self.df.estimate_utm_crs()
766851 else:
767852 assert self.df.estimate_utm_crs() == CRS("EPSG:32618")
768 assert self.df.estimate_utm_crs("NAD83") == CRS("EPSG:26918")
853 if compat.PYPROJ_GE_32: # result is unstable in older pyproj
854 assert self.df.estimate_utm_crs("NAD83") == CRS("EPSG:26918")
769855
770856 def test_to_wkb(self):
771857 wkbs0 = [
66 from shapely.geometry import LinearRing, LineString, MultiPoint, Point, Polygon
77 from shapely.geometry.collection import GeometryCollection
88 from shapely.ops import unary_union
9 from shapely import wkt
910
1011 from geopandas import GeoDataFrame, GeoSeries
1112 from geopandas.base import GeoPandasBase
7172 self.landmarks = GeoSeries([self.esb, self.sol], crs="epsg:4326")
7273 self.pt2d = Point(-73.9847, 40.7484)
7374 self.landmarks_mixed = GeoSeries([self.esb, self.sol, self.pt2d], crs=4326)
75 self.pt_empty = wkt.loads("POINT EMPTY")
76 self.landmarks_mixed_empty = GeoSeries(
77 [self.esb, self.sol, self.pt2d, self.pt_empty], crs=4326
78 )
7479 self.l1 = LineString([(0, 0), (0, 1), (1, 1)])
7580 self.l2 = LineString([(0, 0), (1, 0), (1, 1), (0, 1)])
7681 self.g5 = GeoSeries([self.l1, self.l2])
7984 self.g8 = GeoSeries([self.t1, self.t5])
8085 self.empty = GeoSeries([])
8186 self.all_none = GeoSeries([None, None])
87 self.all_geometry_collection_empty = GeoSeries(
88 [GeometryCollection([]), GeometryCollection([])]
89 )
8290 self.empty_poly = Polygon()
8391 self.g9 = GeoSeries(self.g0, index=range(1, 8))
92 self.g10 = GeoSeries([self.t1, self.t4])
8493
8594 # Crossed lines
8695 self.l3 = LineString([(0, 0), (1, 1)])
247256 with pytest.warns(UserWarning, match="The indices .+ different"):
248257 assert len(self.g0.intersection(self.g9, align=True) == 8)
249258 assert len(self.g0.intersection(self.g9, align=False) == 7)
259
260 def test_clip_by_rect(self):
261 self._test_binary_topological(
262 "clip_by_rect", self.g1, self.g10, *self.sq.bounds
263 )
264 # self.g1 and self.t3.bounds do not intersect
265 self._test_binary_topological(
266 "clip_by_rect", self.all_geometry_collection_empty, self.g1, *self.t3.bounds
267 )
250268
251269 def test_union_series(self):
252270 self._test_binary_topological("union", self.sq, self.g1, self.g2)
471489 )
472490 assert_array_dtype_equal(expected, self.na_none.distance(self.p0))
473491
474 expected = Series(np.array([np.sqrt(4 ** 2 + 4 ** 2), np.nan]), self.g6.index)
492 expected = Series(np.array([np.sqrt(4**2 + 4**2), np.nan]), self.g6.index)
475493 assert_array_dtype_equal(expected, self.g6.distance(self.na_none))
476494
477495 expected = Series(np.array([np.nan, 0, 0, 0, 0, 0, np.nan, np.nan]), range(8))
621639 # mixed dimensions
622640 expected_z = [30.3244, 31.2344, np.nan]
623641 assert_array_dtype_equal(expected_z, self.landmarks_mixed.geometry.z)
642
643 def test_xyz_points_empty(self):
644 expected_x = [-73.9847, -74.0446, -73.9847, np.nan]
645 expected_y = [40.7484, 40.6893, 40.7484, np.nan]
646 expected_z = [30.3244, 31.2344, np.nan, np.nan]
647
648 assert_array_dtype_equal(expected_x, self.landmarks_mixed_empty.geometry.x)
649 assert_array_dtype_equal(expected_y, self.landmarks_mixed_empty.geometry.y)
650 assert_array_dtype_equal(expected_z, self.landmarks_mixed_empty.geometry.z)
624651
625652 def test_xyz_polygons(self):
626653 # accessing x attribute in polygon geoseries should raise an error
10891116 test_df = df.explode(ignore_index=True, index_parts=True)
10901117 assert_frame_equal(test_df, expected_df)
10911118
1119 def test_explode_order(self):
1120 df = GeoDataFrame(
1121 {"vals": [1, 2, 3]},
1122 geometry=[MultiPoint([(x, x), (x, 0)]) for x in range(3)],
1123 index=[2, 9, 7],
1124 )
1125 test_df = df.explode(index_parts=True)
1126
1127 expected_index = MultiIndex.from_arrays(
1128 [[2, 2, 9, 9, 7, 7], [0, 1, 0, 1, 0, 1]],
1129 )
1130 expected_geometry = GeoSeries(
1131 [
1132 Point(0, 0),
1133 Point(0, 0),
1134 Point(1, 1),
1135 Point(1, 0),
1136 Point(2, 2),
1137 Point(2, 0),
1138 ],
1139 index=expected_index,
1140 )
1141 expected_df = GeoDataFrame(
1142 {"vals": [1, 1, 2, 2, 3, 3]},
1143 geometry=expected_geometry,
1144 index=expected_index,
1145 )
1146 assert_geodataframe_equal(test_df, expected_df)
1147
1148 def test_explode_order_no_multi(self):
1149 df = GeoDataFrame(
1150 {"vals": [1, 2, 3]},
1151 geometry=[Point(0, x) for x in range(3)],
1152 index=[2, 9, 7],
1153 )
1154 test_df = df.explode(index_parts=True)
1155
1156 expected_index = MultiIndex.from_arrays(
1157 [[2, 9, 7], [0, 0, 0]],
1158 )
1159 expected_df = GeoDataFrame(
1160 {"vals": [1, 2, 3]},
1161 geometry=[Point(0, x) for x in range(3)],
1162 index=expected_index,
1163 )
1164 assert_geodataframe_equal(test_df, expected_df)
1165
1166 def test_explode_order_mixed(self):
1167 df = GeoDataFrame(
1168 {"vals": [1, 2, 3]},
1169 geometry=[MultiPoint([(x, x), (x, 0)]) for x in range(2)] + [Point(0, 10)],
1170 index=[2, 9, 7],
1171 )
1172 test_df = df.explode(index_parts=True)
1173
1174 expected_index = MultiIndex.from_arrays(
1175 [[2, 2, 9, 9, 7], [0, 1, 0, 1, 0]],
1176 )
1177 expected_geometry = GeoSeries(
1178 [
1179 Point(0, 0),
1180 Point(0, 0),
1181 Point(1, 1),
1182 Point(1, 0),
1183 Point(0, 10),
1184 ],
1185 index=expected_index,
1186 )
1187 expected_df = GeoDataFrame(
1188 {"vals": [1, 1, 2, 2, 3]},
1189 geometry=expected_geometry,
1190 index=expected_index,
1191 )
1192 assert_geodataframe_equal(test_df, expected_df)
1193
1194 def test_explode_duplicated_index(self):
1195 df = GeoDataFrame(
1196 {"vals": [1, 2, 3]},
1197 geometry=[MultiPoint([(x, x), (x, 0)]) for x in range(3)],
1198 index=[1, 1, 2],
1199 )
1200 test_df = df.explode(index_parts=True)
1201 expected_index = MultiIndex.from_arrays(
1202 [[1, 1, 1, 1, 2, 2], [0, 1, 0, 1, 0, 1]],
1203 )
1204 expected_geometry = GeoSeries(
1205 [
1206 Point(0, 0),
1207 Point(0, 0),
1208 Point(1, 1),
1209 Point(1, 0),
1210 Point(2, 2),
1211 Point(2, 0),
1212 ],
1213 index=expected_index,
1214 )
1215 expected_df = GeoDataFrame(
1216 {"vals": [1, 1, 2, 2, 3, 3]},
1217 geometry=expected_geometry,
1218 index=expected_index,
1219 )
1220 assert_geodataframe_equal(test_df, expected_df)
1221
10921222 #
10931223 # Test '&', '|', '^', and '-'
10941224 #
10951225 def test_intersection_operator(self):
1096 with pytest.warns(DeprecationWarning):
1226 with pytest.warns(FutureWarning):
10971227 self._test_binary_operator("__and__", self.t1, self.g1, self.g2)
1098 with pytest.warns(DeprecationWarning):
1228 with pytest.warns(FutureWarning):
10991229 self._test_binary_operator("__and__", self.t1, self.gdf1, self.g2)
11001230
11011231 def test_union_operator(self):
1102 with pytest.warns(DeprecationWarning):
1232 with pytest.warns(FutureWarning):
11031233 self._test_binary_operator("__or__", self.sq, self.g1, self.g2)
1104 with pytest.warns(DeprecationWarning):
1234 with pytest.warns(FutureWarning):
11051235 self._test_binary_operator("__or__", self.sq, self.gdf1, self.g2)
11061236
11071237 def test_union_operator_polygon(self):
1108 with pytest.warns(DeprecationWarning):
1238 with pytest.warns(FutureWarning):
11091239 self._test_binary_operator("__or__", self.sq, self.g1, self.t2)
1110 with pytest.warns(DeprecationWarning):
1240 with pytest.warns(FutureWarning):
11111241 self._test_binary_operator("__or__", self.sq, self.gdf1, self.t2)
11121242
11131243 def test_symmetric_difference_operator(self):
1114 with pytest.warns(DeprecationWarning):
1244 with pytest.warns(FutureWarning):
11151245 self._test_binary_operator("__xor__", self.sq, self.g3, self.g4)
1116 with pytest.warns(DeprecationWarning):
1246 with pytest.warns(FutureWarning):
11171247 self._test_binary_operator("__xor__", self.sq, self.gdf3, self.g4)
11181248
11191249 def test_difference_series2(self):
11201250 expected = GeoSeries([GeometryCollection(), self.t2])
1121 with pytest.warns(DeprecationWarning):
1251 with pytest.warns(FutureWarning):
11221252 self._test_binary_operator("__sub__", expected, self.g1, self.g2)
1123 with pytest.warns(DeprecationWarning):
1253 with pytest.warns(FutureWarning):
11241254 self._test_binary_operator("__sub__", expected, self.gdf1, self.g2)
11251255
11261256 def test_difference_poly2(self):
11271257 expected = GeoSeries([self.t1, self.t1])
1128 with pytest.warns(DeprecationWarning):
1258 with pytest.warns(FutureWarning):
11291259 self._test_binary_operator("__sub__", expected, self.g1, self.t2)
1130 with pytest.warns(DeprecationWarning):
1260 with pytest.warns(FutureWarning):
11311261 self._test_binary_operator("__sub__", expected, self.gdf1, self.t2)
66 import numpy as np
77 from numpy.testing import assert_array_equal
88 import pandas as pd
9 from pandas.util.testing import assert_index_equal
9 from pandas.testing import assert_index_equal
1010
1111 from pyproj import CRS
1212 from shapely.geometry import (
2020 from shapely.geometry.base import BaseGeometry
2121
2222 from geopandas import GeoSeries, GeoDataFrame, read_file, datasets, clip
23 from geopandas._compat import PYPROJ_LT_3, ignore_shapely2_warnings
23 from geopandas._compat import ignore_shapely2_warnings
2424 from geopandas.array import GeometryArray, GeometryDtype
2525 from geopandas.testing import assert_geoseries_equal
2626
2727 from geopandas.tests.util import geom_equals
2828 from pandas.testing import assert_series_equal
2929 import pytest
30
31 import geopandas._compat as compat
3032
3133
3234 class TestSeries:
204206 self.landmarks.to_crs(crs=None, epsg=None)
205207
206208 def test_estimate_utm_crs__geographic(self):
207 if PYPROJ_LT_3:
209 if compat.PYPROJ_LT_3:
208210 with pytest.raises(RuntimeError, match=r"pyproj 3\+ required"):
209211 self.landmarks.estimate_utm_crs()
210212 else:
211213 assert self.landmarks.estimate_utm_crs() == CRS("EPSG:32618")
212 assert self.landmarks.estimate_utm_crs("NAD83") == CRS("EPSG:26918")
213
214 @pytest.mark.skipif(PYPROJ_LT_3, reason="requires pyproj 3 or higher")
214 if compat.PYPROJ_GE_32: # result is unstable in older pyproj
215 assert self.landmarks.estimate_utm_crs("NAD83") == CRS("EPSG:26918")
216
217 @pytest.mark.skipif(compat.PYPROJ_LT_3, reason="requires pyproj 3 or higher")
215218 def test_estimate_utm_crs__projected(self):
216219 assert self.landmarks.to_crs("EPSG:3857").estimate_utm_crs() == CRS(
217220 "EPSG:32618"
218221 )
219222
220 @pytest.mark.skipif(PYPROJ_LT_3, reason="requires pyproj 3 or higher")
223 @pytest.mark.skipif(compat.PYPROJ_LT_3, reason="requires pyproj 3 or higher")
221224 def test_estimate_utm_crs__out_of_bounds(self):
222225 with pytest.raises(RuntimeError, match="Unable to determine UTM CRS"):
223226 GeoSeries(
224227 [Polygon([(0, 90), (1, 90), (2, 90)])], crs="EPSG:4326"
225228 ).estimate_utm_crs()
226229
227 @pytest.mark.skipif(PYPROJ_LT_3, reason="requires pyproj 3 or higher")
230 @pytest.mark.skipif(compat.PYPROJ_LT_3, reason="requires pyproj 3 or higher")
228231 def test_estimate_utm_crs__missing_crs(self):
229232 with pytest.raises(RuntimeError, match="crs must be set"):
230233 GeoSeries([Polygon([(0, 90), (1, 90), (2, 90)])]).estimate_utm_crs()
378381 assert_geoseries_equal(expected, GeoSeries.from_xy(x, y, z))
379382
380383
381 def test_missing_values_empty_warning():
382 s = GeoSeries([Point(1, 1), None, np.nan, BaseGeometry(), Polygon()])
383 with pytest.warns(UserWarning):
384 s.isna()
385
386 with pytest.warns(UserWarning):
387 s.notna()
388
389
390384 @pytest.mark.filterwarnings("ignore::UserWarning")
391385 def test_missing_values():
392386 s = GeoSeries([Point(1, 1), None, np.nan, BaseGeometry(), Polygon()])
411405
412406
413407 def test_isna_empty_geoseries():
414 # ensure that isna() result for emtpy GeoSeries has the correct bool dtype
408 # ensure that isna() result for empty GeoSeries has the correct bool dtype
415409 s = GeoSeries([])
416410 result = s.isna()
417411 assert_series_equal(result, pd.Series([], dtype="bool"))
469463 for x in gs:
470464 assert x.equals(g)
471465
472 def test_no_geometries_fallback(self):
473 with pytest.warns(FutureWarning):
474 s = GeoSeries([True, False, True])
475 assert not isinstance(s, GeoSeries)
476 assert type(s) == pd.Series
477
478 with pytest.warns(FutureWarning):
479 s = GeoSeries(["a", "b", "c"])
480 assert not isinstance(s, GeoSeries)
481 assert type(s) == pd.Series
482
483 with pytest.warns(FutureWarning):
484 s = GeoSeries([[1, 2], [3, 4]])
485 assert not isinstance(s, GeoSeries)
486 assert type(s) == pd.Series
466 def test_non_geometry_raises(self):
467 with pytest.raises(TypeError, match="Non geometry data passed to GeoSeries"):
468 GeoSeries([True, False, True])
469
470 with pytest.raises(TypeError, match="Non geometry data passed to GeoSeries"):
471 GeoSeries(["a", "b", "c"])
472
473 with pytest.raises(TypeError, match="Non geometry data passed to GeoSeries"):
474 GeoSeries([[1, 2], [3, 4]])
487475
488476 def test_empty(self):
489477 s = GeoSeries([])
499487 def test_empty_array(self):
500488 # with empty data that have an explicit dtype, we use the fallback or
501489 # not depending on the dtype
502 arr = np.array([], dtype="bool")
503490
504491 # dtypes that can never hold geometry-like data
505492 for arr in [
509496 # this gets converted to object dtype by pandas
510497 # np.array([], dtype="str"),
511498 ]:
512 with pytest.warns(FutureWarning):
513 s = GeoSeries(arr)
514 assert not isinstance(s, GeoSeries)
515 assert type(s) == pd.Series
499 with pytest.raises(
500 TypeError, match="Non geometry data passed to GeoSeries"
501 ):
502 GeoSeries(arr)
516503
517504 # dtypes that can potentially hold geometry-like data (object) or
518505 # can come from empty data (float64)
543530 assert s.index is g.index
544531
545532 # GH 1216
546 def test_expanddim(self):
533 @pytest.mark.parametrize("name", [None, "geometry", "Points"])
534 @pytest.mark.parametrize("crs", [None, "epsg:4326"])
535 def test_reset_index(self, name, crs):
547536 s = GeoSeries(
548 [MultiPoint([(0, 0), (1, 1)]), MultiPoint([(2, 2), (3, 3), (4, 4)])]
537 [MultiPoint([(0, 0), (1, 1)]), MultiPoint([(2, 2), (3, 3), (4, 4)])],
538 name=name,
539 crs=crs,
549540 )
550541 s = s.explode(index_parts=True)
551542 df = s.reset_index()
552543 assert type(df) == GeoDataFrame
544 # name None -> 0, otherwise name preserved
545 assert df.geometry.name == (name if name is not None else 0)
546 assert df.crs == s.crs
547
548 @pytest.mark.parametrize("name", [None, "geometry", "Points"])
549 @pytest.mark.parametrize("crs", [None, "epsg:4326"])
550 def test_to_frame(self, name, crs):
551 s = GeoSeries([Point(0, 0), Point(1, 1)], name=name, crs=crs)
552 df = s.to_frame()
553 assert type(df) == GeoDataFrame
554 # name None -> 0, otherwise name preserved
555 expected_name = name if name is not None else 0
556 assert df.geometry.name == expected_name
557 assert df._geometry_column_name == expected_name
558 assert df.crs == s.crs
559
560 # if name is provided to to_frame, it should override
561 df2 = s.to_frame(name="geom")
562 assert type(df) == GeoDataFrame
563 assert df2.geometry.name == "geom"
564 assert df2.crs == s.crs
553565
554566 def test_explode_without_multiindex(self):
555567 s = GeoSeries(
565577 )
566578 s = s.explode(ignore_index=True)
567579 expected_index = pd.Index(range(len(s)))
568 print(expected_index)
569580 assert_index_equal(s.index, expected_index)
570581
571582 # index_parts is ignored if ignore_index=True
00 import pandas as pd
11 import pytest
22 from geopandas.testing import assert_geodataframe_equal
3 from pandas.testing import assert_index_equal
34
45 from shapely.geometry import Point
56
67 from geopandas import GeoDataFrame, GeoSeries
8 from geopandas import _compat as compat
79
810
911 class TestMerging:
9698 res3 = pd.concat([df2.set_crs("epsg:4326"), self.gdf], axis=1)
9799 # check metadata comes from first df
98100 self._check_metadata(res3, geometry_column_name="geom", crs="epsg:4326")
101
102 @pytest.mark.xfail(
103 not compat.PANDAS_GE_11,
104 reason="pandas <=1.0 hard codes concat([GeoSeries, GeoSeries]) -> "
105 "DataFrame or Union[DataFrame, SparseDataFrame] in 0.25",
106 )
107 @pytest.mark.filterwarnings("ignore:Accessing CRS")
108 def test_concat_axis1_geoseries(self):
109 gseries2 = GeoSeries([Point(i, i) for i in range(3, 6)], crs="epsg:4326")
110 result = pd.concat([gseries2, self.gseries], axis=1)
111 # Note this is not consistent with concat([gdf, gdf], axis=1) where the
112 # left metadata is set on the result. This is deliberate for now.
113 assert type(result) is GeoDataFrame
114 self._check_metadata(result, geometry_column_name=None, crs=None)
115 assert_index_equal(pd.Index([0, 1]), result.columns)
116
117 gseries2.name = "foo"
118 result2 = pd.concat([gseries2, self.gseries], axis=1)
119 assert type(result2) is GeoDataFrame
120 self._check_metadata(result2, geometry_column_name=None, crs=None)
121 assert_index_equal(pd.Index(["foo", 0]), result2.columns)
0 import pandas as pd
1 import pyproj
2 import pytest
3 import geopandas._compat as compat
4
5 from shapely.geometry import Point
6 import numpy as np
7
8 from geopandas import GeoDataFrame, GeoSeries
9
10
11 crs_osgb = pyproj.CRS(27700)
12 crs_wgs = pyproj.CRS(4326)
13
14
15 N = 10
16
17
18 @pytest.fixture(params=["geometry", "point"])
19 def df(request):
20 geo_name = request.param
21
22 df = GeoDataFrame(
23 [
24 {
25 "value1": x + y,
26 "value2": x * y,
27 geo_name: Point(x, y), # rename this col in tests
28 }
29 for x, y in zip(range(N), range(N))
30 ],
31 crs=crs_wgs,
32 geometry=geo_name,
33 )
34 # want geometry2 to be a GeoSeries not Series, test behaviour of non geom col
35 df["geometry2"] = df[geo_name].set_crs(crs_osgb, allow_override=True)
36 return df
37
38
39 @pytest.fixture
40 def df2():
41 """For constructor_sliced tests"""
42 return GeoDataFrame(
43 {
44 "geometry": GeoSeries([Point(x, x) for x in range(3)]),
45 "geometry2": GeoSeries([Point(x, x) for x in range(3)]),
46 "geometry3": GeoSeries([Point(x, x) for x in range(3)]),
47 "value": [1, 2, 1],
48 "value_nan": np.nan,
49 }
50 )
51
52
53 def _check_metadata_gdf(gdf, geo_name="geometry", crs=crs_wgs):
54 assert gdf._geometry_column_name == geo_name
55 assert gdf.geometry.name == geo_name
56 assert gdf.crs == crs
57
58
59 def _check_metadata_gs(gs, name="geometry", crs=crs_wgs):
60 assert gs.name == name
61 assert gs.crs == crs
62
63
64 def assert_object(
65 result, expected_type, geo_name="geometry", crs=crs_wgs, check_none_name=False
66 ):
67 """
68 Helper method to make tests easier to read. Checks result is of the expected
69 type. If result is a GeoDataFrame or GeoSeries, checks geo_name
70 and crs match. If geo_name is None, then we expect a GeoDataFrame
71 where the geometry column is invalid/ isn't set. This is never desirable,
72 but is a reality of this first stage of implementation.
73 """
74 assert type(result) is expected_type
75
76 if expected_type == GeoDataFrame:
77 if geo_name is not None:
78 _check_metadata_gdf(result, geo_name=geo_name, crs=crs)
79 else:
80 if check_none_name: # TODO this is awkward
81 assert result._geometry_column_name is None
82
83 if result._geometry_column_name is None:
84 msg = (
85 "You are calling a geospatial method on the GeoDataFrame, "
86 "but the active"
87 )
88 else:
89 msg = (
90 "You are calling a geospatial method on the GeoDataFrame, but "
91 r"the active geometry column \("
92 rf"'{result._geometry_column_name}'\) is not present"
93 )
94 with pytest.raises(AttributeError, match=msg):
95 result.geometry.name # be explicit that geometry is invalid here
96 elif expected_type == GeoSeries:
97 _check_metadata_gs(result, name=geo_name, crs=crs)
98
99
100 def test_getitem(df):
101 geo_name = df.geometry.name
102 assert_object(df[["value1", "value2"]], pd.DataFrame)
103 assert_object(df[[geo_name, "geometry2"]], GeoDataFrame, geo_name)
104 assert_object(df[[geo_name]], GeoDataFrame, geo_name)
105 assert_object(df[["geometry2", "value1"]], GeoDataFrame, None)
106 assert_object(df[["geometry2"]], GeoDataFrame, None)
107 assert_object(df[["value1"]], pd.DataFrame)
108 # Series
109 assert_object(df[geo_name], GeoSeries, geo_name)
110 assert_object(df["geometry2"], GeoSeries, "geometry2", crs=crs_osgb)
111 assert_object(df["value1"], pd.Series)
112
113
114 def test_loc(df):
115 geo_name = df.geometry.name
116 assert_object(df.loc[:, ["value1", "value2"]], pd.DataFrame)
117 assert_object(df.loc[:, [geo_name, "geometry2"]], GeoDataFrame, geo_name)
118 assert_object(df.loc[:, [geo_name]], GeoDataFrame, geo_name)
119 assert_object(df.loc[:, ["geometry2", "value1"]], GeoDataFrame, None)
120 assert_object(df.loc[:, ["geometry2"]], GeoDataFrame, None)
121 assert_object(df.loc[:, ["value1"]], pd.DataFrame)
122 # Series
123 assert_object(df.loc[:, geo_name], GeoSeries, geo_name)
124 assert_object(df.loc[:, "geometry2"], GeoSeries, "geometry2", crs=crs_osgb)
125 assert_object(df.loc[:, "value1"], pd.Series)
126
127
128 def test_iloc(df):
129 geo_name = df.geometry.name
130 assert_object(df.iloc[:, 0:2], pd.DataFrame)
131 assert_object(df.iloc[:, 2:4], GeoDataFrame, geo_name)
132 assert_object(df.iloc[:, [2]], GeoDataFrame, geo_name)
133 assert_object(df.iloc[:, [3, 0]], GeoDataFrame, None)
134 assert_object(df.iloc[:, [3]], GeoDataFrame, None)
135 assert_object(df.iloc[:, [0]], pd.DataFrame)
136 # Series
137 assert_object(df.iloc[:, 2], GeoSeries, geo_name)
138 assert_object(df.iloc[:, 3], GeoSeries, "geometry2", crs=crs_osgb)
139 assert_object(df.iloc[:, 0], pd.Series)
140
141
142 def test_squeeze(df):
143 geo_name = df.geometry.name
144 assert_object(df[[geo_name]].squeeze(), GeoSeries, geo_name)
145 assert_object(df[["geometry2"]].squeeze(), GeoSeries, "geometry2", crs=crs_osgb)
146
147
148 def test_to_frame(df):
149 geo_name = df.geometry.name
150 res1 = df[geo_name].to_frame()
151 assert_object(res1, GeoDataFrame, geo_name, crs=df[geo_name].crs)
152
153 res2 = df["geometry2"].to_frame()
154 assert_object(res2, GeoDataFrame, "geometry2", crs=crs_osgb)
155
156 res3 = df["value1"].to_frame()
157 assert_object(res3, pd.DataFrame)
158
159
160 def test_reindex(df):
161 geo_name = df.geometry.name
162 assert_object(df.reindex(columns=["value1", "value2"]), pd.DataFrame)
163 assert_object(df.reindex(columns=[geo_name, "geometry2"]), GeoDataFrame, geo_name)
164 assert_object(df.reindex(columns=[geo_name]), GeoDataFrame, geo_name)
165 assert_object(df.reindex(columns=["new_col", geo_name]), GeoDataFrame, geo_name)
166 assert_object(df.reindex(columns=["geometry2", "value1"]), GeoDataFrame, None)
167 assert_object(df.reindex(columns=["geometry2"]), GeoDataFrame, None)
168 assert_object(df.reindex(columns=["value1"]), pd.DataFrame)
169
170 # reindexing the rows always preserves the GeoDataFrame
171 assert_object(df.reindex(index=[0, 1, 20]), GeoDataFrame, geo_name)
172
173 # reindexing both rows and columns
174 assert_object(
175 df.reindex(index=[0, 1, 20], columns=[geo_name]), GeoDataFrame, geo_name
176 )
177 assert_object(df.reindex(index=[0, 1, 20], columns=["value1"]), pd.DataFrame)
178
179
180 def test_drop(df):
181 geo_name = df.geometry.name
182 assert_object(df.drop(columns=[geo_name, "geometry2"]), pd.DataFrame)
183 assert_object(df.drop(columns=["value1", "value2"]), GeoDataFrame, geo_name)
184 cols = ["value1", "value2", "geometry2"]
185 assert_object(df.drop(columns=cols), GeoDataFrame, geo_name)
186 assert_object(df.drop(columns=[geo_name, "value2"]), GeoDataFrame, None)
187 assert_object(df.drop(columns=["value1", "value2", geo_name]), GeoDataFrame, None)
188 assert_object(df.drop(columns=["geometry2", "value2", geo_name]), pd.DataFrame)
189
190
191 def test_apply(df):
192 geo_name = df.geometry.name
193
194 def identity(x):
195 return x
196
197 # axis = 0
198 assert_object(df[["value1", "value2"]].apply(identity), pd.DataFrame)
199 assert_object(df[[geo_name, "geometry2"]].apply(identity), GeoDataFrame, geo_name)
200 assert_object(df[[geo_name]].apply(identity), GeoDataFrame, geo_name)
201 assert_object(df[["geometry2", "value1"]].apply(identity), GeoDataFrame, None, None)
202 assert_object(df[["geometry2"]].apply(identity), GeoDataFrame, None, None)
203 assert_object(df[["value1"]].apply(identity), pd.DataFrame)
204
205 # axis = 0, Series
206 assert_object(df[geo_name].apply(identity), GeoSeries, geo_name)
207 assert_object(df["geometry2"].apply(identity), GeoSeries, "geometry2", crs=crs_osgb)
208 assert_object(df["value1"].apply(identity), pd.Series)
209
210 # axis = 0, Series, no longer geometry
211 assert_object(df[geo_name].apply(lambda x: str(x)), pd.Series)
212 assert_object(df["geometry2"].apply(lambda x: str(x)), pd.Series)
213
214 # axis = 1
215 assert_object(df[["value1", "value2"]].apply(identity, axis=1), pd.DataFrame)
216 assert_object(
217 df[[geo_name, "geometry2"]].apply(identity, axis=1), GeoDataFrame, geo_name
218 )
219 assert_object(df[[geo_name]].apply(identity, axis=1), GeoDataFrame, geo_name)
220 # TODO below should be a GeoDataFrame to be consistent with new getitem logic
221 # leave as follow up as quite complicated
222 # FrameColumnApply.series_generator returns object dtypes Series, so will have
223 # patch result of apply
224 assert_object(df[["geometry2", "value1"]].apply(identity, axis=1), pd.DataFrame)
225
226 assert_object(df[["value1"]].apply(identity, axis=1), pd.DataFrame)
227
228
229 @pytest.mark.xfail(not compat.PANDAS_GE_11, reason="apply is different in pandas 1.0.5")
230 def test_apply_axis1_secondary_geo_cols(df):
231 # note #GH2436 would also fix this
232 def identity(x):
233 return x
234
235 assert_object(df[["geometry2"]].apply(identity, axis=1), GeoDataFrame, None, None)
236
237
238 def test_expanddim_in_apply():
239 # https://github.com/geopandas/geopandas/pull/2296#issuecomment-1021966443
240 s = GeoSeries.from_xy([0, 1], [0, 1])
241 result = s.apply(lambda x: pd.Series([x.x, x.y]))
242 assert_object(result, pd.DataFrame)
243
244
245 @pytest.mark.xfail(
246 not compat.PANDAS_GE_11,
247 reason="pandas <1.1 don't preserve subclass through groupby ops", # Pandas GH33884
248 )
249 def test_expandim_in_groupby_aggregate_multiple_funcs():
250 # https://github.com/geopandas/geopandas/pull/2296#issuecomment-1021966443
251 # There are two calls to _constructor_expanddim here
252 # SeriesGroupBy._aggregate_multiple_funcs() and
253 # SeriesGroupBy._wrap_series_output() len(output) > 1
254
255 s = GeoSeries.from_xy([0, 1, 2], [0, 1, 3])
256
257 def union(s):
258 return s.unary_union
259
260 def total_area(s):
261 return s.area.sum()
262
263 grouped = s.groupby([0, 1, 0])
264 agg = grouped.agg([total_area, union])
265 assert_object(agg, GeoDataFrame, None, None, check_none_name=True)
266 result = grouped.agg([union, total_area])
267 assert_object(result, GeoDataFrame, None, None, check_none_name=True)
268 assert_object(grouped.agg([total_area, total_area]), pd.DataFrame)
269 assert_object(grouped.agg([total_area]), pd.DataFrame)
270
271
272 @pytest.mark.xfail(
273 not compat.PANDAS_GE_11,
274 reason="pandas <1.1 uses concat([Series]) in unstack", # Pandas GH33356
275 )
276 def test_expanddim_in_unstack():
277 # https://github.com/geopandas/geopandas/pull/2296#issuecomment-1021966443
278 s = GeoSeries.from_xy(
279 [0, 1, 2],
280 [0, 1, 3],
281 index=pd.MultiIndex.from_tuples([("A", "a"), ("A", "b"), ("B", "a")]),
282 )
283 unstack = s.unstack()
284 assert_object(unstack, GeoDataFrame, None, None, False)
285
286 if compat.PANDAS_GE_12:
287 assert unstack._geometry_column_name is None
288 else: # pandas GH37369, unstack doesn't call finalize
289 assert unstack._geometry_column_name == "geometry"
290
291
292 # indexing / constructor_sliced tests
293
294 test_case_column_sets = [
295 ["geometry"],
296 ["geometry2"],
297 ["geometry", "geometry2"],
298 # non active geo col case
299 ["geometry", "value"],
300 ["geometry", "value_nan"],
301 ["geometry2", "value"],
302 ["geometry2", "value_nan"],
303 ]
304
305
306 @pytest.mark.parametrize(
307 "column_set",
308 test_case_column_sets,
309 ids=[", ".join(i) for i in test_case_column_sets],
310 )
311 def test_constructor_sliced_row_slices(df2, column_set):
312 # https://github.com/geopandas/geopandas/issues/2282
313 df_subset = df2[column_set]
314 assert isinstance(df_subset, GeoDataFrame)
315 res = df_subset.loc[0]
316 # row slices shouldn't be GeoSeries, even if they have a geometry col
317 assert type(res) == pd.Series
318 if "geometry" in column_set:
319 assert not isinstance(res.geometry, pd.Series)
320 assert res.geometry == Point(0, 0)
321
322
323 def test_constructor_sliced_column_slices(df2):
324 # Note loc doesn't use _constructor_sliced so it's not tested here
325 geo_idx = df2.columns.get_loc("geometry")
326 sub = df2.head(1)
327 # column slices should be GeoSeries if of geometry type
328 assert type(sub.iloc[:, geo_idx]) == GeoSeries
329 assert type(sub.iloc[[0], geo_idx]) == GeoSeries
330 sub = df2.head(2)
331 assert type(sub.iloc[:, geo_idx]) == GeoSeries
332 assert type(sub.iloc[[0, 1], geo_idx]) == GeoSeries
333
334 # check iloc row slices are pd.Series instead
335 assert type(df2.iloc[0, :]) == pd.Series
336
337
338 def test_constructor_sliced_in_pandas_methods(df2):
339 # constructor sliced is used in many places, checking a sample of non
340 # geometry cases are sensible
341 assert type(df2.count()) == pd.Series
342 # drop the secondary geometry columns as not hashable
343 hashable_test_df = df2.drop(columns=["geometry2", "geometry3"])
344 assert type(hashable_test_df.duplicated()) == pd.Series
345 assert type(df2.quantile()) == pd.Series
346 assert type(df2.memory_usage()) == pd.Series
00 import os
1 from distutils.version import LooseVersion
1 from packaging.version import Version
22
33 import numpy as np
44 import pandas as pd
55
66 from shapely.geometry import Point, Polygon, LineString, GeometryCollection, box
7 from fiona.errors import DriverError
87
98 import geopandas
109 from geopandas import GeoDataFrame, GeoSeries, overlay, read_file
1312 from geopandas.testing import assert_geodataframe_equal, assert_geoseries_equal
1413 import pytest
1514
15 try:
16 from fiona.errors import DriverError
17 except ImportError:
18
19 class DriverError(Exception):
20 pass
21
22
1623 DATA = os.path.join(os.path.abspath(os.path.dirname(__file__)), "data", "overlay")
1724
1825
1926 pytestmark = pytest.mark.skip_no_sindex
20 pandas_133 = pd.__version__ == LooseVersion("1.3.3")
27 pandas_133 = Version(pd.__version__) == Version("1.3.3")
2128
2229
2330 @pytest.fixture
8087 os.path.join(DATA, "polys", "df1_df2-{0}.geojson".format(name))
8188 )
8289 expected.crs = None
90 for col in expected.columns[expected.dtypes == "int32"]:
91 expected[col] = expected[col].astype("int64")
8392 return expected
8493
8594 if how == "identity":
528537 except OSError: # fiona < 1.8
529538 assert result.empty
530539
540 except RuntimeError: # pyogrio.DataSourceError
541 assert result.empty
542
531543
532544 def test_mixed_geom_error():
533545 polys1 = GeoSeries(
794806 expected = GeoDataFrame(columns=["foo", "bar", "geometry"])
795807 result = overlay(gdf1, gdf2, how="intersection")
796808 assert_geodataframe_equal(result, expected, check_index_type=False)
809
810
811 class TestOverlayWikiExample:
812 def setup_method(self):
813 self.layer_a = GeoDataFrame(geometry=[box(0, 2, 6, 6)])
814
815 self.layer_b = GeoDataFrame(geometry=[box(4, 0, 10, 4)])
816
817 self.intersection = GeoDataFrame(geometry=[box(4, 2, 6, 4)])
818
819 self.union = GeoDataFrame(
820 geometry=[
821 box(4, 2, 6, 4),
822 Polygon([(4, 2), (0, 2), (0, 6), (6, 6), (6, 4), (4, 4), (4, 2)]),
823 Polygon([(10, 0), (4, 0), (4, 2), (6, 2), (6, 4), (10, 4), (10, 0)]),
824 ]
825 )
826
827 self.a_difference_b = GeoDataFrame(
828 geometry=[Polygon([(4, 2), (0, 2), (0, 6), (6, 6), (6, 4), (4, 4), (4, 2)])]
829 )
830
831 self.b_difference_a = GeoDataFrame(
832 geometry=[
833 Polygon([(10, 0), (4, 0), (4, 2), (6, 2), (6, 4), (10, 4), (10, 0)])
834 ]
835 )
836
837 self.symmetric_difference = GeoDataFrame(
838 geometry=[
839 Polygon([(4, 2), (0, 2), (0, 6), (6, 6), (6, 4), (4, 4), (4, 2)]),
840 Polygon([(10, 0), (4, 0), (4, 2), (6, 2), (6, 4), (10, 4), (10, 0)]),
841 ]
842 )
843
844 self.a_identity_b = GeoDataFrame(
845 geometry=[
846 box(4, 2, 6, 4),
847 Polygon([(4, 2), (0, 2), (0, 6), (6, 6), (6, 4), (4, 4), (4, 2)]),
848 ]
849 )
850
851 self.b_identity_a = GeoDataFrame(
852 geometry=[
853 box(4, 2, 6, 4),
854 Polygon([(10, 0), (4, 0), (4, 2), (6, 2), (6, 4), (10, 4), (10, 0)]),
855 ]
856 )
857
858 def test_intersection(self):
859 df_result = overlay(self.layer_a, self.layer_b, how="intersection")
860 assert df_result.geom_equals(self.intersection).bool()
861
862 def test_union(self):
863 df_result = overlay(self.layer_a, self.layer_b, how="union")
864 assert_geodataframe_equal(df_result, self.union)
865
866 def test_a_difference_b(self):
867 df_result = overlay(self.layer_a, self.layer_b, how="difference")
868 assert_geodataframe_equal(df_result, self.a_difference_b)
869
870 def test_b_difference_a(self):
871 df_result = overlay(self.layer_b, self.layer_a, how="difference")
872 assert_geodataframe_equal(df_result, self.b_difference_a)
873
874 def test_symmetric_difference(self):
875 df_result = overlay(self.layer_a, self.layer_b, how="symmetric_difference")
876 assert_geodataframe_equal(df_result, self.symmetric_difference)
877
878 def test_a_identity_b(self):
879 df_result = overlay(self.layer_a, self.layer_b, how="identity")
880 assert_geodataframe_equal(df_result, self.a_identity_b)
881
882 def test_b_identity_a(self):
883 df_result = overlay(self.layer_b, self.layer_a, how="identity")
884 assert_geodataframe_equal(df_result, self.b_identity_a)
8787
8888 def test_indexing(s, df):
8989
90 # accessing scalar from the geometry (colunm)
90 # accessing scalar from the geometry (column)
9191 exp = Point(1, 1)
9292 assert s[1] == exp
9393 assert s.loc[1] == exp
138138 assert isinstance(res.geometry, GeoSeries)
139139 assert_frame_equal(res, df[["value1", "geometry"]])
140140
141 # TODO df.reindex(columns=['value1', 'value2']) still returns GeoDataFrame,
142 # should it return DataFrame instead ?
141 res = df.reindex(columns=["value1", "value2"])
142 assert type(res) == pd.DataFrame
143 assert_frame_equal(res, df[["value1", "value2"]])
143144
144145
145146 def test_take(s, df):
240241 res = df.astype({"value1": float})
241242 assert isinstance(res, GeoDataFrame)
242243
243 # check whether returned object is a datafrane
244 # check whether returned object is a dataframe
244245 res = df.astype(str)
245246 assert isinstance(res, pd.DataFrame) and not isinstance(res, GeoDataFrame)
246247
261262 assert res["a"].dtype == object
262263
263264
264 @pytest.mark.xfail(
265 not compat.PANDAS_GE_10,
266 reason="Convert dtypes new in pandas 1.0",
267 raises=NotImplementedError,
268 )
269265 def test_convert_dtypes(df):
270266 # https://github.com/geopandas/geopandas/issues/1870
271267
272268 # Test geometry col is first col, first, geom_col_name=geometry
273269 # (order is important in concat, used internally)
274 res1 = df.convert_dtypes() # note res1 done first for pandas < 1 xfail check
270 res1 = df.convert_dtypes()
275271
276272 expected1 = GeoDataFrame(
277273 pd.DataFrame(df).convert_dtypes(), crs=df.crs, geometry=df.geometry.name
382378 df2["geometry"] = s2
383379 res = df2.fillna(Point(1, 1))
384380 assert_geodataframe_equal(res, df)
385 with pytest.raises(NotImplementedError):
381 with pytest.raises((NotImplementedError, TypeError)): # GH2351
386382 df2.fillna(0)
387383
388384 # allow non-geometry fill value if there are no missing values
443439 assert_array_equal(s.unique(), exp)
444440
445441
442 def pd14_compat_index(index):
443 if compat.PANDAS_GE_14:
444 return from_shapely(index)
445 else:
446 return index
447
448
446449 def test_value_counts():
447450 # each object is considered unique
448451 s = GeoSeries([Point(0, 0), Point(1, 1), Point(0, 0)])
449452 res = s.value_counts()
450453 with compat.ignore_shapely2_warnings():
451 exp = pd.Series([2, 1], index=[Point(0, 0), Point(1, 1)])
454 exp = pd.Series([2, 1], index=pd14_compat_index([Point(0, 0), Point(1, 1)]))
452455 assert_series_equal(res, exp)
453456 # Check crs doesn't make a difference - note it is not kept in output index anyway
454457 s2 = GeoSeries([Point(0, 0), Point(1, 1), Point(0, 0)], crs="EPSG:4326")
455458 res2 = s2.value_counts()
456459 assert_series_equal(res2, exp)
460 if compat.PANDAS_GE_14:
461 # TODO should/ can we fix CRS being lost
462 assert s2.value_counts().index.array.crs is None
457463
458464 # check mixed geometry
459465 s3 = GeoSeries([Point(0, 0), LineString([[1, 1], [2, 2]]), Point(0, 0)])
460466 res3 = s3.value_counts()
467 index = pd14_compat_index([Point(0, 0), LineString([[1, 1], [2, 2]])])
461468 with compat.ignore_shapely2_warnings():
462 exp3 = pd.Series([2, 1], index=[Point(0, 0), LineString([[1, 1], [2, 2]])])
469 exp3 = pd.Series([2, 1], index=index)
463470 assert_series_equal(res3, exp3)
464471
465472 # check None is handled
466473 s4 = GeoSeries([Point(0, 0), None, Point(0, 0)])
467474 res4 = s4.value_counts(dropna=True)
468475 with compat.ignore_shapely2_warnings():
469 exp4_dropna = pd.Series([2], index=[Point(0, 0)])
476 exp4_dropna = pd.Series([2], index=pd14_compat_index([Point(0, 0)]))
470477 assert_series_equal(res4, exp4_dropna)
471478 with compat.ignore_shapely2_warnings():
472 exp4_keepna = pd.Series([2, 1], index=[Point(0, 0), None])
479 exp4_keepna = pd.Series([2, 1], index=pd14_compat_index([Point(0, 0), None]))
473480 res4_keepna = s4.value_counts(dropna=False)
474481 assert_series_equal(res4_keepna, exp4_keepna)
475482
545552 assert_frame_equal(res, exp)
546553
547554
555 @pytest.mark.skip_no_sindex
556 @pytest.mark.skipif(
557 compat.PANDAS_GE_13 and not compat.PANDAS_GE_14,
558 reason="this was broken in pandas 1.3.5 (GH-2294)",
559 )
560 @pytest.mark.parametrize("crs", [None, "EPSG:4326"])
561 def test_groupby_metadata(crs):
562 # https://github.com/geopandas/geopandas/issues/2294
563 df = GeoDataFrame(
564 {
565 "geometry": [Point(0, 0), Point(1, 1), Point(0, 0)],
566 "value1": np.arange(3, dtype="int64"),
567 "value2": np.array([1, 2, 1], dtype="int64"),
568 },
569 crs=crs,
570 )
571
572 # dummy test asserting we can access the crs
573 def func(group):
574 assert isinstance(group, GeoDataFrame)
575 assert group.crs == crs
576
577 df.groupby("value2").apply(func)
578
579 # actual test with functionality
580 res = df.groupby("value2").apply(
581 lambda x: geopandas.sjoin(x, x[["geometry", "value1"]], how="inner")
582 )
583
584 expected = (
585 df.take([0, 2, 0, 2, 1])
586 .set_index("value2", drop=False, append=True)
587 .swaplevel()
588 .rename(columns={"value1": "value1_left"})
589 .assign(value1_right=[0, 0, 2, 2, 1])
590 )
591 assert_geodataframe_equal(res.drop(columns=["index_right"]), expected)
592
593
548594 def test_apply(s):
549595 # function that returns geometry preserves GeoSeries class
550596 def geom_func(geom):
586632 if crs:
587633 df = df.set_crs(crs)
588634 result = df.apply(lambda col: col.astype(str), axis=0)
589 # TODO this should actually not return a GeoDataFrame
590 assert isinstance(result, GeoDataFrame)
635 assert type(result) is pd.DataFrame
591636 expected = df.astype(str)
592637 assert_frame_equal(result, expected)
593638
594639 result = df.apply(lambda col: col.astype(str), axis=1)
595 assert isinstance(result, GeoDataFrame)
640 assert type(result) is pd.DataFrame
596641 assert_frame_equal(result, expected)
597642
598643
602647 assert result.geometry.name == "geom"
603648
604649
605 @pytest.mark.skipif(not compat.PANDAS_GE_10, reason="attrs introduced in pandas 1.0")
650 def test_df_apply_returning_series(df):
651 # https://github.com/geopandas/geopandas/issues/2283
652 result = df.apply(lambda row: row.geometry, axis=1)
653 assert_geoseries_equal(result, df.geometry, check_crs=False)
654
655 result = df.apply(lambda row: row.value1, axis=1)
656 assert_series_equal(result, df["value1"].rename(None))
657
658
606659 def test_preserve_attrs(df):
607660 # https://github.com/geopandas/geopandas/issues/1654
608661 df.attrs["name"] = "my_name"
0 from distutils.version import LooseVersion
0 from packaging.version import Version
11 import itertools
22 import warnings
33
1515 MultiPoint,
1616 MultiLineString,
1717 GeometryCollection,
18 box,
1819 )
1920
2021
3233 try: # skipif and importorskip do not work for decorators
3334 from matplotlib.testing.decorators import check_figures_equal
3435
35 if matplotlib.__version__ >= LooseVersion("3.3.0"):
36 if Version(matplotlib.__version__) >= Version("3.3.0"):
3637
3738 MPL_DECORATORS = True
3839 else:
302303 with pytest.warns(UserWarning):
303304 ax = s.plot()
304305 assert len(ax.collections) == 0
305 df = GeoDataFrame([])
306 df = GeoDataFrame([], columns=["geometry"])
306307 with pytest.warns(UserWarning):
307308 ax = df.plot()
308309 assert len(ax.collections) == 0
403404 ):
404405 self.df.plot(column="cats", categories=["cat1"])
405406
406 def test_misssing(self):
407 def test_missing(self):
407408 self.df.loc[0, "values"] = np.nan
408409 ax = self.df.plot("values")
409410 cmap = plt.get_cmap()
426427 leg_colors1 = ax.get_legend().axes.collections[1].get_facecolors()
427428 np.testing.assert_array_equal(point_colors[0], leg_colors[0])
428429 np.testing.assert_array_equal(nan_color[0], leg_colors1[0])
430
431 def test_no_missing_and_missing_kwds(self):
432 # GH2210
433 df = self.df.copy()
434 df["category"] = df["values"].astype("str")
435 df.plot("category", missing_kwds={"facecolor": "none"}, legend=True)
429436
430437
431438 class TestPointZPlotting:
862869 def test_plot(self):
863870 # basic test that points with z coords don't break plotting
864871 self.df.plot()
872
873
874 class TestColorParamArray:
875 def setup_method(self):
876 geom = []
877 color = []
878 for a, b in [(0, 2), (4, 6)]:
879 b = box(a, a, b, b)
880 geom += [b, b.buffer(0.8).exterior, b.centroid]
881 color += ["red", "green", "blue"]
882
883 self.gdf = GeoDataFrame({"geometry": geom, "color_rgba": color})
884 self.mgdf = self.gdf.dissolve(self.gdf.type)
885
886 def test_color_single(self):
887 ax = self.gdf.plot(color=self.gdf["color_rgba"])
888
889 _check_colors(
890 4,
891 np.concatenate([c.get_edgecolor() for c in ax.collections]),
892 ["green"] * 2 + ["blue"] * 2,
893 )
894 _check_colors(
895 4,
896 np.concatenate([c.get_facecolor() for c in ax.collections]),
897 ["red"] * 2 + ["blue"] * 2,
898 )
899
900 def test_color_multi(self):
901 ax = self.mgdf.plot(color=self.mgdf["color_rgba"])
902
903 _check_colors(
904 4,
905 np.concatenate([c.get_edgecolor() for c in ax.collections]),
906 ["green"] * 2 + ["blue"] * 2,
907 )
908 _check_colors(
909 4,
910 np.concatenate([c.get_facecolor() for c in ax.collections]),
911 ["red"] * 2 + ["blue"] * 2,
912 )
865913
866914
867915 class TestGeometryCollectionPlotting:
10631111 import mapclassify # noqa
10641112 except ImportError:
10651113 pytest.importorskip("mapclassify")
1114 cls.mc = mapclassify
10661115 cls.classifiers = list(mapclassify.classifiers.CLASSIFIERS)
10671116 cls.classifiers.remove("UserDefined")
10681117 pth = get_path("naturalearth_lowres")
10831132 )
10841133 labels = [t.get_text() for t in ax.get_legend().get_texts()]
10851134 expected = [
1086 u" 140.00, 5217064.00",
1087 u" 5217064.00, 19532732.33",
1088 u" 19532732.33, 1379302771.00",
1135 s.split("|")[0][1:-2]
1136 for s in str(self.mc.Quantiles(self.df["pop_est"], k=3)).split("\n")[4:]
10891137 ]
10901138 assert labels == expected
10911139
11181166 column="NEGATIVES", scheme="FISHER_JENKS", k=3, cmap="OrRd", legend=True
11191167 )
11201168 labels = [t.get_text() for t in ax.get_legend().get_texts()]
1121 expected = [u"-10.00, -3.41", u" -3.41, 3.30", u" 3.30, 10.00"]
1169 expected = ["-10.00, -3.41", " -3.41, 3.30", " 3.30, 10.00"]
11221170 assert labels == expected
11231171
11241172 def test_fmt(self):
11311179 legend_kwds={"fmt": "{:.0f}"},
11321180 )
11331181 labels = [t.get_text() for t in ax.get_legend().get_texts()]
1134 expected = [u"-10, -3", u" -3, 3", u" 3, 10"]
1182 expected = ["-10, -3", " -3, 3", " 3, 10"]
11351183 assert labels == expected
11361184
11371185 def test_interval(self):
11441192 legend_kwds={"interval": True},
11451193 )
11461194 labels = [t.get_text() for t in ax.get_legend().get_texts()]
1147 expected = [u"[-10.00, -3.41]", u"( -3.41, 3.30]", u"( 3.30, 10.00]"]
1195 expected = ["[-10.00, -3.41]", "( -3.41, 3.30]", "( 3.30, 10.00]"]
11481196 assert labels == expected
11491197
11501198 @pytest.mark.parametrize("scheme", ["FISHER_JENKS", "FISHERJENKS"])
11671215 legend=True,
11681216 )
11691217 labels = [t.get_text() for t in ax.get_legend().get_texts()]
1170 expected = [" 140.00, 9961396.00", " 9961396.00, 1379302771.00"]
1218 expected = [
1219 s.split("|")[0][1:-2]
1220 for s in str(self.mc.Percentiles(self.df["pop_est"], pct=[50, 100])).split(
1221 "\n"
1222 )[4:]
1223 ]
1224
11711225 assert labels == expected
11721226
11731227 def test_invalid_scheme(self):
13771431 )
13781432
13791433 def test_points(self):
1380 # failing with matplotlib 1.4.3 (edge stays black even when specified)
1381 pytest.importorskip("matplotlib", "1.5.0")
1382
13831434 from geopandas.plotting import _plot_point_collection, plot_point_collection
13841435 from matplotlib.collections import PathCollection
13851436
14341485 with pytest.raises((TypeError, ValueError)):
14351486 _plot_point_collection(ax, self.points, color="not color")
14361487
1437 # check DeprecationWarning
1438 with pytest.warns(DeprecationWarning):
1488 # check FutureWarning
1489 with pytest.warns(FutureWarning):
14391490 plot_point_collection(ax, self.points)
14401491
14411492 def test_points_values(self):
15071558 # not a color
15081559 with pytest.raises((TypeError, ValueError)):
15091560 _plot_linestring_collection(ax, self.lines, color="not color")
1510 # check DeprecationWarning
1511 with pytest.warns(DeprecationWarning):
1561 # check FutureWarning
1562 with pytest.warns(FutureWarning):
15121563 plot_linestring_collection(ax, self.lines)
15131564
15141565 def test_linestrings_values(self):
15991650 # not a color
16001651 with pytest.raises((TypeError, ValueError)):
16011652 _plot_polygon_collection(ax, self.polygons, color="not color")
1602 # check DeprecationWarning
1603 with pytest.warns(DeprecationWarning):
1653 # check FutureWarning
1654 with pytest.warns(FutureWarning):
16041655 plot_polygon_collection(ax, self.polygons)
16051656
16061657 def test_polygons_values(self):
18361887 Previously, we did `fig.axes[1]`, but in matplotlib 3.4 the order switched
18371888 and the colorbar ax was first and subplot ax second.
18381889 """
1839 if matplotlib.__version__ < LooseVersion("3.0.0"):
1840 if label == "<colorbar>":
1841 return fig.axes[1]
1842 elif label == "":
1843 return fig.axes[0]
18441890 for ax in fig.axes:
18451891 if ax.get_label() == label:
18461892 return ax
172172 assert subset1.sindex is original_index
173173 subset2 = self.df[["A", "geom"]]
174174 assert subset2.sindex is original_index
175
176 def test_rebuild_on_update_inplace(self):
177 gdf = self.df.copy()
178 old_sindex = gdf.sindex
179 # sorting in place
180 gdf.sort_values("A", ascending=False, inplace=True)
181 # spatial index should be invalidated
182 assert not gdf.has_sindex
183 new_sindex = gdf.sindex
184 # and should be different
185 assert new_sindex is not old_sindex
186
187 # sorting should still have happened though
188 assert gdf.index.tolist() == [4, 3, 2, 1, 0]
189
190 @pytest.mark.skipif(not compat.PANDAS_GE_11, reason="fails on pd<1.1.0")
191 def test_update_inplace_no_rebuild(self):
192 gdf = self.df.copy()
193 old_sindex = gdf.sindex
194 gdf.rename(columns={"A": "AA"}, inplace=True)
195 # a rename shouldn't invalidate the index
196 assert gdf.has_sindex
197 # and the "new" should be the same
198 new_sindex = gdf.sindex
199 assert old_sindex is new_sindex
175200
176201
177202 # Skip to accommodate Shapely geometries being unhashable
0 from .crs import explicit_crs_from_epsg
10 from .geocoding import geocode, reverse_geocode
21 from .overlay import overlay
32 from .sjoin import sjoin, sjoin_nearest
65
76 __all__ = [
87 "collect",
9 "explicit_crs_from_epsg",
108 "geocode",
119 "overlay",
1210 "reverse_geocode",
66 """
77 import warnings
88
9 from shapely.geometry import Polygon, MultiPolygon
9 import pandas.api.types
10 from shapely.geometry import Polygon, MultiPolygon, box
1011
1112 from geopandas import GeoDataFrame, GeoSeries
1213 from geopandas.array import _check_crs, _crs_mismatch_warn
1314
1415
15 def _clip_gdf_with_polygon(gdf, poly):
16 """Clip geometry to the polygon extent.
17
18 Clip an input GeoDataFrame to the polygon extent of the poly
16 def _mask_is_list_like_rectangle(mask):
17 return pandas.api.types.is_list_like(mask) and not isinstance(
18 mask, (GeoDataFrame, GeoSeries, Polygon, MultiPolygon)
19 )
20
21
22 def _clip_gdf_with_mask(gdf, mask):
23 """Clip geometry to the polygon/rectangle extent.
24
25 Clip an input GeoDataFrame to the polygon extent of the polygon
1926 parameter.
2027
2128 Parameters
2330 gdf : GeoDataFrame, GeoSeries
2431 Dataframe to clip.
2532
26 poly : (Multi)Polygon
27 Reference polygon for clipping.
33 mask : (Multi)Polygon, list-like
34 Reference polygon/rectangle for clipping.
2835
2936 Returns
3037 -------
3138 GeoDataFrame
3239 The returned GeoDataFrame is a clipped subset of gdf
33 that intersects with poly.
40 that intersects with polygon/rectangle.
3441 """
35 gdf_sub = gdf.iloc[gdf.sindex.query(poly, predicate="intersects")]
42 clipping_by_rectangle = _mask_is_list_like_rectangle(mask)
43 if clipping_by_rectangle:
44 intersection_polygon = box(*mask)
45 else:
46 intersection_polygon = mask
47
48 gdf_sub = gdf.iloc[gdf.sindex.query(intersection_polygon, predicate="intersects")]
3649
3750 # For performance reasons points don't need to be intersected with poly
3851 non_point_mask = gdf_sub.geom_type != "Point"
4457 # Clip the data with the polygon
4558 if isinstance(gdf_sub, GeoDataFrame):
4659 clipped = gdf_sub.copy()
47 clipped.loc[
48 non_point_mask, clipped._geometry_column_name
49 ] = gdf_sub.geometry.values[non_point_mask].intersection(poly)
60 if clipping_by_rectangle:
61 clipped.loc[
62 non_point_mask, clipped._geometry_column_name
63 ] = gdf_sub.geometry.values[non_point_mask].clip_by_rect(*mask)
64 else:
65 clipped.loc[
66 non_point_mask, clipped._geometry_column_name
67 ] = gdf_sub.geometry.values[non_point_mask].intersection(mask)
5068 else:
5169 # GeoSeries
5270 clipped = gdf_sub.copy()
53 clipped[non_point_mask] = gdf_sub.values[non_point_mask].intersection(poly)
54
71 if clipping_by_rectangle:
72 clipped[non_point_mask] = gdf_sub.values[non_point_mask].clip_by_rect(*mask)
73 else:
74 clipped[non_point_mask] = gdf_sub.values[non_point_mask].intersection(mask)
75
76 if clipping_by_rectangle:
77 # clip_by_rect might return empty geometry collections in edge cases
78 clipped = clipped[~clipped.is_empty]
5579 return clipped
5680
5781
5983 """Clip points, lines, or polygon geometries to the mask extent.
6084
6185 Both layers must be in the same Coordinate Reference System (CRS).
62 The `gdf` will be clipped to the full extent of the clip object.
63
64 If there are multiple polygons in mask, data from `gdf` will be
86 The ``gdf`` will be clipped to the full extent of the clip object.
87
88 If there are multiple polygons in mask, data from ``gdf`` will be
6589 clipped to the total boundary of all polygons in mask.
90
91 If the ``mask`` is list-like with four elements ``(minx, miny, maxx, maxy)``, a
92 faster rectangle clipping algorithm will be used. Note that this can lead to
93 slightly different results in edge cases, e.g. if a line would be reduced to a
94 point, this point might not be returned.
95 The geometry is clipped in a fast but possibly dirty way. The output is not
96 guaranteed to be valid. No exceptions will be raised for topological errors.
6697
6798 Parameters
6899 ----------
69100 gdf : GeoDataFrame or GeoSeries
70101 Vector layer (point, line, polygon) to be clipped to mask.
71 mask : GeoDataFrame, GeoSeries, (Multi)Polygon
72 Polygon vector layer used to clip `gdf`.
102 mask : GeoDataFrame, GeoSeries, (Multi)Polygon, list-like
103 Polygon vector layer used to clip ``gdf``.
73104 The mask's geometry is dissolved into one geometric feature
74 and intersected with `gdf`.
105 and intersected with ``gdf``.
106 If the mask is list-like with four elements ``(minx, miny, maxx, maxy)``,
107 ``clip`` will use a faster rectangle clipping (:meth:`~GeoSeries.clip_by_rect`),
108 possibly leading to slightly different results.
75109 keep_geom_type : boolean, default False
76110 If True, return only geometries of original type in case of intersection
77111 resulting in multiple geometry types or GeometryCollections.
80114 Returns
81115 -------
82116 GeoDataFrame or GeoSeries
83 Vector data (points, lines, polygons) from `gdf` clipped to
117 Vector data (points, lines, polygons) from ``gdf`` clipped to
84118 polygon boundary from mask.
85119
86120 See also
109143 "'gdf' should be GeoDataFrame or GeoSeries, got {}".format(type(gdf))
110144 )
111145
112 if not isinstance(mask, (GeoDataFrame, GeoSeries, Polygon, MultiPolygon)):
146 mask_is_list_like = _mask_is_list_like_rectangle(mask)
147 if (
148 not isinstance(mask, (GeoDataFrame, GeoSeries, Polygon, MultiPolygon))
149 and not mask_is_list_like
150 ):
113151 raise TypeError(
114 "'mask' should be GeoDataFrame, GeoSeries or"
115 "(Multi)Polygon, got {}".format(type(mask))
152 "'mask' should be GeoDataFrame, GeoSeries,"
153 f"(Multi)Polygon or list-like, got {type(mask)}"
154 )
155
156 if mask_is_list_like and len(mask) != 4:
157 raise TypeError(
158 "If 'mask' is list-like, it must have four values (minx, miny, maxx, maxy)"
116159 )
117160
118161 if isinstance(mask, (GeoDataFrame, GeoSeries)):
121164
122165 if isinstance(mask, (GeoDataFrame, GeoSeries)):
123166 box_mask = mask.total_bounds
167 elif mask_is_list_like:
168 box_mask = mask
124169 else:
125170 box_mask = mask.bounds
126171 box_gdf = gdf.total_bounds
131176 return gdf.iloc[:0]
132177
133178 if isinstance(mask, (GeoDataFrame, GeoSeries)):
134 poly = mask.geometry.unary_union
135 else:
136 poly = mask
137
138 clipped = _clip_gdf_with_polygon(gdf, poly)
179 combined_mask = mask.geometry.unary_union
180 else:
181 combined_mask = mask
182
183 clipped = _clip_gdf_with_mask(gdf, combined_mask)
139184
140185 if keep_geom_type:
141186 geomcoll_concat = (clipped.geom_type == "GeometryCollection").any()
+0
-57
geopandas/tools/crs.py less more
0 import warnings
1
2 from pyproj import CRS
3
4
5 def explicit_crs_from_epsg(crs=None, epsg=None):
6 """
7 Gets full/explicit CRS from EPSG code provided.
8
9 Parameters
10 ----------
11 crs : dict or string, default None
12 An existing crs dict or Proj string with the 'init' key specifying an EPSG code
13 epsg : string or int, default None
14 The EPSG code to lookup
15 """
16 warnings.warn(
17 "explicit_crs_from_epsg is deprecated. "
18 "You can set the epsg on the GeoDataFrame (gdf) using gdf.crs=epsg",
19 FutureWarning,
20 stacklevel=2,
21 )
22 if crs is not None:
23 return CRS.from_user_input(crs)
24 elif epsg is not None:
25 return CRS.from_epsg(epsg)
26 raise ValueError("Must pass either crs or epsg.")
27
28
29 def epsg_from_crs(crs):
30 """
31 Returns an epsg code from a crs dict or Proj string.
32
33 Parameters
34 ----------
35 crs : dict or string, default None
36 A crs dict or Proj string
37
38 """
39 warnings.warn(
40 "epsg_from_crs is deprecated. "
41 "You can get the epsg code from GeoDataFrame (gdf) "
42 "using gdf.crs.to_epsg()",
43 FutureWarning,
44 stacklevel=2,
45 )
46 crs = CRS.from_user_input(crs)
47 if "init=epsg" in crs.to_string().lower():
48 epsg_code = crs.to_epsg(0)
49 else:
50 epsg_code = crs.to_epsg()
51 return epsg_code
52
53
54 def get_epsg_file_contents():
55 warnings.warn("get_epsg_file_contents is deprecated.", FutureWarning, stacklevel=2)
56 return ""
138138 # keep geometry column last
139139 columns = list(dfunion.columns)
140140 columns.remove("geometry")
141 columns = columns + ["geometry"]
141 columns.append("geometry")
142142 return dfunion.reindex(columns=columns)
143143
144144
106106 "A non-default value for `predicate` was passed"
107107 f' (got `predicate="{predicate}"`'
108108 f' in combination with `op="{op}"`).'
109 " The value of `predicate` will be overriden by the value of `op`,"
109 " The value of `predicate` will be overridden by the value of `op`,"
110110 " , which may result in unexpected behavior."
111111 f"\n{deprecation_message}"
112112 )
342342 )
343343 .set_index(index_right)
344344 .drop(["_key_left", "_key_right"], axis=1)
345 .set_geometry(right_df.geometry.name)
345346 )
346347 if isinstance(index_right, list):
347348 joined.index.names = right_index_name
415416
416417 Results will include multiple output records for a single input record
417418 where there are multiple equidistant nearest or intersected neighbors.
419
420 Distance is calculated in CRS units and can be returned using the
421 `distance_col` parameter.
418422
419423 See the User Guide page
420424 https://geopandas.readthedocs.io/en/latest/docs/user_guide/mergingdata.html
502506
503507 Notes
504508 -----
505 Since this join relies on distances, results will be innaccurate
509 Since this join relies on distances, results will be inaccurate
506510 if your geometries are in a geographic CRS.
507511
508512 Every operation in GeoPandas is planar, i.e. the potential third
00 """Tests for the clip module."""
11
22 import warnings
3 from distutils.version import LooseVersion
3 from packaging.version import Version
44
55 import numpy as np
66 import pandas as pd
1313 LinearRing,
1414 GeometryCollection,
1515 MultiPoint,
16 box,
1617 )
1718
1819 import geopandas
2122 from geopandas.testing import assert_geodataframe_equal, assert_geoseries_equal
2223 import pytest
2324
25 from geopandas.tools.clip import _mask_is_list_like_rectangle
2426
2527 pytestmark = pytest.mark.skip_no_sindex
26 pandas_133 = pd.__version__ == LooseVersion("1.3.3")
28 pandas_133 = Version(pd.__version__) == Version("1.3.3")
29 mask_variants_single_rectangle = [
30 "single_rectangle_gdf",
31 "single_rectangle_gdf_list_bounds",
32 "single_rectangle_gdf_tuple_bounds",
33 "single_rectangle_gdf_array_bounds",
34 ]
35 mask_variants_large_rectangle = [
36 "larger_single_rectangle_gdf",
37 "larger_single_rectangle_gdf_bounds",
38 ]
2739
2840
2941 @pytest.fixture
5971 gdf = GeoDataFrame([1], geometry=[poly_inters], crs="EPSG:3857")
6072 gdf["attr2"] = "site-boundary"
6173 return gdf
74
75
76 @pytest.fixture
77 def single_rectangle_gdf_tuple_bounds(single_rectangle_gdf):
78 """Bounds of the created single rectangle"""
79 return tuple(single_rectangle_gdf.total_bounds)
80
81
82 @pytest.fixture
83 def single_rectangle_gdf_list_bounds(single_rectangle_gdf):
84 """Bounds of the created single rectangle"""
85 return list(single_rectangle_gdf.total_bounds)
86
87
88 @pytest.fixture
89 def single_rectangle_gdf_array_bounds(single_rectangle_gdf):
90 """Bounds of the created single rectangle"""
91 return single_rectangle_gdf.total_bounds
6292
6393
6494 @pytest.fixture
72102 gdf = GeoDataFrame([1], geometry=[poly_inters], crs="EPSG:3857")
73103 gdf["attr2"] = ["study area"]
74104 return gdf
105
106
107 @pytest.fixture
108 def larger_single_rectangle_gdf_bounds(larger_single_rectangle_gdf):
109 """Bounds of the created single rectangle"""
110 return tuple(larger_single_rectangle_gdf.total_bounds)
75111
76112
77113 @pytest.fixture
173209 with pytest.raises(TypeError):
174210 clip((2, 3), single_rectangle_gdf)
175211 with pytest.raises(TypeError):
176 clip(single_rectangle_gdf, (2, 3))
177
178
179 def test_returns_gdf(point_gdf, single_rectangle_gdf):
180 """Test that function returns a GeoDataFrame (or GDF-like) object."""
181 out = clip(point_gdf, single_rectangle_gdf)
182 assert isinstance(out, GeoDataFrame)
183
184
185 def test_returns_series(point_gdf, single_rectangle_gdf):
186 """Test that function returns a GeoSeries if GeoSeries is passed."""
187 out = clip(point_gdf.geometry, single_rectangle_gdf)
188 assert isinstance(out, GeoSeries)
212 clip(single_rectangle_gdf, "foobar")
213 with pytest.raises(TypeError):
214 clip(single_rectangle_gdf, (1, 2, 3))
215 with pytest.raises(TypeError):
216 clip(single_rectangle_gdf, (1, 2, 3, 4, 5))
189217
190218
191219 def test_non_overlapping_geoms():
202230 assert_geoseries_equal(out2, GeoSeries(crs=unit_gdf.crs))
203231
204232
205 def test_clip_points(point_gdf, single_rectangle_gdf):
206 """Test clipping a points GDF with a generic polygon geometry."""
207 clip_pts = clip(point_gdf, single_rectangle_gdf)
208 pts = np.array([[2, 2], [3, 4], [9, 8]])
209 exp = GeoDataFrame([Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:3857")
210 assert_geodataframe_equal(clip_pts, exp)
211
212
213 def test_clip_points_geom_col_rename(point_gdf, single_rectangle_gdf):
214 """Test clipping a points GDF with a generic polygon geometry."""
215 point_gdf_geom_col_rename = point_gdf.rename_geometry("geometry2")
216 clip_pts = clip(point_gdf_geom_col_rename, single_rectangle_gdf)
217 pts = np.array([[2, 2], [3, 4], [9, 8]])
218 exp = GeoDataFrame(
219 [Point(xy) for xy in pts],
220 columns=["geometry2"],
221 crs="EPSG:3857",
222 geometry="geometry2",
223 )
224 assert_geodataframe_equal(clip_pts, exp)
225
226
227 def test_clip_poly(buffered_locations, single_rectangle_gdf):
228 """Test clipping a polygon GDF with a generic polygon geometry."""
229 clipped_poly = clip(buffered_locations, single_rectangle_gdf)
230 assert len(clipped_poly.geometry) == 3
231 assert all(clipped_poly.geom_type == "Polygon")
232
233
234 def test_clip_poly_geom_col_rename(buffered_locations, single_rectangle_gdf):
235 """Test clipping a polygon GDF with a generic polygon geometry."""
236
237 poly_gdf_geom_col_rename = buffered_locations.rename_geometry("geometry2")
238 clipped_poly = clip(poly_gdf_geom_col_rename, single_rectangle_gdf)
239 assert len(clipped_poly.geometry) == 3
240 assert "geometry" not in clipped_poly.keys()
241 assert "geometry2" in clipped_poly.keys()
242
243
244 def test_clip_poly_series(buffered_locations, single_rectangle_gdf):
245 """Test clipping a polygon GDF with a generic polygon geometry."""
246 clipped_poly = clip(buffered_locations.geometry, single_rectangle_gdf)
247 assert len(clipped_poly) == 3
248 assert all(clipped_poly.geom_type == "Polygon")
233 @pytest.mark.parametrize("mask_fixture_name", mask_variants_single_rectangle)
234 class TestClipWithSingleRectangleGdf:
235 @pytest.fixture
236 def mask(self, mask_fixture_name, request):
237 return request.getfixturevalue(mask_fixture_name)
238
239 def test_returns_gdf(self, point_gdf, mask):
240 """Test that function returns a GeoDataFrame (or GDF-like) object."""
241 out = clip(point_gdf, mask)
242 assert isinstance(out, GeoDataFrame)
243
244 def test_returns_series(self, point_gdf, mask):
245 """Test that function returns a GeoSeries if GeoSeries is passed."""
246 out = clip(point_gdf.geometry, mask)
247 assert isinstance(out, GeoSeries)
248
249 def test_clip_points(self, point_gdf, mask):
250 """Test clipping a points GDF with a generic polygon geometry."""
251 clip_pts = clip(point_gdf, mask)
252 pts = np.array([[2, 2], [3, 4], [9, 8]])
253 exp = GeoDataFrame(
254 [Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:3857"
255 )
256 assert_geodataframe_equal(clip_pts, exp)
257
258 def test_clip_points_geom_col_rename(self, point_gdf, mask):
259 """Test clipping a points GDF with a generic polygon geometry."""
260 point_gdf_geom_col_rename = point_gdf.rename_geometry("geometry2")
261 clip_pts = clip(point_gdf_geom_col_rename, mask)
262 pts = np.array([[2, 2], [3, 4], [9, 8]])
263 exp = GeoDataFrame(
264 [Point(xy) for xy in pts],
265 columns=["geometry2"],
266 crs="EPSG:3857",
267 geometry="geometry2",
268 )
269 assert_geodataframe_equal(clip_pts, exp)
270
271 def test_clip_poly(self, buffered_locations, mask):
272 """Test clipping a polygon GDF with a generic polygon geometry."""
273 clipped_poly = clip(buffered_locations, mask)
274 assert len(clipped_poly.geometry) == 3
275 assert all(clipped_poly.geom_type == "Polygon")
276
277 def test_clip_poly_geom_col_rename(self, buffered_locations, mask):
278 """Test clipping a polygon GDF with a generic polygon geometry."""
279
280 poly_gdf_geom_col_rename = buffered_locations.rename_geometry("geometry2")
281 clipped_poly = clip(poly_gdf_geom_col_rename, mask)
282 assert len(clipped_poly.geometry) == 3
283 assert "geometry" not in clipped_poly.keys()
284 assert "geometry2" in clipped_poly.keys()
285
286 def test_clip_poly_series(self, buffered_locations, mask):
287 """Test clipping a polygon GDF with a generic polygon geometry."""
288 clipped_poly = clip(buffered_locations.geometry, mask)
289 assert len(clipped_poly) == 3
290 assert all(clipped_poly.geom_type == "Polygon")
291
292 @pytest.mark.xfail(pandas_133, reason="Regression in pandas 1.3.3 (GH #2101)")
293 def test_clip_multipoly_keep_geom_type(self, multi_poly_gdf, mask):
294 """Test a multi poly object where the return includes a sliver.
295 Also the bounds of the object should == the bounds of the clip object
296 if they fully overlap (as they do in these fixtures)."""
297 clipped = clip(multi_poly_gdf, mask, keep_geom_type=True)
298 expected_bounds = (
299 mask if _mask_is_list_like_rectangle(mask) else mask.total_bounds
300 )
301 assert np.array_equal(clipped.total_bounds, expected_bounds)
302 # Assert returned data is a not geometry collection
303 assert (clipped.geom_type.isin(["Polygon", "MultiPolygon"])).all()
304
305 def test_clip_multiline(self, multi_line, mask):
306 """Test that clipping a multiline feature with a poly returns expected
307 output."""
308 clipped = clip(multi_line, mask)
309 assert clipped.geom_type[0] == "MultiLineString"
310
311 def test_clip_multipoint(self, multi_point, mask):
312 """Clipping a multipoint feature with a polygon works as expected.
313 should return a geodataframe with a single multi point feature"""
314 clipped = clip(multi_point, mask)
315 assert clipped.geom_type[0] == "MultiPoint"
316 assert hasattr(clipped, "attr")
317 # All points should intersect the clip geom
318 assert len(clipped) == 2
319 clipped_mutltipoint = MultiPoint(
320 [
321 Point(2, 2),
322 Point(3, 4),
323 Point(9, 8),
324 ]
325 )
326 assert clipped.iloc[0].geometry.wkt == clipped_mutltipoint.wkt
327 shape_for_points = (
328 box(*mask) if _mask_is_list_like_rectangle(mask) else mask.unary_union
329 )
330 assert all(clipped.intersects(shape_for_points))
331
332 def test_clip_lines(self, two_line_gdf, mask):
333 """Test what happens when you give the clip_extent a line GDF."""
334 clip_line = clip(two_line_gdf, mask)
335 assert len(clip_line.geometry) == 2
336
337 def test_mixed_geom(self, mixed_gdf, mask):
338 """Test clipping a mixed GeoDataFrame"""
339 clipped = clip(mixed_gdf, mask)
340 assert (
341 clipped.geom_type[0] == "Point"
342 and clipped.geom_type[1] == "Polygon"
343 and clipped.geom_type[2] == "LineString"
344 )
345
346 def test_mixed_series(self, mixed_gdf, mask):
347 """Test clipping a mixed GeoSeries"""
348 clipped = clip(mixed_gdf.geometry, mask)
349 assert (
350 clipped.geom_type[0] == "Point"
351 and clipped.geom_type[1] == "Polygon"
352 and clipped.geom_type[2] == "LineString"
353 )
354
355 def test_clip_warning_no_extra_geoms(self, buffered_locations, mask):
356 """Test a user warning is provided if no new geometry types are found."""
357 with pytest.warns(UserWarning):
358 clip(buffered_locations, mask, True)
359 warnings.warn(
360 "keep_geom_type was called when no extra geometry types existed.",
361 UserWarning,
362 )
363
364 def test_clip_with_line_extra_geom(self, sliver_line, mask):
365 """When the output of a clipped line returns a geom collection,
366 and keep_geom_type is True, no geometry collections should be returned."""
367 clipped = clip(sliver_line, mask, keep_geom_type=True)
368 assert len(clipped.geometry) == 1
369 # Assert returned data is a not geometry collection
370 assert not (clipped.geom_type == "GeometryCollection").any()
371
372 def test_clip_no_box_overlap(self, pointsoutside_nooverlap_gdf, mask):
373 """Test clip when intersection is empty and boxes do not overlap."""
374 clipped = clip(pointsoutside_nooverlap_gdf, mask)
375 assert len(clipped) == 0
376
377 def test_clip_box_overlap(self, pointsoutside_overlap_gdf, mask):
378 """Test clip when intersection is empty and boxes do overlap."""
379 clipped = clip(pointsoutside_overlap_gdf, mask)
380 assert len(clipped) == 0
381
382 def test_warning_extra_geoms_mixed(self, mixed_gdf, mask):
383 """Test the correct warnings are raised if keep_geom_type is
384 called on a mixed GDF"""
385 with pytest.warns(UserWarning):
386 clip(mixed_gdf, mask, keep_geom_type=True)
387
388 def test_warning_geomcoll(self, geomcol_gdf, mask):
389 """Test the correct warnings are raised if keep_geom_type is
390 called on a GDF with GeometryCollection"""
391 with pytest.warns(UserWarning):
392 clip(geomcol_gdf, mask, keep_geom_type=True)
393
394
395 def test_clip_line_keep_slivers(sliver_line, single_rectangle_gdf):
396 """Test the correct output if a point is returned
397 from a line only geometry type."""
398 clipped = clip(sliver_line, single_rectangle_gdf)
399 # Assert returned data is a geometry collection given sliver geoms
400 assert "Point" == clipped.geom_type[0]
401 assert "LineString" == clipped.geom_type[1]
249402
250403
251404 @pytest.mark.xfail(pandas_133, reason="Regression in pandas 1.3.3 (GH #2101)")
259412 assert "GeometryCollection" in clipped.geom_type[0]
260413
261414
262 @pytest.mark.xfail(pandas_133, reason="Regression in pandas 1.3.3 (GH #2101)")
263 def test_clip_multipoly_keep_geom_type(multi_poly_gdf, single_rectangle_gdf):
264 """Test a multi poly object where the return includes a sliver.
265 Also the bounds of the object should == the bounds of the clip object
266 if they fully overlap (as they do in these fixtures)."""
267 clipped = clip(multi_poly_gdf, single_rectangle_gdf, keep_geom_type=True)
268 assert np.array_equal(clipped.total_bounds, single_rectangle_gdf.total_bounds)
269 # Assert returned data is a not geometry collection
270 assert (clipped.geom_type == "Polygon").any()
271
272
273 def test_clip_single_multipoly_no_extra_geoms(
274 buffered_locations, larger_single_rectangle_gdf
275 ):
276 """When clipping a multi-polygon feature, no additional geom types
277 should be returned."""
278 multi = buffered_locations.dissolve(by="type").reset_index()
279 clipped = clip(multi, larger_single_rectangle_gdf)
280 assert clipped.geom_type[0] == "Polygon"
281
282
283 def test_clip_multiline(multi_line, single_rectangle_gdf):
284 """Test that clipping a multiline feature with a poly returns expected output."""
285 clipped = clip(multi_line, single_rectangle_gdf)
286 assert clipped.geom_type[0] == "MultiLineString"
287
288
289 def test_clip_multipoint(single_rectangle_gdf, multi_point):
290 """Clipping a multipoint feature with a polygon works as expected.
291 should return a geodataframe with a single multi point feature"""
292 clipped = clip(multi_point, single_rectangle_gdf)
293 assert clipped.geom_type[0] == "MultiPoint"
294 assert hasattr(clipped, "attr")
295 # All points should intersect the clip geom
296 assert len(clipped) == 2
297 clipped_mutltipoint = MultiPoint(
298 [
299 Point(2, 2),
300 Point(3, 4),
301 Point(9, 8),
302 ]
303 )
304 assert clipped.iloc[0].geometry.wkt == clipped_mutltipoint.wkt
305 assert all(clipped.intersects(single_rectangle_gdf.unary_union))
306
307
308 def test_clip_lines(two_line_gdf, single_rectangle_gdf):
309 """Test what happens when you give the clip_extent a line GDF."""
310 clip_line = clip(two_line_gdf, single_rectangle_gdf)
311 assert len(clip_line.geometry) == 2
312
313
314 def test_clip_with_multipolygon(buffered_locations, single_rectangle_gdf):
315 """Test clipping a polygon with a multipolygon."""
316 multi = buffered_locations.dissolve(by="type").reset_index()
317 clipped = clip(single_rectangle_gdf, multi)
318 assert clipped.geom_type[0] == "Polygon"
319
320
321 def test_mixed_geom(mixed_gdf, single_rectangle_gdf):
322 """Test clipping a mixed GeoDataFrame"""
323 clipped = clip(mixed_gdf, single_rectangle_gdf)
324 assert (
325 clipped.geom_type[0] == "Point"
326 and clipped.geom_type[1] == "Polygon"
327 and clipped.geom_type[2] == "LineString"
328 )
329
330
331 def test_mixed_series(mixed_gdf, single_rectangle_gdf):
332 """Test clipping a mixed GeoSeries"""
333 clipped = clip(mixed_gdf.geometry, single_rectangle_gdf)
334 assert (
335 clipped.geom_type[0] == "Point"
336 and clipped.geom_type[1] == "Polygon"
337 and clipped.geom_type[2] == "LineString"
338 )
339
340
341 def test_clip_warning_no_extra_geoms(buffered_locations, single_rectangle_gdf):
342 """Test a user warning is provided if no new geometry types are found."""
343 with pytest.warns(UserWarning):
344 clip(buffered_locations, single_rectangle_gdf, True)
345 warnings.warn(
346 "keep_geom_type was called when no extra geometry types existed.",
347 UserWarning,
348 )
415 def test_warning_crs_mismatch(point_gdf, single_rectangle_gdf):
416 with pytest.warns(UserWarning, match="CRS mismatch between the CRS"):
417 clip(point_gdf, single_rectangle_gdf.to_crs(4326))
349418
350419
351420 def test_clip_with_polygon(single_rectangle_gdf):
360429 assert_geodataframe_equal(clipped, exp)
361430
362431
363 def test_clip_with_line_extra_geom(single_rectangle_gdf, sliver_line):
364 """When the output of a clipped line returns a geom collection,
365 and keep_geom_type is True, no geometry collections should be returned."""
366 clipped = clip(sliver_line, single_rectangle_gdf, keep_geom_type=True)
367 assert len(clipped.geometry) == 1
368 # Assert returned data is a not geometry collection
369 assert not (clipped.geom_type == "GeometryCollection").any()
370
371
372 def test_clip_line_keep_slivers(single_rectangle_gdf, sliver_line):
373 """Test the correct output if a point is returned
374 from a line only geometry type."""
375 clipped = clip(sliver_line, single_rectangle_gdf)
376 # Assert returned data is a geometry collection given sliver geoms
377 assert "Point" == clipped.geom_type[0]
378 assert "LineString" == clipped.geom_type[1]
379
380
381 def test_clip_no_box_overlap(pointsoutside_nooverlap_gdf, single_rectangle_gdf):
382 """Test clip when intersection is empty and boxes do not overlap."""
383 clipped = clip(pointsoutside_nooverlap_gdf, single_rectangle_gdf)
384 assert len(clipped) == 0
385
386
387 def test_clip_box_overlap(pointsoutside_overlap_gdf, single_rectangle_gdf):
388 """Test clip when intersection is empty and boxes do overlap."""
389 clipped = clip(pointsoutside_overlap_gdf, single_rectangle_gdf)
390 assert len(clipped) == 0
391
392
393 def test_warning_extra_geoms_mixed(single_rectangle_gdf, mixed_gdf):
394 """Test the correct warnings are raised if keep_geom_type is
395 called on a mixed GDF"""
396 with pytest.warns(UserWarning):
397 clip(mixed_gdf, single_rectangle_gdf, keep_geom_type=True)
398
399
400 def test_warning_geomcoll(single_rectangle_gdf, geomcol_gdf):
401 """Test the correct warnings are raised if keep_geom_type is
402 called on a GDF with GeometryCollection"""
403 with pytest.warns(UserWarning):
404 clip(geomcol_gdf, single_rectangle_gdf, keep_geom_type=True)
405
406
407 def test_warning_crs_mismatch(point_gdf, single_rectangle_gdf):
408 with pytest.warns(UserWarning, match="CRS mismatch between the CRS"):
409 clip(point_gdf, single_rectangle_gdf.to_crs(4326))
432 def test_clip_with_multipolygon(buffered_locations, single_rectangle_gdf):
433 """Test clipping a polygon with a multipolygon."""
434 multi = buffered_locations.dissolve(by="type").reset_index()
435 clipped = clip(single_rectangle_gdf, multi)
436 assert clipped.geom_type[0] == "Polygon"
437
438
439 @pytest.mark.parametrize(
440 "mask_fixture_name",
441 mask_variants_large_rectangle,
442 )
443 def test_clip_single_multipoly_no_extra_geoms(
444 buffered_locations, mask_fixture_name, request
445 ):
446 """When clipping a multi-polygon feature, no additional geom types
447 should be returned."""
448 masks = request.getfixturevalue(mask_fixture_name)
449 multi = buffered_locations.dissolve(by="type").reset_index()
450 clipped = clip(multi, masks)
451 assert clipped.geom_type[0] == "Polygon"
0 from distutils.version import LooseVersion
0 from packaging.version import Version
11 import math
22 from typing import Sequence
33 from geopandas.testing import assert_geodataframe_equal
136136 if op != predicate:
137137 warntype = UserWarning
138138 match = (
139 "`predicate` will be overriden by the value of `op`"
139 "`predicate` will be overridden by the value of `op`"
140140 + r"(.|\s)*"
141141 + match
142142 )
353353 exp.index.names = df2.index.names
354354
355355 # GH 1364 fix of behaviour was done in pandas 1.1.0
356 if predicate == "within" and str(pd.__version__) >= LooseVersion("1.1.0"):
356 if predicate == "within" and Version(pd.__version__) >= Version("1.1.0"):
357357 exp = exp.sort_index()
358358
359359 assert_frame_equal(res, exp, check_index_type=False)
766766 [Point(1, 1), Point(0.25, 1)],
767767 [0, 1],
768768 [1, 0],
769 [math.sqrt(0.25 ** 2 + 1), 0],
769 [math.sqrt(0.25**2 + 1), 0],
770770 ),
771771 (
772772 [Point(0, 0), Point(1, 1)],
773773 [Point(-10, -10), Point(100, 100)],
774774 [0, 1],
775775 [0, 0],
776 [math.sqrt(10 ** 2 + 10 ** 2), math.sqrt(11 ** 2 + 11 ** 2)],
776 [math.sqrt(10**2 + 10**2), math.sqrt(11**2 + 11**2)],
777777 ),
778778 (
779779 [Point(0, 0), Point(1, 1)],
787787 [Point(1.1, 1.1), Point(0, 0)],
788788 [0, 1, 2],
789789 [1, 0, 1],
790 [0, np.sqrt(0.1 ** 2 + 0.1 ** 2), 0],
790 [0, np.sqrt(0.1**2 + 0.1**2), 0],
791791 ),
792792 ],
793793 )
851851 [Point(-10, -10), Point(100, 100)],
852852 [0, 1],
853853 [0, 1],
854 [math.sqrt(10 ** 2 + 10 ** 2), math.sqrt(99 ** 2 + 99 ** 2)],
854 [math.sqrt(10**2 + 10**2), math.sqrt(99**2 + 99**2)],
855855 ),
856856 (
857857 [Point(0, 0), Point(1, 1)],
858858 [Point(x, y) for x, y in zip(np.arange(10), np.arange(10))],
859859 [0, 1] + [1] * 8,
860860 list(range(10)),
861 [0, 0] + [np.sqrt(x ** 2 + x ** 2) for x in np.arange(1, 9)],
861 [0, 0] + [np.sqrt(x**2 + x**2) for x in np.arange(1, 9)],
862862 ),
863863 (
864864 [Point(0, 0), Point(1, 1), Point(0, 0)],
865865 [Point(1.1, 1.1), Point(0, 0)],
866866 [1, 0, 2],
867867 [0, 1, 1],
868 [np.sqrt(0.1 ** 2 + 0.1 ** 2), 0, 0],
868 [np.sqrt(0.1**2 + 0.1**2), 0, 0],
869869 ),
870870 ],
871871 )
0 from distutils.version import LooseVersion
1
20 from shapely.geometry import LineString, MultiPoint, Point
3 import pyproj
4 from pyproj import CRS
51
62 from geopandas import GeoSeries
73 from geopandas.tools import collect
8 from geopandas.tools.crs import epsg_from_crs, explicit_crs_from_epsg
94
105 import pytest
11
12
13 # pyproj 2.3.1 fixed a segfault for the case working in an environment with
14 # 'init' dicts (https://github.com/pyproj4/pyproj/issues/415)
15 PYPROJ_LT_231 = LooseVersion(pyproj.__version__) < LooseVersion("2.3.1")
166
177
188 class TestTools:
5848 def test_collect_mixed_multi(self):
5949 with pytest.raises(ValueError):
6050 collect([self.mpc, self.mp1])
61
62 @pytest.mark.skipif(PYPROJ_LT_231, reason="segfault")
63 def test_epsg_from_crs(self):
64 with pytest.warns(FutureWarning):
65 assert epsg_from_crs({"init": "epsg:4326"}) == 4326
66 assert epsg_from_crs({"init": "EPSG:4326"}) == 4326
67 assert epsg_from_crs("+init=epsg:4326") == 4326
68
69 @pytest.mark.skipif(PYPROJ_LT_231, reason="segfault")
70 def test_explicit_crs_from_epsg(self):
71 with pytest.warns(FutureWarning):
72 assert explicit_crs_from_epsg(epsg=4326) == CRS.from_epsg(4326)
73 assert explicit_crs_from_epsg(epsg="4326") == CRS.from_epsg(4326)
74 assert explicit_crs_from_epsg(crs={"init": "epsg:4326"}) == CRS.from_dict(
75 {"init": "epsg:4326"}
76 )
77 assert explicit_crs_from_epsg(crs="+init=epsg:4326") == CRS.from_proj4(
78 "+init=epsg:4326"
79 )
80
81 @pytest.mark.filterwarnings("ignore:explicit_crs_from_epsg:FutureWarning")
82 def test_explicit_crs_from_epsg__missing_input(self):
83 with pytest.raises(ValueError):
84 explicit_crs_from_epsg()
0 [build-system]
1 requires = ["setuptools", "wheel"]
2 build-backend = "setuptools.build_meta"
00 version: 2
1 build:
2 os: ubuntu-20.04
3 tools:
4 python: mambaforge-4.10
5 python:
6 install:
7 - method: pip
8 path: .
9 conda:
10 environment: doc/environment.yml
111 formats: []
2 conda:
3 environment: doc/environment.yml
4 python:
5 version: 3
6 install:
7 - method: pip
8 path: .
00 # required
11 fiona>=1.8
2 pandas>=0.25
3 pyproj>=2.2.0
4 shapely>=1.6
2 pandas>=1.0.0
3 pyproj>=2.6.1.post1
4 shapely>=1.7
5 packaging
56
67 # geodatabase access
7 psycopg2>=2.5.1
8 SQLAlchemy>=0.8.3
8 psycopg2>=2.8.0
9 SQLAlchemy>=1.3
910
1011 # geocoding
1112 geopy
1213
1314 # plotting
14 matplotlib>=2.2
15 matplotlib>=3.2
1516 mapclassify
1617
1718 # testing
2021 codecov
2122
2223 # spatial access methods
23 rtree>=0.8
24 rtree>=0.9
2425
2526 # styling
2627 black
0 [bdist_wheel]
1 universal = 1
2
30 # See the docstring in versioneer.py for instructions. Note that you must
41 # re-run 'versioneer.py setup' after changing this section, and commit the
52 # resulting files.
33 """
44
55 import os
6 import sys
67
7 try:
8 from setuptools import setup
9 except ImportError:
10 from distutils.core import setup
8 from setuptools import setup
119
12 import versioneer
10 # ensure the current directory is on sys.path so versioneer can be imported
11 # when pip uses PEP 517/518 build rules.
12 # https://github.com/python-versioneer/python-versioneer/issues/193
13 sys.path.append(os.path.dirname(__file__))
14
15 import versioneer # noqa: E402
1316
1417 LONG_DESCRIPTION = """GeoPandas is a project to add support for geographic data to
1518 `pandas`_ objects.
2932 INSTALL_REQUIRES = []
3033 else:
3134 INSTALL_REQUIRES = [
32 "pandas >= 0.25.0",
33 "shapely >= 1.6",
35 "pandas >= 1.0.0",
36 "shapely >= 1.7, < 2",
3437 "fiona >= 1.8",
35 "pyproj >= 2.2.0",
38 "pyproj >= 2.6.1.post1",
39 "packaging",
3640 ]
3741
3842 # get all data dirs in the datasets module
5660 author="GeoPandas contributors",
5761 author_email="kjordahl@alum.mit.edu",
5862 url="http://geopandas.org",
63 project_urls={
64 "Source": "https://github.com/geopandas/geopandas",
65 },
5966 long_description=LONG_DESCRIPTION,
67 long_description_content_type="text/x-rst",
6068 packages=[
6169 "geopandas",
6270 "geopandas.io",
6674 "geopandas.tools.tests",
6775 ],
6876 package_data={"geopandas": data_files},
69 python_requires=">=3.7",
77 python_requires=">=3.8",
7078 install_requires=INSTALL_REQUIRES,
7179 cmdclass=versioneer.get_cmdclass(),
7280 )
0
1 # Version: 0.16
0 # Version: 0.21
21
32 """The Versioneer - like a rocketeer, but for versions.
43
65 ==============
76
87 * like a rocketeer, but for versions!
9 * https://github.com/warner/python-versioneer
8 * https://github.com/python-versioneer/python-versioneer
109 * Brian Warner
1110 * License: Public Domain
12 * Compatible With: python2.6, 2.7, 3.3, 3.4, 3.5, and pypy
13 * [![Latest Version]
14 (https://pypip.in/version/versioneer/badge.svg?style=flat)
15 ](https://pypi.python.org/pypi/versioneer/)
16 * [![Build Status]
17 (https://travis-ci.org/warner/python-versioneer.png?branch=master)
18 ](https://travis-ci.org/warner/python-versioneer)
11 * Compatible with: Python 3.6, 3.7, 3.8, 3.9 and pypy3
12 * [![Latest Version][pypi-image]][pypi-url]
13 * [![Build Status][travis-image]][travis-url]
1914
2015 This is a tool for managing a recorded version number in distutils-based
2116 python projects. The goal is to remove the tedious and error-prone "update
2621
2722 ## Quick Install
2823
29 * `pip install versioneer` to somewhere to your $PATH
30 * add a `[versioneer]` section to your setup.cfg (see below)
24 * `pip install versioneer` to somewhere in your $PATH
25 * add a `[versioneer]` section to your setup.cfg (see [Install](INSTALL.md))
3126 * run `versioneer install` in your source tree, commit the results
27 * Verify version information with `python setup.py version`
3228
3329 ## Version Identifiers
3430
6056 for example `git describe --tags --dirty --always` reports things like
6157 "0.7-1-g574ab98-dirty" to indicate that the checkout is one revision past the
6258 0.7 tag, has a unique revision id of "574ab98", and is "dirty" (it has
63 uncommitted changes.
59 uncommitted changes).
6460
6561 The version identifier is used for multiple purposes:
6662
8783
8884 ## Installation
8985
90 First, decide on values for the following configuration variables:
91
92 * `VCS`: the version control system you use. Currently accepts "git".
93
94 * `style`: the style of version string to be produced. See "Styles" below for
95 details. Defaults to "pep440", which looks like
96 `TAG[+DISTANCE.gSHORTHASH[.dirty]]`.
97
98 * `versionfile_source`:
99
100 A project-relative pathname into which the generated version strings should
101 be written. This is usually a `_version.py` next to your project's main
102 `__init__.py` file, so it can be imported at runtime. If your project uses
103 `src/myproject/__init__.py`, this should be `src/myproject/_version.py`.
104 This file should be checked in to your VCS as usual: the copy created below
105 by `setup.py setup_versioneer` will include code that parses expanded VCS
106 keywords in generated tarballs. The 'build' and 'sdist' commands will
107 replace it with a copy that has just the calculated version string.
108
109 This must be set even if your project does not have any modules (and will
110 therefore never import `_version.py`), since "setup.py sdist" -based trees
111 still need somewhere to record the pre-calculated version strings. Anywhere
112 in the source tree should do. If there is a `__init__.py` next to your
113 `_version.py`, the `setup.py setup_versioneer` command (described below)
114 will append some `__version__`-setting assignments, if they aren't already
115 present.
116
117 * `versionfile_build`:
118
119 Like `versionfile_source`, but relative to the build directory instead of
120 the source directory. These will differ when your setup.py uses
121 'package_dir='. If you have `package_dir={'myproject': 'src/myproject'}`,
122 then you will probably have `versionfile_build='myproject/_version.py'` and
123 `versionfile_source='src/myproject/_version.py'`.
124
125 If this is set to None, then `setup.py build` will not attempt to rewrite
126 any `_version.py` in the built tree. If your project does not have any
127 libraries (e.g. if it only builds a script), then you should use
128 `versionfile_build = None`. To actually use the computed version string,
129 your `setup.py` will need to override `distutils.command.build_scripts`
130 with a subclass that explicitly inserts a copy of
131 `versioneer.get_version()` into your script file. See
132 `test/demoapp-script-only/setup.py` for an example.
133
134 * `tag_prefix`:
135
136 a string, like 'PROJECTNAME-', which appears at the start of all VCS tags.
137 If your tags look like 'myproject-1.2.0', then you should use
138 tag_prefix='myproject-'. If you use unprefixed tags like '1.2.0', this
139 should be an empty string, using either `tag_prefix=` or `tag_prefix=''`.
140
141 * `parentdir_prefix`:
142
143 a optional string, frequently the same as tag_prefix, which appears at the
144 start of all unpacked tarball filenames. If your tarball unpacks into
145 'myproject-1.2.0', this should be 'myproject-'. To disable this feature,
146 just omit the field from your `setup.cfg`.
147
148 This tool provides one script, named `versioneer`. That script has one mode,
149 "install", which writes a copy of `versioneer.py` into the current directory
150 and runs `versioneer.py setup` to finish the installation.
151
152 To versioneer-enable your project:
153
154 * 1: Modify your `setup.cfg`, adding a section named `[versioneer]` and
155 populating it with the configuration values you decided earlier (note that
156 the option names are not case-sensitive):
157
158 ````
159 [versioneer]
160 VCS = git
161 style = pep440
162 versionfile_source = src/myproject/_version.py
163 versionfile_build = myproject/_version.py
164 tag_prefix =
165 parentdir_prefix = myproject-
166 ````
167
168 * 2: Run `versioneer install`. This will do the following:
169
170 * copy `versioneer.py` into the top of your source tree
171 * create `_version.py` in the right place (`versionfile_source`)
172 * modify your `__init__.py` (if one exists next to `_version.py`) to define
173 `__version__` (by calling a function from `_version.py`)
174 * modify your `MANIFEST.in` to include both `versioneer.py` and the
175 generated `_version.py` in sdist tarballs
176
177 `versioneer install` will complain about any problems it finds with your
178 `setup.py` or `setup.cfg`. Run it multiple times until you have fixed all
179 the problems.
180
181 * 3: add a `import versioneer` to your setup.py, and add the following
182 arguments to the setup() call:
183
184 version=versioneer.get_version(),
185 cmdclass=versioneer.get_cmdclass(),
186
187 * 4: commit these changes to your VCS. To make sure you won't forget,
188 `versioneer install` will mark everything it touched for addition using
189 `git add`. Don't forget to add `setup.py` and `setup.cfg` too.
190
191 ## Post-Installation Usage
192
193 Once established, all uses of your tree from a VCS checkout should get the
194 current version string. All generated tarballs should include an embedded
195 version string (so users who unpack them will not need a VCS tool installed).
196
197 If you distribute your project through PyPI, then the release process should
198 boil down to two steps:
199
200 * 1: git tag 1.0
201 * 2: python setup.py register sdist upload
202
203 If you distribute it through github (i.e. users use github to generate
204 tarballs with `git archive`), the process is:
205
206 * 1: git tag 1.0
207 * 2: git push; git push --tags
208
209 Versioneer will report "0+untagged.NUMCOMMITS.gHASH" until your tree has at
210 least one tag in its history.
86 See [INSTALL.md](./INSTALL.md) for detailed installation instructions.
21187
21288 ## Version-String Flavors
21389
227103
228104 * `['full-revisionid']`: detailed revision identifier. For Git, this is the
229105 full SHA1 commit id, e.g. "1076c978a8d3cfc70f408fe5974aa6c092c949ac".
106
107 * `['date']`: Date and time of the latest `HEAD` commit. For Git, it is the
108 commit date in ISO 8601 format. This will be None if the date is not
109 available.
230110
231111 * `['dirty']`: a boolean, True if the tree has uncommitted changes. Note that
232112 this is only accurate if run in a VCS checkout, otherwise it is likely to
266146 software (exactly equal to a known tag), the identifier will only contain the
267147 stripped tag, e.g. "0.11".
268148
269 Other styles are available. See details.md in the Versioneer source tree for
270 descriptions.
149 Other styles are available. See [details.md](details.md) in the Versioneer
150 source tree for descriptions.
271151
272152 ## Debugging
273153
277157 display the full contents of `get_versions()` (including the `error` string,
278158 which may help identify what went wrong).
279159
160 ## Known Limitations
161
162 Some situations are known to cause problems for Versioneer. This details the
163 most significant ones. More can be found on Github
164 [issues page](https://github.com/python-versioneer/python-versioneer/issues).
165
166 ### Subprojects
167
168 Versioneer has limited support for source trees in which `setup.py` is not in
169 the root directory (e.g. `setup.py` and `.git/` are *not* siblings). The are
170 two common reasons why `setup.py` might not be in the root:
171
172 * Source trees which contain multiple subprojects, such as
173 [Buildbot](https://github.com/buildbot/buildbot), which contains both
174 "master" and "slave" subprojects, each with their own `setup.py`,
175 `setup.cfg`, and `tox.ini`. Projects like these produce multiple PyPI
176 distributions (and upload multiple independently-installable tarballs).
177 * Source trees whose main purpose is to contain a C library, but which also
178 provide bindings to Python (and perhaps other languages) in subdirectories.
179
180 Versioneer will look for `.git` in parent directories, and most operations
181 should get the right version string. However `pip` and `setuptools` have bugs
182 and implementation details which frequently cause `pip install .` from a
183 subproject directory to fail to find a correct version string (so it usually
184 defaults to `0+unknown`).
185
186 `pip install --editable .` should work correctly. `setup.py install` might
187 work too.
188
189 Pip-8.1.1 is known to have this problem, but hopefully it will get fixed in
190 some later version.
191
192 [Bug #38](https://github.com/python-versioneer/python-versioneer/issues/38) is tracking
193 this issue. The discussion in
194 [PR #61](https://github.com/python-versioneer/python-versioneer/pull/61) describes the
195 issue from the Versioneer side in more detail.
196 [pip PR#3176](https://github.com/pypa/pip/pull/3176) and
197 [pip PR#3615](https://github.com/pypa/pip/pull/3615) contain work to improve
198 pip to let Versioneer work correctly.
199
200 Versioneer-0.16 and earlier only looked for a `.git` directory next to the
201 `setup.cfg`, so subprojects were completely unsupported with those releases.
202
203 ### Editable installs with setuptools <= 18.5
204
205 `setup.py develop` and `pip install --editable .` allow you to install a
206 project into a virtualenv once, then continue editing the source code (and
207 test) without re-installing after every change.
208
209 "Entry-point scripts" (`setup(entry_points={"console_scripts": ..})`) are a
210 convenient way to specify executable scripts that should be installed along
211 with the python package.
212
213 These both work as expected when using modern setuptools. When using
214 setuptools-18.5 or earlier, however, certain operations will cause
215 `pkg_resources.DistributionNotFound` errors when running the entrypoint
216 script, which must be resolved by re-installing the package. This happens
217 when the install happens with one version, then the egg_info data is
218 regenerated while a different version is checked out. Many setup.py commands
219 cause egg_info to be rebuilt (including `sdist`, `wheel`, and installing into
220 a different virtualenv), so this can be surprising.
221
222 [Bug #83](https://github.com/python-versioneer/python-versioneer/issues/83) describes
223 this one, but upgrading to a newer version of setuptools should probably
224 resolve it.
225
226
280227 ## Updating Versioneer
281228
282229 To upgrade your project to a new release of Versioneer, do the following:
283230
284231 * install the new Versioneer (`pip install -U versioneer` or equivalent)
285232 * edit `setup.cfg`, if necessary, to include any new configuration settings
286 indicated by the release notes
233 indicated by the release notes. See [UPGRADING](./UPGRADING.md) for details.
287234 * re-run `versioneer install` in your source tree, to replace
288235 `SRC/_version.py`
289236 * commit any changed files
290
291 ### Upgrading to 0.16
292
293 Nothing special.
294
295 ### Upgrading to 0.15
296
297 Starting with this version, Versioneer is configured with a `[versioneer]`
298 section in your `setup.cfg` file. Earlier versions required the `setup.py` to
299 set attributes on the `versioneer` module immediately after import. The new
300 version will refuse to run (raising an exception during import) until you
301 have provided the necessary `setup.cfg` section.
302
303 In addition, the Versioneer package provides an executable named
304 `versioneer`, and the installation process is driven by running `versioneer
305 install`. In 0.14 and earlier, the executable was named
306 `versioneer-installer` and was run without an argument.
307
308 ### Upgrading to 0.14
309
310 0.14 changes the format of the version string. 0.13 and earlier used
311 hyphen-separated strings like "0.11-2-g1076c97-dirty". 0.14 and beyond use a
312 plus-separated "local version" section strings, with dot-separated
313 components, like "0.11+2.g1076c97". PEP440-strict tools did not like the old
314 format, but should be ok with the new one.
315
316 ### Upgrading from 0.11 to 0.12
317
318 Nothing special.
319
320 ### Upgrading from 0.10 to 0.11
321
322 You must add a `versioneer.VCS = "git"` to your `setup.py` before re-running
323 `setup.py setup_versioneer`. This will enable the use of additional
324 version-control systems (SVN, etc) in the future.
325237
326238 ## Future Directions
327239
336248 direction and include code from all supported VCS systems, reducing the
337249 number of intermediate scripts.
338250
251 ## Similar projects
252
253 * [setuptools_scm](https://github.com/pypa/setuptools_scm/) - a non-vendored build-time
254 dependency
255 * [minver](https://github.com/jbweston/miniver) - a lightweight reimplementation of
256 versioneer
257 * [versioningit](https://github.com/jwodder/versioningit) - a PEP 518-based setuptools
258 plugin
339259
340260 ## License
341261
345265 Dedication" license (CC0-1.0), as described in
346266 https://creativecommons.org/publicdomain/zero/1.0/ .
347267
268 [pypi-image]: https://img.shields.io/pypi/v/versioneer.svg
269 [pypi-url]: https://pypi.python.org/pypi/versioneer/
270 [travis-image]:
271 https://img.shields.io/travis/com/python-versioneer/python-versioneer.svg
272 [travis-url]: https://travis-ci.com/github/python-versioneer/python-versioneer
273
348274 """
349
350 from __future__ import print_function
351 try:
352 import configparser
353 except ImportError:
354 import ConfigParser as configparser
275 # pylint:disable=invalid-name,import-outside-toplevel,missing-function-docstring
276 # pylint:disable=missing-class-docstring,too-many-branches,too-many-statements
277 # pylint:disable=raise-missing-from,too-many-lines,too-many-locals,import-error
278 # pylint:disable=too-few-public-methods,redefined-outer-name,consider-using-with
279 # pylint:disable=attribute-defined-outside-init,too-many-arguments
280
281 import configparser
355282 import errno
356283 import json
357284 import os
358285 import re
359286 import subprocess
360287 import sys
288 from typing import Callable, Dict
361289
362290
363291 class VersioneerConfig:
379307 setup_py = os.path.join(root, "setup.py")
380308 versioneer_py = os.path.join(root, "versioneer.py")
381309 if not (os.path.exists(setup_py) or os.path.exists(versioneer_py)):
382 err = ("Versioneer was unable to run the project root directory. "
383 "Versioneer requires setup.py to be executed from "
384 "its immediate directory (like 'python setup.py COMMAND'), "
385 "or in a way that lets it use sys.argv[0] to find the root "
386 "(like 'python path/to/setup.py COMMAND').")
310 err = (
311 "Versioneer was unable to run the project root directory. "
312 "Versioneer requires setup.py to be executed from "
313 "its immediate directory (like 'python setup.py COMMAND'), "
314 "or in a way that lets it use sys.argv[0] to find the root "
315 "(like 'python path/to/setup.py COMMAND')."
316 )
387317 raise VersioneerBadRootError(err)
388318 try:
389319 # Certain runtime workflows (setup.py install/develop in a setuptools
392322 # module-import table will cache the first one. So we can't use
393323 # os.path.dirname(__file__), as that will find whichever
394324 # versioneer.py was first imported, even in later projects.
395 me = os.path.realpath(os.path.abspath(__file__))
396 if os.path.splitext(me)[0] != os.path.splitext(versioneer_py)[0]:
397 print("Warning: build in %s is using versioneer.py from %s"
398 % (os.path.dirname(me), versioneer_py))
325 my_path = os.path.realpath(os.path.abspath(__file__))
326 me_dir = os.path.normcase(os.path.splitext(my_path)[0])
327 vsr_dir = os.path.normcase(os.path.splitext(versioneer_py)[0])
328 if me_dir != vsr_dir:
329 print(
330 "Warning: build in %s is using versioneer.py from %s"
331 % (os.path.dirname(my_path), versioneer_py)
332 )
399333 except NameError:
400334 pass
401335 return root
403337
404338 def get_config_from_root(root):
405339 """Read the project setup.cfg file to determine Versioneer config."""
406 # This might raise EnvironmentError (if setup.cfg is missing), or
340 # This might raise OSError (if setup.cfg is missing), or
407341 # configparser.NoSectionError (if it lacks a [versioneer] section), or
408342 # configparser.NoOptionError (if it lacks "VCS="). See the docstring at
409343 # the top of versioneer.py for instructions on writing your setup.cfg .
410344 setup_cfg = os.path.join(root, "setup.cfg")
411 parser = configparser.SafeConfigParser()
412 with open(setup_cfg, "r") as f:
413 parser.readfp(f)
345 parser = configparser.ConfigParser()
346 with open(setup_cfg, "r") as cfg_file:
347 parser.read_file(cfg_file)
414348 VCS = parser.get("versioneer", "VCS") # mandatory
415349
416 def get(parser, name):
417 if parser.has_option("versioneer", name):
418 return parser.get("versioneer", name)
419 return None
350 # Dict-like interface for non-mandatory entries
351 section = parser["versioneer"]
352
420353 cfg = VersioneerConfig()
421354 cfg.VCS = VCS
422 cfg.style = get(parser, "style") or ""
423 cfg.versionfile_source = get(parser, "versionfile_source")
424 cfg.versionfile_build = get(parser, "versionfile_build")
425 cfg.tag_prefix = get(parser, "tag_prefix")
355 cfg.style = section.get("style", "")
356 cfg.versionfile_source = section.get("versionfile_source")
357 cfg.versionfile_build = section.get("versionfile_build")
358 cfg.tag_prefix = section.get("tag_prefix")
426359 if cfg.tag_prefix in ("''", '""'):
427360 cfg.tag_prefix = ""
428 cfg.parentdir_prefix = get(parser, "parentdir_prefix")
429 cfg.verbose = get(parser, "verbose")
361 cfg.parentdir_prefix = section.get("parentdir_prefix")
362 cfg.verbose = section.get("verbose")
430363 return cfg
431364
432365
433366 class NotThisMethod(Exception):
434367 """Exception raised if a method is not valid for the current scenario."""
435368
369
436370 # these dictionaries contain VCS-specific tools
437 LONG_VERSION_PY = {}
438 HANDLERS = {}
371 LONG_VERSION_PY: Dict[str, str] = {}
372 HANDLERS: Dict[str, Dict[str, Callable]] = {}
439373
440374
441375 def register_vcs_handler(vcs, method): # decorator
442 """Decorator to mark a method as the handler for a particular VCS."""
376 """Create decorator to mark a method as the handler of a VCS."""
377
443378 def decorate(f):
444379 """Store f in HANDLERS[vcs][method]."""
445 if vcs not in HANDLERS:
446 HANDLERS[vcs] = {}
447 HANDLERS[vcs][method] = f
380 HANDLERS.setdefault(vcs, {})[method] = f
448381 return f
382
449383 return decorate
450384
451385
452 def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False):
386 def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False, env=None):
453387 """Call the given command(s)."""
454388 assert isinstance(commands, list)
455 p = None
456 for c in commands:
389 process = None
390 for command in commands:
457391 try:
458 dispcmd = str([c] + args)
392 dispcmd = str([command] + args)
459393 # remember shell=False, so use git.cmd on windows, not just git
460 p = subprocess.Popen([c] + args, cwd=cwd, stdout=subprocess.PIPE,
461 stderr=(subprocess.PIPE if hide_stderr
462 else None))
394 process = subprocess.Popen(
395 [command] + args,
396 cwd=cwd,
397 env=env,
398 stdout=subprocess.PIPE,
399 stderr=(subprocess.PIPE if hide_stderr else None),
400 )
463401 break
464 except EnvironmentError:
402 except OSError:
465403 e = sys.exc_info()[1]
466404 if e.errno == errno.ENOENT:
467405 continue
468406 if verbose:
469407 print("unable to run %s" % dispcmd)
470408 print(e)
471 return None
409 return None, None
472410 else:
473411 if verbose:
474412 print("unable to find command, tried %s" % (commands,))
475 return None
476 stdout = p.communicate()[0].strip()
477 if sys.version_info[0] >= 3:
478 stdout = stdout.decode()
479 if p.returncode != 0:
413 return None, None
414 stdout = process.communicate()[0].strip().decode()
415 if process.returncode != 0:
480416 if verbose:
481417 print("unable to run %s (error)" % dispcmd)
482 return None
483 return stdout
484 LONG_VERSION_PY['git'] = '''
418 print("stdout was %s" % stdout)
419 return None, process.returncode
420 return stdout, process.returncode
421
422
423 LONG_VERSION_PY[
424 "git"
425 ] = r'''
485426 # This file helps to compute a version number in source trees obtained from
486427 # git-archive tarball (such as those provided by githubs download-from-tag
487428 # feature). Distribution tarballs (built by setup.py sdist) and build
489430 # that just contains the computed version number.
490431
491432 # This file is released into the public domain. Generated by
492 # versioneer-0.16 (https://github.com/warner/python-versioneer)
433 # versioneer-0.21 (https://github.com/python-versioneer/python-versioneer)
493434
494435 """Git implementation of _version.py."""
495436
498439 import re
499440 import subprocess
500441 import sys
442 from typing import Callable, Dict
501443
502444
503445 def get_keywords():
508450 # get_keywords().
509451 git_refnames = "%(DOLLAR)sFormat:%%d%(DOLLAR)s"
510452 git_full = "%(DOLLAR)sFormat:%%H%(DOLLAR)s"
511 keywords = {"refnames": git_refnames, "full": git_full}
453 git_date = "%(DOLLAR)sFormat:%%ci%(DOLLAR)s"
454 keywords = {"refnames": git_refnames, "full": git_full, "date": git_date}
512455 return keywords
513456
514457
534477 """Exception raised if a method is not valid for the current scenario."""
535478
536479
537 LONG_VERSION_PY = {}
538 HANDLERS = {}
480 LONG_VERSION_PY: Dict[str, str] = {}
481 HANDLERS: Dict[str, Dict[str, Callable]] = {}
539482
540483
541484 def register_vcs_handler(vcs, method): # decorator
542 """Decorator to mark a method as the handler for a particular VCS."""
485 """Create decorator to mark a method as the handler of a VCS."""
543486 def decorate(f):
544487 """Store f in HANDLERS[vcs][method]."""
545488 if vcs not in HANDLERS:
549492 return decorate
550493
551494
552 def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False):
495 def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False,
496 env=None):
553497 """Call the given command(s)."""
554498 assert isinstance(commands, list)
555 p = None
556 for c in commands:
499 process = None
500 for command in commands:
557501 try:
558 dispcmd = str([c] + args)
502 dispcmd = str([command] + args)
559503 # remember shell=False, so use git.cmd on windows, not just git
560 p = subprocess.Popen([c] + args, cwd=cwd, stdout=subprocess.PIPE,
561 stderr=(subprocess.PIPE if hide_stderr
562 else None))
504 process = subprocess.Popen([command] + args, cwd=cwd, env=env,
505 stdout=subprocess.PIPE,
506 stderr=(subprocess.PIPE if hide_stderr
507 else None))
563508 break
564 except EnvironmentError:
509 except OSError:
565510 e = sys.exc_info()[1]
566511 if e.errno == errno.ENOENT:
567512 continue
568513 if verbose:
569514 print("unable to run %%s" %% dispcmd)
570515 print(e)
571 return None
516 return None, None
572517 else:
573518 if verbose:
574519 print("unable to find command, tried %%s" %% (commands,))
575 return None
576 stdout = p.communicate()[0].strip()
577 if sys.version_info[0] >= 3:
578 stdout = stdout.decode()
579 if p.returncode != 0:
520 return None, None
521 stdout = process.communicate()[0].strip().decode()
522 if process.returncode != 0:
580523 if verbose:
581524 print("unable to run %%s (error)" %% dispcmd)
582 return None
583 return stdout
525 print("stdout was %%s" %% stdout)
526 return None, process.returncode
527 return stdout, process.returncode
584528
585529
586530 def versions_from_parentdir(parentdir_prefix, root, verbose):
587531 """Try to determine the version from the parent directory name.
588532
589 Source tarballs conventionally unpack into a directory that includes
590 both the project name and a version string.
591 """
592 dirname = os.path.basename(root)
593 if not dirname.startswith(parentdir_prefix):
594 if verbose:
595 print("guessing rootdir is '%%s', but '%%s' doesn't start with "
596 "prefix '%%s'" %% (root, dirname, parentdir_prefix))
597 raise NotThisMethod("rootdir doesn't start with parentdir_prefix")
598 return {"version": dirname[len(parentdir_prefix):],
599 "full-revisionid": None,
600 "dirty": False, "error": None}
533 Source tarballs conventionally unpack into a directory that includes both
534 the project name and a version string. We will also support searching up
535 two directory levels for an appropriately named parent directory
536 """
537 rootdirs = []
538
539 for _ in range(3):
540 dirname = os.path.basename(root)
541 if dirname.startswith(parentdir_prefix):
542 return {"version": dirname[len(parentdir_prefix):],
543 "full-revisionid": None,
544 "dirty": False, "error": None, "date": None}
545 rootdirs.append(root)
546 root = os.path.dirname(root) # up a level
547
548 if verbose:
549 print("Tried directories %%s but none started with prefix %%s" %%
550 (str(rootdirs), parentdir_prefix))
551 raise NotThisMethod("rootdir doesn't start with parentdir_prefix")
601552
602553
603554 @register_vcs_handler("git", "get_keywords")
609560 # _version.py.
610561 keywords = {}
611562 try:
612 f = open(versionfile_abs, "r")
613 for line in f.readlines():
614 if line.strip().startswith("git_refnames ="):
615 mo = re.search(r'=\s*"(.*)"', line)
616 if mo:
617 keywords["refnames"] = mo.group(1)
618 if line.strip().startswith("git_full ="):
619 mo = re.search(r'=\s*"(.*)"', line)
620 if mo:
621 keywords["full"] = mo.group(1)
622 f.close()
623 except EnvironmentError:
563 with open(versionfile_abs, "r") as fobj:
564 for line in fobj:
565 if line.strip().startswith("git_refnames ="):
566 mo = re.search(r'=\s*"(.*)"', line)
567 if mo:
568 keywords["refnames"] = mo.group(1)
569 if line.strip().startswith("git_full ="):
570 mo = re.search(r'=\s*"(.*)"', line)
571 if mo:
572 keywords["full"] = mo.group(1)
573 if line.strip().startswith("git_date ="):
574 mo = re.search(r'=\s*"(.*)"', line)
575 if mo:
576 keywords["date"] = mo.group(1)
577 except OSError:
624578 pass
625579 return keywords
626580
628582 @register_vcs_handler("git", "keywords")
629583 def git_versions_from_keywords(keywords, tag_prefix, verbose):
630584 """Get version information from git keywords."""
631 if not keywords:
632 raise NotThisMethod("no keywords at all, weird")
585 if "refnames" not in keywords:
586 raise NotThisMethod("Short version file found")
587 date = keywords.get("date")
588 if date is not None:
589 # Use only the last line. Previous lines may contain GPG signature
590 # information.
591 date = date.splitlines()[-1]
592
593 # git-2.2.0 added "%%cI", which expands to an ISO-8601 -compliant
594 # datestamp. However we prefer "%%ci" (which expands to an "ISO-8601
595 # -like" string, which we must then edit to make compliant), because
596 # it's been around since git-1.5.3, and it's too difficult to
597 # discover which version we're using, or to work around using an
598 # older one.
599 date = date.strip().replace(" ", "T", 1).replace(" ", "", 1)
633600 refnames = keywords["refnames"].strip()
634601 if refnames.startswith("$Format"):
635602 if verbose:
636603 print("keywords are unexpanded, not using")
637604 raise NotThisMethod("unexpanded keywords, not a git-archive tarball")
638 refs = set([r.strip() for r in refnames.strip("()").split(",")])
605 refs = {r.strip() for r in refnames.strip("()").split(",")}
639606 # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
640607 # just "foo-1.0". If we see a "tag: " prefix, prefer those.
641608 TAG = "tag: "
642 tags = set([r[len(TAG):] for r in refs if r.startswith(TAG)])
609 tags = {r[len(TAG):] for r in refs if r.startswith(TAG)}
643610 if not tags:
644611 # Either we're using git < 1.8.3, or there really are no tags. We use
645612 # a heuristic: assume all version tags have a digit. The old git %%d
648615 # between branches and tags. By ignoring refnames without digits, we
649616 # filter out many common branch names like "release" and
650617 # "stabilization", as well as "HEAD" and "master".
651 tags = set([r for r in refs if re.search(r'\d', r)])
618 tags = {r for r in refs if re.search(r'\d', r)}
652619 if verbose:
653 print("discarding '%%s', no digits" %% ",".join(refs-tags))
620 print("discarding '%%s', no digits" %% ",".join(refs - tags))
654621 if verbose:
655622 print("likely tags: %%s" %% ",".join(sorted(tags)))
656623 for ref in sorted(tags):
657624 # sorting will prefer e.g. "2.0" over "2.0rc1"
658625 if ref.startswith(tag_prefix):
659626 r = ref[len(tag_prefix):]
627 # Filter out refs that exactly match prefix or that don't start
628 # with a number once the prefix is stripped (mostly a concern
629 # when prefix is '')
630 if not re.match(r'\d', r):
631 continue
660632 if verbose:
661633 print("picking %%s" %% r)
662634 return {"version": r,
663635 "full-revisionid": keywords["full"].strip(),
664 "dirty": False, "error": None
665 }
636 "dirty": False, "error": None,
637 "date": date}
666638 # no suitable tags, so version is "0+unknown", but full hex is still there
667639 if verbose:
668640 print("no suitable tags, using unknown + full revision id")
669641 return {"version": "0+unknown",
670642 "full-revisionid": keywords["full"].strip(),
671 "dirty": False, "error": "no suitable tags"}
643 "dirty": False, "error": "no suitable tags", "date": None}
672644
673645
674646 @register_vcs_handler("git", "pieces_from_vcs")
675 def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
647 def git_pieces_from_vcs(tag_prefix, root, verbose, runner=run_command):
676648 """Get version from 'git describe' in the root of the source tree.
677649
678650 This only gets called if the git-archive 'subst' keywords were *not*
679651 expanded, and _version.py hasn't already been rewritten with a short
680652 version string, meaning we're inside a checked out source tree.
681653 """
682 if not os.path.exists(os.path.join(root, ".git")):
683 if verbose:
684 print("no .git in %%s" %% root)
685 raise NotThisMethod("no .git directory")
686
687654 GITS = ["git"]
655 TAG_PREFIX_REGEX = "*"
688656 if sys.platform == "win32":
689657 GITS = ["git.cmd", "git.exe"]
658 TAG_PREFIX_REGEX = r"\*"
659
660 _, rc = runner(GITS, ["rev-parse", "--git-dir"], cwd=root,
661 hide_stderr=True)
662 if rc != 0:
663 if verbose:
664 print("Directory %%s not under git control" %% root)
665 raise NotThisMethod("'git rev-parse --git-dir' returned error")
666
690667 # if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty]
691668 # if there isn't one, this yields HEX[-dirty] (no NUM)
692 describe_out = run_command(GITS, ["describe", "--tags", "--dirty",
693 "--always", "--long",
694 "--match", "%%s*" %% tag_prefix],
695 cwd=root)
669 describe_out, rc = runner(GITS, ["describe", "--tags", "--dirty",
670 "--always", "--long",
671 "--match",
672 "%%s%%s" %% (tag_prefix, TAG_PREFIX_REGEX)],
673 cwd=root)
696674 # --long was added in git-1.5.5
697675 if describe_out is None:
698676 raise NotThisMethod("'git describe' failed")
699677 describe_out = describe_out.strip()
700 full_out = run_command(GITS, ["rev-parse", "HEAD"], cwd=root)
678 full_out, rc = runner(GITS, ["rev-parse", "HEAD"], cwd=root)
701679 if full_out is None:
702680 raise NotThisMethod("'git rev-parse' failed")
703681 full_out = full_out.strip()
706684 pieces["long"] = full_out
707685 pieces["short"] = full_out[:7] # maybe improved later
708686 pieces["error"] = None
687
688 branch_name, rc = runner(GITS, ["rev-parse", "--abbrev-ref", "HEAD"],
689 cwd=root)
690 # --abbrev-ref was added in git-1.6.3
691 if rc != 0 or branch_name is None:
692 raise NotThisMethod("'git rev-parse --abbrev-ref' returned error")
693 branch_name = branch_name.strip()
694
695 if branch_name == "HEAD":
696 # If we aren't exactly on a branch, pick a branch which represents
697 # the current commit. If all else fails, we are on a branchless
698 # commit.
699 branches, rc = runner(GITS, ["branch", "--contains"], cwd=root)
700 # --contains was added in git-1.5.4
701 if rc != 0 or branches is None:
702 raise NotThisMethod("'git branch --contains' returned error")
703 branches = branches.split("\n")
704
705 # Remove the first line if we're running detached
706 if "(" in branches[0]:
707 branches.pop(0)
708
709 # Strip off the leading "* " from the list of branches.
710 branches = [branch[2:] for branch in branches]
711 if "master" in branches:
712 branch_name = "master"
713 elif not branches:
714 branch_name = None
715 else:
716 # Pick the first branch that is returned. Good or bad.
717 branch_name = branches[0]
718
719 pieces["branch"] = branch_name
709720
710721 # parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty]
711722 # TAG might have hyphens.
723734 # TAG-NUM-gHEX
724735 mo = re.search(r'^(.+)-(\d+)-g([0-9a-f]+)$', git_describe)
725736 if not mo:
726 # unparseable. Maybe git-describe is misbehaving?
737 # unparsable. Maybe git-describe is misbehaving?
727738 pieces["error"] = ("unable to parse git-describe output: '%%s'"
728739 %% describe_out)
729740 return pieces
748759 else:
749760 # HEX: no tags
750761 pieces["closest-tag"] = None
751 count_out = run_command(GITS, ["rev-list", "HEAD", "--count"],
752 cwd=root)
762 count_out, rc = runner(GITS, ["rev-list", "HEAD", "--count"], cwd=root)
753763 pieces["distance"] = int(count_out) # total number of commits
764
765 # commit date: see ISO-8601 comment in git_versions_from_keywords()
766 date = runner(GITS, ["show", "-s", "--format=%%ci", "HEAD"], cwd=root)[0].strip()
767 # Use only the last line. Previous lines may contain GPG signature
768 # information.
769 date = date.splitlines()[-1]
770 pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1)
754771
755772 return pieces
756773
787804 return rendered
788805
789806
790 def render_pep440_pre(pieces):
791 """TAG[.post.devDISTANCE] -- No -dirty.
807 def render_pep440_branch(pieces):
808 """TAG[[.dev0]+DISTANCE.gHEX[.dirty]] .
809
810 The ".dev0" means not master branch. Note that .dev0 sorts backwards
811 (a feature branch will appear "older" than the master branch).
792812
793813 Exceptions:
794 1: no tags. 0.post.devDISTANCE
814 1: no tags. 0[.dev0]+untagged.DISTANCE.gHEX[.dirty]
795815 """
796816 if pieces["closest-tag"]:
797817 rendered = pieces["closest-tag"]
818 if pieces["distance"] or pieces["dirty"]:
819 if pieces["branch"] != "master":
820 rendered += ".dev0"
821 rendered += plus_or_dot(pieces)
822 rendered += "%%d.g%%s" %% (pieces["distance"], pieces["short"])
823 if pieces["dirty"]:
824 rendered += ".dirty"
825 else:
826 # exception #1
827 rendered = "0"
828 if pieces["branch"] != "master":
829 rendered += ".dev0"
830 rendered += "+untagged.%%d.g%%s" %% (pieces["distance"],
831 pieces["short"])
832 if pieces["dirty"]:
833 rendered += ".dirty"
834 return rendered
835
836
837 def pep440_split_post(ver):
838 """Split pep440 version string at the post-release segment.
839
840 Returns the release segments before the post-release and the
841 post-release version number (or -1 if no post-release segment is present).
842 """
843 vc = str.split(ver, ".post")
844 return vc[0], int(vc[1] or 0) if len(vc) == 2 else None
845
846
847 def render_pep440_pre(pieces):
848 """TAG[.postN.devDISTANCE] -- No -dirty.
849
850 Exceptions:
851 1: no tags. 0.post0.devDISTANCE
852 """
853 if pieces["closest-tag"]:
798854 if pieces["distance"]:
799 rendered += ".post.dev%%d" %% pieces["distance"]
855 # update the post release segment
856 tag_version, post_version = pep440_split_post(pieces["closest-tag"])
857 rendered = tag_version
858 if post_version is not None:
859 rendered += ".post%%d.dev%%d" %% (post_version+1, pieces["distance"])
860 else:
861 rendered += ".post0.dev%%d" %% (pieces["distance"])
862 else:
863 # no commits, use the tag as the version
864 rendered = pieces["closest-tag"]
800865 else:
801866 # exception #1
802 rendered = "0.post.dev%%d" %% pieces["distance"]
867 rendered = "0.post0.dev%%d" %% pieces["distance"]
803868 return rendered
804869
805870
830895 return rendered
831896
832897
898 def render_pep440_post_branch(pieces):
899 """TAG[.postDISTANCE[.dev0]+gHEX[.dirty]] .
900
901 The ".dev0" means not master branch.
902
903 Exceptions:
904 1: no tags. 0.postDISTANCE[.dev0]+gHEX[.dirty]
905 """
906 if pieces["closest-tag"]:
907 rendered = pieces["closest-tag"]
908 if pieces["distance"] or pieces["dirty"]:
909 rendered += ".post%%d" %% pieces["distance"]
910 if pieces["branch"] != "master":
911 rendered += ".dev0"
912 rendered += plus_or_dot(pieces)
913 rendered += "g%%s" %% pieces["short"]
914 if pieces["dirty"]:
915 rendered += ".dirty"
916 else:
917 # exception #1
918 rendered = "0.post%%d" %% pieces["distance"]
919 if pieces["branch"] != "master":
920 rendered += ".dev0"
921 rendered += "+g%%s" %% pieces["short"]
922 if pieces["dirty"]:
923 rendered += ".dirty"
924 return rendered
925
926
833927 def render_pep440_old(pieces):
834928 """TAG[.postDISTANCE[.dev0]] .
835929
836930 The ".dev0" means dirty.
837931
838 Eexceptions:
932 Exceptions:
839933 1: no tags. 0.postDISTANCE[.dev0]
840934 """
841935 if pieces["closest-tag"]:
898992 return {"version": "unknown",
899993 "full-revisionid": pieces.get("long"),
900994 "dirty": None,
901 "error": pieces["error"]}
995 "error": pieces["error"],
996 "date": None}
902997
903998 if not style or style == "default":
904999 style = "pep440" # the default
9051000
9061001 if style == "pep440":
9071002 rendered = render_pep440(pieces)
1003 elif style == "pep440-branch":
1004 rendered = render_pep440_branch(pieces)
9081005 elif style == "pep440-pre":
9091006 rendered = render_pep440_pre(pieces)
9101007 elif style == "pep440-post":
9111008 rendered = render_pep440_post(pieces)
1009 elif style == "pep440-post-branch":
1010 rendered = render_pep440_post_branch(pieces)
9121011 elif style == "pep440-old":
9131012 rendered = render_pep440_old(pieces)
9141013 elif style == "git-describe":
9191018 raise ValueError("unknown style '%%s'" %% style)
9201019
9211020 return {"version": rendered, "full-revisionid": pieces["long"],
922 "dirty": pieces["dirty"], "error": None}
1021 "dirty": pieces["dirty"], "error": None,
1022 "date": pieces.get("date")}
9231023
9241024
9251025 def get_versions():
9431043 # versionfile_source is the relative path from the top of the source
9441044 # tree (where the .git directory might live) to this file. Invert
9451045 # this to find the root from __file__.
946 for i in cfg.versionfile_source.split('/'):
1046 for _ in cfg.versionfile_source.split('/'):
9471047 root = os.path.dirname(root)
9481048 except NameError:
9491049 return {"version": "0+unknown", "full-revisionid": None,
9501050 "dirty": None,
951 "error": "unable to find root of source tree"}
1051 "error": "unable to find root of source tree",
1052 "date": None}
9521053
9531054 try:
9541055 pieces = git_pieces_from_vcs(cfg.tag_prefix, root, verbose)
9641065
9651066 return {"version": "0+unknown", "full-revisionid": None,
9661067 "dirty": None,
967 "error": "unable to compute version"}
1068 "error": "unable to compute version", "date": None}
9681069 '''
9691070
9701071
9771078 # _version.py.
9781079 keywords = {}
9791080 try:
980 f = open(versionfile_abs, "r")
981 for line in f.readlines():
982 if line.strip().startswith("git_refnames ="):
983 mo = re.search(r'=\s*"(.*)"', line)
984 if mo:
985 keywords["refnames"] = mo.group(1)
986 if line.strip().startswith("git_full ="):
987 mo = re.search(r'=\s*"(.*)"', line)
988 if mo:
989 keywords["full"] = mo.group(1)
990 f.close()
991 except EnvironmentError:
1081 with open(versionfile_abs, "r") as fobj:
1082 for line in fobj:
1083 if line.strip().startswith("git_refnames ="):
1084 mo = re.search(r'=\s*"(.*)"', line)
1085 if mo:
1086 keywords["refnames"] = mo.group(1)
1087 if line.strip().startswith("git_full ="):
1088 mo = re.search(r'=\s*"(.*)"', line)
1089 if mo:
1090 keywords["full"] = mo.group(1)
1091 if line.strip().startswith("git_date ="):
1092 mo = re.search(r'=\s*"(.*)"', line)
1093 if mo:
1094 keywords["date"] = mo.group(1)
1095 except OSError:
9921096 pass
9931097 return keywords
9941098
9961100 @register_vcs_handler("git", "keywords")
9971101 def git_versions_from_keywords(keywords, tag_prefix, verbose):
9981102 """Get version information from git keywords."""
999 if not keywords:
1000 raise NotThisMethod("no keywords at all, weird")
1103 if "refnames" not in keywords:
1104 raise NotThisMethod("Short version file found")
1105 date = keywords.get("date")
1106 if date is not None:
1107 # Use only the last line. Previous lines may contain GPG signature
1108 # information.
1109 date = date.splitlines()[-1]
1110
1111 # git-2.2.0 added "%cI", which expands to an ISO-8601 -compliant
1112 # datestamp. However we prefer "%ci" (which expands to an "ISO-8601
1113 # -like" string, which we must then edit to make compliant), because
1114 # it's been around since git-1.5.3, and it's too difficult to
1115 # discover which version we're using, or to work around using an
1116 # older one.
1117 date = date.strip().replace(" ", "T", 1).replace(" ", "", 1)
10011118 refnames = keywords["refnames"].strip()
10021119 if refnames.startswith("$Format"):
10031120 if verbose:
10041121 print("keywords are unexpanded, not using")
10051122 raise NotThisMethod("unexpanded keywords, not a git-archive tarball")
1006 refs = set([r.strip() for r in refnames.strip("()").split(",")])
1123 refs = {r.strip() for r in refnames.strip("()").split(",")}
10071124 # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
10081125 # just "foo-1.0". If we see a "tag: " prefix, prefer those.
10091126 TAG = "tag: "
1010 tags = set([r[len(TAG):] for r in refs if r.startswith(TAG)])
1127 tags = {r[len(TAG) :] for r in refs if r.startswith(TAG)}
10111128 if not tags:
10121129 # Either we're using git < 1.8.3, or there really are no tags. We use
10131130 # a heuristic: assume all version tags have a digit. The old git %d
10161133 # between branches and tags. By ignoring refnames without digits, we
10171134 # filter out many common branch names like "release" and
10181135 # "stabilization", as well as "HEAD" and "master".
1019 tags = set([r for r in refs if re.search(r'\d', r)])
1136 tags = {r for r in refs if re.search(r"\d", r)}
10201137 if verbose:
1021 print("discarding '%s', no digits" % ",".join(refs-tags))
1138 print("discarding '%s', no digits" % ",".join(refs - tags))
10221139 if verbose:
10231140 print("likely tags: %s" % ",".join(sorted(tags)))
10241141 for ref in sorted(tags):
10251142 # sorting will prefer e.g. "2.0" over "2.0rc1"
10261143 if ref.startswith(tag_prefix):
1027 r = ref[len(tag_prefix):]
1144 r = ref[len(tag_prefix) :]
1145 # Filter out refs that exactly match prefix or that don't start
1146 # with a number once the prefix is stripped (mostly a concern
1147 # when prefix is '')
1148 if not re.match(r"\d", r):
1149 continue
10281150 if verbose:
10291151 print("picking %s" % r)
1030 return {"version": r,
1031 "full-revisionid": keywords["full"].strip(),
1032 "dirty": False, "error": None
1033 }
1152 return {
1153 "version": r,
1154 "full-revisionid": keywords["full"].strip(),
1155 "dirty": False,
1156 "error": None,
1157 "date": date,
1158 }
10341159 # no suitable tags, so version is "0+unknown", but full hex is still there
10351160 if verbose:
10361161 print("no suitable tags, using unknown + full revision id")
1037 return {"version": "0+unknown",
1038 "full-revisionid": keywords["full"].strip(),
1039 "dirty": False, "error": "no suitable tags"}
1162 return {
1163 "version": "0+unknown",
1164 "full-revisionid": keywords["full"].strip(),
1165 "dirty": False,
1166 "error": "no suitable tags",
1167 "date": None,
1168 }
10401169
10411170
10421171 @register_vcs_handler("git", "pieces_from_vcs")
1043 def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
1172 def git_pieces_from_vcs(tag_prefix, root, verbose, runner=run_command):
10441173 """Get version from 'git describe' in the root of the source tree.
10451174
10461175 This only gets called if the git-archive 'subst' keywords were *not*
10471176 expanded, and _version.py hasn't already been rewritten with a short
10481177 version string, meaning we're inside a checked out source tree.
10491178 """
1050 if not os.path.exists(os.path.join(root, ".git")):
1051 if verbose:
1052 print("no .git in %s" % root)
1053 raise NotThisMethod("no .git directory")
1054
10551179 GITS = ["git"]
1180 TAG_PREFIX_REGEX = "*"
10561181 if sys.platform == "win32":
10571182 GITS = ["git.cmd", "git.exe"]
1183 TAG_PREFIX_REGEX = r"\*"
1184
1185 _, rc = runner(GITS, ["rev-parse", "--git-dir"], cwd=root, hide_stderr=True)
1186 if rc != 0:
1187 if verbose:
1188 print("Directory %s not under git control" % root)
1189 raise NotThisMethod("'git rev-parse --git-dir' returned error")
1190
10581191 # if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty]
10591192 # if there isn't one, this yields HEX[-dirty] (no NUM)
1060 describe_out = run_command(GITS, ["describe", "--tags", "--dirty",
1061 "--always", "--long",
1062 "--match", "%s*" % tag_prefix],
1063 cwd=root)
1193 describe_out, rc = runner(
1194 GITS,
1195 [
1196 "describe",
1197 "--tags",
1198 "--dirty",
1199 "--always",
1200 "--long",
1201 "--match",
1202 "%s%s" % (tag_prefix, TAG_PREFIX_REGEX),
1203 ],
1204 cwd=root,
1205 )
10641206 # --long was added in git-1.5.5
10651207 if describe_out is None:
10661208 raise NotThisMethod("'git describe' failed")
10671209 describe_out = describe_out.strip()
1068 full_out = run_command(GITS, ["rev-parse", "HEAD"], cwd=root)
1210 full_out, rc = runner(GITS, ["rev-parse", "HEAD"], cwd=root)
10691211 if full_out is None:
10701212 raise NotThisMethod("'git rev-parse' failed")
10711213 full_out = full_out.strip()
10751217 pieces["short"] = full_out[:7] # maybe improved later
10761218 pieces["error"] = None
10771219
1220 branch_name, rc = runner(GITS, ["rev-parse", "--abbrev-ref", "HEAD"], cwd=root)
1221 # --abbrev-ref was added in git-1.6.3
1222 if rc != 0 or branch_name is None:
1223 raise NotThisMethod("'git rev-parse --abbrev-ref' returned error")
1224 branch_name = branch_name.strip()
1225
1226 if branch_name == "HEAD":
1227 # If we aren't exactly on a branch, pick a branch which represents
1228 # the current commit. If all else fails, we are on a branchless
1229 # commit.
1230 branches, rc = runner(GITS, ["branch", "--contains"], cwd=root)
1231 # --contains was added in git-1.5.4
1232 if rc != 0 or branches is None:
1233 raise NotThisMethod("'git branch --contains' returned error")
1234 branches = branches.split("\n")
1235
1236 # Remove the first line if we're running detached
1237 if "(" in branches[0]:
1238 branches.pop(0)
1239
1240 # Strip off the leading "* " from the list of branches.
1241 branches = [branch[2:] for branch in branches]
1242 if "master" in branches:
1243 branch_name = "master"
1244 elif not branches:
1245 branch_name = None
1246 else:
1247 # Pick the first branch that is returned. Good or bad.
1248 branch_name = branches[0]
1249
1250 pieces["branch"] = branch_name
1251
10781252 # parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty]
10791253 # TAG might have hyphens.
10801254 git_describe = describe_out
10831257 dirty = git_describe.endswith("-dirty")
10841258 pieces["dirty"] = dirty
10851259 if dirty:
1086 git_describe = git_describe[:git_describe.rindex("-dirty")]
1260 git_describe = git_describe[: git_describe.rindex("-dirty")]
10871261
10881262 # now we have TAG-NUM-gHEX or HEX
10891263
10901264 if "-" in git_describe:
10911265 # TAG-NUM-gHEX
1092 mo = re.search(r'^(.+)-(\d+)-g([0-9a-f]+)$', git_describe)
1266 mo = re.search(r"^(.+)-(\d+)-g([0-9a-f]+)$", git_describe)
10931267 if not mo:
1094 # unparseable. Maybe git-describe is misbehaving?
1095 pieces["error"] = ("unable to parse git-describe output: '%s'"
1096 % describe_out)
1268 # unparsable. Maybe git-describe is misbehaving?
1269 pieces["error"] = "unable to parse git-describe output: '%s'" % describe_out
10971270 return pieces
10981271
10991272 # tag
11021275 if verbose:
11031276 fmt = "tag '%s' doesn't start with prefix '%s'"
11041277 print(fmt % (full_tag, tag_prefix))
1105 pieces["error"] = ("tag '%s' doesn't start with prefix '%s'"
1106 % (full_tag, tag_prefix))
1278 pieces["error"] = "tag '%s' doesn't start with prefix '%s'" % (
1279 full_tag,
1280 tag_prefix,
1281 )
11071282 return pieces
1108 pieces["closest-tag"] = full_tag[len(tag_prefix):]
1283 pieces["closest-tag"] = full_tag[len(tag_prefix) :]
11091284
11101285 # distance: number of commits since tag
11111286 pieces["distance"] = int(mo.group(2))
11161291 else:
11171292 # HEX: no tags
11181293 pieces["closest-tag"] = None
1119 count_out = run_command(GITS, ["rev-list", "HEAD", "--count"],
1120 cwd=root)
1294 count_out, rc = runner(GITS, ["rev-list", "HEAD", "--count"], cwd=root)
11211295 pieces["distance"] = int(count_out) # total number of commits
1296
1297 # commit date: see ISO-8601 comment in git_versions_from_keywords()
1298 date = runner(GITS, ["show", "-s", "--format=%ci", "HEAD"], cwd=root)[0].strip()
1299 # Use only the last line. Previous lines may contain GPG signature
1300 # information.
1301 date = date.splitlines()[-1]
1302 pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1)
11221303
11231304 return pieces
11241305
11271308 """Git-specific installation logic for Versioneer.
11281309
11291310 For Git, this means creating/changing .gitattributes to mark _version.py
1130 for export-time keyword substitution.
1311 for export-subst keyword substitution.
11311312 """
11321313 GITS = ["git"]
11331314 if sys.platform == "win32":
11361317 if ipy:
11371318 files.append(ipy)
11381319 try:
1139 me = __file__
1140 if me.endswith(".pyc") or me.endswith(".pyo"):
1141 me = os.path.splitext(me)[0] + ".py"
1142 versioneer_file = os.path.relpath(me)
1320 my_path = __file__
1321 if my_path.endswith(".pyc") or my_path.endswith(".pyo"):
1322 my_path = os.path.splitext(my_path)[0] + ".py"
1323 versioneer_file = os.path.relpath(my_path)
11431324 except NameError:
11441325 versioneer_file = "versioneer.py"
11451326 files.append(versioneer_file)
11461327 present = False
11471328 try:
1148 f = open(".gitattributes", "r")
1149 for line in f.readlines():
1150 if line.strip().startswith(versionfile_source):
1151 if "export-subst" in line.strip().split()[1:]:
1152 present = True
1153 f.close()
1154 except EnvironmentError:
1329 with open(".gitattributes", "r") as fobj:
1330 for line in fobj:
1331 if line.strip().startswith(versionfile_source):
1332 if "export-subst" in line.strip().split()[1:]:
1333 present = True
1334 break
1335 except OSError:
11551336 pass
11561337 if not present:
1157 f = open(".gitattributes", "a+")
1158 f.write("%s export-subst\n" % versionfile_source)
1159 f.close()
1338 with open(".gitattributes", "a+") as fobj:
1339 fobj.write(f"{versionfile_source} export-subst\n")
11601340 files.append(".gitattributes")
11611341 run_command(GITS, ["add", "--"] + files)
11621342
11641344 def versions_from_parentdir(parentdir_prefix, root, verbose):
11651345 """Try to determine the version from the parent directory name.
11661346
1167 Source tarballs conventionally unpack into a directory that includes
1168 both the project name and a version string.
1169 """
1170 dirname = os.path.basename(root)
1171 if not dirname.startswith(parentdir_prefix):
1172 if verbose:
1173 print("guessing rootdir is '%s', but '%s' doesn't start with "
1174 "prefix '%s'" % (root, dirname, parentdir_prefix))
1175 raise NotThisMethod("rootdir doesn't start with parentdir_prefix")
1176 return {"version": dirname[len(parentdir_prefix):],
1177 "full-revisionid": None,
1178 "dirty": False, "error": None}
1347 Source tarballs conventionally unpack into a directory that includes both
1348 the project name and a version string. We will also support searching up
1349 two directory levels for an appropriately named parent directory
1350 """
1351 rootdirs = []
1352
1353 for _ in range(3):
1354 dirname = os.path.basename(root)
1355 if dirname.startswith(parentdir_prefix):
1356 return {
1357 "version": dirname[len(parentdir_prefix) :],
1358 "full-revisionid": None,
1359 "dirty": False,
1360 "error": None,
1361 "date": None,
1362 }
1363 rootdirs.append(root)
1364 root = os.path.dirname(root) # up a level
1365
1366 if verbose:
1367 print(
1368 "Tried directories %s but none started with prefix %s"
1369 % (str(rootdirs), parentdir_prefix)
1370 )
1371 raise NotThisMethod("rootdir doesn't start with parentdir_prefix")
1372
11791373
11801374 SHORT_VERSION_PY = """
1181 # This file was generated by 'versioneer.py' (0.16) from
1375 # This file was generated by 'versioneer.py' (0.21) from
11821376 # revision-control system data, or from the parent directory name of an
11831377 # unpacked source archive. Distribution tarballs contain a pre-generated copy
11841378 # of this file.
11851379
11861380 import json
1187 import sys
11881381
11891382 version_json = '''
11901383 %s
12011394 try:
12021395 with open(filename) as f:
12031396 contents = f.read()
1204 except EnvironmentError:
1397 except OSError:
12051398 raise NotThisMethod("unable to read _version.py")
1206 mo = re.search(r"version_json = '''\n(.*)''' # END VERSION_JSON",
1207 contents, re.M | re.S)
1399 mo = re.search(
1400 r"version_json = '''\n(.*)''' # END VERSION_JSON", contents, re.M | re.S
1401 )
1402 if not mo:
1403 mo = re.search(
1404 r"version_json = '''\r\n(.*)''' # END VERSION_JSON", contents, re.M | re.S
1405 )
12081406 if not mo:
12091407 raise NotThisMethod("no version_json in _version.py")
12101408 return json.loads(mo.group(1))
12131411 def write_to_version_file(filename, versions):
12141412 """Write the given version number to the given _version.py file."""
12151413 os.unlink(filename)
1216 contents = json.dumps(versions, sort_keys=True,
1217 indent=1, separators=(",", ": "))
1414 contents = json.dumps(versions, sort_keys=True, indent=1, separators=(",", ": "))
12181415 with open(filename, "w") as f:
12191416 f.write(SHORT_VERSION_PY % contents)
12201417
12461443 rendered += ".dirty"
12471444 else:
12481445 # exception #1
1249 rendered = "0+untagged.%d.g%s" % (pieces["distance"],
1250 pieces["short"])
1446 rendered = "0+untagged.%d.g%s" % (pieces["distance"], pieces["short"])
12511447 if pieces["dirty"]:
12521448 rendered += ".dirty"
12531449 return rendered
12541450
12551451
1256 def render_pep440_pre(pieces):
1257 """TAG[.post.devDISTANCE] -- No -dirty.
1452 def render_pep440_branch(pieces):
1453 """TAG[[.dev0]+DISTANCE.gHEX[.dirty]] .
1454
1455 The ".dev0" means not master branch. Note that .dev0 sorts backwards
1456 (a feature branch will appear "older" than the master branch).
12581457
12591458 Exceptions:
1260 1: no tags. 0.post.devDISTANCE
1459 1: no tags. 0[.dev0]+untagged.DISTANCE.gHEX[.dirty]
12611460 """
12621461 if pieces["closest-tag"]:
12631462 rendered = pieces["closest-tag"]
1463 if pieces["distance"] or pieces["dirty"]:
1464 if pieces["branch"] != "master":
1465 rendered += ".dev0"
1466 rendered += plus_or_dot(pieces)
1467 rendered += "%d.g%s" % (pieces["distance"], pieces["short"])
1468 if pieces["dirty"]:
1469 rendered += ".dirty"
1470 else:
1471 # exception #1
1472 rendered = "0"
1473 if pieces["branch"] != "master":
1474 rendered += ".dev0"
1475 rendered += "+untagged.%d.g%s" % (pieces["distance"], pieces["short"])
1476 if pieces["dirty"]:
1477 rendered += ".dirty"
1478 return rendered
1479
1480
1481 def pep440_split_post(ver):
1482 """Split pep440 version string at the post-release segment.
1483
1484 Returns the release segments before the post-release and the
1485 post-release version number (or -1 if no post-release segment is present).
1486 """
1487 vc = str.split(ver, ".post")
1488 return vc[0], int(vc[1] or 0) if len(vc) == 2 else None
1489
1490
1491 def render_pep440_pre(pieces):
1492 """TAG[.postN.devDISTANCE] -- No -dirty.
1493
1494 Exceptions:
1495 1: no tags. 0.post0.devDISTANCE
1496 """
1497 if pieces["closest-tag"]:
12641498 if pieces["distance"]:
1265 rendered += ".post.dev%d" % pieces["distance"]
1499 # update the post release segment
1500 tag_version, post_version = pep440_split_post(pieces["closest-tag"])
1501 rendered = tag_version
1502 if post_version is not None:
1503 rendered += ".post%d.dev%d" % (post_version + 1, pieces["distance"])
1504 else:
1505 rendered += ".post0.dev%d" % (pieces["distance"])
1506 else:
1507 # no commits, use the tag as the version
1508 rendered = pieces["closest-tag"]
12661509 else:
12671510 # exception #1
1268 rendered = "0.post.dev%d" % pieces["distance"]
1511 rendered = "0.post0.dev%d" % pieces["distance"]
12691512 return rendered
12701513
12711514
12961539 return rendered
12971540
12981541
1542 def render_pep440_post_branch(pieces):
1543 """TAG[.postDISTANCE[.dev0]+gHEX[.dirty]] .
1544
1545 The ".dev0" means not master branch.
1546
1547 Exceptions:
1548 1: no tags. 0.postDISTANCE[.dev0]+gHEX[.dirty]
1549 """
1550 if pieces["closest-tag"]:
1551 rendered = pieces["closest-tag"]
1552 if pieces["distance"] or pieces["dirty"]:
1553 rendered += ".post%d" % pieces["distance"]
1554 if pieces["branch"] != "master":
1555 rendered += ".dev0"
1556 rendered += plus_or_dot(pieces)
1557 rendered += "g%s" % pieces["short"]
1558 if pieces["dirty"]:
1559 rendered += ".dirty"
1560 else:
1561 # exception #1
1562 rendered = "0.post%d" % pieces["distance"]
1563 if pieces["branch"] != "master":
1564 rendered += ".dev0"
1565 rendered += "+g%s" % pieces["short"]
1566 if pieces["dirty"]:
1567 rendered += ".dirty"
1568 return rendered
1569
1570
12991571 def render_pep440_old(pieces):
13001572 """TAG[.postDISTANCE[.dev0]] .
13011573
13021574 The ".dev0" means dirty.
13031575
1304 Eexceptions:
1576 Exceptions:
13051577 1: no tags. 0.postDISTANCE[.dev0]
13061578 """
13071579 if pieces["closest-tag"]:
13611633 def render(pieces, style):
13621634 """Render the given version pieces into the requested style."""
13631635 if pieces["error"]:
1364 return {"version": "unknown",
1365 "full-revisionid": pieces.get("long"),
1366 "dirty": None,
1367 "error": pieces["error"]}
1636 return {
1637 "version": "unknown",
1638 "full-revisionid": pieces.get("long"),
1639 "dirty": None,
1640 "error": pieces["error"],
1641 "date": None,
1642 }
13681643
13691644 if not style or style == "default":
13701645 style = "pep440" # the default
13711646
13721647 if style == "pep440":
13731648 rendered = render_pep440(pieces)
1649 elif style == "pep440-branch":
1650 rendered = render_pep440_branch(pieces)
13741651 elif style == "pep440-pre":
13751652 rendered = render_pep440_pre(pieces)
13761653 elif style == "pep440-post":
13771654 rendered = render_pep440_post(pieces)
1655 elif style == "pep440-post-branch":
1656 rendered = render_pep440_post_branch(pieces)
13781657 elif style == "pep440-old":
13791658 rendered = render_pep440_old(pieces)
13801659 elif style == "git-describe":
13841663 else:
13851664 raise ValueError("unknown style '%s'" % style)
13861665
1387 return {"version": rendered, "full-revisionid": pieces["long"],
1388 "dirty": pieces["dirty"], "error": None}
1666 return {
1667 "version": rendered,
1668 "full-revisionid": pieces["long"],
1669 "dirty": pieces["dirty"],
1670 "error": None,
1671 "date": pieces.get("date"),
1672 }
13891673
13901674
13911675 class VersioneerBadRootError(Exception):
14081692 handlers = HANDLERS.get(cfg.VCS)
14091693 assert handlers, "unrecognized VCS '%s'" % cfg.VCS
14101694 verbose = verbose or cfg.verbose
1411 assert cfg.versionfile_source is not None, \
1412 "please set versioneer.versionfile_source"
1695 assert (
1696 cfg.versionfile_source is not None
1697 ), "please set versioneer.versionfile_source"
14131698 assert cfg.tag_prefix is not None, "please set versioneer.tag_prefix"
14141699
14151700 versionfile_abs = os.path.join(root, cfg.versionfile_source)
14631748 if verbose:
14641749 print("unable to compute version")
14651750
1466 return {"version": "0+unknown", "full-revisionid": None,
1467 "dirty": None, "error": "unable to compute version"}
1751 return {
1752 "version": "0+unknown",
1753 "full-revisionid": None,
1754 "dirty": None,
1755 "error": "unable to compute version",
1756 "date": None,
1757 }
14681758
14691759
14701760 def get_version():
14721762 return get_versions()["version"]
14731763
14741764
1475 def get_cmdclass():
1476 """Get the custom setuptools/distutils subclasses used by Versioneer."""
1765 def get_cmdclass(cmdclass=None):
1766 """Get the custom setuptools/distutils subclasses used by Versioneer.
1767
1768 If the package uses a different cmdclass (e.g. one from numpy), it
1769 should be provide as an argument.
1770 """
14771771 if "versioneer" in sys.modules:
14781772 del sys.modules["versioneer"]
14791773 # this fixes the "python setup.py develop" case (also 'install' and
14871781 # parent is protected against the child's "import versioneer". By
14881782 # removing ourselves from sys.modules here, before the child build
14891783 # happens, we protect the child from the parent's versioneer too.
1490 # Also see https://github.com/warner/python-versioneer/issues/52
1491
1492 cmds = {}
1784 # Also see https://github.com/python-versioneer/python-versioneer/issues/52
1785
1786 cmds = {} if cmdclass is None else cmdclass.copy()
14931787
14941788 # we add "version" to both distutils and setuptools
14951789 from distutils.core import Command
15101804 print("Version: %s" % vers["version"])
15111805 print(" full-revisionid: %s" % vers.get("full-revisionid"))
15121806 print(" dirty: %s" % vers.get("dirty"))
1807 print(" date: %s" % vers.get("date"))
15131808 if vers["error"]:
15141809 print(" error: %s" % vers["error"])
1810
15151811 cmds["version"] = cmd_version
15161812
15171813 # we override "build_py" in both distutils and setuptools
15231819 # setuptools/bdist_egg -> distutils/install_lib -> build_py
15241820 # setuptools/install -> bdist_egg ->..
15251821 # setuptools/develop -> ?
1822 # pip install:
1823 # copies source tree to a tempdir before running egg_info/etc
1824 # if .git isn't copied too, 'git describe' will fail
1825 # then does setup.py bdist_wheel, or sometimes setup.py install
1826 # setup.py egg_info -> ?
15261827
15271828 # we override different "build_py" commands for both environments
1528 if "setuptools" in sys.modules:
1829 if "build_py" in cmds:
1830 _build_py = cmds["build_py"]
1831 elif "setuptools" in sys.modules:
15291832 from setuptools.command.build_py import build_py as _build_py
15301833 else:
15311834 from distutils.command.build_py import build_py as _build_py
15391842 # now locate _version.py in the new build/ directory and replace
15401843 # it with an updated value
15411844 if cfg.versionfile_build:
1542 target_versionfile = os.path.join(self.build_lib,
1543 cfg.versionfile_build)
1845 target_versionfile = os.path.join(self.build_lib, cfg.versionfile_build)
15441846 print("UPDATING %s" % target_versionfile)
15451847 write_to_version_file(target_versionfile, versions)
1848
15461849 cmds["build_py"] = cmd_build_py
1850
1851 if "build_ext" in cmds:
1852 _build_ext = cmds["build_ext"]
1853 elif "setuptools" in sys.modules:
1854 from setuptools.command.build_ext import build_ext as _build_ext
1855 else:
1856 from distutils.command.build_ext import build_ext as _build_ext
1857
1858 class cmd_build_ext(_build_ext):
1859 def run(self):
1860 root = get_root()
1861 cfg = get_config_from_root(root)
1862 versions = get_versions()
1863 _build_ext.run(self)
1864 if self.inplace:
1865 # build_ext --inplace will only build extensions in
1866 # build/lib<..> dir with no _version.py to write to.
1867 # As in place builds will already have a _version.py
1868 # in the module dir, we do not need to write one.
1869 return
1870 # now locate _version.py in the new build/ directory and replace
1871 # it with an updated value
1872 target_versionfile = os.path.join(self.build_lib, cfg.versionfile_build)
1873 print("UPDATING %s" % target_versionfile)
1874 write_to_version_file(target_versionfile, versions)
1875
1876 cmds["build_ext"] = cmd_build_ext
15471877
15481878 if "cx_Freeze" in sys.modules: # cx_freeze enabled?
15491879 from cx_Freeze.dist import build_exe as _build_exe
1880
1881 # nczeczulin reports that py2exe won't like the pep440-style string
1882 # as FILEVERSION, but it can be used for PRODUCTVERSION, e.g.
1883 # setup(console=[{
1884 # "version": versioneer.get_version().split("+", 1)[0], # FILEVERSION
1885 # "product_version": versioneer.get_version(),
1886 # ...
15501887
15511888 class cmd_build_exe(_build_exe):
15521889 def run(self):
15611898 os.unlink(target_versionfile)
15621899 with open(cfg.versionfile_source, "w") as f:
15631900 LONG = LONG_VERSION_PY[cfg.VCS]
1564 f.write(LONG %
1565 {"DOLLAR": "$",
1566 "STYLE": cfg.style,
1567 "TAG_PREFIX": cfg.tag_prefix,
1568 "PARENTDIR_PREFIX": cfg.parentdir_prefix,
1569 "VERSIONFILE_SOURCE": cfg.versionfile_source,
1570 })
1901 f.write(
1902 LONG
1903 % {
1904 "DOLLAR": "$",
1905 "STYLE": cfg.style,
1906 "TAG_PREFIX": cfg.tag_prefix,
1907 "PARENTDIR_PREFIX": cfg.parentdir_prefix,
1908 "VERSIONFILE_SOURCE": cfg.versionfile_source,
1909 }
1910 )
1911
15711912 cmds["build_exe"] = cmd_build_exe
15721913 del cmds["build_py"]
15731914
1915 if "py2exe" in sys.modules: # py2exe enabled?
1916 from py2exe.distutils_buildexe import py2exe as _py2exe
1917
1918 class cmd_py2exe(_py2exe):
1919 def run(self):
1920 root = get_root()
1921 cfg = get_config_from_root(root)
1922 versions = get_versions()
1923 target_versionfile = cfg.versionfile_source
1924 print("UPDATING %s" % target_versionfile)
1925 write_to_version_file(target_versionfile, versions)
1926
1927 _py2exe.run(self)
1928 os.unlink(target_versionfile)
1929 with open(cfg.versionfile_source, "w") as f:
1930 LONG = LONG_VERSION_PY[cfg.VCS]
1931 f.write(
1932 LONG
1933 % {
1934 "DOLLAR": "$",
1935 "STYLE": cfg.style,
1936 "TAG_PREFIX": cfg.tag_prefix,
1937 "PARENTDIR_PREFIX": cfg.parentdir_prefix,
1938 "VERSIONFILE_SOURCE": cfg.versionfile_source,
1939 }
1940 )
1941
1942 cmds["py2exe"] = cmd_py2exe
1943
15741944 # we override different "sdist" commands for both environments
1575 if "setuptools" in sys.modules:
1945 if "sdist" in cmds:
1946 _sdist = cmds["sdist"]
1947 elif "setuptools" in sys.modules:
15761948 from setuptools.command.sdist import sdist as _sdist
15771949 else:
15781950 from distutils.command.sdist import sdist as _sdist
15951967 # updated value
15961968 target_versionfile = os.path.join(base_dir, cfg.versionfile_source)
15971969 print("UPDATING %s" % target_versionfile)
1598 write_to_version_file(target_versionfile,
1599 self._versioneer_generated_versions)
1970 write_to_version_file(
1971 target_versionfile, self._versioneer_generated_versions
1972 )
1973
16001974 cmds["sdist"] = cmd_sdist
16011975
16021976 return cmds
16392013
16402014 """
16412015
1642 INIT_PY_SNIPPET = """
2016 OLD_SNIPPET = """
16432017 from ._version import get_versions
16442018 __version__ = get_versions()['version']
16452019 del get_versions
16462020 """
16472021
2022 INIT_PY_SNIPPET = """
2023 from . import {0}
2024 __version__ = {0}.get_versions()['version']
2025 """
2026
16482027
16492028 def do_setup():
1650 """Main VCS-independent setup function for installing Versioneer."""
2029 """Do main VCS-independent setup function for installing Versioneer."""
16512030 root = get_root()
16522031 try:
16532032 cfg = get_config_from_root(root)
1654 except (EnvironmentError, configparser.NoSectionError,
1655 configparser.NoOptionError) as e:
1656 if isinstance(e, (EnvironmentError, configparser.NoSectionError)):
1657 print("Adding sample versioneer config to setup.cfg",
1658 file=sys.stderr)
2033 except (OSError, configparser.NoSectionError, configparser.NoOptionError) as e:
2034 if isinstance(e, (OSError, configparser.NoSectionError)):
2035 print("Adding sample versioneer config to setup.cfg", file=sys.stderr)
16592036 with open(os.path.join(root, "setup.cfg"), "a") as f:
16602037 f.write(SAMPLE_CONFIG)
16612038 print(CONFIG_ERROR, file=sys.stderr)
16642041 print(" creating %s" % cfg.versionfile_source)
16652042 with open(cfg.versionfile_source, "w") as f:
16662043 LONG = LONG_VERSION_PY[cfg.VCS]
1667 f.write(LONG % {"DOLLAR": "$",
1668 "STYLE": cfg.style,
1669 "TAG_PREFIX": cfg.tag_prefix,
1670 "PARENTDIR_PREFIX": cfg.parentdir_prefix,
1671 "VERSIONFILE_SOURCE": cfg.versionfile_source,
1672 })
1673
1674 ipy = os.path.join(os.path.dirname(cfg.versionfile_source),
1675 "__init__.py")
2044 f.write(
2045 LONG
2046 % {
2047 "DOLLAR": "$",
2048 "STYLE": cfg.style,
2049 "TAG_PREFIX": cfg.tag_prefix,
2050 "PARENTDIR_PREFIX": cfg.parentdir_prefix,
2051 "VERSIONFILE_SOURCE": cfg.versionfile_source,
2052 }
2053 )
2054
2055 ipy = os.path.join(os.path.dirname(cfg.versionfile_source), "__init__.py")
16762056 if os.path.exists(ipy):
16772057 try:
16782058 with open(ipy, "r") as f:
16792059 old = f.read()
1680 except EnvironmentError:
2060 except OSError:
16812061 old = ""
1682 if INIT_PY_SNIPPET not in old:
2062 module = os.path.splitext(os.path.basename(cfg.versionfile_source))[0]
2063 snippet = INIT_PY_SNIPPET.format(module)
2064 if OLD_SNIPPET in old:
2065 print(" replacing boilerplate in %s" % ipy)
2066 with open(ipy, "w") as f:
2067 f.write(old.replace(OLD_SNIPPET, snippet))
2068 elif snippet not in old:
16832069 print(" appending to %s" % ipy)
16842070 with open(ipy, "a") as f:
1685 f.write(INIT_PY_SNIPPET)
2071 f.write(snippet)
16862072 else:
16872073 print(" %s unmodified" % ipy)
16882074 else:
17012087 if line.startswith("include "):
17022088 for include in line.split()[1:]:
17032089 simple_includes.add(include)
1704 except EnvironmentError:
2090 except OSError:
17052091 pass
17062092 # That doesn't cover everything MANIFEST.in can do
17072093 # (http://docs.python.org/2/distutils/sourcedist.html#commands), so
17142100 else:
17152101 print(" 'versioneer.py' already in MANIFEST.in")
17162102 if cfg.versionfile_source not in simple_includes:
1717 print(" appending versionfile_source ('%s') to MANIFEST.in" %
1718 cfg.versionfile_source)
2103 print(
2104 " appending versionfile_source ('%s') to MANIFEST.in"
2105 % cfg.versionfile_source
2106 )
17192107 with open(manifest_in, "a") as f:
17202108 f.write("include %s\n" % cfg.versionfile_source)
17212109 else:
17222110 print(" versionfile_source already in MANIFEST.in")
17232111
17242112 # Make VCS-specific changes. For git, this means creating/changing
1725 # .gitattributes to mark _version.py for export-time keyword
2113 # .gitattributes to mark _version.py for export-subst keyword
17262114 # substitution.
17272115 do_vcs_install(manifest_in, cfg.versionfile_source, ipy)
17282116 return 0
17642152 errors += 1
17652153 return errors
17662154
2155
17672156 if __name__ == "__main__":
17682157 cmd = sys.argv[1]
17692158 if cmd == "setup":