Codebase list python-geopandas / 3275ff6
Update upstream source from tag 'upstream/0.10.0' Update to upstream version '0.10.0' with Debian dir 56b5e009f1c2a96f88d664dc4d45b82749c898a9 Bas Couwenberg 2 years ago
101 changed file(s) with 7516 addition(s) and 1809 deletion(s). Raw diff Collapse all Expand all
66 branches: [master]
77 schedule:
88 - cron: "0 0 * * *"
9
10 concurrency:
11 group: ${{ github.workflow }}-${{ github.ref }}
12 cancel-in-progress: true
913
1014 jobs:
1115 Linting:
2024 needs: Linting
2125 name: ${{ matrix.os }}, ${{ matrix.env }}
2226 runs-on: ${{ matrix.os }}
27 defaults:
28 run:
29 shell: bash -l {0}
2330 strategy:
31 fail-fast: false
2432 matrix:
2533 os: [ubuntu-latest]
2634 postgis: [false]
2735 dev: [false]
2836 env:
29 - ci/envs/36-minimal.yaml
37 - ci/envs/37-minimal.yaml
3038 - ci/envs/38-no-optional-deps.yaml
31 - ci/envs/36-pd025.yaml
39 - ci/envs/37-pd10.yaml
3240 - ci/envs/37-latest-defaults.yaml
3341 - ci/envs/37-latest-conda-forge.yaml
3442 - ci/envs/38-latest-conda-forge.yaml
5866 - uses: actions/checkout@v2
5967
6068 - name: Setup Conda
61 uses: s-weigand/setup-conda@v1
69 uses: conda-incubator/setup-miniconda@v2
6270 with:
63 activate-conda: false
64
65 - name: Install Env
66 shell: bash
67 run: conda env create -f ${{ matrix.env }}
71 environment-file: ${{ matrix.env }}
6872
6973 - name: Check and Log Environment
70 shell: bash
7174 run: |
72 source activate test
7375 python -V
7476 python -c "import geopandas; geopandas.show_versions();"
7577 conda info
8688 fi
8789
8890 - name: Test without PyGEOS
89 shell: bash
9091 env:
9192 USE_PYGEOS: 0
9293 run: |
93 source activate test
9494 pytest -v -r s -n auto --color=yes --cov=geopandas --cov-append --cov-report term-missing --cov-report xml geopandas/
9595
9696 - name: Test with PyGEOS
97 shell: bash
9897 if: env.HAS_PYGEOS == 1
9998 env:
10099 USE_PYGEOS: 1
101100 run: |
102 source activate test
103101 pytest -v -r s -n auto --color=yes --cov=geopandas --cov-append --cov-report term-missing --cov-report xml geopandas/
104102
105103 - name: Test with PostGIS
106 shell: bash
107104 if: contains(matrix.env, '38-latest-conda-forge.yaml') && contains(matrix.os, 'ubuntu')
108105 env:
109106 PGUSER: postgres
110107 PGPASSWORD: postgres
111108 PGHOST: "127.0.0.1"
109 PGPORT: 5432
112110 run: |
113 source activate test
114111 conda install postgis -c conda-forge
115 source ci/envs/setup_postgres.sh
112 sh ci/scripts/setup_postgres.sh
116113 pytest -v -r s --color=yes --cov=geopandas --cov-append --cov-report term-missing --cov-report xml geopandas/io/tests/test_sql.py | tee /dev/stderr | if grep SKIPPED >/dev/null;then echo "TESTS SKIPPED, FAILING" && exit 1;fi
117114
118115 - name: Test docstrings
119 shell: bash
120116 if: contains(matrix.env, '38-latest-conda-forge.yaml') && contains(matrix.os, 'ubuntu')
121117 env:
122118 USE_PYGEOS: 1
123119 run: |
124 source activate test
125120 pytest -v --color=yes --doctest-only geopandas --ignore=geopandas/datasets
126121
127122 - uses: codecov/codecov-action@v1
00 Changelog
11 =========
2
3 Version 0.10.0 (October 3, 2021)
4 --------------------------------
5
6 Highlights of this release:
7
8 - A new `sjoin_nearest()` method to join based on proximity, with the
9 ability to set a maximum search radius (#1865). In addition, the `sindex`
10 attribute gained a new method for a "nearest" spatial index query (#1865,
11 #2053).
12 - A new `explore()` method on GeoDataFrame and GeoSeries with native support
13 for interactive visualization based on folium / leaflet.js (#1953)
14 - The `geopandas.sjoin()`/`overlay()`/`clip()` functions are now also
15 available as methods on the GeoDataFrame (#2141, #1984, #2150).
16
17 New features and improvements:
18
19 - Add support for pandas' `value_counts()` method for geometry dtype (#2047).
20 - The `explode()` method has a new `ignore_index` keyword (consistent with
21 pandas' explode method) to reset the index in the result, and a new
22 `index_parts` keywords to control whether a cumulative count indexing the
23 parts of the exploded multi-geometries should be added (#1871).
24 - `points_from_xy()` is now available as a GeoSeries method `from_xy` (#1936).
25 - The `to_file()` method will now attempt to detect the driver (if not
26 specified) based on the extension of the provided filename, instead of
27 defaulting to ESRI Shapefile (#1609).
28 - Support for the `storage_options` keyword in `read_parquet()` for
29 specifying filesystem-specific options (e.g. for S3) based on fsspec (#2107).
30 - The read/write functions now support `~` (user home directory) expansion (#1876).
31 - Support the `convert_dtypes()` method from pandas to preserve the
32 GeoDataFrame class (#2115).
33 - Support WKB values in the hex format in `GeoSeries.from_wkb()` (#2106).
34 - Update the `estimate_utm_crs()` method to handle crossing the antimeridian
35 with pyproj 3.1+ (#2049).
36 - Improved heuristic to decide how many decimals to show in the repr based on
37 whether the CRS is projected or geographic (#1895).
38 - Switched the default for `geocode()` from GeoCode.Farm to the Photon
39 geocoding API (https://photon.komoot.io) (#2007).
40
41 Deprecations and compatibility notes:
42
43 - The `op=` keyword of `sjoin()` to indicate which spatial predicate to use
44 for joining is being deprecated and renamed in favor of a new `predicate=`
45 keyword (#1626).
46 - The `cascaded_union` attribute is deprecated, use `unary_union` instead (#2074).
47 - Constructing a GeoDataFrame with a duplicated "geometry" column is now
48 disallowed. This can also raise an error in the `pd.concat(.., axis=1)`
49 function if this results in duplicated active geometry columns (#2046).
50 - The `explode()` method currently returns a GeoSeries/GeoDataFrame with a
51 MultiIndex, with an additional level with indices of the parts of the
52 exploded multi-geometries. For consistency with pandas, this will change in
53 the future and the new `index_parts` keyword is added to control this.
54
55 Bug fixes:
56
57 - Fix in the `clip()` function to correctly clip MultiPoints instead of
58 leaving them intact when partly outside of the clip bounds (#2148).
59 - Fix `GeoSeries.isna()` to correctly return a boolean Series in case of an
60 empty GeoSeries (#2073).
61 - Fix the GeoDataFrame constructor to preserve the geometry name when the
62 argument is already a GeoDataFrame object (i.e. `GeoDataFrame(gdf)`) (#2138).
63 - Fix loss of the values' CRS when setting those values as a column
64 (`GeoDataFrame.__setitem__`) (#1963)
65 - Fix in `GeoDataFrame.apply()` to preserve the active geometry column name
66 (#1955).
67 - Fix in `sjoin()` to not ignore the suffixes in case of a right-join
68 (`how="right`) (#2065).
69 - Fix `GeoDataFrame.explode()` with a MultiIndex (#1945).
70 - Fix the handling of missing values in `to/from_wkb` and `to_from_wkt` (#1891).
71 - Fix `to_file()` and `to_json()` when DataFrame has duplicate columns to
72 raise an error (#1900).
73 - Fix bug in the colors shown with user-defined classification scheme (#2019).
74 - Fix handling of the `path_effects` keyword in `plot()` (#2127).
75 - Fix `GeoDataFrame.explode()` to preserve `attrs` (#1935)
76
77 Notes on (optional) dependencies:
78
79 - GeoPandas 0.9.0 dropped support for Python 3.6 and pandas 0.24. Further,
80 the minimum required versions are numpy 1.18, shapely 1.6, fiona 1.8,
81 matplotlib 3.1 and pyproj 2.2.
82 - Plotting with a classification schema now requires mapclassify version >=
83 2.4 (#1737).
84 - Compatibility fixes for the latest numpy in combination with Shapely 1.7 (#2072)
85 - Compatibility fixes for the upcoming Shapely 1.8 (#2087).
86 - Compatibility fixes for the latest PyGEOS (#1872, #2014) and matplotlib
87 (colorbar issue, #2066).
88
289
390 Version 0.9.0 (February 28, 2021)
491 ---------------------------------
77164 - Fix regression in the `plot()` method raising an error with empty
78165 geometries (#1702, #1828).
79166 - Fix `geopandas.overlay()` to preserve geometries of the correct type which
80 are nested withing a GeometryCollection as a result of the overlay
167 are nested within a GeometryCollection as a result of the overlay
81168 operation (#1582). In addition, a warning will now be raised if geometries
82169 of different type are dropped from the result (#1554).
83170 - Fix the repr of an empty GeoSeries to not show spurious warnings (#1673).
84171 - Fix the `.crs` for empty GeoDataFrames (#1560).
85 - Fix `geopandas.clip` to preserve the correct geometry column name (#1566).
172 - Fix `geopandas.clip` to preserve the correct geometry column name (#1566).
86173 - Fix bug in `plot()` method when using `legend_kwds` with multiple subplots
87174 (#1583)
88175 - Fix spurious warning with `missing_kwds` keyword of the `plot()` method
148235 New features and improvements:
149236
150237 - IO enhancements:
238
151239 - New `GeoDataFrame.to_postgis()` method to write to PostGIS database (#1248).
152240 - New Apache Parquet and Feather file format support (#1180, #1435)
153241 - Allow appending to files with `GeoDataFrame.to_file` (#1229).
156244 returned (#1383).
157245 - `geopandas.read_file` now supports reading from file-like objects (#1329).
158246 - `GeoDataFrame.to_file` now supports specifying the CRS to write to the file
159 (#802). By default it still uses the CRS of the GeoDataFrame.
247 (#802). By default it still uses the CRS of the GeoDataFrame.
160248 - New `chunksize` keyword in `geopandas.read_postgis` to read a query in
161249 chunks (#1123).
250
162251 - Improvements related to geometry columns and CRS:
252
163253 - Any column of the GeoDataFrame that has a "geometry" dtype is now returned
164254 as a GeoSeries. This means that when having multiple geometry columns, not
165255 only the "active" geometry column is returned as a GeoSeries, but also
171261 from the column itself (eg `gdf["other_geom_column"].crs`) (#1339).
172262 - New `set_crs()` method on GeoDataFrame/GeoSeries to set the CRS of naive
173263 geometries (#747).
264
174265 - Improvements related to plotting:
266
175267 - The y-axis is now scaled depending on the center of the plot when using a
176268 geographic CRS, instead of using an equal aspect ratio (#1290).
177269 - When passing a column of categorical dtype to the `column=` keyword of the
182274 `legend_kwds` accept two new keywords to control the formatting of the
183275 legend: `fmt` with a format string for the bin edges (#1253), and `labels`
184276 to pass fully custom class labels (#1302).
277
185278 - New `covers()` and `covered_by()` methods on GeoSeries/GeoDataframe for the
186279 equivalent spatial predicates (#1460, #1462).
187280 - GeoPandas now warns when using distance-based methods with data in a
193286 CRS, a deprecation warning is raised when both CRS don't match, and in the
194287 future an error will be raised in such a case. You can use the new `set_crs`
195288 method to override an existing CRS. See
196 [the docs](https://geopandas.readthedocs.io/en/latest/projections.html#projection-for-multiple-geometry-columns).
289 [the docs](https://geopandas.readthedocs.io/en/latest/projections.html#projection-for-multiple-geometry-columns).
197290 - The helper functions in the `geopandas.plotting` module are deprecated for
198291 public usage (#656).
199292 - The `geopandas.io` functions are deprecated, use the top-level `read_file` and
314407 API changes:
315408
316409 - A refactor of the internals based on the pandas ExtensionArray interface (#1000). The main user visible changes are:
410
317411 - The `.dtype` of a GeoSeries is now a `'geometry'` dtype (and no longer a numpy `object` dtype).
318412 - The `.values` of a GeoSeries now returns a custom `GeometryArray`, and no longer a numpy array. To get back a numpy array of Shapely scalars, you can convert explicitly using `np.asarray(..)`.
413
319414 - The `GeoSeries` constructor now raises a warning when passed non-geometry data. Currently the constructor falls back to return a pandas `Series`, but in the future this will raise an error (#1085).
320415 - The missing value handling has been changed to now separate the concepts of missing geometries and empty geometries (#601, 1062). In practice this means that (see [the docs](https://geopandas.readthedocs.io/en/v0.6.0/missing_empty.html) for more details):
416
321417 - `GeoSeries.isna` now considers only missing values, and if you want to check for empty geometries, you can use `GeoSeries.is_empty` (`GeoDataFrame.isna` already only looked at missing values).
322418 - `GeoSeries.dropna` now actually drops missing values (before it didn't drop either missing or empty geometries)
323419 - `GeoSeries.fillna` only fills missing values (behaviour unchanged).
360456 * Significant performance improvement (around 10x) for `GeoDataFrame.iterfeatures`,
361457 which also improves `GeoDataFrame.to_file` (#864).
362458 * File IO enhancements based on Fiona 1.8:
459
363460 * Support for writing bool dtype (#855) and datetime dtype, if the file format supports it (#728).
364461 * Support for writing dataframes with multiple geometry types, if the file format allows it (e.g. GeoJSON for all types, or ESRI Shapefile for Polygon+MultiPolygon) (#827, #867, #870).
462
365463 * Compatibility with pyproj >= 2 (#962).
366464 * A new `geopandas.points_from_xy()` helper function to convert x and y coordinates to Point objects (#896).
367 * The `buffer` and `interpolate` methods now accept an array-like to specify a variable distance for each geometry (#781).
465 * The `buffer` and `interpolate` methods now accept an array-like to specify a variable distance for each geometry (#781).
368466 * Addition of a `relate` method, corresponding to the shapely method that returns the DE-9IM matrix (#853).
369467 * Plotting improvements:
468
370469 * Performance improvement in plotting by only flattening the geometries if there are actually 'Multi' geometries (#785).
371470 * Choropleths: access to all `mapclassify` classification schemes and addition of the `classification_kwds` keyword in the `plot` method to specify options for the scheme (#876).
372471 * Ability to specify a matplotlib axes object on which to plot the color bar with the `cax` keyword, in order to have more control over the color bar placement (#894).
472
373473 * Changed the default provider in ``geopandas.tools.geocode`` from Google (now requires an API key) to Geocode.Farm (#907, #975).
374474
375475 Bug fixes:
413513 * Permit setting markersize for Point GeoSeries plots with column values (#633)
414514 * Started an example gallery (#463, #690, #717)
415515 * Support for plotting MultiPoints (#683)
416 * Testing functionalty (e.g. `assert_geodataframe_equal`) is now publicly exposed (#707)
516 * Testing functionality (e.g. `assert_geodataframe_equal`) is now publicly exposed (#707)
417517 * Add `explode` method to GeoDataFrame (similar to the GeoSeries method) (#671)
418518 * Set equal aspect on active axis on multi-axis figures (#718)
419519 * Pass array of values to column argument in `plot` (#770)
1818 - Install the requirements for the development environment (one can do this
1919 with either conda, and the environment.yml file, or pip, and the
2020 requirements-dev.txt file, and can use the pandas contributing guidelines
21 as a guide).
21 as a guide).
2222 - All existing tests should pass. Please make sure that the test
2323 suite passes, both locally and on
2424 [GitHub Actions](https://github.com/geopandas/geopandas/actions). Status on
3838 Style
3939 -----
4040
41 - GeoPandas supports Python 3.6+ only. The last version of GeoPandas
41 - GeoPandas supports Python 3.7+ only. The last version of GeoPandas
4242 supporting Python 2 is 0.6.
4343
4444 - GeoPandas follows [the PEP 8
00 from geopandas import GeoDataFrame, GeoSeries, read_file, datasets, overlay
1 from shapely.geometry import Polygon
1 import numpy as np
2 from shapely.geometry import Point, Polygon
23
34
45 class Countries:
56
6 param_names = ['op']
7 param_names = ['how']
78 params = [('intersection', 'union', 'identity', 'symmetric_difference',
89 'difference')]
910
1920 self.countries = countries
2021 self.capitals = capitals
2122
22 def time_overlay(self, op):
23 overlay(self.countries, self.capitals, how=op)
23 def time_overlay(self, how):
24 overlay(self.countries, self.capitals, how=how)
2425
2526
2627 class Small:
2728
28 param_names = ['op']
29 param_names = ['how']
2930 params = [('intersection', 'union', 'identity', 'symmetric_difference',
3031 'difference')]
3132
4041
4142 self.df1, self.df2 = df1, df2
4243
43 def time_overlay(self, op):
44 overlay(self.df1, self.df2, how=op)
44 def time_overlay(self, how):
45 overlay(self.df1, self.df2, how=how)
46
47
48 class ManyPoints:
49
50 param_names = ['how']
51 params = [('intersection', 'union', 'identity', 'symmetric_difference',
52 'difference')]
53
54 def setup(self, *args):
55
56 points = GeoDataFrame(geometry=[Point(i, i) for i in range(1000)])
57 base = np.array([[0, 0], [0, 100], [100, 100], [100, 0]])
58 polys = GeoDataFrame(
59 geometry=[Polygon(base + i * 100) for i in range(10)])
60
61 self.df1, self.df2 = points, polys
62
63 def time_overlay(self, how):
64 overlay(self.df1, self.df2, how=how)
66
77 class Bench:
88
9 param_names = ['geom_type']
10 params = [('Point', 'LineString', 'Polygon', 'MultiPolygon', 'mixed')]
9 param_names = ["geom_type"]
10 params = [("Point", "LineString", "Polygon", "MultiPolygon", "mixed")]
1111
1212 def setup(self, geom_type):
1313
14 if geom_type == 'Point':
14 if geom_type == "Point":
1515 geoms = GeoSeries([Point(i, i) for i in range(1000)])
16 elif geom_type == 'LineString':
17 geoms = GeoSeries([LineString([(random.random(), random.random())
18 for _ in range(5)])
19 for _ in range(100)])
20 elif geom_type == 'Polygon':
21 geoms = GeoSeries([Polygon([(random.random(), random.random())
22 for _ in range(3)])
23 for _ in range(100)])
24 elif geom_type == 'MultiPolygon':
16 elif geom_type == "LineString":
2517 geoms = GeoSeries(
26 [MultiPolygon([Polygon([(random.random(), random.random())
27 for _ in range(3)])
28 for _ in range(3)])
29 for _ in range(20)])
30 elif geom_type == 'mixed':
18 [
19 LineString([(random.random(), random.random()) for _ in range(5)])
20 for _ in range(100)
21 ]
22 )
23 elif geom_type == "Polygon":
24 geoms = GeoSeries(
25 [
26 Polygon([(random.random(), random.random()) for _ in range(3)])
27 for _ in range(100)
28 ]
29 )
30 elif geom_type == "MultiPolygon":
31 geoms = GeoSeries(
32 [
33 MultiPolygon(
34 [
35 Polygon(
36 [(random.random(), random.random()) for _ in range(3)]
37 )
38 for _ in range(3)
39 ]
40 )
41 for _ in range(20)
42 ]
43 )
44 elif geom_type == "mixed":
3145 g1 = GeoSeries([Point(i, i) for i in range(100)])
32 g2 = GeoSeries([LineString([(random.random(), random.random())
33 for _ in range(5)])
34 for _ in range(100)])
35 g3 = GeoSeries([Polygon([(random.random(), random.random())
36 for _ in range(3)])
37 for _ in range(100)])
46 g2 = GeoSeries(
47 [
48 LineString([(random.random(), random.random()) for _ in range(5)])
49 for _ in range(100)
50 ]
51 )
52 g3 = GeoSeries(
53 [
54 Polygon([(random.random(), random.random()) for _ in range(3)])
55 for _ in range(100)
56 ]
57 )
3858
3959 geoms = g1
40 geoms.iloc[np.random.randint(0, 100, 50)] = g2
41 geoms.iloc[np.random.randint(0, 100, 33)] = g3
60 geoms.iloc[np.random.randint(0, 100, 50)] = g2.iloc[:50]
61 geoms.iloc[np.random.randint(0, 100, 33)] = g3.iloc[:33]
4262
4363 print(geoms.geom_type.value_counts())
4464
45 df = GeoDataFrame({'geometry': geoms,
46 'values': np.random.randn(len(geoms))})
65 df = GeoDataFrame({"geometry": geoms, "values": np.random.randn(len(geoms))})
4766
4867 self.geoms = geoms
4968 self.df = df
5271 self.geoms.plot()
5372
5473 def time_plot_values(self, *args):
55 self.df.plot(column='values')
56
74 self.df.plot(column="values")
+0
-28
ci/envs/36-minimal.yaml less more
0 name: test
1 channels:
2 - defaults
3 - conda-forge
4 dependencies:
5 - python=3.6
6 # required
7 - numpy=1.15
8 - pandas==0.24
9 - shapely=1.6
10 - fiona=1.8.13
11 #- pyproj
12 # testing
13 - pytest
14 - pytest-cov
15 - pytest-xdist
16 - fsspec
17 # optional
18 - rtree
19 - matplotlib
20 - matplotlib=2.2
21 - mapclassify>=2.2.0
22 - geopy
23 - SQLalchemy
24 - libspatialite
25 - pyarrow
26 - pip:
27 - pyproj==2.2.2
+0
-27
ci/envs/36-pd025.yaml less more
0 name: test
1 channels:
2 - defaults
3 dependencies:
4 - python=3.6
5 # required
6 - pandas=0.25
7 - shapely
8 - fiona
9 #- pyproj
10 - geos
11 # testing
12 - pytest
13 - pytest-cov
14 - pytest-xdist
15 - fsspec
16 # optional
17 - rtree
18 - matplotlib
19 #- geopy
20 - SQLalchemy
21 - libspatialite
22 - pyarrow
23 - pip:
24 - pyproj==2.3.1
25 - geopy
26 - mapclassify==2.2.0
1717 - rtree
1818 - matplotlib
1919 - mapclassify
20 - folium
21 - xyzservices
2022 - scipy
2123 - geopy
2224 - SQLalchemy
2325 - libspatialite
2426 - pyarrow
25
27
11 channels:
22 - defaults
33 dependencies:
4 - python=3.7.3
4 - python=3.7
55 # required
66 - pandas
77 - shapely
1919 #- geopy
2020 - SQLalchemy
2121 - libspatialite
22 - pyarrow
2322 - pip:
2423 - geopy
2524 - mapclassify
25 - pyarrow
26 - folium
27 - xyzservices
0 name: test
1 channels:
2 - defaults
3 - conda-forge
4 dependencies:
5 - python=3.7
6 # required
7 - numpy=1.18
8 - pandas==0.25
9 - shapely=1.6
10 - fiona=1.8.13
11 #- pyproj
12 # testing
13 - pytest
14 - pytest-cov
15 - pytest-xdist
16 - fsspec
17 # optional
18 - rtree
19 - matplotlib
20 - matplotlib=3.1
21 # - mapclassify=2.4.0 - doesn't build due to conflicts
22 - geopy
23 - SQLalchemy
24 - libspatialite
25 - pyarrow
26 - pip:
27 - pyproj==2.2.2
0 name: test
1 channels:
2 - defaults
3 dependencies:
4 - python=3.7
5 # required
6 - pandas=1.0
7 - shapely
8 - fiona
9 - numpy=<1.19
10 #- pyproj
11 - geos
12 # testing
13 - pytest
14 - pytest-cov
15 - pytest-xdist
16 - fsspec
17 # optional
18 - rtree
19 - matplotlib
20 #- geopy
21 - SQLalchemy
22 - libspatialite
23 - pip
24 - pip:
25 - pyproj==3.0.1
26 - geopy
27 - mapclassify==2.4.0
28 - pyarrow
2020 - pyarrow
2121 - pip:
2222 - geopy
23 - mapclassify>=2.2.0
23 - mapclassify>=2.4.0
2424 # dev versions of packages
25 - git+https://github.com/numpy/numpy.git@master
25 - git+https://github.com/numpy/numpy.git@main
2626 - git+https://github.com/pydata/pandas.git@master
2727 - git+https://github.com/matplotlib/matplotlib.git@master
2828 - git+https://github.com/Toblerity/Shapely.git@master
2929 - git+https://github.com/pygeos/pygeos.git@master
30 - git+https://github.com/python-visualization/folium.git@master
31 - git+https://github.com/geopandas/xyzservices.git@main
32
33 dependencies:
44 - python=3.8
55 # required
6 - pandas
6 - pandas=1.3.2 # temporary pin because 1.3.3 has regression for overlay (GH2101)
77 - shapely
88 - fiona
99 - pyproj
1717 - rtree
1818 - matplotlib
1919 - mapclassify
20 - folium
21 - xyzservices
2022 - scipy
2123 - geopy
2224 # installed in tests.yaml, because not available on windows
1616 # optional
1717 - rtree
1818 - matplotlib
19 - descartes
2019 - mapclassify
20 - folium
21 - xyzservices
2122 - scipy
2223 - geopy
2324 # installed in tests.yaml, because not available on windows
2930 - pyarrow
3031 # doctest testing
3132 - pytest-doctestplus
33
+0
-21
ci/envs/setup_postgres.sh less more
0 #!/bin/bash -e
1
2 echo "Setting up Postgresql"
3
4 mkdir -p ${HOME}/var
5 rm -rf ${HOME}/var/db
6
7 pg_ctl initdb -D ${HOME}/var/db
8 pg_ctl start -D ${HOME}/var/db
9
10 echo -n 'waiting for postgres'
11 while [ ! -e /tmp/.s.PGSQL.5432 ]; do
12 sleep 1
13 echo -n '.'
14 done
15
16 createuser -U ${USER} -s postgres
17 createdb --owner=postgres test_geopandas
18 psql -d test_geopandas -q -c "CREATE EXTENSION postgis"
19
20 echo "Done setting up Postgresql"
0 #!/bin/sh
1 set -e
2
3 if [ -z "${PGUSER}" ] || [ -z "${PGPORT}" ]; then
4 echo "Environment variables PGUSER and PGPORT must be set"
5 exit 1
6 fi
7
8 PGDATA=$(mktemp -d /tmp/postgres.XXXXXX)
9 echo "Setting up PostgreSQL in ${PGDATA} on port ${PGPORT}"
10
11 pg_ctl -D ${PGDATA} initdb
12 pg_ctl -D ${PGDATA} start
13
14 SOCKETPATH="/tmp/.s.PGSQL.${PGPORT}"
15 echo -n 'waiting for postgres'
16 while [ ! -e ${SOCKETPATH} ]; do
17 sleep 1
18 echo -n '.'
19 done
20 echo
21
22 echo "Done setting up PostgreSQL. When finished, stop and cleanup using:"
23 echo
24 echo " pg_ctl -D ${PGDATA} stop"
25 echo " rm -rf ${PGDATA}"
26 echo
27
28 createuser -U ${USER} -s ${PGUSER}
29 createdb --owner=${PGUSER} test_geopandas
30 psql -d test_geopandas -q -c "CREATE EXTENSION postgis"
31
32 echo "PostGIS server ready."
11 channels:
22 - conda-forge
33 dependencies:
4 - python=3.9.1
5 - pandas=1.2.2
4 - python=3.9.7
5 - pandas=1.3.2
66 - shapely=1.7.1
7 - fiona=1.8.18
8 - pyproj=3.0.0.post1
7 - fiona=1.8.20
8 - pyproj=3.2.1
99 - rtree=0.9.7
10 - geopy=2.1.0
11 - matplotlib=3.3.4
12 - mapclassify=2.4.2
13 - sphinx=3.5.1
14 - pydata-sphinx-theme=0.4.3
10 - geopy=2.2.0
11 - matplotlib=3.4.3
12 - mapclassify=2.4.3
13 - sphinx=4.2.0
14 - pydata-sphinx-theme=0.6.3
1515 - numpydoc=1.1.0
16 - ipython=7.20.0
17 - pillow=8.1.0
16 - ipython=7.27.0
17 - pillow=8.3.2
1818 - mock=4.0.3
19 - cartopy=0.18.0
19 - cartopy=0.20.0
2020 - pyepsg=0.4.0
2121 - contextily=1.1.0
22 - rasterio=1.2.0
23 - geoplot=0.4.1
24 - sphinx-gallery=0.8.2
25 - jinja2=2.11.3
22 - rasterio=1.2.8
23 - geoplot=0.4.4
24 - sphinx-gallery=0.9.0
25 - jinja2=3.0.1
2626 - doc2dash=2.3.0
27 - matplotlib-scalebar=0.7.2
2728 # specify additional dependencies to reduce solving for conda
28 - gdal=3.1.4
29 - libgdal=3.1.4
30 - proj=7.2.0
31 - geos=3.9.0
32 - nbsphinx=0.8.1
33 - jupyter_client=6.1.11
34 - ipykernel=5.4.3
35 - myst-parser=0.13.5
29 - gdal=3.3.2
30 - libgdal=3.3.2
31 - proj=8.0.1
32 - geos=3.9.1
33 - nbsphinx=0.8.7
34 - jupyter_client=7.0.3
35 - ipykernel=6.4.1
36 - myst-parser=0.15.2
3637 - folium=0.12.0
37 - libpysal=4.4.0
38 - pygeos=0.9
38 - libpysal=4.5.1
39 - pygeos=0.10.2
40 - xyzservices=2021.9.1
3941 - pip
4042 - pip:
4143 - sphinx-toggleprompt
00 /* colors */
11
2 h1 {
3 color: #139C5A;
4 }
5
6 h2 {
7 color: #333333;
8 }
9
10 .nav li.active>a, .navbar-nav>.active>.nav-link {
11 color: #139C5A!important;
12 }
13
14 .toc-entry>.nav-link.active {
15 border-left-color: #139C5A;
16 color: #139C5A!important;
17 }
18
19 .nav li>a:hover {
20 color: #333333!important;
2 :root {
3 --pst-color-primary: 19, 156, 90;
4 --pst-color-active-navigation: 19, 156, 90;
5 --pst-color-h2: var(--color-text-base);
216 }
227
238 /* buttons */
4343 imports when possible, and explicit relative imports for local
4444 imports when necessary in tests.
4545
46 - GeoPandas supports Python 3.6+ only. The last version of GeoPandas
46 - GeoPandas supports Python 3.7+ only. The last version of GeoPandas
4747 supporting Python 2 is 0.6.
4848
4949
107107 the upstream (main project) *GeoPandas* repository.
108108
109109 The testing suite will run automatically on GitHub Actions once your pull request is
110 submitted. The test suite will also autmatically run on your branch so you can
111 check it prior to submitting the pull request.
110 submitted. The test suite will also automatically run on your branch so you can
111 check it prior to submitting the pull request.
112112
113113 Creating a branch
114114 ~~~~~~~~~~~~~~~~~~
246246 6) Updating the Documentation
247247 -----------------------------
248248
249 *GeoPandas* documentation resides in the ``doc`` folder. Changes to the docs are make by
250 modifying the appropriate file in the `source` folder within ``doc``. *GeoPandas* docs use
249 *GeoPandas* documentation resides in the ``doc`` folder. Changes to the docs are made by
250 modifying the appropriate file in the ``source`` folder within ``doc``. *GeoPandas* docs use
251251 mixture of reStructuredText syntax for ``rst`` files, `which is explained here
252252 <http://www.sphinx-doc.org/en/stable/rest.html#rst-primer>`_ and MyST syntax for ``md``
253253 files `explained here <https://myst-parser.readthedocs.io/en/latest/index.html>`_.
256256 and examples are Jupyter notebooks converted to docs using `nbsphinx
257257 <https://nbsphinx.readthedocs.io/>`_. Jupyter notebooks should be stored without the output.
258258
259 We highly encourage you to follow the `Google developer documentation style guide
260 <https://developers.google.com/style/highlights>`_ when updating or creating new documentation.
261
259262 Once you have made your changes, you may try if they render correctly by
260 building the docs using sphinx. To do so, you can navigate to the `doc` folder
263 building the docs using sphinx. To do so, you can navigate to the `doc` folder::
264
265 cd doc
266
261267 and type::
262268
263269 make html
264270
265 The resulting html pages will be located in ``doc/build/html``. In case of any errors, you
266 can try to use ``make html`` within a new environment based on environment.yml
267 specification in the ``doc`` folder. You may need to register Jupyter kernel as
271 The resulting html pages will be located in ``doc/build/html``.
272
273 In case of any errors, you can try to use ``make html`` within a new environment based on
274 environment.yml specification in the ``doc`` folder. You may need to register Jupyter kernel as
268275 ``geopandas_docs``. Using conda::
269276
277 cd doc
270278 conda env create -f environment.yml
271279 conda activate geopandas_docs
272280 python -m ipykernel install --user --name geopandas_docs
273281 make html
274282
275 For minor updates, you can skip whole ``make html`` part as reStructuredText and MyST
283 For minor updates, you can skip the ``make html`` part as reStructuredText and MyST
276284 syntax are usually quite straightforward.
277285
278286
346354 Now you can commit your changes in your local repository::
347355
348356 git commit -m
349
11
22 ## GeoPandas dependencies
33
4 GeoPandas brings together the full capability of `pandas` and open-source geospatial
4 GeoPandas brings together the full capability of `pandas` and the open-source geospatial
55 tools `Shapely`, which brings manipulation and analysis of geometric objects backed by
66 [`GEOS`](https://trac.osgeo.org/geos) library, `Fiona`, allowing us to read and write
77 geographic data files using [`GDAL`](https://gdal.org), and `pyproj`, a library for
8 cartographic projections and coordinate transformations, which is a Python interface of
8 cartographic projections and coordinate transformations, which is a Python interface to
99 [`PROJ`](https://proj.org).
1010
1111 Furthermore, GeoPandas has several optional dependencies as `rtree`, `pygeos`,
3939
4040 #### [pyproj](https://github.com/pyproj4/pyproj)
4141 `pyproj` is a Python interface to `PROJ` (cartographic projections and coordinate
42 transformations library). GeoPandas uses `pyproj.crs.CRS` object to keep track of a
42 transformations library). GeoPandas uses a `pyproj.crs.CRS` object to keep track of the
4343 projection of each `GeoSeries` and its `Transformer` object to manage re-projections.
4444
4545 ### Optional dependencies
7777
7878 Various packages are built on top of GeoPandas addressing specific geospatial data
7979 processing needs, analysis, and visualization. Below is an incomplete list (in no
80 particular order) of tools which form GeoPandas related Python ecosystem.
80 particular order) of tools which form the GeoPandas-related Python ecosystem.
8181
8282 ### Spatial analysis and Machine Learning
8383
104104 ##### [segregation](https://github.com/pysal/segregation)
105105 `segregation` package calculates over 40 different segregation indices and provides a
106106 suite of additional features for measurement, visualization, and hypothesis testing that
107 together represent the state-of-the-art in quantitative segregation analysis.
107 together represent the state of the art in quantitative segregation analysis.
108108
109109 ##### [mgwr](https://github.com/pysal/mgwr)
110110 `mgwr` provides scalable algorithms for estimation, inference, and prediction using
111 single- and multi-scale geographically-weighted regression models in a variety of
112 generalized linear model frameworks, as well model diagnostics tools.
111 single- and multi-scale geographically weighted regression models in a variety of
112 generalized linear model frameworks, as well as model diagnostics tools.
113113
114114 ##### [tobler](https://github.com/pysal/tobler)
115 `tobler` provides functionality for for areal interpolation and dasymetric mapping.
115 `tobler` provides functionality for areal interpolation and dasymetric mapping.
116116 `tobler` includes functionality for interpolating data using area-weighted approaches,
117117 regression model-based approaches that leverage remotely-sensed raster data as auxiliary
118118 information, and hybrid approaches.
119
120119
121120 #### [movingpandas](https://github.com/anitagraser/movingpandas)
122121 `MovingPandas` is a package for dealing with movement data. `MovingPandas` implements a
162161
163162 ### Visualization
164163
164 #### [hvPlot](https://hvplot.holoviz.org/user_guide/Geographic_Data.html#Geopandas)
165 `hvPlot` provides interactive Bokeh-based plotting for GeoPandas
166 dataframes and series using the same API as the Matplotlib `.plot()`
167 support that comes with GeoPandas. hvPlot makes it simple to pan and zoom into
168 your plots, use widgets to explore multidimensional data, and render even the
169 largest datasets in web browsers using [Datashader](https://datashader.org).
170
165171 #### [contextily](https://github.com/geopandas/contextily)
166172 `contextily` is a small Python 3 (3.6 and above) package to retrieve tile maps from the
167173 internet. It can add those tiles as basemap to `matplotlib` figures or write tile maps
199205 `matplotlib`.
200206
201207 #### [GeoViews](https://github.com/holoviz/geoviews)
202 `GeoViews` is a Python library that makes it easy to explore and visualize any data that
203 includes geographic locations. It has particularly powerful support for multidimensional
204 meteorological and oceanographic datasets, such as those used in weather, climate, and
205 remote sensing research, but is useful for almost anything that you would want to plot
206 on a map!
208 `GeoViews` is a Python library that makes it easy to explore and
209 visualize any data that includes geographic locations, with native
210 support for GeoPandas dataframes and series objects. It has
211 particularly powerful support for multidimensional meteorological and
212 oceanographic datasets, such as those used in weather, climate, and
213 remote sensing research, but is useful for almost anything that you
214 would want to plot on a map!
207215
208216 #### [EarthPy](https://github.com/earthlab/earthpy)
209217 `EarthPy` is a python package that makes it easier to plot and work with spatial raster
228236 ### Geometry manipulation
229237
230238 #### [TopoJSON](https://github.com/mattijn/topojson)
231 `Topojson` is a library that is capable of creating a topojson encoded format of merely
232 any geographical object in Python. With topojson it is possible to reduce the size of
233 your geographical data. Mostly by orders of magnitude. It is able to do so through:
234 eliminating redundancy through computation of a topology; fixed-precision integer
235 encoding of coordinates and simplification and quantization of arcs.
239 `topojson` is a library for creating a TopoJSON encoding of nearly any
240 geographical object in Python. With topojson it is possible to reduce the size of
241 your geographical data, typically by orders of magnitude. It is able to do so through
242 eliminating redundancy through computation of a topology, fixed-precision integer
243 encoding of coordinates, and simplification and quantization of arcs.
236244
237245 #### [geocube](https://github.com/corteva/geocube)
238246 Tool to convert geopandas vector data into rasterized `xarray` data.
243251 `OSMnx` is a Python package that lets you download spatial data from OpenStreetMap and
244252 model, project, visualize, and analyze real-world street networks. You can download and
245253 model walkable, drivable, or bikeable urban networks with a single line of Python code
246 then easily analyze and visualize them. You can just as easily download and work with
254 and then easily analyze and visualize them. You can just as easily download and work with
247255 other infrastructure types, amenities/points of interest, building footprints, elevation
248256 data, street bearings/orientations, and speed/travel time.
249257
266274 package is intended for exploratory data analysis and draws inspiration from
267275 sqlalchemy-like interfaces and `acs.R`. With separate APIs for application developers
268276 and folks who only want to get their data quickly & painlessly, `cenpy` should meet the
269 needs of most who aim to get US Census Data from Python.
277 needs of most who aim to get US Census Data into Python.
270278
271279 ```{admonition} Expand this page
272280 Do know a package which should be here? [Let us
3535 "myst_parser",
3636 "nbsphinx",
3737 "numpydoc",
38 'sphinx_toggleprompt',
39 "matplotlib.sphinxext.plot_directive"
38 "sphinx_toggleprompt",
39 "matplotlib.sphinxext.plot_directive",
4040 ]
4141
4242 # continue doc build and only print warnings/errors in examples
5353
5454
5555 def setup(app):
56 app.add_stylesheet("custom.css") # may also be an URL
56 app.add_css_file("custom.css") # may also be an URL
5757
5858
5959 # Add any paths that contain templates here, relative to this directory.
6464
6565 nbsphinx_execute = "always"
6666 nbsphinx_allow_errors = True
67
68 # connect docs in other projects
69 intersphinx_mapping = {"pyproj": ("http://pyproj4.github.io/pyproj/stable/", None)}
67 nbsphinx_kernel_name = "python3"
68
7069 # suppress matplotlib warning in examples
7170 warnings.filterwarnings(
7271 "ignore",
330329
331330 __ https://github.com/geopandas/geopandas/blob/master/doc/source/{{ docname }}
332331 """
332
333 # --Options for sphinx extensions -----------------------------------------------
334
335 # connect docs in other projects
336 intersphinx_mapping = {
337 "cartopy": (
338 "https://scitools.org.uk/cartopy/docs/latest/",
339 "https://scitools.org.uk/cartopy/docs/latest/objects.inv",
340 ),
341 "contextily": (
342 "https://contextily.readthedocs.io/en/stable/",
343 "https://contextily.readthedocs.io/en/stable/objects.inv",
344 ),
345 "fiona": (
346 "https://fiona.readthedocs.io/en/stable/",
347 "https://fiona.readthedocs.io/en/stable/objects.inv",
348 ),
349 "folium": (
350 "https://python-visualization.github.io/folium/",
351 "https://python-visualization.github.io/folium/objects.inv",
352 ),
353 "geoplot": (
354 "https://residentmario.github.io/geoplot/index.html",
355 "https://residentmario.github.io/geoplot/objects.inv",
356 ),
357 "geopy": (
358 "https://geopy.readthedocs.io/en/stable/",
359 "https://geopy.readthedocs.io/en/stable/objects.inv",
360 ),
361 "libpysal": (
362 "https://pysal.org/libpysal/",
363 "https://pysal.org/libpysal/objects.inv",
364 ),
365 "mapclassify": (
366 "https://pysal.org/mapclassify/",
367 "https://pysal.org/mapclassify/objects.inv",
368 ),
369 "matplotlib": (
370 "https://matplotlib.org/stable/",
371 "https://matplotlib.org/stable/objects.inv",
372 ),
373 "pandas": (
374 "https://pandas.pydata.org/pandas-docs/stable/",
375 "https://pandas.pydata.org/pandas-docs/stable/objects.inv",
376 ),
377 "pyarrow": ("https://arrow.apache.org/docs/", None),
378 "pyepsg": (
379 "https://pyepsg.readthedocs.io/en/stable/",
380 "https://pyepsg.readthedocs.io/en/stable/objects.inv",
381 ),
382 "pygeos": (
383 "https://pygeos.readthedocs.io/en/latest/",
384 "https://pygeos.readthedocs.io/en/latest/objects.inv",
385 ),
386 "pyproj": (
387 "https://pyproj4.github.io/pyproj/stable/",
388 "https://pyproj4.github.io/pyproj/stable/objects.inv",
389 ),
390 "python": (
391 "https://docs.python.org/3",
392 "https://docs.python.org/3/objects.inv",
393 ),
394 "rtree": (
395 "https://rtree.readthedocs.io/en/stable/",
396 "https://rtree.readthedocs.io/en/stable/objects.inv",
397 ),
398 "rasterio": (
399 "https://rasterio.readthedocs.io/en/stable/",
400 "https://rasterio.readthedocs.io/en/stable/objects.inv",
401 ),
402 "shapely": (
403 "https://shapely.readthedocs.io/en/stable/",
404 "https://shapely.readthedocs.io/en/stable/objects.inv",
405 ),
406 "branca": (
407 "https://python-visualization.github.io/branca/",
408 "https://python-visualization.github.io/branca/objects.inv",
409 ),
410 "xyzservices": (
411 "https://xyzservices.readthedocs.io/en/stable/",
412 "https://xyzservices.readthedocs.io/en/stable/objects.inv",
413 ),
414 }
1616 :maxdepth: 2
1717
1818 user_guide/missing_empty
19 user_guide/reproject_fiona
1212
1313 GeoDataFrame
1414
15 Reading and writing files
16 -------------------------
15 Serialization / IO / conversion
16 -------------------------------
1717
1818 .. autosummary::
1919 :toctree: api/
2626 GeoDataFrame.to_parquet
2727 GeoDataFrame.to_feather
2828 GeoDataFrame.to_postgis
29 GeoDataFrame.to_wkb
30 GeoDataFrame.to_wkt
2931
3032 Projection handling
3133 -------------------
5658 GeoDataFrame.dissolve
5759 GeoDataFrame.explode
5860
61 Spatial joins
62 -------------
63
64 .. autosummary::
65 :toctree: api/
66
67 GeoDataFrame.sjoin
68 GeoDataFrame.sjoin_nearest
69
70 Overlay operations
71 ------------------
72
73 .. autosummary::
74 :toctree: api/
75
76 GeoDataFrame.clip
77 GeoDataFrame.overlay
78
5979 Plotting
6080 --------
81
82 .. autosummary::
83 :toctree: api/
84
85 GeoDataFrame.explore
86
6187
6288 .. autosummary::
6389 :toctree: api/
6490 :template: accessor_callable.rst
6591
6692 GeoDataFrame.plot
67
6893
6994 Spatial index
7095 -------------
95120 All pandas ``DataFrame`` methods are also available, although they may
96121 not operate in a meaningful way on the ``geometry`` column. All methods
97122 listed in `GeoSeries <geoseries>`__ work directly on an active geometry column of GeoDataFrame.
98
107107 GeoSeries.unary_union
108108 GeoSeries.explode
109109
110 Reading and writing files
111 -------------------------
110 Serialization / IO / conversion
111 -------------------------------
112112
113113 .. autosummary::
114114 :toctree: api/
115115
116116 GeoSeries.from_file
117 GeoSeries.from_wkb
118 GeoSeries.from_wkt
119 GeoSeries.from_xy
117120 GeoSeries.to_file
118121 GeoSeries.to_json
122 GeoSeries.to_wkb
123 GeoSeries.to_wkt
119124
120125 Projection handling
121126 -------------------
138143 GeoSeries.isna
139144 GeoSeries.notna
140145
146 Overlay operations
147 ------------------
148
149 .. autosummary::
150 :toctree: api/
151
152 GeoSeries.clip
153
141154 Plotting
142155 --------
143156
145158 :toctree: api/
146159
147160 GeoSeries.plot
161 GeoSeries.explore
148162
149163
150164 Spatial index
2929
3030 intersection
3131 is_empty
32 nearest
3233 query
3334 query_bulk
3435 size
4142 (``geopandas.sindex.RTreeIndex``) offers the full capability of
4243 ``rtree.index.Index`` - see the full API in the `rtree documentation`_.
4344
45 Similarly, the ``pygeos``-based spatial index
46 (``geopandas.sindex.PyGEOSSTRTreeIndex``) offers the full capability of
47 ``pygeos.STRtree``, including nearest-neighbor queries.
48 See the full API in the `PyGEOS STRTree documentation`_.
49
4450 .. _rtree documentation: https://rtree.readthedocs.io/en/stable/class.html
51 .. _PyGEOS STRTree documentation: https://pygeos.readthedocs.io/en/latest/strtree.html
66 :toctree: api/
77
88 sjoin
9 sjoin_nearest
910 overlay
1011 clip
1112 tools.geocode
1111
1212 Spatial data are often more granular than we need. For example, we might have data on sub-national units, but we're actually interested in studying patterns at the level of countries.
1313
14 In a non-spatial setting, when all we need are summary statistics of the data, we aggregate our data using the ``groupby`` function. But for spatial data, we sometimes also need to aggregate geometric features. In the *geopandas* library, we can aggregate geometric features using the ``dissolve`` function.
14 In a non-spatial setting, when all we need are summary statistics of the data, we aggregate our data using the :meth:`~pandas.DataFrame.groupby` function. But for spatial data, we sometimes also need to aggregate geometric features. In the *geopandas* library, we can aggregate geometric features using the :meth:`~geopandas.GeoDataFrame.dissolve` function.
1515
16 ``dissolve`` can be thought of as doing three things: (a) it dissolves all the geometries within a given group together into a single geometric feature (using the ``unary_union`` method), and (b) it aggregates all the rows of data in a group using ``groupby.aggregate()``, and (c) it combines those two results.
16 :meth:`~geopandas.GeoDataFrame.dissolve` can be thought of as doing three things:
1717
18 ``dissolve`` Example
19 ~~~~~~~~~~~~~~~~~~~~~
18 (a) it dissolves all the geometries within a given group together into a single geometric feature (using the :attr:`~geopandas.GeoSeries.unary_union` method), and
19 (b) it aggregates all the rows of data in a group using :ref:`groupby.aggregate <groupby.aggregate>`, and
20 (c) it combines those two results.
21
22 :meth:`~geopandas.GeoDataFrame.dissolve` Example
23 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2024
2125 Suppose we are interested in studying continents, but we only have country-level data like the country dataset included in *geopandas*. We can easily convert this to a continent-level dataset.
2226
2327
24 First, let's look at the most simple case where we just want continent shapes and names. By default, ``dissolve`` will pass ``'first'`` to ``groupby.aggregate``.
28 First, let's look at the most simple case where we just want continent shapes and names. By default, :meth:`~geopandas.GeoDataFrame.dissolve` will pass ``'first'`` to :ref:`groupby.aggregate <groupby.aggregate>`.
2529
2630 .. ipython:: python
2731
3438
3539 continents.head()
3640
37 If we are interested in aggregate populations, however, we can pass different functions to the ``dissolve`` method to aggregate populations using the ``aggfunc =`` argument:
41 If we are interested in aggregate populations, however, we can pass different functions to the :meth:`~geopandas.GeoDataFrame.dissolve` method to aggregate populations using the ``aggfunc =`` argument:
3842
3943 .. ipython:: python
4044
6165 ~~~~~~~~~~~~~~~~~~
6266
6367 The ``aggfunc =`` argument defaults to 'first' which means that the first row of attributes values found in the dissolve routine will be assigned to the resultant dissolved geodataframe.
64 However it also accepts other summary statistic options as allowed by ``pandas.groupby()`` including:
68 However it also accepts other summary statistic options as allowed by :meth:`pandas.groupby <pandas.DataFrame.groupby>` including:
6569
6670 * 'first'
6771 * 'last'
1313 =========================================
1414
1515 GeoPandas implements two main data structures, a :class:`GeoSeries` and a
16 :class:`GeoDataFrame`. These are subclasses of pandas ``Series`` and
17 ``DataFrame``, respectively.
16 :class:`GeoDataFrame`. These are subclasses of :class:`pandas.Series` and
17 :class:`pandas.DataFrame`, respectively.
1818
1919 GeoSeries
2020 ---------
4444 by matching indices. Binary operations can also be applied to a
4545 single geometry, in which case the operation is carried out for each
4646 element of the series with that geometry. In either case, a
47 ``Series`` or a :class:`GeoSeries` will be returned, as appropriate.
47 :class:`~pandas.Series` or a :class:`GeoSeries` will be returned, as appropriate.
4848
4949 A short summary of a few attributes and methods for GeoSeries is
5050 presented here, and a full list can be found in the :doc:`all attributes and methods page <../reference/geoseries>`.
6363 Basic Methods
6464 ^^^^^^^^^^^^^^
6565
66 * :meth:`~GeoSeries.distance`: returns ``Series`` with minimum distance from each entry to ``other``
66 * :meth:`~GeoSeries.distance`: returns :class:`~pandas.Series` with minimum distance from each entry to ``other``
6767 * :attr:`~GeoSeries.centroid`: returns :class:`GeoSeries` of centroids
6868 * :meth:`~GeoSeries.representative_point`: returns :class:`GeoSeries` of points that are guaranteed to be within each geometry. It does **NOT** return centroids.
6969 * :meth:`~GeoSeries.to_crs`: change coordinate reference system. See :doc:`projections <projections>`
116116 Now, we create centroids and make it the geometry:
117117
118118 .. ipython:: python
119 :okwarning:
119120
120121 world['centroid_column'] = world.centroid
121122 world = world.set_geometry('centroid_column')
128129
129130 gdf = gdf.rename(columns={'old_name': 'new_name'}).set_geometry('new_name')
130131
131 **Note 2:** Somewhat confusingly, by default when you use the ``read_file`` command, the column containing spatial objects from the file is named "geometry" by default, and will be set as the active geometry column. However, despite using the same term for the name of the column and the name of the special attribute that keeps track of the active column, they are distinct. You can easily shift the active geometry column to a different :class:`GeoSeries` with the :meth:`~GeoDataFrame.set_geometry` command. Further, ``gdf.geometry`` will always return the active geometry column, *not* the column named ``geometry``. If you wish to call a column named "geometry", and a different column is the active geometry column, use ``gdf['geometry']``, not ``gdf.geometry``.
132 **Note 2:** Somewhat confusingly, by default when you use the :func:`~geopandas.read_file` command, the column containing spatial objects from the file is named "geometry" by default, and will be set as the active geometry column. However, despite using the same term for the name of the column and the name of the special attribute that keeps track of the active column, they are distinct. You can easily shift the active geometry column to a different :class:`GeoSeries` with the :meth:`~GeoDataFrame.set_geometry` command. Further, ``gdf.geometry`` will always return the active geometry column, *not* the column named ``geometry``. If you wish to call a column named "geometry", and a different column is the active geometry column, use ``gdf['geometry']``, not ``gdf.geometry``.
132133
133134 Attributes and Methods
134135 ~~~~~~~~~~~~~~~~~~~~~~
135136
136137 Any of the attributes calls or methods described for a :class:`GeoSeries` will work on a :class:`GeoDataFrame` -- effectively, they are just applied to the "geometry" :class:`GeoSeries`.
137138
138 However, ``GeoDataFrames`` also have a few extra methods for input and output which are described on the :doc:`Input and Output <io>` page and for geocoding with are described in :doc:`Geocoding <geocoding>`.
139 However, :class:`GeoDataFrames <GeoDataFrame>` also have a few extra methods for input and output which are described on the :doc:`Input and Output <io>` page and for geocoding with are described in :doc:`Geocoding <geocoding>`.
139140
140141
141142 .. ipython:: python
156157 geopandas.options
157158
158159 The ``geopandas.options.display_precision`` option can control the number of
159 decimals to show in the display of coordinates in the geometry column.
160 decimals to show in the display of coordinates in the geometry column.
160161 In the ``world`` example of above, the default is to show 5 decimals for
161162 geographic coordinates:
162163
3131 boro_locations.plot(ax=ax, color="red");
3232
3333
34 By default, the ``geocode`` function uses the
35 `GeoCode.Farm geocoding API <https://geocode.farm/>`__ with a rate limitation
36 applied. But a different geocoding service can be specified with the
34 By default, the :func:`~geopandas.tools.geocode` function uses the
35 `Photon geocoding API <https://photon.komoot.io>`__.
36 But a different geocoding service can be specified with the
3737 ``provider`` keyword.
3838
3939 The argument to ``provider`` can either be a string referencing geocoding
4040 services, such as ``'google'``, ``'bing'``, ``'yahoo'``, and
41 ``'openmapquest'``, or an instance of a ``Geocoder`` from ``geopy``. See
41 ``'openmapquest'``, or an instance of a :mod:`Geocoder <geopy.geocoders>` from :mod:`geopy`. See
4242 ``geopy.geocoders.SERVICE_TO_GEOCODER`` for the full list.
4343 For many providers, parameters such as API keys need to be passed as
44 ``**kwargs`` in the ``geocode`` call.
44 ``**kwargs`` in the :func:`~geopandas.tools.geocode` call.
4545
4646 For example, to use the OpenStreetMap Nominatim geocoder, you need to specify
4747 a user agent:
5353 .. attention::
5454
5555 Please consult the Terms of Service for the chosen provider. The example
56 above uses ``'geocodefarm'`` (the default), for which free users are
57 limited to 250 calls per day and 4 requests per second
58 (`geocodefarm ToS <https://geocode.farm/geocoding/free-api-documentation/>`_).
56 above uses ``'photon'`` (the default), which expects fair usage
57 - extensive usage will be throttled.
58 (`Photon's Terms of Use <https://photon.komoot.io>`_).
1111
1212 .. method:: GeoSeries.buffer(distance, resolution=16)
1313
14 Returns a ``GeoSeries`` of geometries representing all points within a given `distance`
14 Returns a :class:`~geopandas.GeoSeries` of geometries representing all points within a given `distance`
1515 of each geometric object.
1616
1717 .. attribute:: GeoSeries.boundary
1818
19 Returns a ``GeoSeries`` of lower dimensional objects representing
19 Returns a :class:`~geopandas.GeoSeries` of lower dimensional objects representing
2020 each geometries's set-theoretic `boundary`.
2121
2222 .. attribute:: GeoSeries.centroid
2323
24 Returns a ``GeoSeries`` of points for each geometric centroid.
24 Returns a :class:`~geopandas.GeoSeries` of points for each geometric centroid.
2525
2626 .. attribute:: GeoSeries.convex_hull
2727
28 Returns a ``GeoSeries`` of geometries representing the smallest
28 Returns a :class:`~geopandas.GeoSeries` of geometries representing the smallest
2929 convex `Polygon` containing all the points in each object unless the
3030 number of points in the object is less than three. For two points,
3131 the convex hull collapses to a `LineString`; for 1, a `Point`.
3232
3333 .. attribute:: GeoSeries.envelope
3434
35 Returns a ``GeoSeries`` of geometries representing the point or
35 Returns a :class:`~geopandas.GeoSeries` of geometries representing the point or
3636 smallest rectangular polygon (with sides parallel to the coordinate
3737 axes) that contains each object.
3838
3939 .. method:: GeoSeries.simplify(tolerance, preserve_topology=True)
4040
41 Returns a ``GeoSeries`` containing a simplified representation of
41 Returns a :class:`~geopandas.GeoSeries` containing a simplified representation of
4242 each object.
4343
4444 .. attribute:: GeoSeries.unary_union
4545
46 Return a geometry containing the union of all geometries in the ``GeoSeries``.
46 Return a geometry containing the union of all geometries in the :class:`~geopandas.GeoSeries`.
4747
4848
4949 Affine transformations
5151
5252 .. method:: GeoSeries.affine_transform(self, matrix)
5353
54 Transform the geometries of the GeoSeries using an affine transformation matrix
54 Transform the geometries of the :class:`~geopandas.GeoSeries` using an affine transformation matrix
5555
5656 .. method:: GeoSeries.rotate(self, angle, origin='center', use_radians=False)
5757
58 Rotate the coordinates of the GeoSeries.
58 Rotate the coordinates of the :class:`~geopandas.GeoSeries`.
5959
6060 .. method:: GeoSeries.scale(self, xfact=1.0, yfact=1.0, zfact=1.0, origin='center')
6161
62 Scale the geometries of the GeoSeries along each (x, y, z) dimensio.
62 Scale the geometries of the :class:`~geopandas.GeoSeries` along each (x, y, z) dimensio.
6363
6464 .. method:: GeoSeries.skew(self, angle, origin='center', use_radians=False)
6565
66 Shear/Skew the geometries of the GeoSeries by angles along x and y dimensions.
66 Shear/Skew the geometries of the :class:`~geopandas.GeoSeries` by angles along x and y dimensions.
6767
6868 .. method:: GeoSeries.translate(self, xoff=0.0, yoff=0.0, zoff=0.0)
6969
70 Shift the coordinates of the GeoSeries.
70 Shift the coordinates of the :class:`~geopandas.GeoSeries`.
7171
7272
7373
9191
9292 .. image:: ../../_static/test.png
9393
94 Some geographic operations return normal pandas object. The ``area`` property of a ``GeoSeries`` will return a ``pandas.Series`` containing the area of each item in the ``GeoSeries``:
94 Some geographic operations return normal pandas object. The :attr:`~geopandas.GeoSeries.area` property of a :class:`~geopandas.GeoSeries` will return a :class:`pandas.Series` containing the area of each item in the :class:`~geopandas.GeoSeries`:
9595
9696 .. sourcecode:: python
9797
160160 .. image:: ../../_static/nyc_hull.png
161161
162162 To demonstrate a more complex operation, we'll generate a
163 ``GeoSeries`` containing 2000 random points:
163 :class:`~geopandas.GeoSeries` containing 2000 random points:
164164
165165 .. sourcecode:: python
166166
177177
178178 >>> circles = pts.buffer(2000)
179179
180 We can collapse these circles into a single shapely MultiPolygon
180 We can collapse these circles into a single :class:`MultiPolygon`
181181 geometry with
182182
183183 .. sourcecode:: python
202202 .. image:: ../../_static/boros_with_holes.png
203203
204204 Note that this can be simplified a bit, since ``geometry`` is
205 available as an attribute on a ``GeoDataFrame``, and the
206 ``intersection`` and ``difference`` methods are implemented with the
205 available as an attribute on a :class:`~geopandas.GeoDataFrame`, and the
206 :meth:`~geopandas.GeoSeries.intersection` and :meth:`~geopandas.GeoSeries.difference` methods are implemented with the
207207 "&" and "-" operators, respectively. For example, the latter could
208208 have been expressed simply as ``boros.geometry - mp``.
209209
88 Indexing and Selecting Data
99 ===========================
1010
11 GeoPandas inherits the standard ``pandas`` methods for indexing/selecting data. This includes label based indexing with ``.loc`` and integer position based indexing with ``.iloc``, which apply to both ``GeoSeries`` and ``GeoDataFrame`` objects. For more information on indexing/selecting, see the pandas_ documentation.
11 GeoPandas inherits the standard pandas_ methods for indexing/selecting data. This includes label based indexing with :attr:`~pandas.DataFrame.loc` and integer position based indexing with :attr:`~pandas.DataFrame.iloc`, which apply to both :class:`GeoSeries` and :class:`GeoDataFrame` objects. For more information on indexing/selecting, see the pandas_ documentation.
1212
1313 .. _pandas: http://pandas.pydata.org/pandas-docs/stable/indexing.html
1414
15 In addition to the standard ``pandas`` methods, GeoPandas also provides
16 coordinate based indexing with the ``cx`` indexer, which slices using a bounding
17 box. Geometries in the ``GeoSeries`` or ``GeoDataFrame`` that intersect the
15 In addition to the standard pandas_ methods, GeoPandas also provides
16 coordinate based indexing with the :attr:`~GeoDataFrame.cx` indexer, which slices using a bounding
17 box. Geometries in the :class:`GeoSeries` or :class:`GeoDataFrame` that intersect the
1818 bounding box will be returned.
1919
2020 Using the ``world`` dataset, we can use this functionality to quickly select all
2626 southern_world = world.cx[:, :0]
2727 @savefig world_southern.png
2828 southern_world.plot(figsize=(10, 3));
29
0 {
1 "cells": [
2 {
3 "cell_type": "markdown",
4 "source": [
5 "# Interactive mapping\n",
6 "\n",
7 "Alongside static plots, `geopandas` can create interactive maps based on the [folium](https://python-visualization.github.io/folium/) library.\n",
8 "\n",
9 "Creating maps for interactive exploration mirrors the API of [static plots](../reference/api/geopandas.GeoDataFrame.plot.html) in an [explore()](../reference/api/geopandas.GeoDataFrame.explore.html) method of a GeoSeries or GeoDataFrame.\n",
10 "\n",
11 "Loading some example data:"
12 ],
13 "metadata": {}
14 },
15 {
16 "cell_type": "code",
17 "execution_count": null,
18 "source": [
19 "import geopandas\n",
20 "\n",
21 "nybb = geopandas.read_file(geopandas.datasets.get_path('nybb'))\n",
22 "world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))\n",
23 "cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))"
24 ],
25 "outputs": [],
26 "metadata": {}
27 },
28 {
29 "cell_type": "markdown",
30 "source": [
31 "The simplest option is to use `GeoDataFrame.explore()`:"
32 ],
33 "metadata": {}
34 },
35 {
36 "cell_type": "code",
37 "execution_count": null,
38 "source": [
39 "nybb.explore()"
40 ],
41 "outputs": [],
42 "metadata": {}
43 },
44 {
45 "cell_type": "markdown",
46 "source": [
47 "Interactive plotting offers largely the same customisation as static one plus some features on top of that. Check the code below which plots a customised choropleth map. You can use `\"BoroName\"` column with NY boroughs names as an input of the choropleth, show (only) its name in the tooltip on hover but show all values on click. You can also pass custom background tiles (either a name supported by folium, a name recognized by `xyzservices.providers.query_name()`, XYZ URL or `xyzservices.TileProvider` object), specify colormap (all supported by `matplotlib`) and specify black outline."
48 ],
49 "metadata": {}
50 },
51 {
52 "cell_type": "code",
53 "execution_count": null,
54 "source": [
55 "nybb.explore( \n",
56 " column=\"BoroName\", # make choropleth based on \"BoroName\" column\n",
57 " tooltip=\"BoroName\", # show \"BoroName\" value in tooltip (on hover)\n",
58 " popup=True, # show all values in popup (on click)\n",
59 " tiles=\"CartoDB positron\", # use \"CartoDB positron\" tiles\n",
60 " cmap=\"Set1\", # use \"Set1\" matplotlib colormap\n",
61 " style_kwds=dict(color=\"black\") # use black outline\n",
62 " )"
63 ],
64 "outputs": [],
65 "metadata": {}
66 },
67 {
68 "cell_type": "markdown",
69 "source": [
70 "The `explore()` method returns a `folium.Map` object, which can also be passed directly (as you do with `ax` in `plot()`). You can then use folium functionality directly on the resulting map. In the example below, you can plot two GeoDataFrames on the same map and add layer control using folium. You can also add additional tiles allowing you to change the background directly in the map."
71 ],
72 "metadata": {}
73 },
74 {
75 "cell_type": "code",
76 "execution_count": null,
77 "source": [
78 "import folium\n",
79 "\n",
80 "m = world.explore(\n",
81 " column=\"pop_est\", # make choropleth based on \"BoroName\" column\n",
82 " scheme=\"naturalbreaks\", # use mapclassify's natural breaks scheme\n",
83 " legend=True, # show legend\n",
84 " k=10, # use 10 bins\n",
85 " legend_kwds=dict(colorbar=False), # do not use colorbar\n",
86 " name=\"countries\" # name of the layer in the map\n",
87 ")\n",
88 "\n",
89 "cities.explore(\n",
90 " m=m, # pass the map object\n",
91 " color=\"red\", # use red color on all points\n",
92 " marker_kwds=dict(radius=10, fill=True), # make marker radius 10px with fill\n",
93 " tooltip=\"name\", # show \"name\" column in the tooltip\n",
94 " tooltip_kwds=dict(labels=False), # do not show column label in the tooltip\n",
95 " name=\"cities\" # name of the layer in the map\n",
96 ")\n",
97 "\n",
98 "folium.TileLayer('Stamen Toner', control=True).add_to(m) # use folium to add alternative tiles\n",
99 "folium.LayerControl().add_to(m) # use folium to add layer control\n",
100 "\n",
101 "m # show map"
102 ],
103 "outputs": [],
104 "metadata": {}
105 }
106 ],
107 "metadata": {
108 "kernelspec": {
109 "display_name": "Python 3",
110 "language": "python",
111 "name": "python3"
112 },
113 "language_info": {
114 "codemirror_mode": {
115 "name": "ipython",
116 "version": 3
117 },
118 "file_extension": ".py",
119 "mimetype": "text/x-python",
120 "name": "python",
121 "nbconvert_exporter": "python",
122 "pygments_lexer": "ipython3",
123 "version": "3.9.2"
124 }
125 },
126 "nbformat": 4,
127 "nbformat_minor": 5
128 }
1717 transformations.
1818
1919 Any arguments passed to :func:`geopandas.read_file` after the file name will be
20 passed directly to ``fiona.open``, which does the actual data importation. In
20 passed directly to :func:`fiona.open`, which does the actual data importation. In
2121 general, :func:`geopandas.read_file` is pretty smart and should do what you want
2222 without extra arguments, but for more help, type::
2323
2929
3030 countries_gdf = geopandas.read_file("package.gpkg", layer='countries')
3131
32 Where supported in ``fiona``, *geopandas* can also load resources directly from
32 Where supported in :mod:`fiona`, *geopandas* can also load resources directly from
3333 a web URL, for example for GeoJSON files from `geojson.xyz <http://geojson.xyz/>`_::
3434
3535 url = "http://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_land.geojson"
4949
5050 zipfile = "zip:///Users/name/Downloads/gadm36_AFG_shp.zip!data/gadm36_AFG_1.shp"
5151
52 It is also possible to read any file-like objects with a ``read()`` method, such
53 as a file handler (e.g. via built-in ``open`` function) or ``StringIO``::
52 It is also possible to read any file-like objects with a :func:`os.read` method, such
53 as a file handler (e.g. via built-in :func:`open` function) or :class:`~io.StringIO`::
5454
5555 filename = "test.geojson"
5656 file = open(filename)
196196 Writing to PostGIS::
197197
198198 from sqlalchemy import create_engine
199 db_connection_url = "postgres://myusername:mypassword@myhost:5432/mydatabase";
199 db_connection_url = "postgresql://myusername:mypassword@myhost:5432/mydatabase";
200200 engine = create_engine(db_connection_url)
201201 countries_gdf.to_postgis("countries_table", con=engine)
202202
1414 =========================================
1515
1616
17 *geopandas* provides a high-level interface to the ``matplotlib`` library for making maps. Mapping shapes is as easy as using the ``plot()`` method on a ``GeoSeries`` or ``GeoDataFrame``.
17 *geopandas* provides a high-level interface to the matplotlib_ library for making maps. Mapping shapes is as easy as using the :meth:`~GeoDataFrame.plot()` method on a :class:`GeoSeries` or :class:`GeoDataFrame`.
18
19 .. _matplotlib: https://matplotlib.org/stable/
1820
1921 Loading some example data:
2022
3436 @savefig world_randomcolors.png
3537 world.plot();
3638
37 Note that in general, any options one can pass to `pyplot <http://matplotlib.org/api/pyplot_api.html>`_ in ``matplotlib`` (or `style options that work for lines <http://matplotlib.org/api/lines_api.html>`_) can be passed to the ``plot()`` method.
39 Note that in general, any options one can pass to `pyplot <http://matplotlib.org/api/pyplot_api.html>`_ in matplotlib_ (or `style options that work for lines <http://matplotlib.org/api/lines_api.html>`_) can be passed to the :meth:`~GeoDataFrame.plot` method.
3840
3941
4042 Choropleth Maps
4345 *geopandas* makes it easy to create Choropleth maps (maps where the color of each shape is based on the value of an associated variable). Simply use the plot command with the ``column`` argument set to the column whose values you want used to assign colors.
4446
4547 .. ipython:: python
46
47 # Plot by GDP per capta
48 :okwarning:
49
50 # Plot by GDP per capita
4851 world = world[(world.pop_est>0) & (world.name!="Antarctica")]
4952 world['gdp_per_cap'] = world.gdp_md_est / world.pop_est
5053 @savefig world_gdp_per_cap.png
6467 @savefig world_pop_est.png
6568 world.plot(column='pop_est', ax=ax, legend=True)
6669
67 However, the default appearance of the legend and plot axes may not be desirable. One can define the plot axes (with ``ax``) and the legend axes (with ``cax``) and then pass those in to the ``plot`` call. The following example uses ``mpl_toolkits`` to vertically align the plot axes and the legend axes:
70 However, the default appearance of the legend and plot axes may not be desirable. One can define the plot axes (with ``ax``) and the legend axes (with ``cax``) and then pass those in to the :meth:`~GeoDataFrame.plot` call. The following example uses ``mpl_toolkits`` to vertically align the plot axes and the legend axes:
6871
6972 .. ipython:: python
7073
9598 Choosing colors
9699 ~~~~~~~~~~~~~~~~
97100
98 One can also modify the colors used by ``plot`` with the ``cmap`` option (for a full list of colormaps, see the `matplotlib website <http://matplotlib.org/users/colormaps.html>`_):
101 One can also modify the colors used by :meth:`~GeoDataFrame.plot` with the ``cmap`` option (for a full list of colormaps, see the `matplotlib website <http://matplotlib.org/users/colormaps.html>`_):
99102
100103 .. ipython:: python
101104
153156 },
154157 );
155158
159 Other map customizations
160 ~~~~~~~~~~~~~~~~~~~~~~~~
161
162 Maps usually do not have to have axis labels. You can turn them off using ``set_axis_off()`` or ``axis("off")`` axis methods.
163
164 .. ipython:: python
165
166 ax = world.plot()
167 @savefig set_axis_off.png
168 ax.set_axis_off();
169
156170 Maps with Layers
157171 -----------------
158172
1010
1111 There are two ways to combine datasets in *geopandas* -- attribute joins and spatial joins.
1212
13 In an attribute join, a ``GeoSeries`` or ``GeoDataFrame`` is combined with a regular *pandas* ``Series`` or ``DataFrame`` based on a common variable. This is analogous to normal merging or joining in *pandas*.
13 In an attribute join, a :class:`GeoSeries` or :class:`GeoDataFrame` is
14 combined with a regular :class:`pandas.Series` or :class:`pandas.DataFrame` based on a
15 common variable. This is analogous to normal merging or joining in *pandas*.
1416
15 In a Spatial Join, observations from two ``GeoSeries`` or ``GeoDataFrames`` are combined based on their spatial relationship to one another.
17 In a Spatial Join, observations from two :class:`GeoSeries` or :class:`GeoDataFrame`
18 are combined based on their spatial relationship to one another.
1619
1720 In the following examples, we use these datasets:
1821
3336 Appending
3437 ---------
3538
36 Appending GeoDataFrames and GeoSeries uses pandas ``append`` methods. Keep in mind, that appended geometry columns needs to have the same CRS.
39 Appending :class:`GeoDataFrame` and :class:`GeoSeries` uses pandas :meth:`~pandas.DataFrame.append` methods.
40 Keep in mind, that appended geometry columns needs to have the same CRS.
3741
3842 .. ipython:: python
3943
4953 Attribute Joins
5054 ----------------
5155
52 Attribute joins are accomplished using the ``merge`` method. In general, it is recommended to use the ``merge`` method called from the spatial dataset. With that said, the stand-alone ``merge`` function will work if the GeoDataFrame is in the ``left`` argument; if a DataFrame is in the ``left`` argument and a GeoDataFrame is in the ``right`` position, the result will no longer be a GeoDataFrame.
56 Attribute joins are accomplished using the :meth:`~pandas.DataFrame.merge` method. In general, it is recommended
57 to use the ``merge()`` method called from the spatial dataset. With that said, the stand-alone
58 :func:`pandas.merge` function will work if the :class:`GeoDataFrame` is in the ``left`` argument;
59 if a :class:`~pandas.DataFrame` is in the ``left`` argument and a :class:`GeoDataFrame`
60 is in the ``right`` position, the result will no longer be a :class:`GeoDataFrame`.
5361
54
55 For example, consider the following merge that adds full names to a ``GeoDataFrame`` that initially has only ISO codes for each country by merging it with a *pandas* ``DataFrame``.
62 For example, consider the following merge that adds full names to a :class:`GeoDataFrame`
63 that initially has only ISO codes for each country by merging it with a :class:`~pandas.DataFrame`.
5664
5765 .. ipython:: python
5866
6573 # Merge with `merge` method on shared variable (iso codes):
6674 country_shapes = country_shapes.merge(country_names, on='iso_a3')
6775 country_shapes.head()
68
6976
7077
7178 Spatial Joins
8390
8491 # Execute spatial join
8592
86 cities_with_country = geopandas.sjoin(cities, countries, how="inner", op='intersects')
93 cities_with_country = cities.sjoin(countries, how="inner", predicate='intersects')
8794 cities_with_country.head()
8895
8996
90 Sjoin Arguments
91 ~~~~~~~~~~~~~~~~
97 GeoPandas provides two spatial-join functions:
9298
93 ``sjoin()`` has two core arguments: ``how`` and ``op``.
99 - :meth:`GeoDataFrame.sjoin`: joins based on binary predicates (intersects, contains, etc.)
100 - :meth:`GeoDataFrame.sjoin_nearest`: joins based on proximity, with the ability to set a maximum search radius.
94101
95 **op**
102 .. note::
103 For historical reasons, both methods are also available as top-level functions :func:`sjoin` and :func:`sjoin_nearest`.
104 It is recommended to use methods as the functions may be deprecated in the future.
96105
97 The ``op`` argument specifies how ``geopandas`` decides whether or not to join the attributes of one object to another, based on their geometric relationship.
106 Binary Predicate Joins
107 ~~~~~~~~~~~~~~~~~~~~~~
98108
99 The values for ``op`` correspond to the names of geometric binary predicates and depend on the spatial index implementation.
109 Binary predicate joins are available via :meth:`GeoDataFrame.sjoin`.
100110
101 The default spatial index in GeoPandas currently supports the following values for ``op``:
111 :meth:`GeoDataFrame.sjoin` has two core arguments: ``how`` and ``predicate``.
112
113 **predicate**
114
115 The ``predicate`` argument specifies how ``geopandas`` decides whether or not to join the attributes of one
116 object to another, based on their geometric relationship.
117
118 The values for ``predicate`` correspond to the names of geometric binary predicates and depend on the spatial
119 index implementation.
120
121 The default spatial index in ``geopandas`` currently supports the following values for ``predicate`` which are
122 defined in the
123 `Shapely documentation <http://shapely.readthedocs.io/en/latest/manual.html#binary-predicates>`__:
102124
103125 * `intersects`
104126 * `contains`
107129 * `crosses`
108130 * `overlaps`
109131
110 You can read more about each join type in the `Shapely documentation <http://shapely.readthedocs.io/en/latest/manual.html#binary-predicates>`__.
111
112132 **how**
113133
114 The `how` argument specifies the type of join that will occur and which geometry is retained in the resultant geodataframe. It accepts the following options:
134 The `how` argument specifies the type of join that will occur and which geometry is retained in the resultant
135 :class:`GeoDataFrame`. It accepts the following options:
115136
116 * ``left``: use the index from the first (or `left_df`) geodataframe that you provide to ``sjoin``; retain only the `left_df` geometry column
137 * ``left``: use the index from the first (or `left_df`) :class:`GeoDataFrame` that you provide
138 to :meth:`GeoDataFrame.sjoin`; retain only the `left_df` geometry column
117139 * ``right``: use index from second (or `right_df`); retain only the `right_df` geometry column
118 * ``inner``: use intersection of index values from both geodataframes; retain only the `left_df` geometry column
140 * ``inner``: use intersection of index values from both :class:`GeoDataFrame`; retain only the `left_df` geometry column
119141
120 Note more complicated spatial relationships can be studied by combining geometric operations with spatial join. To find all polygons within a given distance of a point, for example, one can first use the ``buffer`` method to expand each point into a circle of appropriate radius, then intersect those buffered circles with the polygons in question.
142 Note more complicated spatial relationships can be studied by combining geometric operations with spatial join.
143 To find all polygons within a given distance of a point, for example, one can first use the :meth:`~geopandas.GeoSeries.buffer` method to expand each
144 point into a circle of appropriate radius, then intersect those buffered circles with the polygons in question.
145
146 Nearest Joins
147 ~~~~~~~~~~~~~
148
149 Proximity-based joins can be done via :meth:`GeoDataFrame.sjoin_nearest`.
150
151 :meth:`GeoDataFrame.sjoin_nearest` shares the ``how`` argument with :meth:`GeoDataFrame.sjoin`, and
152 includes two additional arguments: ``max_distance`` and ``distance_col``.
153
154 **max_distance**
155
156 The ``max_distance`` argument specifies a maximum search radius for matching geometries. This can have a considerable performance impact in some cases.
157 If you can, it is highly recommended that you use this parameter.
158
159 **distance_col**
160
161 If set, the resultant GeoDataFrame will include a column with this name containing the computed distances between an input geometry and the nearest geometry.
2121 a Shapely geometry object.
2222 - **Missing geometries** are unknown values in a GeoSeries. They will typically
2323 be propagated in operations (for example in calculations of the area or of
24 the intersection), or ignored in reductions such as ``unary_union``.
24 the intersection), or ignored in reductions such as :attr:`~GeoSeries.unary_union`.
2525 The scalar object (when accessing a single element of a GeoSeries) is the
2626 Python ``None`` object.
2727
6666
6767 s.is_empty
6868
69 To get only the actual geometry objects that are neiter missing nor empty,
69 To get only the actual geometry objects that are neither missing nor empty,
7070 you can use a combination of both:
7171
7272 .. ipython:: python
159159
160160 .. code-block:: python
161161
162 >>> s1.intersection(s2)
162 >>> s1.intersection(s2)
163163 0 GEOMETRYCOLLECTION EMPTY
164164 1 POINT (1 1)
165165 2 GEOMETRYCOLLECTION EMPTY
168168 * Starting from GeoPandas v0.6.0, :meth:`GeoSeries.align` will use missing
169169 values to fill in the non-aligned indices, to be consistent with the
170170 behaviour in pandas:
171
171
172172 .. ipython:: python
173173
174174 s1_aligned, s2_aligned = s1.align(s2)
180180 depending on the spatial operation:
181181
182182 .. ipython:: python
183 :okwarning:
183184
184185 s1.intersection(s2)
2929 - CRS WKT string
3030 - An authority string (i.e. "epsg:4326")
3131 - An EPSG integer code (i.e. 4326)
32 - A ``pyproj.CRS``
32 - A :class:`pyproj.CRS <pyproj.crs.CRS>`
3333 - An object with a to_wkt method.
3434 - PROJ string
3535 - Dictionary of PROJ parameters
8585 world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
8686
8787 # Check original projection
88 # (it's Platte Carre! x-y are long and lat)
88 # (it's Plate Carrée! x-y are long and lat)
8989 world.crs
9090
9191 # Visualize
144144 ---------------------------------------------------------
145145
146146 Starting with GeoPandas 0.7, the `.crs` attribute of a GeoSeries or GeoDataFrame
147 stores the CRS information as a ``pyproj.CRS``, and no longer as a proj4 string
147 stores the CRS information as a :class:`pyproj.CRS <pyproj.crs.CRS>`, and no longer as a proj4 string
148148 or dict.
149149
150150 Before, you might have seen this:
175175 migration issues.
176176
177177 See the `pyproj docs <https://pyproj4.github.io/pyproj/stable/>`__ for more on
178 the ``pyproj.CRS`` object.
178 the :class:`pyproj.CRS <pyproj.crs.CRS>` object.
179179
180180 Importing data from files
181181 ^^^^^^^^^^^^^^^^^^^^^^^^^
266266 **Other formats**
267267
268268 Next to the EPSG code mentioned above, there are also other ways to specify the
269 CRS: an actual ``pyproj.CRS`` object, a WKT string, a PROJ JSON string, etc.
270 Anything that is accepted by ``pyproj.CRS.from_user_input`` can by specified
269 CRS: an actual :class:`pyproj.CRS <pyproj.crs.CRS>` object, a WKT string, a PROJ JSON string, etc.
270 Anything that is accepted by :meth:`pyproj.CRS.from_user_input() <pyproj.crs.CRS.from_user_input>` can by specified
271271 to the ``crs`` keyword/attribute in GeoPandas.
272272
273 Also compatible CRS objects, such as from the ``rasterio`` package, can be
273 Also compatible CRS objects, such as from the :mod:`rasterio` package, can be
274274 passed directly to GeoPandas.
275275
276276
305305 There are many file sources and CRS definitions out there "in the wild" that
306306 might have a CRS description that does not fully conform to the new standards of
307307 PROJ > 6 (proj4 strings, older WKT formats, ...). In such cases, you will get a
308 ``pyproj.CRS`` object that might not be fully what you expected (e.g. not equal
308 :class:`pyproj.CRS <pyproj.crs.CRS>` object that might not be fully what you expected (e.g. not equal
309309 to the expected EPSG code). Below we list a few possible cases.
310310
311311 I get a "Bound CRS"?
446446 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
447447
448448 If you relied on the ``.crs`` object being a dict or a string, such code can
449 be broken given it is now a ``pyproj.CRS`` object. But this object actually
449 be broken given it is now a :class:`pyproj.CRS <pyproj.crs.CRS>` object. But this object actually
450450 provides a more robust interface to get information about the CRS.
451451
452452 For example, if you used the following code to get the EPSG code:
456456 gdf.crs['init']
457457
458458 This will no longer work. To get the EPSG code from a ``crs`` object, you can use
459 the ``to_epsg()`` method.
459 the :meth:`~pyproj.crs.CRS.to_epsg` method.
460460
461461 Or to check if a CRS was a certain UTM zone:
462462
470470
471471 gdf.crs.utm_zone is not None
472472
473 And there are many other methods available on the ``pyproj.CRS`` class to get
473 And there are many other methods available on the :class:`pyproj.CRS <pyproj.crs.CRS>` class to get
474474 information about the CRS.
0 Re-projecting using GDAL with Rasterio and Fiona
1 ================================================
2
3 The simplest method of re-projecting is :meth:`GeoDataFrame.to_crs`.
4 It uses ``pyproj`` as the engine and transforms the points within the geometries.
5
6 These examples demonstrate how to use ``Fiona`` or ``rasterio`` as the engine to re-project your data.
7 Fiona and rasterio are powered by GDAL and with algorithms that consider the geometry instead of
8 just the points the geometry contains. This is particularly useful for antimeridian cutting.
9 However, this also means the transformation is not as fast.
10
11
12 Fiona Example
13 -------------
14
15 .. code-block:: python
16
17 from functools import partial
18
19 import fiona
20 import geopandas
21 from fiona.transform import transform_geom
22 from packaging import version
23 from pyproj import CRS
24 from pyproj.enums import WktVersion
25 from shapely.geometry import mapping, shape
26
27
28 # set up Fiona transformer
29 def crs_to_fiona(proj_crs):
30 proj_crs = CRS.from_user_input(proj_crs)
31 if version.parse(fiona.__gdal_version__) < version.parse("3.0.0"):
32 fio_crs = proj_crs.to_wkt(WktVersion.WKT1_GDAL)
33 else:
34 # GDAL 3+ can use WKT2
35 fio_crs = proj_crs.to_wkt()
36 return fio_crs
37
38 def base_transformer(geom, src_crs, dst_crs):
39 return shape(
40 transform_geom(
41 src_crs=crs_to_fiona(src_crs),
42 dst_crs=crs_to_fiona(dst_crs),
43 geom=mapping(geom),
44 antimeridian_cutting=True,
45 )
46 )
47
48 # load example data
49 world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
50
51 destination_crs = "EPSG:3395"
52 forward_transformer = partial(base_transformer, src_crs=world.crs, dst_crs=destination_crs)
53
54 # Reproject to Mercator (after dropping Antartica)
55 world = world[(world.name != "Antarctica") & (world.name != "Fr. S. Antarctic Lands")]
56 with fiona.Env(OGR_ENABLE_PARTIAL_REPROJECTION="YES"):
57 mercator_world = world.set_geometry(world.geometry.apply(forward_transformer), crs=destination_crs)
58
59
60 Rasterio Example
61 ----------------
62
63 This example requires rasterio 1.2+ and GDAL 3+.
64
65
66 .. code-block:: python
67
68 import geopandas
69 import rasterio.warp
70 from shapely.geometry import shape
71
72 # load example data
73 world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
74 # Reproject to Mercator (after dropping Antartica)
75 world = world[(world.name != "Antarctica") & (world.name != "Fr. S. Antarctic Lands")]
76
77 destination_crs = "EPSG:3395"
78 geometry = rasterio.warp.transform_geom(
79 src_crs=world.crs,
80 dst_crs=destination_crs,
81 geom=world.geometry.values,
82 )
83 mercator_world = world.set_geometry(
84 [shape(geom) for geom in geometry],
85 crs=destination_crs,
86 )
0 .. currentmodule:: geopandas
1
02 .. ipython:: python
13 :suppress:
24
1315 those datasets overlap (or don't overlap). These manipulations are often
1416 referred using the language of sets -- intersections, unions, and differences.
1517 These types of operations are made available in the *geopandas* library through
16 the ``overlay`` function.
18 the :meth:`~geopandas.GeoDataFrame.overlay` method.
1719
1820 The basic idea is demonstrated by the graphic below but keep in mind that
1921 overlays operate at the DataFrame level, not on individual geometries, and the
20 properties from both are retained. In effect, for every shape in the first
21 GeoDataFrame, this operation is executed against every other shape in the other
22 GeoDataFrame:
22 properties from both are retained. In effect, for every shape in the left
23 :class:`~geopandas.GeoDataFrame`, this operation is executed against every other shape in the right
24 :class:`~geopandas.GeoDataFrame`:
2325
2426 .. image:: ../../_static/overlay_operations.png
2527
2628 **Source: QGIS Documentation**
2729
28 (Note to users familiar with the *shapely* library: ``overlay`` can be thought
29 of as offering versions of the standard *shapely* set-operations that deal with
30 the complexities of applying set operations to two *GeoSeries*. The standard
31 *shapely* set-operations are also available as ``GeoSeries`` methods.)
30 .. note::
31 Note to users familiar with the *shapely* library: :meth:`~geopandas.GeoDataFrame.overlay` can be thought
32 of as offering versions of the standard *shapely* set-operations that deal with
33 the complexities of applying set operations to two *GeoSeries*. The standard
34 *shapely* set-operations are also available as :class:`~geopandas.GeoSeries` methods.
3235
3336
3437 The different Overlay operations
5659 df2.plot(ax=ax, color='green', alpha=0.5);
5760
5861 We illustrate the different overlay modes with the above example.
59 The ``overlay`` function will determine the set of all individual geometries
62 The :meth:`~geopandas.GeoDataFrame.overlay` method will determine the set of all individual geometries
6063 from overlaying the two input GeoDataFrames. This result covers the area covered
6164 by the two input GeoDataFrames, and also preserves all unique regions defined by
6265 the combined boundaries of the two GeoDataFrames.
6366
67 .. note::
68 For historical reasons, the overlay method is also available as a top-level function :func:`overlay`.
69 It is recommended to use the method as the function may be deprecated in the future.
70
6471 When using ``how='union'``, all those possible geometries are returned:
6572
6673 .. ipython:: python
6774
68 res_union = geopandas.overlay(df1, df2, how='union')
75 res_union = df1.overlay(df2, how='union')
6976 res_union
7077
7178 ax = res_union.plot(alpha=0.5, cmap='tab10')
7986
8087 .. ipython:: python
8188
82 res_intersection = geopandas.overlay(df1, df2, how='intersection')
89 res_intersection = df1.overlay(df2, how='intersection')
8390 res_intersection
8491
8592 ax = res_intersection.plot(cmap='tab10')
9299
93100 .. ipython:: python
94101
95 res_symdiff = geopandas.overlay(df1, df2, how='symmetric_difference')
102 res_symdiff = df1.overlay(df2, how='symmetric_difference')
96103 res_symdiff
97104
98105 ax = res_symdiff.plot(cmap='tab10')
105112
106113 .. ipython:: python
107114
108 res_difference = geopandas.overlay(df1, df2, how='difference')
115 res_difference = df1.overlay(df2, how='difference')
109116 res_difference
110117
111118 ax = res_difference.plot(cmap='tab10')
118125
119126 .. ipython:: python
120127
121 res_identity = geopandas.overlay(df1, df2, how='identity')
128 res_identity = df1.overlay(df2, how='identity')
122129 res_identity
123130
124131 ax = res_identity.plot(cmap='tab10')
145152 countries = countries.to_crs('epsg:3395')
146153 capitals = capitals.to_crs('epsg:3395')
147154
148 To illustrate the ``overlay`` function, consider the following case in which one
155 To illustrate the :meth:`~geopandas.GeoDataFrame.overlay` method, consider the following case in which one
149156 wishes to identify the "core" portion of each country -- defined as areas within
150157 500km of a capital -- using a ``GeoDataFrame`` of countries and a
151158 ``GeoDataFrame`` of capitals.
170177
171178 .. ipython:: python
172179
173 country_cores = geopandas.overlay(countries, capitals, how='intersection')
180 country_cores = countries.overlay(capitals, how='intersection')
174181 @savefig country_cores.png width=5in
175182 country_cores.plot(alpha=0.5, edgecolor='k', cmap='tab10');
176183
178185
179186 .. ipython:: python
180187
181 country_peripheries = geopandas.overlay(countries, capitals, how='difference')
188 country_peripheries = countries.overlay(capitals, how='difference')
182189 @savefig country_peripheries.png width=5in
183190 country_peripheries.plot(alpha=0.5, edgecolor='k', cmap='tab10');
184191
193200 keep_geom_type keyword
194201 ----------------------
195202
196 In default settings, ``overlay`` returns only geometries of the same geometry type as df1
203 In default settings, :meth:`~geopandas.GeoDataFrame.overlay` returns only geometries of the same geometry type as GeoDataFrame
197204 (left one) has, where Polygon and MultiPolygon is considered as a same type (other types likewise).
198205 You can control this behavior using ``keep_geom_type`` option, which is set to
199206 True by default. Once set to False, ``overlay`` will return all geometry types resulting from
204211 More Examples
205212 -------------
206213
207 A larger set of examples of the use of ``overlay`` can be found `here <http://nbviewer.jupyter.org/github/geopandas/geopandas/blob/master/examples/overlays.ipynb>`_
214 A larger set of examples of the use of :meth:`~geopandas.GeoDataFrame.overlay` can be found `here <https://nbviewer.jupyter.org/github/geopandas/geopandas/blob/master/doc/source/gallery/overlays.ipynb>`_
208215
209216
210217
1313 Reading and Writing Files <user_guide/io>
1414 Indexing and Selecting Data <user_guide/indexing>
1515 Making Maps and plots <user_guide/mapping>
16 Interactive mapping <user_guide/interactive_mapping>
1617 Managing Projections <user_guide/projections>
1718 Geometric Manipulations <user_guide/geometric_manipulations>
1819 Set Operations with overlay <user_guide/set_operations>
1717 "* Equal Intervals\n",
1818 " - Separates the measure's interval into equal parts, 5C per bin.\n",
1919 "* Natural Breaks (Fischer Jenks)\n",
20 " - This algorithm tries to split the rows into naturaly occurring clusters. The numbers per bin will depend on how the observations are located on the interval."
20 " - This algorithm tries to split the rows into naturally occurring clusters. The numbers per bin will depend on how the observations are located on the interval."
2121 ]
2222 },
2323 {
88 "\n",
99 "This example shows how to create a ``GeoDataFrame`` when starting from\n",
1010 "a *regular* ``DataFrame`` that has coordinates either WKT\n",
11 "([well-known text](https://en.wikipedia.org/wiki/Well-known_text>))\n",
11 "([well-known text](https://en.wikipedia.org/wiki/Well-known_text))\n",
1212 "format, or in\n",
1313 "two columns.\n"
1414 ]
224224 },
225225 "nbformat": 4,
226226 "nbformat_minor": 4
227 }
227 }
0 {
1 "cells": [
2 {
3 "cell_type": "markdown",
4 "metadata": {},
5 "source": [
6 "\n",
7 "# Using GeoPandas with Rasterio to sample point data\n",
8 "\n",
9 "This example shows how to use GeoPandas with Rasterio. [Rasterio](https://rasterio.readthedocs.io/en/latest/index.html) is a package for reading and writing raster data.\n",
10 "\n",
11 "In this example a set of vector points is used to sample raster data at those points.\n",
12 "\n",
13 "The raster data used is Copernicus Sentinel data 2018 for Sentinel data.\n"
14 ]
15 },
16 {
17 "cell_type": "code",
18 "execution_count": null,
19 "metadata": {},
20 "outputs": [],
21 "source": [
22 "import geopandas\n",
23 "import rasterio\n",
24 "import matplotlib.pyplot as plt\n",
25 "from shapely.geometry import Point"
26 ]
27 },
28 {
29 "cell_type": "markdown",
30 "metadata": {},
31 "source": [
32 "Create example vector data\n",
33 "=============================\n",
34 "\n",
35 "Generate a geodataframe from a set of points\n"
36 ]
37 },
38 {
39 "cell_type": "code",
40 "execution_count": null,
41 "metadata": {},
42 "outputs": [],
43 "source": [
44 "# Create sampling points\n",
45 "points = [Point(625466, 5621289), Point(626082, 5621627), Point(627116, 5621680), Point(625095, 5622358)]\n",
46 "gdf = geopandas.GeoDataFrame([1, 2, 3, 4], geometry=points, crs=32630)"
47 ]
48 },
49 {
50 "cell_type": "markdown",
51 "metadata": {},
52 "source": [
53 "The ``GeoDataFrame`` looks like this:"
54 ]
55 },
56 {
57 "cell_type": "code",
58 "execution_count": null,
59 "metadata": {},
60 "outputs": [],
61 "source": [
62 "gdf.head()"
63 ]
64 },
65 {
66 "cell_type": "markdown",
67 "metadata": {},
68 "source": [
69 "Open the raster data\n",
70 "=============================\n",
71 "\n",
72 "Use ``rasterio`` to open the raster data to be sampled"
73 ]
74 },
75 {
76 "cell_type": "code",
77 "execution_count": null,
78 "metadata": {},
79 "outputs": [],
80 "source": [
81 "src = rasterio.open('s2a_l2a_fishbourne.tif')"
82 ]
83 },
84 {
85 "cell_type": "markdown",
86 "metadata": {},
87 "source": [
88 "Let's see the raster data with the point data overlaid.\n",
89 "\n"
90 ]
91 },
92 {
93 "cell_type": "code",
94 "execution_count": null,
95 "metadata": {
96 "tags": [
97 "nbsphinx-thumbnail"
98 ]
99 },
100 "outputs": [],
101 "source": [
102 "from rasterio.plot import show\n",
103 "\n",
104 "fig, ax = plt.subplots()\n",
105 "\n",
106 "# transform rasterio plot to real world coords\n",
107 "extent=[src.bounds[0], src.bounds[2], src.bounds[1], src.bounds[3]]\n",
108 "ax = rasterio.plot.show(src, extent=extent, ax=ax, cmap='pink')\n",
109 "\n",
110 "gdf.plot(ax=ax)"
111 ]
112 },
113 {
114 "cell_type": "markdown",
115 "metadata": {},
116 "source": [
117 "Sampling the data\n",
118 "===============\n",
119 "Rasterio requires a list of the coordinates in x,y format rather than as the points that are in the geomentry column.\n",
120 "\n",
121 "This can be achieved using the code below"
122 ]
123 },
124 {
125 "cell_type": "code",
126 "execution_count": null,
127 "metadata": {},
128 "outputs": [],
129 "source": [
130 "coord_list = [(x,y) for x,y in zip(gdf['geometry'].x , gdf['geometry'].y)]"
131 ]
132 },
133 {
134 "cell_type": "markdown",
135 "metadata": {},
136 "source": [
137 "Carry out the sampling of the data and store the results in a new column called `value`. Note that if the image has more than one band, a value is returned for each band."
138 ]
139 },
140 {
141 "cell_type": "code",
142 "execution_count": null,
143 "metadata": {},
144 "outputs": [],
145 "source": [
146 "gdf['value'] = [x for x in src.sample(coord_list)]\n",
147 "gdf.head()"
148 ]
149 }
150 ],
151 "metadata": {
152 "kernelspec": {
153 "display_name": "Python 3 (ipykernel)",
154 "language": "python",
155 "name": "python3"
156 },
157 "language_info": {
158 "codemirror_mode": {
159 "name": "ipython",
160 "version": 3
161 },
162 "file_extension": ".py",
163 "mimetype": "text/x-python",
164 "name": "python",
165 "nbconvert_exporter": "python",
166 "pygments_lexer": "ipython3",
167 "version": "3.9.2"
168 }
169 },
170 "nbformat": 4,
171 "nbformat_minor": 4
172 }
0 {
1 "cells": [
2 {
3 "cell_type": "markdown",
4 "metadata": {},
5 "source": [
6 "# Adding a scale bar to a matplotlib plot\n",
7 "When making a geospatial plot in matplotlib, you can use [maplotlib-scalebar library](https://pypi.org/project/matplotlib-scalebar/) to add a scale bar."
8 ]
9 },
10 {
11 "cell_type": "code",
12 "execution_count": null,
13 "metadata": {},
14 "outputs": [],
15 "source": [
16 "import geopandas as gpd\n",
17 "from matplotlib_scalebar.scalebar import ScaleBar"
18 ]
19 },
20 {
21 "cell_type": "markdown",
22 "metadata": {},
23 "source": [
24 "## Creating a ScaleBar object\n",
25 "The only required parameter for creating a ScaleBar object is `dx`. This is equal to a size of one pixel in real world. Value of this parameter depends on units of your CRS.\n",
26 "\n",
27 "### Projected coordinate system (meters)\n",
28 "The easiest way to add a scale bar is using a projected coordinate system with meters as units. Just set `dx = 1`:"
29 ]
30 },
31 {
32 "cell_type": "code",
33 "execution_count": null,
34 "metadata": {
35 "tags": [
36 "nbsphinx-thumbnail"
37 ]
38 },
39 "outputs": [],
40 "source": [
41 "nybb = gpd.read_file(gpd.datasets.get_path('nybb'))\n",
42 "nybb = nybb.to_crs(32619) # Convert the dataset to a coordinate\n",
43 "# system which uses meters\n",
44 "\n",
45 "ax = nybb.plot()\n",
46 "ax.add_artist(ScaleBar(1))"
47 ]
48 },
49 {
50 "cell_type": "markdown",
51 "metadata": {},
52 "source": [
53 "### Geographic coordinate system (degrees)\n",
54 "With a geographic coordinate system with degrees as units, `dx` should be equal to a distance in meters of two points with the same latitude (Y coordinate) which are one full degree of longitude (X) apart. You can calculate this distance by online calculator [(e.g. the Great Circle calculator)](http://edwilliams.org/gccalc.htm) or in geopandas.\\\n",
55 "\\\n",
56 "Firstly, we will create a GeoSeries with two points that have roughly the coordinates of NYC. They are located on the same latitude but one degree of longitude from each other. Their initial coordinates are specified in a geographic coordinate system (geographic WGS 84). They are then converted to a projected system for the calculation:"
57 ]
58 },
59 {
60 "cell_type": "code",
61 "execution_count": null,
62 "metadata": {},
63 "outputs": [],
64 "source": [
65 "from shapely.geometry.point import Point\n",
66 "\n",
67 "points = gpd.GeoSeries([Point(-73.5, 40.5), Point(-74.5, 40.5)], crs=4326) # Geographic WGS 84 - degrees\n",
68 "points = points.to_crs(32619) # Projected WGS 84 - meters"
69 ]
70 },
71 {
72 "cell_type": "markdown",
73 "metadata": {},
74 "source": [
75 "After the conversion, we can calculate the distance between the points. The result slightly differs from the Great Circle Calculator but the difference is insignificant (84,921 and 84,767 meters):"
76 ]
77 },
78 {
79 "cell_type": "code",
80 "execution_count": null,
81 "metadata": {},
82 "outputs": [],
83 "source": [
84 "distance_meters = points[0].distance(points[1])"
85 ]
86 },
87 {
88 "cell_type": "markdown",
89 "metadata": {},
90 "source": [
91 "Finally, we are able to use geographic coordinate system in our plot. We set value of `dx` parameter to a distance we just calculated:"
92 ]
93 },
94 {
95 "cell_type": "code",
96 "execution_count": null,
97 "metadata": {
98 "scrolled": true
99 },
100 "outputs": [],
101 "source": [
102 "nybb = gpd.read_file(gpd.datasets.get_path('nybb'))\n",
103 "nybb = nybb.to_crs(4326) # Using geographic WGS 84\n",
104 "\n",
105 "ax = nybb.plot()\n",
106 "ax.add_artist(ScaleBar(distance_meters))"
107 ]
108 },
109 {
110 "cell_type": "markdown",
111 "metadata": {},
112 "source": [
113 "## Using other units \n",
114 "The default unit for `dx` is m (meter). You can change this unit by the `units` and `dimension` parameters. There is a list of some possible `units` for various values of `dimension` below:\n",
115 "\n",
116 "| dimension | units |\n",
117 "| ----- |:-----:|\n",
118 "| si-length | km, m, cm, um|\n",
119 "| imperial-length |in, ft, yd, mi|\n",
120 "|si-length-reciprocal|1/m, 1/cm|\n",
121 "|angle|deg|\n",
122 "\n",
123 "In the following example, we will leave the dataset in its initial CRS which uses feet as units. The plot shows scale of 2 leagues (approximately 11 kilometers):"
124 ]
125 },
126 {
127 "cell_type": "code",
128 "execution_count": null,
129 "metadata": {},
130 "outputs": [],
131 "source": [
132 "nybb = gpd.read_file(gpd.datasets.get_path('nybb'))\n",
133 "\n",
134 "ax = nybb.plot()\n",
135 "ax.add_artist(ScaleBar(1, dimension=\"imperial-length\", units=\"ft\"))"
136 ]
137 },
138 {
139 "cell_type": "markdown",
140 "metadata": {},
141 "source": [
142 "## Customization of the scale bar"
143 ]
144 },
145 {
146 "cell_type": "code",
147 "execution_count": null,
148 "metadata": {
149 "scrolled": true
150 },
151 "outputs": [],
152 "source": [
153 "nybb = gpd.read_file(gpd.datasets.get_path('nybb')).to_crs(32619)\n",
154 "ax = nybb.plot()\n",
155 "\n",
156 "# Position and layout\n",
157 "scale1 = ScaleBar(\n",
158 "dx=1, label='Scale 1',\n",
159 " location='upper left', # in relation to the whole plot\n",
160 " label_loc='left', scale_loc='bottom' # in relation to the line\n",
161 ")\n",
162 "\n",
163 "# Color\n",
164 "scale2 = ScaleBar(\n",
165 " dx=1, label='Scale 2', location='center', \n",
166 " color='#b32400', box_color='yellow',\n",
167 " box_alpha=0.8 # Slightly transparent box\n",
168 ")\n",
169 "\n",
170 "# Font and text formatting\n",
171 "scale3 = ScaleBar(\n",
172 " dx=1, label='Scale 3',\n",
173 " font_properties={'family':'serif', 'size': 'large'}, # For more information, see the cell below\n",
174 " scale_formatter=lambda value, unit: f'> {value} {unit} <'\n",
175 ")\n",
176 "\n",
177 "ax.add_artist(scale1)\n",
178 "ax.add_artist(scale2)\n",
179 "ax.add_artist(scale3)"
180 ]
181 },
182 {
183 "cell_type": "markdown",
184 "metadata": {},
185 "source": [
186 "*Note:* Font is specified by six properties: `family`, `style`, `variant`, `stretch`, `weight`, `size` (and `math_fontfamily`). See [more](https://matplotlib.org/stable/api/font_manager_api.html#matplotlib.font_manager.FontProperties).\\\n",
187 "\\\n",
188 "For more information about matplotlib-scalebar library, see the [PyPI](https://pypi.org/project/matplotlib-scalebar/) or [GitHub](https://github.com/ppinard/matplotlib-scalebar) page."
189 ]
190 }
191 ],
192 "metadata": {
193 "interpreter": {
194 "hash": "9914e2881520d4f08a067c2c2c181121476026b863eca2e121cd0758701ab602"
195 },
196 "kernelspec": {
197 "display_name": "Python 3",
198 "language": "python",
199 "name": "python3"
200 },
201 "language_info": {
202 "codemirror_mode": {
203 "name": "ipython",
204 "version": 3
205 },
206 "file_extension": ".py",
207 "mimetype": "text/x-python",
208 "name": "python",
209 "nbconvert_exporter": "python",
210 "pygments_lexer": "ipython3",
211 "version": "3.9.2"
212 }
213 },
214 "nbformat": 4,
215 "nbformat_minor": 4
216 }
11 "cells": [
22 {
33 "cell_type": "markdown",
4 "metadata": {},
54 "source": [
65 "# Overlays\n",
76 "\n",
1514 "not on individual geometries, and the properties from both are retained\n",
1615 "\n",
1716 "![illustration](http://docs.qgis.org/testing/en/_images/overlay_operations.png)"
18 ]
19 },
20 {
21 "cell_type": "markdown",
22 "metadata": {},
17 ],
18 "metadata": {}
19 },
20 {
21 "cell_type": "markdown",
2322 "source": [
2423 "Now we can load up two GeoDataFrames containing (multi)polygon geometries..."
25 ]
26 },
27 {
28 "cell_type": "code",
29 "execution_count": null,
30 "metadata": {},
31 "outputs": [],
24 ],
25 "metadata": {}
26 },
27 {
28 "cell_type": "code",
29 "execution_count": null,
3230 "source": [
3331 "%matplotlib inline\n",
3432 "from shapely.geometry import Point\n",
4644 " {'geometry': Point(x, y).buffer(10000), 'value1': x + y, 'value2': x - y}\n",
4745 " for x, y in zip(range(b[0], b[2], int((b[2] - b[0]) / N)),\n",
4846 " range(b[1], b[3], int((b[3] - b[1]) / N)))])"
49 ]
50 },
51 {
52 "cell_type": "markdown",
53 "metadata": {},
47 ],
48 "outputs": [],
49 "metadata": {}
50 },
51 {
52 "cell_type": "markdown",
5453 "source": [
5554 "The first dataframe contains multipolygons of the NYC boros"
56 ]
57 },
58 {
59 "cell_type": "code",
60 "execution_count": null,
61 "metadata": {},
62 "outputs": [],
55 ],
56 "metadata": {}
57 },
58 {
59 "cell_type": "code",
60 "execution_count": null,
6361 "source": [
6462 "polydf.plot()"
65 ]
66 },
67 {
68 "cell_type": "markdown",
69 "metadata": {},
63 ],
64 "outputs": [],
65 "metadata": {}
66 },
67 {
68 "cell_type": "markdown",
7069 "source": [
7170 "And the second GeoDataFrame is a sequentially generated set of circles in the same geographic space. We'll plot these with a [different color palette](https://matplotlib.org/examples/color/colormaps_reference.html)."
72 ]
73 },
74 {
75 "cell_type": "code",
76 "execution_count": null,
77 "metadata": {},
78 "outputs": [],
71 ],
72 "metadata": {}
73 },
74 {
75 "cell_type": "code",
76 "execution_count": null,
7977 "source": [
8078 "polydf2.plot(cmap='tab20b')"
81 ]
82 },
83 {
84 "cell_type": "markdown",
85 "metadata": {},
79 ],
80 "outputs": [],
81 "metadata": {}
82 },
83 {
84 "cell_type": "markdown",
8685 "source": [
8786 "The `geopandas.tools.overlay` function takes three arguments:\n",
8887 "\n",
9897 " 'symmetric_difference',\n",
9998 " 'difference']\n",
10099 "\n",
101 "So let's identify the areas (and attributes) where both dataframes intersect using the `overlay` tool. "
102 ]
103 },
104 {
105 "cell_type": "code",
106 "execution_count": null,
107 "metadata": {},
108 "outputs": [],
109 "source": [
110 "from geopandas.tools import overlay\n",
111 "newdf = overlay(polydf, polydf2, how=\"intersection\")\n",
112 "newdf.plot(cmap='tab20b')"
113 ]
114 },
115 {
116 "cell_type": "markdown",
117 "metadata": {},
100 "So let's identify the areas (and attributes) where both dataframes intersect using the `overlay` method. "
101 ],
102 "metadata": {}
103 },
104 {
105 "cell_type": "code",
106 "execution_count": null,
107 "source": [
108 "newdf = polydf.overlay(polydf2, how=\"intersection\")\n",
109 "newdf.plot(cmap='tab20b')"
110 ],
111 "outputs": [],
112 "metadata": {}
113 },
114 {
115 "cell_type": "markdown",
118116 "source": [
119117 "And take a look at the attributes; we see that the attributes from both of the original GeoDataFrames are retained. "
120 ]
121 },
122 {
123 "cell_type": "code",
124 "execution_count": null,
125 "metadata": {},
126 "outputs": [],
118 ],
119 "metadata": {}
120 },
121 {
122 "cell_type": "code",
123 "execution_count": null,
127124 "source": [
128125 "polydf.head()"
129 ]
130 },
131 {
132 "cell_type": "code",
133 "execution_count": null,
134 "metadata": {},
135 "outputs": [],
126 ],
127 "outputs": [],
128 "metadata": {}
129 },
130 {
131 "cell_type": "code",
132 "execution_count": null,
136133 "source": [
137134 "polydf2.head()"
138 ]
139 },
140 {
141 "cell_type": "code",
142 "execution_count": null,
143 "metadata": {},
144 "outputs": [],
135 ],
136 "outputs": [],
137 "metadata": {}
138 },
139 {
140 "cell_type": "code",
141 "execution_count": null,
145142 "source": [
146143 "newdf.head()"
147 ]
148 },
149 {
150 "cell_type": "markdown",
151 "metadata": {},
144 ],
145 "outputs": [],
146 "metadata": {}
147 },
148 {
149 "cell_type": "markdown",
152150 "source": [
153151 "Now let's look at the other `how` operations:"
154 ]
155 },
156 {
157 "cell_type": "code",
158 "execution_count": null,
159 "metadata": {},
160 "outputs": [],
161 "source": [
162 "newdf = overlay(polydf, polydf2, how=\"union\")\n",
163 "newdf.plot(cmap='tab20b')"
164 ]
165 },
166 {
167 "cell_type": "code",
168 "execution_count": null,
169 "metadata": {},
170 "outputs": [],
171 "source": [
172 "newdf = overlay(polydf, polydf2, how=\"identity\")\n",
173 "newdf.plot(cmap='tab20b')"
174 ]
175 },
176 {
177 "cell_type": "code",
178 "execution_count": null,
152 ],
153 "metadata": {}
154 },
155 {
156 "cell_type": "code",
157 "execution_count": null,
158 "source": [
159 "newdf = polydf.overlay(polydf2, how=\"union\")\n",
160 "newdf.plot(cmap='tab20b')"
161 ],
162 "outputs": [],
163 "metadata": {}
164 },
165 {
166 "cell_type": "code",
167 "execution_count": null,
168 "source": [
169 "newdf = polydf.overlay(polydf2, how=\"identity\")\n",
170 "newdf.plot(cmap='tab20b')"
171 ],
172 "outputs": [],
173 "metadata": {}
174 },
175 {
176 "cell_type": "code",
177 "execution_count": null,
178 "source": [
179 "newdf = polydf.overlay(polydf2, how=\"symmetric_difference\")\n",
180 "newdf.plot(cmap='tab20b')"
181 ],
182 "outputs": [],
179183 "metadata": {
180184 "tags": [
181185 "nbsphinx-thumbnail"
182186 ]
183 },
184 "outputs": [],
185 "source": [
186 "newdf = overlay(polydf, polydf2, how=\"symmetric_difference\")\n",
187 "newdf.plot(cmap='tab20b')"
188 ]
189 },
190 {
191 "cell_type": "code",
192 "execution_count": null,
193 "metadata": {},
194 "outputs": [],
195 "source": [
196 "newdf = overlay(polydf, polydf2, how=\"difference\")\n",
197 "newdf.plot(cmap='tab20b')"
198 ]
187 }
188 },
189 {
190 "cell_type": "code",
191 "execution_count": null,
192 "source": [
193 "newdf = polydf.overlay(polydf2, how=\"difference\")\n",
194 "newdf.plot(cmap='tab20b')"
195 ],
196 "outputs": [],
197 "metadata": {}
199198 }
200199 ],
201200 "metadata": {
202201 "kernelspec": {
203 "display_name": "Python 3",
202 "display_name": "Python 3 (ipykernel)",
204203 "language": "python",
205204 "name": "python3"
206205 },
214213 "name": "python",
215214 "nbconvert_exporter": "python",
216215 "pygments_lexer": "ipython3",
217 "version": "3.9.1"
216 "version": "3.9.2"
218217 }
219218 },
220219 "nbformat": 4,
221220 "nbformat_minor": 4
222 }
221 }
11 "cells": [
22 {
33 "cell_type": "markdown",
4 "metadata": {},
54 "source": [
65 "# Clip Vector Data with GeoPandas\n",
76 "\n",
87 "\n",
98 "Learn how to clip geometries to the boundary of a polygon geometry\n",
109 "using GeoPandas."
11 ]
12 },
13 {
14 "cell_type": "markdown",
15 "metadata": {},
10 ],
11 "metadata": {}
12 },
13 {
14 "cell_type": "markdown",
1615 "source": [
1716 "The example below shows you how to clip a set of vector geometries\n",
1817 "to the spatial extent / shape of another vector object. Both sets of geometries\n",
3231 "be clipped to the total boundary of all polygons in clip object.\n",
3332 "</div>\n",
3433 "\n"
35 ]
36 },
37 {
38 "cell_type": "markdown",
39 "metadata": {},
34 ],
35 "metadata": {}
36 },
37 {
38 "cell_type": "markdown",
4039 "source": [
4140 "Import Packages\n",
4241 "---------------\n",
4342 "\n",
4443 "To begin, import the needed packages.\n",
4544 "\n"
46 ]
47 },
48 {
49 "cell_type": "code",
50 "execution_count": null,
51 "metadata": {},
52 "outputs": [],
45 ],
46 "metadata": {}
47 },
48 {
49 "cell_type": "code",
50 "execution_count": null,
5351 "source": [
5452 "import matplotlib.pyplot as plt\n",
5553 "import geopandas\n",
5654 "from shapely.geometry import Polygon"
57 ]
58 },
59 {
60 "cell_type": "markdown",
61 "metadata": {},
55 ],
56 "outputs": [],
57 "metadata": {}
58 },
59 {
60 "cell_type": "markdown",
6261 "source": [
6362 "Get or Create Example Data\n",
6463 "--------------------------\n",
6766 "Additionally, a polygon is created with shapely and then converted into a\n",
6867 "GeoDataFrame with the same CRS as the GeoPandas world dataset.\n",
6968 "\n"
70 ]
71 },
72 {
73 "cell_type": "code",
74 "execution_count": null,
75 "metadata": {},
76 "outputs": [],
69 ],
70 "metadata": {}
71 },
72 {
73 "cell_type": "code",
74 "execution_count": null,
7775 "source": [
7876 "capitals = geopandas.read_file(geopandas.datasets.get_path(\"naturalearth_cities\"))\n",
7977 "world = geopandas.read_file(geopandas.datasets.get_path(\"naturalearth_lowres\"))\n",
8482 "# Create a custom polygon\n",
8583 "polygon = Polygon([(0, 0), (0, 90), (180, 90), (180, 0), (0, 0)])\n",
8684 "poly_gdf = geopandas.GeoDataFrame([1], geometry=[polygon], crs=world.crs)"
87 ]
88 },
89 {
90 "cell_type": "markdown",
91 "metadata": {},
85 ],
86 "outputs": [],
87 "metadata": {}
88 },
89 {
90 "cell_type": "markdown",
9291 "source": [
9392 "Plot the Unclipped Data\n",
9493 "-----------------------\n",
9594 "\n"
96 ]
97 },
98 {
99 "cell_type": "code",
100 "execution_count": null,
101 "metadata": {},
102 "outputs": [],
95 ],
96 "metadata": {}
97 },
98 {
99 "cell_type": "code",
100 "execution_count": null,
103101 "source": [
104102 "fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8))\n",
105103 "world.plot(ax=ax1)\n",
111109 "ax1.set_axis_off()\n",
112110 "ax2.set_axis_off()\n",
113111 "plt.show()"
114 ]
115 },
116 {
117 "cell_type": "markdown",
118 "metadata": {},
112 ],
113 "outputs": [],
114 "metadata": {}
115 },
116 {
117 "cell_type": "markdown",
119118 "source": [
120119 "Clip the Data\n",
121120 "--------------\n",
122121 "\n",
123 "When you call `clip`, the first object called is the object that will\n",
124 "be clipped. The second object called is the clip extent. The returned output\n",
122 "The object on which you call `clip` is the object that will\n",
123 "be clipped. The object you pass is the clip extent. The returned output\n",
125124 "will be a new clipped GeoDataframe. All of the attributes for each returned\n",
126125 "geometry will be retained when you clip.\n",
127126 "\n",
130129 "Note\n",
131130 "\n",
132131 "Recall that the data must be in the same CRS in order to use the\n",
133 "`clip` function. If the data are not in the same CRS, be sure to use\n",
132 "`clip` method. If the data are not in the same CRS, be sure to use\n",
134133 "the GeoPandas `GeoDataFrame.to_crs` method to ensure both datasets\n",
135134 "are in the same CRS.\n",
136135 "</div>\n",
137136 "\n"
138 ]
139 },
140 {
141 "cell_type": "markdown",
142 "metadata": {},
137 ],
138 "metadata": {}
139 },
140 {
141 "cell_type": "markdown",
143142 "source": [
144143 "Clip the World Data\n",
145144 "--------------------\n",
146145 "\n"
147 ]
148 },
149 {
150 "cell_type": "code",
151 "execution_count": null,
152 "metadata": {
153 "tags": [
154 "nbsphinx-thumbnail"
155 ]
156 },
157 "outputs": [],
158 "source": [
159 "world_clipped = geopandas.clip(world, polygon)\n",
146 ],
147 "metadata": {}
148 },
149 {
150 "cell_type": "code",
151 "execution_count": null,
152 "source": [
153 "world_clipped = world.clip(polygon)\n",
160154 "\n",
161155 "# Plot the clipped data\n",
162156 "# The plot below shows the results of the clip function applied to the world\n",
168162 "ax.set_title(\"World Clipped\", fontsize=20)\n",
169163 "ax.set_axis_off()\n",
170164 "plt.show()"
171 ]
172 },
173 {
174 "cell_type": "markdown",
175 "metadata": {},
176 "source": [
165 ],
166 "outputs": [],
167 "metadata": {
168 "tags": [
169 "nbsphinx-thumbnail"
170 ]
171 }
172 },
173 {
174 "cell_type": "markdown",
175 "source": [
176 "<div class=\"alert alert-info\">\n",
177 " \n",
178 "Note\n",
179 "\n",
180 "For historical reasons, the clip method is also available as a top-level function `geopandas.clip`.\n",
181 "It is recommended to use the method as the function may be deprecated in the future.\n",
182 "</div>\n",
183 "\n",
177184 "Clip the Capitals Data\n",
178185 "----------------------\n",
179186 "\n"
180 ]
181 },
182 {
183 "cell_type": "code",
184 "execution_count": null,
185 "metadata": {},
186 "outputs": [],
187 "source": [
188 "capitals_clipped = geopandas.clip(capitals, south_america)\n",
187 ],
188 "metadata": {}
189 },
190 {
191 "cell_type": "code",
192 "execution_count": null,
193 "source": [
194 "capitals_clipped = capitals.clip(south_america)\n",
189195 "\n",
190196 "# Plot the clipped data\n",
191197 "# The plot below shows the results of the clip function applied to the capital cities\n",
195201 "ax.set_title(\"Capitals Clipped\", fontsize=20)\n",
196202 "ax.set_axis_off()\n",
197203 "plt.show()"
198 ]
204 ],
205 "outputs": [],
206 "metadata": {}
199207 }
200208 ],
201209 "metadata": {
219227 },
220228 "nbformat": 4,
221229 "nbformat_minor": 4
222 }
230 }
99 "This example shows how you can add a background basemap to plots created\n",
1010 "with the geopandas ``.plot()`` method. This makes use of the\n",
1111 "[contextily](https://github.com/geopandas/contextily) package to retrieve\n",
12 "web map tiles from several sources (OpenStreetMap, Stamen).\n"
13 ]
14 },
15 {
16 "cell_type": "code",
17 "execution_count": null,
18 "metadata": {},
19 "outputs": [],
20 "source": [
21 "import geopandas"
12 "web map tiles from several sources (OpenStreetMap, Stamen). Also have a\n",
13 "look at contextily's \n",
14 "[introduction guide](https://contextily.readthedocs.io/en/latest/intro_guide.html#Using-transparent-layers)\n",
15 "for possible new features not covered here.\n"
16 ]
17 },
18 {
19 "cell_type": "code",
20 "execution_count": null,
21 "metadata": {},
22 "outputs": [],
23 "source": [
24 "import geopandas\n",
25 "import contextily as cx"
2226 ]
2327 },
2428 {
4448 "cell_type": "markdown",
4549 "metadata": {},
4650 "source": [
47 "Convert the data to Web Mercator\n",
48 "================================\n",
49 "\n",
51 "## Matching coordinate systems \n",
52 "\n",
53 "\n",
54 "Before adding web map tiles to this plot, we first need to ensure the\n",
55 "coordinate reference systems (CRS) of the tiles and the data match.\n",
5056 "Web map tiles are typically provided in\n",
5157 "[Web Mercator](https://en.wikipedia.org/wiki/Web_Mercator>)\n",
52 "([EPSG 3857](https://epsg.io/3857)), so we need to make sure to convert\n",
53 "our data first to the same CRS to combine our polygons and background tiles\n",
54 "in the same map:\n",
55 "\n"
56 ]
57 },
58 {
59 "cell_type": "code",
60 "execution_count": null,
61 "metadata": {},
62 "outputs": [],
63 "source": [
64 "df = df.to_crs(epsg=3857)"
65 ]
66 },
67 {
68 "cell_type": "code",
69 "execution_count": null,
70 "metadata": {},
71 "outputs": [],
72 "source": [
73 "import contextily as ctx"
74 ]
75 },
76 {
77 "cell_type": "markdown",
78 "metadata": {},
79 "source": [
80 "Add background tiles to plot\n",
81 "============================\n",
82 "\n",
83 "We can use `add_basemap` function of contextily to easily add a background\n",
84 "map to our plot. :\n",
85 "\n"
58 "([EPSG 3857](https://epsg.io/3857)), so let us first check what\n",
59 "CRS our NYC boroughs are in:"
60 ]
61 },
62 {
63 "cell_type": "code",
64 "execution_count": null,
65 "metadata": {},
66 "outputs": [],
67 "source": [
68 "df.crs"
69 ]
70 },
71 {
72 "cell_type": "markdown",
73 "metadata": {},
74 "source": [
75 "Now we know the CRS do not match, so we need to choose in which\n",
76 "CRS we wish to visualize the data: either the CRS of the tiles,\n",
77 "the one of the data, or even a different one.\n",
78 "\n",
79 "The first option to match CRS is to leverage the `to_crs` method\n",
80 "of GeoDataFrames to convert the CRS of our data, here to Web Mercator:"
81 ]
82 },
83 {
84 "cell_type": "code",
85 "execution_count": null,
86 "metadata": {},
87 "outputs": [],
88 "source": [
89 "df_wm = df.to_crs(epsg=3857)"
90 ]
91 },
92 {
93 "cell_type": "markdown",
94 "metadata": {},
95 "source": [
96 "We can then use `add_basemap` function of contextily to easily add a\n",
97 "background map to our plot:"
8698 ]
8799 },
88100 {
95107 },
96108 "outputs": [],
97109 "source": [
110 "ax = df_wm.plot(figsize=(10, 10), alpha=0.5, edgecolor='k')\n",
111 "cx.add_basemap(ax)"
112 ]
113 },
114 {
115 "cell_type": "markdown",
116 "metadata": {},
117 "source": [
118 "If we want to convert the CRS of the tiles instead, which might be advisable\n",
119 "for large datasets, we can use the `crs` keyword argument of `add_basemap`\n",
120 "as follows:"
121 ]
122 },
123 {
124 "cell_type": "code",
125 "execution_count": null,
126 "metadata": {},
127 "outputs": [],
128 "source": [
98129 "ax = df.plot(figsize=(10, 10), alpha=0.5, edgecolor='k')\n",
99 "ctx.add_basemap(ax)"
130 "cx.add_basemap(ax, crs=df.crs)"
131 ]
132 },
133 {
134 "cell_type": "markdown",
135 "metadata": {},
136 "source": [
137 "This reprojects map tiles to a target CRS which may in some cases cause a\n",
138 "loss of sharpness. See \n",
139 "[contextily's guide on warping tiles](https://contextily.readthedocs.io/en/latest/warping_guide.html)\n",
140 "for more information on the subject."
141 ]
142 },
143 {
144 "cell_type": "markdown",
145 "metadata": {},
146 "source": [
147 "## Controlling the level of detail"
100148 ]
101149 },
102150 {
115163 "metadata": {},
116164 "outputs": [],
117165 "source": [
118 "ax = df.plot(figsize=(10, 10), alpha=0.5, edgecolor='k')\n",
119 "ctx.add_basemap(ax, zoom=12)"
166 "ax = df_wm.plot(figsize=(10, 10), alpha=0.5, edgecolor='k')\n",
167 "cx.add_basemap(ax, zoom=12)"
168 ]
169 },
170 {
171 "cell_type": "markdown",
172 "metadata": {},
173 "source": [
174 "## Choosing a different style"
120175 ]
121176 },
122177 {
124179 "metadata": {},
125180 "source": [
126181 "By default, contextily uses the Stamen Terrain style. We can specify a\n",
127 "different style using ``ctx.providers``:\n",
182 "different style using ``cx.providers``:\n",
128183 "\n"
129184 ]
130185 },
134189 "metadata": {},
135190 "outputs": [],
136191 "source": [
137 "ax = df.plot(figsize=(10, 10), alpha=0.5, edgecolor='k')\n",
138 "ctx.add_basemap(ax, url=ctx.providers.Stamen.TonerLite)\n",
192 "ax = df_wm.plot(figsize=(10, 10), alpha=0.5, edgecolor='k')\n",
193 "cx.add_basemap(ax, source=cx.providers.Stamen.TonerLite)\n",
139194 "ax.set_axis_off()"
140195 ]
141196 },
142197 {
143 "cell_type": "code",
144 "execution_count": null,
145 "metadata": {},
146 "outputs": [],
147 "source": []
198 "cell_type": "markdown",
199 "metadata": {},
200 "source": [
201 "## Adding labels as an overlay"
202 ]
203 },
204 {
205 "cell_type": "markdown",
206 "metadata": {},
207 "source": [
208 "Sometimes, when you plot data on a basemap, the data will obscure some important map elements, such as labels,\n",
209 "that you would otherwise want to see unobscured. Some map tile providers offer multiple sets of partially\n",
210 "transparent tiles to solve this, and `contextily` will do its best to auto-detect these transparent layers\n",
211 "and put them on top."
212 ]
213 },
214 {
215 "cell_type": "code",
216 "execution_count": null,
217 "metadata": {},
218 "outputs": [],
219 "source": [
220 "ax = df_wm.plot(figsize=(10, 10), alpha=0.5, edgecolor='k')\n",
221 "cx.add_basemap(ax, source=cx.providers.Stamen.TonerLite)\n",
222 "cx.add_basemap(ax, source=cx.providers.Stamen.TonerLabels)"
223 ]
224 },
225 {
226 "cell_type": "markdown",
227 "metadata": {},
228 "source": [
229 "By splitting the layers like this, you can also independently manipulate the level of zoom on each layer,\n",
230 "for example to make labels larger while still showing a lot of detail."
231 ]
232 },
233 {
234 "cell_type": "code",
235 "execution_count": null,
236 "metadata": {},
237 "outputs": [],
238 "source": [
239 "ax = df_wm.plot(figsize=(10, 10), alpha=0.5, edgecolor='k')\n",
240 "cx.add_basemap(ax, source=cx.providers.Stamen.Watercolor, zoom=12)\n",
241 "cx.add_basemap(ax, source=cx.providers.Stamen.TonerLabels, zoom=10)"
242 ]
148243 }
149244 ],
150245 "metadata": {
163258 "name": "python",
164259 "nbconvert_exporter": "python",
165260 "pygments_lexer": "ipython3",
166 "version": "3.7.6"
261 "version": "3.9.1"
167262 }
168263 },
169264 "nbformat": 4,
33 "cell_type": "markdown",
44 "metadata": {},
55 "source": [
6 "# Plotting with folium\n",
6 "# Plotting with Folium\n",
77 "\n",
88 "__What is Folium?__\n",
99 "\n",
10 "It builds on the data wrangling and a Python wrapper for leaflet.js. It makes it easy to visualize data in Python with minimal instructions.\n",
11 "\n",
12 "Folium expands on the data wrangling properties utilized in Python language and the mapping characteristics of the Leaflet.js library. Folium enables us to make an intuitive map and are is visualized in a Leaflet map after manipulating data in Python. Folium results are intuitive which makes this library helpful for dashboard building and easier to work with.\n",
13 "\n",
14 "Let's see the implementation of both GeoPandas and Folium:"
15 ]
16 },
17 {
18 "cell_type": "code",
19 "execution_count": null,
20 "metadata": {},
21 "outputs": [],
22 "source": [
23 "# Importing Libraries\n",
10 "[Folium](https://python-visualization.github.io/folium/) builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the leaflet.js library. This allows you to manipulate your data in Geopandas and visualize it on a Leaflet map via Folium.\n",
11 "\n",
12 "In this example, we will first use Geopandas to load the geometries (volcano point data), and then create the Folium map with markers representing the different types of volcanoes."
13 ]
14 },
15 {
16 "cell_type": "markdown",
17 "metadata": {},
18 "source": [
19 "## Load geometries\n",
20 "This example uses a freely available [volcano dataset](https://www.kaggle.com/texasdave/volcano-eruptions). We will be reading the csv file using pandas, and then convert the pandas `DataFrame` to a Geopandas `GeoDataFrame`."
21 ]
22 },
23 {
24 "cell_type": "code",
25 "execution_count": null,
26 "metadata": {},
27 "outputs": [],
28 "source": [
29 "# Import Libraries\n",
2430 "import pandas as pd\n",
2531 "import geopandas\n",
2632 "import folium\n",
27 "import matplotlib.pyplot as plt\n",
28 "\n",
29 "from shapely.geometry import Point"
33 "import matplotlib.pyplot as plt"
3034 ]
3135 },
3236 {
3640 "outputs": [],
3741 "source": [
3842 "df1 = pd.read_csv('volcano_data_2010.csv')\n",
43 "\n",
44 "# Keep only relevant columns\n",
3945 "df = df1.loc[:, (\"Year\", \"Name\", \"Country\", \"Latitude\", \"Longitude\", \"Type\")]\n",
4046 "df.info()"
4147 ]
4652 "metadata": {},
4753 "outputs": [],
4854 "source": [
55 "# Create point geometries\n",
4956 "geometry = geopandas.points_from_xy(df.Longitude, df.Latitude)\n",
5057 "geo_df = geopandas.GeoDataFrame(df[['Year','Name','Country', 'Latitude', 'Longitude', 'Type']], geometry=geometry)\n",
5158 "\n",
8289 "cell_type": "markdown",
8390 "metadata": {},
8491 "source": [
85 "We will be using different icons to differentiate the types of Volcanoes using Folium.\n",
86 "But before we start, we can see a few different tiles to choose from folium."
92 "## Create Folium map\n",
93 "Folium has a number of built-in tilesets from OpenStreetMap, Mapbox, and Stamen. For example:"
8794 ]
8895 },
8996 {
123130 "cell_type": "markdown",
124131 "metadata": {},
125132 "source": [
126 "We can use other tiles for the visualization, these are just a few examples.\n",
127 "\n",
128 "### Markers\n",
129 "Now, let's look at different volcanoes on the map using different Markers to represent the volcanoes."
130 ]
131 },
132 {
133 "cell_type": "code",
134 "execution_count": null,
135 "metadata": {},
136 "outputs": [],
137 "source": [
138 "#use terrain map layer to actually see volcano terrain\n",
133 "This example uses the Stamen Terrain map layer to visualize the volcano terrain."
134 ]
135 },
136 {
137 "cell_type": "code",
138 "execution_count": null,
139 "metadata": {},
140 "outputs": [],
141 "source": [
142 "# Use terrain map layer to see volcano terrain\n",
139143 "map = folium.Map(location = [4,10], tiles = \"Stamen Terrain\", zoom_start = 3)"
140144 ]
141145 },
142146 {
143 "cell_type": "code",
144 "execution_count": null,
145 "metadata": {},
146 "outputs": [],
147 "source": [
148 "# insert multiple markers, iterate through list\n",
149 "# add a different color marker associated with type of volcano\n",
150 "\n",
147 "cell_type": "markdown",
148 "metadata": {},
149 "source": [
150 "### Add markers\n",
151 "To represent the different types of volcanoes, you can create Folium markers and add them to your map."
152 ]
153 },
154 {
155 "cell_type": "code",
156 "execution_count": null,
157 "metadata": {},
158 "outputs": [],
159 "source": [
160 "# Create a geometry list from the GeoDataFrame\n",
151161 "geo_df_list = [[point.xy[1][0], point.xy[0][0]] for point in geo_df.geometry ]\n",
152162 "\n",
163 "# Iterate through list and add a marker for each volcano, color-coded by its type.\n",
153164 "i = 0\n",
154165 "for coordinates in geo_df_list:\n",
155166 " #assign a color marker for the type of volcano, Strato being the most common\n",
165176 " type_color = \"purple\"\n",
166177 "\n",
167178 "\n",
168 " #now place the markers with the popup labels and data\n",
179 " # Place the markers with the popup labels and data\n",
169180 " map.add_child(folium.Marker(location = coordinates,\n",
170181 " popup =\n",
171182 " \"Year: \" + str(geo_df.Year[i]) + '<br>' +\n",
190201 "cell_type": "markdown",
191202 "metadata": {},
192203 "source": [
193 "### Heatmaps\n",
194 "\n",
195 "Folium is well known for it's heatmap which create a heatmap layer. To plot a heat map in folium, one needs a list of Latitude, Longitude."
196 ]
197 },
198 {
199 "cell_type": "code",
200 "execution_count": null,
201 "metadata": {},
202 "outputs": [],
203 "source": [
204 "# In this example, with the hep of heat maps, we are able to perceive the density of volcanoes\n",
205 "# which is more in some part of the world compared to others.\n",
204 "## Folium Heatmaps\n",
205 "\n",
206 "Folium is well known for its heatmaps, which create a heatmap layer. To plot a heatmap in Folium, you need a list of latitudes and longitudes."
207 ]
208 },
209 {
210 "cell_type": "code",
211 "execution_count": null,
212 "metadata": {},
213 "outputs": [],
214 "source": [
215 "# This example uses heatmaps to visualize the density of volcanoes\n",
216 "# which is more in some parts of the world compared to others.\n",
206217 "\n",
207218 "from folium import plugins\n",
208219 "\n",
215226 "\n",
216227 "map"
217228 ]
218 },
219 {
220 "cell_type": "code",
221 "execution_count": null,
222 "metadata": {},
223 "outputs": [],
224 "source": []
225229 }
226230 ],
227231 "metadata": {
240244 "name": "python",
241245 "nbconvert_exporter": "python",
242246 "pygments_lexer": "ipython3",
243 "version": "3.7.6"
247 "version": "3.9.1"
244248 }
245249 },
246250 "nbformat": 4,
33 "cell_type": "markdown",
44 "metadata": {},
55 "source": [
6 "# An example of polygon plotting with folium \n",
7 "We are going to demonstrate polygon plotting in this example with the help of folium"
6 "# Plotting polygons with Folium\n",
7 "This example demonstrates how to plot polygons on a Folium map."
88 ]
99 },
1010 {
2222 "cell_type": "markdown",
2323 "metadata": {},
2424 "source": [
25 "We make use of nybb dataset"
25 "## Load geometries\n",
26 "This example uses the nybb dataset, which contains polygons of New York boroughs."
2627 ]
2728 },
2829 {
6162 "cell_type": "markdown",
6263 "metadata": {},
6364 "source": [
64 "One thing to notice is that the values of the geometry do not directly represent the values of latitude of longitude in geographic coordinate system\n"
65 ]
66 },
67 {
68 "cell_type": "code",
69 "execution_count": null,
70 "metadata": {},
71 "outputs": [],
72 "source": [
73 "print(df.crs)"
74 ]
75 },
76 {
77 "cell_type": "markdown",
78 "metadata": {},
79 "source": [
80 "As folium(i.e. leaflet.js) by default takes input of values of latitude and longitude, we need to project the geometry first"
81 ]
82 },
83 {
84 "cell_type": "code",
85 "execution_count": null,
86 "metadata": {},
87 "outputs": [],
88 "source": [
65 "Notice that the values of the polygon geometries do not directly represent the values of latitude of longitude in a geographic coordinate system.\n",
66 "To view the coordinate reference system of the geometry column, access the `crs` attribute:"
67 ]
68 },
69 {
70 "cell_type": "code",
71 "execution_count": null,
72 "metadata": {},
73 "outputs": [],
74 "source": [
75 "df.crs"
76 ]
77 },
78 {
79 "cell_type": "markdown",
80 "metadata": {},
81 "source": [
82 "The [epsg:2263](https://epsg.io/2263) crs is a projected coordinate reference system with linear units (ft in this case).\n",
83 "As folium (i.e. leaflet.js) by default accepts values of latitude and longitude (angular units) as input, we need to project the geometry to a geographic coordinate system first."
84 ]
85 },
86 {
87 "cell_type": "code",
88 "execution_count": null,
89 "metadata": {},
90 "outputs": [],
91 "source": [
92 "# Use WGS 84 (epsg:4326) as the geographic coordinate system\n",
8993 "df = df.to_crs(epsg=4326)\n",
9094 "print(df.crs)\n",
9195 "df.head()"
105109 "cell_type": "markdown",
106110 "metadata": {},
107111 "source": [
108 "Initialize folium map object"
112 "## Create Folium map"
109113 ]
110114 },
111115 {
122126 "cell_type": "markdown",
123127 "metadata": {},
124128 "source": [
125 "Overlay the boundaries of boroughs on map with borough name as popup"
129 "### Add polygons to map\n",
130 "Overlay the boundaries of boroughs on map with borough name as popup:"
126131 ]
127132 },
128133 {
132137 "outputs": [],
133138 "source": [
134139 "for _, r in df.iterrows():\n",
135 " #without simplifying the representation of each borough, the map might not be displayed \n",
136 " #sim_geo = gpd.GeoSeries(r['geometry'])\n",
140 " # Without simplifying the representation of each borough,\n",
141 " # the map might not be displayed \n",
137142 " sim_geo = gpd.GeoSeries(r['geometry']).simplify(tolerance=0.001)\n",
138143 " geo_j = sim_geo.to_json()\n",
139144 " geo_j = folium.GeoJson(data=geo_j,\n",
147152 "cell_type": "markdown",
148153 "metadata": {},
149154 "source": [
150 "Add marker showing the area and length of each borough"
151 ]
152 },
153 {
154 "cell_type": "code",
155 "execution_count": null,
156 "metadata": {},
157 "outputs": [],
158 "source": [
159 "df['lat'] = df.centroid.y\n",
160 "df['lon'] = df.centroid.x\n",
155 "### Add centroid markers\n",
156 "In order to properly compute geometric properties, in this case centroids, of the geometries, we need to project the data to a projected coordinate system."
157 ]
158 },
159 {
160 "cell_type": "code",
161 "execution_count": null,
162 "metadata": {},
163 "outputs": [],
164 "source": [
165 "# Project to NAD83 projected crs\n",
166 "df = df.to_crs(epsg=2263)\n",
167 "\n",
168 "# Access the centroid attribute of each polygon\n",
169 "df['centroid'] = df.centroid"
170 ]
171 },
172 {
173 "cell_type": "markdown",
174 "metadata": {},
175 "source": [
176 "Since we're again adding a new geometry to the Folium map, we need to project the geometry back to a geographic coordinate system with latitude and longitude values."
177 ]
178 },
179 {
180 "cell_type": "code",
181 "execution_count": null,
182 "metadata": {},
183 "outputs": [],
184 "source": [
185 "# Project to WGS84 geographic crs\n",
186 "\n",
187 "# geometry (active) column\n",
188 "df = df.to_crs(epsg=4326)\n",
189 "\n",
190 "# Centroid column\n",
191 "df['centroid'] = df['centroid'].to_crs(epsg=4326)\n",
192 "\n",
161193 "df.head()"
162194 ]
163195 },
168200 "outputs": [],
169201 "source": [
170202 "for _, r in df.iterrows():\n",
171 " folium.Marker(location=[r['lat'], r['lon']], popup='length: {} <br> area: {}'.format(r['Shape_Leng'], r['Shape_Area'])).add_to(m)\n",
172 " \n",
203 " lat = r['centroid'].y\n",
204 " lon = r['centroid'].x\n",
205 " folium.Marker(location=[lat, lon],\n",
206 " popup='length: {} <br> area: {}'.format(r['Shape_Leng'], r['Shape_Area'])).add_to(m)\n",
207 "\n",
173208 "m"
174209 ]
175 },
176 {
177 "cell_type": "code",
178 "execution_count": null,
179 "metadata": {},
180 "outputs": [],
181 "source": []
182210 }
183211 ],
184212 "metadata": {
213 "kernelspec": {
214 "display_name": "Python 3",
215 "language": "python",
216 "name": "python3"
217 },
185218 "language_info": {
186219 "codemirror_mode": {
187220 "name": "ipython",
192225 "name": "python",
193226 "nbconvert_exporter": "python",
194227 "pygments_lexer": "ipython3",
195 "version": "3.8.5"
228 "version": "3.9.1"
196229 }
197230 },
198231 "nbformat": 4,
11 "cells": [
22 {
33 "cell_type": "markdown",
4 "metadata": {},
54 "source": [
65 "# Spatial Joins\n",
76 "\n",
1211 "A common use case might be a spatial join between a point layer and a polygon layer where you want to retain the point geometries and grab the attributes of the intersecting polygons.\n",
1312 "\n",
1413 "![illustration](https://web.natur.cuni.cz/~langhamr/lectures/vtfg1/mapinfo_1/about_gis/Image23.gif)"
15 ]
16 },
17 {
18 "cell_type": "markdown",
19 "metadata": {},
14 ],
15 "metadata": {}
16 },
17 {
18 "cell_type": "markdown",
2019 "source": [
2120 "\n",
2221 "## Types of spatial joins\n",
8483 " 0101000000F0D88AA0E1A4EEBF7052F7E5B115E9BF | 2 | 20\n",
8584 "(4 rows) \n",
8685 "```"
87 ]
88 },
89 {
90 "cell_type": "markdown",
91 "metadata": {},
86 ],
87 "metadata": {}
88 },
89 {
90 "cell_type": "markdown",
9291 "source": [
9392 "## Spatial Joins between two GeoDataFrames\n",
9493 "\n",
9594 "Let's take a look at how we'd implement these using `GeoPandas`. First, load up the NYC test data into `GeoDataFrames`:"
96 ]
97 },
98 {
99 "cell_type": "code",
100 "execution_count": null,
101 "metadata": {},
102 "outputs": [],
95 ],
96 "metadata": {}
97 },
98 {
99 "cell_type": "code",
100 "execution_count": null,
103101 "source": [
104102 "%matplotlib inline\n",
105103 "from shapely.geometry import Point\n",
106104 "from geopandas import datasets, GeoDataFrame, read_file\n",
107 "from geopandas.tools import overlay\n",
108105 "\n",
109106 "# NYC Boros\n",
110107 "zippath = datasets.get_path('nybb')\n",
120117 "\n",
121118 "# Make sure they're using the same projection reference\n",
122119 "pointdf.crs = polydf.crs"
123 ]
124 },
125 {
126 "cell_type": "code",
127 "execution_count": null,
128 "metadata": {},
129 "outputs": [],
120 ],
121 "outputs": [],
122 "metadata": {}
123 },
124 {
125 "cell_type": "code",
126 "execution_count": null,
130127 "source": [
131128 "pointdf"
132 ]
133 },
134 {
135 "cell_type": "code",
136 "execution_count": null,
137 "metadata": {},
138 "outputs": [],
129 ],
130 "outputs": [],
131 "metadata": {}
132 },
133 {
134 "cell_type": "code",
135 "execution_count": null,
139136 "source": [
140137 "polydf"
141 ]
142 },
143 {
144 "cell_type": "code",
145 "execution_count": null,
146 "metadata": {},
147 "outputs": [],
138 ],
139 "outputs": [],
140 "metadata": {}
141 },
142 {
143 "cell_type": "code",
144 "execution_count": null,
148145 "source": [
149146 "pointdf.plot()"
150 ]
151 },
152 {
153 "cell_type": "code",
154 "execution_count": null,
155 "metadata": {},
156 "outputs": [],
147 ],
148 "outputs": [],
149 "metadata": {}
150 },
151 {
152 "cell_type": "code",
153 "execution_count": null,
157154 "source": [
158155 "polydf.plot()"
159 ]
160 },
161 {
162 "cell_type": "markdown",
163 "metadata": {},
156 ],
157 "outputs": [],
158 "metadata": {}
159 },
160 {
161 "cell_type": "markdown",
164162 "source": [
165163 "## Joins"
166 ]
167 },
168 {
169 "cell_type": "code",
170 "execution_count": null,
171 "metadata": {},
172 "outputs": [],
173 "source": [
174 "from geopandas.tools import sjoin\n",
175 "join_left_df = sjoin(pointdf, polydf, how=\"left\")\n",
164 ],
165 "metadata": {}
166 },
167 {
168 "cell_type": "code",
169 "execution_count": null,
170 "source": [
171 "join_left_df = pointdf.sjoin(polydf, how=\"left\")\n",
176172 "join_left_df\n",
177173 "# Note the NaNs where the point did not intersect a boro"
178 ]
179 },
180 {
181 "cell_type": "code",
182 "execution_count": null,
183 "metadata": {},
184 "outputs": [],
185 "source": [
186 "join_right_df = sjoin(pointdf, polydf, how=\"right\")\n",
174 ],
175 "outputs": [],
176 "metadata": {}
177 },
178 {
179 "cell_type": "code",
180 "execution_count": null,
181 "source": [
182 "join_right_df = pointdf.sjoin(polydf, how=\"right\")\n",
187183 "join_right_df\n",
188184 "# Note Staten Island is repeated"
189 ]
190 },
191 {
192 "cell_type": "code",
193 "execution_count": null,
194 "metadata": {},
195 "outputs": [],
196 "source": [
197 "join_inner_df = sjoin(pointdf, polydf, how=\"inner\")\n",
185 ],
186 "outputs": [],
187 "metadata": {}
188 },
189 {
190 "cell_type": "code",
191 "execution_count": null,
192 "source": [
193 "join_inner_df = pointdf.sjoin(polydf, how=\"inner\")\n",
198194 "join_inner_df\n",
199195 "# Note the lack of NaNs; dropped anything that didn't intersect"
200 ]
201 },
202 {
203 "cell_type": "markdown",
204 "metadata": {},
196 ],
197 "outputs": [],
198 "metadata": {}
199 },
200 {
201 "cell_type": "markdown",
205202 "source": [
206203 "We're not limited to using the `intersection` binary predicate. Any of the `Shapely` geometry methods that return a Boolean can be used by specifying the `op` kwarg."
207 ]
208 },
209 {
210 "cell_type": "code",
211 "execution_count": null,
212 "metadata": {},
213 "outputs": [],
214 "source": [
215 "sjoin(pointdf, polydf, how=\"left\", op=\"within\")"
216 ]
204 ],
205 "metadata": {}
206 },
207 {
208 "cell_type": "code",
209 "execution_count": null,
210 "source": [
211 "pointdf.sjoin(polydf, how=\"left\", op=\"within\")"
212 ],
213 "outputs": [],
214 "metadata": {}
217215 }
218216 ],
219217 "metadata": {
237235 },
238236 "nbformat": 4,
239237 "nbformat_minor": 4
240 }
238 }
137137 Required dependencies:
138138
139139 - `numpy`_
140 - `pandas`_ (version 0.24 or later)
140 - `pandas`_ (version 0.25 or later)
141141 - `shapely`_ (interface to `GEOS`_)
142142 - `fiona`_ (interface to `GDAL`_)
143143 - `pyproj`_ (interface to `PROJ`_; version 2.2.0 or later)
153153
154154 For plotting, these additional packages may be used:
155155
156 - `matplotlib`_ (>= 2.2.0)
157 - `mapclassify`_ (>= 2.2.0)
156 - `matplotlib`_ (>= 3.1.0)
157 - `mapclassify`_ (>= 2.4.0)
158158
159159
160160 Using the optional PyGEOS dependency
11 "cells": [
22 {
33 "cell_type": "markdown",
4 "metadata": {},
4 "metadata": {
5 "tags": []
6 },
57 "source": [
68 "# Introduction to GeoPandas\n",
79 "\n",
8 "This quick tutorial provides an introduction to the key concepts of GeoPandas. In a few minutes, we'll describe the basics which allow you to start your projects.\n",
10 "This quick tutorial introduces the key concepts and basic features of GeoPandas to help you get started with your projects.\n",
911 "\n",
1012 "## Concepts\n",
1113 "\n",
12 "GeoPandas, as the name suggests, extends popular data science library [pandas](https://pandas.pydata.org) by adding support for geospatial data. If you are not familiar with `pandas`, we recommend taking a quick look at its [Getting started documentation](https://pandas.pydata.org/docs/getting_started/index.html#getting-started) before proceeding.\n",
13 "\n",
14 "The core data structure in GeoPandas is `geopandas.GeoDataFrame`, a subclass of `pandas.DataFrame` able to store geometry columns and perform spatial operations. Geometries are handled by `geopandas.GeoSeries`, a subclass of `pandas.Series`. Therefore, your `GeoDataFrame` is a combination of `Series` with your data (numerical, boolean, text etc.) and `GeoSeries` with geometries (points, polygons etc.). You can have as many columns with geometries as you wish, there's no limit typical for desktop GIS software.\n",
14 "GeoPandas, as the name suggests, extends the popular data science library [pandas](https://pandas.pydata.org) by adding support for geospatial data. If you are not familiar with `pandas`, we recommend taking a quick look at its [Getting started documentation](https://pandas.pydata.org/docs/getting_started/index.html#getting-started) before proceeding.\n",
15 "\n",
16 "The core data structure in GeoPandas is the `geopandas.GeoDataFrame`, a subclass of `pandas.DataFrame`, that can store geometry columns and perform spatial operations. The `geopandas.GeoSeries`, a subclass of `pandas.Series`, handles the geometries. Therefore, your `GeoDataFrame` is a combination of `pandas.Series`, with traditional data (numerical, boolean, text etc.), and `geopandas.GeoSeries`, with geometries (points, polygons etc.). You can have as many columns with geometries as you wish; there's no limit typical for desktop GIS software.\n",
1517 "\n",
1618 "![geodataframe schema](../_static/dataframe.svg)\n",
1719 "\n",
18 "Each `GeoSeries` can contain any geometry type (we can even mix them within a single array) and has a `GeoSeries.crs` attribute, which stores information on the projection (CRS stands for Coordinate Reference System). Therefore, each `GeoSeries` in a `GeoDataFrame` can be in a different projection, allowing you to have, for example, multiple versions of the same geometry, just in a different CRS.\n",
19 "\n",
20 "One `GeoSeries` within a `GeoDataFrame` is seen as the _active_ geometry, which means that all geometric operations applied to a `GeoDataFrame` use the specified column.\n",
20 "Each `GeoSeries` can contain any geometry type (you can even mix them within a single array) and has a `GeoSeries.crs` attribute, which stores information about the projection (CRS stands for Coordinate Reference System). Therefore, each `GeoSeries` in a `GeoDataFrame` can be in a different projection, allowing you to have, for example, multiple versions (different projections) of the same geometry.\n",
21 "\n",
22 "Only one `GeoSeries` in a `GeoDataFrame` is considered the _active_ geometry, which means that all geometric operations applied to a `GeoDataFrame` operate on this _active_ column.\n",
2123 "\n",
2224 "\n",
2325 "<div class=\"alert alert-info\">\n",
2729 "</div>\n",
2830 "\n",
2931 "\n",
30 "Let's see how this works in practice.\n",
32 "Let's see how some of these concepts work in practice.\n",
3133 "\n",
3234 "## Reading and writing files\n",
3335 "\n",
3436 "First, we need to read some data.\n",
3537 "\n",
36 "### Read files\n",
37 "\n",
38 "Assuming we have a file containing both data and geometry (e.g. GeoPackage, GeoJSON, Shapefile), we can easily read it using `geopandas.read_file` function, which automatically detects filetype and creates a `GeoDataFrame`. In this example, we'll use the `\"nybb\"` dataset, a map of New York boroughs which is part of GeoPandas installation. Therefore we need to get the path to the actual file. With your file, you specify a path as a string (`\"my_data/my_file.geojson\"`)."
38 "### Reading files\n",
39 "\n",
40 "Assuming you have a file containing both data and geometry (e.g. GeoPackage, GeoJSON, Shapefile), you can read it using `geopandas.read_file()`, which automatically detects the filetype and creates a `GeoDataFrame`. This tutorial uses the `\"nybb\"` dataset, a map of New York boroughs, which is part of the GeoPandas installation. Therefore, we use `geopandas.datasets.get_path()` to retrieve the path to the dataset."
3941 ]
4042 },
4143 {
5456 },
5557 {
5658 "cell_type": "markdown",
57 "metadata": {},
58 "source": [
59 "### Write files\n",
60 "\n",
61 "Writing a `GeoDataFrame` back to file is similarly simple, using `GeoDataFrame.to_file`. The default file format is Shapefile, but you can specify your own using `driver` keyword."
59 "metadata": {
60 "tags": []
61 },
62 "source": [
63 "### Writing files\n",
64 "\n",
65 "To write a `GeoDataFrame` back to file use `GeoDataFrame.to_file()`. The default file format is Shapefile, but you can specify your own with the `driver` keyword."
6266 ]
6367 },
6468 {
7276 },
7377 {
7478 "cell_type": "markdown",
75 "metadata": {},
79 "metadata": {
80 "tags": []
81 },
7682 "source": [
7783 "<div class=\"alert alert-info\">\n",
7884 "User Guide\n",
8288 "\n",
8389 "\n",
8490 "\n",
85 "## Simple methods\n",
91 "## Simple accessors and methods\n",
8692 "\n",
8793 "Now we have our `GeoDataFrame` and can start working with its geometry. \n",
8894 "\n",
89 "Since we have only one geometry column read from the file, it is automatically seen as the active geometry and methods used on `GeoDataFrame` will be applied to the `\"geometry\"` column.\n",
95 "Since there was only one geometry column in the New York Boroughs dataset, this column automatically becomes the _active_ geometry and spatial methods used on the `GeoDataFrame` will be applied to the `\"geometry\"` column.\n",
9096 "\n",
9197 "### Measuring area\n",
9298 "\n",
93 "To measure the area of each polygon (or MultiPolygon in this specific case), we can use `GeoDataFrame.area` attribute, which returns a `pandas.Series`. Note that `GeoDataFrame.area` is just `GeoSeries.area` applied to an active geometry column.\n",
94 "\n",
95 "But first, we set the names of boroughs as an index, to make the results easier to read."
99 "To measure the area of each polygon (or MultiPolygon in this specific case), access the `GeoDataFrame.area` attribute, which returns a `pandas.Series`. Note that `GeoDataFrame.area` is just `GeoSeries.area` applied to the _active_ geometry column.\n",
100 "\n",
101 "But first, to make the results easier to read, set the names of the boroughs as the index:"
96102 ]
97103 },
98104 {
120126 "source": [
121127 "### Getting polygon boundary and centroid\n",
122128 "\n",
123 "To get just the boundary of each polygon (LineString), we can call `GeoDataFrame.boundary`."
129 "To get the boundary of each polygon (LineString), access the `GeoDataFrame.boundary`:"
124130 ]
125131 },
126132 {
158164 "source": [
159165 "### Measuring distance\n",
160166 "\n",
161 "We can also measure how far is each centroid from the first one."
167 "We can also measure how far each centroid is from the first centroid location."
162168 ]
163169 },
164170 {
176182 "cell_type": "markdown",
177183 "metadata": {},
178184 "source": [
179 "It's still a DataFrame, so we have all the pandas functionality available to use on the geospatial dataset, and to do data manipulations with the attributes and geometry information together.\n",
180 "\n",
181 "For example, we can calculate average of the distance measured above (by accessing the `'distance'` column, and calling the `mean()` method on it):"
185 "Note that `geopandas.GeoDataFrame` is a subclass of `pandas.DataFrame`, so we have all the pandas functionality available to use on the geospatial dataset — we can even perform data manipulations with the attributes and geometry information together.\n",
186 "\n",
187 "For example, to calculate the average of the distances measured above, access the 'distance' column and call the mean() method on it:"
182188 ]
183189 },
184190 {
196202 "source": [
197203 "## Making maps\n",
198204 "\n",
199 "GeoPandas can also plot maps, so we can check how our geometries look like in space. The key method here is `GeoDataFrame.plot()`. In the example below, we plot the `\"area\"` we measured earlier using the active geometry column. We also want to show a legend (`legend=True`)."
205 "GeoPandas can also plot maps, so we can check how the geometries appear in space. To plot the active geometry, call `GeoDataFrame.plot()`. To color code by another column, pass in that column as the first argument. In the example below, we plot the active geometry column and color code by the `\"area\"` column. We also want to show a legend (`legend=True`)."
200206 ]
201207 },
202208 {
206212 "outputs": [],
207213 "source": [
208214 "gdf.plot(\"area\", legend=True)"
215 ]
216 },
217 {
218 "cell_type": "markdown",
219 "metadata": {},
220 "source": [
221 "You can also explore your data interactively using `GeoDataFrame.explore()`, which behaves in the same way `plot()` does but returns an interactive map instead."
222 ]
223 },
224 {
225 "cell_type": "code",
226 "execution_count": null,
227 "metadata": {},
228 "outputs": [],
229 "source": [
230 "gdf.explore(\"area\", legend=False)"
209231 ]
210232 },
211233 {
274296 "\n",
275297 "### Convex hull\n",
276298 "\n",
277 "If we are interested in the convex hull of our polygons, we can call `GeoDataFrame.convex_hull`."
299 "If we are interested in the convex hull of our polygons, we can access `GeoDataFrame.convex_hull`."
278300 ]
279301 },
280302 {
417439 {
418440 "cell_type": "code",
419441 "execution_count": null,
420 "metadata": {},
442 "metadata": {
443 "tags": []
444 },
421445 "outputs": [],
422446 "source": [
423447 "gdf = gdf.set_geometry(\"buffered_centroid\")\n",
431455 "source": [
432456 "## Projections\n",
433457 "\n",
434 "Each `GeoSeries` has the Coordinate Reference System (CRS) accessible as `GeoSeries.crs`. CRS tells GeoPandas where the coordinates of geometries are located on the Earth. In some cases, CRS is geographic, which means that coordinates are in latitude and longitude. In those cases, its CRS is WGS84, with the authority code `EPSG:4326`. Let's see the projection of our NY boroughs `GeoDataFrame`."
458 "Each `GeoSeries` has its Coordinate Reference System (CRS) accessible at `GeoSeries.crs`. The CRS tells GeoPandas where the coordinates of the geometries are located on the earth's surface. In some cases, the CRS is geographic, which means that the coordinates are in latitude and longitude. In those cases, its CRS is WGS84, with the authority code `EPSG:4326`. Let's see the projection of our NY boroughs `GeoDataFrame`."
435459 ]
436460 },
437461 {
474498 "cell_type": "markdown",
475499 "metadata": {},
476500 "source": [
477 "Notice the difference in coordinates along the axes of the plot. Where we had 120 000 - 280 000 (feet) before, we have 40.5 - 40.9 (degrees) now. In this case, `boroughs_4326` has a `\"geometry\"` column in WGS84 but all the other (with centroids etc.) remains in the original CRS.\n",
501 "Notice the difference in coordinates along the axes of the plot. Where we had 120 000 - 280 000 (feet) before, we now have 40.5 - 40.9 (degrees). In this case, `boroughs_4326` has a `\"geometry\"` column in WGS84 but all the other (with centroids etc.) remain in the original CRS.\n",
478502 "\n",
479503 "<div class=\"alert alert-warning\">\n",
480504 "Warning\n",
481505 " \n",
482 "For operations that rely on distance or area, you always need to use projected CRS (in meters, feet, kilometers etc.) not a geographic one. GeoPandas operations are planar, and degrees reflect the position on a sphere. Therefore the results may not be correct. For example, the result of `gdf.area.sum()` (projected CRS) is 8 429 911 572 ft<sup>2</sup> but the result of `boroughs_4326.area.sum()` (geographic CRS) is 0.083.\n",
506 "For operations that rely on distance or area, you always need to use a projected CRS (in meters, feet, kilometers etc.) not a geographic one (in degrees). GeoPandas operations are planar, whereas degrees reflect the position on a sphere. Therefore, spatial operations using degrees may not yield correct results. For example, the result of `gdf.area.sum()` (projected CRS) is 8 429 911 572 ft<sup>2</sup> but the result of `boroughs_4326.area.sum()` (geographic CRS) is 0.083.\n",
483507 "</div>\n",
484508 "\n",
485509 "<div class=\"alert alert-info\">\n",
490514 "\n",
491515 "## What next?\n",
492516 "\n",
493 "With GeoPandas we can do much more that this, from [aggregations](../docs/user_guide/aggregation_with_dissolve.rst), to [spatial joins](../docs/user_guide/mergingdata.rst), [geocoding](../docs/user_guide/geocoding.rst) and [much more](../gallery/index.rst).\n",
494 "\n",
495 "Head to the [User Guide](../docs/user_guide.rst) for to learn more about different functionality of GeoPandas, to the [Examples](../gallery/index.rst) to see how it can be used or the the [API reference](../docs/reference.rst) for the details."
496 ]
497 },
498 {
499 "cell_type": "code",
500 "execution_count": null,
501 "metadata": {},
502 "outputs": [],
503 "source": []
517 "With GeoPandas we can do much more than what has been introduced so far, from [aggregations](../docs/user_guide/aggregation_with_dissolve.rst), to [spatial joins](../docs/user_guide/mergingdata.rst), to [geocoding](../docs/user_guide/geocoding.rst), and [much more](../gallery/index.rst).\n",
518 "\n",
519 "Head over to the [User Guide](../docs/user_guide.rst) to learn more about the different features of GeoPandas, the [Examples](../gallery/index.rst) to see how they can be used, or to the [API reference](../docs/reference.rst) for the details."
520 ]
504521 }
505522 ],
506523 "metadata": {
524 "kernelspec": {
525 "display_name": "Python 3",
526 "language": "python",
527 "name": "python3"
528 },
507529 "language_info": {
508530 "codemirror_mode": {
509531 "name": "ipython",
514536 "name": "python",
515537 "nbconvert_exporter": "python",
516538 "pygments_lexer": "ipython3",
517 "version": "3.7.6"
539 "version": "3.9.2"
518540 }
519541 },
520542 "nbformat": 4,
1313
1414 ## Installation
1515
16 GeoPandas is written in pure Python, but has several dependecies written in C
16 GeoPandas is written in pure Python, but has several dependencies written in C
1717 ([GEOS](https://geos.osgeo.org), [GDAL](https://www.gdal.org/), [PROJ](https://proj.org/)). Those base C libraries can sometimes be a challenge to
1818 install. Therefore, we advise you to closely follow the recommendations below to avoid
1919 installation problems.
33 dependencies:
44 # required
55 - fiona>=1.8
6 - pandas>=0.24
6 - pandas>=0.25
77 - pyproj>=2.2.0
88 - shapely>=1.6
99
77 from geopandas.io.arrow import _read_parquet as read_parquet # noqa
88 from geopandas.io.arrow import _read_feather as read_feather # noqa
99 from geopandas.io.sql import _read_postgis as read_postgis # noqa
10 from geopandas.tools import sjoin # noqa
10 from geopandas.tools import sjoin, sjoin_nearest # noqa
1111 from geopandas.tools import overlay # noqa
1212 from geopandas.tools._show_versions import show_versions # noqa
1313 from geopandas.tools import clip # noqa
33 import os
44 import warnings
55
6 import numpy as np
67 import pandas as pd
78 import pyproj
89 import shapely
1314 # pandas compat
1415 # -----------------------------------------------------------------------------
1516
16 PANDAS_GE_025 = str(pd.__version__) >= LooseVersion("0.25.0")
1717 PANDAS_GE_10 = str(pd.__version__) >= LooseVersion("1.0.0")
1818 PANDAS_GE_11 = str(pd.__version__) >= LooseVersion("1.1.0")
1919 PANDAS_GE_115 = str(pd.__version__) >= LooseVersion("1.1.5")
3737 PYGEOS_SHAPELY_COMPAT = None
3838
3939 PYGEOS_GE_09 = None
40 PYGEOS_GE_010 = None
41
42 INSTALL_PYGEOS_ERROR = "To use PyGEOS within GeoPandas, you need to install PyGEOS: \
43 'conda install pygeos' or 'pip install pygeos'"
4044
4145 try:
4246 import pygeos # noqa
4549 if str(pygeos.__version__) >= LooseVersion("0.8"):
4650 HAS_PYGEOS = True
4751 PYGEOS_GE_09 = str(pygeos.__version__) >= LooseVersion("0.9")
52 PYGEOS_GE_010 = str(pygeos.__version__) >= LooseVersion("0.10")
4853 else:
4954 warnings.warn(
5055 "The installed version of PyGEOS is too old ({0} installed, 0.8 required),"
8893 # validate the pygeos version
8994 if not str(pygeos.__version__) >= LooseVersion("0.8"):
9095 raise ImportError(
91 "PyGEOS >= 0.6 is required, version {0} is installed".format(
96 "PyGEOS >= 0.8 is required, version {0} is installed".format(
9297 pygeos.__version__
9398 )
9499 )
114119 PYGEOS_SHAPELY_COMPAT = True
115120
116121 except ImportError:
117 raise ImportError(
118 "To use the PyGEOS speed-ups within GeoPandas, you need to install "
119 "PyGEOS: 'conda install pygeos' or 'pip install pygeos'"
120 )
122 raise ImportError(INSTALL_PYGEOS_ERROR)
121123
122124
123125 set_use_pygeos()
143145 with warnings.catch_warnings():
144146 warnings.filterwarnings(
145147 "ignore", "Iteration|The array interface|__len__", shapely_warning
148 )
149 yield
150
151
152 elif (str(np.__version__) >= LooseVersion("1.21")) and not SHAPELY_GE_20:
153
154 @contextlib.contextmanager
155 def ignore_shapely2_warnings():
156 with warnings.catch_warnings():
157 # warning from numpy for existing Shapely releases (this is fixed
158 # with Shapely 1.8)
159 warnings.filterwarnings(
160 "ignore", "An exception was ignored while fetching", DeprecationWarning
146161 )
147162 yield
148163
209224 # -----------------------------------------------------------------------------
210225
211226 PYPROJ_LT_3 = LooseVersion(pyproj.__version__) < LooseVersion("3")
227 PYPROJ_GE_31 = LooseVersion(pyproj.__version__) >= LooseVersion("3.1")
1515 """Provide attribute-style access to configuration dict."""
1616
1717 def __init__(self, options):
18 super(Options, self).__setattr__("_options", options)
18 super().__setattr__("_options", options)
1919 # populate with default values
2020 config = {}
2121 for key, option in options.items():
2222 config[key] = option.default_value
2323
24 super(Options, self).__setattr__("_config", config)
24 super().__setattr__("_config", config)
2525
2626 def __setattr__(self, key, value):
2727 # you can't set new keys
5858 doc_text = "\n".join(textwrap.wrap(option.doc, width=70))
5959 else:
6060 doc_text = u"No description available."
61 doc_text = indent(doc_text, prefix=" ")
61 doc_text = textwrap.indent(doc_text, prefix=" ")
6262 description += doc_text + "\n"
6363 space = "\n "
6464 description = description.replace("\n", space)
6565 return "{}({}{})".format(cls, space, description)
66
67
68 def indent(text, prefix, predicate=None):
69 """
70 This is the python 3 textwrap.indent function, which is not available in
71 python 2.
72 """
73 if predicate is None:
74
75 def predicate(line):
76 return line.strip()
77
78 def prefixed_lines():
79 for line in text.splitlines(True):
80 yield (prefix + line if predicate(line) else line)
81
82 return "".join(prefixed_lines())
8366
8467
8568 def _validate_display_precision(value):
0 from textwrap import dedent
1 from typing import Callable, Union
2
3
4 # doc decorator function ported with modifications from Pandas
5 # https://github.com/pandas-dev/pandas/blob/master/pandas/util/_decorators.py
6
7
8 def doc(*docstrings: Union[str, Callable], **params) -> Callable:
9 """
10 A decorator take docstring templates, concatenate them and perform string
11 substitution on it.
12 This decorator will add a variable "_docstring_components" to the wrapped
13 callable to keep track the original docstring template for potential usage.
14 If it should be consider as a template, it will be saved as a string.
15 Otherwise, it will be saved as callable, and later user __doc__ and dedent
16 to get docstring.
17
18 Parameters
19 ----------
20 *docstrings : str or callable
21 The string / docstring / docstring template to be appended in order
22 after default docstring under callable.
23 **params
24 The string which would be used to format docstring template.
25 """
26
27 def decorator(decorated: Callable) -> Callable:
28 # collecting docstring and docstring templates
29 docstring_components: list[Union[str, Callable]] = []
30 if decorated.__doc__:
31 docstring_components.append(dedent(decorated.__doc__))
32
33 for docstring in docstrings:
34 if hasattr(docstring, "_docstring_components"):
35 docstring_components.extend(docstring._docstring_components)
36 elif isinstance(docstring, str) or docstring.__doc__:
37 docstring_components.append(docstring)
38
39 # formatting templates and concatenating docstring
40 decorated.__doc__ = "".join(
41 component.format(**params)
42 if isinstance(component, str)
43 else dedent(component.__doc__ or "")
44 for component in docstring_components
45 )
46
47 decorated._docstring_components = docstring_components
48 return decorated
49
50 return decorator
66 import warnings
77
88 import numpy as np
9 import pandas as pd
910
1011 import shapely.geometry
1112 import shapely.geos
4344 type_mapping, geometry_type_ids, geometry_type_values = None, None, None
4445
4546
46 def _isna(value):
47 """
48 Check if scalar value is NA-like (None or np.nan).
47 def isna(value):
48 """
49 Check if scalar value is NA-like (None, np.nan or pd.NA).
4950
5051 Custom version that only works for scalars (returning True or False),
5152 as `pd.isna` also works for array-like input returning a boolean array.
5354 if value is None:
5455 return True
5556 elif isinstance(value, float) and np.isnan(value):
57 return True
58 elif compat.PANDAS_GE_10 and value is pd.NA:
5659 return True
5760 else:
5861 return False
126129 out.append(_shapely_to_pygeos(geom))
127130 else:
128131 out.append(geom)
129 elif _isna(geom):
132 elif isna(geom):
130133 out.append(None)
131134 else:
132135 raise TypeError("Input must be valid geometry objects: {0}".format(geom))
164167 out = []
165168
166169 for geom in data:
167 if geom is not None and len(geom):
168 geom = shapely.wkb.loads(geom)
170 if not isna(geom) and len(geom):
171 geom = shapely.wkb.loads(geom, hex=isinstance(geom, str))
169172 else:
170173 geom = None
171174 out.append(geom)
199202 out = []
200203
201204 for geom in data:
202 if geom is not None and len(geom):
205 if not isna(geom) and len(geom):
203206 if isinstance(geom, bytes):
204207 geom = geom.decode("utf-8")
205208 geom = shapely.wkt.loads(geom)
246249 else:
247250 out = _points_from_xy(x, y, z)
248251 aout = np.empty(len(x), dtype=object)
249 aout[:] = out
252 with compat.ignore_shapely2_warnings():
253 aout[:] = out
250254 return aout
251255
252256
608612 "geometry types, None is returned."
609613 )
610614 data = np.empty(len(data), dtype=object)
611 data[:] = inner_rings
615 with compat.ignore_shapely2_warnings():
616 data[:] = inner_rings
612617 return data
613618
614619
618623 else:
619624 # method and not a property -> can't use _unary_geo
620625 out = np.empty(len(data), dtype=object)
621 out[:] = [
622 geom.representative_point() if geom is not None else None for geom in data
623 ]
626 with compat.ignore_shapely2_warnings():
627 out[:] = [
628 geom.representative_point() if geom is not None else None
629 for geom in data
630 ]
624631 return out
625632
626633
793800
794801 def interpolate(data, distance, normalized=False):
795802 if compat.USE_PYGEOS:
796 return pygeos.line_interpolate_point(data, distance, normalize=normalized)
803 try:
804 return pygeos.line_interpolate_point(data, distance, normalized=normalized)
805 except TypeError: # support for pygeos<0.9
806 return pygeos.line_interpolate_point(data, distance, normalize=normalized)
797807 else:
798808 out = np.empty(len(data), dtype=object)
799809 if isinstance(distance, np.ndarray):
802812 "Length of distance sequence does not match "
803813 "length of the GeoSeries"
804814 )
815 with compat.ignore_shapely2_warnings():
816 out[:] = [
817 geom.interpolate(dist, normalized=normalized)
818 for geom, dist in zip(data, distance)
819 ]
820 return out
821
822 with compat.ignore_shapely2_warnings():
805823 out[:] = [
806 geom.interpolate(dist, normalized=normalized)
807 for geom, dist in zip(data, distance)
824 geom.interpolate(distance, normalized=normalized) for geom in data
808825 ]
809 return out
810
811 out[:] = [geom.interpolate(distance, normalized=normalized) for geom in data]
812826 return out
813827
814828
857871
858872 def project(data, other, normalized=False):
859873 if compat.USE_PYGEOS:
860 return pygeos.line_locate_point(data, other, normalize=normalized)
874 try:
875 return pygeos.line_locate_point(data, other, normalized=normalized)
876 except TypeError: # support for pygeos<0.9
877 return pygeos.line_locate_point(data, other, normalize=normalized)
861878 else:
862879 return _binary_op("project", data, other, normalized=normalized)
863880
940957 result = np.empty(n, dtype=object)
941958 for i in range(n):
942959 geom = data[i]
943 if _isna(geom):
960 if isna(geom):
944961 result[i] = geom
945962 else:
946963 result[i] = transform(func, geom)
2121 # setup.py/versioneer.py will grep for the variable names, so they must
2222 # each be defined on a line of their own. _version.py will just call
2323 # get_keywords().
24 git_refnames = " (tag: v0.9.0)"
25 git_full = "ec4c6805d1182f846b9659345a5e66fa7c7afac7"
24 git_refnames = " (HEAD -> master, tag: v0.10.0)"
25 git_full = "0be92da324d6a83d2a65904cde5c983c433a1584"
2626 keywords = {"refnames": git_refnames, "full": git_full}
2727 return keywords
2828
5353
5454
5555 register_extension_dtype(GeometryDtype)
56
57
58 def _isna(value):
59 """
60 Check if scalar value is NA-like (None, np.nan or pd.NA).
61
62 Custom version that only works for scalars (returning True or False),
63 as `pd.isna` also works for array-like input returning a boolean array.
64 """
65 if value is None:
66 return True
67 elif isinstance(value, float) and np.isnan(value):
68 return True
69 elif compat.PANDAS_GE_10 and value is pd.NA:
70 return True
71 else:
72 return False
7356
7457
7558 def _check_crs(left, right, allow_none=False):
397380 if isinstance(key, numbers.Integral):
398381 raise ValueError("cannot set a single element with an array")
399382 self.data[key] = value.data
400 elif isinstance(value, BaseGeometry) or _isna(value):
401 if _isna(value):
383 elif isinstance(value, BaseGeometry) or vectorized.isna(value):
384 if vectorized.isna(value):
402385 # internally only use None as missing value indicator
403386 # but accept others
404387 value = None
844827 raise RuntimeError("crs must be set to estimate UTM CRS.")
845828
846829 minx, miny, maxx, maxy = self.total_bounds
847 # ensure using geographic coordinates
848 if not self.crs.is_geographic:
849 lon, lat = Transformer.from_crs(
850 self.crs, "EPSG:4326", always_xy=True
851 ).transform((minx, maxx, minx, maxx), (miny, miny, maxy, maxy))
852 x_center = np.mean(lon)
853 y_center = np.mean(lat)
854 else:
830 if self.crs.is_geographic:
855831 x_center = np.mean([minx, maxx])
856832 y_center = np.mean([miny, maxy])
833 # ensure using geographic coordinates
834 else:
835 transformer = Transformer.from_crs(self.crs, "EPSG:4326", always_xy=True)
836 if compat.PYPROJ_GE_31:
837 minx, miny, maxx, maxy = transformer.transform_bounds(
838 minx, miny, maxx, maxy
839 )
840 y_center = np.mean([miny, maxy])
841 # crossed the antimeridian
842 if minx > maxx:
843 # shift maxx from [-180,180] to [0,360]
844 # so both numbers are positive for center calculation
845 # Example: -175 to 185
846 maxx += 360
847 x_center = np.mean([minx, maxx])
848 # shift back to [-180,180]
849 x_center = ((x_center + 180) % 360) - 180
850 else:
851 x_center = np.mean([minx, maxx])
852 else:
853 lon, lat = transformer.transform(
854 (minx, maxx, minx, maxx), (miny, miny, maxy, maxy)
855 )
856 x_center = np.mean(lon)
857 y_center = np.mean(lat)
857858
858859 utm_crs_list = query_utm_crs_info(
859860 datum_name=datum_name,
967968 )
968969 # self.data[idx] = value
969970 value_arr = np.empty(1, dtype=object)
970 value_arr[:] = [value]
971 with compat.ignore_shapely2_warnings():
972 value_arr[:] = [value]
971973 self.data[idx] = value_arr
972974 return self
973975
10041006
10051007 if mask.any():
10061008 # fill with value
1007 if _isna(value):
1009 if vectorized.isna(value):
10081010 value = None
10091011 elif not isinstance(value, BaseGeometry):
10101012 raise NotImplementedError(
10461048 pd_dtype = pd.api.types.pandas_dtype(dtype)
10471049 if isinstance(pd_dtype, pd.StringDtype):
10481050 # ensure to return a pandas string array instead of numpy array
1049 return pd.array(string_values, dtype="string")
1051 return pd.array(string_values, dtype=pd_dtype)
10501052 return string_values.astype(dtype, copy=False)
10511053 else:
10521054 return np.array(self, dtype=dtype, copy=copy)
10591061 return pygeos.is_missing(self.data)
10601062 else:
10611063 return np.array([g is None for g in self.data], dtype="bool")
1064
1065 def value_counts(
1066 self,
1067 dropna: bool = True,
1068 ):
1069 """
1070 Compute a histogram of the counts of non-null values.
1071
1072 Parameters
1073 ----------
1074 dropna : bool, default True
1075 Don't include counts of NaN
1076
1077 Returns
1078 -------
1079 pd.Series
1080 """
1081
1082 # note ExtensionArray usage of value_counts only specifies dropna,
1083 # so sort, normalize and bins are not arguments
1084 values = to_wkb(self)
1085 from pandas import Series, Index
1086
1087 result = Series(values).value_counts(dropna=dropna)
1088 # value_counts converts None to nan, need to convert back for from_wkb to work
1089 # note result.index already has object dtype, not geometry
1090 # Can't use fillna(None) or Index.putmask, as this gets converted back to nan
1091 # for object dtypes
1092 result.index = Index(
1093 from_wkb(np.where(result.index.isna(), None, result.index))
1094 )
1095 return result
10621096
10631097 def unique(self):
10641098 """Compute the ExtensionArray of unique values.
11071141 len(self) is returned, with all values filled with
11081142 ``self.dtype.na_value``.
11091143 """
1110 shifted = super(GeometryArray, self).shift(periods, fill_value)
1144 shifted = super().shift(periods, fill_value)
11111145 shifted.crs = self.crs
11121146 return shifted
11131147
12231257
12241258 precision = geopandas.options.display_precision
12251259 if precision is None:
1226 # dummy heuristic based on 10 first geometries that should
1227 # work in most cases
1228 with warnings.catch_warnings():
1229 warnings.simplefilter("ignore", category=RuntimeWarning)
1230 xmin, ymin, xmax, ymax = self[~self.isna()][:10].total_bounds
1231 if (
1232 (-180 <= xmin <= 180)
1233 and (-180 <= xmax <= 180)
1234 and (-90 <= ymin <= 90)
1235 and (-90 <= ymax <= 90)
1236 ):
1237 # geographic coordinates
1238 precision = 5
1260 if self.crs:
1261 if self.crs.is_projected:
1262 precision = 3
1263 else:
1264 precision = 5
12391265 else:
1240 # typically projected coordinates
1241 # (in case of unit meter: mm precision)
1242 precision = 3
1266 # fallback
1267 # dummy heuristic based on 10 first geometries that should
1268 # work in most cases
1269 with warnings.catch_warnings():
1270 warnings.simplefilter("ignore", category=RuntimeWarning)
1271 xmin, ymin, xmax, ymax = self[~self.isna()][:10].total_bounds
1272 if (
1273 (-180 <= xmin <= 180)
1274 and (-180 <= xmax <= 180)
1275 and (-90 <= ymin <= 90)
1276 and (-90 <= ymax <= 90)
1277 ):
1278 # geographic coordinates
1279 precision = 5
1280 else:
1281 # typically projected coordinates
1282 # (in case of unit meter: mm precision)
1283 precision = 3
12431284 return lambda geom: shapely.wkt.dumps(geom, rounding_precision=precision)
12441285 return repr
12451286
13181359 """
13191360 Return for `item in self`.
13201361 """
1321 if _isna(item):
1362 if vectorized.isna(item):
13221363 if (
13231364 item is self.dtype.na_value
13241365 or isinstance(item, self.dtype.type)
55
66 from shapely.geometry import box
77 from shapely.geometry.base import BaseGeometry
8 from shapely.ops import cascaded_union
98
109 from .array import GeometryArray, GeometryDtype
1110
698697
699698 @property
700699 def cascaded_union(self):
701 """Deprecated: Return the unary_union of all geometries"""
702 return cascaded_union(np.asarray(self.geometry.values))
700 """Deprecated: use `unary_union` instead"""
701 warn(
702 "The 'cascaded_union' attribute is deprecated, use 'unary_union' instead",
703 FutureWarning,
704 stacklevel=2,
705 )
706 return self.geometry.values.unary_union()
703707
704708 @property
705709 def unary_union(self):
718722
719723 >>> union = s.unary_union
720724 >>> print(union)
721 POLYGON ((0 0, 0 1, 0 2, 2 2, 2 0, 1 0, 0 0))
725 POLYGON ((0 1, 0 2, 2 2, 2 0, 1 0, 0 0, 0 1))
722726 """
723727 return self.geometry.values.unary_union()
724728
730734 """Returns a ``Series`` of ``dtype('bool')`` with value ``True`` for
731735 each aligned geometry that contains `other`.
732736
733 An object is said to contain `other` if its `interior` contains the
734 `boundary` and `interior` of the other object and their boundaries do
735 not touch at all.
737 An object is said to contain `other` if at least one point of `other` lies in
738 the interior and no points of `other` lie in the exterior of the object.
739 (Therefore, any given polygon does not contain its own boundary – there is not
740 any point that lies in the interior.)
741 If either object is empty, this operation returns ``False``.
736742
737743 This is the inverse of :meth:`within` in the sense that the expression
738744 ``a.contains(b) == b.within(a)`` always evaluates to ``True``.
745751 Parameters
746752 ----------
747753 other : GeoSeries or geometric object
748 The GeoSeries (elementwise) or geometric object to test if is
749 contained.
754 The GeoSeries (elementwise) or geometric object to test if it
755 is contained.
750756 align : bool (default True)
751757 If True, automatically aligns GeoSeries based on their indices.
752758 If False, the order of elements is preserved.
16401646 """Returns a ``Series`` of ``dtype('bool')`` with value ``True`` for
16411647 each aligned geometry that is within `other`.
16421648
1643 An object is said to be within `other` if its `boundary` and `interior`
1644 intersects only with the `interior` of the other (not its `boundary` or
1645 `exterior`).
1649 An object is said to be within `other` if at least one of its points is located
1650 in the `interior` and no points are located in the `exterior` of the other.
1651 If either object is empty, this operation returns ``False``.
16461652
16471653 This is the inverse of :meth:`contains` in the sense that the
16481654 expression ``a.within(b) == b.contains(a)`` always evaluates to
17571763
17581764 An object A is said to cover another object B if no points of B lie
17591765 in the exterior of A.
1766 If either object is empty, this operation returns ``False``.
17601767
17611768 The operation works on a 1-to-1 row-wise manner:
17621769
21472154 :align: center
21482155
21492156 >>> s.difference(Polygon([(0, 0), (1, 1), (0, 1)]))
2150 0 POLYGON ((0.00000 1.00000, 0.00000 2.00000, 2....
2151 1 POLYGON ((0.00000 1.00000, 0.00000 2.00000, 2....
2157 0 POLYGON ((0.00000 2.00000, 2.00000 2.00000, 1....
2158 1 POLYGON ((0.00000 2.00000, 2.00000 2.00000, 1....
21522159 2 LINESTRING (1.00000 1.00000, 2.00000 2.00000)
21532160 3 MULTILINESTRING ((2.00000 0.00000, 1.00000 1.0...
21542161 4 POINT EMPTY
21642171
21652172 >>> s.difference(s2, align=True)
21662173 0 None
2167 1 POLYGON ((0.00000 1.00000, 0.00000 2.00000, 2....
2174 1 POLYGON ((0.00000 2.00000, 2.00000 2.00000, 1....
21682175 2 MULTILINESTRING ((0.00000 0.00000, 1.00000 1.0...
21692176 3 LINESTRING EMPTY
21702177 4 POINT (0.00000 1.00000)
21722179 dtype: geometry
21732180
21742181 >>> s.difference(s2, align=False)
2175 0 POLYGON ((0.00000 1.00000, 0.00000 2.00000, 2....
2176 1 POLYGON ((1.00000 1.00000, 0.00000 0.00000, 0....
2182 0 POLYGON ((0.00000 2.00000, 2.00000 2.00000, 1....
2183 1 POLYGON ((0.00000 0.00000, 0.00000 2.00000, 1....
21772184 2 MULTILINESTRING ((0.00000 0.00000, 1.00000 1.0...
21782185 3 LINESTRING (2.00000 0.00000, 0.00000 2.00000)
21792186 4 POINT EMPTY
22622269 :align: center
22632270
22642271 >>> s.symmetric_difference(Polygon([(0, 0), (1, 1), (0, 1)]))
2265 0 POLYGON ((0.00000 1.00000, 0.00000 2.00000, 2....
2266 1 POLYGON ((0.00000 1.00000, 0.00000 2.00000, 2....
2267 2 GEOMETRYCOLLECTION (LINESTRING (1.00000 1.0000...
2268 3 GEOMETRYCOLLECTION (LINESTRING (2.00000 0.0000...
2269 4 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 1....
2272 0 POLYGON ((0.00000 2.00000, 2.00000 2.00000, 1....
2273 1 POLYGON ((0.00000 2.00000, 2.00000 2.00000, 1....
2274 2 GEOMETRYCOLLECTION (POLYGON ((0.00000 0.00000,...
2275 3 GEOMETRYCOLLECTION (POLYGON ((0.00000 0.00000,...
2276 4 POLYGON ((0.00000 1.00000, 1.00000 1.00000, 0....
22702277 dtype: geometry
22712278
22722279 We can also check two GeoSeries against each other, row by row.
22792286
22802287 >>> s.symmetric_difference(s2, align=True)
22812288 0 None
2282 1 POLYGON ((0.00000 1.00000, 0.00000 2.00000, 2....
2289 1 POLYGON ((0.00000 2.00000, 2.00000 2.00000, 1....
22832290 2 MULTILINESTRING ((0.00000 0.00000, 1.00000 1.0...
22842291 3 LINESTRING EMPTY
22852292 4 MULTIPOINT (0.00000 1.00000, 1.00000 1.00000)
22872294 dtype: geometry
22882295
22892296 >>> s.symmetric_difference(s2, align=False)
2290 0 POLYGON ((0.00000 1.00000, 0.00000 2.00000, 2....
2291 1 GEOMETRYCOLLECTION (LINESTRING (1.00000 0.0000...
2297 0 POLYGON ((0.00000 2.00000, 2.00000 2.00000, 1....
2298 1 GEOMETRYCOLLECTION (POLYGON ((0.00000 0.00000,...
22922299 2 MULTILINESTRING ((0.00000 0.00000, 1.00000 1.0...
22932300 3 LINESTRING (2.00000 0.00000, 0.00000 2.00000)
22942301 4 POINT EMPTY
23742381 :align: center
23752382
23762383 >>> s.union(Polygon([(0, 0), (1, 1), (0, 1)]))
2377 0 POLYGON ((1.00000 1.00000, 0.00000 0.00000, 0....
2378 1 POLYGON ((1.00000 1.00000, 0.00000 0.00000, 0....
2379 2 GEOMETRYCOLLECTION (LINESTRING (1.00000 1.0000...
2380 3 GEOMETRYCOLLECTION (LINESTRING (2.00000 0.0000...
2381 4 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 1....
2384 0 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 0....
2385 1 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 0....
2386 2 GEOMETRYCOLLECTION (POLYGON ((0.00000 0.00000,...
2387 3 GEOMETRYCOLLECTION (POLYGON ((0.00000 0.00000,...
2388 4 POLYGON ((0.00000 1.00000, 1.00000 1.00000, 0....
23822389 dtype: geometry
23832390
23842391 We can also check two GeoSeries against each other, row by row.
23912398
23922399 >>> s.union(s2, align=True)
23932400 0 None
2394 1 POLYGON ((1.00000 1.00000, 0.00000 0.00000, 0....
2401 1 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 0....
23952402 2 MULTILINESTRING ((0.00000 0.00000, 1.00000 1.0...
23962403 3 LINESTRING (2.00000 0.00000, 0.00000 2.00000)
23972404 4 MULTIPOINT (0.00000 1.00000, 1.00000 1.00000)
23992406 dtype: geometry
24002407
24012408 >>> s.union(s2, align=False)
2402 0 POLYGON ((1.00000 1.00000, 0.00000 0.00000, 0....
2403 1 GEOMETRYCOLLECTION (LINESTRING (1.00000 0.0000...
2409 0 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 0....
2410 1 GEOMETRYCOLLECTION (POLYGON ((0.00000 0.00000,...
24042411 2 MULTILINESTRING ((0.00000 0.00000, 1.00000 1.0...
24052412 3 LINESTRING (2.00000 0.00000, 0.00000 2.00000)
24062413 4 POINT (0.00000 1.00000)
24872494 :align: center
24882495
24892496 >>> s.intersection(Polygon([(0, 0), (1, 1), (0, 1)]))
2490 0 POLYGON ((1.00000 1.00000, 0.00000 0.00000, 0....
2491 1 POLYGON ((1.00000 1.00000, 0.00000 0.00000, 0....
2497 0 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 1....
2498 1 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 1....
24922499 2 LINESTRING (0.00000 0.00000, 1.00000 1.00000)
24932500 3 POINT (1.00000 1.00000)
24942501 4 POINT (0.00000 1.00000)
25042511
25052512 >>> s.intersection(s2, align=True)
25062513 0 None
2507 1 POLYGON ((1.00000 1.00000, 0.00000 0.00000, 0....
2514 1 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 1....
25082515 2 POINT (1.00000 1.00000)
25092516 3 LINESTRING (2.00000 0.00000, 0.00000 2.00000)
25102517 4 POINT EMPTY
25122519 dtype: geometry
25132520
25142521 >>> s.intersection(s2, align=False)
2515 0 POLYGON ((1.00000 1.00000, 0.00000 0.00000, 0....
2522 0 POLYGON ((0.00000 0.00000, 0.00000 1.00000, 1....
25162523 1 LINESTRING (1.00000 1.00000, 1.00000 2.00000)
25172524 2 POINT (1.00000 1.00000)
25182525 3 POINT (1.00000 1.00000)
27202727 """Returns a ``GeoSeries`` containing a simplified representation of
27212728 each geometry.
27222729
2730 The algorithm (Douglas-Peucker) recursively splits the original line
2731 into smaller parts and connects these parts’ endpoints
2732 by a straight line. Then, it removes all points whose distance
2733 to the straight line is smaller than `tolerance`. It does not
2734 move any points and it always preserves endpoints of
2735 the original line or polygon.
27232736 See http://shapely.readthedocs.io/en/latest/manual.html#object.simplify
27242737 for details
27252738
27262739 Parameters
27272740 ----------
27282741 tolerance : float
2729 All points in a simplified geometry will be no more than
2730 `tolerance` distance from the original.
2742 All parts of a simplified geometry will be no more than
2743 `tolerance` distance from the original. It has the same units
2744 as the coordinate reference system of the GeoSeries.
2745 For example, using `tolerance=100` in a projected CRS with meters
2746 as units means a distance of 100 meters in reality.
27312747 preserve_topology: bool (default True)
27322748 False uses a quicker algorithm, but may produce self-intersecting
27332749 or otherwise invalid geometries.
28912907
28922908 Examples
28932909 --------
2894 >>> from shapely.geometry import Polygon, LineString, Point
2895 >>> s = geopandas.GeoSeries(
2896 ... [
2897 ... Polygon([(0, 0), (2, 2), (0, 2)]),
2910 >>> from shapely.geometry import LineString, Point
2911 >>> s = geopandas.GeoSeries(
2912 ... [
2913 ... LineString([(0, 0), (2, 0), (0, 2)]),
28982914 ... LineString([(0, 0), (2, 2)]),
28992915 ... LineString([(2, 0), (0, 2)]),
29002916 ... ],
29092925 ... )
29102926
29112927 >>> s
2912 0 POLYGON ((0.00000 0.00000, 2.00000 2.00000, 0....
2928 0 LINESTRING (0.00000 0.00000, 2.00000 0.00000, ...
29132929 1 LINESTRING (0.00000 0.00000, 2.00000 2.00000)
29142930 2 LINESTRING (2.00000 0.00000, 0.00000 2.00000)
29152931 dtype: geometry
29272943 :align: center
29282944
29292945 >>> s.project(Point(1, 0))
2930 0 -1.000000
2946 0 1.000000
29312947 1 0.707107
29322948 2 0.707107
29332949 dtype: float64
29482964 dtype: float64
29492965
29502966 >>> s.project(s2, align=False)
2951 0 -1.000000
2967 0 1.000000
29522968 1 0.707107
29532969 2 0.707107
29542970 dtype: float64
33003316 # don't know how to handle step; should this raise?
33013317 if xs.step is not None or ys.step is not None:
33023318 warn("Ignoring step - full interval is used.")
3303 xmin, ymin, xmax, ymax = obj.total_bounds
3319 if xs.start is None or xs.stop is None or ys.start is None or ys.stop is None:
3320 xmin, ymin, xmax, ymax = obj.total_bounds
33043321 bbox = box(
33053322 xs.start if xs.start is not None else xmin,
33063323 ys.start if ys.start is not None else ymin,
0 from statistics import mean
1
2 import geopandas
3 from shapely.geometry import LineString
4 import numpy as np
5 import pandas as pd
6
7 _MAP_KWARGS = [
8 "location",
9 "prefer_canvas",
10 "no_touch",
11 "disable_3d",
12 "png_enabled",
13 "zoom_control",
14 "crs",
15 "zoom_start",
16 "left",
17 "top",
18 "position",
19 "min_zoom",
20 "max_zoom",
21 "min_lat",
22 "max_lat",
23 "min_lon",
24 "max_lon",
25 "max_bounds",
26 ]
27
28
29 def _explore(
30 df,
31 column=None,
32 cmap=None,
33 color=None,
34 m=None,
35 tiles="OpenStreetMap",
36 attr=None,
37 tooltip=True,
38 popup=False,
39 highlight=True,
40 categorical=False,
41 legend=True,
42 scheme=None,
43 k=5,
44 vmin=None,
45 vmax=None,
46 width="100%",
47 height="100%",
48 categories=None,
49 classification_kwds=None,
50 control_scale=True,
51 marker_type=None,
52 marker_kwds={},
53 style_kwds={},
54 highlight_kwds={},
55 missing_kwds={},
56 tooltip_kwds={},
57 popup_kwds={},
58 legend_kwds={},
59 **kwargs,
60 ):
61 """Interactive map based on GeoPandas and folium/leaflet.js
62
63 Generate an interactive leaflet map based on :class:`~geopandas.GeoDataFrame`
64
65 Parameters
66 ----------
67 column : str, np.array, pd.Series (default None)
68 The name of the dataframe column, :class:`numpy.array`,
69 or :class:`pandas.Series` to be plotted. If :class:`numpy.array` or
70 :class:`pandas.Series` are used then it must have same length as dataframe.
71 cmap : str, matplotlib.Colormap, branca.colormap or function (default None)
72 The name of a colormap recognized by ``matplotlib``, a list-like of colors,
73 :class:`matplotlib.colors.Colormap`, a :class:`branca.colormap.ColorMap` or
74 function that returns a named color or hex based on the column
75 value, e.g.::
76
77 def my_colormap(value): # scalar value defined in 'column'
78 if value > 1:
79 return "green"
80 return "red"
81
82 color : str, array-like (default None)
83 Named color or a list-like of colors (named or hex).
84 m : folium.Map (default None)
85 Existing map instance on which to draw the plot.
86 tiles : str, xyzservices.TileProvider (default 'OpenStreetMap Mapnik')
87 Map tileset to use. Can choose from the list supported by folium, query a
88 :class:`xyzservices.TileProvider` by a name from ``xyzservices.providers``,
89 pass :class:`xyzservices.TileProvider` object or pass custom XYZ URL.
90 The current list of built-in providers (when ``xyzservices`` is not available):
91
92 ``["OpenStreetMap", "Stamen Terrain", “Stamen Toner", “Stamen Watercolor"
93 "CartoDB positron", “CartoDB dark_matter"]``
94
95 You can pass a custom tileset to Folium by passing a Leaflet-style URL
96 to the tiles parameter: ``http://{s}.yourtiles.com/{z}/{x}/{y}.png``.
97 Be sure to check their terms and conditions and to provide attribution with
98 the ``attr`` keyword.
99 attr : str (default None)
100 Map tile attribution; only required if passing custom tile URL.
101 tooltip : bool, str, int, list (default True)
102 Display GeoDataFrame attributes when hovering over the object.
103 ``True`` includes all columns. ``False`` removes tooltip. Pass string or list of
104 strings to specify a column(s). Integer specifies first n columns to be
105 included. Defaults to ``True``.
106 popup : bool, str, int, list (default False)
107 Input GeoDataFrame attributes for object displayed when clicking.
108 ``True`` includes all columns. ``False`` removes popup. Pass string or list of
109 strings to specify a column(s). Integer specifies first n columns to be
110 included. Defaults to ``False``.
111 highlight : bool (default True)
112 Enable highlight functionality when hovering over a geometry.
113 categorical : bool (default False)
114 If ``False``, ``cmap`` will reflect numerical values of the
115 column being plotted. For non-numerical columns, this
116 will be set to True.
117 legend : bool (default True)
118 Plot a legend in choropleth plots.
119 Ignored if no ``column`` is given.
120 scheme : str (default None)
121 Name of a choropleth classification scheme (requires ``mapclassify`` >= 2.4.0).
122 A :func:`mapclassify.classify` will be used
123 under the hood. Supported are all schemes provided by ``mapclassify`` (e.g.
124 ``'BoxPlot'``, ``'EqualInterval'``, ``'FisherJenks'``, ``'FisherJenksSampled'``,
125 ``'HeadTailBreaks'``, ``'JenksCaspall'``, ``'JenksCaspallForced'``,
126 ``'JenksCaspallSampled'``, ``'MaxP'``, ``'MaximumBreaks'``,
127 ``'NaturalBreaks'``, ``'Quantiles'``, ``'Percentiles'``, ``'StdMean'``,
128 ``'UserDefined'``). Arguments can be passed in ``classification_kwds``.
129 k : int (default 5)
130 Number of classes
131 vmin : None or float (default None)
132 Minimum value of ``cmap``. If ``None``, the minimum data value
133 in the column to be plotted is used.
134 vmax : None or float (default None)
135 Maximum value of ``cmap``. If ``None``, the maximum data value
136 in the column to be plotted is used.
137 width : pixel int or percentage string (default: '100%')
138 Width of the folium :class:`~folium.folium.Map`. If the argument
139 m is given explicitly, width is ignored.
140 height : pixel int or percentage string (default: '100%')
141 Height of the folium :class:`~folium.folium.Map`. If the argument
142 m is given explicitly, height is ignored.
143 categories : list-like
144 Ordered list-like object of categories to be used for categorical plot.
145 classification_kwds : dict (default None)
146 Keyword arguments to pass to mapclassify
147 control_scale : bool, (default True)
148 Whether to add a control scale on the map.
149 marker_type : str, folium.Circle, folium.CircleMarker, folium.Marker (default None)
150 Allowed string options are ('marker', 'circle', 'circle_marker'). Defaults to
151 folium.CircleMarker.
152 marker_kwds: dict (default {})
153 Additional keywords to be passed to the selected ``marker_type``, e.g.:
154
155 radius : float (default 2 for ``circle_marker`` and 50 for ``circle``))
156 Radius of the circle, in meters (for ``circle``) or pixels
157 (for ``circle_marker``).
158 fill : bool (default True)
159 Whether to fill the ``circle`` or ``circle_marker`` with color.
160 icon : folium.map.Icon
161 the :class:`folium.map.Icon` object to use to render the marker.
162 draggable : bool (default False)
163 Set to True to be able to drag the marker around the map.
164
165 style_kwds : dict (default {})
166 Additional style to be passed to folium ``style_function``:
167
168 stroke : bool (default True)
169 Whether to draw stroke along the path. Set it to ``False`` to
170 disable borders on polygons or circles.
171 color : str
172 Stroke color
173 weight : int
174 Stroke width in pixels
175 opacity : float (default 1.0)
176 Stroke opacity
177 fill : boolean (default True)
178 Whether to fill the path with color. Set it to ``False`` to
179 disable filling on polygons or circles.
180 fillColor : str
181 Fill color. Defaults to the value of the color option
182 fillOpacity : float (default 0.5)
183 Fill opacity.
184
185 Plus all supported by :func:`folium.vector_layers.path_options`. See the
186 documentation of :class:`folium.features.GeoJson` for details.
187
188 highlight_kwds : dict (default {})
189 Style to be passed to folium highlight_function. Uses the same keywords
190 as ``style_kwds``. When empty, defaults to ``{"fillOpacity": 0.75}``.
191 tooltip_kwds : dict (default {})
192 Additional keywords to be passed to :class:`folium.features.GeoJsonTooltip`,
193 e.g. ``aliases``, ``labels``, or ``sticky``.
194 popup_kwds : dict (default {})
195 Additional keywords to be passed to :class:`folium.features.GeoJsonPopup`,
196 e.g. ``aliases`` or ``labels``.
197 legend_kwds : dict (default {})
198 Additional keywords to be passed to the legend.
199
200 Currently supported customisation:
201
202 caption : string
203 Custom caption of the legend. Defaults to the column name.
204
205 Additional accepted keywords when ``scheme`` is specified:
206
207 colorbar : bool (default True)
208 An option to control the style of the legend. If True, continuous
209 colorbar will be used. If False, categorical legend will be used for bins.
210 scale : bool (default True)
211 Scale bins along the colorbar axis according to the bin edges (True)
212 or use the equal length for each bin (False)
213 fmt : string (default "{:.2f}")
214 A formatting specification for the bin edges of the classes in the
215 legend. For example, to have no decimals: ``{"fmt": "{:.0f}"}``. Applies
216 if ``colorbar=False``.
217 labels : list-like
218 A list of legend labels to override the auto-generated labels.
219 Needs to have the same number of elements as the number of
220 classes (`k`). Applies if ``colorbar=False``.
221 interval : boolean (default False)
222 An option to control brackets from mapclassify legend.
223 If True, open/closed interval brackets are shown in the legend.
224 Applies if ``colorbar=False``.
225 max_labels : int, default 10
226 Maximum number of colorbar tick labels (requires branca>=0.5.0)
227
228 **kwargs : dict
229 Additional options to be passed on to the folium object.
230
231 Returns
232 -------
233 m : folium.folium.Map
234 folium :class:`~folium.folium.Map` instance
235
236 Examples
237 --------
238 >>> df = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
239 >>> df.head(2) # doctest: +SKIP
240 pop_est continent name iso_a3 \
241 gdp_md_est geometry
242 0 920938 Oceania Fiji FJI 8374.0 MULTIPOLY\
243 GON (((180.00000 -16.06713, 180.00000...
244 1 53950935 Africa Tanzania TZA 150600.0 POLYGON (\
245 (33.90371 -0.95000, 34.07262 -1.05982...
246
247 >>> df.explore("pop_est", cmap="Blues") # doctest: +SKIP
248 """
249 try:
250 import branca as bc
251 import folium
252 import matplotlib.cm as cm
253 import matplotlib.colors as colors
254 import matplotlib.pyplot as plt
255 from mapclassify import classify
256 except (ImportError, ModuleNotFoundError):
257 raise ImportError(
258 "The 'folium', 'matplotlib' and 'mapclassify' packages are required for "
259 "'explore()'. You can install them using "
260 "'conda install -c conda-forge folium matplotlib mapclassify' "
261 "or 'pip install folium matplotlib mapclassify'."
262 )
263
264 # xyservices is an optional dependency
265 try:
266 import xyzservices
267
268 HAS_XYZSERVICES = True
269 except (ImportError, ModuleNotFoundError):
270 HAS_XYZSERVICES = False
271
272 gdf = df.copy()
273
274 # convert LinearRing to LineString
275 rings_mask = df.geom_type == "LinearRing"
276 if rings_mask.any():
277 gdf.geometry[rings_mask] = gdf.geometry[rings_mask].apply(
278 lambda g: LineString(g)
279 )
280
281 if gdf.crs is None:
282 kwargs["crs"] = "Simple"
283 tiles = None
284 elif not gdf.crs.equals(4326):
285 gdf = gdf.to_crs(4326)
286
287 # create folium.Map object
288 if m is None:
289 # Get bounds to specify location and map extent
290 bounds = gdf.total_bounds
291 location = kwargs.pop("location", None)
292 if location is None:
293 x = mean([bounds[0], bounds[2]])
294 y = mean([bounds[1], bounds[3]])
295 location = (y, x)
296 if "zoom_start" in kwargs.keys():
297 fit = False
298 else:
299 fit = True
300 else:
301 fit = False
302
303 # get a subset of kwargs to be passed to folium.Map
304 map_kwds = {i: kwargs[i] for i in kwargs.keys() if i in _MAP_KWARGS}
305
306 if HAS_XYZSERVICES:
307 # match provider name string to xyzservices.TileProvider
308 if isinstance(tiles, str):
309 try:
310 tiles = xyzservices.providers.query_name(tiles)
311 except ValueError:
312 pass
313
314 if isinstance(tiles, xyzservices.TileProvider):
315 attr = attr if attr else tiles.html_attribution
316 map_kwds["min_zoom"] = tiles.get("min_zoom", 0)
317 map_kwds["max_zoom"] = tiles.get("max_zoom", 18)
318 tiles = tiles.build_url(scale_factor="{r}")
319
320 m = folium.Map(
321 location=location,
322 control_scale=control_scale,
323 tiles=tiles,
324 attr=attr,
325 width=width,
326 height=height,
327 **map_kwds,
328 )
329
330 # fit bounds to get a proper zoom level
331 if fit:
332 m.fit_bounds([[bounds[1], bounds[0]], [bounds[3], bounds[2]]])
333
334 for map_kwd in _MAP_KWARGS:
335 kwargs.pop(map_kwd, None)
336
337 nan_idx = None
338
339 if column is not None:
340 if pd.api.types.is_list_like(column):
341 if len(column) != gdf.shape[0]:
342 raise ValueError(
343 "The GeoDataFrame and given column have different number of rows."
344 )
345 else:
346 column_name = "__plottable_column"
347 gdf[column_name] = column
348 column = column_name
349 elif pd.api.types.is_categorical_dtype(gdf[column]):
350 if categories is not None:
351 raise ValueError(
352 "Cannot specify 'categories' when column has categorical dtype"
353 )
354 categorical = True
355 elif gdf[column].dtype is np.dtype("O") or categories:
356 categorical = True
357
358 nan_idx = pd.isna(gdf[column])
359
360 if categorical:
361 cat = pd.Categorical(gdf[column][~nan_idx], categories=categories)
362 N = len(cat.categories)
363 cmap = cmap if cmap else "tab20"
364
365 # colormap exists in matplotlib
366 if cmap in plt.colormaps():
367
368 color = np.apply_along_axis(
369 colors.to_hex, 1, cm.get_cmap(cmap, N)(cat.codes)
370 )
371 legend_colors = np.apply_along_axis(
372 colors.to_hex, 1, cm.get_cmap(cmap, N)(range(N))
373 )
374
375 # colormap is matplotlib.Colormap
376 elif isinstance(cmap, colors.Colormap):
377 color = np.apply_along_axis(colors.to_hex, 1, cmap(cat.codes))
378 legend_colors = np.apply_along_axis(colors.to_hex, 1, cmap(range(N)))
379
380 # custom list of colors
381 elif pd.api.types.is_list_like(cmap):
382 if N > len(cmap):
383 cmap = cmap * (N // len(cmap) + 1)
384 color = np.take(cmap, cat.codes)
385 legend_colors = np.take(cmap, range(N))
386
387 else:
388 raise ValueError(
389 "'cmap' is invalid. For categorical plots, pass either valid "
390 "named matplotlib colormap or a list-like of colors."
391 )
392
393 elif callable(cmap):
394 # List of colors based on Branca colormaps or self-defined functions
395 color = list(map(lambda x: cmap(x), df[column]))
396
397 else:
398 vmin = gdf[column].min() if not vmin else vmin
399 vmax = gdf[column].max() if not vmax else vmax
400
401 # get bins
402 if scheme is not None:
403
404 if classification_kwds is None:
405 classification_kwds = {}
406 if "k" not in classification_kwds:
407 classification_kwds["k"] = k
408
409 binning = classify(
410 np.asarray(gdf[column][~nan_idx]), scheme, **classification_kwds
411 )
412 color = np.apply_along_axis(
413 colors.to_hex, 1, cm.get_cmap(cmap, k)(binning.yb)
414 )
415
416 else:
417
418 bins = np.linspace(vmin, vmax, 257)[1:]
419 binning = classify(
420 np.asarray(gdf[column][~nan_idx]), "UserDefined", bins=bins
421 )
422
423 color = np.apply_along_axis(
424 colors.to_hex, 1, cm.get_cmap(cmap, 256)(binning.yb)
425 )
426
427 # set default style
428 if "fillOpacity" not in style_kwds:
429 style_kwds["fillOpacity"] = 0.5
430 if "weight" not in style_kwds:
431 style_kwds["weight"] = 2
432
433 # specify color
434 if color is not None:
435 if (
436 isinstance(color, str)
437 and isinstance(gdf, geopandas.GeoDataFrame)
438 and color in gdf.columns
439 ): # use existing column
440
441 def _style_color(x):
442 return {
443 "fillColor": x["properties"][color],
444 **style_kwds,
445 }
446
447 style_function = _style_color
448 else: # assign new column
449 if isinstance(gdf, geopandas.GeoSeries):
450 gdf = geopandas.GeoDataFrame(geometry=gdf)
451
452 if nan_idx is not None and nan_idx.any():
453 nan_color = missing_kwds.pop("color", None)
454
455 gdf["__folium_color"] = nan_color
456 gdf.loc[~nan_idx, "__folium_color"] = color
457 else:
458 gdf["__folium_color"] = color
459
460 stroke_color = style_kwds.pop("color", None)
461 if not stroke_color:
462
463 def _style_column(x):
464 return {
465 "fillColor": x["properties"]["__folium_color"],
466 "color": x["properties"]["__folium_color"],
467 **style_kwds,
468 }
469
470 style_function = _style_column
471 else:
472
473 def _style_stroke(x):
474 return {
475 "fillColor": x["properties"]["__folium_color"],
476 "color": stroke_color,
477 **style_kwds,
478 }
479
480 style_function = _style_stroke
481 else: # use folium default
482
483 def _style_default(x):
484 return {**style_kwds}
485
486 style_function = _style_default
487
488 if highlight:
489 if "fillOpacity" not in highlight_kwds:
490 highlight_kwds["fillOpacity"] = 0.75
491
492 def _style_highlight(x):
493 return {**highlight_kwds}
494
495 highlight_function = _style_highlight
496 else:
497 highlight_function = None
498
499 # define default for points
500 if marker_type is None:
501 marker_type = "circle_marker"
502
503 marker = marker_type
504 if isinstance(marker_type, str):
505 if marker_type == "marker":
506 marker = folium.Marker(**marker_kwds)
507 elif marker_type == "circle":
508 marker = folium.Circle(**marker_kwds)
509 elif marker_type == "circle_marker":
510 marker_kwds["radius"] = marker_kwds.get("radius", 2)
511 marker_kwds["fill"] = marker_kwds.get("fill", True)
512 marker = folium.CircleMarker(**marker_kwds)
513 else:
514 raise ValueError(
515 "Only 'marker', 'circle', and 'circle_marker' are "
516 "supported as marker values"
517 )
518
519 # remove additional geometries
520 if isinstance(gdf, geopandas.GeoDataFrame):
521 non_active_geoms = [
522 name
523 for name, val in (gdf.dtypes == "geometry").items()
524 if val and name != gdf.geometry.name
525 ]
526 gdf = gdf.drop(columns=non_active_geoms)
527
528 # preprare tooltip and popup
529 if isinstance(gdf, geopandas.GeoDataFrame):
530 # add named index to the tooltip
531 if gdf.index.name is not None:
532 gdf = gdf.reset_index()
533 # specify fields to show in the tooltip
534 tooltip = _tooltip_popup("tooltip", tooltip, gdf, **tooltip_kwds)
535 popup = _tooltip_popup("popup", popup, gdf, **popup_kwds)
536 else:
537 tooltip = None
538 popup = None
539
540 # add dataframe to map
541 folium.GeoJson(
542 gdf.__geo_interface__,
543 tooltip=tooltip,
544 popup=popup,
545 marker=marker,
546 style_function=style_function,
547 highlight_function=highlight_function,
548 **kwargs,
549 ).add_to(m)
550
551 if legend:
552 # NOTE: overlaps will be resolved in branca #88
553 caption = column if not column == "__plottable_column" else ""
554 caption = legend_kwds.pop("caption", caption)
555 if categorical:
556 categories = cat.categories.to_list()
557 legend_colors = legend_colors.tolist()
558
559 if nan_idx.any() and nan_color:
560 categories.append(missing_kwds.pop("label", "NaN"))
561 legend_colors.append(nan_color)
562
563 _categorical_legend(m, caption, categories, legend_colors)
564 elif column is not None:
565
566 cbar = legend_kwds.pop("colorbar", True)
567 colormap_kwds = {}
568 if "max_labels" in legend_kwds:
569 colormap_kwds["max_labels"] = legend_kwds.pop("max_labels")
570 if scheme:
571 cb_colors = np.apply_along_axis(
572 colors.to_hex, 1, cm.get_cmap(cmap, binning.k)(range(binning.k))
573 )
574 if cbar:
575 if legend_kwds.pop("scale", True):
576 index = [vmin] + binning.bins.tolist()
577 else:
578 index = None
579 colorbar = bc.colormap.StepColormap(
580 cb_colors,
581 vmin=vmin,
582 vmax=vmax,
583 caption=caption,
584 index=index,
585 **colormap_kwds,
586 )
587 else:
588 fmt = legend_kwds.pop("fmt", "{:.2f}")
589 if "labels" in legend_kwds:
590 categories = legend_kwds["labels"]
591 else:
592 categories = binning.get_legend_classes(fmt)
593 show_interval = legend_kwds.pop("interval", False)
594 if not show_interval:
595 categories = [c[1:-1] for c in categories]
596
597 if nan_idx.any() and nan_color:
598 categories.append(missing_kwds.pop("label", "NaN"))
599 cb_colors = np.append(cb_colors, nan_color)
600 _categorical_legend(m, caption, categories, cb_colors)
601
602 else:
603 if isinstance(cmap, bc.colormap.ColorMap):
604 colorbar = cmap
605 else:
606
607 mp_cmap = cm.get_cmap(cmap)
608 cb_colors = np.apply_along_axis(
609 colors.to_hex, 1, mp_cmap(range(mp_cmap.N))
610 )
611 # linear legend
612 if mp_cmap.N > 20:
613 colorbar = bc.colormap.LinearColormap(
614 cb_colors,
615 vmin=vmin,
616 vmax=vmax,
617 caption=caption,
618 **colormap_kwds,
619 )
620
621 # steps
622 else:
623 colorbar = bc.colormap.StepColormap(
624 cb_colors,
625 vmin=vmin,
626 vmax=vmax,
627 caption=caption,
628 **colormap_kwds,
629 )
630
631 if cbar:
632 if nan_idx.any() and nan_color:
633 _categorical_legend(
634 m, "", [missing_kwds.pop("label", "NaN")], [nan_color]
635 )
636 m.add_child(colorbar)
637
638 return m
639
640
641 def _tooltip_popup(type, fields, gdf, **kwds):
642 """get tooltip or popup"""
643 import folium
644
645 # specify fields to show in the tooltip
646 if fields is False or fields is None or fields == 0:
647 return None
648 else:
649 if fields is True:
650 fields = gdf.columns.drop(gdf.geometry.name).to_list()
651 elif isinstance(fields, int):
652 fields = gdf.columns.drop(gdf.geometry.name).to_list()[:fields]
653 elif isinstance(fields, str):
654 fields = [fields]
655
656 for field in ["__plottable_column", "__folium_color"]:
657 if field in fields:
658 fields.remove(field)
659
660 # Cast fields to str
661 fields = list(map(str, fields))
662 if type == "tooltip":
663 return folium.GeoJsonTooltip(fields, **kwds)
664 elif type == "popup":
665 return folium.GeoJsonPopup(fields, **kwds)
666
667
668 def _categorical_legend(m, title, categories, colors):
669 """
670 Add categorical legend to a map
671
672 The implementation is using the code originally written by Michel Metran
673 (@michelmetran) and released on GitHub
674 (https://github.com/michelmetran/package_folium) under MIT license.
675
676 Copyright (c) 2020 Michel Metran
677
678 Parameters
679 ----------
680 m : folium.Map
681 Existing map instance on which to draw the plot
682 title : str
683 title of the legend (e.g. column name)
684 categories : list-like
685 list of categories
686 colors : list-like
687 list of colors (in the same order as categories)
688 """
689
690 # Header to Add
691 head = """
692 {% macro header(this, kwargs) %}
693 <script src="https://code.jquery.com/ui/1.12.1/jquery-ui.js"></script>
694 <script>$( function() {
695 $( ".maplegend" ).draggable({
696 start: function (event, ui) {
697 $(this).css({
698 right: "auto",
699 top: "auto",
700 bottom: "auto"
701 });
702 }
703 });
704 });
705 </script>
706 <style type='text/css'>
707 .maplegend {
708 position: absolute;
709 z-index:9999;
710 background-color: rgba(255, 255, 255, .8);
711 border-radius: 5px;
712 box-shadow: 0 0 15px rgba(0,0,0,0.2);
713 padding: 10px;
714 font: 12px/14px Arial, Helvetica, sans-serif;
715 right: 10px;
716 bottom: 20px;
717 }
718 .maplegend .legend-title {
719 text-align: left;
720 margin-bottom: 5px;
721 font-weight: bold;
722 }
723 .maplegend .legend-scale ul {
724 margin: 0;
725 margin-bottom: 0px;
726 padding: 0;
727 float: left;
728 list-style: none;
729 }
730 .maplegend .legend-scale ul li {
731 list-style: none;
732 margin-left: 0;
733 line-height: 16px;
734 margin-bottom: 2px;
735 }
736 .maplegend ul.legend-labels li span {
737 display: block;
738 float: left;
739 height: 14px;
740 width: 14px;
741 margin-right: 5px;
742 margin-left: 0;
743 border: 0px solid #ccc;
744 }
745 .maplegend .legend-source {
746 color: #777;
747 clear: both;
748 }
749 .maplegend a {
750 color: #777;
751 }
752 </style>
753 {% endmacro %}
754 """
755 import branca as bc
756
757 # Add CSS (on Header)
758 macro = bc.element.MacroElement()
759 macro._template = bc.element.Template(head)
760 m.get_root().add_child(macro)
761
762 body = f"""
763 <div id='maplegend {title}' class='maplegend'>
764 <div class='legend-title'>{title}</div>
765 <div class='legend-scale'>
766 <ul class='legend-labels'>"""
767
768 # Loop Categories
769 for label, color in zip(categories, colors):
770 body += f"""
771 <li><span style='background:{color}'></span>{label}</li>"""
772
773 body += """
774 </ul>
775 </div>
776 </div>
777 """
778
779 # Add Body
780 body = bc.element.Element(body, "legend")
781 m.get_root().html.add_child(body)
782
783
784 def _explore_geoseries(
785 s,
786 color=None,
787 m=None,
788 tiles="OpenStreetMap",
789 attr=None,
790 highlight=True,
791 width="100%",
792 height="100%",
793 control_scale=True,
794 marker_type=None,
795 marker_kwds={},
796 style_kwds={},
797 highlight_kwds={},
798 **kwargs,
799 ):
800 """Interactive map based on GeoPandas and folium/leaflet.js
801
802 Generate an interactive leaflet map based on :class:`~geopandas.GeoSeries`
803
804 Parameters
805 ----------
806 color : str, array-like (default None)
807 Named color or a list-like of colors (named or hex).
808 m : folium.Map (default None)
809 Existing map instance on which to draw the plot.
810 tiles : str, xyzservices.TileProvider (default 'OpenStreetMap Mapnik')
811 Map tileset to use. Can choose from the list supported by folium, query a
812 :class:`xyzservices.TileProvider` by a name from ``xyzservices.providers``,
813 pass :class:`xyzservices.TileProvider` object or pass custom XYZ URL.
814 The current list of built-in providers (when ``xyzservices`` is not available):
815
816 ``["OpenStreetMap", "Stamen Terrain", “Stamen Toner", “Stamen Watercolor"
817 "CartoDB positron", “CartoDB dark_matter"]``
818
819 You can pass a custom tileset to Folium by passing a Leaflet-style URL
820 to the tiles parameter: ``http://{s}.yourtiles.com/{z}/{x}/{y}.png``.
821 Be sure to check their terms and conditions and to provide attribution with
822 the ``attr`` keyword.
823 attr : str (default None)
824 Map tile attribution; only required if passing custom tile URL.
825 highlight : bool (default True)
826 Enable highlight functionality when hovering over a geometry.
827 width : pixel int or percentage string (default: '100%')
828 Width of the folium :class:`~folium.folium.Map`. If the argument
829 m is given explicitly, width is ignored.
830 height : pixel int or percentage string (default: '100%')
831 Height of the folium :class:`~folium.folium.Map`. If the argument
832 m is given explicitly, height is ignored.
833 control_scale : bool, (default True)
834 Whether to add a control scale on the map.
835 marker_type : str, folium.Circle, folium.CircleMarker, folium.Marker (default None)
836 Allowed string options are ('marker', 'circle', 'circle_marker'). Defaults to
837 folium.Marker.
838 marker_kwds: dict (default {})
839 Additional keywords to be passed to the selected ``marker_type``, e.g.:
840
841 radius : float
842 Radius of the circle, in meters (for ``'circle'``) or pixels
843 (for ``circle_marker``).
844 icon : folium.map.Icon
845 the :class:`folium.map.Icon` object to use to render the marker.
846 draggable : bool (default False)
847 Set to True to be able to drag the marker around the map.
848
849 style_kwds : dict (default {})
850 Additional style to be passed to folium ``style_function``:
851
852 stroke : bool (default True)
853 Whether to draw stroke along the path. Set it to ``False`` to
854 disable borders on polygons or circles.
855 color : str
856 Stroke color
857 weight : int
858 Stroke width in pixels
859 opacity : float (default 1.0)
860 Stroke opacity
861 fill : boolean (default True)
862 Whether to fill the path with color. Set it to ``False`` to
863 disable filling on polygons or circles.
864 fillColor : str
865 Fill color. Defaults to the value of the color option
866 fillOpacity : float (default 0.5)
867 Fill opacity.
868
869 Plus all supported by :func:`folium.vector_layers.path_options`. See the
870 documentation of :class:`folium.features.GeoJson` for details.
871
872 highlight_kwds : dict (default {})
873 Style to be passed to folium highlight_function. Uses the same keywords
874 as ``style_kwds``. When empty, defaults to ``{"fillOpacity": 0.75}``.
875
876 **kwargs : dict
877 Additional options to be passed on to the folium.
878
879 Returns
880 -------
881 m : folium.folium.Map
882 folium :class:`~folium.folium.Map` instance
883
884 """
885 return _explore(
886 s,
887 color=color,
888 m=m,
889 tiles=tiles,
890 attr=attr,
891 highlight=highlight,
892 width=width,
893 height=height,
894 control_scale=control_scale,
895 marker_type=marker_type,
896 marker_kwds=marker_kwds,
897 style_kwds=style_kwds,
898 highlight_kwds=highlight_kwds,
899 **kwargs,
900 )
33 import numpy as np
44 import pandas as pd
55 from pandas import DataFrame, Series
6 from pandas.core.accessor import CachedAccessor
67
78 from shapely.geometry import mapping, shape
89 from shapely.geometry.base import BaseGeometry
910
10
1111 from pyproj import CRS
1212
1313 from geopandas.array import GeometryArray, GeometryDtype, from_shapely, to_wkb, to_wkt
1414 from geopandas.base import GeoPandasBase, is_geometry_type
15 from geopandas.geoseries import GeoSeries, inherit_doc
15 from geopandas.geoseries import GeoSeries
1616 import geopandas.io
17 from geopandas.plotting import plot_dataframe
17 from geopandas.explore import _explore
1818 from . import _compat as compat
19 from ._decorator import doc
1920
2021
2122 DEFAULT_GEO_COLUMN_NAME = "geometry"
3233 """
3334 if is_geometry_type(data):
3435 if isinstance(data, Series):
35 return GeoSeries(data)
36 data = GeoSeries(data)
37 if data.crs is None:
38 data.crs = crs
3639 return data
3740 else:
3841 if isinstance(data, Series):
4144 else:
4245 out = from_shapely(data, crs=crs)
4346 return out
47
48
49 def _crs_mismatch_warning():
50 # TODO: raise error in 0.9 or 0.10.
51 warnings.warn(
52 "CRS mismatch between CRS of the passed geometries "
53 "and 'crs'. Use 'GeoDataFrame.set_crs(crs, "
54 "allow_override=True)' to overwrite CRS or "
55 "'GeoDataFrame.to_crs(crs)' to reproject geometries. "
56 "CRS mismatch will raise an error in the future versions "
57 "of GeoPandas.",
58 FutureWarning,
59 stacklevel=3,
60 )
4461
4562
4663 class GeoDataFrame(GeoPandasBase, DataFrame):
99116
100117 _geometry_column_name = DEFAULT_GEO_COLUMN_NAME
101118
102 def __init__(self, *args, geometry=None, crs=None, **kwargs):
119 def __init__(self, data=None, *args, geometry=None, crs=None, **kwargs):
103120 with compat.ignore_shapely2_warnings():
104 super(GeoDataFrame, self).__init__(*args, **kwargs)
121 super().__init__(data, *args, **kwargs)
105122
106123 # need to set this before calling self['geometry'], because
107124 # getitem accesses crs
113130 # but within a try/except because currently non-geometries are
114131 # allowed in that case
115132 # TODO do we want to raise / return normal DataFrame in this case?
133
134 # if gdf passed in and geo_col is set, we use that for geometry
135 if geometry is None and isinstance(data, GeoDataFrame):
136 self._geometry_column_name = data._geometry_column_name
137 if crs is not None and data.crs != crs:
138 _crs_mismatch_warning()
139 # TODO: raise error in 0.9 or 0.10.
140 return
141
116142 if geometry is None and "geometry" in self.columns:
143 # Check for multiple columns with name "geometry". If there are,
144 # self["geometry"] is a gdf and constructor gets recursively recalled
145 # by pandas internals trying to access this
146 if (self.columns == "geometry").sum() > 1:
147 raise ValueError(
148 "GeoDataFrame does not support multiple columns "
149 "using the geometry column name 'geometry'."
150 )
151
117152 # only if we have actual geometry values -> call set_geometry
118153 index = self.index
119154 try:
123158 and crs
124159 and not self["geometry"].values.crs == crs
125160 ):
126 warnings.warn(
127 "CRS mismatch between CRS of the passed geometries "
128 "and 'crs'. Use 'GeoDataFrame.set_crs(crs, "
129 "allow_override=True)' to overwrite CRS or "
130 "'GeoDataFrame.to_crs(crs)' to reproject geometries. "
131 "CRS mismatch will raise an error in the future versions "
132 "of GeoPandas.",
133 FutureWarning,
134 stacklevel=2,
135 )
161 _crs_mismatch_warning()
136162 # TODO: raise error in 0.9 or 0.10.
137163 self["geometry"] = _ensure_geometry(self["geometry"].values, crs)
138164 except TypeError:
152178 and crs
153179 and not geometry.crs == crs
154180 ):
155 warnings.warn(
156 "CRS mismatch between CRS of the passed geometries "
157 "and 'crs'. Use 'GeoDataFrame.set_crs(crs, "
158 "allow_override=True)' to overwrite CRS or "
159 "'GeoDataFrame.to_crs(crs)' to reproject geometries. "
160 "CRS mismatch will raise an error in the future versions "
161 "of GeoPandas.",
162 FutureWarning,
163 stacklevel=2,
164 )
181 _crs_mismatch_warning()
165182 # TODO: raise error in 0.9 or 0.10.
166183 self.set_geometry(geometry, inplace=True)
167184
178195 if attr == "geometry":
179196 object.__setattr__(self, attr, val)
180197 else:
181 super(GeoDataFrame, self).__setattr__(attr, val)
198 super().__setattr__(attr, val)
182199
183200 def _get_geometry(self):
184201 if self._geometry_column_name not in self:
267284 raise ValueError("Must pass array with one dimension only.")
268285 else:
269286 try:
270 level = frame[col].values
287 level = frame[col]
271288 except KeyError:
272289 raise ValueError("Unknown column %s" % col)
273290 except Exception:
274291 raise
292 if isinstance(level, DataFrame):
293 raise ValueError(
294 "GeoDataFrame does not support setting the geometry column where "
295 "the column name is shared by multiple columns."
296 )
297
275298 if drop:
276299 to_remove = col
277300 geo_column_name = self._geometry_column_name
428451 def from_dict(cls, data, geometry=None, crs=None, **kwargs):
429452 """
430453 Construct GeoDataFrame from dict of array-like or dicts by
431 overiding DataFrame.from_dict method with geometry and crs
454 overriding DataFrame.from_dict method with geometry and crs
432455
433456 Parameters
434457 ----------
635658 PostGIS
636659
637660 >>> from sqlalchemy import create_engine # doctest: +SKIP
638 >>> db_connection_url = "postgres://myusername:mypassword@myhost:5432/mydb"
661 >>> db_connection_url = "postgresql://myusername:mypassword@myhost:5432/mydb"
639662 >>> con = create_engine(db_connection_url) # doctest: +SKIP
640663 >>> sql = "SELECT geom, highway FROM roads"
641664 >>> df = geopandas.GeoDataFrame.from_postgis(sql, con) # doctest: +SKIP
769792 na : str, optional
770793 Options are {'null', 'drop', 'keep'}, default 'null'.
771794 Indicates how to output missing (NaN) values in the GeoDataFrame
772 * null: ouput the missing entries as JSON null
773 * drop: remove the property from the feature. This applies to
774 each feature individually so that features may have
775 different properties
776 * keep: output the missing entries as NaN
795
796 - null: output the missing entries as JSON null
797 - drop: remove the property from the feature. This applies to each feature \
798 individually so that features may have different properties
799 - keep: output the missing entries as NaN
800
777801 show_bbox : bool, optional
778802 Include bbox (bounds) in the geojson. Default False.
779803 drop_id : bool, default: False
808832
809833 ids = np.array(self.index, copy=False)
810834 geometries = np.array(self[self._geometry_column_name], copy=False)
835
836 if not self.columns.is_unique:
837 raise ValueError("GeoDataFrame cannot contain duplicated column names.")
811838
812839 properties_cols = self.columns.difference([self._geometry_column_name])
813840
955982 compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
956983 Name of the compression to use. Use ``None`` for no compression.
957984 kwargs
958 Additional keyword arguments passed to to pyarrow.parquet.write_table().
985 Additional keyword arguments passed to :func:`pyarrow.parquet.write_table`.
959986
960987 Examples
961988 --------
10031030 Name of the compression to use. Use ``"uncompressed"`` for no
10041031 compression. By default uses LZ4 if available, otherwise uncompressed.
10051032 kwargs
1006 Additional keyword arguments passed to to pyarrow.feather.write_feather().
1033 Additional keyword arguments passed to to
1034 :func:`pyarrow.feather.write_feather`.
10071035
10081036 Examples
10091037 --------
10201048
10211049 _to_feather(self, path, index=index, compression=compression, **kwargs)
10221050
1023 def to_file(
1024 self, filename, driver="ESRI Shapefile", schema=None, index=None, **kwargs
1025 ):
1051 def to_file(self, filename, driver=None, schema=None, index=None, **kwargs):
10261052 """Write the ``GeoDataFrame`` to a file.
10271053
10281054 By default, an ESRI shapefile is written, but any OGR data source
10361062 ----------
10371063 filename : string
10381064 File path or file handle to write to.
1039 driver : string, default: 'ESRI Shapefile'
1065 driver : string, default None
10401066 The OGR format driver used to write the vector file.
1067 If not specified, it attempts to infer it from the file extension.
1068 If no extension is specified, it saves ESRI Shapefile to a folder.
10411069 schema : dict, default: None
10421070 If specified, the schema dictionary is passed to Fiona to
10431071 better control how the file is written.
12951323 GeoSeries. If it's a DataFrame with a 'geometry' column, return a
12961324 GeoDataFrame.
12971325 """
1298 result = super(GeoDataFrame, self).__getitem__(key)
1326 result = super().__getitem__(key)
12991327 geo_col = self._geometry_column_name
13001328 if isinstance(result, Series) and isinstance(result.dtype, GeometryDtype):
13011329 result.__class__ = GeoSeries
13161344 value = [value] * self.shape[0]
13171345 try:
13181346 value = _ensure_geometry(value, crs=self.crs)
1347 self._crs = value.crs
13191348 except TypeError:
13201349 warnings.warn("Geometry column does not contain geometry.")
1321 super(GeoDataFrame, self).__setitem__(key, value)
1350 super().__setitem__(key, value)
13221351
13231352 #
13241353 # Implement pandas methods
13551384 result.__class__ = DataFrame
13561385 return result
13571386
1358 @inherit_doc(pd.DataFrame)
1387 @doc(pd.DataFrame)
13591388 def apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwargs):
13601389 result = super().apply(
13611390 func, axis=axis, raw=raw, result_type=result_type, args=args, **kwargs
13631392 if (
13641393 isinstance(result, GeoDataFrame)
13651394 and self._geometry_column_name in result.columns
1366 and any(isinstance(t, GeometryDtype) for t in result.dtypes)
1395 and isinstance(result[self._geometry_column_name].dtype, GeometryDtype)
13671396 ):
1397 # apply calls _constructor which resets geom col name to geometry
1398 result._geometry_column_name = self._geometry_column_name
13681399 if self.crs is not None and result.crs is None:
13691400 result.set_crs(self.crs, inplace=True)
13701401 return result
13741405 return GeoDataFrame
13751406
13761407 def __finalize__(self, other, method=None, **kwargs):
1377 """propagate metadata from other to self """
1408 """propagate metadata from other to self"""
13781409 self = super().__finalize__(other, method=method, **kwargs)
13791410
13801411 # merge operation: using metadata of the left object
13861417 for name in self._metadata:
13871418 object.__setattr__(self, name, getattr(other.objs[0], name, None))
13881419
1420 if (self.columns == self._geometry_column_name).sum() > 1:
1421 raise ValueError(
1422 "Concat operation has resulted in multiple columns using "
1423 f"the geometry column name '{self._geometry_column_name}'.\n"
1424 f"Please ensure this column from the first DataFrame is not "
1425 f"repeated."
1426 )
13891427 return self
13901428
13911429 def dissolve(
15121550 return aggregated
15131551
15141552 # overrides the pandas native explode method to break up features geometrically
1515 def explode(self, column=None, **kwargs):
1553 def explode(self, column=None, ignore_index=False, index_parts=None, **kwargs):
15161554 """
15171555 Explode muti-part geometries into multiple single geometries.
15181556
15201558 multiple rows with single geometries, thereby increasing the vertical
15211559 size of the GeoDataFrame.
15221560
1523 The index of the input geodataframe is no longer unique and is
1524 replaced with a multi-index (original index with additional level
1525 indicating the multiple geometries: a new zero-based index for each
1526 single part geometry per multi-part geometry).
1561 .. note:: ignore_index requires pandas 1.1.0 or newer.
1562
1563 Parameters
1564 ----------
1565 column : string, default None
1566 Column to explode. In the case of a geometry column, multi-part
1567 geometries are converted to single-part.
1568 If None, the active geometry column is used.
1569 ignore_index : bool, default False
1570 If True, the resulting index will be labelled 0, 1, …, n - 1,
1571 ignoring `index_parts`.
1572 index_parts : boolean, default True
1573 If True, the resulting index will be a multi-index (original
1574 index with an additional level indicating the multiple
1575 geometries: a new zero-based index for each single part geometry
1576 per multi-part geometry).
15271577
15281578 Returns
15291579 -------
15481598 0 name1 MULTIPOINT (1.00000 2.00000, 3.00000 4.00000)
15491599 1 name2 MULTIPOINT (2.00000 1.00000, 0.00000 0.00000)
15501600
1551 >>> exploded = gdf.explode()
1601 >>> exploded = gdf.explode(index_parts=True)
15521602 >>> exploded
15531603 col1 geometry
15541604 0 0 name1 POINT (1.00000 2.00000)
15561606 1 0 name2 POINT (2.00000 1.00000)
15571607 1 name2 POINT (0.00000 0.00000)
15581608
1609 >>> exploded = gdf.explode(index_parts=False)
1610 >>> exploded
1611 col1 geometry
1612 0 name1 POINT (1.00000 2.00000)
1613 0 name1 POINT (3.00000 4.00000)
1614 1 name2 POINT (2.00000 1.00000)
1615 1 name2 POINT (0.00000 0.00000)
1616
1617 >>> exploded = gdf.explode(ignore_index=True)
1618 >>> exploded
1619 col1 geometry
1620 0 name1 POINT (1.00000 2.00000)
1621 1 name1 POINT (3.00000 4.00000)
1622 2 name2 POINT (2.00000 1.00000)
1623 3 name2 POINT (0.00000 0.00000)
1624
15591625 See also
15601626 --------
15611627 GeoDataFrame.dissolve : dissolve geometries into a single observation.
15671633 column = self.geometry.name
15681634 # If the specified column is not a geometry dtype use pandas explode
15691635 if not isinstance(self[column].dtype, GeometryDtype):
1570 return super(GeoDataFrame, self).explode(column, **kwargs)
1571 # TODO: make sure index behaviour is consistent
1636 if compat.PANDAS_GE_11:
1637 return super().explode(column, ignore_index=ignore_index, **kwargs)
1638 else:
1639 return super().explode(column, **kwargs)
1640
1641 if index_parts is None:
1642 if not ignore_index:
1643 warnings.warn(
1644 "Currently, index_parts defaults to True, but in the future, "
1645 "it will default to False to be consistent with Pandas. "
1646 "Use `index_parts=True` to keep the current behavior and "
1647 "True/False to silence the warning.",
1648 FutureWarning,
1649 stacklevel=2,
1650 )
1651 index_parts = True
15721652
15731653 df_copy = self.copy()
15741654
1575 if "level_1" in df_copy.columns: # GH1393
1576 df_copy = df_copy.rename(columns={"level_1": "__level_1"})
1577
1578 exploded_geom = df_copy.geometry.explode().reset_index(level=-1)
1579 exploded_index = exploded_geom.columns[0]
1580
1581 df = pd.concat(
1582 [df_copy.drop(df_copy._geometry_column_name, axis=1), exploded_geom], axis=1
1655 level_str = f"level_{df_copy.index.nlevels}"
1656
1657 if level_str in df_copy.columns: # GH1393
1658 df_copy = df_copy.rename(columns={level_str: f"__{level_str}"})
1659
1660 if index_parts:
1661 exploded_geom = df_copy.geometry.explode(index_parts=True)
1662 exploded_index = exploded_geom.index
1663 exploded_geom = exploded_geom.reset_index(level=-1, drop=True)
1664 else:
1665 exploded_geom = df_copy.geometry.explode(index_parts=True).reset_index(
1666 level=-1, drop=True
1667 )
1668 exploded_index = exploded_geom.index
1669
1670 df = (
1671 df_copy.drop(df_copy._geometry_column_name, axis=1)
1672 .join(exploded_geom)
1673 .__finalize__(self)
15831674 )
1584 # reset to MultiIndex, otherwise df index is only first level of
1585 # exploded GeoSeries index.
1586 df.set_index(exploded_index, append=True, inplace=True)
1587 df.index.names = list(self.index.names) + [None]
1588
1589 if "__level_1" in df.columns:
1590 df = df.rename(columns={"__level_1": "level_1"})
1675
1676 if ignore_index:
1677 df.reset_index(inplace=True, drop=True)
1678 elif index_parts:
1679 # reset to MultiIndex, otherwise df index is only first level of
1680 # exploded GeoSeries index.
1681 df.set_index(exploded_index, inplace=True)
1682 df.index.names = list(self.index.names) + [None]
1683 else:
1684 df.set_index(exploded_index, inplace=True)
1685 df.index.names = self.index.names
1686
1687 if f"__{level_str}" in df.columns:
1688 df = df.rename(columns={f"__{level_str}": level_str})
15911689
15921690 geo_df = df.set_geometry(self._geometry_column_name)
15931691 return geo_df
16061704 -------
16071705 GeoDataFrame or DataFrame
16081706 """
1609 df = super(GeoDataFrame, self).astype(dtype, copy=copy, errors=errors, **kwargs)
1707 df = super().astype(dtype, copy=copy, errors=errors, **kwargs)
16101708
16111709 try:
16121710 geoms = df[self._geometry_column_name]
16171715 # if the geometry column is converted to non-geometries or did not exist
16181716 # do not return a GeoDataFrame
16191717 return pd.DataFrame(df)
1718
1719 def convert_dtypes(self, *args, **kwargs):
1720 """
1721 Convert columns to best possible dtypes using dtypes supporting ``pd.NA``.
1722
1723 Always returns a GeoDataFrame as no conversions are applied to the
1724 geometry column.
1725
1726 See the pandas.DataFrame.convert_dtypes docstring for more details.
1727
1728 Returns
1729 -------
1730 GeoDataFrame
1731
1732 """
1733 # Overridden to fix GH1870, that return type is not preserved always
1734 # (and where it was, geometry col was not)
1735
1736 if not compat.PANDAS_GE_10:
1737 raise NotImplementedError(
1738 "GeoDataFrame.convert_dtypes requires pandas >= 1.0"
1739 )
1740
1741 return GeoDataFrame(
1742 super().convert_dtypes(*args, **kwargs),
1743 geometry=self.geometry.name,
1744 crs=self.crs,
1745 )
16201746
16211747 def to_postgis(
16221748 self,
16291755 chunksize=None,
16301756 dtype=None,
16311757 ):
1632
16331758 """
16341759 Upload GeoDataFrame into PostGIS database.
16351760
16691794 --------
16701795
16711796 >>> from sqlalchemy import create_engine
1672 >>> engine = create_engine("postgres://myusername:mypassword@myhost:5432\
1797 >>> engine = create_engine("postgresql://myusername:mypassword@myhost:5432\
16731798 /mydatabase") # doctest: +SKIP
16741799 >>> gdf.to_postgis("my_table", engine) # doctest: +SKIP
16751800
17241849 )
17251850 return self.geometry.difference(other)
17261851
1727 if compat.PANDAS_GE_025:
1728 from pandas.core.accessor import CachedAccessor
1729
1730 plot = CachedAccessor("plot", geopandas.plotting.GeoplotAccessor)
1731 else:
1732
1733 def plot(self, *args, **kwargs):
1734 """Generate a plot of the geometries in the ``GeoDataFrame``.
1735 If the ``column`` parameter is given, colors plot according to values
1736 in that column, otherwise calls ``GeoSeries.plot()`` on the
1737 ``geometry`` column.
1738 Wraps the ``plot_dataframe()`` function, and documentation is copied
1739 from there.
1740 """
1741 return plot_dataframe(self, *args, **kwargs)
1742
1743 plot.__doc__ = plot_dataframe.__doc__
1852 plot = CachedAccessor("plot", geopandas.plotting.GeoplotAccessor)
1853
1854 @doc(_explore)
1855 def explore(self, *args, **kwargs):
1856 """Interactive map based on folium/leaflet.js"""
1857 return _explore(self, *args, **kwargs)
1858
1859 def sjoin(self, df, *args, **kwargs):
1860 """Spatial join of two GeoDataFrames.
1861
1862 See the User Guide page :doc:`../../user_guide/mergingdata` for details.
1863
1864 Parameters
1865 ----------
1866 df : GeoDataFrame
1867 how : string, default 'inner'
1868 The type of join:
1869
1870 * 'left': use keys from left_df; retain only left_df geometry column
1871 * 'right': use keys from right_df; retain only right_df geometry column
1872 * 'inner': use intersection of keys from both dfs; retain only
1873 left_df geometry column
1874
1875 predicate : string, default 'intersects'
1876 Binary predicate. Valid values are determined by the spatial index used.
1877 You can check the valid values in left_df or right_df as
1878 ``left_df.sindex.valid_query_predicates`` or
1879 ``right_df.sindex.valid_query_predicates``
1880 lsuffix : string, default 'left'
1881 Suffix to apply to overlapping column names (left GeoDataFrame).
1882 rsuffix : string, default 'right'
1883 Suffix to apply to overlapping column names (right GeoDataFrame).
1884
1885 Examples
1886 --------
1887 >>> countries = geopandas.read_file( \
1888 geopandas.datasets.get_path("naturalearth_lowres"))
1889 >>> cities = geopandas.read_file( \
1890 geopandas.datasets.get_path("naturalearth_cities"))
1891 >>> countries.head() # doctest: +SKIP
1892 pop_est continent name \
1893 iso_a3 gdp_md_est geometry
1894 0 920938 Oceania Fiji FJI 8374.0 \
1895 MULTIPOLYGON (((180.00000 -16.06713, 180.00000...
1896 1 53950935 Africa Tanzania TZA 150600.0 \
1897 POLYGON ((33.90371 -0.95000, 34.07262 -1.05982...
1898 2 603253 Africa W. Sahara ESH 906.5 \
1899 POLYGON ((-8.66559 27.65643, -8.66512 27.58948...
1900 3 35623680 North America Canada CAN 1674000.0 \
1901 MULTIPOLYGON (((-122.84000 49.00000, -122.9742...
1902 4 326625791 North America United States of America USA 18560000.0 \
1903 MULTIPOLYGON (((-122.84000 49.00000, -120.0000...
1904 >>> cities.head()
1905 name geometry
1906 0 Vatican City POINT (12.45339 41.90328)
1907 1 San Marino POINT (12.44177 43.93610)
1908 2 Vaduz POINT (9.51667 47.13372)
1909 3 Luxembourg POINT (6.13000 49.61166)
1910 4 Palikir POINT (158.14997 6.91664)
1911
1912 >>> cities_w_country_data = cities.sjoin(countries)
1913 >>> cities_w_country_data.head() # doctest: +SKIP
1914 name_left geometry index_right pop_est \
1915 continent name_right iso_a3 gdp_md_est
1916 0 Vatican City POINT (12.45339 41.90328) 141 62137802 \
1917 Europe Italy ITA 2221000.0
1918 1 San Marino POINT (12.44177 43.93610) 141 62137802 \
1919 Europe Italy ITA 2221000.0
1920 192 Rome POINT (12.48131 41.89790) 141 62137802 \
1921 Europe Italy ITA 2221000.0
1922 2 Vaduz POINT (9.51667 47.13372) 114 8754413 \
1923 Europe Au stria AUT 416600.0
1924 184 Vienna POINT (16.36469 48.20196) 114 8754413 \
1925 Europe Austria AUT 416600.0
1926
1927 Notes
1928 ------
1929 Every operation in GeoPandas is planar, i.e. the potential third
1930 dimension is not taken into account.
1931
1932 See also
1933 --------
1934 GeoDataFrame.sjoin_nearest : nearest neighbor join
1935 sjoin : equivalent top-level function
1936 """
1937 return geopandas.sjoin(left_df=self, right_df=df, *args, **kwargs)
1938
1939 def sjoin_nearest(
1940 self,
1941 right,
1942 how="inner",
1943 max_distance=None,
1944 lsuffix="left",
1945 rsuffix="right",
1946 distance_col=None,
1947 ):
1948 """
1949 Spatial join of two GeoDataFrames based on the distance between their
1950 geometries.
1951
1952 Results will include multiple output records for a single input record
1953 where there are multiple equidistant nearest or intersected neighbors.
1954
1955 See the User Guide page
1956 https://geopandas.readthedocs.io/en/latest/docs/user_guide/mergingdata.html
1957 for more details.
1958
1959
1960 Parameters
1961 ----------
1962 right : GeoDataFrame
1963 how : string, default 'inner'
1964 The type of join:
1965
1966 * 'left': use keys from left_df; retain only left_df geometry column
1967 * 'right': use keys from right_df; retain only right_df geometry column
1968 * 'inner': use intersection of keys from both dfs; retain only
1969 left_df geometry column
1970
1971 max_distance : float, default None
1972 Maximum distance within which to query for nearest geometry.
1973 Must be greater than 0.
1974 The max_distance used to search for nearest items in the tree may have a
1975 significant impact on performance by reducing the number of input
1976 geometries that are evaluated for nearest items in the tree.
1977 lsuffix : string, default 'left'
1978 Suffix to apply to overlapping column names (left GeoDataFrame).
1979 rsuffix : string, default 'right'
1980 Suffix to apply to overlapping column names (right GeoDataFrame).
1981 distance_col : string, default None
1982 If set, save the distances computed between matching geometries under a
1983 column of this name in the joined GeoDataFrame.
1984
1985 Examples
1986 --------
1987 >>> countries = geopandas.read_file(geopandas.datasets.get_\
1988 path("naturalearth_lowres"))
1989 >>> cities = geopandas.read_file(geopandas.datasets.get_path("naturalearth_citi\
1990 es"))
1991 >>> countries.head(2).name # doctest: +SKIP
1992 pop_est continent name \
1993 iso_a3 gdp_md_est geometry
1994 0 920938 Oceania Fiji FJI 8374.0 MULTI\
1995 POLYGON (((180.00000 -16.06713, 180.00000...
1996 1 53950935 Africa Tanzania TZA 150600.0 POLYG\
1997 ON ((33.90371 -0.95000, 34.07262 -1.05982...
1998 >>> cities.head(2).name # doctest: +SKIP
1999 name geometry
2000 0 Vatican City POINT (12.45339 41.90328)
2001 1 San Marino POINT (12.44177 43.93610)
2002
2003 >>> cities_w_country_data = cities.sjoin_nearest(countries)
2004 >>> cities_w_country_data[['name_left', 'name_right']].head(2) # doctest: +SKIP
2005 name_left geometry index_right pop_est continent n\
2006 ame_right iso_a3 gdp_md_est
2007 0 Vatican City POINT (12.45339 41.90328) 141 62137802 Europe \
2008 Italy ITA 2221000.0
2009 1 San Marino POINT (12.44177 43.93610) 141 62137802 Europe \
2010 Italy ITA 2221000.0
2011
2012 To include the distances:
2013
2014 >>> cities_w_country_data = cities.sjoin_nearest(countries, \
2015 distance_col="distances")
2016 >>> cities_w_country_data[["name_left", "name_right", \
2017 "distances"]].head(2) # doctest: +SKIP
2018 name_left name_right distances
2019 0 Vatican City Italy 0.0
2020 1 San Marino Italy 0.0
2021
2022 In the following example, we get multiple cities for Italy because all results
2023 are equidistant (in this case zero because they intersect).
2024 In fact, we get 3 results in total:
2025
2026 >>> countries_w_city_data = cities.sjoin_nearest(countries, \
2027 distance_col="distances", how="right")
2028 >>> italy_results = \
2029 countries_w_city_data[countries_w_city_data["name_left"] == "Italy"]
2030 >>> italy_results # doctest: +SKIP
2031 name_x name_y
2032 141 Vatican City Italy
2033 141 San Marino Italy
2034 141 Rome Italy
2035
2036 See also
2037 --------
2038 GeoDataFrame.sjoin : binary predicate joins
2039 sjoin_nearest : equivalent top-level function
2040
2041 Notes
2042 -----
2043 Since this join relies on distances, results will be innaccurate
2044 if your geometries are in a geographic CRS.
2045
2046 Every operation in GeoPandas is planar, i.e. the potential third
2047 dimension is not taken into account.
2048 """
2049 return geopandas.sjoin_nearest(
2050 self,
2051 right,
2052 how=how,
2053 max_distance=max_distance,
2054 lsuffix=lsuffix,
2055 rsuffix=rsuffix,
2056 distance_col=distance_col,
2057 )
2058
2059 def clip(self, mask, keep_geom_type=False):
2060 """Clip points, lines, or polygon geometries to the mask extent.
2061
2062 Both layers must be in the same Coordinate Reference System (CRS).
2063 The GeoDataFrame will be clipped to the full extent of the `mask` object.
2064
2065 If there are multiple polygons in mask, data from the GeoDataFrame will be
2066 clipped to the total boundary of all polygons in mask.
2067
2068 Parameters
2069 ----------
2070 mask : GeoDataFrame, GeoSeries, (Multi)Polygon
2071 Polygon vector layer used to clip `gdf`.
2072 The mask's geometry is dissolved into one geometric feature
2073 and intersected with `gdf`.
2074 keep_geom_type : boolean, default False
2075 If True, return only geometries of original type in case of intersection
2076 resulting in multiple geometry types or GeometryCollections.
2077 If False, return all resulting geometries (potentially mixed types).
2078
2079 Returns
2080 -------
2081 GeoDataFrame
2082 Vector data (points, lines, polygons) from `gdf` clipped to
2083 polygon boundary from mask.
2084
2085 See also
2086 --------
2087 clip : equivalent top-level function
2088
2089 Examples
2090 --------
2091 Clip points (global cities) with a polygon (the South American continent):
2092
2093 >>> world = geopandas.read_file(
2094 ... geopandas.datasets.get_path('naturalearth_lowres'))
2095 >>> south_america = world[world['continent'] == "South America"]
2096 >>> capitals = geopandas.read_file(
2097 ... geopandas.datasets.get_path('naturalearth_cities'))
2098 >>> capitals.shape
2099 (202, 2)
2100
2101 >>> sa_capitals = capitals.clip(south_america)
2102 >>> sa_capitals.shape
2103 (12, 2)
2104 """
2105 return geopandas.clip(self, mask=mask, keep_geom_type=keep_geom_type)
2106
2107 def overlay(self, right, how="intersection", keep_geom_type=None, make_valid=True):
2108 """Perform spatial overlay between GeoDataFrames.
2109
2110 Currently only supports data GeoDataFrames with uniform geometry types,
2111 i.e. containing only (Multi)Polygons, or only (Multi)Points, or a
2112 combination of (Multi)LineString and LinearRing shapes.
2113 Implements several methods that are all effectively subsets of the union.
2114
2115 See the User Guide page :doc:`../../user_guide/set_operations` for details.
2116
2117 Parameters
2118 ----------
2119 right : GeoDataFrame
2120 how : string
2121 Method of spatial overlay: 'intersection', 'union',
2122 'identity', 'symmetric_difference' or 'difference'.
2123 keep_geom_type : bool
2124 If True, return only geometries of the same geometry type the GeoDataFrame
2125 has, if False, return all resulting geometries. Default is None,
2126 which will set keep_geom_type to True but warn upon dropping
2127 geometries.
2128 make_valid : bool, default True
2129 If True, any invalid input geometries are corrected with a call to
2130 `buffer(0)`, if False, a `ValueError` is raised if any input geometries
2131 are invalid.
2132
2133 Returns
2134 -------
2135 df : GeoDataFrame
2136 GeoDataFrame with new set of polygons and attributes
2137 resulting from the overlay
2138
2139 Examples
2140 --------
2141 >>> from shapely.geometry import Polygon
2142 >>> polys1 = geopandas.GeoSeries([Polygon([(0,0), (2,0), (2,2), (0,2)]),
2143 ... Polygon([(2,2), (4,2), (4,4), (2,4)])])
2144 >>> polys2 = geopandas.GeoSeries([Polygon([(1,1), (3,1), (3,3), (1,3)]),
2145 ... Polygon([(3,3), (5,3), (5,5), (3,5)])])
2146 >>> df1 = geopandas.GeoDataFrame({'geometry': polys1, 'df1_data':[1,2]})
2147 >>> df2 = geopandas.GeoDataFrame({'geometry': polys2, 'df2_data':[1,2]})
2148
2149 >>> df1.overlay(df2, how='union')
2150 df1_data df2_data geometry
2151 0 1.0 1.0 POLYGON ((2.00000 2.00000, 2.00000 1.00000, 1....
2152 1 2.0 1.0 POLYGON ((2.00000 2.00000, 2.00000 3.00000, 3....
2153 2 2.0 2.0 POLYGON ((4.00000 4.00000, 4.00000 3.00000, 3....
2154 3 1.0 NaN POLYGON ((2.00000 0.00000, 0.00000 0.00000, 0....
2155 4 2.0 NaN MULTIPOLYGON (((3.00000 3.00000, 4.00000 3.000...
2156 5 NaN 1.0 MULTIPOLYGON (((2.00000 2.00000, 3.00000 2.000...
2157 6 NaN 2.0 POLYGON ((3.00000 5.00000, 5.00000 5.00000, 5....
2158
2159 >>> df1.overlay(df2, how='intersection')
2160 df1_data df2_data geometry
2161 0 1 1 POLYGON ((2.00000 2.00000, 2.00000 1.00000, 1....
2162 1 2 1 POLYGON ((2.00000 2.00000, 2.00000 3.00000, 3....
2163 2 2 2 POLYGON ((4.00000 4.00000, 4.00000 3.00000, 3....
2164
2165 >>> df1.overlay(df2, how='symmetric_difference')
2166 df1_data df2_data geometry
2167 0 1.0 NaN POLYGON ((2.00000 0.00000, 0.00000 0.00000, 0....
2168 1 2.0 NaN MULTIPOLYGON (((3.00000 3.00000, 4.00000 3.000...
2169 2 NaN 1.0 MULTIPOLYGON (((2.00000 2.00000, 3.00000 2.000...
2170 3 NaN 2.0 POLYGON ((3.00000 5.00000, 5.00000 5.00000, 5....
2171
2172 >>> df1.overlay(df2, how='difference')
2173 geometry df1_data
2174 0 POLYGON ((2.00000 0.00000, 0.00000 0.00000, 0.... 1
2175 1 MULTIPOLYGON (((3.00000 3.00000, 4.00000 3.000... 2
2176
2177 >>> df1.overlay(df2, how='identity')
2178 df1_data df2_data geometry
2179 0 1.0 1.0 POLYGON ((2.00000 2.00000, 2.00000 1.00000, 1....
2180 1 2.0 1.0 POLYGON ((2.00000 2.00000, 2.00000 3.00000, 3....
2181 2 2.0 2.0 POLYGON ((4.00000 4.00000, 4.00000 3.00000, 3....
2182 3 1.0 NaN POLYGON ((2.00000 0.00000, 0.00000 0.00000, 0....
2183 4 2.0 NaN MULTIPOLYGON (((3.00000 3.00000, 4.00000 3.000...
2184
2185 See also
2186 --------
2187 GeoDataFrame.sjoin : spatial join
2188 overlay : equivalent top-level function
2189
2190 Notes
2191 ------
2192 Every operation in GeoPandas is planar, i.e. the potential third
2193 dimension is not taken into account.
2194 """
2195 return geopandas.overlay(
2196 self, right, how=how, keep_geom_type=keep_geom_type, make_valid=make_valid
2197 )
17442198
17452199
17462200 def _dataframe_set_geometry(self, col, drop=False, inplace=False, crs=None):
1010
1111 from geopandas.base import GeoPandasBase, _delegate_property
1212 from geopandas.plotting import plot_series
13
13 from geopandas.explore import _explore_geoseries
14 import geopandas
15
16 from . import _compat as compat
17 from ._decorator import doc
1418 from .array import (
1519 GeometryDtype,
1620 from_shapely,
1721 from_wkb,
1822 from_wkt,
23 points_from_xy,
1924 to_wkb,
2025 to_wkt,
2126 )
2227 from .base import is_geometry_type
23 from . import _compat as compat
2428
2529
2630 _SERIES_WARNING_MSG = """\
4650 return GeoSeries(data=data, index=index, crs=crs, **kwargs)
4751 except TypeError:
4852 return Series(data=data, index=index, **kwargs)
49
50
51 def inherit_doc(cls):
52 """
53 A decorator adding a docstring from an existing method.
54 """
55
56 def decorator(decorated):
57 original_method = getattr(cls, decorated.__name__, None)
58 if original_method:
59 doc = original_method.__doc__ or ""
60 else:
61 doc = ""
62
63 decorated.__doc__ = doc
64 return decorated
65
66 return decorator
6753
6854
6955 class GeoSeries(GeoPandasBase, Series):
131117 - Lat[north]: Geodetic latitude (degree)
132118 - Lon[east]: Geodetic longitude (degree)
133119 Area of Use:
134 - name: World
120 - name: World.
135121 - bounds: (-180.0, -90.0, 180.0, 90.0)
136 Datum: World Geodetic System 1984
122 Datum: World Geodetic System 1984 ensemble
137123 - Ellipsoid: WGS 84
138124 - Prime Meridian: Greenwich
139125
203189 kwargs.pop("dtype", None)
204190 # Use Series constructor to handle input data
205191 with compat.ignore_shapely2_warnings():
192 # suppress additional warning from pandas for empty data
193 # (will always give object dtype instead of float dtype in the future,
194 # making the `if s.empty: s = s.astype(object)` below unnecessary)
195 empty_msg = "The default dtype for empty Series"
196 warnings.filterwarnings("ignore", empty_msg, DeprecationWarning)
197 warnings.filterwarnings("ignore", empty_msg, FutureWarning)
206198 s = pd.Series(data, index=index, name=name, **kwargs)
207199 # prevent trying to convert non-geometry objects
208200 if s.dtype != object:
209 if s.empty or data is None:
201 if (s.empty and s.dtype == "float64") or data is None:
202 # pd.Series with empty data gives float64 for older pandas versions
210203 s = s.astype(object)
211204 else:
212205 warnings.warn(_SERIES_WARNING_MSG, FutureWarning, stacklevel=2)
445438 return cls._from_wkb_or_wkb(from_wkt, data, index=index, crs=crs, **kwargs)
446439
447440 @classmethod
441 def from_xy(cls, x, y, z=None, index=None, crs=None, **kwargs):
442 """
443 Alternate constructor to create a :class:`~geopandas.GeoSeries` of Point
444 geometries from lists or arrays of x, y(, z) coordinates
445
446 In case of geographic coordinates, it is assumed that longitude is captured
447 by ``x`` coordinates and latitude by ``y``.
448
449 Parameters
450 ----------
451 x, y, z : iterable
452 index : array-like or Index, optional
453 The index for the GeoSeries. If not given and all coordinate inputs
454 are Series with an equal index, that index is used.
455 crs : value, optional
456 Coordinate Reference System of the geometry objects. Can be anything
457 accepted by
458 :meth:`pyproj.CRS.from_user_input() <pyproj.crs.CRS.from_user_input>`,
459 such as an authority string (eg "EPSG:4326") or a WKT string.
460 **kwargs
461 Additional arguments passed to the Series constructor,
462 e.g. ``name``.
463
464 Returns
465 -------
466 GeoSeries
467
468 See Also
469 --------
470 GeoSeries.from_wkt
471 points_from_xy
472
473 Examples
474 --------
475
476 >>> x = [2.5, 5, -3.0]
477 >>> y = [0.5, 1, 1.5]
478 >>> s = geopandas.GeoSeries.from_xy(x, y, crs="EPSG:4326")
479 >>> s
480 0 POINT (2.50000 0.50000)
481 1 POINT (5.00000 1.00000)
482 2 POINT (-3.00000 1.50000)
483 dtype: geometry
484 """
485 if index is None:
486 if (
487 isinstance(x, Series)
488 and isinstance(y, Series)
489 and x.index.equals(y.index)
490 and (z is None or (isinstance(z, Series) and x.index.equals(z.index)))
491 ): # check if we can reuse index
492 index = x.index
493 return cls(points_from_xy(x, y, z, crs=crs), index=index, crs=crs, **kwargs)
494
495 @classmethod
448496 def _from_wkb_or_wkb(
449497 cls, from_wkb_or_wkt_function, data, index=None, crs=None, **kwargs
450498 ):
484532
485533 return GeoDataFrame({"geometry": self}).__geo_interface__
486534
487 def to_file(self, filename, driver="ESRI Shapefile", index=None, **kwargs):
535 def to_file(self, filename, driver=None, index=None, **kwargs):
488536 """Write the ``GeoSeries`` to a file.
489537
490538 By default, an ESRI shapefile is written, but any OGR data source
494542 ----------
495543 filename : string
496544 File path or file handle to write to.
497 driver : string, default: 'ESRI Shapefile'
545 driver : string, default None
498546 The OGR format driver used to write the vector file.
547 If not specified, it attempts to infer it from the file extension.
548 If no extension is specified, it saves ESRI Shapefile to a folder.
499549 index : bool, default None
500550 If True, write index into one or more columns (for MultiIndex).
501551 Default None writes the index into one or more columns only if
547597
548598 def _wrapped_pandas_method(self, mtd, *args, **kwargs):
549599 """Wrap a generic pandas method to ensure it returns a GeoSeries"""
550 val = getattr(super(GeoSeries, self), mtd)(*args, **kwargs)
600 val = getattr(super(), mtd)(*args, **kwargs)
551601 if type(val) == Series:
552602 val.__class__ = GeoSeries
553603 val.crs = self.crs
556606 def __getitem__(self, key):
557607 return self._wrapped_pandas_method("__getitem__", key)
558608
559 @inherit_doc(pd.Series)
609 @doc(pd.Series)
560610 def sort_index(self, *args, **kwargs):
561611 return self._wrapped_pandas_method("sort_index", *args, **kwargs)
562612
563 @inherit_doc(pd.Series)
613 @doc(pd.Series)
564614 def take(self, *args, **kwargs):
565615 return self._wrapped_pandas_method("take", *args, **kwargs)
566616
567 @inherit_doc(pd.Series)
617 @doc(pd.Series)
568618 def select(self, *args, **kwargs):
569619 return self._wrapped_pandas_method("select", *args, **kwargs)
570620
571 @inherit_doc(pd.Series)
621 @doc(pd.Series)
572622 def apply(self, func, convert_dtype=True, args=(), **kwargs):
573623 result = super().apply(func, convert_dtype=convert_dtype, args=args, **kwargs)
574624 if isinstance(result, GeoSeries):
577627 return result
578628
579629 def __finalize__(self, other, method=None, **kwargs):
580 """ propagate metadata from other to self """
630 """propagate metadata from other to self"""
581631 # NOTE: backported from pandas master (upcoming v0.13)
582632 for name in self._metadata:
583633 object.__setattr__(self, name, getattr(other, name, None))
636686 stacklevel=2,
637687 )
638688
639 return super(GeoSeries, self).isna()
689 return super().isna()
640690
641691 def isnull(self):
642692 """Alias for `isna` method. See `isna` for more detail."""
694744 UserWarning,
695745 stacklevel=2,
696746 )
697 return super(GeoSeries, self).notna()
747 return super().notna()
698748
699749 def notnull(self):
700750 """Alias for `notna` method. See `notna` for more detail."""
740790 """
741791 if value is None:
742792 value = BaseGeometry()
743 return super(GeoSeries, self).fillna(
744 value=value, method=method, inplace=inplace, **kwargs
745 )
793 return super().fillna(value=value, method=method, inplace=inplace, **kwargs)
746794
747795 def __contains__(self, other):
748796 """Allow tests of the form "geom in s"
756804 else:
757805 return False
758806
807 @doc(plot_series)
759808 def plot(self, *args, **kwargs):
760 """Generate a plot of the geometries in the ``GeoSeries``.
761
762 Wraps the ``plot_series()`` function, and documentation is copied from
763 there.
764 """
765809 return plot_series(self, *args, **kwargs)
766810
767 plot.__doc__ = plot_series.__doc__
768
769 def explode(self):
811 @doc(_explore_geoseries)
812 def explore(self, *args, **kwargs):
813 """Interactive map based on folium/leaflet.js"""
814 return _explore_geoseries(self, *args, **kwargs)
815
816 def explode(self, ignore_index=False, index_parts=None):
770817 """
771818 Explode multi-part geometries into multiple single geometries.
772819
774821 This is analogous to PostGIS's ST_Dump(). The 'path' index is the
775822 second level of the returned MultiIndex
776823
777 Returns
778 ------
824 Parameters
825 ----------
826 ignore_index : bool, default False
827 If True, the resulting index will be labelled 0, 1, …, n - 1,
828 ignoring `index_parts`.
829 index_parts : boolean, default True
830 If True, the resulting index will be a multi-index (original
831 index with an additional level indicating the multiple
832 geometries: a new zero-based index for each single part geometry
833 per multi-part geometry).
834
835 Returns
836 -------
779837 A GeoSeries with a MultiIndex. The levels of the MultiIndex are the
780838 original index and a zero-based integer index that counts the
781839 number of single geometries within a multi-part geometry.
791849 1 MULTIPOINT (2.00000 2.00000, 3.00000 3.00000, ...
792850 dtype: geometry
793851
794 >>> s.explode()
852 >>> s.explode(index_parts=True)
795853 0 0 POINT (0.00000 0.00000)
796854 1 POINT (1.00000 1.00000)
797855 1 0 POINT (2.00000 2.00000)
804862 GeoDataFrame.explode
805863
806864 """
865 if index_parts is None and not ignore_index:
866 warnings.warn(
867 "Currently, index_parts defaults to True, but in the future, "
868 "it will default to False to be consistent with Pandas. "
869 "Use `index_parts=True` to keep the current behavior and True/False "
870 "to silence the warning.",
871 FutureWarning,
872 stacklevel=2,
873 )
874 index_parts = True
807875
808876 if compat.USE_PYGEOS and compat.PYGEOS_GE_09:
809877 import pygeos # noqa
828896
829897 # extract original index values based on integer index
830898 outer_index = self.index.take(outer_idx)
831
832 index = MultiIndex.from_arrays(
833 [outer_index, inner_index], names=self.index.names + [None]
834 )
899 if ignore_index:
900 index = range(len(geometries))
901
902 elif index_parts:
903 nlevels = outer_index.nlevels
904 index_arrays = [
905 outer_index.get_level_values(lvl) for lvl in range(nlevels)
906 ]
907 index_arrays.append(inner_index)
908
909 index = MultiIndex.from_arrays(
910 index_arrays, names=self.index.names + [None]
911 )
912
913 else:
914 index = outer_index
835915
836916 return GeoSeries(geometries, index=index, crs=self.crs).__finalize__(self)
837917
848928 idxs = [(idx, 0)]
849929 index.extend(idxs)
850930 geometries.extend(geoms)
851 index = MultiIndex.from_tuples(index, names=self.index.names + [None])
931
932 if ignore_index:
933 index = range(len(geometries))
934
935 elif index_parts:
936 # if self.index is a MultiIndex then index is a list of nested tuples
937 if isinstance(self.index, MultiIndex):
938 index = [tuple(outer) + (inner,) for outer, inner in index]
939 index = MultiIndex.from_tuples(index, names=self.index.names + [None])
940
941 else:
942 index = [idx for idx, _ in index]
943
852944 return GeoSeries(geometries, index=index, crs=self.crs).__finalize__(self)
853945
854946 #
12041296 stacklevel=2,
12051297 )
12061298 return self.difference(other)
1299
1300 def clip(self, mask, keep_geom_type=False):
1301 """Clip points, lines, or polygon geometries to the mask extent.
1302
1303 Both layers must be in the same Coordinate Reference System (CRS).
1304 The GeoSeries will be clipped to the full extent of the `mask` object.
1305
1306 If there are multiple polygons in mask, data from the GeoSeries will be
1307 clipped to the total boundary of all polygons in mask.
1308
1309 Parameters
1310 ----------
1311 mask : GeoDataFrame, GeoSeries, (Multi)Polygon
1312 Polygon vector layer used to clip `gdf`.
1313 The mask's geometry is dissolved into one geometric feature
1314 and intersected with `gdf`.
1315 keep_geom_type : boolean, default False
1316 If True, return only geometries of original type in case of intersection
1317 resulting in multiple geometry types or GeometryCollections.
1318 If False, return all resulting geometries (potentially mixed-types).
1319
1320 Returns
1321 -------
1322 GeoSeries
1323 Vector data (points, lines, polygons) from `gdf` clipped to
1324 polygon boundary from mask.
1325
1326 See also
1327 --------
1328 clip : top-level function for clip
1329
1330 Examples
1331 --------
1332 Clip points (global cities) with a polygon (the South American continent):
1333
1334 >>> world = geopandas.read_file(
1335 ... geopandas.datasets.get_path('naturalearth_lowres'))
1336 >>> south_america = world[world['continent'] == "South America"]
1337 >>> capitals = geopandas.read_file(
1338 ... geopandas.datasets.get_path('naturalearth_cities'))
1339 >>> capitals.shape
1340 (202, 2)
1341
1342 >>> sa_capitals = capitals.geometry.clip(south_america)
1343 >>> sa_capitals.shape
1344 (12,)
1345 """
1346 return geopandas.clip(self, mask=mask, keep_geom_type=keep_geom_type)
77 from geopandas.array import from_wkb
88 from geopandas import GeoDataFrame
99 import geopandas
10
10 from .file import _expand_user
1111
1212 METADATA_VERSION = "0.1.0"
1313 # reference: https://github.com/geopandas/geo-arrow-spec
3131 # }
3232
3333
34 def _is_fsspec_url(url):
35 return (
36 isinstance(url, str)
37 and "://" in url
38 and not url.startswith(("http://", "https://"))
39 )
40
41
3442 def _create_metadata(df):
3543 """Create and encode geo metadata dict.
3644
234242 "pyarrow.parquet", extra="pyarrow is required for Parquet support."
235243 )
236244
245 path = _expand_user(path)
237246 table = _geopandas_to_arrow(df, index=index)
238247 parquet.write_table(table, path, compression=compression, **kwargs)
239248
281290 if pyarrow.__version__ < LooseVersion("0.17.0"):
282291 raise ImportError("pyarrow >= 0.17 required for Feather support")
283292
293 path = _expand_user(path)
284294 table = _geopandas_to_arrow(df, index=index)
285295 feather.write_feather(table, path, compression=compression, **kwargs)
286296
292302 df = table.to_pandas()
293303
294304 metadata = table.schema.metadata
295 if b"geo" not in metadata:
305 if metadata is None or b"geo" not in metadata:
296306 raise ValueError(
297307 """Missing geo metadata in Parquet/Feather file.
298308 Use pandas.read_parquet/read_feather() instead."""
338348 return GeoDataFrame(df, geometry=geometry)
339349
340350
341 def _read_parquet(path, columns=None, **kwargs):
351 def _get_filesystem_path(path, filesystem=None, storage_options=None):
352 """
353 Get the filesystem and path for a given filesystem and path.
354
355 If the filesystem is not None then it's just returned as is.
356 """
357 import pyarrow
358
359 if (
360 isinstance(path, str)
361 and storage_options is None
362 and filesystem is None
363 and LooseVersion(pyarrow.__version__) >= "5.0.0"
364 ):
365 # Use the native pyarrow filesystem if possible.
366 try:
367 from pyarrow.fs import FileSystem
368
369 filesystem, path = FileSystem.from_uri(path)
370 except Exception:
371 # fallback to use get_handle / fsspec for filesystems
372 # that pyarrow doesn't support
373 pass
374
375 if _is_fsspec_url(path) and filesystem is None:
376 fsspec = import_optional_dependency(
377 "fsspec", extra="fsspec is requred for 'storage_options'."
378 )
379 filesystem, path = fsspec.core.url_to_fs(path, **(storage_options or {}))
380
381 if filesystem is None and storage_options:
382 raise ValueError(
383 "Cannot provide 'storage_options' with non-fsspec path '{}'".format(path)
384 )
385
386 return filesystem, path
387
388
389 def _read_parquet(path, columns=None, storage_options=None, **kwargs):
342390 """
343391 Load a Parquet object from the file path, returning a GeoDataFrame.
344392
365413 geometry read from the file will be set as the geometry column
366414 of the returned GeoDataFrame. If no geometry columns are present,
367415 a ``ValueError`` will be raised.
416 storage_options : dict, optional
417 Extra options that make sense for a particular storage connection, e.g. host,
418 port, username, password, etc. For HTTP(S) URLs the key-value pairs are
419 forwarded to urllib as header options. For other URLs (e.g. starting with
420 "s3://", and "gcs://") the key-value pairs are forwarded to fsspec. Please
421 see fsspec and urllib for more details.
422
423 When no storage options are provided and a filesystem is implemented by
424 both ``pyarrow.fs`` and ``fsspec`` (e.g. "s3://") then the ``pyarrow.fs``
425 filesystem is preferred. Provide the instantiated fsspec filesystem using
426 the ``filesystem`` keyword if you wish to use its implementation.
368427 **kwargs
369428 Any additional kwargs passed to pyarrow.parquet.read_table().
370429
387446 parquet = import_optional_dependency(
388447 "pyarrow.parquet", extra="pyarrow is required for Parquet support."
389448 )
390
449 # TODO(https://github.com/pandas-dev/pandas/pull/41194): see if pandas
450 # adds filesystem as a keyword and match that.
451 filesystem = kwargs.pop("filesystem", None)
452 filesystem, path = _get_filesystem_path(
453 path, filesystem=filesystem, storage_options=storage_options
454 )
455
456 path = _expand_user(path)
391457 kwargs["use_pandas_metadata"] = True
392 table = parquet.read_table(path, columns=columns, **kwargs)
458 table = parquet.read_table(path, columns=columns, filesystem=filesystem, **kwargs)
393459
394460 return _arrow_to_geopandas(table)
395461
449515 if pyarrow.__version__ < LooseVersion("0.17.0"):
450516 raise ImportError("pyarrow >= 0.17 required for Feather support")
451517
518 path = _expand_user(path)
452519 table = feather.read_table(path, columns=columns, **kwargs)
453520 return _arrow_to_geopandas(table)
0 import os
01 from distutils.version import LooseVersion
1
2 from pathlib import Path
23 import warnings
4
35 import numpy as np
46 import pandas as pd
57
1113 import fiona
1214
1315 fiona_import_error = None
16
17 # only try to import fiona.Env if the main fiona import succeeded (otherwise you
18 # can get confusing "AttributeError: module 'fiona' has no attribute '_loading'"
19 # / partially initialized module errors)
20 try:
21 from fiona import Env as fiona_env
22 except ImportError:
23 try:
24 from fiona import drivers as fiona_env
25 except ImportError:
26 fiona_env = None
27
1428 except ImportError as err:
1529 fiona = None
1630 fiona_import_error = str(err)
1731
18 try:
19 from fiona import Env as fiona_env
20 except ImportError:
21 try:
22 from fiona import drivers as fiona_env
23 except ImportError:
24 fiona_env = None
2532
2633 from geopandas import GeoDataFrame, GeoSeries
2734
3441
3542 _VALID_URLS = set(uses_relative + uses_netloc + uses_params)
3643 _VALID_URLS.discard("")
44
45 _EXTENSION_TO_DRIVER = {
46 ".bna": "BNA",
47 ".dxf": "DXF",
48 ".csv": "CSV",
49 ".shp": "ESRI Shapefile",
50 ".dbf": "ESRI Shapefile",
51 ".json": "GeoJSON",
52 ".geojson": "GeoJSON",
53 ".geojsonl": "GeoJSONSeq",
54 ".geojsons": "GeoJSONSeq",
55 ".gpkg": "GPKG",
56 ".gml": "GML",
57 ".xml": "GML",
58 ".gpx": "GPX",
59 ".gtm": "GPSTrackMaker",
60 ".gtz": "GPSTrackMaker",
61 ".tab": "MapInfo File",
62 ".mif": "MapInfo File",
63 ".mid": "MapInfo File",
64 ".dgn": "DGN",
65 }
66
67
68 def _expand_user(path):
69 """Expand paths that use ~."""
70 if isinstance(path, str):
71 path = os.path.expanduser(path)
72 elif isinstance(path, Path):
73 path = path.expanduser()
74 return path
3775
3876
3977 def _check_fiona(func):
77115 bbox : tuple | GeoDataFrame or GeoSeries | shapely Geometry, default None
78116 Filter features by given bounding box, GeoSeries, GeoDataFrame or a
79117 shapely geometry. CRS mis-matches are resolved if given a GeoSeries
80 or GeoDataFrame. Cannot be used with mask.
118 or GeoDataFrame. Tuple is (minx, miny, maxx, maxy) to match the
119 bounds property of shapely geometry objects. Cannot be used with mask.
81120 mask : dict | GeoDataFrame or GeoSeries | shapely Geometry, default None
82121 Filter for features that intersect with the given dict-like geojson
83122 geometry, GeoSeries, GeoDataFrame or shapely geometry.
110149
111150 Reading only geometries intersecting ``bbox``:
112151
113 >>> df = geopandas.read_file("nybb.shp", bbox=(0, 10, 0, 20)) # doctest: +SKIP
152 >>> df = geopandas.read_file("nybb.shp", bbox=(0, 0, 10, 20)) # doctest: +SKIP
114153
115154 Returns
116155 -------
124163 by using the encoding keyword parameter, e.g. ``encoding='utf-8'``.
125164 """
126165 _check_fiona("'read_file' function")
166 filename = _expand_user(filename)
167
127168 if _is_url(filename):
128169 req = _urlopen(filename)
129170 path_or_bytes = req.read()
231272 return _to_file(*args, **kwargs)
232273
233274
275 def _detect_driver(path):
276 """
277 Attempt to auto-detect driver based on the extension
278 """
279 try:
280 # in case the path is a file handle
281 path = path.name
282 except AttributeError:
283 pass
284 try:
285 return _EXTENSION_TO_DRIVER[Path(path).suffix.lower()]
286 except KeyError:
287 # Assume it is a shapefile folder for now. In the future,
288 # will likely raise an exception when the expected
289 # folder writing behavior is more clearly defined.
290 return "ESRI Shapefile"
291
292
234293 def _to_file(
235294 df,
236295 filename,
237 driver="ESRI Shapefile",
296 driver=None,
238297 schema=None,
239298 index=None,
240299 mode="w",
253312 df : GeoDataFrame to be written
254313 filename : string
255314 File path or file handle to write to.
256 driver : string, default 'ESRI Shapefile'
315 driver : string, default None
257316 The OGR format driver used to write the vector file.
317 If not specified, it attempts to infer it from the file extension.
318 If no extension is specified, it saves ESRI Shapefile to a folder.
258319 schema : dict, default None
259320 If specified, the schema dictionary is passed to Fiona to
260321 better control how the file is written. If None, GeoPandas
291352 by using the encoding keyword parameter, e.g. ``encoding='utf-8'``.
292353 """
293354 _check_fiona("'to_file' method")
355 filename = _expand_user(filename)
356
294357 if index is None:
295358 # Determine if index attribute(s) should be saved to file
296359 index = list(df.index.names) != [None] or type(df.index) not in (
306369 else:
307370 crs = df.crs
308371
372 if driver is None:
373 driver = _detect_driver(filename)
374
309375 if driver == "ESRI Shapefile" and any([len(c) > 10 for c in df.columns.tolist()]):
310376 warnings.warn(
311377 "Column names longer than 10 characters will be truncated when saved to "
137137 PostGIS
138138
139139 >>> from sqlalchemy import create_engine # doctest: +SKIP
140 >>> db_connection_url = "postgres://myusername:mypassword@myhost:5432/mydatabase"
140 >>> db_connection_url = "postgresql://myusername:mypassword@myhost:5432/mydatabase"
141141 >>> con = create_engine(db_connection_url) # doctest: +SKIP
142142 >>> sql = "SELECT geom, highway FROM roads"
143143 >>> df = geopandas.read_postgis(sql, con) # doctest: +SKIP
361361 --------
362362
363363 >>> from sqlalchemy import create_engine # doctest: +SKIP
364 >>> engine = create_engine("postgres://myusername:mypassword@myhost:5432\
364 >>> engine = create_engine("postgresql://myusername:mypassword@myhost:5432\
365365 /mydatabase";) # doctest: +SKIP
366366 >>> gdf.to_postgis("my_table", engine) # doctest: +SKIP
367367 """
66 from pandas import DataFrame, read_parquet as pd_read_parquet
77 from pandas.testing import assert_frame_equal
88 import numpy as np
9 from shapely.geometry import box
910
1011 import geopandas
1112 from geopandas import GeoDataFrame, read_file, read_parquet, read_feather
1516 _create_metadata,
1617 _decode_metadata,
1718 _encode_metadata,
19 _get_filesystem_path,
1820 _validate_dataframe,
1921 _validate_metadata,
2022 METADATA_VERSION,
337339 read_parquet(filename)
338340
339341
342 def test_parquet_missing_metadata2(tmpdir):
343 """Missing geo metadata, such as from a parquet file created
344 from a pyarrow Table (which will also not contain pandas metadata),
345 will raise a ValueError.
346 """
347 import pyarrow.parquet as pq
348
349 table = pyarrow.table({"a": [1, 2, 3]})
350 filename = os.path.join(str(tmpdir), "test.pq")
351
352 # use pyarrow.parquet write_table (no geo metadata, but also no pandas metadata)
353 pq.write_table(table, filename)
354
355 # missing metadata will raise ValueError
356 with pytest.raises(
357 ValueError, match="Missing geo metadata in Parquet/Feather file."
358 ):
359 read_parquet(filename)
360
361
340362 @pytest.mark.parametrize(
341363 "geo_meta,error",
342364 [
477499 ImportError, match="pyarrow >= 0.17 required for Feather support"
478500 ):
479501 df.to_feather(filename)
502
503
504 def test_fsspec_url():
505 fsspec = pytest.importorskip("fsspec")
506 import fsspec.implementations.memory
507
508 class MyMemoryFileSystem(fsspec.implementations.memory.MemoryFileSystem):
509 # Simple fsspec filesystem that adds a required keyword.
510 # Attempting to use this filesystem without the keyword will raise an exception.
511 def __init__(self, is_set, *args, **kwargs):
512 self.is_set = is_set
513 super().__init__(*args, **kwargs)
514
515 fsspec.register_implementation("memory", MyMemoryFileSystem, clobber=True)
516 memfs = MyMemoryFileSystem(is_set=True)
517
518 test_dataset = "naturalearth_lowres"
519 df = read_file(get_path(test_dataset))
520
521 with memfs.open("data.parquet", "wb") as f:
522 df.to_parquet(f)
523
524 result = read_parquet("memory://data.parquet", storage_options=dict(is_set=True))
525 assert_geodataframe_equal(result, df)
526
527 result = read_parquet("memory://data.parquet", filesystem=memfs)
528 assert_geodataframe_equal(result, df)
529
530
531 def test_non_fsspec_url_with_storage_options_raises():
532 with pytest.raises(ValueError, match="storage_options"):
533 test_dataset = "naturalearth_lowres"
534 read_parquet(get_path(test_dataset), storage_options={"foo": "bar"})
535
536
537 @pytest.mark.skipif(
538 pyarrow.__version__ < LooseVersion("5.0.0"),
539 reason="pyarrow.fs requires pyarrow>=5.0.0",
540 )
541 def test_prefers_pyarrow_fs():
542 filesystem, _ = _get_filesystem_path("file:///data.parquet")
543 assert isinstance(filesystem, pyarrow.fs.LocalFileSystem)
544
545
546 def test_write_read_parquet_expand_user():
547 gdf = geopandas.GeoDataFrame(geometry=[box(0, 0, 10, 10)], crs="epsg:4326")
548 test_file = "~/test_file.parquet"
549 gdf.to_parquet(test_file)
550 pq_df = geopandas.read_parquet(test_file)
551 assert_geodataframe_equal(gdf, pq_df, check_crs=True)
552 os.remove(os.path.expanduser(test_file))
553
554
555 def test_write_read_feather_expand_user():
556 gdf = geopandas.GeoDataFrame(geometry=[box(0, 0, 10, 10)], crs="epsg:4326")
557 test_file = "~/test_file.feather"
558 gdf.to_feather(test_file)
559 f_df = geopandas.read_feather(test_file)
560 assert_geodataframe_equal(gdf, f_df, check_crs=True)
561 os.remove(os.path.expanduser(test_file))
1212
1313 import geopandas
1414 from geopandas import GeoDataFrame, read_file
15 from geopandas.io.file import fiona_env
15 from geopandas.io.file import fiona_env, _detect_driver, _EXTENSION_TO_DRIVER
1616
1717 from geopandas.testing import assert_geodataframe_equal, assert_geoseries_equal
1818 from geopandas.tests.util import PACKAGE_DIR, validate_boro_df
6060 # to_file tests
6161 # -----------------------------------------------------------------------------
6262
63 driver_ext_pairs = [("ESRI Shapefile", "shp"), ("GeoJSON", "geojson"), ("GPKG", "gpkg")]
63 driver_ext_pairs = [
64 ("ESRI Shapefile", ".shp"),
65 ("GeoJSON", ".geojson"),
66 ("GPKG", ".gpkg"),
67 (None, ".shp"),
68 (None, ""),
69 (None, ".geojson"),
70 (None, ".gpkg"),
71 ]
72
73
74 def assert_correct_driver(file_path, ext):
75 # check the expected driver
76 expected_driver = "ESRI Shapefile" if ext == "" else _EXTENSION_TO_DRIVER[ext]
77 with fiona.open(str(file_path)) as fds:
78 assert fds.driver == expected_driver
6479
6580
6681 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
6782 def test_to_file(tmpdir, df_nybb, df_null, driver, ext):
68 """ Test to_file and from_file """
83 """Test to_file and from_file"""
6984 tempfilename = os.path.join(str(tmpdir), "boros." + ext)
7085 df_nybb.to_file(tempfilename, driver=driver)
7186 # Read layer back in
7590 assert np.alltrue(df["BoroName"].values == df_nybb["BoroName"])
7691
7792 # Write layer with null geometry out to file
78 tempfilename = os.path.join(str(tmpdir), "null_geom." + ext)
93 tempfilename = os.path.join(str(tmpdir), "null_geom" + ext)
7994 df_null.to_file(tempfilename, driver=driver)
8095 # Read layer back in
8196 df = GeoDataFrame.from_file(tempfilename)
8297 assert "geometry" in df
8398 assert len(df) == 2
8499 assert np.alltrue(df["Name"].values == df_null["Name"])
100 # check the expected driver
101 assert_correct_driver(tempfilename, ext)
85102
86103
87104 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
88105 def test_to_file_pathlib(tmpdir, df_nybb, df_null, driver, ext):
89 """ Test to_file and from_file """
106 """Test to_file and from_file"""
90107 temppath = pathlib.Path(os.path.join(str(tmpdir), "boros." + ext))
91108 df_nybb.to_file(temppath, driver=driver)
92109 # Read layer back in
94111 assert "geometry" in df
95112 assert len(df) == 5
96113 assert np.alltrue(df["BoroName"].values == df_nybb["BoroName"])
114 # check the expected driver
115 assert_correct_driver(temppath, ext)
97116
98117
99118 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
105124 "a": [1, 2, 3],
106125 "b": [True, False, True],
107126 "geometry": [Point(0, 0), Point(1, 1), Point(2, 2)],
108 }
127 },
128 crs=4326,
109129 )
110130
111131 df.to_file(tempfilename, driver=driver)
112132 result = read_file(tempfilename)
113 if driver == "GeoJSON":
114 # geojson by default assumes epsg:4326
115 result.crs = None
116 if driver == "ESRI Shapefile":
133 if ext in (".shp", ""):
117134 # Shapefile does not support boolean, so is read back as int
118135 df["b"] = df["b"].astype("int64")
119136 assert_geodataframe_equal(result, df)
137 # check the expected driver
138 assert_correct_driver(tempfilename, ext)
120139
121140
122141 def test_to_file_datetime(tmpdir):
124143 tempfilename = os.path.join(str(tmpdir), "test_datetime.gpkg")
125144 point = Point(0, 0)
126145 now = datetime.datetime.now()
127 df = GeoDataFrame({"a": [1, 2], "b": [now, now]}, geometry=[point, point], crs={})
146 df = GeoDataFrame({"a": [1, 2], "b": [now, now]}, geometry=[point, point], crs=4326)
128147 df.to_file(tempfilename, driver="GPKG")
129148 df_read = read_file(tempfilename)
130149 assert_geoseries_equal(df.geometry, df_read.geometry)
134153 def test_to_file_with_point_z(tmpdir, ext, driver):
135154 """Test that 3D geometries are retained in writes (GH #612)."""
136155
137 tempfilename = os.path.join(str(tmpdir), "test_3Dpoint." + ext)
156 tempfilename = os.path.join(str(tmpdir), "test_3Dpoint" + ext)
138157 point3d = Point(0, 0, 500)
139158 point2d = Point(1, 1)
140159 df = GeoDataFrame({"a": [1, 2]}, geometry=[point3d, point2d], crs=_CRS)
141160 df.to_file(tempfilename, driver=driver)
142161 df_read = GeoDataFrame.from_file(tempfilename)
143162 assert_geoseries_equal(df.geometry, df_read.geometry)
163 # check the expected driver
164 assert_correct_driver(tempfilename, ext)
144165
145166
146167 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
147168 def test_to_file_with_poly_z(tmpdir, ext, driver):
148169 """Test that 3D geometries are retained in writes (GH #612)."""
149170
150 tempfilename = os.path.join(str(tmpdir), "test_3Dpoly." + ext)
171 tempfilename = os.path.join(str(tmpdir), "test_3Dpoly" + ext)
151172 poly3d = Polygon([[0, 0, 5], [0, 1, 5], [1, 1, 5], [1, 0, 5]])
152173 poly2d = Polygon([[0, 0], [0, 1], [1, 1], [1, 0]])
153174 df = GeoDataFrame({"a": [1, 2]}, geometry=[poly3d, poly2d], crs=_CRS)
154175 df.to_file(tempfilename, driver=driver)
155176 df_read = GeoDataFrame.from_file(tempfilename)
156177 assert_geoseries_equal(df.geometry, df_read.geometry)
178 # check the expected driver
179 assert_correct_driver(tempfilename, ext)
157180
158181
159182 def test_to_file_types(tmpdir, df_points):
160 """ Test various integer type columns (GH#93) """
183 """Test various integer type columns (GH#93)"""
161184 tempfilename = os.path.join(str(tmpdir), "int.shp")
162185 int_types = [
163186 np.int8,
246269
247270 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
248271 def test_append_file(tmpdir, df_nybb, df_null, driver, ext):
249 """ Test to_file with append mode and from_file """
272 """Test to_file with append mode and from_file"""
250273 from fiona import supported_drivers
251274
275 tempfilename = os.path.join(str(tmpdir), "boros" + ext)
276 driver = driver if driver else _detect_driver(tempfilename)
252277 if "a" not in supported_drivers[driver]:
253278 return None
254279
255 tempfilename = os.path.join(str(tmpdir), "boros." + ext)
256280 df_nybb.to_file(tempfilename, driver=driver)
257281 df_nybb.to_file(tempfilename, mode="a", driver=driver)
258282 # Read layer back in
263287 assert_geodataframe_equal(df, expected, check_less_precise=True)
264288
265289 # Write layer with null geometry out to file
266 tempfilename = os.path.join(str(tmpdir), "null_geom." + ext)
290 tempfilename = os.path.join(str(tmpdir), "null_geom" + ext)
267291 df_null.to_file(tempfilename, driver=driver)
268292 df_null.to_file(tempfilename, mode="a", driver=driver)
269293 # Read layer back in
272296 assert len(df) == (2 * 2)
273297 expected = pd.concat([df_null] * 2, ignore_index=True)
274298 assert_geodataframe_equal(df, expected, check_less_precise=True)
299
300
301 @pytest.mark.parametrize("driver,ext", driver_ext_pairs)
302 def test_empty_crs(tmpdir, driver, ext):
303 """Test handling of undefined CRS with GPKG driver (GH #1975)."""
304 if ext == ".gpkg":
305 pytest.xfail("GPKG is read with Undefined geographic SRS.")
306
307 tempfilename = os.path.join(str(tmpdir), "boros" + ext)
308 df = GeoDataFrame(
309 {
310 "a": [1, 2, 3],
311 "geometry": [Point(0, 0), Point(1, 1), Point(2, 2)],
312 },
313 )
314
315 df.to_file(tempfilename, driver=driver)
316 result = read_file(tempfilename)
317
318 if ext == ".geojson":
319 # geojson by default assumes epsg:4326
320 df.crs = "EPSG:4326"
321
322 assert_geodataframe_equal(result, df)
275323
276324
277325 # -----------------------------------------------------------------------------
389437 gdf = read_file(path)
390438 assert isinstance(gdf, geopandas.GeoDataFrame)
391439
392 # Check that it can sucessfully add a zip scheme to a path that already has a scheme
440 # Check that it can successfully add a zip scheme to a path that already has a
441 # scheme
393442 gdf = read_file("file+file://" + path)
394443 assert isinstance(gdf, geopandas.GeoDataFrame)
395444
406455 assert isinstance(gdf, geopandas.GeoDataFrame)
407456
408457
409 def test_read_file_filtered(df_nybb):
410 full_df_shape = df_nybb.shape
458 def test_read_file_filtered__bbox(df_nybb):
411459 nybb_filename = geopandas.datasets.get_path("nybb")
412460 bbox = (
413461 1031051.7879884212,
416464 244317.30894023244,
417465 )
418466 filtered_df = read_file(nybb_filename, bbox=bbox)
419 filtered_df_shape = filtered_df.shape
420 assert full_df_shape != filtered_df_shape
421 assert filtered_df_shape == (2, 5)
467 expected = df_nybb[df_nybb["BoroName"].isin(["Bronx", "Queens"])]
468 assert_geodataframe_equal(filtered_df, expected.reset_index(drop=True))
469
470
471 def test_read_file_filtered__bbox__polygon(df_nybb):
472 nybb_filename = geopandas.datasets.get_path("nybb")
473 bbox = box(
474 1031051.7879884212, 224272.49231459625, 1047224.3104931959, 244317.30894023244
475 )
476 filtered_df = read_file(nybb_filename, bbox=bbox)
477 expected = df_nybb[df_nybb["BoroName"].isin(["Bronx", "Queens"])]
478 assert_geodataframe_equal(filtered_df, expected.reset_index(drop=True))
422479
423480
424481 def test_read_file_filtered__rows(df_nybb):
425 full_df_shape = df_nybb.shape
426482 nybb_filename = geopandas.datasets.get_path("nybb")
427483 filtered_df = read_file(nybb_filename, rows=1)
428 filtered_df_shape = filtered_df.shape
429 assert full_df_shape != filtered_df_shape
430 assert filtered_df_shape == (1, 5)
431
432
484 assert_geodataframe_equal(filtered_df, df_nybb.iloc[[0], :])
485
486
487 def test_read_file_filtered__rows_slice(df_nybb):
488 nybb_filename = geopandas.datasets.get_path("nybb")
489 filtered_df = read_file(nybb_filename, rows=slice(1, 3))
490 assert_geodataframe_equal(filtered_df, df_nybb.iloc[1:3, :].reset_index(drop=True))
491
492
493 @pytest.mark.filterwarnings(
494 "ignore:Layer does not support OLC_FASTFEATURECOUNT:RuntimeWarning"
495 ) # for the slice with -1
433496 def test_read_file_filtered__rows_bbox(df_nybb):
434 full_df_shape = df_nybb.shape
435497 nybb_filename = geopandas.datasets.get_path("nybb")
436498 bbox = (
437499 1031051.7879884212,
439501 1047224.3104931959,
440502 244317.30894023244,
441503 )
504 # combination bbox and rows (rows slice applied after bbox filtering!)
505 filtered_df = read_file(nybb_filename, bbox=bbox, rows=slice(4, None))
506 assert filtered_df.empty
442507 filtered_df = read_file(nybb_filename, bbox=bbox, rows=slice(-1, None))
443 filtered_df_shape = filtered_df.shape
444 assert full_df_shape != filtered_df_shape
445 assert filtered_df_shape == (1, 5)
446
447
448 def test_read_file_filtered__rows_bbox__polygon(df_nybb):
449 full_df_shape = df_nybb.shape
450 nybb_filename = geopandas.datasets.get_path("nybb")
451 bbox = box(
452 1031051.7879884212, 224272.49231459625, 1047224.3104931959, 244317.30894023244
453 )
454 filtered_df = read_file(nybb_filename, bbox=bbox, rows=slice(-1, None))
455 filtered_df_shape = filtered_df.shape
456 assert full_df_shape != filtered_df_shape
457 assert filtered_df_shape == (1, 5)
508 assert_geodataframe_equal(filtered_df, df_nybb.iloc[4:, :].reset_index(drop=True))
458509
459510
460511 def test_read_file_filtered_rows_invalid():
778829 # named DatetimeIndex
779830 df.index.name = "datetime"
780831 do_checks(df, index_is_used=True)
832
833
834 def test_to_file__undetermined_driver(tmp_path, df_nybb):
835 shpdir = tmp_path / "boros.invalid"
836 df_nybb.to_file(shpdir)
837 assert shpdir.is_dir()
838 assert list(shpdir.glob("*.shp"))
839
840
841 @pytest.mark.parametrize(
842 "test_file", [(pathlib.Path("~/test_file.geojson")), "~/test_file.geojson"]
843 )
844 def test_write_read_file(test_file):
845 gdf = geopandas.GeoDataFrame(geometry=[box(0, 0, 10, 10)], crs=_CRS)
846 gdf.to_file(test_file, driver="GeoJSON")
847 df_json = geopandas.read_file(test_file)
848 assert_geodataframe_equal(gdf, df_json, check_crs=True)
849 os.remove(os.path.expanduser(test_file))
6363
6464 try:
6565 con = sqlalchemy.create_engine(
66 URL(
66 URL.create(
6767 drivername="postgresql+psycopg2",
6868 username=user,
6969 database=dbname,
113113 def drop_table_if_exists(conn_or_engine, table):
114114 sqlalchemy = pytest.importorskip("sqlalchemy")
115115
116 if conn_or_engine.dialect.has_table(conn_or_engine, table):
116 if sqlalchemy.inspect(conn_or_engine).has_table(table):
117117 metadata = sqlalchemy.MetaData(conn_or_engine)
118118 metadata.reflect()
119119 table = metadata.tables.get(table)
11
22 import numpy as np
33 import pandas as pd
4 from pandas.plotting import PlotAccessor
45
56 import geopandas
67
78 from distutils.version import LooseVersion
9
10 from ._decorator import doc
811
912
1013 def deprecated(new):
6972 mpl = matplotlib.__version__
7073 if mpl >= LooseVersion("3.4") or (mpl > LooseVersion("3.3.2") and "+" in mpl):
7174 # alpha is supported as array argument with matplotlib 3.4+
72 scalar_kwargs = ["marker"]
75 scalar_kwargs = ["marker", "path_effects"]
7376 else:
74 scalar_kwargs = ["marker", "alpha"]
77 scalar_kwargs = ["marker", "alpha", "path_effects"]
7578
7679 for att, value in kwargs.items():
7780 if "color" in att: # color(s), edgecolor(s), facecolor(s)
722725
723726 nan_idx = np.asarray(pd.isna(values), dtype="bool")
724727
725 # Define `values` as a Series
726 if categorical:
727 if cmap is None:
728 cmap = "tab10"
729
730 cat = pd.Categorical(values, categories=categories)
731 categories = list(cat.categories)
732
733 # values missing in the Categorical but not in original values
734 missing = list(np.unique(values[~nan_idx & cat.isna()]))
735 if missing:
736 raise ValueError(
737 "Column contains values not listed in categories. "
738 "Missing categories: {}.".format(missing)
739 )
740
741 values = cat.codes[~nan_idx]
742 vmin = 0 if vmin is None else vmin
743 vmax = len(categories) - 1 if vmax is None else vmax
744
745728 if scheme is not None:
729 mc_err = (
730 "The 'mapclassify' package (>= 2.4.0) is "
731 "required to use the 'scheme' keyword."
732 )
733 try:
734 import mapclassify
735
736 except ImportError:
737 raise ImportError(mc_err)
738
739 if mapclassify.__version__ < LooseVersion("2.4.0"):
740 raise ImportError(mc_err)
741
746742 if classification_kwds is None:
747743 classification_kwds = {}
748744 if "k" not in classification_kwds:
749745 classification_kwds["k"] = k
750746
751 binning = _mapclassify_choro(values[~nan_idx], scheme, **classification_kwds)
747 binning = mapclassify.classify(
748 np.asarray(values[~nan_idx]), scheme, **classification_kwds
749 )
752750 # set categorical to True for creating the legend
753751 categorical = True
754752 if legend_kwds is not None and "labels" in legend_kwds:
760758 )
761759 )
762760 else:
763 categories = list(legend_kwds.pop("labels"))
761 labels = list(legend_kwds.pop("labels"))
764762 else:
765763 fmt = "{:.2f}"
766764 if legend_kwds is not None and "fmt" in legend_kwds:
767765 fmt = legend_kwds.pop("fmt")
768766
769 categories = binning.get_legend_classes(fmt)
767 labels = binning.get_legend_classes(fmt)
770768 if legend_kwds is not None:
771769 show_interval = legend_kwds.pop("interval", False)
772770 else:
773771 show_interval = False
774772 if not show_interval:
775 categories = [c[1:-1] for c in categories]
776 values = np.array(binning.yb)
773 labels = [c[1:-1] for c in labels]
774
775 values = pd.Categorical([np.nan] * len(values), categories=labels, ordered=True)
776 values[~nan_idx] = pd.Categorical.from_codes(
777 binning.yb, categories=labels, ordered=True
778 )
779 if cmap is None:
780 cmap = "viridis"
781
782 # Define `values` as a Series
783 if categorical:
784 if cmap is None:
785 cmap = "tab10"
786
787 cat = pd.Categorical(values, categories=categories)
788 categories = list(cat.categories)
789
790 # values missing in the Categorical but not in original values
791 missing = list(np.unique(values[~nan_idx & cat.isna()]))
792 if missing:
793 raise ValueError(
794 "Column contains values not listed in categories. "
795 "Missing categories: {}.".format(missing)
796 )
797
798 values = cat.codes[~nan_idx]
799 vmin = 0 if vmin is None else vmin
800 vmax = len(categories) - 1 if vmax is None else vmax
777801
778802 # fill values with placeholder where were NaNs originally to map them properly
779803 # (after removing them in categorical or scheme)
902926 else:
903927 legend_kwds.setdefault("ax", ax)
904928
905 n_cmap.set_array([])
929 n_cmap.set_array(np.array([]))
906930 ax.get_figure().colorbar(n_cmap, **legend_kwds)
907931
908932 plt.draw()
909933 return ax
910934
911935
912 if geopandas._compat.PANDAS_GE_025:
913 from pandas.plotting import PlotAccessor
914
915 class GeoplotAccessor(PlotAccessor):
916
917 __doc__ = plot_dataframe.__doc__
918 _pandas_kinds = PlotAccessor._all_kinds
919
920 def __call__(self, *args, **kwargs):
921 data = self._parent.copy()
922 kind = kwargs.pop("kind", "geo")
923 if kind == "geo":
924 return plot_dataframe(data, *args, **kwargs)
925 if kind in self._pandas_kinds:
926 # Access pandas plots
927 return PlotAccessor(data)(kind=kind, **kwargs)
928 else:
929 # raise error
930 raise ValueError(f"{kind} is not a valid plot kind")
931
932 def geo(self, *args, **kwargs):
933 return self(kind="geo", *args, **kwargs)
934
935
936 def _mapclassify_choro(values, scheme, **classification_kwds):
937 """
938 Wrapper for choropleth schemes from mapclassify for use with plot_dataframe
939
940 Parameters
941 ----------
942 values
943 Series to be plotted
944 scheme : str
945 One of mapclassify classification schemes
946 Options are BoxPlot, EqualInterval, FisherJenks,
947 FisherJenksSampled, HeadTailBreaks, JenksCaspall,
948 JenksCaspallForced, JenksCaspallSampled, MaxP,
949 MaximumBreaks, NaturalBreaks, Quantiles, Percentiles, StdMean,
950 UserDefined
951
952 **classification_kwds : dict
953 Keyword arguments for classification scheme
954 For details see mapclassify documentation:
955 https://pysal.org/mapclassify/api.html
956
957 Returns
958 -------
959 binning
960 Binning objects that holds the Series with values replaced with
961 class identifier and the bins.
962 """
963 try:
964 import mapclassify.classifiers as classifiers
965
966 except ImportError:
967 raise ImportError(
968 "The 'mapclassify' >= 2.2.0 package is required to use the 'scheme' keyword"
969 )
970 from mapclassify import __version__ as mc_version
971
972 if mc_version < LooseVersion("2.2.0"):
973 raise ImportError(
974 "The 'mapclassify' >= 2.2.0 package is required to "
975 "use the 'scheme' keyword"
976 )
977 schemes = {}
978 for classifier in classifiers.CLASSIFIERS:
979 schemes[classifier.lower()] = getattr(classifiers, classifier)
980
981 scheme = scheme.lower()
982
983 # mapclassify < 2.1 cleaned up the scheme names (removing underscores)
984 # trying both to keep compatibility with older versions and provide
985 # compatibility with newer versions of mapclassify
986 oldnew = {
987 "Box_Plot": "BoxPlot",
988 "Equal_Interval": "EqualInterval",
989 "Fisher_Jenks": "FisherJenks",
990 "Fisher_Jenks_Sampled": "FisherJenksSampled",
991 "HeadTail_Breaks": "HeadTailBreaks",
992 "Jenks_Caspall": "JenksCaspall",
993 "Jenks_Caspall_Forced": "JenksCaspallForced",
994 "Jenks_Caspall_Sampled": "JenksCaspallSampled",
995 "Max_P_Plassifier": "MaxP",
996 "Maximum_Breaks": "MaximumBreaks",
997 "Natural_Breaks": "NaturalBreaks",
998 "Std_Mean": "StdMean",
999 "User_Defined": "UserDefined",
1000 }
1001 scheme_names_mapping = {}
1002 scheme_names_mapping.update(
1003 {old.lower(): new.lower() for old, new in oldnew.items()}
1004 )
1005 scheme_names_mapping.update(
1006 {new.lower(): old.lower() for old, new in oldnew.items()}
1007 )
1008
1009 try:
1010 scheme_class = schemes[scheme]
1011 except KeyError:
1012 scheme = scheme_names_mapping.get(scheme, scheme)
1013 try:
1014 scheme_class = schemes[scheme]
1015 except KeyError:
1016 raise ValueError(
1017 "Invalid scheme. Scheme must be in the set: %r" % schemes.keys()
1018 )
1019
1020 if classification_kwds["k"] is not None:
1021 from inspect import getfullargspec as getspec
1022
1023 spec = getspec(scheme_class.__init__)
1024 if "k" not in spec.args:
1025 del classification_kwds["k"]
1026 try:
1027 binning = scheme_class(np.asarray(values), **classification_kwds)
1028 except TypeError:
1029 raise TypeError("Invalid keyword argument for %r " % scheme)
1030 return binning
936 @doc(plot_dataframe)
937 class GeoplotAccessor(PlotAccessor):
938
939 _pandas_kinds = PlotAccessor._all_kinds
940
941 def __call__(self, *args, **kwargs):
942 data = self._parent.copy()
943 kind = kwargs.pop("kind", "geo")
944 if kind == "geo":
945 return plot_dataframe(data, *args, **kwargs)
946 if kind in self._pandas_kinds:
947 # Access pandas plots
948 return PlotAccessor(data)(kind=kind, **kwargs)
949 else:
950 # raise error
951 raise ValueError(f"{kind} is not a valid plot kind")
952
953 def geo(self, *args, **kwargs):
954 return self(kind="geo", *args, **kwargs)
0 from textwrap import dedent
10 import warnings
21
32 from shapely.geometry.base import BaseGeometry
54 import numpy as np
65
76 from . import _compat as compat
7 from ._decorator import doc
88
99
1010 def _get_sindex_class():
167167 """
168168 raise NotImplementedError
169169
170 def nearest(
171 self, geometry, return_all=True, max_distance=None, return_distance=False
172 ):
173 """
174 Return the nearest geometry in the tree for each input geometry in
175 ``geometry``.
176
177 .. note::
178 ``nearest`` currently only works with PyGEOS >= 0.10.
179
180 Note that if PyGEOS is not available, geopandas will use rtree
181 for the spatial index, where nearest has a different
182 function signature to temporarily preserve existing
183 functionality. See the documentation of
184 :meth:`rtree.index.Index.nearest` for the details on the
185 ``rtree``-based implementation.
186
187 If multiple tree geometries have the same distance from an input geometry,
188 multiple results will be returned for that input geometry by default.
189 Specify ``return_all=False`` to only get a single nearest geometry
190 (non-deterministic which nearest is returned).
191
192 In the context of a spatial join, input geometries are the "left"
193 geometries that determine the order of the results, and tree geometries
194 are "right" geometries that are joined against the left geometries.
195 If ``max_distance`` is not set, this will effectively be a left join
196 because every geometry in ``geometry`` will have a nearest geometry in
197 the tree. However, if ``max_distance`` is used, this becomes an
198 inner join, since some geometries in ``geometry`` may not have a match
199 in the tree.
200
201 For performance reasons, it is highly recommended that you set
202 the ``max_distance`` parameter.
203
204 Parameters
205 ----------
206 geometry : {shapely.geometry, GeoSeries, GeometryArray, numpy.array of PyGEOS \
207 geometries}
208 A single shapely geometry, one of the GeoPandas geometry iterables
209 (GeoSeries, GeometryArray), or a numpy array of PyGEOS geometries to query
210 against the spatial index.
211 return_all : bool, default True
212 If there are multiple equidistant or intersecting nearest
213 geometries, return all those geometries instead of a single
214 nearest geometry.
215 max_distance : float, optional
216 Maximum distance within which to query for nearest items in tree.
217 Must be greater than 0. By default None, indicating no distance limit.
218 return_distance : bool, optional
219 If True, will return distances in addition to indexes. By default False
220
221 Returns
222 -------
223 Indices or tuple of (indices, distances)
224 Indices is an ndarray of shape (2,n) and distances (if present) an
225 ndarray of shape (n).
226 The first subarray of indices contains input geometry indices.
227 The second subarray of indices contains tree geometry indices.
228
229 Examples
230 --------
231 >>> from shapely.geometry import Point, box
232 >>> s = geopandas.GeoSeries(geopandas.points_from_xy(range(10), range(10)))
233 >>> s.head()
234 0 POINT (0.00000 0.00000)
235 1 POINT (1.00000 1.00000)
236 2 POINT (2.00000 2.00000)
237 3 POINT (3.00000 3.00000)
238 4 POINT (4.00000 4.00000)
239 dtype: geometry
240
241 >>> s.sindex.nearest(Point(1, 1))
242 array([[0],
243 [1]])
244
245 >>> s.sindex.nearest([box(4.9, 4.9, 5.1, 5.1)])
246 array([[0],
247 [5]])
248
249 >>> s2 = geopandas.GeoSeries(geopandas.points_from_xy([7.6, 10], [7.6, 10]))
250 >>> s2
251 0 POINT (7.60000 7.60000)
252 1 POINT (10.00000 10.00000)
253 dtype: geometry
254
255 >>> s.sindex.nearest(s2)
256 array([[0, 1],
257 [8, 9]])
258 """
259 raise NotImplementedError
260
170261 def intersection(self, coordinates):
171262 """Compatibility wrapper for rtree.index.Index.intersection,
172 use ``query`` intead.
263 use ``query`` instead.
173264
174265 Parameters
175266 ----------
264355 raise NotImplementedError
265356
266357
267 def doc(docstring):
268 """
269 A decorator take docstring from passed object and it to decorated one.
270 """
271
272 def decorator(decorated):
273 decorated.__doc__ = dedent(docstring.__doc__ or "")
274 return decorated
275
276 return decorator
277
278
279358 if compat.HAS_RTREE:
280359
281360 import rtree.index # noqa
299378 def intersection(self, coordinates, *args, **kwargs):
300379 return super().intersection(coordinates, *args, **kwargs)
301380
381 @doc(BaseSpatialIndex.nearest)
382 def nearest(self, *args, **kwargs):
383 return super().nearest(*args, **kwargs)
384
385 @property
302386 @doc(BaseSpatialIndex.size)
303 @property
304387 def size(self):
305388 return len(self.leaves()[0][1])
306389
390 @property
307391 @doc(BaseSpatialIndex.is_empty)
308 @property
309392 def is_empty(self):
310393 if len(self.leaves()) > 1:
311394 return False
342425 [None] * self.geometries.size, dtype=object
343426 )
344427
428 @property
345429 @doc(BaseSpatialIndex.valid_query_predicates)
346 @property
347430 def valid_query_predicates(self):
348431 return {
349432 None,
449532 input_geometry_index.extend([i] * len(res))
450533 return np.vstack([input_geometry_index, tree_index])
451534
535 def nearest(self, coordinates, num_results=1, objects=False):
536 """
537 Returns the nearest object or objects to the given coordinates.
538
539 Requires rtree, and passes parameters directly to
540 :meth:`rtree.index.Index.nearest`.
541
542 This behaviour is deprecated and will be updated to be consistent
543 with the pygeos PyGEOSSTRTreeIndex in a future release.
544
545 If longer-term compatibility is required, use
546 :meth:`rtree.index.Index.nearest` directly instead.
547
548 Examples
549 --------
550 >>> s = geopandas.GeoSeries(geopandas.points_from_xy(range(3), range(3)))
551 >>> s
552 0 POINT (0.00000 0.00000)
553 1 POINT (1.00000 1.00000)
554 2 POINT (2.00000 2.00000)
555 dtype: geometry
556
557 >>> list(s.sindex.nearest((0, 0))) # doctest: +SKIP
558 [0]
559
560 >>> list(s.sindex.nearest((0.5, 0.5))) # doctest: +SKIP
561 [0, 1]
562
563 >>> list(s.sindex.nearest((3, 3), num_results=2)) # doctest: +SKIP
564 [2, 1]
565
566 >>> list(super(type(s.sindex), s.sindex).nearest((0, 0),
567 ... num_results=2)) # doctest: +SKIP
568 [0, 1]
569
570 Parameters
571 ----------
572 coordinates : sequence or array
573 This may be an object that satisfies the numpy array protocol,
574 providing the index’s dimension * 2 coordinate pairs
575 representing the mink and maxk coordinates in each dimension
576 defining the bounds of the query window.
577 num_results : integer
578 The number of results to return nearest to the given
579 coordinates. If two index entries are equidistant, both are
580 returned. This property means that num_results may return more
581 items than specified
582 objects : True / False / ‘raw’
583 If True, the nearest method will return index objects that were
584 pickled when they were stored with each index entry, as well as
585 the id and bounds of the index entries. If ‘raw’, it will
586 return the object as entered into the database without the
587 rtree.index.Item wrapper.
588 """
589 warnings.warn(
590 "sindex.nearest using the rtree backend was not previously documented "
591 "and this behavior is deprecated in favor of matching the function "
592 "signature provided by the pygeos backend (see "
593 "PyGEOSSTRTreeIndex.nearest for details). This behavior will be "
594 "updated in a future release.",
595 FutureWarning,
596 )
597 return super().nearest(
598 coordinates, num_results=num_results, objects=objects
599 )
600
452601 @doc(BaseSpatialIndex.intersection)
453602 def intersection(self, coordinates):
454603 return super().intersection(coordinates, objects=False)
455604
605 @property
456606 @doc(BaseSpatialIndex.size)
457 @property
458607 def size(self):
459608 if hasattr(self, "_size"):
460609 size = self._size
466615 self._size = size
467616 return size
468617
618 @property
469619 @doc(BaseSpatialIndex.is_empty)
470 @property
471620 def is_empty(self):
472621 return self.geometries.size == 0 or self.size == 0
473622
480629 from . import geoseries # noqa
481630 from . import array # noqa
482631 import pygeos # noqa
632
633 _PYGEOS_PREDICATES = {p.name for p in pygeos.strtree.BinaryPredicate} | set([None])
483634
484635 class PyGEOSSTRTreeIndex(pygeos.STRtree):
485636 """A simple wrapper around pygeos's STRTree.
498649 # https://github.com/pygeos/pygeos/issues/147
499650 non_empty = geometry.copy()
500651 non_empty[pygeos.is_empty(non_empty)] = None
501 # set empty geometries to None to mantain indexing
652 # set empty geometries to None to maintain indexing
502653 super().__init__(non_empty)
503654 # store geometries, including empty geometries for user access
504655 self.geometries = geometry.copy()
520671 {'contains', 'crosses', 'covered_by', None, 'intersects', 'within', \
521672 'touches', 'overlaps', 'contains_properly', 'covers'}
522673 """
523 return pygeos.strtree.VALID_PREDICATES | set([None])
674 return _PYGEOS_PREDICATES
524675
525676 @doc(BaseSpatialIndex.query)
526677 def query(self, geometry, predicate=None, sort=False):
542693
543694 return matches
544695
696 @staticmethod
697 def _as_geometry_array(geometry):
698 """Convert geometry into a numpy array of PyGEOS geometries.
699
700 Parameters
701 ----------
702 geometry
703 An array-like of PyGEOS geometries, a GeoPandas GeoSeries/GeometryArray,
704 shapely.geometry or list of shapely geometries.
705
706 Returns
707 -------
708 np.ndarray
709 A numpy array of pygeos geometries.
710 """
711 if isinstance(geometry, np.ndarray):
712 return geometry
713 elif isinstance(geometry, geoseries.GeoSeries):
714 return geometry.values.data
715 elif isinstance(geometry, array.GeometryArray):
716 return geometry.data
717 elif isinstance(geometry, BaseGeometry):
718 return array._shapely_to_geom(geometry)
719 elif isinstance(geometry, list):
720 return np.asarray(
721 [
722 array._shapely_to_geom(el)
723 if isinstance(el, BaseGeometry)
724 else el
725 for el in geometry
726 ]
727 )
728 else:
729 return np.asarray(geometry)
730
545731 @doc(BaseSpatialIndex.query_bulk)
546732 def query_bulk(self, geometry, predicate=None, sort=False):
547733 if predicate not in self.valid_query_predicates:
550736 predicate, self.valid_query_predicates
551737 )
552738 )
553 if isinstance(geometry, geoseries.GeoSeries):
554 geometry = geometry.values.data
555 elif isinstance(geometry, array.GeometryArray):
556 geometry = geometry.data
557 elif not isinstance(geometry, np.ndarray):
558 geometry = np.asarray(geometry)
739
740 geometry = self._as_geometry_array(geometry)
559741
560742 res = super().query_bulk(geometry, predicate)
561743
566748 return np.vstack((geo_res[indexing], tree_res[indexing]))
567749
568750 return res
751
752 @doc(BaseSpatialIndex.nearest)
753 def nearest(
754 self, geometry, return_all=True, max_distance=None, return_distance=False
755 ):
756 if not compat.PYGEOS_GE_010:
757 raise NotImplementedError("sindex.nearest requires pygeos >= 0.10")
758
759 geometry = self._as_geometry_array(geometry)
760
761 if not return_all and max_distance is None and not return_distance:
762 return super().nearest(geometry)
763
764 result = super().nearest_all(
765 geometry, max_distance=max_distance, return_distance=return_distance
766 )
767 if return_distance:
768 indices, distances = result
769 else:
770 indices = result
771
772 if not return_all:
773 # first subarray of geometry indices is sorted, so we can use this
774 # trick to get the first of each index value
775 mask = np.diff(indices[0, :]).astype("bool")
776 # always select the first element
777 mask = np.insert(mask, 0, True)
778
779 indices = indices[:, mask]
780 if return_distance:
781 distances = distances[mask]
782
783 if return_distance:
784 return indices, distances
785 else:
786 return indices
569787
570788 @doc(BaseSpatialIndex.intersection)
571789 def intersection(self, coordinates):
597815
598816 return indexes
599817
818 @property
600819 @doc(BaseSpatialIndex.size)
601 @property
602820 def size(self):
603821 return len(self)
604822
823 @property
605824 @doc(BaseSpatialIndex.is_empty)
606 @property
607825 def is_empty(self):
608826 return len(self) == 0
1111
1212 def _isna(this):
1313 """isna version that works for both scalars and (Geo)Series"""
14 if hasattr(this, "isna"):
15 return this.isna()
16 elif hasattr(this, "isnull"):
17 return this.isnull()
18 else:
19 return pd.isnull(this)
20
21
22 def geom_equals(this, that):
14 with warnings.catch_warnings():
15 # GeoSeries.isna will raise a warning about no longer returning True
16 # for empty geometries. This helper is used below always in combination
17 # with an is_empty check to preserve behaviour, and thus we ignore the
18 # warning here to avoid it bubbling up to the user
19 warnings.filterwarnings(
20 "ignore", r"GeoSeries.isna\(\) previously returned", UserWarning
21 )
22 if hasattr(this, "isna"):
23 return this.isna()
24 elif hasattr(this, "isnull"):
25 return this.isnull()
26 else:
27 return pd.isnull(this)
28
29
30 def _geom_equals_mask(this, that):
2331 """
2432 Test for geometric equality. Empty or missing geometries are considered
2533 equal.
2836 ----------
2937 this, that : arrays of Geo objects (or anything that has an `is_empty`
3038 attribute)
39
40 Returns
41 -------
42 Series
43 boolean Series, True if geometries in left equal geometries in right
3144 """
3245
3346 return (
3447 this.geom_equals(that)
3548 | (this.is_empty & that.is_empty)
3649 | (_isna(this) & _isna(that))
37 ).all()
38
39
40 def geom_almost_equals(this, that):
50 )
51
52
53 def geom_equals(this, that):
54 """
55 Test for geometric equality. Empty or missing geometries are considered
56 equal.
57
58 Parameters
59 ----------
60 this, that : arrays of Geo objects (or anything that has an `is_empty`
61 attribute)
62
63 Returns
64 -------
65 bool
66 True if all geometries in left equal geometries in right
67 """
68
69 return _geom_equals_mask(this, that).all()
70
71
72 def _geom_almost_equals_mask(this, that):
4173 """
4274 Test for 'almost' geometric equality. Empty or missing geometries
4375 considered equal.
4981 ----------
5082 this, that : arrays of Geo objects (or anything that has an `is_empty`
5183 property)
84
85 Returns
86 -------
87 Series
88 boolean Series, True if geometries in left almost equal geometries in right
5289 """
5390
5491 return (
5592 this.geom_almost_equals(that)
5693 | (this.is_empty & that.is_empty)
5794 | (_isna(this) & _isna(that))
58 ).all()
95 )
96
97
98 def geom_almost_equals(this, that):
99 """
100 Test for 'almost' geometric equality. Empty or missing geometries
101 considered equal.
102
103 This method allows small difference in the coordinates, but this
104 requires coordinates be in the same order for all components of a geometry.
105
106 Parameters
107 ----------
108 this, that : arrays of Geo objects (or anything that has an `is_empty`
109 property)
110
111 Returns
112 -------
113 bool
114 True if all geometries in left almost equal geometries in right
115 """
116
117 return _geom_almost_equals_mask(this, that).all()
59118
60119
61120 def assert_geoseries_equal(
156215 )
157216 if check_less_precise:
158217 precise = "almost "
159 if not geom_almost_equals(left, right):
160 unequal_left_geoms = left[~left.geom_almost_equals(right)]
161 unequal_right_geoms = right[~left.geom_almost_equals(right)]
162 raise AssertionError(
163 assert_error_message.format(
164 len(unequal_left_geoms),
165 len(left),
166 unequal_left_geoms.index.to_list(),
167 precise,
168 _truncated_string(unequal_left_geoms.iloc[0]),
169 _truncated_string(unequal_right_geoms.iloc[0]),
170 )
218 equal = _geom_almost_equals_mask(left, right)
219 else:
220 precise = ""
221 equal = _geom_equals_mask(left, right)
222
223 if not equal.all():
224 unequal_left_geoms = left[~equal]
225 unequal_right_geoms = right[~equal]
226 raise AssertionError(
227 assert_error_message.format(
228 len(unequal_left_geoms),
229 len(left),
230 unequal_left_geoms.index.to_list(),
231 precise,
232 _truncated_string(unequal_left_geoms.iloc[0]),
233 _truncated_string(unequal_right_geoms.iloc[0]),
171234 )
172 else:
173 precise = ""
174 if not geom_equals(left, right):
175 unequal_left_geoms = left[~left.geom_almost_equals(right)]
176 unequal_right_geoms = right[~left.geom_almost_equals(right)]
177 raise AssertionError(
178 assert_error_message.format(
179 len(unequal_left_geoms),
180 len(left),
181 unequal_left_geoms.index.to_list(),
182 precise,
183 _truncated_string(unequal_left_geoms.iloc[0]),
184 _truncated_string(unequal_right_geoms.iloc[0]),
185 )
186 )
235 )
187236
188237
189238 def assert_geodataframe_equal(
138138 assert all(v.equals(t) for v, t in zip(res, points_no_missing))
139139
140140 # missing values
141 # TODO(pygeos) does not support empty strings
142 if compat.USE_PYGEOS:
143 L_wkb.extend([None])
144 else:
145 L_wkb.extend([b"", None])
146 res = from_wkb(L_wkb)
147 assert res[-1] is None
141 # TODO(pygeos) does not support empty strings, np.nan, or pd.NA
142 missing_values = [None]
148143 if not compat.USE_PYGEOS:
149 assert res[-2] is None
144 missing_values.extend([b"", np.nan])
145
146 if compat.PANDAS_GE_10:
147 missing_values.append(pd.NA)
148
149 res = from_wkb(missing_values)
150 np.testing.assert_array_equal(res, np.full(len(missing_values), None))
150151
151152 # single MultiPolygon
152153 multi_poly = shapely.geometry.MultiPolygon(
154155 )
155156 res = from_wkb([multi_poly.wkb])
156157 assert res[0] == multi_poly
158
159
160 def test_from_wkb_hex():
161 geometry_hex = ["0101000000CDCCCCCCCCCC1440CDCCCCCCCC0C4A40"]
162 res = from_wkb(geometry_hex)
163 assert isinstance(res, GeometryArray)
164
165 # array
166 res = from_wkb(np.array(geometry_hex, dtype=object))
167 assert isinstance(res, GeometryArray)
157168
158169
159170 def test_to_wkb():
201212 assert all(v.almost_equals(t) for v, t in zip(res, points_no_missing))
202213
203214 # missing values
204 # TODO(pygeos) does not support empty strings
205 if compat.USE_PYGEOS:
206 L_wkt.extend([None])
207 else:
208 L_wkt.extend([f(""), None])
209 res = from_wkt(L_wkt)
210 assert res[-1] is None
215 # TODO(pygeos) does not support empty strings, np.nan, or pd.NA
216 missing_values = [None]
211217 if not compat.USE_PYGEOS:
212 assert res[-2] is None
218 missing_values.extend([f(""), np.nan])
219
220 if compat.PANDAS_GE_10:
221 missing_values.append(pd.NA)
222
223 res = from_wkb(missing_values)
224 np.testing.assert_array_equal(res, np.full(len(missing_values), None))
213225
214226 # single MultiPolygon
215227 multi_poly = shapely.geometry.MultiPolygon(
444456
445457
446458 @pytest.mark.parametrize(
447 "attr", ["is_closed", "is_valid", "is_empty", "is_simple", "has_z", "is_ring"]
459 "attr",
460 [
461 "is_closed",
462 "is_valid",
463 "is_empty",
464 "is_simple",
465 "has_z",
466 # for is_ring we raise a warning about the value for Polygon changing
467 pytest.param(
468 "is_ring", marks=pytest.mark.filterwarnings("ignore:is_ring:FutureWarning")
469 ),
470 ],
448471 )
449472 def test_unary_predicates(attr):
450473 na_value = False
481504 assert result.tolist() == expected
482505
483506
507 # for is_ring we raise a warning about the value for Polygon changing
508 @pytest.mark.filterwarnings("ignore:is_ring:FutureWarning")
484509 def test_is_ring():
485510 g = [
486511 shapely.geometry.LinearRing([(0, 0), (1, 1), (1, -1)]),
922947 "EPSG:32618"
923948 )
924949
950 @pytest.mark.skipif(not compat.PYPROJ_GE_31, reason="requires pyproj 3.1 or higher")
951 def test_estimate_utm_crs__antimeridian(self):
952 antimeridian = from_shapely(
953 [
954 shapely.geometry.Point(1722483.900174921, 5228058.6143420935),
955 shapely.geometry.Point(4624385.494808555, 8692574.544944234),
956 ],
957 crs="EPSG:3851",
958 )
959 assert antimeridian.estimate_utm_crs() == CRS("EPSG:32760")
960
925961 @pytest.mark.skipif(compat.PYPROJ_LT_3, reason="requires pyproj 3 or higher")
926962 def test_estimate_utm_crs__out_of_bounds(self):
927963 with pytest.raises(RuntimeError, match="Unable to determine UTM CRS"):
219219 assert df.geometry.values.crs == self.osgb
220220
221221 # different passed CRS than array CRS is ignored
222 with pytest.warns(FutureWarning):
222 with pytest.warns(FutureWarning, match="CRS mismatch"):
223223 df = GeoDataFrame(geometry=s, crs=4326)
224224 assert df.crs == self.osgb
225225 assert df.geometry.crs == self.osgb
226226 assert df.geometry.values.crs == self.osgb
227 with pytest.warns(FutureWarning):
227 with pytest.warns(FutureWarning, match="CRS mismatch"):
228228 GeoDataFrame(geometry=s, crs=4326)
229 with pytest.warns(FutureWarning):
229 with pytest.warns(FutureWarning, match="CRS mismatch"):
230230 GeoDataFrame({"data": [1, 2], "geometry": s}, crs=4326)
231 with pytest.warns(FutureWarning):
231 with pytest.warns(FutureWarning, match="CRS mismatch"):
232232 GeoDataFrame(df, crs=4326).crs
233233
234234 # manually change CRS
267267 assert df.geometry.crs == self.wgs
268268 assert df.geometry.values.crs == self.wgs
269269
270 arr = from_shapely(self.geoms)
271 s = GeoSeries(arr, crs=27700)
270272 df = GeoDataFrame()
271273 df = df.set_geometry(s)
272274 assert df.crs == self.osgb
296298 df = GeoDataFrame({"geometry": [0, 1]})
297299 df.crs = 27700
298300 assert df.crs == self.osgb
301
302 def test_dataframe_setitem(self):
303 # new geometry CRS has priority over GDF CRS
304 arr = from_shapely(self.geoms)
305 s = GeoSeries(arr, crs=27700)
306 df = GeoDataFrame()
307 df["geometry"] = s
308 assert df.crs == self.osgb
309 assert df.geometry.crs == self.osgb
310 assert df.geometry.values.crs == self.osgb
311
312 arr = from_shapely(self.geoms, crs=27700)
313 df = GeoDataFrame()
314 df["geometry"] = arr
315 assert df.crs == self.osgb
316 assert df.geometry.crs == self.osgb
317 assert df.geometry.values.crs == self.osgb
318
319 # test to_crs case (GH1960)
320 arr = from_shapely(self.geoms)
321 df = GeoDataFrame({"col1": [1, 2], "geometry": arr}, crs=4326)
322 df["geometry"] = df["geometry"].to_crs(27700)
323 assert df.crs == self.osgb
324 assert df.geometry.crs == self.osgb
325 assert df.geometry.values.crs == self.osgb
326
327 # test changing geometry crs not in the geometry column doesn't change the crs
328 arr = from_shapely(self.geoms)
329 df = GeoDataFrame(
330 {"col1": [1, 2], "geometry": arr, "other_geom": arr}, crs=4326
331 )
332 df["other_geom"] = from_shapely(self.geoms, crs=27700)
333 assert df.crs == self.wgs
334 assert df.geometry.crs == self.wgs
335 assert df["geometry"].crs == self.wgs
336 assert df["other_geom"].crs == self.osgb
299337
300338 @pytest.mark.parametrize(
301339 "scalar", [None, Point(0, 0), LineString([(0, 0), (1, 1)])]
0 from textwrap import dedent
1
2 from geopandas._decorator import doc
3
4
5 @doc(method="cumsum", operation="sum")
6 def cumsum(whatever):
7 """
8 This is the {method} method.
9
10 It computes the cumulative {operation}.
11 """
12 ...
13
14
15 @doc(
16 cumsum,
17 dedent(
18 """
19 Examples
20 --------
21
22 >>> cumavg([1, 2, 3])
23 2
24 """
25 ),
26 method="cumavg",
27 operation="average",
28 )
29 def cumavg(whatever):
30 ...
31
32
33 @doc(cumsum, method="cummax", operation="maximum")
34 def cummax(whatever):
35 ...
36
37
38 @doc(cummax, method="cummin", operation="minimum")
39 def cummin(whatever):
40 ...
41
42
43 def test_docstring_formatting():
44 docstr = dedent(
45 """
46 This is the cumsum method.
47
48 It computes the cumulative sum.
49 """
50 )
51 assert cumsum.__doc__ == docstr
52
53
54 def test_docstring_appending():
55 docstr = dedent(
56 """
57 This is the cumavg method.
58
59 It computes the cumulative average.
60
61 Examples
62 --------
63
64 >>> cumavg([1, 2, 3])
65 2
66 """
67 )
68 assert cumavg.__doc__ == docstr
69
70
71 def test_doc_template_from_func():
72 docstr = dedent(
73 """
74 This is the cummax method.
75
76 It computes the cumulative maximum.
77 """
78 )
79 assert cummax.__doc__ == docstr
80
81
82 def test_inherit_doc_template():
83 docstr = dedent(
84 """
85 This is the cummin method.
86
87 It computes the cumulative minimum.
88 """
89 )
90 assert cummin.__doc__ == docstr
200200 assert_frame_equal(expected_unsorted, gdf.dissolve("a", sort=False))
201201
202202
203 @pytest.mark.skipif(
204 not compat.PANDAS_GE_025,
205 reason="'observed' param behavior changed in pandas 0.25.0",
206 )
207203 def test_dissolve_categorical():
208204 gdf = geopandas.GeoDataFrame(
209205 {
0 import geopandas as gpd
1 import numpy as np
2 import pandas as pd
3 import pytest
4 from distutils.version import LooseVersion
5
6 folium = pytest.importorskip("folium")
7 branca = pytest.importorskip("branca")
8 matplotlib = pytest.importorskip("matplotlib")
9 mapclassify = pytest.importorskip("mapclassify")
10
11 import matplotlib.cm as cm # noqa
12 import matplotlib.colors as colors # noqa
13 from branca.colormap import StepColormap # noqa
14
15 BRANCA_05 = str(branca.__version__) > LooseVersion("0.4.2")
16
17
18 class TestExplore:
19 def setup_method(self):
20 self.nybb = gpd.read_file(gpd.datasets.get_path("nybb"))
21 self.world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
22 self.cities = gpd.read_file(gpd.datasets.get_path("naturalearth_cities"))
23 self.world["range"] = range(len(self.world))
24 self.missing = self.world.copy()
25 np.random.seed(42)
26 self.missing.loc[np.random.choice(self.missing.index, 40), "continent"] = np.nan
27 self.missing.loc[np.random.choice(self.missing.index, 40), "pop_est"] = np.nan
28
29 def _fetch_map_string(self, m):
30 out = m._parent.render()
31 out_str = "".join(out.split())
32 return out_str
33
34 def test_simple_pass(self):
35 """Make sure default pass"""
36 self.nybb.explore()
37 self.world.explore()
38 self.cities.explore()
39 self.world.geometry.explore()
40
41 def test_choropleth_pass(self):
42 """Make sure default choropleth pass"""
43 self.world.explore(column="pop_est")
44
45 def test_map_settings_default(self):
46 """Check default map settings"""
47 m = self.world.explore()
48 assert m.location == [
49 pytest.approx(-3.1774349999999956, rel=1e-6),
50 pytest.approx(2.842170943040401e-14, rel=1e-6),
51 ]
52 assert m.options["zoom"] == 10
53 assert m.options["zoomControl"] is True
54 assert m.position == "relative"
55 assert m.height == (100.0, "%")
56 assert m.width == (100.0, "%")
57 assert m.left == (0, "%")
58 assert m.top == (0, "%")
59 assert m.global_switches.no_touch is False
60 assert m.global_switches.disable_3d is False
61 assert "openstreetmap" in m.to_dict()["children"].keys()
62
63 def test_map_settings_custom(self):
64 """Check custom map settins"""
65 m = self.nybb.explore(
66 zoom_control=False,
67 width=200,
68 height=200,
69 )
70 assert m.location == [
71 pytest.approx(40.70582377450201, rel=1e-6),
72 pytest.approx(-73.9778006856748, rel=1e-6),
73 ]
74 assert m.options["zoom"] == 10
75 assert m.options["zoomControl"] is False
76 assert m.height == (200.0, "px")
77 assert m.width == (200.0, "px")
78
79 # custom XYZ tiles
80 m = self.nybb.explore(
81 zoom_control=False,
82 width=200,
83 height=200,
84 tiles="https://mt1.google.com/vt/lyrs=m&x={x}&y={y}&z={z}",
85 attr="Google",
86 )
87
88 out_str = self._fetch_map_string(m)
89 s = '"https://mt1.google.com/vt/lyrs=m\\u0026x={x}\\u0026y={y}\\u0026z={z}"'
90 assert s in out_str
91 assert '"attribution":"Google"' in out_str
92
93 m = self.nybb.explore(location=(40, 5))
94 assert m.location == [40, 5]
95 assert m.options["zoom"] == 10
96
97 m = self.nybb.explore(zoom_start=8)
98 assert m.location == [
99 pytest.approx(40.70582377450201, rel=1e-6),
100 pytest.approx(-73.9778006856748, rel=1e-6),
101 ]
102 assert m.options["zoom"] == 8
103
104 m = self.nybb.explore(location=(40, 5), zoom_start=8)
105 assert m.location == [40, 5]
106 assert m.options["zoom"] == 8
107
108 def test_simple_color(self):
109 """Check color settings"""
110 # single named color
111 m = self.nybb.explore(color="red")
112 out_str = self._fetch_map_string(m)
113 assert '"fillColor":"red"' in out_str
114
115 # list of colors
116 colors = ["#333333", "#367324", "#95824f", "#fcaa00", "#ffcc33"]
117 m2 = self.nybb.explore(color=colors)
118 out_str = self._fetch_map_string(m2)
119 for c in colors:
120 assert f'"fillColor":"{c}"' in out_str
121
122 # column of colors
123 df = self.nybb.copy()
124 df["colors"] = colors
125 m3 = df.explore(color="colors")
126 out_str = self._fetch_map_string(m3)
127 for c in colors:
128 assert f'"fillColor":"{c}"' in out_str
129
130 # line GeoSeries
131 m4 = self.nybb.boundary.explore(color="red")
132 out_str = self._fetch_map_string(m4)
133 assert '"fillColor":"red"' in out_str
134
135 def test_choropleth_linear(self):
136 """Check choropleth colors"""
137 # default cmap
138 m = self.nybb.explore(column="Shape_Leng")
139 out_str = self._fetch_map_string(m)
140 assert 'color":"#440154"' in out_str
141 assert 'color":"#fde725"' in out_str
142 assert 'color":"#50c46a"' in out_str
143 assert 'color":"#481467"' in out_str
144 assert 'color":"#3d4e8a"' in out_str
145
146 # named cmap
147 m = self.nybb.explore(column="Shape_Leng", cmap="PuRd")
148 out_str = self._fetch_map_string(m)
149 assert 'color":"#f7f4f9"' in out_str
150 assert 'color":"#67001f"' in out_str
151 assert 'color":"#d31760"' in out_str
152 assert 'color":"#f0ecf5"' in out_str
153 assert 'color":"#d6bedc"' in out_str
154
155 def test_choropleth_mapclassify(self):
156 """Mapclassify bins"""
157 # quantiles
158 m = self.nybb.explore(column="Shape_Leng", scheme="quantiles")
159 out_str = self._fetch_map_string(m)
160 assert 'color":"#21918c"' in out_str
161 assert 'color":"#3b528b"' in out_str
162 assert 'color":"#5ec962"' in out_str
163 assert 'color":"#fde725"' in out_str
164 assert 'color":"#440154"' in out_str
165
166 # headtail
167 m = self.world.explore(column="pop_est", scheme="headtailbreaks")
168 out_str = self._fetch_map_string(m)
169 assert '"fillColor":"#3b528b"' in out_str
170 assert '"fillColor":"#21918c"' in out_str
171 assert '"fillColor":"#5ec962"' in out_str
172 assert '"fillColor":"#fde725"' in out_str
173 assert '"fillColor":"#440154"' in out_str
174 # custom k
175 m = self.world.explore(column="pop_est", scheme="naturalbreaks", k=3)
176 out_str = self._fetch_map_string(m)
177 assert '"fillColor":"#21918c"' in out_str
178 assert '"fillColor":"#fde725"' in out_str
179 assert '"fillColor":"#440154"' in out_str
180
181 def test_categorical(self):
182 """Categorical maps"""
183 # auto detection
184 m = self.world.explore(column="continent")
185 out_str = self._fetch_map_string(m)
186 assert 'color":"#9467bd","continent":"Europe"' in out_str
187 assert 'color":"#c49c94","continent":"NorthAmerica"' in out_str
188 assert 'color":"#1f77b4","continent":"Africa"' in out_str
189 assert 'color":"#98df8a","continent":"Asia"' in out_str
190 assert 'color":"#ff7f0e","continent":"Antarctica"' in out_str
191 assert 'color":"#9edae5","continent":"SouthAmerica"' in out_str
192 assert 'color":"#7f7f7f","continent":"Oceania"' in out_str
193 assert 'color":"#dbdb8d","continent":"Sevenseas(openocean)"' in out_str
194
195 # forced categorical
196 m = self.nybb.explore(column="BoroCode", categorical=True)
197 out_str = self._fetch_map_string(m)
198 assert 'color":"#9edae5"' in out_str
199 assert 'color":"#c7c7c7"' in out_str
200 assert 'color":"#8c564b"' in out_str
201 assert 'color":"#1f77b4"' in out_str
202 assert 'color":"#98df8a"' in out_str
203
204 # pandas.Categorical
205 df = self.world.copy()
206 df["categorical"] = pd.Categorical(df["name"])
207 m = df.explore(column="categorical")
208 out_str = self._fetch_map_string(m)
209 for c in np.apply_along_axis(colors.to_hex, 1, cm.tab20(range(20))):
210 assert f'"fillColor":"{c}"' in out_str
211
212 # custom cmap
213 m = self.nybb.explore(column="BoroName", cmap="Set1")
214 out_str = self._fetch_map_string(m)
215 assert 'color":"#999999"' in out_str
216 assert 'color":"#a65628"' in out_str
217 assert 'color":"#4daf4a"' in out_str
218 assert 'color":"#e41a1c"' in out_str
219 assert 'color":"#ff7f00"' in out_str
220
221 # custom list of colors
222 cmap = ["#333432", "#3b6e8c", "#bc5b4f", "#8fa37e", "#efc758"]
223 m = self.nybb.explore(column="BoroName", cmap=cmap)
224 out_str = self._fetch_map_string(m)
225 for c in cmap:
226 assert f'"fillColor":"{c}"' in out_str
227
228 # shorter list (to make it repeat)
229 cmap = ["#333432", "#3b6e8c"]
230 m = self.nybb.explore(column="BoroName", cmap=cmap)
231 out_str = self._fetch_map_string(m)
232 for c in cmap:
233 assert f'"fillColor":"{c}"' in out_str
234
235 with pytest.raises(ValueError, match="'cmap' is invalid."):
236 self.nybb.explore(column="BoroName", cmap="nonsense")
237
238 def test_categories(self):
239 m = self.nybb[["BoroName", "geometry"]].explore(
240 column="BoroName",
241 categories=["Brooklyn", "Staten Island", "Queens", "Bronx", "Manhattan"],
242 )
243 out_str = self._fetch_map_string(m)
244 assert '"Bronx","__folium_color":"#c7c7c7"' in out_str
245 assert '"Manhattan","__folium_color":"#9edae5"' in out_str
246 assert '"Brooklyn","__folium_color":"#1f77b4"' in out_str
247 assert '"StatenIsland","__folium_color":"#98df8a"' in out_str
248 assert '"Queens","__folium_color":"#8c564b"' in out_str
249
250 df = self.nybb.copy()
251 df["categorical"] = pd.Categorical(df["BoroName"])
252 with pytest.raises(ValueError, match="Cannot specify 'categories'"):
253 df.explore("categorical", categories=["Brooklyn", "Staten Island"])
254
255 def test_column_values(self):
256 """
257 Check that the dataframe plot method returns same values with an
258 input string (column in df), pd.Series, or np.array
259 """
260 column_array = np.array(self.world["pop_est"])
261 m1 = self.world.explore(column="pop_est") # column name
262 m2 = self.world.explore(column=column_array) # np.array
263 m3 = self.world.explore(column=self.world["pop_est"]) # pd.Series
264 assert m1.location == m2.location == m3.location
265
266 m1_fields = self.world.explore(column=column_array, tooltip=True, popup=True)
267 out1_fields_str = self._fetch_map_string(m1_fields)
268 assert (
269 'fields=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
270 in out1_fields_str
271 )
272 assert (
273 'aliases=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
274 in out1_fields_str
275 )
276
277 m2_fields = self.world.explore(
278 column=self.world["pop_est"], tooltip=True, popup=True
279 )
280 out2_fields_str = self._fetch_map_string(m2_fields)
281 assert (
282 'fields=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
283 in out2_fields_str
284 )
285 assert (
286 'aliases=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
287 in out2_fields_str
288 )
289
290 # GeoDataframe and the given list have different number of rows
291 with pytest.raises(ValueError, match="different number of rows"):
292 self.world.explore(column=np.array([1, 2, 3]))
293
294 def test_no_crs(self):
295 """Naive geometry get no tiles"""
296 df = self.world.copy()
297 df.crs = None
298 m = df.explore()
299 assert "openstreetmap" not in m.to_dict()["children"].keys()
300
301 def test_style_kwds(self):
302 """Style keywords"""
303 m = self.world.explore(
304 style_kwds=dict(fillOpacity=0.1, weight=0.5, fillColor="orange")
305 )
306 out_str = self._fetch_map_string(m)
307 assert '"fillColor":"orange","fillOpacity":0.1,"weight":0.5' in out_str
308 m = self.world.explore(column="pop_est", style_kwds=dict(color="black"))
309 assert '"color":"black"' in self._fetch_map_string(m)
310
311 def test_tooltip(self):
312 """Test tooltip"""
313 # default with no tooltip or popup
314 m = self.world.explore()
315 assert "GeoJsonTooltip" in str(m.to_dict())
316 assert "GeoJsonPopup" not in str(m.to_dict())
317
318 # True
319 m = self.world.explore(tooltip=True, popup=True)
320 assert "GeoJsonTooltip" in str(m.to_dict())
321 assert "GeoJsonPopup" in str(m.to_dict())
322 out_str = self._fetch_map_string(m)
323 assert (
324 'fields=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
325 in out_str
326 )
327 assert (
328 'aliases=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
329 in out_str
330 )
331
332 # True choropleth
333 m = self.world.explore(column="pop_est", tooltip=True, popup=True)
334 assert "GeoJsonTooltip" in str(m.to_dict())
335 assert "GeoJsonPopup" in str(m.to_dict())
336 out_str = self._fetch_map_string(m)
337 assert (
338 'fields=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
339 in out_str
340 )
341 assert (
342 'aliases=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
343 in out_str
344 )
345
346 # single column
347 m = self.world.explore(tooltip="pop_est", popup="iso_a3")
348 out_str = self._fetch_map_string(m)
349 assert 'fields=["pop_est"]' in out_str
350 assert 'aliases=["pop_est"]' in out_str
351 assert 'fields=["iso_a3"]' in out_str
352 assert 'aliases=["iso_a3"]' in out_str
353
354 # list
355 m = self.world.explore(
356 tooltip=["pop_est", "continent"], popup=["iso_a3", "gdp_md_est"]
357 )
358 out_str = self._fetch_map_string(m)
359 assert 'fields=["pop_est","continent"]' in out_str
360 assert 'aliases=["pop_est","continent"]' in out_str
361 assert 'fields=["iso_a3","gdp_md_est"' in out_str
362 assert 'aliases=["iso_a3","gdp_md_est"]' in out_str
363
364 # number
365 m = self.world.explore(tooltip=2, popup=2)
366 out_str = self._fetch_map_string(m)
367 assert 'fields=["pop_est","continent"]' in out_str
368 assert 'aliases=["pop_est","continent"]' in out_str
369
370 # keywords tooltip
371 m = self.world.explore(
372 tooltip=True,
373 popup=False,
374 tooltip_kwds=dict(aliases=[0, 1, 2, 3, 4, 5], sticky=False),
375 )
376 out_str = self._fetch_map_string(m)
377 assert (
378 'fields=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
379 in out_str
380 )
381 assert "aliases=[0,1,2,3,4,5]" in out_str
382 assert '"sticky":false' in out_str
383
384 # keywords popup
385 m = self.world.explore(
386 tooltip=False,
387 popup=True,
388 popup_kwds=dict(aliases=[0, 1, 2, 3, 4, 5]),
389 )
390 out_str = self._fetch_map_string(m)
391 assert (
392 'fields=["pop_est","continent","name","iso_a3","gdp_md_est","range"]'
393 in out_str
394 )
395 assert "aliases=[0,1,2,3,4,5]" in out_str
396 assert "<th>${aliases[i]" in out_str
397
398 # no labels
399 m = self.world.explore(
400 tooltip=True,
401 popup=True,
402 tooltip_kwds=dict(labels=False),
403 popup_kwds=dict(labels=False),
404 )
405 out_str = self._fetch_map_string(m)
406 assert "<th>${aliases[i]" not in out_str
407
408 # named index
409 gdf = self.nybb.set_index("BoroName")
410 m = gdf.explore()
411 out_str = self._fetch_map_string(m)
412 assert "BoroName" in out_str
413
414 def test_default_markers(self):
415 # check overriden default for points
416 m = self.cities.explore()
417 strings = ['"radius":2', '"fill":true', "CircleMarker(latlng,opts)"]
418 out_str = self._fetch_map_string(m)
419 for s in strings:
420 assert s in out_str
421
422 m = self.cities.explore(marker_kwds=dict(radius=5, fill=False))
423 strings = ['"radius":5', '"fill":false', "CircleMarker(latlng,opts)"]
424 out_str = self._fetch_map_string(m)
425 for s in strings:
426 assert s in out_str
427
428 def test_custom_markers(self):
429 # Markers
430 m = self.cities.explore(
431 marker_type="marker",
432 marker_kwds={"icon": folium.Icon(icon="star")},
433 )
434 assert ""","icon":"star",""" in self._fetch_map_string(m)
435
436 # Circle Markers
437 m = self.cities.explore(marker_type="circle", marker_kwds={"fill_color": "red"})
438 assert ""","fillColor":"red",""" in self._fetch_map_string(m)
439
440 # Folium Markers
441 m = self.cities.explore(
442 marker_type=folium.Circle(
443 radius=4, fill_color="orange", fill_opacity=0.4, color="black", weight=1
444 ),
445 )
446 assert ""","color":"black",""" in self._fetch_map_string(m)
447
448 # Circle
449 m = self.cities.explore(marker_type="circle_marker", marker_kwds={"radius": 10})
450 assert ""","radius":10,""" in self._fetch_map_string(m)
451
452 # Unsupported Markers
453 with pytest.raises(
454 ValueError,
455 match="Only 'marker', 'circle', and 'circle_marker' are supported",
456 ):
457 self.cities.explore(marker_type="dummy")
458
459 def test_vmin_vmax(self):
460 df = self.world.copy()
461 df["range"] = range(len(df))
462 m = df.explore("range", vmin=-100, vmax=1000)
463 out_str = self._fetch_map_string(m)
464 assert 'case"176":return{"color":"#3b528b","fillColor":"#3b528b"' in out_str
465 assert 'case"119":return{"color":"#414287","fillColor":"#414287"' in out_str
466 assert 'case"3":return{"color":"#482173","fillColor":"#482173"' in out_str
467
468 def test_missing_vals(self):
469 m = self.missing.explore("continent")
470 assert '"fillColor":null' in self._fetch_map_string(m)
471
472 m = self.missing.explore("pop_est")
473 assert '"fillColor":null' in self._fetch_map_string(m)
474
475 m = self.missing.explore("pop_est", missing_kwds=dict(color="red"))
476 assert '"fillColor":"red"' in self._fetch_map_string(m)
477
478 m = self.missing.explore("continent", missing_kwds=dict(color="red"))
479 assert '"fillColor":"red"' in self._fetch_map_string(m)
480
481 def test_categorical_legend(self):
482 m = self.world.explore("continent", legend=True)
483 out_str = self._fetch_map_string(m)
484 assert "#1f77b4'></span>Africa" in out_str
485 assert "#ff7f0e'></span>Antarctica" in out_str
486 assert "#98df8a'></span>Asia" in out_str
487 assert "#9467bd'></span>Europe" in out_str
488 assert "#c49c94'></span>NorthAmerica" in out_str
489 assert "#7f7f7f'></span>Oceania" in out_str
490 assert "#dbdb8d'></span>Sevenseas(openocean)" in out_str
491 assert "#9edae5'></span>SouthAmerica" in out_str
492
493 m = self.missing.explore(
494 "continent", legend=True, missing_kwds={"color": "red"}
495 )
496 out_str = self._fetch_map_string(m)
497 assert "red'></span>NaN" in out_str
498
499 def test_colorbar(self):
500 m = self.world.explore("range", legend=True)
501 out_str = self._fetch_map_string(m)
502 assert "attr(\"id\",'legend')" in out_str
503 assert "text('range')" in out_str
504
505 m = self.world.explore(
506 "range", legend=True, legend_kwds=dict(caption="my_caption")
507 )
508 out_str = self._fetch_map_string(m)
509 assert "attr(\"id\",'legend')" in out_str
510 assert "text('my_caption')" in out_str
511
512 m = self.missing.explore("pop_est", legend=True, missing_kwds=dict(color="red"))
513 out_str = self._fetch_map_string(m)
514 assert "red'></span>NaN" in out_str
515
516 # do not scale legend
517 m = self.world.explore(
518 "pop_est",
519 legend=True,
520 legend_kwds=dict(scale=False),
521 scheme="Headtailbreaks",
522 )
523 out_str = self._fetch_map_string(m)
524 assert out_str.count("#440154ff") == 100
525 assert out_str.count("#3b528bff") == 100
526 assert out_str.count("#21918cff") == 100
527 assert out_str.count("#5ec962ff") == 100
528 assert out_str.count("#fde725ff") == 100
529
530 # scale legend accorrdingly
531 m = self.world.explore(
532 "pop_est",
533 legend=True,
534 scheme="Headtailbreaks",
535 )
536 out_str = self._fetch_map_string(m)
537 assert out_str.count("#440154ff") == 16
538 assert out_str.count("#3b528bff") == 51
539 assert out_str.count("#21918cff") == 133
540 assert out_str.count("#5ec962ff") == 282
541 assert out_str.count("#fde725ff") == 18
542
543 # discrete cmap
544 m = self.world.explore("pop_est", legend=True, cmap="Pastel2")
545 out_str = self._fetch_map_string(m)
546
547 assert out_str.count("b3e2cdff") == 63
548 assert out_str.count("fdcdacff") == 62
549 assert out_str.count("cbd5e8ff") == 63
550 assert out_str.count("f4cae4ff") == 62
551 assert out_str.count("e6f5c9ff") == 62
552 assert out_str.count("fff2aeff") == 63
553 assert out_str.count("f1e2ccff") == 62
554 assert out_str.count("ccccccff") == 63
555
556 @pytest.mark.skipif(not BRANCA_05, reason="requires branca >= 0.5.0")
557 def test_colorbar_max_labels(self):
558 # linear
559 m = self.world.explore("pop_est", legend_kwds=dict(max_labels=3))
560 out_str = self._fetch_map_string(m)
561
562 tick_values = [140.0, 465176713.5921569, 930353287.1843138]
563 for tick in tick_values:
564 assert str(tick) in out_str
565
566 # scheme
567 m = self.world.explore(
568 "pop_est", scheme="headtailbreaks", legend_kwds=dict(max_labels=3)
569 )
570 out_str = self._fetch_map_string(m)
571
572 assert "tickValues([140,'',182567501.0,'',1330619341.0,''])" in out_str
573
574 # short cmap
575 m = self.world.explore("pop_est", legend_kwds=dict(max_labels=3), cmap="tab10")
576 out_str = self._fetch_map_string(m)
577
578 tick_values = [140.0, 551721192.4, 1103442244.8]
579 for tick in tick_values:
580 assert str(tick) in out_str
581
582 def test_xyzservices_providers(self):
583 xyzservices = pytest.importorskip("xyzservices")
584
585 m = self.nybb.explore(tiles=xyzservices.providers.CartoDB.PositronNoLabels)
586 out_str = self._fetch_map_string(m)
587
588 assert (
589 '"https://a.basemaps.cartocdn.com/light_nolabels/{z}/{x}/{y}{r}.png"'
590 in out_str
591 )
592 assert (
593 'attribution":"\\u0026copy;\\u003cahref=\\"https://www.openstreetmap.org'
594 in out_str
595 )
596 assert '"maxNativeZoom":19,"maxZoom":19,"minZoom":0' in out_str
597
598 def test_xyzservices_query_name(self):
599 pytest.importorskip("xyzservices")
600
601 m = self.nybb.explore(tiles="CartoDB Positron No Labels")
602 out_str = self._fetch_map_string(m)
603
604 assert (
605 '"https://a.basemaps.cartocdn.com/light_nolabels/{z}/{x}/{y}{r}.png"'
606 in out_str
607 )
608 assert (
609 'attribution":"\\u0026copy;\\u003cahref=\\"https://www.openstreetmap.org'
610 in out_str
611 )
612 assert '"maxNativeZoom":19,"maxZoom":19,"minZoom":0' in out_str
613
614 def test_linearrings(self):
615 rings = self.nybb.explode(index_parts=True).exterior
616 m = rings.explore()
617 out_str = self._fetch_map_string(m)
618
619 assert out_str.count("LineString") == len(rings)
620
621 def test_mapclassify_categorical_legend(self):
622 m = self.missing.explore(
623 column="pop_est",
624 legend=True,
625 scheme="naturalbreaks",
626 missing_kwds=dict(color="red", label="missing"),
627 legend_kwds=dict(colorbar=False, interval=True),
628 )
629 out_str = self._fetch_map_string(m)
630
631 strings = [
632 "[140.00,33986655.00]",
633 "(33986655.00,105350020.00]",
634 "(105350020.00,207353391.00]",
635 "(207353391.00,326625791.00]",
636 "(326625791.00,1379302771.00]",
637 "missing",
638 ]
639 for s in strings:
640 assert s in out_str
641
642 # interval=False
643 m = self.missing.explore(
644 column="pop_est",
645 legend=True,
646 scheme="naturalbreaks",
647 missing_kwds=dict(color="red", label="missing"),
648 legend_kwds=dict(colorbar=False, interval=False),
649 )
650 out_str = self._fetch_map_string(m)
651
652 strings = [
653 ">140.00,33986655.00",
654 ">33986655.00,105350020.00",
655 ">105350020.00,207353391.00",
656 ">207353391.00,326625791.00",
657 ">326625791.00,1379302771.00",
658 "missing",
659 ]
660 for s in strings:
661 assert s in out_str
662
663 # custom labels
664 m = self.world.explore(
665 column="pop_est",
666 legend=True,
667 scheme="naturalbreaks",
668 k=5,
669 legend_kwds=dict(colorbar=False, labels=["s", "m", "l", "xl", "xxl"]),
670 )
671 out_str = self._fetch_map_string(m)
672
673 strings = [">s<", ">m<", ">l<", ">xl<", ">xxl<"]
674 for s in strings:
675 assert s in out_str
676
677 # fmt
678 m = self.missing.explore(
679 column="pop_est",
680 legend=True,
681 scheme="naturalbreaks",
682 missing_kwds=dict(color="red", label="missing"),
683 legend_kwds=dict(colorbar=False, fmt="{:.0f}"),
684 )
685 out_str = self._fetch_map_string(m)
686
687 strings = [
688 ">140,33986655",
689 ">33986655,105350020",
690 ">105350020,207353391",
691 ">207353391,326625791",
692 ">326625791,1379302771",
693 "missing",
694 ]
695 for s in strings:
696 assert s in out_str
697
698 def test_given_m(self):
699 "Check that geometry is mapped onto a given folium.Map"
700 m = folium.Map()
701 self.nybb.explore(m=m, tooltip=False, highlight=False)
702
703 out_str = self._fetch_map_string(m)
704
705 assert out_str.count("BoroCode") == 5
706 # should not change map settings
707 assert m.options["zoom"] == 1
708
709 def test_highlight(self):
710 m = self.nybb.explore(highlight=True)
711 out_str = self._fetch_map_string(m)
712
713 assert '"fillOpacity":0.75' in out_str
714
715 m = self.nybb.explore(
716 highlight=True, highlight_kwds=dict(fillOpacity=1, color="red")
717 )
718 out_str = self._fetch_map_string(m)
719
720 assert '{"color":"red","fillOpacity":1}' in out_str
721
722 def test_custom_colormaps(self):
723
724 step = StepColormap(["green", "yellow", "red"], vmin=0, vmax=100000000)
725
726 m = self.world.explore("pop_est", cmap=step, tooltip=["name"], legend=True)
727
728 strings = [
729 'fillColor":"#008000ff"', # Green
730 '"fillColor":"#ffff00ff"', # Yellow
731 '"fillColor":"#ff0000ff"', # Red
732 ]
733
734 out_str = self._fetch_map_string(m)
735 for s in strings:
736 assert s in out_str
737
738 assert out_str.count("008000ff") == 306
739 assert out_str.count("ffff00ff") == 187
740 assert out_str.count("ff0000ff") == 190
741
742 # Using custom function colormap
743 def my_color_function(field):
744 """Maps low values to green and high values to red."""
745 if field > 100000000:
746 return "#ff0000"
747 else:
748 return "#008000"
749
750 m = self.world.explore("pop_est", cmap=my_color_function, legend=False)
751
752 strings = [
753 '"color":"#ff0000","fillColor":"#ff0000"',
754 '"color":"#008000","fillColor":"#008000"',
755 ]
756
757 for s in strings:
758 assert s in self._fetch_map_string(m)
759
760 # matplotlib.Colormap
761 cmap = colors.ListedColormap(["red", "green", "blue", "white", "black"])
762
763 m = self.nybb.explore("BoroName", cmap=cmap)
764 strings = [
765 '"fillColor":"#ff0000"', # Red
766 '"fillColor":"#008000"', # Green
767 '"fillColor":"#0000ff"', # Blue
768 '"fillColor":"#ffffff"', # White
769 '"fillColor":"#000000"', # Black
770 ]
771
772 out_str = self._fetch_map_string(m)
773 for s in strings:
774 assert s in out_str
775
776 def test_multiple_geoseries(self):
777 """
778 Additional GeoSeries need to be removed as they cannot be converted to GeoJSON
779 """
780 gdf = self.nybb
781 gdf["boundary"] = gdf.boundary
782 gdf["centroid"] = gdf.centroid
783
784 gdf.explore()
2222 import shapely.geometry
2323
2424 from geopandas.array import GeometryArray, GeometryDtype, from_shapely
25 from geopandas._compat import ignore_shapely2_warnings
2526
2627 import pytest
2728
4748
4849 def make_data():
4950 a = np.empty(100, dtype=object)
50 a[:] = [shapely.geometry.Point(i, i) for i in range(100)]
51 with ignore_shapely2_warnings():
52 a[:] = [shapely.geometry.Point(i, i) for i in range(100)]
5153 ga = from_shapely(a)
5254 return ga
5355
298300 result = np.array(data, dtype=object)
299301 # expected = np.array(list(data), dtype=object)
300302 expected = np.empty(len(data), dtype=object)
301 expected[:] = list(data)
303 with ignore_shapely2_warnings():
304 expected[:] = list(data)
302305 assert_array_equal(result, expected)
303306
304307 def test_contains(self, data, data_missing):
305 # overrided due to the inconsistency between
308 # overridden due to the inconsistency between
306309 # GeometryDtype.na_value = np.nan
307310 # and None being used as NA in array
308311
363366
364367 @pytest.mark.skip("fillna method not supported")
365368 def test_fillna_series_method(self, data_missing, method):
369 pass
370
371 @pytest.mark.skip("fillna method not supported")
372 def test_fillna_no_op_returns_copy(self, data):
366373 pass
367374
368375
394401 """
395402 Fixture for dunder names for common arithmetic operations
396403
397 Adapted to excluse __sub__, as this is implemented as "difference".
398 """
399 return request.param
400
401
404 Adapted to exclude __sub__, as this is implemented as "difference".
405 """
406 return request.param
407
408
409 # an inherited test from pandas creates a Series from a list of geometries, which
410 # triggers the warning from Shapely, out of control of GeoPandas, so ignoring here
411 @pytest.mark.filterwarnings(
412 "ignore:The array interface is deprecated and will no longer work in Shapely 2.0"
413 )
402414 class TestArithmeticOps(extension_tests.BaseArithmeticOpsTests):
403415 @pytest.mark.skip(reason="not applicable")
404416 def test_divmod_series_array(self, data, data_for_twos):
409421 pass
410422
411423
424 # an inherited test from pandas creates a Series from a list of geometries, which
425 # triggers the warning from Shapely, out of control of GeoPandas, so ignoring here
426 @pytest.mark.filterwarnings(
427 "ignore:The array interface is deprecated and will no longer work in Shapely 2.0"
428 )
412429 class TestComparisonOps(extension_tests.BaseComparisonOpsTests):
413430 def _compare_other(self, s, data, op_name, other):
414431 op = getattr(operator, op_name.strip("_"))
429446
430447
431448 class TestMethods(extension_tests.BaseMethodsTests):
432 @not_yet_implemented
449 @no_sorting
433450 @pytest.mark.parametrize("dropna", [True, False])
434451 def test_value_counts(self, all_data, dropna):
435452 pass
436453
437 @not_yet_implemented
454 @no_sorting
438455 def test_value_counts_with_normalize(self, data):
439456 pass
440457
2222 """
2323
2424 def __init__(self, *args, **kwargs):
25 super(ForwardMock, self).__init__(*args, **kwargs)
25 super().__init__(*args, **kwargs)
2626 self._n = 0.0
2727
2828 def __call__(self, *args, **kwargs):
2929 self.return_value = args[0], (self._n, self._n + 0.5)
3030 self._n += 1
31 return super(ForwardMock, self).__call__(*args, **kwargs)
31 return super().__call__(*args, **kwargs)
3232
3333
3434 class ReverseMock(mock.MagicMock):
4040 """
4141
4242 def __init__(self, *args, **kwargs):
43 super(ReverseMock, self).__init__(*args, **kwargs)
43 super().__init__(*args, **kwargs)
4444 self._n = 0
4545
4646 def __call__(self, *args, **kwargs):
4747 self.return_value = "address{0}".format(self._n), args[0]
4848 self._n += 1
49 return super(ReverseMock, self).__call__(*args, **kwargs)
49 return super().__call__(*args, **kwargs)
5050
5151
5252 @pytest.fixture
133133 from geopy.exc import GeocoderNotFound
134134
135135 with pytest.raises(GeocoderNotFound):
136 reverse_geocode(["cambridge, ma"], "badprovider")
136 reverse_geocode([Point(0, 0)], "badprovider")
137137
138138
139139 def test_forward(locations, points):
140 from geopy.geocoders import GeocodeFarm
140 from geopy.geocoders import Photon
141141
142 for provider in ["geocodefarm", GeocodeFarm]:
143 with mock.patch("geopy.geocoders.GeocodeFarm.geocode", ForwardMock()) as m:
142 for provider in ["photon", Photon]:
143 with mock.patch("geopy.geocoders.Photon.geocode", ForwardMock()) as m:
144144 g = geocode(locations, provider=provider, timeout=2)
145145 assert len(locations) == m.call_count
146146
154154
155155
156156 def test_reverse(locations, points):
157 from geopy.geocoders import GeocodeFarm
157 from geopy.geocoders import Photon
158158
159 for provider in ["geocodefarm", GeocodeFarm]:
160 with mock.patch("geopy.geocoders.GeocodeFarm.reverse", ReverseMock()) as m:
159 for provider in ["photon", Photon]:
160 with mock.patch("geopy.geocoders.Photon.reverse", ReverseMock()) as m:
161161 g = reverse_geocode(points, provider=provider, timeout=2)
162162 assert len(points) == m.call_count
163163
99 import pyproj
1010 from pyproj import CRS
1111 from pyproj.exceptions import CRSError
12 from shapely.geometry import Point
12 from shapely.geometry import Point, Polygon
1313
1414 import geopandas
15 import geopandas._compat as compat
1516 from geopandas import GeoDataFrame, GeoSeries, read_file
1617 from geopandas.array import GeometryArray, GeometryDtype, from_shapely
18 from geopandas._compat import ignore_shapely2_warnings
1719
1820 from geopandas.testing import assert_geodataframe_equal, assert_geoseries_equal
1921 from geopandas.tests.util import PACKAGE_DIR, validate_boro_df
2224
2325
2426 PYPROJ_LT_3 = LooseVersion(pyproj.__version__) < LooseVersion("3")
27 TEST_NEAREST = compat.PYGEOS_GE_010 and compat.USE_PYGEOS
28 pandas_133 = pd.__version__ == LooseVersion("1.3.3")
29
30
31 @pytest.fixture
32 def dfs(request):
33 s1 = GeoSeries(
34 [
35 Polygon([(0, 0), (2, 0), (2, 2), (0, 2)]),
36 Polygon([(2, 2), (4, 2), (4, 4), (2, 4)]),
37 ]
38 )
39 s2 = GeoSeries(
40 [
41 Polygon([(1, 1), (3, 1), (3, 3), (1, 3)]),
42 Polygon([(3, 3), (5, 3), (5, 5), (3, 5)]),
43 ]
44 )
45 df1 = GeoDataFrame({"col1": [1, 2], "geometry": s1})
46 df2 = GeoDataFrame({"col2": [1, 2], "geometry": s2})
47 return df1, df2
48
49
50 @pytest.fixture(
51 params=["union", "intersection", "difference", "symmetric_difference", "identity"]
52 )
53 def how(request):
54 if pandas_133 and request.param in ["symmetric_difference", "identity", "union"]:
55 pytest.xfail("Regression in pandas 1.3.3 (GH #2101)")
56 return request.param
2557
2658
2759 class TestDataFrame:
366398 assert len(data["features"]) == 5
367399 assert "id" in data["features"][0].keys()
368400
401 @pytest.mark.filterwarnings(
402 "ignore:Geometry column does not contain geometry:UserWarning"
403 )
369404 def test_to_json_geom_col(self):
370405 df = self.df.copy()
371406 df["geom"] = df["geometry"]
456491 for f in data["features"]:
457492 assert "id" not in f.keys()
458493
494 def test_to_json_with_duplicate_columns(self):
495 df = GeoDataFrame(
496 data=[[1, 2, 3]], columns=["a", "b", "a"], geometry=[Point(1, 1)]
497 )
498 with pytest.raises(
499 ValueError, match="GeoDataFrame cannot contain duplicated column names."
500 ):
501 df.to_json()
502
459503 def test_copy(self):
460504 df2 = self.df.copy()
461505 assert type(df2) is GeoDataFrame
484528 df = GeoDataFrame.from_file(tempfilename)
485529 assert df.crs == "epsg:2263"
486530
531 def test_to_file_with_duplicate_columns(self):
532 df = GeoDataFrame(
533 data=[[1, 2, 3]], columns=["a", "b", "a"], geometry=[Point(1, 1)]
534 )
535 with pytest.raises(
536 ValueError, match="GeoDataFrame cannot contain duplicated column names."
537 ):
538 tempfilename = os.path.join(self.tempdir, "crs.shp")
539 df.to_file(tempfilename)
540
487541 def test_bool_index(self):
488542 # Find boros with 'B' in their name
489543 df = self.df[self.df["BoroName"].str.contains("B")]
629683 df = self.df.iloc[:1].copy()
630684 df.loc[0, "BoroName"] = np.nan
631685 # when containing missing values
632 # null: ouput the missing entries as JSON null
686 # null: output the missing entries as JSON null
633687 result = list(df.iterfeatures(na="null"))[0]["properties"]
634688 assert result["BoroName"] is None
635689 # drop: remove the property from the feature.
664718 # keep
665719 result = list(df_only_numerical_cols.iterfeatures(na="keep"))[0]
666720 assert type(result["properties"]["Shape_Leng"]) is float
721
722 with pytest.raises(
723 ValueError, match="GeoDataFrame cannot contain duplicated column names."
724 ):
725 df_with_duplicate_columns = df[
726 ["Shape_Leng", "Shape_Leng", "Shape_Area", "geometry"]
727 ]
728 list(df_with_duplicate_columns.iterfeatures())
667729
668730 # geometry not set
669731 df = GeoDataFrame({"values": [0, 1], "geom": [Point(0, 1), Point(1, 0)]})
743805 expected_df = pd.DataFrame({"gs0": wkts0, "gs1": wkts1})
744806 assert_frame_equal(expected_df, gdf.to_wkt())
745807
808 @pytest.mark.parametrize("how", ["left", "inner", "right"])
809 @pytest.mark.parametrize("predicate", ["intersects", "within", "contains"])
810 @pytest.mark.skipif(
811 not compat.USE_PYGEOS and not compat.HAS_RTREE,
812 reason="sjoin needs `rtree` or `pygeos` dependency",
813 )
814 def test_sjoin(self, how, predicate):
815 """
816 Basic test for availability of the GeoDataFrame method. Other
817 sjoin tests are located in /tools/tests/test_sjoin.py
818 """
819 left = read_file(geopandas.datasets.get_path("naturalearth_cities"))
820 right = read_file(geopandas.datasets.get_path("naturalearth_lowres"))
821
822 expected = geopandas.sjoin(left, right, how=how, predicate=predicate)
823 result = left.sjoin(right, how=how, predicate=predicate)
824 assert_geodataframe_equal(result, expected)
825
826 @pytest.mark.parametrize("how", ["left", "inner", "right"])
827 @pytest.mark.parametrize("max_distance", [None, 1])
828 @pytest.mark.parametrize("distance_col", [None, "distance"])
829 @pytest.mark.skipif(
830 not TEST_NEAREST,
831 reason=(
832 "PyGEOS >= 0.10.0"
833 " must be installed and activated via the geopandas.compat module to"
834 " test sjoin_nearest"
835 ),
836 )
837 def test_sjoin_nearest(self, how, max_distance, distance_col):
838 """
839 Basic test for availability of the GeoDataFrame method. Other
840 sjoin tests are located in /tools/tests/test_sjoin.py
841 """
842 left = read_file(geopandas.datasets.get_path("naturalearth_cities"))
843 right = read_file(geopandas.datasets.get_path("naturalearth_lowres"))
844
845 expected = geopandas.sjoin_nearest(
846 left, right, how=how, max_distance=max_distance, distance_col=distance_col
847 )
848 result = left.sjoin_nearest(
849 right, how=how, max_distance=max_distance, distance_col=distance_col
850 )
851 assert_geodataframe_equal(result, expected)
852
853 @pytest.mark.skip_no_sindex
854 def test_clip(self):
855 """
856 Basic test for availability of the GeoDataFrame method. Other
857 clip tests are located in /tools/tests/test_clip.py
858 """
859 left = read_file(geopandas.datasets.get_path("naturalearth_cities"))
860 world = read_file(geopandas.datasets.get_path("naturalearth_lowres"))
861 south_america = world[world["continent"] == "South America"]
862
863 expected = geopandas.clip(left, south_america)
864 result = left.clip(south_america)
865 assert_geodataframe_equal(result, expected)
866
867 @pytest.mark.skip_no_sindex
868 def test_overlay(self, dfs, how):
869 """
870 Basic test for availability of the GeoDataFrame method. Other
871 overlay tests are located in tests/test_overlay.py
872 """
873 df1, df2 = dfs
874
875 expected = geopandas.overlay(df1, df2, how=how)
876 result = df1.overlay(df2, how=how)
877 assert_geodataframe_equal(result, expected)
878
746879
747880 def check_geodataframe(df, geometry_column="geometry"):
748881 assert isinstance(df, GeoDataFrame)
844977 "B": np.arange(3.0),
845978 "geometry": [Point(x, x) for x in range(3)],
846979 }
847 a = np.array([data["A"], data["B"], data["geometry"]], dtype=object).T
980 with ignore_shapely2_warnings():
981 a = np.array([data["A"], data["B"], data["geometry"]], dtype=object).T
848982
849983 df = GeoDataFrame(a, columns=["A", "B", "geometry"])
850984 check_geodataframe(df)
859993 "geometry": [Point(x, x) for x in range(3)],
860994 }
861995 gpdf = GeoDataFrame(data)
862 pddf = pd.DataFrame(data)
996 with ignore_shapely2_warnings():
997 pddf = pd.DataFrame(data)
863998 check_geodataframe(gpdf)
864999 assert type(pddf) == pd.DataFrame
8651000
8891024
8901025 gpdf = GeoDataFrame(data, geometry="other_geom")
8911026 check_geodataframe(gpdf, "other_geom")
892 pddf = pd.DataFrame(data)
1027 with ignore_shapely2_warnings():
1028 pddf = pd.DataFrame(data)
8931029
8941030 for df in [gpdf, pddf]:
8951031 res = GeoDataFrame(df, geometry="other_geom")
8961032 check_geodataframe(res, "other_geom")
8971033
898 # when passing GeoDataFrame with custom geometry name to constructor
899 # an invalid geodataframe is the result TODO is this desired ?
1034 # gdf from gdf should preserve active geometry column name
9001035 df = GeoDataFrame(gpdf)
901 with pytest.raises(AttributeError):
902 df.geometry
1036 check_geodataframe(df, "other_geom")
9031037
9041038 def test_only_geometry(self):
9051039 exp = GeoDataFrame(
9751109 def test_overwrite_geometry(self):
9761110 # GH602
9771111 data = pd.DataFrame({"geometry": [1, 2, 3], "col1": [4, 5, 6]})
978 geoms = pd.Series([Point(i, i) for i in range(3)])
1112 with ignore_shapely2_warnings():
1113 geoms = pd.Series([Point(i, i) for i in range(3)])
9791114 # passed geometry kwarg should overwrite geometry column in data
9801115 res = GeoDataFrame(data, geometry=geoms)
9811116 assert_geoseries_equal(res.geometry, GeoSeries(geoms))
9821117
1118 def test_repeat_geo_col(self):
1119 df = pd.DataFrame(
1120 [
1121 {"geometry": Point(x, y), "geom": Point(x, y)}
1122 for x, y in zip(range(3), range(3))
1123 ],
1124 )
1125 # explicitly prevent construction of gdf with repeat geometry column names
1126 # two columns called "geometry", geom col inferred
1127 df2 = df.rename(columns={"geom": "geometry"})
1128 with pytest.raises(ValueError):
1129 GeoDataFrame(df2)
1130 # ensure case is caught when custom geom column name is used
1131 # two columns called "geom", geom col explicit
1132 df3 = df.rename(columns={"geometry": "geom"})
1133 with pytest.raises(ValueError):
1134 GeoDataFrame(df3, geometry="geom")
1135
9831136
9841137 def test_geodataframe_crs():
985 gdf = GeoDataFrame()
1138 gdf = GeoDataFrame(columns=["geometry"])
9861139 gdf.crs = "IGNF:ETRS89UTM28"
9871140 assert gdf.crs.to_authority() == ("IGNF", "ETRS89UTM28")
11
22 import numpy as np
33 from numpy.testing import assert_array_equal
4 from pandas import DataFrame, MultiIndex, Series
4 from pandas import DataFrame, Index, MultiIndex, Series
55
66 from shapely.geometry import LinearRing, LineString, MultiPoint, Point, Polygon
77 from shapely.geometry.collection import GeometryCollection
103103 )
104104
105105 def _test_unary_real(self, op, expected, a):
106 """ Tests for 'area', 'length', 'is_valid', etc. """
106 """Tests for 'area', 'length', 'is_valid', etc."""
107107 fcmp = assert_series_equal
108108 self._test_unary(op, expected, a, fcmp)
109109
118118 self._test_unary(op, expected, a, fcmp)
119119
120120 def _test_binary_topological(self, op, expected, a, b, *args, **kwargs):
121 """ Tests for 'intersection', 'union', 'symmetric_difference', etc. """
121 """Tests for 'intersection', 'union', 'symmetric_difference', etc."""
122122 if isinstance(expected, GeoPandasBase):
123123 fcmp = assert_geoseries_equal
124124 else:
228228 result = getattr(gdf, op)
229229 fcmp(result, expected)
230230
231 # TODO reenable for all operations once we use pyproj > 2
231 # TODO re-enable for all operations once we use pyproj > 2
232232 # def test_crs_warning(self):
233233 # # operations on geometries should warn for different CRS
234234 # no_crs_g3 = self.g3.copy()
244244 "intersection", self.all_none, self.g1, self.empty
245245 )
246246
247 assert len(self.g0.intersection(self.g9, align=True) == 8)
247 with pytest.warns(UserWarning, match="The indices .+ different"):
248 assert len(self.g0.intersection(self.g9, align=True) == 8)
248249 assert len(self.g0.intersection(self.g9, align=False) == 7)
249250
250251 def test_union_series(self):
251252 self._test_binary_topological("union", self.sq, self.g1, self.g2)
252253
253 assert len(self.g0.union(self.g9, align=True) == 8)
254 with pytest.warns(UserWarning, match="The indices .+ different"):
255 assert len(self.g0.union(self.g9, align=True) == 8)
254256 assert len(self.g0.union(self.g9, align=False) == 7)
255257
256258 def test_union_polygon(self):
259261 def test_symmetric_difference_series(self):
260262 self._test_binary_topological("symmetric_difference", self.sq, self.g3, self.g4)
261263
262 assert len(self.g0.symmetric_difference(self.g9, align=True) == 8)
264 with pytest.warns(UserWarning, match="The indices .+ different"):
265 assert len(self.g0.symmetric_difference(self.g9, align=True) == 8)
263266 assert len(self.g0.symmetric_difference(self.g9, align=False) == 7)
264267
265268 def test_symmetric_difference_poly(self):
272275 expected = GeoSeries([GeometryCollection(), self.t2])
273276 self._test_binary_topological("difference", expected, self.g1, self.g2)
274277
275 assert len(self.g0.difference(self.g9, align=True) == 8)
278 with pytest.warns(UserWarning, match="The indices .+ different"):
279 assert len(self.g0.difference(self.g9, align=True) == 8)
276280 assert len(self.g0.difference(self.g9, align=False) == 7)
277281
278282 def test_difference_poly(self):
289293 # binary geo empty result with right GeoSeries
290294 result = GeoSeries([l1]).intersection(GeoSeries([l2]))
291295 assert_geoseries_equal(result, expected)
292 # unary geo resulting in emtpy geometry
296 # unary geo resulting in empty geometry
293297 result = GeoSeries([GeometryCollection()]).convex_hull
294298 assert_geoseries_equal(result, expected)
295299
349353
350354 self._test_unary_topological("unary_union", expected, g)
351355
356 def test_cascaded_union_deprecated(self):
357 p1 = self.t1
358 p2 = Polygon([(2, 0), (3, 0), (3, 1)])
359 g = GeoSeries([p1, p2])
360 with pytest.warns(
361 FutureWarning, match="The 'cascaded_union' attribute is deprecated"
362 ):
363 result = g.cascaded_union
364 assert result == g.unary_union
365
352366 def test_contains(self):
353367 expected = [True, False, True, False, False, False, False]
354368 assert_array_dtype_equal(expected, self.g0.contains(self.t1))
355369
356370 expected = [False, True, True, True, True, True, False, False]
357 assert_array_dtype_equal(expected, self.g0.contains(self.g9, align=True))
371 with pytest.warns(UserWarning, match="The indices .+ different"):
372 assert_array_dtype_equal(expected, self.g0.contains(self.g9, align=True))
358373
359374 expected = [False, False, True, False, False, False, False]
360375 assert_array_dtype_equal(expected, self.g0.contains(self.g9, align=False))
378393 assert_array_dtype_equal(expected, self.crossed_lines.crosses(self.l3))
379394
380395 expected = [False] * 8
381 assert_array_dtype_equal(expected, self.g0.crosses(self.g9, align=True))
396 with pytest.warns(UserWarning, match="The indices .+ different"):
397 assert_array_dtype_equal(expected, self.g0.crosses(self.g9, align=True))
382398
383399 expected = [False] * 7
384400 assert_array_dtype_equal(expected, self.g0.crosses(self.g9, align=False))
388404 assert_array_dtype_equal(expected, self.g0.disjoint(self.t1))
389405
390406 expected = [False] * 8
391 assert_array_dtype_equal(expected, self.g0.disjoint(self.g9, align=True))
407 with pytest.warns(UserWarning, match="The indices .+ different"):
408 assert_array_dtype_equal(expected, self.g0.disjoint(self.g9, align=True))
392409
393410 expected = [False, False, False, False, True, False, False]
394411 assert_array_dtype_equal(expected, self.g0.disjoint(self.g9, align=False))
425442 index=range(8),
426443 )
427444
428 assert_array_dtype_equal(expected, self.g0.relate(self.g9, align=True))
445 with pytest.warns(UserWarning, match="The indices .+ different"):
446 assert_array_dtype_equal(expected, self.g0.relate(self.g9, align=True))
429447
430448 expected = Series(
431449 [
451469 assert_array_dtype_equal(expected, self.g6.distance(self.na_none))
452470
453471 expected = Series(np.array([np.nan, 0, 0, 0, 0, 0, np.nan, np.nan]), range(8))
454 assert_array_dtype_equal(expected, self.g0.distance(self.g9, align=True))
472 with pytest.warns(UserWarning, match="The indices .+ different"):
473 assert_array_dtype_equal(expected, self.g0.distance(self.g9, align=True))
455474
456475 val = self.g0.iloc[4].distance(self.g9.iloc[4])
457476 expected = Series(np.array([0, 0, 0, 0, val, np.nan, np.nan]), self.g0.index)
478497 assert_array_dtype_equal(expected, self.g0.intersects(self.empty_poly))
479498
480499 expected = [False, True, True, True, True, True, False, False]
481 assert_array_dtype_equal(expected, self.g0.intersects(self.g9, align=True))
500 with pytest.warns(UserWarning, match="The indices .+ different"):
501 assert_array_dtype_equal(expected, self.g0.intersects(self.g9, align=True))
482502
483503 expected = [True, True, True, True, False, False, False]
484504 assert_array_dtype_equal(expected, self.g0.intersects(self.g9, align=False))
491511 assert_array_dtype_equal(expected, self.g4.overlaps(self.t1))
492512
493513 expected = [False] * 8
494 assert_array_dtype_equal(expected, self.g0.overlaps(self.g9, align=True))
514 with pytest.warns(UserWarning, match="The indices .+ different"):
515 assert_array_dtype_equal(expected, self.g0.overlaps(self.g9, align=True))
495516
496517 expected = [False] * 7
497518 assert_array_dtype_equal(expected, self.g0.overlaps(self.g9, align=False))
501522 assert_array_dtype_equal(expected, self.g0.touches(self.t1))
502523
503524 expected = [False] * 8
504 assert_array_dtype_equal(expected, self.g0.touches(self.g9, align=True))
525 with pytest.warns(UserWarning, match="The indices .+ different"):
526 assert_array_dtype_equal(expected, self.g0.touches(self.g9, align=True))
505527
506528 expected = [True, False, False, True, False, False, False]
507529 assert_array_dtype_equal(expected, self.g0.touches(self.g9, align=False))
514536 assert_array_dtype_equal(expected, self.g0.within(self.sq))
515537
516538 expected = [False, True, True, True, True, True, False, False]
517 assert_array_dtype_equal(expected, self.g0.within(self.g9, align=True))
539 with pytest.warns(UserWarning, match="The indices .+ different"):
540 assert_array_dtype_equal(expected, self.g0.within(self.g9, align=True))
518541
519542 expected = [False, True, False, False, False, False, False]
520543 assert_array_dtype_equal(expected, self.g0.within(self.g9, align=False))
531554 assert_series_equal(res, exp)
532555
533556 expected = [False, True, True, True, True, True, False, False]
534 assert_array_dtype_equal(expected, self.g0.covers(self.g9, align=True))
557 with pytest.warns(UserWarning, match="The indices .+ different"):
558 assert_array_dtype_equal(expected, self.g0.covers(self.g9, align=True))
535559
536560 expected = [False, False, True, False, False, False, False]
537561 assert_array_dtype_equal(expected, self.g0.covers(self.g9, align=False))
551575 assert_series_equal(res, exp)
552576
553577 expected = [False, True, True, True, True, True, False, False]
554 assert_array_dtype_equal(expected, self.g0.covered_by(self.g9, align=True))
578 with pytest.warns(UserWarning, match="The indices .+ different"):
579 assert_array_dtype_equal(expected, self.g0.covered_by(self.g9, align=True))
555580
556581 expected = [False, True, False, False, False, False, False]
557582 assert_array_dtype_equal(expected, self.g0.covered_by(self.g9, align=False))
564589 expected = Series(np.array([False] * len(self.g1)), self.g1.index)
565590 self._test_unary_real("is_empty", expected, self.g1)
566591
592 # for is_ring we raise a warning about the value for Polygon changing
593 @pytest.mark.filterwarnings("ignore:is_ring:FutureWarning")
567594 def test_is_ring(self):
568595 expected = Series(np.array([True] * len(self.g1)), self.g1.index)
569596 self._test_unary_real("is_ring", expected, self.g1)
677704
678705 s = GeoSeries([Point(2, 2), Point(0.5, 0.5)], index=[1, 2])
679706 expected = Series([np.nan, 2.0, np.nan])
680 assert_series_equal(self.g5.project(s), expected)
707 with pytest.warns(UserWarning, match="The indices .+ different"):
708 assert_series_equal(self.g5.project(s), expected)
681709
682710 expected = Series([2.0, 0.5], index=self.g5.index)
683711 assert_series_equal(self.g5.project(s, align=False), expected)
832860 index=MultiIndex.from_tuples(index, names=expected_index_name),
833861 crs=4326,
834862 )
835 assert_geoseries_equal(expected, s.explode())
863 with pytest.warns(FutureWarning, match="Currently, index_parts defaults"):
864 assert_geoseries_equal(expected, s.explode())
836865
837866 @pytest.mark.parametrize("index_name", [None, "test"])
838867 def test_explode_geodataframe(self, index_name):
840869 df = GeoDataFrame({"col": [1, 2], "geometry": s})
841870 df.index.name = index_name
842871
843 test_df = df.explode()
872 with pytest.warns(FutureWarning, match="Currently, index_parts defaults"):
873 test_df = df.explode()
844874
845875 expected_s = GeoSeries([Point(1, 2), Point(2, 3), Point(5, 5)])
846876 expected_df = GeoDataFrame({"col": [1, 1, 2], "geometry": expected_s})
859889 df = GeoDataFrame({"level_1": [1, 2], "geometry": s})
860890 df.index.name = index_name
861891
862 test_df = df.explode()
892 test_df = df.explode(index_parts=True)
863893
864894 expected_s = GeoSeries([Point(1, 2), Point(2, 3), Point(5, 5)])
865895 expected_df = GeoDataFrame({"level_1": [1, 1, 2], "geometry": expected_s})
871901 expected_df = expected_df.set_index(expected_index)
872902 assert_frame_equal(test_df, expected_df)
873903
874 @pytest.mark.skipif(
875 not compat.PANDAS_GE_025,
876 reason="pandas explode introduced in pandas 0.25",
877 )
904 @pytest.mark.parametrize("index_name", [None, "test"])
905 def test_explode_geodataframe_no_multiindex(self, index_name):
906 # GH1393
907 s = GeoSeries([MultiPoint([Point(1, 2), Point(2, 3)]), Point(5, 5)])
908 df = GeoDataFrame({"level_1": [1, 2], "geometry": s})
909 df.index.name = index_name
910
911 test_df = df.explode(index_parts=False)
912
913 expected_s = GeoSeries([Point(1, 2), Point(2, 3), Point(5, 5)])
914 expected_df = GeoDataFrame({"level_1": [1, 1, 2], "geometry": expected_s})
915
916 expected_index = Index([0, 0, 1], name=index_name)
917 expected_df = expected_df.set_index(expected_index)
918 assert_frame_equal(test_df, expected_df)
919
878920 def test_explode_pandas_fallback(self):
879921 d = {
880922 "col1": [["name1", "name2"], ["name3", "name4"]],
881 "geometry": [
882 MultiPoint([(1, 2), (3, 4)]),
883 MultiPoint([(2, 1), (0, 0)]),
884 ],
923 "geometry": [MultiPoint([(1, 2), (3, 4)]), MultiPoint([(2, 1), (0, 0)])],
885924 }
886925 gdf = GeoDataFrame(d, crs=4326)
887926 expected_df = GeoDataFrame(
913952 def test_explode_pandas_fallback_ignore_index(self):
914953 d = {
915954 "col1": [["name1", "name2"], ["name3", "name4"]],
916 "geometry": [
917 MultiPoint([(1, 2), (3, 4)]),
918 MultiPoint([(2, 1), (0, 0)]),
919 ],
955 "geometry": [MultiPoint([(1, 2), (3, 4)]), MultiPoint([(2, 1), (0, 0)])],
920956 }
921957 gdf = GeoDataFrame(d, crs=4326)
922958 expected_df = GeoDataFrame(
940976 exploded_df = gdf.explode(column="col1", ignore_index=True)
941977 assert_geodataframe_equal(exploded_df, expected_df)
942978
979 @pytest.mark.parametrize("outer_index", [1, (1, 2), "1"])
980 def test_explode_pandas_multi_index(self, outer_index):
981 index = MultiIndex.from_arrays(
982 [[outer_index, outer_index, outer_index], [1, 2, 3]],
983 names=("first", "second"),
984 )
985 df = GeoDataFrame(
986 {"vals": [1, 2, 3]},
987 geometry=[MultiPoint([(x, x), (x, 0)]) for x in range(3)],
988 index=index,
989 )
990
991 test_df = df.explode(index_parts=True)
992
993 expected_s = GeoSeries(
994 [
995 Point(0, 0),
996 Point(0, 0),
997 Point(1, 1),
998 Point(1, 0),
999 Point(2, 2),
1000 Point(2, 0),
1001 ]
1002 )
1003 expected_df = GeoDataFrame({"vals": [1, 1, 2, 2, 3, 3], "geometry": expected_s})
1004 expected_index = MultiIndex.from_tuples(
1005 [
1006 (outer_index, *pair)
1007 for pair in [(1, 0), (1, 1), (2, 0), (2, 1), (3, 0), (3, 1)]
1008 ],
1009 names=["first", "second", None],
1010 )
1011 expected_df = expected_df.set_index(expected_index)
1012 assert_frame_equal(test_df, expected_df)
1013
1014 @pytest.mark.parametrize("outer_index", [1, (1, 2), "1"])
1015 def test_explode_pandas_multi_index_false(self, outer_index):
1016 index = MultiIndex.from_arrays(
1017 [[outer_index, outer_index, outer_index], [1, 2, 3]],
1018 names=("first", "second"),
1019 )
1020 df = GeoDataFrame(
1021 {"vals": [1, 2, 3]},
1022 geometry=[MultiPoint([(x, x), (x, 0)]) for x in range(3)],
1023 index=index,
1024 )
1025
1026 test_df = df.explode(index_parts=False)
1027
1028 expected_s = GeoSeries(
1029 [
1030 Point(0, 0),
1031 Point(0, 0),
1032 Point(1, 1),
1033 Point(1, 0),
1034 Point(2, 2),
1035 Point(2, 0),
1036 ]
1037 )
1038 expected_df = GeoDataFrame({"vals": [1, 1, 2, 2, 3, 3], "geometry": expected_s})
1039 expected_index = MultiIndex.from_tuples(
1040 [
1041 (outer_index, 1),
1042 (outer_index, 1),
1043 (outer_index, 2),
1044 (outer_index, 2),
1045 (outer_index, 3),
1046 (outer_index, 3),
1047 ],
1048 names=["first", "second"],
1049 )
1050 expected_df = expected_df.set_index(expected_index)
1051 assert_frame_equal(test_df, expected_df)
1052
1053 @pytest.mark.parametrize("outer_index", [1, (1, 2), "1"])
1054 def test_explode_pandas_multi_index_ignore_index(self, outer_index):
1055 index = MultiIndex.from_arrays(
1056 [[outer_index, outer_index, outer_index], [1, 2, 3]],
1057 names=("first", "second"),
1058 )
1059 df = GeoDataFrame(
1060 {"vals": [1, 2, 3]},
1061 geometry=[MultiPoint([(x, x), (x, 0)]) for x in range(3)],
1062 index=index,
1063 )
1064
1065 test_df = df.explode(ignore_index=True)
1066
1067 expected_s = GeoSeries(
1068 [
1069 Point(0, 0),
1070 Point(0, 0),
1071 Point(1, 1),
1072 Point(1, 0),
1073 Point(2, 2),
1074 Point(2, 0),
1075 ]
1076 )
1077 expected_df = GeoDataFrame({"vals": [1, 1, 2, 2, 3, 3], "geometry": expected_s})
1078 expected_index = Index(range(len(expected_df)))
1079 expected_df = expected_df.set_index(expected_index)
1080 assert_frame_equal(test_df, expected_df)
1081
1082 # index_parts is ignored if ignore_index=True
1083 test_df = df.explode(ignore_index=True, index_parts=True)
1084 assert_frame_equal(test_df, expected_df)
1085
9431086 #
9441087 # Test '&', '|', '^', and '-'
9451088 #
66 import numpy as np
77 from numpy.testing import assert_array_equal
88 import pandas as pd
9 from pandas.util.testing import assert_index_equal
910
1011 from pyproj import CRS
1112 from shapely.geometry import (
1819 )
1920 from shapely.geometry.base import BaseGeometry
2021
21 from geopandas import GeoSeries, GeoDataFrame
22 from geopandas._compat import PYPROJ_LT_3
22 from geopandas import GeoSeries, GeoDataFrame, read_file, datasets, clip
23 from geopandas._compat import PYPROJ_LT_3, ignore_shapely2_warnings
2324 from geopandas.array import GeometryArray, GeometryDtype
2425 from geopandas.testing import assert_geoseries_equal
2526
171172 assert_series_equal(res, exp)
172173
173174 def test_to_file(self):
174 """ Test to_file and from_file """
175 """Test to_file and from_file"""
175176 tempfilename = os.path.join(self.tempdir, "test.shp")
176177 self.g3.to_file(tempfilename)
177178 # Read layer back in?
238239 # self.na_none.fillna(method='backfill')
239240
240241 def test_coord_slice(self):
241 """ Test CoordinateSlicer """
242 """Test CoordinateSlicer"""
242243 # need some better test cases
243244 assert geom_equals(self.g3, self.g3.cx[:, :])
244245 assert geom_equals(self.g3[[True, False]], self.g3.cx[0.9:, :0.1])
261262
262263 def test_proj4strings(self):
263264 # As string
264 reprojected = self.g3.to_crs("+proj=utm +zone=30N")
265 reprojected = self.g3.to_crs("+proj=utm +zone=30")
265266 reprojected_back = reprojected.to_crs(epsg=4326)
266267 assert np.all(self.g3.geom_almost_equals(reprojected_back))
267268
268269 # As dict
269 reprojected = self.g3.to_crs({"proj": "utm", "zone": "30N"})
270 reprojected = self.g3.to_crs({"proj": "utm", "zone": "30"})
270271 reprojected_back = reprojected.to_crs(epsg=4326)
271272 assert np.all(self.g3.geom_almost_equals(reprojected_back))
272273
273274 # Set to equivalent string, convert, compare to original
274275 copy = self.g3.copy()
275276 copy.crs = "epsg:4326"
276 reprojected = copy.to_crs({"proj": "utm", "zone": "30N"})
277 reprojected = copy.to_crs({"proj": "utm", "zone": "30"})
277278 reprojected_back = reprojected.to_crs(epsg=4326)
278279 assert np.all(self.g3.geom_almost_equals(reprojected_back))
279280
280281 # Conversions by different format
281 reprojected_string = self.g3.to_crs("+proj=utm +zone=30N")
282 reprojected_dict = self.g3.to_crs({"proj": "utm", "zone": "30N"})
282 reprojected_string = self.g3.to_crs("+proj=utm +zone=30")
283 reprojected_dict = self.g3.to_crs({"proj": "utm", "zone": "30"})
283284 assert np.all(reprojected_string.geom_almost_equals(reprojected_dict))
284285
285286 def test_from_wkb(self):
321322 def test_to_wkt(self):
322323 assert_series_equal(pd.Series([self.t1.wkt, self.sq.wkt]), self.g1.to_wkt())
323324
325 @pytest.mark.skip_no_sindex
326 def test_clip(self):
327 left = read_file(datasets.get_path("naturalearth_cities"))
328 world = read_file(datasets.get_path("naturalearth_lowres"))
329 south_america = world[world["continent"] == "South America"]
330
331 expected = clip(left.geometry, south_america)
332 result = left.geometry.clip(south_america)
333 assert_geoseries_equal(result, expected)
334
335 def test_from_xy_points(self):
336 x = self.landmarks.x.values
337 y = self.landmarks.y.values
338 index = self.landmarks.index.tolist()
339 crs = self.landmarks.crs
340 assert_geoseries_equal(
341 self.landmarks, GeoSeries.from_xy(x, y, index=index, crs=crs)
342 )
343 assert_geoseries_equal(
344 self.landmarks,
345 GeoSeries.from_xy(self.landmarks.x, self.landmarks.y, crs=crs),
346 )
347
348 def test_from_xy_points_w_z(self):
349 index_values = [5, 6, 7]
350 x = pd.Series([0, -1, 2], index=index_values)
351 y = pd.Series([8, 3, 1], index=index_values)
352 z = pd.Series([5, -6, 7], index=index_values)
353 expected = GeoSeries(
354 [Point(0, 8, 5), Point(-1, 3, -6), Point(2, 1, 7)], index=index_values
355 )
356 assert_geoseries_equal(expected, GeoSeries.from_xy(x, y, z))
357
358 def test_from_xy_points_unequal_index(self):
359 x = self.landmarks.x
360 y = self.landmarks.y
361 y.index = -np.arange(len(y))
362 crs = self.landmarks.crs
363 assert_geoseries_equal(
364 self.landmarks, GeoSeries.from_xy(x, y, index=x.index, crs=crs)
365 )
366 unindexed_landmarks = self.landmarks.copy()
367 unindexed_landmarks.reset_index(inplace=True, drop=True)
368 assert_geoseries_equal(
369 unindexed_landmarks,
370 GeoSeries.from_xy(x, y, crs=crs),
371 )
372
373 def test_from_xy_points_indexless(self):
374 x = np.array([0.0, 3.0])
375 y = np.array([2.0, 5.0])
376 z = np.array([-1.0, 4.0])
377 expected = GeoSeries([Point(0, 2, -1), Point(3, 5, 4)])
378 assert_geoseries_equal(expected, GeoSeries.from_xy(x, y, z))
379
324380
325381 def test_missing_values_empty_warning():
326382 s = GeoSeries([Point(1, 1), None, np.nan, BaseGeometry(), Polygon()])
354410 assert len(s.dropna()) == 3
355411
356412
413 def test_isna_empty_geoseries():
414 # ensure that isna() result for emtpy GeoSeries has the correct bool dtype
415 s = GeoSeries([])
416 result = s.isna()
417 assert_series_equal(result, pd.Series([], dtype="bool"))
418
419
357420 def test_geoseries_crs():
358421 gs = GeoSeries()
359422 gs.crs = "IGNF:ETRS89UTM28"
433496 s = GeoSeries(index=range(3))
434497 check_geoseries(s)
435498
499 def test_empty_array(self):
500 # with empty data that have an explicit dtype, we use the fallback or
501 # not depending on the dtype
502 arr = np.array([], dtype="bool")
503
504 # dtypes that can never hold geometry-like data
505 for arr in [
506 np.array([], dtype="bool"),
507 np.array([], dtype="int64"),
508 np.array([], dtype="float32"),
509 # this gets converted to object dtype by pandas
510 # np.array([], dtype="str"),
511 ]:
512 with pytest.warns(FutureWarning):
513 s = GeoSeries(arr)
514 assert not isinstance(s, GeoSeries)
515 assert type(s) == pd.Series
516
517 # dtypes that can potentially hold geometry-like data (object) or
518 # can come from empty data (float64)
519 for arr in [
520 np.array([], dtype="object"),
521 np.array([], dtype="float64"),
522 np.array([], dtype="str"),
523 ]:
524 with pytest.warns(None) as record:
525 s = GeoSeries(arr)
526 assert not record
527 assert isinstance(s, GeoSeries)
528
436529 def test_from_series(self):
437530 shapes = [
438531 Polygon([(random.random(), random.random()) for _ in range(3)])
439532 for _ in range(10)
440533 ]
441 s = pd.Series(shapes, index=list("abcdefghij"), name="foo")
534 with ignore_shapely2_warnings():
535 # the warning here is not suppressed by GeoPandas, as this is a pure
536 # pandas construction call
537 s = pd.Series(shapes, index=list("abcdefghij"), name="foo")
442538 g = GeoSeries(s)
443539 check_geoseries(g)
444540
451547 s = GeoSeries(
452548 [MultiPoint([(0, 0), (1, 1)]), MultiPoint([(2, 2), (3, 3), (4, 4)])]
453549 )
454 s = s.explode()
550 s = s.explode(index_parts=True)
455551 df = s.reset_index()
456552 assert type(df) == GeoDataFrame
553
554 def test_explode_without_multiindex(self):
555 s = GeoSeries(
556 [MultiPoint([(0, 0), (1, 1)]), MultiPoint([(2, 2), (3, 3), (4, 4)])]
557 )
558 s = s.explode(index_parts=False)
559 expected_index = pd.Index([0, 0, 1, 1, 1])
560 assert_index_equal(s.index, expected_index)
561
562 def test_explode_ignore_index(self):
563 s = GeoSeries(
564 [MultiPoint([(0, 0), (1, 1)]), MultiPoint([(2, 2), (3, 3), (4, 4)])]
565 )
566 s = s.explode(ignore_index=True)
567 expected_index = pd.Index(range(len(s)))
568 print(expected_index)
569 assert_index_equal(s.index, expected_index)
570
571 # index_parts is ignored if ignore_index=True
572 s = s.explode(index_parts=True, ignore_index=True)
573 assert_index_equal(s.index, expected_index)
00 import pandas as pd
1 import pytest
2 from geopandas.testing import assert_geodataframe_equal
13
24 from shapely.geometry import Point
35
4749 assert isinstance(res, GeoDataFrame)
4850 assert isinstance(res.geometry, GeoSeries)
4951 self._check_metadata(res)
52 exp = GeoDataFrame(pd.concat([pd.DataFrame(self.gdf), pd.DataFrame(self.gdf)]))
53 assert_geodataframe_equal(exp, res)
54 # check metadata comes from first gdf
55 res4 = pd.concat([self.gdf.set_crs("epsg:4326"), self.gdf], axis=0)
56 # Note: this behaviour potentially does not make sense. If geom cols are
57 # concatenated but have different CRS, then the CRS will be overridden.
58 self._check_metadata(res4, crs="epsg:4326")
5059
5160 # series
5261 res = pd.concat([self.gdf.geometry, self.gdf.geometry])
6271 assert isinstance(res, GeoDataFrame)
6372 assert isinstance(res.geometry, GeoSeries)
6473 self._check_metadata(res)
74
75 def test_concat_axis1_multiple_geodataframes(self):
76 # https://github.com/geopandas/geopandas/issues/1230
77 # Expect that concat should fail gracefully if duplicate column names belonging
78 # to geometry columns are introduced.
79 expected_err = (
80 "GeoDataFrame does not support multiple columns using the geometry"
81 " column name 'geometry'"
82 )
83 with pytest.raises(ValueError, match=expected_err):
84 pd.concat([self.gdf, self.gdf], axis=1)
85
86 # Check case is handled if custom geometry column name is used
87 df2 = self.gdf.rename_geometry("geom")
88 expected_err2 = (
89 "Concat operation has resulted in multiple columns using the geometry "
90 "column name 'geom'."
91 )
92 with pytest.raises(ValueError, match=expected_err2):
93 pd.concat([df2, df2], axis=1)
94
95 # Check that two geometry columns is fine, if they have different names
96 res3 = pd.concat([df2.set_crs("epsg:4326"), self.gdf], axis=1)
97 # check metadata comes from first df
98 self._check_metadata(res3, geometry_column_name="geom", crs="epsg:4326")
00 import os
1 from distutils.version import LooseVersion
12
23 import pandas as pd
34
67
78 import geopandas
89 from geopandas import GeoDataFrame, GeoSeries, overlay, read_file
10 from geopandas import _compat
911
1012 from geopandas.testing import assert_geodataframe_equal, assert_geoseries_equal
1113 import pytest
1416
1517
1618 pytestmark = pytest.mark.skip_no_sindex
19 pandas_133 = pd.__version__ == LooseVersion("1.3.3")
1720
1821
1922 @pytest.fixture
5053 params=["union", "intersection", "difference", "symmetric_difference", "identity"]
5154 )
5255 def how(request):
56 if pandas_133 and request.param in ["symmetric_difference", "identity", "union"]:
57 pytest.xfail("Regression in pandas 1.3.3 (GH #2101)")
5358 return request.param
5459
5560
182187
183188 # first, check that all bounds and areas are approx equal
184189 # this is a very rough check for multipolygon equality
190 if not _compat.PANDAS_GE_11:
191 kwargs = dict(check_less_precise=True)
192 else:
193 kwargs = {}
185194 pd.testing.assert_series_equal(
186 result.geometry.area, expected.geometry.area, check_less_precise=True
195 result.geometry.area, expected.geometry.area, **kwargs
187196 )
188197 pd.testing.assert_frame_equal(
189 result.geometry.bounds, expected.geometry.bounds, check_less_precise=True
198 result.geometry.bounds, expected.geometry.bounds, **kwargs
190199 )
191200
192201 # There are two cases where the multipolygon have a different number
318327 overlay(df1, df2, how="spandex")
319328
320329
321 def test_duplicate_column_name(dfs):
330 def test_duplicate_column_name(dfs, how):
331 if how == "difference":
332 pytest.skip("Difference uses columns from one df only.")
322333 df1, df2 = dfs
323334 df2r = df2.rename(columns={"col2": "col1"})
324 res = overlay(df1, df2r, how="union")
335 res = overlay(df1, df2r, how=how)
325336 assert ("col1_1" in res.columns) and ("col1_2" in res.columns)
326337
327338
563574 df1 = read_file(os.path.join(DATA, "geom_type", "df1.geojson"))
564575 df2 = read_file(os.path.join(DATA, "geom_type", "df2.geojson"))
565576
577 with pytest.warns(UserWarning, match="`keep_geom_type=True` in overlay"):
578 intersection = overlay(df1, df2, keep_geom_type=None)
579 assert len(intersection) == 1
580 assert (intersection.geom_type == "Polygon").all()
581
566582 intersection = overlay(df1, df2, keep_geom_type=True)
567583 assert len(intersection) == 1
568584 assert (intersection.geom_type == "Polygon").all()
570586 intersection = overlay(df1, df2, keep_geom_type=False)
571587 assert len(intersection) == 1
572588 assert (intersection.geom_type == "GeometryCollection").all()
589
590
591 def test_keep_geom_type_geometry_collection2():
592 polys1 = [
593 box(0, 0, 1, 1),
594 box(1, 1, 3, 3).union(box(1, 3, 5, 5)),
595 ]
596
597 polys2 = [
598 box(0, 0, 1, 1),
599 box(3, 1, 4, 2).union(box(4, 1, 5, 4)),
600 ]
601 df1 = GeoDataFrame({"left": [0, 1], "geometry": polys1})
602 df2 = GeoDataFrame({"right": [0, 1], "geometry": polys2})
603
604 result1 = overlay(df1, df2, keep_geom_type=True)
605 expected1 = GeoDataFrame(
606 {
607 "left": [0, 1],
608 "right": [0, 1],
609 "geometry": [box(0, 0, 1, 1), box(4, 3, 5, 4)],
610 }
611 )
612 assert_geodataframe_equal(result1, expected1)
613
614 result1 = overlay(df1, df2, keep_geom_type=False)
615 expected1 = GeoDataFrame(
616 {
617 "left": [0, 1, 1],
618 "right": [0, 0, 1],
619 "geometry": [
620 box(0, 0, 1, 1),
621 Point(1, 1),
622 GeometryCollection([box(4, 3, 5, 4), LineString([(3, 1), (3, 2)])]),
623 ],
624 }
625 )
626 assert_geodataframe_equal(result1, expected1)
573627
574628
575629 @pytest.mark.parametrize("make_valid", [True, False])
591645 else:
592646 with pytest.raises(ValueError, match="1 invalid input geometries"):
593647 overlay(df1, df_bowtie, make_valid=make_valid)
648
649
650 def test_empty_overlay_return_non_duplicated_columns():
651
652 nybb = geopandas.read_file(geopandas.datasets.get_path("nybb"))
653 nybb2 = nybb.copy()
654 nybb2.geometry = nybb2.translate(20000000)
655
656 result = geopandas.overlay(nybb, nybb2)
657
658 assert all(result.columns.isin(nybb.columns))
659 assert len(result.columns) == len(nybb.columns)
44 import pandas as pd
55
66 import shapely
7 from shapely.geometry import Point, GeometryCollection
7 from shapely.geometry import Point, GeometryCollection, LineString
88
99 import geopandas
1010 from geopandas import GeoDataFrame, GeoSeries
4545 s1 = GeoSeries([p1, p2, None])
4646 assert "POINT (10.12346 50.12346)" in repr(s1)
4747
48 # geographic coordinates 4326
49 s3 = GeoSeries([p1, p2], crs=4326)
50 assert "POINT (10.12346 50.12346)" in repr(s3)
51
4852 # projected coordinates
4953 p1 = Point(3000.123456789, 3000.123456789)
5054 p2 = Point(4000.123456789, 4000.123456789)
5155 s2 = GeoSeries([p1, p2, None])
5256 assert "POINT (3000.123 3000.123)" in repr(s2)
57
58 # projected geographic coordinate
59 s4 = GeoSeries([p1, p2], crs=3857)
60 assert "POINT (3000.123 3000.123)" in repr(s4)
5361
5462 geopandas.options.display_precision = 1
5563 assert "POINT (10.1 50.1)" in repr(s1)
7078 def test_repr_empty():
7179 # https://github.com/geopandas/geopandas/issues/1195
7280 s = GeoSeries([])
73 if compat.PANDAS_GE_025:
74 # repr with correct name fixed in pandas 0.25
75 assert repr(s) == "GeoSeries([], dtype: geometry)"
76 else:
77 assert repr(s) == "Series([], dtype: geometry)"
81 assert repr(s) == "GeoSeries([], dtype: geometry)"
7882 df = GeoDataFrame({"a": [], "geometry": s})
7983 assert "Empty GeoDataFrame" in repr(df)
8084 # https://github.com/geopandas/geopandas/issues/1184
255259 res = df.astype(object)
256260 assert isinstance(res, pd.DataFrame) and not isinstance(res, GeoDataFrame)
257261 assert res["a"].dtype == object
262
263
264 @pytest.mark.xfail(
265 not compat.PANDAS_GE_10,
266 reason="Convert dtypes new in pandas 1.0",
267 raises=NotImplementedError,
268 )
269 def test_convert_dtypes(df):
270 # https://github.com/geopandas/geopandas/issues/1870
271
272 # Test geometry col is first col, first, geom_col_name=geometry
273 # (order is important in concat, used internally)
274 res1 = df.convert_dtypes() # note res1 done first for pandas < 1 xfail check
275
276 expected1 = GeoDataFrame(
277 pd.DataFrame(df).convert_dtypes(), crs=df.crs, geometry=df.geometry.name
278 )
279
280 # Checking type and metadata are right
281 assert_geodataframe_equal(expected1, res1)
282
283 # Test geom last, geom_col_name=geometry
284 res2 = df[["value1", "value2", "geometry"]].convert_dtypes()
285 assert_geodataframe_equal(expected1[["value1", "value2", "geometry"]], res2)
286
287 # Test again with crs set and custom geom col name
288 df2 = df.set_crs(epsg=4326).rename_geometry("points")
289 expected2 = GeoDataFrame(
290 pd.DataFrame(df2).convert_dtypes(), crs=df2.crs, geometry=df2.geometry.name
291 )
292 res3 = df2.convert_dtypes()
293 assert_geodataframe_equal(expected2, res3)
294
295 # Test geom last, geom_col=geometry
296 res4 = df2[["value1", "value2", "points"]].convert_dtypes()
297 assert_geodataframe_equal(expected2[["value1", "value2", "points"]], res4)
258298
259299
260300 def test_to_csv(df):
403443 assert_array_equal(s.unique(), exp)
404444
405445
406 @pytest.mark.xfail
407446 def test_value_counts():
408447 # each object is considered unique
409448 s = GeoSeries([Point(0, 0), Point(1, 1), Point(0, 0)])
410449 res = s.value_counts()
411 exp = pd.Series([2, 1], index=[Point(0, 0), Point(1, 1)])
450 with compat.ignore_shapely2_warnings():
451 exp = pd.Series([2, 1], index=[Point(0, 0), Point(1, 1)])
412452 assert_series_equal(res, exp)
453 # Check crs doesn't make a difference - note it is not kept in output index anyway
454 s2 = GeoSeries([Point(0, 0), Point(1, 1), Point(0, 0)], crs="EPSG:4326")
455 res2 = s2.value_counts()
456 assert_series_equal(res2, exp)
457
458 # check mixed geometry
459 s3 = GeoSeries([Point(0, 0), LineString([[1, 1], [2, 2]]), Point(0, 0)])
460 res3 = s3.value_counts()
461 with compat.ignore_shapely2_warnings():
462 exp3 = pd.Series([2, 1], index=[Point(0, 0), LineString([[1, 1], [2, 2]])])
463 assert_series_equal(res3, exp3)
464
465 # check None is handled
466 s4 = GeoSeries([Point(0, 0), None, Point(0, 0)])
467 res4 = s4.value_counts(dropna=True)
468 with compat.ignore_shapely2_warnings():
469 exp4_dropna = pd.Series([2], index=[Point(0, 0)])
470 assert_series_equal(res4, exp4_dropna)
471 with compat.ignore_shapely2_warnings():
472 exp4_keepna = pd.Series([2, 1], index=[Point(0, 0), None])
473 res4_keepna = s4.value_counts(dropna=False)
474 assert_series_equal(res4_keepna, exp4_keepna)
413475
414476
415477 @pytest.mark.xfail(strict=False)
453515 assert_frame_equal(res, exp)
454516
455517 # applying on the geometry column
456 res = df.groupby("value2")["geometry"].apply(lambda x: x.cascaded_union)
518 res = df.groupby("value2")["geometry"].apply(lambda x: x.unary_union)
457519 if compat.PANDAS_GE_11:
458520 exp = GeoSeries(
459521 [shapely.geometry.MultiPoint([(0, 0), (2, 2)]), Point(1, 1)],
534596 assert_frame_equal(result, expected)
535597
536598
599 def test_apply_preserves_geom_col_name(df):
600 df = df.rename_geometry("geom")
601 result = df.apply(lambda col: col, axis=0)
602 assert result.geometry.name == "geom"
603
604
537605 @pytest.mark.skipif(not compat.PANDAS_GE_10, reason="attrs introduced in pandas 1.0")
538606 def test_preserve_attrs(df):
539607 # https://github.com/geopandas/geopandas/issues/1654
549617 df2 = df.reset_index()
550618 assert df2.attrs == attrs
551619
620 # https://github.com/geopandas/geopandas/issues/1875
621 df3 = df2.explode(index_parts=True)
622 assert df3.attrs == attrs
623
552624
553625 @pytest.mark.skipif(not compat.PANDAS_GE_12, reason="attrs introduced in pandas 1.0")
554626 def test_preserve_flags(df):
2121 from geopandas import GeoDataFrame, GeoSeries, read_file
2222 from geopandas.datasets import get_path
2323 import geopandas._compat as compat
24 from geopandas.plotting import GeoplotAccessor
2425
2526 import pytest
2627
3132 try: # skipif and importorskip do not work for decorators
3233 from matplotlib.testing.decorators import check_figures_equal
3334
34 MPL_DECORATORS = True
35 if matplotlib.__version__ >= LooseVersion("3.3.0"):
36
37 MPL_DECORATORS = True
38 else:
39 MPL_DECORATORS = False
3540 except ImportError:
3641 MPL_DECORATORS = False
3742
510515 self.df.plot(linestyle=ls, linewidth=1),
511516 self.df.plot(column="values", linestyle=ls, linewidth=1),
512517 ]:
513 np.testing.assert_array_equal(exp_ls, ax.collections[0].get_linestyle())
518 assert exp_ls == ax.collections[0].get_linestyle()
514519
515520 def test_style_kwargs_linewidth(self):
516521 # single
544549 np.linspace(0, 0.0, 1.0, self.N), ax.collections[0].get_alpha()
545550 )
546551
552 def test_style_kwargs_path_effects(self):
553 from matplotlib.patheffects import withStroke
554
555 effects = [withStroke(linewidth=8, foreground="b")]
556 ax = self.df.plot(color="orange", path_effects=effects)
557 assert ax.collections[0].get_path_effects()[0].__dict__["_gc"] == {
558 "linewidth": 8,
559 "foreground": "b",
560 }
561
547562 def test_subplots_norm(self):
548563 # colors of subplots are the same as for plot (norm is applied)
549564 cmap = matplotlib.cm.viridis_r
941956 self.series.plot(linestyles=ls, linewidth=1),
942957 self.df.plot(linestyles=ls, linewidth=1),
943958 ]:
944 np.testing.assert_array_equal(exp_ls, ax.collections[0].get_linestyle())
959 assert exp_ls == ax.collections[0].get_linestyle()
945960
946961 def test_style_kwargs_linewidth(self):
947962 # single
10531068 pth = get_path("naturalearth_lowres")
10541069 cls.df = read_file(pth)
10551070 cls.df["NEGATIVES"] = np.linspace(-10, 10, len(cls.df.index))
1071 cls.df["low_vals"] = np.linspace(0, 0.3, cls.df.shape[0])
1072 cls.df["mid_vals"] = np.linspace(0.3, 0.7, cls.df.shape[0])
1073 cls.df["high_vals"] = np.linspace(0.7, 1.0, cls.df.shape[0])
1074 cls.df.loc[cls.df.index[:20:2], "high_vals"] = np.nan
10561075
10571076 def test_legend(self):
10581077 with warnings.catch_warnings(record=True) as _: # don't print warning
11931212 legend_height = _get_ax(fig, "fixed_colorbar").get_position().height
11941213 assert abs(plot_height - legend_height) < 1e-6
11951214
1215 def test_empty_bins(self):
1216 bins = np.arange(1, 11) / 10
1217 ax = self.df.plot(
1218 "low_vals",
1219 scheme="UserDefined",
1220 classification_kwds={"bins": bins},
1221 legend=True,
1222 )
1223 expected = np.array(
1224 [
1225 [0.281412, 0.155834, 0.469201, 1.0],
1226 [0.267004, 0.004874, 0.329415, 1.0],
1227 [0.244972, 0.287675, 0.53726, 1.0],
1228 ]
1229 )
1230 assert all(
1231 [
1232 (z == expected).all(axis=1).any()
1233 for z in ax.collections[0].get_facecolors()
1234 ]
1235 )
1236 labels = [
1237 "0.00, 0.10",
1238 "0.10, 0.20",
1239 "0.20, 0.30",
1240 "0.30, 0.40",
1241 "0.40, 0.50",
1242 "0.50, 0.60",
1243 "0.60, 0.70",
1244 "0.70, 0.80",
1245 "0.80, 0.90",
1246 "0.90, 1.00",
1247 ]
1248 legend = [t.get_text() for t in ax.get_legend().get_texts()]
1249 assert labels == legend
1250
1251 legend_colors_exp = [
1252 (0.267004, 0.004874, 0.329415, 1.0),
1253 (0.281412, 0.155834, 0.469201, 1.0),
1254 (0.244972, 0.287675, 0.53726, 1.0),
1255 (0.190631, 0.407061, 0.556089, 1.0),
1256 (0.147607, 0.511733, 0.557049, 1.0),
1257 (0.119699, 0.61849, 0.536347, 1.0),
1258 (0.20803, 0.718701, 0.472873, 1.0),
1259 (0.430983, 0.808473, 0.346476, 1.0),
1260 (0.709898, 0.868751, 0.169257, 1.0),
1261 (0.993248, 0.906157, 0.143936, 1.0),
1262 ]
1263
1264 assert [
1265 line.get_markerfacecolor() for line in ax.get_legend().get_lines()
1266 ] == legend_colors_exp
1267
1268 ax2 = self.df.plot(
1269 "mid_vals",
1270 scheme="UserDefined",
1271 classification_kwds={"bins": bins},
1272 legend=True,
1273 )
1274 expected = np.array(
1275 [
1276 [0.244972, 0.287675, 0.53726, 1.0],
1277 [0.190631, 0.407061, 0.556089, 1.0],
1278 [0.147607, 0.511733, 0.557049, 1.0],
1279 [0.119699, 0.61849, 0.536347, 1.0],
1280 [0.20803, 0.718701, 0.472873, 1.0],
1281 ]
1282 )
1283 assert all(
1284 [
1285 (z == expected).all(axis=1).any()
1286 for z in ax2.collections[0].get_facecolors()
1287 ]
1288 )
1289
1290 labels = [
1291 "-inf, 0.10",
1292 "0.10, 0.20",
1293 "0.20, 0.30",
1294 "0.30, 0.40",
1295 "0.40, 0.50",
1296 "0.50, 0.60",
1297 "0.60, 0.70",
1298 "0.70, 0.80",
1299 "0.80, 0.90",
1300 "0.90, 1.00",
1301 ]
1302 legend = [t.get_text() for t in ax2.get_legend().get_texts()]
1303 assert labels == legend
1304 assert [
1305 line.get_markerfacecolor() for line in ax2.get_legend().get_lines()
1306 ] == legend_colors_exp
1307
1308 ax3 = self.df.plot(
1309 "high_vals",
1310 scheme="UserDefined",
1311 classification_kwds={"bins": bins},
1312 legend=True,
1313 )
1314 expected = np.array(
1315 [
1316 [0.709898, 0.868751, 0.169257, 1.0],
1317 [0.993248, 0.906157, 0.143936, 1.0],
1318 [0.430983, 0.808473, 0.346476, 1.0],
1319 ]
1320 )
1321 assert all(
1322 [
1323 (z == expected).all(axis=1).any()
1324 for z in ax3.collections[0].get_facecolors()
1325 ]
1326 )
1327
1328 legend = [t.get_text() for t in ax3.get_legend().get_texts()]
1329 assert labels == legend
1330
1331 assert [
1332 line.get_markerfacecolor() for line in ax3.get_legend().get_lines()
1333 ] == legend_colors_exp
1334
11961335
11971336 class TestPlotCollections:
11981337 def setup_method(self):
14741613 ax.cla()
14751614
14761615
1477 @pytest.mark.skipif(not compat.PANDAS_GE_025, reason="requires pandas > 0.24")
14781616 class TestGeoplotAccessor:
14791617 def setup_method(self):
14801618 geometries = [Polygon([(0, 0), (1, 0), (1, 1)]), Point(1, 3)]
14981636 getattr(self.gdf.plot, kind)(ax=ax_geopandas_2, **kwargs)
14991637
15001638 _pandas_kinds = []
1501 if compat.PANDAS_GE_025:
1502 from geopandas.plotting import GeoplotAccessor
1503
1504 _pandas_kinds = GeoplotAccessor._pandas_kinds
1639
1640 _pandas_kinds = GeoplotAccessor._pandas_kinds
15051641
15061642 if MPL_DECORATORS:
15071643
15251661 kwargs = {"y": "y"}
15261662 elif kind in _xy_kinds:
15271663 kwargs = {"x": "x", "y": "y"}
1664 if kind == "hexbin": # increase gridsize to reduce duration
1665 kwargs["gridsize"] = 10
15281666
15291667 self.compare_figures(kind, fig_test, fig_ref, kwargs)
15301668 plt.close("all")
15601698 polys = GeoSeries([t1, t2], index=list("AB"))
15611699 df = GeoDataFrame({"geometry": polys, "values": [0, 1]})
15621700
1563 # Test with continous values
1701 # Test with continuous values
15641702 ax = df.plot(column="values")
15651703 colors = ax.collections[0].get_facecolors()
15661704 ax = df.plot(column=df["values"])
15801718 colors_array = ax.collections[0].get_facecolors()
15811719 np.testing.assert_array_equal(colors, colors_array)
15821720
1583 # Check raised error: is df rows number equal to column legth?
1721 # Check raised error: is df rows number equal to column length?
15841722 with pytest.raises(ValueError, match="different number of rows"):
15851723 ax = df.plot(column=np.array([1, 2, 3]))
15861724
16551793
16561794
16571795 def _style_to_vertices(markerstyle):
1658 """ Converts a markerstyle string to a path. """
1796 """Converts a markerstyle string to a path."""
16591797 # TODO: Vertices values are twice the actual path; unclear, why.
16601798 path = matplotlib.markers.MarkerStyle(markerstyle).get_path()
16611799 return path.vertices / 2
0 import sys
0 from math import sqrt
11
22 from shapely.geometry import (
33 Point,
1616 import pytest
1717 import numpy as np
1818
19
20 @pytest.mark.skipif(sys.platform.startswith("win"), reason="fails on AppVeyor")
19 if compat.USE_PYGEOS:
20 import pygeos
21
22
2123 @pytest.mark.skip_no_sindex
2224 class TestSeriesSindex:
2325 def test_has_sindex(self):
106108 assert sliced.sindex is not original_index
107109
108110
109 @pytest.mark.skipif(sys.platform.startswith("win"), reason="fails on AppVeyor")
110111 @pytest.mark.skip_no_sindex
111112 class TestFrameSindex:
112113 def setup_method(self):
161162 assert geometry_col.sindex is original_index
162163
163164 @pytest.mark.skipif(
164 not compat.PANDAS_GE_10, reason="Column selection returns a copy on pd<=1.0.0"
165 not compat.PANDAS_GE_11, reason="Column selection returns a copy on pd<=1.1.0"
165166 )
166167 def test_rebuild_on_multiple_col_selection(self):
167168 """Selecting a subset of columns preserves the index."""
669670 )
670671 raise e
671672
673 # ------------------------- `nearest` tests ------------------------- #
674 @pytest.mark.skipif(
675 compat.USE_PYGEOS,
676 reason=("RTree supports sindex.nearest with different behaviour"),
677 )
678 def test_rtree_nearest_warns(self):
679 df = geopandas.GeoDataFrame({"geometry": []})
680 with pytest.warns(
681 FutureWarning, match="sindex.nearest using the rtree backend"
682 ):
683 df.sindex.nearest((0, 0, 1, 1), num_results=2)
684
685 @pytest.mark.skipif(
686 not (compat.USE_PYGEOS and not compat.PYGEOS_GE_010),
687 reason=("PyGEOS < 0.10 does not support sindex.nearest"),
688 )
689 def test_pygeos_error(self):
690 df = geopandas.GeoDataFrame({"geometry": []})
691 with pytest.raises(NotImplementedError, match="requires pygeos >= 0.10"):
692 df.sindex.nearest(None)
693
694 @pytest.mark.skipif(
695 not (compat.USE_PYGEOS and compat.PYGEOS_GE_010),
696 reason=("PyGEOS >= 0.10 is required to test sindex.nearest"),
697 )
698 @pytest.mark.parametrize("return_all", [True, False])
699 @pytest.mark.parametrize(
700 "geometry,expected",
701 [
702 ([0.25, 0.25], [[0], [0]]),
703 ([0.75, 0.75], [[0], [1]]),
704 ],
705 )
706 def test_nearest_single(self, geometry, expected, return_all):
707 geoms = pygeos.points(np.arange(10), np.arange(10))
708 df = geopandas.GeoDataFrame({"geometry": geoms})
709
710 p = Point(geometry)
711 res = df.sindex.nearest(p, return_all=return_all)
712 assert_array_equal(res, expected)
713
714 p = pygeos.points(geometry)
715 res = df.sindex.nearest(p, return_all=return_all)
716 assert_array_equal(res, expected)
717
718 @pytest.mark.skipif(
719 not compat.USE_PYGEOS or not compat.PYGEOS_GE_010,
720 reason=("PyGEOS >= 0.10 is required to test sindex.nearest"),
721 )
722 @pytest.mark.parametrize("return_all", [True, False])
723 @pytest.mark.parametrize(
724 "geometry,expected",
725 [
726 ([(1, 1), (0, 0)], [[0, 1], [1, 0]]),
727 ([(1, 1), (0.25, 1)], [[0, 1], [1, 1]]),
728 ],
729 )
730 def test_nearest_multi(self, geometry, expected, return_all):
731 geoms = pygeos.points(np.arange(10), np.arange(10))
732 df = geopandas.GeoDataFrame({"geometry": geoms})
733
734 ps = [Point(p) for p in geometry]
735 res = df.sindex.nearest(ps, return_all=return_all)
736 assert_array_equal(res, expected)
737
738 ps = pygeos.points(geometry)
739 res = df.sindex.nearest(ps, return_all=return_all)
740 assert_array_equal(res, expected)
741
742 s = geopandas.GeoSeries(ps)
743 res = df.sindex.nearest(s, return_all=return_all)
744 assert_array_equal(res, expected)
745
746 x, y = zip(*geometry)
747 ga = geopandas.points_from_xy(x, y)
748 res = df.sindex.nearest(ga, return_all=return_all)
749 assert_array_equal(res, expected)
750
751 @pytest.mark.skipif(
752 not compat.USE_PYGEOS or not compat.PYGEOS_GE_010,
753 reason=("PyGEOS >= 0.10 is required to test sindex.nearest"),
754 )
755 @pytest.mark.parametrize("return_all", [True, False])
756 @pytest.mark.parametrize(
757 "geometry,expected",
758 [
759 (None, [[], []]),
760 ([None], [[], []]),
761 ],
762 )
763 def test_nearest_none(self, geometry, expected, return_all):
764 geoms = pygeos.points(np.arange(10), np.arange(10))
765 df = geopandas.GeoDataFrame({"geometry": geoms})
766
767 res = df.sindex.nearest(geometry, return_all=return_all)
768 assert_array_equal(res, expected)
769
770 @pytest.mark.skipif(
771 not compat.USE_PYGEOS or not compat.PYGEOS_GE_010,
772 reason=("PyGEOS >= 0.10 is required to test sindex.nearest"),
773 )
774 @pytest.mark.parametrize("return_distance", [True, False])
775 @pytest.mark.parametrize(
776 "return_all,max_distance,expected",
777 [
778 (True, None, ([[0, 0, 1], [0, 1, 5]], [sqrt(0.5), sqrt(0.5), sqrt(50)])),
779 (False, None, ([[0, 1], [0, 5]], [sqrt(0.5), sqrt(50)])),
780 (True, 1, ([[0, 0], [0, 1]], [sqrt(0.5), sqrt(0.5)])),
781 (False, 1, ([[0], [0]], [sqrt(0.5)])),
782 ],
783 )
784 def test_nearest_max_distance(
785 self, expected, max_distance, return_all, return_distance
786 ):
787 geoms = pygeos.points(np.arange(10), np.arange(10))
788 df = geopandas.GeoDataFrame({"geometry": geoms})
789
790 ps = [Point(0.5, 0.5), Point(0, 10)]
791 res = df.sindex.nearest(
792 ps,
793 return_all=return_all,
794 max_distance=max_distance,
795 return_distance=return_distance,
796 )
797 if return_distance:
798 assert_array_equal(res[0], expected[0])
799 assert_array_equal(res[1], expected[1])
800 else:
801 assert_array_equal(res, expected[0])
802
672803 # --------------------------- misc tests ---------------------------- #
673804
674805 def test_empty_tree_geometries(self):
128128 assert_geodataframe_equal(df1, df2, check_crs=False)
129129
130130 assert len(record) == 0
131
132
133 def test_almost_equal_but_not_equal():
134 s_origin = GeoSeries([Point(0, 0)])
135 s_almost_origin = GeoSeries([Point(0.0000001, 0)])
136 assert_geoseries_equal(s_origin, s_almost_origin, check_less_precise=True)
137 with pytest.raises(AssertionError):
138 assert_geoseries_equal(s_origin, s_almost_origin)
00 from .crs import explicit_crs_from_epsg
11 from .geocoding import geocode, reverse_geocode
22 from .overlay import overlay
3 from .sjoin import sjoin
3 from .sjoin import sjoin, sjoin_nearest
44 from .util import collect
55 from .clip import clip
66
1111 "overlay",
1212 "reverse_geocode",
1313 "sjoin",
14 "sjoin_nearest",
1415 "clip",
1516 ]
66 """
77 import warnings
88
9 import numpy as np
10 import pandas as pd
11
129 from shapely.geometry import Polygon, MultiPolygon
1310
1411 from geopandas import GeoDataFrame, GeoSeries
1512 from geopandas.array import _check_crs, _crs_mismatch_warn
1613
1714
18 def _clip_points(gdf, poly):
19 """Clip point geometry to the polygon extent.
15 def _clip_gdf_with_polygon(gdf, poly):
16 """Clip geometry to the polygon extent.
2017
21 Clip an input point GeoDataFrame to the polygon extent of the poly
22 parameter. Points that intersect the poly geometry are extracted with
23 associated attributes and returned.
18 Clip an input GeoDataFrame to the polygon extent of the poly
19 parameter.
2420
2521 Parameters
2622 ----------
2723 gdf : GeoDataFrame, GeoSeries
28 Composed of point geometry that will be clipped to the poly.
29
30 poly : (Multi)Polygon
31 Reference geometry used to spatially clip the data.
32
33 Returns
34 -------
35 GeoDataFrame
36 The returned GeoDataFrame is a subset of gdf that intersects
37 with poly.
38 """
39 return gdf.iloc[gdf.sindex.query(poly, predicate="intersects")]
40
41
42 def _clip_line_poly(gdf, poly):
43 """Clip line and polygon geometry to the polygon extent.
44
45 Clip an input line or polygon to the polygon extent of the poly
46 parameter. Parts of Lines or Polygons that intersect the poly geometry are
47 extracted with associated attributes and returned.
48
49 Parameters
50 ----------
51 gdf : GeoDataFrame, GeoSeries
52 Line or polygon geometry that is clipped to poly.
24 Dataframe to clip.
5325
5426 poly : (Multi)Polygon
5527 Reference polygon for clipping.
6234 """
6335 gdf_sub = gdf.iloc[gdf.sindex.query(poly, predicate="intersects")]
6436
37 # For performance reasons points don't need to be intersected with poly
38 non_point_mask = gdf_sub.geom_type != "Point"
39
40 if not non_point_mask.any():
41 # only points, directly return
42 return gdf_sub
43
6544 # Clip the data with the polygon
6645 if isinstance(gdf_sub, GeoDataFrame):
6746 clipped = gdf_sub.copy()
68 clipped[gdf.geometry.name] = gdf_sub.intersection(poly)
47 clipped.loc[
48 non_point_mask, clipped._geometry_column_name
49 ] = gdf_sub.geometry.values[non_point_mask].intersection(poly)
6950 else:
7051 # GeoSeries
71 clipped = gdf_sub.intersection(poly)
52 clipped = gdf_sub.copy()
53 clipped[non_point_mask] = gdf_sub.values[non_point_mask].intersection(poly)
7254
7355 return clipped
7456
10082 GeoDataFrame or GeoSeries
10183 Vector data (points, lines, polygons) from `gdf` clipped to
10284 polygon boundary from mask.
85
86 See also
87 --------
88 GeoDataFrame.clip : equivalent GeoDataFrame method
89 GeoSeries.clip : equivalent GeoSeries method
10390
10491 Examples
10592 --------
148135 else:
149136 poly = mask
150137
151 geom_types = gdf.geometry.type
152 poly_idx = np.asarray((geom_types == "Polygon") | (geom_types == "MultiPolygon"))
153 line_idx = np.asarray(
154 (geom_types == "LineString")
155 | (geom_types == "LinearRing")
156 | (geom_types == "MultiLineString")
157 )
158 point_idx = np.asarray((geom_types == "Point") | (geom_types == "MultiPoint"))
159 geomcoll_idx = np.asarray((geom_types == "GeometryCollection"))
160
161 if point_idx.any():
162 point_gdf = _clip_points(gdf[point_idx], poly)
163 else:
164 point_gdf = None
165
166 if poly_idx.any():
167 poly_gdf = _clip_line_poly(gdf[poly_idx], poly)
168 else:
169 poly_gdf = None
170
171 if line_idx.any():
172 line_gdf = _clip_line_poly(gdf[line_idx], poly)
173 else:
174 line_gdf = None
175
176 if geomcoll_idx.any():
177 geomcoll_gdf = _clip_line_poly(gdf[geomcoll_idx], poly)
178 else:
179 geomcoll_gdf = None
180
181 order = pd.Series(range(len(gdf)), index=gdf.index)
182 concat = pd.concat([point_gdf, line_gdf, poly_gdf, geomcoll_gdf])
138 clipped = _clip_gdf_with_polygon(gdf, poly)
183139
184140 if keep_geom_type:
185 geomcoll_concat = (concat.geom_type == "GeometryCollection").any()
186 geomcoll_orig = geomcoll_idx.any()
141 geomcoll_concat = (clipped.geom_type == "GeometryCollection").any()
142 geomcoll_orig = (gdf.geom_type == "GeometryCollection").any()
187143
188144 new_collection = geomcoll_concat and not geomcoll_orig
189145
209165 # Check how many geometry types are in the clipped GeoDataFrame
210166 clip_types_total = sum(
211167 [
212 concat.geom_type.isin(polys).any(),
213 concat.geom_type.isin(lines).any(),
214 concat.geom_type.isin(points).any(),
168 clipped.geom_type.isin(polys).any(),
169 clipped.geom_type.isin(lines).any(),
170 clipped.geom_type.isin(points).any(),
215171 ]
216172 )
217173
225181 elif new_collection or more_types:
226182 orig_type = gdf.geom_type.iloc[0]
227183 if new_collection:
228 concat = concat.explode()
184 clipped = clipped.explode()
229185 if orig_type in polys:
230 concat = concat.loc[concat.geom_type.isin(polys)]
186 clipped = clipped.loc[clipped.geom_type.isin(polys)]
231187 elif orig_type in lines:
232 concat = concat.loc[concat.geom_type.isin(lines)]
188 clipped = clipped.loc[clipped.geom_type.isin(lines)]
233189
234 # Return empty GeoDataFrame or GeoSeries if no shapes remain
235 if len(concat) == 0:
236 return gdf.iloc[:0]
237
238 # Preserve the original order of the input
239 if isinstance(concat, GeoDataFrame):
240 concat["_order"] = order
241 return concat.sort_values(by="_order").drop(columns="_order")
242 else:
243 concat = GeoDataFrame(geometry=concat)
244 concat["_order"] = order
245 return concat.sort_values(by="_order").geometry
190 return clipped
3030 strings : list or Series of addresses to geocode
3131 provider : str or geopy.geocoder
3232 Specifies geocoding service to use. If none is provided,
33 will use 'geocodefarm' with a rate limit applied (see the geocodefarm
34 terms of service at:
35 https://geocode.farm/geocoding/free-api-documentation/ ).
33 will use 'photon' (see the Photon's terms of service at:
34 https://photon.komoot.io).
3635
3736 Either the string name used by geopy (as specified in
3837 geopy.geocoders.SERVICE_TO_GEOCODER) or a geopy Geocoder instance
39 (e.g., geopy.geocoders.GeocodeFarm) may be used.
38 (e.g., geopy.geocoders.Photon) may be used.
4039
4140 Some providers require additional arguments such as access keys
4241 See each geocoder's specific parameters in geopy.geocoders
6160 """
6261
6362 if provider is None:
64 # https://geocode.farm/geocoding/free-api-documentation/
65 provider = "geocodefarm"
66 throttle_time = 0.25
67 else:
68 throttle_time = _get_throttle_time(provider)
63 provider = "photon"
64 throttle_time = _get_throttle_time(provider)
6965
7066 return _query(strings, True, provider, throttle_time, **kwargs)
7167
8480 y coordinate is latitude
8581 provider : str or geopy.geocoder (opt)
8682 Specifies geocoding service to use. If none is provided,
87 will use 'geocodefarm' with a rate limit applied (see the geocodefarm
88 terms of service at:
89 https://geocode.farm/geocoding/free-api-documentation/ ).
83 will use 'photon' (see the Photon's terms of service at:
84 https://photon.komoot.io).
9085
9186 Either the string name used by geopy (as specified in
9287 geopy.geocoders.SERVICE_TO_GEOCODER) or a geopy Geocoder instance
93 (e.g., geopy.geocoders.GeocodeFarm) may be used.
88 (e.g., geopy.geocoders.Photon) may be used.
9489
9590 Some providers require additional arguments such as access keys
9691 See each geocoder's specific parameters in geopy.geocoders
116111 """
117112
118113 if provider is None:
119 # https://geocode.farm/geocoding/free-api-documentation/
120 provider = "geocodefarm"
121 throttle_time = 0.25
122 else:
123 throttle_time = _get_throttle_time(provider)
114 provider = "photon"
115 throttle_time = _get_throttle_time(provider)
124116
125117 return _query(points, False, provider, throttle_time, **kwargs)
126118
130122 from geopy.geocoders.base import GeocoderQueryError
131123 from geopy.geocoders import get_geocoder_for_service
132124
133 if not isinstance(data, pd.Series):
134 data = pd.Series(data)
125 if forward:
126 if not isinstance(data, pd.Series):
127 data = pd.Series(data)
128 else:
129 if not isinstance(data, geopandas.GeoSeries):
130 data = geopandas.GeoSeries(data)
135131
136132 if isinstance(provider, str):
137133 provider = get_geocoder_for_service(provider)
180180
181181 >>> geopandas.overlay(df1, df2, how='union')
182182 df1_data df2_data geometry
183 0 1.0 1.0 POLYGON ((1.00000 2.00000, 2.00000 2.00000, 2....
184 1 2.0 1.0 POLYGON ((3.00000 2.00000, 2.00000 2.00000, 2....
185 2 2.0 2.0 POLYGON ((3.00000 4.00000, 4.00000 4.00000, 4....
186 3 1.0 NaN POLYGON ((2.00000 1.00000, 2.00000 0.00000, 0....
183 0 1.0 1.0 POLYGON ((2.00000 2.00000, 2.00000 1.00000, 1....
184 1 2.0 1.0 POLYGON ((2.00000 2.00000, 2.00000 3.00000, 3....
185 2 2.0 2.0 POLYGON ((4.00000 4.00000, 4.00000 3.00000, 3....
186 3 1.0 NaN POLYGON ((2.00000 0.00000, 0.00000 0.00000, 0....
187187 4 2.0 NaN MULTIPOLYGON (((3.00000 3.00000, 4.00000 3.000...
188188 5 NaN 1.0 MULTIPOLYGON (((2.00000 2.00000, 3.00000 2.000...
189 6 NaN 2.0 POLYGON ((3.00000 4.00000, 3.00000 5.00000, 5....
189 6 NaN 2.0 POLYGON ((3.00000 5.00000, 5.00000 5.00000, 5....
190190
191191 >>> geopandas.overlay(df1, df2, how='intersection')
192192 df1_data df2_data geometry
193 0 1 1 POLYGON ((1.00000 2.00000, 2.00000 2.00000, 2....
194 1 2 1 POLYGON ((3.00000 2.00000, 2.00000 2.00000, 2....
195 2 2 2 POLYGON ((3.00000 4.00000, 4.00000 4.00000, 4....
193 0 1 1 POLYGON ((2.00000 2.00000, 2.00000 1.00000, 1....
194 1 2 1 POLYGON ((2.00000 2.00000, 2.00000 3.00000, 3....
195 2 2 2 POLYGON ((4.00000 4.00000, 4.00000 3.00000, 3....
196196
197197 >>> geopandas.overlay(df1, df2, how='symmetric_difference')
198198 df1_data df2_data geometry
199 0 1.0 NaN POLYGON ((2.00000 1.00000, 2.00000 0.00000, 0....
199 0 1.0 NaN POLYGON ((2.00000 0.00000, 0.00000 0.00000, 0....
200200 1 2.0 NaN MULTIPOLYGON (((3.00000 3.00000, 4.00000 3.000...
201201 2 NaN 1.0 MULTIPOLYGON (((2.00000 2.00000, 3.00000 2.000...
202 3 NaN 2.0 POLYGON ((3.00000 4.00000, 3.00000 5.00000, 5....
202 3 NaN 2.0 POLYGON ((3.00000 5.00000, 5.00000 5.00000, 5....
203203
204204 >>> geopandas.overlay(df1, df2, how='difference')
205 geometry df1_data
206 0 POLYGON ((2.00000 1.00000, 2.00000 0.00000, 0.... 1
207 1 MULTIPOLYGON (((2.00000 3.00000, 2.00000 4.000... 2
205 geometry df1_data
206 0 POLYGON ((2.00000 0.00000, 0.00000 0.00000, 0.... 1
207 1 MULTIPOLYGON (((3.00000 3.00000, 4.00000 3.000... 2
208208
209209 >>> geopandas.overlay(df1, df2, how='identity')
210210 df1_data df2_data geometry
211 0 1.0 1.0 POLYGON ((1.00000 2.00000, 2.00000 2.00000, 2....
212 1 2.0 1.0 POLYGON ((3.00000 2.00000, 2.00000 2.00000, 2....
213 2 2.0 2.0 POLYGON ((3.00000 4.00000, 4.00000 4.00000, 4....
214 3 1.0 NaN POLYGON ((2.00000 1.00000, 2.00000 0.00000, 0....
211 0 1.0 1.0 POLYGON ((2.00000 2.00000, 2.00000 1.00000, 1....
212 1 2.0 1.0 POLYGON ((2.00000 2.00000, 2.00000 3.00000, 3....
213 2 2.0 2.0 POLYGON ((4.00000 4.00000, 4.00000 3.00000, 3....
214 3 1.0 NaN POLYGON ((2.00000 0.00000, 0.00000 0.00000, 0....
215215 4 2.0 NaN MULTIPOLYGON (((3.00000 3.00000, 4.00000 3.000...
216216
217217 See also
218218 --------
219219 sjoin : spatial join
220 GeoDataFrame.overlay : equivalent method
220221
221222 Notes
222223 ------
262263 raise NotImplementedError(
263264 "df{} contains mixed geometry types.".format(i + 1)
264265 )
266
267 box_gdf1 = df1.total_bounds
268 box_gdf2 = df2.total_bounds
269
270 if not (
271 ((box_gdf1[0] <= box_gdf2[2]) and (box_gdf2[0] <= box_gdf1[2]))
272 and ((box_gdf1[1] <= box_gdf2[3]) and (box_gdf2[1] <= box_gdf1[3]))
273 ):
274 return GeoDataFrame(
275 [],
276 columns=list(
277 set(
278 df1.drop(df1.geometry.name, axis=1).columns.to_list()
279 + df2.drop(df2.geometry.name, axis=1).columns.to_list()
280 )
281 )
282 + ["geometry"],
283 )
265284
266285 # Computations
267286 def _make_valid(df):
283302 df1 = _make_valid(df1)
284303 df2 = _make_valid(df2)
285304
286 with warnings.catch_warnings(): # CRS checked above, supress array-level warning
305 with warnings.catch_warnings(): # CRS checked above, suppress array-level warning
287306 warnings.filterwarnings("ignore", message="CRS mismatch between the CRS")
288307 if how == "difference":
289308 return _overlay_difference(df1, df2)
298317 result = dfunion[dfunion["__idx1"].notnull()].copy()
299318
300319 if keep_geom_type:
301 key_order = result.keys()
302 exploded = result.reset_index(drop=True).explode()
303 exploded = exploded.reset_index(level=0)
304
320 geom_type = df1.geom_type.iloc[0]
321
322 # First we filter the geometry types inside GeometryCollections objects
323 # (e.g. GeometryCollection([polygon, point]) -> polygon)
324 # we do this separately on only the relevant rows, as this is an expensive
325 # operation (an expensive no-op for geometry types other than collections)
326 is_collection = result.geom_type == "GeometryCollection"
327 if is_collection.any():
328 geom_col = result._geometry_column_name
329 collections = result[[geom_col]][is_collection]
330
331 exploded = collections.reset_index(drop=True).explode(index_parts=True)
332 exploded = exploded.reset_index(level=0)
333
334 orig_num_geoms_exploded = exploded.shape[0]
335 if geom_type in polys:
336 exploded = exploded.loc[exploded.geom_type.isin(polys)]
337 elif geom_type in lines:
338 exploded = exploded.loc[exploded.geom_type.isin(lines)]
339 elif geom_type in points:
340 exploded = exploded.loc[exploded.geom_type.isin(points)]
341 else:
342 raise TypeError(
343 "`keep_geom_type` does not support {}.".format(geom_type)
344 )
345 num_dropped_collection = orig_num_geoms_exploded - exploded.shape[0]
346
347 # level_0 created with above reset_index operation
348 # and represents the original geometry collections
349 # TODO avoiding dissolve to call unary_union in this case could further
350 # improve performance (we only need to collect geometries in their
351 # respective Multi version)
352 dissolved = exploded.dissolve(by="level_0")
353 result.loc[is_collection, geom_col] = dissolved[geom_col].values
354 else:
355 num_dropped_collection = 0
356
357 # Now we filter all geometries (in theory we don't need to do this
358 # again for the rows handled above for GeometryCollections, but filtering
359 # them out is probably more expensive as simply including them when this
360 # is typically about only a few rows)
305361 orig_num_geoms = result.shape[0]
306 geom_type = df1.geom_type.iloc[0]
307362 if geom_type in polys:
308 exploded = exploded.loc[exploded.geom_type.isin(polys)]
363 result = result.loc[result.geom_type.isin(polys)]
309364 elif geom_type in lines:
310 exploded = exploded.loc[exploded.geom_type.isin(lines)]
365 result = result.loc[result.geom_type.isin(lines)]
311366 elif geom_type in points:
312 exploded = exploded.loc[exploded.geom_type.isin(points)]
367 result = result.loc[result.geom_type.isin(points)]
313368 else:
314369 raise TypeError("`keep_geom_type` does not support {}.".format(geom_type))
315
316 # level_0 created with above reset_index operation
317 # and represents the original geometry collections
318 result = exploded.dissolve(by="level_0")[key_order]
319
320 if (result.shape[0] != orig_num_geoms) and keep_geom_type_warning:
321 num_dropped = orig_num_geoms - result.shape[0]
370 num_dropped = orig_num_geoms - result.shape[0]
371
372 if (num_dropped > 0 or num_dropped_collection > 0) and keep_geom_type_warning:
322373 warnings.warn(
323374 "`keep_geom_type=True` in overlay resulted in {} dropped "
324375 "geometries of different geometry types than df1 has. "
325376 "Set `keep_geom_type=False` to retain all "
326 "geometries".format(num_dropped),
377 "geometries".format(num_dropped + num_dropped_collection),
327378 UserWarning,
328379 stacklevel=2,
329380 )
0 from typing import Optional
01 import warnings
12
3 import numpy as np
24 import pandas as pd
35
46 from geopandas import GeoDataFrame
7 from geopandas import _compat as compat
58 from geopandas.array import _check_crs, _crs_mismatch_warn
69
710
811 def sjoin(
9 left_df, right_df, how="inner", op="intersects", lsuffix="left", rsuffix="right"
12 left_df,
13 right_df,
14 how="inner",
15 predicate="intersects",
16 lsuffix="left",
17 rsuffix="right",
18 **kwargs,
1019 ):
1120 """Spatial join of two GeoDataFrames.
1221
2332 * 'right': use keys from right_df; retain only right_df geometry column
2433 * 'inner': use intersection of keys from both dfs; retain only
2534 left_df geometry column
26 op : string, default 'intersects'
35 predicate : string, default 'intersects'
2736 Binary predicate. Valid values are determined by the spatial index used.
2837 You can check the valid values in left_df or right_df as
2938 ``left_df.sindex.valid_query_predicates`` or
3039 ``right_df.sindex.valid_query_predicates``
40 Replaces deprecated ``op`` parameter.
3141 lsuffix : string, default 'left'
3242 Suffix to apply to overlapping column names (left GeoDataFrame).
3343 rsuffix : string, default 'right'
7787 See also
7888 --------
7989 overlay : overlay operation resulting in a new geometry
90 GeoDataFrame.sjoin : equivalent method
8091
8192 Notes
8293 ------
8394 Every operation in GeoPandas is planar, i.e. the potential third
8495 dimension is not taken into account.
8596 """
97 if "op" in kwargs:
98 op = kwargs.pop("op")
99 deprecation_message = (
100 "The `op` parameter is deprecated and will be removed"
101 " in a future release. Please use the `predicate` parameter"
102 " instead."
103 )
104 if predicate != "intersects" and op != predicate:
105 override_message = (
106 "A non-default value for `predicate` was passed"
107 f' (got `predicate="{predicate}"`'
108 f' in combination with `op="{op}"`).'
109 " The value of `predicate` will be overriden by the value of `op`,"
110 " , which may result in unexpected behavior."
111 f"\n{deprecation_message}"
112 )
113 warnings.warn(override_message, UserWarning, stacklevel=4)
114 else:
115 warnings.warn(deprecation_message, FutureWarning, stacklevel=4)
116 predicate = op
117 if kwargs:
118 first = next(iter(kwargs.keys()))
119 raise TypeError(f"sjoin() got an unexpected keyword argument '{first}'")
120
86121 _basic_checks(left_df, right_df, how, lsuffix, rsuffix)
87122
88 indices = _geom_predicate_query(left_df, right_df, op)
123 indices = _geom_predicate_query(left_df, right_df, predicate)
89124
90125 joined = _frame_join(indices, left_df, right_df, how, lsuffix, rsuffix)
91126
142177 )
143178
144179
145 def _geom_predicate_query(left_df, right_df, op):
180 def _geom_predicate_query(left_df, right_df, predicate):
146181 """Compute geometric comparisons and get matching indices.
147182
148183 Parameters
149184 ----------
150185 left_df : GeoDataFrame
151186 right_df : GeoDataFrame
152 op : string
187 predicate : string
153188 Binary predicate to query.
154189
155190 Returns
164199 warnings.filterwarnings(
165200 "ignore", "Generated spatial index is empty", FutureWarning
166201 )
167 if op == "within":
202
203 original_predicate = predicate
204
205 if predicate == "within":
168206 # within is implemented as the inverse of contains
169207 # contains is a faster predicate
170208 # see discussion at https://github.com/geopandas/geopandas/pull/1421
174212 else:
175213 # all other predicates are symmetric
176214 # keep them the same
177 predicate = op
178215 sindex = right_df.sindex
179216 input_geoms = left_df.geometry
180217
184221 else:
185222 # when sindex is empty / has no valid geometries
186223 indices = pd.DataFrame(columns=["_key_left", "_key_right"], dtype=float)
187 if op == "within":
224
225 if original_predicate == "within":
188226 # within is implemented as the inverse of contains
189227 # flip back the results
190228 indices = indices.rename(
194232 return indices
195233
196234
197 def _frame_join(indices, left_df, right_df, how, lsuffix, rsuffix):
235 def _frame_join(join_df, left_df, right_df, how, lsuffix, rsuffix):
198236 """Join the GeoDataFrames at the DataFrame level.
199237
200238 Parameters
201239 ----------
202 indices : DataFrame
203 Indexes returned by the geometric join.
240 join_df : DataFrame
241 Indices and join data returned by the geometric join.
204242 Must have columns `_key_left` and `_key_right`
205243 with integer indices representing the matches
206244 from `left_df` and `right_df` respectively.
245 Additional columns may be included and will be copied to
246 the resultant GeoDataFrame.
207247 left_df : GeoDataFrame
208248 right_df : GeoDataFrame
209249 lsuffix : string
252292
253293 # perform join on the dataframes
254294 if how == "inner":
255 indices = indices.set_index("_key_left")
295 join_df = join_df.set_index("_key_left")
256296 joined = (
257 left_df.merge(indices, left_index=True, right_index=True)
297 left_df.merge(join_df, left_index=True, right_index=True)
258298 .merge(
259299 right_df.drop(right_df.geometry.name, axis=1),
260300 left_on="_key_right",
270310 joined.index.name = left_index_name
271311
272312 elif how == "left":
273 indices = indices.set_index("_key_left")
313 join_df = join_df.set_index("_key_left")
274314 joined = (
275 left_df.merge(indices, left_index=True, right_index=True, how="left")
315 left_df.merge(join_df, left_index=True, right_index=True, how="left")
276316 .merge(
277317 right_df.drop(right_df.geometry.name, axis=1),
278318 how="left",
292332 joined = (
293333 left_df.drop(left_df.geometry.name, axis=1)
294334 .merge(
295 indices.merge(
335 join_df.merge(
296336 right_df, left_on="_key_right", right_index=True, how="right"
297337 ),
298338 left_index=True,
299339 right_on="_key_left",
300340 how="right",
341 suffixes=("_{}".format(lsuffix), "_{}".format(rsuffix)),
301342 )
302343 .set_index(index_right)
303344 .drop(["_key_left", "_key_right"], axis=1)
308349 joined.index.name = right_index_name
309350
310351 return joined
352
353
354 def _nearest_query(
355 left_df: GeoDataFrame,
356 right_df: GeoDataFrame,
357 max_distance: float,
358 how: str,
359 return_distance: bool,
360 ):
361 if not (compat.PYGEOS_GE_010 and compat.USE_PYGEOS):
362 raise NotImplementedError(
363 "Currently, only PyGEOS >= 0.10.0 supports `nearest_all`. "
364 + compat.INSTALL_PYGEOS_ERROR
365 )
366 # use the opposite of the join direction for the index
367 use_left_as_sindex = how == "right"
368 if use_left_as_sindex:
369 sindex = left_df.sindex
370 query = right_df.geometry
371 else:
372 sindex = right_df.sindex
373 query = left_df.geometry
374 if sindex:
375 res = sindex.nearest(
376 query,
377 return_all=True,
378 max_distance=max_distance,
379 return_distance=return_distance,
380 )
381 if return_distance:
382 (input_idx, tree_idx), distances = res
383 else:
384 (input_idx, tree_idx) = res
385 distances = None
386 if use_left_as_sindex:
387 l_idx, r_idx = tree_idx, input_idx
388 sort_order = np.argsort(l_idx, kind="stable")
389 l_idx, r_idx = l_idx[sort_order], r_idx[sort_order]
390 if distances is not None:
391 distances = distances[sort_order]
392 else:
393 l_idx, r_idx = input_idx, tree_idx
394 join_df = pd.DataFrame(
395 {"_key_left": l_idx, "_key_right": r_idx, "distances": distances}
396 )
397 else:
398 # when sindex is empty / has no valid geometries
399 join_df = pd.DataFrame(
400 columns=["_key_left", "_key_right", "distances"], dtype=float
401 )
402 return join_df
403
404
405 def sjoin_nearest(
406 left_df: GeoDataFrame,
407 right_df: GeoDataFrame,
408 how: str = "inner",
409 max_distance: Optional[float] = None,
410 lsuffix: str = "left",
411 rsuffix: str = "right",
412 distance_col: Optional[str] = None,
413 ) -> GeoDataFrame:
414 """Spatial join of two GeoDataFrames based on the distance between their geometries.
415
416 Results will include multiple output records for a single input record
417 where there are multiple equidistant nearest or intersected neighbors.
418
419 See the User Guide page
420 https://geopandas.readthedocs.io/en/latest/docs/user_guide/mergingdata.html
421 for more details.
422
423
424 Parameters
425 ----------
426 left_df, right_df : GeoDataFrames
427 how : string, default 'inner'
428 The type of join:
429
430 * 'left': use keys from left_df; retain only left_df geometry column
431 * 'right': use keys from right_df; retain only right_df geometry column
432 * 'inner': use intersection of keys from both dfs; retain only
433 left_df geometry column
434 max_distance : float, default None
435 Maximum distance within which to query for nearest geometry.
436 Must be greater than 0.
437 The max_distance used to search for nearest items in the tree may have a
438 significant impact on performance by reducing the number of input
439 geometries that are evaluated for nearest items in the tree.
440 lsuffix : string, default 'left'
441 Suffix to apply to overlapping column names (left GeoDataFrame).
442 rsuffix : string, default 'right'
443 Suffix to apply to overlapping column names (right GeoDataFrame).
444 distance_col : string, default None
445 If set, save the distances computed between matching geometries under a
446 column of this name in the joined GeoDataFrame.
447
448 Examples
449 --------
450 >>> countries = geopandas.read_file(geopandas.datasets.get_\
451 path("naturalearth_lowres"))
452 >>> cities = geopandas.read_file(geopandas.datasets.get_path("naturalearth_cities"))
453 >>> countries.head(2).name # doctest: +SKIP
454 pop_est continent name \
455 iso_a3 gdp_md_est geometry
456 0 920938 Oceania Fiji FJI 8374.0 MULTIPOLY\
457 GON (((180.00000 -16.06713, 180.00000...
458 1 53950935 Africa Tanzania TZA 150600.0 POLYGON (\
459 (33.90371 -0.95000, 34.07262 -1.05982...
460 >>> cities.head(2).name # doctest: +SKIP
461 name geometry
462 0 Vatican City POINT (12.45339 41.90328)
463 1 San Marino POINT (12.44177 43.93610)
464
465 >>> cities_w_country_data = geopandas.sjoin_nearest(cities, countries)
466 >>> cities_w_country_data[['name_left', 'name_right']].head(2) # doctest: +SKIP
467 name_left geometry index_right pop_est continent name_\
468 right iso_a3 gdp_md_est
469 0 Vatican City POINT (12.45339 41.90328) 141 62137802 Europe \
470 Italy ITA 2221000.0
471 1 San Marino POINT (12.44177 43.93610) 141 62137802 Europe \
472 Italy ITA 2221000.0
473
474 To include the distances:
475
476 >>> cities_w_country_data = geopandas.sjoin_nearest\
477 (cities, countries, distance_col="distances")
478 >>> cities_w_country_data[["name_left", "name_right", \
479 "distances"]].head(2) # doctest: +SKIP
480 name_left name_right distances
481 0 Vatican City Italy 0.0
482 1 San Marino Italy 0.0
483
484 In the following example, we get multiple cities for Italy because all results are
485 equidistant (in this case zero because they intersect).
486 In fact, we get 3 results in total:
487
488 >>> countries_w_city_data = geopandas.sjoin_nearest\
489 (cities, countries, distance_col="distances", how="right")
490 >>> italy_results = \
491 countries_w_city_data[countries_w_city_data["name_left"] == "Italy"]
492 >>> italy_results # doctest: +SKIP
493 name_x name_y
494 141 Vatican City Italy
495 141 San Marino Italy
496 141 Rome Italy
497
498 See also
499 --------
500 sjoin : binary predicate joins
501 GeoDataFrame.sjoin_nearest : equivalent method
502
503 Notes
504 -----
505 Since this join relies on distances, results will be innaccurate
506 if your geometries are in a geographic CRS.
507
508 Every operation in GeoPandas is planar, i.e. the potential third
509 dimension is not taken into account.
510 """
511 _basic_checks(left_df, right_df, how, lsuffix, rsuffix)
512
513 left_df.geometry.values.check_geographic_crs(stacklevel=1)
514 right_df.geometry.values.check_geographic_crs(stacklevel=1)
515
516 return_distance = distance_col is not None
517
518 join_df = _nearest_query(left_df, right_df, max_distance, how, return_distance)
519
520 if return_distance:
521 join_df = join_df.rename(columns={"distances": distance_col})
522 else:
523 join_df.pop("distances")
524
525 joined = _frame_join(join_df, left_df, right_df, how, lsuffix, rsuffix)
526
527 if return_distance:
528 columns = [c for c in joined.columns if c != distance_col] + [distance_col]
529 joined = joined[columns]
530
531 return joined
00 """Tests for the clip module."""
11
22 import warnings
3 from distutils.version import LooseVersion
34
45 import numpy as np
6 import pandas as pd
57
68 import shapely
7 from shapely.geometry import Polygon, Point, LineString, LinearRing, GeometryCollection
9 from shapely.geometry import (
10 Polygon,
11 Point,
12 LineString,
13 LinearRing,
14 GeometryCollection,
15 MultiPoint,
16 )
817
918 import geopandas
1019 from geopandas import GeoDataFrame, GeoSeries, clip
1423
1524
1625 pytestmark = pytest.mark.skip_no_sindex
26 pandas_133 = pd.__version__ == LooseVersion("1.3.3")
1727
1828
1929 @pytest.fixture
2030 def point_gdf():
2131 """Create a point GeoDataFrame."""
2232 pts = np.array([[2, 2], [3, 4], [9, 8], [-12, -15]])
23 gdf = GeoDataFrame([Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:4326")
33 gdf = GeoDataFrame([Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:3857")
2434 return gdf
2535
2636
2939 """Create a point GeoDataFrame. Its points are all outside the single
3040 rectangle, and its bounds are outside the single rectangle's."""
3141 pts = np.array([[5, 15], [15, 15], [15, 20]])
32 gdf = GeoDataFrame([Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:4326")
42 gdf = GeoDataFrame([Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:3857")
3343 return gdf
3444
3545
3848 """Create a point GeoDataFrame. Its points are all outside the single
3949 rectangle, and its bounds are overlapping the single rectangle's."""
4050 pts = np.array([[5, 15], [15, 15], [15, 5]])
41 gdf = GeoDataFrame([Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:4326")
51 gdf = GeoDataFrame([Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:3857")
4252 return gdf
4353
4454
4656 def single_rectangle_gdf():
4757 """Create a single rectangle for clipping."""
4858 poly_inters = Polygon([(0, 0), (0, 10), (10, 10), (10, 0), (0, 0)])
49 gdf = GeoDataFrame([1], geometry=[poly_inters], crs="EPSG:4326")
59 gdf = GeoDataFrame([1], geometry=[poly_inters], crs="EPSG:3857")
5060 gdf["attr2"] = "site-boundary"
5161 return gdf
5262
5969 eliminates the slivers in the clip return.
6070 """
6171 poly_inters = Polygon([(-5, -5), (-5, 15), (15, 15), (15, -5), (-5, -5)])
62 gdf = GeoDataFrame([1], geometry=[poly_inters], crs="EPSG:4326")
72 gdf = GeoDataFrame([1], geometry=[poly_inters], crs="EPSG:3857")
6373 gdf["attr2"] = ["study area"]
6474 return gdf
6575
8797 """Create Line Objects For Testing"""
8898 linea = LineString([(1, 1), (2, 2), (3, 2), (5, 3)])
8999 lineb = LineString([(3, 4), (5, 7), (12, 2), (10, 5), (9, 7.5)])
90 gdf = GeoDataFrame([1, 2], geometry=[linea, lineb], crs="EPSG:4326")
100 gdf = GeoDataFrame([1, 2], geometry=[linea, lineb], crs="EPSG:3857")
91101 return gdf
92102
93103
95105 def multi_poly_gdf(donut_geometry):
96106 """Create a multi-polygon GeoDataFrame."""
97107 multi_poly = donut_geometry.unary_union
98 out_df = GeoDataFrame(geometry=GeoSeries(multi_poly), crs="EPSG:4326")
108 out_df = GeoDataFrame(geometry=GeoSeries(multi_poly), crs="EPSG:3857")
99109 out_df["attr"] = ["pool"]
100110 return out_df
101111
107117 # Create a single and multi line object
108118 multiline_feat = two_line_gdf.unary_union
109119 linec = LineString([(2, 1), (3, 1), (4, 1), (5, 2)])
110 out_df = GeoDataFrame(geometry=GeoSeries([multiline_feat, linec]), crs="EPSG:4326")
120 out_df = GeoDataFrame(geometry=GeoSeries([multiline_feat, linec]), crs="EPSG:3857")
111121 out_df["attr"] = ["road", "stream"]
112122 return out_df
113123
120130 geometry=GeoSeries(
121131 [multi_point, Point(2, 5), Point(-11, -14), Point(-10, -12)]
122132 ),
123 crs="EPSG:4326",
133 crs="EPSG:3857",
124134 )
125135 out_df["attr"] = ["tree", "another tree", "shrub", "berries"]
126136 return out_df
134144 poly = Polygon([(3, 4), (5, 2), (12, 2), (10, 5), (9, 7.5)])
135145 ring = LinearRing([(1, 1), (2, 2), (3, 2), (5, 3), (12, 1)])
136146 gdf = GeoDataFrame(
137 [1, 2, 3, 4], geometry=[point, poly, line, ring], crs="EPSG:4326"
147 [1, 2, 3, 4], geometry=[point, poly, line, ring], crs="EPSG:3857"
138148 )
139149 return gdf
140150
145155 point = Point([(2, 3), (11, 4), (7, 2), (8, 9), (1, 13)])
146156 poly = Polygon([(3, 4), (5, 2), (12, 2), (10, 5), (9, 7.5)])
147157 coll = GeometryCollection([point, poly])
148 gdf = GeoDataFrame([1], geometry=[coll], crs="EPSG:4326")
158 gdf = GeoDataFrame([1], geometry=[coll], crs="EPSG:3857")
149159 return gdf
150160
151161
154164 """Create a line that will create a point when clipped."""
155165 linea = LineString([(10, 5), (13, 5), (15, 5)])
156166 lineb = LineString([(1, 1), (2, 2), (3, 2), (5, 3), (12, 1)])
157 gdf = GeoDataFrame([1, 2], geometry=[linea, lineb], crs="EPSG:4326")
167 gdf = GeoDataFrame([1, 2], geometry=[linea, lineb], crs="EPSG:3857")
158168 return gdf
159169
160170
181191 def test_non_overlapping_geoms():
182192 """Test that a bounding box returns empty if the extents don't overlap"""
183193 unit_box = Polygon([(0, 0), (0, 1), (1, 1), (1, 0), (0, 0)])
184 unit_gdf = GeoDataFrame([1], geometry=[unit_box], crs="EPSG:4326")
194 unit_gdf = GeoDataFrame([1], geometry=[unit_box], crs="EPSG:3857")
185195 non_overlapping_gdf = unit_gdf.copy()
186196 non_overlapping_gdf = non_overlapping_gdf.geometry.apply(
187197 lambda x: shapely.affinity.translate(x, xoff=20)
196206 """Test clipping a points GDF with a generic polygon geometry."""
197207 clip_pts = clip(point_gdf, single_rectangle_gdf)
198208 pts = np.array([[2, 2], [3, 4], [9, 8]])
199 exp = GeoDataFrame([Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:4326")
209 exp = GeoDataFrame([Point(xy) for xy in pts], columns=["geometry"], crs="EPSG:3857")
200210 assert_geodataframe_equal(clip_pts, exp)
201211
202212
208218 exp = GeoDataFrame(
209219 [Point(xy) for xy in pts],
210220 columns=["geometry2"],
211 crs="EPSG:4326",
221 crs="EPSG:3857",
212222 geometry="geometry2",
213223 )
214224 assert_geodataframe_equal(clip_pts, exp)
238248 assert all(clipped_poly.geom_type == "Polygon")
239249
240250
251 @pytest.mark.xfail(pandas_133, reason="Regression in pandas 1.3.3 (GH #2101)")
241252 def test_clip_multipoly_keep_slivers(multi_poly_gdf, single_rectangle_gdf):
242253 """Test a multi poly object where the return includes a sliver.
243254 Also the bounds of the object should == the bounds of the clip object
248259 assert "GeometryCollection" in clipped.geom_type[0]
249260
250261
262 @pytest.mark.xfail(pandas_133, reason="Regression in pandas 1.3.3 (GH #2101)")
251263 def test_clip_multipoly_keep_geom_type(multi_poly_gdf, single_rectangle_gdf):
252264 """Test a multi poly object where the return includes a sliver.
253265 Also the bounds of the object should == the bounds of the clip object
281293 assert clipped.geom_type[0] == "MultiPoint"
282294 assert hasattr(clipped, "attr")
283295 # All points should intersect the clip geom
296 assert len(clipped) == 2
297 clipped_mutltipoint = MultiPoint(
298 [
299 Point(2, 2),
300 Point(3, 4),
301 Point(9, 8),
302 ]
303 )
304 assert clipped.iloc[0].geometry.wkt == clipped_mutltipoint.wkt
284305 assert all(clipped.intersects(single_rectangle_gdf.unary_union))
285306
286307
334355 exp_poly = polygon.intersection(
335356 Polygon([(0, 0), (0, 10), (10, 10), (10, 0), (0, 0)])
336357 )
337 exp = GeoDataFrame([1], geometry=[exp_poly], crs="EPSG:4326")
358 exp = GeoDataFrame([1], geometry=[exp_poly], crs="EPSG:3857")
338359 exp["attr2"] = "site-boundary"
339360 assert_geodataframe_equal(clipped, exp)
340361
364385
365386
366387 def test_clip_box_overlap(pointsoutside_overlap_gdf, single_rectangle_gdf):
367 """Test clip when intersection is emtpy and boxes do overlap."""
388 """Test clip when intersection is empty and boxes do overlap."""
368389 clipped = clip(pointsoutside_overlap_gdf, single_rectangle_gdf)
369390 assert len(clipped) == 0
370391
385406
386407 def test_warning_crs_mismatch(point_gdf, single_rectangle_gdf):
387408 with pytest.warns(UserWarning, match="CRS mismatch between the CRS"):
388 clip(point_gdf, single_rectangle_gdf.to_crs(3857))
409 clip(point_gdf, single_rectangle_gdf.to_crs(4326))
00 from distutils.version import LooseVersion
1 import math
2 from typing import Sequence
3 from geopandas.testing import assert_geodataframe_equal
14
25 import numpy as np
36 import pandas as pd
58 from shapely.geometry import Point, Polygon, GeometryCollection
69
710 import geopandas
8 from geopandas import GeoDataFrame, GeoSeries, read_file, sjoin
11 import geopandas._compat as compat
12 from geopandas import GeoDataFrame, GeoSeries, read_file, sjoin, sjoin_nearest
13 from geopandas.testing import assert_geoseries_equal
914
1015 from pandas.testing import assert_frame_equal
1116 import pytest
17
18
19 TEST_NEAREST = compat.PYGEOS_GE_010 and compat.USE_PYGEOS
1220
1321
1422 pytestmark = pytest.mark.skip_no_sindex
8896
8997
9098 class TestSpatialJoin:
99 @pytest.mark.parametrize(
100 "how, lsuffix, rsuffix, expected_cols",
101 [
102 ("left", "left", "right", {"col_left", "col_right", "index_right"}),
103 ("inner", "left", "right", {"col_left", "col_right", "index_right"}),
104 ("right", "left", "right", {"col_left", "col_right", "index_left"}),
105 ("left", "lft", "rgt", {"col_lft", "col_rgt", "index_rgt"}),
106 ("inner", "lft", "rgt", {"col_lft", "col_rgt", "index_rgt"}),
107 ("right", "lft", "rgt", {"col_lft", "col_rgt", "index_lft"}),
108 ],
109 )
110 def test_suffixes(self, how: str, lsuffix: str, rsuffix: str, expected_cols):
111 left = GeoDataFrame({"col": [1], "geometry": [Point(0, 0)]})
112 right = GeoDataFrame({"col": [1], "geometry": [Point(0, 0)]})
113 joined = sjoin(left, right, how=how, lsuffix=lsuffix, rsuffix=rsuffix)
114 assert set(joined.columns) == expected_cols | set(("geometry",))
115
91116 @pytest.mark.parametrize("dfs", ["default-index", "string-index"], indirect=True)
92117 def test_crs_mismatch(self, dfs):
93118 index, df1, df2, expected = dfs
95120 with pytest.warns(UserWarning, match="CRS mismatch between the CRS"):
96121 sjoin(df1, df2)
97122
123 @pytest.mark.parametrize("dfs", ["default-index"], indirect=True)
124 @pytest.mark.parametrize("op", ["intersects", "contains", "within"])
125 def test_deprecated_op_param(self, dfs, op):
126 _, df1, df2, _ = dfs
127 with pytest.warns(FutureWarning, match="`op` parameter is deprecated"):
128 sjoin(df1, df2, op=op)
129
130 @pytest.mark.parametrize("dfs", ["default-index"], indirect=True)
131 @pytest.mark.parametrize("op", ["intersects", "contains", "within"])
132 @pytest.mark.parametrize("predicate", ["contains", "within"])
133 def test_deprecated_op_param_nondefault_predicate(self, dfs, op, predicate):
134 _, df1, df2, _ = dfs
135 match = "use the `predicate` parameter instead"
136 if op != predicate:
137 warntype = UserWarning
138 match = (
139 "`predicate` will be overriden by the value of `op`"
140 + r"(.|\s)*"
141 + match
142 )
143 else:
144 warntype = FutureWarning
145 with pytest.warns(warntype, match=match):
146 sjoin(df1, df2, predicate=predicate, op=op)
147
148 @pytest.mark.parametrize("dfs", ["default-index"], indirect=True)
149 def test_unknown_kwargs(self, dfs):
150 _, df1, df2, _ = dfs
151 with pytest.raises(
152 TypeError,
153 match=r"sjoin\(\) got an unexpected keyword argument 'extra_param'",
154 ):
155 sjoin(df1, df2, extra_param="test")
156
157 @pytest.mark.filterwarnings("ignore:The `op` parameter:FutureWarning")
98158 @pytest.mark.parametrize(
99159 "dfs",
100160 [
106166 ],
107167 indirect=True,
108168 )
109 @pytest.mark.parametrize("op", ["intersects", "contains", "within"])
110 def test_inner(self, op, dfs):
169 @pytest.mark.parametrize("predicate", ["intersects", "contains", "within"])
170 @pytest.mark.parametrize("predicate_kw", ["predicate", "op"])
171 def test_inner(self, predicate, predicate_kw, dfs):
111172 index, df1, df2, expected = dfs
112173
113 res = sjoin(df1, df2, how="inner", op=op)
114
115 exp = expected[op].dropna().copy()
174 res = sjoin(df1, df2, how="inner", **{predicate_kw: predicate})
175
176 exp = expected[predicate].dropna().copy()
116177 exp = exp.drop("geometry_y", axis=1).rename(columns={"geometry_x": "geometry"})
117178 exp[["df1", "df2"]] = exp[["df1", "df2"]].astype("int64")
118179 if index == "default-index":
149210 ],
150211 indirect=True,
151212 )
152 @pytest.mark.parametrize("op", ["intersects", "contains", "within"])
153 def test_left(self, op, dfs):
213 @pytest.mark.parametrize("predicate", ["intersects", "contains", "within"])
214 def test_left(self, predicate, dfs):
154215 index, df1, df2, expected = dfs
155216
156 res = sjoin(df1, df2, how="left", op=op)
217 res = sjoin(df1, df2, how="left", predicate=predicate)
157218
158219 if index in ["default-index", "string-index"]:
159 exp = expected[op].dropna(subset=["index_left"]).copy()
220 exp = expected[predicate].dropna(subset=["index_left"]).copy()
160221 elif index == "named-index":
161 exp = expected[op].dropna(subset=["df1_ix"]).copy()
222 exp = expected[predicate].dropna(subset=["df1_ix"]).copy()
162223 elif index == "multi-index":
163 exp = expected[op].dropna(subset=["level_0_x"]).copy()
224 exp = expected[predicate].dropna(subset=["level_0_x"]).copy()
164225 elif index == "named-multi-index":
165 exp = expected[op].dropna(subset=["df1_ix1"]).copy()
226 exp = expected[predicate].dropna(subset=["df1_ix1"]).copy()
166227 exp = exp.drop("geometry_y", axis=1).rename(columns={"geometry_x": "geometry"})
167228 exp["df1"] = exp["df1"].astype("int64")
168229 if index == "default-index":
200261 }
201262 )
202263 not_in = geopandas.GeoDataFrame({"col1": [1], "geometry": [Point(-0.5, 0.5)]})
203 empty = sjoin(not_in, polygons, how="left", op="intersects")
264 empty = sjoin(not_in, polygons, how="left", predicate="intersects")
204265 assert empty.index_right.isnull().all()
205 empty = sjoin(not_in, polygons, how="right", op="intersects")
266 empty = sjoin(not_in, polygons, how="right", predicate="intersects")
206267 assert empty.index_left.isnull().all()
207 empty = sjoin(not_in, polygons, how="inner", op="intersects")
268 empty = sjoin(not_in, polygons, how="inner", predicate="intersects")
208269 assert empty.empty
209270
210 @pytest.mark.parametrize("op", ["intersects", "contains", "within"])
271 @pytest.mark.parametrize("predicate", ["intersects", "contains", "within"])
211272 @pytest.mark.parametrize(
212273 "empty",
213274 [
215276 GeoDataFrame(geometry=GeoSeries()),
216277 ],
217278 )
218 def test_join_with_empty(self, op, empty):
279 def test_join_with_empty(self, predicate, empty):
219280 # Check joins with empty geometry columns/dataframes.
220281 polygons = geopandas.GeoDataFrame(
221282 {
226287 ],
227288 }
228289 )
229 result = sjoin(empty, polygons, how="left", op=op)
290 result = sjoin(empty, polygons, how="left", predicate=predicate)
230291 assert result.index_right.isnull().all()
231 result = sjoin(empty, polygons, how="right", op=op)
292 result = sjoin(empty, polygons, how="right", predicate=predicate)
232293 assert result.index_left.isnull().all()
233 result = sjoin(empty, polygons, how="inner", op=op)
294 result = sjoin(empty, polygons, how="inner", predicate=predicate)
234295 assert result.empty
235296
236297 @pytest.mark.parametrize("dfs", ["default-index", "string-index"], indirect=True)
254315 ],
255316 indirect=True,
256317 )
257 @pytest.mark.parametrize("op", ["intersects", "contains", "within"])
258 def test_right(self, op, dfs):
318 @pytest.mark.parametrize("predicate", ["intersects", "contains", "within"])
319 def test_right(self, predicate, dfs):
259320 index, df1, df2, expected = dfs
260321
261 res = sjoin(df1, df2, how="right", op=op)
322 res = sjoin(df1, df2, how="right", predicate=predicate)
262323
263324 if index in ["default-index", "string-index"]:
264 exp = expected[op].dropna(subset=["index_right"]).copy()
325 exp = expected[predicate].dropna(subset=["index_right"]).copy()
265326 elif index == "named-index":
266 exp = expected[op].dropna(subset=["df2_ix"]).copy()
327 exp = expected[predicate].dropna(subset=["df2_ix"]).copy()
267328 elif index == "multi-index":
268 exp = expected[op].dropna(subset=["level_0_y"]).copy()
329 exp = expected[predicate].dropna(subset=["level_0_y"]).copy()
269330 elif index == "named-multi-index":
270 exp = expected[op].dropna(subset=["df2_ix1"]).copy()
331 exp = expected[predicate].dropna(subset=["df2_ix1"]).copy()
271332 exp = exp.drop("geometry_x", axis=1).rename(columns={"geometry_y": "geometry"})
272333 exp["df2"] = exp["df2"].astype("int64")
273334 if index == "default-index":
292353 exp.index.names = df2.index.names
293354
294355 # GH 1364 fix of behaviour was done in pandas 1.1.0
295 if op == "within" and str(pd.__version__) >= LooseVersion("1.1.0"):
356 if predicate == "within" and str(pd.__version__) >= LooseVersion("1.1.0"):
296357 exp = exp.sort_index()
297358
298359 assert_frame_equal(res, exp, check_index_type=False)
349410 df = sjoin(self.pointdf, self.polydf, how="inner")
350411 assert df.shape == (11, 8)
351412
352 def test_sjoin_op(self):
413 def test_sjoin_predicate(self):
353414 # points within polygons
354 df = sjoin(self.pointdf, self.polydf, how="left", op="within")
415 df = sjoin(self.pointdf, self.polydf, how="left", predicate="within")
355416 assert df.shape == (21, 8)
356417 assert df.loc[1]["BoroName"] == "Staten Island"
357418
358419 # points contain polygons? never happens so we should have nulls
359 df = sjoin(self.pointdf, self.polydf, how="left", op="contains")
420 df = sjoin(self.pointdf, self.polydf, how="left", predicate="contains")
360421 assert df.shape == (21, 8)
361422 assert np.isnan(df.loc[1]["Shape_Area"])
362423
363 def test_sjoin_bad_op(self):
424 def test_sjoin_bad_predicate(self):
364425 # AttributeError: 'Point' object has no attribute 'spandex'
365426 with pytest.raises(ValueError):
366 sjoin(self.pointdf, self.polydf, how="left", op="spandex")
427 sjoin(self.pointdf, self.polydf, how="left", predicate="spandex")
367428
368429 def test_sjoin_duplicate_column_name(self):
369430 pointdf2 = self.pointdf.rename(columns={"pointattr1": "Shape_Area"})
462523 df2 = sjoin(self.pointdf, self.polydf.append(empty), how="left")
463524 assert df2.shape == (21, 8)
464525
465 @pytest.mark.parametrize("op", ["intersects", "within", "contains"])
466 def test_sjoin_no_valid_geoms(self, op):
526 @pytest.mark.parametrize("predicate", ["intersects", "within", "contains"])
527 def test_sjoin_no_valid_geoms(self, predicate):
467528 """Tests a completely empty GeoDataFrame."""
468529 empty = GeoDataFrame(geometry=[], crs=self.pointdf.crs)
469 assert sjoin(self.pointdf, empty, how="inner", op=op).empty
470 assert sjoin(self.pointdf, empty, how="right", op=op).empty
471 assert sjoin(empty, self.pointdf, how="inner", op=op).empty
472 assert sjoin(empty, self.pointdf, how="left", op=op).empty
530 assert sjoin(self.pointdf, empty, how="inner", predicate=predicate).empty
531 assert sjoin(self.pointdf, empty, how="right", predicate=predicate).empty
532 assert sjoin(empty, self.pointdf, how="inner", predicate=predicate).empty
533 assert sjoin(empty, self.pointdf, how="left", predicate=predicate).empty
534
535 def test_empty_sjoin_return_duplicated_columns(self):
536
537 nybb = geopandas.read_file(geopandas.datasets.get_path("nybb"))
538 nybb2 = nybb.copy()
539 nybb2.geometry = nybb2.translate(200000) # to get non-overlapping
540
541 result = geopandas.sjoin(nybb, nybb2)
542
543 assert "BoroCode_right" in result.columns
544 assert "BoroCode_left" in result.columns
473545
474546
475547 class TestSpatialJoinNaturalEarth:
484556 countries = self.world[["geometry", "name"]]
485557 countries = countries.rename(columns={"name": "country"})
486558 cities_with_country = sjoin(
487 self.cities, countries, how="inner", op="intersects"
559 self.cities, countries, how="inner", predicate="intersects"
488560 )
489561 assert cities_with_country.shape == (172, 4)
562
563
564 @pytest.mark.skipif(
565 TEST_NEAREST,
566 reason=("This test can only be run _without_ PyGEOS >= 0.10 installed"),
567 )
568 def test_no_nearest_all():
569 df1 = geopandas.GeoDataFrame({"geometry": []})
570 df2 = geopandas.GeoDataFrame({"geometry": []})
571 with pytest.raises(
572 NotImplementedError,
573 match="Currently, only PyGEOS >= 0.10.0 supports `nearest_all`",
574 ):
575 sjoin_nearest(df1, df2)
576
577
578 @pytest.mark.skipif(
579 not TEST_NEAREST,
580 reason=(
581 "PyGEOS >= 0.10.0"
582 " must be installed and activated via the geopandas.compat module to"
583 " test sjoin_nearest"
584 ),
585 )
586 class TestNearest:
587 @pytest.mark.parametrize(
588 "how_kwargs", ({}, {"how": "inner"}, {"how": "left"}, {"how": "right"})
589 )
590 def test_allowed_hows(self, how_kwargs):
591 left = geopandas.GeoDataFrame({"geometry": []})
592 right = geopandas.GeoDataFrame({"geometry": []})
593 sjoin_nearest(left, right, **how_kwargs) # no error
594
595 @pytest.mark.parametrize("how", ("outer", "abcde"))
596 def test_invalid_hows(self, how: str):
597 left = geopandas.GeoDataFrame({"geometry": []})
598 right = geopandas.GeoDataFrame({"geometry": []})
599 with pytest.raises(ValueError, match="`how` was"):
600 sjoin_nearest(left, right, how=how)
601
602 @pytest.mark.parametrize("distance_col", (None, "distance"))
603 def test_empty_right_df_how_left(self, distance_col: str):
604 # all records from left and no results from right
605 left = geopandas.GeoDataFrame({"geometry": [Point(0, 0), Point(1, 1)]})
606 right = geopandas.GeoDataFrame({"geometry": []})
607 joined = sjoin_nearest(
608 left,
609 right,
610 how="left",
611 distance_col=distance_col,
612 )
613 assert_geoseries_equal(joined["geometry"], left["geometry"])
614 assert joined["index_right"].isna().all()
615 if distance_col is not None:
616 assert joined[distance_col].isna().all()
617
618 @pytest.mark.parametrize("distance_col", (None, "distance"))
619 def test_empty_right_df_how_right(self, distance_col: str):
620 # no records in joined
621 left = geopandas.GeoDataFrame({"geometry": [Point(0, 0), Point(1, 1)]})
622 right = geopandas.GeoDataFrame({"geometry": []})
623 joined = sjoin_nearest(
624 left,
625 right,
626 how="right",
627 distance_col=distance_col,
628 )
629 assert joined.empty
630 if distance_col is not None:
631 assert distance_col in joined
632
633 @pytest.mark.parametrize("how", ["inner", "left"])
634 @pytest.mark.parametrize("distance_col", (None, "distance"))
635 def test_empty_left_df(self, how, distance_col: str):
636 right = geopandas.GeoDataFrame({"geometry": [Point(0, 0), Point(1, 1)]})
637 left = geopandas.GeoDataFrame({"geometry": []})
638 joined = sjoin_nearest(left, right, how=how, distance_col=distance_col)
639 assert joined.empty
640 if distance_col is not None:
641 assert distance_col in joined
642
643 @pytest.mark.parametrize("distance_col", (None, "distance"))
644 def test_empty_left_df_how_right(self, distance_col: str):
645 right = geopandas.GeoDataFrame({"geometry": [Point(0, 0), Point(1, 1)]})
646 left = geopandas.GeoDataFrame({"geometry": []})
647 joined = sjoin_nearest(
648 left,
649 right,
650 how="right",
651 distance_col=distance_col,
652 )
653 assert_geoseries_equal(joined["geometry"], right["geometry"])
654 assert joined["index_left"].isna().all()
655 if distance_col is not None:
656 assert joined[distance_col].isna().all()
657
658 @pytest.mark.parametrize("how", ["inner", "left"])
659 def test_empty_join_due_to_max_distance(self, how):
660 # after applying max_distance the join comes back empty
661 # (as in NaN in the joined columns)
662 left = geopandas.GeoDataFrame({"geometry": [Point(0, 0)]})
663 right = geopandas.GeoDataFrame({"geometry": [Point(1, 1), Point(2, 2)]})
664 joined = sjoin_nearest(
665 left,
666 right,
667 how=how,
668 max_distance=1,
669 distance_col="distances",
670 )
671 expected = left.copy()
672 expected["index_right"] = [np.nan]
673 expected["distances"] = [np.nan]
674 if how == "inner":
675 expected = expected.dropna()
676 expected["index_right"] = expected["index_right"].astype("int64")
677 assert_geodataframe_equal(joined, expected)
678
679 def test_empty_join_due_to_max_distance_how_right(self):
680 # after applying max_distance the join comes back empty
681 # (as in NaN in the joined columns)
682 left = geopandas.GeoDataFrame({"geometry": [Point(0, 0), Point(1, 1)]})
683 right = geopandas.GeoDataFrame({"geometry": [Point(2, 2)]})
684 joined = sjoin_nearest(
685 left,
686 right,
687 how="right",
688 max_distance=1,
689 distance_col="distances",
690 )
691 expected = right.copy()
692 expected["index_left"] = [np.nan]
693 expected["distances"] = [np.nan]
694 expected = expected[["index_left", "geometry", "distances"]]
695 assert_geodataframe_equal(joined, expected)
696
697 @pytest.mark.parametrize("how", ["inner", "left"])
698 def test_max_distance(self, how):
699 left = geopandas.GeoDataFrame({"geometry": [Point(0, 0), Point(1, 1)]})
700 right = geopandas.GeoDataFrame({"geometry": [Point(1, 1), Point(2, 2)]})
701 joined = sjoin_nearest(
702 left,
703 right,
704 how=how,
705 max_distance=1,
706 distance_col="distances",
707 )
708 expected = left.copy()
709 expected["index_right"] = [np.nan, 0]
710 expected["distances"] = [np.nan, 0]
711 if how == "inner":
712 expected = expected.dropna()
713 expected["index_right"] = expected["index_right"].astype("int64")
714 assert_geodataframe_equal(joined, expected)
715
716 def test_max_distance_how_right(self):
717 left = geopandas.GeoDataFrame({"geometry": [Point(1, 1), Point(2, 2)]})
718 right = geopandas.GeoDataFrame({"geometry": [Point(0, 0), Point(1, 1)]})
719 joined = sjoin_nearest(
720 left,
721 right,
722 how="right",
723 max_distance=1,
724 distance_col="distances",
725 )
726 expected = right.copy()
727 expected["index_left"] = [np.nan, 0]
728 expected["distances"] = [np.nan, 0]
729 expected = expected[["index_left", "geometry", "distances"]]
730 assert_geodataframe_equal(joined, expected)
731
732 @pytest.mark.parametrize("how", ["inner", "left"])
733 @pytest.mark.parametrize(
734 "geo_left, geo_right, expected_left, expected_right, distances",
735 [
736 (
737 [Point(0, 0), Point(1, 1)],
738 [Point(1, 1)],
739 [0, 1],
740 [0, 0],
741 [math.sqrt(2), 0],
742 ),
743 (
744 [Point(0, 0), Point(1, 1)],
745 [Point(1, 1), Point(0, 0)],
746 [0, 1],
747 [1, 0],
748 [0, 0],
749 ),
750 (
751 [Point(0, 0), Point(1, 1)],
752 [Point(1, 1), Point(0, 0), Point(0, 0)],
753 [0, 0, 1],
754 [1, 2, 0],
755 [0, 0, 0],
756 ),
757 (
758 [Point(0, 0), Point(1, 1)],
759 [Point(1, 1), Point(0, 0), Point(2, 2)],
760 [0, 1],
761 [1, 0],
762 [0, 0],
763 ),
764 (
765 [Point(0, 0), Point(1, 1)],
766 [Point(1, 1), Point(0.25, 1)],
767 [0, 1],
768 [1, 0],
769 [math.sqrt(0.25 ** 2 + 1), 0],
770 ),
771 (
772 [Point(0, 0), Point(1, 1)],
773 [Point(-10, -10), Point(100, 100)],
774 [0, 1],
775 [0, 0],
776 [math.sqrt(10 ** 2 + 10 ** 2), math.sqrt(11 ** 2 + 11 ** 2)],
777 ),
778 (
779 [Point(0, 0), Point(1, 1)],
780 [Point(x, y) for x, y in zip(np.arange(10), np.arange(10))],
781 [0, 1],
782 [0, 1],
783 [0, 0],
784 ),
785 (
786 [Point(0, 0), Point(1, 1), Point(0, 0)],
787 [Point(1.1, 1.1), Point(0, 0)],
788 [0, 1, 2],
789 [1, 0, 1],
790 [0, np.sqrt(0.1 ** 2 + 0.1 ** 2), 0],
791 ),
792 ],
793 )
794 def test_sjoin_nearest_left(
795 self,
796 geo_left,
797 geo_right,
798 expected_left: Sequence[int],
799 expected_right: Sequence[int],
800 distances: Sequence[float],
801 how,
802 ):
803 left = geopandas.GeoDataFrame({"geometry": geo_left})
804 right = geopandas.GeoDataFrame({"geometry": geo_right})
805 expected_gdf = left.iloc[expected_left].copy()
806 expected_gdf["index_right"] = expected_right
807 # without distance col
808 joined = sjoin_nearest(left, right, how=how)
809 # inner / left join give a different row order
810 check_like = how == "inner"
811 assert_geodataframe_equal(expected_gdf, joined, check_like=check_like)
812 # with distance col
813 expected_gdf["distance_col"] = np.array(distances, dtype=float)
814 joined = sjoin_nearest(left, right, how=how, distance_col="distance_col")
815 assert_geodataframe_equal(expected_gdf, joined, check_like=check_like)
816
817 @pytest.mark.parametrize(
818 "geo_left, geo_right, expected_left, expected_right, distances",
819 [
820 ([Point(0, 0), Point(1, 1)], [Point(1, 1)], [1], [0], [0]),
821 (
822 [Point(0, 0), Point(1, 1)],
823 [Point(1, 1), Point(0, 0)],
824 [1, 0],
825 [0, 1],
826 [0, 0],
827 ),
828 (
829 [Point(0, 0), Point(1, 1)],
830 [Point(1, 1), Point(0, 0), Point(0, 0)],
831 [1, 0, 0],
832 [0, 1, 2],
833 [0, 0, 0],
834 ),
835 (
836 [Point(0, 0), Point(1, 1)],
837 [Point(1, 1), Point(0, 0), Point(2, 2)],
838 [1, 0, 1],
839 [0, 1, 2],
840 [0, 0, math.sqrt(2)],
841 ),
842 (
843 [Point(0, 0), Point(1, 1)],
844 [Point(1, 1), Point(0.25, 1)],
845 [1, 1],
846 [0, 1],
847 [0, 0.75],
848 ),
849 (
850 [Point(0, 0), Point(1, 1)],
851 [Point(-10, -10), Point(100, 100)],
852 [0, 1],
853 [0, 1],
854 [math.sqrt(10 ** 2 + 10 ** 2), math.sqrt(99 ** 2 + 99 ** 2)],
855 ),
856 (
857 [Point(0, 0), Point(1, 1)],
858 [Point(x, y) for x, y in zip(np.arange(10), np.arange(10))],
859 [0, 1] + [1] * 8,
860 list(range(10)),
861 [0, 0] + [np.sqrt(x ** 2 + x ** 2) for x in np.arange(1, 9)],
862 ),
863 (
864 [Point(0, 0), Point(1, 1), Point(0, 0)],
865 [Point(1.1, 1.1), Point(0, 0)],
866 [1, 0, 2],
867 [0, 1, 1],
868 [np.sqrt(0.1 ** 2 + 0.1 ** 2), 0, 0],
869 ),
870 ],
871 )
872 def test_sjoin_nearest_right(
873 self,
874 geo_left,
875 geo_right,
876 expected_left: Sequence[int],
877 expected_right: Sequence[int],
878 distances: Sequence[float],
879 ):
880 left = geopandas.GeoDataFrame({"geometry": geo_left})
881 right = geopandas.GeoDataFrame({"geometry": geo_right})
882 expected_gdf = right.iloc[expected_right].copy()
883 expected_gdf["index_left"] = expected_left
884 expected_gdf = expected_gdf[["index_left", "geometry"]]
885 # without distance col
886 joined = sjoin_nearest(left, right, how="right")
887 assert_geodataframe_equal(expected_gdf, joined)
888 # with distance col
889 expected_gdf["distance_col"] = np.array(distances, dtype=float)
890 joined = sjoin_nearest(left, right, how="right", distance_col="distance_col")
891 assert_geodataframe_equal(expected_gdf, joined)
892
893 @pytest.mark.filterwarnings("ignore:Geometry is in a geographic CRS")
894 def test_sjoin_nearest_inner(self):
895 # check equivalency of left and inner join
896 countries = read_file(geopandas.datasets.get_path("naturalearth_lowres"))
897 cities = read_file(geopandas.datasets.get_path("naturalearth_cities"))
898 countries = countries[["geometry", "name"]].rename(columns={"name": "country"})
899
900 # default: inner and left give the same result
901 result1 = sjoin_nearest(cities, countries, distance_col="dist")
902 assert result1.shape[0] == cities.shape[0]
903 result2 = sjoin_nearest(cities, countries, distance_col="dist", how="inner")
904 assert_geodataframe_equal(result2, result1)
905 result3 = sjoin_nearest(cities, countries, distance_col="dist", how="left")
906 assert_geodataframe_equal(result3, result1, check_like=True)
907
908 # with max_distance: rows that go above are dropped in case of inner
909 result4 = sjoin_nearest(cities, countries, distance_col="dist", max_distance=1)
910 assert_geodataframe_equal(
911 result4, result1[result1["dist"] < 1], check_like=True
912 )
913 result5 = sjoin_nearest(
914 cities, countries, distance_col="dist", max_distance=1, how="left"
915 )
916 assert result5.shape[0] == cities.shape[0]
917 result5 = result5.dropna()
918 result5["index_right"] = result5["index_right"].astype("int64")
919 assert_geodataframe_equal(result5, result4, check_like=True)
3333 # Point and MultiPoint... or even just MultiPoint
3434 t = x[0].type
3535 if not all(g.type == t for g in x):
36 raise ValueError("Geometry type must be homogenous")
36 raise ValueError("Geometry type must be homogeneous")
3737 if len(x) > 1 and t.startswith("Multi"):
3838 raise ValueError("Cannot collect {0}. Must have single geometries".format(t))
3939
00 # required
11 fiona>=1.8
2 pandas>=0.24
2 pandas>=0.25
33 pyproj>=2.2.0
44 shapely>=1.6
55
1414 matplotlib>=2.2
1515 mapclassify
1616
17 # testing
17 # testing
1818 pytest>=3.1.0
1919 pytest-cov
2020 codecov
2121
22 # spatial access methods
22 # spatial access methods
2323 rtree>=0.8
2424
2525 # styling
2929 INSTALL_REQUIRES = []
3030 else:
3131 INSTALL_REQUIRES = [
32 "pandas >= 0.24.0",
32 "pandas >= 0.25.0",
3333 "shapely >= 1.6",
3434 "fiona >= 1.8",
3535 "pyproj >= 2.2.0",
6666 "geopandas.tools.tests",
6767 ],
6868 package_data={"geopandas": data_files},
69 python_requires=">=3.6",
69 python_requires=">=3.7",
7070 install_requires=INSTALL_REQUIRES,
7171 cmdclass=versioneer.get_cmdclass(),
7272 )