I have created a function that suppose to drop duplicate geometry from polygon layer and the nsave it as shapefile.
The function works but I have realized that the polygons are not save correctly and I get lines.
When I checked how to polygons are saved , I have found out that the polygons string is not being save correctly and for that reason I can't get the shape properly.
This is how it hapenned.
First I have created the function:
import geopandas as gpd
import pandas as pd
#variable to define
file_location=r'data/ABCD.csv'
crs_s='epsg:4326'
geometry_column='GEOMETRY'
result_dest='shape/ABCD.shp'
def unique_geometry(gdf,file_location,crs_s,geometry_column,result_dest):
gdf = gpd.read_file(file_location)
gdf.crs=crs_file
print('original length of the dataframe is: {}'.format(len(gdf)))
gdf.drop_duplicates(subset=geometry_column,inplace=True)
print('Total unique geometris: {}'.format(len(gdf)))
gdf.to_file(result_dest)
return gdf
unique_geometry(gdf,file_location,crs_file,geometry_column,result_dest)
>>>
original length of the dataframe is: 1006
Total unique geometris: 479
Now when I print one value from the geometry column:
gdf.iloc[0,2]
>>>'MULTIPOLYGON(((-51.71916 -12.36817,-51.71903 -12.36821,-51.71906 -12.36936,-51.71906 -12.37198,-51.71924 -12.37435,-51.71916 -12.37494,-51.72185 -12.37441,-51.72488 -12.37372,-51.72539 -12.37355,-51.72615 -12.37306,-51.72612 -12.37264,-51.726 -12.37222,-51.72588 -12.37203,-51.72578 -12.37175,-51.72565 -12.37147,-51.7253 -12.37111,-51.72439 -12.37083,-51.72406 -12.37074,-51.72376 -12.37058,-51.72354 -12.37039,-51.72315 -12.3702,-51.72289 -12.37016,-51.72196 -12.36959,-51.72167 -12.36946,-51.72133 -12.36946,-51.72003 -12.36904,-51.71978 -12.36886,-51.71916 -12.36817)))'
but when I open this again and try to sloc the same :
gdf = gpd.read_file(r'shape/ABCD.shp')
gdf.iloc[0,2]
'MULTIPOLYGON(((-51.71916 -12.36817,-51.71903 -12.36821,-51.71906 -12.36936,-51.71906 -12.37198,-51.71924 -12.37435,-51.71916 -12.37494,-51.72185 -12.37441,-51.72488 -12.37372,-51.72539 -12.37355,-51.72615 -12.37306,-51.72612 -12.37264,-51.726 -12.37222,-'
I have no idea why this can happen or how to fix it.
This is also happens outside the function:
gdf = gpd.read_file(file_location)
gdf.crs=crs_file
gdf.drop_duplicates(subset=geometry_column,inplace=True)
print(gdf.iloc[0,2])
gdf.to_file('shape/ABCD.shp')
gdf1=gpd.read_file('shape/ABCD.shp')
print(gdf.iloc[0,2])
>>>MULTIPOLYGON(((-51.71916 -12.36817,-51.71903 -12.36821,-51.71906 -12.36936,-51.71906 -12.37198,-51.71924 -12.37435,-51.71916 -12.37494,-51.72185 -12.37441,-51.72488 -12.37372,-51.72539 -12.37355,-51.72615 -12.37306,-51.72612 -12.37264,-51.726 -12.37222,-51.72588 -12.37203,-51.72578 -12.37175,-51.72565 -12.37147,-51.7253 -12.37111,-51.72439 -12.37083,-51.72406 -12.37074,-51.72376 -12.37058,-51.72354 -12.37039,-51.72315 -12.3702,-51.72289 -12.37016,-51.72196 -12.36959,-51.72167 -12.36946,-51.72133 -12.36946,-51.72003 -12.36904,-51.71978 -12.36886,-51.71916 -12.36817)))
MULTIPOLYGON(((-51.71916 -12.36817,-51.71903 -12.36821,-51.71906 -12.36936,-51.71906 -12.37198,-51.71924 -12.37435,-51.71916 -12.37494,-51.72185 -12.37441,-51.72488 -12.37372,-51.72539 -12.37355,-51.72615 -12.37306,-51.72612 -12.37264,-51.726 -12.37222,-
My end goal: to drop the duplicates geoemtry and save the result as shapefile correctly
question from:
https://stackoverflow.com/questions/65869409/geopandas-do-not-save-the-full-geometry-when-using-to-file