Context
I just get into trouble while trying to do some I/O operations on some databases from a Python3
script.
When I want to connect to a database, I habitually use psycopg2
in order to handle the connections and cursors.
My data are usually stored as Pandas
DataFrames
and/or GeoPandas
's equivalent GeoDataFrames
.
Difficulties
In order to read data from a database table;
I can rely on its .read_sql()
methods which takes as a parameter con
, as stated in the doc:
con : SQLAlchemy connectable (engine/connection) or database str URI
or DBAPI2 connection (fallback mode)'
Using SQLAlchemy makes it possible to use any DB supported by that
library. If a DBAPI2 object, only sqlite3 is supported. The user is responsible
for engine disposal and connection closure for the SQLAlchemy connectable. See
`here <https://docs.sqlalchemy.org/en/13/core/connections.html>`_
I can rely on its .read_postigs()
methods which takes as a parameter con
, as stated in the doc:
con : DB connection object or SQLAlchemy engine
Active connection to the database to query.
In order to write data to a database table;
I can rely on the .to_sql()
methods which takes as a parameter con
, as stated in the doc:
con : sqlalchemy.engine.Engine or sqlite3.Connection
Using SQLAlchemy makes it possible to use any DB supported by that
library. Legacy support is provided for sqlite3.Connection objects. The user
is responsible for engine disposal and connection closure for the SQLAlchemy
connectable See `here <https://docs.sqlalchemy.org/en/13/core/connections.html>`_
I can rely on the .to_sql()
methods (which directly relies on the Pandas
.to_sql()
) which takes as a parameter con
, as stated in the doc:
con : sqlalchemy.engine.Engine or sqlite3.Connection
Using SQLAlchemy makes it possible to use any DB supported by that
library. Legacy support is provided for sqlite3.Connection objects. The user
is responsible for engine disposal and connection closure for the SQLAlchemy
connectable See `here <https://docs.sqlalchemy.org/en/13/core/connections.html>`_
From here, I easily understand that GeoPandas
is built on Pandas
especially for its GeoDataFrame
object, which is, shortly, a special DataFrame
that can handle geographic data.
But I'm wondering why do GeoPandas
has the ability to directly takes a psycopg2
connection as an argument and not Pandas
and if it is planned for the latter?
And why is it neither the case for one nor the other when it comes to writing data?
I would like (as probably many of others1,2) to directly give them a psycopg2
connections argument instead of relying on SQLAlchemy
engine.
Because even is this tool is really great, it makes me use two different frameworks to connect to my database and thus handle two different connection strings (and I personally prefer the way psycopg2
handles the parameters expansion from a dictionary to build a connection string properly such as; psycopg2.connect(**dict_params)
vs URL injection as explained here for example: Is it possible to pass a dictionary into create_engine function in SQLAlchemy?).
Workaround
I was first creating my connection string with psycopg2
from a dictionary of parameters this way:
connParams = ("user={}", "password={}", "host={}", "port={}", "dbname={}")
conn = ' '.join(connParams).format(*dict_params.values())
Then I figured out it was better and more pythonic this way:
conn = psycopg2.connect(**dict_params)
Which I finally replaced by this, so that I can interchangeably use it to build either a psycopg2
connections, or a SQLAlchemy
engine:
def connector():
return psycopg2.connect(**dict_params)
a) Initialize a psycopg2
connection is now done by:
conn = connector()
curs = conn.cursor()
b) And initialize a SQLAlchemy
engine by:
engine = create_engine('postgresql+psycopg2://', creator=connector)
(or with any of your flavored db+driver
)
This is well documented here:
https://docs.sqlalchemy.org/en/13/core/engines.html#custom-dbapi-args
and here:
https://docs.sqlalchemy.org/en/13/core/engines.html#sqlalchemy.create_engine
[1] Dataframe to sql without Sql Alchemy engine
[2] How to write data frame to Postgres table without using SQLAlchemy engine?
See Question&Answers more detail:
os