Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

python - Prevent duplicate table entries in a many-to-many relationship in SQLAlchemy

I am trying to set up a movie database using SQLAlchemy with a many-to-many relationship. I have two tables, 'movie' and 'actor', and an association table 'movie_actor'. I would like to be able to add a new movie to the movie table, but if some of the actors in this new movie are already in the actor table, I would like to prevent duplicating them in the actor table while still adding the movie_id and actor_id's to the association table. Here is my table setup:

   from sqlalchemy import Table, Column, Integer, String, ForeignKey, create_engine, and_, or_
   from sqlalchemy.ext.declarative import declarative_base
   from sqlalchemy.orm import backref, mapper, relationship, Session

   Base = declarative_base()

   formats = {'1': 'DVD', '2': 'Blu-ray', '3': 'Digital', '4': 'VHS'}

   ###########################################################################################
   class Movie(Base):
   """Movie Class"""

        __tablename__ = "movie"

        movie_id = Column(Integer, primary_key=True)
        title = Column(String(20), nullable=False, unique=True)
        year = Column(Integer, nullable=False)
        format = Column(String, nullable=False)
        movie_actor = relationship("MovieActor", cascade="all, delete-orphan", backref="movie")

        def __init__(self, title, year, format):
            self.title = title
            self.year = year
            self.format = format

        def __repr__(self):
            return "%s %s" % (self.movie_id, self.title)

    ###########################################################################################
    class Actor(Base):
    """Actor Class"""

        __tablename__ = "actor"

        actor_id = Column(Integer, primary_key=True)
        full_name = Column(String(30), nullable=False, unique=True)

        def __init__(self, full_name):
            self.full_name = full_name

        def __repr__(self):
            return "%s %s" % (self.actor_id, self.full_name)

    ###########################################################################################
    class MovieActor(Base):
    """MovieActor Association Class"""

        __tablename__ = "movieactor"
        movie_id = Column(Integer, ForeignKey('movie.movie_id'), primary_key=True)
        actor_id = Column(Integer, ForeignKey('actor.actor_id'), primary_key=True)

        def __init__(self, actor):
            self.actor = actor
        actor = relationship(Actor, lazy='joined')

        def __repr__(self):
            return "%s, %s" % (self.movie_id, self.actor_id)

And here is a class which will handle inserting new entries and querying the database:

    ###########################################################################################
    class Database(object):

    # A connection to the movie database is established upon instantiation.
        def __init__(self):
            engine = create_engine('sqlite:///bmdb.db')
            Base.metadata.create_all(engine)
            session = Session(engine)
            self.session = session

    # add_new method takes a dictionary of strings containing all the info for a new movie: "title, year, format, actors"
    # and formats the strings, then adds them to the proper tables in the database

        def add_new(self, new_movie):

            #find out what formats exist
            format = ""
            for i in range(1,5):
                    try:
                            format += new_movie[formats[str(i)]]
                            format += ", "
                    except:
                            pass

            format = format[:-1]
            format = format[:-1]

            # capitalize the first letter of each word in the movie title
            title = " ".join(word[0].upper() + word[1:].lower() for word in new_movie['title'].split())
            try:
                    movie = Movie(title, new_movie['year'], format)
                    # add the new movie to the session
                    self.session.add(movie)
                    # commit the new movie to the database
                    self.session.commit()
            except:
                    print "Duplicate Movie"
                    self.session.rollback()
                    return

            # parse the text in the actors entry
            # take the incoming string of all actors in the movie and split it into a list of individual actors

            actors = new_movie['actors'].split(", ")        
            for i in range(len(actors)):
                    # for each actor in the list, capitalize the first letter in their first and last names
                    actors[i] = " ".join(word[0].upper() + word[1:].lower() for word in actors[i].split())
                    # add each formatted actor name to the Actor table
                    actor = Actor(actors[i])
                    try:
                            # add the appropriate association between the movie and the actors to the MovieActor table
                            movie.movie_actor.append(MovieActor(actor))
                            # add the new actor and movieactor association to the session
                            self.session.add(movie)
                            self.session.commit()
                    except:
                            print "Duplicate Actor"
                            self.session.rollback()

As my code stands now, the try/except block in the add_new() method prevents duplicate actors from being added to the database since the 'full_name' column in the actor table is set to unique=True, but this also prevents an entry from being added to the movie_actor association table.

Basically what I would like to know is how to add a movie, check to see if the actors in the movie already exist in the actor table, and if they do, do not insert the actors into the actor table, but take their existing actor_id from the actor table and create the appropriate association in the movie_actor association table.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You'll likely need to insert a self.session.begin_nested() in your try: block. Then if you need to rollback because of the duplicate key, you can still add the actors to the movie:

from sqlalchemy.exc import IntegrityError  # only catch the right exception!
           # in for loop:
                try:
                        session.begin_nested()
                        actor = Actor(actors[i])
                except IntegrityError:
                        print "Duplicate Actor"
                        self.session.rollback() # subtransaction
                        actor = self.session.query(Actor).
                           filter(Actor.name==actors[i]).first()
                else:
                        self.session.commit()  # subtransaction

                # add the appropriate association between the movie and the actors to the MovieActor table
                movie.movie_actor.append(MovieActor(actor))
                # add the new actor and movieactor association to the session
                self.session.add(movie)
                self.session.commit()

Edit: always except IntegrityError when expecting duplicate key errors.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...