UPSERT (update or insert) in Oracle with Python
2 min read

UPSERT (update or insert) in Oracle with Python

UPSERT (update or insert) in Oracle with Python

One of the things I had to do is to build a script to upload data (e.g. the content of a list of files) into a database with a single table. My table has 3 columns:

  1. ID - a 20-character string identifier with PK properties (unique)
  2. ARTICLE- an XMLType element with the article details
  3. CITEDBY - an XMLType element with citations data

The problem is that some of the data is already in the database and it may only need updating. Following this question, the selected strategy is to perform the update and, if the record does not exist, then insert it. The SQL code equivalent is (from here):

UPDATE tablename
    SET val1 = in_val1,
        val2 = in_val2
    WHERE val3 = in_val3;

IF ( sql%rowcount = 0 )
    THEN
    INSERT INTO tablename
        VALUES (in_val1, in_val2, in_val3);
END IF;

If this were only SQL, everything would be OK. However, the requirement is to perform the operation from python, so I can only do one statement at a time. Moreover, I can't do an IF statement. To be a bit DRY (just in case I need to reuse this somewhere else), I've created a simple function:

def insert_or_update(connection, identifier, article, citations=None):
  cursor = connection.cursor()

Update

First attempt is to update a record, if it exists:

  cursor.setinputsizes(cx_Oracle.CLOB, cx_Oracle.CLOB, 20)
  cursor.prepare(UPDATE_STATEMENT)
  cursor.execute(None, (article, citations, identifier))
  connection.commit()

My update statement is:

update ARTICLE_XML_DATA set
  article = XMLType(
    xmlData => :1,
    schema => 'article-description.xsd',
    validated => 0,
    wellformed => 0
  ),
  citedby = XMLType(
    xmlData => :2
  )
WHERE id = :3

To make usable in the python script, you need to wrap it into UPDATE_STATEMENT=""" ... """.

You can see the statement will have an effect only if the table contains a record with the required identifier. Otherwise, nothing happens.

Insert

Since I can't perform a composite statement (with an IF) without moving to PL/SQL, I need to emulate this. The simplest way to do it is to perform an INSERT anyway and catch the error if the record is already there. The resulting code is described below:

  try:
      cursor.setinputsizes(cx_Oracle.CLOB, cx_Oracle.CLOB, 20)
      cursor.prepare(INSERT_STATEMENT)
      cursor.execute(None, (article, citations, identifier))
      connection.commit()
      print >> sys.stderr, "++ New Entry:", identifier
  except cx_Oracle.IntegrityError, e:
      if ("%s" % e.message).startswith('ORA-00001:'):
          print >> sys.stderr, "Entry already there:", identifier
      else:
          raise e

As you can see, I'm catching the IntegrityError and check if it's for a duplicate entry (my identifier field has a primary key -- unique -- constraint). If yes, then I'm just noting it. Otherwise, I'm propagating the error further.

The insert string is very similar to the update string above:

INSERT into ARTICLE_XML_DATA(article, citedby, id)
  values (
      XMLType(
      xmlData => :1,
      schema => 'article-description.xsd',
      validated => 0,
      wellformed => 0
    ),
    xmltype(:2),
    :3
)

Then, at the end I perform a cursor closing (to avoid the max cursors error):

  finally:
      cursor.close()

Function

The full function contents is:

def insert_or_update(connection, identifier, article, citations=None):
    cursor = connection.cursor()

    cursor.setinputsizes(cx_Oracle.CLOB, cx_Oracle.CLOB, 20)
    cursor.prepare(UPDATE_STATEMENT)
    cursor.execute(None, (article, citations, identifier))
    connection.commit()

    try:
        cursor.setinputsizes(cx_Oracle.CLOB, cx_Oracle.CLOB, 20)
        cursor.prepare(INSERT_STATEMENT)
        cursor.execute(None, (article, citations, identifier))
        connection.commit()
        print >> sys.stderr, "++ New Entry:", identifier
    except cx_Oracle.IntegrityError, e:
        if ("%s" % e.message).startswith('ORA-00001:'):
            print >> sys.stderr, "Entry already there:", identifier
        else:
            raise e
    finally:
        cursor.close()

Some notes

  • The function can be further enhanced to allow for flexible arguments, different strings
  • You could perform the PL/SQL statements in a more complex statement