Use Oracle's RegisterSchema with Large Schemas (from Python)
1 min read

Use Oracle's RegisterSchema with Large Schemas (from Python)

Use Oracle's RegisterSchema with Large Schemas (from Python)

An important note when loading XSDs in Oracle is that DBMSXMLSCHEMA.registerSchema() requires the actual content of the _xsd is loaded in a string. A side effect is that strings in Oracle have limits and schemas can exceed those limits. Therefore, such strings must be fragmented. In this case, we need to employ PL/SQL. A processed fragment could look something like this:

declare
    vSCHEMA clob;
begin
    -- Concatenate fragments
    vSCHEMA := CONCAT(vSCHEMA, 'long-string-fragment');
    vSCHEMA := CONCAT(vSCHEMA, 'long-string-fragment');


    dbms_xmlschema.registerSchema(
        schemaURL => 'mySchems.xsd',
        schemaDoc => vSCHEMA,
        local => true,
        genTypes => false,
        genTables => true,
        enableHierarchy => dbms_xmlschema.ENABLE_HIERARCHY_NONE
    );

end;

Now, all you need to do is populate the fragments (where I've placed 'long-string-fragment') with the actual content. A rule of thumb I've got is to have a fragment to be roughly 3000 characters.

For my project, I've used a bit of python code to generate my string. A first part is the PL/SQL code (wrapped in python):

INSERT_STR_MULTI = """
declare
    vSCHEMA clob;
begin
    %(segments)s

  dbms_xmlschema.registerSchema(
    schemaURL => '%(name)s',
    schemaDoc => vSCHEMA,
    local => true,
    genTypes => false,
    genTables => true,
    enableHierarchy => dbms_xmlschema.ENABLE_HIERARCHY_NONE
  );

end;
"""

Here, the '%(name)s' is the schema's name and the '%(segments)s' will be replaced with the sequence of vSCHEMA := CONCAT(vSCHEMA, '...');.

The meat is:

xsd_content = open(xsd_path).read()

if len(xsd_content) > 3000:
    print "... Large XML file. Splitting"
    result = ""
    while len(xsd_content) > 3000:

        result += "vSCHEMA := CONCAT(vSCHEMA, '%s');\n" % xsd_content[:3000]
        xsd_content = xsd_content[3000:]

    result += "vSCHEMA := CONCAT(vSCHEMA, '%s');\n" % xsd_content
else:
    result = "vSCHEMA := CONCAT(vSCHEMA, '%s');\n" % xsd_content

insert_str = INSERT_STR_MULTI % \
    {
        'name': xsd_file,
        'segments': result,
    }

cursor.execute(
    insert_str
)
connection.commit()

Few notes:

  • Code can be a bit simplified to avoid CONCAT() on short strings.
  • The 3000 limit is a rule of thumb. The larger the string, the less concatenation operations need to be performed, the quicker the code (ish).

Note: Please refer to the previous article to see e.g. connection setup via cx_oracle

HTH,