DBI and cheating with array interface?



I have to load a file of significant size into my Oracle 10g database on a
monthly basis. Things became complicated and I have no other recourse but
to pre-parse file with Perl. For speed sake, I thought of using the array
interface. What I found out shocked me: DBI has array binds and array
execution, but it is actually not using Oracle's array interface, it is
cheating by internally breaking it up and sending each consecutive insert
from within a loop. The proof is below:
CODE:
-----
#!/usr/bin/perl -w
use strict;
use DBI;
use Getopt::Long;
my $arrsize = 16;

# Parse command line options
my $stat = GetOptions( "n|array_size=s" => \$arrsize );

my ( @id, @text, @stat );
my $cnt = 0;
my $ind = 0;
my $trc =
"alter session set events='10046 trace name context forever, level 12'";
my $tid = "alter session set tracefile_identifier='load_multi'";
my $dbh = DBI->connect( "dbi:Oracle:10g", "scott", "tiger" )
|| die( $DBI::errstr . "\n" );
$dbh->{AutoCommit} = 0;
$dbh->{RaiseError} = 1;
$dbh->{ora_check_sql} = 0;

$dbh->do($tid);
$dbh->do($trc);

my $sth = $dbh->prepare("insert into test_table values(?,?)");
$sth->bind_param_array( 1, \@id );
$sth->bind_param_array( 2, \@text );

print STDERR "Loading with array size=$arrsize\n";
while (<>) {
chomp;
my @row = split(/,/);
push @id, $row[0];
push @text, $row[1];
if ( ++$ind == $arrsize ) {
$sth->execute_array( { ArrayTupleStatus => \@stat } );
$ind = 0;
@id = ();
@text = ();
}
++$cnt;
}
if ( $ind > 0 ) {
$sth->execute_array( { ArrayTupleStatus => \@stat } );
}
$dbh->commit;
print STDERR "$cnt records loaded.\n";

END {
$dbh->disconnect() if ( defined $dbh );
}


TKPROF:
-------

insert into test_table
values
(:p1,:p2)


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 20000 8.52 8.32 9 309 62778 20000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 20001 8.52 8.32 9 309 62778 20000
Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 54

There were 20000 executions of the insert statement, there were 20000
lines in the file and I invoked the script with n=64.

$ time load_filen -n 64 data.txt
Loading with array size=64
20000 records loaded.
24.22s real 6.55s user 1.01s system

The only possible conclusion is that DBI is cheating. It doesn't actually
use the array interface, it breaks the execute statement into an internal
loop and bypasses the array interface.

$ perl -e 'use DBI; use DBD::Oracle;print $DBI::VERSION,"\n"'
1.48
$ perl -e 'use DBI; use DBD::Oracle;print $DBD::Oracle::VERSION,"\n"'
1.16
$
$ uname -a
Linux medo.noip.com 2.6.12-1.1381_FC3 #1 Fri Oct 21 03:46:55 EDT 2005 i686 athlon i386 GNU/Linux
$

Here is the award winning question: when will we have a version that will
actually use array interface? Unfortunately, this is useless for speeding
up data loads. Is there anything I can do to make DBI use the array
interface?

--
http://www.mgogala.com

.



Relevant Pages

  • Re: Scripting language with a support for array interface
    ... >> Does it support array bind and, if it does, does it cheat like DBI ... I don't need array interface ... is there a scripting language with regular expressions ...
    (comp.databases.oracle.server)
  • Is ArrayTupleStatus in execute_array mandatory?
    ... "The mandatory ArrayTupleStatus attribute is used to specify a reference to ... array which will receive the execute status of each executed parameter tuple." ... It would appear in DBI 1.50 you can omit it: ...
    (perl.dbi.users)
  • Re: Records that could be arrays
    ... inability to provide an array interface to a record type (to have ... enumerated components), or a record interface to an array (to have named ... type IntArray is ...
    (comp.lang.ada)
  • Re: Ada Popularity: Comparison of Ada/Charles with C++ STL (and Perl)
    ... >> An ability to index is just one of many array properties. ... >> array interface defined and thus it can check a type against that ... Ada has them built-in, but does not allow to implement ...
    (comp.lang.ada)
  • Re: I dont understand the definition of DOES>
    ... on stack on code entry). ... You can use DOES to make ARRAY and then use ARRAY ... CREATE CELLS ALLOT ... First you'd make the code that will execute on the child word. ...
    (comp.lang.forth)