|
|
Python Database API
This API has been defined to encourage similarity between the Python
modules that are used to access databases. By doing this, we hope to
achieve a consistency leading to more easily understood modules,
code that is generally more portable across databases, and a broader
reach of database connectivity from Python.
This interface specification consists of several items:
Comments and questions about this specification may be directed to
the SIG on Tabular Databases in Python.
See the main index for information about modules
that use this interface.
This specification document was last updated on: April 9, 1996. It
will be known as Version 1.0 of this specification.
Note: The ODBC module implements a newer version of this specification.
Some additional error types were added to DBI. A spec update is due.
Module Interface
The database interface modules should typically be named with something
terminated by db. Existing examples are:
oracledb , informixdb , and pg95db .
These modules should export several names:
- modulename(connection_string)
- Constructor for creating a connection to the database. Returns a
Connection Object.
- error
- Exception raise for errors from the database module.
Connection Objects
Connections Objects should respond to the following methods:
- close()
- Close the connection now (rather than whenever __del__ is called).
The connection will be unusable from this point forward; an exception will
be raised if any operation is attempted with the connection.
- commit()
- Commit any pending transaction to the database.
- rollback()
- Roll the database back to the start of any pending transaction.
- cursor()
- Return a new Cursor Object. An exception may be thrown if the
database does not support a cursor concept.
- callproc([params])
-
Note: this method is not well-defined yet.
Call a stored database procedure with the given (optional) parameters.
Returns the result of the stored procedure.
- all Cursor Object attributes and methods
- For databases that do not have cursors and for simple applications that
do not require the complexity of a cursor, a Connection Object should
respond to each of the attributes and methods of the Cursor Object.
Databases that have cursor can implement this by using an implicit, internal
cursor.
Cursor Objects
These objects represent a database cursor, which is used to manage the context
of a fetch operation.
Cursor Objects should respond to the following methods and attributes:
- arraysize
- This read/write attribute specifies the
number of rows to fetch at a time with
fetchmany() . This value
is also used
when inserting multiple rows at a time (passing a tuple/list of tuples/lists
as the params value to execute() ). This attribute will
default to a single row.
Note that the arraysize is optional and is merely provided for higher
performance database interactions. Implementations should observe it
with respect to the fetchmany() method, but are free to
interact with
the database a single row at a time.
- description
- This read-only attribute is a
tuple of 7-tuples. Each 7-tuple contains information describing each
result column:
(name, type_code, display_size, internal_size,
precision, scale, null_ok) . This attribute will be
None for operations that do not
return rows or if the cursor has not had an operation invoked via the
execute() method yet.
The type_code is one of the dbi values
specified in the section below.
Note: this is a bit in flux. Generally, the first two items of
the 7-tuple will always be present; the others may be database specific.
- close()
- Close the cursor now (rather than whenever __del__ is called).
The cursor will be unusable from this point forward; an exception will
be raised if any operation is attempted with the cursor.
- execute(operation [,params])
- Execute (prepare) a database operation (query or command). Parameters
may be provided (as a sequence (e.g. tuple/list)) and will be bound to
variables in the
operation. Variables are specified in a database-specific notation that
is based on the index in the parameter tuple (position-based rather than
name-based).
The parameters may also be specified as a sequence of sequences
(e.g. a list of tuples) to insert multiple rows in a single operation.
A reference to the operation will be retained by the cursor. If the same
operation object is passed in again, then the cursor can optimize its
behavior. This is most effective for algorithms where the same operation
is used, but different parameters are bound to it (many times).
For maximum efficiency when reusing an operation, it is best to use the
setinputsizes() method to specify the parameter types and sizes ahead of
time. It is legal for a parameter to not match the predefined information;
the implementation should compensate, possibly with a loss of efficiency.
Using SQL terminology, these are the possible result values from the
execute() method:
- If the statement is DDL (e.g. CREATE TABLE), then 1 is returned.
- If the statement is DML (e.g. UPDATE or INSERT), then the
number of rows affected is returned (0 or a positive integer).
- If the statement is DQL (e.g. SELECT), None is returned, indicating
that the statement is not really complete until you use one of the
fetch methods.
- fetchone()
- Fetch the next row of a query result, returning a single tuple.
- fetchmany([size])
- Fetch the next set of rows of a query result, returning as a list of
tuples. An empty list is returned when no more rows are available. The
number of rows to fetch is specified by the parameter. If it is
None ,
then the cursor's arraysize determines the number of rows to be fetched.
Note there are performance considerations involved with the size parameter.
For optimal performance, it is usually best to use the arraysize attribute.
If the size parameter is used, then it is best for it to retain the same
value from one fetchmany() call to the next.
- fetchall()
- Fetch all rows of a query result, returning as a list of tuples. Note
that the cursor's arraysize attribute can affect the performance of this
operation.
- setinputsizes(sizes)
-
Note: this method is not well-defined yet.
This can be used before a call to execute() to predefine
memory areas for
the operation's parameters. sizes is specified as a tuple -- one item
for each input parameter. The item should be a Type object that
corresponds to the input that will be used, or it should be an integer
specifying the maximum length of a string parameter. If the item is
None , then no predefined memory area will be reserved for
that column (this is useful to avoid predefined areas for large inputs).
This method would be used before the execute() method
is invoked.
Note that this method is optional and is merely provided for higher
performance
database interaction. Implementations are free to do nothing and users
are free to not use it.
- setoutputsize(size [,col])
-
Note: this method is not well-defined yet.
Set a column buffer size for fetches of large columns (e.g. LONG).
The column
is specified as an index into the result tuple. Using a column of
None will set the default size for all large columns in
the cursor.
This method would be used before the execute() method
is invoked.
Note that this method is optional and is merely provided for higher
performance
database interaction. Implementations are free to do nothing and users
are free to not use it.
DBI Helper Objects
Many databases need to have the input in a particular format
for binding to an operation's input parameters. For example, if an
input is destined for a DATE column, then it must be bound to the
database in a particular string format. Similar problems exist for
"Row ID" columns or large binary items (e.g. blobs or RAW columns).
This presents problems for Python since the parameters to the
execute() method are untyped. When the database module
sees a Python string object, it doesn't know if it should be bound
as a simple CHAR column, as a raw binary item, or as a DATE.
To overcome this problem, the dbi module was created.
This module specifies some basic database interface types for working
with databases. There are two classes: dbiDate and
dbiRaw. These are simple container classes that
wrap up a value. When passed to the database modules, the module can then
detect that the input parameter is intended as a DATE or a RAW. For
symmetry, the database modules will return DATE and RAW columns as
instances of these classes.
A Cursor Object's description attribute returns information
about each of the result columns of a query. The type_code
is defined to be one of five types exported by this module:
STRING, RAW, NUMBER,
DATE, or ROWID.
The module exports the following names:
- dbiDate(value)
- This function constructs a
dbiDate instance that holds
a date value. The value should be specified as an integer number of
seconds since the "epoch" (e.g. time.time()).
- dbiRaw(value)
- This function constructs a
dbiRaw instance that holds
a raw (binary) value. The value should be specified as a Python string.
- STRING
- This object is used to describe columns in a database that are
string-based (e.g. CHAR).
- RAW
- This object is used to describe (large) binary columns in a database
(e.g. LONG RAW, blobs).
- NUMBER
- This object is used to describe numeric columns in a database.
- DATE
- This object is used to describe date columns in a database.
- ROWID
- This object is used to describe the "Row ID" column in a database.
|