Re: DB API 2.0 and transactions



I'm CC:ing this to D'Arcy J.M. Cain. (See comp.lang.python for prequel
D'Arcy.)

Christopher J. Bottaro wrote:
Check this out...

<code>
import pgdb
import time

print time.ctime()
db = pgdb.connect(user='test', host='localhost', database='test')
time.sleep(5)
db.cursor().execute('insert into time_test
                     (datetime)
                     values
                     (CURRENT_TIMESTAMP)')
db.commit()
curs = db.cursor()
curs.execute('select datetime from time_test order by datetime desc limit
1')
row = curs.fetchone()
print row[0]
</code>

<output>
Fri Jun 10 17:27:21 2005
'2005-06-10 17:27:21.654897-05'
</output>

Notice the times are exactly the same instead of 5 sec difference.

What do you make of that?  Some other replies to this thread seemed to
indicate that this is expected and proper behavior.

This is wrong. It should not behave like that if it is to follow the SQL standard which *I* would expect and consider proper.

I don't think the SQL standard mandates that all evaluations of
CURRENT_TIMESTAMP within a transaction should be the same. It does
manadate that CURRENT_TIMESTAMP in only evaluated once in each SQL
statement, so "CURRENT_TIMESTAMP=CURRENT_TIMESTAMP" should always be
true in a WHERE statement. I don't think it's a bug if all timestamps
in a transaction are the same though. It's really a bonus if we can
view all of a transaction as taking place at the same time. (A bit
like Piper Halliwell's time-freezing spell in "Charmed".)

The problem is that transactions should never start until the first
transaction-initiating SQL statement takes place. (In SQL-92, all
standard SQL statements are transaction initiating except CONNECT,
DISCONNECT, COMMIT, ROLLBACK, GET DAIGNOSTICS and most SET commands
(SET DESCRIPTOR is the exception here).) Issuing BEGIN directly after
CONNECT, ROLLBACK and COMMIT is in violation with the SQL standards.

A workaround for you could be to explicitly start a new transaction
before the insert as PostgreSQL (but not the SQL standard) wants you
to do. I suppose you can easily do that using e.g. db.rollback(). If
you like, I guess you could do db.begin=db.rollback in the beginning
of your code and then use db.begin().

Another option would be to investigate if any of the other postgreSQL
drivers have a more correct behaviour. The non-standard behaviour that
you describe it obvious from the pgdb source. See:
http://www.pygresql.org/cvsweb.cgi/pygresql/module/pgdb.py?rev=1.27
(Comments added by me.)

class pgdbCnx:

	def __init__(self, cnx):
		self.__cnx = cnx
		self.__cache = pgdbTypeCache(cnx)
		try:
			src = self.__cnx.source()
			src.execute("BEGIN") # Ouch!
		except:
			raise OperationalError, "invalid connection."

...
	def commit(self):
		try:
			src = self.__cnx.source()
			src.execute("COMMIT")
			src.execute("BEGIN") # Ouch!
		except:
			raise OperationalError, "can't commit."

	def rollback(self):
		try:
			src = self.__cnx.source()
			src.execute("ROLLBACK")
			src.execute("BEGIN") # Ouch!
		except:
			raise OperationalError, "can't rollback."



....

This should be changed to something like this (untested):

class pgdbCnx:

	def __init__(self, cnx):
		self.__cnx = cnx
		self.__cache = pgdbTypeCache(cnx)
		self.inTxn = False #NEW
		try:
			src = self.__cnx.source() # No BEGIN here
		except:
			raise OperationalError, "invalid connection."

....
	def commit(self):
		try:
			src = self.__cnx.source()
			src.execute("COMMIT")
			self.inTxn = False # Changed
		except:
			raise OperationalError, "can't commit."

	def rollback(self):
		try:
			src = self.__cnx.source()
			src.execute("ROLLBACK")
			self.inTxn = False # Changed
		except:
			raise OperationalError, "can't rollback."
....
	def cursor(self):
		try:
			src = self.__cnx.source()
			return pgdbCursor(src, self.__cache, self) # Added self
		except:
			raise pgOperationalError, "invalid connection."


....

class pgdbCursor:

	def __init__(self, src, cache, conn): # Added conn
		self.__cache = cache
		self.__source = src
> self.__conn = conn # New
		self.description = None
		self.rowcount = -1
		self.arraysize = 1
		self.lastrowid = None
....
(execute calls executemany)
....
	def executemany(self, operation, param_seq):
		self.description = None
		self.rowcount = -1

		# first try to execute all queries
		totrows = 0
		sql = "INIT"
		try:
			for params in param_seq:
				if params != None:
					sql = _quoteparams(operation, params)
				else:
					sql = operation
> if not self.__conn.inTxn: # Added test
self.__source.execute('BEGIN')
> self.__conn.inTxn = True
				rows = self.__source.execute(sql)
				if rows != None: # true is __source is NOT a DQL
					totrows = totrows + rows
				else:
					self.rowcount = -1

I guess it would be even better if the executemany method checked that it was really a tranasction-initiating SQL statement, but that makes things a bit slower and more complicated, especially as I suspect that the driver premits several SQL statements separated by semicolon in execute and executemany. We really don't want to add a SQL parser to pgdb. Making all statements transaction-initiating is at least much closer to standard behaviour than to *always* start transactions start prematurely. I guess it will remove problems like the one I mentioned earlier (repeated below) in more than 99% of the cases.

This bug has implications far beyond timestamps. Imagine two
transaction running with isolation level set to e.g. serializable.
Transaction A updates the AMOUNT column in various rows of table
X, and transaction B calculates the sum of all AMOUNTs in X.

Lets say they run over time like this, with | marking transaction
start and > commit (N.B. ASCII art follows, you need a fixed font
to view this):

....|--A-->.......|--A-->........
............|-B->.........|-B->..

This works as expected... The first B-transaction sums up AMOUNTs
after the first A-transaction is done etc, but imagine what happens
if transactions implicitly begin too early as with the current pgdb:

|-----A-->|---------A-->|-------
|------------B->|----------B->|-

This will cause B1 to sum up AMOUNTs before A1, and B2 will sum up
AMOUNTs after A1, not after A2.
.



Relevant Pages

  • Re: Massive DTS job (5 days and i am nervous)
    ... Inserts are done as a transaction. ... Columnist, SQL Server Professional ... The way this was written and i am using the current stored proc being executed as an example is that no transactions are used. ... would like to know which> sql statement is being executed just to satisfy myself that this thing is> not hanging on one sql statement the entire time.... ...
    (microsoft.public.sqlserver.programming)
  • Re: get the sql producing lock
    ... Is there a simple way to get the sql statement which locked the ... I am not aware of 'current transaction sql'. ... executed (or more like currently executing sql). ...
    (comp.databases.oracle.server)
  • Re: Asynchronous Stored Procedure Never Returns - Help?
    ... If you have the Sql Server 2000 or 2005 docs they are thorough and can be ... for Transaction SQL Reference from the drop-down or select a keyword from ... your query in Query Analyzer or Sql Server Management Studio, ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: commit work not working?
    ... Because he is using straight sql, ... The transaction is made up of the statements that occur ... locks the stock table, updates rows in the stock table ... The database server must perform this sequence of operations either ...
    (comp.databases.informix)
  • Re: Handling Script Timeout when invoking ActiveX Object involving ADODB from ASP2.0 pages- SQL Clie
    ... The SQL Client Network Utility on the remote SQL Server 2000 SP3a on Windows ... > This is an example of an Aborted Transaction. ...
    (microsoft.public.sqlserver.clients)