[Info-ingres] Weird problem in Ingres 10

Mark i at dontgetlotsofspamanymore.net
Thu May 17 15:00:48 UTC 2018


On Thu, 17 May 2018 14:51:54 +0000, Martin Bowes
<martin.bowes at ndph.ox.ac.uk> wrote:

>You shouldn't need to do a +j level checkpoint. You only need that if the database itself is not being journaled.

I don't think the database itself is being journalled.  It's only an
R&D installation for testing -- not a productive system.

>If it is a journal enabled database, simply set journaling on the table and then simply ckpdb sw2igt
>
>Marty
>
>-----Original Message-----
>From: Mark [mailto:i at dontgetlotsofspamanymore.net] 
>Sent: 17 May 2018 15:28
>To: info-ingres at lists.planetingres.org
>Subject: Re: [Info-ingres] Weird problem in Ingres 10
>
>On Thu, 17 May 2018 14:00:02 +0000, Martin Bowes
><martin.bowes at ndph.ox.ac.uk> wrote:
>
>>Please remove it. You may get a swathe of data but in this case more may shed some light on the issue.
>>
>>Are there any other processes you know of which are monkeying with the data?
>
>No.
>
>>Is there a good reason why there is nojournaling on this table? Can we turn it on ... which will entail backing up the database as well.
>
>sw2igt at malfoy 288 % ckpdb +j sw2igt
>Thu May 17 15:13:23 2018 CPP: Preparing to checkpoint database: sw2igt
>Thu May 17 15:13:28 2018 E_DM110D_CPP_DB_BUSY   Database is currently
>being used. Database lock conflict.
>
>Don't know what's accessing the database right now.  All the processes
>I know about are shut down.
>
>>Marty
>>
>>-----Original Message-----
>>From: Mark [mailto:i at dontgetlotsofspamanymore.net] 
>>Sent: 17 May 2018 14:54
>>To: info-ingres at lists.planetingres.org
>>Subject: Re: [Info-ingres] Weird problem in Ingres 10
>>
>>On Thu, 17 May 2018 13:43:35 +0000, Martin Bowes
>><martin.bowes at ndph.ox.ac.uk> wrote:
>>
>>>Hi Mark,
>>>
>>>I note that you are turning off the sc930 trace with a SET TRACE POINT SC930 0 which is still presumably in your ESQLC script. Sadly I think that is masking what is happening here as on the first occasion we lost nearly two minutes of activity, then over a minute on the second and over 4minutes on the third occasion. Could you please remove that from the script and try again.
>>
>>I wanted to disable the tracing after the problematic statement had
>>finished.  I can remove this if there may be useful data shown
>>otherwise.
>>
>>>If this table is journaled could you please also check with auditdb for an insert of the message_id = ' 100055700000XDBW' record after 10:34:06
>>
>>The table is not journalled.
>>
>>>Marty
>>>
>>>-----Original Message-----
>>>From: Martin Bowes [mailto:martin.bowes at ndph.ox.ac.uk] 
>>>Sent: 17 May 2018 14:31
>>>To: info-ingres at lists.planetingres.org
>>>Subject: Re: [Info-ingres] Weird problem in Ingres 10
>>>
>>>Whoops sorry, I'll take that back there are 4 queries.
>>>
>>>The first and the fourth queries are  seemingly identical and the fourth indicates a rowcount 1 on the update.
>>>
>>>I'll have a closer look.
>>>
>>>Marty
>>>
>>>-----Original Message-----
>>>From: Martin Bowes [mailto:martin.bowes at ndph.ox.ac.uk] 
>>>Sent: 17 May 2018 14:25
>>>To: Mark; info-ingres at lists.planetingres.org
>>>Subject: Re: [Info-ingres] Weird problem in Ingres 10
>>>
>>>Hi Mark,
>>>
>>>That trace shows three successive update attempts each with a different message_id. The first one updates no rows the others update a single row.
>>>
>>>So can you do a:
>>>select count(*) from db_msg where message_id = ' 100055700000XDBW' and queue <> ' log_20150325' /* rowcount = 0 */
>>>
>>>select count(*) from db_msg where message_id = ' 400055700000XDBW' and queue <> ' log_20150325' /* rowcount = 1 */
>>>
>>>select count(*) from db_msg where message_id = ' 500055700000XDBW' and queue <> ' log_20150325' /* rowcount = 1 */
>>>
>>>Marty
>>>
>>>-----Original Message-----
>>>From: Mark [mailto:i at dontgetlotsofspamanymore.net] 
>>>Sent: 17 May 2018 14:17
>>>To: info-ingres at lists.planetingres.org
>>>Subject: Re: [Info-ingres] Weird problem in Ingres 10
>>>
>>>On Thu, 17 May 2018 13:03:16 +0000, Martin Bowes
>>><martin.bowes at ndph.ox.ac.uk> wrote:
>>>
>>>>Sorry, my bad that should have been a line starting: SESSION BEGINS
>>>>
>>>>Can you send the update queries and parameters (if any) from the good and bad cases. Those won't show any private data, so it should be cool.
>>>
>>>Cheers.  I have emailed you the logs.
>>>
>>>>Marty
>>>>
>>>>-----Original Message-----
>>>>From: Mark [mailto:i at dontgetlotsofspamanymore.net] 
>>>>Sent: 17 May 2018 13:51
>>>>To: info-ingres at lists.planetingres.org
>>>>Subject: Re: [Info-ingres] Weird problem in Ingres 10
>>>>
>>>>On Thu, 17 May 2018 12:35:38 +0000, Martin Bowes
>>>><martin.bowes at ndph.ox.ac.uk> wrote:
>>>>
>>>>>Going out on a limb here...
>>>>>
>>>>>Are we sure the two sessions are working on the same table owned by the same user in the same database as each other?
>>>>
>>>>I am only using a single session for this.
>>>>
>>>>>In the sc930 trace file can you find the connection information immediately prior to the update code. You are looking for a line like:
>>>>>BEG 27/03/2018 13:26:07.598545, Database: iidbdb, User: ingres
>>>>
>>>>There's nothing like this in the trace files.
>>>>
>>>>>You could also check your database and see if there are any duplicates of the table with the query:
>>>>>Select table_owner from iitables where table_name='YOURTABLE';
>>>>>
>>>>>Do that for each table involved in the update.
>>>>
>>>>Only 1 row (& only 1 table in the update).
>>>>
>>>>>Marty
>>>>>
>>>>>-----Original Message-----
>>>>>From: Mark [mailto:i at dontgetlotsofspamanymore.net] 
>>>>>Sent: 17 May 2018 13:21
>>>>>To: info-ingres at lists.planetingres.org
>>>>>Subject: Re: [Info-ingres] Weird problem in Ingres 10
>>>>>
>>>>>On Thu, 17 May 2018 12:06:48 +0000, Martin Bowes
>>>>><martin.bowes at ndph.ox.ac.uk> wrote:
>>>>>
>>>>>>No error code and with a rowcount of zero means the update is successful, but had no work to do.
>>>>>>
>>>>>>I presume there is a where clause in the update and that it looks OK.
>>>>>
>>>>>There is a where clause and it does match a record so there is "work
>>>>>to do".
>>>>>
>>>>>>Can you compare the code which worked and the code which didn't work for discrepancies?
>>>>>
>>>>>Yes.  They are exactly the same!  And it only fails the first time it
>>>>>is executed, all other times work fine.
>>>>>
>>>>>>Marty
>>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Mark [mailto:i at dontgetlotsofspamanymore.net] 
>>>>>>Sent: 17 May 2018 12:57
>>>>>>To: info-ingres at lists.planetingres.org
>>>>>>Subject: Re: [Info-ingres] Weird problem in Ingres 10
>>>>>>
>>>>>>On Thu, 17 May 2018 11:21:37 +0000, Martin Bowes
>>>>>><martin.bowes at ndph.ox.ac.uk> wrote:
>>>>>>
>>>>>>>Twice...
>>>>>>>
>>>>>>>Once where it updates the data and once where it puts it back the way it was?
>>>>>>
>>>>>>No.  The data is the same for each.
>>>>>>
>>>>>>>What do the Rowcount figures say on each statement?
>>>>>>
>>>>>>0.
>>>>>>
>>>>>>>Marty
>>>>>>>
>>>>>>>-----Original Message-----
>>>>>>>From: Mark [mailto:i at dontgetlotsofspamanymore.net] 
>>>>>>>Sent: 17 May 2018 12:12
>>>>>>>To: info-ingres at lists.planetingres.org
>>>>>>>Subject: Re: [Info-ingres] Weird problem in Ingres 10
>>>>>>>
>>>>>>>On Wed, 16 May 2018 12:40:48 -0400, Karl and Betty Schendel
>>>>>>><schendel at kbcomputer.com> wrote:
>>>>>>>
>>>>>>>>SC930 tracing is server wide and runs until you stop it with set notrace sc930.
>>>>>>>>So you can connect to any database, do the set trace record '/some/ingres-writable/dir'
>>>>>>>>and the set trace point sc930, go to the trace record directory you gave it and
>>>>>>>>ensure that there's at least something there (should be one or more sessNNNNN files),
>>>>>>>>and then you can run your esql program until the fault happens.
>>>>>>>>
>>>>>>>>You don't actually need to put the sc930 into the esql program unless you want to
>>>>>>>>do it that way.
>>>>>>>>
>>>>>>>>Once you capture what you need to capture, set notrace point sc930 turns it off.
>>>>>>>>
>>>>>>>>You'll need the trace point privilege, simplest is to do it as the installation owner
>>>>>>>>(user ingres, traditionally).
>>>>>>>
>>>>>>>Thanks again.  We now have traces but they don't help me.  The traces
>>>>>>>include the SQL statement and the data but, pretty much, nothing else.
>>>>>>>
>>>>>>>The only notable thing I observed is that the data is traced twice for
>>>>>>>the failed transaction and only once for the successful ones.
>>>>>>>
>>>>>>>There are no error messages.
>>>>>>>
>>>>>>>>Karl
>>>>>>>>
>>>>>>>>> On May 16, 2018, at 11:12 AM, Mark <i at dontgetlotsofspamanymore.net> wrote:
>>>>>>>>> 
>>>>>>>>> On Wed, 16 May 2018 12:11:08 +0000, Martin Bowes
>>>>>>>>> <martin.bowes at ndph.ox.ac.uk> wrote:
>>>>>>>>> 
>>>>>>>>>> Just to expand on Karl's sc930...You may already know this...
>>>>>>>>>> 
>>>>>>>>>> To turn it on, make a recording directory...mkdir /full/path/to/directory'
>>>>>>>>>> 
>>>>>>>>>> And then...
>>>>>>>>>> sql iidbdb << SQL_END
>>>>>>>>>> set trace record '/full/path/to/directory';
>>>>>>>>>> set trace point sc930 1;
>>>>>>>>>> \p\g
>>>>>>>>>> \q
>>>>>>>>>> SQL_END
>>>>>>>>> 
>>>>>>>>> Do I need to run this on the server account?  I have implemented it
>>>>>>>>> currently in the embedded SQL of my program.
>>>>>>>>> 
>>>>>>>>>> FYI. The digit after the sc930 indicates a tracing level, 1 should be sufficient.
>>>>>>>>>> 
>>>>>>>>>> Run the errant query.
>>>>>>>>>> 
>>>>>>>>>> And turn off the sc930.
>>>>>>>>>> sql iidbdb << SQL_END
>>>>>>>>>> set trace point sc930 0;
>>>>>>>>>> \p\g
>>>>>>>>>> \q
>>>>>>>>>> SQL_END
>>>>>>>>>> 
>>>>>>>>>> You can now access the recording directory and start scanning the files for any sign of life from your query.
>>>>>>>>>> 
>>>>>>>>>> Marty
>>>>>>>>>> 
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Karl and Betty Schendel [mailto:schendel at kbcomputer.com] 
>>>>>>>>>> Sent: 16 May 2018 12:54
>>>>>>>>>> To: Ingres and related product discussion forum
>>>>>>>>>> Subject: Re: [Info-ingres] Weird problem in Ingres 10
>>>>>>>>>> 
>>>>>>>>>> On May 16, 2018, at 7:44 AM, Mark <i at dontgetlotsofspamanymore.net> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, 15 May 2018 12:27:09 -0400, Karl and Betty Schendel
>>>>>>>>>>> <schendel at kbcomputer.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> It's certainly not something I have heard of or seen before.  Do you have any rules
>>>>>>>>>>>> defined on the relevant tables?  Try enabling LOG_TRACE if the problem is
>>>>>>>>>>>> sufficiently predictable, or do a logdump after the problem occurs if it's not;
>>>>>>>>>>>> the idea being to try to see whether you actually got any PUT (insert) or
>>>>>>>>>>>> REP (replace) log records that were then rolled back, or whether the insert / update
>>>>>>>>>>>> was never executed at all.
>>>>>>>>>>> 
>>>>>>>>>>> After enabling log_trace all I got was:
>>>>>>>>>>> 
>>>>>>>>>>>   LOG: SAVEPOINT   Size written/reserved:      0/     0  Flags:
>>>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>> 
>>>>>>>>>> So the insert/update isn't ever being executed.  Either it's failing with some sort of
>>>>>>>>>> silent error, which would seem odd, or it's not reaching the backend at all, or
>>>>>>>>>> it's being pre-empted by a before rule.  I think the next step would be to enable
>>>>>>>>>> sc930 tracing and see if the backend is getting the insert-update, and what
>>>>>>>>>> end-of-query status it's recording.  There should be KB articles on enabling
>>>>>>>>>> SC930 tracing.
>>>>>>>>>> 
>>>>>>>>>> Karl
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Info-ingres mailing list
>>>>>>>>>> Info-ingres at lists.planetingres.org
>>>>>>>>>> http://lists.planetingres.org/mailman/listinfo/info-ingres
>>>>>>>>> 
>>>>>>>>> -- 
>>>>>>>>> <insert witty sig here>
>>>>>>>>> _______________________________________________
>>>>>>>>> Info-ingres mailing list
>>>>>>>>> Info-ingres at lists.planetingres.org
>>>>>>>>> http://lists.planetingres.org/mailman/listinfo/info-ingres

-- 
<insert witty sig here>



More information about the Info-ingres mailing list