[Info-ingres] Why I love table procedures!
Martin Bowes
martin.bowes at ndph.ox.ac.uk
Thu Nov 22 10:51:35 UTC 2018
Hi All,
Gloat mode engaged....
I had a table which was very big, it had more than the maximum integer4 (2+billion) rows and had to be partitioned 32 ways just to store the data. It had become a serious issue within the database and in particular meant I couldn't transfer this database to another installation which was becoming a necessary step.
It looked like this:
Run_id integer4
Pid integer4
Gid integer2
Status integer1
For any run_id there may be as many as (and typically are as many as) 500,000 rows.
Then I got told, 'What? You're storing that stuff! We don't need it.' ... but after examining the code for a while I found that they did need it during a particular operational phase and then not until another process which occurs once every 6months.
So I couldn't remove the data. But the good news was that those processes accessed the data by run_id only.
So I made an archive table which looked like this:
Run_id integer4
All_data long varchar
Tinsert ingresdate
The all_data field was composed of all the old rows pid,gid,status data separated by commas and each of the original rows data separated by colons.
Eg. 1000780,-1,0:1000800,-1,0:1000889,-1,0:1000959,-1,0...
The all_data would typically be about 65M.
So how to extract the data when actually needed it? Enter a table procedure and the similar to regular expression pattern matching.
select first 10 pid, gid, status from all_data(run_id =146)
┌─────────────┬────────┬──────┐
│pid │gid │status│
├─────────────┼────────┼──────┤
│ 1000018│ -1│ 0│
│ 1000020│ -1│ 0│
│ 1000034│ -1│ 0│
│ 1000041│ -1│ 0│
│ 1000056│ -1│ 0│
│ 1000062│ -1│ 0│
│ 1000075│ -1│ 0│
│ 1000089│ -1│ 0│
│ 1000093│ -1│ 0│
│ 1000104│ -1│ 0│
└─────────────┴────────┴──────┘
(10 rows in 0.004865 secs)
select count(*) from all_data(run_id = 162)
┌─────────────┐
│col1 │
├─────────────┤
│ 502713│
└─────────────┘
(1 row in 20.629744 secs)
By comparison on the source table such a select would take about 0.5 seconds.
The trade off of performance for reliably maintaining the data was worth it.
Marty
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.planetingres.org/pipermail/info-ingres/attachments/20181122/e4127a50/attachment.html>
More information about the Info-ingres
mailing list