<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=koi8-r">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
@font-face
        {font-family:"MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"\@MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Hi All,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Gloat mode engaged….<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I had a table which was very big, it had more than the maximum integer4 (2+billion) rows and had to be partitioned 32 ways just to store the data.<span style="color:#1F497D"> It had become a serious issue within the database and in particular
meant I couldn’t transfer this database to another installation which was becoming a necessary step.</span><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">It looked like this:<o:p></o:p></p>
<p class="MsoNormal"><span style="font-family:"Courier New"">Run_id integer4<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New"">Pid integer4<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New"">Gid integer2<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New"">Status integer1<o:p></o:p></span></p>
<p class="MsoNormal">For any run_id there may be as many as (and typically are as many as) 500,000 rows.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Then I got told, ’What? You’re storing that stuff! We don’t need it.’ … but after examining the code for a while I found that they did need it during a particular operational phase and then not until another process which occurs once every
6months.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">So I couldn’t remove the data. But the good news was that those processes accessed the data by run_id only.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">So I made an archive table which looked like this:<o:p></o:p></p>
<p class="MsoNormal"><span style="font-family:"Courier New"">Run_id integer4<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New"">All_data long varchar<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New"">Tinsert ingresdate</span><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The all_data field was composed of all the old rows pid,gid,status data separated by commas and each of the original rows data separated by colons.<o:p></o:p></p>
<p class="MsoNormal">Eg. 1000780,-1,0:1000800,-1,0:1000889,-1,0:1000959,-1,0…<o:p></o:p></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">The all_data would typically be about 65M.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal">So how to extract the data when actually needed it? Enter a table procedure and the similar to regular expression pattern matching.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">select first 10 pid, gid, status from all_data(run_id =146)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">pid gid status<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000018 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000020 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000034 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000041 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000056 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000062 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000075 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000089 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000093 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 1000104 -1 0<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">(10 rows in 0.004865 secs)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">select count(*) from all_data(run_id = 162)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">col1 <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> 502713<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">(1 row in <span style="background:yellow;mso-highlight:yellow">
20.629744</span> secs)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">By comparison on the source table such a select would take about 0.5 seconds.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">The trade off of performance for reliably maintaining the data was worth it.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Marty<o:p></o:p></span></p>
</div>
</body>
</html>