<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
@font-face
        {font-family:"MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:"\@MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:#1F497D">Here is the table with index with sampled stats:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">Name: fred<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">Number of rows: 1615057<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">Storage structure: btree with unique keys<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">Column Information:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> Key Avg Count<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">Column Name Type Length Nulls Defaults Seq Per Value<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">result_id integer 4 no no 1 1.6<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">sample_id integer 4 no no unique<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">test_cid integer 4 no no
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">Secondary indexes:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">Index Name Structure Keyed On<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">fred_idx1 isam sample_id<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">And Here is the same table with non-sampled stats (ie. –zs100)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">Column Information:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D"> Key Avg Count<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">Column Name Type Length Nulls Defaults Seq Per Value<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">result_id integer 4 no no 1 unique<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">sample_id integer 4 no no 4.4<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Courier New";color:#1F497D">test_cid integer 4 no no
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">The concerning thing here is that the secondary index was flagged as unique in the sampled stats(which it most definitely isn’t).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Anyone got any feelings about this?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Marty<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Martin Bowes [mailto:martin.bowes@ndph.ox.ac.uk]
<br>
<b>Sent:</b> 27 January 2017 13:30<br>
<b>To:</b> info-ingres@lists.planetingres.org<br>
<b>Subject:</b> [Info-ingres] stats on tables with more than 1million rows<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi All,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">It appears that optimizedb automatically switches to sampling once a table breaks 1million rows, which is cool.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">But now I have a table with unique keys which the stats insist is a non-unique key with an average count per value of 1.6.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Is this a problem?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Martin Bowes<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</body>
</html>