[Info-ingres] Generating Stats with Optimizedb
Henrik Georg Sørensen (HGO)
HGO at kmd.dk
Fri Aug 30 06:11:41 UTC 2019
In our case (on Linux) I've created a script to make sure there's Stats on every official table (listed in our table b_statistik - just tablename),
Which when running this script (edited so hopefully not missed something), which makes sure if no Stats then optimizedb -zk on that table,
And If missing stats on a Column with index creates it with optimizedb -zk and -a for the column.name.
Table=Tabel and Column=Kolonne (in the Danish translation 😊).
#!/bin/ksh
database=${1};
logfil=/tmp/${database}.optimstat.log
${II_SYSTEM}/ingres/bin/optimizedb -zk ${database}|tee -a ${logfil} fi; cat <<+|${II_SYSTEM}/ingres/bin/sql -s ${database} >> ${logfil}
set norules;
set lockmode session where readlock = nolock, level = table;
set autocommit on\g
declare global temporary table session.st_statistics
as select distinct i.table_name
from iistats i
where i.table_owner = dbmsinfo('dba')
on commit preserve rows with norecovery, page_size = 8192;
\p\g\time
modify session.st_statistics to btree on table_name;
\p\g\time
declare global temporary table session.st_optimstat
as select table_name = b.tabelnavn
from b_statistik b, iitables i
where b.tabelnavn = i.table_name and i.table_type = 'T' and
i.table_owner = dbmsinfo('dba') and i.num_rows >= 1 and
i.expire_date = 0 and i.storage_structure != 'HEAP' and
not exists (select *
from session.st_statistics s
where s.table_name = i.table_name)
on commit preserve rows with norecovery;
\p\g\time
modify session.st_optimstat to btree on table_name;
\p\g\time
copy session.st_optimstat
(table_name = c0nl)
into '/tmp/${database}.optimstat.lst';
\p\g\time
+
cat /tmp/${database}.optimstat.lst|while read tabel; do
echo $(date +"%d.%m.%Y %H:%M:%S")' Stats '${database}'/'${tabel}
optimizedb -zk ${database} -r${tabel}
done;
cat <<+|${II_SYSTEM}/ingres/bin/sql -s ${database} >> ${logfil}
set norules;
set lockmode session where readlock = nolock, level = table;
set autocommit on\g
declare global temporary table session.st_statistics
as select distinct i.table_name, i.column_name
from iistats i
where i.table_owner = dbmsinfo('dba')
on commit preserve rows with norecovery, page_size = 8192;
\p\g\time
modify session.st_statistics to btree
on table_name, column_name;
\p\g\time
declare global temporary table session.st_optimstat
as select table_name = c.table_name,
column_name = c.column_name,
reference_name = varchar(squeeze(c.table_name)
+ ':' + squeeze(c.column_name), 65),
key_sequence = c.key_sequence
from b_statistik b, iitables t, iicolumns c
where b.tabelnavn = t.table_name and b.tabelnavn = c.table_name and
t.table_type = 'T' and t.storage_structure != 'HEAP' and
t.table_owner = dbmsinfo('dba') and t.num_rows >= 1 and
t.table_name = c.table_name and t.table_owner = c.table_owner and
not exists (select *
from session.st_statistics s
where s.table_name = c.table_name and
s.column_name = c.column_name)
on commit preserve rows with norecovery, page_size = 8192;
\p\g\time
modify session.st_optimstat to btree
on table_name, column_name;
\p\g\time
declare global temporary table session.st_optimkey
as select table_name = s.table_name,
column_name = s.column_name,
key_sequence = max(c.key_sequence)
from session.st_optimstat s, iiindexes x, iicolumns c
where x.base_name = s.table_name and x.base_owner = x.index_owner and
x.base_owner = c.table_owner and x.index_name = c.table_name and
c.table_owner = dbmsinfo('dba') and
s.column_name = c.column_name and s.key_sequence = 0 and
c.key_sequence != 0
group by s.table_name, s.column_name
on commit preserve rows with norecovery, page_size = 8192;
\p\g\time
modify session.st_optimkey to btree
on table_name, column_name;
\p\g\time
update session.st_optimstat s
from session.st_optimkey k
set key_sequence = k.key_sequence
where s.table_name = k.table_name and s.column_name = k.column_name and
s.key_sequence = 0 and k.key_sequence != 0;
\p\g\time
delete from session.st_optimstat
where key_sequence = 0;
\p\g\time
delete from session.st_optimstat c
where exists (select *
from session.st_statistics s
where s.table_name = c.table_name and
s.column_name = c.column_name)
\p\g\time
copy session.st_optimstat
(reference_name = c0nl)
into '/tmp/${database}.optimstat.lst';
\p\g\time
+
cat /tmp/${database}.optimstat.lst|while read refnavn; do
tabel=$(echo ${refnavn}|cut -f1 -d":");
kolonne=$(echo ${refnavn}|cut -f2 -d":");
echo $(date +"%d.%m.%Y %H:%M:%S")' Stats '${database}'/'${tabel}'/'${kolonne}
${II_SYSTEM}/ingres/bin/optimizedb ${database} -r${tabel} -a${kolonne}
done;
Best regards
Henrik Georg Sørensen
Internal - KMD A/S
-----Oprindelig meddelelse-----
Fra: info-ingres-bounces at lists.planetingres.org <info-ingres-bounces at lists.planetingres.org> På vegne af Roy Hann
Sendt: 29. august 2019 10:59
Til: info-ingres at lists.planetingres.org
Emne: Re: [Info-ingres] Generating Stats with Optimizedb
s.anderson.au at gmail.com wrote:
> Are there any general rules regarding generating statistics for a
> table with optimizedb? For example, should a table have at least X rows?
There're a couple of things to consider. How long will it take optimizedb to generate the statistics? And if it will take a long time how much benefit will you gain from taking the time to think more carefully about which statistics to gather?
If it is going to be quick--and quite what "quick" means is up to you--there is no point spending time thinking about it. Just run optimizedb with the -zk flag and be done with it.
If optimizedb could take a long time because you have some huge tables, and you are aware that specific queries are excessively slow, then you can reasonably invest the time to work out which tables and which columns need statistics.
To do that you will need to look at the query plans for the slow queries. Without getting into what is a massive topic in detail, one thing to look for would be K-joins where the optimizer has under-estimated the number of rows in the outer source (the left-hand source). Try to work out which columns it mistakenly thought would be highly selective, then run optimizedb on that column(s).
Another big win often comes from gathering stats on non-key columns used with constant conditions in the query where the values are highly skewed. Status indicator columns are a classsic example, e.g.
WHERE status = 'FINISHED'
versus
WHERE status = 'PROCESSING'
The first will usually be unselective, the latter will usually be very selective. The optimizer needs to know that. In that case you'd run optmizedb using -rtable_name and -astatus.
Roy
_______________________________________________
Info-ingres mailing list
Info-ingres at lists.planetingres.org
https://lists.planetingres.org/mailman/listinfo/info-ingres
Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der fortæller, hvordan vi behandler oplysninger om dig.
Protection of your personal data is important to us. Here you can read KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we process your personal data.
Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi dig slette e-mailen i dit system uden at videresende eller kopiere den. Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri for virus og andre fejl, som kan påvirke computeren eller it-systemet, hvori den modtages og læses, åbnes den på modtagerens eget ansvar. Vi påtager os ikke noget ansvar for tab og skade, som er opstået i forbindelse med at modtage og bruge e-mailen.
Please note that this message may contain confidential information. If you have received this message by mistake, please inform the sender of the mistake by sending a reply, then delete the message from your system without making, distributing or retaining any copies of it. Although we believe that the message and any attachments are free from viruses and other errors that might affect the computer or it-system where it is received and read, the recipient opens the message at his or her own risk. We assume no responsibility for any loss or damage arising from the receipt or use of this message.
More information about the Info-ingres
mailing list