[Info-ingres] 4k Cache
Roy Hann
specially at processed.almost.meat
Thu Jul 23 13:46:22 UTC 2020
Steve wrote:
> Thanks, that's interesting, particularly what you said about generating
> appropriate stats.
>
> Regarding the 4k cache, almost all tables in the database have
> 2k pages. Actian recommended the tables be changed to 8k and the
> secondary indexes to 4k.
That makes sense. My preference for 8kb pages versus 16kb or even 32kb
pages is slight. In some cases the row size will force you to choose a
bigger page. In some cases severe lock contention will encourage
smaller pages. (I bet one could use up days and weeks making
exquisitely precise measurements to decide which is best for any given
table.)
> Actually, that brings up another question,
> I was considering changing the indexes to 4k first and sometime later
> changing the tables to 8k. Is there any reason to think 2k tables cannot
> have 4k secondary indexes?
No reason at all. It will work.
The reason 4kb pages might not be a win is because page sizes above
2kb have a bigger per-row overhead to support row versioning and
row-level locking. If you had a 2kb table with big rows that
only just fit the page you would get one row per page. Because of
the row overhead you might still get only one row per 4kb page and you'd
get a lot of waste. You end up needing the same number of pages. But
because the pages are twice as big you use twice the disk space.
On the other hand very short rows, each with a big row overhead, can
also mean using more disk space. Secondary indexes will often have very
short rows (unless the keys are big or it is a covering index with
non-key columns in it).
> What are the benefits, reasons for using 16 or higher page sizes?
The argument for bigger pages is more getting more bang for your I/O
buck. Even very fast electromechanical disks struggle to sustain
more than about 120 IOs per second once their cache is flooded. That is
sloooooooooooooooooow. These days you are probably using SSD though,
which is going to be ~100 times faster. But you are still going to have
to go through layers of drivers and caches and context switches and...I
lose the will... So bigger pages get more data into (and out of) memory
faster.
The down side is when a big page with lots of data in it is locked for a
long time, a lot of data is locked. Potentially concurrency can suffer.
In reality most systems I see have baked-in lock contention; the page
size makes it neither better nor worse. In which case I'd say: default
to a big page size and revert to a smaller page size only when forced
into it.
Roy
More information about the Info-ingres
mailing list