-->

2013-08-16

memo: postgresql

http://www.postgresql.jp/document/9.2/html/indexes.html
http://dqn.sakusakutto.jp/2011/11/postgresql.html
http://d.hatena.ne.jp/y-kawaz/20090226/1235623336
http://lets.postgresql.jp/documents/technical/bulkload/
http://www.techscore.com/blog/2013/03/12/%E9%9B%86%E5%90%88%E3%82%92%E8%BF%94%E3%81%99generate_series%E9%96%A2%E6%95%B0%E3%81%A7%E5%A4%A7%E9%87%8F%E3%83%87%E3%83%BC%E3%82%BF%E3%82%92%E7%94%9F%E6%88%90%E3%81%97%E3%81%A6%E9%9B%86%E8%A8%88/
http://longkey1.net/blog/2013/05/09/clear-postgresql-caches/
http://www.glamenv-septzen.net/view/725
http://d.hatena.ne.jp/hhelibex/20110609/1307620714
http://www.ksknet.net/postgresql/primary_key.html
http://www.postgresql.jp/document/9.2/html/sql-commands.html
http://dba.stackexchange.com/questions/2070/postgresql-count-uses-a-sequential-scan-not-index
http://stackoverflow.com/questions/2204058/show-which-columns-an-index-is-on-in-postgresql
http://stackoverflow.com/questions/3524859/how-to-display-full-stored-procedure-code
http://serverfault.com/questions/34741/postgres-equivalent-to-mysqls-g
http://www.pgadmin.org/
http://stansantiago.wordpress.com/2011/10/30/install-and-configure-pl-php-for-postgres-9-1/
http://www.atmarkit.co.jp/ait/articles/0810/01/news134_3.html
http://lets.postgresql.jp/documents/tutorial/gihyo_rensai/5/
http://lets.postgresql.jp/documents/technical/text-processing/3
http://aoi-f.blog.so-net.ne.jp/2011-05-15

目的は、
postgresqlはどうだろう?使ったことある、
pgadminは便利だったような気がする、
phpをトリガーの言語で使えたような気がするが今は無い、
などです。

テスト環境です。
$ uname -a
Linux vmware-gentoo1 3.8.13-gentoo #1 SMP Thu May 16 19:10:40 JST 2013 x86_64 Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz GenuineIntel GNU/Linux

$ emerge --info|grep -P "^(CFLAGS|CXXFLAGS|CHOST)="
CFLAGS="-march=native -O2"
CHOST="x86_64-pc-linux-gnu"
CXXFLAGS="-march=native -O2"

$ LC_ALL=C lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                3
On-line CPU(s) list:   0-2
Thread(s) per core:    1
Core(s) per socket:    3
Socket(s):             1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 58
Stepping:              9
CPU MHz:               3192.820
BogoMIPS:              6385.64
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              6144K

$ free
             total       used       free     shared    buffers     cached
Mem:       8170124    1313924    6856200          0     159176     610836
-/+ buffers/cache:     543912    7626212
Swap:       131068          0     131068

インストールします。
$ grep postgresql /etc/portage/package.use/x86_64-pc-linux-gnu
dev-db/postgresql-server       perl
$ sudo emerge --oneshot -avt dev-db/postgresql-server
$ sudo emerge --config dev-db/postgresql-server:9.2

ユーザーを作成します。
# 多分"-a"は非推奨。"-s"が正解。
$ sudo createuser -a -E -P -U postgres dbuser

設定を変更します。
$ /usr/bin/sudo /usr/bin/rcsdiff /etc/config-archive/etc/postgresql-9.2/postgresql.conf,v /etc/postgresql-9.2/postgresql.conf
===================================================================
RCS file: /etc/config-archive/etc/postgresql-9.2/postgresql.conf,v
retrieving revision 1.1
diff -r1.1 /etc/postgresql-9.2/postgresql.conf
59c59
< #listen_addresses = 'localhost'          # what IP address(es) to listen on;
---
> listen_addresses = '192.168.0.110'       # what IP address(es) to listen on;
64c64
< max_connections = 100                      # (change requires restart)
---
> max_connections = 1000                     # (change requires restart)
113c113
< shared_buffers = 32MB                      # min 128kB
---
> shared_buffers = 2048MB                    # min 128kB

設定を変更します。
$ /usr/bin/sudo /usr/bin/rcsdiff /etc/config-archive/etc/postgresql-9.2/pg_hba.conf,v /etc/postgresql-9.2/pg_hba.conf
===================================================================
RCS file: /etc/config-archive/etc/postgresql-9.2/pg_hba.conf,v
retrieving revision 1.1
diff -r1.1 /etc/postgresql-9.2/pg_hba.conf
86a87
> host    all             dbuser            192.168.0.110/24        md5

とりあえずOFFにします。ONの場合安全に再起動。
$ rcsdiff -u -U4 /etc/config-archive/etc/conf.d/postgresql-9.2,v /etc/conf.d/postgresql-9.2
===================================================================
RCS file: /etc/config-archive/etc/conf.d/postgresql-9.2,v
retrieving revision 1.1
diff -u -U4 -r1.1 /etc/conf.d/postgresql-9.2
--- /etc/conf.d/postgresql-9.2  2013/08/15 08:13:01     1.1
+++ /etc/conf.d/postgresql-9.2  2013/08/16 04:21:27
@@ -12,9 +12,9 @@
 # Forecfully disconnect clients from server and shut down. This is performed
 # after NICE_QUIT. Terminated client connections have their open transactions
 # rolled back.
 # Set RUDE_QUIT to "NO" to disable. RUDE_TIMEOUT in seconds.
-RUDE_QUIT="YES"
+RUDE_QUIT="NO"
 RUDE_TIMEOUT=30

 # If the server still fails to shutdown, you can force it to quit by setting
 # this to YES and a recover-run will execute on the next startup.

plperlを入れます。
$ createlang -h 192.168.0.110 -U dbuser plperl dbname

再起動します。
$ sudo /etc/init.d/postgresql-9.2 restart

psqlで接続します。
# pgadmin3を入れて起動しておくと便利です。
$ psql -h 192.168.0.110 -U dbuser dbname

時間を計ります。
dbname=# \timing
タイミングは on です。

ランダムな文字を作成する関数を作成します。
dbname=# \x
Expanded display is on.
dbname=# SELECT prosrc FROM pg_proc WHERE proname = 'randstr';
-[ RECORD 1 ]----------------------------------------------
prosrc | # http://aoi-f.blog.so-net.ne.jp/2011-05-15
       |
       | use strict;
       |
       | my $ret = "";
       |
       | my $length = $_[0];
       |
       | my @char_tmp=();
       |
       | push @char_tmp, ('a'..'z');
       | push @char_tmp, ('A'..'Z');
       | push @char_tmp, (0..9);
       |
       | my $rand_str_tmp = '';
       | my $cnt = $#char_tmp;
       |
       | for (my $i=1; $i<=$length; $i++) {
       |     $rand_str_tmp .= $char_tmp[int(rand($cnt+1))];
       | }
       |
       | return $rand_str_tmp;
       |

Time: 2.506 ms
dbname=# \x
Expanded display is off.

テストテーブルを作成します。
dbname=# DROP TABLE test003;
DROP TABLE
時間: 5.686 ms
dbname=# CREATE TABLE test003 (id SERIAL, data VARCHAR(255));
NOTICE:  CREATE TABLEはシリアル列"test003.id"用に暗黙的なシーケンス"test003_id_seq"を作成します。
CREATE TABLE
時間: 3.490 ms
dbname=# \d test003
                              テーブル "public.test003"
  列  |           型           |                        修飾語
------+------------------------+------------------------------------------------------
 id   | integer                | not null default nextval('test003_id_seq'::regclass)
 data | character varying(255) |

テストテーブルの serial を後回しにします。
dbname=# DROP TABLE test003;
DROP TABLE
時間: 3.348 ms
dbname=# CREATE TABLE test003 (id INTEGER, data VARCHAR(255));
CREATE TABLE
時間: 2.262 ms
dbname=# CREATE SEQUENCE test003_id_seq
  INCREMENT 1
  MINVALUE 1
  MAXVALUE 9223372036854775807
  START 1
  CACHE 1;
CREATE SEQUENCE
時間: 2.420 ms
dbname=# ALTER TABLE test003_id_seq
  OWNER TO dbuser;
ALTER TABLE
時間: 0.688 ms
dbname=# ALTER SEQUENCE test003_id_seq OWNED BY test003.id;
ALTER SEQUENCE
時間: 1.429 ms
dbname=# ALTER TABLE test003 ALTER COLUMN id SET NOT NULL;
ALTER TABLE
時間: 1.396 ms
dbname=# ALTER TABLE test003 ALTER COLUMN id SET DEFAULT nextval('test003_id_seq'::regclass);
ALTER TABLE
時間: 1.994 ms
dbname=# \d test003
                              テーブル "public.test003"
  列  |           型           |                        修飾語
------+------------------------+------------------------------------------------------
 id   | integer                | not null default nextval('test003_id_seq'::regclass)
 data | character varying(255) |

10件INSERTします。
dbname=# DROP TABLE test003;
DROP TABLE
時間: 2.531 ms
dbname=# CREATE TABLE test003 (id INTEGER, data VARCHAR(255));
CREATE TABLE
時間: 2.185 ms
dbname=# INSERT INTO test003 (data) SELECT randstr((FLOOR(RANDOM()*250)+1)::INT) AS data FROM GENERATE_SERIES(1, 10);
INSERT 0 10
時間: 3.071 ms
dbname=# SELECT COUNT(data) FROM test003;
 count
-------
    10
(1 行)

時間: 0.826 ms
dbname=# SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len ASC;
 len | cnt
-----+-----
   5 |   1
  20 |   1
  53 |   1
  54 |   1
  58 |   1
  64 |   1
  68 |   1
  82 |   1
 119 |   1
 243 |   1
(10 行)

時間: 1.192 ms

ある程度INSERTします。
dbname=# INSERT INTO test003 (data) SELECT randstr((FLOOR(RANDOM()*250)+1)::INT) AS data FROM GENERATE_SERIES(1, 100);
INSERT 0 100
時間: 11.091 ms
dbname=# INSERT INTO test003 (data) SELECT randstr((FLOOR(RANDOM()*250)+1)::INT) AS data FROM GENERATE_SERIES(1, 1000);
INSERT 0 1000
時間: 65.906 ms
dbname=# INSERT INTO test003 (data) SELECT randstr((FLOOR(RANDOM()*250)+1)::INT) AS data FROM GENERATE_SERIES(1, 10000);
INSERT 0 10000
時間: 424.606 ms
dbname=# INSERT INTO test003 (data) SELECT randstr((FLOOR(RANDOM()*250)+1)::INT) AS data FROM GENERATE_SERIES(1, 100000);
INSERT 0 100000
時間: 4210.131 ms
dbname=# INSERT INTO test003 (data) SELECT randstr((FLOOR(RANDOM()*250)+1)::INT) AS data FROM GENERATE_SERIES(1, 1000000);
INSERT 0 1000000
時間: 41347.289 ms
dbname=# SELECT COUNT(data) FROM test003;
  count
---------
 1111110
(1 行)

時間: 153.315 ms
dbname=# SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len ASC;
 len | cnt
-----+------
   1 | 4572
   2 | 4354
   3 | 4488

... snip ...

 248 | 4435
 249 | 4477
 250 | 4445
(250 行)

時間: 2228.025 ms
dbname=# SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len ASC LIMIT 1;
 len | cnt
-----+------
   1 | 4572
(1 行)

時間: 1860.114 ms
dbname=# SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len DESC LIMIT 1;
 len | cnt
-----+------
 250 | 4445
(1 行)

時間: 1874.450 ms

約1000万レコードでテストに移ります。
dbname=# INSERT INTO test003 (data) SELECT randstr((FLOOR(RANDOM()*250)+1)::INT) AS data FROM GENERATE_SERIES(1, 10000000);
INSERT 0 10000000
時間: 410982.963 ms
dbname=# SELECT COUNT(data) FROM test003;
  count
----------
 11111110
(1 行)

時間: 1408.626 ms
dbname=# SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len ASC LIMIT 1;
 len |  cnt
-----+-------
   1 | 44672
(1 行)

時間: 85998.609 ms
dbname=# SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len DESC LIMIT 1;
 len |  cnt
-----+-------
 250 | 44174
(1 行)

時間: 87158.518 ms
dbname=#

# 非常に怪しいランダムの偏りです。 RANDOM() の使用法に誤りがあります。

serial を戻します。
dbname=# CREATE SEQUENCE test003_id_seq
  INCREMENT 1
  MINVALUE 1
  MAXVALUE 9223372036854775807
  START 1
  CACHE 1;
CREATE SEQUENCE
時間: 27.763 ms
dbname=# ALTER TABLE test003_id_seq
  OWNER TO dbuser;
ALTER TABLE
時間: 0.705 ms
dbname=# ALTER SEQUENCE test003_id_seq OWNED BY test003.id;
ALTER SEQUENCE
時間: 1.378 ms
dbname=# ALTER TABLE test003 ALTER COLUMN id SET NOT NULL;
ERROR:  列"id"にはNULL値があります
時間: 36.759 ms
dbname=# ALTER TABLE test003 ALTER COLUMN id SET DEFAULT nextval('test003_id_seq'::regclass);
ALTER TABLE
時間: 2.010 ms
dbname=# UPDATE test003 SET id=nextval('test003_id_seq'::regclass);
UPDATE 11111110
時間: 125298.068 ms
dbname=# ALTER TABLE test003 ALTER COLUMN id SET NOT NULL;
ALTER TABLE
時間: 3793.104 ms
dbname=# \d test003
                              テーブル "public.test003"
  列  |           型           |                        修飾語
------+------------------------+------------------------------------------------------
 id   | integer                | not null default nextval('test003_id_seq'::regclass)
 data | character varying(255) |

dbname=#

serial が正常か確認します。
dbname=# \d test003_id_seq
      シーケンス "public.test003_id_seq"
      列       |   型    |         値
---------------+---------+---------------------
 sequence_name | name    | test003_id_seq
 last_value    | bigint  | 11111110
 start_value   | bigint  | 1
 increment_by  | bigint  | 1
 max_value     | bigint  | 9223372036854775807
 min_value     | bigint  | 1
 cache_value   | bigint  | 1
 log_cnt       | bigint  | 13
 is_cycled     | boolean | f
 is_called     | boolean | t
所有者: public.test003.id

dbname=# SELECT id, LENGTH(data) FROM test003 ORDER BY id DESC LIMIT 2;
    id    | length
----------+--------
 11111110 |    103
 11111109 |    214
(2 行)

時間: 6080.592 ms
dbname=# INSERT INTO test003 (data) SELECT randstr((FLOOR(RANDOM()*250)+1)::INT) AS data FROM GENERATE_SERIES(1, 1);
INSERT 0 1
時間: 16.984 ms
dbname=# SELECT id, LENGTH(data) FROM test003 ORDER BY id DESC LIMIT 2;
    id    | length
----------+--------
 11111111 |     47
 11111110 |    103
(2 行)

時間: 6044.897 ms

インデックス優先にします。
dbname=# SET ENABLE_SEQSCAN = FALSE;
SET
時間: 0.382 ms

primary key を追加します。
dbname=# ALTER TABLE test003 ADD PRIMARY KEY(id);
NOTICE:  ALTER TABLE / ADD PRIMARY KEYはテーブル"test003"に暗黙的なインデックス"test003_pkey"を作成します
ALTER TABLE
時間: 13328.882 ms

dbname=# SELECT id, LENGTH(data) FROM test003 ORDER BY id DESC LIMIT 2;
    id    | length
----------+--------
 11111111 |     47
 11111110 |    103
(2 行)

時間: 1.277 ms
dbname=# SELECT id, LENGTH(data) FROM test003 ORDER BY id ASC LIMIT 2;
 id | length
----+--------
  1 |     68
  2 |    119
(2 行)

時間: 0.886 ms
dbname=# EXPLAIN SELECT id, LENGTH(data) FROM test003 ORDER BY id ASC LIMIT 2;
                                            QUERY PLAN
--------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..7.78 rows=2 width=131)
   ->  Index Scan using test003_pkey on test003  (cost=0.00..43216749.91 rows=11111111 width=131)
(2 行)

時間: 0.807 ms

集約関数にindexは貼れないことを確認します。
dbname=# SELECT COUNT(data) FROM test003;
  count
----------
 11111111
(1 行)

時間: 1494.485 ms
dbname=# EXPLAIN SELECT COUNT(data) FROM test003;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Aggregate  (cost=10000579709.89..10000579709.90 rows=1 width=127)
   ->  Seq Scan on test003  (cost=10000000000.00..10000551932.11 rows=11111111 width=127)
(2 行)

時間: 0.569 ms
dbname=# CREATE INDEX ON test003 (COUNT(data));
ERROR:  式インデックスには集約関数を使用できません
時間: 1001.837 ms
dbname=# SELECT reltuples FROM pg_class WHERE oid = 'test003'::regclass;
  reltuples
-------------
 1.11111e+07
(1 行)

時間: 1.378 ms

COUNT の書き方を試しています。
dbname=# SELECT COUNT(id) FROM test003;
  count
----------
 11111111
(1 行)

時間: 2903.467 ms
dbname=# EXPLAIN SELECT COUNT(id) FROM test003;
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 Aggregate  (cost=759916.69..759916.70 rows=1 width=4)
   ->  Bitmap Heap Scan on test003  (cost=180206.80..732138.91 rows=11111111 width=4)
         ->  Bitmap Index Scan on test003_pkey  (cost=0.00..177429.02 rows=11111111 width=0)
(3 行)

時間: 0.670 ms
dbname=# SELECT *, COUNT(*) OVER() AS cntall FROM test003 ORDER BY id DESC LIMIT 1;
    id    |                      data                       |  cntall
----------+-------------------------------------------------+----------
 11111111 | vonjFJ3RCln0ohF65zQQ5isLbe1O1AoI0QnJT4iyOGAm42t | 11111111
(1 行)

時間: 97596.188 ms
dbname=# SELECT reltuples::bigint FROM pg_class WHERE oid = 'test003'::regclass;
 reltuples
-----------
  11250615
(1 行)

時間: 9.842 ms
dbname=# explain SELECT *, COUNT(id) OVER() AS cntall FROM test003 ORDER BY id DESC LIMIT 1;
                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..0.21 rows=1 width=131)
   ->  WindowAgg  (cost=0.00..2396190.51 rows=11244966 width=131)
         ->  Index Scan Backward using test003_pkey on test003  (cost=0.00..2255628.43 rows=11244966 width=131)
(3 行)

時間: 2.019 ms
dbname=# SELECT *, COUNT(id) OVER() AS cntall FROM test003 ORDER BY id DESC LIMIT 1;
    id    |                      data                       |  cntall
----------+-------------------------------------------------+----------
 11111111 | vonjFJ3RCln0ohF65zQQ5isLbe1O1AoI0QnJT4iyOGAm42t | 11111111
(1 行)

時間: 14490.027 ms

explain が変化しました。多分統計情報が集まってきたためです。
dbname=# SELECT COUNT(id) FROM test003;
  count
----------
 11111111
(1 行)

時間: 1078.204 ms
dbname=# explain SELECT COUNT(id) FROM test003;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=318660.37..318660.38 rows=1 width=4)
   ->  Index Only Scan using test003_pkey on test003  (cost=0.00..290547.96 rows=11244966 width=4)
(2 行)

時間: 0.727 ms

work_mem を確認します。
$ sudo grep work_mem /etc/postgresql-9.2/*
/etc/postgresql-9.2/postgresql.conf:#work_mem = 1MB                             # min 64kB
/etc/postgresql-9.2/postgresql.conf:#maintenance_work_mem = 16MB                # min 1MB

LENGTH(data) にインデックスを貼ります。
dbname=# SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len DESC LIMIT 1;
 len |  cnt
-----+-------
 250 | 44174
(1 行)

時間: 125475.003 ms
dbname=# explain SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len DESC LIMIT 1;
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Limit  (cost=10004819382.67..10004819382.70 rows=1 width=127)
   ->  GroupAggregate  (cost=10004819382.67..10004964150.97 rows=4834484 width=127)
         ->  Sort  (cost=10004819382.67..10004847495.09 rows=11244966 width=127)
               Sort Key: (length((data)::text))
               ->  Seq Scan on test003  (cost=10000000000.00..10000581383.08 rows=11244966 width=127)
(5 行)

時間: 0.963 ms
dbname=#

dbname=# CREATE INDEX ON test003 (LENGTH(data));
CREATE INDEX
時間: 37074.003 ms

dbname=# explain SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len DESC LIMIT 1;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..9.07 rows=1 width=127)
   ->  GroupAggregate  (cost=0.00..43332017.17 rows=4776936 width=127)
         ->  Index Scan Backward using test003_length_idx on test003  (cost=0.00..43216749.91 rows=11111111 width=127)
(3 行)

時間: 1.398 ms

dbname=# SELECT LENGTH(data) AS len, COUNT(data) AS cnt FROM test003 GROUP BY len ORDER BY len DESC LIMIT 1;
 len |  cnt
-----+-------
 250 | 44174
(1 行)

時間: 111.031 ms
dbname=#

LIKE に似た絞込みにインデックスを貼ります。
dbname=# SELECT COUNT(data) AS cnt FROM test003 WHERE data LIKE 'a%';
  cnt
--------
 179465
(1 行)

時間: 1934.644 ms
dbname=# explain SELECT COUNT(data) AS cnt FROM test003 WHERE data LIKE 'a%';
                                       QUERY PLAN
----------------------------------------------------------------------------------------
 Aggregate  (cost=10000579991.33..10000579991.34 rows=1 width=127)
   ->  Seq Scan on test003  (cost=10000000000.00..10000579709.89 rows=112578 width=127)
         Filter: ((data)::text ~~ 'a%'::text)
(3 行)

時間: 1.167 ms
dbname=#

LIKE に似た絞込みにインデックスを貼ります。
dbname=# CREATE INDEX ON test003 (substr(data, 1, 1));
CREATE INDEX
時間: 62754.251 ms
dbname=# SELECT COUNT(data) AS cnt FROM test003 WHERE substr(data, 1, 1) = 'a';
  cnt
--------
 179465
(1 行)

時間: 3131.558 ms
dbname=# explain SELECT COUNT(data) AS cnt FROM test003 WHERE substr(data, 1, 1) = 'a';
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Aggregate  (cost=157082.20..157082.21 rows=1 width=127)
   ->  Bitmap Heap Scan on test003  (cost=1044.03..156943.31 rows=55556 width=127)
         Recheck Cond: (substr((data)::text, 1, 1) = 'a'::text)
         ->  Bitmap Index Scan on test003_substr_idx  (cost=0.00..1030.14 rows=55556 width=0)
               Index Cond: (substr((data)::text, 1, 1) = 'a'::text)
(5 行)

時間: 1.048 ms
dbname=#

LIKE に似た絞込みにインデックスを貼ります。
dbname=# CREATE INDEX ON test003 (substr(data::text, 1, 1));
CREATE INDEX
時間: 62361.070 ms
dbname=# explain SELECT COUNT(data) AS cnt FROM test003 WHERE substr(data::text, 1, 1) = 'a';
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Aggregate  (cost=157082.20..157082.21 rows=1 width=127)
   ->  Bitmap Heap Scan on test003  (cost=1044.03..156943.31 rows=55556 width=127)
         Recheck Cond: (substr((data)::text, 1, 1) = 'a'::text)
         ->  Bitmap Index Scan on test003_substr_idx1  (cost=0.00..1030.14 rows=55556 width=0)
               Index Cond: (substr((data)::text, 1, 1) = 'a'::text)
(5 行)

時間: 1.409 ms
dbname=# SELECT COUNT(data) AS cnt FROM test003 WHERE substr(data::text, 1, 1) = 'a';
  cnt
--------
 179465
(1 行)

時間: 3078.055 ms
dbname=#

LIKE に似た絞込みにインデックスを貼ります。
dbname=# CREATE INDEX ON test003 (substr(data::text, 1, 2));
CREATE INDEX
時間: 65614.633 ms
dbname=# explain SELECT COUNT(data) AS cnt FROM test003 WHERE substr(data::text, 1, 2) = 'ab';
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Aggregate  (cost=157082.20..157082.21 rows=1 width=127)
   ->  Bitmap Heap Scan on test003  (cost=1044.03..156943.31 rows=55556 width=127)
         Recheck Cond: (substr((data)::text, 1, 2) = 'ab'::text)
         ->  Bitmap Index Scan on test003_substr_idx2  (cost=0.00..1030.14 rows=55556 width=0)
               Index Cond: (substr((data)::text, 1, 2) = 'ab'::text)
(5 行)

時間: 0.904 ms
dbname=# SELECT COUNT(data) AS cnt FROM test003 WHERE substr(data::text, 1, 2) = 'ab';
 cnt
------
 2873
(1 行)

時間: 8.811 ms
dbname=#

dbname=# SELECT COUNT(data) AS cnt FROM test003 WHERE data LIKE 'ab%';
 cnt
------
 2873
(1 行)

時間: 1856.263 ms
dbname=#

LIKE に似た絞込みにインデックスを貼ります。
dbname=# SELECT COUNT(data) AS cnt FROM test003 WHERE data LIKE 'abc%';
 cnt
-----
  48
(1 行)

時間: 1887.597 ms
dbname=# CREATE INDEX ON test003 (substr(data::text, 1, 3));
CREATE INDEX
時間: 69296.781 ms
dbname=# SELECT COUNT(data) AS cnt FROM test003 WHERE substr(data::text, 1, 3) = 'abc';
 cnt
-----
  48
(1 行)

時間: 0.988 ms
dbname=# explain SELECT COUNT(data) AS cnt FROM test003 WHERE substr(data::text, 1, 3) = 'abc';
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Aggregate  (cost=157082.20..157082.21 rows=1 width=127)
   ->  Bitmap Heap Scan on test003  (cost=1044.03..156943.31 rows=55556 width=127)
         Recheck Cond: (substr((data)::text, 1, 3) = 'abc'::text)
         ->  Bitmap Index Scan on test003_substr_idx3  (cost=0.00..1030.14 rows=55556 width=0)
               Index Cond: (substr((data)::text, 1, 3) = 'abc'::text)
(5 行)

時間: 1.042 ms
dbname=#

カウント数がざっくりで良い場合は使えます。
更新しなければ時間が経つと正確なカウント数が取れるかもしれません。
dbname=# SELECT reltuples::bigint FROM pg_class WHERE oid = 'test003'::regclass;
 reltuples
-----------
  11111111
(1 行)

時間: 1.109 ms

統計情報を見ます。
dbname=# select * from report_tablesize where relname like 'test003%';
       relname       | relkind |     rows     |      bytes
---------------------+---------+--------------+------------------
 test003             | r       |   11,111,111 |    3,611,205,632
 test003_id_seq      | S       |            1 |            8,192
 test003_length_idx  | i       |   11,111,111 |      249,593,856
 test003_pkey        | i       |   11,111,111 |      249,593,856
 test003_substr_idx  | i       |   11,111,111 |      249,593,856
 test003_substr_idx1 | i       |   11,111,111 |      249,593,856
 test003_substr_idx2 | i       |   11,111,111 |      249,593,856
 test003_substr_idx3 | i       |   11,111,111 |      249,593,856
(8 行)

時間: 1.175 ms

統計情報を見ます。
drop view report_tablesize;
CREATE VIEW report_tablesize AS

SELECT
  relname,
  relkind,
  to_char(reltuples, '999,999,999') as rows,
  to_char(pg_relation_size(relname::regclass), '999,999,999,999') as bytes
FROM pg_class
WHERE relnamespace = (SELECT oid FROM pg_namespace WHERE nspname='public');

COMMENT ON VIEW report_tablesize IS '現在のテーブルサイズ';

統計情報を見ます。
dbname=# select * from report_tablesize where relname like 'test003%' and relkind='i';
       relname       | relkind |     rows     |      bytes
---------------------+---------+--------------+------------------
 test003_length_idx  | i       |   11,111,111 |      249,593,856
 test003_pkey        | i       |   11,111,111 |      249,593,856
 test003_substr_idx3 | i       |   11,111,111 |      249,593,856
(3 行)

時間: 1.506 ms

COUNT をキャッシュします。
dbname=# CREATE TABLE mytablerowcount (id SERIAL, name VARCHAR(63) NOT NULL, cnt BIGINT NOT NULL);
NOTICE:  CREATE TABLEはシリアル列"mytablerowcount.id"用に暗黙的なシーケンス"mytablerowcount_id_seq"を作成します。
CREATE TABLE
時間: 46.575 ms

dbname=# ALTER TABLE mytablerowcount ADD UNIQUE (name);
NOTICE:  ALTER TABLE / ADD UNIQUEはテーブル"mytablerowcount"に暗黙的なインデックス"mytablerowcount_name_key"を作成します
ALTER TABLE
時間: 26.638 ms

COUNT をキャッシュします。
dbname=# select * from mytablerowcount;
 id |  name   | cnt
----+---------+------
 16 | test004 | 2280
(1 row)

dbname=# select get_mytablerowcount('test003');
 get_mytablerowcount
---------------------
            11111111
(1 row)

dbname=# select * from mytablerowcount;
 id |  name   |   cnt
----+---------+----------
 17 | test003 | 11111111
 16 | test004 |     2280
(2 rows)

dbname=# \timing
Timing is on.
dbname=# select get_mytablerowcount('test003');
 get_mytablerowcount
---------------------
            11111111
(1 row)

Time: 1.892 ms
dbname=# INSERT INTO test003 (data) SELECT randstr((FLOOR(RANDOM()*250)+1)::INT) AS data FROM GENERATE_SERIES(1, 10);
INSERT 0 10
Time: 403.743 ms
dbname=# select get_mytablerowcount('test003');
 get_mytablerowcount
---------------------
            11111121
(1 row)

Time: 1.805 ms

COUNT をキャッシュします。
dbname=# select * from mytablerowcount where name='test003'; select count(id) from test003; select get_mytablerowcount('test003'); SELECT reltuples::bigint FROM pg_class WHERE oid = 'test003'::regclass;
 id |  name   |   cnt
----+---------+----------
 17 | test003 | 11111121
(1 row)

Time: 0.714 ms
  count
----------
 11111121
(1 row)

Time: 1036.212 ms
 get_mytablerowcount
---------------------
            11111121
(1 row)

Time: 0.852 ms
 reltuples
-----------
  11111111
(1 row)

Time: 0.603 ms
dbname=#

COUNT をキャッシュします。
-- Function: set_mytablerowcount()

-- DROP FUNCTION set_mytablerowcount();

CREATE OR REPLACE FUNCTION set_mytablerowcount()
  RETURNS trigger AS
$BODY$

use strict;

my $res;

=pod
unless (($_TD->{event} eq "INSERT" || $_TD->{event} eq "UPDATE") &&
        ($_TD->{when} eq "BEFORE") &&
        ($_TD->{level} eq "ROW"))
{
    elog(ERROR, "CRITICAL: this call timing is invalid. (event = $_TD->{event}, when = $_TD->{when}, level = $_TD->{level})");
}
=cut

if (! defined $_SHARED{stmt_mytablerowcount}->{$_TD->{table_name}})
{
    spi_exec_query("SELECT init_mytablerowcount('" . quote_ident($_TD->{table_name}) . "')");
}

$res = spi_exec_prepared($_SHARED{stmt_mytablerowcount}->{$_TD->{table_name}}->{insert}, {limit => 1}, $_TD->{table_name});

if ($res->{processed} != 1)
{
    elog(ERROR, "affected rows != 1.");
}

return;

$BODY$
  LANGUAGE plperl VOLATILE
  COST 100;
ALTER FUNCTION set_mytablerowcount()
  OWNER TO dbuser;

COUNT をキャッシュします。
-- Function: init_mytablerowcount(text)

-- DROP FUNCTION init_mytablerowcount(text);

CREATE OR REPLACE FUNCTION init_mytablerowcount(table_name text)
  RETURNS void AS
$BODY$

use strict;

my $table_name = $_[0];

my $sql;
my $cnt;

$sql = 'SELECT cnt FROM mytablerowcount WHERE name = $1';
$_SHARED{stmt_mytablerowcount}->{$table_name}->{select} = spi_prepare($sql, 'TEXT');

$sql = 'UPDATE mytablerowcount SET cnt=cnt+1 WHERE name = $1';
$_SHARED{stmt_mytablerowcount}->{$table_name}->{insert} = spi_prepare($sql, 'TEXT');

$sql = 'UPDATE mytablerowcount SET cnt=cnt-1 WHERE name = $1';
$_SHARED{stmt_mytablerowcount}->{$table_name}->{delete} = spi_prepare($sql, 'TEXT');

$sql = 'INSERT INTO mytablerowcount (name, cnt) VALUES ($1, (SELECT COUNT(id) FROM ' . quote_ident($table_name) . '))';
$_SHARED{stmt_mytablerowcount}->{$table_name}->{init} = spi_prepare($sql, 'TEXT');

$cnt = spi_exec_prepared($_SHARED{stmt_mytablerowcount}->{$table_name}->{select},
    {limit => 1}, $table_name)->{rows}[0]->{cnt};

if (defined $cnt)
{
}
else
{
    my $res = spi_exec_prepared($_SHARED{stmt_mytablerowcount}->{$table_name}->{init}, {limit => 1}, $table_name);
}

$BODY$
  LANGUAGE plperl VOLATILE
  COST 100;
ALTER FUNCTION init_mytablerowcount(text)
  OWNER TO dbuser;

COUNT をキャッシュします。
-- Function: get_mytablerowcount(text)

-- DROP FUNCTION get_mytablerowcount(text);

CREATE OR REPLACE FUNCTION get_mytablerowcount(table_name text)
  RETURNS bigint AS
$BODY$

use strict;

my $table_name = $_[0];
my $ret = 0;
my $cnt;

if (! defined $_SHARED{stmt_mytablerowcount}->{$_TD->{table_name}})
{
    spi_exec_query("SELECT init_mytablerowcount('" . quote_ident($table_name) . "')");
}

$cnt = spi_exec_prepared($_SHARED{stmt_mytablerowcount}->{$table_name}->{select},
    {limit => 1}, $table_name)->{rows}[0]->{cnt};

if (defined $cnt)
{
    $ret = $cnt;
}

return $cnt;

$BODY$
  LANGUAGE plperl VOLATILE
  COST 100;
ALTER FUNCTION get_mytablerowcount(text)
  OWNER TO dbuser;

randstr です。
-- Function: randstr(integer)

-- DROP FUNCTION randstr(integer);

CREATE OR REPLACE FUNCTION randstr(len integer)
  RETURNS text AS
$BODY$# http://aoi-f.blog.so-net.ne.jp/2011-05-15

use strict;

my $ret = "";

my $length = $_[0];

my @char_tmp=();

push @char_tmp, ('a'..'z');
push @char_tmp, ('A'..'Z');
push @char_tmp, (0..9);

my $rand_str_tmp = '';
my $cnt = $#char_tmp;

for (my $i=1; $i<=$length; $i++) {
    $rand_str_tmp .= $char_tmp[int(rand($cnt+1))];
}
    
return $rand_str_tmp;
$BODY$
  LANGUAGE plperl VOLATILE
  COST 100;
ALTER FUNCTION randstr(integer)
  OWNER TO dbuser;

画面が一つの場合 "Ctrl + Z => fg" で戻れます。
$ psql -h 192.168.0.110 -U dbuser dbname
ユーザ dbuser のパスワード:
psql (9.2.4)
"help" でヘルプを表示します.

dbname=# ^Z
[1]+  停止                  psql -h 192.168.0.110 -U dbuser dbname
$ sudo su - -c "echo 3 > /proc/sys/vm/drop_caches"
$ fg
psql -h 192.168.0.110 -U dbuser dbname

dbname=#

random()は0.5足して四捨五入が正解かもしれません。
可能であれば更新時に数えとけば集約関数の遅さもOKです。上記の例ではdeleteしてもカウント数が減りません。余計なことをする分、更新スピードは下がります。
pgadmin3は便利です。しかし画面左側のオブジェクトブラウザのツリーが、何かの作成時などに閉じてしまうことがあるのがうざいです。
検索サイト経由でpostgresqlのマニュアルを探すとバージョンがバラバラなのでローカルに検索できる環境を用意したほうが良いです。
pgadmin3に接続しているサーバのバージョンにマッチしたマニュアルの該当項目にジャンプするような右クリックのメニューがあるといいなと思います。
昔substr(data::text, 1, 3)でインデックスを作ってdata like 'abc%'とやるとインデックスを使ったような気がしましたが気のせいです。もしくは設定です。
pl/phpは9でも使えるようですが試していません。

0 件のコメント: