Wednesday, June 02, 2010

Indexing on Partitoned Tables:

Local Indexes:
A local index on a partition table will have the same partition schema as the table. Each partition in a local index will have similar entries like the corresponding data in the table partition.
eg.:
Creating a table first:
SQL> CREATE TABLE DATA_LOAD
( DATA_id NUMBER(6),
source_id NUMBER,
load_date DATE,
data_size NUMBER,
state VARCHAR2(20))
PARTITION BY RANGE (load_date)
SUBPARTITION BY LIST (state)
SUBPARTITION TEMPLATE (
SUBPARTITION NORTH_01 VALUES ('DELHI', 'JAMMU KASHMIR', 'PUNJAB', 'HARYANA') TABLESPACE DATA_01,
SUBPARTITION WEST_01 VALUES ('RAJASTHAN', 'GUJRAT', 'MAHARASTRA') TABLESPACE DATA_02,
SUBPARTITION SOUTH_01 VALUES ('TAMIL NADU', 'KERELA', 'KARNATAKA') TABLESPACE DATA_03,
SUBPARTITION EAST_01 VALUES ('ASSAM', 'WEST BENGAL', 'NAGALAND') TABLESPACE DATA_04,
SUBPARTITION NO_REGION VALUES (NULL) TABLESPACE USERS,
SUBPARTITION OTHER VALUES (DEFAULT) TABLESPACE USERS
)
( PARTITION data_2006 VALUES LESS THAN (TO_DATE('01-JAN-2007','DD-MON-YYYY')) ,
PARTITION data_2007 VALUES LESS THAN (TO_DATE('01-JAN-2008','DD-MON-YYYY')) ,
PARTITION data_2008 VALUES LESS THAN (TO_DATE('01-JAN-2009','DD-MON-YYYY')) ,
PARTITION data_2009 VALUES LESS THAN (TO_DATE('01-JAN-2010','DD-MON-YYYY'))
);
Table created.

SQL> create index DATA_LOAD_LOCAL1 on DATA_LOAD (load_date) LOCAL;
Index created.

SQL> insert into DATA_LOAD values (1,1,sysdate-1200,10,'DELHI');
1 row created.
SQL> insert into DATA_LOAD values (2,2,sysdate-1100,20,'RAJASTHAN');
1 row created.
SQL> insert into DATA_LOAD values (3,3,sysdate-1000,30,'KERELA');
1 row created.
SQL> insert into DATA_LOAD values (4,4,sysdate-1300,40,'ASSAM');
1 row created.
SQL> insert into DATA_LOAD values (5,5,sysdate-900,50,'NAGALAND');
1 row created.
SQL> insert into DATA_LOAD values (6,6,sysdate-300,60,'PUNJAB');
1 row created.
SQL> insert into DATA_LOAD values (7,7,sysdate,70,'HARYANA');
1 row created.
SQL> commit;
Commit complete.
SQL> analyze table DATA_LOAD compute statistics;
Table analyzed.

SQL> analyze index DATA_LOAD_LOCAL1 compute statistics ;
Index analyzed.

SQL> select INDEX_NAME,COMPOSITE,PARTITION_NAME,SUBPARTITION_COUNT,NUM_ROWS from user_ind_partitions where index_name='DATA_LOAD_LOCAL1';

INDEX_NAME COM PARTITION_NAME SUBPARTITION_COUNT NUM_ROWS
------------------------------ --- --------------- ------------------ ----------
DATA_LOAD_LOCAL1 YES DATA_2006 6 5
DATA_LOAD_LOCAL1 YES DATA_2007 6 0
DATA_LOAD_LOCAL1 YES DATA_2008 6 1
DATA_LOAD_LOCAL1 YES DATA_2009 6 1

SO we can see the index partitions were created the same way as the table partition, same subpartitions also.

Prefixed LOCAL Index:
A prefixed local index is an index in which the left most column used in the index definition is the same as the table partitioning key. The above example is a local prefixed index.

LOCAL NON-Prefixed Index:
As the name suggests its opposite to the prefixed local index. In this index the left most key is not the partition key column.

SQL> create index DATA_LOAD_LOCAL2 on DATA_LOAD (source_id) LOCAL;
Index created.

SQL> analyze index DATA_LOAD_LOCAL2 compute statistics;
Index analyzed.

SQL> select INDEX_NAME,COMPOSITE,PARTITION_NAME,SUBPARTITION_COUNT,NUM_ROWS from user_ind_partitions where index_name='DATA_LOAD_LOCAL2';
INDEX_NAME COM PARTITION_NAME SUBPARTITION_COUNT NUM_ROWS
------------------------------ --- --------------- ------------------ ----------
DATA_LOAD_LOCAL2 YES DATA_2006 6 5
DATA_LOAD_LOCAL2 YES DATA_2007 6 0
DATA_LOAD_LOCAL2 YES DATA_2008 6 1
DATA_LOAD_LOCAL2 YES DATA_2009 6 1

We can see that even in the NON-Prefixed index has the same partitioning scheme as the partition table.
Each partition of the NON-Prefixed Local index will contain the whole range of possible partition key values, so this might we inefficient at time if we are not using parallel option while selecting. As here the partitions for the index are created like the that of the table using the partition key, but the index key is something else.

GLOBAL INDEXES:
Basically Global Indexes are indexes on which we can define the parition scheme, we define how many partitions we want of the index using which ever table column we would like to use. A global index can be created with the table partition key as the leftmost column or with any other column and in the create statement we can define the partition values for the key.

SQL> CREATE INDEX DATA_LOAD_GLOBAL ON DATA_LOAD (load_date,data_size)
GLOBAL PARTITION BY RANGE (load_date)
(PARTITION p1 VALUES LESS THAN (TO_DATE('01-JAN-2008','DD-MON-YYYY')),
PARTITION p2 VALUES LESS THAN (MAXVALUE)
);
Index created.

SQL> select INDEX_NAME,COMPOSITE,PARTITION_NAME,SUBPARTITION_COUNT,NUM_ROWS from user_ind_partitions where index_name='DATA_LOAD_GLOBAL';
INDEX_NAME COM PARTITION_NAME SUBPARTITION_COUNT NUM_ROWS
------------------------------ --- --------------- ------------------ ----------
DATA_LOAD_GLOBAL NO P1 0 5
DATA_LOAD_GLOBAL NO P2 0 2

We can see that the partition definition of the index is different from the table.

Actually the Global indexes are more useful when using other column for indexing the table rather than the partition key column of the table. As in our case the partitions are done on the basic on load_date, what if we use the column data_size more often to query than the load_date. In this case if we create a local index on data_size (non-prefixed local index) than its partition defination doesn't help the query as it is partitions are based on the table partitioning.
In our case:
Every partition based on load_date can have all the values of data_size. Say the data size column has values from 1 to 100.
Then DATA_2006 can have values 1 to 100, DATA_2007 can have values 1 to 100 and so on. So this does not help quering this index so much.

Instead we can create a global index on this column and define the correct partition scheme for the index using data_size as the partition key for the index.
SQL> CREATE INDEX DATA_LOAD_GLOBAL2 on DATA_LOAD (data_size)
GLOBAL PARTITION BY RANGE (data_size)
(PARTITION p1 VALUES LESS THAN (30),
PARTITION p2 VALUES LESS THAN (50),
PARTITION p3 VALUES LESS THAN (80),
PARTITION p4 VALUES LESS THAN (MAXVALUE)
);
Index created.

SQL> select INDEX_NAME,COMPOSITE,PARTITION_NAME,SUBPARTITION_COUNT,NUM_ROWS from user_ind_partitions where index_name='DATA_LOAD_GLOBAL2';
INDEX_NAME COM PARTITION_NAME SUBPARTITION_COUNT NUM_ROWS
------------------------------ --- --------------- ------------------ ----------
DATA_LOAD_GLOBAL2 NO P2 0 2
DATA_LOAD_GLOBAL2 NO P3 0 3
DATA_LOAD_GLOBAL2 NO P4 0 0
DATA_LOAD_GLOBAL2 NO P1 0 2

Note: Global indexed should have same left most index key and the index partition key, they cannot be different.
Else we will receive the below error. They are different from the non-prefixed local partition in this context.

SQL> CREATE INDEX DATA_LOAD_GLOBAL2 on DATA_LOAD (data_size)
GLOBAL PARTITION BY RANGE (source_id)
(PARTITION p1 VALUES LESS THAN (30),
PARTITION p2 VALUES LESS THAN (50),
PARTITION p3 VALUES LESS THAN (80),
PARTITION p4 VALUES LESS THAN (MAXVALUE)
); 2 3 4 5 6 7
GLOBAL PARTITION BY RANGE (source_id)
*
ERROR at line 2:
ORA-14038: GLOBAL partitioned index must be prefixed

Similarly we can have hash and list partition Global indexes:

SQL> CREATE INDEX DATA_LOAD_GLOBAL3 on DATA_LOAD (data_size) GLOBAL PARTITION BY HASH (data_size) PARTITIONS 4;
CREATE INDEX DATA_LOAD_GLOBAL3 on DATA_LOAD (data_size) GLOBAL PARTITION BY HASH (data_size) PARTITIONS 4
*
ERROR at line 1:
ORA-01408: such column list already indexed
We received the above error as we already create an index with the above column.

SQL> CREATE INDEX DATA_LOAD_GLOBAL3 on DATA_LOAD (data_size,source_id) GLOBAL PARTITION BY HASH (data_size) PARTITIONS 4;
Index created.

And so on....

No comments: