Oracle Context Option Administrator's Guide | ![]() Library |
![]() Product |
![]() Contents |
![]() Index |
The topics discussed in this chapter are:
For more information about general database tuning, see Oracle7 Server Tuning.
However, the following areas can be addressed to help improve indexing performance and ensure that ConText indexes are successfully created for text columns:
It is important to ensure that the temporary tablespace for CTXSYS is large enough to perform the sort. In general, the size of the temporary tablespace should be at least 25% of the size of the data being indexed.
However, the size required also depends on the value that has been specified for the SORT_AREA_SIZE initialization parameter. SORT_AREA_SIZE determines how much memory in the SGA is used to perform index sort, thereby affecting the amount of temporary tablespace needed.
For more information about initialization parameters, see Oracle7 Server Administrator's Guide.
The amount of memory allocated for indexing should be a function of the amount of real and virtual memory on the machine on which the ConText server is running.
To calculate this amount, determine the amount of real memory left over after accounting for all other processes, including ConText servers, running on the machine. The remaining real memory should be allocated for the ConText server(s).
If you are running multiple ConText servers on a single machine, divide the available memory equally among the servers so that each server has sufficient memory for parallel indexing.
Attention: If your real memory usage on the servers exceeds the real memory of the machine, you will experience thrashing due to processes being swapped out of real memory and into virtual memory on disk.
This will cause significant performance degradation and should be avoided at all costs. In other words, always err on the conservative side when allocating indexing memory.
Immediate DML initiates immediate reindexing so that changes to documents are reflected in the index in real-time.
Batch mode is generally preferred because the index will be less fragmented. This is because a larger number of reindexing requests for documents are processed by a single server in one pass, resulting in less segmentation in the index. However, immediate DML is advantageous if updates to the text table need to be reflected in the index in real time.
Note: Fragmentation of the index may be fixed by running index optimization regularly.
On the other hand, if each ConText server runs on a separate workstation in a networked environment, significant performance gains will be realized by running indexing in parallel. It should be noted that the advantage of running multiple servers is gained during the inversion of the document set in memory. However, when the inverted index is flushed from the memory buffers to the index table, you may have contention because multiple servers will be writing to the wordlist table.
This can be alleviated by increasing the INITRANS parameter for the wordlist (I1T) table to be the same as the number of servers indexing in parallel.
For more information about setting the INITRANS parameter for the I1T table, see the section on Engine Tiles in "Tiles, Tile Attributes, and Attribute Values".
For more information about parallel processing, see Oracle7 Server Administrator's Guide.
Two-table compaction is much faster because only reads are performed on the source table and only inserts are performed on the destination table. There are no updates, and more important, no Oracle index updates when two-table compaction is used.
The advantage of in-place compaction is that you need much less space. Note that two-table compaction results in an approximate replication of the index table. The destination table will be smaller depending on the amount of compaction performed. Specifically, the more fragmented the index, the less significant the size of the destination table compared to the size of the source table.
Something to consider when choosing the method of compaction is the amount of DML that has actually been performed on the table since the last optimization. If a large amount of DML has been performed, optimization will more likely reduce the wordlist significantly, with large numbers of reads and writes to perform the compaction.
Two-table compaction is preferred for this scenario. However, if the index has not been significantly fragmented, in-place compaction should perform sufficiently well and requires less tablespace.
Additionally, the index is flushed back to the database only after the indexing memory has been filled. This flush also utilizes array inserts into the database, reducing the number of network hits.
Note: If indexing memory is small, buffer flushes to the database will be more frequent, resulting in network performance also becoming more of an issue.
The hitlist result table may be shared by all users or may be specific to the user running the query. Performance is better if each user has a unique hitlist table. This is because the query on the hitlist to get the result set doesn't require a filter if the hitlist is not shared. Additionally, the hitlist table can be truncated rather than deleted from when it is not shared. Truncating is much faster than deleting from a table and has the added benefit of generating no redo log.
Finally, the hitlist table should be in a datafile that is on a raw partition. This will result in faster writes and reads from the table.
For more information about hitlists and the MAX operator, see Oracle ConText Option Application Developer's Guide.
If you find that throughput is dropping as the number of users increases, start up more query servers, with the assumption that each server is not fighting another server for CPU cycles. If servers are bottle necking due to insufficient CPU cycles, consider spreading query servers across multiple machines.
Note: Network considerations are important during queries, because fast response time is generally critical, and the time to perform network round trips will become a significant percentage of the elapsed query time. If you are forced to use multiple machines for your Query servers, make sure that the network connection between the query server and the database is a fast connection.
![]() ![]() Prev Next |
![]() Copyright © 1996 Oracle Corporation. All Rights Reserved. |
![]() Library |
![]() Product |
![]() Contents |
![]() Index |