README.txt - Using ChkDB Rule-Based Data Checking
(After it has been installed)
- Contents
- Re-Running Checks www.chkdb.org Domains and Rule-Sets Populating a Domain Running Checks Errors while Running Checks Reviewing Results Running the Sample Adding Tables to Domains
Re-Running Checks
You are a DBA or SQL expert... you didn't setup ChkDB, but you have now been told to "Run the ChkDB checks again". This section describes how to do that.
You need to know:
- the database (the Oracle username and password)
- the "data domain" name
- the names of the tables to be checked
Login to SQL*Plus as the appropriate user. These queries should each result in some rows:
select * from tab where tname like 'CHKDB%';
select * from CHKDB_DOMAIN;
select * from CHKDB_RULESET;
Pick domains from the results of the select from the CHKDB_DOMAIN table.
Note that the tables specified to be checked might not be the tables in your database; they might be views into those tables or temporary tables to which data is dumped. You might want to...
select DATA_DOMAIN, count(*) from CHKDB_TABLE
group by DATA_DOMAIN;
and then query CHKDB_TABLE to identify the tables to be
checked. In some cases, this could include multiple
views into the same database table.
Checks are run by executing a procedure in a package, passing the domain and the table name. The table name can include the wildcard character '%'.
Ex. To check the CLIENT table in the BOOKS domain: SQL> call CHKDB_CHECK_DB.APPLY_RULES('BOOKS','CLIENT');
Ex. To check all tables beginning with 'ACCT'
in the BOOKS domain:
SQL> call CHKDB_CHECK_DB.APPLY_RULES('BOOKS','ACCT%');
Checking can take time - depending on table size, indexes, and other factors, maybe a long time. It is very wise to run checks on a single large table before attempting to run checks on multiple large tables.
Results are written to the table CHKDB_RESULT. See the file: sample_queries.sql
The ChkDB website www.chkdb.org includes:
- a page where you can download ChkDB (zip or gz)
- data-domain definitions and rule-set summaries
- the .txt files in the release (including this one)
- the ChkDB table-create script and e-r diagram
- pages showing the sample files and how they are used.
- information about the ChkDB project
Domains and Rule-Sets
A ChkDB distribution comes with files that define:
- data domains: tables/views that can be checked
- rule-sets: rules to apply to a data domain.
Domain and rule-set files are SQL script files.
To load a domain or rule-set into a database, you (or your DBA) runs the script file with SQL*Plus.
Existing domains and rule-sets are described at: http://www.chkdb.org/rulesets.htm
You can open domain or rule-set script files using a text editor (like Notepad).
Remove a domain from a database by executing the
script remove_domain.sql, specifying the domain name.
Ex. If the domain is CHKDB_SAMP_HR:
sql> @remove_domain.sql CHKDB_SAMP_HR
This will drop tables listed in CHKDB_TABLE, for this
domain, where DROP_ON_REMOVE_YN is set to 'Y'. This
script will also delete all rule-sets in the domain.
Remove a rule-set (along with its rules and results)
by executing the script remove_ruleset.sql, specifying the
domain name and rule-set name. Ex:
sql> @remove_ruleset CHKDB_SAMP_HR SAMP_RULES
Populating a Domain
A domain defines a set of tables or views to be checked.
Data checking can be done two ways:
- checking views into your database
- checking tables that are populated from your database
For each 'create table' statement in a domain script, you can:
- define a view into your database based on the 'create table' statement, or,
- create the table and populate it from your database with some software before checking is done.
Sometimes, you would like to create a view that is just a subset of another view or table, but it would be difficult to specify the 'create view' statement to select the desired subset. One way to handle this is to create a table with the same primary key as the source table. You populate this table with the keys of the subset using some software. Then you can define the view simply, maybe something like this:
CREATE VIEW EMP_SUPERVISOR as
SELECT * from EMP e where exists
(select 'x' from SUPERVISOR_LIST s
where s.EMP_ID = e.EMP_ID);
Running Checks
To run checks against one or more tables, you run a procedure in a stored PL/SQL package.
You control which rules are applied with the ACTIVE_YN column in the ChkDB tables. You can turn checking on or off for a domain, table, rule-set or individual rule.
You run checks by calling a procedure. You can specify a data domain and a table name. The table name can include the wild-card character '%'.
Examples (using SQL*Plus):
To check the CHKDB_SAMP_DEPT table
in the CHKDB_SAMP_HR domain:
SQL> CALL CHKDB_CHECK_DB.APPLY_RULES('CHKDB_SAMP_HR','CHKDB_SAMP_DEPT');
To check all the tables whose names begin with CHKDB_SAMP
in the CHKDB_SAMP_HR domain:
SQL> CALL CHKDB_CHECK_DB.APPLY_RULES('CHKDB_SAMP_HR','CHKDB_SAMP%');
To apply all active rules in the CHKDB_SAMP_HR domain: SQL> CALL CHKDB_CHECK_DB.APPLY_RULES('CHKDB_SAMP_HR','%');
Note that when a check is run for one or more tables, all existing results for those tables is first deleted.
Errors while Running Checks
If an error occurs when you run a check, it may be a result of a rule that was not specified correctly in the CHKDB_RULE table.
See the table CHKDB_RULE_ERR - if an error occurred because of how a rule is defined, it will be described in this table.
The table CHKDB_RULE_SQL has the SQL that was generated for each rule that is executed.
If you create or modify rules, remember:
- Inside a string delimited by single quotes, you specify a single quote as a pair of them - ex. 'don''t wait'.
- In the SQL generated for a rule, the table to which the rule applies has an alias of t. You may need this alias in where clauses that use 'exists' or if the table being tested and CHKDB tables have columns with the same name.
Reviewing Results
When a check is run and a rule is found to apply to a row in the table being checked, a row is inserted into the CHKDB_RESULT table.
You may have an application or a way of running reports to help you review these results, or you may run queries from SQL*Plus.
A ChkDB distribution includes sample queries in a file called: sample_queries.sql
Running the Sample
You can execute run_sample.sql to:
- create and populate a sample database of two tables
- define a sample domain and rule-set
- run checks against the sample database
- run sample queries (overwriting sample_results.txt)
Alternatively, you may use the run_sample.sql script as a guide to executing the various sample scripts one at a time.
You might want to make a copy of sample_results.txt before the file is overwritten by the sample queries.
Copyright and Open Source License
Copyright (c) Brian Marshall 2003
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
