SQL Formatting Tips for Documentum
http://www.revasolutions.com/ecm-blog/?p=167
SQL Formatting Tips for Documentum
April 6th, 2010 by Zainab Mantri
Sometimes we want to play directly with the underlying database or generate outputs in tabular format. It can easily be done in dql (Documentum Query Language) and then imported into Microsoft Excel. SQL does not generate tabular output and needs to be tweaked a little to get the desired results.
Here are some tips:
- Set autocommit off. This will enable you to verify the changes and doing a rollback if you feel that the changes are not good enough.
- SQL updates do not affect the modify date & modifier so if you are sure of the data that needs to be modified and wish to save the modify date etc use SQL. SQL updates can be done even when the documents are checked out.
- For every object type, there are 2 tables and 2 views that you’ll find useful. For dm_sysobject, the tables are: dm_sysobject_s and dm_sysobject_r, while the views are dm_sysobject_sv and dm_sysobject_rv. The naming conventions are easy enough to remember, appending _s for all the single-value attributes, _r for all the repeating attributes, and an extra v if it is a view. The regular tables only allow you to deal with attributes that weren’t inherited (i.e. r_object_type is not available for dm_document_s because it is inherited from dm_sysobject_s). However, the views will allow you to select all inherited attributes as columns. So if you were doing a select, it would be more convenient to use views, but if you are doing an update, then you are stuck with the regular table and would have to join the tables.
- The column that all 4 of tables share is the r_object_id attribute, so you will be using this in the join condition. I also highly recommend using aliases all the time because it will save you a lot of headaches. Here is a simple example to find how many my_persons have their best friend as one of their friends. A little useless, but it might be a query to verify if your data is consistent.
select count(a.r_object_id) from my_person_sv a, my_person_rv b where a.r_object_id = b.r_object_id and a.bestfriend = b.friends;
- If you want to access more than one repeating attribute, even for the same object type, you have to do another join with that repeating table. So we extend the previous example by adding the constraint that the best friend is a colleague. It is easy to see how repeating attributes can be quite a pain.
select count(a.r_object_id) from my_person_sv a, my_person_rv b, my_person_rv c where a.r_object_id = b.r_object_id and a.r_object_id = c.r_object_id and a.bestfriend = b.friends and a.bestfriend = c.colleagues;
- Speaking of joins, the real pain is when you want to do an update join. I can’t figure out the best way to do this in Oracle SQL, but here are some ways of doing it. This example will set some attribute by looking up a code in a control list if and only if that code exists in the control list.
update my_type_s a
set a.some_attribute = (select the_new_attribute b from my_control_list_s b where a.code = b.code)
where exists (select 1 from my_control_list_s where a.code = b.code);
The exists part performs a subquery for each occurrence and will be a source of slowdown, but if you need joined attributes in the where clause, then sometimes there might not be a better alternative. Of course, in some cases (such as this) there are ways to tweak the query in this way:
update my_type_s a
set a.some_attribute = (select the_new_attribute from my_control_list_s b where a.code = b.code)
where a.code in (select b.code from my_control_list_s b);
- Don’t insert or remove repeating attributes in SQL unless you absolutely know what you are doing. Feel free to update them to other values, but anything else is better left for a DQL query.
- If you are running queries in batch, use SET DEFINE OFF; to prevent ‘%’ or special characters from being treated as substitution variables.
- You can print the results to a file by spooling the data. spool C:\temp.txt. After the output is done use spool off;
- SET HEADING OFF will give you a clean output with the headings only once rather than being repeated every now and then.
- set linesize 1000; (Calculate the total length of all the attributes and the line size should be more than that else it will wrap the data.)
- set colsep “,”. (This will generate the output file with the columns separated by a ‘,’. When you will import this file in Microsoft Excel you can import it using the delimiter as ‘,’. You can use any colsep value like tab, ; , |, etc)
- set pagesize 50000; (The value of PAGESIZE is the number of output lines to be produced on each page.)
I welcome any other tips that all of us could use.