
We all get pressure from application team for data migration of big tables and schemas which is hard time for us during business production hours.There should be some faster approach to save us in this situation 🙂 You can test the below methods and be a super hero oracle dba to your team.Let us see how!!
Case study :There are 4 big tables to be exported from prod and imported to dev.
Database name – source – testdb
destination – test1db
version – 12.1.0
Total size of the four tables = 436GB
Tables: big1,big2,big3,big4
Prod schema: test
Dev schema: test1
1) Estimated time taken for traditional import = 13 hours
2) Importing tables separately ,followed by constraints and indexes took 7 hours
3)Firing the import of four tables parallely and their objects took 6 hours
Take the export backup of full tables with the parameters with max parallel servers:
~]# vi test.par
directory = export_dir
dumpfile = exptest_%U.dmp
logfile = exptest.log
tables = test.big1,test.big2,test.big3,test.big4
filesize=10GB
#this parameter will split the export dumpfiles to reduce single file I/O read and write.
parallel = 16
~]# expdp user/password parfile=test.par
In this case , datapump take 1 hour for exporting the data.
Job “SYSTEM”.”SYS_EXPORT_TABLE_01″ successfully completed at Thu Dec 21 10:25:26 2018 elapsed 001:01:03
Importing the data with normal method:
~]vi test.par
PARFILE:
directory=export_dir
dumpfile=exptest_%U.dmp
logfile=imp_speed.log
tables=test.big1,test.big2,test.big3,test.big4
remap_schema=test:test1
remap_tablespace=prod1:dev1,prod2:dev2
parallel=16
logtime=all
Time taken for normal method:
nohup expdp username/password parfile=test.par
22-DEC-18 15:09:59.953: Job “SYSTEM”.”SYS_IMPORT_TABLE_02″ successfully completed at Thu Dec 22 15:09:59 2018 elapsed 0 14:11:03
1)Importing with faster approach:
PARFILE:test.par
~] vi test.par
directory=export_dir
dumpfile=expfast_%U.dmp
logfile=impfast.log
tables=test.big1,test.big2,test.big3,test.big4
remap_schema=test:test1
parallel=16
logtime=all
exclude=index,constraint,statistic,grant
access_method=direct_path
transform=disable_archive_logging:y
nohup expdp username/password parfile=test.par
The above parameters which are highlighted in red are the major impact for the performance enhancement.
direct path access method will disable logging while data import and insert the rows on top of the high water mark.
disable archive logging parameter will not archive the redologs during import. If the database is in force_logging mode , then the parameter changes doesnot take any effect.
Command for dumping DDL to script file:
—impdp username/password directory=default tables=test.big1,test.big2,test.big3,test.big4 sqlfile=script_ddl.sql include=constraint,index,grant
a script file is placed in your current directory named “script_ddl.sql”
Once the ddl are dumped just modify the ddl with higher DOP for index creation and enable novalidate on constraints in the script file. Edit the file in vi editor.Split the index, constraint and grant in separate files and run it.
Substitute “noparallel” or “parallel 1” with “parallel 16” on all ddl
The degree of parallel is calculated by total size of objects / filesize
Note1: Degree of parallelism is dependent on the number of cores of CPU on your server.This server has 48 cores.If your server has less cores then you can give ‘parallel 8’
Substitute “enable” to “enable novalidate”
Note2: ‘Enable novalidate’ can be used in case of emergency like business hours for faster import.After constraints are imported without validation,you can ‘enable validate’ leisurely so that there will not be application user complaint.
Eg: ~] vi script_ddl.sql
—set timing on;
—create index "test"."big1_ind" on "test"."big1"("row1","row2","row3" DESC) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "dev1" parallel 16;
—alter index "test"."big1_ind" parallel 16;
—alter table "test"."big1" add constraint "big1_pk" primary key("row11","row12","row13") using index "test"."big1_ind" enable novalidate;
use “sqlplus” to import the constraints and indexes
—nohup sqlplus username/password@script_ddl.sql
Time taken to import without constraints and indexes:
22-DEC-18 22:10:33.991: Job “SYSTEM”.”SYS_IMPORT_TABLE_02″ successfully completed at Thu Dec 22 22:10:33 2018 elapsed 0 00:48:04
Time until index and constraint validated:
Table data without metadata like constraints and indexes get import faster.
Major time spent on validating constraints and importing indexes. we reduce the time by not validating the constraints and creating index with parallelism.
index altered
elapsed time: 06:30:00 hours
For faster import,we can enable novalidate to import first and then we can validate constraint.
After the process, validate constraints by replacing “enable validate” on the script_ddl.sql file if there are performance issues.
This approach saved 9 hours from the previous method.
2)Again do the import process by firing the import parallely for four tables with four putty sessions at the same time.Follow the same process 1) but split the tables.
impdp tables=big1
impdp tables=big2
impdp tables=big3
impdp tables=big4
this approach reduced 30 mins from previous method.
elapsed time: 06:01:00 hours
The methods shown are tested in a different environment.The performance varies based on the configuration of the system.There is no guarantee of speed.Please test it in POC before refreshing in production or development.
THANK YOU !! Hope you enjoyed the post