IMPDP PERFORMANCE SPEED UP DATAPUMP ORACLE

“BE PERSISTENT AND PERSEVERANT IN iNFINITE LEARNING”

We all get pressure from application team for data migration of big tables and schemas which is hard time for us during business production hours.There should be some faster approach to save us in this situation 🙂 You can test the below methods and be a super hero oracle dba to your team.Let us see how!!

Case study :There are 4 big tables to be exported from prod and imported to dev.

Database name – source – testdb

destination – test1db

version – 12.1.0

Total size of the four tables = 436GB

Tables: big1,big2,big3,big4

Prod schema: test

Dev schema: test1

1) Estimated time taken for traditional import = 13 hours

2) Importing tables separately ,followed by constraints and indexes took 7 hours

3)Firing the import of four tables parallely and their objects took 6 hours

i usually estimate parallelism based on the formula ===> ” parallel = “total size of objects/total dumpfiles “ and with the total available CPU cycles

Take the export backup of full tables with the parameters with max parallel servers:

~]# vi test.par 
directory = export_dir 
dumpfile = exptest_%U.dmp 
logfile = exptest.log 
tables = test.big1,test.big2,test.big3,test.big4
filesize=10GB #this parameter will split the export dumpfiles to reduce single file I/O read and write.
parallel = 4  -------> parallel = "parallel = "total size of objects/total number of tables per filesize"
~]# expdp user/password parfile=test.par 

In this case , datapump take 1 hour for exporting the data.

Job “SYSTEM”.”SYS_EXPORT_TABLE_01″ successfully completed at Thu Dec 21 10:25:26 2018 elapsed 001:01:03

Importing the data with normal method:

~]vi test.par

PARFILE:

directory=export_dir 
dumpfile=exptest_%U.dmp 
logfile=imp_speed.log 
tables=test.big1,test.big2,test.big3,test.big4 
remap_schema=test:test1 
remap_tablespace=prod1:dev1,prod2:dev2 
parallel=4 
logtime=all 

Time taken for normal method:

nohup expdp username/password parfile=test.par

22-DEC-18 15:09:59.953: Job “SYSTEM”.”SYS_IMPORT_TABLE_02″ successfully completed at Thu Dec 22 15:09:59 2018 elapsed 0 14:11:03

1)Importing with faster approach:

PARFILE:test.par

~] vi test.par
directory=export_dir 
dumpfile=expfast_%U.dmp 
logfile=impfast.log
tables=test.big1,test.big2,test.big3,test.big4 
remap_schema=test:test1 
table_exists_action=replace
parallel=4 
logtime=all
exclude=index,constraint,statistic,grant  <=====
access_method=direct_path <=======
transform=disable_archive_logging:y <=======
cluster=y <====== if you have RAC setup,then this parameter utilizes cluster resources to speed up the import

nohup impdp username/password parfile=test.par

The above parameters which are highlighted in red are the major impact for the performance enhancement.

direct path access method insert the rows on top of the high water mark. Note that this parameter will not always be used by oracle due to certain limitations of table properties but mostly for all tables oracle automatically choose access method

disable archive logging parameter will not archive the redologs during import. If the database is in force_logging mode , then the parameter changes doesnot take any effect.If archivelogs are disabled,then there is not much impact of this parameter

Command for dumping DDL to script file:

---impdp username/password directory=default tables=test.big1,test.big2,test.big3,test.big4 sqlfile=script_ddl.sql include=constraint,index,grant

a script file is placed in your current directory named “script_ddl.sql”

Once the ddl are dumped just modify the ddl with higher DOP for index creation and enable novalidate on constraints in the script file. Edit the file in vi editor.Split the index, constraint and grant in separate files and run it.

Substitute “noparallel” or “parallel 1” with “parallel 16” on all ddl

The degree of parallel is calculated by total size of objects / number of tables(based on filesize)..eg: total filesize=450gb;filesize=50gb; then number of dumpfiles is

kish@exdbx<>select 450/50 from dual;

    450/50
----------
         9

This means that depending on the number of tables we can use parallel servers which is 4 here!In datapump,no matter how many parallel server you use,oracle will utilize parallel degree per table count.If i have 4 tables,then using 16 parallel degree would make remaining 12 worker processes idle.

Note1: Degree of parallelism is dependent on the number of cores of CPU on your server.This server has 48 cores.If your server has less cores then you can give less parallel resource

Substitute “enable” to “enable novalidate” for the ddl command

Note2: ‘Enable novalidate’ can be used in case of emergency like business hours for faster import.After constraints are imported without validation,you can ‘enable validate’ leisurely so that there will not be application user complaint.

Eg: ~] vi script_ddl.sql

---set timing on;
---create index "test"."big1_ind" on "test"."big1"("row1","row2","row3" DESC) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "dev1" parallel 16; 
---alter index "test"."big1_ind" parallel 16; 
---alter table "test"."big1" add constraint "big1_pk" primary key("row11","row12","row13") using index "test"."big1_ind" enable novalidate; (by default it oracle validate constraints which take huge amount of time)<======

use “sqlplus” to import the constraints and indexes

---nohup sqlplus username/password@script_ddl.sql

Time taken to import without constraints and indexes:

22-DEC-18 22:10:33.991: Job “SYSTEM”.”SYS_IMPORT_TABLE_02″ successfully completed at Thu Dec 22 22:10:33 2018 elapsed 0 00:48:04

Time until index and constraint validated:

Table data without metadata like constraints and indexes get import faster.

Major time spent on validating constraints and importing indexes. we reduce the time by not validating the constraints and creating index with parallelism.

index altered

elapsed time: 06:30:00 hours

For faster import,we can enable novalidate to import first and then we can validate constraint.

After the process, validate constraints by replacing “enable validate” on the script_ddl.sql file if there are performance issues.

This approach saved 9 hours from the previous method.

2)Again do the import process by firing the import parallely for four tables with four putty sessions at the same time to parallelize the operation manually.Follow the same process 1) but split the tables.

session 1

impdp tables=big1

session 2

impdp tables=big2

session 3

impdp tables=big3

session 4

impdp tables=big4

this approach reduced 30 mins from previous method.

elapsed time: 06:01:00 hours

The methods shown are tested in a different environment.The performance varies based on the configuration of the system.There is no guarantee of speed.Please test it in POC before refreshing in production or development.

If you have big schemas,then follow the same approach with high parallel servers and make sure to split the import jobs with multiple chunks of dumpfile to make parallel more effective

eg: if a schema SCOTT has 50 tables,then you can follow the same method which has been used for 4 tables.But parallel should be equal to or less than 50 (or 48 or 32 …) based the CPU power you have.

If out of 50 tables only ten tables are big and remaining 45 tables are considerably small,then start import of 5 big tables separately with 5 different putty sessions excluding index,constraints and statistics and parallely import 45 small tables by schema import excluding the 5 big tables with 48 parallel degree without excluding index,constraint and statistics so that you save time and dont fall into serialization trap!

Parallelism for tables

THANK YOU !! Hope you enjoyed the post

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s