Welcome Guest! Log in
×

Notice

The forum is in read only mode.
Due to some maintenance operations, stambia.org will be switched to read-only mode during the 13th November. It will be possible to read and download, but impossible to post on the forums or create new accounts. For any question please contact the support team.

Topic-icon Question Mapping very long to execute

More
03 Oct 2016 17:24 #1 by letancel
letancel created the topic: Mapping very long to execute
Hi,

I have a simple mapping which copies an AS400 table to a MSSQL table in Append mode.
I do a bulk to load data ("Action Process LOAD Rdbms to Mssql (with BULK)").
The source table contains 3 000 000 lines and it takes around 20 minutes to load them into the target table...
Is there a way to load them faster?

Thanks
More
04 Oct 2016 09:59 #2 by Thomas BLETON
Thomas BLETON replied the topic: Mapping very long to execute
I would first identify which step is taking so long (unload from DB2/400? Bulk load?).
You can consult the CORE_DURATION variable of each process step.

If the Bulk is concerned, try to execute it out of Stambia => does it make a difference?
More
04 Oct 2016 10:23 #3 by letancel
letancel replied the topic: Mapping very long to execute
Hi,
The step which takes time is the unload from DB2/400.
It is executed from Analytics.
More
04 Oct 2016 11:11 #4 by Thomas BLETON
Thomas BLETON replied the topic: Mapping very long to execute
Please be more specific: is it the "Unload of data from source into file" action which is concerned? I suppose so.
You can try to tune it with the "Fetch Size" parameter. See stackoverflow.com/questions/30730383/how...for-the-select-query for an interesting discussion on this subject.

If this is not enough, try unloading data from DB2/400 using DB2 utilities: CALL SYSPROC.ADMIN_CMD('EXPORT TO /tmp/result.csv OF DEL SELECT c1,c2 FROM table')
=> If this is faster than the SqlFileExport "Unload of data from source into file" then you may want to design your process differently:
- First export data from DB2/400 using DB2 utilities: CALL SYSPROC.ADMIN_CMD('EXPORT TO /tmp/result.csv OF DEL SELECT c1,c2 FROM table')
- Get this file from the AS/400 to the SqlServer (the method depends on your servers' tools to transfer files)
- Load the file to SqlServer using Bulk
More
05 Oct 2016 11:04 #5 by letancel
letancel replied the topic: Mapping very long to execute
The Fetch Size parameter doesn't change anything..

I would like to try the second option but I just want to make sure I understood it well :
In a process :
1. Operating System Command using the DB2 utilies CALL
2. Get Files to get the file from the AS400 server to the SQL Server server
3. Load the file using bullk in a mapping
I am correct?

Thanks
More
05 Oct 2016 14:59 #6 by Thomas BLETON
Thomas BLETON replied the topic: Mapping very long to execute
1. The OS Command can be used to execute commands into the Runtime's host system.
I suppose you don't have a Runtime running on the AS/400, right?
Instead, simply use a SQL Operation connecting to DB2/400 and containing CALL SYSPROC.ADMIN_CMD('....')

2. Get Files.... with what? First check your OS/400 admins to know which kind of file transfer protocol is available

3. Yes, correct
More
10 Oct 2016 09:56 - 10 Oct 2016 10:35 #7 by letancel
letancel replied the topic: Mapping very long to execute
Hi,

I tried the Export File solution and the performances are not better (it's even a bit longer):
- it takes around 10 minutes to export the table from the AS400 database
- it takes 18 minutes to import the file (csv, 2Go) to the MSSQL target table using bulk

Is there a third solution? :dry:

By the way, I have a time out connection wich appears randomly on the mapping AS40 table to MSSQL table.
I checked on Internet but couldn't find the origin of the problem. I guess it's because of the amount of data transfered. Is it a java problem (heap space) ?
Last Edit: 10 Oct 2016 10:35 by letancel.
More
10 Oct 2016 11:56 #8 by Thomas BLETON
Thomas BLETON replied the topic: Mapping very long to execute
OK, so the DB2/400 & Mssql native tools are not really faster.
Is it their duration with their default configuration? or did you try various parameters (see their respective documentation)?

A third solution can be to parallelize your job.
For example, base the extraction on an indexed or partition key which will let you extract independent batches of data.
Let's say I extract Customer data, I can parallelize 6 jobs:
- Job1 extracts customers where ID between 1 and 500000
- Job2 extracts customers where ID between 500001 and 1000000
etc.
You have to define your key and amount of rows.