Hi Guys,
I have a requirement to load all the csv files available in a folder to the target table. The file structure will be same but the name will be vary for each file. So i used variable and looping concept in ODI to achieve this requirement.
Steps followed.
1) list out the names of all the .csv files available in a directory to a temporary file.
----Command----
ls *.csv > Directory/Text_.txt
2) Separate the names into one by one.
---Command----
awk 'sub("$","\r")' Directory/Text_.txt > DirectoryALL_FILE_NAMES.txt
3)Write the file names into a Target table the structure of the target table can be.We can use an ODI interface to
CREATE TABLE ALL_FILE_NAMES
(
ROW_NUMBER INT,
FILE_NAME VARCHAR(50),
LOAD_STATUS VARCHAR(5),
LOAD_DATE DATE)
;
Row_Number: Sequence
File_Nmae: Can directly map the name from the file in ODI mapping.
Load_Status:Stamp 'N' as default.
Load_Date: Sysdate.
4) Create a variable for taking the file count. Select the count of files having LOAD_STATUS equals 'N'
5) Create an Evaluate variable
This variable can be used for looping through the package. In the evaluate condition we can give '>=1' so till the file count equals 0 the loop will be repeated.
6) Create a global variable for picking the file name dynamically.
Eg:
SELECT max(FILE_NAME) FROM ALL_FILE_NAMES
WHERE LOAD_STATUS='N'
group by FILE_NAME
ORDER BY 1
7) Create the interface for loading the data. The above created variable can be used as a file name for this interface.
Resource_Name:#GLOBAL.V_FILE_NAME
8) Create an ODI procedure for updating the file name.This will update the loaded file status to Y and this file will not be picked for next load.
UPDATE ALL_FILE_NAMES SET LOAD_STATUS='Y' WHERE FILE_NAME in (SELECT FILE_NAME FROM ALL_FILE_NAMES
WHERE LOAD_STATUS='N'
and FILE_NAME=#GLOBAL.V_FILE_NAME
9)Take the file count again. This time you will get the updated file count and if that is greater than 0 the loop will be repeated.
10) Refer the screen shot below so you will get a better understanding about the loop process.
I have a requirement to load all the csv files available in a folder to the target table. The file structure will be same but the name will be vary for each file. So i used variable and looping concept in ODI to achieve this requirement.
Steps followed.
1) list out the names of all the .csv files available in a directory to a temporary file.
----Command----
ls *.csv > Directory/Text_.txt
2) Separate the names into one by one.
---Command----
awk 'sub("$","\r")' Directory/Text_.txt > DirectoryALL_FILE_NAMES.txt
3)Write the file names into a Target table the structure of the target table can be.We can use an ODI interface to
CREATE TABLE ALL_FILE_NAMES
(
ROW_NUMBER INT,
FILE_NAME VARCHAR(50),
LOAD_STATUS VARCHAR(5),
LOAD_DATE DATE)
;
Row_Number: Sequence
File_Nmae: Can directly map the name from the file in ODI mapping.
Load_Status:Stamp 'N' as default.
Load_Date: Sysdate.
4) Create a variable for taking the file count. Select the count of files having LOAD_STATUS equals 'N'
5) Create an Evaluate variable
This variable can be used for looping through the package. In the evaluate condition we can give '>=1' so till the file count equals 0 the loop will be repeated.
6) Create a global variable for picking the file name dynamically.
Eg:
SELECT max(FILE_NAME) FROM ALL_FILE_NAMES
WHERE LOAD_STATUS='N'
group by FILE_NAME
ORDER BY 1
7) Create the interface for loading the data. The above created variable can be used as a file name for this interface.
Resource_Name:#GLOBAL.V_FILE_NAME
8) Create an ODI procedure for updating the file name.This will update the loaded file status to Y and this file will not be picked for next load.
UPDATE ALL_FILE_NAMES SET LOAD_STATUS='Y' WHERE FILE_NAME in (SELECT FILE_NAME FROM ALL_FILE_NAMES
WHERE LOAD_STATUS='N'
and FILE_NAME=#GLOBAL.V_FILE_NAME
9)Take the file count again. This time you will get the updated file count and if that is greater than 0 the loop will be repeated.
10) Refer the screen shot below so you will get a better understanding about the loop process.
No comments:
Post a Comment