Tuesday, October 28, 2014

Importing Flat Files with Newline Characters in Informatica

Importing Flat Files with Newline Characters in Informatica:-

We were given multiple comma-delimited csv files with several fields utilizing a double quote text qualifier.
Normally this would not be an issue; however inside the double quotes were newline characters. Below is an example of the issue

“C,2010-05-25 18:47:36,3:9,0,db_id:11111
U,2010-05-25 18:47:53,3:9,0,db_id:11111,date_approved,0000-00-00,
date_submitted,0000-00-00
U,2010-05-26 20:37:17,3:9,0,db_id:11111,date_submitted,0000-00-00,
approval_status,’O’,date_approved,0000-00-00?

or
U,2010-05-25 18:47:53,3:9,0,db_id:11111,date_approved,0000-00-00


-00",test

NOTE:-I found two viable solutions to this issue, one of which is impractical.
1) The first approach was to open the csv files in Notepad and manually remove
the newline characters. but this is an time consuming process or 1 or 2 small file.

2) I have a better solution for that  adding a new entry to the Custom Properties field in the session configuration.

Goto-> session confi -> Custom Properties-> and  write here attribute

‘MatchQuotesPastEndOfLine’ and set the value to ‘Yes’.

or
MatchQuotesPastEndOfLine=Yes;


‘MatchQuotesPastEndOfLine=Yes';’


2 comments:

  1. Hi Dinesh, we are dealing with flat files, which has user entered comments and we are receiving double quotes and new line character inside the data. Does informatica support anything else apart from quotes in such custom properties. Kindly let me know

    ReplyDelete
  2. we were thinking of using something like tilled operator or so where users will not use them in comment section. let me know if informatica supports something like that

    ReplyDelete

 BEST PYSPARK LEARNING SITES https://www.youtube.com/watch?v=s3B8HXLlLTM&list=PL2IsFZBGM_IHCl9zhRVC1EXTomkEp_1zm&index=5 https://www...