Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
533 views
in Technique[技术] by (71.8m points)

regex - Line breaking issue to move csv file in Linux

[I have moved the csv file into Linux system with binary mode. File content of one field is spitted into multiple lines its comment sections,I need to remove the new line , keep the same format, Please help on shell command or perl command

here is the example for three records, Actual look like] Original content of the file

[After moved into linux, comments field is splitted into 4 lines , i want to keep the comment field in the same format but dont want the new line characters

"First line

Second line

Third line all lines format should not change" ]2

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As I said in my comment above, the specs are not clear but I suspect this is what you are trying to do. Here's a way to load data into Oracle using sqlldr where a field is surrounded by double-quotes and contains linefeeds where the end of the record is a combination carriage return/linefeed. This can happen when the data comes from an Excel spreadsheet saved as a .csv for example, where the cell contains the linefeeds.

Here's the data file as exported by Excel as a .csv and viewed in gvim, with the option turned on to show control characters. You can see the linefeeds as the '$' character and the carriage returns as the '^M' character:

100,test1,"1line1$
1line2$
1line3"^M$
200,test2,"2line1$
2line2$
2line3"^M$

Construct the control file like this using the "str" clause on the infile option line to set the end of record character. It tells sqlldr that hex 0D (carriage return, or ^M) is the record separator (this way it will ignore the linefeeds inside the double-quotes):

LOAD DATA
infile "test.dat" "str x'0D'" 
TRUNCATE
INTO TABLE test
replace
fields terminated by ","  
optionally enclosed by '"'
(
cola char,
colb char,
colc char
)

After loading, the data looks like this with linefeeds in the comment field (I called it colc) preserved:

SQL> select *
  2  from test;

COLA                 COLB                 COLC
-------------------- -------------------- --------------------
100                  test1                1line1
                                          1line2
                                          1line3

200                  test2                2line1
                                          2line2
                                          2line3

SQL>

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...