Languages :: Csharp :: .Net :: Read CSV file to array |
|||
| By: Bernard |
Date: 21/11/2005 08:50:10 |
Points: 20 | Status: Answered Quality : Excellent |
|
Hello, I know how to read a file. But what I don't know is how to split the contents? As normal CSV format, the fields are separated by commas, but the actual data also contains commas. What should I do? I need also to store the contents into a matrix (2D array). Thanks |
|||
| By: VGR | Date: 21/11/2005 09:06:05 | Type : Answer |
|
| Hi ;-) Basically your logic is classical : - read each line of the file - this is your first dimension of the final array - split each line on the delimiter - this is your second dimension of the array So as you said you knew how to read a file, only the second part is problematic. You probably understand that the solution has to do with Split() - like methods You could use regEx.Matches() but this is unefficient Classically, .Net programmers would use a custom CSV parser for this kind of line format : $100,"$2,350",a,15/04/1998 Example : Regex r = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))"); string s = "\"a\",b,\"c, d, e\",,f"; string[] sAry = r.Split(s); for(int i=0;i < sAry.Length;i++) { Console.WriteLine(sAry); } See : link 1 link 2 link 3 Others propose classical VB6 code, but it fails miserably on your data as it contains commas in the litterals. I have seen once someone[alex_developer] proposing to use this kind of ODBC code to solve elegantly the issue : string ConnectionString = @"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\"; OdbcConnection connection = new OdbcConnection(ConnectionString); connection.Open(); OdbcCommand command = new OdbcCommand("Select * FROM Subscribers.csv", connection); OdbcDataReader reader = command.ExecuteReader(); while (reader.Read()) { // loop through every field for (int x = 0; x < reader.FieldCount; x++) { // add code for puting fields into array.... Console.WriteLine(reader.GetDataTypeName(x) + " - " + reader.GetValue(x)); } } // Close the connection connection.Close(); and I think this is right direction to go if you want ease of use, performance and reliability. My own advice is even better if you have access to a native MySql driver. This is quite common nowadays. Just read the CSV file in MySql with something like : LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n'; And your problem is gone ;-) Afterwards, just read the contents of the table into your 2D-array and it's done. This way you also will have a logic (their names) for retrieval of the columns' values in stead of having to handle numerical nonsense indices in the second dimension. This also enables you to sort, arrange, sub-select from the data read in stead of doing it in your .Net script. It will also be faster. You may even perform fast "LIKE 'XXX%'" operations on your textual column containing commas, something you would not be able to do as easily in "standard" .Net code Best regards, VGR |
|||
|
Do register to be able to answer |
|||
©2010 These pages are served without commercial sponsorship. (No popup ads, etc...). Bandwidth abuse increases hosting cost forcing sponsorship or shutdown. This server aggressively defends against automated copying for any reason including offline viewing, duplication, etc... Please respect this requirement and DO NOT RIP THIS SITE.
Please DO link to this page!








