sqlitebrowser

mirror of https://github.com/sqlitebrowser/sqlitebrowser.git synced 2026-01-20 11:00:44 -06:00

Author	SHA1	Message	Date
Martin Kleusberg	ed9fda28ea	Speed up CSV import by not querying the stream position Avoid querying the position in the text stream using Qt's pos() function to update the progress dialog. Instead keep track of the stream position manually. This is possible here because we don't ever seek in the file. In result, this speeds up the CSV import dramatically.	2017-11-05 12:40:32 +01:00
Martin Kleusberg	5a14e47419	Mark some more constructors as explicit	2017-10-31 12:11:03 +01:00
Martin Kleusberg	ee32b3e4e1	Use nullptr where possible	2017-10-30 21:20:02 +01:00
Martin Kleusberg	659f38ebef	Increase CSV parser performance	2017-09-18 15:10:43 +02:00
Martin Kleusberg	0eb1f65798	Optimise the CSV import performance This commit bundles a number of smaller optimisations in the CSV parser and import code. They do add up to a noticible speed gain though (at least on some systems and configurations).	2017-09-13 15:03:13 +02:00
Martin Kleusberg	6ed8080fdb	Don't parse entire CSV file before inserting the first row We were separating the CSV import into two steps: parsing the CSV file and inserting the parsed data. This had the advantages that it keeps the parsing code and the database code nicely separated and that we have full knowledge of the CSV file when we start inserting the data into the database. However, this made it necessary to keep the entire parser results in RAM. For large CSV files this uses enormous amounts of memory. This commit changes the import to parse the first 20 lines and analyse them. This should give us a good impression of what to expect from the rest of the file. Based on that information we then parse the file row by row and insert each row into the database as soon as it is parsed. This means we only have to keep one row at a time in memory while more or less keeping the possibility to analyse the file before inserting data. On my system this does seem to change the runtime for small files which take a little longer now (<5%), though these measurements aren't conclusive. For large files it, however, it changes memory consumption from using all memory and starting to swap within seconds to almost no memory consumption at all. And not having to swap speeds things up a lot.	2017-09-12 10:37:28 +02:00
Martin Kleusberg	b7a00d301a	Don't track column count when parsing CSV files When parsing a CSV file we used to check the column count for each row and track the highest number of columns that we found. This information then could be used to create an INSERT statement large enough for all the data. This column number tracking code is removed by this commit. Instead it analyses the first 20 rows only. It does that while generating the field list. Performance-wise this should take a (very) little longer but makes it easier to improve the performance in other ways later which should more than compensate this commit. Feature-wise this should fix some (technically invalid) corner-case CSV files with fewer fields in the title row than in the other rows. It should also break some other (technically invalid) corner-case CSV files if they are imported into an existing table and have less columns than the existing table in their first 20 rows but later on the exact same number. Both cases, I think, don't matter too much.	2017-09-10 11:07:02 +02:00
Martin Kleusberg	e64eb8a118	Only load extra byte in the CSV parser when there's more data available	2017-06-30 22:32:13 +02:00
Martin Kleusberg	c6deca1242	Fix CSV import when line breaks appear at the buffer boundary We're reading CSV files not all at once but in chunks. And when we're encountering a \r char we're checking if it is followed by a \n char. So far so good. But now it might happen that we're hitting a \r char that's right at the end of the current buffer. In this case the lookahead check isn't working as expected because there isn't more data available yet. This commit fixes the issue by checking for these conditions and loading an extra byte when needed. See issue #1033.	2017-06-30 00:59:03 +02:00
Martin Kleusberg	743bdf9941	Fix a few warnings	2015-07-06 22:48:18 +02:00
Peinthor Rene	3ae9808289	use qt int64 type to fix build	2015-04-12 20:11:51 +02:00
Samir Aguiar	ca38995013	csvparser: Add support for old Mac OS line endings In order to detect the CR characters, the file must be opened in binary mode, otherwise QFile just removes them all. See issue #212.	2015-03-04 21:28:38 +01:00
Peinthor Rene	8d55fd5c48	cvsparse: used wrong var for last row check	2015-02-05 15:58:42 +01:00
Peinthor Rene	97e2025cc9	cvsparser: Newly implemented CSV Parser Moved parser into it's own class This parser now proper supports new lines in quoted text and returns a QVector<QStringList> result.	2014-09-02 18:05:04 +02:00

14 Commits