Don't track column count when parsing CSV files

When parsing a CSV file we used to check the column count for each row
and track the highest number of columns that we found. This information
then could be used to create an INSERT statement large enough for all
the data.

This column number tracking code is removed by this commit. Instead it
analyses the first 20 rows only. It does that while generating the field
list.

Performance-wise this should take a (very) little longer but makes it
easier to improve the performance in other ways later which should more
than compensate this commit.

Feature-wise this should fix some (technically invalid) corner-case CSV
files with fewer fields in the title row than in the other rows. It
should also break some other (technically invalid) corner-case CSV files
if they are imported into an existing table and have less columns than
the existing table in their first 20 rows but later on the exact same
number. Both cases, I think, don't matter too much.
This commit is contained in:
Martin Kleusberg
2017-09-10 11:07:02 +02:00
parent 67adb99665
commit b7a00d301a
5 changed files with 52 additions and 44 deletions

View File

@@ -8,7 +8,6 @@ CSVParser::CSVParser(bool trimfields, const QChar& fieldseparator, const QChar&
, m_cFieldSeparator(fieldseparator)
, m_cQuoteChar(quotechar)
, m_pCSVProgress(0)
, m_nColumns(0)
, m_nBufferSize(4096)
{
}
@@ -32,7 +31,6 @@ inline void addColumn(QStringList& r, QString& field, bool trim)
bool CSVParser::parse(QTextStream& stream, qint64 nMaxRecords)
{
m_vCSVData.clear();
m_nColumns = 0;
ParseStates state = StateNormal;
QString fieldbuf;
QStringList record;