Add a new dropdown box to the Import CSV dialog to set an ON CONFLICT
strategy when importing into an existing table. You can now choose
between the old and still default behaviour of aborting the import in
case of a conflict, ignoring the conflicting row from the CSV file, and
replacing the existing row in the table.
See issue #1585.
Make sure to show the correct error message when there is an error
during CSV import.
Make sure to release the DB handle used for the import before rolling
back to the last savepoint in case of an error in the CSV import. This
avoids a deadlock situation.
See issue #1590.
This commit refactors vast parts of the sqlitetypes.h interface. Its
main goals are: less code, easier code, a more modern interface, reduced
likelihood for strange errors and more flexibility for future
extensions.
The main reason why the sqlitetypes.h functions were working so well in
DB4S was not because they were that stable but because they were
extremely interlinked with the rest of the code. This is fine because we
do not plan to ship them as a separate library. But it makes it hard to
find the obvious spot to fix an issue or to put a new function. It can
always be done in the sqlitetypes function or in the rest of the DB4S
code because it is just not clear what the interface between the two
should look like. This is supposed to be improved by this commit. One
main thing here is to make ownership of objects a bit clearer.
In theory the new code should be faster too but that difference will be
neglectable from a user POV.
This commit also fixes a hidden bug which caused all table constraints
to be removed in the Edit Table dialog when a single field was removed
from the table.
This is all still WIP and more work is needed to be done here.
Make strings translatable, remove some more debug code, fix tests,
reduce size of patch slightly, remove weird tooltip, don't crash when
closing database, simplify code, fix filters, don't link agains pthread
on Windows.
Because there are some circumstances under which the automatic type
detection can cause problems with the imported data and because it is
not accurate when the data changes a lot after the first couple of rows,
we need an option to disable it.
See issue #1382.
This changes the default behaviour for the CSV import to follow a set of
rules which hopefully makes most people happy.
It also add an "Advanced" section to the settings bits of the dialog to
modify this new default behaviour.
See issue #1395.
* Add a combo box for missing values in import ui
* Add ui pieces to .cpp file
* Insert NULL values if requested
* Allow inserting NULLs in new table also
QProgressDialog only takes an int as the maximum value and the current
value. So using the number of bytes parsed so far isn't going to work
for very large files when an int will overflow. This commit changes this
by precalculating a progress indicator and handing that over to the
QProgressDialog object.
See issue #1212.
Don't use the QSettings class directly. This keeps the code more
consistent and makes it a bit easier to read. It also means that all
parts of the code profit from the settings cache that we have
implemented in the Settings class.
When importing multiple CSV files at once, remove each entry from the
list of CSV files as its import completes. This way people can see the
list shrink visibly onscreen.
Also don't close the window if there are still files left to be
imported. This allows the user to import unchecked files, too, probably
using different settings.
See issue #1072.
When importing a CSV file into a table that doesn't exist yet (i.e. that
is created during the import), try to guess the data type of each column
based on the first couple of rows. If it is all floats or mixed floats
and integers, set the data type to REAL; if it is all integers, set the
data type to INTEGER; if it is anything else, set the data type to TEXT.
See issue #1003.
This commit bundles a number of smaller optimisations in the CSV parser
and import code. They do add up to a noticible speed gain though (at
least on some systems and configurations).
We were separating the CSV import into two steps: parsing the CSV file
and inserting the parsed data. This had the advantages that it keeps the
parsing code and the database code nicely separated and that we have
full knowledge of the CSV file when we start inserting the data into the
database. However, this made it necessary to keep the entire parser
results in RAM. For large CSV files this uses enormous amounts of
memory.
This commit changes the import to parse the first 20 lines and analyse
them. This should give us a good impression of what to expect from the
rest of the file. Based on that information we then parse the file row
by row and insert each row into the database as soon as it is parsed.
This means we only have to keep one row at a time in memory while more
or less keeping the possibility to analyse the file before inserting
data.
On my system this does seem to change the runtime for small files which
take a little longer now (<5%), though these measurements aren't
conclusive. For large files it, however, it changes memory consumption
from using all memory and starting to swap within seconds to almost no
memory consumption at all. And not having to swap speeds things up a
lot.
When parsing a CSV file we used to check the column count for each row
and track the highest number of columns that we found. This information
then could be used to create an INSERT statement large enough for all
the data.
This column number tracking code is removed by this commit. Instead it
analyses the first 20 rows only. It does that while generating the field
list.
Performance-wise this should take a (very) little longer but makes it
easier to improve the performance in other ways later which should more
than compensate this commit.
Feature-wise this should fix some (technically invalid) corner-case CSV
files with fewer fields in the title row than in the other rows. It
should also break some other (technically invalid) corner-case CSV files
if they are imported into an existing table and have less columns than
the existing table in their first 20 rows but later on the exact same
number. Both cases, I think, don't matter too much.
Don't build a separate SQL statement per row to insert during CSV import
but use a single prepared statement which can be reused for each row.
This should speed up the CSV import noticeably.
This adds initial basic support for handling different database schemata
at once to the backend code. This is still far from working properly but
shouldn't break much either - mostly because it's not really used yet in
the user interface code.
When importing a CSV file into an existing table (i.e. a table where we
have a table schema), check the data type of a field before inserting
empty values. If it is an integer field, don't insert empty string like
we did before but 0 or NULL depending on the NOT NULL flag.
See issue #195.
Please note that this isn't perfect. The preview in the dialog doesn't
reflect these changes yet, it just show you the contents of the file as
is. It's a little tricky to change this and I somehow think it's better
the way it is now anyway. Also the import doesn't check for other
constraints like UNIQUE or CHECK which might cause trouble. But then
again it didn't do that before either.
When importing a CSV file and using the first row as the field names,
the row would be imported as the first data row again. It's now skipped
when the checkbox is set.
When importing a single CSV file the checkbox asking whether to import
into a single table or separate tables is hidden. However, the last set
values are loaded anyway when the dialog is opened. This means the
checkbox could be set, even though it's invisible. If it's set, however,
and we're importing a single CSV file this would mean that it's
impossible to manually set the table name to import into. This is fixed,
too.
Also this simplifies the code a bit and removed a large loop from the
import dialog code.
- Tweak input checker
- Preserve old file import as not to cause any unforeseen breaks
- Allow ignoring file name when importing multiple files to tables
- Mass toggle several files for import
This finally gets rid of the DBBrowserObject class entirely and moves
all its functionality to the newer classes in the sqlb namespace.
I'm still not entirely happy with this but at least things should be a
little more consistent now.
This changes the class structure in the sqlb namespace as well as the
DBBrowserObject class. The rest of the commit are changes that are
required by the modifications in sqlb and DBBrowserObject.
The idea behind this refactoring is this: we currently have the
DBBrowserObject class which holds some basic information about the
database object (name, type, SQL string, etc.). It also contains a
sqlb::Table and a sqlb::Index object. Those are used if the type of
the object is table or index and they contain a whole lot more
information on the object than the DBBrowserObject class, including the
name, the type, the SQL string, etc.
So we have a duplication here. There are two class structures for
storing the same information. This has historic reasons but other than
that there is no point in keeping it this way. With this commit I start
the work of consolidating the sqlb classes in order to get rid of the
DBBrowserObject class entirely.
This commit only starts this task, it doesn't finish it. This is why it
is a little messy here and there, but then again the old structure was a
little messy, too. We will need at least a very basic trigger and view
parser before finishing this is even possible. When this is done, I hope
the ode will be much easier to read and understand. But even in the
current state there already is some progress: we save a little bit of
memory, don't copy big objects all the time anymore, and replace a lot
of unnecessary string comparisons with integer comparisons.
When generating SQL statements properly escape all identifiers, even
those containing backticks which apparently are allowed inside
identifiers in SQLite.
See issue #387.
Add some basic initial support for SQLCipher. Note that this is more of
a POC than a final implementation.
This commit adds an option called 'sqlcipher' to the cmake and qmake
projects which - when enabled - replaces the default SQLite3 include and
library files by their SQLCipher counter-parts. Especially on MacOS X
there might be some more work required in finding the correct include
paths. The SQLCipher library supports unencrypted databases, too, so
even if the option is enabled the program behaves like before. You can
see the difference, though, in the About Dialog where the SQLite version
string will say 'SQLCipher version xy'.
When the sqlcipher option is enabled and you try to open a file which is
neither a project file nor a normal SQLite3 database it is assumed now
that the file is an encypted database. There is no way to tell between
an invalid file and an encypted file, so in both cases a password dialog
pops up. When the correct password and page size are entered the file is
opened and can be edited like any other database before.
Creating encrypted databases isn't supported yet. So for testing you
need to fall back to the sqlcipher command line tool.
See issue #12.