This is a long overdue continuation of some previous refactoring effort.
Before this we used to store the columns a table constraint belongs to
within the constraint object itself. So for example, a foreign key
constraint object would store the referencing as well as the referenced
column names. While initially simple, this approach has the downside of
duplicating certain data, thus breaking ownership and complicating
matters later on. This becomes obvious when renaming the referencing
column. The column name clearly is a feature of the table but in the
previous approach it also needs to be changed in the foreign key object
as well as in any other constraint for this field even though the
constraint itself has not been touched. This illustrates how a
constraint is not only a property of a table but the field names (a
property of the table) are also a property of the constraint, creating a
circular ownership. This makes the code hard to maintain. It also
invalidates references to constraints in the program needlessly, e.g.
when only changing a column name.
With this commit the column names are removed from the constraint types.
Instead they are now solely a property of the table. This, however,
raised another issue. For unique constraints and primary keys it is
possible to use expressions and/or sorted keys whereas for foreign keys
this is not possible. Additionally check constraints have no columns at
all. So when not storing the used columns inside the constraint objects
we need to have different storage types for each of them. So in a second
step this commit moves the code from a single data structure for storing
all table constraints to three data structures, one for PK and unique,
one for foreign keys, and one for check constraints.
By doing all this, this commit also changes the interface for handling
quite a bit. The new interface tends to use more explicit types which
makes the usage code easier to read.
Please note that this is still far from finished. But future development
on this should be a lot easier now.
This moves the code to remove comments from SQL statements from the
SqliteTableModel class to Data.cpp making it a free function. This
removes some dependencies from the SqliteTableModel class with all its
dependencies.
Also simplify the CMakeLists.txt file for the tests by removing all the
dependencies which are not really required.
This adds the moc header files to the dependencies of the executable in
the cmake project so they show up in their proper location in QtCreator.
This makes it more pleasant to use the cmake files as a QtCreator
project file.
Also rework the CMakeLists.txt file a bit by fixing whitespace issues,
rearranging some blocks and unifying the code style a bit.
This extends our new Bison-generated parser to also parse CREATE TABLE
statements and replaces the last parts of the Antlr-generated parser by
doing so.
Also adjust the unit tests to match the new style of parsed expressions.
They have better formatting now and identifiers are always correctly
quoted. This could not be done before and so the tests expected the old
look of expression statements.
See issue #1990.
Replace the Antlr lexer and parser for CREATE INDEX statements a new
lexer and parser generated with flex and bison. This commit is a first
step towards replacing all Antlr-realted parts of the parser. Until then
the new bison-generated parser is only used for CREATE INDEX statements
and the old Antlr-generated parser is used for CREATE TABLE statements.
These are the main reasons for replacing all of the Antlr parser:
- Getting rid of the Antlr runtime library as a dependency.
- Not depending on an old piece of sotware (we are depending on Antlr2
while Antlr4 is available at the moment. However, migrating to Antlr4
is as bad as migrating to bison).
- Better handling of expressions in statements. This proved to be a
consistent source of problems over the last couple of years.
- Somewhat better Unicode support.
- Reentrant code / multithreading support.
- I can finally uninstall Java from my computer.
See #1990.
This commit changes the class hierarchy to make primary key constraints
a type of unique constraints. This fits nicely with reality because
primary key columns do not allow duplicate values. It also makes our
life easier as the other changes which are introduced here add some code
required by both unique and primary key constraints and which now can be
shared.
Move the auto increment flag from the field class to the primary key
class. This changes how auto increment fields work and look and might be
a bit unfamiliar but it simplifies things a lot for us because an auto
increment field is always a primary key. So before we had to maintain
two places: the field with the auto increment flag and the primary key
which belongs to it. Now it is all in one place in the primary key.
Add support for storing and manipulating sort order for columns in
primary key and unique constraints. It does not add support for them to
the grammar parser though.
Finally add a way to store and manipulate on conflict clauses for unique
and primary key constraints. Again, parser support for them is not added.
This silences a couple of compiler warnings. The changes in the grammar
parser serve the same purpose: they silence at least some of the
warnings Antlr prints while generating the parser code.
In the Edit Dialog a missing break is added to a switch statement. This
seems like it actually was an unintended fallthrough for once, setting
the focus to the hex editor instead of the RTL editor.
When using a "x IS NOT NULL" expression in a statement our parser was
generating something like "xIS NOTNULL". This was especially a problem
because the table looked like it parsed correctly but actually contained
a faulty expression. So when modifying the table you would get
unexpected error messages, or worse silent errors introduced into your
table.
Also add a test case for this and for commit e7ba79f478.
See issue #1969.
This changes the way we store the constraints associated with a table
from using a map to using a set. The map was mapping from the list of
field names to the constraint applied on these fields. Now the field
list is stored inside the constraint and we can store the constraints in
a simple set. This turns out to simplify the code noticeably.
In the Table class we need to store whether this is a WITHOUT ROWID
table or now. Instead of just storing a boolean flag for that we were
storing a list of the rowid column(s). This is not just more complicated
to handle than a simple flag but also more error-prone because the list
must always be kept equal to the list of primary key columns. Failing to
keep them equal would result in an invalid SQL statement.
After setting a filter, the user can select from the context menu in the
filter line a new option "Use for Conditional Format", that assigns
automatically a colour to the background of cells fulfilling that
condition.
The formatting is preserved after the user has removed the filter. Several
conditional formats can be successively added to a column using different
filters.
The conditional formats of a column can be cleared when the filter is empty
selecting "Clear All Conditional Formats" from the filter line context
menu.
The conditional formats are saved and loaded in project files as other
browse table settings.
A new class Palette has been added for reusing the automatic colour
assignment of the Plot Dock. It takes into account the theme kind of the
application (dark, light) for the colour selection.
A new class CondFormat for using the conditional formatting settings from
several classes. The conversion of a filter string from our format to an
SQL condition has been moved here for reuse in filters and conditional
formatting.
Whether the conditional format applies is resolved by SQLite, so filters
and conditional formats give the same exact results.
Code for getting a pragma value has been reused for getting the condition
result, and consequently renamed to selectSingleCell.
Possible future improvement:
- New dialog for editing the conditional formatting (at least colour and
application order of conditions, but maybe too: adding new conditions and
editing the condition itself).
This commit refactors vast parts of the sqlitetypes.h interface. Its
main goals are: less code, easier code, a more modern interface, reduced
likelihood for strange errors and more flexibility for future
extensions.
The main reason why the sqlitetypes.h functions were working so well in
DB4S was not because they were that stable but because they were
extremely interlinked with the rest of the code. This is fine because we
do not plan to ship them as a separate library. But it makes it hard to
find the obvious spot to fix an issue or to put a new function. It can
always be done in the sqlitetypes function or in the rest of the DB4S
code because it is just not clear what the interface between the two
should look like. This is supposed to be improved by this commit. One
main thing here is to make ownership of objects a bit clearer.
In theory the new code should be faster too but that difference will be
neglectable from a user POV.
This commit also fixes a hidden bug which caused all table constraints
to be removed in the Edit Table dialog when a single field was removed
from the table.
This is all still WIP and more work is needed to be done here.
This commit fixes a regression which was introduced in commit
788134eee6 which broke the parsing of row
values.
It also makes sure CHECK expressions are parsed in exactly the same way,
no matter whether they are a column or a table constraint. Before spaces
were added to the query in a different way. The way it was done for
column constaints had also an error were the minus sign of a negative
number was separated from the first digit by a space. This is fixed,
too.
Because of all the changes this commit also adjusts the tests to expect
the new layout of the check expressions. It also adds some new tests for
row values and for complex expressions to make sure both work. Finally,
it also removes all QScintilla dependencies from the tests which don't
seem to be necessary.
* Replace deprecated qt5_use_modules function
* Fix includes that fall under a larger module
* Bump minimum Cmake version to use newer features and properly use libs
* Replace deprecated qt5_use_modules function and bump minimum CMake version to 3.1.0 for 3rd party libraries
* Rename confusing variables
* Fix some project warnings
* Fix code style
* Add constant for the default page size
* Move KeyFormats enum to CipherSettings
* Fix code style
* Fix memory leak
* Stop relying on CipherDialog for encryption settings management
* Fix code style
* Add .env format for QSettings
* Add automatic crypted databases open via dotenvs
This adds support for `.env` files next to the crypted databases that
are to be opened that contains the needed cipher settings.
The only required one is the plain-text password as a value for the key
with the name of the database like this:
myCryptedDatabase.sqlite = MyPassword
This way, databases with a different extension are supported too:
myCryptedDatabase.db = MyPassword
You can also specify a custom page size adding a different line
(anywhere in the file) like this:
myCryptedDatabase.db_pageSize = 2048
If not specified, `1024` is used.
You can also specify the format of the specified key using the
associated integer id:
anotherCryptedDatabase.sqlite = 0xCAFEBABE
anotherCryptedDatabase.sqlite_keyFormat = 1
where `1` means a Raw key. If not specified, `0` is used, which means a
simple text Passphrase.
Dotenv files (`.env`) are already used on other platforms and by
different tools to manage environment variables, and it's recommended
to be ignored from version control systems, so they won't leak.
* Add new files to CMakeLists
* Move DotenvFormat include to the implementation
* Fix build error
* Remove superfluous method
(related to ac51c23)
* Remove superfluous checks
* Fix memory leaks
(introduced by 94bbb46)
* Fix code style
* Make dotenv related variable and comment clearer
* Remove duplicated code
* Remove unused forward declaration
(introduced by e5a0293)
Modified some tests for taking into account the new standard quoting and
added some more for testing the quoting configuration and for correct
parsing of different quoting styles.
Default branch in escapeIdentifier() for trying to avoid warning.
Make strings translatable, remove some more debug code, fix tests,
reduce size of patch slightly, remove weird tooltip, don't crash when
closing database, simplify code, fix filters, don't link agains pthread
on Windows.
At the moment space is inserted between all tokens from which a type
consists. This adds extra spaces to types like VARCHAR(5) which become
"VARCHAR ( 5 )" which causes problems in some applications.
This patch modifies the way tokens are concatenated for a type. It makes
sure that the extra space isn't inserted before "(" and ")" and also
after "(".
This commit bundles a number of smaller optimisations in the CSV parser
and import code. They do add up to a noticible speed gain though (at
least on some systems and configurations).
We were separating the CSV import into two steps: parsing the CSV file
and inserting the parsed data. This had the advantages that it keeps the
parsing code and the database code nicely separated and that we have
full knowledge of the CSV file when we start inserting the data into the
database. However, this made it necessary to keep the entire parser
results in RAM. For large CSV files this uses enormous amounts of
memory.
This commit changes the import to parse the first 20 lines and analyse
them. This should give us a good impression of what to expect from the
rest of the file. Based on that information we then parse the file row
by row and insert each row into the database as soon as it is parsed.
This means we only have to keep one row at a time in memory while more
or less keeping the possibility to analyse the file before inserting
data.
On my system this does seem to change the runtime for small files which
take a little longer now (<5%), though these measurements aren't
conclusive. For large files it, however, it changes memory consumption
from using all memory and starting to swap within seconds to almost no
memory consumption at all. And not having to swap speeds things up a
lot.
When parsing a CSV file we used to check the column count for each row
and track the highest number of columns that we found. This information
then could be used to create an INSERT statement large enough for all
the data.
This column number tracking code is removed by this commit. Instead it
analyses the first 20 rows only. It does that while generating the field
list.
Performance-wise this should take a (very) little longer but makes it
easier to improve the performance in other ways later which should more
than compensate this commit.
Feature-wise this should fix some (technically invalid) corner-case CSV
files with fewer fields in the title row than in the other rows. It
should also break some other (technically invalid) corner-case CSV files
if they are imported into an existing table and have less columns than
the existing table in their first 20 rows but later on the exact same
number. Both cases, I think, don't matter too much.