This is a multi-part series. If you want to play around with a live demo, you can try it out here. You can also view the source code here.
Recently, I've found myself working a lot with SQL in Rust. I've been using both SeaORM and SQLx for my database needs. These two tools provide a lot in terms of query safety and ease of development.
One issue I continually ran into was the need to have a running database during development. This seems like a trivial thing ("We're running a database when testing anyways!") but I really think we should have other options for ensuring our queries are safe. I wanted to explore statically analyzing SQL without running a database.
Our approach requires us to write a program that is able to take in your migrations, manage an internal state, and validate if your queries are valid or invalid (plus, a handy error message or two would be nice!).
This approach that I will be covering is not good for cases where you:
The first thing you do in your SQL database is create the tables that represent your data. When you see the query below, what are all of the pieces that are important for us?
create table account (id int primary key, name text not null, status int not null default 1, email text not null, unique(email))
The first and easiest thing to notice is that the name has a name, "account". This table name serves as an identifier, allowing us to reference this table in other SQL statements.
pub struct Simulator {
// Mapping of the table's name to the Table.
pub tables: HashMap<String, Table>
}
The next thing you notice is the list of the columns that belong to this table.
Each of these columns has a type and a set of constraints. There are a couple
different constraints that can exist on these columns but for now, we will focus
on just not null
and default
. We will handle others,
like "primary key", later on.
We can define some basic structures for SqlType
and Column
for representing the types and the columns respectively. These
are fairly minimal for now.
pub enum SqlType {
/// 16 bit Signed Integer
SmallInt,
/// 32 bit Signed Integer
Integer,
/// 64 bit Signed Integer,
BigInt,
/// 32 bit Floating
Float,
/// 64 bit Floating
Double,
/// String
Text,
Boolean,
Unknown(String),
}
pub struct Column {
// The type of the column like SqlType::Text
pub ty: SqlType,
// If the column can be NULL.
pub nullable: bool,
// If the column has a DEFAULT defined.
pub default: bool,
}
// We can now define the Table.
pub struct Table {
pub columns: HashMap<String, Column>,
}
We can now actually create the table and store it within the Simulator
instance.
There are some low hanging validations we can do now that we are building up our state. One of these is ensuring that there can't be two tables with the same name. We start off by defining an Error type for our Simulator.
#[derive(Debug, thiserror::Error)]
pub enum Error {
#[error("Parsing: {0}")]
Parsing(#[from] sqlparser::parser::ParserError),
#[error("Table '{0}' already exists")]
TableAlreadyExists(String),
#[error("'{0}' is currently unsupported")]
Unsupported(String),
}
Now that we have defined an error type, with a variant to use for this case, we can actually do our validation check.
// Ensure that this table doesn't already exist.
if !create_table.if_not_exists && self.tables.contains_key(&name) {
return Err(Error::TableAlreadyExists(name));
}
Since SQL CREATE TABLE
statements can include an optional IF NOT EXISTS
, we must check that along with ensuring that this
table doesn't currently exist.
After that, we can start loading our columns into this table. We can start off on a naive implementation that just creates a column and ignores constraints.
let mut table = Table::default();
for column in create_table.columns {
let column_name = &column.name.value;
let mut nullable = true;
let mut default = false;
let ty: SqlType = column.data_type.into();
let col = Column { ty, nullable, default };
// TODO: Check nullability, default, and constraints.
// Ensure that this column doen't already exist.
if table.columns.contains_key(column_name) {
return Err(Error::ColumnAlreadyExists(column_name.to_string()));
}
table.columns.insert(column_name.to_string(), col);
}
This snippet above also adds another validation, ensuring that we don't have two columns of the same name within a Table.
So far, we have properly created a Simulator that has accurate state regarding the executed queries and is able to do some basic validations. In the next post, we will handle the column and table level constraints.