#yaml #fixture #database #seed #seeding #fixtures #database-schema

cder

database seed generator that helps create and persist struct-typed instances based on serde-compatible yaml files

5 releases

0.2.2 Aug 14, 2024
0.2.1 Dec 4, 2023
0.2.0 Apr 18, 2023
0.1.1 Jan 25, 2023
0.1.0 Jan 24, 2023

#137 in Development tools

MIT license

37KB
486 lines

cder Latest version Documentation licence

cder

A lightweight, simple database seeding tool for Rust


cder (see-der) is a database seeding tool to help you import fixture data in your local environment.

Generating seeds programmatically is an easy task, but maintaining them is not. Every time when your schema is changed, your seeds can be broken. It costs your team extra effort to keep them updated.

with cder you can:

  • maintain your data in a readable format, separated from the seeding program
  • handle reference integrities on-the-fly, using embedded tags
  • reuse existing structs and insert functions, with only a little glue code is needed

cder has no mechanism for database interaction, so it can work with any type of ORM or database wrapper (e.g. sqlx) your application has.

This embedded-tag mechanism is inspired by fixtures that Ruby on Rails provides for test data generation.

Installation

# Cargo.toml
[dependencies]
cder = "0.2"

Usage

Quick start

Suppose you have users table as seeding target:

CREATE TABLE
  users (
    `id` BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    `name` VARCHAR(255) NOT NULL,
    `email` VARCHAR(255) NOT NULL,
  )

In your application you also have:

  • a struct of type <T> (usually a model, built upon a underlying table)
  • database insertion method that returns id of the new record: Fn(T) -> Result<i64>

First, add DeserializeOwned trait on the struct. (cder brings in serde as dependencies, so derive(Deserialize) macro can do the job)

use serde::Deserialize;

#[derive(Deserialize)] // add this derive macro
User {
  name: String,
  email: String,
}

impl User {
  // can be sync or async functions
  async fn insert(&self) -> Result<(i64)> {
    //
    // inserts a corresponding record into table, and returns its id when succeeded
    //
  }
}

Your User seed is defined by two separate files, data and glue code.

Now create a seed data file 'fixtures/users.yml'

# fixtures/users.yml

User1:
  name: Alice
  email: 'alice@example.com'
User2:
  name: Bob
  email: 'bob@example.com'

Now you can insert above two users into your database:

use cder::DatabaseSeeder;

async fn populate_seeds() -> Result<()> {
    let mut seeder = DatabaseSeeder::new()

    seeder
        .populate_async("fixtures/users.yml", |input| {
            async move { User::insert(&input).await }
        })
        .await?;

    Ok(())
}

Et voila! You will get the records Alice and Bob populated in your database.

Working with non-async functions

If your function is non-async (normal) function, use Seeder::populate instead of Seeder::populate_async.

use cder::DatabaseSeeder;

fn main() -> Result<()> {
    let mut seeder = DatabaseSeeder::new();

    seeder
        .populate("fixtures/users.yml", |input| {
            // this block can contain any non-async functions
            // but it has to return Result<i64> in the end
            diesel::insert_into(users)
                .values((name.eq(input.name), email.eq(input.email)))
                .returning(id)
                .get_result(conn)
                .map(|value| value.into())
        })

        Ok(())
}

Constructing instances

If you want to take more granular control over the deserialized structs before inserting, use StructLoader instead.

use cder::{ Dict, StructLoader };

fn construct_users() -> Result<()> {
    // provide your fixture filename followed by its directory
    let mut loader = StructLoader::<User>::new("users.yml", "fixtures");

    // deserializes User struct from the given fixture
    // the argument is related to name resolution (described later)
    loader.load(&Dict::<String>::new())?;

    let customer = loader.get("User1")?;
    assert_eq!(customer.name, "Alice");
    assert_eq!(customer.email, "alice@example.com");

    let customer = loader.get("User2")?;
    assert_eq!(customer.name, "Bob");
    assert_eq!(customer.email, "bob@example.com");

    ok(())
}

Defining values on-the-go

cder replaces certain tags with values based on a couple of rules. This 'pre-processing' runs just before deserialization, so that you can define dynamic values that can vary depending on your local environments.

Currently following two cases are covered:

1. Defining relations (foreign keys)

Let's say you have two records to be inserted in companies table. companies.ids are unknown, as they are given by the local database on insert.

# fixtures/companies.yml

Company1:
  name: MassiveSoft
Company2:
  name: BuggyTech

Now you have user records that reference to these companies:

# fixtures/users.yml

User1:
  name: Alice
  company_id: 1 // this might be wrong

You might end up with failing building User1, as Company1 is not guaranteed to have id=1 (especially if you already have operated on the companies table). For this, use ${{ REF(label) }} tag in place of undecided values.

User1:
  name: Alice
  company_id: ${{ REF(Company1) }}

Now, how does Seeder know id of Compnay1 record? As described earlier, the block given to Seeder must return Result<i64>. Seeder stores the result value mapped against the record label, which will be re-used later to resolve the tag references.

use cder::DatabaseSeeder;

async fn populate_seeds() -> Result<()> {
    let mut seeder = DatabaseSeeder::new();
    // you can specify the base directory, relative to the project root
    seeder.set_dir("fixtures");

    // Seeder stores mapping of companies record label and its id
    seeder
        .populate_async("companies.yml", |input| {
            async move { Company::insert(&input).await }
        })
        .await?;
    // the mapping is used to resolve the reference tags
    seeder
        .populate_async("users.yml", |input| {
            async move { User::insert(&input).await }
        })
        .await?;

    Ok(())
}

A couple of watch-outs:

  1. Insert a file that contains 'referenced' records first (companies in above examples) before 'referencing' records (users).
  2. Currently Seeder resolve the tag when reading the source file. That means you cannot have references to the record within the same file. If you want to reference a user record from another one, you could achieve this by splitting the yaml file in two.

2. Environment vars

You can also refer to environment variables using ${{ ENV(var_name) }} syntax.

Dev:
  name: Developer
  email: ${{ ENV(DEVELOPER_EMAIL) }}

The email is replaced with DEVELOPER_EMAIL if that environment var is defined.

If you would prefer to use default value, use (shell-like) syntax:

Dev:
  name: Developer
  email: ${{ ENV(DEVELOPER_EMAIL:-"developer@example.com") }}

Without specifying the default value, all the tags that point to undefined environment vars are simply replaced by empty string "".

Data representation

cder deserializes yaml data based on serde-yaml, that supports powerful serde serialization framework. With serde, you can deserialize pretty much any struct. You can see a few sample structs with various types of attributes and the yaml files that can be used as their seeds.

Below are a few basics of required YAML format. Check serde-yaml's github page for further details.

Basics

Label_1:
  name: Alice
  email: 'alice@example.com'
Label_2:
  name: Bob
  email: 'bob@example.com'

Notice that, cder requires each record to be labeled (Label_x). A label can be anything (as long as it is a valid yaml key) but you might want to keep them unique to avoid accidental mis-references.

Enums and Complex types

Enums can be deserialized using YAML's !tag. Suppose you have a struct CustomerProfile with enum Contact.

struct CustomerProfile {
  name: String,
  contact: Option<Contact>,
}

enum Contact {
  Email { email: String }
  Employee(usize),
  Unknown
}

You can generate customers with each type of contact as follows;

Customer1:
  name: "Jane Doe"
  contact: !Email { email: "jane@example.com" }
Customer2:
  name: "Uncle Doe"
  contact: !Employee(10100)
Customer3:
  name: "John Doe"
  contact: !Unknown

Not for production use

cder is designed to populate seeds in development (or possibly, test) environment. Production use is NOT recommended.

License

The project is available as open source under the terms of the MIT License.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, shall be licensed as MIT, without any additional terms or conditions.

Bug reports and pull requests are welcome on GitHub at https://github.com/estie-inc/cder

Dependencies

~4–6MB
~113K SLoC