1 unstable release
0.1.0 | Mar 11, 2023 |
---|
#53 in #backend
Used in hextacy
50KB
458 lines
⬡ Hextacy ⬡
A repository designed to quick start web server development with actix_web by providing an extensible infrastructure and a CLI tool to reduce manually writing boilerplate while maintaining best practices.
The kind of project structure this repository uses is heavily based on hexagonal architecture also known as the ports and adapters architecture which is very flexible and easily testable. You can read great articles about it here and here.
Architecture
The following is the server architecture intended to be used with the various hextacy helpers, but in order to understand why it is built the way it is, you first need to understand how all the helpers tie together to provide an efficient and flexible architecture.
Backend servers usually, if not always, consists of data stores. Repositories provide methods through which an application's Adapters can interact with to get access to database Models.
In this architecture, a repository contains no implementation details. It is simply an interface which adapters utilise for their specific implementations to obtain the underlying model. For this reason, repository methods must always take in a completely generic connection parameter which is made concrete in adapter implementations.
When business level services need access to the database, they can obtain it by having a repository struct which is bound to whichever repository traits it needs (to have a better clue what this means, take a look at the server example, or the user example below). For example, an authentication service may need access to a user and session repository.
In the service's definition, its repository must be constrained by any repository traits the service requires. This will require that the intermediate service repository also takes in generic connection parameters. Since the service should be oblivious to the repository implementation, this means that the client this intermediate repository uses to establish database connections must also be generic, since the service cannot know in advance which adapter it will be using.
The generic connection could be mitigated by moving the client from the business level to the adapter level, but unfortunately we would then lose the ability perfom database transactions (without a nightmare API). The business level must retain its ability to perform atomic queries.
So far, we have 2 generic parameters, the client and the connection, and we have repositories, interfaces our service repositories can utilise to obtain data, so good!
Because we are now working with completely generic types, we have a completely decoupled architecture (yay), but unfortunately for us, we now have to endure rust's esoteric trait bounds on every intermediate repository we create (boo). Fortunately for us, we can utilise rust's most excellent feature - macros!
First, let's go step by step to understand why we'll need these macros by examining an example of a simple user endpoint. Check out the server example in the examples repo to see how everything is ultimately set up.
The server
First things first, we have to define the data we'll use:
-
data.rs
// We expect this in the query params // Validify creates a GetUsersPaginatedPayload in the background #[derive(Debug, Deserialize)] #[validify] #[serde(rename_all = "camelCase")] pub(super) struct GetUsersPaginated { #[validate(range(min = 1, max = 65_535))] pub page: Option<u16>, #[validate(range(min = 1, max = 65_535))] pub per_page: Option<u16>, } // It must derive Serialize and optionally new for convenience (provided by the // derive_new crate) #[derive(Debug, Serialize, new)] pub(super) struct UserResponse { users: Vec<User>, } impl Response for UserResponse {}
GetUsersPaginated
comes in, gets validated, UserReponse
comes out, simple enough!
We create entry points for the service with handlers:
-
handler.rs
use super::{contract::ServiceContract, data::GetUsersPaginatedPayload}; pub(super) async fn get_paginated<S: ServiceContract>( data: web::Query<GetUsersPaginatedPayload>, service: web::Data<S>, ) -> Result<impl Responder, Error> { let query = GetUsersPaginated::validify(data.0)?; info!("Getting users"); service.get_paginated(query) }
So far we've been showcasing a simple actix handler, so let's get to the good stuff.
Notice that we have a ServiceContract
bound in our handler. Services define their api through contracts. Contracts are nothing more than traits (interfaces) through which we interact with the service:
-
contract.rs
pub(super) trait ServiceContract { fn get_paginated(&self, data: GetUsersPaginated) -> Result<HttpResponse, Error>; } pub(super) trait RepositoryContract { fn get_paginated( &self, page: u16, per_page: u16, sort: Option<user::SortOptions>, ) -> Result<Vec<User>, Error>; }
These contracts define the behaviour we want from our service and the infrastructure it will use. The service contract is implemented by the service struct:
-
service.rs
pub(super) struct UserService<R> where R: RepositoryContract, { pub repository: R, } impl<R> ServiceContract for UserService<R> where R: RepositoryContract, { fn get_paginated(&self, data: GetUsersPaginated) -> Result<HttpResponse, Error> { let users = self.repository.get_paginated( data.page.unwrap_or(1_u16), data.per_page.unwrap_or(25), data.sort_by, )?; Ok(UserResponse::new(users) .to_response(StatusCode::OK) .finish()) } }
The service has a single field that must implement the contract. This contract serves as a layer of abstraction such that we now do not care what goes in the repository
field so long as it implements RepositoryContract
. This helps with the generic bounds in the upcoming repository and makes testing the services a breeze!
Now we have to define our repository and is when we enter the esoteric realms of rust trait bounds:
-
adapter.rs
use hextacy_derive::Repository; use hextacy::clients::db::{Client, DBConnect}; use std::{marker::PhantomData, sync::Arc}; #[derive(Debug, Clone, Repository)] #[postgres(Conn)] pub struct Repository<C, Conn, User> where C: DBConnect<Connection = Conn>, User: UserRepository<Conn>, { postgres: Client<C, Conn>, user: PhantomData<User>, } // This one's for convenience impl<C, Conn, User> Repository<C, Conn, User> where C: DBConnect<Connection = Conn>, User: UserRepository<Conn> { pub fn new(client: Arc<A>) -> Self { Self { client: Client::new(client), user: PhantomData } } } impl<C, Conn, User> RepositoryContract for Repository<C, Conn, User> where Self: RepositoryAccess<Conn>, C: DBConnect<Connection = Conn>, User: UserRepository<Conn> { fn get_paginated( &self, page: u16, per_page: u16, sort: Option<user::SortOptions>, ) -> Result<Vec<user::User>, Error> { let mut conn = self.connect()?; User::get_paginated(&mut conn, page, per_page, sort).map_err(Error::new) } }
That's a lot of stuff for just getting users out of the database.
The Repository
derive implements the RepositoryAccess
trait using PgPoolConnection
as its connection type, since we annotated it with
postgres.
RepositoryAccess
is a simple trait that is generic over the connection and gives its implementors a connect()
method to establish a connection to the database. In the Repository
derive, this generic connection is concretised to PgPoolConnection
, which basically means we can use any client that can establish that connection. The Postgres
client can do just that (it is simply a wrapper around a connection pool).
The #[postgres(Conn)]
attribute tells the derive macro which generic connection parameter to substitute in the implementation and must match the generic in the struct. RepositoryAccess
can be manually implemented as well.
DBConnect
is a trait used by clients to establish an actual connection. All concrete clients implement it in their specific ways.
It is also implemented by the Client
struct. A Client
is a wrapper around a concrete client and simply delegates the connect()
call to it.
As you can see, the client's C
parameter MUST implement DBConnect
which takes care of connecting to the database and its connection MUST be the same as the connection on DBConnect
. This takes care of how we're connecting to the DB.
The User
bound is simply a bound to a repository the service's repository will use, which in this case is the UserRepository
. Since repository methods must take in a connection (in order to preserve transactions) they do not take in &self
. This is fine, but now the compiler will complain we have unused fields because we are in fact not using them. If we remove the fields, the compiler will complain we have unused trait bounds, so we use phantom data to make the compiler think the struct owns the data.
So far we haven't coupled any implementation details to the service. The derive macro generates code for a postgres client, but it just substitutes the generic connection bounds for concrete ones in its RepositoryAccess
implementation. It's called postgres because the derive macro has a specific set of fields on which it operates, this could be named anything in case of manual implementations.
So pretty much, all the service has are calls to some generic clients, connections and repositories.
This fact is at the core of this architecture and is precisely what makes it so powerful. Not only does this make testing a piece of cake, but it also allows us to switch up our adapters any way we want without ever having to change the business logic. They are completely decoupled.
Finally, we'll concretise everything in the setup:
-
setup.rs
pub(crate) fn routes(pg: Arc<Postgres>, rd: Arc<Redis>, cfg: &mut web::ServiceConfig) { let service = UserService { repository: Repository::<Postgres, PgPoolConnection, PgUserAdapter>::new(pg.clone()), }; let auth_guard = interceptor::AuthGuard::new(pg, rd, Role::User); cfg.app_data(Data::new(service)); // Show all cfg.service( web::resource("/users") .route(web::get().to(handler::get_paginated::< UserService<Repository<Postgres, PgPoolConnection, PgUserAdapter>>, >)) .wrap(auth_guard), ); }
I'll admit it, the trait bounds do look kind of ugly, but seeing as this is the only place where we concretise our types, we never have to worry about the rest of the service breaking when we makes changes in our adapters.
To reduce some of the unpleasentness with dealing with so many generics, macros exist to aid the process. If we utilise the repository!
and contract!
macro, our adapter.rs
file becomes a bit more easy on the eyes:
-
adapter.rs
/* ..imports.. */ repository! { C => Connection : field_name; User => UserRepository<Connection> } contract! { C => Connection; RepositoryContract => Repository, RepositoryAccess; User => UserRepository<Connection>; fn get_paginated( &self, page: u16, per_page: u16, sort: Option<user::SortOptions>, ) -> Result<Vec<user::User>, Error> { let mut conn = self.connect()?; User::get_paginated(&mut conn, page, per_page, sort).map_err(Error::new) } }
Looks much better! This will essentially generate all the code with the generics from the original file.
You can read more about how the macros work in the hextacy::db
module.
Transactions
The reason for the repositories always taking in a connection in their methods is transactions. Since business level services should have the ability to rollback transactions if anything goes south, we have to somehow enable their repositories to suport transactions.
We do this by adding a transaction field to the repository, which is simply a RefCell
around an Option<C>
where C
is the connection. We use the ref cell to get mutable access to transactions without poisoning our API with &mut self
references.
This ref cell can now hold an open connection that can be used to perform queries. The Atomic
trait provides an interface for any repository to start, commit or rollback a transaction. The way this is done is by checking whether our ref cell contains a connection, if it does we use that one and if it doesn't we simply instruct our client to establish a new one. Taking it one step further, let's make the user service repository atomic:
use hextacy_derive::Repository;
use hextacy::db::{AtomicConnection, Transaction};
use hextacy::clients::db::{Client, DBConnect};
use std::{marker::PhantomData, sync::Arc};
#[derive(Debug, Clone, AcidRepository)]
#[postgres(Conn)]
pub struct Repository<C, Conn, User>
where
C: DBConnect<Connection = Conn>,
User: UserRepository<Conn>,
{
pub postgres: Client<C, Conn>,
// Type provided for convenience which is equivalent to RefCell<Option<Conn>>
pub pg_tx: Transaction<Conn>,
user: PhantomData<User>,
}
Now, instead of simply establishing a connection and calling User::get_paginated
, we first have to check whether an open connection exists:
impl<C, Conn, User> RepositoryContract for Repository<C, Conn, User>
where
Self: AcidRepositoryAccess<Conn>,
C: DBConnect<Connection = Conn>,
User: UserRepository<Conn>
{
fn get_paginated(
&self,
page: u16,
per_page: u16,
sort: Option<user::SortOptions>,
) -> Result<Vec<user::User>, Error> {
let mut conn = self.connect()?;
// Use atomic! to reduce this boilerplate
match conn {
hextacy::db::AtomicConnection::New(mut conn) => User::get_paginated(&mut conn, page, per_page, sort).map_err(Error::new),
hextacy::db::AtomicConnection::Existing(mut conn) => User::get_paginated(conn.borrow_mut().as_mut().unwrap(), page, per_page, sort).map_err(Error::new),
}
}
}
To reduce the boilerplate around matching whether a connection exists, the atomic!
macro can be utilised to perform the query. It does exactly what's written above.
Notice that Repository
is changed to AcidRepository
and RepositoryAccess
is changed to AcidRepositoryAccess
. The access traits are the same, except that the atomic version returns an AtomicConnection<C>
and requires the repository to implement Atomic
, which AcidRepository
does behind the scenes:
use hextacy::db::{Atomic, DatabaseError, TransactionError};
use diesel::connection::AnsiTransactionManager;
impl</* ..bounds.. */> Atomic for Repository< /* ..bounds.. */, PgPoolConnection>
where /* ..bounds.. */
{
fn start_transaction(&self) -> Result<(), DatabaseError> {
let mut tx = self.transaction.borrow_mut();
match *tx {
Some(_) => Err(DatabaseError::Transaction(TransactionError::InProgress)),
None => {
let mut conn = self.client.connect()?;
AnsiTransactionManager::begin_transaction(&mut *conn)?;
*tx = Some(conn);
Ok(())
}
}
}
fn rollback_transaction(&self) -> Result<(), DatabaseError> {
let mut tx = self.transaction.borrow_mut();
match tx.take() {
Some(ref mut conn) => AnsiTransactionManager::rollback_transaction(&mut **conn)
.map_err(DatabaseError::from),
None => Err(DatabaseError::Transaction(TransactionError::NonExisting).into()),
}
}
fn commit_transaction(&self) -> Result<(), DatabaseError> {
let mut tx = self.transaction.borrow_mut();
match tx.take() {
Some(ref mut conn) => {
AnsiTransactionManager::commit_transaction(&mut **conn)
.map_err(DatabaseError::from)
}
None => Err(DatabaseError::Transaction(TransactionError::NonExisting).into()),
}
}
}
Atomic implementations need to have concrete types since it must know which transaction manager to use to operate on the connection.
Thankfully, AcidRepository
does this for us. One more shorcut that can be used is the acid_repo!
macro which functions the same as repository
except with the addition of the transaction field and the atomic access implementation.
Business level services can now utilise the three methods to perform transactions as they see fit. To reduce the boilerplate associated with them, we can utilise the transaction!
macro.
This macro takes in a callback that must return a result. Before the callback start, start_transaction
will be called, then, depending on the result, the transaction will either be committed or rollbacked.
To elaborate further, here's what a repository would look like:
- repository/user.rs
pub trait UserRepository<C> {
fn get_paginated(
conn: &mut C,
page: u16,
per_page: u16,
sort_by: Option<SortOptions>,
) -> Result<Vec<User>, AdapterError>;
}
The adapter just implements the UserRepository
trait and returns the model using its specific ORM. This concludes the architectural part (for now... :).
hextacy
Feature flags:
- full - Enables all the feature below
- db - Enables mongo, diesel and redis
- ws - Enable the WS session adapter and message broker
- diesel - Enables the diesel postgres client and derive macros
- mongo - Enables the mongodb client and derive macros
- redis - Enables the redis client and cache access trait
- email - Enables the SMTP client and lettre
-
db
Contains a collection of traits to implement on structures that access databases and interact with repositories. Provides macros to easily generate repository structs as shown in the example.
-
clients
Contains structures implementing client specific behaviour such as connecting to and establishing connection pools with database, cache, smtp and http servers. All the connections made here are generally shared throughout the app with Arcs.
-
logger
The
logger
module utilizes the tracing, env_logger and log4rs crates to setup logging either to stdout or aserver.log
file, whichever suits your needs better. -
crypto
Contains cryptographic utilities for encrypting and signing data and generating tokens.
-
web
Contains various helpers and utilities for HTTP and websockets.
-
http
The most notable here are the Default security headers middleware for HTTP (sets all the recommended security headers for each request as described here) and the Response trait, a utility trait that can be implemented by any struct that needs to be turned in to an HTTP response. Also some cookie helpers.
-
ws
Module containing a Websocket session handler.
Every message sent to this handler must have a top level
"domain"
field. Domains are completely arbitrary and are used to tell the ws session which datatype to broadcast.Domains are internally mapped to data types. Actors can subscribe via the broker to specific data types they are interested in and WS session actors will in turn publish them whenever they receive any from their respective clients.
Registered data types are usually enums which are then matched in handlers of the receiving actors. Enums should always be untagged, so as to mitigate unnecessary nestings from the client sockets.
Uses an implementation of a broker utilising the actix framework, a very cool message based communication system based on the Actor model.
Check out the
web::ws
module for more info and an example of how it works.
-
-
cache
Contains a cacher trait which can be implemented for services that require access to the cache. Each service must have its cache domain and identifiers for cache seperation. The
CacheAccess
andCacheIdentifier
traits can be used for such purposes.
A note on middleware
The structure is similar to the endpoints as demonstrated above. If you're interested in a bit more detail about how Actix's middleware works, here's a nice blog post you can read. By wrapping resources with middleware we get access to the request before it actually hits the handler. This enables us to append any data to the request for use by the designated handler. Essentially, we have to implement the Transform
trait for the middleware and the Service
trait for the actual business logic.
If you take a look at the auth
middleware you'll notice how our Transform
implementation, specifically the new_transform
function returns a future whose output value is a result containing either the AuthMiddleware
or an InitError
which is a unit type. If you take a look at the signature for Actix's wrap
function you can see that we can pass to it anything that implements Transform
. This means that, for example, when we want to wrap a resource with our AuthGuardMiddleware
, we have to pass the instantiated AuthGuard
struct, because that's the one implementing Transform
.
If you take an even closer look at what happens in wrap
you'll see that it triggers new_transform
internally, meaning the instantiated AuthGuard
transforms into an AuthGuardMiddleware
which executes all the business.
The structure is exactly the same as that of endpoints with the exception of interceptor.rs which contains our Transform
and Service
implementations. The main functionality of the middleware is located in the call
function of the Service
implementation.
The config file
We tie all our handlers together in the config.rs
file in the server's src
directory. With only this one endpoint it would look something like:
pub(super) fn init(cfg: &mut ServiceConfig) {
let pg = Arc::new(Postgres::new());
users::setup::routes(pg, cfg);
}
We would then pass this function to our server setup.
HttpServer::new(move || {
App::new()
.configure(config::init)
.wrap(Logger::default())
})
.bind_openssl(addr, builder)?
.run()
.await
Read more about the openssl setup in openssl/README.md
The helpers module contains various helper functions usable throughout the server.
Storage Directory Overview
The storage crate is project specific which is why it's completely seperated from the rest. It contains 3 main modules:
-
Repository
Contains interfaces for interacting with application models. Their sole purpose is to describe the nature of interaction with the database, they are completely oblivious to the implementation. This module is designed to be as generic as possible and usable anywhere in the service logic.
-
Adapters
Contains the client specific implementations of the repository interfaces. Adapters adapt the behaviour dictated by their underlying repository. Seperating implementation from behaviour decouples any other module using a repository from the client specific code located in the adapter.
-
Models
Where application models are located.
The storage adapters can utilize connections established from the clients module.
XTC - Very much a work in progress
A.K.A. the CLI tool provides a way of seamlessly generating and documenting endpoints and middleware.
To set up the cli tool after cloning the repository enter
cargo install --path xtc
from the project root.
The list of top level commands can be viewed with the xtc -h
command.
The most notable commands are [g]enerate
which sets up endpoint/middleware boilerplate and [anal]yze
which scans the router and middleware directories and constructs a Json/Yaml file containing endpoint info.
Xtc only works for the project structure described in the architecture section.
The [g]enerate
command generates an endpoint structure like the one described in the router. It can generate route [r]
and middleware [mw]
boilerplate. Contracts can also supplied to the command with the -c
flag followed by the contracts you wish to hook up to the endpoint, comma seperated e.g.
xtc gen route <NAME> -c repository,cache
This will automagically hook up the contracts to the service and set up an infrastructure boilerplate. It will also append pub(crate) mod <NAME>
to the router's mod.rs
. It also takes in a -p
argument which can be used to specify the directory you want to set up the endpoint.
The analyze
function heavily relies on the syn crate. It analyzes the syntax of the data
, handler
and setup
files and extracts the necessary info to document the endpoint.
All commands take in the -v
flag which stands for 'verbose' and if true print what xtc is doing to stdout. By default, all commands are run quietly.
TODO:
- Init project with
xtc init
Dependencies
~1.5MB
~36K SLoC