In this blog post, I'll walk you through an open source PR I submitted to SeaORM with Rust.
This is a transparent and honest story that serves as a walk-through of my first open source issue after a long break from the open source world. The aim of this blog post is to demystify the process of contributing to open source projects and help the readers realize that open source contribution is (often) challenging but rewarding and fun.
While I use a contribution to Rust’s SeaORM
as an example, you do not need any knowledge in Rust to follow this post’s thought process and procedure to take away lessons that might help you start your open source journey in any language or ecosystem.
If you like this post or have questions, feel free to share the post and interact with my Twitter handle.
Alrighty, hop on the wagon and let us begin this story!
Tl;dr
I know many of you won’t have the time or patience to read my blah blah, so here’s the main take-aways for contributing to open source projects:
- Pick a project that not only relates to your work or hobby but is also beginner-friendly with maintainers who are willing to offer mentorship and guidance.
- Pick an issue that’s beginner-friendly, well-scoped (size small to medium) to make sure you can understand the issue at hand without being overwhelmed.
- Communicate with the maintainers actively about the issue and possible solutions so that they are onboard with your ideas, vice versa. This way, you save both the maintainers’s and your time when it comes to code review.
- If you ran into a wall, never hesitate to ask the maintainers for help with detailed, well-formulated questions. If one doesn’t answer, ask another. Maintainers are usually helpful and friendly humans.
- Don’t forget to be polite, communicative, and helpful to others in the open source community 🫶
Motivation to write about open source contribution
As software engineers, we are often advised to contribute to open source projects as a way to hone our reading and writing skills as well as to give back to the community. However, many new programmers, including myself, tend to feel very intimidated by open source codebases that might appear foreign, abstract, or overwhelming.
Because of this, I decided to create this blog post (and maybe more future posts) to show other open source enthusiasts my process as well as my inner dialogue when working on open source contributions.
To make the post more readable and less fragmented, I presented the entire story in more or less a linear storyline. Please keep in mind that the reality is much more complex, complicated, and iterative. There is usually a lot more struggles going into a first contribution to an unfamiliar project, but it is all worth the effort.
Pick a beginner-friendly project
I was trying out Rust Tokio’s axum web framework for building APIs when I encountered a mild issue: I am not particularly good at raw SQL.
In my previous job, I worked mostly in Django (with a little bit of Go), which has an amazing ORM that spoiled me. I am certainly happy and willing to hone my SQL skills, but I also wondered if there was a decent ORM framework in Rust.
By chance, I landed on the homepage of SeaQL, the project behind the Rust crate SeaORM, an async, dynamic, and testable ORM that supports Postgres, MySQL, and SQLite at the time this blog post was written. It seems like a project with a lot of care from its creators, so I decided to see if there’s an opportunity for doing some open source contribution.
It’s good to iterate on an important point in open source contribution: Choose a project that matters to us! Working on a codebase means that we must play with the code ourselves. If the project does not relate to our work or hobby, then we’d be reluctant to play with the code or to test the code we write for the project. I’m a backend engineer and I interact with databases all the time, so an ORM for Rust seems meaningful to me and I’d love to play with SeaORM.
Choose a beginner-friendly issue
The first thing I did was skimming through the open issues of SeaORM. Luckily, SeaORM’s maintainers are very active and it was easy for me to spot a good first issue
: Add flag to sea-orm-cli
to generate code for time
crate #661.
Keep in mind that some projects aren’t actively labelling good first issue
issues and some maintainers don’t have the time or capacity to mentor or guide new contributors.
I personally avoid those projects because I didn’t see myself as very good at navigating on foreign terrain in a codebase due to having programmed professionally for only 2 years. With more years of experience, we’d become better at this.
Understand the issue
Back to the story. As the issuer suggested in the link: Since the time
crate is supported as alternative to chrono
, it should be possible to generate code for time
crate.
For those who don’t know enough about Rust, there are 2 crates in Rust that deal with time and date, time
and chrono
(note that a crate
is the equivalent of a gem
in Ruby or package
in Python and Javascript).
The problem at hand seems to be that sea-orm-cli
, the command line tool for SeaORM
, appeared to generate only code that corresponds to chrono
's datetime
types and some users would like to have a feature in the command line tool to generate time
's datetime
types without having to write custom code themselves.
Define questions for solving the issue
Having had very little experience writing CLI
tools in Rust (or any language for that matter), I laid down some bullet points that I needed clarification in order to work on this issue:
- Where in the codebase is the
chrono
'sdatetime
types being generated? - How do I use
sea-orm-cli
to reproduce the reported issue so that I can progressively iterate towards a solution? - What kind of a flag would the maintainers be happy to accept for the finished feature?
Find the relevant code for the issue at hand
I posted the first question to the issue’s thread:
I'm looking for a
good first issue
:) Just wondering how one is generating code forchrono
right now?
Billy, one of the maintainers, was quick to reply an answer:
Hey @nahuakang, welcome! You can take a look at:
// sea-orm/sea-orm-codegen/src/entity/column.rs
// https://github.com/SeaQL/sea-orm/blob/86e7e808b37179315a1cc5c6c852764830c04661/sea-orm-codegen/src/entity/column.rs#L47-L51
ColumnType::Date => "Date".to_owned(),
ColumnType::Time(_) => "Time".to_owned(),
ColumnType::DateTime(_) => "DateTime".to_owned(),
ColumnType::Timestamp(_) => "DateTimeUtc".to_owned(),
ColumnType::TimestampWithTimeZone(_) => "DateTimeWithTimeZone".to_owned(),
Okay, at least now I have some code that I can read. I felt some adrenaline kicking into my system now that I got the first thread of code to investigate. But my dearest enemy, imposter syndrome, also kicked in.
I started questioning myself and wondered if I’d be able to actually come up with a solution? After all, some good first issue
issues aren’t that easy. What if I can’t solve it and embarrass myself? Because of this, I procrastinated.
Communicate with maintainers before coding
As a seasoned procrastinator, I waited for a day before I posted the next two questions:
@billy1624 Thanks! I'm reading the docs as well as trying to understand the workflow with
sea-orm
now. Are there any examples/tests that demonstrate how to usesea-orm-cli
to generate code for entities involvingchrono
? I haven't seen achrono
flag insea-orm-cli
so I wonder if the solution here is to add a flag fortime
crate or something else (like handlingtime
types)?
Luckily, it was the weekend so it took Billy a few days to respond, by which time I had calmed down from my imposter anxiety:
Hey @nahuakang, you can simply create a database table with timestamp / datetime columns. Then, follow the steps on https://www.sea-ql.org/SeaORM/docs/generate-entity/sea-orm-cli, to generate entity files with
sea-orm-cli
. We don't have achrono
flag forsea-orm-cli
, since it's the default crate to represent datetime crate as of now. I think we can add an option,--date-time-crate
, tosea-orm-cli generate entity
. It will take values such aschrono
/time
withchrono
being the default. This way it's backward compatible and users can opt-in to it. Thoughts?
Okay, so the flag for this option should be --date-time-crate
and we should run the new feature like:
# To use the time crate for the command
$ sea-orm-cli generate entity --date-time-crate=time -u postgres://nahua:password@localhost:5432/timetest -o src/entity
# To use the chrono crate for the command
$ sea-orm-cli generate entity --date-time-crate=chrono -u postgres://nahua:password@localhost:5432/timetest -o src/entity
This is good. So I claimed this issue and began working towards a solution in a format that I knew the maintainers would be willing to accept.
Familiarize ourselves with the documentation
Quick pause to clarify some important details before jumping into the solution section.
During this entire time as I communicated with Billy, I was actively reading SeaORM’s documentation. It’s a new ORM and it does things differently from what I did in the Python world.
The most important bits I learned from the documentation that helped me solve the issue were:
Entity
: The representation of a table in SeaORM. In the column types section, I found out about how thetime
andchrono
Rustdatetime
types correspond to SeaORM’s types.sea-orm-cli
: I learned how to use theCLI
tool that I was now supposed to fix.
Reproduce the issue
To reproduce the existing issue, I needed to do a few things:
- A minimal Rust cargo project to generate the
entities
in - A running database of my choice that
SeaORM
can connect to (Postgres) - Generate a table full of date & time types
- Run the command to confirm the issue
- Investigate the code more to understand where I could add the feature to solve the issue
1. Create a minimal cargo project
This is really easy in Rust. I simply ran the following command:
$ cargo new playground
$ cd playground
$ tree
.
├── Cargo.toml
├── README.md
└── src
└── main.rs
Since I know I’ll be only generating entity files (in any directory of my choice) to observe whether sea-orm-cli
could generate code that has the correct types for chrono
or time
given the user’s choice, this barebone project would suffice without any extra boilerplate code.
I installed the latest distribution of sea-orm-cli
with the following command:
$ cargo install sea-orm-cli
2. Spin up a Postgres DB
I have Postgres installed on my Macbook, but I really like using Docker for spinning up a garbage DB that I can test in, so I copy-pasted a very simple docker-compose
file that I use a lot for job interviews:
version: "3.3"
services:
db:
image: postgres:latest
container_name: postgres
restart: always
ports:
- 5432:5432
environment:
- POSTGRES_USER=nahua
- POSTGRES_PASSWORD=password
- POSTGRES_DB=timetest
volumes:
- timetest:/var/lib/postgres/data
adminer:
image: adminer:latest
restart: always
ports:
- 8080:8080
volumes:
timetest:
This is not a tutorial on Docker so the gist is that with this docker-compose
file, I can spin up a postgres
database by simply running the following command:
# Using -d flag for detach mode
$ docker compose up -d
The database would have a user nahua
with a password password
as well as a DB named timetest
that’s listening on the port 5432. To examine the database easily, I could login onto adminer
on localhost:8080
.
3. Create a test DB table
With the DB spinning, I procrastinated again because I dreaded writing raw SQL commands, not knowing exactly which types I should write.
About a day later, I overcame my inertia and logged in to the database to create the following test table that was filled with some relevant date & time types:
# Access the running database as the user "nahua" for the DB "timetest"
# Then run SQL command to create a table called "time_tests"
$ docker exec -it postgres /usr/bin/psql -U nahua -d timetest
psql (14.2 (Debian 14.2-1.pgdg110+1))
Type "help" for help.
timetest=# CREATE TABLE time_tests (
id SERIAL NOT NULL PRIMARY KEY,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_date DATE NOT NULL DEFAULT NOW(),
created_time TIME NOT NULL DEFAULT NOW()
);
timetest=# \q
4. Reproduce the issue for sanity check
I ran the following commands per specification of the documentation and confirmed that I had everything set up properly for testing my code later on:
$ sea-orm-cli generate entity -u postgres://nahua:password@localhost:5432/timetest -o .
In essence, sea-orm-cli
would connect to the DB table I spun up and fetch the schema of the time_tests
table and translate it into Rust code, i.e. entities
files in my barebone project’s src/
directory.
This would generate the following files:
$ tree
.
├── Cargo.toml
├── mod.rs
├── prelude.rs
├── seaql_migrations.rs
├── src
│ └── main.rs
└── time_tests.rs
And time_tests.rs
is the file of interest here:
$ cat time_tests.rs
//! SeaORM Entity. Generated by sea-orm-codegen 0.8.0
use sea_orm::entity::prelude::*;
#[derive(Clone, Debug, PartialEq, DeriveEntityModel)]
#[sea_orm(table_name = "time_tests")]
pub struct Model {
#[sea_orm(primary_key)]
pub id: i32,
pub created_at: DateTimeWithTimeZone,
pub created_date: Date,
pub created_time: Time,
}
#[derive(Copy, Clone, Debug, EnumIter)]
pub enum Relation {}
impl RelationTrait for Relation {
fn def(&self) -> RelationDef {
panic!("No RelationDef")
}
}
impl ActiveModelBehavior for ActiveModel {}
When I’m done with my feature, I’d be able to specify a --date-time-crate
flag with a value so that the values in the time_tests
model would be the following for chrono
crate:
pub id: i32,
pub created_at: DateTimeWithTimeZone,
pub created_date: Date,
pub created_time: Time,
and the following for time
crate:
pub id: i32,
pub created_at: TimeDateTimeWithTimeZone,
pub created_date: TimeDate,
pub created_time: TimeTime,
This information was gathered from the documentation for column types.
Work towards a feature
Now I started investigating the code snippet Billy gave me more carefully. A few things I kept in mind were:
- How the
sea-orm-cli
code works and how I could insert a new flag--date-time-crate
? - Once the user could specify
--date-time-crate
value, how do I pass this information down to the part of the code that Billy shared, i.e. into theColumn.get_rs_type
method that was in charge of convertingchrono
types ortime
types and translate it into Rust types for the output files we had above? - Which other functions or objects are using
Column.get_rs_type
because I must adjust these functions as part of the clean-up after implementing the new feature?
1. The Structure of SeaORM project
As the Cargo.toml
suggests, sea-orm
has the following workspaces:
[workspace]
members = [".", "sea-orm-macros", "sea-orm-codegen"]
Examining the structure, we can see that sea-orm-cli
is sort of a stand-alone project living inside sea-orm
and sea-orm-codegen
is the workspace that contains the Column.get_rs_type
method that Billy originated pointed me to.
$ pwd
/path/to/our/seaql/sea-orm
$ tree -L 1
...
├── Cargo.toml
...
├── sea-orm-cli
├── sea-orm-codegen
...
├── src
...
└── tests
This means that my feature would almost exclusively reside in sea-orm-cli/
and sea-orm-codegen/
directories.
2. Adding a new CLI flag
By examining the file sea-orm/sea-orm-cli/src/cli.rs
, I quickly familiarized myself with how the project uses clap
to generate CLI commands and flags and wrote a new flag for --date-time-crate
(click for a Github view):
// sea-orm-cli/src/cli.rs
pub fn build_cli() -> App<'static, 'static> {
...
).arg(
Arg::with_name("DATE_TIME_CRATE")
.long("date-time-crate")
.help("The datetime crate to use for generating entities.")
.takes_value(true)
.possible_values(&["chrono", "time"])
.default_value("chrono")
),
...
}
3. Refactor; add a context struct and date time crate enum
I also traced the usage of Column.get_rs_type
all the way back to sea-orm/sea-orm-cli/src/commands.rs
where the following lines would generate the entities:
// sea-orm-cli/src/commands.rs
pub async fn run_generate_command(matches: &ArgMatches<'_>) -> Result<(), Box<dyn Error>> {
...
let output = EntityTransformer::transform(table_stmts)?
.generate(expanded_format, WithSerde::from_str(with_serde).unwrap());
...
}
So I somehow needed to pass the user’s date-time-crate
choice into EntityWriter.generate
method so that Column.get_rs_type
would eventually pick up this data and act accordingly. Essentially, I needed some sort of a context struct for this job.
I talked with Billy about this and he pointed me to his comment on another standing PR:
Hey @negezor, thanks for the updates! I think the semantic isn't seems right. The transformer,
EntityTransformer::transform(table_stmts, name_resolver)
, don't need name resolver. Instead, we could have introduce a new struct calledEntityWriterContext
which contains three things...
pub struct EntityWriterContext {
pub(crate) expanded_format: bool,
pub(crate) with_serde: WithSerde,
pub(crate) name_resolver: NameResolver,
}
Then,
EntityWriter::generate
method would take anEntityWriterContext
. Thoughts?
This was very helpful! So I wrote the following struct with a new
method that specified the variables needed by EntityWriter.generate
to generate the entities:
// sea-orm-codegen/src/entity/writer.rs
#[derive(Debug)]
pub struct EntityWriterContext {
pub(crate) expanded_format: bool,
pub(crate) with_serde: WithSerde,
pub(crate) date_time_crate: DateTimeCrate,
}
impl EntityWriterContext {
pub fn new(
expanded_format: bool,
with_serde: WithSerde,
date_time_crate: DateTimeCrate,
) -> Self {
Self {
expanded_format,
with_serde,
date_time_crate,
}
}
}
Meanwhile, let’s also work out a DateTimeCrate
enum so that we could store information about which crate the user chooses:
// sea-orm-codegen/src/entity/writer.rs
#[derive(Debug)]
pub enum DateTimeCrate {
Chrono,
Time,
}
impl FromStr for DateTimeCrate {
type Err = crate::Error;
fn from_str(s: &str) -> Result<Self, Self::Err> {
Ok(match s {
"chrono" => Self::Chrono,
"time" => Self::Time,
v => {
return Err(crate::Error::TransformError(format!(
"Unsupported enum variant '{}'",
v
)))
}
})
}
}
These changes are all available here on Github.
With the EntityWriterContext
struct and the DateTimeCrate
enum ready, I could basically write the following in sea-orm-cli
to let the command pass the --date-time-crate
information to the code that generates the entities files:
// sea-orm-cli/src/commands.rs
pub async fn run_generate_command(matches: &ArgMatches<'_>) -> Result<(), Box<dyn Error>> {
...
let date_time_crate = args.value_of("DATE_TIME_CRATE").unwrap();
...
let writer_context = EntityWriterContext::new(
expanded_format,
WithSerde::from_str(with_serde).unwrap(),
DateTimeCrate::from_str(date_time_crate).unwrap(),
);
let output = EntityTransformer::transform(table_stmts)?.generate(&writer_context);
}
4. Sew the thread
Now that we’ve introduced an EntityWriterContext
, we must update all the methods that would be affected by it until we’ve reached Column.get_rs_type
method. This part was quite mechanical and rather easy and the changes can be found here and here, mostly in the sea-orm-codegen/src/entity/base_entity.rs
file and sea-orm-codegen/src/entity/writer.rs
file.
5. Translate different crate types into correct column types
Finally, we have arrived at the beginning of this story when Billy showed me the original code that should be impacted by this feature change:
// sea-orm-codegen/src/entity/column.rs
pub fn get_rs_type(&self) -> TokenStream {
...
#[allow(unreachable_patterns)]
let ident: TokenStream = match &self.col_type {
ColumnType::Char(_)
...
ColumnType::Float(_) => "f32".to_owned(),
ColumnType::Double(_) => "f64".to_owned(),
ColumnType::Json | ColumnType::JsonBinary => "Json".to_owned(),
ColumnType::Date => "Date".to_owned(),
ColumnType::Time(_) => "Time".to_owned(),
ColumnType::DateTime(_) => "DateTime".to_owned(),
ColumnType::Timestamp(_) => "DateTimeUtc".to_owned(),
ColumnType::TimestampWithTimeZone(_) => "DateTimeWithTimeZone".to_owned(),
...
}
I decided to just do some nested match expressions to translate the types properly given the variable date_time_crate
that we’ve passed down from the command line via the EntityWriterContext
struct:
// sea-orm-codegen/src/entity/column.rs
pub fn get_rs_type(&self, date_time_crate: &DateTimeCrate) -> TokenStream {
...
#[allow(unreachable_patterns)]
let ident: TokenStream = match &self.col_type {
ColumnType::Char(_)
...
ColumnType::Float(_) => "f32".to_owned(),
ColumnType::Double(_) => "f64".to_owned(),
ColumnType::Json | ColumnType::JsonBinary => "Json".to_owned(),
ColumnType::Date => match date_time_crate {
DateTimeCrate::Chrono => "Date".to_owned(),
DateTimeCrate::Time => "TimeDate".to_owned(),
},
ColumnType::Time(_) => match date_time_crate {
DateTimeCrate::Chrono => "Time".to_owned(),
DateTimeCrate::Time => "TimeTime".to_owned(),
},
ColumnType::DateTime(_) => match date_time_crate {
DateTimeCrate::Chrono => "DateTime".to_owned(),
DateTimeCrate::Time => "TimeDateTime".to_owned(),
},
ColumnType::Timestamp(_) => match date_time_crate {
DateTimeCrate::Chrono => "DateTimeUtc".to_owned(),
// ColumnType::Timpestamp(_) => time::PrimitiveDateTime: https://docs.rs/sqlx/0.3.5/sqlx/postgres/types/index.html#time
DateTimeCrate::Time => "TimeDateTime".to_owned(),
},
ColumnType::TimestampWithTimeZone(_) => match date_time_crate {
DateTimeCrate::Chrono => "DateTimeWithTimeZone".to_owned(),
DateTimeCrate::Time => "TimeDateTimeWithTimeZone".to_owned(),
},
...
}
Changes can be viewed on Github here.
6. Test the code and try it out
Before finishing up, I wrote some unit tests for distinguishing time
from chrono
types in the Column.get_rs_type
method.
Now, since everything is ready, we should test the end product. Let’s install sea-orm-cli
from our updated local source code by running:
$ pwd
/path/to/our/seaql/sea-orm/sea-orm-cli
# This installs `sea-orm-cli` from our local source code
$ cargo install --path .
Finally, I ran the sea-orm-cli
command in my barebone project to see if I could generate the correct entity files with the correct types given chrono
or time
crate specification:
$ sea-orm-cli generate entity --date-time-crate time -u postgres://nahua:password@localhost:5432/timetest -o .
$ cat time_tests.rs
//! SeaORM Entity. Generated by sea-orm-codegen 0.8.0
use sea_orm::entity::prelude::*;
#[derive(Clone, Debug, PartialEq, DeriveEntityModel)]
#[sea_orm(table_name = "time_tests")]
pub struct Model {
#[sea_orm(primary_key)]
pub id: i32,
pub created_at: TimeDateTimeWithTimeZone,
pub created_date: TimeDate,
pub created_time: TimeTime,
}
#[derive(Copy, Clone, Debug, EnumIter)]
pub enum Relation {}
impl RelationTrait for Relation {
fn def(&self) -> RelationDef {
panic!("No RelationDef")
}
}
impl ActiveModelBehavior for ActiveModel {}
Awesome. So I wrote up a PR for this issue. Let’s wait and see what the code reviewers say 😉
Final words
Congratulations! You’ve gone through my long and winding blah blah. I hope you’ve learned something from my walk-through and, perhaps, you’ve realized that open source contribution does not need to be that intimidating!
I want to give the good people at SeaQL a quick shout-out and tell the world that they’re awesome for doing open source works! If you can, please sponsor them so that they can continue doing open source works and mentor new contributors.
In the near future, I hope to write a few more walk-throughs for different projects so as to provide you with a wide range of examples on open source contribution.
Remember, you are well-equipped both mentally and skill-wise to contribute to open source. Take your time, don’t get anxious, and enjoy the process!