- Published on
The Small Engine That Could
- Authors

- Name
- Thomas Quan
- Reading time
Reading time
15 min read
The Small Engine That Could.
Table of Contents
- Backstory Time
- It was setup for us
- Yellow Flags
- Breaking points and quick pivot
- Starting on someone else's floor plan
- One codebase, two stores, no real variant support
- Bypassing push first
- Automating deploy while Leaf still compiled
- Production Release was a pain
- Knixteen prep and the actual rebuild
- Extending the engine once it was ours
Not a lot of chances where you get to build your own engine that support the workflow of a big e-commerce space. I was lucky enough to have that chance and scale it at that.
Backstory Time
It was about a few months after I joined Knix that I got pulled onto the re-platforming project. It was a big initiative but not a surprising one from what I understand after talking to peers in the office. It was a yearly thing, it turn out.
The goal this time was to move the storefront off Remix + Shopify GraphQL Hydrogen and move back onto Shopify Liquid but this time in the latest and greatest version 2. Knix - our main e-commerce store came first with Black Friday as the hard deadline. Knixteen, the sister brand, would soon follow around December on a separate codebase with the same structure.
It was setup for us
When I join the project, the contract developer had already built the foundational piece before hand to power our "Github x Shopify" Liquid codebase using his custom made - Leaf CLI.
This was demo-ed to me and my manager as a wrapper around Shopify CLI with a src/ -> dist/ workflow. This included Rollup for JavaScript, PostCSS for styles, and commands like leaf build, leaf pull, and leaf watch for DevOps. The reason why there is a wrapper around an already build system from what I can understand is to help use ES6 JavaScript...that's all. Because of time constrant, we ended up following on with the structure even though there were some doubt in the reason behind it.
Yellow Flags
After a few weeks working on the project, the limits were hard to ignore. Our e-comerce structure operates in multiple stores and in tatrum with multiple team modifying at the same time. The support for it were basically nonexistent. Pulling themes settings that were edited by merchandises to edit was fragile to say the least. JS and CSS bundling couldn't be tuned with splitting and minification. There was little to no automation support as the wrapper wasn't exposing some of the key fields such as password prompts, Y/N pipe, live theme tag, no delete tag...etc. Everything that we wanted to do in the long run to run a successful business ran into a dead-end - that is not to say we can't run a business with it, but it would take a look of human power and delicate cross functional work to get it working.
All that to say, the biggest problem is that we don't own the wrapper, and is entirely dependent on the contract work which is outside of our control for something we wanted to build on long-term. I wasn't ever going to fork Leaf because forking a contract deliverable to run your production pipeline is just a slower way of building the same dependency.
Breaking points and quick pivot
Just weeks before we released, we were having trouble figuring out how we can deploy quickly and safely with all the moving pieces we had. I boldly made a suggested that we should make some critical changes "quick" and started the process of building the small engine from there. That got a lot of push back because, we can't just pause everything to re-do it at the start, so I decided to do something about the problem in incremental. The first step wasn't ripping Leaf out, it was figuring out what Shopify CLI could actually do on its own, in conjunction with Leaf, so that version control and deploys could work the way we needed them to. Once I understood that layer, the path to owning the rest became clearer.
Ultimately the engine I'm building is one that does all the heavy lifting of pull, build, and deploy to themes without human errors.
Starting on someone else's floor plan
Leaf wasn't bad for what it was, a way to get a replatform moving on a deadline. You got a sensible source tree.
src/
├── config/ # settings schema + settings_data
├── templates/ # JSON templates
├── sections/
├── snippets/
├── scripts/theme.js
├── styles/theme.css
└── ...
dist/ # what Shopify actually sees
leaf.config.js # limited config for how the build should work
stores.json # handle what stores to deploy, pull to/from
Rollup bundled your JS. PostCSS handled CSS. leaf watch kept dist/ in sync while shopify theme dev hot-reloaded. For kicking off a Liquid replatform while the business was still running promos on the old stack, it worked but not scale-able.
[!WARNING] We only use PULL to pull down theme settings that were made by merchandisers, nothing else because that is a clear line between the two team that should not be cross for integrity purpose.
stores.json was there to help us handle multiple theme deployment, pull and help handle multiple "store" even if it meant some hassle which we will talk to in a bit.
The dependency bothered me more than the code. I'd seen what happens when vendor tooling goes stale and Shopify moves on. Leaf wrapped Shopify CLI behind its own interface, and every improvement I wanted later meant working around that layer instead of building on something we controlled.
I said so early but timeline was tight so we agreed to ship first and pay as we go.
One codebase, two stores, no real variant support
The first real problem wasn't build. It was pull, and how pull work implied how the repo was structured.
What leaf pull does with our codebase is that it wrote straight into src/config/ and src/templates/ - meaning that it is a one shared folder for everything, preventing us from switching store template even if we have the same codebase
leaf pull -s ca-store -e production
# If we want to pull us store,
# that means it would override all the files we have for ca.
leaf pull -s us-store -e staging
Leaf read store and environment from leaf.config.js, called Shopify CLI, and whatever came back became the source of truth on the next commit. No routing. No per-store folders.
[!INFO] FUN FACT: Our git history from that period is full of commits named
leaf pull productionandleaf pull - settings. Merchandising would tweak something in Shopify admin, someone would pull, and a PR would land touching homepage JSON, page templates,settings_data.json.
Getting staging theme and preprod theme up to date every time was a hassle and manual as well where we need to repeat the pull step every time there was a major change.
With two stores sharing one repo, we didn't really have two stores in the codebase. We had one version. Canada was the default reality on disk most of the time. If you wanted to work on US templates locally, the workflow was roughly delete your local CA template folder, pull US, do your work, and hope that when you pushed, nothing from Canada bled through and nothing you needed from CA was gone. There was no us theme settings store in the src code. There was just src/templates/, and whoever pulled last won.
We never had full US support in the repo in any meaningful sense. US existed in Shopify. In git, it was always fighting CA for the same folder.
At this point, I kept the overall src/ layout Leaf gave us. The pull model had to change eventually. I just couldn't change everything at once, and at this point pull and build were still intertwined through Leaf, so I couldn't cleanly bypass one without the other. So we left in a limbo state at the time were we only store CA info even though we advertise support for both store.
Right now at this stage, we couldn't introduce GitHub action workflow because there are just too many dynamic variable in places
Bypassing push first
With Build and Pull being tie down to Leaf CLI. Push has to work or else we are going to have a very bad time managing requirements.
Rebuilding the whole toolchain in the middle of a replatform wasn't going to happen. Two groups were already sharing the repo. Merchandising on templates and settings, engineering on Liquid and JS. Both needed to ship without stepping on each other.
What I did first was narrower than it sounds in retrospect. I bypassed push only. Pull still went through Leaf. Build still went through Leaf. They were tied together in the wrapper, and untangling pull without replacing build wasn't something I could do in a week. So the first change was direct using Shopify CLI for deployment to see if it is successful, turn out it is.
# Building the `dist` as it is now
leaf build
# Try to bypass leaf deploy with under the hood shopify cli
shopify theme push \
--path dist \
--store your-store.myshopify.com \
--theme 123456789 \
--nodelete \
--ignore "config/*.json" \
--ignore "templates/*.json"
# ^ this was a success
leaf build stay the same, where it builds to dist folder. Afterward we just specify to Shopify that our "correct" folder template is in dist and deploy that.
That alone changed how the team worked. Developers could push code without overwriting merchandising config sitting in Shopify admin. I wrapped the flags into package.json scripts because nobody should have to remember --ignore patterns from the docs.
Throughout the Knix pre-release period and into those first Black Friday releases, we were still on Leaf recipes but I'd cut deploy out of the wrapper without touching the compiler or the pull path yet. That turned out to be the right first move, even if it didn't solve the one-codebase problem. It was a small step in the right path.
Automating deploy while Leaf still compiled
Once direct push worked, it means that we can automate the dev and pre-release branches now.
The convention was simple and it's still how the repo works. Long-lived branches map to named themes in stores.json. Merge to staging, deploy to the staging theme. Merge to preprod, deploy to preprod. Merge to main, deploy to dev-production. Nobody picks a theme ID by hand but release stay manual because of some limitation.
The early GitHub Actions workflow was build with Leaf, push with Shopify CLI, per store, per branch, with ignores on the dev store so engineering deploys didn't clobber merchandising work in admin. I added PR labeling by target branch and a release workflow for production. Semver bump, git tag, GitHub Release with the PR description as notes, branch sync. Those were additional to what I’m building with the engine but valuable because we were shipping weekly and needed to know what was in prod without asking around.
Production Release was a pain
At this stage CI was not pulling live settings before every deploy which kinda sucks. That came later, after I'd moved store-specific JSON into its own folder structure - because remember, we still don't have true multi-store setup yet, just CA live in the codebase for now. During pre-release and the first production releases, the US/CA collision problem was still a local workflow problem.
To release a version to production we have to do it manually in these step:
+----------------------+
| Pull CA templates |
+----------+-----------+
|
v
+----------------------+
| Release CA |
+----------+-----------+
|
v
+----------------------+
| Delete CA templates |
+----------+-----------+
|
v
+----------------------+
| Pull US templates |
+----------+-----------+
|
v
+----------------------+
| Release US |
+----------+-----------+
|
v
+----------------------+
| Delete US templates |
+----------+-----------+
|
v
+----------------------+
| Revert to CA |
+----------------------+
(Note: At this stage, CI was NOT automating these steps on every merge due to too many moving parts.)
For the duratinon of Knix pre-release and release period, this was what it is like for a while, not perfect but we got it working in a semi-automated steps. But throughout the time of then and before Knixteen re-platform, I had a chance to revisit this workflow and improve it.
Knixteen prep and the actual rebuild
The work that replaced Leaf entirely and have version controlled store happens in the second re-platform that I had a chance leading. This initiative allows me to rethink how build, deploy, pull work.
Knixteen meant another store on the same Liquid base, and the one-folder-for-all-stores model wasn't going to survive another brand. I couldn't keep asking people to wipe local template folders to switch countries, especially if it is a new developer on the team. I couldn't keep depending on contract tooling for a codebase we were about to scale across multiple stores. And I wasn't going to fork Leaf to add code splitting or custom watch behavior. That would just be more dependency on work outside our control.
What I ultimately shipped was a custom build script to replace Leaf entirely, thought through from the ground up to support per-store storage, folder organized Liquid snippet, JS, CSS and native Shopify CLI support exposed for extend ability. Store-specific config and templates now live separately under versions/. This would allow us to handle multiple stores correctly without having to delete anything.
src/
├── sections/ # shared, same Liquid everywhere
├── snippets/
├── blocks/
├── layout/
├── scripts/ # ES modules, web components
├── styles/ # Tailwind via PostCSS
├── assets/ # fonts, images, not theme.js/theme.css
└── versions/
├── ca-store/
│ ├── config/
│ └── templates/
├── us-store/
│ ├── config/
│ └── templates/
└── knix-dev-store/
├── config/
└── templates/
How it work now compare to Leaf CLI is that at build time, --store ca-store merges the right versions/ folder into dist/config/ and dist/templates/. Everything else copies from shared src/ with JS/CSS being bundle up, minified, build correctly. Shopify Liquid snippets flatted from a folder organization down to it recommended folder style in snippets folder.
Because Build step now build everything correctly to the dist folder and we manage the build step. pull and deploy can be simplify 90% to just the bare Shopify CLI. We just need to handle how we build things and just Shopify CLI for the rest
shopify theme pull --path "$TEMP" --store "$DOMAIN" --live --only "config/*.json" "templates/*"
cp -r "$TEMP/config/"* "src/versions/$STORE/config/"
cp -r "$TEMP/templates/"* "src/versions/$STORE/templates/"
Canada and US could finally coexist in git. Pulling US no longer meant deleting Canada.
How the build engine work under the hood is dead simple for a reason. It uses ESBuild to build JS, PostCSS to build Tailwind, and the rest is basic cp and mv command to move Liquid files around. Chokidar was there to help us watch files for changes while developing.
When a system is very simple to understand, it also means that it is very simple to extend on for scale-ability. This opens up a lot of possibility for us to optimize on which Leaf CLI couldn't including pipeline automation.
This is where we were able to shave off tremendouse amount of release time by automation steps like:
- Pull Live on every release deployment for theme match and merge.
- Release candidate versioning automation
- Quick Sync between live theme and current theme
- Multiple store deployment in same pipeline with fail retry.
- Test environment setup All those steps that we need to do manually, now are able to be done with a simple click of a button. What took us like 1 hours of prepping and releasing now can be done in 5 minutes.
Extending the engine once it was ours
At that point the engine stopped being a build script and became infrastructure.
Because the build process was ours, every part of the developer experience became ours too. Adding a new workflow no longer meant digging through a third-party wrapper hoping there was an escape hatch. If Shopify CLI gained a new feature, we could expose it the same day. If our release process needed another automation step, we wrote it. If the team wanted a different project structure six months later, we changed it.
The engine itself never became particularly complicated. Under the hood it's still ESBuild, PostCSS, Chokidar, and a lot of copying files into the right place. That simplicity was the point. Instead of hiding Shopify behind another abstraction, it embraced the platform and automated only the pieces unique to our workflow.
Replacing Leaf wasn't about writing a better build tool. It was about removing a ceiling. Once the compiler, pipeline, and deployment flow belonged to us, the question stopped being "does the tooling support this?" and became "should we build it?" More often than not, the answer was yes and should be yes.