At Swarmia, we get to see hundreds of different development workflows from our customers, and have greatly enjoyed learning from them. We felt that it would be appropriate for us to return the favor and open up our own workflow.
This post is an opinionated description of our development workflow in detail. Obviously, one approach doesn’t fit all engineering organizations or teams, but we feel that this "Swarmia way" is a good place to start for teams working in web development. You should always tweak workflows to be optimal for your organization, rather than blindly following someone else’s.
Note: Swarmia (the tool) supports all sorts of different development workflows. This blog post just describes our own.
Team composition
We have built every team in our engineering organization from T-shaped developers, meaning that they know a little about everything and a lot about something specific. Everyone is able to deliver full features end-to-end, starting from the frontend to backend to ops, but additionally some of our developers are stronger in DB performance optimization, frontend, infrastructure, etc.
We think that development teams should have 3-8 members (the 2 pizzas rule), before communication and focus start breaking down.
We believe in empowered teams, which means that each team owns a problem and has the decision-making power, tools, and skills to build the right solutions to address said problem. In practice, this means that every team should have a dedicated product manager and designer, and developers capable of delivering on those designs.
For us, a culture of kindness, trust, and humility is very important. We’re always open to helping each other, but also ready to challenge opinions when it benefits the team. Developers also have a lot of freedom to work on what they think is most impactful. For example, I was able to just decide that I want to write this blog post instead of coding.
Our own teams are perhaps uncommonly senior — all of our developers have at least 6 years of experience, and all of us have team-leading experience. This will likely change in the future, but for now it’s nice as it allows us to move faster.
Our remote policy
We allow developers to work remotely, although currently everyone is based in Finland. This has been helpful, as sometimes it’s very beneficial to be able to get together at the office to discuss something face-to-face. It’s also easier to gather for team activities like dinners, which build up relationships within the product organization.
We might expand to other countries in the future, but we’re likely to hire in close-by timezones to keep communication fast.
Most people come to the office a few days a week, but it’s by no means mandatory, as all team meetings (dailies, retros, etc.) are held online. Having said that, we’ve found that some types of meetings, like planning new features, do tend to work better face-to-face.
Managing work
Use Kanban for organizing work
We use Kanban as the basic framework for our day-to-day work. That means we have a backlog that we pull new items from as we complete them. We don’t use Scrum, as we feel that it’s unnecessarily heavy and inflexible for a well-coordinated team like ours. Scrum is not necessarily bad, but it does feel a bit like training wheels, whereas the more flexible Kanban allows teams to do their best work.
Every team has a work-in-progress limit of 3 stories (I’ll get to our story definition soon), and generally speaking, we require that each story has at least 2 developers working on it. This helps avoid knowledge silos, and encourages completing in-progress projects before picking up new ones. The quality of code reviews also improves when the reviewer is someone who also works on the project.
We use Jira to track our issues and to provide us with a Kanban board. Honestly, we would probably use the tool from our friends at Linear instead (which we also integrate with), but as most of our customers are using Jira, we’ve decided to use it to better understand their problems.
Split work into small chunks
In Jira, we use the following issue types: epics, stories, tasks/subtasks, and bugs. We use issue types to distinguish between work items of different sizes.
Epics are our bigger projects or features, and should take 1-3 months to complete. An example would be our recent project: introducing team hierarchies everywhere in the product.
Stories are smaller chunks of work, often individual features that can be shipped to customers. They can either belong to epics or be standalone smaller features. Stories should take 1-2 weeks to complete. An example would be a story from the team hierarchies epic: implementing the hierarchical drill down to our organization insights view.
Stories are split into tasks during planning. They should each take 1-2 days max. for a single developer to complete. An example task might be: add hierarchical team selector component. They can also be used to track individual small tasks that are not part of stories, though you need to be careful not to focus too much on this type of work instead of your bigger roadmap items.
Bugs are just specialized tasks describing work that is a fix for something that has customer impact.
What’s the point?
Some readers might think that this process is too heavy — why would I try to make my stories 2 weeks max. instead of just as long as it takes to ship the feature?
There are many benefits to small, predictable chunks of work:
- You’re constantly and quickly shipping value to your customers. If you design your stories to be ready to ship features and aim to ship them in 2 weeks, it ensures that you’re scoping the feature small enough that it can start providing value for your customers in 2 weeks. You can then improve it further in future stories.
- When you know how long your stories usually take, you can do a much better job at forecasting. When breaking down an epic into stories, as long as your story size is decently consistent, you will get a good approximation of how long the epic will take. We don’t particularly believe in deadlines, but it’s extremely useful to know if something will ship in 2 weeks or in 4 months.
- It helps to keep the scope small. It’s very easy to want to keep adding more functionality to a feature, when that effort might be better spent in some other feature instead.
Implementing this
It takes quite a bit of practice to get good at consistently scoping the work to meet these time guidelines. In the beginning, you should try to make your stories complete in 1 week, which makes it more likely that you will hit the target of 2 weeks.
However, we believe that doing the work to improve the scoping of your projects is well worth the effort.
It’s easier to make your stories small when you’re working with small pull requests.
Maintain enough slack
It’s important that the teams aren’t always working at 100% utilization. This allows developers to take on small improvements that don’t fit neatly on the roadmap, be creative and try out new things, spend time on learning, etc. It also helps reduce burnout as it gives developers a chance to breathe rather than always be rushing to the next thing. After all, software development is a marathon, not a sprint.
To implement this in practice, make sure that you don’t push your work-in-progress limits to the extreme. Also avoid deadlines when possible, as they are the opposite of slack and can result in developers needing to crunch work.
Planning work
Product discovery and planning
We track potentially interesting future product areas in our “product pages” in Notion. There, we collect discovery feedback that we’ve heard from our customers about that feature. When we feel that there is strong enough pull for the feature, we start ideating and creating mockups, which we then validate with some customers.
When the discovery starts to be in good shape, we find 1-2 developers that are interested in planning the feature further, and they meet with the product manager and designer to hash out the details and provide perspective to the technical limitations. We call this “pre-planning”.
Designs are then refined further, and the story is split into tasks by the developers. Then, the developers present the story to the rest of the development team to gather feedback. Usually (not always!) there is no blocking feedback, and the story can be put to the top of the backlog waiting for the team to have available developers to start working on it. It’s a good idea to have at least one of the original planning developers working on the story.
We also make sure to plan how to track the adoption of the new feature, and define what success would look like.
Check out our blog post Why product teams should plan together — and how we do it at Swarmia for more details.
Assign a project lead
Every story and epic has a designated developer to act as the lead of the project. It’s their responsibility to ensure that the project progresses smoothly and doesn’t get stuck, and that feedback is properly taken into account.
We’ve found that it’s good to have one person ultimately responsible for the project. This helps avoid situations where no one is sure who’s taking care of something and it ends up getting dropped. It also provides a natural contact person, allowing you to send all of your feedback and questions directly to them.
Having this role does not mean that the ownership or freedom of the other developers is diminished. This is purely a facilitator role that ensures that balls don’t get dropped. It’s also a good opportunity for developers to grow their leadership skills.
Balance new features and refactoring work
It’s important to not only work on new features but also improve the infrastructure and refactor buggy areas of the codebase. Building new features is of course a great way to get new customers and make the existing ones happier, but we should also make sure that the maintenance load doesn’t become unbearable and start to slow us down. The right balance can be hard to achieve, and in fact, we too have had times when we’ve slipped a bit.
If you have a serious problem with this, you might want to consider dedicating one of your story lanes to increasing productivity. This way, you’ll always have 2 stories for new features and one for infrastructure. This can be a bit heavy handed and inflexible, but at least it ensures that you’re not only working on new features.
This is actually something that we find Swarmia’s investment balance(/product/investment-balance/) to be very helpful for, as it gives much needed visibility into your ratio of new features vs productivity improvements. You can read more about the details in our blog post How we use Swarmia at Swarmia.
Use RFCs for bigger technical projects
When planning big technical projects like architectural changes, we like to use RFCs (request for comments) to facilitate the discussion. A developer writes up a technical proposal for a plan well in advance (we use GitHub Issues for this), and others then comment on it. This ensures that everyone gets to say their piece and is up to date with the plan. When all comments have been addressed, the work on the change can proceed.
Team rituals
We try to keep the number of meetings to a minimum to allow developers to focus on coding. This chapter describes all the recurring meetings we have, resulting in ~3 hours of meetings per week for developers, with most days only having the daily.
We do a lot of our communication in Slack, and only take meetings when deep collaboration is required, for example when planning new features or pair-debugging difficult problems.
Short dailies that focus on the work
We have dailies where we walk through our Jira Kanban board and quickly go over each task that is in progress, and move finished tasks to the done column. We go through the board lane by lane instead of going from person to person. The goal is to have a shared understanding of what the team is working on, and to provide a low barrier opportunity to briefly discuss with the team any details regarding a task.
We aim to finish the daily in 15 minutes, and actively move discussions to after the daily if they drag on for more than a couple of minutes.
For more details on how we think dailies should be run, see our blog post Are daily stand-ups a waste of time?
Retrospectives for improving the team
We host 1-hour-long retrospectives every 2 weeks for each development team. The facilitator is a rotating role, so that the load gets divided evenly and everyone learns to facilitate retrospectives. In the meeting, we focus on the work that we’ve done in the time period: what have we been happy about, and what didn’t work and needs to be improved.
We use Miro for facilitation. Usually, we have a mind map where we write topics for 15 mins, and then discuss them, but sometimes we mix in different retrospective formats to keep it fresh.
For more thoughts, you can read our blog post Reduce bias in retrospectives with better data.
Demos to connect with the rest of the organization
We have company-wide demos every Friday, where the developers also participate by demoing features we’ve built or the plans that we have for the future. It’s a great opportunity to keep the company informed on what is going on on the product side, and to gather feedback from customer-facing teams.
Coding
Use trunk-based development with feature flags
When we start working on a new task, we create a new branch for it, create commits, push them and open a PR. We don’t create long-living branches for stories or epics. Generally speaking, we only branch from the main branch (except when doing short-lived stacked PRs).
Instead of keeping the feature’s code hidden in a separate branch that will eventually get merge conflicts, we use feature flags to constantly push code to the main branch. The code will only be executed if the user has the feature flag enabled, which allows us to do controlled rollouts of new features. When the feature is ready to be released for everyone, the flag is removed. We have our own home grown simple feature flagging system instead of using a service, as we haven’t felt the need for that yet.
Create and review PRs
We require one developer to approve each PR (it is also required by our SOC 2 audit). It can be anyone, but most often, people working on the story are also reviewing the PRs from that story.
When we open a new PR, the whole team gets a review request via a GitHub CODEOWNERS file. This triggers the Swarmia bot to send a Slack notification to our channel, so we can easily keep track of pending reviews. This combined with actively checking Swarmia’s Pull Requests view ensures that no PRs fall through the cracks and are left waiting for review.
As we have the culture of developers actively looking for PRs to review when they have time, we don’t usually assign anyone specific to review a PR. This enables faster PR reviews, as you’re not blocked waiting on a specific person for a review. An exception might be when you immediately want to get the attention of someone specific.
When reviewing, we try to strike a good balance of upholding quality but not unnecessarily blocking the release of code.
You can read more about our philosophy of code reviews in our blog post A complete guide to code reviews.
Write tests; mostly integration tests
We don’t have separate QA people. Developers are responsible for writing tests for their own code, and ensuring that their code works as expected. We think that this leads to faster iteration speed and better tests, as the developers need to consider writing tests when writing their code.
The PR reviewers do act as mini QA though, clicking through the UI to check that things look right.
Tests are not free, and we think that integration tests generally offer the best bang for the buck, since they test complete integrations rather than individual pieces. We save unit tests for pure functions that contain a lot of logic. So on the frontend most of our tests are testing the React components (with RTL), and on the backend we use a real database rather than mocking it out. This ensures that we’re testing that the system works end to end, avoiding real issues that might be hiding behind mocks.
Deploy directly to production
We use GitHub Actions for our CI/CD pipelines. It’s great, because it allows fast iteration on the pipelines, empowering every developer to make changes when needed.
When a PR is opened on our frontend repository, we deploy a preview for it and a link to the preview gets posted to the PR. This is the production bundle, which talks to the production backend, deployed under a specific folder that you can only access with a specific URL. This can be achieved with just a couple of lines of GitHub Actions code, and it will make it much more likely that someone actually tests the change via the UI as it’s so effortless to do.
All of our repositories deploy to production right away when a PR is merged to the main branch. We deliberately don’t have a staging environment, as they place a high burden on maintenance and usually don’t truly match the production environment anyways. Instead, we invest in good monitoring tools and fast reverts. This allows us to iterate extremely fast, and when something does go wrong, we have the tools to fix it fast.
Take security seriously
Security is one of those things that won’t make your product features better, but it still just has to be done correctly. Taking security seriously can, however, also be a competitive advantage, and it’s something that we’ve paid close attention to ever since the founding of our company. Getting our SOC 2 Type 2 certification early on has made it a lot easier to get through our customers’ security reviews.
We conduct internal security audits twice a year, where our developers go through the OWASP ASVS framework section by section, identifying any weaknesses that our system might have. ASVS is quite comprehensive, and often we might find a couple of small things that we could be doing better. We then make sure to roll those improvements into a story and put it to the top of our backlog. This audit and improvement cycle takes us a couple of weeks of developer time twice a year.
In addition to the internal audits, we also conduct external penetration tests once a year to ensure that we didn’t miss anything. Usually, our internal audits have caught more issues than the external audits, likely due to our familiarity with our systems.
We also always consider security when planning new features, and when we touch sensitive areas like authentication, we require multiple reviewers on the PRs.
Build a design system early on
When we started building Swarmia, quite early on we decided to build our own design system for basic components like buttons, dropdowns, layouting, etc. This might sound a bit daunting and distracting from delivering features to customers, but it’s really not that much work and quickly pays itself back. When using a design system, your UI will be consistent across views, and you’ll be able to quickly put together new features with the existing components.
We use Styled System to define our building blocks like colors and spacings, and we have a UI component library in Storybook. These are kept in sync with Figma, so that when our designers put out new designs, the developers can easily see what components and values are being used, which makes converting the design to code easy.
Handling support & bugs
We have a weekly rotating role called the Chief Firefighting Officer (CFO), who is responsible for ensuring that the product is running smoothly that week. This includes monitoring our systems in Grafana and Sentry, and investigating customer issues.
When a customer has an issue, they reach out to us via our in-app chat. Someone, usually from the Customers team, picks up the chat and investigates what the problem is. If they can’t solve it, they post about it on our #dev-support Slack channel where the CFO sees it.
The CFO creates bug tickets from customer issues that other developers then work on and communicate back to the customer when they are fixed. These bugs go to our “everything else” lane in our Kanban board, which is a type of separate backlog for this type of ad hoc work. We don’t usually have long-running bugs, and thus we’re able to fix bugs as they come in so that our bug backlog is not growing.
Connect developers with customers
For developers to do their best job, it’s important for them to see how the product is actually being used. Our Customers team posts notes from customer meetings that they have, which our developers learn from. It also allows us to help the Customers team with technical questions.
We use Gong for recording our customer calls, and the developers can listen to all of them. We have an optional bi-weekly Gong Club where we watch one pre-selected customer call and discuss it. It’s very insightful to see the customers actually clicking around in the tool.
We have Slack Connect channels with our customers where they post feedback and questions, to which our developers reply directly.
Use boring technologies
This isn’t a tech blog post per se, but I’ll say a couple of words about our tech choices. We try to follow the guideline of using boring technologies and avoiding unnecessary additions to the stack.
Using proven (”boring”) technologies is often much faster, as people have already figured out how to solve problems with them. In practice, it means you’ll run into fewer problems that you need to be the first one to solve, which can greatly slow you down or, in the worst case, even force you to switch to a different technology. Choosing boring technologies also pays off when you start scaling up your development organization, as the hiring pool will be bigger and every new hire won’t have to learn a new tech.
It might be really tempting to try out some cool new tech, but a responsible development organization must resist this urge (unless there is a very strong reason to adopt the bleeding edge tech). Often the new tool might really be a little better at the job, but the downsides of treading on unknown territory are still usually bigger than the potential upside.
Keeping the stack small reduces the amount of information that developers need to keep in their head to do their job, and makes onboarding new developers faster. It’s also less work to e.g. operate one database rather than multiple different ones.
Our tech stack
We have Node.js on the backend, React on the frontend, PostgreSQL as the database and Kubernetes on Google Cloud as our infrastructure managed with Terraform. We just recently decided to add Redis to our stack after a long deliberation of whether instead, we should just use Postgres for those use cases as well.
We also use TypeScript for everything, as we believe that when building anything even a little complex, not using static typing just makes development unnecessarily difficult.
Our least boring technology is Apollo GraphQL, which has luckily worked out very well for us so far.
We’ve also made the deliberate decision to avoid too much functional programming, avoiding usage of libraries such as fp-ts
. This was a hard decision, as we think that functional programming is awesome, but the implications to onboarding future developers without that experience were too severe.
A look inside our team
The ways of working described in this post are in part how our product organization has been able to move faster than our competitors. You can see from our changelog the real cadence of the features we ship.
I personally find it very interesting to hear about how other teams work, so I hope that this post has been fruitful. I doubt that anyone is going to walk away from this thinking they’ll now implement everything in this blog post, but hopefully there have been some things that have sparked thoughts.