Handshake serves three different types of users (domains): students, employers, and career service center staff at universities. Each of these user types has very different needs and only need to be aware of a subset of the attributes we store for things like jobs, interview schedules, and applications.
The Handshake web app has largely been developed as a single Rails app servicing all three user types. This has enabled our team to ship features at a fast pace, but has posed its share of challenges. As our team has grown we’ve broken out into smaller teams focused on understanding and solving problems for specific user types.
For the most part, this allows engineers to focus and decreases the surface area of the app that they need to think about. However, this also makes it hard to stay on top of all the changes other teams are making to logic that is shared across teams, and it can be difficult tell if a change has unintended consequences for other teams.
While there’s no substitute for strong cross-team communication, we felt there were architectural changes we could make that would help eliminate this class of issues and enable each team to move faster.
Given that all user types only need to know about a subset of the data we have available, we’ve decided to create dedicated databases for each user type with a subset of the tables and columns that user type needs to know about. This drastically simplifies our data model and permissions logic.
Once we agreed on the architecture we wanted to move toward, we had to figure out how we would populate the new databases and keep them in sync with the main Postgres instance supporting our monolith. There are a lot of ways we could have accomplished this, but we decided to stick with what we know and spin up a new API-only Rails app that subscribes to changes in the main database and persists the data we need to know about in the new service.
We use Google Cloud Platform (GCP) to host our infrastructure and reached for Cloud Pub/Sub to publish messages about what changes are happening in our system. Each service is then able to create a subscription to receive these messages and handle them accordingly.
This pattern has already worked well for us with our Elasticsearch reindexer, a small app that subscribes to a Pub/Sub topic about changes happening in our master database and is responsible for keeping our Elasticsearch documents in sync.
Our monolith has a set of fairly standard REST endpoints which have been used for making requests from our web app, and a separate API following the JSON:API specification for our mobile clients. There is a high maintenance cost to maintaining multiple APIs and making sure there are no regressions when manipulating the shared services supporting them.
One of our goals with this transition is to have a single API that can support requests from any client that needs to render UI to our student users. In order to accomplish this we knew we needed to provide more flexibility to consumers and allow them to declare the data they need and be able to get that without creating or modifying existing endpoints.
We’ve been excited to tap into the ecosystem of tooling that is growing around GraphQL. One of the pieces we are most excited about is Apollo Client. We’ve been watching its development closely, anticipating we would use it to make requests from our clients to our new API.
Our very own Spencer Miskoviak even made some contributions to apollo-link-rest, a project we thought was a promising way of providing the developer experience we will be moving toward while transitioning from our current REST endpoints.
So far Apollo Client has worked out well. Setting it up was relatively painless and worked as advertised. We’re taking advantage of the Query component to make requests to our API, and we've also started leveraging the Apollo CLI to generate TypeScript types for us. This has been a breath of fresh air, eliminating the need to maintain interfaces by hand, which were often inaccurate because they were shared and used in so many different contexts.
For example, an
Employer interface would be shared across all instances of an employer object, regardless of what attributes we actually had. If we were returning an employer as part of a list of jobs you may only have the ID and name of the employer, but it was still assigned the
Employer interface. With the types being generated based on our query definitions we have much more accurate types, which has already helped catch a number of errors earlier in the development cycle.
Another nice thing about Apollo is that they have libraries for iOS and Android. We’re excited about the prospect of having similar tooling on all clients making requests and providing type safety that is verified against our GraphQL schema.
So far I’ve only mentioned the benefits of adopting this new technology, but I don’t want to pretend it’s been a painless process. We are learning a lot as we work through how we want to handle things like data replication, authentication, query optimization, and mixing data sources in the client (Redux + Apollo). We’re still in the early stages of leveraging these new tools, but we’re excited to share our experience and contribute to the base of knowledge available to others who are working on solving similar problems at their companies. Stay tuned for more information like this in future articles.
Please reach out (@ccschmitz on Twitter) if you are interested in talking more about any of the ideas or technology mentioned here, and consider subscribing to Rubber Ducking, a podcast where Spencer Miskoviak and I go into more detail on topics like this.
Finally, please consider checking out our open roles if you want to work with us on scaling this technology to better serve millions of students and help bridge the opportunity gap.