Jupiter Engineering

The Jupiter Stack

April 15, 2020

A rocket containing groceries

by Nate Leung, software engineer

Jupiter is a home-fulfillment service that helps busy people and families effortlessly stock their home with groceries and essentials through a personalized and trusted fulfillment team.

We serve hundreds of homes in the San Francisco Bay Area, our fulfillment teams handle thousands of items, and our frontend and backend systems shoulder the order of magnitude more compute operations that these processes generate every single day.

Our tech stack is comprised of a single Express server running in App Engine.

Just kidding — while a single Node.js process on a large enough machine might still work now, we’ve spent the last few months building a safe, scalable system to handle the next stage of our growth. It’s currently powering all of our operations, and we’re excited to share it with you.

Here’s a quick overview of the new tech powering Jupiter’s growing operations.

Backend

Stack

Language

Jupiter’s backend is now written in Kotlin, a modern, statically-typed programming language supported by JetBrains and Google.

Although Kotlin is most widely known as an alternative to Java for writing Android apps, it’s a general-purpose programming language that can compile to native code and JavaScript in addition to Java bytecode.

We use Kotlin’s modern features to keep our code clean and safe:

// Using constructs like `try-catch` as expressions
// (i.e. code that yields a value in addition to
// "doing something") reduces the need for mutable
// values. Explicitly nullable types help us avoid
// `NullPointerException`s.
fun uuidFromStringOrNull(uuid: String): UUID? = try {
  UUID.fromString(uuid)
} catch (e: Exception) {
  null
}

// Because of the `filterNotNull` call, Kotlin's type inference
// algorithm knows that the type of `uuids` is `List<UUID>`,
// not `List<UUID?>` (a list of nullable `UUID`s)!
val uuids = listOf(
  "bad uuid",
  "00000000-0000-0000-0000-000000000000",
  "1e12f1ff-c4b1-4f06-b9cb-5028621fe0ea"
)
  .map { uuidFromStringOrNull(it) }
  .filterNotNull()

// Alternatively, Kotlin has a built-in shorthand for the above
val uuids2 = listOf(
  "bad uuid",
  "00000000-0000-0000-0000-000000000000",
  "1e12f1ff-c4b1-4f06-b9cb-5028621fe0ea"
)
  .mapNotNull { uuidFromStringOrNull(it) }

We’ve specifically chosen to compile our code to Java bytecode over JavaScript or native in order to interop with the amazing Java and JVM ecosystem that has developed over the almost 25 years that have passed since Java was initially released.

Notably, this ecosystem includes the framework we’re using for our “server”: gRPC. gRPC was originally created by Google, and it has a super-well-maintained Java implementation and excellent support for Java tooling (likely because Java is one of the most commonly used languages at Google).

RPC

gRPC isn’t exactly a web framework like Express, Django, or Rails, but an RPC framework. Think of gRPC as a generalization of web frameworks: whereas with a web framework we’d send a request from our browser which would subsequently run some code on the backend and then return a response, with gRPC we can send a request and get a response from anywhere (a mobile app, the browser, or even another server — in fact, server-to-server communication is where gRPC really shines).

gRPC defines the structure of requests and responses through Protocol Buffers. Protocol Buffers (protobuf for short) aren’t actually runnable code, but a way of defining how we want our API to be structured.

service UserService {
  rpc ListUsers(ListUsersRequest) returns (ListUsersResponse);
}

message User {
  string name = 1;
}

message ListUsersRequest {}

message ListUsersResponse {
  repeated User users = 1;
}

The protobuf compiler can generate code in multiple languages, depending on the use case. For our backend, that means generating Java code, which can interop with our Kotlin (there currently doesn’t exist an official Kotlin code generator).

// The `UserServiceGrpc.UserServiceImplBase`,
// `ListUsersRequest`, `ListUsersResponse`, and
// `User` classes are all generated by the protobuf
// compiler from the code block above.
class UserServiceImpl() : UserServiceGrpc.UserServiceImplBase() {
  override fun listUsers(
    request: ListUsersRequest?,
    responseObserver: StreamObserver<ListUsersResponse>?
  ) {
    responseObserver.onNext(
      ListUsersResponse.newBuilder()
        .addUser(
          User.newBuilder()
            .setName("Nate")
            .build()
        )
        .build()
    )
    responseObserver.onCompleted()
  }
}

For our frontend, the protobuf compiler can generate JavaScript and corresponding TypeScript type definitions (we’re also currently building an iOS app, for which the protobuf compiler can generate Swift code).

The result is that we can strongly type the requests and responses we send and receive from our backend, increasing the resilience of our client-side code.

const req = new GetUsersRequest()
const res = await userServiceClient.getUsers(req)
const users = res.getUsersList()
// If this is TypeScript, and the structure of `users` changes,
// we'd know immediately because this code wouldn't compile
const natesName = users[0].name

This is in contrast to traditional web programming, where the structure of requests and responses often need to be manually kept in sync between the backend and frontend.

// Defining an endpoint in Express
app.get("/api/users", (req, res) => {
  res.json([{ name: "Nate" }, { name: "Will" }])
})

// Accessing the endpoint in the browser
const res = await fetch("/api/users")
const users = await res.json()
// What if we add user IDs and `users` becomes
// a map indexed by ID? I guess our users will
// find out first.
const natesName = users[0].name

In addition, it’s a breeze to add new services to separate concerns (continuing the example above, we could add a HomeService in addition to our UserService) as the scope of our application grows.

And as we develop, since all the APIs are defined in a single place (in the .proto files), when we want to figure out how to get a certain piece of data from our backend to our frontend, we can just look for the relevant Service definition and see the method signature right then and there. With the types and tooling, compared to configuring a manual fetch, the method calls pretty much write themselves.

Database

gRPC shuttles data from our backend to our frontend, but we need to store it somewhere. We do that in a PostgreSQL database.

There’s not much to say here other than that PostgreSQL seems to better support the SQL standard than MySQL, and the nature of our data (from ordering to shopping to delivering to stocking) has led us to run complicated queries involving some relatively advanced SQL features, so PostgreSQL seems to have been a solid choice for now.

Looking ahead, we’ve started integrating Pusher for realtime operations and we’re potentially looking at using Redis as a cache to lighten the load on our main database.

Other APIs

Outside of our own code, we rely on our buddies at Algolia, Segment, Mailgun, Stripe, and Twilio to handle search, analytics, emails, payment processing, and SMS, respectively.

DevOps

Build

Although we initially built our backend code with Gradle, the documentation surrounding build.gradle files wasn’t great (nobody on the team was super familiar with the DSL), we kept running into problems with proto compilation, and our build.gradle files quickly devolved into a mess of random tasks shelling out and back into the task runner.

So, back in February, we migrated to Google’s build tool, Bazel. Although Bazel is language-agnostic, we still currently only use it for our backend code, specifically to compile protos and Kotlin to JVM binaries.

Bazel’s been a great fit for our use case so far. It has first-class monorepo support, which is our current repository structure (and what’s used internally at Google), first-class proto support (Google invented protobuf), and build times seem to be noticeably faster than Gradle’s. We’ve also set up a remote Bazel cache with NGINX, so build artifacts are shared among the team, speeding up builds even more.

kt_jvm_library(
    name = "user_service_impl",
    srcs = ["src/main/kotlin/*.kt"],
    deps = [
        # It's super easy to depend on other parts of our code
        ":util",
        "//src/proto:user_service_java_grpc",
        "//src/proto:user_service_java_proto",
    ],
)

Deploy

Our backend is split into microservices (i.e., continuing the example above, HomeService, UserService, etc. are compiled into separate binaries and run separately) and each service is built individually using a multi-stage Dockerfile:

First, we have a base “builder” image containing Bazel that clones our repository from GitHub and checks out master.

FROM ubuntu:18.04

ARG GITHUB_USERNAME
ARG GITHUB_TOKEN

# ...

# Install Bazel
RUN apt-get update && apt-get install -y bazel

# Clone repo into /srv/repo
WORKDIR /srv
RUN git clone https://$GITHUB_USERNAME:$GITHUB_TOKEN@github.com/Jupiter-Inc/code.git repo

# Set working directory to the repository
WORKDIR /srv/repo

# ...

For each individual service, the “builder” image compiles the Kotlin code and associated protos. Next, we copy the built binary to a slimmer JVM image to produce the final service image. Then, we push the built image to Google Container Registry, where it can be subsequently rolled out to users.

FROM builder AS builder

RUN bazel build //src/user-service/user_service_bin_deploy.jar

# Copy to a standard location so we can consistently copy
# to next stage
RUN cp bazel-bin/src/user-service/user_service_bin_deploy.jar /srv/deploy.jar

FROM openjdk:11-jre-slim-buster

COPY --from=builder /srv/deploy.jar /srv/deploy.jar

ENTRYPOINT ["java", "-jar", "/srv/deploy.jar"]

For deployment to production, we typically trigger a CircleCI workflow to build and push the images (everything happens in the cloud) in order to ensure the process is consistent and independent from the idiosyncrasies of our development machines’ environments.

Infrastructure

Our backend services run on Google Kubernetes Engine, and we use Terraform to declaratively provision our infrastructure — our Kubernetes cluster is created by a google_container_cluster resource.

resource "google_container_cluster" "jupiter-cluster" {
  name = "jupiter-cluster"
}

We’ve also set up Terraform Cloud to get the effect of continuous infrastructure deployment — it hooks up to our GitHub repository and automatically plans infrastructure changes whenever a Terraform file changes. Like our CircleCI build setup, it ensures consistency by divorcing the provisioning of our infrastructure from our development machines’ environments.

The "runs" page of Terraform Cloud

Unlike a traditional Kubernetes set up, since we’re using Terraform, our Kubernetes objects are written in Terraform’s HCL instead of YAML.

Although it would be great if Terraform’s Kubernetes provider supported turning YAML Kubernetes configuration files into the format needed by Terraform, it unfortunately doesn’t.

So in the meantime, using HCL helps us make the most of Terraform by allowing interop between our Kubernetes setup and other pieces of our infrastructure — for instance, we can create a GCP static external IP with Terraform and pass it to an HCL Kubernetes object so it can be associated with a Kubernetes LoadBalancer.

resource "kubernetes_service" "envoy" {
  metadata {
    name = "envoy"
  }

  spec {
    selector = {
      app = "envoy"
    }

    type = "LoadBalancer"
    # Created by another terraform resource
    load_balancer_ip = google_compute_address.address.address

    port {
      name        = "http"
      port        = "80"
      target_port = "8080"
    }

    port {
      name        = "https"
      port        = "443"
      target_port = "8443"
    }
  }
}

This allows us to point domains to our cluster (our DNS is also completely declaratively provisioned with Terraform).

resource "aws_route53_record" "A-jupiterco-api" {
  zone_id = aws_route53_zone.jupiterco-zone.id
  name    = "api"
  type    = "A"
  ttl     = 60
  # This points to the `LoadBalancer` we created above
  records = [google_compute_address.address.address]
}

In terms of the choice between Kubernetes over a different approach to running and managing our application (App Engine, EC2, etc.), the batteries included with Kubernetes (containers, logging, healthchecking) combined with the great Kubernetes open-source community (lots of real-world examples, great documentation, and no vendor lock-in) made it super easy to set up something that fit our use case.

For what it’s worth, we actually initially deployed our containers to AWS Fargate, but documentation surrounding best practices was sparse, there wasn’t much of an ecosystem or community, and our initial setup ended up being far from optimal (lots of hardcoded values and poor resource utilization). The experience with pure Kubernetes has been much better in comparison.

Going forward, I’d say our biggest challenge has been and continues to be introspection: we don’t always know what’s happening inside a container when it stops working, although LogDNA and Stackdriver have been a big help in this regard. Our challenges might also be more on the gRPC level than on the Kubernetes level, but we just haven’t had much bandwidth to investigate (for now, we just restart the container!).

Frontend

Stack

Language

In general, our frontend choices are pretty standard, so there’s a bit less to say here.

Our frontend is written completely in TypeScript, although in certain spots we run our code through both tsc (the TypeScript compiler) and Babel in order to take advantage of a few Babel plugins (like babel-plugin-lodash to make our bundle smaller and babel-plugin-styled-components to help us debug styled-components).

Views

We use React and styled-components to build our user interface on the web, which, like TypeScript, are both relatively standard choices these days.

Google Trends for React, Angular, and Vue.js

At this point in time, I think React has pretty much won the framework wars (even SwiftUI, which we’re looking at as a way to quickly get a decent mobile app off the ground, is vaguely React-like), and it’s not without reason: even more so than our experience with the JVM or Kubernetes ecosystem, the ease of learning, excellent developer experience, and vibrant community surrounding React has been instrumental in keeping our development velocity high.

React, Angular, and Vue GitHub Stars

In a slightly different vein, our choice of CSS-in-JS framework was a bit less clear — I’m partial to styled-components because I’m familiar with it, but I’m sure we could’ve done just fine with CSS Modules, Emotion, Aphrodite, or just plain inline styles (which we still use, albeit infrequently).

// From one of our styled components
interface LogoProps {
  scale: number
}

const StyledLogo = styled.div<LogoProps>`
  height: ${({ scale }) => 3 * scale}rem;
  width: ${({ scale }) => 3 * scale}rem;
`

State

We use Redux to manage state on our largest frontends (app.jupiter.co and register.jupiter.co), and use React’s built-in component state on our other frontends.

Redux has a bit of a reputation for boilerplate, and it definitely isn’t the right choice for a smaller frontend like waitlist.jupiter.co, which has 3 screens, but the pattern has helped us keep state organized in our larger, more involved frontends.

Homepage of the main Jupiter app

For instance, the separation of state management from our components that Redux (vs. component state) enables has allowed us to more easily share cross-cutting logic:

We’ve separated our main app and registration flow into two SPAs on two different subdomains, but we share the same authentication action and reducer code among the two (data is persisted to and loaded from a cookie on *.jupiter.co via a thunk that runs on load). So users can authenticate on either site, and that authentication will be loaded and reflected in the Redux store of both sites.

// We also share this `createStore` wrapper between the main
// app and registration. The generic type parameter ensures
// that both the app and registration stores contain the
// required shared `auth` state
function createJupiterStore<S extends { auth: AuthState }>(
  reducer: Reducer<S>
) {
  return createStore(reducer)
}

Fetching Data

As mentioned above, we use gRPC to shuttle data from the frontend to the backend (although gRPC services aren’t normally accessible from the browser, we’re using a library called gRPC-Web to enable communication over HTTP). As a result, we can send typed requests and receive typed responses like so:

const req = new GetUsersRequest()
const res = await userServiceClient.getUsers(req)
const users = res.getUsersList()
// If this is TypeScript, and the structure of `users` changes,
// we'd know immediately because this code wouldn't compile
const natesName = users[0].name

gRPC-Web is still relatively new (its 1.0 release was in October 2018), and in stark contrast to issues with React, it’s sometimes lonely to be the only one (or just one of a few) on the internet with a specific cryptic error message. At least for now, though, the problems have never been too huge, and for what it’s worth, the interop with gRPC and type safety have likely saved us a lot more time.

DevOps

Build

While Bazel has decent support for depending on Maven packages for JVM development, it doesn’t seem to have great support for node_modules dependencies for frontend development (part of this might be because Google doesn’t use node_modules internally). Given the greater JavaScript ecosystem’s reliance on NPM, we decided to stick with a more traditional frontend build system.

That is, we use plain NPM scripts with Rollup and Webpack to package our code (e.g. npm run build will run Rollup or Webpack, depending on the project).

In terms of whether to use Rollup or Webpack on a specific subproject, we have internal JS libraries for shared components and shared business logic (like the shared authentication reducer mentioned above) that build with Rollup to take advantage of ES6 modules and tree shaking, and we publish those to our instance of a private NPM-compatible registry called Verdaccio.

Homepage of our private NPM-compatible registry

Then, in our customer-facing frontends, we can specify those private packages in our package.json and install them from the private registry. We take advantage of Rollup’s ESM output in these private packages to reduce our final bundle size, but we build our frontends using Webpack and Babel for maximum compatibility across multiple browser versions.

Deploy

Our repository is hooked up to Netlify, and every time we merge a frontend change into master, the site is rebuilt and is pushed to production.

Although we could technically import our private NPM packages with relative imports instead of publishing to our private registry since everything is in a monorepo, having packages published makes our Netlify deploys a lot easier to configure.

We simply set our base directory in our Netlify config to the subdirectory in our repository containing the site we want to deploy. Netlify looks at this one single directory, read the directory’s package.json, and downloads only the packages required by this single site, instead of looking at the entire monorepo. This also saves compute since Netlify can automatically skip a deploy if no changes occurred in the site folder (for instance, none of the most recent commits below affected our registration app).

A list of our Netlify deploys

In general, our approach to JavaScript development is to embrace the ecosystem: the community appears to have rallied around a few strong opinions (depending on a lot of NPM packages installed in node_modules, making use of NPM scripts, using Webpack to build applications, etc.), and we’d rather not fight the prevailing opinions in order to more cleanly fit “our” way of doing things (like building with Bazel — Netlify’s build image comes with Node.js and NPM preinstalled, but not Bazel. Installing it, and then working around Netlify’s heuristics for node_modules to fit the Bazel model is likely going to be more trouble than it’s worth).

Infrastructure

In addition to their devops features, Netlify handles global deployment, CDN configuration, geographic routing, and a whole slew of other frontend infra concerns, too.

At this time, all of our frontends are deployed on Netlify, so we’ve been relying entirely on their setup for our frontend infra needs.

Join Us

That’s our stack. Thanks for reading!

Does this sound interesting to you? Have ideas for improvement that you want to work on? We’re actively seeking talented engineers at Jupiter. Shoot us a note on Twitter (@_jupiterco) or via email (starship@jupiter.co).


Written by Jupiter Engineering. Jupiter is a home-fulfillment service that helps busy people and families effortlessly stock their home with groceries and essentials through a personalized and trusted fulfillment team.