To Serverless or Not To Serverless

No managing servers manually

One of the primary proposed advantages of Serverless is that you don’t have to care about provisioning physical servers. I’m a little skeptical here, as I think this only holds true if you compare serverless to actually maintaining your own physical server or VM. However, most large cloud providers also offer services where you simply provide a binary or a container, which they then host. I think if you compare those two types services, you’ll find that the worrying about physical infrastructure is much the same.

Forces your code to be stateless

Functions, being short-lived, forces your code to be stateless, there’s no need to debug anything that only happens because you somehow end up incorrectly modifying some internal state, and then whole thing comes crashing down later.

Easy to scale up and down

Now, what I think is probably the “real” benefit of a serverless architecture. The ability to scale up and down at the blink of an eye is powerful. It means that you don’t have to worry about scaling at all! However, as with all good things, there’s a catch. Many projects probably don’t need to worry about scaling all that much.

Steady state scaling

Steady state scaling means that as your application or project becomes more popular, it has to serve more users, or transfer more data. The characteristics of steady state scaling is that traffic grows at a constant pace. You know you get 10% more users every month, or your traffic increases by 15% year over year. You can calculate how many machines you need to handle the load next month, or next year. The scaling is constantly growing, without large spikes.

Peak scaling

Peak scaling is when your system has large spikes in traffic. Perhaps you run a shopping website and on Black Friday you have 10x the normal traffic, or you run a website with funny cats images, and your traffic doubles over lunch break.

  1. You simply have enough servers to handle peak load. This is the naive solution, where you simply overprovision servers to such a degree that you’re able to handle these spikes in traffic. This works brilliantly, but depending on your volume and how large your spikes are, is potentially very expensive
  2. You provision extra servers when you know you’re going to have spikes. This is possible, if you know that e.g. you’re going to have a fire sale, or you’ve sent out a newsletter. It requires a lot of manual work though and is error prone. What if someone forgets to provision more servers, right before a big sale and everything goes down?
  3. Other auto-scaling solutions, like Kubernetes Horizontal Pod autoscaling, or Azure Autoscale where you specify some metrics, like response time or CPU usage, and you then scale on those. This is usually a fine solution, but it requires you’re on a technology stack where you have access to some of those capabilities.

Lots of bindings

If you’re alright with accepting the vendor lock-in there’s a lot of bindings. Lambda has bindings for lots of AWS services, and Azure Functions has the same for many Azure services. There’s definitely something freeing in just stating you’re interested in handling events from EventHub Foo and then having the serverless runtime deal with the nitty gritty of how to get those events to you, and how to get them into a database or your end-user.

[FunctionName("EventHub2EventHub")]
public static async Task Run(
[EventHubTrigger("source", Connection = "EventHubConnectionAppSetting")] EventData[] events,
[EventHub("dest", Connection = "EventHubConnectionAppSetting")]IAsyncCollector<string> outputEvents,
ILogger log)
{
foreach (EventData eventData in events)
{
// do some processing:
var myProcessedEvent = DoSomething(eventData);

// then send the message
await outputEvents.AddAsync(JsonConvert.SerializeObject(myProcessedEvent));
}
}

Cost model is different

The cost model for serverless is different. It depends on your cloud-provider but often you pay for what you use. So if you have little traffic, you run less functions and you pay less. This often makes it a very good model for when you have large differences in peak and off-peak traffic. However, I’ve heard multiple stories of people actually paying more with serverless environments than they would with a regular server, particularly when peak and off-peak traffic didn’t differ that much.

High availability

Serverless environments should give you a high availability setup without having to worry about anything. Your application will always scale to the amount of traffic required. High-availability can also be achieved by scaling out normally, but serverless definitely buys you “peace of mind” scaling, where you never have to worry about if you can handle the traffic.

Testing

Testing serverless code is (potentially) more challenging. In general, most people I’ve talked to tend to not test the binding code, but extract the business logic, usually following steps that look something like this:

  1. Extract the logic into its own module or function
  2. Make the actual function code a very thin wrapper around the function
  3. Test only the function

Vendor lock-in

This is often listed as a detriment. You’re kind-of vendor locked-in. I say “kinda”, because you can run these functions on a kubernetes cluster that potentially can be hosted anywhere. Many of our triggers are bound up on azure-specific bindings, but I don’t think that matters that much. In general if you tend to separate your bindings and your business logic, migrating to a different vendor isn’t that much of a herculean task. However, I’ve always felt vendor lock-in hasn’t been that big of a deal. Most projects never change databases, cloud providers or anything else, so worrying about being portable is wasted effort.

Monitoring is harder

In general, distributed monitoring is much harder than monitoring a single applications. Monitoring functions isn’t harder in itself, than than monitoring the equivalent amount of microservices would be. But I think when evaluating a technology stack, you need to consider the patterns the technology encourages. Serverless make it very easy to create many small functions, which might make your logging and monitoring “more distributed” than it would tend to be when using a more classical approach.

Maturity

Serverless is still a (reasonably new) player on the market. Azure Functions was made Generally Available in 2016, and their version 3, which we’d like to use, has only been live since December of 2019, around a month at the time of writing.

Loss of control

You lose some control, and you’re more at the mercy of the bindings. For example, there’s some reasonably advanced options we’d like to configure in regards to how we read events from our Event Hubs. However these configurations aren’t exposed by the Azure Functions runtime, so if we end up using Functions, we’ll have to live with not being able to configure that.

It’s a different skill set

There’s a different skill set to learn when using functions. While much of the business logic might be the same, the way you architect systems doesn’t necessarily carry over. This can either be nice, you get to learn something new, or a hassle — you can’t use some of the skills you’ve previously acquired.

Caching is harder

If you have a system that relies heavily on caches, that’s harder when using Functions. In regular applications you can cache things in-memory, but that’s not always possible with functions, as you have no knowledge of how often a function is discarded and re-created. Optimally you would use some third party cache-provider like Redis or Memcached, instead of caching in-memory.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gustav Wengel

Gustav Wengel

Software Developer at SCADA Minds