๐Ÿง‘๐Ÿพโ€๐Ÿ’ป prep

Saving work, avoiding work, optimising work

๐Ÿ›๏ธ Caching

Learning Objectives

Caching means storing a copy of something in a temporary, faster-to-access location.

Instead of fetching or calculating the same thing over and over again, we use the stored copy. Caching saves the result of work we have already done, so we can reuse it.

๐Ÿ”– Browser Caching

All browsers cache very heavily. When you visit a page, your browser downloads resources like images, CSS, and so on. The next time you visit any page that calls that resource, the browser checks its cache first. If it finds a valid local copy, it uses that instead of downloading from the internet.

Accessing local files is much faster than downloading over the network, so your page loads more quickly. In fact your browser has many levels of caching. In the Purple Forest application, the application state is cached in Local storage.

๐Ÿซฑ๐Ÿฝโ€๐Ÿซฒ๐Ÿฟ The Memory vs. CPU/Network Trade-off

Browser caching is a classic example of trading memory for CPU time or network time. We use extra memory space to hold the cached copies and we save network time by not re-downloading assets. We could have chosen to use less memory space, but it would have slowed down our application.

Activity

Open Purple Forest in your browser and take a look at all the ways resources are cached in Devtools.

You could try opening the Network tab and selecting a Network request like a CSS file. Ask AI Assistance something like “show me all the ways this resource is cached in this browser”

โ›“๏ธโ€๐Ÿ’ฅ Cache Invalidation

Learning Objectives

“There are only two hard things in Computer Science: cache invalidation and naming things.” ~ Phil Karlton

๐Ÿฃ Stale/Fresh

What happens if the answer changes after we have cached it? The user might see an old version. This is called a stale cache or staleness. We need to tell the cache: “This item is old, get a fresh copy” . This is cache invalidation๐Ÿงถ๐Ÿงถ cache invalidationCache invalidation is the process of removing data that is no longer accurate or useful. , and it’s tricky.

In the browser, common invalidation strategies include:

  • Time-To-Live (TTL): Items expire after a set time (e.g., “cache this image for 1 hour”).
  • ETags/Last-Modified Headers: The browser asks the server “I have version X, is it still current?”. If yes, the server sends back a “304 Not Modified” response; if no, it sends the new version. A server can send a 304 response faster than it can send the whole content. But having to make a network request and wait for the answer is slower than not making a network request.
  • Cache-Busting: Change the name when the content changes. The browser sees a new URL and must download it. Inspect this web page and look at the CSS link in the header. The file is automatically fingerprinted to bust the cache when the fingerprint changes.

๐Ÿค” Why Is This Hard?

If there are so many solutions, why do we say cache invalidation is hard? It’s because every cache is a tradeoff between running out of time, running out of resources, and giving the right answer. Your program can be fast and cheap, but probably not accurate. It can be fast and accurate, but that’s not cheap. And it can be cheap and accurate, but that’s not likely to be fast.

Venn diagram showing three overlapping circles: fast, cheap, and accurate

When we do not cache, our program is slow. If we do not invalidate our cache at the right moment, our program is wrong.

In different applications, there are different priorities:

  • In a social media application, it’s probably more important to be fast than correct. If our goal is to serve the user the most interesting post to them, but we more quickly serve them the second most interesting post to them, the user will probably still be pretty happy (and maybe it gives us time to load the other post to show them next).
  • In an application handling medical records, it’s probably more important to be correct than fast. If a doctor is checking whether a patient is allergic to a medication they’re about to give them, a fast but wrong answer could be fatal.

Software has real effects in the world. It’s important to think about the real-world context and implications of our software when we’re writing it.

๐Ÿ“ Memoisation

Learning Objectives

Caching isn’t limited to your browser.

We can save time by caching the results of operations (answers) at every level in our system: in the database, at the endpoint, in the browser, and even inside our functions.

๐Ÿ“‹ Caching During Execution

Imagine any computation that we do repeatedly on the way to a final result. We could do it once, cache the answer in memory, and reuse it. This is especially helpful in functions that use recursion๐Ÿงถ๐Ÿงถ recursionA recursive function calls itself, typically until some break condition is met.

๐Ÿช† Problems and subproblems

We have seen that software engineering is mostly breaking down large problems into smaller problems that are easier to solve. Dynamic programming is a rigorous application of this idea.

In Chapter 8 of How Computers Really Work, you wrote a program in Assembly to calculate the factorial๐Ÿงถ๐Ÿงถ factorialA factorial is the product of all the whole numbers between 1 and n. of a number. Consider this way of calculating factorials.

def factorial(n):
    return n * factorial(n-1) if n else 1

We recursively execute the function until n reduces to 1. If you calculate factorial(5), it calculates factorial(4), which calculates factorial(3), and so on. If you later calculate factorial(6), it will re-calculate factorial(5) and all its subproblems again!

@cache
def factorial(n):
    return n * factorial(n-1) if n else 1

In Python, functools provides a function cache. The @cache decorator automatically stores the result of factorial(n) for each value of n it computes. If the function is called again with the same value, @cache returns the stored result without rerunning the code.

We don’t need a decorator to apply this idea. We can create a temporary cache, index, or memo inside any function ourselves. Memoisation is an implementation of the strategy of saving time by spending space.

๐Ÿ•น๏ธactivity

๐Ÿ›๏ธ Caching too much

We’ve seen that caching allows us to spend more memory to save time. We may want to limit how much memory we’re spending to save time.

Some examples of when storing a value in a cache doesn’t make sense:

Values we won’t use again

Imagine a user signs up for Purple Forest, we compute a home timeline for them and cache it on the server. But they never visit the website again. We’re spending some memory keeping that user’s home timeline cached for no reason.

In some cases, it makes sense to not cache some computation at all, if the results are unlikely to ever be used again.

In other cases, it makes sense to only cache some values. Perhaps we can identify this up front, e.g.

  • We could decide never to cache things on a user’s first visit, but if they’ve come to our site twice we’ll cache things for them.
  • We could decide to only cache values for our top 1000 users, and give everyone else a slower experience.
  • We could decide to only cache values that took us more than 10ms to compute - anything faster than that we can just re-compute again.

In other cases, we want to cache all values, but not remember them all forever. In caching, we call this eviction๐Ÿงถ๐Ÿงถ Cache EvictionEviction is deciding to remove something from a cache. Perhaps because it is stale, unlikely to be used again, or just because we want to limit how much memory we consume. .

Reading

There are different strategies to decide what values we should evict and when.

Read about different cache eviction policies.

Write down how you would decide which eviction policy to use when solving a problem. What’s an example of a use-case where each eviction strategy makes the most sense?

Another cache stores equivalent information

Consider this code:

from functools import cache

@cache
def calculate_fibonacci(n: int):
    if n < 2:
        return 1
    return calculate_fibonacci(n - 1) + calculate_fibonacci(n - 2)

@cache
def calculate_fibonacci_plus_one(n: int):
     return calculate_fibonacci(n) + 1

for n in range(10):
    print(calculate_fibonacci_plus_one(n))
How many cache entries will this code store?
Because both calculate_fibonacci and calculate_fibonacci_plus_one are cached, 20 cache entries will be stored. 10 for each function.

What would happen if we removed the @cache decorator on calculate_fibonacci_plus_one?

Our program would work exactly the same, and would have the same performance. All of the calculate_fibonacci calls would still be cached. We wouldn’t be caching the ten + 1 operations in our calls to calculate_fibonacci_plus_one. But these are cheap. (Also, we never call calculate_fibonacci_plus_one with the same argument, so we’re never getting cache hits from its cache anyway).

What would happen if instead we removed the @cache decorator on calculate_fibnacci?

Our program would get a lot slower. If we called calculate_fibonacci_plus_one with the same argument several times, those calls would be faster, but the fibonacci bit would be slow for the first time. And every time we call calculate_fibonacci_plus_one with a different argument, it will re-do all of that fibonacci work again.

Having two @cache annotations may feel useful, but in reality it’s just spending twice as much memory for no real gain.

It’s useful to think about where a cache is useful, rather than just adding caches when things are slow.

๐Ÿ’กTip

Before adding a cache, think:

  • Is this computation likely to be re-used?
  • Is there a better place to be adding a cache?
  • Are there any existing caches I should remove, if I add this one?

๐Ÿ”ฎ Pre-computing

Learning Objectives

Pre-computation means doing some work upfront to make future work faster.

๐Ÿ’ช๐Ÿพ Doing Work Ahead of Time

So far we have thought about saving the results of computations to access them again more quickly. But if we can predict what computations we will need, we can do them before we need them and access them instantly.

We invest time now to save time later. This could be a complete answer, an index of answers, or some intermediate computation that makes finding our final result faster.

Often, the work we pre-compute is sorting data to make searching operations faster.

Searching for a username in an unsorted list of one million users requires checking each one. This has O(n) complexity, and will on average take half a million comparisons.

But what if we sorted our list? Now, searching for a specific username in the sorted list can use binary search. Binary search repeatedly halves the search space, which is O(log n) complexity - taking only about twenty comparisons for a million items.

But sorting isn’t free! Sorting typically takes O(n * log n), which is slower than O(n).

If we need to search the list many times, the time spent sorting once is easily paid back by the much faster searches later. But if we’re only searching one time, sorting before we search will be slower than just searching once. Pre-sorting is useful when we have data that is read frequently but updated less often.

Pre-computing always involves trade-offs: you invest time and memory upfront to gain speed later.

Draw-backs of pre-sorting

One of the trade-offs with pre-sorting is that it can make updates more expensive. This is fine if we aren’t going to update our list, but may be a problem if we will perform updates.

Imagine if you have an array which you have pre-sorted to make searching easier:

const numbers = [1, 17, 38, 401];

Suppose we want to add the number 20 to the array.

If we didn’t care about keeping our array sorted, we could just add it to the end, an O(1) operation:

numbers.push(20);

If we care about keeping our array sorted, inserting new values may become slower. We may need to first binary search to find where to insert it (an O(log n) operation), and then insert the new value in place. Inserting the new value in place is probably an O(n) operation, because we need to move all of the values that come after it:

const position = binarySearchPosition(numbers, 20); // O(log n)
numbers.splice(position, 0, 20); // O(n)

Skip Lists

Different data structures have different performance characteristics for different operations.

Part of the complexity of inserting sorted positions above is that we’re using an array - arrays are expensive to insert values into anywhere other than the end, because that requires us to move all of the other elements.

Linked lists are a data structure with different performance characteristics. With linked lists, adding or removing elements at arbitrary positions is cheaper. Once you’ve found the right place to insert, you just update a few pointers and the new element has been added. But this comes with a cost - linked lists don’t have random access๐Ÿงถ๐Ÿงถ Random accessRandom access is the ability to ask for any element (e.g. to write numbers[2]). Data structures without random access require you to start at a position and keep asking “what is the next element” until you get to the element you want. This is slower.

Aside: RAM stands for Random Access Memory because it’s a kind of memory you can access in this way.
. So binary search is slower - you can’t just find the middle element in the list, you need to repeatedly ask for the next element until you get to the middle. Quicker splice came at the cost of slower binary search.

This led to the development of a new kind of data structure called a Skip list. A skip list is a sorted linked list, but where we keep track of which elements are at different points in the list. This allows us to easily jump to, say, the middle of the list if we need to (or a bit before it). Skip lists are another example of precomputing - every time we insert or delete an element into the data structure, we do a bit of extra book-keeping to update which elements we keep track of. In exchange for this extra work on insertion/deletion, our look-ups get faster.

TODO: Exercise about implementing insert + search backed by each of an array, a linked list (implement your own) and a a skip list (implement your own).

โš–๏ธ Trade-offs

Learning Objectives

In software engineering, we constantly make trade-offs. We choose one approach over another, gaining some benefits while accepting some downsides. In this module we have examined many problems that involve balancing resources like memory, CPU time, network bandwidth, and even developer time.

๐Ÿซฑ๐Ÿปโ€๐Ÿซฒ๐Ÿฝ Space vs Time

We saw caching trades the limited resource of memory space for the limited resource of computation time. But caching introduces the problem of stale data, and it can make it harder to reason about our application. We saw that we can store much more data and query it much quicker on a database, but also that it’s slower to access that data over a network.

In each optimisation we have examined, there is a trade off.

๐Ÿ—บ๏ธ Where Should Work Be Done?

Understanding these trade-offs allows us to optimise a system. This doesn’t always mean “make it the absolute fastest”. It means making conscious choices to best meet the goals, which might include:

  • Minimising user waiting time (latency)
  • Reducing server hosting costs (CPU, memory, bandwidth usage)
  • Improving developer comprehension (simpler code, patterns)
  • Increasing reliability (avoiding database flooding like the N+1 problem)

Choosing between trade-offs

One of the tricky things with trade-offs is there often isn’t a single clear best answer. Often reasonable people will have different thoughts.

We’ve come across many examples of trade-offs in this course. For example:

  • In the Tools module we considered trade-offs between using existing command line tools vs writing our own.
  • In the Tools module we considered the trade-offs of different programming languages (e.g. using compiled vs interpreted languages, using statically typed vs dynamic languages, etc).
  • In the Logic module we considered the trade-offs between using lots of rats in parallel (perhaps getting an answer quickly) vs using fewer rats sequentially (perhaps using fewer rats).
  • In the Decomposition module we considered the trade-offs between accepting the limitations of a simpler frontend-only system vs making a more complex system with more capabilities by including a backend.
  • In the Decomposition module we considered the trade-offs between short-polling, long-polling, or using websockets.
  • In the Decomposition module we considered the trade-offs between giving a new number of likes as a difference ("+1 like") or an absolute value (“now 10 likes”).
  • In the Decomposition module we considered the trade-offs between different continuation callback styles.
  • And lots more!

Some of these trade-offs have clear decision frameworks (do we want a fast answer, or to minimise how many rats we kill?). Others depend a lot more on context.

โœ๏ธexercise

Think about the trade-offs mentioned above. When would you choose each option? How would you decide between them? What context would help?

It’s important to consider context, and be open to different opinions. This is one of the strengths of working in diverse teams. Perhaps someone in your team has experienced a problem you haven’t, and weighs the down-sides of some trade-off differently than you do. If you just work on your own, your system may be worse for not considering those down-sides. If you just work with people who have the same experience as you do, you may not notice risks that others know. By working together, we benefit from all of our shared experience.