🧑🏾‍💻 prep

Examining the cost of computation

🥪 Memory consumption

Learning Objectives

📏 There are only so many bits you can fit on a chip.

Computers are real things, not diagrams or mathematical models. This means a computer has a limited physical size and a finite amount of memory. When your program runs, it uses some of this memory to store data.

Think back to Chapter 7 of How Your Computer Really Works.

graph LR CPU -->|️ Fastest and Smallest| Cache -->|Fast and Small| RAM -->|Slow and Big| Disk -->|Slowest and Vast| Network

At each stage there are limits to how fast you can get the data and how much data you can store. Given this constraint, we need to consider how much memory our programs consume.

🧘🏽 Simpler is smaller

Consider these data structures. Order them from least to most memory needed:

const userIds = [101, 102, ..., 1100]; // An array of 1000 numbers
const userRoles = ["Admin", "Editor", "Viewer"]; //An array of 3 short strings
const userProfiles = [ {id: 1, name: "Farzaneh", role: "Admin", preferences: {...}}, {id: 2, name: "Cuneyt", role: "Editor", preferences: {...}} ]; // An array of 2 complex objects

Different kinds of data have different memory footprints. All data is fundamentally stored as bytes. We can form intuition for how much memory a piece of data takes:

  • Numbers are typically stored as 8 bytes. In some languages, you can define numbers which take up less space (but can store a smaller range of values).
  • Each character in an ASCII string takes 1 byte. More complex characters may take more bytes. The biggest characters take up to 4 bytes.
  • The longer a string, the more bytes it consumes.
  • Objects and arrays are stored in different ways in different languages. But they need to store at least the information contained within them.
    • This means an array of 5 elements will use at least as much memory as the 5 elements would on their own.
    • And objects will use at least as much memory as all of the values inside the object (and in some languages, all of the keys as well).

More complicated elements or more properties need more memory. It matters what things are made of. All of this data added up is how much space our program takes.

It also matters how many steps our program takes. Another way to think of this is how much time it takes to get the answer…

📈 Big-O

Learning Objectives

Big-O notation describes how resource use grows as the input size (n) grows.

xychart-beta title "Big O Notation - Constant to Exponential" x-axis "Input size (n)" [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] y-axis "Time/Space (log)" 0 --> 1024 line [1, 1, 1, 1, 1, 1, 1, 1, 1, 1] line [0, 1, 1.6, 2, 2.3, 2.6, 2.8, 3, 3.2, 3.3] line [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] line [1, 4, 9, 16, 25, 36, 49, 64, 81, 100] line [2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]

Reading

Complete the coursework Data Structures and Algorithms: Space and Time Complexity.

This is in your backlog. You do not need to do it right now, but it might help to. If not, you might like to open it in a tab.

😀 Constant: The algorithm takes the same amount of time, regardless of the input size.

xychart-beta title "O(1) Constant Time" x-axis "Input Size" [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] y-axis "Computation Time" 0 --> 10 line [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

An example is getting the first character of a string. No matter how long the string is, we know where the first character is, and we can get it.

xychart-beta title "O(log n) Logarithmic Time" x-axis [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] y-axis "Computation Time" 0 --> 10 line [2.3, 3.0, 3.4, 3.7, 3.9, 4.1, 4.2, 4.4, 4.5, 4.6]

☺️ Logarithmic: The runtime grows proportionally to the logarithm of the input size.

An example is finding a string in a sorted list. Each time we can look in the middle of the list, and halve the number of entries we need to consider next time by looking either in the half before or the half after that element.

xychart-beta title "O(n) Linear Time" x-axis "Input Size" [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] y-axis "Computation Time" 0 --> 10 line [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

🙂 Linear: The runtime grows proportionally to the input size.

An example is finding an element by value in an un-sorted list. To be sure we find the element, we may need to look through every element in the list and check if it’s the one we’re looking for.

If we double the length of the list, we need to check twice as many elements.

xychart-beta title "O(n²) Quadratic Time" x-axis "Input Size" [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] y-axis "Computation Time" 0 --> 10000 line [100, 400, 900, 1600, 2500, 3600, 4900, 6400, 8100, 10000]
inputtime
11
24
749
864
981
10100

What does this mean? It means that the time is the square of the input size: n*n.

An example is finding which elements in an array are present more than once. For each element, we need to check every other element in the same array to see if they’re equal. If we double the number of elements in the array, we quadruple the number of checks we need to do.

😨 Quadratic: The runtime grows proportionally to the square of the input size.

xychart-beta title "O(2ⁿ) Exponential Time" x-axis [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] y-axis "Computation Time" 0 --> 1024 line [2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]
inputtime
12
24
38
416
9512
101024
Oh where have we seen this sequence of numbers before? ;)

😱 Exponential: The runtime grows exponentially with the input size.

An example is making a list of every combination of ever element in a list (so if we have [1, 2, 3] and want to make all the combinations: [], [1], [2], [3], [1, 2], [1, 3], [2, 3], [1, 2, 3]).

You will explore this theory in your backlog, but you will find that you already have a basic understanding of this idea. No really! Let’s look at these algorithms in real life:

Drag the complexity categories onto the correct examples

  • Finding your own parked car in your designated spot. Doesn’t matter if the car park has 10 spots or 1000, finding your spot takes the same time. This is like accessing an index in an array.
  • Finding a name in a physical dictionary or sorted phone book. You open to the middle, decide which half the name is in, and repeat. Each step halves the problem. Adding way more pages doesn’t increase the search time proportionally. This is like binary search.
  • Reading every page of a book in order to find a specific word. If the book doubles in length, it takes roughly double the time. This is like looping over an array.
  • Everyone at a party shaking hands with everyone else. If you double the number of people (n), the number of handshakes increases much faster (roughly n * n). This is like nesting a loop inside a loop.
  • Trying every possible combination to unlock a password. Each extra character dramatically increases the possibilities.

💡Tip

Big-O notation also describes space complexity (how memory use grows). Sometimes an algorithm’s time complexity is different from its space complexity. We have focused on time here, but you’ll meet space complexity analysis in the assigned reading.

Big-O notation is focused on the trend of growth, not the exact growth.

Think about strings: one character may take up one byte or four. If we double the length of the string, we don’t check which characters are in the string. We just think about the trend. The string will take about twice as much space. If the string only has four-byte characters, and we add one-byte characters, the string is still growing linearly, even though it may not take exactly double the space.

🧰 Worked example: Duplicate Encoder

Let’s consider this problem:

✍️exercise

Convert a string to a new string. The new string should:

  • Replace each character which occurs exactly once in the original string with a 1 character.
  • Replace each character which occurs more than once in the original string with a * character.

Ignore capitalization when determining if a character is a duplicate.

Examples

InputOutput
"din""```"
"recede""1*1*1*"
"Success""*1**1**"
"11*2""**11"

⚠️Warning

First, try solving this exercise yourself.

Here are three sample solutions we will compare:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
function duplicateEncode(word){
  let result = ""
  for (const char of word.toLowerCase()) {
    if (word.indexOf(char) === word.lastIndexOf(char)) {
      result += "1";
    } else {
      result += "*";
    }
  }
  return result;
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
function duplicateEncode(word){
  let result = ""
  for (const char of word) {
    const lowerCaseChar = char.toLowerCase()
    if (word.indexOf(lowerCaseChar) === word.lastIndexOf(lowerCaseChar)) {
      result += "1";
    } else {
      result += "*";
    }
  }
  return result;
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
function duplicateEncode(word){
  const occurrences = {};
  for (const char of word) {
    const normalisedChar = char.toLowerCase();
    occurrences[normalisedChar] = (occurrences[normalisedChar] || 0) + 1;
  }
  let out = "";
  for (const char of word) {
    out += (occurrences[char.toLowerCase()] === 1 ? "1" : "*");
  }
  return out;
}

Comparing the approaches

Approaches 1 and 2 are very similar. In terms of time and memory complexity, they are the same as each other. But the second uses less memory than the first. The first one keeps around an entire copy of word (converted to lower case) for the whole function. The second one only converts each character in word to lower case one at a time. It’s important to remember that big-O complexity doesn’t tell you how much time or memory an approach takes, only fast how they grow.

Approaches 2 and 3 are quite different. Let’s analyse them each for time complexity:

Approach 2

  • Approach 2 has a for-loop over each character in the word (line 3). If we say the size of the word is n, a for-loop on its own is O(n).
  • Line 4 calls char.toLowerCase() - this is an O(1) operation - changing the case of one character takes constant time.
  • Line 5 calls word.indexOf - this is an O(n) operation - it may have to look through the whole string, comparing every character to see if it’s the one we’re looking for. Because we have an O(n) operation (the word.indexOf call) inside an O(n) operation (the for loop), this makes the function at least O(n^2).
  • Line 5 also calls word.lastIndexOf - this is also an O(n) operation for the same reason. But it doesn’t change the complexity of our function - doing an O(n) operation twice is still O(n) - we ignore constant factors. This is different from doing an O(n) operation inside a for loop (where do we do an O(n) operation for each n - making it O(n^2)).

Because the worst thing we’ve seen is O(n^2) (an O(n) operation inside a for loop), approach 2 takes O(n^2) time.

Approach 3

  • Approach 3 has a for-loop over each character in the word (line 3). This is O(n).
  • Each operation inside the for-loop is O(1) (converting one character to lower case, looking up a value in an object, adding two numbers, inserting a value in an object). So this whole loop is O(n).
  • We have a second for-loop over each character in the word - O(n).
  • Each operation inside the for-loop is O(1) (converting one character to lower case, looking up a value in an object, a ternary operator, and appending one character to a string). So this whole loop is O(n).

Because the worst thing we’ve seen is O(n) (even though there were two O(n) loops), approach 3 takes O(n) time.

This technique is called precomputing🧶🧶 PrecomputingPrecomputing is when we do some work in advance which we can use to avoid needing to do it later.

Here we counted the occurrences of each character before the loop (which was O(n)) rather than needing to search through the string for each character (which would’ve been O(n^2)).
. We will learn more about it later.

Which is better?

Approach 3 has a better time complexity than approach 2. As longer input is given to the function, approach 3 will be much faster than approach 2.

Speed is not the only concern, however!

Approach 3 uses more memory than approach 2 (though not in terms of memory complexity), because it stores a whole extra object.

Which approach do you think is easiest to read? Ease of reading is an important concern to consider in code.

If we know our function will only take small strings as input, the time and memory use of the functions probably doesn’t matter much, and we should prefer the easiest code to read, maintain, and modify. We can still make small choices to speed things up (like choosing approach 2 over approach 1). But we should only make our code harder to read if we know that we’ll have a problem if we don’t.

🧮 "Expensive" Operations

Learning Objectives

Let’s think about Purple Forest from the Legacy Code module.

When we build the timeline of blooms on the homepage, we call an endpoint /home. This returns an array of objects, blooms, produced by people we follow, plus our own blooms, sorted by timestamp. We stuff this in our state object (and cache that in our local storage).

There are many different ways we could get and show this information in our frontend. Some ways are better🧶🧶 betterHere we are defining better as faster. We might at other times define better as simpler, clearer, or safer. than others.

What if we had tried any of the following strategies:

1. Get ALL Blooms, then Filter

  1. Request all the blooms in the database
  2. Request a list of all the people we follow
  3. Filter all the blooms by the people we follow, plus our own name
  4. Sort by timestamp
  5. Display blooms

2. Loop Network Requests

  1. Request a list of all the people we follow
  2. For each person we follow, request their blooms
  3. Request our own blooms
  4. Merge all the arrays
  5. Sort by timestamp
  6. Display blooms

3. Get ALL Blooms & People, then Loop & Filter

  1. Request all the blooms in the database
  2. Request all the people in the database
  3. Request a list of all the people we follow
  4. For each bloom.sender, loop over all the people and check if they are in our “following” list
  5. Sort by timestamp
  6. Display blooms

Given what we’ve just thought about, how efficient are these programs? Which is going to be fastest or slowest? Which is going to use the most or least memory? How could you make them more efficient? Write your ideas down in your notebook.

Our end state is always to show the latest blooms that meet our criteria. How we produce that list determines how quickly our user gets their page. This is very very important. After just three seconds, half of all your users have given up and left.

The Purple Forest application does not do most of this work on the front end, but it does still do it somewhere. (We can make our work cheaper, but never free.) The application uses common strategies to minimise:

  1. Steps taken to find and sort the data
  2. Quantity of data requested
  3. Number of network calls

This is because some operations are more expensive🧶🧶 expensiveExpensive operations consume a lot of computational resources like CPU time, memory, or disk I/O. than others.

The order also matters. In all of the above strategies, we filter the blooms before sorting them. Sorting isn’t a constant-time operation, so it takes more time to sort more data. If in the first strategy we had sorted all of the blooms before we filtered down to the just the ones we cared about, we would have spent a lot more time sorting blooms we don’t care about.

Network as a bottleneck

Learning Objectives

We first thought about latency when we learned about fetch, remember:

💡 Network latency is travel time.

Network calls are slow. Even the shortest round trip on a local area network takes around 500 microseconds (µs) or 500,000 nanoseconds (ns). This is thousands of times slower than accessing RAM (~100 ns) and several times slower than a fast local SSD (~16-150 µs).

The further data must travel, the longer it takes. This is why we try to store data close to where we need it. A round trip between continents, like the US and Europe, can easily take 150 milliseconds (ms) or 150,000,000 ns. This is orders of magnitude slower than any local operation.

Thinking back to the strategies we mooted in Expensive Operations, there are two main network related costs:

🦥 Fetching Too Much Data

In design one and three, we just get everything and sort it out in the browser. This sends too much data over the network! Every single byte of data we request extends the time we have to wait. Once we have all this data, we have to waste yet more time and energy sorting through items we knew we could never need or use.

🦥 Making Too Many Requests

In the second design, we make network calls inside a loop. Each network call in the loop adds latency. If we follow 50 people, we wait for the network 50+ times! Another way to say this is that this is an “N+1 API problem”.

🤑 Too much, too often

Both looping network calls and fetching/processing huge amounts of data are examples of overusing a limited resource. They make the user wait.

Remember: after just three seconds of waiting, half of all your users have given up and left.

But there’s a further problem with making too many requests or asking for too much data. Let’s think again about our “N+1” problem.

🎟️ N+1 Query Problem

Learning Objectives

You fetch data from an endpoint, but where does the endpoint get its data? Usually from a database, by sending a query.

💡 Databases also take time to find data

Database calls are expensive. When our backend needs data from the database, it sends a query. If we’re not careful, we can end up making many more queries than necessary.

🟣 Purple Forest Profile Picture

We are building a new feature of “user profile picture”. We already show blooms from people you follow. Now each bloom also needs to show the author’s name and profile picture.

Let’s say we are using a SQL database. A simple, but inefficient, way to fetch this data from the database might be:

  1. Get the IDs of the 50 people you follow: SELECT user_id FROM follows WHERE follower_id = current_user_id (1 query)
  2. Get the latest 10 bloom IDs for each of those 50 people: SELECT bloom_id FROM blooms WHERE author_id = ? ORDER BY timestamp DESC LIMIT 10 (50 separate queries, one for each followed user)
  3. Now we have, say, 100 unique bloom IDs. For each of those 100 blooms, get the bloom content: SELECT * FROM blooms WHERE bloom_id = ? (100 separate queries!)
  4. And for each of those 100 blooms, get the author’s details: SELECT name, avatar_url FROM users WHERE user_id = ?

This pattern: one initial query followed by N queries inside a loop is the “N+1 Query Problem”. In this example, it’s more like 1 + N + M + P queries! It results in hundreds of separate trips to the database.

🕹️Draft this feature

  1. In your notebook, plan how you would build this feature in your local copy of Purple Forest.
  2. Now build out the feature from the back end to the front. Don’t worry about the layout, just focus on finding the most efficient route through the system.

🌊 Flooding and Performance

Even if each individual database query is reasonably fast. Making hundreds of them back-to-back has serious consequences.

We’ve already seen that every query adds network delay and processing time. This is the latency problem we considered before. But there’s also a problem with the number of queries. There are only so many queries you can fit in a queue. Sending a burst of hundreds of simple queries can overwhelm the database server. This is sometimes called “flooding” the database.

The server has to handle each request individually, consuming resources (CPU, memory, connections). If many users trigger this N+1 pattern at once, the database can slow down for everyone, or even fall over entirely.

This N+1 problem can happen with any database interaction if you loop and query individually. Understanding this helps you write code that doesn’t accidentally overload the database.

🤔 What is the N+1 Query Problem?

📦 What to do instead

The real /home endpoint avoids these problems by using efficient strategies:

Batching: Ask for many items in one request. “get blooms for users Daniel, Ali, and Sally”.
Caching: Store results so you don’t have to ask the network again. Ask for only new changes in future.
Pagination: Ask for only the first page of results. Load more later if the user scrolls or clicks “next”.

All these are ways to save the data we need, close to where we need it. But each strategy also has downsides.

  • Batching may reduce our responsiveness. It will probably take longer to fetch three users’ blooms than one user’s blooms. If we’d just asked for one user’s blooms, then the next user’s, we probably probably could’ve showed the user some results sooner. Batching forces us to wait for all of the results before we show anything.
  • Caching may result in stale results. If we store a user’s most recent blooms, and when they re-visit the page we don’t ask the database for the most recent blooms, it’s possible we’ll return old results missing the newest blooms.
  • Pagination means the user doesn’t have complete results up-front. If they want to ask a question like “has this user ever bloomed the word cheese”, they may need to keep scrolling and searching to find the answer (with each scroll requiring a separate network call and database lookup).