Have you heard about Envirotechnical? Let's fight climate change one byte at a time :) Learn More →

logo
Published on

The fancy N+1 problem, a tutorial to f**k up your code

Authors
Table of Contents

What is the N+1 problem?

Yes, I know, you've been waiting for this your whole life, searching for some Guru to spread the word about the misusage of ORMs and why it's always a bad choice to use them.

Well, lucky for you, you've come to the wrong place again! But don't worry too much or your head will go puff, and we don't want that.

Believe in yourself and your ability to overcome difficulties

Let's start with the simple what is what, why do we exist and is this the matrix questions. Spoiler alert: it probably isn't, although you should take some time off to read Aldous Huxley more than you did Orwell, both very good dystopic authors tho.

N+1 is many things, but first and foremost, it's an anti-pattern. It can be considered so because you, the genius programmer, either used too many loops or you used them in the wrong spot. Sometimes the N+1 problem can be solved with the saying Give to Caesar what is of Caesar, which means: write a fuckin join statement - roughly translated from Latin, which is not my first language.

So, how does it happen?

Whenever you intentionally write a query that requires your code to execute N additional queries, you are probably entering the N+1 realms. If you need to fetch N:M data (which is usually the case with N+1 problems), you should rather use a JOIN statement in your queries.

Believe in yourself and your ability to overcome difficulties

Once you have all the data joined together only then will you have the Server (or preferably the Client application) do the heavy lifting, so to avoid unnecessary overloading of the server resources.

A clear example in SQL could be:

SELECT "employees".* FROM "employees" ORDER BY "employees"."id"
SELECT "tickets".* FROM "tickets" WHERE "tickets"."employee_id" = 1
SELECT "tickets".* FROM "tickets" WHERE "tickets"."employee_id" = 2
SELECT "tickets".* FROM "tickets" WHERE "tickets"."employee_id" = 3
SELECT "tickets".* FROM "tickets" WHERE "tickets"."employee_id" = 4
SELECT "tickets".* FROM "tickets" WHERE "tickets"."employee_id" = 5

Or, with Javascript:

const Employees = db.Employees.findAll()
const n = Employees.length

for (let idx = 0; idx < n; idx++) {
  Employees[idx]['tickets'] = db.Tickets.find({
    employee_id: {
      [Op.eq]: Employees[idx]['id'],
    },
  })
}

How do we solve this issue? With a JOIN statement.

SELECT employees.employee_name, count(*) as number_of_tickets FROM employees
JOIN tickets ON employees.user_id = employee_id
GROUP BY employees.employee_name
ORDER BY number_of_tickets ASC;

This way all we need to do is fetch the data we need directly from our data store which is optimized to run this kind of queries and will not suffer the subject of this talk, if used as intended.

The goodbye

I hope you found this article useful and to your liking and if you have any requests, drop a message on one of my social media accounts or open an issue/start a discussion on github, on this repository!

As always you can find me on Twitter, listen to my Podcast on Spotify and add me on LinkedIn to talk professionally (yeah, right)