SQL n + 1 Selects Explained

Posted by karldmoore on February 5, 2009

The SQL n + 1 selects problem is extremely common but I have often found that many people have either never heard of it or simply don’t understand it. It is actually very easy to introduce a problem like this into your code, but it’s also very easy to resolve as well. Problems like this are best explained with an example; so imagine we have a table called users and another called user_roles. These tables are setup with a one-to-many relationship, meaning that one user (e.g. jsmith) can have many roles (e.g. Administrator, Auditor, Developer). Many people might implement something like this;

public Iterable<User> allUsers() {
    final String selectUsers = "select users.username, users.email, " +
        "users.last_password_change from users";
    return getJdbcTemplate().query(selectUsers, new Object[] {},
                new ParameterizedRowMapper<User>() {
        public User mapRow(ResultSet resultSet, int rowNumber) throws SQLException {
            String username = resultSet.getString("username");
            String email = resultSet.getString("email");
            Date lastPasswordChange = resultSet.getDate("last_password_change");
            User user = new DefaultUser(username, email, lastPasswordChange);
            addRolesToUser(user);
            return user;
        }
    });
}

private void addRolesToUser(final User user) {
    final String selectUserRoles = "select role_name from user_roles where username = ?";
    getJdbcTemplate().query(selectUserRoles, new Object[] { user.getPrincipalName() },
                new RowCallbackHandler() {
        public void processRow(ResultSet resultSet) throws SQLException {
            String rolename = resultSet.getString("role_name");
            user.addRole(rolename);
        }
    });
}

Reviewing the code we can see one query is executed to retrieve the users, the problem here is for each user another SQL statement needs to be executed to retrieve the roles. If the first query retrieved one user, this would require one additional query to retrieve the roles. If the first query retrieved a hundred users, this would require one hundred additional queries to retrieve the roles. The pattern will always be the same, one query for the users and n queries dependent on the number of users found, thus n + 1. Although this solution is functional, it does result in many unnecessary SQL statements being executed.

select users.username, users.email, users.last_password_change from users;
select role_name from user_roles where username = ?;
select role_name from user_roles where username = ?;
select role_name from user_roles where username = ?; 
...

Shared resources are typically the bottleneck in most applications, so expensive or unnecessary SQL should be avoided if possible. As the application attempts to scale, this bottleneck can become extremely problematic and severely inhibit application performance. Fortunately this is a simple solutions to this problem; introducing a join into the query.

public Iterable<User> allUsers() {
    final String selectUsers =
        "select users.username, users.email, users.last_password_change, user_roles.role_name "
            + "from users left join user_roles on (users.username = user_roles.username)";
    final Map<String, User> users = new HashMap<String, User>();
    getJdbcTemplate().query(selectUsers, new Object[] {}, new RowCallbackHandler() {
        public void processRow(ResultSet resultSet) throws SQLException {
            String username = resultSet.getString("username");
            if (!users.containsKey(username)) {
                String email = resultSet.getString("email");
                Date lastPasswordChange = resultSet.getDate("last_password_change");
                User user = new DefaultUser(username, email, lastPasswordChange);
                users.put(username, user);
            }

            String rolename = resultSet.getString("role_name");
            if (!StringUtil.isNull(rolename)) {
                User user = users.get(username);
                user.addRole(rolename);
            }
        }
    });
    return users.values();
}

Although the code and SQL statement are slightly more complex that the original example, it results in much fewer SQL statements being executed. Instead of the n + 1 statements executed in the first example, one statement is executed that fetches all the required data. This typically results in much improved performance and as the numbers scale the improvement in performance can become much more apparent.

select users.username, users.email, users.last_password_change, user_roles.role_name
    from users left join user_roles on (users.username = user_roles.username);

As with all performance optimizations the most important thing is to measure the effect of the improvement. Performance optimizations aren’t always predictable so by taking measurements before and after the change, you can accurately know if you have actually improved the performance (or made it worse). A SQL join may be the most appropriate way of solving a problem such as this, but there are other alternatives such as caching the data instead. Although the SQL n + 1 selects is an extremely common problem, is not always well understood and is often still found within code. It is very easy to introduce a problem like this into your code, but it’s also very easy to resolve as well. Next time you are viewing your debug output, see if you can spot SQL n + 1 selects.

References

Database access using Spring JdbcTemplate
Preventing the n + 1 select problem when using Hibernate

This entry was posted on February 5, 2009 at 06:00 and is filed under Development, Refactoring, Spring, SQL. Tagged: Bugs, Code Review, Examples, Improvement, Jdbc, Performance, Refactoring, Smells, Spring, SQL. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

5 Responses to “SQL n + 1 Selects Explained”

Embedded Database Performance: Handle With Care « Pramatr Blog said

February 11, 2009 at 20:22
[…] SQL n + 1 Selects Explained […]

Reply
SOA and the N + 1 Selects Problem | Summa Blog said

February 17, 2009 at 14:33
[…] through that service’s exposed interface – and this is where you come face-to-face with the n + 1 selects problem. In this architecture, your Customer Service would first have to query the customer data […]

Reply
ActiveRecord group by, count and joins « Tatva-Artha said

March 18, 2009 at 21:35
[…] is inefficient because of N+1 select […]

Reply
Boris K said

May 30, 2009 at 20:49
Thanks for explaining this! I was reading the Hibernate documentation, came across the term, and needed an explanation.

Reply
Solution Street Blog » Blog Archive Top 10 skills every software consultant should have - Solution Street Blog said

September 30, 2016 at 20:38
[…] the database is missing an index or a database call is being performed over and over again (e.g., N+1 problem). A good utility or test harness is good for identifying performance issues whether they are […]

Reply

karldmoore development

A development oriented blog, containing anything I've been working on, reading or thinking about.

Blog Stats

Categories

Archives

Recent Posts

Subscribe

Twitter Updates

Disclaimer

Meta