How to generate unique initials for users in Laravel and Faker.

JCVD - Generated from Jean-Claude Van Damme

In an application I’m working on, users are represented with initials. In order to generate seed data I used Faker to fill their information but I struggled to generate unique initials from their first & last name.

Here’s my initial User Factory.

$factory->define(User::class, function (Faker $faker) {
    $first_name = $faker->firstName;
    $last_name = $faker->lastName;

    return [
        'initials' => $first_name[0] . $last_name[0],
        'first_name' => $first_name,
        'last_name' => $last_name,
        'email' => $faker->unique()->safeEmail,
        'password' => '$2y$10$92IXUNpkjO0rOQ5byMi.Ye4oKoEa3Ro9llC/.og/at2.uheWG/igi', // password
        'remember_token' => Str::random(10),
    ];
});

You can see how I implemented this at first. Of course this works great, but I ran into some issues that might be obvious to you.

Initials must be unique

My first issue occurred when multiple users shared the same initials. The seeder will throw a QueryException if you set your initials column as unique.

Here was my first attempt at solving this issue.

$factory->define(User::class, function (Faker $faker) {
    $first_name = $faker->firstName;
    $last_name = $faker->lastName;

    $suffix = 0;
    do {
        $initials = $first_name[0];
        for ($i = 0; $i <= $suffix; $i++) {
            $initials .= $last_name[$i];
        }
        $suffix++;
    } while (User::where('initials', $initials)->count() === 1);

    return [
        'initials' => $initials,
        'first_name' => $first_name,
        'last_name' => $last_name,
        'email' => $faker->unique()->safeEmail,
        'password' => '$2y$10$92IXUNpkjO0rOQ5byMi.Ye4oKoEa3Ro9llC/.og/at2.uheWG/igi', // password
        'remember_token' => Str::random(10),
    ];
});

I thought this would fix it, but I didn’t account the fact that the seeder only creates the models after generating them.

This means that my query User::where('initials', $initials)->count() will not take the newly generated models into account.

How to generate unique fields with Model Factory?

If the field you want to be unique is generated from Faker’s API, you can use the unique() method on it. You can see I already use that in the email field.

But the initials are not generated by Faker’s API in my case, so I have to do something else.

In fact it’s possible to share a variable throughout the factory loop (if you create several models at once). You have to use a static variable for this. Let’s see how we can use this static variable to solve our issue.

$factory->define(User::class, function (Faker $faker) {
    static $existingInitials;

    $existingInitials = $existingInitials ?: [];

    $first_name = $faker->firstName;
    $last_name = $faker->lastName;

    $suffix = 0;
    do {
        $initials = $first_name[0];
        for ($i = 0; $i <= $suffix; $i++) {
            $initials .= $last_name[$i];
        }
        $suffix++;
    } while (in_array($initials, $existingInitials));

    $existingInitials[] = $initials;

    return [
        'initials' => $initials,
        'first_name' => $first_name,
        'last_name' => $last_name,
        'email' => $faker->unique()->safeEmail,
        'password' => '$2y$10$92IXUNpkjO0rOQ5byMi.Ye4oKoEa3Ro9llC/.og/at2.uheWG/igi', // password
        'remember_token' => Str::random(10),
    ];
});

Great! It works!

What about special characters?

Soon enough after this I ran into another issue when I started seeding more users. Indeed, the initials length will be bigger to make them unique. And while most names don’t have special characters at the first position, it happens more often at the second position.

Illuminate\Database\QueryException  : SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xC3' for column 'initials' at row 1 (SQL: insert into `users` (`initials`, `first_name`, `last_name`, `email`, `password`, `remember_token`, `updated_at`, `created_at`) values (JL�, Jade, Léonard, [email protected], $2y$10$92IXUNpkjO0rOQ5byMi.Ye4oKoEa3Ro9llC/.og/at2.uheWG/igi, 65Yeja16Fg, 2019-07-18 11:54:13, 2019-07-18 11:54:13))

The query fails because \xC3 is not a valid string value. It actually represents the é character in Léonard.

What I want in this case i to transform the é into an e.

First instinct was to search on StackOverflow. They share some solutions, but they seemed to be quite complicated. I tried the iconv function but it didn’t work like I wanted (converted é into 'e. Other solutions seemed to require an array of replacements for special characters which seemed a bit too much for my case.

Finally I looked into the Laravel Helpers. At first there seemed to not have any helper for this, but when I looked into the Str::slug() helper, I noticed it used another helper called Str::ascii() which is exactly like what I needed.

Let’s put this into the factory.

use Illuminate\Support\Str;

$factory->define(User::class, function (Faker $faker) {
    static $existingInitials;

    $existingInitials = $existingInitials ?: [];

    $first_name = $faker->firstName;
    $last_name = $faker->lastNiName;

    $suffix = 0;
    do {
        $initials = Str::ascii($first_name)[0];
        for ($i = 0; $i <= $suffix; $i++) {
            $initials .= Str::ascii($last_name)[$i];
        }
        $suffix++;
    } while (in_array($initials, $existingInitials));

    $existingInitials[] = $initials;

    return [
        'initials' => $initials,
        'first_name' => $first_name,
        'last_name' => $last_name,
        'email' => $faker->unique()->safeEmail,
        'password' => '$2y$10$92IXUNpkjO0rOQ5byMi.Ye4oKoEa3Ro9llC/.og/at2.uheWG/igi', // password
        'remember_token' => Str::random(10),
    ];
});

Do note that we are transforming all the last_name and not only the character because when the last name is Léonard, $last_name[1] will return a special character (Ã).

Nice! I now have unique initials without special characters.

Make sure initials are uppercase

Of course initials must be uppercase. I forgot this, so let’s quickly change the code.

do {
    $initials = strtoupper(Str::ascii($first_name)[0]);
    for ($i = 0; $i <= $suffix; $i++) {
        $initials .= strtoupper(Str::ascii($last_name)[$i]);
    }
    $suffix++;
} while (in_array($initials, $existingInitials));

Illuminate\Database\QueryException Duplicate entry

What?! How?! Why?!

I seeded more users at once and I came up with a new QueryException Duplicate entry on the initials.

It appears that one of the users had the name Arne De Pauw. Another user already had the initials ADE so the script tried to generate the initials with an additional character: ADE (with a space).

If you try to compare ADE with ADE in MySQL it’ll return the same results, so the space character is not taken into account.

Let’s fix this.

$character = strtoupper(Str::ascii($last_name)[$i]);
if (in_array($character, [' ', '-', '\'', ''])) {
    continue;
}
$initials .= strtoupper($character);

What if I already have users in my DB?

Good question. Basically we can easily implement this by setting the initial value of $initials to the current list of initials.

$existingInitials = $existingInitials ?: User::pluck('initials')->toArray();

One more thing…

Before we finish this, you may run into a final error when seeding a large number of users.

ErrorException : Uninitialized string offset: 3

This error will be thrown if the last name is not long enough to make a unique set of initials.

In my case it was an issue with a short last name of 5 letters. If you have a lot of users, the length of initials will start to grow and reach 5 or 6 letters.

A simple solution I came with is that in case the last name is not big enough, I’ll just append X initials.

$safeLastName = Str::ascii($last_name);
$initials = strtoupper(Str::ascii($first_name)[0]);
for ($i = 0; $i <= $suffix; $i++) {
    if (! isset($safeLastName[$i])) {
        $initials .= 'X';
        continue;
    }

    $character = $safeLastName[$i];
    if (in_array($character, [' ', '-', '\'', ''])) {
        continue;
    }
    $initials .= strtoupper($character);
}

Conclusion

We now have a pretty good factory that can generate unique initials.

It can be improved, here are some ideas :

  • When names have multiple parts, use the first character of each part (Jean-Claude Van Damme would become JCVD)
  • Use a number instead of X when initials already exist so the initials are shorter
  • Extract the logic into an helper class so you can use it anywhere in your project

The code of this factory is available on Github: https://gist.github.com/depsimon/d26d2809b9269fbe0f7f82b9d2ae2fc6

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.