Update
PR 31721 has been merged into Laravel 7.0.8, which fixes the escaped forward slashes in the json encoding. Before this, encrypting the same data would give you variable size results. Now, as of 7.0.8, encrypting the same data will give you the same size result every time.
TL;DR:
Laravel's encrypt method will return a string, so the datatype should be a varchar or text variation, depending on the size of the data being encrypted.
To determine the approximate size, you can use the following series of calculations:
Laravel >= 7.0.8
Let a
= the size of the serialized unencrypted data (strlen(serialize($data))
)
Let b
= a + 16 - (a MOD 16)
(calculate size of encrypted data)
Let c
= (b + 2 - ((b + 2) MOD 3)) / 3 * 4
(calculate size of base64 encoded data)
Let d
= c + 117
(add size of MAC, IV, and json encoding)
Let e
= (d + 2 - ((d + 2) MOD 3)) / 3 * 4
(calculate size of base64 encoded data)
Even though the value is not deterministic, the size of the result is. For example, if you were to encrypt a 9 digit social security number, the result will always be 216 characters.
Laravel < 7.0.8
Let a
= the size of the serialized unencrypted data (strlen(serialize($data))
)
Let b
= a + 16 - (a MOD 16)
(calculate size of encrypted data)
Let c
= (b + 2 - ((b + 2) MOD 3)) / 3 * 4
(calculate size of base64 encoded data)
Let d
= c + 117 + 8 + ((c + 2 - ((c + 2) MOD 3)) / 3)
(add size of MAC, IV, and json encoding, plus extra buffer for potentially escaped slashes)
Let e
= (d + 2 - ((d + 2) MOD 3)) / 3 * 4
(calculate size of base64 encoded data)
For example, if you were to encrypt a 9 digit social security number, the result would be at minimum 216 characters, and at maximum 308 characters (though this is probably a statistical impossibility). If you run a loop of 100000+ encryptions, you'll see the size is usually in the 216 - 224 range. The formula provided above would tell you to set your field to 248 characters, which is a healthy buffer above the expected range, but not statistically impossible to hit.
Details:
The value returned from the encrypt method is not just the encrypted text, but is a base64 encoded representation of a json encoded payload array that contains (1) the base64 encoded encrypted value of the serialized data, (2) the base64 encoded initialization vector (IV), and (3) the message authentication code (MAC). So, to determine the size of the field needed, you will need to know the max size of the data that will be encoded, and then add on some extra room for these extra pieces of information that are stuffed in the returned string.
First, let's calculate the max size of your encrypted value. Since your encryption algorithm (AES-256-CBC) is a block cipher, this is pretty easily done with a formula. AES uses 16 byte blocks and requires at least one byte of padding, so the size of the encrypted value will be the next multiple of 16. So, if your original data is 30 bytes, your encrypted data will be 32 bytes. If your original data is 32 bytes, your encrypted data will be 48 bytes (since AES requires at least one byte of padding, your 32 bytes becomes 33, and then that goes up to the next multiple of 16 to 48). The formula for this would be x + 16 - (x MOD 16)
. So, for 30 bytes you get 30 + 16 - (30 MOD 16) = 32
.
When calculating the size of the encrypted value, keep in mind that the data being encrypted is first serialized. So, for example, if you are encrypting a social security number, the plain value is only 9 characters, but the serialized value is actually 16 characters (s:9:"xxxxxxxxx";
). Since the serialized value is what is actually encrypted, and it is 16 bytes, the size of the encrypted value will be 32 bytes (16 + 16 - (16 MOD 16) = 32
).
In addition to this, the openssl_encrypt
function returns the encrypted data already base64 encoded. Base64 encoding increases the size of the value by about 4/3. For every 3 bytes in the original data, base64 encoding will generate a 4 byte (character) representation. So, for the SSN example, the encrypted result is 32 bytes. When translating to base64, 32 bytes gives us (32 / 3) = 10.6
3 byte segments. Since base64 pads to the next byte, take the ceiling, and multiply by 4, which gives 11 * 4 = 44
bytes. So, our original 32 byte encrypted value becomes a 44 character string. If you need a formula for this, you can use (x + 2 - ((x + 2) MOD 3)) / 3 * 4
. So, (32 + 2 - ((32 + 2) MOD 3)) / 3 * 4 = 44
.
The next piece of information is the MAC. The MAC is a SHA256 hashed value, so we know that it will be 64 characters.
The final piece of information is the IV. The plain IV is 16 random bytes. The IV stored in the payload array is the base64 encoded value of the plain IV. So, we can use the formula above to calculate the size of the base64 encoded IV: (16 + 2 - ((16 + 2) MOD 3)) / 3 * 4 = 24
.
These three pieces of information are compacted into an array, and then json_encoded. Because of the json representation and the name of the values in the array, this adds another 29 bytes.
Additionally, in Laravel < 7.0.8, any forward slashes in the base64 encoded data are escaped with backslashes in the json string, so this adds a variable number of bytes depending on how many forward slashes are present. For the SSN example, there are 68 characters of base64 encoded data (44 for the encrypted data, 24 for the IV). Let's assume the maximum number of forward slashes is probably about 1/3 of the results, or about 23 extra bytes. In Laravel >= 7.0.8, these forward slashes are not escaped, so there are no extra bytes.
Finally, this json_encoded value is base64_encoded, which will again increase the size by a factor of about 4/3.
So, to put this all together, lets again imagine you're encrypting a social security number. The openssl_encrypt
result will be 44 characters, the MAC is 64 characters, the IV is 24 characters, and the json representation adds another 29 characters.
In Laravel < 7.0.8, there is also the buffer of an extra 23 characters. This gives us (44 + 64 + 24 + 29 + 23 = 184
) characters. This result gets base64 encoded, which gives us ((184 + 2 - ((184 + 2) MOD 3)) / 3 * 4 = 248
) characters.
In Laravel >= 7.0.8, there is no extra buffer. This gives us (44 + 64 + 24 + 29 = 161
) characters. This result gets base64 encoded, which gives us ((161 + 2 - ((161 + 2) MOD 3)) / 3 * 4 = 216
) characters.