Reinvent your SQL coding with the newly released dbForge AI Assistant
Learn more

How to Use Regexp in MySQL With Practical Examples

Regular expressions are supported in various relational database management systems (RDBMSs), including MySQL. This method works with pattern matching in MySQL to make queries to databases more efficient. It also helps database specialists handle various data-related tasks in a more straightforward and precise manner.

This article will examine regular expressions, including standard operators and functions in MySQL, their characteristics, distinctive features, and use cases.

Key operators and functions for regular expressions in MySQL

The primary options for pattern matching in MySQL are the REGEXP and RLIKE operators. These operators check whether a string matches a specified pattern. A more flexible and precise approach is using MySQL's regular expression functions, which are now the recommended method for working with regular expressions.

MySQL offers robust support for regular expressions, which has significantly improved since version 8.0 with the transition to ICU-based regular expressions. Now MySQL uses ICU (International Components for Unicode) library for processing regular expressions which ensures full support for Unicode multibyte characters in patterns.

In this guide, we'll explore both the REGEXP operator and regular expression functions in MySQL, demonstrating their usage with practical examples. For this purpose, we'll use the popular sakila test database and dbForge Studio for MySQL as our integrated development environment (IDE) for writing and executing queries. We'll focus on query execution, but this powerful tool offers a comprehensive suite of features for managing MySQL and MariaDB databases.

REGEXP and RLIKE operators

The REGEXP operator in MySQL is a powerful tool that simplifies complex search operations. It allows developers to define search patterns with precision, making it easier to query large datasets, filter results, validate column data, and more. While REGEXP is the standard operator, RLIKE is its synonym: both function identically.

The syntax for both operators is the same:

SELECT column_name FROM table_name WHERE column_name REGEXP 'pattern';  
SELECT column_name FROM table_name WHERE column_name RLIKE 'pattern';  

Now, let's explore how these operators work with practical examples.

Assume we need to retrieve a list of all customers whose last names begin with "C" using the REGEXP operator:

SELECT  
  customer_id,  
  first_name,  
  last_name  
FROM customer  
WHERE last_name REGEXP '^C';  

MySQL REGEXP standard operator

Now, let's check how the RLIKE operator works. Since RLIKE is a synonym for REGEXP, it can be used for pattern matching in the same way. In the following example, we retrieve all films from the sakila database where the description includes the word "documentary":

SELECT  
  film_id,  
  title,  
  description  
FROM film  
WHERE description RLIKE 'documentary';  

MySQL RLIKE standard operator

Another option for pattern matching is the REGEXP_LIKE function. Let's examine how it differs from the standard operators.

REGEXP_LIKE function

MySQL provides a built-in REGEXP_LIKE function that allows you to search for specific patterns within a string using regular expressions. This function serves the same purpose as the REGEXP operator, but REGEXP_LIKE is a more powerful and flexible option. It is the preferred choice in MySQL 8.0 and later versions.

The syntax of the REGEXP_LIKE function is as follows:

REGEXP_LIKE(string, pattern)  

Parameters:

  • string - the string to search within.
  • pattern - the regular expression pattern used for searching.

Optionally, the query can include the match_type parameter - a string containing one or more characters that adjust the matching behavior.

There are five matching parameters that can be applied in the query:

Match type Meaning
c Case sensitive matching (default)
i Case insensitive matching
m Multi-line mode (^ and $ match start and end of lines, not just the whole string)
n Allows . (dot) to match newline characters
u Enables full Unicode matching

Overall, the REGEXP_LIKE function in MySQL provides a more structured approach to pattern matching. As a function, it offers greater flexibility by allowing additional parameters for more precise results. Additionally, it is portable to other databases.

Let's look at an example of the REGEXP_LIKE function in MySQL. Suppose we want to retrieve a list of actors whose last names end with "son". The query would be:

SELECT  
  actor_id,  
  first_name,  
  last_name  
FROM actor  
WHERE REGEXP_LIKE(last_name, 'son$', 'i');  

MySQL REGEXP_LIKE function

In the 'son$' pattern, the $ symbol ensures that "son" appears at the end of the last name. The i parameter enables case insensitive matching, meaning it will match "Son", "SON", "son", and similar variations.

REGEXP_INSTR function

The REGEXP_INSTR function in MySQL searches for a pattern within a string using a regular expression and returns the position where the match is found. If no match is found, it returns 0.

The syntax of the function is:

REGEXP_INSTR(expr, pattern [, start [, occurrence [, return_option [, match_type]]]])  

Parameters:

  • expr - the string in which to search for the pattern.
  • pattern - the regular expression pattern to search for.
  • start (optional) - the position in expr to start searching from (default is 1, meaning the beginning of the string).
  • occurrence (optional) - the occurrence of the pattern to search for (default is 1).
  • return_option (optional) - the value that should be returned: 0 (default) returns the starting position of the match, 1 returns the position of the first character after the match.
  • match_type (optional) - the matching behavior.

The REGEXP_INSTR function is useful when you need to determine the position of a specific pattern in a string, find multiple occurrences of a substring, or search for patterns in text fields such as email validation, phone numbers, and more.

For example, we want to find the position of the first vowel (a, e, i, o, u) in the last names of actors. We can use the following query:

SELECT  
  actor_id,  
  last_name,  
  REGEXP_INSTR(last_name, '[aeiou]', 1, 1, 0, 'i') AS first_vowel_position  
FROM sakila.actor  
LIMIT 30;  

MySQL REGEXP_INSTR function

In this query, the function searches for any vowel in last_name, starting from position 1. It finds the first occurrence of a vowel and returns its position. The i parameter makes the search case insensitive.

REGEXP_REPLACE function

The REGEXP_REPLACE function in MySQL searches for a regular expression pattern in a string and replaces it with a specified replacement string. This function is particularly useful for masking sensitive data, cleaning and standardizing data, and formatting text. It can replace parts of a string with specific symbols, remove extra spaces, numbers, or unnecessary characters, unify different spelling variations, and more.

This function is ideal for complex substitutions where simple string replacement does not meet all requirements.

The syntax of the function is:

REGEXP_REPLACE(expr, pattern, replacement [, position [, occurrence [, match_type]]])  

Parameters:

  • expr - the string where the replacement will occur.
  • pattern - the regular expression pattern to search for.
  • replacement - the string that will replace the matched pattern.
  • position (optional) - the starting position in expr to search (default is 1).
  • occurrence (optional) - the occurrence of the pattern to replace: 0 (default) replaces all occurrences, any positive number replaces only that specific occurrence.
  • match_type (optional) - the matching behavior.

Below is an example of using REGEXP_REPLACE in MySQL. Suppose we want to replace all vowels (a, e, i, o, u) in the last names of actors with an asterisk (*).

SELECT  
  actor_id,  
  last_name,  
  REGEXP_REPLACE(last_name, '[aeiou]', '*', 1, 0, 'i') AS masked_last_name  
FROM sakila.actor  
LIMIT 30;  

MySQL REGEXP_REPLACE function

In this query, the REGEXP_REPLACE function searches for vowels ([aeiou]) starting from position 1. The 0 parameter ensures that all occurrences are replaced, while the i parameter makes the search case insensitive, replacing both uppercase and lowercase vowels.

REGEXP_SUBSTR function

The REGEXP_SUBSTR function in MySQL extracts a substring from a string that matches a regular expression pattern. This function is useful in various scenarios, such as retrieving specific data parts or parsing structured strings. Additionally, REGEXP_SUBSTR is valuable for data validation and cleaning, as it can filter valid or invalid formats.

The syntax of the function is:

REGEXP_SUBSTR(string, pattern [, position [, occurrence [, match_type]]])  

Parameters:

  • string - the string to search (VARCHAR or TEXT data type).
  • pattern - the regular expression pattern to match.
  • position (optional) - the position in the string to start searching (default is 1).
  • occurrence (optional) - the occurrence of the pattern to return (default is 1).
  • match_type (optional) - the matching behavior.

In the example below, we want to extract the first word from film titles. We can achieve this using the REGEXP_SUBSTR function:

SELECT  
  film_id,  
  title,  
  REGEXP_SUBSTR(title, '^[A-Za-z]+') AS first_word  
FROM film  
LIMIT 30;  

MySQL REGEXP_SUBSTR function

In this query, the pattern '^[A-Za-z]+' matches the first sequence of letters at the beginning (^) of each film title. It then extracts the first word, ensuring only alphabetic characters are included, without numbers or special characters.

Syntax and patterns in MySQL regular expressions

In previous sections, we explored MySQL functions that deal with regular expressions and the parameters used in queries to make the results more precise. In this section, we are going to examine advanced MySQL regex query parameters and specificities that help us compose more effective queries.

Basic syntax rules and common pattern elements

We illustrated various use cases with practical queries, and you might have noticed the patterns used in their syntax. Here we have compiled a table of patterns in MySQL regular expression syntax with their descriptions and examples.

Pattern Description Example
^ Matches the beginning of the string ^A (matches "A" at the start)
$ Matches the end of the string e$ (matches "e" at the end)
. Matches any single character except a newline a.b (matches "acb", "a-b", etc.)
[abc] Matches any one of the characters inside the brackets [aeiou] (matches any vowel)
[^abc] Matches any character not in the brackets [^0-9] (matches non-digit characters)
[a-z] Matches any lowercase letter [a-z] (matches any lowercase letter)
[A-Z] Matches any uppercase letter [A-Z] (matches any uppercase letter)
[0-9] Matches any digit (same as \d) [0-9] (matches any digit)
* Matches zero or more occurrences of the preceding element a* (matches "", "a", "aa", etc.)
+ Matches one or more occurrences of the preceding element a+ (matches "a", "aa", "aaa", etc.)
? Matches zero or one occurrence of the preceding element colou?r (matches "color" or "colour")
{n} Matches exactly n occurrences of the preceding element a{3} (matches "aaa")
{n,} Matches at least n occurrences of the preceding element a{2,} (matches "aa", "aaa", "aaaa", etc.)
{n,m} Matches between n and m occurrences of the preceding element a{2,4} (matches "aa", "aaa", "aaaa")
| OR operator, matches either pattern on either side cat|dog (matches "cat" or "dog")
\\ Escape a special character to treat it as a literal \\. (matches a period .)
\d Matches any digit (same as [0-9]) \d{2} (matches any two digits)
\w Matches any alphanumeric character (same as [A-Za-z0-9_]) \w+ (matches any word)
\b Matches a word boundary \bword\b (matches "word" as a separate word)

Escaping special characters in MySQL REGEXP syntax

It is essential to pay attention to regex escaping in MySQL to ensure the query functions correctly in a given scenario.

In MySQL regex syntax, certain characters have special meanings (for instance, . (dot) represents any character, $ marks the end of a string). However, there are cases where we need to search for these characters literally. To achieve this, they must be escaped.

MySQL uses a double backslash (\\) before a symbol for escaping regex elements. It instructs MySQL to treat the character literally.

Let's review regex escaping with an example. We want to find all amounts greater than 2 in the payment table. Since payments are formatted as 2.xx (with cents), we can use a regular expression to match numbers starting with 2. but greater than 2:

SELECT
  rental_id,
  amount
FROM payment
WHERE amount REGEXP '^2\\.[1-9][0-9]*$';

MySQL Regex escaping example

In this query:

  • ^2\\. matches numbers starting with "2." (escaped dot for a literal match).
  • [1-9] ensures the first decimal digit is greater than 0 (2.1, 2.2, and so on).
  • [0-9]* matches any additional decimal places (like 2.15, 2.99, and more).
  • $ ensures the match occurs at the end of the value.

This query filters out exact 2.00 and retrieves only amounts greater than 2.00 from the payment table.

Advanced pattern matching techniques

While MySQL REGEXP supports basic regex features, we can improve the search using advanced techniques. Let us explore them deeper.

Greedy and non-greedy quantifiers

In MySQL REGEXP, we can define how many times a pattern can repeat using quantifiers. Greedy quantifiers match as much as possible. MySQL does not support non-greedy quantifiers.

Greedy regex quantifiers in MySQL:

  • * matches 0 or more times, as much as possible.
  • + matches 1 or more times, as much as possible.
  • ? matches 0 or 1 time, preferring 1 if possible.
  • {n,m} matches between n and m times, maximizing the match.

For example, we want to find descriptions that contain text inside parentheses ( ... ), using greedy matching to capture the longest possible sequence:

SELECT
  description
FROM film_text
WHERE description REGEXP '\\(.*\\)';
The double backslash (\\) is used to match opening and closing parentheses, while the * quantifier ensures a greedy match, capturing as much text as possible between parentheses.

MySQL regex greedy quantifiers example

As this pattern captures the longest possible sequence, results may include multiple sequences within parentheses as a single match.

MySQL regex greedy quantifiers with multiple sequences

Grouping and alternation

Grouping allows us to use parentheses and the alternation operator (|) to combine multiple expressions and apply operators to the group. For example, let's search for films containing either Astounding panorama or Fanciful panorama in their descriptions:

SELECT
  title,
  description
FROM film_text
WHERE description REGEXP '(Fanciful|Astounding) Panorama';

The (Fanciful|Astounding) fragment matches either Fanciful or Astounding, and Panorama ensures that the word Panorama appears immediately after either word.

MySQL regex grouping example

REGEXP checks for this pattern in the description column and applies grouping.

Alternation allows us to match any one of multiple patterns. For example, we can search for films where descriptions include either Epic Drama or Epic Tale:

SELECT
  title,
  description
FROM film_text
WHERE description REGEXP 'Epic Drama|Epic Tale';

Here, the alternation operator (|) acts like an OR operator for regex, fetching all descriptions that match either string.

MySQL regex alternation example

Using these advanced regex techniques, we can refine search queries and retrieve precise results.

Using RegExp for generating test data in MySQL

When working with databases, it's often necessary to produce realistic test data that meets specific requirements. dbForge Studio for MySQL offers a powerful tool that creates test data for MySQL databases using predefined generators.

MySQL data generation with regex (regular expressions) means that we can generate realistic test data with patterns that follow real-world formats (names, email addresses, phone numbers, etc.). This functionality is helpful for various application testing scenarios when it's necessary to monitor the application behavior under real-world load and in different conditions.

Let us see how the RegExp data generator works in dbForge Studio for MySQL.

RegExp generator in Data Generator for MySQL

The MySQL Data Generator in dbForge Studio is a powerful and flexible tool that allows us to populate database tables with realistic test data in different formats using predefined regex templates.

Open the Studio and navigate to Database > Tasks > New Data Generation, specify the database, and configure the general options for data generation (or accept the default settings).

Launch the dbForge Data Generator for MySQL

Select the tables and columns to be populated. The Studio automatically assigns a generator based on the column and table characteristics. You can select a different generator for any column and customize its parameters. On the right pane, you can see a predefined regex template to generate the necessary data type.

Regex template in dbForge Data Generator

You can define regex patterns in the MySQL test data generator yourself and adjust the parameters for each specific case. The RegExp generator for MySQL offers the possibility to insert various custom snippets, counters, and files.

Regex customization in dbForge Data Generator

Preview the results of the chosen regex data patterns generator. If they match your requirements, click the green arrow at the top to launch the Data Population Wizard.

Start data generation process in dbForge Data Generator

On the Output page, define how the Data Generator should handle the script. You can open it for viewing in the internal editor, save it to a file, or execute it directly against the database.

ata generator output page in dbForge Data Generator

On the Options page, check additional options for the data task generation.

Data generator options page in dbForge Data Generator

On the Additional Scripts page, you can optionally set the tool to execute custom scripts together with the generated data population script.

Additional scripts page in dbForge Data Generator

Finally, on the Summary page, you can see a summary of all actions set to occur during the data generation process, along with warnings if there are any.

Data generator summary in dbForge Studio

Click Generate to start the data generation process.

Automating test data generation with RegExp

Most tasks in dbForge Studio for MySQL can be automated through the command-line interface. To automate MySQL test data generation, configure the process in the Data Population Wizard as shown above. Click Save Command Line.

Save command line in dbForge Data Generator

In the new window, click Validate, and the tool will check the task settings. If it detects any issues, it will inform you immediately, so you can edit the details and ensure they are correct.

Validate command line options in dbForge Data Generator

Click Save to export the task configuration as a .bat file. You can use it to run the task automatically and schedule the data generation job.

You can use the automated regex data generator for MySQL to produce any amount of test data according to your specific needs and criteria to ensure proper application testing and further smooth performance.

Using MySQL REGEXP with dbForge Studio for MySQL

In this article, we used dbForge Studio for MySQL to execute our queries against the Sakila test database and demonstrate the results. However, it is only one option of many more available in the Studio. This IDE covers all database tasks from code writing to version controlling and allows you to perform the whole range of tasks of database development, management, analysis, and administration.

The most helpful feature for MySQL database developers is a powerful Editor that allows us to handle MySQL REGEXP queries in dbForge Studio easily. This tool with its functionality of context-based code auto-completion, syntax validation, and formatting, allows us to write high-quality code much faster and beautify it to make better-readable and matching the code standards across the organization.

Here you can see how we can write regex queries in dbForge Studio for MySQL, getting suggestions.

Code writing in the Editor tool in dbForge Studio for MySQL

The tool also offers the possibility of the MySQL regex test in dbForge Studio, so you can get the full information about any issues detected in the code. Should any errors take place during the query execution, the Studio throws an error message.

Query analysis in dbForge Studio for MySQL

Debugging and optimizing REGEXP queries in dbForge Studio

Queries that include regular expressions are often quite sophisticated, as they have to perform numerous operations in one go. In practice, it means that regex performance in dbForge MySQL needs to be optimized.

The IDE provides a professional Query Profiler that allows us to analyze each query before running it, define the possible issues and bottlenecks, and optimize queries for the most efficient performance. Click Query Profiler on the Start Page and enter the query to analyze. The tool provides the full statistics for the query.

Query Profiler in dbForge Studio for MySQL

When MySQL REGEXP is included in a stored procedure, you can debug the entire procedure using the Studio's Debugger. Select Debug Stored Procedure on the Start Page and define the procedure to be debugged.

Debugger for stored procedures in dbForge Studio for MySQL

It provides the detailed information about the stored procedure.

Debugging results in dbForge Studio for MySQL

These tools allow you to analyze regex queries (and any other query types), ensure their efficiency, and write high-quality code.

Reusing REGEXP queries in dbForge Studio

First of all, you can save regex queries in dbForge Studio using the standard File > Save option. Later, these queries can be reopened and executed whenever applicable.

Besides, any query can be saved as a snippet in dbForge Studio for MySQL. Then you can reuse it easily from the Snippet Manager. Click View > Code Snippets > Open Snippets Manager.

Snippets Manager in dbForge Studio for MySQL

Click Add and create a new snippet with the necessary code.

Creation of the code snippet in dbForge Studio for MySQL

You will be able to refer to the snippet any time when writing your code in the Studio.

Conclusion

In this article, we explored regular expressions in MySQL, covering both standard operators and built-in functions. We provided a comprehensive summary of MySQL regular expressions, including their definitions, specific use cases, and practical benefits for MySQL specialists. Additionally, we demonstrated real-world applications of regular expressions using dbForge Studio for MySQL.

dbForge Studio simplifies database tasks, making it easy to apply best practices for regex in MySQL through an intuitive visual interface and a powerful set of tools, all within a single solution. You can try dbForge Studio for MySQL and experience how it enhances your workflow. A fully functional 30-day free trial is available, so download it and improve your database operations!

FAQ

What is the use of regular expression in MySQL?

Regular expressions (regex) in MySQL are used when the task is to match some pattern in string data. With their help, we can search, validate, and manipulate text strings more efficiently and ensure more precise results.

Why do we use REGEXP in SQL?

REGEXP is the standard MySQL operator for working with regular expressions. It allows the MySQL specialists to define search patterns, set the parameters for the most targeted search, filter the search results, query large datasets, and perform other tasks.

Is MySQL REGEXP case sensitive?

By default, MySQL REGEXP is case insensitive, meaning it treats uppercase and lowercase letters as the same when matching patterns.

If you need a case sensitive search, it's best to use the REGEXP_LIKE() function, introduced in MySQL 8.0. This function provides greater flexibility for working with regular expressions and supports various match types, including case sensitive searches.

What are the most commonly used metacharacters in MySQL regex?

Regular expressions (REGEXP) use various metacharacters to set the specific search patterns. The most commonly used ones include . (dot) to match any single character, ^ to match the start of a string, $ to match the end of the string, * to match zero or more of the preceding character, () to group patterns together, \ to escape character for special symbols, and more. Refer to the table of metacharacters used in MySQL regular expressions.

How do case insensitive matches work in MySQL regular expressions?

The case insensitive match is the default search mode for queries with regular expressions in MySQL. It does not distinguish between uppercase and lowercase letters in patterns.

To perform a case sensitive match, use the REGEXP_LIKE() function with the 'c' flag. This function has been available since MySQL 8.0.

What are the differences between REGEXP and REGEXP_LIKE()?
Feature REGEXP / RLIKE REGEXP_LIKE()
Introduced In Before MySQL 8.0 MySQL 8.0+
Return Type 0 (No match) or 1 (Match) TRUE (Match) or FALSE (No match)
Syntax expression REGEXP pattern REGEXP_LIKE(expression, pattern [, match_type])
Case Sensitivity Case insensitive by default Supports case sensitive matching with the 'c' flag
Multiple Match Modes No Yes (supports 'c', 'i', 'n', 'm')
Standards Compliance MySQL-specific Complies with SQL:2023 standard

dbForge Studio for MySQL

The most intelligent tool for MySQL development and management