Regular Expressions
✅ What is a Regular Expression?
Section titled “✅ What is a Regular Expression?”A regular expression (short: regex) is a sequence of characters that defines a search pattern, mainly used for string matching and manipulation (e.g., validation, extraction, replacement).
🔍 Basic Components of Regex
Section titled “🔍 Basic Components of Regex”-
Literals
Match exact characters.
Example:catmatches"cat"in"concatenate".
-
Metacharacters
Special characters with specific meanings:.→ Any single character except newline^→ Start of string$→ End of string*→ 0 or more repetitions+→ 1 or more repetitions?→ 0 or 1 occurrence|→ OR operator\→ Escape special characters
🔢 Character Classes
Section titled “🔢 Character Classes”[abc]→ Matchesa,b, orc[^abc]→ Matches any character excepta,b, orc[a-z]→ Matches lowercase letters[0-9]→ Matches digits
Predefined classes:
\d→ Digit (0-9)\D→ Non-digit\w→ Word character (letters, digits, underscore)\W→ Non-word character\s→ Whitespace\S→ Non-whitespace
🔁 Quantifiers
Section titled “🔁 Quantifiers”a*→ 0 or moreaa+→ 1 or moreaa?→ 0 or 1aa{3}→ Exactly 3aa{2,4}→ Between 2 and 4aa{2,}→ 2 or morea
🧩 Groups and Capturing
Section titled “🧩 Groups and Capturing”You can use groups if you need to access the information captured inside a group later, for example in back-references or from calling code.
(abc)→ Capturing group(?:abc)→ Non-capturing group(?P<name>abc)→ Named capturing group\1,\2→ Backreferences to captured groups
✅ Common Regex Patterns
Section titled “✅ Common Regex Patterns”-
Email Validation
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ -
Phone Number (US)
^\(\d{3}\)\s?\d{3}-\d{4}$ -
URL
^https?:\/\/(www\.)?[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(/.*)?$ -
IP Address (IPv4)
^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$ -
Date (YYYY-MM-DD)
^\d{4}-\d{2}-\d{2}$ -
Password (8+ chars, 1 uppercase, 1 digit)
^(?=.*[A-Z])(?=.*\d)[A-Za-z\d]{8,}$
⚡ Advanced Features
Section titled “⚡ Advanced Features”- Lookahead / Lookbehind
(?=...)→ Positive lookahead(?!...)→ Negative lookahead(?<=...)→ Positive lookbehind(?<!...)→ Negative lookbehind
Example:
(?=.*\d)(?=.*[A-Z]) ensures at least one digit and one uppercase letter.