SKILL.md
# Regex Explainer
Take a regular expression and produce a plain-English explanation of what it
matches, plus concrete examples that match and don't match.
## When to use
Use this skill when the user:
- Pastes a regex and asks "what does this do?"
- Asks "is this regex right for matching X?"
- Says "explain this regex" or "decode this"
- Pastes a regex from someone else's code they're trying to understand
## Inputs
A regex, in any of:
- Bare pattern: `^[a-z]+$`
- JS literal: `/^[a-z]+$/i`
- Quoted string with escapes: `"^[a-z]+$"`
Optional:
- Flavor (PCRE / JavaScript / POSIX). If unspecified, assume PCRE-compatible.
- A test string the user wants to know whether it matches.
## Output format
```markdown
### Pattern
`<the original regex>`
### What it matches
<2–3 sentence plain-English summary>
### Token-by-token
- `^` — start of string
- `[a-z]+` — one or more lowercase letters
- `$` — end of string
### Matches ✅
- `hello`
- `abc`
### Does not match ❌
- `Hello` — capital H
- `hello world` — contains a space
- `` — empty string (the `+` requires at least one char)
### Edge cases / gotchas
- <e.g. "this matches Unicode letters only with the `u` flag">
```
## Process
1. Normalize the input. Strip wrapper slashes if it's a JS literal,
capture flags.
2. Walk the pattern token by token. For each piece:
- Anchors: `^`, `$`, `\b`, `\B`
- Character classes: `\d`, `\w`, `\s`, `[abc]`, `[^abc]`,
`[a-z]`
- Quantifiers: `*`, `+`, `?`, `{n}`, `{n,}`, `{n,m}`,
greedy vs lazy (`*?`)
- Groups: `(...)` capture, `(?:...)` non-capture, `(?<name>...)`
named, `(?=...)` lookahead, `(?!...)` negative lookahead
- Backrefs: `\1`, `\k<name>`
- Flags: `i` (case), `g` (global), `m` (multiline), `s`
(dotall), `u` (unicode)
3. Build the plain-English summary.
4. Generate 2–4 strings that match and 2–4 that don't, including at least
one tricky edge case.
5. Call out flavor-specific gotchas (e.g. JS doesn't support possessive
quantifiers; POSIX has no lookbehind).
## Example
**Input:** `^(?<year>\d{4})-(?<month>0[1-9]|1[0-2])-(?<day>0[1-9]|[12]\d|3[01])$`
**Output:**
```markdown
### Pattern
`^(?<year>\d{4})-(?<month>0[1-9]|1[0-2])-(?<day>0[1-9]|[12]\d|3[01])$`
### What it matches
A strict ISO 8601 date in the form YYYY-MM-DD where the month is
01–12 and the day is 01–31. The named capture groups expose
`year`, `month`, and `day` for downstream use.
### Token-by-token
- `^` — start of string
- `(?<year>\d{4})` — exactly four digits, captured as `year`
- `-` — literal hyphen
- `(?<month>0[1-9]|1[0-2])` — `01`–`09` OR `10`–`12`
- `-` — literal hyphen
- `(?<day>0[1-9]|[12]\d|3[01])` — `01`–`09`, `10`–`29`,
`30`–`31`
- `$` — end of string
### Matches ✅
- `2025-01-15`
- `1999-12-31`
### Does not match ❌
- `2025-13-01` — month 13 isn't allowed
- `2025-02-30` — passes lexical check but isn't a real calendar date
- `25-01-15` — only 2 digits for the year
### Edge cases / gotchas
- This validates *shape*, not *calendar reality*. Feb 30 and Apr 31
pass. Validate dates with `Date` or a calendar library.
- Year is a flat 4-digit window — `0001` and `9999` both match.
```
## Hard rules
- Always test your example strings mentally before listing them.
- Always note when the regex is unsafe for untrusted input
(catastrophic backtracking risk).
- Never claim a regex is "correct for emails" — emails are too irregular.
Suggest a simple-but-pragmatic pattern instead.