SKILL.md
# Regex Explainer

Take a regular expression and produce a plain-English explanation of what it
matches, plus concrete examples that match and don't match.

## When to use

Use this skill when the user:
- Pastes a regex and asks "what does this do?"
- Asks "is this regex right for matching X?"
- Says "explain this regex" or "decode this"
- Pastes a regex from someone else's code they're trying to understand

## Inputs

A regex, in any of:
- Bare pattern: `^[a-z]+$`
- JS literal: `/^[a-z]+$/i`
- Quoted string with escapes: `"^[a-z]+$"`

Optional:
- Flavor (PCRE / JavaScript / POSIX). If unspecified, assume PCRE-compatible.
- A test string the user wants to know whether it matches.

## Output format

```markdown
### Pattern

`<the original regex>`

### What it matches

<2–3 sentence plain-English summary>

### Token-by-token

- `^` — start of string
- `[a-z]+` — one or more lowercase letters
- `$` — end of string

### Matches ✅

- `hello`
- `abc`

### Does not match ❌

- `Hello` — capital H
- `hello world` — contains a space
- `` — empty string (the `+` requires at least one char)

### Edge cases / gotchas

- <e.g. "this matches Unicode letters only with the `u` flag">
```

## Process

1. Normalize the input. Strip wrapper slashes if it's a JS literal,
   capture flags.
2. Walk the pattern token by token. For each piece:
   - Anchors: `^`, `$`, `\b`, `\B`
   - Character classes: `\d`, `\w`, `\s`, `[abc]`, `[^abc]`,
     `[a-z]`
   - Quantifiers: `*`, `+`, `?`, `{n}`, `{n,}`, `{n,m}`,
     greedy vs lazy (`*?`)
   - Groups: `(...)` capture, `(?:...)` non-capture, `(?<name>...)`
     named, `(?=...)` lookahead, `(?!...)` negative lookahead
   - Backrefs: `\1`, `\k<name>`
   - Flags: `i` (case), `g` (global), `m` (multiline), `s`
     (dotall), `u` (unicode)
3. Build the plain-English summary.
4. Generate 2–4 strings that match and 2–4 that don't, including at least
   one tricky edge case.
5. Call out flavor-specific gotchas (e.g. JS doesn't support possessive
   quantifiers; POSIX has no lookbehind).

## Example

**Input:** `^(?<year>\d{4})-(?<month>0[1-9]|1[0-2])-(?<day>0[1-9]|[12]\d|3[01])$`

**Output:**
```markdown
### Pattern

`^(?<year>\d{4})-(?<month>0[1-9]|1[0-2])-(?<day>0[1-9]|[12]\d|3[01])$`

### What it matches

A strict ISO 8601 date in the form YYYY-MM-DD where the month is
01–12 and the day is 01–31. The named capture groups expose
`year`, `month`, and `day` for downstream use.

### Token-by-token

- `^` — start of string
- `(?<year>\d{4})` — exactly four digits, captured as `year`
- `-` — literal hyphen
- `(?<month>0[1-9]|1[0-2])` — `01`–`09` OR `10`–`12`
- `-` — literal hyphen
- `(?<day>0[1-9]|[12]\d|3[01])` — `01`–`09`, `10`–`29`,
  `30`–`31`
- `$` — end of string

### Matches ✅

- `2025-01-15`
- `1999-12-31`

### Does not match ❌

- `2025-13-01` — month 13 isn't allowed
- `2025-02-30` — passes lexical check but isn't a real calendar date
- `25-01-15` — only 2 digits for the year

### Edge cases / gotchas

- This validates *shape*, not *calendar reality*. Feb 30 and Apr 31
  pass. Validate dates with `Date` or a calendar library.
- Year is a flat 4-digit window — `0001` and `9999` both match.
```

## Hard rules

- Always test your example strings mentally before listing them.
- Always note when the regex is unsafe for untrusted input
  (catastrophic backtracking risk).
- Never claim a regex is "correct for emails" — emails are too irregular.
  Suggest a simple-but-pragmatic pattern instead.