Zero-dependency TypeScript library for regex intersection, complement and other utilities that go beyond string matching. These are surprisingly hard to come by for any programming language.
import { intersection, size, enumerate } from '@gruhn/regex-utils'
// `intersection` combines multiple regex into one:
const passwordRegex = intersection(
/^[a-zA-Z0-9]{12,32}$/, // 12-32 alphanumeric characters
/[0-9]/, // at least one number
/[A-Z]/, // at least one upper case letter
/[a-z]/, // at least one lower case letter
)
// `size` calculates the number of strings matching the regex:
console.log(size(passwordRegex))
// 2301586451429392354821768871006991487961066695735482449920n
// `enumerate` returns a stream of strings matching the regex:
for (const sample of enumerate(passwordRegex).take(10)) {
console.log(sample)
}
// aaaaaaaaaaA0
// aaaaaaaaaa0A
// aaaaaaaaaAA0
// aaaaaaaaaA00
// aaaaaaaaaaA1
// aaaaaaaaa00A
// baaaaaaaaaA0
// AAAAAAAAAA0a
// aaaaaaaaaAA1
// aaaaaaaaaa0B
npm install @gruhn/regex-utils
There is a high-level API and a low-level API:
The high-level API operates directly on native JavaScript RegExp
instances,
which is more convenient but also requires parsing the regular expression.
The low-level API operates on an internal representation
which skips parsing step and is more efficient when combining multiple functions.
For example, say you want to know how many strings match the intersection
of two regular expressions:
import { size, intersection } from '@gruhn/regex-utils'
size(intersection(regex1, regex2))
This:
RegExp
RegExp
Step (1) should be fast for small handwritten regex. But the intersection of two regex can be quite large, which can make step (3) and (4) quite costly. With the low-level API, step (3) and step (4) can be eliminated:
import * as RE from '@gruhn/regex-utils/low-level-api'
RE.size(
RE.toStdRegex(
RE.and(
RE.parse(regex1),
RE.parse(regex2)
)
)
)
*
, +
, ?
, {3,5}
, ...|
.
, \w
, [a-z]
, ...^
/ $
but only at the start/end
(technically they are allowed anywhere in the expression)\$
, \.
, ...(...)
intersection
and complement
(a*|b)*
.new RegExp(...)
constructor crashes.Heavily informed by these papers: