I'm a coder. Welcome to my blog. Here are some of the records on my job.

Categories

A random string that corresponds to a regex

How would you go about creating a random alpha-numeric string that matches a certain regular expression?

This is specifically for creating initial passwords that fulfill regular password requirements.

Welp, just musing, but the general question of generating random inputs that match a regex sounds doable to me for a sufficiently relaxed definition of random and a sufficiently tight definition of regex. I'm thinking of the classical formal definition, which allows only ()|* and alphabet characters.

Regular expressions can be mapped to formal machines called finite automata. Such a machine is a directed graph with a particular node called the final state, a node called the initial state, and a letter from the alphabet on each edge. A word is accepted by the regex if it's possible to start at the initial state and traverse one edge labeled with each character through the graph and end at the final state.

One could build the graph, then start at the final state and traverse random edges backwards, keeping track of the path. In a standard construction, every node in the graph is reachable from the initial state, so you do not need to worry about making irrecoverable mistakes and needing to backtrack. If you reach the initial state, stop, and read off the path going forward. That's your match for the regex.

There's no particular guarantee about when or if you'll reach the initial state, though. One would have to figure out in what sense the generated strings are 'random', and in what sense you are hoping for a random element from the language in the first place.

Maybe that's a starting point for thinking about the problem, though!

Now that I've written that out, it seems to me that it might be simpler to repeatedly resolve choices to simplify the regex pattern until you're left with a simple string. Find the first non-alphabet character in the pattern. If it's a *, replicate the preceding item some number of times and remove the *. If it's a |, choose which of the OR'd items to preserve and remove the rest. For a left paren, do the same, but looking at the character following the matching right paren. This is probably easier if you parse the regex into a tree representation first that makes the paren grouping structure easier to work with.

To the person who worried that deciding if a regex actually matches anything is equivalent to the halting problem: Nope, regular languages are quite well behaved. You can tell if any two regexes describe the same set of accepted strings. You basically make the machine above, then follow an algorithm to produce a canonical minimal equivalent machine. Do that for two regexes, then check if the resulting minimal machines are equivalent, which is straightforward.

Related Articles

${[a-zA-Z0-9._:]*} I have used the above pattern for grepping words similar to ${SomeText}. How can I replace 'SomeText' with some other string i.e ${SomeText} --> SomeString in my bash script? Example: file.txt text text text ${SomeText1} text text

Duplicate: Random string that matches a regexp No, it isn't. I'm looking for an easy and universal method, one that I could actually implement. That's far more difficult than randomly generating passwords. I want to create an application that takes a

I have a regex string that matches the specific part of the string that I need as seen here. /[^\/,\s\?#]+?\.[^\/,\s]+?(?=\/|\s|$|\?|#)/g I want to use this regex in javascript to strip the parts of a string that do not match the regex and then store

I have a String that contains expressions like these {0}, {1} ... {n}. And i have String List with n length. {0} replace with param[0], {1} replace with param[1], {n} replace with param[n]. How can I do that? List<String> params = new ArrayList<S

I have a log file with the string "ERROR" on some lines. I want to delete every line that doesn't have ERROR so that I can see just what needs fixing. I was going to do something like the following in vim: %s/!(ERROR)// to replace non-error line

How to randomize a string while preserving the spacing of the original string in Python? For example: s = 'The quick brown fox jumps over the lazy dog.' to s = 'Egt jkowe tzlmb hea etrua djom krg iahs pqh.' You can use random.choice from the standard

if i have String "Life is Good". now i need to extract some words from that string which fits on width 40. how can achieve this from objective C ?You can use the sizeWithFont: method in a loop, like this: NSString *longestFitting = nil; NSString

i'm using astrogrep to search email to make it very tolerant i'm using this regex: [a-z0-9_.-][email protected][a-z0-9_.-]+ but I have to ignore some email from that search if i find ABCD or YXZ at the begining of the email, like [email protected] and YXZ

I want to replace all characters matching a pattern in a HTML document except those inside HTML tags. How do you do this with a regex using Perl or sed? Example: replace all "a" with "b" but not if "a" is in an HTML tag like

I found out that whenever I press submit button it does change the current value of the rand() the reason why it doesn't match on my input value. I want to use this as validation to my form. Before the user can submit the message he/she must answer t

I use regular expressions to validate user input. Now I can configure the regex and so it would help the user to see an example of how a certaint input has to be formatted. Is it possible to generate some strings that match an arbitrary regex? And is

I want to create a random string (about 20 characters length). Is there any built-in class in .net that able to create random string?Path.GetRandomFileName Method The GetRandomFileName method returns a cryptographically strong, random string that can

This question already has an answer here: Generate random password string with requirements in javascript 10 answers How to generate a random password using these requirements? [closed] 3 answers I'm looking for an efficient way to build a random str

I need to generate random String as output, Input can be language name / locale name which are already installed in System. All suggestions are appreciated. Thanks. Taken from comment method signature can be like: public String getRandomStringOfLocal