Primes With Regex - A PlanarFe Adventure

I ran across a blog post explaining how you can use regular expressions to determine if a number is prime. It seems like an odd use and it has been a while (kinda? Like four whole weeks!) since I’ve used regular expressions so it took a a bit of time and a whole lot of whiteboard space to figure out what was happening so this is the post topic that you’re getting god dammit!

So how about I show you what it looks like before we step through it:

def primes_with_regex?(num)
  num_as_1s = "1" * num
  ! /^1?$|^(11+?)\1+$/.match(num_as_1s)
end

Or, more succinctly:

def primes_with_regex?(num)
   ("1" * num) !~ /^1?$|^(11+?)\1+$/
end

Here’s how it works, broken down piece by piece:

Create a string composed of “1” (or really any character) of length ‘num’.
‘!’ negates the matching condition. If we find a match for either of our two regular expressions then the number passed to our method is not a prime and the method returns false.

–>!<– /^1?$|^(11+?)\1+$/.match(num_as_1s)
Check to see if the number passed to the method is zero or one. These numbers are not prime so our method should return false.

! /–>^1?$<–|^(11+?)\1+$/.match(num_as_1s)
- ‘^’ anchors the expression to the beginning of the string
- ‘1?’ uses the ‘lazy’ quantifier to match zero or one occurrences of ‘1’.
- ‘$’ checks that the zero or one occurrences of ‘1’ are also anchored to the end of the string.
- If ‘1’ occurs either zero or one times and is anchored to both the beginning and the end of the string the regular expression will match and, though our negation, our method will return false thereby correctly stating that the numbers 1 and 0 are not prime.
Checks to see if the string of ‘1’s can be evenly divided into groups of some length n.

! /^1?$|–>^(11+?)\1+$<–/.match(num_as_1s)
- The regular expression will match (and, through negation, our method will return false) if the string can be every divided into groups. If the string can be evenly divided into segments of length n then the repetition of the groups will naturally abut the beginning and the end of the string, thereby satisfying the beginning and end anchor conditions.
  
  ! /^1?$|^–>(11+?)<–\1+$/.match(num_as_1s)
- Creates a matching group of two or more ‘1’s lazily. I think of the ‘?’ as being generous more than lazy. If a subsequent portion of the regular expression could use some of the characters that it has grabbed to match and the loss of those characters will not cause the first portion to no longer match, then it will generously give those characters away to make the entirety of the expression return true. All for one and one for all!!! This matching group is then stored as something like “Matching Group 1” as shown below.
  
  ! /^1?$|^(11+?)–>\1<–+$/.match(num_as_1s)
Here is an example of a capturing group that may make the concept more clear:
1
(\d\d) + (\d\d) = \2 + \1 matches “12 + 65 = 65 + 12”
- Mushes together one or more occurrences of the matching group in an effort to create a collection of them that abut the beginning and end of the string.
  
  ! /^1?$|^(11+?)\1–>+<–$/.match(num_as_1s)

So what does it actually do?

That’s cute, but should I use it? No! Why would you determine primes using a string like some kind of asshole? Computers are good with numbers. Just let ‘em do their thing. That’s not just my opinion. Shakira said that hip’s don’t lie. Neither does benchmark. Let’s see what it says:

Benchmark.bmbm do |x|
  x.report("Prime_w_Regex") { for i in 0..999 do
                                prime_with_regex?(i)
                              end}

  x.report("Prime_by_Math")  { for i in 0..999 do
                                is_prime?(i)
                               end}


  x.report("Prime_by_Prime_Mod")  { for i in 0..999 do
                                      Prime.prime?(i)
                                    end}

end

Rehearsal ------------------------------------------------------
Prime_w_Regex        0.170000   0.000000   0.170000 (  0.172319)
Prime_by_Math        0.000000   0.000000   0.000000 (  0.003190)
Prime_by_Prime_Mod   0.000000   0.000000   0.000000 (  0.002690)
--------------------------------------------- total: 0.170000sec

                         user     system      total        real
Prime_w_Regex        0.170000   0.000000   0.170000 (  0.172192)
Prime_by_Math        0.000000   0.000000   0.000000 (  0.002153)
Prime_by_Prime_Mod   0.010000   0.000000   0.010000 (  0.002359)

So that only took, like 72 times as long. Sooooooo, novelty much? Try adding an extra zero to those ranges and running the test again. Just try it, I’ll wait…

Sources:

The Blog:

http://www.noulakaz.net/2007/03/18/a-regular-expression-to-check-for-prime-numbers/

Regex Cheatsheet (More Quantifiers):

http://www.rexegg.com/regex-quickstart.html

Pickaxe- Regular Expressions- p.100

Thomas, Dave. Programming Ruby 1.9 & 2.0: The Pragmatic Programmers' Guide. 2013