Balanced Parentheses – Ryan J Thompson

This post is part of a series on Mohammad Anwar’s excellent Weekly Challenge, where hackers submit solutions in Perl, Raku, or any other language, to two different challenges every week. (It’s a lot of fun, if you’re into that sort of thing.)

The second challenge this week is another old chestnut:

Write a script to generate a string with random number of ( and ) brackets. Then make the script validate the string if it has balanced brackets.

The question has two parts:

Contents

Generate strings with random brackets

There are many ways to go about this task. However, I’m opting to make one decision that will give the strings a fighting chance to be balanced. If the string contains more of one bracket than the other, there is no way the string can be balanced. Take a moment to convince yourself of this if needed. Thus, we are looking for even length strings of length 2n, containing n left brackets and n right brackets. We’ll call these proper candidates from here on, to save a little typing.

Perl Code

The code to generate these is much shorter than this description:

sub gen_str { join '', shuffle map { ($_) x $_[0] } qw<( )> }

From right to left, the qw<( )> gives us our parentheses. We then map those to a repetition to $_[0] (the first argument) times, so if we stopped there, gen_str(3) would return ('(', '(', '(', ')', ')', ')'). But of course we keep going: we shuffle the results and join them together into a string, such as '())()('. My approach is a fairly idiomatic way to handle problems such as this.

Raku Code

The Raku code is also pretty simple:

sub gen-str( Int $n ) { ('()' x $n).comb.pick(*).join }

This looks a bit different than the Perl solution, but it’s actually quite similar. Thanks to the chaining of the functions, here, this solution can be read from left to right.

First we use string repetition to generate a string of '()' repeated $n times. comb then splits it into a character array. pick(*) randomly picks * elements from that array, which is all of them. Finally, we join everything back together to one string again.

Checking for balance

Now that we have a way to generate proper candidate strings, we’re ready to check if they’re balanced.

The classic way to do this is with a stack. Left brackets go on the stack, and right brackets pull off a left bracket. If the stack is ever empty while there is a right bracket, the string is unbalanced. Otherwise, if the stack is empty at the end, the string is balanced.

Perl

If I were programming this in C, maybe I’d do it that way. But this is Perl. Perl is not optimized for that kind of tight looping. Perl does, however, have an extremely well optimized regular expression engine, so that’s what I’ll use.

Under the hood, the regex engine will use its own stack to handle the recursion and backtracking this regex demands:

sub balanced(_) { $_[0] =~ /^(\((?1)*\))*$/ }

I suppose this sort of solution is why Perl programmers are often accused of writing line noise for a living. Let’s break it down with some whitespace with /x:

sub balanced(_) { 
    $_[0] =~ /^         # Start of string
        (               # Match group 1
            \(          # Opening bracket
                (?1)*   # Recurse to start of group 1
            \)          # Closing bracket
        )*              # Group 1, zero or more times
    $/x                 # End of string
}

Being able to write (and understand) regexes like these is an incredibly useful skill to have, as it can help you write more efficient solutions to some problems that might have previously required dozens of lines of code that runs slowly.

My complete Perl solution is on GitHub.

Raku

Another cute way to check for balance is to work on a copy of the input string, and remove adjacent matching brace pairs until you can’t remove any more, and then the input string is balanced iff the resulting string is empty. For example:

((())())
 (()())
  (())
   ()
BALANCED!

I chose this method because I’ve been leaning on grammars a lot lately, and wanted to try something different, and this way shows off some interesting Raku features. Here’s the code:

sub balanced( Str $str is copy --> Bool ) {
    Nil while $str ~~ s:g/'()'//;
    $str.chars == 0
}

So what’s going on there? First, note the is copy parameter trait. That makes $str a copy of the input string, so we can freely modify it.

The actual balanced check happens in the next line, which just repeatedly subjects $str to a global substitution regex (s:g///), replacing all instances of '()' with the empty string. This while loop repeats as long as there are matches. All of the action happens in the loop’s condition, so our block is simply Nil, effectively a no-op in this case.

After that, we return the Bool result of the expression $str.chars == 0, which is only true if the string is balanced.

My Raku solution is on GitHub.

So that’s it! While working on this problem, though, I started to wonder what percentage of such strings are balanced. You can safely stop reading now if you were only here for the Perl and Raku discussion. It’s OK, really!

A probable detour

An interesting question arising from this is: given a proper candidate string, what is the probability that it’s balanced? With all the recursion and iteration, you might expect a messy, unsatisfying result. The result, however, is quite beautiful.

As any good high school student knows, probability is the total target outcomes divided by the total outcomes. So, first, we need to know how many total ways we can arrange n left parentheses and n right parentheses. Another way to think of this is: how many combinations of ways can we place n parentheses in 2n spots, which is also known as a binomial coefficient, whose formula is as follows:

${n \choose k} = \frac{n!}{k!(n-k)!}$

In our case, the top number is 2n, and the bottom number is n, so we have:

1. ${2n \choose n} = \frac{(2n)!}{n!(2n-n)!} = \frac{(2n)!}{n!}$

(n.b.: That numerator is (2n)!, not 2(n!).)

Next, we need to know how many of those ways lead to balanced strings. That ends up being the Catalan sequence, which comes up an awful lot for recursive definitions and other counting problems:

2. $\frac{(2n)!}{(n+1)n!}$

The probability we are after is therefore (2) divided by (1). All of those factorials just melt away, and we are left with the following beautiful result:

$\frac{(2n)!}{(n+1)n!} \div \frac{(2n)!}{n!} = \frac{1}{(n+1)} $

Therefore, the probability of getting a balanced string of length 2n is 1 / (n+1). Nice!

As confirmation, I wrote a quick Perl solution to generate every possible proper candidate string for lengths 0..10, and then count how many of those are balanced. The results match:

n =  0:     1 / 1      = 1
n =  1:     1 / 2      = 1/2
n =  2:     2 / 6      = 1/3
n =  3:     5 / 20     = 1/4
n =  4:    14 / 70     = 1/5
n =  5:    42 / 252    = 1/6
n =  6:   132 / 924    = 1/7
n =  7:   429 / 3432   = 1/8
n =  8:  1430 / 12870  = 1/9
n =  9:  4862 / 48620  = 1/10
n = 10: 16796 / 184756 = 1/11

The Perl code I used for this is simple enough, thanks to Algorithm::Combinatorics:

use Algorithm::Combinatorics 'combinations';
use Number::Fraction;

# Check all strings of length 0..20
for my $n (0..10) {
    my $iter = combinations( [0 .. 2*$n-1], $n );
    my ($bal, $tot);
    while (my $i = $iter->next) {
        my @a   = ('(') x (2 * $n);
        @a[@$i] = (')') x $n;
        $tot++;
        $bal++ if balanced join '', @a;
    }
    printf "n = %2d: %5d / %-6d = %s\n",
        $n, $bal, $tot, Number::Fraction->new($bal, $tot);
}

I always love it when something so seemingly complex such as whether or not parentheses are balanced leads to such an elegant result.