PWC 052 › Lucky Winner

This post is part of a series on Mohammad Anwar’s excellent Weekly Challenge, where hackers submit solutions in Perl, Raku, or any other language, to two different challenges every week. (It’s a lot of fun, if you’re into that sort of thing.)

Task #2 this week can be as simple or as difficult as you make it. Mohammad’s description:

Suppose there are following coins arranged on a table in a line in random order.

£1, 50p, 1p, 10p, 5p, 20p, £2, 2p

Suppose you are playing against the computer. Player can only pick one coin at a time from either ends. Find out the lucky winner, who has the larger amounts in total?

Analysis

The problem is quite specific, in that we are only given one possible input list, and it is set up as a human v. computer contest, with no other challenge as to how smart (or stupid) the computer is. In fact, this particular problem can be reduced to: “whomever gets the £2 piece wins,” as all the other coins add to 188p.

If you go first, picking £1 will force the opponent to pick 50p or 2p. Keep going until your opponent picks the 2p piece (or the 20p piece), then you get the £2 piece and win.

If the opponent goes first, the same thing applies. Keep stalling until they pick the 20p or 2p piece, then pick the £2 piece and win. Unless of course your opponent is using the same strategy, in which case, they’ve stalled you, and you lose.

So, this isn’t terribly interesting, but I’ll implement it in Raku anyway. Stick around for the Perl version if you yearn for a more general version, though!

Raku (Simple)

So, for a baseline, here is my Raku solution, that keeps it simple and does exactly what is asked:

#| A more literal reading of the problem, with a simple greedy CPU
sub MAIN( Bool :$me-first ) {
    my @coins = @*ARGV ?? @*ARGV !! <100 50 1 10 5 20 200 2>;

    my $my-score = 0;
    my $cpu-score= 0;
    my $my-turn  = $me-first;

    while (@coins) {
        say @coins;
        if ($my-turn) {
            my $lr;
            repeat { $lr = prompt "Move [lr]? " } until 'lr'.index($lr) ≥ 0;
            $my-score += $lr eq 'l' ?? @coins.shift !! @coins.pop;
        } else {
            $cpu-score += @coins[0] > @coins[*-1] ?? @coins.shift !! @coins.pop;
        }
        $my-turn = !$my-turn;
    }

    say $my-score > $cpu-score ?? "You win" !! "CPU wins";
    say "Your score: $my-score. CPU score: $cpu-score";
}

Even then, I still went a bit beyond the spec, by allowing a user-specified list of @coins.

The computer uses a greedy algorithm, which performs reasonably well (we’ll get to that in a bit), but isn’t optimal. That “algorithm” is just one line in this case:

            $cpu-score += @coins[0] > @coins[*-1] ?? @coins.shift !! @coins.pop;

To do better than greedy, we would need only replace that line with code (or a function call) to make a smarter decision. Without looking ahead, greedy is obviously the best we can do.

Going (a lot) further (Perl)

While I went for a barebones solution with Raku, I went completely overboard with Perl. I wanted to go beyond the task description and write a general solution for any number of “coins,” of any value.

I have my reasons for this: since I’ll be reviewing everyone else’s solutions this week, I wanted a platform I could build on to make some comparative statements about any solutions that tried to make a “smart” computer. Plus it was fun and I got a bit carried away. OK, it was mostly that it was fun and I got a bit carried away. What can I say. I make my own amusement.

I made a complete application with arguments and Pod. Here is the help, as generated with pod2github:


NAME

ch-2.pl – Lucky Winner Simulator 9000

SYNOPSIS

ch-2.pl [options] [algorithm1 algorithm2 ...]
ch-2.pl --human=<cpu_algorithm>
ch-2.pl --help

OPTIONS

--count=<iter>     Play <iter> games                   Default: 1000
--coins=<N>        Every game uses <N> coins           Default: 8
--maxcoin=<N>      Maximum coin value                  Default: 200
--help             Full help page
--human=<cpu_alg>  Human vs CPU, CPU uses <cpu_alg>
--seed=<N>         Use specific random number seed (integer)
--verbose          Enable extra output
--noverbose        Disable extra output

ALGORITHMS

humanHuman input. Only available with --human option.
bozoReal stupid algorithm; chooses left or right randomly.
worstSomehow even stupider. Always picks lowest option.
greedyGreedy algorithm. Always picks highest option, but doesn’t look ahead.
ahead1
ahead3
ahead5
Looks ahead 1, 3, or 5 turns, and picks the option that maximizes (my_scoretheir_score)

The script has two basic modes: --human mode, and CPU-only mode. With --human=greedy, for example, the human player will compete against a greedy CPU. You will be prompted for a left/right choice. What I think is a lot more interesting, though, is the CPU-only mode, which pits the CPU against itself in a round-robin battle between every CPU algorithm, and tallies the results to see which one emerges victorious overall. Here is a sample run with the default settings:

      Player0 v Player1       |  Wins0 - Wins1 
--------------------------------------------------
       ahead1 v ahead3    (W) |    920 - 1080  
       ahead1 v ahead5    (W) |    924 - 1076  
(W)    ahead1 v bozo          |   1796 - 204   
(W)    ahead1 v greedy        |   1134 - 866   
(W)    ahead1 v worst         |   1999 - 1     
       ahead3 v ahead5    (W) |    987 - 1013  
(W)    ahead3 v bozo          |   1771 - 229   
(W)    ahead3 v greedy        |   1208 - 792   
(W)    ahead3 v worst         |   1991 - 9     
(W)    ahead5 v bozo          |   1744 - 256   
(W)    ahead5 v greedy        |   1246 - 754   
(W)    ahead5 v worst         |   1987 - 13    
         bozo v greedy    (W) |    263 - 1737  
(W)      bozo v worst         |   1764 - 236   
(W)    greedy v worst         |   2000 - 0     

Leaderboard:
     ahead5:    7066 wins
     ahead3:    7037 wins
     ahead1:    6773 wins
     greedy:    6149 wins
       bozo:    2716 wins
      worst:     259 wins

The first block shows individual match-ups. For each, --count (default: 1000) games are played where player 0 goes first, then another --count games are played where player 1 goes first, so it’s fair. The wins for each are shown on the right (so, in the ahead1 v ahead3 matchup, ahead1 won 917 games, and ahead3 won 1083. I put a (W) beside the one with the most wins, but this doesn’t mean anything to the overall Leaderboard stats.

The Leaderboad: section is a simple tally of wins for each algorithm, regardless of opponent. Notice how worst managed to pick up 259 total wins, despite always picking the smallest of the two coins available!

We can see pretty clearly that while greedy has a respectable showing, looking farther ahead has an advantage. Sometimes it’s better to pick a lower valued coin to force your opponent into giving you an even bigger coin, after all. Looking ahead definitely seems to have diminishing returns after a few turns.

I could have done much, much more with this, but it’s already over 250 lines. The full code is available here. I’ll show you some interesting snippets:

Algorithms

The “easy” algorithms are all one-liners in a dispatch table. Given a list of coins, each sub returns 0 for left, and 1 for right, and is called for every turn:

sub get_algorithms {
    (
        bozo    => sub { rand > 0.5 },
        worst   => sub { $_[0] > $_[-1] },
        greedy  => sub { $_[0] < $_[-1] },
        ahead1  => \&ahead1,
        ahead3  => ahead(3),
        ahead5  => ahead(5),
    );
}

Not shown here is the human sub; that one simply prompts for a left/right choice.

The ahead sub is the complex one. It can handle any depth, and so it returns a closure around the specified number of moves ($n), for the dispatch table:

# Look ahead n moves
sub ahead {
    my $n = shift;

    sub {
        my $ahead = sub {
            my ($depth, $us, $lr, @coins) = @_;
            my $val = $us * ($lr == LEFT ? shift @coins : pop @coins);
            return $val if !$depth or @coins == 0;

            my $f = $us == 1 ? \&min : \&max;
            $val + $f->(
                map { __SUB__->($depth-1, -$us, $_, @coins) } LEFT, RIGHT
            );
        };

        $ahead->($n, 1, LEFT,  @_) > 
        $ahead->($n, 1, RIGHT, @_) ? LEFT : RIGHT;
    };
}

This is a straightforward combinatorial game theory approach. The $ahead sub seeks to maximize the current player’s score while minimizing the other player’s maximum score. It grows exponentially, so it’s best to keep $n small (or not call it often).

The rest of the code is boring tally, display, and Pod (documentation), but you can view it here.

Leave a Reply

Your email address will not be published. Required fields are marked *