Self-descriptive Numbers

This post is part of a series on Mohammad Anwar’s excellent Perl Weekly Challenge, where Perl and Raku hackers submit solutions to two different challenges every week. (It’s a lot of fun, if you’re into that sort of thing.)

Challenge #2 this week (43) is to generate Self-descriptive Numbers in arbitrary bases. A self-descriptive number, as described by Wikipedia, is an integer m that in a given base b is b digits long in which each digit d at position n (the most significant digit being at position 0 and the least significant at position b – 1) counts how many instances of digit n are in m.

For example, 6210001000 is a base-10 self-descriptive number, because there are:

  • 6 zeroes,
  • 2 ones,
  • 1 two,
  • 1 six,
  • and nothing else

For a good description of the number theory involved, check out the Wikipedia page, or this discussion between some of my absolute favourite math(s) YouTubers, James Grime, Matt Parker, and Katie Steckles:

If you want to know more about the math, including proofs for optimizations I’ve used, I encourage you to use the resources above.

Perl Solution

To handle arbitrary bases (I’ve gone up to base-35, but you’re really only limited by your imagination and your favourite Unicode font, I guess), I set up my own symbol table, with a reverse value mapping:

my @base = (0..9, 'a'..'z');
my %val  = map { $base[$_] => $_ } 0..$#base;

Now, for the function which will return all self-descriptive numbers of a given base:

sub self_descriptive_base {
    my $b = shift;

    return "$base[$b-4]21" . '0' x ($b-7) . '1000' if $b >= 7;

    grep { is_self_descriptive($_) }
     map { 10 * $_ } 10**($b-2) .. 10**($b-1) - 1;

You’ll see that if the base is 7 or greater, we can stitch together the answer (the only answer) from the known formula \((b-4)b^{b-1}+2b^{b-2}+b^{b-3}+b^3\).

For bases less than 7, I basically brute force the results. You may have noticed something slightly strange, however: I’m not iterating over base-n values, as you might expect. I’m iterating over n-digit numbers in base 10. I could have easily iterated over all n-digit numbers in base-n. Doing so would cut our iterations by more than half (for base 6, our worst-case), but with a much higher constant number of operations. Performance really isn’t the main driver, here.

Now, what’s with the map { 10 * $_ } ? Every self-descriptive number must be divisible by its base. Since the number is base-digits long, all self-descriptive numbers must end in zero. So I’m cutting my search space by 90% by only looking for multiples of 10, since I’m iterating in base-10.

If working in base-10 for base-1..6 numbers makes you uncomfortable, ordinarily I’d agree, but in this case we will never need to perform any operations where that will make a difference. Trust the math! Do the math!

Is a number descriptive?

We are now in need of an is_descriptive_number() sub, which will return a true value if the given number is self-descriptive. Here is that sub:

sub is_self_descriptive {
    my @s = split '', shift;

    return if @s != sum @s; # Not a Niven number

    my %count;
    $count{ $s[$_] }++ for 0..$#s;

    all { $count{$_} == $s[$_] } 0..$#s;

First, we split the number into digits.

Next, there is an easy optimization we can do: self-descriptive numbers are also Niven (or Harshad) numbers. That means a self-descriptive number’s sum of digits must be equal to its base, which is what we check on line 4.

If we have a Niven number, we then get the count of each digit. For example, 21200 would result in %count = ( 0 => 2, 1 => 1, 2 => 2 ).

Finally, using the definition of a self-descriptive number, we check that the $_th digit is equal to $count{$_} for all digits.

Putting it all together

All that’s left to do is to loop through the user-supplied base and print out the results. I opted to allow the user to specify multiple bases. Simply:

for my $base (@ARGV) {
    printf "base-%2d: %s\n", $base, join ', ', self_descriptive_base($base);

Here are the results for bases 1 through 10, plus 15 and 35 for good measure. This runs in 200ms on my system.

[pwc/rjt_043] challenge-043/⋯/perl5 %> ./ 1 2 3 4 5 6 7 8 9 10 15 35
base- 1: 
base- 2: 
base- 3: 
base- 4: 1210, 2020
base- 5: 21200
base- 6: 
base- 7: 3211000
base- 8: 42101000
base- 9: 521001000
base-10: 6210001000
base-15: b21000000001000
base-35: v2100000000000000000000000000001000

There is one further optimization that can be made, and that is to use the knowledge that bases 2, 3, and 6 do not have any self-descriptive numbers. However, that was a bridge too far for me, personally, since after that there was nothing really stopping me from just returning a list of the known numbers without any computation, which would be extremely fast, but also pointless and boring.

Raku Solution

I was a bit pressed for time this week, so my Raku solution isn’t much more than a simple translation of the Perl one. Here is the main routine:

sub self-descriptive( Int $b ) {
    return @base[$b-4] ~ '21' ~ '0' x ($b-7) ~ '1000' if $b >= 7;

    ((10**($b-2) .. 10**($b-1)-1) »*» 10).grep: { is-self-descriptive($_) };

It is slightly more concise with the use of the »*» hyperoperator, but that’s about it. is-self-descriptive() is similar:

sub is-self-descriptive( Int $n ) {
    my @s = $n.comb;

    return False if $n.chars ≠ @s.sum; # Not a Niven number

    my %count;
    %count{ @s[$_] }++ for 0..^@s.elems;

    (0..^$n.chars).map({ (%count{$_} // 0) == @s[$_] }).all

Given infinite time, I’d have made this a little more Raku-ish, but I’m still happy with it as-is.

Leave a Reply

Your email address will not be published. Required fields are marked *