a1k0n.net2015-06-11T17:14:23-05:00http://a1k0n.net/Andy Sloaneandy@a1k0n.nethttp://a1k0n.net/2011/07/20/donut-mathDonut math: how donut.c works2011-07-20T00:00:00-05:00Andy Sloanehttp://a1k0n.net/<link href="/css/prettify.css" type="text/css" rel="stylesheet" />
<script type="text/javascript" src="/js/prettify.js">
</script>
<script src="/js/donut.js">
</script>
<p>There has been a sudden resurgence of interest in my <a href="/2006/09/15/obfuscated-c-donut.html">“donut” code from 2006</a>, and I’ve
had a couple requests to explain this one. It’s been five years now, so it’s
not exactly fresh in my memory, so I will reconstruct it from scratch, in great
detail, and hopefully get approximately the same result.</p>
<p>This is the code and the output, animated in Javascript:
<button onclick="anim1();">toggle animation</button></p>
<table border="0" cellspacing="0" cellpadding="0"><tr>
<td style="background-color:#000">
<pre style="background-color:#000; color:#ccc;">
k;double sin()
,cos();main(){float A=
0,B=0,i,j,z[1760];char b[
1760];printf("\x1b[2J");for(;;
){memset(b,32,1760);memset(z,0,7040)
;for(j=0;6.28>j;j+=0.07)for(i=0;6.28
>i;i+=0.02){float c=sin(i),d=cos(j),e=
sin(A),f=sin(j),g=cos(A),h=d+2,D=1/(c*
h*e+f*g+5),l=cos (i),m=cos(B),n=s\
in(B),t=c*h*g-f* e;int x=40+30*D*
(l*h*m-t*n),y= 12+15*D*(l*h*n
+t*m),o=x+80*y, N=8*((f*e-c*d*g
)*m-c*d*e-f*g-l *d*n);if(22>y&&
y>0&&x>0&&80>x&&D>z[o]){z[o]=D;;;b[o]=
".,-~:;=!*#$@"[N>0?N:0];}}/*#****!!-*/
printf("\x1b[H");for(k=0;1761>k;k++)
putchar(k%80?b[k]:10);A+=0.04;B+=
0.02;}}/*****####*******!!=;:~
~::==!!!**********!!!==::-
.,~~;;;========;;;:~-.
..,--------,*/
</pre>
</td>
<td style="background-color:#000">
<pre id="d" style="background-color:#000; color:#ccc;">
</pre>
</td></tr></table>
<p>At its core, it’s a framebuffer and a Z-buffer into which I render pixels.
Since it’s just rendering relatively low-resolution ASCII art, I massively
cheat. All it does is plot pixels along the surface of the torus at
fixed-angle increments, and does it densely enough that the final result looks
solid. The “pixels” it plots are ASCII characters corresponding to the
illumination value of the surface at each point: <code>.,-~:;=!*#$@</code> from dimmest to
brightest. No raytracing required.</p>
<p>So how do we do that? Well, let’s start with the basic math behind 3D
perspective rendering. The following diagram is a side view of a person
sitting in front of a screen, viewing a 3D object behind it.</p>
<center><img src="/img/perspective.png" /></center>
<p>To render a 3D object onto a 2D screen, we project each point (<em>x</em>,<em>y</em>,<em>z</em>) in
3D-space onto a plane located <em>z’</em> units away from the viewer, so that the
corresponding 2D position is (<em>x’</em>,<em>y’</em>). Since we’re looking from the side,
we can only see the <em>y</em> and <em>z</em> axes, but the math works the same for the <em>x</em>
axis (just pretend this is a top view instead). This projection is really easy
to obtain: notice that the origin, the <em>y</em>-axis, and point (<em>x</em>,<em>y</em>,<em>z</em>) form a
right triangle, and a similar right triangle is formed with (<em>x’</em>,<em>y’</em>,<em>z’</em>).
Thus the relative proportions are maintained:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{aligned}
\frac{y'}{z'} &= \frac{y}{z} \\
y' &= \frac{y z'}{z}.
\end{aligned} %]]></script>
<p>So to project a 3D coordinate to 2D, we scale a coordinate by the screen
distance <em>z’</em>. Since <em>z’</em> is a fixed constant, and not functionally a
coordinate, let’s rename it to <em>K<sub>1</sub></em>, so our projection equation
becomes <script type="math/tex">(x',y') = (\frac{K_1 x}{z}, \frac{K_1 y}{z})</script>. We can choose
<em>K<sub>1</sub></em> arbitrarily based on the field of view we want to show in our
2D window. For example, if we have a 100x100 window of pixels, then the view
is centered at (50,50); and if we want to see an object which is 10 units wide
in our 3D space, set back 5 units from the viewer, then <em>K<sub>1</sub></em> should
be chosen so that the projection of the point <em>x</em>=10, <em>z</em>=5 is still on the
screen with <em>x’</em> < 50: 10<em>K<sub>1</sub></em>/5 < 50, or <em>K<sub>1</sub></em> < 25.</p>
<p>When we’re plotting a bunch of points, we might end up plotting different
points at the same (<em>x’</em>,<em>y’</em>) location but at different depths, so we maintain
a <a href="http://en.wikipedia.org/wiki/Z-buffering">z-buffer</a> which stores
the <em>z</em> coordinate of everything we draw. If we need to plot a location, we
first check to see whether we’re plotting in front of what’s there already. It
also helps to compute <em>z</em><sup>-1</sup> <script type="math/tex">= \frac{1}{z}</script> and use that when depth
buffering because:</p>
<ul>
<li><em>z</em><sup>-1</sup> = 0 corresponds to infinite depth, so we can pre-initialize
our z-buffer to 0 and have the background be infinitely far away</li>
<li>we can re-use <em>z</em><sup>-1</sup> when computing <em>x’</em> and <em>y’</em>:
Dividing once and multiplying by <em>z</em><sup>-1</sup> twice is cheaper than
dividing by <em>z</em> twice.</li>
</ul>
<p>Now, how do we draw a donut, AKA <a href="http://en.wikipedia.org/wiki/Torus">torus</a>? Well, a torus is a <a href="http://en.wikipedia.org/wiki/Solid_of_revolution">solid of
revolution</a>, so one way to do it is to draw a 2D circle around some point in
3D space, and then rotate it around the central axis of the torus. Here is a
cross-section through the center of a torus:</p>
<center><img src="/img/torusxsec.png" /></center>
<p>So we have a circle of radius <em>R</em><sub>1</sub> centered at point
(<em>R</em><sub>2</sub>,0,0), drawn on the <em>xy</em>-plane. We can draw this by sweeping
an angle — let’s call it <em>θ</em> — from 0 to 2π:</p>
<p>\[
(x,y,z) = (R_2,0,0) + (R_1 \cos \theta, R_1 \sin \theta, 0)
\]</p>
<p>Now we take that circle and rotate it around the <em>y</em>-axis by another angle
— let’s call it φ. To rotate an arbitrary 3D point around one of the
cardinal axes, the standard technique is to multiply by a <a href="http://en.wikipedia.org/wiki/Rotation_matrix">rotation matrix</a>. So if
we take the previous points and rotate about the <em>y</em>-axis we get:</p>
<script type="math/tex; mode=display">% <![CDATA[
\left( \begin{matrix}
R_2 + R_1 \cos \theta, &
R_1 \sin \theta, &
0 \end{matrix} \right)
\cdot
\left( \begin{matrix}
\cos \phi & 0 & \sin \phi \\
0 & 1 & 0 \\
-\sin \phi & 0 & \cos \phi \end{matrix} \right)
=
\left( \begin{matrix}
(R_2 + R_1 \cos \theta)\cos \phi, &
R_1 \sin \theta, &
-(R_2 + R_1 \cos \theta)\sin \phi \end{matrix} \right) %]]></script>
<p>But wait: we also want the whole donut to spin around on at least two more axes
for the animation. They were called <em>A</em> and <em>B</em> in the original code: it was a
rotation about the <em>x</em>-axis by <em>A</em> and a rotation about the <em>z</em>-axis by <em>B</em>.
This is a bit hairier, so I’m not even going write the result yet, but it’s a
bunch of matrix multiplies.</p>
<script type="math/tex; mode=display">% <![CDATA[
\left( \begin{matrix}
R_2 + R_1 \cos \theta, &
R_1 \sin \theta, &
0 \end{matrix} \right)
\cdot
\left( \begin{matrix}
\cos \phi & 0 & \sin \phi \\
0 & 1 & 0 \\
-\sin \phi & 0 & \cos \phi \end{matrix} \right)
\cdot
\left( \begin{matrix}
1 & 0 & 0 \\
0 & \cos A & \sin A \\
0 & -\sin A & \cos A \end{matrix} \right)
\cdot
\left( \begin{matrix}
\cos B & \sin B & 0 \\
-\sin B & \cos B & 0 \\
0 & 0 & 1 \end{matrix} \right) %]]></script>
<p>Churning through the above gets us an (<em>x</em>,<em>y</em>,<em>z</em>) point on the surface of our
torus, rotated around two axes, centered at the origin. To actually get screen
coordinates, we need to:</p>
<ul>
<li>Move the torus somewhere in front of the viewer (the viewer is at the
origin) — so we just add some constant to <em>z</em> to move it backward.</li>
<li>Project from 3D onto our 2D screen.</li>
</ul>
<p>So we have another constant to pick, call it <em>K</em><sub>2</sub>, for the distance
of the donut from the viewer, and our projection now looks like:</p>
<p>\[
\left( x’, y’ \right)
=
\left( \frac{K_1 x}{K_2 + z} , \frac{K_1 y}{K_2 + z} \right)
\]</p>
<p><em>K</em><sub>1</sub> and <em>K</em><sub>2</sub> can be tweaked together to change the field
of view and flatten or exaggerate the depth of the object.</p>
<p>Now, we could implement a 3x3 matrix multiplication routine in our code and
implement the above in a straightforward way. But if our goal is to shrink the
code as much as possible, then every 0 in the matrices above is an opportunity
for simplification. So let’s multiply it out. Churning through a bunch of
algebra (thanks Mathematica!), the full result is:</p>
<script type="math/tex; mode=display">\left( \begin{matrix} x \\ y \\ z \end{matrix} \right) =
\left( \begin{matrix}
(R_2 + R_1 \cos \theta) (\cos B \cos \phi + \sin A \sin B \sin \phi) -
R_1 \cos A \sin B \sin \theta \\
(R_2 + R_1 \cos \theta) (\cos \phi \sin B - \cos B \sin A \sin \phi) +
R_1 \cos A \cos B \sin \theta \\
\cos A (R_2 + R_1 \cos \theta) \sin \phi + R_1 \sin A \sin \theta
\end{matrix} \right)</script>
<p>Well, that looks pretty hideous, but we we can precompute some common
subexpressions (e.g. all the sines and cosines, and <script type="math/tex">R_2 + R_1 \cos \theta</script>)
and reuse them in the code. In fact I came up with a completely different
factoring in the original code but that’s left as an exercise for the reader.
(The original code also swaps the sines and cosines of A, effectively rotating
by 90 degrees, so I guess my initial derivation was a bit different but that’s
OK.)</p>
<p>Now we know where to put the pixel, but we still haven’t even considered which
shade to plot. To calculate illumination, we need to know the <a href="http://en.wikipedia.org/wiki/Surface_normal">surface normal</a> —
the direction perpendicular to the surface at each point. If we have that,
then we can take the <a href="http://en.wikipedia.org/wiki/Dot_product">dot
product</a> of the surface normal with the light direction, which we can choose
arbitrarily. That gives us the cosine of the angle between the light direction
and the surface direction: If the dot product is >0, the surface is facing
the light and if it’s <0, it faces away from the light. The higher the
value, the more light falls on the surface.</p>
<p>The derivation of the surface normal direction turns out to be pretty much the
same as our derivation of the point in space. We start with a point on a
circle, rotate it around the torus’s central axis, and then make two more
rotations. The surface normal of the point on the circle is fairly obvious:
it’s the same as the point on a unit (radius=1) circle centered at the origin.</p>
<p>So our surface normal (<em>N<sub>x</sub></em>, <em>N<sub>y</sub></em>, <em>N<sub>z</sub></em>) is
derived the same as above, except the point we start with is just (cos
<em>θ</em>, sin <em>θ</em>, 0). Then we apply the same rotations:</p>
<script type="math/tex; mode=display">% <![CDATA[
\left( \begin{matrix}
N_x, &
N_y, &
N_z \end{matrix} \right)
=
\left( \begin{matrix}
\cos \theta, &
\sin \theta, &
0 \end{matrix} \right)
\cdot
\left( \begin{matrix}
\cos \phi & 0 & \sin \phi \\
0 & 1 & 0 \\
-\sin \phi & 0 & \cos \phi \end{matrix} \right)
\cdot
\left( \begin{matrix}
1 & 0 & 0 \\
0 & \cos A & \sin A \\
0 & -\sin A & \cos A \end{matrix} \right)
\cdot
\left( \begin{matrix}
\cos B & \sin B & 0 \\
-\sin B & \cos B & 0 \\
0 & 0 & 1 \end{matrix} \right) %]]></script>
<p>So which lighting direction should we choose? How about we light up surfaces
facing behind and above the viewer: <script type="math/tex">(0,1,-1)</script>. Technically
this should be a normalized unit vector, and this vector has a magnitude of
√2. That’s okay – we will compensate later. Therefore we compute the
above (<em>x</em>,<em>y</em>,<em>z</em>), throw away the <em>x</em> and get our luminance <em>L</em> = <em>y</em>-<em>z</em>.</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{aligned}
L &=
\left( \begin{matrix}
N_x, &
N_y, &
N_z \end{matrix} \right)
\cdot
\left( \begin{matrix}
0, &
1, &
-1 \end{matrix} \right)
\\
&=
\cos \phi \cos \theta \sin B - \cos A \cos \theta \sin \phi - \sin A \sin \theta +
\cos B ( \cos A \sin \theta - \cos \theta \sin A \sin \phi)
\end{aligned} %]]></script>
<p>Again, not too pretty, but not terrible once we’ve precomputed all the sines
and cosines.</p>
<p>So now all that’s left to do is to pick some values for <em>R</em><sub>1</sub>,
<em>R</em><sub>2</sub>, <em>K</em><sub>1</sub>, and <em>K</em><sub>2</sub>. In the original donut
code I chose <em>R</em><sub>1</sub>=1 and <em>R</em><sub>2</sub>=2, so it has the same
geometry as my cross-section diagram above. <em>K<sub>1</sub></em> controls the
scale, which depends on our pixel resolution and is in fact different for <em>x</em>
and <em>y</em> in the ASCII animation. <em>K</em><sub>2</sub>, the distance from the viewer
to the donut, was chosen to be 5.</p>
<p>I’ve taken the above equations and written a quick and dirty canvas
implementation here, just plotting the pixels and the lighting values from the
equations above. The result is not exactly the same as the original as some of
my rotations are in opposite directions or off by 90 degrees, but it is
qualitatively doing the same thing.</p>
<p>Here it is: <button onclick="anim2();">toggle animation</button></p>
<canvas id="canvasdonut" width="300" height="240">
</canvas>
<p>It’s slightly mind-bending because you can see right through the torus, but the
math does work! Convert that to an ASCII rendering with <em>z</em>-buffering, and
you’ve got yourself a clever little program.</p>
<p>Now, we have all the pieces, but how do we write the code? Roughly like this
(some pseudocode liberties have been taken with 2D arrays):</p>
<pre class="prettyprint">
const float theta_spacing = 0.07;
const float phi_spacing = 0.02;
const float R1 = 1;
const float R2 = 2;
const float K2 = 5;
// Calculate K1 based on screen size: the maximum x-distance occurs roughly at
// the edge of the torus, which is at x=R1+R2, z=0. we want that to be
// displaced 3/8ths of the width of the screen, which is 3/4th of the way from
// the center to the side of the screen.
// screen_width*3/8 = K1*(R1+R2)/(K2+0)
// screen_width*K2*3/(8*(R1+R2)) = K1
const float K1 = screen_width*K2*3/(8*(R1+R2));
render_frame(float A, float B) {
// precompute sines and cosines of A and B
float cosA = cos(A), sinA = sin(A);
float cosB = cos(B), sinB = sin(B);
char output[0..screen_width, 0..screen_height] = ' ';
float zbuffer[0..screen_width, 0..screen_height] = 0;
// theta goes around the cross-sectional circle of a torus
for(float theta=0; theta < 2*pi; theta += theta_spacing) {
// precompute sines and cosines of theta
float costheta = cos(theta), sintheta = sin(theta);
// phi goes around the center of revolution of a torus
for(float phi=0; phi < 2*pi; phi += phi_spacing) {
// precompute sines and cosines of phi
float cosphi = cos(phi), sinphi = sin(phi);
// the x,y coordinate of the circle, before revolving (factored out of the above equations)
float circlex = R2 + R1*costheta;
float circley = R1*sintheta;
// final 3D (x,y,z) coordinate after rotations, directly from our math above
float x = circlex*(cosB*cosphi + sinA*sinB*sinphi) - circley*cosA*sinB;
float y = circlex*(sinB*cosphi - sinA*cosB*sinphi) + circley*cosA*cosB;
float z = K2 + cosA*circlex*sinphi + circley*sinA;
float ooz = 1/z; // "one over z"
// x and y projection. note that y is negated here, because y goes up in
// 3D space but down on 2D displays.
int xp = (int) (screen_width/2 + K1*ooz*x);
int yp = (int) (screen_height/2 - K1*ooz*y);
// calculate luminance. ugly, but correct.
float L = cosphi*costheta*sinB - cosA*costheta*sinphi - sinA*sintheta +
cosB*(cosA*sintheta - costheta*sinA*sinphi);
// L ranges from -sqrt(2) to +sqrt(2). If it's < 0, the surface is
// pointing away from us, so we won't bother trying to plot it.
if(L>0) {
// test against the z-buffer. larger 1/z means the pixel is closer to
// the viewer than what's already plotted.
if(ooz > zbuffer[xp,yp]) {
zbuffer[xp,yp] = ooz;
int luminance_index = L*8; // this brings L into the range 0..11 (8*sqrt(2) = 11.3)
// now we lookup the character corresponding to the luminance and plot it in our output:
output[xp,yp] = ".,-~:;=!*#$@"[luminance_index];
}
}
}
}
// now, dump output[] to the screen.
// bring cursor to "home" location, in just about any currently-used terminal
// emulation mode
printf("\x1b[H");
for(int j=0;j<screen_height;j++) {
for(int i=0;i<screen_width;i++) {
putchar(output[i,j]);
}
putchar('\n');
}
}
</pre>
<p>The Javascript source for both the ASCII and canvas rendering is <a href="/js/donut.js">right here</a>.</p>
http://a1k0n.net/2011/06/26/obfuscated-c-yahoo-logoYahoo! Logo ASCII Animation in six lines of C2011-06-26T00:00:00-05:00Andy Sloanehttp://a1k0n.net/<p>[<b>Update 6/28/2011:</b> added Javascript version; press button below to see the
output without compiling the code.]</p>
<p>[<b>Update 7/9/2011:</b> if you're using a compiler other than gcc, you
might need to put a <tt>#include <math.h></tt> at the top for it to work
correctly -- I seem to be depending on the builtin behavior of <tt>sin</tt> and
<tt>cos</tt> w.r.t. their return types when undeclared.]</p>
<p>Last week I put together another obfuscated C program and have been urged by
my coworkers to post it publicly. I've made some refinements since posting it
to our internal list, so here is the final version (to those who had seen it
already: it's one line shorter now, and the angles are less screwy, and the
animation is 2 seconds instead of 3). Go ahead, try it:</p>
<script>
var F,S,V,tmr,doframe=function(){var k=document.getElementById("output"),c,d,e,a,f,g,h,j,b,i=[];S+=V+=(1-S)/10-V/4;for(d=0;d<24;d++){for(c=0;c<73;c++){for(a=e=0;a<3;a++){f=S*(c-27);g=S*(d*3+a-36);e^=(136*f*f+84*g*g<92033)<<a;b=0;p=6;for(m=0;m<8;){h=('O:85!fI,wfO8!yZfO8!f*hXK3&fO;:O;#hP;"i'.charCodeAt(b)-79)/14.6423;j="<[\\]O=IKNAL;KNRbF8EbGEROQ@BSXXtG!#t3!^".charCodeAt(b++)-79;if(f*Math.cos(h)+g*Math.sin(h)<j/1.165){b=p;p="<AFJPTX".charCodeAt(m++)-50}else if(b==p){e^=1<<a;m=8}}}i.push(" ''\".$u$"[e])}i.push("\n")}k.innerHTML=
i.join("");if(!F--){clearInterval(tmr);tmr=undefined}};function animate(){F=40;V=S=0;if(tmr===undefined)tmr=setInterval(doframe,50)};
</script>
<pre>
$ cat >yanim.c
c,p,i,j,n,F=40,k,m;float a,x,y,S=0,V=0;main(){for(;F--;usleep(50000),F?puts(
"\x1b[25A"):0)for(S+=V+=(1-S)/10-V/4,j=0;j<72;j+=3,putchar(10))for(i=0;x=S*(
i-27),i++<73;putchar(c[" ''\".$u$"]))for(c=0,n=3;n--;)for(y=S*(j+n-36),k=0,c
^=(136*x*x+84*y*y<92033)<<n,p=6,m=0;m<8;k++["<[\\]O=IKNAL;KNRbF8EbGEROQ@BSX"
"XtG!#t3!^"]/1.16-68>x*cos(a)+y*sin(a)?k=p,p="<AFJPTX"[m++]-50:k==p?c^=1<<n,
m=8:0)a=(k["O:85!fI,wfO8!yZfO8!f*hXK3&fO;:O;#hP;\"i[by asloane"]-79)/14.64;}
^D
$ gcc -o yanim yanim.c -lm
[warnings which real programmers ignore]
$ ./yanim
[you'll see - <button onclick="window.animate();">show the animation</button>]
</pre>
<pre id="output" style="background:#000; color:#ccc;">
</pre>
<p>It's a 20fps, antialiased ASCII art animation of the Yahoo! logo. If you
want to figure out how it works on your own, you're welcome to. Otherwise,
read on.</p>
<p>I encourage you to play with the constants in the code: S+=V+=(1-S)/10-V/5
is the underdamped control system for the animation -- S is scale (=1/zoom), V
is velocity, and 1/10 and 1/5 are the <a
href="http://en.wikipedia.org/wiki/PID_controller">PD constants</a>. S=0
corresponds to infinite zoom on the first frame. S<0 is funny. F is the
frame counter. The 1.16 controls the scale of the polygon rendering (68 is an
approximation of 79/1.16 so you have to adjust that too), and 136/84/92033
define the ellipse. The 14.64 is not a tunable parameter,
though (it's 46/π, and for a good reason).</p>
<p>The antialiasing is simple: each character consists of three
vertically-arranged samples and an 8-character lookup table for each
arrangement of three on/off pixels. Each frame consists of 73x24 characters,
or 73x72 pixels. The 73 horizontal choice was somewhat arbitrary; I suppose I
could have gone up to 79.</p>
<p>The logo is rendered as an ellipse and eight convex polygons using a fairly
neat method (I thought) with sub-pixel precision and no frame buffer. It
required some design tradeoffs to fit into two printable-character arrays, but
it's much less code than rendering triangles to a framebuffer, which is the
typical way polygon rasterization is done.</p>
<p>To produce this, first I had to vectorize the "Y!" logo. I did this by
taking some measurements of a reference image and writing coordinates down on
graph paper. Then I wrote a utility program which takes the points and polygon
definitions and turns them into angles and offsets as defined below. [I put <a
href="http://pastebin.com/tNqrGszq">the generator code on pastebin</a> until
I get can some code highlighting stuff set up for my blog].</p>
<p>The ellipse is fairly standard high-school math:
<i>x</i><sup>2</sup>/<i>a</i><sup>2</sup> +
<i>y</i><sup>2</sup>/<i>b</i><sup>2</sup> < 1. Each point is tested and if
it's inside the ellipse, the pixel is plotted. (136<i>x</i><sup>2</sup> +
84<i>y</i><sup>2</sup> < 92033 was a trivial rearrangement of terms with
<i>a</i> and <i>b</i> being the radii of the two axes of the ellipse measured
from my source image, scaled to the pixel grid). </p>
<p>Each polygon is made up of a set of separating half-planes (a half-plane
being all points on one side of an infinitely long line). If a given point is
"inside" all of the half-planes, it's inside the polygon (which only works as
long as the polygon is convex) and the pixel is toggled with the XOR operator
<tt>^</tt> (thus it handles the "inverse" part inside the ellipse as well as
the uninverted exclamation mark without any special cases). Each side of a
polygon is defined by the equation <i>ax</i> + <i>by</i> > <i>c</i>. To
represent both <i>a</i> and <i>b</i> I use an angle <i>θ</i> so that
<i>a</i> = cos(<i>θ</i>) and <i>b</i> = sin(<i>θ</i>) and quantize
the angle in π/46 increments — my angles are thus represented from
-π to +π as ASCII 33 to 125 — '!' to '}' — with 'O' (ASCII
79) as zero. Then I solve for <i>c</i>, also quantized in scaled increments
from -47 to +47, so that the midpoint of the side is considered inside the
polygon.</p>
<p>Here's an extremely crude diagram: (I'm writing this on a plane and none of
my drawing programs are working. Sorry.)</p> <img src="/img/polygon_separation.jpg" /> <p>The
shaded area is <i>ax</i> + <i>by</i> < <i>c</i>, implying it's outside the
polygon, and the dashed line is <i>ax</i> + <i>by</i> = <i>c</i>.</p>
<p>(<i>a</i>,<i>b</i>) form a vector orthogonal to the line segment they
represent pointing towards the inside of the polygon, so we can get them
directly from the points defining the line segment by taking the vector
defining the side —
(<i>x<sub>1</sub></i> - <i>x<sub>0</sub></i>, <i>y<sub>1</sub></i> - <i>y<sub>0</sub></i>)
—
and rotating it 90 degrees,
<!--(you think of it as multiplying <i>x</i>+<i>yi</i> by <i>i</i> or as multiplying the vector with the matrix [<table cellspacing=2 cellpadding=0 border=0 style="display: inline; margin: 0px; font-size: x-small; margin-bottom: 0px">
<tr><td>0</td><td>1</td></tr><tr><td>-1</td><td>0</td></tr></table>]) -this is unclear and looks like crap -->
resulting in (<i>a</i>, <i>b</i>) = (<i>y<sub>1</sub></i> -
<i>y<sub>0</sub></i>, <i>x<sub>0</sub></i> - <i>x<sub>1</sub></i>). Then we
normalize (<i>a</i>, <i>b</i>) as the actual magnitude doesn't matter, but it
will be 1 when we decode the angle and we can compensate with our choice of
<i>c</i> later (if <i>ax</i>+<i>by</i>><i>c</i>, then
<i>sax</i>+<i>sby</i>><i>sc</i> for some scale <i>s</i>>0). Then compute
<i>θ</i> = <tt><a
href="http://en.wikipedia.org/wiki/Atan2">atan2</a></tt>(<i>a</i>,<i>b</i>),
quantize to one of our 94 angles, and get our new (<i>a</i>,<i>b</i>) =
(<tt>cos</tt>(<i>θ</i>), <tt>sin</tt>(<i>θ</i>)).</p>
<p><i>c</i> is easy to get by directly substituting any of the points making up
the line on the side of the polygon into <i>c</i> = <i>ax</i>+<i>by</i>. I use
the midpoint of the line segment on the side, (<i>x<sub>t</sub></i>,
<i>y<sub>t</sub></i>) = ((<i>x<sub>0</sub></i> + <i>x<sub>1</sub></i>)/2,
(<i>y<sub>0</sub></i> + <i>y<sub>1</sub></i>)/2), because the angle of the side
can be slightly off after we quantize θ, and this evens the errors out
across the length of the side.</p>
<p>You'll notice on the first couple frames (you can pause with ^S, resume with
^Q -- xon/xoff) that the bottom section of the 'Y' has little bites taken out
of it due to the quantization error in the separating half-plane equations.
</p>
<p>It could probably be made somewhat more efficient CPU-wise by careful
reordering of the separating plane arrays so that most of the drawing area is
rejected first. I didn't get to that in my generator code.</p>
<p>The animation is done by the <ESC>[25A sequence — it moves the
cursor up 25 lines in just about any terminal emulation mode. I technically
only need to move up 24 lines, but <tt>puts</tt> is shorter than
<tt>printf</tt> and it implicitly adds a newline. If your terminal isn't at
least 26 lines high, though, it does funky things to your scrollback. And
<tt>usleep</tt> is there to limit it to 20fps, which is the only non-ANSI Cism
about it.</p>
<p>And then I shrunk the code down by arranging it into clever <tt>for</tt>
loops and taking unorthodox advantage of commas, conditionals, and globals
being <tt>int</tt>s by default in C (which is all par for the course in
obfuscated C code). And that pretty much reveals all the secrets as to how it
was done.</p>
<p>It would be fairly easy to enhance this with a different movement sequence,
or rotation (or any kind of 3D transform, as it's basically just ray-tracing
the logo). I just animated the scale to prove the point that it was being
rendered dynamically and not just a compressed logo, and kept the animation
short and sweet.</p>
<p>I apologize in advance for the various sign errors I'm sure to have made
when typing this up, but you get the idea.</p>
http://a1k0n.net/2010/03/04/google-ai-postmortemGoogle AI Challenge post-mortem2010-03-04T00:00:00-06:00Andy Sloanehttp://a1k0n.net/<p>[<b>Update 8/2/2011</b>: you can play against a dumbed-down Javascript version of the bot <a href="/code/tron.html">here</a>.]</p>
<p />I can't believe <a
href="http://csclub.uwaterloo.ca/contest/rankings.php">I won</a>.
<p />I can't believe I won <i>decisively</i> at all.
<p />I've never won any programming contest before (although I did place in 3rd
in the <a href="http://julian.togelius.com/mariocompetition2009/">Mario AI
contest</a> but there were only about 10 entrants). Whenever I badly lose at
an <a href="http://icfpcontest.org/">ICFP contest</a> I'm always anxious to see
the post mortems of the people who did better than I did, and I imagine a lot
of people are curious as to how I won, exactly. So here's my post-mortem.
<h3>Code</h3>
<p />Before we get into it, note that all of my code is <a
href="http://github.com/a1k0n/tronbot/">on github here</a>. The commit logs
might be a mildly entertaining read.
<h3>Phase 1: denial</h3>
<p />The first thing I did was attempt to ignore this contest as long as
possible, because month-long programming contests get me into a lot of trouble
at work and at home. The contest was the Tron light cycle game. I've played
variants of this game ever since I got one of <a
href="http://www.handheldmuseum.com/Tomy/Tron.htm">these</a> when I was in
1st grade or so. The most fun variant was my uncle's copy of <a
href="http://www.youtube.com/watch?v=BK_a8xV3O6w">Snafu for the
Intellivision</a> which we played at my Grandma's house all day long. I've
long wondered how to write a bot for it, because the AI on these usually isn't
very smart.
<h3>Phase 2: space-filling</h3>
<p />But I finally gave in on the 9th, downloaded a starter pack, and attempted
to make a simple semi-decent wall-hugging bot. I quickly discovered a simple
useful heuristic: the best rule for efficiently filling space is to always
choose the move that removes the least number of edges from the graph. In
other words, go towards the space with the most walls for neighbors. But!
Avoid <a href="http://en.wikipedia.org/wiki/Cut_vertex">cut
vertices</a> (AKA articulation points), and if you have to enter a cut vertex,
then always choose the largest space left over. At this stage I wasn't
actually calculating articulation points; I just checked the 3x3 neighborhood
of the square and made a lookup table of neighbor configurations that
<i>might</i> be articulation points based on the 3x3 neighborhood. This is
what the <tt><a
href="http://github.com/a1k0n/tronbot/blob/a1k0nbot-2.18.2/cpp/MyTronBot.cc#L238">potential_articulation</a></tt>
function does in my code, and <tt><a
href="http://github.com/a1k0n/tronbot/blob/a1k0nbot-2.18.2/cpp/artictbl.h#L1">artictbl.h</a></tt>
is the lookup table.
<p />I was, however, computing the <a
href="http://en.wikipedia.org/wiki/Connected_component_(graph_theory)">connected
components</a> of the map. This is a simple two-pass O(<i>NM</i>)
algorithm for <i>N</i> squares in the map and <i>M</i> different
components. For each non-wall square in the map, traversed in raster
order, merge it with the component above it (if there is no wall
above) and do the same to its left (if there is no wall to the left).
If it connects two components, renumber based on the lowest index,
maintaining an equivalence lookup table on the side (equivalence
lookups are O(<i>M</i>) but really just linear scans of a tiny
vector). Then scan again and fixup the equivalences. This is what
the <tt><a
href="http://github.com/a1k0n/tronbot/blob/a1k0nbot-2.18.2/cpp/MyTronBot.cc#L243">Components</a></tt>
structure is for; it has this algorithm and simple sometimes-O(1),
sometimes-O(<i>NM</i>) update functions based on
<tt>potential_articulation</tt> above.
<h3>Phase 3: minimax</h3>
<p />I left it at that for the rest of the week and then Friday the 12th I was
inspired by various posts on the official contest forum to implement <a
href="http://en.wikipedia.org/wiki/Minimax">minimax</a> with <a
href="http://en.wikipedia.org/wiki/Alpha-beta_pruning">alpha-beta
pruning</a>, which would let it look several moves ahead before deciding what
to do. The problem with this approach is that you have to have some way of
estimating who is going to win and by how much, given any possible
configuration of walls and players. If the players are separated by a wall,
then the one with the most open squares, for the most part, wins. If they
aren't separated, then we need to somehow guess who will be able to wall in
whom into a smaller space. To do that, I did what everyone else in the contest
who had read the forums was doing at this point: I used the so-called Voronoi
heuristic.
<p />The "Voronoi heuristic" works like this: for each spot on the map, find
whether player 1 can reach it before player 2 does or vice versa. This creates
a <a
href="http://en.wikipedia.org/wiki/Voronoi_diagram">Voronoi diagram</a> with
just two points which sort of winds around all the obstacles. The best way to
explain it is to show what it looks like during a game:
<center><img src="/img/voronoi.gif" /></center>
<p />The light red area are all squares the red player can reach before the blue
player can. Similarly for the light blue squares. If they're white, they're
equidistant. The heuristic value I used initially, and many other contestants
used, was to add up the number of squares on each side and subtract.
<p />Once the red player cuts the blue one off, they are no longer in the same
connected component and then gameplay evaluation switches to "endgame" or
"survival" mode, where you just try to outlast your opponent. After this
point, the minimax value was 1000*(size of player 1's connected component -
size of player 2's connected component). The factor of 1000 was just to reward
certainty in an endgame vs. heuristic positional value. Note that this was
only used to <i>predict</i> an endgame situation. After the bot actually
reached the endgame, it simply used the greedy wall-following heuristic
described above, and it performed admirably for doing no searching at all.
<h3>evaluation heuristic tweaking</h3>
<p />I next noticed that my bot would make some fairly random moves in the early
game, effectively cluttering its own space. So I took a cue from my flood
filling heuristic and added a territory bonus for the number of open neighbors
each square in the territory had (effectively counting each "edge" twice).
This led to automatic wall-hugging behavior when all else was equal.
<p />After fixing a lot of bugs, and finally realizing that when time runs out on
a minimax search, you have to throw away the ply you're in the middle of
searching and use the best move from the previous ply, I had an extremely
average bot. Due to the arbitrariness of the ranking up until the last week in
the contest, it briefly hit the #1 spot and then settled to a random spot on
the first page. It was pretty hard to tell whether it was any good, but I was
losing some games, so I figured it must not be.
<h3>endgame tweaking</h3>
<p />The next realization was that my bot was doing a lot of stupid things in the
endgame. So the next improvement was to do an iteratively-deepened search in
the endgame. I exhaustively tried all possible moves, and at the bottom of the
search tree, ran my greedy heuristic to completion. Whichever move sequence
"primed" the greedy evaluator the best wins. This works great on the smallish
official contest maps. It works terribly on very large random maps currently
in rotation on <a href="http://www.benzedrine.cx/tron/">dhartmei's server</a>,
but I didn't realize that until after the contest.
<h3>data mining</h3>
<p />I was out of ideas for a while and spent some time optimizing (I used
Dijkstra's to do the Voronoi computation and I sped it up by using what I call
a radix priority queue which is just a vector of stacks... see <a
href="http://github.com/a1k0n/tronbot/blob/a1k0nbot-2.18.2/cpp/MyTronBot.cc#L382"><tt>dijkstra</tt></a>).
But it had been bothering me that my edge count/node count Voronoi heuristic
was pretty arbitrary, and wondered if I could do any kind of inference to
discover better ones.
<p />Well, hundreds of thousands of games had been played on the contest server
by this point, and they are extremely easy to download (the contest site's game
viewer does an AJAX request to get some simple-to-parse data for the game), so
I figured I'd try to do some data mining. I wrote a <a
href="http://github.com/a1k0n/tronbot/blob/a1k0nbot-2.18.2/util/getgame.pl">quick
perl hack</a> to grab games from the site and output them in a format that
Tron bots recognize. Then I copied-and-pasted my code wholesale into
<tt>examine.cc</tt> and marked it up so it would read in a game back-to-front,
find the point at which the players enter separate components, guess what the
optimal number of moves they could have made from that point forward, and then
use the existing territory evaluation code on every turn before that and dump
out some statistics. The goal was to discover a model that would predict,
given these territory statistics, what the difference in squares will
eventually be in the endgame.
<p />I started with an extremely simple linear model (and never really changed it
afterwards): the predicted difference in endgame moves is <i>K<sub>1</sub></i>
(<i>N<sub>1</sub></i> - <i>N<sub>2</sub></i>) + <i>K<sub>2</sub></i>
(<i>E<sub>1</sub></i> - <i>E<sub>2</sub></i>) where <i>N<sub>i</sub></i> is the
number of nodes in player <i>i</i>'s territory and <i>E<sub>i</sub></i> is the
number of edges (double-counted actually).
<p />Now, this model is pretty far from absolutely great, and only a little
predictive. This is what the raw data looks like after analyzing 11691 of the
games the top-100 players (at the time) had played:
<p /><img src="/img/nodes.png"><br>
<img src="/img/edges.png">
<p />That's the difference of nodes/edges on the <i>x</i>-axis and the difference
of endgame moves on the <i>y</i>-axis. So both nodes and edges by the Voronoi
metric are, of course, correlated. I did a linear regression to find
approximate values for <i>K<sub>1</sub></i> (currently 0.055) and
<i>K<sub>2</sub></i> (0.194) and multiplied through by 1000 to keep everything
integers.
<p />This definitely improved play in my own testing (I kept 14 different
versions of my bot throughout the contest so I could compare them.
Interestingly, no bot ever totally shut out a previous bot on all maps in my
tests; every bot has a weakness). Once I had that, I was doing very well in
the leaderboard rankings.
<h3>applied graph theory</h3>
<p />Next I noticed <a
href="http://csclub.uwaterloo.ca/contest/forums/viewtopic.php?f=8&t=319&start=10#p1568">dmj's
"mostly useless" idea</a> on the official contest forums: Pretend the game is
played on a checkerboard. Each player can only move from red to black and vice
versa. Therefore, if a given space has a lot more "red" squares than "black"
squares, the surplus "red" squares will necessarily be wasted. I switched out
all my space counting code to count up red and black spaces, and found a
tighter upper bound on the amount of space an ideal bot could fill. This let
my endgame code stop searching when it had found a solution matching the upper
bound, and gave slightly more realistic territory evaluations.
<p />I had already started to think about what came to be called "chamber trees",
as <a
href="http://csclub.uwaterloo.ca/contest/forums/viewtopic.php?f=8&t=319#p1484">pointed
out by iouri in the same thread</a>: find all the articulation points on the
map and construct a graph of connected spaces. I implemented the standard
O(<i>N</i>) algorithm for finding articulation points (<a
href="http://github.com/a1k0n/tronbot/blob/a1k0nbot-2.18.2/cpp/MyTronBot.cc#L513"><tt>calc_articulations</tt></a>,
taken from <a
href="http://www.eecs.wsu.edu/~holder/courses/CptS223/spr08/slides/graphapps.pdf">this
presentation [pdf]</a>). I messed around with this idea but nothing came to
fruition until just before the deadline.
<p />At around this point, I got extremely sick and spent all day Wednesday in
bed. That day, <a
href="http://www.benzedrine.cx/tron/">dhartmei's server</a> showed up, which
was a huge blessing. I ran my bot on there in the background all Thursday
long, and it did very well on there too, which was a very reassuring thing.
But it was still losing a lot of games.
<p />So finally, after failing to get to sleep Thursday night thanks to coughing
and being unable to breathe through my nose, I was struck by an idea at around
3am. This, it turns out, was probably the contest-winning idea, though I'm not
so sure that nobody else implemented it. Anyway, take a look at this game (<a
href="http://csclub.uwaterloo.ca/contest/visualizer.php?game_id=3878644">a1k0n_
v. ecv257</a>):
<center><img src="/img/example1.png" /></center>
<p />(The little circles are the articulation points found by the algorithm
above.) By the so-called Voronoi heuristic, blue has a lot more space than red
does. But red is ultimately going to win this game, because the only space
that blue controls that matters here is the space that borders red. Blue can
choose to cut off that space and fill in the two chambers on the right, or it
can choose to move into the left chamber and take its claim on what I call the
"battlefront": the border between blue space and red space.
<p />I had long ago come to the realization that a better evaluation
heuristic will always beat deeper minimax searches, because a deep
minimax search using a flawed evaluation heuristic is self-deluded
about what its opponent is actually going to do, and will occasionally
favor moves that lose to moves that win, simply because it can't tell
the difference. Anything you can do to make your evaluation function
smarter will result in improved play in the long run.
<p />In this case, I decided to make my evaluation function aware of the above
condition: if the player is not in the chamber containing the "battlefront",
then make the choice I outlined above. More formally, the new heuristic value
is the same as the old one, but <i>N<sub>i</sub></i> and <i>E<sub>i</sub></i>
are counted differently. First, find all cut vertices <i>assuming the
opponent's territory by the Voronoi metric is disconnected</i>. Start a
depth-first search in the player's "chamber", count <i>N<sub>i</sub></i> and
<i>E<sub>i</sub></i> within the chamber, and list all the neighboring cut
vertices but do not traverse them (<a
href="http://github.com/a1k0n/tronbot/blob/a1k0nbot-2.18.2/cpp/MyTronBot.cc#L547"><tt>_explore_space</tt></a>).
Now, explore space recursively for each adjacent cut vertex. If child
<i>j</i>'s space is <i>not</i> a battlefront, then our potential value is the
sum of the current chamber size and child <i>j</i>'s value. Otherwise, it
<i>is</i> a battlefront, and we ignore the current chamber's size but add only
the number of steps it takes to enter the child chamber (I don't have a good
formal reason for this, it just seemed intuitively right). After computing
this potential value for each child, we return the maximum of them as the
current chamber's value.
<p />Therefore the new code will handle the mutual exclusion of battlefront
chambers and other chambers, and force it to choose to either ignore the upper
left chamber or ignore the two chambers on the right.
<p />The idea was extremely roughly-formed when I implemented it (see <a
href="http://github.com/a1k0n/tronbot/blob/a1k0nbot-2.18.2/cpp/MyTronBot.cc#L586"><tt>max_articulated_space</tt></a>),
but it did improve play markedly after throwing it together (again, it didn't
shut out the previous version of my bot totally but it won 12-1 IIRC).
<p />I also had the idea of negatively counting the space we ignore on a
battlefront, as we are effectively handing that space to our opponent. Never
got a chance to try it, though. Might be a nice improvement.
<p />So that was it. I submitted that Friday around noon, and was subsequently
afraid to touch it. (My wife and son and I left for Wisconsin Dells right
afterwards, where I couldn't help but check rankings on my cellphone and keep
up on the forums the whole time, which didn't go over well) The bot is still
running on <a href="http://www.benzedrine.cx/tron/">dhartmei's server</a>, and
still loses many games as a result of miscounting the opponent's space in the
endgame, since my endgame evaluation wasn't very good. But it was good enough
for the contest.
http://a1k0n.net/2009/11/17/hacker-challenge-2-solutionHacker challenge part 2 solution2009-11-17T00:00:00-06:00Andy Sloanehttp://a1k0n.net/<p>Because I am lazy and easily sidetracked, the promised update to
the "Hacker challenge" (<a
href="http://a1k0n.net/blah/archives/2009/03/index.html#e2009-03-31T18_50_59.txt">part
1</a>, <a
href="http://a1k0n.net/blah/archives/2009/04/index.html#e2009-04-03T21_36_15.txt">part
2</a>) is now over seven months late. So I might as well post the
solution to part 2, which was solved by three individuals on
<a
href="http://www.reddit.com/r/programming/comments/89vma/hacker_challenge_part_2_solution_to_part_1_and_a/">
reddit (bobdole, c0dep0et, and xahtep)</a>.</p>
<p>Part 2 used <a href="http://en.wikipedia.org/wiki/Digital_Signature_Algorithm">DSA</a>,
which was fairly obvious; as before I made no effort to hide it. Instead of
decrypting to a particular value, it verifies a message signature hash of
12345678 with various "random per-message values" or <a
href="http://en.wikipedia.org/wiki/Cryptographic_nonce">nonces</a>.</p>
<p>The parameters for the algorithm were encoded in 32-bit chunks. The nice
thing about DSA is that you can use huge <i>p</i> and <i>y</i> values (see
Wikipedia for the terminology) without making the license key any bigger;
unfortunately that doesn't buy you a lot, because the underlying group order is
only <i>q</i>, so that's how big the search space theoretically is -- it
doesn't increase security, it just makes it marginally slower to factor.</p>
<p>So I chose <i>p</i>, <i>y</i>, and <i>g</i> to be on the order of 384 bits,
but <i>q</i> and <i>x</i> are only on the order of 64 bits. In fact <i>p</i>
is just a 64-bit number shifted left 320 bits and then incremented.</p>
<p>The security of DSA derives from the difficulty of determining <i>x</i> from
<i>y = g<sup>x</sup></i> mod <i>p</i>, which is known as the <a
href="http://en.wikipedia.org/wiki/Discrete_logarithm">discrete logarithm
problem</a>, which is harder than factoring primes.</p>
<p>So after you reconstruct the parameters from the hexadecimal encoding you find:</p>
<i>p</i> = 12026070341127085754893097835098576041235013569186796331<br />
441953314639277634647572425804266039236571162321832835547137<br />
<i>g</i> = 54659936461116297034410364232325768273521088000551606899<br />
39983550682370032756410525809260221877924847568552733696072<br />
<i>y</i> = 27434965696578515868290246727046666462183462061939529180<br />
41150093730722092239431092724025892380242699544101134561292<br />
<p>You can plug these numbers into a <a
href="http://www.alpertron.com.ar/DILOG.HTM">discrete log solver</a> and find
<i>x</i> (it will also deduce <i>q</i> as the subgroup size after a few
seconds). This takes about three hours on my MacBook Pro, IIRC.</p>
<p>Once you have <i>x</i> the challenge collapses into reimplementing
DSA (with a small twist: <i>s</i> is inverted in the generator, not in
the validator; I can't see any reason this would affect security and
it makes the validator simpler):</p>
<li />let <i>H</i> = 12345678 (the supposed message hash)
<li />choose a nonce <i>k</i>
<li />compute <i>r</i> = (<i>g<sup>k</sup></i> mod <i>p</i>) mod <i>q</i>
<li />compute <i>k</i><sup>-1</sup> (mod <i>q</i>) using the modular multiplicative inverse (<tt>mpz_invert</tt> with <a href="http://gmplib.org/">GMP</a>)
<li />compute <i>s</i> = (<i>k</i><sup>-1</sup>(<i>H</i> + <i>x r</i>)) mod <i>q</i>
<li />let <i>w</i> = <i>s</i><sup>-1</sup> (mod <i>q</i>)
<li />combine (<i>r,w</i>) into <i>K</i> = <i>r q</i> + <i>w</i>
<li />convert <i>K</i> into a base32-ish key string
<p>I think this scheme is actually pretty good, as it's non-trivial to solve,
but it's still crackable with the newest discrete log solver methods. It did
confound some dedicated redditors for a couple days, at least, with all the
details laid bare.</p>
<p>The obvious next step is to move on to elliptic curve cryptography, and that
is the reason this post is so late. When I started writing the first hacker
challenge I was completely ignorant of ECC. Immediately after writing the
previous post, I bought a book on the subject, and while I understand the
basics now I still don't understand it well enough to write a toy
implementation suitable for a "Hacker Challenge". So I will leave
that for another day, or perhaps for another person.</p>
<p>Source code for the DSA private key generator and license generator:</p>
<li /><a href="http://a1k0n.net/code/keydecode2.cpp.txt">keydecode2.cpp</a> - the challenge code from the last post (for reference)<br />
<li /><a href="http://a1k0n.net/code/bn.h.txt">bn.h</a> - quick and dirty bignum template<br />
<li /><a href="http://a1k0n.net/code/dl_genpriv.c.txt">dl_genpriv.c</a> - discrete log private/public key pair generator<br />
<li /><a href="http://a1k0n.net/code/dsa_genlic.c.txt">dsa_genlic.c</a> - license generator (key parameters hardcoded)<br />
http://a1k0n.net/2009/04/03/hacker-challenge-2Hacker challenge part 22009-04-03T00:00:00-05:00Andy Sloanehttp://a1k0n.net/<p>Well, I guess I'm not quitting my day job to become a cryptographer
any time soon.</p>
<p>As was instantly ascertained on reddit, the key algorithm in my "<a
href="http://a1k0n.net/blah/archives/2009/03/31/index.html#e2009-03-31T18_50_59.txt">Hacker
Challenge</a>" is <a href="http://en.wikipedia.org/wiki/RSA">RSA</a>.
I was hoping that it would take at least an hour to crack the private
key, but alas, I had severely underestimated the time that modern
elliptic-curve and number field sieve integer factorizers would take.
It factored in under a second, which means the key would have to be
many orders of magnitude larger to offer any kind of security.</p>
<p>So within an hour a factorization of <i>n</i> was <a
href="http://www.reddit.com/r/programming/comments/890yf/hacker_challenge_can_you_make_a_key_generator/c08l4rh">posted
to reddit by c0dep0et</a>. Even so, <a
href="http://www.reddit.com/r/programming/comments/890yf/hacker_challenge_can_you_make_a_key_generator/c08l4mg">LoneStar309
pointed out an embarassing implementation mistake</a> which
significantly weakened it (given a valid key, you could generate
several other valid encodings of the same key); I patched this, as I
mentioned in my update. And then <a
href="http://www.reddit.com/r/programming/comments/890yf/hacker_challenge_can_you_make_a_key_generator/c08l9k1">teraflop
demonstrated posession of a working keygen</a> a couple hours
later.</p>
<p>I wanted the key generator challenge to be possible, and it
definitely wasn't trivial, but it was still far easier than I had
hoped. Still, I couldn't be happier with the result, and I would like
to thank my fellow programming.redditors for a great discussion.</p>
<p>For those who haven't studied how RSA works in sufficient detail to go
from a factored <i>n</i> to a key generator, go take a moment to read
up on Wikipedia. Basically the public and private keys, <i>e</i> and
<i>d</i>, are <a
href="http://en.wikipedia.org/wiki/Modular_multiplicative_inverse">multiplicative
inverses</a> mod <i>φ(n)</i> where <i>φ(n)</i> is <a
href="http://en.wikipedia.org/wiki/Euler's_totient_function">Euler's
totient function</a>. In the case of <i>n</i>=<i>pq</i> where
<i>p</i> and <i>q</i> are prime, <i>φ(n)</i> = (<i>p</i> -
1)(<i>q</i> - 1). So you use the <a
href="http://en.wikipedia.org/wiki/Extended_Euclidean_algorithm">extended
euclidean algorithm</a> to find <i>e</i> from <i>d</i> and
<i>φ(n)</i>. If you're using <a href="http://gmplib.org/">GMP</a>
(I am), you can just call <tt>mpz_invert</tt> to do that.</p>
<p>Once you've recovered <i>e</i> from <i>d</i>, you just RSA-encrypt the
message <i>m</i> = 12345678 + <tt>check_mod</tt>*<i>N</i> where
<i>N</i> is the key number of your choosing and 12345678 is a
<a
href="http://en.wikipedia.org/wiki/Nothing_up_my_sleeve_number">"nothing
up my sleeve" number</a> I chose for validating a decryption.
The ciphertext is thus <i>m</i><sup><i>e</i></sup> (mod <i>n</i>),
calculated using <a
href="http://en.wikipedia.org/wiki/Exponentiation_by_squaring">exponentiation
by squaring</a>, mod <i>n</i> at each step (which is what
<tt>expmod</tt> does in <tt>bn.h</tt>), and then you do the reverse of
<tt>decode</tt> to turn the number into a string.</p>
<p>The code I used for generating RSA private key pairs is <a
href="/code/rsa_genpriv.c.txt">rsa_genpriv.c</a> and for generating
license keys is <a href="/code/rsa_genlic.c.txt">rsa_genlic.c</a>.
These require <a href="http://gmplib.org/">libgmp</a>; the job is just
too big for poor little <tt>bn.h</tt>.</p>
<p>(All my code here is MIT-licensed, by the way, so feel free to
steal it for your own purposes. By all means, use it instead of some
silly easy-to-duplicate hashing scheme for your application...)</p>
<p>So, will RSA-based license schemes work? Not with such a short key
length. Can we just make the key length longer? Well, that depends.
Your ciphertext is always going to be a number between 2 and <i>n</i>,
if <i>n</i> is 512 bits then so is your ciphertext. 1024 bits is
probably the smallest reasonably secure size you'd want to use for
RSA, which is 205 characters in the A-Y,1-9 code I'm using. So if
your users are pasting keys out of an email, that's probably fine, but
if they're typing it in by hand off of a CD case, forget it.</p>
<p>Also, this scheme, though cryptographically weak, has some points
in its favor. If a theoretical cracker disassembles the code, he
absolutely <b>must</b> understand RSA at some level, extract <i>n</i>,
and factor it in order to create a key generator. I probably wouldn't
have the patience to do it if the least bit of obfuscation were used
in conjunction. It's totally self-contained (so you don't have to
link in libcrypto or libopenssl or libgmp), so it's pretty much a
drop-in replacement for whatever hashing scheme that most software
tends to use.</p>
<p>And, though the backbone of the challenge was quickly broken, only
one person demonstrated a keygen. I guess one is all it takes.</p>
<p>Can we do better? Yes, I think we can do much better. RSA's
security derives from the difficulty of the integer factorization
problem. There are two other commonly used classes of asymmetric key
cryptosystems based on harder problems: discrete logarithm and
elliptic curve discrete logarithm. Each provides more "strengh" per
bit of key than the last.</p>
<p><a href="http://www.reddit.com/r/programming/comments/890yf/hacker_challenge_can_you_make_a_key_generator/c08l6dl">james_block
brings up some good points</a> along these lines. It may not be
possible to create a software license scheme with both short license
codes and enough security to withstand a large, coordinated effort to
break it. But it's far better to use a license key scheme that could
be broken with a large effort than one that will definitely be broken
with a small effort, when the former is an easy drop-in replacement
for the latter. Truly uncrackable (in the cryptographic sense)
security will require longer keys and users who paste keys out of
emails.</p>
<p>So here is challenge #2. I've used another common algorithm which
is no longer encumbered by a patent. The ciphertext is still slightly
less than 125 bits. It is not impossible to crack by any means, but
it is much harder (in terms of CPU time necessary) than the previous
one. And there's always the possibility that I screwed something up
and left a big back door in it, which is a good reason for proposing
the challenge in the first place.</p>
<p>The code:
<br /><a href="/code/keydecode2.cpp.txt">keydecode2.cpp</a> - challenge #2 decoder
<br /><a href="/code/bn.h.txt">bn.h</a> - quick and dirty bignums (updated from
last time)</p>
<p>I plan on issuing one further challenge next week, and there's a
good chance that this one will be broken before then if it receives
the same level of attention as the first one did.</p>
<p><a
href="http://www.reddit.com/r/programming/comments/89vma/hacker_challenge_part_2_solution_to_part_1_and_a/">Here</a>
is the reddit thread for part 2.</p>
<p><b>Update</b>: <a href="/2009/11/17/hacker-challenge-2-solution.html">The
solution to part 2</a> has been posted.</p>
http://a1k0n.net/2009/03/31/hacker-challengeHacker challenge: Can you make a keygen?2009-03-31T00:00:00-05:00Andy Sloanehttp://a1k0n.net/<p>I like to reverse-engineer things, and I like number theory. These
hobbies happen to intersect in the art of reverse-engineering software
license keys.</p>
<p>I won't lie: I've cracked programs. I've created key generators for
programs. But I also never distribute them. I do it for the
challenge, not for the program.</p>
<p>But from a warez d00d perspective, it is infinitely preferable if you
can create a key generator instead of cracking, because then you can
typically get further software updates, and things are just easier for
everyone. </p>
<p>It is sometimes shockingly easy to create a key generator. Often a
program that checks a license key is structured like this: </p>
<pre>
licensestr = get_license_key_modal_dialog()
validlicensestr = make_valid_license(licensestr);
if(licensestr == validlicensestr) { ... }
</pre>
<p>So now all I have to do is extract your make_valid_license code, feed
it random garbage, and I have a key generator for your program. One
time I just replaced the call to strcmp() with puts() in a program and
turned it into its own key generator.</p>
<p>Other key generators cycle through a hash of some sort (the hash is
sometimes srand() / rand()) and ensure some check digits, or whatever.
Any way you slice it, it's security through obscurity: you're giving
the end user the code, and if end user can read and understand that
code, they can break it.</p>
<p>It doesn't have to be this way. I have created a self-contained
license key decoder, and I'm distributing the source code to it. In
my next post, I will reveal all the details and how to create keys for
it. For now, I want to see whether anyone can break it without having
the "official" key generator. If so, there's a flaw in my reasoning.
It uses a well-known, public-domain algorithm; that's all I'm going to
say for now.</p>
<p>The code is here:</p>
<p><a href="http://a1k0n.net/code/keydecode.cpp.txt">keydecode.cpp</a> - key
decoder</p>
<p><a href="http://a1k0n.net/code/bn.h.txt">bn.h</a> - quick and dirty bignums</p>
<p>(The web host I'm using has the wrong MIME types on .cpp and .h, so they're
.txts - sorry)</p>
<p>I would like to open up a <a
href="http://www.reddit.com/r/programming/comments/890yf/hacker_challenge_can_you_make_a_key_generator/">discussion
on reddit</a>. Undoubtedly many people there will recognize the
algorithm and maybe poke holes in what I'm doing.</p>
<p><b>Update</b>: "maybe poke holes in what I'm doing". Ha. More like
drive a cement mixer through it in minutes. I was pleasantly
surprised to find that this reached #1 on the programming subreddit.
LoneStar309 found a gaping hole which I patched, and tharkban also
found a bug in the final if statement, also fixed. It's fair game to
make keys that way for the challenge I proposed, I suppose, but I
wanted to see whether the idea would work, not necessarily my poor
implementation of it. Turns out: no, it won't, and unsurprisingly
it's been done before. Part 2 coming later.</p>
<p><b>Update 2</b>: <a href="/2009/04/03/hacker-challenge-2.html">Hacker
challenge part 2</a> has been posted.</p>
http://a1k0n.net/2007/10/12/obfuscated-c-vendetta-onlineThe source code to Vendetta Online2007-10-12T00:00:00-05:00Andy Sloanehttp://a1k0n.net/By request: <a href="/code/vosource.c.txt">Vendetta Online's source code</a>.
<pre>
_
,
i
,
z
,x ,
y, o
,b[1840],A
=0;p(n,c){
for(;n--;x++)c==10?y+=80,x=o-1:x>0 ?80 >x? !(
c==64)?b[y+x]=c:0:0:0;}c(q,l,r,o,s) char *l, *r ,*s;{while(q>
0)q= (s[_ /6]- 62> _++% 6&1? r[q] :l[ q]) -o ;; return q;}u(q,l,
r,o) char *l,* r;{ return c(q ,l,r ,o, "A" "mv" "jjQLQm\\polpxwq"
"{pIw" "hR" "|h" "ZO" "LF" "MR" "{g" "H" "E" "ws" "" "" ""
"[LQm" "lq" "_m" "}L" "?") ;}v( q,l, r,o ) char *l , *r;
{return c(q, l,r, o,"" ">K" "su" "f]" "mz" "G" "kM" "" "" ""
"quZljq" "{b" "m]" "tu" "YJ" "sQ" "mZ" "GK" "J" "hD" ); } vu
(a,b){for(o=x =a,y =80* b,_= 0;(( 283> _)); )a= "*/" "" "" ""
")([\n@" "\\" "_ " [u(8 ,"\" *,'" "&%" ".0" ,"" "$#" "" "" ""
"+!)(-/" "1", 42)+ 10], p("" "#$" "%&" "'(" ")" "*," [u ( 7,
"$#'&*\"-/", "(%" ")!" "+,. ",41 )+9] -34, a); }va( a, b ){for
(o=x=a,y =80* b,_= 0;(( 201> _)); )a= "@/ ]\n-" "\\_." [ v(7,
"# )&\"" "*," ".", "(%" "!$" "'+" "-/" ,41) + 9], p( "#$%&')"[v(4,"#\"$!)",
"%&'( ", 38)+ 6]- 34,a );} main (){ puts ( "\033[" "2J");for(;;A++){for(i=
0;i<1840 ;i++ )b[i ]=32 ;srand(0) ;for (i=0 ;i <100;i ++){z=rand()%20;b[(rand
()%23)*80+(rand()-A*(1+z)/59 )%80 ]=".,o*"[ z/ 5 ] ;} va(10,(int)(8.5+8.5*sin
(A*0.02 /****/ )));; /****/ vu(55,(int)(6.5+6.5*cos(A*0.03/****/ ))); printf(
"%c[H"/********/ , /********/ 27);for(i=1;i<80*23+1;i++ /********/
/*** ***/ /*** ***/ /*** ***/
/** **/ /** **/ )putchar(i%80?b[i]:10); /** **/
/*** ***/ /*** ***/ usleep(10000);}}/*** ***/
/********/ /********/ /********/
/****/ /****/ /****/
</pre>
<p>(Updated 3/9/2010: I added a usleep in there, cause it runs way too
fast)</p>
<p>(From <a
href="http://www.vendetta-online.com/x/msgboard/2/14902?page=2">this
thread</a>, referencing an inside joke from a <a
href="http://www.vendetta-online.com/x/msgboard/1/2807#38314">much
earlier thread</a>, but the message ordering got all mixed up at some
point when I had to reindex the message IDs in the database (sigh)).</p>
http://a1k0n.net/2007/08/24/obfuscated-c-fireAnother short C program.2007-08-24T00:00:00-05:00Andy Sloanehttp://a1k0n.net/<pre>
b[2080];main(j){for(;;){printf("\x1b[H");for(j=1;j<2080;j++)b[j]=j<
2000?(b[j+79]+b[j+80]+b[j]+b[j-1]+b[j+81])/5:rand()%4?0:512,j<1840?
putchar((j%80)==79?'\n':" .:*#$H@"[b[j]>>5]):0;usleep(20000);}}
</pre>
<p>It's supposed to be the old fire demo effect but in ASCII. It looks kinda
like camoflauge instead.</p>
<p>update: fixed a crashing bug. It has other issues with uninitialized
data, though.</p>
http://a1k0n.net/2006/09/20/obfuscated-c-donut-2Embellishing the donut: an old-school CG cliche2006-09-20T00:00:00-05:00Andy Sloanehttp://a1k0n.net/
<pre>
_,x,y,o ,N;char b[1840] ;p(n,c)
{for(;n --;x++) c==10?y +=80,x=
o-1:x>= 0?80>x? c!='~'? b[y+x]=
c:0:0:0 ;}c(q,l ,r,o,v) char*l,
*r;{for (;q>=0; )q=("A" "YLrZ^"
"w^?EX" "novne" "bYV" "dO}LE"
"{yWlw" "Jl_Ja|[ur]zovpu" "" "i]e|y"
"ao_Be" "osmIg}r]]r]m|wkZU}{O}" "xys]]\
x|ya|y" "sm||{uel}|r{yIcsm||ya[{uE" "{qY\
w|gGor" "VrVWioriI}Qac{{BIY[sXjjsVW]aM" "T\
tXjjss" "sV_OUkRUlSiorVXp_qOM>E{BadB"[_/6 ]-
62>>_++ %6&1?r[q]:l[q])-o;return q;}E(a){for (
o= x=a,y=0,_=0;1095>_;)a= " <.,`'/)(\n-" "\\_~"[
c (12,"!%*/')#3" "" "+-6,8","\"(.$" "01245"
" &79",46)+14], p("" "#$%&'()0:439 "[ c(10
, "&(*#,./1345" ,"')" "+%-$02\"! ", 44)+12]
-34,a); }main(k){float A=0,B= 0,i,j,z[1840];
puts("" "\x1b[2J");;; for(;; ){float e=sin
(A), n= sin(B),g=cos( A),m= cos(B);for(k=
0;1840> k;k++)y=-10-k/ 80 ,o=41+(k%80-40
)* 1.3/y+n,N=A-100.0/y,b[k]=".#"[o+N&1], z[k]=0;
E( 80-(int)(9*B)%250);for(j=0;6.28>j;j +=0.07)
for (i=0;6.28>i;i+=0.02){float c=sin( i), d=
cos( j),f=sin(j),h=d+2,D=15/(c*h*e+f *g+5),l
=cos(i) ,t=c*h*g-f*e;x=40+2*D*(l*h* m-t*n
),y=12+ D *(l*h*n+t*m),o=x+80*y,N =8*((f*
e-c*d*g )*m -c*d*e-f*g-l*d*n) ;if(D>z
[o])z[o ]=D,b[ o]=" ." ".,,-+"
"+=#$@" [N>0?N: 0];;;;} printf(
"%c[H", 27);for (k=1;18 *100+41
>k;k++) putchar (k%80?b [k]:10)
;;;;A+= 0.053;; B+=0.03 ;;;;;}}
</pre>
(as with the <a href="/2006/09/15/obfuscated-c-donut.html">first one</a>, compile it with -lm, and it needs
ANSI-ish terminal emulation)
http://a1k0n.net/2006/09/15/obfuscated-c-donutHave a donut.2006-09-15T00:00:00-05:00Andy Sloanehttp://a1k0n.net/(compile with <tt>gcc -o donut donut.c -lm</tt>, and it needs ANSI- or
VT100-like emulation)
<pre>
k;double sin()
,cos();main(){float A=
0,B=0,i,j,z[1760];char b[
1760];printf("\x1b[2J");for(;;
){memset(b,32,1760);memset(z,0,7040)
;for(j=0;6.28>j;j+=0.07)for(i=0;6.28
>i;i+=0.02){float c=sin(i),d=cos(j),e=
sin(A),f=sin(j),g=cos(A),h=d+2,D=1/(c*
h*e+f*g+5),l=cos (i),m=cos(B),n=s\
in(B),t=c*h*g-f* e;int x=40+30*D*
(l*h*m-t*n),y= 12+15*D*(l*h*n
+t*m),o=x+80*y, N=8*((f*e-c*d*g
)*m-c*d*e-f*g-l *d*n);if(22>y&&
y>0&&x>0&&80>x&&D>z[o]){z[o]=D;;;b[o]=
".,-~:;=!*#$@"[N>0?N:0];}}/*#****!!-*/
printf("\x1b[H");for(k=0;1761>k;k++)
putchar(k%80?b[k]:10);A+=0.04;B+=
0.02;}}/*****####*******!!=;:~
~::==!!!**********!!!==::-
.,~~;;;========;;;:~-.
..,--------,*/
</pre>
(This was my first attempt at obfuscated C and I feel it's pretty amateurish;
see <a href="/2006/09/20/obfuscated-c-donut-2.html">Donut Mark II</a> for a more
impressive demo — though this one is simple and elegant in comparison.)
http://a1k0n.net/2005/11/04/lisp-using-slime-over-sshUsing SLIME over an SSH tunnel2005-11-04T00:00:00-06:00Andy Sloanehttp://a1k0n.net/<p>If you'd like to use emacs on one computer (i.e. your windows box at home) and
use <a href="http://common-lisp.net/project/slime/">SLIME</a> to connect to a
Common Lisp process on a remote computer (i.e. your server at work), here's how
I do it.</p>
<p>First, create a startup file for your favorite Lisp implementation.</p>
<h2>lisp startup file</h2>
<pre>
(require 'asdf)
(asdf:oos 'asdf:load-op 'swank)
; start swank
(setf swank:*use-dedicated-output-stream* nil)
(setf swank:*communication-style* :fd-handler)
(swank:create-server :dont-close t)
</pre>
<p>Now edit your ~/.emacs so that you've got something like the following in it:</p>
<h2>.emacs</h2>
<pre>
(require 'slime)
(require 'tramp)
(add-hook 'lisp-mode-hook (lambda () (slime-mode t)))
(add-hook 'inferior-lisp-mode-hook (lambda () (inferior-slime-mode t)))
(setq lisp-indent-function 'common-lisp-indent-function
slime-complete-symbol-function 'slime-fuzzy-complete-symbol)
(slime-setup)
;;; If you want to tunnel through an intermediate host, such as your
;;; work firewall, use the following couple lines. If you're using a
;;; Windows emacs, use 'plink' as below, otherwise substitute 'ssh'.
(add-to-list
'tramp-default-proxies-alist
'("\\.work-domain\\.com" nil "/plink:fwuserid@firewall.work-domain.com:/"))
(add-to-list
'tramp-default-proxies-alist
'("firewall\\.work-domain\\.com" nil nil))
(defvar *my-box-tramp-path*
"/ssh:me@my-box.work-domain.com:")
(defvar *current-tramp-path* nil)
(defun connect-to-host (path)
(setq *current-tramp-path* path)
(setq slime-translate-from-lisp-filename-function
(lambda (f)
(concat *current-tramp-path* f)))
(setq slime-translate-to-lisp-filename-function
(lambda (f)
(substring f (length *current-tramp-path*))))
(slime-connect "localhost" 4005))
(defun my-box-slime ()
(interactive)
(connect-to-host *my-box-tramp-path*))
(defun my-box-homedir ()
(interactive)
(find-file (concat *zarniwoop-tramp-path* "/home/me/")))
</pre>
<p>Now, load up the startup file you created on your host Lisp to start
the swank server. Then, create an ssh tunnel, i.e. <tt>ssh -L
4005:localhost:4005 me@my-work.com</tt>.</p>
<p>Now you can <tt>M-x my-box-slime</tt> to connect through your SSH
tunnel to your work box; SLIME's <tt>M-</tt>. command will also
correctly open up the file containing the defun of whatever's under
your cursor, and <tt>C-c C-k</tt> works correctly, etc. If you want
to open up some lisp file, <tt>M-x my-box-homedir</tt> is a convenient
shortcut.</p>
<h2>For Windows users</h2>
<p>If you're using Windows and want to also use a multi-hop tramp method
(i.e. ssh into your work firewall, and then ssh from there to your
server at work), be aware that tramp 2.1.4 and prior has a bug; it's
fixed in CVS and probably 2.1.5, which is not out yet. Information
and a patch is available <a
href="http://lists.gnu.org/archive/html/tramp-devel/2005-10/msg00060.html">here</a>.</p>
<p>You'll also want to use plink from the <a
href="http://www.chiark.greenend.org.uk/~sgtatham/putty/">PuTTY</a>
distribution in lieu of ssh. If you're doing multi-hop tramp, though,
you need to use plink for the first hop (Windows box -> "firewall"
box) and ssh thereafter ("firewall" -> "server").</p>