April 6, 2007

The Chudnovsky Brothers

Filed under: Analysis, Computing — mark @ 7:34 am

Just a few links today. At lunch Wednesday, however these things come up, I found myself talking about the Chudnovsky Brothers, featured in the New Yorker magazine profile, “The Mountains of Pi“, about the parallel processing computer built in Gregory Chudnovsky’s West Side apartment (from spare parts and Home Depot supplies) to calculate digits of Pi.

The article I was referring to, however, was the more recent project they took on knitting together high-resolution digital photographs of the Unicorn tapestries at The Cloisters. There is an excellent multimedia resource on the Nova site that discusses this project and the brothers themselves. The 10-minute Quicktime segment is really well done. The New Yorker piece was well done too, but as usual, the entry has vanished off their rather densely commercialized site, and I can’t find a copy anywhere.

May 27, 2006

The Candy Economy

Filed under: Analysis — mark @ 10:48 pm

Data haunts me. I don’t know why … but it commands me to try to sort it out and figure out what it is trying to tell me. I rarely do. Usually an eight-year-old wanders into the room, looks at the multiple spreadsheets open on the main display, eyes the three Perl scripts puttering away on the backup PC, and then tosses in an accurate assessment of the probable conclusion along with a request for a Coke.

Luckily, the resident geniuses are long off to bed, so consider this. There would be no serious objection to the assumption that the Mallo Cup (by Boyer) is the premier candy for us huddled masses yearning to breathe free. I am a long time fan of the Mallo Cup, although I only recently learned that they have been manufactured (why is it that you eat food that is cooked, but candy that is manufactured?) in a factory in Altoona, PA, along a highway I regularly travelled going from Pittsburgh to Penn State and back.

Along with its magical cheap chocolate, marshmallow center, and cocoanut accents, the Mallo Cup offers its own fake money. Fake money that it seems has been being printed with the same mimeograph machine since the Dawn of Time, that shows a ‘coin’ (worth 5, 10, 25 or 50 points), and an address where you can send in your collected coins for valuable prizes as well as request a catalogue of the treasures available.

The catalogue is actually one side of a glossy 8.5×11″ brochure showing 10 prizes: a $1.00 rebate check, coffee mug, tin of Mallo Cups, canvas tote bag, baseball cap, youth or adult t-shirts or sweatshirts, and a quartz wristwatch.

Based on a collection of 12 coins (not statistically meaningful, but what about this is, exactly?) you’ll average about 25 points per $0.75 Mallo Cup package. Among the prizes offered, the worst deals are the rebate check and coffee mug … at their stated retail values, they can be ‘purchased’ with points at an exchange rate of $0.002 per point (1/5 of a cent). The best value is the Adult sweatshirt, with an exchange rate of $0.0034 per point.

Unfortunately, to get sufficient points for your sweatshirt, you have to buy (and presumably eat) $150 worth of Mallo Cups. After which the sweatshirt is unlikely to fit.

February 1, 2006

A Real Search for Fake Data

Filed under: Analysis, Observation — mark @ 9:37 am

Add to your collection of nifty mathematical insights, Benford’s Law, which states that in a suprising diversity of data collections, the integer digits represented in that data will NOT be uniformly distributed. Or more exactly, they won’t be uniformly distributed “positionally” … that a “1″ is far more likely going to be the initial digit of a number in a data set than will be, say, a “9″.

Reg sent a pointer to an NYT article from 1998 that mentions this. (A mathematician friend notes what that the article does not … that Benford’s law is generally held to be correct when the sample size is large enough.)

The most interesting practical application of such a law is in constructing tests for Fake Data (let all fillers-out of Expense Reports be forewarned) … that a data set could be quickly analyzed to see if the data used digits well outside a distribution that Benford’s Law might suggest.

… or the corollary might be, with an understanding of Benford’s Law, we now have a market opportunity for tools that create accurate fake data that conforms to Benford’s Law plus or minus epsilon.

We’ll be rich.