A Generic HashSet in .NET 2.0
April 2nd, 2009
One of the great new additions to .NET 3.5 is HashSet<T>. Unfortunately, you can’t run the 3.5 framework with Win2K, which is what my users are forced to use until we get a company-wide upgrade to XP or higher. We get to develop in VS 2008 on XP, and use some of the new C# 3.0 language features, but we still have to compile to 2.0, and because the users can’t run System.Core on Win2K, that means LINQ and HashSet<T> are out, for now.
Or so I thought, until I figured out a way to write it myself using a Dictionary<T, object> as the internal storage. Feel free to use it as you wish:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Runtime.Serialization;
public class HashSet<T> : ICollection<T>, ISerializable, IDeserializationCallback
{
private readonly Dictionary<T, object> dict;
public HashSet()
{
dict = new Dictionary<T, object>();
}
public HashSet(IEnumerable<T> items) : this()
{
if (items == null)
{
return;
}
foreach (T item in items)
{
Add(item);
}
}
public HashSet<T> NullSet { get { return new HashSet<T>(); } }
#region ICollection<T> Members
public void Add(T item)
{
if (null == item)
{
throw new ArgumentNullException("item");
}
dict[item] = null;
}
public void Clear()
{
dict.Clear();
}
public bool Contains(T item)
{
return dict.ContainsKey(item);
}
public void CopyTo(T[] array, int arrayIndex)
{
if (array == null) throw new ArgumentNullException("array");
if (arrayIndex < 0 || arrayIndex >= array.Length || arrayIndex >= Count)
{
throw new ArgumentOutOfRangeException("arrayIndex");
}
dict.Keys.CopyTo(array, arrayIndex);
}
public bool Remove(T item)
{
return dict.Remove(item);
}
public int Count
{
get { return dict.Count; }
}
public bool IsReadOnly
{
get
{
return false;
}
}
#endregion
public HashSet<T> Union(HashSet<T> set)
{
HashSet<T> unionSet = new HashSet<T>(this);
if (null == set)
{
return unionSet;
}
foreach (T item in set)
{
if (unionSet.Contains(item))
{
continue;
}
unionSet.Add(item);
}
return unionSet;
}
public HashSet<T> Subtract(HashSet<T> set)
{
HashSet<T> subtractSet = new HashSet<T>(this);
if (null == set)
{
return subtractSet;
}
foreach (T item in set)
{
if (!subtractSet.Contains(item))
{
continue;
}
subtractSet.dict.Remove(item);
}
return subtractSet;
}
public bool IsSubsetOf(HashSet<T> set)
{
HashSet<T> setToCompare = set ?? NullSet;
foreach (T item in this)
{
if (!setToCompare.Contains(item))
{
return false;
}
}
return true;
}
public HashSet<T> Intersection(HashSet<T> set)
{
HashSet<T> intersectionSet = NullSet;
if (null == set)
{
return intersectionSet;
}
foreach (T item in this)
{
if (!set.Contains(item))
{
continue;
}
intersectionSet.Add(item);
}
foreach (T item in set)
{
if (!Contains(item) || intersectionSet.Contains(item))
{
continue;
}
intersectionSet.Add(item);
}
return intersectionSet;
}
public bool IsProperSubsetOf(HashSet<T> set)
{
HashSet<T> setToCompare = set ?? NullSet;
// A is a proper subset of a if the b is a subset of a and a != b
return (IsSubsetOf(setToCompare) && !setToCompare.IsSubsetOf(this));
}
public bool IsSupersetOf(HashSet<T> set)
{
HashSet<T> setToCompare = set ?? NullSet;
foreach (T item in setToCompare)
{
if (!Contains(item))
{
return false;
}
}
return true;
}
public bool IsProperSupersetOf(HashSet<T> set)
{
HashSet<T> setToCompare = set ?? NullSet;
// B is a proper superset of a if b is a superset of a and a != b
return (IsSupersetOf(setToCompare) && !setToCompare.IsSupersetOf(this));
}
public List<T> ToList()
{
return new List<T>(this);
}
#region Implementation of ISerializable
public void GetObjectData(SerializationInfo info, StreamingContext context)
{
if (info == null) throw new ArgumentNullException("info");
dict.GetObjectData(info, context);
}
#endregion
#region Implementation of IDeserializationCallback
public void OnDeserialization(object sender)
{
dict.OnDeserialization(sender);
}
#endregion
#region Implementation of IEnumerable
public IEnumerator<T> GetEnumerator()
{
return dict.Keys.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
#endregion
}
Half-assed Functional Programming in .NET 2.0
April 8th, 2008
Ever since reading Joel’s Can Your Programming Language Do This?, I’ve been obsessed with Functional Programming, and with my shiny new hammer, everything looks like a nail. A few months ago, I got it into my head that it’d be a good idea to implement Map, Filter, and Reduce in .NET 1.1. It was actually simple enough to do, albeit in no way type-safe, and still vulnerable to side effects. Naturally, I named the library QAFP (Quarter-Assed Functional Programming), since it was half-half-assed. As an example, here was my Map function:
public delegate object MapFunction(object obj);
public static IList Map(IList list, MapFunction func)
{
if (null == list || null == func)
{
return null;
}
IList mappedList = new ArrayList(list.Count);
foreach (object o in list)
{
mappedList.Add(func(o));
}
return mappedList;
}
public object Square(object obj)
{
int i = (int) obj;
return i * i;
}
public void MapTest()
{
IList numbers = new ArrayList(5);
for(int i = 1; i <= 5; i++)
{
numbers.Add(i);
}
numbers = Map(numbers, Square);
// list should now contain [1,4,9,16,25]
}
It was ugly, but it worked. Pass it a list of mixed items and forget to do your type checking, and you’re boned. It wound up being very useful, and cut down on some of our development time, until we finally moved up to .NET 2.0. I know all of this stuff is old news to those of you using .NET 3.5 and LINQ, but given that we’ve just updated to 2.0, 3.5 is a long way off for us, so bear with me. If you’re stuck using .NET 2.0, you may enjoy this.
The upgrade to half-assed functional programming came with generics and anonymous delegates. Since side-effects are still possible, it’s not pure functional programming, but it’s close enough to be very useful, as long as you’re careful with it. The new code doesn’t look all that different, but it’s much safer, and any type errors should be caught by the compiler. Here’s the new Map function:
public delegate TOutput MapFunction<tinput, TOutput>(TInput obj);
public static void Map<tinput, TOutput>(List<tinput> inputList, MapFunction<tinput, TOutput> func, out List<toutput> outputList)
{
outputList = new List<toutput>();
if (null == inputList || null == func)
{
return;
}
outputList.Capacity = inputList.Count;
foreach (TInput obj in inputList)
{
outputList.Add(func(obj));
}
}
public int Square(int i)
{
return i * i;
}
public void MapTest()
{
List<int> numbers = new List<int>(5);
List<int> squaredNumbers;
for(int i = 1; i <= 5; i++)
{
numbers.Add(i);
}
Map(numbers, Square, out squaredNumbers);
// list should now contain [1,4,9,16,25]
}
Filtering a list is also needed quite a bit, and the generic List<T> in .NET 2.0 already has this covered, through the use of Predicates. A Predicate is basically a function that takes an object of type T, and returns true or false if it matches some criteria. Combine that with List<T>’s FindAll function, and it’s easy to return a list of Ts matching that criteria. I still wrapped everything in a filter function, just to ensure no empty lists or missing predicates. As an example, to strip out all odd numbers from a list:
public static List<tinput> Filter<tinput>(List<tinput> inputList, Predicate<tinput> func)
{
if (null == inputList || null == func)
{
return null;
}
return inputList.FindAll(func);
}
public List<int> GetEvenNumbers(List<int> numbers)
{
return Filter(numbers, delegate(int i){ return (i % 2) == 1; });
}
Last, but certainly not least, is Reduce, also known as foldr to those of you who speak with a LISP. For those not in the know, it performs an operation on each item in the list and the total (or aggregate) of every item before it, starting with an initial value for the total. The way this total is implemented is completely up to you, as you can see below:
public delegate TAggregate ReduceFunction<tinput, TAggregate>(TAggregate aggregate, TInput obj);
public static TAggregate Reduce<tinput, TAggregate>(List<tinput> inputList, ReduceFunction<tinput, TAggregate> func, TAggregate initialValue)
{
TAggregate aggregate = initialValue;
inputList.ForEach(delegate(TInput input) {
aggregate = func(aggregate, input);
});
return aggregate;
}
public int Sum(int total, int i)
{
return total + i;
}
public int Product(int total, int i)
{
return total * i;
}
public StringBuilder Concatenate(StringBuilder sb, string s)
{
return sb.AppendFormat("{0}", s);
}
public void TestReduce()
{
List<int> numbers = new List<int>(5);
for(int i = 1; i <= 5; i++)
{
numbers.Add(i);
}
int sum = Reduce(numbers, Sum, 0);
int product = Reduce(numbers, Product, 1);
List<string> strings = GetListOfStringsToConcatenate();
StringBuilder sb = new StringBuilder();
sb = Reduce(strings, Concatenate, sb);
}
Now, where and why would you use this? Functional Programming is usually used for math-intensive applications, but like I said, everything looks like a nail. To improve network performance, when making remote function calls, we don’t send entire objects if we only need their ID number. So, given a list of patients, let’s say we want to find those with fatal allergies, to keep them from falling into our strategically placed peanut, grass, and pet dander bins (let’s also say we’re witch doctors).
public List<int> GetAllergiesForPatientsAboveSeverity(List<int> patientIDs, Severity severity)
{
List<patient> patients;
List<int> allergyIDs;
Map(patientIDs, GetPatientObjectFromID, out patients);
List<allergy> allergies =
GetAllAllergiesForPatients(patients)
.Filter(delegate(Allergy allergy) {
return allergy.Severity >= severity;
});
Map(allergies, GetAllergyIDFromObject, out allergyIDs);
return allergyIDs;
}
public List<allergy> GetAllAllergiesForPatients(List<patient> patients)
{
// Grab the allergies for the given patients…
}
public Patient GetPatientObjectFromID(int patientID)
{
return PatientCache.Find(delegate(Patient p) { return p.ID == patientID; };
}
public int GetAllergyIDFromObject(Allergy allergy)
{
return allergy.ID;
}
We’d end up only sending a few ints to the server, and getting a few ints in return. Why is this important? Hell if I know, I just think it’s elegant code, if a contrived example. So, if you’re stuck with .NET 2.0 for the moment, like I am, and you’re into Functional Programming, give this a shot. I’m sure you’ll find all sorts of nails you never noticed before.
Patterns of set bits in a byte
November 15th, 2007
When I was searching for my current job, I got a lot of help from Joel Spolsky’s Guerrilla Guide to Interviewing. I found that a few of my phone interviewers were readers of Joel’s when one (who was not affiliated with Fog Creek Software) asked me:
“How many gas stations are in Los Angeles?”
Right out of the article, and the only one with an answer below it in said article, which I happened to have laid out in front of me at the time. I had to add a few candidate categories to Joel’s list:
- Not-so-smart candidates will get flustered and upset.
- Smart candidates will realize you aren’t quizzing them on their knowledge, but on their problem-solving abilities.
- Smart-ass candidates will say “I’d just go to Joel on Software and look up the answer.”
- Smart smart-ass candidates will read the answer off of JoS and wing it. No sense in letting them know you’ve figured out they’re a lazy interviewer.
I’ve always enjoyed a good challenge, and hadn’t written any C code in awhile, so I decided to tackle a few of his coding questions, like reversing a string in place:
void revstr(TCHAR *str)
{
if( *str == '' )
{
return;
}
TCHAR *start = str;
TCHAR *end = start + strlen(str) - 1;
while(start < end)
{
*start ^= *end;
*end ^= *start;
*start ^= *end;
*start++;
*end–;
/*
could also use *start ^= *end ^= *start++ ^= *end–; if you want to get fancy
*/
}
}
That led me to try my hand at a function to check if a string was a palindrome:
bool isAlphaNumeric(char c)
{
return (iswalpha(c) || iswdigit(c));
}
bool isPalindrome(char *str)
{
/* A man, a plan, Anal Panama!!! */
if(*str == ”)
{
return false;
}
int len = strlen(str);
if(len <= 1) return true;
char *start = str;
char *end = start + len - 1;
while(start < end)
{
if(!isAlphaNumeric(*start))
{
*start++;
continue;
}
if(!isAlphaNumeric(*end))
{
*end–;
continue;
}
if(towlower(*start) != towlower(*end))
{
return false;
}
*start++;
*end–;
}
return true;
}
Nothing too special for either of those. Feel free to use them if they’re useful to you, just don’t pretend you wrote them if you didn’t. I’m personally stealing them from the 2005 version of myself, but to be fair, he slept with my wife, so we’re even.
The point of all this code dick-waving has been what I thought was the really interesting question on Joel’s test: Count all the bits that are on in a byte.
A simple, but inefficient method is to just use a bitwise AND to see if the lowest bit is set, then shift the number one place to the right, and repeat until it’s zero, incrementing a counter each time you come across a set bit. However, unless you’re storing it in a lookup table of some sort, you’ll have to run that code each time you need it. With a lookup table, you can cache it the first time it’s needed, and do a simple lookup every time after that. If you’re willing to get even fancier and think about the problem, you can save a hell of a lot of space.
As a disclaimer, my method is far from the best. The best I’ve seen comes from a few mathematical geniuses and can be found here. Even though I failed CALC 166 in college (who schedules Calculus at 7:30 in the morning?), I’ve always been fascinated with math and numbers, and the odd patterns that show up in them. For example, both my zip code and cell phone number (including the area code) are prime numbers, and my office phone number is made up entirely of different powers of two.
So, when Joel said “Really, really brilliant candidates will try to devise a way to compute the table using some kind of a shortcut taking advantage of the patterns that occur.,” I took that as a challenge to find the patterns, and I believe I have. To start, here are the number of bits set for each number that can be represented in a byte. I went ahead and displayed them as NxN grids, where N is a power of two representing the maximum number of bits in the given number, so these represent 2-, 4-, and 8-bit numbers, respectively.
0-3 in 2x2 grid 0 1 1 2 0-15 in 4×4 grid 0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4 0-255 in 16×16 grid 0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4 1 2 2 3 2 3 3 4 2 3 3 4 3 4 4 5 1 2 2 3 2 3 3 4 2 3 3 4 3 4 4 5 2 3 3 4 3 4 4 5 3 4 4 5 4 5 5 6 1 2 2 3 2 3 3 4 2 3 3 4 3 4 4 5 2 3 3 4 3 4 4 5 3 4 4 5 4 5 5 6 2 3 3 4 3 4 4 5 3 4 4 5 4 5 5 6 3 4 4 5 4 5 5 6 4 5 5 6 5 6 6 7 1 2 2 3 2 3 3 4 2 3 3 4 3 4 4 5 2 3 3 4 3 4 4 5 3 4 4 5 4 5 5 6 2 3 3 4 3 4 4 5 3 4 4 5 4 5 5 6 3 4 4 5 4 5 5 6 4 5 5 6 5 6 6 7 2 3 3 4 3 4 4 5 3 4 4 5 4 5 5 6 3 4 4 5 4 5 5 6 4 5 5 6 5 6 6 7 3 4 4 5 4 5 5 6 4 5 5 6 5 6 6 7 4 5 5 6 5 6 6 7 5 6 6 7 6 7 7 8
Notice anything in particular about the numbers when they’re laid out like this? The first row and column have exactly the same numbers, and if you pick a row and column, and add the first number of each together, it’s the exact number you’ll find at the point that row and column intersect. I’m also showing them as 2D arrays, but if you lay out the 2×2 grid as a 1D array, compare it to the first row and column of the 4×4 grid, and then do the same with that grid compared to the 16×16 grid. Each grid, if treated as a one-dimensional array, forms the basis of the next one. There are also some other cool patterns I will mention, but won’t go into.
So, now that we know that we can take the row and column and use that to figure out how many bits are set, we can just divide and mod the number we want by the square root of the size of the array to get our coordinates, like so:
# say we’re looking for the number of bits in 243 x = 243 / 16 => 15 (int) y = 243 % 16 => 3 # or, if you’re using Ruby x,y = 243.divmod(16) => [15,3]
Since we’re going to use these coordinates as indices into arrays, and the arrays are the same, we can just use the same array for both, the first row of the grid:
# 0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4 # Let's call the array "a" add the two indices we got earlier together a[3] + a[15] = 2 + 4 = 6 # Are there six bits set in the number 243? 243 = 11110011 # Indeed there are
So, your lookup table for an N-bit number requires only 2^(N/2) entries (thanks for the correction, nico). Instead of 256 bytes to cache the number of bits in an 8-bit number, you need at most only 16 bytes. Given that I showed how each grid formed the first row of the next grid, and depending on how you decide to trade off between speed and storage, you could get away with only storing the base array of [0,1,1,2] and computing each successive grid from there. I actually did this in the Ruby code I used to generate the grid tables above, which you can see here:
class Fixnum
@@bits_base = [0,1,1,2]
def bits
self.divmod(@@bits_base.length).inject(0){|sum, i| sum += @@bits_base[i]}
end
def update_bits_base(new_bits_base)
@@bits_base = new_bits_base
end
end
class Array
def in_groups_of(n)
list=[]
i = 0
while(i < length) do
list << slice(i, n)
i += n
end
list
end
def print_table(size, col_sep_begin = nil, col_sep_end = “\t“, row_sep_begin = nil, row_sep_end = “\n“, prefix = nil, suffix = nil)
row_arr = self.in_groups_of(size).inject(“”){|row, row_item| col_arr = row_item.inject(“”){|col, col_item| col + “#{col_sep_begin}#{col_item}#{col_sep_end}“}; row + “#{row_sep_begin}#{col_arr}#{row_sep_end}“}
puts “#{prefix}#{row_arr}#{suffix}“
end
def print_html_table(size)
print_table(size, “<td>“, “</td>“, “<tr>“, “</tr>“, “<table>“, “</table>“)
end
end
def bits_on_test(size)
sqrt_size = Math.sqrt(size).to_i
puts “<p>0-#{size-1} in #{sqrt_size}x#{sqrt_size} grid<hr /></p>“
bits = (0..(size-1)).map{|i| i.bits}
bits.print_table(sqrt_size)
bits
end
[1,2,4].each { |i|
i.update_bits_base(bits_on_test(2**(2*i)))
}
The important part is the Fixnum::bits method, which just does the divmod outlined above and sums the array values at those indices. @@bits_base is the lookup table, which is re-computed using itself as the first row and column.
Now, for the other cool pattern I noticed, but won’t go into too much detail on. It might help to have a pen and paper.
# Make a 4x4 grid on the paper, and fill in the upper left quadrant with the values of the 2x2 grid. # For each quadrant, add the value from that same quadrant in the 2x2 grid to the array. # Upper left quad add 0 to each number from 2x2 0 1 * * 1 2 * * * * * * * * * * # Upper right quad add 1 to each number from 2×2 0 1 1 2 1 2 2 3 * * * * * * * * # Lower left quad add 1 to each number from 2×2 0 1 1 2 1 2 2 3 1 2 * * 2 3 * * # Lower right quad add 2 to each number from 2×2 0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4
Does the final grid look familiar? Compare it to the 4×4 grid I computed earlier. Now applying the same principle to this 4×4 grid to make an 8×8, and then a 16×16 grid, and you’ll come up with the 16×16 grid I came up with above. If you’re clever enough, you may come up with some sort of quadtree algorithm to determine what additions you’d have to make to the base [0,1,1,2] array for a given number.
I hope this has made at least some sense to you. If it does, then Wooo! Independent verification from Joel that I’m really, really brilliant. If not, then my secret fear that I’m actually retarded and everyone is just humoring me may be true. Personally, I’d put my money on idiot savant.
Ruby is my timewaster, part 2 (Alienware contest)
October 30th, 2007
Alienware is running a contest to win a trip for 2 to New York. All you have to do is decipher an alien message using clues from this page:
It’s actually a pretty simple substitution cipher, and it took me about 10 minutes to figure it out.
What does this have to do with Ruby? Well, I was able to rewrite a few lines of my word finder from Ruby is my timewaster to search for words where you know the length of the word, a few letters, and their positions, like in a crossword puzzle.
Here’s the revised code:
def get_possible_words(dictionary, required_letters)
re_required_letters = Regexp.new(required_letters)
possible_words = []
File.readlines(dictionary).each { |line|
possible_words << line if line.strip! =~ re_required_letters
}
possible_words
end
def list_possible_words(dictionary, required_letters)
get_possible_words(dictionary, required_letters).each{ |word| puts word }
end
dictionary = “words“ # /usr/share/dict/words stripped of proper nouns and punctuation
required_letters = “^.ut..e$“ # default just for an example
required_letters = ARGV[0] unless ARGV.length == 0
list_possible_words(dictionary, required_letters)
For example, I needed a 6-letter word that matched the pattern “^.ut..e$”. Running the code against it brought back the following words:
- butane
- butene
- cuttle
- futile
- future
- guttae
- guttle
- mutase
- mutate
- mutine
- mutule
- nutate
- outage
- outate
- outbye
- outlie
- outsee
- outvie
- puttee
- rutile
- suttee
- suture
Maybe 3 or 4 of those could reasonably be used in a sentence related to aliens that would be readable by someone without an English degree. The more symbols you figure out, the easier it gets, because you can refine your regular expression and figure out which letters it couldn’t be.
The first page of clues (dated July 8, 1947) gives you enough to figure out all but two letters. If you know Alienware’s naming conventions, one of the two letters should be simple to guess. The second page of clues doesn’t give you any letters you couldn’t have guessed from the first, but it eliminates 3 of the possible letters that could be the last symbol. They’re also the least-frequently used letters of the alphabet, so it’s just a guessing game. I took a shot with the one I thought was most likely.
Hint: Look for the numbers first, and think about why Alienware would use them. That’ll get you the most frequently used letters, and it should only get easier from there.
Second hint: Hey Alienware, I have a fondness for review hardware.
Easier Timezone support in Rails
August 29th, 2007
As I was coding Too Many Secrets (your one-stop shop for anonymous confessions), I decided I wanted to show the times that secrets were added. No real reason, other than I was sick of having to write down the ID of the last secret I read on sites like Cave Canum. Post-it notes, while cheap, are a pain in the ass to keep successfully organized. It’s much easier to remember “Well, the last time I looked at it was around 2AM on Sunday.”
Seemed easy enough to just write out the value of created_at, and let it go at that. Unfortunately, that displays the time on the server, not the client. Since DreamHost is located in California, and I’m in Texas, it’s a 2 hour difference, and I had to find a way to display it in the client’s local time.
Err The Blog had a good article back in March called A Zoned Defense, that covered just this problem. Of course, if JavaScript or cookies were turned off, it wouldn’t work, but it’s difficult to have a fancy Web-2.0-type site without JS or cookies enabled, so let’s assume they are.
The problem with Err’s solution is that I’d like to cache the shit out of everything. If I’m taking a cookie from the user and outputting the time just for them, there’s no way I can cache it. Well, I could if I wanted to cache the fragment for every possible timezone offset. Turns out there’s a much easier way, and it requires no plugins or cookies, but still requires JavaScript.
To make things simple, I’m using MooTools, though it should be doable with Prototype, or just plain old handcrafted JS.
The secret is to use Time::to_i, which returns the value of the given time as an integer number of seconds since epoch. JavaScript’s Date class has a method that allows you to set the time using the number of milliseconds since epoch. All you have to do is multiply your timestamp as an integer by 1000.
Since most people can’t read epoch times, we’ll put it in an invisible span, like so:
<span class="hide dt"> <%= secret.created_at.to_i * 1000 %><br/> </span>
The CSS classes just hide the span from view using display: none;. What we’ll do is grab every span on the page that has the “dt” (datetime) class, read it’s text (which should be the milliseconds since epoch), use that to set the time of a JS Date object, and replace the span’s text with the new date, then drop it’s “hide” class. Add the following to your application.js in Rails.
var Timezone = {
replaceEpochWithLocalDate : function() {
var date = new Date();
var dateSpans = $$(’span.dt’)
dateSpans.each(function(dt){
var msFromEpoch = dt.getText();
date.setTime(msFromEpoch);
dt.empty().setHTML(date.toLocaleString() + ‘<br/>’);
dt.removeClass(‘hide’);
dt.removeClass(‘dt’);
});
}
};
window.addEvent(‘domready’, Timezone.replaceEpochWithLocalDate);
The “domready” event is only available in MooTools, I believe, but either way, you just need to call replaceEpochWithLocalDate() when the page loads. The “DOM Ready” event just does it as soon as the DOM structure is available, before all of the images have even finished loading, so you don’t have the somewhat jarring effect of all of the dates popping up after the page loads completely. There are probably ways to do it in Prototype or plain JS, but it’s done for me in MooTools, so I could give a crap.
The real magic in this method comes from JavaScript’s Date::toLocaleString, which prints out the date using whatever locale your browser it set to, in your local time. No more screwing around behind the scenes trying to figure out how they want it printed. The work is already done for you by their browser, and they’re either using the default, or they have it configured how they want it. Either way, they see the time they prefer. All you have to do is save and display your timestamps like normal, and stop worrying about timezones.
The only downfall to this method is that you have to have JavaScript enabled, same as the three methods provided in Err’s article. If not, you just won’t see the dates, unless you’re using a browser like Lynx, which is text-only, and no CSS or JavaScript. In that case, you’ll just see the epoch time, plain as day. But if you’re cool enough to use Lynx, you’re cool enough to decode epoch times, so it’s a win-win for everyone involved.
Ruby is my timewaster
August 27th, 2007
A few weeks ago, someone on a message board I frequent asked for some good timewasters while their boss was out of town. Someone came up with Make-A-Word, a sort of half-assed Boggle. The only goal seems to be to use the given letters to make as many words as possible in two minutes, with more points given for longer words. Unlike Boggle, the letters don’t have to be next to each other to make a valid word, which I didn’t figure out until I’d gotten horrible scores a few dozen times.
Once I figured it out, my score jumped up a bit, from the 5th percentile to the 30th. Not good enough. Like any nerd worth his salt, I had to figure out a way to beat the system. I turned to my real favorite timewaster: coding. Having fallen in love with Ruby a year or so ago, it seemed like a natural choice.
- The basic rules for the game:
- Words must be at least 2 and no more than 10 letters in length
- Words can only consist of the 9 letters given, but don’t have to include all 9 letters (duh).
- Words cannot be proper nouns (no names) or have punctuation.
- You can’t type the words in. You must use their JavaScript keypad and click with the mouse (usability nightmare).
First things first, I needed a dictionary. Being on Windows (shut up), I couldn’t use the trusty /usr/share/dict/words. However, I found a version online that was essentially the same file stripped of proper nouns and punctuation. That easily took care of one of the rules. Now that I had a dictionary, I needed a plan, and I went with my first:
- Read each line from the dictionary file (one word per line)
- If the word was too short or too long, trash it and go to the next word
- If the word consisted of only the required letters, add it to a list
All of them were fairly easy to accomplish, and the only one that might add complexity to the script would be comparing the word to verify it only contained the required letters. I decided to go with a regular expression after a few shamefully naive implementations. After that, it was just 10 quick lines away. Here’s the final implementation:
def get_possible_words(dictionary, required_letters, min_length = 1, max_length = -1)
required_letters = required_letters.split(//).sort.uniq.join # no duplicate letters
re_required_letters = Regexp.new(“^[#{required_letters}]*$“) # ensure only required letters are used
possible_words = []
File.readlines(dictionary).each { |line|
len = line.strip!.length
# only add the word if length fits and it matches the regex
possible_words << line if (len >= min_length and line =~ re_required_letters) unless (max_length >= min_length and len > max_length)
}
possible_words
end
So, am I done? I’ve got a way to get the words, but I have no idea how fast it’ll run, and I’ve got to click all those buttons. Better get the longest words first to maximize my score. Simple enough:
def get_possible_words_longest_first(dictionary, required_letters, min_length, max_length)
get_possible_words(dictionary, required_letters, min_length, max_length).sort{|x,y| y.length <=> x.length}
end
def list_possible_words(dictionary, required_letters, min_length, max_length)
get_possible_words_longest_first(dictionary, required_letters, min_length, max_length).each{ |word| puts word }
end
required_letters = ‘qwertasdfgzxcvb‘
dictionary = “words“ # /usr/share/dict/words stripped of proper nouns and punctuation
list_possible_words(dictionary, required_letters, 9, 40)
It should be obvious that required_letters is where you enter the letters from the game. I could have made this take a command-line argument, but I just run it from inside SciTE, and I’m lazy. You should also have noticed that the string I used (qwertasdfzxcvb) is longer than the 9 letters you’ll get in the game.
Remember learning that “stewardesses” is the longest English word you can type using only your left hand? That’s wrong, for three reasons:
- I can type one-handed.
- “Sweaterdresses” comes in at 14 letters, two more than “stewardesses”, which only has 12.
- The following words have the same length as “stewardesses”:
- reaggregated
- decerebrated
- decerebrates
- desegregated
- desegregates
- extravagated
- extravagates
- extravasated
- extravasates
- aftereffects
- reaggregates
- resegregated
- resegregates
- reverberated
- reverberates
- sweaterdress
- abracadabras
- watercresses
The longest right-handed words?
- hypolimnion
- homophony
There you go, 19 words you can type using only your left hand that are as long or longer than “stewardesses”, and the two longest right-handed words in the English language. You can at least say you learned something today. Back to the nerdery.
While I’ve certainly played a lot of Starcraft, there’s no way in hell I’m going to click those buttons fast enough to get more than 20 or so words in, which means I’ll be stuck in the mid 1000’s score-wise (i.e. near the top, but not the best). What’s the point of cheating if you can’t decimate the competition?
Since Make-A-Word uses buttons, and doesn’t submit, they must be using Javascript. If you have Firebug for Firefox, you may have noticed that the console allows you to enter your own Javascript, within the scope of the page. Make-A-Word’s JS simply calls a function named “addletter” on each button-click, and another named “addit” to add the word. All we have to do now is loop through the words, and for each one, loop through the letters, outputting the call to “addletter”, and ending with the call to “addit”. Here’s the code for that:
def cheat_like_hell_on_nerdtests(dictionary, required_letters, min_length, max_length)
get_possible_words_longest_first(dictionary, required_letters, min_length, max_length).each{ |word|
word.split(//).each { |letter| puts “addletter(’#{letter}‘);“ }
puts “addit();“
}
end
Simply run the script, copy the output, and run it in Firebug’s console. The cutting and pasting turned out to be the longest part, and I had to find something to occupy the remaining minute and 50 seconds. Admittedly, I could’ve had it just output a Javascript array and do the iteration in the Firebug console, but this only took me 10 minutes from start to finish, and I have no current need to make it run any faster, as you can see:

As with most games, once you know how to cheat, it ceases to be fun, and ruins it for everyone else, so I only recorded my score for one game. Hell, I had more fun writing the script than the game could ever have given me.

