{}

Fun With Ruby 2

I’ve just finished updating all of the libraries I use to run this site to work under Ruby 2, and it’s been a fairly smooth ride.

A few gotchas that happened more than once:

  • It’s strict about the encoding of the return value of #inspect. Volating this constraint looks like: Encoding::CompatibilityError: inspected result must be ASCII only or use the same encoding with default external. Mostly I solved this by just not inspecting, but I guess you could do it by transcoding the return value in your custom #inspect as well.
  • The thread primitives refuse to work inside of a signal-handler. I saw this first in Foreman, which I use to launch mongrel2 and a bunch of Strelka apps in development. This can only be a good thing, as signal handlers should be re-entrant. I submitted a patch to Foreman that adds deferred signal-handling via a Thread-local queue (on the main thread) and DJB’s self-pipe trick. This also cleared up a problem I was having with recent Foreman releases which would make the foreman process not respond to signals at all. Perhaps it was related?
  • The new regex engine still has a few kinks.

Ruby 2 is noticably faster than 1.9.3, as well. All in all, it’s been a fantastic p0 release.

Comments

Comments for this entry.

The Power of String#[]

There are some very powerful tools in the toolbox of Ruby’s String, and one which I don’t often see used is the index operator with a Regexp argument.

Everyone knows how to do string-extraction using a regex in the tried-and-true Perl way:

# Regexp to match an RFC2616 HTTP status lines:
HTTP_STATUS = %r|
	^
	HTTP/(\d+\.\d+)  # $1: HTTP version
	\x20
	(\d{3})          # $2: Status code
	\x20
	([^[:cntrl:]]+)  # $3: Reason
	$
|x

# An example status line
status = "HTTP/1.1 408 Request Time-out"

if status =~ HTTP_STATUS
	code = $2
else
	raise "Invalid status line!"
end

That works, of course, but it doesn’t look very Ruby-ish, uses a global variable, and takes up 5 lines. We can do better.

Enter String#[].

In addition to the usual numeric-index arguments for extracting parts of a String, you can also extract parts using a Regexp argument:

code = status[ /\d{3}/ ] or raise "no status code!"

However, this isn’t really ideal for our purpose because you can’t use placeholders or anything to anchor the match without including it in the result, that is unless you use the two-argument form. Using the original regexp, we can be sure the status code is what’s actually extracted:

code = status[ HTTP_STATUS, 2 ] or raise "no status code!"

Much better! If you’re using Ruby 1.9’s regular expressions with named captures, you can do even better:

HTTP_STATUS = %r|
	^
	HTTP/(?<http_version>\d+\.\d+)
	\x20
	(?<status_code>\d{3})
	\x20
	(?<reason>[^[:cntrl:]]+)
	$
|x

code = status[ HTTP_STATUS, :status_code ] or raise "no status code!"

I love it. You should too!

Comments

Comments for this entry.

Fun With TextMate 2

I’ve been playing around with TextMate 2 for a few days, trying to figure out the mountain of changes that have been made, and so far I’m pretty happy with it. Especially considering that it’s only an alpha, I’m getting a surprising amount of Real Work™ done with it.

One of the things that kept me from switching to a new and shinier editor has been my Ruby workflow, which makes extensive use of the RSpec bundle. I wrote a little formatter for it that can include logging with each spec that you can toggle on and off, and I find it to be a little more compact and readable than the default TextMate formatter.

Anyway, I just got the .tm_properties tweaks figured out that are necessary to make RSpec work in the new version:

If you’re using the WebKit formatter, or any other formatter for that matter, add this to the .tm_properties either in your $HOME (to apply it to all RSpec bundle output), or in a specific project’s .tm_properties file:

TM_RSPEC_OPTS        = '-rrspec/core/formatters/webkit'
TM_RSPEC_FORMATTER   = 'RSpec::Core::Formatters::WebKit'

I’m excited about the possibilities of tweaking the editor’s behavior on a directory-by-directory basis. TM2 still has some rough edges, and some stuff that’s missing or doesn’t work right, but I’m no longer worried that I’ll be disappointed by the final 2.0 release. Now, hopefully Alan won’t get too frustrated with all the bikeshedders and people that won’t read instructions.

Comments

Comments for this entry.

A Restart

I started out making this site using a blogging engine backed by a database, but at some point it stopped working after I upgraded a bunch of dependencies, and I wasn’t ever able to get it working again.

To make the situation even more frustrating, I wasn’t able to duplicate it anywhere but the “production” system. Matching library versions exactly on my development box yielded a fully-functional system. So I’ve started over, this time using that time-honored database: the filesystem.

I adapted a simple CMS I wrote for work (which ironically was itself based on an earlier version of this site) to do on-the-fly conversion of Textile files on disk with configurable filters, all managed via Mercurial, and so far it’s been a breath of fresh air.

I started out this project with the promise to myself to do it just for me, not worrying about ever distributing it or cleaning it up for public consumption, which was incredibly hard, but it’s now paid off completely. I burned down everything, started over, and didn’t have to tell anyone, worry about which version number to increment, or manage migrations. It’s programming just for pure fun, without the accompanying repetitive informationless bug reports, inane suggestions, and posturing that has become part of the Open Source experience.

Now I just have to work on actually posting stuff to it. No promises on that front, as I know by now that I’m a horrible procrastinator with the attention span of a cockatiel. So, yeah. At least it isn’t a lame “my site is down” warning anymore.

Comments

Comments for this entry.

False Advertising

If there’s one trend I wish would go away in the Open Source community, and especially the Ruby community, it’s the practice of over-marketing. There seem to be an endless stream of projects which start to gain in popularity, usually tangentially to the popularity or prestige of some website or another, but then are hyped and marketed and start to make grandiose claims about how much better or novel it is, and it’s almost always gross exaggeration.

I love to see companies release their internal technology; I think it benefits them in all the usual Open Source ways, and provides the community with not only potentially-useful software, but also more examples of how programmers are thinking about problems. This kind of corpus of reference code is instrumental in the betterment of the legions of us that are self-taught. But just because Yahoo or Facebook or Github or Engine Yard are using it, doesn’t mean that it’s necessarily better or brighter or a new way of solving a problem. People seem to assume that just because the code is behind a very high-traffic site, or used by thousands of people, that it’s inherently great. As I imagine anyone that’s worked behind the scenes at a company whose website became popular very quickly would agree, that’s a flawed assumption. Code that’s written under the gun to shore up a sagging infrastructure is usually expedient code, haphazardly tested, and solves only a very narrow subset of its problem domain. It can, and often does, get better, but code that’s a cornerstone of a site that has to be up all the time can only grow and improve at a cautious pace.

This is especially true of software that’s hyped as being “faster” than something it’s intended to replace. What the marketing hype usually fails to mention is that all that speed comes at the cost of cutting some fairly critical corners. Of course a “data structure server” can claim to be “blazing fast” if it sacrifices reliability to do so. If you don’t have to worry about things like transactional isolation and two-phase commit, you can make any software much faster than software that provides these kinds of safeguards. The apologists are quick to point out that you don’t always need reliability, and that’s true, but the second you add persistence to a datastore, I’d argue that you expect to be able to predict the state that’s persisted at any given time. Taken to the ridiculous extreme, a datastore that doesn’t actually “store” anything would be blindingly fast, but useless; it’s the careful balance between reliability and performance that informs our decisions about what technology to use.

The same is true for the raft of software claiming to be “less bloated”, or “pure Ruby”, or (my personal peeve) “more DRY”. It always seems to accrete around anything that gains critical mass, usually with a commensurate minimal fraction of the original’s functionality.

At the very least, these kinds of caveats and should be taken into consideration when talking about and comparing software, and should temper the fanatical hype that usually surrounds whatever New Shiny is in favor with the Code As Fashion people. Good software takes time to perfect, so I don’t expect that it’ll come out with all the problems solved immediately, but the marketing should take that into account.

Also: please stop thinking of memcached like a database. It’s a distributed cache.

Comments

Comments for this entry.
{}