Monday, October 22, 2007

Time Machine: not just for backups

Time Machine occupies just a few bullet points in Apple's list of 300+ new features in Leopard. But it has the ability to fundamentally change the way you and I use our computers.

Disclaimer: I have not actually used Time Machine. My only knowledge of it comes from the public videos and demos and from reading the Mac sites like MacInTouch and The Unofficial Apple Weblog.



From the Apple literature, Time Machine is an easy way to retrieve your lost files. It's being introduced it like that because it's simple to understand. If you've accidentally (or purposefully) deleted a file, you can simply use Time Machine to find it and bring it back. For this use, it's not much different from any other backup software. But the similarity ends there. This is not SuperDuper.

Time Machine maintains a complete history (at hourly intervals) of every file on your computer. To put it a different way, every hour it takes a "snapshot" of your computer and stores the way it looked at that moment. These snapshots comprise a set of "versions" of your files. Time Machine gives you a way to retrieve those versions.

Time Machine is not simply a backup of the way your computer's files currently look. The term "backup" implies a second copy of your current files. Stop thinking about backups. Start thinking about versions. Think about three, four, ten, hundreds of copies of your files, each representing the way it looked at a particular point in time.

Here are a few scenarios that use versions--rather than backups of lost files--as their centerpieces.

Have you ever worked on a long-term project? One that spanned more than a few hours? It could be a project for work. Or it could be that Great American Novel you've been writing. Or an iMovie project of your last vacation. You haven't deleted it; you've simply been changing it. Have you ever messed it up so badly (and saved your mess) that you wished you could go back to the way it was earlier in the day? With Time Machine you can. Just roll back through time to find the last good version of your project. You can use Quick Look to glance at the document to verify it's the one you want. When you find it, restore it.

Some documents are ever-evolving. For example, a real estate office may have a letter they mail out to new clients periodically. When they want to send out a letter, they just change the address and a few words here and there to customize it. Because the document evolves, today's version bears little resemblance to the one they sent out six months ago. What if they wanted to see what the letter looked liked back then and compare it to today's? They can with Time Machine. They don't even have to restore it; just use Quick Look to peek at it.

I'll leave you with this one final revolutionary usage of Time Machine. Most of us have an ever-increasing number of files on our computer. Organizing it all is becoming a chore. We download new files or create them constantly. They clutter up our folders and Desktop. It's maddening. And we should be backing it all up. Of course, you know that Time Machine will help you do the backups.

But think about this: Download or create a new file. Use it for its intended purpose now. Keep it around for an hour or two. Then throw it in the Trash. That's right: don't think about how you might want to use the file later. Never mind that your co-worker will be needing it next week. Or that you might want to refer to it next month. Throw it away now. De-clutter your life. Or at least your Mac.

Ah, don't worry. Time Machine has it. You can always get it back later. Spotlight can help you find it.

This may change the way we build computers. No longer will you need a gigantic primary hard drive that holds all your files, plus an equally monstrous hard drive to back up it all. Instead, your computer has a lean, fast primary storage device big enough to hold the files you're using right now. A few GBs of flash memory will do for most people. The rest--a complete history of every file you've ever worked on--is stored on a large-capacity Time Machine device.

That's the future of computing. Look out, because in a near-future release of OS X, I predict that Time Machine will be the filesystem. There will be no distinction between your "current files" and your "backup files." All your files, current or not, will simply be versions stored in a searchable database. Solaris' ZFS technology will help make that happen.

Saturday, October 6, 2007

Lifetime To-Do List

Doesn't everyone have a list of things they'd like to accomplish/do/see before they die? Here's mine, in no particular order.

See a space shuttle launch.

Experience weightlessness.

Visit Antarctica.

Visit Alaska.

Visit China.

Write an application that gets used by more than 100 people.

Make a hologram.

Make a daguerreotype.

What's yours?

Tuesday, August 21, 2007

iPhone features I'd like to see

OK, I'm thiiiis close to getting an Apple iPhone. I'd have to break my contract with Verizon. But there have been so many times that I've been out and about and wanted to quickly look up something on the web. Or check my email to see if I've been outbid in the last 10 seconds on a digital oscilloscope auction, again.

I've used my friends' iPhones and played with them in the store. As a result, I'm tempted to wait until software updates or a new iPhone model comes out. There are some nifty features missing on the current version that it desperately needs to be really, really usable.

GPS

Think about it: how many times have you been in an unfamiliar town (or even a familiar one) and wanted to know something like "where are the Italian restaurants nearest to me right now?" or "where's the nearest Post Office?" With the current iPhone, you can bring up Google Maps, which is great, but you still need to know what city or ZIP code you are in. What if you don't know your exact current location? You're screwed, because either you can't search at all or you get way too many results.

What I'd like to see is a way for the iPhone to know its current location. GPS would be great, as it would give your location to within 10 feet or so, but for Google maps an exact location isn't necessary. Even knowing your approximate current location would be good enough. I don't know much about cell technology, but I imagine that each cell tower knows its own location. The iPhone could do some triangulation to figure out where it is in relation to the towers, at least well enough to make Google maps searching incredibly useful.

Voice Dialing, Done Right
I use voice dialing when I'm driving. With my headset on, I just touch a button on the phone and ask it to dial one of the people in my contact list. Here's how the conversation with my phone usually goes:

Me: Call Brenda
Phone: Did you say, "Call Brenda?"
Me: Yes
Phone: Which number?
Me: Mobile
Phone: Calling...

My phone has a voice-recoginition system that doesn't need training ("voice-independent recognition"). That is, out of the box, it's able to listen to the user's voice and match it with a name in the contact list. But because the voice recognition isn't tuned to a specific voice (mine), it needs to double-check the match. Hence, the question "Did you say...." What gets me is that it is always correct!

Phone manufacturers, unfortunately, have decided that implementing voice-independent recognition is inherently a binay decision (either it's used for all contacts, or it is not used at all) and, if voice-independent recognition is not used, training it is inherently modal. That is, in order to train the phone to recognize names, the user must go into a special "training" mode.

What I want is a phone that learns.

When I answer "yes" to the query "did you say...", the phone should immediately remember what I just said. The next time I ask to call that name, the training will have already been done. That should be the extent of training the recognition system -- totally transparent to the user.

Games
I need my Tetris and Bejeweled!

VOIP
That's Voice Over IP or, in layman's terms, Skype. Not going to happen, though. The purpose of a mobile phone is to make the carrier rich by charging for minutes of airtime.

On the other hand, T-Mobile is rolling out a new service called T-Mobile Hotspot @ Home in which a WiFi-enabled mobile phone can make free call when in the presence of a WiFi hotspot.

So who knows, maybe we'll see Skype on the iPhone yet.

Saturday, August 18, 2007

Airport Extreme Cripples Write Throughput

There is a major problem with the Apple AirPort Extreme involving its "AirDisk" feature. I have the "older" 100BaseT model to which I have attached a USB hard drive to share among my various Macs. There have been many postings on the Apple Discussion Boards about poor disk performance when the Airport Extreme is used to share disks. While some users report slow throughput, others claim to have no issues at all.

I have confirmed, through testing, that some USB disks do indeed exhibit very poor write throughput when compared to other USB disks. This could have a serious impact for anyone using a USB disk with an AirPort Extreme, especially when used for its intended purpose: to do unobtrusive backups.

Background
This problem was first brought to my attention when I purchased an Other World Computing (OWC) Mercury Elite Pro "Quad Interface" enclosure. In it I placed a Seagate Barracuda 3.5" 400GB SATA drive. This enclosure, which sports four types of interfaces (USB 2.0, FW400, FW800, and eSATA) for attaching to a computer, performed admirably over FW800 when connected directly to my PowerMac G5 tower. But when I connected it to my Airport Extreme, throughput suffered. Writing to the drive was especially bad.

The drive enclosure is manufactured by Newer Technologies. Many emails with their technical support division turned up no solutions. In fact, Newer's tech support denied that there was any kind of performance issue.

In desperation, I connected up an old "generic" USB enclosure with an even older 20GB IDE hard drive. To my surprise, throughput was quite good. Clearly, there is some bad interaction between the AirPort Extreme, the USB enclosure and its internal driver circuitry, and the hard drive.

Ultimately, I wanted a nicer enclosure that would fit under my AirPort Extreme. I settled on the miniStack v3. Like the Mercury Elite Pro, the miniStack v3 is also a quad interface enclosure. It, too, is manufactured by Newer Technologies. I also purchased a 500GB Hitachi SATA drive. After hooking everything up, the results were the same: write performance through the Airport Extreme was poor.

Finally, at wit's end, I decided to settle this once and for all and run some tests. I suspect that there's something about the Newer Technologies driver circuitry that is reducing throughput. I wanted to pit the miniStack against the cheapest USB enclosure I could fine. In the tests, I attempted to eliminate as many variables as possible. The only variable would be the USB enclosure. I also tested a direct USB connection to the Mac to obtain baseline figures.

The Hardware
Two enclosures:
Newer Technologies miniStack v3 with quad interfaces ($120, from OWC);
AverTech SATA HDD Enclosure with only a USB 2.0 interface ($35, at Fry's).

One hard drive:
Seagate Barracuda 7200.10 3.5" 400GB SATA hard drive.

Two connection types:
Directly connected to the Mac via USB cable;
Via the AirPort Extreme, using 100BaseT Fast Ethernet (not wireless). AirPort Extreme running v 7.1.1 firmware.

Methodology
I ran eight tests. Four were "read" tests and four were "write" tests. All tests used the same 600MB mix of files and folders. The files ranged in sizes from under 1K to over 200MB. Before each write test, the drive was reformatted as HFS+ with journaling using the Apple Disk Utility.

The results are as follows (click for larger version):


In all tests, the miniStack is no faster than the "cheapie" USB enclosure. In fact, it's always slower. But most alarming is the write throughput via the AirPort Extreme. The miniStack is takes almost twice as long!

Also worth noting is the noise the drive makes when performing the tests. In seven of the eight tests (all except the slow write test noted above), the drive is nearly silent. All I can hear is the motor and a very quiet clicking as the heads move back and forth. In contrast, during the AirPort Extreme write test using the miniStack, the drive is very loud. The heads can be heard noisily clacking back and forth. Something different is clearly happening with the miniStack.

Conclusion
The OWC Mercury Elite Pro "Quad Interface" and the Newer Technologies miniStack v3 both behave nearly identically. I suspect that the internals of both enclosures are pretty much the same, as they are both made by the same manufacturer (Newer Tech) and both are quad interface devices. Both enclosure exhibit very slow write throughput and create lots of noise when connected to my Mac via the AirPort Extreme.

The AverTech USB enclosure (the cheapest thing I could find at my local Fry's) is identical in performance to the miniStack except when writing files via the AirPort Extreme. In that test, it is much faster and much quieter.

I suspect that there is some kind of bad interaction between the Apple AirPort Extreme and these "multi interface" enclosures. A no-name USB-only enclosure soundly beats a name brand multi-interface one. Someone, either Apple or the USB drive enclosure manufacturers, needs to explain these dismal throughput numbers.

I've learned one thing for sure: spending more -- a lot more -- on an enclosure doesn't necessarily get you better quality or higher speed.

Update (7:35 pm)
Just for grins, I included two more enclosure and drive combinations in my tests. All told, they are:

Newer Technologies miniStack v3 with Seagate 400GB SATA drive
AverTech "generic" USB with Seagate 400GB SATA drive
CompUSA "generic" USB with Quantum 20GB IDE drive
OWC Mercury Elite Pro Quad with Seagate 400GB SATA drive

Results (click for larger version):


Interestingly, the old Quantum IDE drive fared better in the AirPort write test than any of the others! I can't account for that, but I do know why its read speeed over direct USB is slower: it has a longer rotational latency and therefore takes longer to access data.

The whole point of the graph is to show that the Seagate SATA drive performs poorly in the Airport write test whether its in the miniStack or the Mercury Elite. Something is definitely wrong with this picture.

Update update (Aug. 21, 1:15 AM)
A reader suggested turning off HFS+ journaling on the external drive. I did so and clocked the fastest AirPort write time yet: 125 seconds for the 600MB of files. Throughput rises to an astonishing 4.8 Mbytes/sec! USB 2.0's maximum practical throughput is around 30 MBytes/sec. Fast ethernet can pump data at approximately 12 Mbytes/sec. Who knows what the internal disk transfer speed is of the Seagate drive, as Seagate does not publish that spec. I would expect performance of the overall system to be 6-9 Mbytes/sec.

Still, this points to a problem with the AirPort firmware: writing to HFS+ journaled multi-interface external disks is extremely slow. HFS+ with journaling is the default for newly-formatted disks, so there should be no reason why an ordinary user would need to turn it off.

Saturday, August 11, 2007

Mobile phone myths

I've been a mobile phone user since 1999. I started own on Pacbell PCS, which later became Cingular. In 2004 I switched to Verizon.

Over the years I have learned some of the ins and outs of the mobile phone trade. These are all based on my personal experience and from reading the fine print. Here are four myths about mobile phones and their rate plans. I'll debunk each myth with the plain, simple truth.

Myth 1: My carrier requires a 1- or 2-year contract.
Carriers use contracts to help guarantee that you'll remain a customer for at least the length of the contract, not necessarily to extort money out of you. By doing so, they can generally offer lower rates than a pay-as-you-go customer would pay, knowing that they've got you as a customer for at least 12 or 24 months. To get the best of both worlds (lower rates without a contract), simply buy your phone at full retail price and opt for a month-to-month plan. The upfront cost will, obviously, be higher but the rate plan will be the same as the ones for contract. You can cancel at any time. Some dealers will even unlock your phone for you, allowing you to use your phone with other carriers.

Myth 2: My phone was free! (Or nearly free.)
Of course it wasn't. Look at the receipt from your phone purchase. You paid sales tax on the full retail price of the phone. In other words, you bought the phone for full price. But then they gave you an instant rebate. Notice how the rebate amount is approximately the same as the "early termination" fee? If you cancel early, the carrier wants you to pony up the amount that you "saved" by signing a contract. In other words, the early termination fee is the money you would have paid had you bought the phone at full price.

Myth 3: I'm locked into a contract.
No one is ever locked into a contract. Contracts can be broken at any time by either party. All you have to do is pay the "early termination fee." Can't afford the fee? Well, that's your problem, not the carrier's. They're not locking you in; you are. See Myth 1 for a way to avoid contracts in the future. Side note: some carriers wil prorate the termination fee during the last year of the contract. For example, if you cancel your contract after 18 months, you'll need to pay only half of the termination fee.

Myth 4: I can't change my rate plan.
That depends on what state you live in and which carrier you are using. Here in California, Verizon allows me to change my rate plan at any time without penalty. Cingular was the same way. In some other states, I've heard that any change in your plan results in a new contract. I would complain to the dealer or carrier and try to get it waived. By agreeing to a new contract, but not getting a new phone in the process, you're giving the carrier free money. See Myth 2.

And now some real truths about mobile phones.

The user interface sucks and the phone is falling apart.
Yup. You see, the phone manufacturers are in the business of selling phones. They build them to last about 18-24 months under normal use. By the time the thing is on its last legs, it's time to renew your contract and get a new phone. The user interface, too, is designed to irritate you just enough that you'll want to dump your current phone at the earliest opportunity. You've heard of "planned obsolesence," right?

Cell phone rate plans here in the US are more expensive than in ___.
Yes and no. In some countries, such as the UK, rate plans are actually more expensive. It may be why text messaging is so much more popular there; it costs less to fire off a few 100-character text messagse than it is to talk for two or three minutes. That said, in some countries, such as India, rate plans are dirt cheap. However, the consumer generally buys the phone outright. In other words, the phone manufacturers and carriers are not as closely tied togehter as they are in the US. Because the consumer is free to switch plans at any time, the carriers compete ruthlessly for business. Consequently, rate plans are very cheap.

There you have it. Contracts, phones, and plans in the US generally suck. But that's the truth. Any comments?

Sunday, August 5, 2007

A new teaching tool for CS

I have been teaching full-time for five and a half years. Before that, part time for a couple of years. I love interacting with students and inspiring them to learn more than they expected. It's been a wonderful experience and I wish to continue it.

My area of interest, as you can imagine, is in education. Specifically, how to inspire a new generation of students who have grown up using iPods, playing 3D video games, and chatting on the Internet. It's a different world than when I was going through the MS program.

I am interested in new pedagogical approaches to CS education, specifically in the area of programming. Introducing new students to programming, especially ones raised on a diet of the above-mentioned technologies. Today's introductory programming classes need to be interactive, graphical, and support a wide variety of teaching and learning preferences.

I've been attending the annual SIGCSE conference regularly for five years. I've heard from countless professors that today's CS1 and CS2 courses lack a compelling, rigorous programming environment for teaching and learning. BlueJ, Alice, DrScheme, and others are all good, solid attempts to bring modern IDEs and pedagogically-appropriate languages to beginning students. But they aren't enough. They all have their strengths and weaknesses. I have taught CS0 and CS1 courses with several of the teaching tools and have come to form my own opinion about what a good teaching tool should be.

I believe that what students need is a language and IDE that grows with them. Beginning students -- even those in primary and secondary schools -- can use a drag-and-drop interface similar to Alice. As they advance, the IDE can change to support more typing with less dragging with corresponding advances in program design flexibility. At its most advanced level, seasoned professional programmers could use the IDE to write production-quality programs.

It's an ambitious project, for sure. I intend to start simple and reuse existing technologies: XML, XSLT, CSS, virtual machines, and the like. In fact, the programming language specification is nothing more than an XML DTD file. The IDE is just an engine that interprets and builds correct XML files. The compiler is an XSLT engine that interprets an XML file and targets... well, anything. For starters, it would target an existing virtual machine, such as the JVM or Parrot.

In short, my idea is influenced by a well-known acronym: MVC (model-view-controller). While the MVC design architecture has been around for decades, it has not, to my knowledge, been applied to a programming language.

Model: an XML file with DTD specifying the "grammar."

View: an interchangeable view of the XML file. It could be graphical (like Alice), textual, 3D, flow charts, whatever. The view can also accommodate different user interface languages: English, Japanese, etc.

Controller: the user interface. Like the view, it is interchangeable. Users could choose a drag-and-drop interface, a more traditional text interface, or a hybrid of the two. Or even something completely new, yet-to-be-invented (but simmering in the back of my mind).

Once the prototype has been built, I imagine it would need to be tested in the classroom. A curriculum using this new language/IDE would be designed and tested against existing pedagogical approaches: BlueJ, DrScheme, Alice, Karel, etc. Quantifiable measurements would be test scores and how well the students performed in subsequent classes that used traditional programming languages.

Monday, July 23, 2007

Backups, pt 1

We all need to do backups of our computers. Most of us don't. Let's take a brief look at the typical backup options available to us.

1. Manual backup to removable media, such as tape, CD, DVD, etc. This has been the traditional way of doing backups. The rationale (at least, until recently) is that removable media can store more data than hard drives. For example, about ten years ago, a high-end tape cartridge could store about 8GB of data wherease the largest hard drive you could reasonably obtain was about 4GB. Also, backups were supposed to be write-mostly, meaning that backups were to be read very infrequently. The linear nature of tape meant that reads were very slow, but that was a small price to pay for the relative low cost of the cartridges.

Nowadays, almost all removable media hold less than a hard drive, so this option isn't good for comprehensive backup solutions. Still, it's the preferred method among casual backer-uppers for keeping a copy of documents, photos, and so forth.

2. Automated backup to another hard drive. Given how ridiculously cheap hard drives are these days, this is my current method of doing backups. The backup drive is simply a large, cheap hard drive that keeps a copy of the data on my computer.

3. Automated backup to a network service. I like the concept of these backup services (Mozy, .Mac, Amazon S3). The storage is relatively cheap and you can be assured that the service itself is keeping backups, so you'll never lose your data. All your files are kept off-site, a big plus if your home or office is lost in a fire or flood.

For the automated solutions, there's a problem: the backups are performed at periodic intervals, such as every night or once a week. But the time you really need your backup is immediately after you realize you just deleted an important file. Ideally, the backup system should be making backups within minutes of making any filesystem change, whether it be adding, deleting, or modifying a file.

One easy fix that fits within the realm of automated, periodic backups is simply to reduce the backup interval. How about every five minutes? That would be swell. But that begets another problem: the actual time it takes to perform a backup may well exceed the backup interval! You see, typical backup applications figure out which files have changed by scanning all of your files and comparing the modifiation time to the last time the backup was made. My desktop computer has nearly a million files in it; scanning all of them takes a long, long time. It causes my computer to slow to a crawl, too, as system resources are consumed.

I hate doing backups. I dread seeing the pop-up window that appears, letting me know that backups are about to begin. My system will become unusable for about an hour. Ugh.

How can we speed it up and make it better? We need help from the operating system itself, something that most OSs don't natively provide to applications. Apple has a solution in its upcoming Leopard release. We'll talk about it in an upcoming post.