Syncing under the weight of the cloud

10 01 2013

This will have to be a two part post because, this problem isn’t easy to solve. But, I think I’ve found a solution. I haven’t tested it thoroughly enough to say “yes this is as awesome as I want it to be” but, it is getting there…

After 8 months of preparing for the death of Windows Live Mesh, that day is so near that it was time to take action. But the cloud has killed Peer to Peer file sync!

The two reasons I am unable to ONLY use SkyDrive are twofold:

  1. I need, really need, Peer to Peer Sync without the cloud. And the “partial sync” of SkyDrive is great, but Comcast will simply shut down my up / downlink if I need to re-sync a few hundred GB of RAW photos between hard drives via the cloud.
  2. I need to have folders that sync outside a single hierarchy.

But what options are left if you want to NOT include the cloud?

It turns out, not very many. And, by the time you read this post, there may be fewer. At least 2 of those options that do P2P sync use Java – which is a non-starter for me personally right now. If the security profile of Java improves, that’s fine. However, a file sync engine with Java as its basis seems like an invitation to badness.

(Those two are Wuala and AeroFS – which is still in private beta.)

Also, most of the offerings that do P2P sync require a monthly fee, which includes cloud storage – which I already have SkyDrive for.

Why copying files is so freaking hard

Photos and photo editors are a funny thing. When you edit photos, they retain the same time and date stamp as when the photo was taken. Even if you delete and replace a JPG file from a RAW file using the same file name.

What that means is that many tricks “file sync” programs tend to use to speed up comparisons between folders simply don’t work because a photo may be the same filename, date and time, and maybe even the same size, but it is technically a different photo.

In my explorations, I tried Microsoft’s SyncToy Framework. Which appeared to be pretty good for many things… except, I noticed that it would miss huge swaths of photos I know I had changed or updated.

The only way to catch those was via a hashing / CRC type comparison. Which is extremely slow.

TeraCopy is a wonderful piece of software that can compare folders, and make network file copies work quickly.

It can also uncover things like your Intel 4965AGN Wireless Adapter being unable to maintain a 300Mpbs connection for large file transfers to gigabit connected peers.

Seriously… Intel has refused to fix this issue. If you have this card and it disconnects from your network for seemingly no reason, and will not reconnect without anything short of a reboot, you may have this problem. The x64 problem gives you a BSOD, the x32 problem just disconnects you into a frustrating non-routable networking land.

RoboCopy is about the safest thing to use to keep two folders in sync on a local drive. And if you use the ATTRIB command correctly, you can get around all the nonsense about whether or not your files have changed.

Here are examples of how I used RoboCopy (and I would follow with a comparison with TeraCopy periodically):

  • robocopy “C:\Users\username\Pictures\Imports” “M:\Backup\Imports Backup” /E /PURGE /COPYALL /M /Z /XA:H /W:5 /R:15 /LOG+:”M:\logs\mirrorlog.log”
  • attrib -A “C:\Users\username\Pictures\Imports\*.*” /S

With those commands, you can run this (as Administrator) and then see what happens. It’s a “mirror” – as in, it deletes on the destination what’s missing on the source.

To verify with TeraCopy, open the source and destination folders. Open TeraCopy. Drag and drop all the folders from the source onto TeraCopy, select the Destination folder from the drop down.

Click More… and click “Verify”. It will give you a list of anything what’s wrong using checksums. Click “Clean Up” to see just a list of what went wrong and remedy the problem with Copy, Delete, Move, whatever!

Clearly that’s not a solution when we’re on the road…

The magic of Live Mesh was that I signed in, it sync’d, and “it just worked” wherever I was. On all my computers.

Even adding Hamachi back into the game makes RoboCopy difficult to use because it isn’t interruptible (at least not gracefully).

So that brings us to the actual only 2 options I’ve found. One works, the other one may work someday, but it doesn’t work the way I need it to today.

Cubby isn’t there for me yet…

I love Hamachi. I don’t currently use it. But it was like magic.

I wanted Cubby to be like magic too. But it just isn’t yet.

Cubby does work, but their pricing plan doesn’t work for me. “DirectSync” (their P2P sync) comes with cloud storage… which I don’t want / need. Plus, I often run a “dual boot” machine and share one set of folders on a common drive. Cubby, right now, can’t handle that. It re-indexes all those files as though they are “new” and compares them with the other sync partners. And it uses 80% of the CPU on both machines while it does that.

So, I submitted my “idea” to allow Cubby to support the dual boot edge case (which Live Mesh did – it spent about 2 minutes to figure out what was going on, and was fine), and basically have moved one.

The final answer is apparently GoodSync – so far

It’s real sync. Local, P2P, and with third party cloud offerings.

It’s basically like Microsoft’s SyncToy and Live Mesh had a freaky three-way with Hamachi.

Because I am just trying everything out, I can’t say 100% that it is doing everything I need it to do. But my use case will be to replace the RoboCopy local operations I illustrated above, as well as the P2P sync operations back to my Windows Home Server 2011 for photos, files etc.

So I’ve set up my router to allow in the GoodSync protocol and I’ve set up users, Sync Jobs etc. on my machines. GoodSync also has a Mac client, and a special dispensation for Windows Home Server 2011 use (so you don’t have to buy their real Server offering).

The sync operations support CRC checksum comparisons for files as well as flexible rules for creating jobs. In fact, you could throw Hamachi back in and sync to UNC paths through the VPN.

They appear to offer pricing discounts for multiple machine licenses, and because the software has quite a bit engineering behind it, if it solves my use cases, it will be worth every penny.

A comment on SkyDrive’s Code of Conduct (and others)

From a photographer’s point of view, an issue with any cloud storage provider is that you don’t know what or who is looking through your stuff making “code of conduct” violation rulings. And while I know it probably isn’t a “real human” doing these rulings about what pictures might or might not be “appropriate” for their service, I can’t run the risk that I’ll accidentally take “put that picture up” someday.

That’s not usually what’s in front of the lens, but when you automatically sync folders from your hard drive, anything could happen. And shutting down my SkyDrive account isn’t an option I can handle.

What next…

Testing testing testing…

I’m deploying about 12 different operations with GoodSync over 5 computers. I’ll be working on and off my LAN. So I’ll try to detail that setup because it will involve:

  • Windows 8 Pro
  • Windows Home Server 2011
  • Stablebit’s DrivePool (on Windows Home Server 2011)
  • SkyDrive (I’ll be using GoodSync to put files into my SkyDrive hierarchy)
  • Windows Phone 8 (GoodSync will put some files in here I need, and also “extract” files I want to offload)
  • Xbox 360 – maybe – I’d like to put slideshows up for SkyDrive there

Should be fun! And I guess I’ll finally have a way to be at peace over Mesh being gone…

Advertisements

Actions

Information

One response

28 01 2013
Lynne N. Sellers

I use an FFS batch file to sync key data folders to my user folder every day which then syncs automatically to my skydrive folder in the cloud. Also to back up my whole 30 gig data partition twice a week to different external drives (one off-site). I also back-up all current folders once per month, which I store for 12 months, and for (permanent) year-end storage. All back-up files are encrypted in case an external drive gets stolen. The whole process takes about 10 minutes per week and runs mainly automatically or in the background.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




%d bloggers like this: