sandipb.net

› yours truly.

Serializing Structured Data Into Avro Using Python

| Comments

It is impossible to ignore avro at work - it is the data serialization format of choice at work (and rightly so), whether it is to store data into Kafka or into our document database Espresso. Recently, I had the need to read avro data serialized by a Java application, and I looked into how I might use Python to read such data.

Pagerduty’s Fantastic Zookeeper Bug

| Comments

Ok, I don't particularly like calling a bug fantastic, in this case, it is more of a fantastic troubleshooting of a bug. What I found interesting was the layers that were unpeeled one by one to reach the probable region of the root cause. (Yeah, the root cause is probably so esoteric and confined to a specific combination of version, that it is unlikely to be looked at by anybody).

Here is Pagerduty's summary of the bug.

After more than a month of tireless research and testing, we have finally got to the bottom of our ZooKeeper mystery. Corruption during AES encryption in Xen v4.1 or v3.4 paravirtual guests running a Linux 3.0+ kernel, combined with the lack of TCP checksum validation in IPSec Transport mode, which leads to the admission of corrupted TCP data on a ZooKeeper node, resulting in an unhandled exception from which ZooKeeper is unable to recover. Jeez. Talk about a needle in a haystack… Even after all this, we are still unsure where precisely the bug lies. Despite that fact, we’re still pretty satisfied with the outcome of the investigation. Now all we need to do is work around it.

The Story Is More Important Than the Tool

| Comments

I have a confession to make. Hollywood has always fascinated me. Not because of the larger-than-life stories they come up with. But because of the enormous machinery that churns out a movie. To the utter frustration of my family, I always stay back at the end of a movie, looking at all the credits which flash by - to see the rest of the iceberg under the tip. The thousands of people who made this movie happen, out of which only a fraction gets the world wide adulation, but all of them were needed to make it happen.

Apple Patents Tech to Allow Govt to Block Recording on Mobile Devices

| Comments

A troubling development:

Apple has patented a piece of technology which would allow government and police to block transmission of information, including video and photographs, from any public gathering or venue they deem “sensitive”, and “protected from externalities.”

In other words, these powers will have control over what can and cannot be documented on wireless devices during any public event.

And while the company says the affected sites are to be mostly cinemas, theaters, concert grounds and similar locations, Apple Inc. also says “covert police or government operations may require complete ‘blackout’ conditions.”

And those who think that this is not coming for Android in the future are deluded. If Apple managed to get this technology into the field, it is only a matter of time that Android handset manufacturers are forced to incorporate this as well. If the technology exists, in today's post 9/11 world, it is difficult to resist government pressure on such matters.

Of course, it would be interesting to see the security features for this tech, as this is very likely to be abused - by repressive governments (read, every one) as well as criminal enterprise (recording-free drug zones everybody?)

Apple-touch-icon 404 Errors in Logs

| Comments

Curious about several peculiar Apple related 404 errors for images in my web server logs, I decided to find what is going on, and became knowledgeable about yet another nugget that I really didn't want to know. (sigh)

Use of Tor Will Make You Interesting to NSA

| Comments

Just now read a rather disturbing article from Sophos security. The article describes the interpretation of the law by NSA and some of the internal policies that they use in surveillance.

They also reveal that courts don't always determine who's targeted for surveillance because that discretion is practiced by the NSA's own analysts, with only a percentage of decisions being reviewed by regular internal audits.

To make those decisions, NSA analysts use information including IP addresses, potential targets' statements, and public information and data collected by other agencies.

In the absence of such information - for example, if a potential target is using online anonymity services such as Tor, or sending encrypted email and instant messages - agents are encouraged to assume that the target is outside the US.

This is the part that needs to be emphasized again and again - all this hullaboo in USA about NSA's surveillance is about snooping on American citizens. If you are not one, you have no rights at all and NSA has no limits to what they can sniff out of you and how long they can keep that info. I know, it is pretty much common sense, but when I see Indians getting all worked up about this revelation, I sometimes feel that some of them don't get this.

So coming back to the article, if an American is using Tor or encrypted email or encrypted chat messages, unless the American has been positively identified as an US citizen, he will be treated like a foreign person - essentially with no rights.

And this part is interesting:

If communication is encrypted - particularly if a US person is using certain types of cryptology or steganography known to have been used by "individuals associated with a foreign power or foreign territory” - the NSA is free to collect it and store it "indefinitely" for future reference and cryptanalysis attempts.

That is a loophole right there in my opinion - will they still keep the crypto data if they already have the means to crack it? :-)

Law Enforcement Was Not Supposed to Be Easy

| Comments

Touch of Evil by Pink Cow Photography

A scene from the ‘Touch of Evil’ (1958)

In this day and age of the surveillance state, a quotation worth remembering from the legendary Orson Welles over 50 years back.

A policeman’s job is only easy in a police state.

– Charlton Heston as Mike Vargas in the movie “Touch of Evil”(1958),
Orson Welles (screenwriter and director)

Curiously, a similar statement was made over a decade back, in fact a couple of years before 9/11, before the world changed, or actually before the United States' war on terror changed the world.

We should not be building surveillance technology into standards. Law enforcement was not supposed to be easy. Where it is easy, it’s called a police state.

– Jeff Schiller, an IESG member and MIT network manager, Wired Magazine, 1999

(via Answer Girl and Steve Worona)

Use Btsync and Owncloud to Create Your Own Free Personal Storage Cloud

| Comments

Stormy storage. #clouds by scattered sunshine

Cloud storage?

High Scalability had an interesting link today about a project that combines Raspberry PI, btsync and owncloud to create essentially a personal Dropbox replacement with none of the costs or the storage limitation. Also very importantly, keeping up with the hot topic nowadays, the peace of mind from knowing that you are not making it easy for intelligence agencies to go through your most important and personal data.

The players in this solution here are:

  1. btsync: A still alpha lab product from the original bittorrent creators, which allows you to securely sync a folder between multiple devices owned by you. Ready to use binaries are provided for all the major platforms (desktop and mobile) as well as several ARM architectures (which is where Raspberry Pi comes in). The UI interface is not great, which is probably why the next piece of the puzzle comes in - Owncloud. But if you really want the basics, this is all the software that you need for a synchronized folder among multiple devices.

"The minimal btsync web ui"

Unfortunately, btsync is not Opensource software. So it is entirely upto you who you trust more - Dropbox or Bittorrent Inc. Btsync is reported to phone home for version check and uploading anonymized stats. I have looked around. btsync doesn't have any open source competition yet.

  1. Owncloud: This is actually a standalone application for sharing your files via a dropbox like web interface. It has an extensive list of features - sync between devices, multiple user support, file versioning, undelete, Lucence based search, shared calendar, tasks, data migration/backup and many more. Most importantly, this is Open source software, with all the code available on github.

One question that came to my mind after reading the feature set is that Owncloud already had a multiple device file sync feature. So why would you need btsync?

From reading over the net, it seems to me that btsync is considered to be more reliable as a file sync client. So the idea is to use btsync everywhere, and on one of the devices, use owncloud to provide the interface to serve/edit files over the web.

  1. So how does Raspberry Pi - the overnight micro computing sensation fit into all this? This is because of the way Bittorrent works. For uploads to happen for a torrent, you need one seed up with the complete data. Since btsync is essentially multiple torrents bunched together, it needs a seed as well. And if all your devices are mobile and not always on, there is a good chance that when you need a file, none of the other devices are up and you are cut off from your data.

Raspberry PI by psd

Raspberry Pi

The solution is simple, have one of the btsync devices to always be running, essentially acting like the seeds for your data. If this always-on computer is a mind-numbingly low 6 watts burning tiny box hanging off a wall socket, well .. you can see the appeal of R-pi.

But I already have an always-on device - my Synology NAS, which also happens to be an ARM device. So to try it out, I downloaded the PPC version of btsync and tried to run it - no luck. The btsync binary is a glibc2.4 binary while the NAS firmware is glibc2.3. btsync uses inotify on glibc2.4 and therefore will never support glibc2.3, so I am out of luck here.

# ./btsync 
./btsync: /lib/libc.so.6: version `GLIBC_2.4' not found (required by ./btsync)

The one thing I am yet not comfortable with Raspberry Pi, is its lack of a shutdown switch. Raspberry Pi is perfect for headless usage and with a USB wifi dongle, the only wire it needs is the charger. However to shut it down properly, you cannot just turn it off. Just like any other Linux machine, you need to execute the shutdown command which will unmount the filesystems cleanly before turning off the machine. Mess this up, and you will end up with a filesystem which needs an fsck on bootup and the machine will not boot without you using a keyboard and console to fsck the filesystem.

Till I get myself a hack to shut R-Pi headlessly in a clean and convenient way, I just am not to comfortable using it for serious applications, let alone touch my precious data. There is a nice discussion on raspberry pi forums that I need to readup to do this, and a few blogs (like this) already provide various ways to do that. I just need to find some time to go through all that.