Sharded Column Index for Cassandra

Description Eindex is a module for sharding column based indexes. With a similar schemeof how Cassandra shards row keys we decided to shard column keys. With this scheme you can still use the Cassandra Random Partitioner and get range queries for keys. Our goal is to support 100’s of millions of keys across a Cassandra cluster.

Read More

Java Unique ID Generator

EID is a service for generating unique ID numbers at high scale with some simple guarantees (based on the work from https://github.com/twitter/snowflake). The service can be in-memory or run as a REST-ful web service using Jetty.

Read More

Raid Level 0 Setup on Amazon EC2 EBS drives

This will be a short post describing how we configure a raid level 0 drive on an EC2 instance using the EBS drives. For a lot of our functionality we typically use the ephemeral drives and periodically backup content using the EBS drives and snapshots. We mainly use raided EBS drives to get the maximum performance out of an Amazon EC2 small instances. For example we have seen nearly double the performance out of our Cassandra cluster on small instances using raided EBS drives.

Read More

Multi-machine EC2 Cassandra Setup in 30 minutes

In this post we will walk through setting up a production ready 3 node  Cassandra cluster with Munin monitoring running on Amazon EC2 in under 30 minutes. We will also walk through getting the sample Cassandra stress scripts running with a basic load on the 3 node cluster. This post builds on a previous post about how to setup and maintain an EC2 virtual instance with our supplied unattended install scripts. If you wish to know more about how our unattended install scripts works please review my previous post.

Read More

Unattended Amazon EC2 Install Script

After maintaining several version of my own private AMI’s and, realizing what a pain maintenance was, I decided to find a better solution. There is a lot of great information on the net if your google-fu is good, but I decided to compile all the information I use into a couple scripts and describe each step in detail so others could understand, modify and use the scripts. The overriding goal is to allow the flexibility of launching and configuring remote Amazon EC2 instances in an non-interactive manner.

Read More

JMX Support for Java Perf Counters

I have added JMX support to the Simple Java Performance Counters. For a detailed description on the Java Performance Counters please check out my previous post located.

Read More

Java in Memory Cache

Lets look at creating and using a simple thread-safe Java in-memory cache. It would be nice to have a cache that can expire items from the cache based on a time to live as well as keep the most recently used items. Luckily the apache common collections has a LRUMap, which, removes the least used entries from a fixed sized map. Great, one piece of the puzzle is complete. For the expiration of items we can timestamp the last access and in a separate thread remove the items when the time to live limit is reached. This is nice for reducing memory pressure for applications that have long idle time in between accessing the cached objects. There is also some debate weather the cache items should return a cloned object or the original. I prefer to keep it simple and fast by returning the original object. So the onus is on the user of the cache to understand modifying the underlying object will modify the object in the cache as well. Notice this is also an in-memory cache so objects are not serialized to disk.

Read More

Simple Java Performance Counters

There doesn’t seem to be a whole lot of open source options for Java performance counters.  Since I found it frustrating and  rolled my own I decided to share my work so others could just ditto it.  The overarching principal is Simplicity or more importantly KISS.  I wanted something fast, simple, easy to use, fast, simple and thread-safe (did I mention fast and simple).   After working with windows C++ performance counters (yuck!) talk about warts and .NET performance counters (nice band-aid, but still didn’t cover the warts) I opted for a simple under-engineered design. Before we dive into some samples lets briefly explain the included Java performance counters.

Read More

Unattended Java Install on Linux

When building and configuring Amazon EC2 instances I find myself needing to install the Sun Java 6 runtime and/or the JDK unattended. This is sometimes referred to as non-interactive or headless install. The script below is what I typically use to install Java on my Ubuntu 9.10 instances running on Amazon EC2.

Read More

Run a MySQL Script using Java

Sometimes it is nice to programmatically run .sql scripts on a MySQL database using Java.  This is easily accomplished using the allowMultiQueries configuration property for the MySQL Connector/J driver.  When set to true it allows the use of ‘;’ to delimit multiple queries.
Read More

Polling for EC2 instance availability

I often find myself writing scripts for Amazon EC2 that need to wait for the instance to become available.  Instance availability, for me, is dictated when the ssh service becomes available.  Lets create a simple script that will poll a ssh connection and wait until it can connect before letting the script continue.
Read More