What's it all about, Alfie?

Archive for the ‘Clojure Coding’ Category

More Topology Balancing

Thursday, September 11th, 2014

Unified Click

I've spent most of the day trying to get the topology working under load from the batch email send process. In truth, I may never get it really perfectly balanced because it's a batch process and not a real-time, on-demand, kind of thing. I'm trying to count shotgun pellets after the gun goes off. It's kinda tough.

But still I try. And I'm learning a lot about the way this topology and Storm is responding to the load. For instance, if you don't want to buffer tuples in the system - and for the most part, I don't, then use the :local-or-shuffle and then make sure that your data flow is balanced on all bolts before that step. This will save a lot of lag in the throughput as it can hand off one tuple to the next bolt without going through any buffering.

What I've been playing with lately is significantly increasing the size of the decorator bolt parallelization hint and the encoder and transmitter bolts to see if this will make a difference, or if it's just going to shorten the time we're at capacity by moving more messages through the system - but still always being at capacity.

So I've had good luck, actually, and this is the message rate for a bulk email send (purple) and the corresponding output decorator (golden) and output messages (cyan):

Message Rate for 500 PH

There's a lot to like about this graph over the older ones - first, the decorate and xmit are virtually identical - i.e. no buffering. Excellent. Also, the drop-off on the output is nearly as good as the ramp-up, so that means that we're really doing a pretty decent job of moving the data. I'm not unhappy with this graph at all. But the capacity graph is a different story:

Message Rate for 500 PH

Here we see that we peaked after the email send block was done, and that's a bit odd, but on the plus side, the encode and emit bolts also rose nicely saying that the decoding is starting to share the load more, and that's a good thing.

My concern is still the capacity number. I suppose I'll run a few more tests with higher numbers still and see if that makes any difference, but I have a feeling it's not going to make any change to the height of the capacity surge - but it will likely lessen the duration.

Posted in Clojure Coding, Cube Life, Open Source Software | Comments Off on More Topology Balancing

Testing New Data Feeds for a Topology

Wednesday, September 10th, 2014

Storm Logo

Today has been a easy-going day of performance testing and topology tuning with the new data source I'm trying to integrate - email sends. This is coming in from a current batch process that's delivering batches of emails to different locations all around the globe. The trick with this is that testing it is kinda tough because if they aren't running a batch, you have no data to test with. Combine that with the fact that a lot of these tests with Storm need to run for a while to get to the steady-state condition, and it makes for a lot of staring at graphs.

I like to make a table of all experiments and their results, and for today's experiments this is what I ended up with:

RM Decoders	RM Mappers	Decorate	Topology Workers	rms-decode	rms-map	decorate	Time
375	150	350	25	0.023	0.025	0.740	2:51:18
		250		0.001	0.002	0.650	1:28:42
rewrite lookups		250		0.033	0.004	0.527	31:41
100	50	250		0.000	0.000	0.594	20:26
		300	30	0.000	0.000	0.438	15:16

While it's still developing, it's clear to me that we started out with more resources on the email send bolts than we needed, and re-writing the lookups was an important step.

UPDATE: with the increase of the workers to 30, we finally have something that handles the load at least as well as production. That's good enough for today.

Posted in Clojure Coding, Cube Life, Open Source Software | Comments Off on Testing New Data Feeds for a Topology

Really Pleased with Sharded Redis Solution

Wednesday, September 10th, 2014

Redis Database

I've been faced with a significant problem - cache data for 110,000,000 users in a way that's both fast, and efficient, so that we can pull data out of it at a rate of 50k to 100k times a second. Redis, being single-threaded, is great at fast hits for reasonable data sets, but storing more than 100 million of anything, and accessing it by hundreds of threads is going to blow out any one redis server - so you have to shard.

But how to shard efficiently?

Turns out, Java's MD5 is amazingly efficient. I wrote the following hash function that takes any string and hashes it into one of 8 buckets - I'm planning on having 8 redis servers on one physical box:

  (defn hash8
    "Function to generate a 3-bit hash on the provided string by using the MD5
    as an intermediate vechile, and then taking the last byte and 'mod 8' it.
    This is using the Java MessageDigest class to create the hash, so it's
    only as good as that class/function - but from our tests, it's very efficient."
    [s]
    (if (string? s)
      (let [ba (.digest (doto (java.security.MessageDigest/getInstance "MD5")
                              (.reset)
                              (.update (.getBytes s))))]
        (mod (aget ba (dec (alength ba))) 8))))

and I compared it to what I thought was going to be the much faster way: Simply adding up all the ASCII byte values and taking the last 3 bits:

  (defn smash
    "Function to generate a 3-bit hash on the provided string by summing
    the bytes of the string, and then taking the 'mod 8' of it."
    [s]
    (if (string? s)
      (let [ba (.getBytes s)]
        (mod (areduce ba i ret (long 0)
               (+ ret (aget ba i)))
             8))))

And to my surprise, when I ran a sequence of strings through both, the MD5 version far outperformed the byte array version, and I'm now convinced that it's because of the getInstance() call - Java is holding onto a generator, and serving it up to the caller as needed. Plus, they have to have really optimized that code to beat a simple adder.

In the end, I put this on the front-end of the redis calls with:

  (defn user-shard
    "Function to return the right function that will be used with 'wcar' to get
    the provided key from redis. This is specific just to the user data."
    [k]
    (case (hash8 k)
      0 :user-ids-1
      1 :user-ids-2
      2 :user-ids-3
      3 :user-ids-4
      4 :user-ids-5
      5 :user-ids-6
      6 :user-ids-7
      7 :user-ids-8))

and then it's used, with Carmine, as:

  (wcar (user-shard k) (car/get k))

When I look at the CPU and memory usage on the box, I'm seeing wonderfully balanced CPU usage - meaning that the sharing is very nicely distributed, and the memory usage for redis is very reasonable for the data set.

Great win!

Posted in Clojure Coding, Cube Life, Open Source Software | Comments Off on Really Pleased with Sharded Redis Solution

Struggling with Dead Workers in Carmine

Tuesday, September 9th, 2014

Redis Database

This morning I'm once again trying to figure out a problem I've been having with the workers in the Carmine message queue implementation. Basically, the thread that starts the workers is doing just fine, but the workers themselves, are just stopping. I had no idea what to do about it - so I wrote to the author asking him about this.

He responded with the :monitor option to the worker function. I didn't see it in reading the code, but yes, there's a function that gets called when the queue is cycled, so I added a simple function there to reset an atom, and then in the thread that starts these workers, I inc that atom, and if it exceeds 50 sec of not being reset, then I know that it's taken more than 50 sec for the queue to cycle, and I try to stop/start the worker.

The basic monitor and it's worker look something like this:

  save-mon (fn [{:keys [mid-circle-size ndry-runs poll-reply]}]
             (debug "persistence worker heartbeat (iteration)...")
             (reset! _save_loops 0))
  saver (mq/worker (epr/connection :queue) *dump*
          {:handler (fn [{:keys [message attempt]}]
                      (save-it! cfg message)
                      {:status :success})
           :monitor save-mon
           :nthreads 1})

and then in the main body of the function we have something that checks to see if the _save_loops atom has been reset recently enough:

  (let [sc (swap! _save_loops inc)]
    (when (< 50 sc)
      (warnf "Persistence worker hasn't cycled for %s sec -- Restarting!" sc)
      (reset! _save_loops 0)
      (infof "Stopping the persistence worker... [%s]"
             (if (mq/stop saver) "ok" "FAIL"))
      (infof "Starting the persistence worker... [%s]"
             (if (mq/start saver) "ok" "FAIL"))))

This all took a while to figure out, but after a time, I got it working, and it appeared to be working. But the stopping and starting just weren't doing the right things. Add to this, the background that in the other data center, I had multiple installations of this where the workers weren't in trouble at all.

I'm starting to think it's the redis server. That will likely be the next thing I do - restart the redis server and hope that clears any odd state that might be there.

I do wish this would settle down.

UPDATE: I emailed Peter, the author, and he asked me to check the logs - and in this case the timbre logs - his logging package. These go to standard out, and I had forgotten about the. Sure enough, there was useful data there, and all the stops and starts were logged as well. At this point, I believe it's something in redis, and it has been successfully cleared out. But if it happens again, I'll dump the redis database and start fresh - it's just the queue data.

Posted in Clojure Coding, Cube Life, Open Source Software | Comments Off on Struggling with Dead Workers in Carmine

Getting Bolt Metrics from Nimbus

Monday, September 8th, 2014

Storm Logo

This morning I wanted to be able to add the Storm topology bolt capacity values to the in-house monitoring and graphing tools that The Shop uses. The reason for this is that I'm constantly checking the Storm UI to see what the capacity values are for the bolts on my critical topology, and it'd be so much nicer to be able to see them in a simple graph on the display that I'm already looking at for disk space, CPU usage, and also the higher-level metrics like the messages per second emitted from each bolt.

The latter is something I figured out by digging into the Nimbus JavaDocs, and it was still useful in this bit of detective work. But the biggie was the code that the Storm UI uses to generate it's response. That was a little harder to find, but when I found it, I knew I had what I needed to get the job done.

The resulting code wasn't too bad:

(ns gym.storm.nimbus "Namespace for exercising Nimbus to find out facts about the specific storm cluster. This data can be used to monitor the cluster and look at stats in a way that's got very low load on the overall system." (:require [backtype.storm.clojure :refer :all] [backtype.storm.config :refer :all] [backtype.storm.ui.core :refer :all] [clj-endpoints :as ep] [clj-endpoints.persistence.redis :refer [wcar]] [clj-endpoints.util :as util] [clojure.string :as cs] [clojure.tools.logging :refer [error infof warnf]] [taoensso.carmine :as car]) (:import [org.apache.thrift7.transport TSocket TFramedTransport] [org.apache.thrift7.protocol TBinaryProtocol] [backtype.storm.generated Nimbus$Client SupervisorSummary])) (defn get-emitted-totals "Function to take a config map and a topology name, and query Nimbus to see what the `emitted` totals are for each of the bolts in the topology. This is a very lightweight way to keep track of what's going on in the topology without monitoring all the messages coming out of the kafka cluster." [cluster topo] (let [cfg (ep/config cluster) tft (TFramedTransport. (TSocket. (:nimbus cfg) 6627)) nc (Nimbus$Client. (TBinaryProtocol. tft))] (try (.open tft) (let [ci (.getClusterInfo nc) ts (first (filter #(= topo (.get_name %)) (.get_topologies ci))) ti (.getTopologyInfo nc (.get_id ts)) exes (.get_executors ti) cnts (into {} (for [[k v] (group-by #(.get_component_id %) exes)] [(keyword k) (util/safe-sum (map get-emitted v))]))] (.close tft) { :nimbus-host (:nimbus cfg) :topology topo :executors (count exes) :counts cnts }) (catch Exception e (warnf "Exception thrown: %s" (.getMessage e)))))) (defn get-capacity "Function to take a config map and a topology name, and query Nimbus to see what the capacity is for each of the bolts in the topology. This is a very lightweight way to keep track of what's going on in the topology without monitoring all the messages coming out of the kafka cluster." [cluster topo] (let [cfg (ep/config cluster) tft (TFramedTransport. (TSocket. (:nimbus cfg) 6627)) nc (Nimbus$Client. (TBinaryProtocol. tft))] (try (.open tft) (let [ci (.getClusterInfo nc) ts (first (filter #(= topo (.get_name %)) (.get_topologies ci))) tid (.get_id ts) ti (.getTopologyInfo nc tid) st (.getTopology nc tid) exes (.get_executors ti) bolts (group-by-comp (filter (partial bolt-summary? st) exes)) caps (into {} (for [[id bc] bolts] [(keyword id) (util/to-4dp (compute-bolt-capacity bc))]))] (.close tft) { :nimbus-host (:nimbus cfg) :topology topo :executors (count exes) :capacity caps }) (catch Exception e (warnf "Exception thrown: %s" (.getMessage e))))))

What really surprised me was that even the Storm UI was written exactly as I would have done it - all in clojure with compojure for the RESTful API. It was pretty sparsely documented, but in the end, I can understand an Apache project with sparse documentation - it's kinda to be expected.

I fiddled around with the return values and the difference between a StormTopology and a TopologyInfo - and how it's used in both the emitted counts and the calculation of the capacity for the bolts. But in the end, by looking carefully at the code as an example, I was able to get what I needed out of the library. Very nice.

Topology Capacity

Posted in Clojure Coding, Cube Life, Open Source Software | Comments Off on Getting Bolt Metrics from Nimbus

Adding Email Send Messages

Friday, September 5th, 2014

Unified Click

Today I got news that the email send log messages were now including the deal UUID so that I could generate the daily_send messages in the unified message stream. This is something we've been waiting a long time for, and as I dug into this, I realized very quickly that there are a number of really significant issues in adding these messages:

There's a whole lot of them - like 10k msgs/sec just for the sends, and each send has anywhere from 6 to 12 deals - which translates to upwards of 100k msgs/sec just for the sends. That's a lot.
There's no user identifier - there's a UserId, but it's not really what I need because it's coverage is not really complete - missing in 3.7 million messages in 4 hours today, and it's only useful for US and Canada. We need to move to the UUID identifier that's good for international locations as well.

So I looked into ways to solve this, and it's possible, but it's not easy. I could map all the email addresses to UUIDs for the users, but then that's a map of 110 million - plus - emails. That's a lot. And keeping it fresh is going to be a challenge due to the size and the frequency of additions/changes.

And this is only for the sends. The email opens are a different stream, and that was going to require that we cache the data from the sends to know what to send for an open. Thankfully, there was a ticket in the system already to add the deal UUIDs to the email open message, and I created a ticket to either get better coverage on the UserId and/or add the user UUID to the sends and opens so that we can use the data in the message(s) and not have to have any lookups.

I have no idea how long this will take, but until I get some support from them, I really can't go a lot further. Nearly a million malformed messages (missing the UserId) every hour is not a good way to establish a dataset. So things have to get better upstream.

Hope so.

Posted in Clojure Coding, Cube Life | Comments Off on Adding Email Send Messages

Huge Fan of the Thunderbolt Display

Thursday, September 4th, 2014

Thunderbolt

I've been at The Shop for a little over two years now, and I've had a Thunderbolt Display for all that time. I have to admit that I haven't really used it a lot - until recently, but now that I have - in conjunction with my 15" Retina MacBook Pro, it's really quite amazing.

So much so that I've got one at home, and I'm liable to get another. They are just so amazingly nice, and slick, and Apple is - well... Apple - and it just works and does everything you can imagine needing to be done. Amazing tools from an amazing company.

Wow... it's nice to write nice things about good tools!

Posted in Apple, Clojure Coding, Cube Life | Comments Off on Huge Fan of the Thunderbolt Display

Wanted: Faster JVM Load Times

Thursday, September 4th, 2014

This morning I was working on a simple addition to a suite of applications that does some experimental analysis on data streams, and it was all written in clojure. Wonderful language. The problem is that it's based on the JVM, so if I want to do something small, I still have a relatively long load time due to the JVM.

JRuby had this as well, but then there was RMI Ruby that was very fast to load, and it was great for smaller tasks. JRuby is great for larger projects where you need the JIT and speed you can get with the JVM.

I've seen ideas where you spin up a JVM, and have it sitting there, and then load up a new class loader and load up the code. But that means you have a JVM just sitting there all the time. That's not the answer. And I can understand the case of GUI apps - those are building up lots of buffers, etc. that you don't have to have with a simple command-line app.

Boy... that would be as useful as a really great Garbage Collector. Sure wish it was faster to launch clojure apps.

UPDATE: I have read a few blog posts about this, and I have to say that I've changed my mind on the subject - Java's JVM loader is just fine. It's the clojure loader on the JVM that's bad. So very bad. Tests on my MacBook Pro confirm that Java itself is not bad at all - very fast, once it's compiled. But clojure is still slow. Plenty of reasons for the slowness, but no clear answers.

Posted in Clojure Coding, Open Source Software | Comments Off on Wanted: Faster JVM Load Times

Tracking Down Data in a Database – Yikes!

Wednesday, September 3rd, 2014

Today I have spent most of the day tracking down some data in a massive Teradata database, and trying to relate it back to the data I needed so I could make the event stream cover this odd data as well as it covers the normal stuff. While I can understand the need for this, it's hard work because the field I'm getting in the incoming message is permalink, and it's an integer, and now I have to find what field this is in literally hundreds of tables in a massive, multi-TB Teradata install.

I was pretty sure that I had a good idea where to start looking, but I wasn't exactly sure, and true enough, I wasn't right on the money, but I was close. I then had hours ahead of me to get the right answer, and then build the SQL query that would return the rows I needed in a reasonable time-frame to build up the cache of data in redis that I was going to need.

Suffice it to say that it's not a lot of fun when you can't get folks to standardize on indexes or identifying codes in a system as large as this. But hey... it's done.

Posted in Clojure Coding, Open Source Software | Comments Off on Tracking Down Data in a Database – Yikes!

Performance of Larger Storm Topologies

Tuesday, September 2nd, 2014

Storm Logo

I was looking at the loading on one of my topologies today, and noticing that it wasn't catching up to a backlog from a restart as fast as I'd hoped. Since I had the capacity, it made sense to expand the topology and give it more workers, and more JSON decoder bolts, and more data filtering bolts. So I doubled the workers to 20, added many more bolts, and restarted it.

And the result was that it was slower. The capacity numbers went way up, and the backlog continued. This made no sense, but then it did. There's a balance to all things in a topology-based system like Storm. You can't increase one thing and expect it not to impact the system in other places.

So I took the changes out, and the speed returned. You can't tweak topologies without getting the performance data, and you can't measure it without upsetting the running of the topology. It's not an easy system to use, but it can be made to be quite useful. You just have to be careful.

Posted in Clojure Coding, Everything Else, Open Source Software | Comments Off on Performance of Larger Storm Topologies

What's it all about, Alfie?

Archive for the ‘Clojure Coding’ Category

More Topology Balancing

Testing New Data Feeds for a Topology

Really Pleased with Sharded Redis Solution

Struggling with Dead Workers in Carmine

Getting Bolt Metrics from Nimbus

Adding Email Send Messages

Huge Fan of the Thunderbolt Display

Wanted: Faster JVM Load Times

Tracking Down Data in a Database – Yikes!

Performance of Larger Storm Topologies

Pages

Archives

Categories