Web console for Kafka messaging system

Running Kafka for a streaming collection service can feel somewhat opaque at times, this is why I was thrilled to find the Kafka Web Console project on Github yesterday.  This Scala application can be easily downloaded and installed with a couple steps.  An included web server can then be launched to serve it up quickly.  Here’s how to do all that.

For a quick intro to what the web console does, see my video first.  Instructions for getting started follow below, including the quick video.

Continue reading Web console for Kafka messaging system

Drinking from the (data) Firehose of Terror

Between classic business transactions and social interactions and machine-generated observations, the digital data tap has been turned on and it will never be turned off. The flow of data is everlasting. Which is why you see a lot of things in the loop around real time frameworks and streaming frameworks. – Mike Hoskins, CTO Actian

From Mike Hoskins to Mike Richards (yes we can do that kind of leap in logic, it’s the weekend)…

Oh, Joel Miller, you just found the marble in the oatmeal!   You’re a lucky, lucky, lucky little boy – because you know why?  You get to drink from… the firehose!  Okay, ready?  Open wide! – Stanley Spadowski, UHF

Firehose of Terror

I think you get the picture – a potentially frightening picture for those unprepared to handle the torrent of data that is coming down the pipe.  Unfortunately, for those who are unprepared, the disaster will not merely overwhelm them.  Quite the contrary – I believe they will be consumed by irrelevancy.

If you’re still with me, let me explain. Continue reading Drinking from the (data) Firehose of Terror

OSX Open Command – Launch Custom Application

The OSX “open” command line tool is very useful.  Use it to launch a URL or point to a folder and the web browser or Finder pops up automatically.  But what about when you want to launch a particular app to handle a resource you provide?  It can do that as well.  Easily.

In this example I’m setting up my environment to launch several Windows remote desktop client sessions using the wonderful CoRD application.  The idea is that I can put the CoRD app in a folder with a shell script and have an easily transportable launcher package to give to others – without them having to install a RDP client, etc.

Because I’m not install the CoRD application globally, the RDP protocol doesn’t get associated with the app.  Therefore, when using the open command I need to explicitly tell it what application to launch it with, using the “-a” flag:

 open -a /User/tyler/CoRD.app/Contents/MacOS/CoRD rdp://winserver1:3389

The benefit to using the open command here is that it won’t launch new CoRD instances, but will add them as sessions to the existing one.  Likewise, open will start the process and return you to the command line – so you can list several open statements in a script and it will run them all without having to do any funky backgrounding step.

Thanks to this old OSXDaily post for pointing me in the right direction.  I had given up on the command because I didn’t realise there were more options.  Next time I’ll read the manual first!

 

 

VIDEO: Kibana 3 Dashboard – 3 Use Cases Demonstrated

Kibana dashboards, from the Elasticsearch project, can help you visualise activity and incidents in log files. Here I show 3 different types of use cases for dashboards and how each can be used to answer different questions depending on the person.  Video and details follow. Continue reading VIDEO: Kibana 3 Dashboard – 3 Use Cases Demonstrated

Google wants “mobile-friendly” – fix your WordPress site

TheNextWeb reports: “Google will begin ranking mobile-friendly sites higher starting April 21“.  It’s always nice having advance warning, so use it wisely – here’s how to tweak WordPress to increase your mobile-friendliness.

Google Mobile-Friendly Check

I use a self hosted WordPress site and wanted to make sure it was ready for action.  I already thought it was, because I’ve accessed in on a mobile device very often and it worked okay.

I even went onto the Google Web Admin tools and the mobile usability check said things were fine, but… Continue reading Google wants “mobile-friendly” – fix your WordPress site

iPhone cable – loose connection?

iPhone cable – mysterious loose connection bothering you?  Before buying a new gold plated cord or adapter, clean out the port with a toothpick.  You will be amazed!

Kafka Consumer – Simple Python Script and Tips

[UPDATE: Check out the Kafka Web Console that allows you to manage topics and see traffic going through your topics – all in a browser!]


 

When you’re pushing data into a Kafka topic, it’s always helpful to monitor the traffic using a simple Kafka consumer script.  Here’s a simple script I’ve been using that subscribes to a given topic and outputs the results.  It depends on the kafka-python module and takes a single argument for the topic name.  Modify the script to point to the right server IP.

from kafka import KafkaClient, SimpleConsumer
from sys import argv
kafka = KafkaClient("10.0.1.100:6667")
consumer = SimpleConsumer(kafka, "my-group", argv[1])
consumer.max_buffer_size=0
consumer.seek(0,2)
for message in consumer:
 print("OFFSET: "+str(message[0])+"\t MSG: "+str(message[1][3]))

Max Buffer Size

There are two lines I wanted to focus on in particular.  The first is the “max_buffer_size” setting:

consumer.max_buffer_size=0

When subscribing to a topic with a high level of messages that have not been received before, the consumer/client can max out and fail.  Setting an infinite buffer size (zero) allows it to take everything that is available.

If you kill and restart the script it will continue where it last left off, at the last offset that was received.  This is pretty cool but in some environments it has some trouble, so I changed the default by adding another line.

Offset Out of Range Error

As I regularly kill the servers running Kafka and the producers feeding it (yes, just for fun), things sometimes go a bit crazy, not entirely sure why but I got the error:

kafka.common.OffsetOutOfRangeError: FetchResponse(topic='my_messages', partition=0, error=1, highwaterMark=-1, messages=)

To fix it I added the “seek” setting:

consumer.seek(0,2)

If you set it to (0,0) it will restart scanning from the first message.  Setting it to (0,2) allows it to start from the most recent offset – so letting you tap back into the stream at the latest moment.

Removing this line forces it back to the context mentioned earlier, where it will pick up from the last message it previously received.  But if/when that gets broke, then you’ll want to have a line like this to save the day.


For more about Kafka on Hadoop – see Hortonworks excellent overview page from which the screenshot above is taken.

Web Mapping Illustrated – 10 year celebration giveaway [ENDED!]

web-mapping-tyler-mitchell-large
My O’Reilly, 2005 book

Update: All copies are gone!  If you want Geospatial Desktop or Geospatial Power Tools – go to LocatePress.com – quantity discounts available.  For Web Mapping Illustrated go to Amazon.


 

I’m giving away a couple copies of my circa 2005 classic book.  Details below…  When O’Reilly published Web Mapping Illustrated – Using Open Source GIS Toolkits – nothing like it existed on the market.  It was a gamble but worked out well in the end.

Primarily focused on MapServer, GDAL/OGR and PostGIS, it is a how-to guide for those building web apps that included maps.  That’s right, you couldn’t just use somebody else’s maps all the time – us geographers needed jobs, after all.

To help give you the context of the times, a couple months before the final print date, Google Maps, was released.  I blithely added a reference to their site just in case it became popular.

The book is still selling today and though I haven’t reviewed it in a while, I do believe many of the concepts are still as valid as when it was written.  In fact, it’s even easier to install and configure the apps now due to packaging and distribution options that didn’t exist back then.  Note this was also a year before OSGeo.org’s collaborative efforts started to help popularise the tools further.

In celebration of 10 years of sales I have a couple autographed copies as giveaways to the first two people who don’t mind paying only for the shipping (about USD$8) and who drop me a note expressing their interest.

Additionally, I have some of Gary Sherman’s excellent Geospatial Desktop books as giveaways as well.  Same deal, pay actual shipping cost only from my remote hut in northern Canada.  Just let me know you’d like one of them and I’ll email you the PayPal details.  Sorry, not autographed by Gary, though I was editor and publisher, so could scribble on it for you if desired.

Neo4j Cypher Query for Graph Density Analysis

Graph analysis is all about finding relationships. In this post I show how to compute graph density (a ratio of how well connected relationships in a graph are) using a Cypher query with Neo4j. This is a follow up to the earlier post: SPARQL Query for Graph Density Analysis.

Installing Neo4j Graph Database

In this example we launch Neo4j and enter Cypher commands into the web console… Continue reading Neo4j Cypher Query for Graph Density Analysis

Code snippet: SPARQL Query Graph Density

Code snippet: SPARQL Query Graph Density

I’m testing out sharing SPARQL code snippets using Github Gist features. I’ll be adding more as I work through more graph-specific examples using SPARQLverse, but here is my first one:

Ideally we’d have a common landing place for building up a library of these kinds of examples.

%d bloggers like this: