Would you like some Mongo with your Postgres? #

Speaking of PostgreSQL, ToroDB is a JSON database that runs on top of Postgres.

JSON documents are stored relationally, not as a blob/jsonb. This leads to significant storage and I/O savings. It speaks natively the MongoDB protocol, meaning that it can be used with any mongo-compatible client.

MongoDB client compatibility. Smart. Still early days, though:

ToroDB follows a RERO (Release Early, Release Often) policy. Current version is considered a “developer preview” and hence is not suitable for production use. However, any feedback, contributions, help and/or patches are very welcome.

backbone_query – MongoDB-like query interface to Backbone collections #

Nice JavaScript add-on for Backbone from Dave Tonge that provides a MongoDB-like query API for your collections:

MyCollection.query({
  // Models must match all these queries
  $and:{
    title: {$like: "news"}, // Title attribute contains the string "news"
    likes: {$gt: 10}}, // Likes attribute is greater than 10

  // Models must match one of these queries
  $or:{
    featured: true, // Featured attribute is true
    category:{$in:["code","programming","javascript"]}}
    //Category attribute is either "code", "programming", or "javascript"
});

Check the README for usage.

Qu – Background job queue for Ruby, Redis, and MongoDB #

Qu is interesting project for doing background jobs in Ruby from Brandon Keepers, one of the maintainers of delayed_job :

class ProcessPresentation
  def self.perform(presentation_id)
    presentation = Presentation.find(presentation_id)
    presentation.process!
  end
end

job = Qu.enqueue ProcessPresentation, @presentation.id

Check out the README for usage and answers on why another Ruby queuing library.

rstat.us – Distributed Twitter with Ruby and ostatus #

So, I’m sure you’ve all been waiting with baited breath for me to begin my licensing series. I got lots of great feedback, but something’s made me put it off for a moment: coding. I plan on starting the series in earnest next week, but in its stead, I offer you this: rstat.us.

If you didn’t hear, a week ago Friday Twitter changed their terms of service. This got a lot of people upset, including me. My friends and I started thinking about it, and the real problem is this: any software that’s owned by one entity, corporate or not, is open to the possibility of being abused.

So we decided to fix it. Ten days later, here we are: http://rstat.us/ is born.

rstat.us

To boil it down, rstat.us is a Sinatra application that clones the basic functionality of Twitter. Fine. But here’s the interesting part: if you want to follow someone that’s not on the main rstat.us site, you can copy/paste a URL into a form, and from then on out, it just transparently works. We’re building on the ostatus protocol that other sites like Identi.ca uses, so you can actually follow Identica users on rstat.us right now, and after we work out a kink or two, they can follow you, too.

Oh, and I should mention that: this is very much an alpha release. rstat.us was put together by 6 or 8 of my closest friends in a marathon coding session, so there’s some refactoring work to be done. The documentation is also a bit obtuse, partially to slightly discourage people from running their own nodes just yet. Eventually, this should be a two or three line process, and you can be running your own node up on Heroku. We also want to significantly improve our test coverage.

There’s some pretty big plans for the future: we want to extract a Sinatra extension that will enable anyone to easily build their own distributed network. We’re also releasing three Ruby gems that will let anyone work with the few standards that we build upon, so that other people can make their own tools that work with us, or build their own implementations and copy of the site. Check it out on GitHub, or drop by #rstatus on Freenode if you’d like to say hello.

It’s a distributed world that we live in. Own your own data. Build decentralized networks. Take control of your own social networking. And help us do it. :)

[GitHub] [README] [Discuss on HN]

Episode 0.5.1 – MongoDB, NoSQL, and Web Scale with Eliot Horowitz

Steve and Wynn sat down with Eliot Horowitz from 10gen to talk about MongoDB, the NoSQL landscape, and the fun of building at Web Scale. Items mentioned in the show: Eliot Horowitz CTO and Co-Founder of 10gen Dwight Merriman CEO & Co-Founder at 10gen NoSQL is a loose term for Key Value Stores, Graph Databases, […]

Graylog2: Java, Ruby, MongoDB-powered log management, monitoring, and alerting #

For developers, application logs are critical to figuring out what’s going on inside the apps we create. We tail them. We search them. We analyze and graph them. Graylog2 a slick log management, monitoring, and alerting tool powered by Java, Ruby, and MongoDB, performs these well. Graylog consists of a Java server that collects your logging data and stuffs it into MongoDB and a Ruby on Rails web interface for searching, filtering, and graphing that data.

graylog structure

Collecting: Graylog Server

Graylog’s server component requires Mongo version 1.6 or later and a Java environment.

Check out the project’s wiki for installation and startup instructions. Graylog also supports AMQP as an alternate transport for messages, just configure appropriately in your config file.

Graylog supports writing custom rules to determine what messages find their way to MongoDB and in what form, using Drools Expert, as in this example:

import org.graylog2.messagehandlers.gelf.GELFMessage

rule "Rewrite localhost host"
    when
        m : GELFMessage( host == "localhost" && version == "1.0" )
    then 
        m.setHost( "localhost.example.com" );
        System.out.println( "[Overwrite localhost rule fired] : " + m.toString() );
end

rule "Drop UDP and ICMP Traffic from firewall"
    when
        m : GELFMessage( fullMessage matches "(?i).*(ICMP|UDP) Packet(.|n|r)*" && host == "firewall" )
    then
        m.setFilterOut(true);
        System.out.println("[Drop all syslog ICMP and UDP traffic] : " + m.toString() );
end

Transport format: GELF

In addition to syslog format, Graylog also supports GELF, or the Graylog Extended Log Format, which offers

  • more than the 1024 bytes offered by syslog to accomodate more info such as backtraces
  • structured data.

Here’s a quick example of a GELF message:

{
  "version": "1.0",
  "host": "www1",
  "short_message": "Short message",
  "full_message": "Backtrace herennmore stuff",
  "timestamp": 1291899928,
  "level": 1,
  "facility": "payment-backend",
  "file": "/var/www/somefile.rb",
  "line": 356,
  "_user_id": 42,
  "_something_else": "foo"
}

A GELF message is just a GZIP’d or ZLIB’d JSON string. Check the GELF Spec for a list of required fields.

Searching and analyzing: Graylog Web Interface

Graylog also ships with a rather slick web interface for searching and viewing Graylog messages. Filters can be applied and saved into logical “streams”, allowing you to look at a slice of your data.

graylog

Graylog can even alert you when certain thresholds are exceeded for a given stream, as in this example email alert:

From: graylog2@example.org
To: lennart@socketfeed.com
Subject: [graylog2] Stream alarm! (Stream: Finance)
# Stream >Finance< has 23 new messages in the last 15 minutes. Limit: 15
# Description: Just a dummy stream with a not-so-random name but random data.

From: graylog2@example.org
To: lennart@socketfeed.com
Subject: [graylog2] Subscription (Stream: Finance)
# Stream >Finance< has 24 new messages since 2011-01-08 20:31:13 +0100
2011-01-08 21:12:38 +0100 from >localhost.localdomain<
  sundaysister kernel: [92837.097110] CPU0: Core temperature/speed normal
2011-01-08 21:12:38 +0100 from >localhost.localdomain<
  sundaysister kernel: [92837.096461] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1485916)

Bonus: Capture exceptions with Rack

For Ruby web applications, there is a bonus Rack application that allows you to send all application exceptions to Graylog in the spirit of Hoptoad.

[Source on GitHub] [Web site]

MongoDB 1.7.5 released: Single Server Durability! #

Today brings a new release of MongoDB. Normally I wouldn’t make a fuss about a point release, but this one has a big feature: Single server durability.

For those of you not in the know, many NoSQL databases have their own little nooks, crannies, and special cases where they diverge from what you were expecting with traditional data stores. In Mongo’s case, you had no guarantee of durability with only one instance running: you’d better have both a master as well as at least one slave if you’re doing anything in production.

You can try this out by passing a flag to mongod:

$ mongod --dur

This turns on journaling. From the documentation:

With —dur enabled, journal files will be created in a journal/ subdirectory under your chosen db path. These files are write-ahead redo logs. In addition, a last sequence number file, journal/lsn, will be created. A clean shutdown removes ll files under journal/.

In addition, 1.7.5 also brings a few other changes. You can see the changelog here.

One quick note: MongoDB uses the standard “odd numbers are development, even are stable” versioning scheme. So the 1.7.x series is still under development, and you should probably wait until the release of 1.8.x to use this feature in production unless you fully understand what you’re doing.

[Mailing list announcement] [Downloads page]

rogue – A Lift/MongoDB query DSL from Foursquare #

Foursquare just announced a really neat new framework: rogue. Foursquare uses Scala to power their website, along with the Lift framework. They’ve been pretty vocal about their usage of MongoDB as well. This project combines all of that together.

Here’s what they have to say about the motivations for developing rogue:

Unfortunately, we found [Lift’s ORM] querying support a bit too expressive — you can pass in a query object that doesn’t represent a valid query, or query against fields that aren’t part of the record. And in addition it isn’t very type-safe. You can ask for, say, all Venue records where mayor = “Bob”, and it happily executes that query for you, returning nothing, never informing you that the mayor field is not a String but a Long representing the ID of the user. Well, we thought we could use the Scala type system to prevent this from ever happening, and that’s what we set out to do.

So what’s it look like? Here’s an example that should be familliar to anyone who’s used Foursquare:

Checkin where (_.venueid eqs id)
  and (_.userid eqs mayor.id)
  and (_.cheat eqs false)
  and (_._id after sixtyDaysAgo) 
  select(_._id) fetch()

Pretty cool. This will actually use Scala’s static type system to make sure you aren’t doing something stupid. For example, it will make sure that venueid is an actual member of Checkin, and also that id is of the same type as venueid.

Foursquare has made sure to mention that contributions are very welcome, so if Scala is your thing, fork away!

#40: Riak revisited with Andy Gross, Mark Phillips, and John Nunemaker

Wynn sat down with Andy Gross and Mark Phillips of Basho and John Nunemaker of Ordered List to talk about Riak, Riak Search, and moving an open source community to GitHub. Items mentioned in the show: NoSQL smackdown, live from SXSW 2010. Are you web scale? Drop us a ping@thechangelog.com and let us know who […]

mongomatic: Minimal Ruby mapper for Mongo #

If you’re a close-to-the-metal sort of developer who eschews conveniences like relationships, indexes, and query APIs, then check out Mongomatic from Ben Myles. Mongomatic aims to do ‘just enough’ by mapping your models to MongoDB collections but leaves the rest to you:

  • No additional query API. You simply drop down to the Ruby driver.
  • No relationships. Simply write your own finder methods.
  • No validations. Unless you write your own.

What’s the upside you may ask? Minimal dependencies and better alignment with MongoDB native conventions.

A sample model

require 'mongomatic'

class User < Mongomatic::Base
  def validate
    self.errors << ["Name", "can't be empty"]  if self["name"].blank?
    self.errors << ["Email", "can't be empty"] if self["email"].blank?
  end
end

# set the db for all models:
Mongomatic.db = Mongo::Connection.new.db("mongomatic_test")
# or you can set it for a specific model:
User.db = Mongo::Connection.new.db("mongomatic_test_user")

Find a single user:

found = User.find_one({"name" => "Ben Myles"})
=> #<User:0x00000101939a48 @doc={"_id"=>BSON::ObjectID('4c32834f0218236321000001'), "name"=>"Ben Myles", "email"=>"me@somewhere.com"}, @removed=false, @is_new=false, @errors=[]>

Iterate over a cursor, the MongoDB way:

cursor = User.find({"name" => "Ben Myles"})
=> #<Mongomatic::Cursor:0x0000010195b4e0 @obj_class=User, @mongo_cursor=<Mongo::Cursor:0x80cadac0 namespace='mongomatic_test.User' @selector={"name"=>"Ben Myles"}>>
found = cursor.next
=> #<User:0x00000101939a48 @doc={"_id"=>BSON::ObjectID('4c32834f0218236321000001'), "name"=>"Ben Myles", "email"=>"me@somewhere.com"}, @removed=false, @is_new=false, @errors=[]>
found.remove
=> 67
User.count
=> 0
User.find({"name" => "Ben Myles"}).next
=> nil

If you need a quick-and-dirty model for your MongoDB Ruby app, give Mongomatic a look. It looks like a lightweight alternative to MongoMapper and Mongoid.

[Source on GitHub] [Homepage]

Video: NoSQL Smackdown Part 4

The last of our NoSQL Smackdown series features CouchDB contributor J Chris Anderson tagging in for Jan to talk about what makes CouchDB development so cool. [Download] [iPhone version]

Video: NoSQL Smackdown Part 3

The third installment of our NoSQL Smackdown video series asks if only the world’s largest sites have big data needs. Werner says the amount of social interaction on today’s web means even low volume sites have to deal with a lot of data. [Download] [Part 1] [Part 2] [Complete event audio]

Video: NoSQL Smackdown Part 2

Hot on the heels of Part 1, our next video installment of the NoSQL Smackdown has Werner telling you you’re crazy if you run your own database. Ready FIGHT! [Download] [iPhone/iPod version] [Complete EP 0.1.8 audio]

Delayed Job hits 2.0 #

Collective Idea has released Delayed Job 2.0 from their fork of everyone’s favorite database-driven job queue for Ruby. The latest version supports new database options including MongoMapper and DataMapper.

Version 2.0 is roughly six times faster than 1.8.5 when using the active_record backend:

                      user     system      total        real
delayed_job 1.8.5 195.670000  14.020000  209.690000 (230.887172)
delayed_job 2.0    36.200000   0.940000  37.140000  ( 39.959233)

What’s even more surprising is that active_record is so much faster than the other two options based on Brandon’s benchmarks:

                     user     system      total        real
active_record      36.200000   0.940000  37.140000 ( 39.959233)
mongo_mapper       69.270000   3.220000  72.490000 ( 90.783220)
data_mapper       255.620000   2.880000 258.500000 (275.550383)

[Source on GitHub] [Changelog] [Compare 2.0 with 1.8.5]

Video: NoSQL Smackdown Part 1

By popular demand we’re posting the video behind Episode 0.1.8 NoSQL Smackdown from SXSW in a series of bite-sized chunks. In part one we talk about massively large documents, inside massively large databases, consistency models and the impact on application design. Enjoy! [Download] [iPhone/iPod version]

A Nunemaker Joint #

It’s never been easier to do file uploads with MongoMapper and GridFS. Joint, from MongoMapper creator John Nunemaker, is a MM plugin that adds some nice convenience methods to your models:

class Foo
  include MongoMapper::Document
  plugin Joint

  attachment :image
  attachment :pdf
end

By declaring these two attachments, you automagically get accessors for image and pdf. The setter methods take any IO (File, Tempfile, etc) and the getter methods return a GridIO type from the Ruby driver.

John even throws in some nifty proxy goodness:

doc.image.id
doc.image.size
doc.image.type
doc.image.name

[Source on GitHub][John on Episode 0.1.1]

#18: NoSQL Smackdown!

While at SXSW Interactive, Adam and Wynn got to attend the Data Cluster Meetup hosted by Rackspace and Infochimps. Things got a bit rowdy when the panel debated features of Cassandra, CouchDB, MongoDB and Amazon SimpleDB and started throwing dirt at everybody else’s favorite NoSQL databases. The participants: Stu Hood from Cassandra Jan Lehnardt from […]

NoRM – Bringing MongoDB to .NET, LINQ, and Mono #

Just because you’re slinging C# doesn’t mean that Microsoft SQL Server is the only database in town, you know. Want to play with MongoDB? Then give NoRM from Andrew Theken a look. NoRM aims to:

  • Wrap the standard MongoDB operations in a strongly-typed interface
  • Provide ultra fast serialization of BSON to .NET CLR types and back
  • LINQ-to-Mongo support
  • Mono compatability

A quick code example from the Wiki:

//open collection
var coll = (new MongoServer()).GetDatabase("Northwind")
                  .GetCollection<Product>();
//create a new object to be added to the collection
var obj = new Product();
obj._id = BSONOID.NewOID();
obj.Title = "Shoes";
//save the object
coll.Insert(obj);
//find the object
var obj2 = coll.FindOne(new { _id = obj._id}).First();

Glad to see Northwind Traders has upgraded to MongoDB…

[Source on GitHub]

Mongrations – migrations for MongoMapper #

Why would a schema-less database need migrations? Simple: to help you keep old data fresh as you change your data format. Recently added new columns to your MongoMapper model and need to update old values in your MongoDB collection? Terry Heath gives you Mongrations:

script/generate mongration update_followers_count_for_existing

You’ll get a new file with the familiar Rails migration format:

class UpdateFollowersCountForExisting < MongoMapper::Mongration
  def self.up
  end

  def self.down
  end
end

Just add your own code to manipulate your data and call rake db:mongrate. Mongrations include rake tasks for db:mongrate:redo, db:mongrate:up, db:mongrate:down, db:mongrate:rollback.

[Source on GitHub] [Blog post]

Kohana – The Swift PHP Framework #

Kohana is a PHP MVC web framework that aims to be easy, lightweight, and secure. It has no dependencies on PECL or PEAR extensions and uses strict PHP 5 OOP. It uses what it calls “cascading resources” to allow developers to extend the framework without editing the core system.

There is an extensive list of [modules], even a MongoDB ORM, MangoDB.

[Source on GitHub] [Homepage] [Docs]

Episode 0.1.2 – Gordon is such a Showoff

Adam and Wynn continued chatting with John Nunemaker about some recent featured projects including Gordon, Showoff, jQuery Lint, JSpec, congomongo and more. Items discussed in the show: Friendly NoSQL in MySQL configliere – Simple Ruby configuration twitter-node – Node.js tweetstreaming Gordon – Pure JavaScript Flash replacement jQuery Lint – jQuery validator Showoff – Keynote killer? […]

Episode 0.1.1 – John Nunemaker from Ordered List, RailsTips.org, and MongoMapper

Adam and Wynn caught up with John Nunemaker from Ordered List to chat about open source, improving your craft, building a business, and how MongoDB has changed his life. Items mentioned in the show: John’s Ruby projects The Twitter gem, HTTParty, Crack, MongoMapper MongoHQ, shared MongoDB hosting The GitHub Fork Queue Harmony, the new CMS […]

Navvy: Simple database agnostic Ruby background job processor #

If you like delayed_job but want to BYODB, @jeffkreeftmeijer brings you Navvy.

From the README

Navvy is a simple Ruby background job processor inspired by delayed_job, but aiming for database agnosticism. Currently Navvy supports ActiveRecord, MongoMapper, Sequel and DataMapper but it’s extremely easy to write an adapter for your favorite ORM. Besides plain Ruby (1.8 & 1.9) it completely supports Rails Edge.

Navvy gets an endorsement from Mr. MongoMapper @jnunemaker in upcoming episode 0.1.2.

We’re looking forward to Monitor, the planned web UI to Navvy.

[Source on GitHub] [Wiki] [Roadmap]