
Today’s Hadoop World talk comes from Ashish Thusoo at Facebook and goes into detail about how Facebook uses Hadoop and Hive to expose massive volumes of data to their internal users familiar with traditional data warehousing tools. Thanks Ashish, and stay tuned for more!

Today’s Hadoop World video comes from Ed Capriolo, and goes into details about how to effectively monitor Hadoop in production environments. Thanks Ed, and stay tuned for more!

Avro is a recent addition to Apache’s Hadoop family of projects. Avro defines a data format designed to support Big Data applications, and provides support for this format in a variety of programming languages. ...

It has been almost a month since Hadoop World: NYC, and things are just starting to get back to normal here at Cloudera HQ. We were thrilled to see over 500 Apache Hadoop enthusiasts descend upon New York City for the first major Big Data event on the East Coast...

Around the world, individuals contribute to Hadoop and build community around the technology. This kind of collaboration is at the heart of open source software, and here at Cloudera, we feel privileged to be a part of the Apache Hadoop community...

At Hadoop World NYC Cloudera announced a new product: Cloudera Desktop. Over the past several months this product has been my principal concern here at Cloudera where I’m the UI lead (actually, until about a week ago, I was the only UI developer). ...

Every day, we hear about people doing amazing things with Hadoop. The variety of applications across industries is clear evidence that Hadoop is radically changing the way data is processed at scale...

Today at Hadoop World NYC, we’re announcing the availability of Cloudera Desktop,a unified and extensible graphical user interface for Hadoop. The product is free to download and can be used with either internal clusters or clusters running on public clouds. ...

At the beginning of September, we announced the first release of CDH2, our current testing repository. Packages in our testing repository are recommended for people who want more features and are willing to upgrade as bugs are worked out...

One of the more common requests we receive from the community is to package HBase with Cloudera’s Distribution for Hadoop. Lately, I’ve been doing a lot of work on making Cloudera’s packages easy to use, and recently, the HBase team has pitched in to help us deliver compatible HBase packages. We’r...

In a previous post, I outlined how to build a basic trend tracking site called trendingtopics.org with Cloudera’s Distribution for Hadoop and Hive. TrendingTopics uses Hadoop to identify the top articles trending on Wikipedia and displays related news stories and charts. ...

Apache Hadoop’s jobtracker, namenode, secondary namenode, datanode, and tasktracker all generate logs. That includes logs from each of the daemons under normal operation, as well as configuration logs, statistics, standard error, standard out, and internal diagnostic information. ...

In March of this year, we released our distribution for Hadoop. Our initial focus was on stability and making Hadoop easy to install. This original distribution, now named CDH1, was based on the most stable version of Apache Hadoop at the time:0.18.3...

It’s been a crazy few weeks here at Cloudera, and while there is no sign of things letting up before Hadoop World: NYC 2009 on October 2nd, we wanted to take a minute to share the latest details about the speakers, and to say thanks to our sponsors who have recently come on board. ...






